bishwaspraveen commited on
Commit
4dfb96e
·
verified ·
1 Parent(s): 5d6247c

Add model from wandb - run astroBERT-80-epochs

Browse files
Files changed (1) hide show
  1. model_architecture.txt +45 -0
model_architecture.txt ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ BertForSequenceClassification(
2
+ (bert): BertModel(
3
+ (embeddings): BertEmbeddings(
4
+ (word_embeddings): Embedding(30000, 768, padding_idx=0)
5
+ (position_embeddings): Embedding(512, 768)
6
+ (token_type_embeddings): Embedding(2, 768)
7
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
8
+ (dropout): Dropout(p=0.1, inplace=False)
9
+ )
10
+ (encoder): BertEncoder(
11
+ (layer): ModuleList(
12
+ (0-11): 12 x BertLayer(
13
+ (attention): BertAttention(
14
+ (self): BertSdpaSelfAttention(
15
+ (query): Linear(in_features=768, out_features=768, bias=True)
16
+ (key): Linear(in_features=768, out_features=768, bias=True)
17
+ (value): Linear(in_features=768, out_features=768, bias=True)
18
+ (dropout): Dropout(p=0.1, inplace=False)
19
+ )
20
+ (output): BertSelfOutput(
21
+ (dense): Linear(in_features=768, out_features=768, bias=True)
22
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
23
+ (dropout): Dropout(p=0.1, inplace=False)
24
+ )
25
+ )
26
+ (intermediate): BertIntermediate(
27
+ (dense): Linear(in_features=768, out_features=3072, bias=True)
28
+ (intermediate_act_fn): GELUActivation()
29
+ )
30
+ (output): BertOutput(
31
+ (dense): Linear(in_features=3072, out_features=768, bias=True)
32
+ (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
33
+ (dropout): Dropout(p=0.1, inplace=False)
34
+ )
35
+ )
36
+ )
37
+ )
38
+ (pooler): BertPooler(
39
+ (dense): Linear(in_features=768, out_features=768, bias=True)
40
+ (activation): Tanh()
41
+ )
42
+ )
43
+ (dropout): Dropout(p=0.1, inplace=False)
44
+ (classifier): Linear(in_features=768, out_features=36, bias=True)
45
+ )