sjrhuschlee commited on
Commit
da11330
1 Parent(s): 859875b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -2
README.md CHANGED
@@ -7,6 +7,39 @@ language:
7
  tags:
8
  - bart
9
  - question-answering
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  # bart-base for Extractive QA
@@ -39,23 +72,59 @@ tokenizer = AutoTokenizer.from_pretrained(model_name)
39
  ```
40
 
41
  ## Metrics
42
- More information needed.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
  ## Training procedure
45
 
46
  ### Training hyperparameters
47
 
48
  The following hyperparameters were used during training:
 
 
49
  - learning_rate: 2e-06
50
  - train_batch_size: 16
51
  - eval_batch_size: 8
52
  - seed: 42
53
  - gradient_accumulation_steps: 6
54
  - total_train_batch_size: 96
55
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
  - lr_scheduler_type: linear
57
  - lr_scheduler_warmup_ratio: 0.1
58
  - num_epochs: 4.0
 
 
59
 
60
 
61
  ### Framework versions
 
7
  tags:
8
  - bart
9
  - question-answering
10
+ model-index:
11
+ - name: sjrhuschlee/bart-base-squad2
12
+ results:
13
+ - task:
14
+ type: question-answering
15
+ name: Question Answering
16
+ dataset:
17
+ name: squad_v2
18
+ type: squad_v2
19
+ config: squad_v2
20
+ split: validation
21
+ metrics:
22
+ - type: exact_match
23
+ value: 75.223
24
+ name: Exact Match
25
+ - type: f1
26
+ value: 78.443
27
+ name: F1
28
+ - task:
29
+ type: question-answering
30
+ name: Question Answering
31
+ dataset:
32
+ name: squad
33
+ type: squad
34
+ config: plain_text
35
+ split: validation
36
+ metrics:
37
+ - type: exact_match
38
+ value: 83.406
39
+ name: Exact Match
40
+ - type: f1
41
+ value: 90.377
42
+ name: F1
43
  ---
44
 
45
  # bart-base for Extractive QA
 
72
  ```
73
 
74
  ## Metrics
75
+
76
+ ```bash
77
+ # Squad v2
78
+ {
79
+ "eval_HasAns_exact": 76.45074224021593,
80
+ "eval_HasAns_f1": 82.88605283171232,
81
+ "eval_HasAns_total": 5928,
82
+ "eval_NoAns_exact": 74.01177460050462,
83
+ "eval_NoAns_f1": 74.01177460050462,
84
+ "eval_NoAns_total": 5945,
85
+ "eval_best_exact": 75.23793481007327,
86
+ "eval_best_exact_thresh": 0.0,
87
+ "eval_best_f1": 78.45098300230696,
88
+ "eval_best_f1_thresh": 0.0,
89
+ "eval_exact": 75.22951233892024,
90
+ "eval_f1": 78.44256053115387,
91
+ "eval_runtime": 131.875,
92
+ "eval_samples": 11955,
93
+ "eval_samples_per_second": 90.654,
94
+ "eval_steps_per_second": 3.784,
95
+ "eval_total": 11873
96
+ }
97
+
98
+ # Squad
99
+ {
100
+ "eval_exact_match": 83.40586565752129,
101
+ "eval_f1": 90.37706849113668,
102
+ "eval_runtime": 117.2093,
103
+ "eval_samples": 10619,
104
+ "eval_samples_per_second": 90.599,
105
+ "eval_steps_per_second": 3.78
106
+ }
107
+ ```
108
 
109
  ## Training procedure
110
 
111
  ### Training hyperparameters
112
 
113
  The following hyperparameters were used during training:
114
+ - max_seq_length 512
115
+ - doc_stride 128
116
  - learning_rate: 2e-06
117
  - train_batch_size: 16
118
  - eval_batch_size: 8
119
  - seed: 42
120
  - gradient_accumulation_steps: 6
121
  - total_train_batch_size: 96
122
+ - optimizer: Adam8Bit with betas=(0.9,0.999) and epsilon=1e-08
123
  - lr_scheduler_type: linear
124
  - lr_scheduler_warmup_ratio: 0.1
125
  - num_epochs: 4.0
126
+ - gradient_checkpointing: True
127
+ - tf32: True
128
 
129
 
130
  ### Framework versions