thrunlab
/

t5-large_sst2_sp0_ar0

@@ -22,7 +22,7 @@ model-index:
     metrics:
     - name: Accuracy
       type: accuracy
-      value: 0.9560546875
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -32,8 +32,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [t5-large](https://huggingface.co/t5-large) on the glue dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3456
-- Accuracy: 0.9561
 ## Model description
@@ -56,51 +56,50 @@ The following hyperparameters were used during training:
 - train_batch_size: 16
 - eval_batch_size: 32
 - seed: 1
-- distributed_type: tpu
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 20
-- training_steps: 750
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.6852        | 0.01  | 25   | 0.6952          | 0.5092   |
-| 0.6751        | 0.01  | 50   | 0.6331          | 0.7546   |
-| 0.603         | 0.02  | 75   | 0.4811          | 0.8899   |
-| 0.3459        | 0.02  | 100  | 0.2048          | 0.9335   |
-| 0.1808        | 0.03  | 125  | 0.2377          | 0.9300   |
-| 0.1933        | 0.04  | 150  | 0.3369          | 0.9323   |
-| 0.527         | 0.04  | 175  | 0.6582          | 0.9404   |
-| 0.2241        | 0.05  | 200  | 0.1874          | 0.9507   |
-| 0.1997        | 0.05  | 225  | 0.5160          | 0.9472   |
-| 0.2192        | 0.06  | 250  | 0.5193          | 0.9461   |
-| 0.168         | 0.07  | 275  | 0.4091          | 0.9484   |
-| 0.1879        | 0.07  | 300  | 0.3114          | 0.9427   |
-| 0.1653        | 0.08  | 325  | 0.5526          | 0.9484   |
-| 0.1847        | 0.08  | 350  | 0.6536          | 0.9450   |
-| 0.1449        | 0.09  | 375  | 0.6520          | 0.9438   |
-| 0.2485        | 0.1   | 400  | 0.4093          | 0.9518   |
-| 0.1604        | 0.1   | 425  | 0.2821          | 0.9461   |
-| 0.1316        | 0.11  | 450  | 0.8609          | 0.9461   |
-| 0.1754        | 0.11  | 475  | 0.4047          | 0.9472   |
-| 0.1524        | 0.12  | 500  | 0.4034          | 0.9495   |
-| 0.4571        | 0.13  | 525  | 0.2895          | 0.9495   |
-| 0.1448        | 0.13  | 550  | 0.5239          | 0.9484   |
-| 0.1459        | 0.14  | 575  | 0.2996          | 0.9518   |
-| 0.2131        | 0.14  | 600  | 0.2983          | 0.9495   |
-| 0.1298        | 0.15  | 625  | 0.5322          | 0.9484   |
-| 0.1519        | 0.16  | 650  | 0.5311          | 0.9518   |
-| 0.1809        | 0.16  | 675  | 0.5271          | 0.9495   |
-| 0.1495        | 0.17  | 700  | 0.5282          | 0.9495   |
-| 0.1665        | 0.17  | 725  | 0.5307          | 0.9507   |
-| 0.1978        | 0.18  | 750  | 0.5295          | 0.9507   |
 ### Framework versions
 - Transformers 4.33.2
-- Pytorch 2.0.0+cu118
 - Datasets 2.14.5
 - Tokenizers 0.11.6

     metrics:
     - name: Accuracy
       type: accuracy
+      value: 0.9453125
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [t5-large](https://huggingface.co/t5-large) on the glue dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1944
+- Accuracy: 0.9453
 ## Model description
 - train_batch_size: 16
 - eval_batch_size: 32
 - seed: 1
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 20
+- num_epochs: 6
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Accuracy |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.6815        | 0.01  | 25   | 0.6999          | 0.5092   |
+| 0.6592        | 0.01  | 50   | 0.6221          | 0.6445   |
+| 0.5832        | 0.02  | 75   | 0.4570          | 0.7993   |
+| 0.2882        | 0.02  | 100  | 0.2076          | 0.9358   |
+| 0.1894        | 0.03  | 125  | 0.3499          | 0.9404   |
+| 0.1864        | 0.04  | 150  | 0.2963          | 0.9461   |
+| 0.2553        | 0.04  | 175  | 0.6929          | 0.9289   |
+| 0.245         | 0.05  | 200  | 0.4761          | 0.9323   |
+| 0.2042        | 0.05  | 225  | 0.5294          | 0.9461   |
+| 0.2002        | 0.06  | 250  | 0.8441          | 0.9472   |
+| 0.1633        | 0.07  | 275  | 0.8560          | 0.9495   |
+| 0.1939        | 0.07  | 300  | 0.3197          | 0.9450   |
+| 0.1928        | 0.08  | 325  | 0.4214          | 0.9472   |
+| 0.2201        | 0.08  | 350  | 0.5266          | 0.9484   |
+| 0.143         | 0.09  | 375  | 0.8642          | 0.9450   |
+| 0.2354        | 0.1   | 400  | 1.2116          | 0.9335   |
+| 0.1692        | 0.1   | 425  | 0.1807          | 0.9472   |
+| 0.1531        | 0.11  | 450  | 0.6431          | 0.9484   |
+| 0.152         | 0.11  | 475  | 1.4046          | 0.9553   |
+| 0.1948        | 0.12  | 500  | 0.1596          | 0.9553   |
+| 0.2007        | 0.13  | 525  | 0.1779          | 0.9438   |
+| 0.1338        | 0.13  | 550  | 0.6476          | 0.9495   |
+| 0.3812        | 0.14  | 575  | 0.3901          | 0.9484   |
+| 0.7052        | 0.14  | 600  | 0.1740          | 0.9507   |
+| 0.8601        | 0.15  | 625  | 1.5226          | 0.9484   |
+| 1.384         | 0.16  | 650  | 0.6605          | 0.9427   |
+| 0.6833        | 0.16  | 675  | 0.7313          | 0.9484   |
+| 0.1833        | 0.17  | 700  | 0.4110          | 0.9438   |
+| 0.1968        | 0.17  | 725  | 0.2914          | 0.9450   |
+| 0.2001        | 0.18  | 750  | 0.1947          | 0.9335   |
 ### Framework versions
 - Transformers 4.33.2
+- Pytorch 2.0.1+cu117
 - Datasets 2.14.5
 - Tokenizers 0.11.6