MatthisHoules
/

t5-large-finetuned-break-qdmr-decomposition

+---
+license: apache-2.0
+tags:
+- generated_from_trainer
+datasets:
+- break_data
+metrics:
+- bleu
+model-index:
+- name: t5-large-finetuned-break-qdmr-decomposition
+  results:
+  - task:
+      name: Sequence-to-sequence Language Modeling
+      type: text2text-generation
+    dataset:
+      name: break_data
+      type: break_data
+      config: QDMR
+      split: validation
+      args: QDMR
+    metrics:
+    - name: Bleu
+      type: bleu
+      value: 0.22169382457557757
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# t5-large-finetuned-break-qdmr-decomposition
+This model is a fine-tuned version of [t5-large](https://huggingface.co/t5-large) on the break_data dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1729
+- Bleu: 0.2217
+- Precisions: [0.928997558602713, 0.8089017135403285, 0.702859772673759, 0.6237525532535746]
+- Brevity Penalty: 0.2926
+- Length Ratio: 0.4487
+- Translation Length: 108954
+- Reference Length: 242845
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 64
+- total_train_batch_size: 128
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 10
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Bleu   | Precisions                                                                       | Brevity Penalty | Length Ratio | Translation Length | Reference Length |
+|:-------------:|:-----:|:----:|:---------------:|:------:|:--------------------------------------------------------------------------------:|:---------------:|:------------:|:------------------:|:----------------:|
+| No log        | 1.0   | 346  | 0.2217          | 0.2190 | [0.9212396799650076, 0.7929651493459373, 0.6788405612515656, 0.5938190356122556] | 0.2973          | 0.4519       | 109738             | 242845           |
+| 0.3597        | 2.0   | 692  | 0.1898          | 0.2213 | [0.9278319373884388, 0.8053505444154309, 0.6955454787943451, 0.6142312076867599] | 0.2944          | 0.4499       | 109245             | 242845           |
+| 0.1943        | 3.0   | 1038 | 0.1780          | 0.2213 | [0.9274868270332188, 0.805860010851872, 0.6987019924149351, 0.6179670572886331]  | 0.2936          | 0.4494       | 109125             | 242845           |
+| 0.1943        | 4.0   | 1385 | 0.1722          | 0.2209 | [0.9296421064226247, 0.8077246177717601, 0.6996456975263051, 0.618521199103474]  | 0.2926          | 0.4486       | 108943             | 242845           |
+| 0.1588        | 5.0   | 1731 | 0.1708          | 0.2221 | [0.9263551333376084, 0.8062900028599888, 0.7016414100962206, 0.6226711690731253] | 0.2938          | 0.4495       | 109159             | 242845           |
+| 0.1395        | 6.0   | 2077 | 0.1699          | 0.2209 | [0.9307313480922355, 0.8116381660470879, 0.7052247221178113, 0.6255682084446319] | 0.2907          | 0.4473       | 108635             | 242845           |
+| 0.1395        | 7.0   | 2423 | 0.1699          | 0.2219 | [0.9294629418890643, 0.8099284613256393, 0.7035550704165061, 0.623971523603898]  | 0.2927          | 0.4487       | 108964             | 242845           |
+| 0.1245        | 8.0   | 2770 | 0.1717          | 0.2215 | [0.9293905921457364, 0.8091923795588686, 0.7026416387368962, 0.6239635641714353] | 0.2924          | 0.4485       | 108909             | 242845           |
+| 0.1152        | 9.0   | 3116 | 0.1724          | 0.2215 | [0.9294489230034706, 0.8091424956007671, 0.7027003876051995, 0.6234366789280084] | 0.2924          | 0.4485       | 108914             | 242845           |
+| 0.1152        | 9.99  | 3460 | 0.1729          | 0.2217 | [0.928997558602713, 0.8089017135403285, 0.702859772673759, 0.6237525532535746]   | 0.2926          | 0.4487       | 108954             | 242845           |
+### Framework versions
+- Transformers 4.30.2
+- Pytorch 2.0.1+cu118
+- Datasets 2.13.1
+- Tokenizers 0.13.3