emnlp 2023 commited on
Commit
5bcec32
·
1 Parent(s): 48b6599

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -18
README.md CHANGED
@@ -67,13 +67,13 @@ which is subsequently served by extending model's decoder input context by addin
67
  - **Developed by:** Anonymous
68
  - **Model type:** Autoregressive Encoder-Decoder
69
  - **Language(s):** en
70
- - **Finetuned from:** google/calc-t5-large
71
 
72
  ### Model Sources
73
 
74
  <!-- Provide the basic links for the model. -->
75
 
76
- - **Repository:** https://github.com/emnlp2023/gadgets
77
  - **Paper:** Stay tuned!
78
 
79
  ## Usage
@@ -82,8 +82,8 @@ Additionally to conventional generation, using Tool-augmented generation require
82
  (1) implementation of the tool(s) and
83
  (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
84
 
85
- You can find these two components implemented in the attached **gadget_assisted_model.py** and **gadget.py** in this model's repo
86
- and the project's [home repo](https://github.com/emnlp2023/gadgets).
87
 
88
  After adding these two scripts to your directory, you can use the model as follows:
89
 
@@ -130,24 +130,17 @@ Final result is<result>800</result></s>
130
  Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
131
  more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
132
 
 
133
  ## Training Details
134
 
135
  ### Training Data
136
-
137
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
138
 
139
- This model was trained on our Calculator-augmented set of [ape210k dataset github](https://github.com/Chenny0808/ape210k),
140
- [mathqa HF dataset](https://huggingface.co/datasets/math_qa),
141
- [gsm8k HF dataset](https://huggingface.co/datasets/gsm8k),
142
- [aqua_rat](https://huggingface.co/datasets/aqua_rat),
 
143
  in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.
144
 
145
- ### Training Procedure
146
-
147
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
148
-
149
- The model was fine-tuned from [google/calc-t5-large](https://huggingface.co/google/calc-t5-large) for TODO steps
150
- aiming to maximise exact-match ration on a validation split of the questions from [gsm8k dataset](https://huggingface.co/datasets/gsm8k).
151
- We fine-tune only TODO of the parameters finding that this circumvents overfitting to relatively small training dataset.
152
-
153
- The full training configuration can be identified from the [training script](https://github.com/emnlp2023/gadgets/blob/9185d1fc4b4812321179f8e5cad3e2f2a764f1df/examples/train_gsm8k_flan-t5-slice.py).
 
67
  - **Developed by:** Anonymous
68
  - **Model type:** Autoregressive Encoder-Decoder
69
  - **Language(s):** en
70
+ - **Finetuned from:** t5-large
71
 
72
  ### Model Sources
73
 
74
  <!-- Provide the basic links for the model. -->
75
 
76
+ - **Repository:** https://github.com/emnlp2023sub/gadgets
77
  - **Paper:** Stay tuned!
78
 
79
  ## Usage
 
82
  (1) implementation of the tool(s) and
83
  (2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
84
 
85
+ You can find these two components implemented in the **gadgets/gadget_assisted_model.py** and **gadgets/gadget.py** in the project's [home repo](https://github.com/emnlp2023sub/gadgets).
86
+
87
 
88
  After adding these two scripts to your directory, you can use the model as follows:
89
 
 
130
  Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
131
  more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
132
 
133
+
134
  ## Training Details
135
 
136
  ### Training Data
 
137
  <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
138
+ This model was trained on our Calculator-augmented set of
139
 
140
+ - [Calc Ape210k](https://huggingface.co/datasets/emnlp2023/Calc-ape210k) ([original Ape210k on github](https://github.com/Chenny0808/ape210k))
141
+ - [Calc MathQA](https://huggingface.co/datasets/emnlp2023/Calc-math_qa) ([original MathQA on HF](https://huggingface.co/datasets/math_qa))
142
+ - [Calc GSM8K](https://huggingface.co/datasets/emnlp2023/Calc-gsm8k) ([original GSM8K on HF](https://huggingface.co/datasets/gsm8k))
143
+ - [Calc Aqua-RAT](https://huggingface.co/datasets/emnlp2023/Calc-aqua_rat) ([original Aqua-RAT on HF](https://huggingface.co/datasets/aqua_rat)
144
+
145
  in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.
146