MU-NLPC
/

calcformer-t5-large

@@ -1,6 +1,4 @@
 ---
-# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
-# Doc / guide: https://huggingface.co/docs/hub/model-cards
 datasets:
 - MU-NLPC/Calc-gsm8k
 - MU-NLPC/Calc-aqua_rat
@@ -9,51 +7,19 @@ datasets:
 metrics:
 - exact_match
 - rouge
-model-index:
-- name: calc-t5-large
-  results:
-  - task:
-      type: question-answering
-      name: Question Answering
-    dataset:
-      type: gsm8k
-      name: GSM8K
-      split: validation
-    metrics:
-    - type: exact_match
-      value: 0.420
-    - type: rouge
-      value: 0.627
-  - task:
-      type: question-answering
-      name: Question Answering
-    dataset:
-      type: aqua_rat
-      name: AQUA-RAT
-      split: validation
-    metrics:
-    - type: exact_match
-      value: 0.06
-    - type: rouge
-      value: 0.323
 license: apache-2.0
 language:
 - en
 ---
-# Model Card for calc-t5-large
-<!-- Provide a quick summary of what the model is/does. -->
 This model generates reasoning chains over mathematical questions while **using an external tool: Sympy calculator**.
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-With the idea to offload a symbolic reasoning from the stochastic language model,
 we train this model to utilize a calculator **for all applicable numeric operations**.
 This is achieved by training the model to construct calls to the tool's API in this format:
@@ -64,26 +30,28 @@ This is achieved by training the model to construct calls to the tool's API in t
 where `<gadget>` segment triggers a call of the tool,
 which is subsequently served by extending model's decoder input context by adding the output of the tool within the `<output>` segment.
-- **Developed by:** Anonymous
 - **Model type:** Autoregressive Encoder-Decoder
 - **Language(s):** en
 - **Finetuned from:** t5-large
-### Model Sources
-<!-- Provide the basic links for the model. -->
-- **Repository:** https://github.com/emnlp2023sub/gadgets
-- **Paper:** Stay tuned!
 ## Usage
 Additionally to conventional generation, using Tool-augmented generation requires
 (1) implementation of the tool(s) and
-(2) a customization of generate() method augmenting input context on-demand with the outputs of the tools.
-You can find these two components implemented in the **gadgets/model.py** and **gadgets/gadget.py** in the project's [home repo](https://github.com/emnlp2023sub/gadgets).
 After adding these two scripts to your directory, you can use the model as follows:
@@ -94,8 +62,9 @@ from gadgets.model import gadget_assisted_model
 from gadgets.gadget import Calculator
 GadgetAssistedT5 = gadget_assisted_model(T5ForConditionalGeneration)
-model = GadgetAssistedT5.from_pretrained("emnlp2023/calc-t5-large")
-tokenizer = T5Tokenizer.from_pretrained("emnlp2023/calc-t5-large")
 model.prepare_for_generate(tokenizer,
                            enabled_gadgets=[Calculator()],
@@ -111,7 +80,9 @@ inputs = tokenizer(query, return_tensors="pt")
 output_ids = model.generate(**inputs)
 tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)
 ```
 This returns:
 ```html
 According to the ratio, for every 5 parts that Johnson gets, Mike gets 2 parts Since Johnson got $2500,
 each part is therefore $2500/5 = $<gadget id="calculator">2500/5</gadget><output>500</output> 500
@@ -120,26 +91,16 @@ After buying the shirt he will have $1000-$200 = $<gadget id="calculator">1000-2
 Final result is<result>800</result></s>
 ```
-### Out-of-Scope Usage
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
 more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
-## Training Details
-### Training Data
-<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-This model was trained on our Calculator-augmented set of
-- [Calc Ape210k](https://huggingface.co/datasets/emnlp2023/Calc-ape210k) ([original Ape210k on github](https://github.com/Chenny0808/ape210k))
-- [Calc MathQA](https://huggingface.co/datasets/emnlp2023/Calc-math_qa) ([original MathQA on HF](https://huggingface.co/datasets/math_qa))
-- [Calc GSM8K](https://huggingface.co/datasets/emnlp2023/Calc-gsm8k) ([original GSM8K on HF](https://huggingface.co/datasets/gsm8k))
-- [Calc Aqua-RAT](https://huggingface.co/datasets/emnlp2023/Calc-aqua_rat) ([original Aqua-RAT on HF](https://huggingface.co/datasets/aqua_rat))
-in a standard auto-regressive setup i.e. for a conditional next-token prediction with teacher-forced prefix.
 ## Cite

 ---
 datasets:
 - MU-NLPC/Calc-gsm8k
 - MU-NLPC/Calc-aqua_rat
 metrics:
 - exact_match
 - rouge
 license: apache-2.0
 language:
 - en
 ---
+# Model Card for calcformer-t5-xl
 This model generates reasoning chains over mathematical questions while **using an external tool: Sympy calculator**.
+## Model Description
+With the idea to offload the symbolic computation from the stochastic language model,
 we train this model to utilize a calculator **for all applicable numeric operations**.
 This is achieved by training the model to construct calls to the tool's API in this format:
 where `<gadget>` segment triggers a call of the tool,
 which is subsequently served by extending model's decoder input context by adding the output of the tool within the `<output>` segment.
+- **Developed by:** Calcformer team
 - **Model type:** Autoregressive Encoder-Decoder
 - **Language(s):** en
 - **Finetuned from:** t5-large
+## Sources
+- **Repository:** <https://github.com/prompteus/calc-x>
+- **Paper:** <https://arxiv.org/abs/2305.15017>
+- [**Calcformer model family on HF**](https://huggingface.co/collections/MU-NLPC/calcformers-65367392badc497807b3caf5)
+- [**Calc-X dataset collection on HF**](https://huggingface.co/collections/MU-NLPC/calc-x-652fee9a6b838fd820055483)
 ## Usage
 Additionally to conventional generation, using Tool-augmented generation requires
 (1) implementation of the tool(s) and
+(2) a customization of `generate()` method augmenting input context on-demand with the outputs of the tools.
+You can find these two components implemented in the attached **gadgets/model.py** and **gadgets/gadget.py** in this model's repo
+and the project's [home repo](https://github.com/prompteus/calc-x).
 After adding these two scripts to your directory, you can use the model as follows:
 from gadgets.gadget import Calculator
 GadgetAssistedT5 = gadget_assisted_model(T5ForConditionalGeneration)
+model_name = "MU-NLPC/calcformer-t5-large"
+model = GadgetAssistedT5.from_pretrained(model_name)
+tokenizer = T5Tokenizer.from_pretrained(model_name)
 model.prepare_for_generate(tokenizer,
                            enabled_gadgets=[Calculator()],
 output_ids = model.generate(**inputs)
 tokenizer.decode(output_ids[0], spaces_between_special_tokens=False)
 ```
 This returns:
 ```html
 According to the ratio, for every 5 parts that Johnson gets, Mike gets 2 parts Since Johnson got $2500,
 each part is therefore $2500/5 = $<gadget id="calculator">2500/5</gadget><output>500</output> 500
 Final result is<result>800</result></s>
 ```
+## Out-of-Scope Usage
 Note that given the limited scope of the exercises' complexity in the training, this model will not work well for tasks requiring
 more complex algebraic operations, including equations, variables and operations outside the scope of (+-*/).
+## Training
+This model was trained on [Calc-X](https://huggingface.co/collections/MU-NLPC/calc-x-652fee9a6b838fd820055483), a collection of math problem datasets which we converted into CoT with calculator interactions.
+We used a standard auto-regressive transformer training, i.e. a conditional next-token prediction with cross-entropy loss. For more detail about data, training or evaluation, see the [Calc-X and Calcformers paper](https://arxiv.org/abs/2305.15017).
 ## Cite