reaperdoesntknow
/

SMOLM2Prover

@@ -1,6 +1,6 @@
 library_name: transformers
-model_name: SmolLM2_Thinks
 tags:
 - text-generation
 - proof
@@ -22,7 +22,7 @@ datasets:
 base_model:
 - prithivMLmods/SmolLM2-CoT-360M
 pipeline_tag: text-generation
 # Model Card for SmolLM2Prover
@@ -92,21 +92,23 @@ tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_gene
 outputs = model.generate(tokenized_chat, max_new_tokens=512)
 decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
 # Print only the generated part
 print(decoded_output.split("assistant\n")[-1])
-Training Procedure
 The model underwent several rounds of Supervised Fine-Tuning (SFT) using TRL's SFTTrainer.
  * Training Data: The primary dataset used was AI-MO/NuminaMath-1.5, augmented with approximately 1 million additional tokens. This data was formatted with a specific prompt structure designed to elicit step-by-step, chain-of-thought reasoning from the model.
  * Process: The iterative SFT approach allowed for progressive refinement of the model's reasoning capabilities.
-Framework Versions
  * Transformers: 4.56.0
  * Pytorch: 2.8.0+cu126
  * TRL: 0.22.2
  * Datasets: 4.0.0
  * Tokenizers: 0.22.0
-Intended Use
 This model is a versatile tool suitable for a range of applications, from everyday conversation to complex problem-solving.
  * Primary Use Cases (Specialized Skills):
    * Educational tools for higher-level mathematics and logic.
@@ -121,9 +123,9 @@ Limitations and Bias
  * Mathematical Accuracy: While highly capable, the model can still make errors or "hallucinate" incorrect steps or solutions in complex mathematical proofs. All outputs, especially for critical applications, should be verified by a human expert.
  * Domain Performance: The model's performance is most reliable on problems similar to its training data. While it is designed to handle higher levels of math and deep thinking, its accuracy in novel or esoteric domains should be carefully evaluated.
  * Inherited Bias: This model inherits any biases present in the base model (SmolLM2-CoT-360M) and the training datasets.
-Acknowledgements
 You're doing great!
-Citations
 If you use TRL in your work, please cite the library:
 @misc{vonwerra2022trl,
     title        = {{TRL: Transformer Reinforcement Learning}},

+---
 library_name: transformers
+model_name: SmolLM2Prover
 tags:
 - text-generation
 - proof
 base_model:
 - prithivMLmods/SmolLM2-CoT-360M
 pipeline_tag: text-generation
+---
 # Model Card for SmolLM2Prover
 outputs = model.generate(tokenized_chat, max_new_tokens=512)
 decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
 # Print only the generated part
 print(decoded_output.split("assistant\n")[-1])
+```
+### Training
 The model underwent several rounds of Supervised Fine-Tuning (SFT) using TRL's SFTTrainer.
  * Training Data: The primary dataset used was AI-MO/NuminaMath-1.5, augmented with approximately 1 million additional tokens. This data was formatted with a specific prompt structure designed to elicit step-by-step, chain-of-thought reasoning from the model.
  * Process: The iterative SFT approach allowed for progressive refinement of the model's reasoning capabilities.
+## Framework Versions
  * Transformers: 4.56.0
  * Pytorch: 2.8.0+cu126
  * TRL: 0.22.2
  * Datasets: 4.0.0
  * Tokenizers: 0.22.0
+### Intended Use
 This model is a versatile tool suitable for a range of applications, from everyday conversation to complex problem-solving.
  * Primary Use Cases (Specialized Skills):
    * Educational tools for higher-level mathematics and logic.
  * Mathematical Accuracy: While highly capable, the model can still make errors or "hallucinate" incorrect steps or solutions in complex mathematical proofs. All outputs, especially for critical applications, should be verified by a human expert.
  * Domain Performance: The model's performance is most reliable on problems similar to its training data. While it is designed to handle higher levels of math and deep thinking, its accuracy in novel or esoteric domains should be carefully evaluated.
  * Inherited Bias: This model inherits any biases present in the base model (SmolLM2-CoT-360M) and the training datasets.
+### Acknowledgements
 You're doing great!
+## Citations
 If you use TRL in your work, please cite the library:
 @misc{vonwerra2022trl,
     title        = {{TRL: Transformer Reinforcement Learning}},