reaperdoesntknow commited on
Commit
4b7693b
·
verified ·
1 Parent(s): 6411414

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -9
README.md CHANGED
@@ -1,6 +1,6 @@
1
-
2
  library_name: transformers
3
- model_name: SmolLM2_Thinks
4
  tags:
5
  - text-generation
6
  - proof
@@ -22,7 +22,7 @@ datasets:
22
  base_model:
23
  - prithivMLmods/SmolLM2-CoT-360M
24
  pipeline_tag: text-generation
25
-
26
 
27
  # Model Card for SmolLM2Prover
28
 
@@ -92,21 +92,23 @@ tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_gene
92
 
93
  outputs = model.generate(tokenized_chat, max_new_tokens=512)
94
  decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
95
-
96
  # Print only the generated part
97
  print(decoded_output.split("assistant\n")[-1])
 
 
98
 
99
- Training Procedure
100
  The model underwent several rounds of Supervised Fine-Tuning (SFT) using TRL's SFTTrainer.
101
  * Training Data: The primary dataset used was AI-MO/NuminaMath-1.5, augmented with approximately 1 million additional tokens. This data was formatted with a specific prompt structure designed to elicit step-by-step, chain-of-thought reasoning from the model.
102
  * Process: The iterative SFT approach allowed for progressive refinement of the model's reasoning capabilities.
103
- Framework Versions
 
104
  * Transformers: 4.56.0
105
  * Pytorch: 2.8.0+cu126
106
  * TRL: 0.22.2
107
  * Datasets: 4.0.0
108
  * Tokenizers: 0.22.0
109
- Intended Use
 
110
  This model is a versatile tool suitable for a range of applications, from everyday conversation to complex problem-solving.
111
  * Primary Use Cases (Specialized Skills):
112
  * Educational tools for higher-level mathematics and logic.
@@ -121,9 +123,9 @@ Limitations and Bias
121
  * Mathematical Accuracy: While highly capable, the model can still make errors or "hallucinate" incorrect steps or solutions in complex mathematical proofs. All outputs, especially for critical applications, should be verified by a human expert.
122
  * Domain Performance: The model's performance is most reliable on problems similar to its training data. While it is designed to handle higher levels of math and deep thinking, its accuracy in novel or esoteric domains should be carefully evaluated.
123
  * Inherited Bias: This model inherits any biases present in the base model (SmolLM2-CoT-360M) and the training datasets.
124
- Acknowledgements
125
  You're doing great!
126
- Citations
127
  If you use TRL in your work, please cite the library:
128
  @misc{vonwerra2022trl,
129
  title = {{TRL: Transformer Reinforcement Learning}},
 
1
+ ---
2
  library_name: transformers
3
+ model_name: SmolLM2Prover
4
  tags:
5
  - text-generation
6
  - proof
 
22
  base_model:
23
  - prithivMLmods/SmolLM2-CoT-360M
24
  pipeline_tag: text-generation
25
+ ---
26
 
27
  # Model Card for SmolLM2Prover
28
 
 
92
 
93
  outputs = model.generate(tokenized_chat, max_new_tokens=512)
94
  decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)
 
95
  # Print only the generated part
96
  print(decoded_output.split("assistant\n")[-1])
97
+ ```
98
+ ### Training
99
 
 
100
  The model underwent several rounds of Supervised Fine-Tuning (SFT) using TRL's SFTTrainer.
101
  * Training Data: The primary dataset used was AI-MO/NuminaMath-1.5, augmented with approximately 1 million additional tokens. This data was formatted with a specific prompt structure designed to elicit step-by-step, chain-of-thought reasoning from the model.
102
  * Process: The iterative SFT approach allowed for progressive refinement of the model's reasoning capabilities.
103
+
104
+ ## Framework Versions
105
  * Transformers: 4.56.0
106
  * Pytorch: 2.8.0+cu126
107
  * TRL: 0.22.2
108
  * Datasets: 4.0.0
109
  * Tokenizers: 0.22.0
110
+
111
+ ### Intended Use
112
  This model is a versatile tool suitable for a range of applications, from everyday conversation to complex problem-solving.
113
  * Primary Use Cases (Specialized Skills):
114
  * Educational tools for higher-level mathematics and logic.
 
123
  * Mathematical Accuracy: While highly capable, the model can still make errors or "hallucinate" incorrect steps or solutions in complex mathematical proofs. All outputs, especially for critical applications, should be verified by a human expert.
124
  * Domain Performance: The model's performance is most reliable on problems similar to its training data. While it is designed to handle higher levels of math and deep thinking, its accuracy in novel or esoteric domains should be carefully evaluated.
125
  * Inherited Bias: This model inherits any biases present in the base model (SmolLM2-CoT-360M) and the training datasets.
126
+ ### Acknowledgements
127
  You're doing great!
128
+ ## Citations
129
  If you use TRL in your work, please cite the library:
130
  @misc{vonwerra2022trl,
131
  title = {{TRL: Transformer Reinforcement Learning}},