CodeIsAbstract commited on
Commit
bf7d133
·
verified ·
1 Parent(s): 7dee348

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +238 -3
README.md CHANGED
@@ -1,3 +1,238 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - CodeIsAbstract/reasoning_dataset
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ - f1
10
+ - recall
11
+ base_model:
12
+ - answerdotai/ModernBERT-base
13
+ pipeline_tag: text-classification
14
+ ---
15
+
16
+ # Model Card for Model ID
17
+
18
+ This model is a text classification model that identifies whether a given text expresses reasoning or not. It classifies text into two categories: "reasoning" (label 1) and "non-reasoning" (label 0).
19
+
20
+
21
+ ## Model Details
22
+
23
+ ### Model Description
24
+
25
+ This model is designed to classify text based on the presence of reasoning. It has been trained on the @CodeIsAbstract/reasoning_dataset, a dataset specifically created for this task. The model is intended to distinguish between text that presents logical arguments, explanations, or justifications (reasoning) and text that does not (non-reasoning).
26
+
27
+ Developed by: Samarth Pusalkar - Shared by: CodeIsAbstract - Model type: Transformer-based text classification model - Language(s) (NLP): English
28
+
29
+ License: mit
30
+
31
+ Finetuned from model: ModernBert-base ### Model Sources [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
32
+
33
+ Repository: [https://huggingface.co/CodeIsAbstract/ReasoningTextClassifier](https://huggingface.co/CodeIsAbstract/ReasoningTextClassifier)
34
+
35
+
36
+ - **Developed by:** Samarth Pusalkar
37
+ - **Model type:** BertForSequenceClassification
38
+ - **Language(s) (NLP):** English
39
+ - **License:** MIT
40
+ - **Finetuned from model:** answerdotai/ModernBERT-base
41
+
42
+ NOTE: Calling this model as reasoning classification model could be ambiguous users may find that the model does not classify a math problem solved step by step as reasoning,
43
+ rather model is more inclined towards detecting reasoning pattern in text language and specifically of the reasoning and thinking patters of the LLMs like Deepseek and gemini, much towards the bias of its training data
44
+
45
+
46
+ ## Uses
47
+
48
+ ### Direct Use
49
+ The primary direct use of this model is to classify English text as either expressing reasoning (label 1) or not (label 0). Researchers, educators, and content analysts can use this model to automatically identify and categorize text based on the presence of reasoning.
50
+ This model can be used to score an LLMs output as reasoning and non reasoning and potentially allowing the model to learn to predict reasoning like output.
51
+
52
+
53
+ ### Out-of-Scope Use
54
+
55
+ This model is intended for classifying English text. Its performance on other languages is not guaranteed. Misuse and out-of-scope scenarios include:
56
+
57
+ High-stakes decision making: The model's output should not be used as the sole basis for critical decisions, especially in contexts where incorrect reasoning detection could have significant negative consequences (e.g., legal or medical domains) without careful validation and human oversight.
58
+ Detecting specific types of reasoning: The model is trained on a general dataset and may not be optimized for detecting specific types or nuances of reasoning (e.g., causal reasoning, deductive reasoning, etc.).
59
+ Bias amplification: If the training dataset contains biases, the model may perpetuate or amplify these biases in its predictions. Users should be aware of potential biases in the model's output, especially when used on text from underrepresented groups or sensitive topics.
60
+ Content generation: This model is designed for classification and not for generating text. It should not be used for generating text that is supposed to exhibit reasoning.
61
+ Use on non-textual data: The model is specifically designed for text and should not be applied to other data types such as images or audio.
62
+
63
+ ## Bias, Risks, and Limitations
64
+
65
+ The model's performance is subject to several limitations:
66
+
67
+ Dataset Bias: The CodeIsAbstract/reasoning_dataset dataset's inherent biases may be reflected in the model.
68
+ The dataset characteristics and potential biases should be further investigated in the dataset card.
69
+ Generalization: The model's ability to generalize to text significantly different from the training data is not fully evaluated and may be limited.
70
+ Ambiguity of "Reasoning": The concept of "reasoning" can be subjective and context-dependent. The model's definition of reasoning is based on the dataset labels and may not align perfectly with all users' interpretations.
71
+ Technical limitations: As a machine learning model, it is not guaranteed to be perfectly accurate. Errors in classification are possible, particularly with complex or nuanced text.
72
+ Evaluation Limitations: The evaluation metrics and testing data used to assess the model's performance may not fully capture its real-world effectiveness across all use cases. Further evaluation on diverse datasets and in different application scenarios is recommended.
73
+
74
+ ### Recommendations
75
+
76
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
77
+ Dataset Understanding: Users are strongly encouraged to examine the CodeIsAbstract/reasoning_dataset dataset card to understand the data it was trained on and potential biases within the data.
78
+ Careful Evaluation: Before deploying the model in any application, especially in sensitive contexts, thorough evaluation on data relevant to the specific use case is essential. Consider evaluating on diverse datasets and with metrics beyond standard accuracy.
79
+ Human Oversight: In applications where accuracy is critical, human review of the model's predictions is highly recommended. The model should be used as a tool to augment, not replace, human judgment.
80
+ Bias Mitigation: Explore techniques for bias detection and mitigation if the model is to be used in contexts where fairness is a concern.
81
+ Transparency: When using this model, clearly communicate its limitations and potential biases to end-users.
82
+
83
+
84
+ ## How to Get Started with the Model
85
+
86
+ Use the code below to get started with the model.
87
+
88
+ ```
89
+ from transformers import pipeline
90
+
91
+ classifier = pipeline("text-classification", model="[**Your Model ID on Hugging Face Hub**]") # Replace with your actual model ID
92
+ text_example = """The user wants a detailed reasoning for each step in solving the equation x+3=5. I need to break down the process into smaller steps and explain the logic behind each manipulation.
93
+ Here's a plan:
94
+
95
+ Start with the given equation: State the equation we are trying to solve.
96
+ Identify the goal: Explain that the goal is to isolate 'x' to find its value.
97
+ Recognize the operation with x: Identify that 3 is being added to x.
98
+ Determine the inverse operation: Explain that to isolate x, we need to perform the inverse operation of addition, which""" # Example text exhibiting reasoning
99
+ output = classifier(text_example)
100
+ print(output)
101
+
102
+ text_example_non_reasoning = ".., we need to isolate x on one side of the equation.
103
+ to help user evaluate the output
104
+
105
+ The given equation is:
106
+ x + 3 = 5
107
+
108
+ To isolate x, we need to remove the +3 from the left side of the equation.
109
+ We can do this by subtracting 3 from both sides of the equation to maintain the equality." # Example text without reasoning, this is an extreme case where the text seem to be reasoning math like but actually does not have the thinking patterns of LLM's
110
+ output_non_reasoning = classifier(text_example_non_reasoning)
111
+ print(output_non_reasoning)
112
+ ```
113
+
114
+
115
+
116
+ ## Training Details
117
+
118
+ ### Training Data
119
+
120
+ @[CodeIsAbstract/reasoning_dataset](https://huggingface.co/datasets/CodeIsAbstract/reasoning_dataset) train split
121
+ the training data is specifically desined to help classify text as reasoning(1) and non-reasoning(0), and is a derivative of [Dolphine R1 dataset](https://huggingface.co/datasets/cognitivecomputations/dolphin-r1)
122
+
123
+ ### Training Procedure
124
+ Training procedure was simply the default implementation of Trainer with BertForSequenceClassification
125
+
126
+
127
+ #### Training Hyperparameters
128
+
129
+ - **Training regime:** fp16 mixed precision <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
130
+ - **model_architecture:** ModernBert-Base,
131
+ - **learning_rate:** 3e-5,
132
+ - **per_device_train_batch_size:** 704,
133
+ - **per_device_eval_batch_size:** 512,
134
+ - **num_train_epochs:** 2,
135
+ - **gradient_accumulation_steps:** 4,
136
+ - **dataloader_num_workers:** 4,
137
+ - **weight_decay:** 0.001,
138
+ - **warmup_ratio:** 0.03,
139
+ - **logging_steps:** 50,
140
+ - **evaluation_strategy:** **steps**,
141
+ - **eval_steps:** 100,
142
+ - **save_strategy:** **steps**,
143
+ - **save_steps:** 200,
144
+ - **load_best_model_at_end:** True,
145
+ - **metric_for_best_model:** **eval_loss**,
146
+ - **gradient_checkpointing:** True,
147
+ - **fp16:** True
148
+ - **torch.backends.cudnn.benchmark:** True # Enable cudnn auto-tuner
149
+ - **torch.backends.cuda.matmul.allow_tf32:** True # Allow TF32 on Ampere
150
+ - **orch.backends.cudnn.allow_tf32L:** True #
151
+
152
+
153
+ ## Evaluation
154
+
155
+ ### Testing Data, Factors & Metrics
156
+
157
+ #### Testing Data
158
+
159
+ Testing dataset comes from the same distribution as train set from @[CodeIsAbstract/reasoning_dataset](https://huggingface.co/datasets/CodeIsAbstract/reasoning_dataset) test split
160
+ Testing dataset size -> 165k samples
161
+
162
+ #### Metrics
163
+
164
+ Tested on test set of [@CodeIsAbstract/reasoning_dataset](https://huggingface.co/datasets/CodeIsAbstract/reasoning_dataset)
165
+
166
+ -**eval_loss:** 0.003581336699426174
167
+ -**eval_model_preparation_time:** 0.0048
168
+ -**eval_accuracy:** 0.9991756576554733
169
+ -**eval_precision:** 0.9991760105961167
170
+ -**eval_recall:** 0.9991756576554733
171
+ -**eval_f1:** 0.9991756643183358
172
+ -**eval_runtime:** 447.9271
173
+ -**eval_samples_per_second:** 368.319
174
+ -**eval_steps_per_second:** 0.721
175
+
176
+ ### Results
177
+
178
+ The model is able to classify the test set samples with near 100% accuracy.
179
+
180
+
181
+ ## Environmental Impact
182
+
183
+ - **Hardware Type:** L40S
184
+ - **Hours used:** 6Hrs
185
+ - **Cloud Provider:** Lightning-AI
186
+ - **Compute Region:** N.A.
187
+ - **Carbon Emitted:** N.A.
188
+
189
+
190
+ ## Technical Specifications
191
+
192
+ ### Model Architecture and Objective
193
+
194
+ Derived from Bert, BertForSequenceClassification
195
+
196
+ ### Compute Infrastructure
197
+
198
+ #### Hardware
199
+
200
+ Trained on L40S for 6 Hrs 2 epochs completed on entire train set size 1.04M text
201
+
202
+ #### Software
203
+
204
+ Trained with huggingface transformers library -> BertForSequenceClassification with CrossEntropyLossFunction
205
+
206
+ ## Citation
207
+
208
+ **BibTeX:**
209
+ ```
210
+ @misc{pusalkar2025reasoningclf,
211
+ title={Sequence Classifier for classifing reasoning dataset},
212
+ author={Samarth Pusalkar},
213
+ year={2025}
214
+ }
215
+ ```
216
+
217
+ Taken from base model:
218
+ ```
219
+ @misc{modernbert,
220
+ title={Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference},
221
+ author={Benjamin Warner and Antoine Chaffin and Benjamin Clavié and Orion Weller and Oskar Hallström and Said Taghadouini and Alexis Gallagher and Raja Biswas and Faisal Ladhak and Tom Aarsen and Nathan Cooper and Griffin Adams and Jeremy Howard and Iacopo Poli},
222
+ year={2024},
223
+ eprint={2412.13663},
224
+ archivePrefix={arXiv},
225
+ primaryClass={cs.CL},
226
+ url={https://arxiv.org/abs/2412.13663},
227
+ }
228
+ ```
229
+
230
+
231
+ ## Model Card Authors
232
+
233
+ Samarth Pusalkar
234
+
235
+
236
+ ## Model Card Contact
237
+
238