Add metadata, link to code
Browse filesThis PR adds relevant metadata to improve model discoverability. It also includes a concise model description and links to the paper and code.
README.md
CHANGED
@@ -1,3 +1,11 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
pipeline_tag: question-answering
|
4 |
+
library_name: transformers
|
5 |
+
---
|
6 |
+
|
7 |
+
This repository contains the model described in the paper [SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild](https://arxiv.org/abs/2503.18892).
|
8 |
+
|
9 |
+
This model has been trained with a simple reinforcement learning (RL) recipe to improve reasoning abilities. Training starts from base models and uses rule-based rewards and the GSM8K/Math datasets. This approach has been successfully applied to diverse base models with limited data (8K examples), achieving significant accuracy gains ranging from 10 to more than 20 absolute points.
|
10 |
+
|
11 |
+
Code: https://github.com/hkust-nlp/simpleRL-reason/tree/v1
|