Minimal-RL
Collection
2 items
•
Updated
•
1
Qwen-Math-7B-base
.
If you found useful, please consider cite,
@inproceedings{Xiong2025AMA,
title={A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce},
author={Wei Xiong and Jiarui Yao and Yuhui Xu and Bo Pang and Lei Wang and Doyen Sahoo and Junnan Li and Nan Jiang and Tong Zhang and Caiming Xiong and Hanze Dong},
journal={arXiv preprint arXiv:2504.11343},
year={2025},
}