RLHFlow MATH Process Reward Model Collection This is a collection of datasets and models of process reward modeling. • 15 items • Updated Nov 9, 2024 • 10
Qwen2.5-Math Collection Math-specific model series based on Qwen2.5 • 11 items • Updated 29 days ago • 71
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 20 items • Updated 28 days ago • 115
Reasoning Datasets Collection Distilled synthetic Reasoning datasets • 7 items • Updated 10 days ago • 50
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Paper • 2501.18512 • Published 12 days ago • 25
IndicBERT v2 Collection IndicBERT v2 is a multilingual BERT model pretrained on IndicCorp v2, an Indic monolingual corpus of 20.9 billion tokens, covering 24 consitutionally • 4 items • Updated Oct 15, 2024 • 3
IndicLLMSuite Collection Largest Collections of Pretraining and Instruction Finetuning datasets for 22 Indic languages. • 4 items • Updated Nov 5, 2024 • 15
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 11 items • Updated about 19 hours ago • 49