Rui Yang's picture

Rui Yang

Ray2333

·

https://yangrui2015.github.io

YangRui2015

AI & ML interests

Deep Reinforcement Learning

Recent Activity

upvoted a paper about 12 hours ago

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

commented on a paper about 12 hours ago

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

upvoted a paper 2 days ago

Teaching Language Models to Critique via Reinforcement Learning

View all activity

Organizations

Ray2333's activity

upvoted a paper about 12 hours ago

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Paper • 2502.09560 • Published about 22 hours ago • 17

commented a paper about 12 hours ago

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents

Paper • 2502.09560 • Published about 22 hours ago • 17 •

upvoted a paper 2 days ago

Teaching Language Models to Critique via Reinforcement Learning

Paper • 2502.03492 • Published 10 days ago • 21

updated 12 models 9 days ago

Ray2333/Gemma-2B-rewardmodel-baseline

Text Classification • Updated 9 days ago • 769

Ray2333/GRM-llama3-8B-distill

Text Classification • Updated 9 days ago • 115 • 6

Ray2333/reward-model-Mistral-7B-instruct-Unified-Feedback

Text Classification • Updated 9 days ago • 1.78k • 11

Ray2333/GRM-Gemma-2B-rewardmodel-ft

Updated 9 days ago • 222 • 1

Ray2333/Gemma-2B-rewardmodel-ft

Updated 9 days ago • 15 • 1

Ray2333/GRM-llama3.2-3B-sftreg

Text Classification • Updated 9 days ago • 57 • 1

Ray2333/GRM-Gemma-2B-sftreg

Text Classification • Updated 9 days ago • 452 • 3

Ray2333/GRM-llama3-8B-sftreg

Text Classification • Updated 9 days ago • 133 • 5

Ray2333/GRM-Gemma2-2B-sftreg

Text Classification • Updated 9 days ago • 24 • 1

Ray2333/GRM-gemma2-2B-rewardmodel-ft

Text Classification • Updated 9 days ago • 2.76k • 6

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

Text Classification • Updated 9 days ago • 66 • 3

Ray2333/GRM-Llama3-8B-rewardmodel-ft

Updated 9 days ago • 1.05k • 1

New activity in Ray2333/GRM-Llama3.2-3B-rewardmodel-ft 9 days ago

Update default tokenization behavior to "longest" in README

#2 opened 9 days ago by

liked a model 2 months ago

Ray2333/GRM_Llama3.1_8B_rewardmodel-ft

Text Classification • Updated 9 days ago • 66 • 3

updated a model 3 months ago

Ray2333/GRM-Llama3.2-3B-rewardmodel-ft

Text Classification • Updated 9 days ago • 5.91k • 7

updated a collection 3 months ago

GRM

Generalizable Reward Models • 11 items • Updated Nov 25, 2024 • 4

upvoted a collection 3 months ago

Papers - Math - Reasoning

11 items • Updated Nov 10, 2024 • 1