Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Zikang Shan
zkshan2002
Follow
0 followers
·
1 following
AI & ML interests
Reinforcement Learning
Recent Activity
published
a model
2 days ago
RTO-RL/Llama3-8B-TDPO
updated
a model
2 days ago
RTO-RL/Llama3-8B-TDPO
published
a model
2 days ago
RTO-RL/Llama3-8B-SimPO
View all activity
Organizations
zkshan2002
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
published
a model
2 days ago
RTO-RL/Llama3-8B-TDPO
Updated
2 days ago
•
2
•
1
updated
a model
2 days ago
RTO-RL/Llama3-8B-TDPO
Updated
2 days ago
•
2
•
1
published
a model
2 days ago
RTO-RL/Llama3-8B-SimPO
Updated
2 days ago
•
2
updated
a model
2 days ago
RTO-RL/Llama3-8B-SimPO
Updated
2 days ago
•
2
published
a model
2 days ago
RTO-RL/Llama3-8B-RDPO
Updated
2 days ago
•
3
•
1
updated
a model
2 days ago
RTO-RL/Llama3-8B-RDPO
Updated
2 days ago
•
3
•
1
published
a model
2 days ago
RTO-RL/Llama3-8B-PPO
Updated
2 days ago
•
3
•
1
updated
5 models
2 days ago
RTO-RL/Llama3-8B-PPO
Updated
2 days ago
•
3
•
1
RTO-RL/Llama3-8B-RTO
Updated
2 days ago
•
2
•
1
RTO-RL/Llama3.2-1B-RewardModel
Updated
2 days ago
•
81
RTO-RL/Llama3-8B-RewardModel
Updated
2 days ago
•
80
RTO-RL/Llama3-8B-DPO
Updated
2 days ago
•
31
published
a model
2 days ago
RTO-RL/Llama3-8B-RTO
Updated
2 days ago
•
2
•
1
published
a dataset
20 days ago
zkshan2002/hh-rlhf_preprocessed
Viewer
•
Updated
20 days ago
•
46.1k
•
38
updated
a dataset
20 days ago
zkshan2002/hh-rlhf_preprocessed
Viewer
•
Updated
20 days ago
•
46.1k
•
38
updated
2 models
about 2 months ago
RTO-RL/Llama3-8B-RTO
Updated
2 days ago
•
2
•
1
RTO-RL/Llama3-8B-RTO
Updated
2 days ago
•
2
•
1
Load more