See our paper at https://huggingface.co/papers/2405.19332.
Shenao Zhang
ZhangShenao
AI & ML interests
None yet
Recent Activity
updated
a dataset
6 days ago
ZhangShenao/bt-math_gsm-gemma-1.1-7b-it-iter_sample_7500_temp_1.0_gen_1
updated
a model
11 days ago
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_qk_ep_10
Organizations
Collections
3
-
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-3
Text Generation • Updated • 59 • 5 -
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-2
Text Generation • Updated • 19 -
ZhangShenao/SELM-Llama-3-8B-Instruct-iter-1
Text Generation • Updated • 14 -
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment
Paper • 2405.19332 • Published • 22
models
394
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_qk_ep_10
Updated
•
11
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_noneep_10
Updated
•
10
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_vo_ep_10
Updated
•
10
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_reverse_vo
Updated
•
12
ZhangShenao/math_gsm-gemma-1.1-7b-it-msft-sample_7473_tp_unfreeze_reverse_vo
Updated
•
16
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_qk
Updated
•
9
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_vo
Updated
•
10
ZhangShenao/math_gsm-Mistral-7B-Instruct-v0.2-msft-sample_7473_tp_unfreeze_none
Text Generation
•
Updated
•
16
ZhangShenao/math_gsm-gemma-1.1-7b-it-msft-sample_7473_tp_unfreeze_none
Text Generation
•
Updated
•
9
ZhangShenao/math_gsm-gemma-1.1-7b-it-msft-sample_7473_tp_unfreeze_vo
Text Generation
•
Updated
•
7
datasets
236
ZhangShenao/bt-math_gsm-gemma-1.1-7b-it-iter_sample_7500_temp_1.0_gen_1
Viewer
•
Updated
•
3.74k
•
67
ZhangShenao/msft-math_gsm-Mistral-7B-Instruct-v0.2-iter_sample_7473_tp
Viewer
•
Updated
•
7.47k
•
61
ZhangShenao/msft-math_gsm-gemma-1.1-7b-it-iter_sample_7473_tp
Viewer
•
Updated
•
7.47k
•
75
ZhangShenao/bt_norej-math_gsm-gemma-1.1-7b-it-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
7.47k
•
60
ZhangShenao/bt_norej-math_gsm-Mistral-7B-Instruct-v0.2-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
7.47k
•
51
ZhangShenao/bt_norej_full-math_gsm-Mistral-7B-Instruct-v0.2-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
7.47k
•
47
ZhangShenao/bigmath_chat
Viewer
•
Updated
•
251k
•
2
ZhangShenao/bt-math_gsm-Meta-Llama-3-8B-Instruct-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
5.43k
•
85
ZhangShenao/bt-math_gsm-Mistral-7B-Instruct-v0.2-iter_sample_7500_temp_1.0_gen_1_mlr5e-5
Viewer
•
Updated
•
3.05k
•
75
ZhangShenao/bt_sft-math_gsm-gemma-1.1-7b-it-iter_sample_7500_tp
Viewer
•
Updated
•
7.47k
•
84