GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-6_eta_1e4_bs_128_1722279786 Text Generation • 8B • Updated Jul 30, 2024 • 5
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-6_eta_1e3_bs_128_1722275374 Text Generation • 8B • Updated Jul 29, 2024 • 5
GitBag/rebel_multiturn_chat_pairx_continue_1400_batch_size_32_kl_0_lr_3e-7_1722191156 Updated Jul 29, 2024
GitBag/simpo_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-6_beta_10_gb_0.3_bs_128_1722052465 Text Generation • 8B • Updated Jul 27, 2024 • 5
GitBag/dpo_ultrafeedback_armo_OneBatch_newprob_full_lr_3e-7_beta_0.01_bs_128_1722056874 Text Generation • 8B • Updated Jul 27, 2024 • 5
GitBag/rebel_multiturn_chat_pairx_continue_1000_batch_size_32_kl_0_lr_3e-7_1722030780 Updated Jul 27, 2024
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_5e-7_eta_1_bs_64_1721891508 Text Generation • 8B • Updated Jul 26, 2024 • 7
GitBag/rebel_ultrafeedback_armo_OneBatch_newprob_full_lr_1e-6_eta_10_bs_128_1721872264 Text Generation • 8B • Updated Jul 26, 2024 • 5
GitBag/rebel_multiturn_chat_hybrid_gen_pairx_wm_0.1_batch_size_32_kl_0_lr_3e-7_1721860439 Updated Jul 26, 2024
GitBag/rebel_multiturn_chat_hybrid_pairx_wm_0.1_eta_0.3_batch_size_32_kl_0_lr_3e-7_1721862031 Updated Jul 25, 2024
GitBag/rebel_multiturn_chat_hybrid_pairx_wm_0.1_eta_0.1_batch_size_32_kl_0_lr_3e-7_1721862112 Updated Jul 25, 2024
GitBag/rebel_multiturn_chat_hybrid_gen_pairx_batch_size_32_kl_0_lr_3e-7_1721671331 Updated Jul 24, 2024
GitBag/rebel_multiturn_chat_hybrid_gen_p_1_pairx_batch_size_32_kl_0_lr_3e-7_1721670552 Updated Jul 24, 2024