Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
SURESHBEEKHANI
/
llama_3_2_3B-dpo-rlhf-fine-tuning
like
1
Question Answering
GGUF
Intel/orca_dpo_pairs
English
llama
Inference Endpoints
conversational
License:
mit
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
llama_3_2_3B-dpo-rlhf-fine-tuning
Commit History
Update README.md
5e90b32
verified
SURESHBEEKHANI
commited on
25 days ago
Update README.md
6b1c651
verified
SURESHBEEKHANI
commited on
25 days ago
Update README.md
d5edca4
verified
SURESHBEEKHANI
commited on
25 days ago
Update README.md
e27bdd2
verified
SURESHBEEKHANI
commited on
25 days ago
Update README.md
4476b03
verified
SURESHBEEKHANI
commited on
25 days ago
Create README.md
129a994
verified
SURESHBEEKHANI
commited on
25 days ago
(Trained with Unsloth)
4792ef2
verified
SURESHBEEKHANI
commited on
25 days ago
(Trained with Unsloth)
1fb8a92
verified
SURESHBEEKHANI
commited on
25 days ago
initial commit
d6d353d
verified
SURESHBEEKHANI
commited on
26 days ago