Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
SURESHBEEKHANI
/
llama_3_2_3B-dpo-rlhf-fine-tuning
like
1
Question Answering
GGUF
Intel/orca_dpo_pairs
English
llama
Inference Endpoints
conversational
License:
mit
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
llama_3_2_3B-dpo-rlhf-fine-tuning
1 contributor
History:
9 commits
SURESHBEEKHANI
Update README.md
5e90b32
verified
22 days ago
.gitattributes
Safe
1.58 kB
(Trained with Unsloth)
22 days ago
README.md
Safe
4.63 kB
Update README.md
22 days ago
config.json
Safe
29 Bytes
(Trained with Unsloth)
22 days ago
unsloth.Q4_K_M.gguf
Safe
2.02 GB
LFS
(Trained with Unsloth)
22 days ago