Text Generation
Transformers
Safetensors
Turkish
English
llama
conversational
text-generation-inference
Inference Endpoints
zolicsaki commited on
Commit
3029cd4
·
verified ·
1 Parent(s): 8a89b81

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -46,6 +46,12 @@ model = AutoModelForCausalLM.from_pretrained("sambanovasystems/SambaLingo-Turkis
46
  ## Evaluation Results
47
 
48
  ## Training Details
 
 
 
 
 
 
49
 
50
  ## Tokenizer Details
51
  We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.
 
46
  ## Evaluation Results
47
 
48
  ## Training Details
49
+ The alignment phase follows the recipe for [Zephyr-7B](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), and comprises two stages: supervised fine-tuning (SFT) and Direct Performance Optimization (DPO).
50
+
51
+ The SFT phase was done on the ultrachat_200k dataset mixed with the Google translated version of the ultrachat_200k dataset. It was trained for one epoch with global batch size 512 and max sequence length 2048 tokens. We used a linear decay learning rate of 2e-5 and 10% warmup.
52
+
53
+ The DPO phase was done on the ultrafeedback dataset and cai-conversation-harmless dataset, mixed with 10% of the data Google translated. It was trained with global batch size 32 and for three epochs. We used a linear decay learning rate of 5e-7, 10% warmup and β=0.1 as the regularization factor for DPO.
54
+
55
 
56
  ## Tokenizer Details
57
  We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.