sambanovasystems
/

SambaLingo-Turkish-Chat

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

zolicsaki commited on Feb 22, 2024

Commit

3029cd4

·

verified ·

1 Parent(s): 8a89b81

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -46,6 +46,12 @@ model = AutoModelForCausalLM.from_pretrained("sambanovasystems/SambaLingo-Turkis
 ## Evaluation Results
 ## Training Details
 ## Tokenizer Details
 We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.

 ## Evaluation Results
 ## Training Details
+The alignment phase follows the recipe for [Zephyr-7B](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta), and comprises two stages: supervised fine-tuning (SFT) and Direct Performance Optimization (DPO).
+The SFT phase was done on the ultrachat_200k dataset mixed with the Google translated version of the ultrachat_200k dataset. It was trained for one epoch with global batch size 512 and max sequence length 2048 tokens. We used a linear decay learning rate of 2e-5 and 10% warmup.
+The DPO phase was done on the ultrafeedback dataset and cai-conversation-harmless dataset, mixed with 10% of the data Google translated. It was trained with global batch size 32 and for three epochs. We used a linear decay learning rate of 5e-7, 10% warmup and β=0.1 as the regularization factor for DPO.
 ## Tokenizer Details
 We extended the vocabulary of the base llama model from 32,000 tokens to 57,000 tokens by adding up to 25,000 non-overlapping tokens from the new language.