PyTorch
mistral
Krutrim
language-model
krutrim-admin commited on
Commit
fb2ce88
·
verified ·
1 Parent(s): 63bafd8

minor changes in the description

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -28,7 +28,7 @@ widget:
28
  ## Model Overview
29
  Krutrim-2 is a 12B parameter language model developed by the OLA Krutrim team. It is built on the Mistral-NeMo 12B architecture and trained across various domains, including web data, code, math, Indic languages, Indian context data, synthetic data, and books. Following pretraining, the model was finetuned for instruction following on diverse data covering a wide range of tasks, including knowledge recall, math, reasoning, coding, safety, and creative writing.
30
 
31
- After fine-tuning, the model underwent Direct Preference Optimization (DPO) to enhance alignment across multiple aspects. DPO was applied to improve response helpfulness, safety, and compliance, making the model more robust against harmful prompts, reducing biases, and improving factual consistency.
32
 
33
  ## Key Features
34
  - 12B parameter dense transformer model leading to better generalization compared to Krutrim-1 7B;
 
28
  ## Model Overview
29
  Krutrim-2 is a 12B parameter language model developed by the OLA Krutrim team. It is built on the Mistral-NeMo 12B architecture and trained across various domains, including web data, code, math, Indic languages, Indian context data, synthetic data, and books. Following pretraining, the model was finetuned for instruction following on diverse data covering a wide range of tasks, including knowledge recall, math, reasoning, coding, safety, and creative writing.
30
 
31
+ After fine-tuning, the model underwent Direct Preference Optimization (DPO) to enhance alignment across multiple aspects. DPO was applied to improve response helpfulness, safety, and reasoning.
32
 
33
  ## Key Features
34
  - 12B parameter dense transformer model leading to better generalization compared to Krutrim-1 7B;