PyTorch
mistral
Krutrim
language-model
krutrim-admin commited on
Commit
fb081df
·
verified ·
1 Parent(s): 08b869e

Updated model features

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -24,10 +24,14 @@ Krutrim-2 is a 12B parameter language model developed by the OLA Krutrim team. I
24
  After fine-tuning, the model underwent Direct Preference Optimization (DPO) with 300K data points to enhance alignment across multiple aspects. DPO was applied to improve response helpfulness, safety, and compliance, making the model more robust against harmful prompts, reducing biases, and improving factual consistency.
25
 
26
  ## Key Features
27
- - Supports long context upto 128k tokens
 
 
 
 
 
 
28
  - Available in both pre-trained and instruction-tuned versions
29
- - Supports English and 22 scheduled Indian languages
30
- - Demonstrates robust knowledge of Indic culture and context, responding with an Indian-centric perspective unless specified otherwise
31
 
32
  ## Model Developer
33
  - OLA Krutrim Team
 
24
  After fine-tuning, the model underwent Direct Preference Optimization (DPO) with 300K data points to enhance alignment across multiple aspects. DPO was applied to improve response helpfulness, safety, and compliance, making the model more robust against harmful prompts, reducing biases, and improving factual consistency.
25
 
26
  ## Key Features
27
+ - 12B parameter dense transformer model leading to better generalization compared to Krutrim-1 7B;
28
+ - Supports context up to 128K tokens making it suitable for long multi-turn conversations, long-form generations, document translations and others;
29
+ - Retains the original performance of MN-12B on most En benchmarks with x3.5 improvement on HumanEval coding task;
30
+ - Natively multilingual delivering best-in-class performance on Indic benchmarks;
31
+ - Matches or exceeds performance of models much larger (x6) on multilingual Indic generation tasks including creative writing, summarization, and translation;
32
+ - Stronger Indian cultural context relevance - scored the highest in manual evaluation with multiple models in an anonymised setting;
33
+ - Delivers top-3 performance on 5 (out of 7) tasks in BharatBench among much larger open source and commercial models.
34
  - Available in both pre-trained and instruction-tuned versions
 
 
35
 
36
  ## Model Developer
37
  - OLA Krutrim Team