krutrim-ai-labs
/

Krutrim-2-instruct

Model card Files Files and versions Community

krutrim-admin commited on Feb 3

Commit

fb081df

·

verified ·

1 Parent(s): 08b869e

Updated model features

Files changed (1) hide show

README.md +7 -3

README.md CHANGED Viewed

@@ -24,10 +24,14 @@ Krutrim-2 is a 12B parameter language model developed by the OLA Krutrim team. I
 After fine-tuning, the model underwent Direct Preference Optimization (DPO) with 300K data points to enhance alignment across multiple aspects. DPO was applied to improve response helpfulness, safety, and compliance, making the model more robust against harmful prompts, reducing biases, and improving factual consistency.
 ## Key Features
-- Supports long context upto 128k tokens
 - Available in both pre-trained and instruction-tuned versions
-- Supports English and 22 scheduled Indian languages
-- Demonstrates robust knowledge of Indic culture and context, responding with an Indian-centric perspective unless specified otherwise
 ## Model Developer
 - OLA Krutrim Team

 After fine-tuning, the model underwent Direct Preference Optimization (DPO) with 300K data points to enhance alignment across multiple aspects. DPO was applied to improve response helpfulness, safety, and compliance, making the model more robust against harmful prompts, reducing biases, and improving factual consistency.
 ## Key Features
+- 12B parameter dense transformer model leading to better generalization compared to Krutrim-1 7B;
+- Supports context up to 128K tokens making it suitable for long multi-turn conversations, long-form generations, document translations and others;
+- Retains the original performance of MN-12B on most En benchmarks with x3.5 improvement on HumanEval coding task;
+- Natively multilingual delivering best-in-class performance on Indic benchmarks;
+- Matches or exceeds performance of models much larger (x6) on multilingual Indic generation tasks including creative writing, summarization, and translation;
+- Stronger Indian cultural context relevance - scored the highest in manual evaluation with multiple models in an anonymised setting;
+- Delivers top-3 performance on 5 (out of 7) tasks in BharatBench among much larger open source and commercial models.
 - Available in both pre-trained and instruction-tuned versions
 ## Model Developer
 - OLA Krutrim Team