kz-transformers commited on
Commit
970ec6e
·
verified ·
1 Parent(s): ed9137f

Update src/display/about.py

Browse files
Files changed (1) hide show
  1. src/display/about.py +10 -22
src/display/about.py CHANGED
@@ -20,27 +20,21 @@ Kaz LLM is a benchmark for LLM with multiple-choice tasks on the following topic
20
  - mmlu-translated-kk
21
  - kazakh-constitution-mc
22
  - kazakh-dastur-mc
23
- - kazakh-unified-national-testing-mc
24
 
25
  Each task contains from 4 to 8 answer choices.
26
  ## Instructions for Use
27
  ### Installation
28
  To install the necessary library, run the following command:
29
  ```bash
30
- git clone --depth 1 https://github.com/horde-research/lm-evaluation-harness-kk.git
31
- cd lm-evaluation-harness-kk
32
- pip install -e .
33
  ```
34
  ### Execution
35
  To run the benchmark, use the following command:
36
  ```bash
37
- lm_eval \
38
- --model hf \
39
- --model_args pretrained={hf/model} \
40
- --batch_size 8 \
41
- --num_fewshot 0 \
42
- --tasks mmlu_translated_kk,kazakh_and_literature_unt_mc,kk_biology_unt_mc,kk_constitution_mc,kk_dastur_mc,kk_english_unt_mc,kk_geography_unt_mc,kk_history_of_kazakhstan_unt_mc,kk_human_society_rights_unt_mc,kk_unified_national_testing_mc,kk_world_history_unt_mc \
43
- --output output
44
  ```
45
  ### Results
46
  After executing the above command, a JSON file will be created in the `output` directory, which must be attached. This file contains the results of the tasks and a description of the session, and **must not be modified**.
@@ -55,7 +49,7 @@ Kaz LLM – бұл төмендегі тақырыптар бойынша көп
55
  - mmlu-translated-kk
56
  - kazakh-constitution-mc
57
  - kazakh-dastur-mc
58
- - kazakh-unified-national-testing-mc
59
 
60
  Әр тапсырмада 4-8 жауап нұсқасы бар.
61
 
@@ -63,21 +57,15 @@ Kaz LLM – бұл төмендегі тақырыптар бойынша көп
63
  ### Орнату
64
  Қажетті кітапхананы орнату үшін төмендегі команданы орындаңыз:
65
  ```bash
66
- git clone --depth 1 https://github.com/horde-research/lm-evaluation-harness-kk.git
67
- cd lm-evaluation-harness-kk
68
- pip install -e .
69
  ```
70
 
71
  ### Орындау
72
  Бенчмаркті іске қосу үшін келесі команданы пайдаланыңыз:
73
  ```bash
74
- lm_eval \
75
- --model hf \
76
- --model_args pretrained={hf/model} \
77
- --batch_size 8 \
78
- --num_fewshot 0 \
79
- --tasks mmlu_translated_kk,kazakh_and_literature_unt_mc,kk_biology_unt_mc,kk_constitution_mc,kk_dastur_mc,kk_english_unt_mc,kk_geography_unt_mc,kk_history_of_kazakhstan_unt_mc,kk_human_society_rights_unt_mc,kk_unified_national_testing_mc,kk_world_history_unt_mc \
80
- --output output
81
  ```
82
 
83
  ### Нәтижелер
 
20
  - mmlu-translated-kk
21
  - kazakh-constitution-mc
22
  - kazakh-dastur-mc
23
+ - kazakh-unified-national-testing-mc(biology,english, geography, history of kz, world's history, society rights, literature, kazakh language)
24
 
25
  Each task contains from 4 to 8 answer choices.
26
  ## Instructions for Use
27
  ### Installation
28
  To install the necessary library, run the following command:
29
  ```bash
30
+ git clone https://github.com/horde-research/horde-common.git
31
+ cd scripts
32
+ pip install -r requirements.txt
33
  ```
34
  ### Execution
35
  To run the benchmark, use the following command:
36
  ```bash
37
+ python mc-eval-simplified-inference.py --model_id deepseek-ai/DeepSeek-R1-Distill-Qwen-14B --output_path .
 
 
 
 
 
 
38
  ```
39
  ### Results
40
  After executing the above command, a JSON file will be created in the `output` directory, which must be attached. This file contains the results of the tasks and a description of the session, and **must not be modified**.
 
49
  - mmlu-translated-kk
50
  - kazakh-constitution-mc
51
  - kazakh-dastur-mc
52
+ - kazakh-unified-national-testing-mc(biology,english, geography, history of kz, world's history, society rights, literature, kazakh language)
53
 
54
  Әр тапсырмада 4-8 жауап нұсқасы бар.
55
 
 
57
  ### Орнату
58
  Қажетті кітапхананы орнату үшін төмендегі команданы орындаңыз:
59
  ```bash
60
+ git clone https://github.com/horde-research/horde-common.git
61
+ cd scripts
62
+ pip install -r requirements.txt
63
  ```
64
 
65
  ### Орындау
66
  Бенчмаркті іске қосу үшін келесі команданы пайдаланыңыз:
67
  ```bash
68
+ python mc-eval-simplified-inference.py --model_id deepseek-ai/DeepSeek-R1-Distill-Qwen-14B --output_path .
 
 
 
 
 
 
69
  ```
70
 
71
  ### Нәтижелер