ptrdvn commited on
Commit
cef295c
·
verified ·
1 Parent(s): faabbfb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -107,7 +107,7 @@ We compare this to the original R1 model and test in both regimes where repetiti
107
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 66 | 92 |
108
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 70 | 98 |
109
 
110
- Code for the SakanaAI/gsm8k-ja-test_250-1319 evaluation can be found [here](https://drive.google.com/file/d/1gCzCJv5vasw8R3KVQimfoIDFyfxwxNvC/view?usp=sharing):
111
 
112
 
113
  We further use the first 50 prompts from (DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja)[https://huggingface.co/datasets/DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja] to evaluate the percentage of valid Japanese `\<think\>` sections in model responses.
@@ -120,7 +120,7 @@ This benchmark contains more varied and complex prompts, meaning this is a more
120
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 84 |
121
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 94 |
122
 
123
- Code for the DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja evaluation can be found [here](https://drive.google.com/file/d/1f75IM5x1SZrb300odkEsLMfKsfibrxvR/view?usp=sharing):
124
 
125
  # How this model was made
126
 
@@ -228,7 +228,7 @@ for output in outputs:
228
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 66 | 92 |
229
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 70 | 98 |
230
 
231
- SakanaAI/gsm8k-ja-test_250-1319の評価コードは[こちら](https://drive.google.com/file/d/1gCzCJv5vasw8R3KVQimfoIDFyfxwxNvC/view?usp=sharing)にあります:
232
 
233
  さらに、(DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja)[https://huggingface.co/datasets/DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja]の最初の50プロンプトを使用して、モデル応答における有効な日本語の`<think>`セクションの割合を評価します。このベンチマークにはより多様で複雑なプロンプトが含まれており、モデルが日本語を信頼性高く出力できるかどうかを、より現実的に評価します。
234
 
@@ -239,7 +239,7 @@ SakanaAI/gsm8k-ja-test_250-1319の評価コードは[こちら](https://drive.go
239
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 84 |
240
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 94 |
241
 
242
- DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja評価コードは[こちら](https://drive.google.com/file/d/1f75IM5x1SZrb300odkEsLMfKsfibrxvR/view?usp=sharing)にあります:
243
 
244
  # 作成方法
245
 
 
107
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 66 | 92 |
108
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 70 | 98 |
109
 
110
+ Code for the SakanaAI/gsm8k-ja-test_250-1319 evaluation can be found [here](https://drive.google.com/file/d/1gCzCJv5vasw8R3KVQimfoIDFyfxwxNvC/view?usp=sharing).
111
 
112
 
113
  We further use the first 50 prompts from (DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja)[https://huggingface.co/datasets/DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja] to evaluate the percentage of valid Japanese `\<think\>` sections in model responses.
 
120
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 84 |
121
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 94 |
122
 
123
+ Code for the DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja evaluation can be found [here](https://drive.google.com/file/d/1f75IM5x1SZrb300odkEsLMfKsfibrxvR/view?usp=sharing).
124
 
125
  # How this model was made
126
 
 
228
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 66 | 92 |
229
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 70 | 98 |
230
 
231
+ SakanaAI/gsm8k-ja-test_250-1319の評価コードは[こちら](https://drive.google.com/file/d/1gCzCJv5vasw8R3KVQimfoIDFyfxwxNvC/view?usp=sharing)にあります。
232
 
233
  さらに、(DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja)[https://huggingface.co/datasets/DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja]の最初の50プロンプトを使用して、モデル応答における有効な日本語の`<think>`セクションの割合を評価します。このベンチマークにはより多様で複雑なプロンプトが含まれており、モデルが日本語を信頼性高く出力できるかどうかを、より現実的に評価します。
234
 
 
239
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.0 | 84 |
240
  | lightblue/DeepSeek-R1-Distill-Qwen-7B-Japanese | 1.1 | 94 |
241
 
242
+ DeL-TaiseiOzaki/Tengentoppa-sft-reasoning-ja評価コードは[こちら](https://drive.google.com/file/d/1f75IM5x1SZrb300odkEsLMfKsfibrxvR/view?usp=sharing)にあります。
243
 
244
  # 作成方法
245