noneUsername commited on
Commit
03b70af
·
verified ·
1 Parent(s): 5d36df4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md CHANGED
@@ -5,6 +5,28 @@ base_model:
5
  selected 70-1024-df10-u2k
6
 
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  vllm (pretrained=/root/autodl-tmp/70-512-df10,add_bos_token=true,max_model_len=2048), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: 1
9
  |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr|
10
  |-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
 
5
  selected 70-1024-df10-u2k
6
 
7
 
8
+ vllm (pretrained=/root/autodl-tmp/Cydonia-24B-v2,add_bos_token=true,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto
9
+ |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr|
10
+ |-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
11
+ |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.912|± | 0.018|
12
+ | | |strict-match | 5|exact_match|↑ |0.912|± | 0.018|
13
+
14
+ vllm (pretrained=/root/autodl-tmp/Cydonia-24B-v2,add_bos_token=true,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 500.0, num_fewshot: 5, batch_size: auto
15
+ |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr|
16
+ |-----|------:|----------------|-----:|-----------|---|----:|---|-----:|
17
+ |gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.904|± |0.0132|
18
+ | | |strict-match | 5|exact_match|↑ |0.894|± |0.0138|
19
+
20
+ vllm (pretrained=/root/autodl-tmp/Cydonia-24B-v2,add_bos_token=true,max_model_len=700,dtype=bfloat16), gen_kwargs: (None), limit: 15.0, num_fewshot: None, batch_size: 1
21
+ | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr|
22
+ |------------------|------:|------|------|------|---|-----:|---|-----:|
23
+ |mmlu | 2|none | |acc |↑ |0.7942|± |0.0131|
24
+ | - humanities | 2|none | |acc |↑ |0.8205|± |0.0257|
25
+ | - other | 2|none | |acc |↑ |0.8103|± |0.0271|
26
+ | - social sciences| 2|none | |acc |↑ |0.8500|± |0.0257|
27
+ | - stem | 2|none | |acc |↑ |0.7298|± |0.0249|
28
+
29
+
30
  vllm (pretrained=/root/autodl-tmp/70-512-df10,add_bos_token=true,max_model_len=2048), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: 1
31
  |Tasks|Version| Filter |n-shot| Metric | |Value| |Stderr|
32
  |-----|------:|----------------|-----:|-----------|---|----:|---|-----:|