LLMJapan commited on
Commit
afa67e8
·
verified ·
1 Parent(s): baf4f6f

Update README.md

Browse files

Quantization updated

Files changed (1) hide show
  1. README.md +94 -3
README.md CHANGED
@@ -1,3 +1,94 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - open-r1/codeforces-cots
5
+ language:
6
+ - en
7
+ base_model:
8
+ - Qwen/Qwen2.5-Coder-7B-Instruct
9
+ - open-r1/OlympicCoder-7B
10
+ base_model_relation: quantized
11
+ pipeline_tag: text-generation
12
+ library_name: transformers
13
+ ---
14
+
15
+
16
+ ## Exllama v2 Quantizations of OlympicCoder-7B
17
+
18
+ Using <a href="https://github.com/turboderp/exllamav2/releases/tag/v0.2.8">turboderp's ExLlamaV2 v0.2.8</a> for quantization.
19
+
20
+ average:4.5bpw
21
+ lm_head:6.0bpw
22
+ ```sh
23
+ python convert.py \
24
+ -i {path}/OlympicCoder-7B \
25
+ -o {path}/OlympicCoder-7B/workingdir/ \
26
+ -cf {path}/OlympicCoder-7B_4.5bpw/ \
27
+ -b 4.5 \
28
+ -hb 6
29
+ ```
30
+
31
+ # Model Card for OlympicCoder-7B
32
+
33
+ OlympicCoder-7B is a code model that achieves strong performance on competitive coding benchmarks such as LiveCodeBench and the 2024 International Olympiad in Informatics.
34
+
35
+ * Repository: https://github.com/huggingface/open-r1
36
+ * Blog post: https://huggingface.co/blog/open-r1/update-3
37
+
38
+ ## Model description
39
+
40
+ - **Model type:** A 7B parameter model fine-tuned on a decontaminated version of the codeforces dataset.
41
+ - **Language(s) (NLP):** Primarily English
42
+ - **License:** apache-2.0
43
+ - **Finetuned from model:** [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct)
44
+
45
+ ## Evaluation
46
+
47
+ ![](./ioi-evals.png)
48
+
49
+
50
+
51
+ ## Usage
52
+ Here's how you can run the model using the `pipeline()` function from 🤗 Transformers:
53
+
54
+ ```python
55
+ # pip install transformers
56
+ # pip install accelerate
57
+ import torch
58
+ from transformers import pipeline
59
+ pipe = pipeline("text-generation", model="open-r1/OlympicCoder-7B", torch_dtype=torch.bfloat16, device_map="auto")
60
+ # We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
61
+ messages = [
62
+ {"role": "user", "content": "Write a python program to calculate the 10th Fibonacci number"},
63
+ ]
64
+ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
65
+ outputs = pipe(prompt, max_new_tokens=8000, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
66
+ print(outputs[0]["generated_text"])
67
+ #<|im_start|>user
68
+ #Write a python program to calculate the 10th fibonacci number<|im_end|>
69
+ #<|im_start|>assistant
70
+ #<think>Okay, I need to write a Python program that calculates the 10th Fibonacci number. Hmm, the Fibonacci sequence starts with 0 and 1. Each subsequent number is the sum of the two preceding ones. So the sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, and so on. ...
71
+ ```
72
+
73
+ > [!WARNING]
74
+ > To ensure that the model consistently outputs a long chain-of-thought, we have edited the chat template to prefill the first assistant turn with a `<think>` token. As a result, the outputs from this model will not show the opening `<think>` token if you use the model's `generate()` method. To apply reinforcement learning with a format reward, either prepend the `<think>` token to the model's completions or amend the chat template to remove the prefill.
75
+
76
+ ## Training procedure
77
+ ### Training hyper-parameters
78
+
79
+ The following hyperparameters were used during training:
80
+
81
+ - dataset: open-r1/codeforces-cots
82
+ - learning_rate: 4.0e-5
83
+ - train_batch_size: 2
84
+ - seed: 42
85
+ - packing: false
86
+ - distributed_type: deepspeed-zero-3
87
+ - num_devices: 8
88
+ - gradient_accumulation_steps: 8
89
+ - total_train_batch_size: 16
90
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
91
+ - lr_scheduler_type: cosine_with_min_lr
92
+ - min_lr_rate: 0.1
93
+ - lr_scheduler_warmup_ratio: 0.03
94
+ - num_epochs: 10.0