ozone-research commited on
Commit
eae888f
·
verified ·
1 Parent(s): 3b6c1cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md CHANGED
@@ -4,3 +4,89 @@ license_name: qwen-research
4
  license_link: >-
5
  https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT
6
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  license_link: >-
5
  https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT
6
  ---
7
+
8
+ # Chirp-3b
9
+
10
+ ## Overview
11
+
12
+ Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model, it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval.
13
+
14
+ Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike.
15
+
16
+ ## Key Features
17
+
18
+ - **Parameters**: 3 billion
19
+ - **Training Data**: 50M tokens distilled from GPT-4o
20
+ - **Fine-Tuned From**: [Base model name TBD—update if applicable]
21
+ - **License**: [Specify license, e.g., MIT, Apache 2.0, etc.—update as needed]
22
+
23
+ ## Benchmarks
24
+
25
+ Chirp-3b excels on rigorous evaluation datasets, showcasing its strength for a 3B model.
26
+
27
+ ### MMLU Pro
28
+
29
+ | Subject | Average Accuracy |
30
+ |---------------------|------------------|
31
+ | Biology | 0.6234 |
32
+ | Business | 0.5032 |
33
+ | Chemistry | 0.3701 |
34
+ | Computer Science | 0.4268 |
35
+ | Economics | 0.5284 |
36
+ | Engineering | 0.3013 |
37
+ | Health | 0.3900 |
38
+ | History | 0.3885 |
39
+ | Law | 0.2252 |
40
+ | Math | 0.5736 |
41
+ | Other | 0.4145 |
42
+ | Philosophy | 0.3687 |
43
+ | Physics | 0.3995 |
44
+ | Psychology | 0.5589 |
45
+ | **Overall Average** | **0.4320** |
46
+
47
+ - **Improvement**: 9 points above the base model.
48
+
49
+ ### IFEval
50
+
51
+ - **Score**: 72%
52
+ - **Improvement**: 14% better than the base model.
53
+
54
+ More benchmarks are in the works and will be shared soon!
55
+
56
+ ## Download
57
+
58
+ Access Chirp-3b here:
59
+ https://huggingface.co/ozone-research/Chirp-01
60
+
61
+ ## Usage
62
+
63
+ ### Requirements
64
+
65
+ - Recommended GPU: 8 GB VRAM Minimum
66
+
67
+ ### Example
68
+
69
+ ```python
70
+ from transformers import AutoModelForCausalLM, AutoTokenizer
71
+
72
+ model_name = "ozone-research/Chirp-01"
73
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
74
+ model = AutoModelForCausalLM.from_pretrained(model_name)
75
+
76
+ input_text = "What’s the future of AI?"
77
+ inputs = tokenizer(input_text, return_tensors="pt")
78
+ outputs = model.generate(**inputs, max_length=50)
79
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
80
+ ```
81
+
82
+ ## Future Work
83
+
84
+ The Ozone AI team is exploring additional models, including 2B and larger variants. Keep an eye out for upcoming releases!
85
+
86
+ ## Feedback
87
+
88
+ We’re eager for your input! Try Chirp-3b and let us know your thoughts, use cases, or ideas for improvement. Open an issue here or contact us via [contact method—update as needed].
89
+
90
+ ## Acknowledgments
91
+
92
+ A big thanks to the open-source community for driving projects like this forward. Chirp-3b is our contribution to making AI research more accessible.