ozone-research
/

Chirp-01

Safetensors

qwen2

Model card Files Files and versions Community

ozone-research commited on Feb 23

Commit

eae888f

verified ·

1 Parent(s): 3b6c1cb

Update README.md

Browse files

Files changed (1) hide show

README.md +86 -0

README.md CHANGED Viewed

@@ -4,3 +4,89 @@ license_name: qwen-research
 license_link: >-
   https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT
 ---

 license_link: >-
   https://raw.githubusercontent.com/QwenLM/Qwen/refs/heads/main/Tongyi%20Qianwen%20RESEARCH%20LICENSE%20AGREEMENT
 ---
+# Chirp-3b
+## Overview
+Chirp-3b is a high-performing 3B parameter language model crafted by the Ozone Research team. Fine-tuned from a robust base model, it was trained on 50 million tokens of distilled data from GPT-4o. This compact yet powerful model delivers exceptional results, outperforming expectations on benchmarks like MMLU Pro and IFEval.
+Chirp-3b is an open-source effort to push the limits of what small-scale LLMs can achieve, making it a valuable tool for researchers and enthusiasts alike.
+## Key Features
+- **Parameters**: 3 billion
+- **Training Data**: 50M tokens distilled from GPT-4o
+- **Fine-Tuned From**: [Base model name TBD—update if applicable]
+- **License**: [Specify license, e.g., MIT, Apache 2.0, etc.—update as needed]
+## Benchmarks
+Chirp-3b excels on rigorous evaluation datasets, showcasing its strength for a 3B model.
+### MMLU Pro
+| Subject             | Average Accuracy |
+|---------------------|------------------|
+| Biology             | 0.6234           |
+| Business            | 0.5032           |
+| Chemistry           | 0.3701           |
+| Computer Science    | 0.4268           |
+| Economics           | 0.5284           |
+| Engineering         | 0.3013           |
+| Health              | 0.3900           |
+| History             | 0.3885           |
+| Law                 | 0.2252           |
+| Math                | 0.5736           |
+| Other               | 0.4145           |
+| Philosophy          | 0.3687           |
+| Physics             | 0.3995           |
+| Psychology          | 0.5589           |
+| **Overall Average** | **0.4320**       |
+- **Improvement**: 9 points above the base model.
+### IFEval
+- **Score**: 72%
+- **Improvement**: 14% better than the base model.
+More benchmarks are in the works and will be shared soon!
+## Download
+Access Chirp-3b here:
+https://huggingface.co/ozone-research/Chirp-01
+## Usage
+### Requirements
+- Recommended GPU: 8 GB VRAM Minimum
+### Example
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "ozone-research/Chirp-01"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+input_text = "What’s the future of AI?"
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=50)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## Future Work
+The Ozone AI team is exploring additional models, including 2B and larger variants. Keep an eye out for upcoming releases!
+## Feedback
+We’re eager for your input! Try Chirp-3b and let us know your thoughts, use cases, or ideas for improvement. Open an issue here or contact us via [contact method—update as needed].
+## Acknowledgments
+A big thanks to the open-source community for driving projects like this forward. Chirp-3b is our contribution to making AI research more accessible.