Update README.md
Browse files
README.md
CHANGED
@@ -4,57 +4,31 @@ datasets: eyad-silx/Quasar-Max-3.3
|
|
4 |
library_name: transformers
|
5 |
model_name: Quasar-3.0-Max
|
6 |
tags:
|
7 |
-
-
|
8 |
-
-
|
9 |
- trl
|
10 |
- sft
|
11 |
licence: license
|
12 |
---
|
13 |
|
14 |
-
#
|
15 |
|
16 |
-
|
17 |
-
|
|
|
18 |
|
19 |
-
##
|
20 |
|
21 |
-
|
22 |
-
from transformers import pipeline
|
23 |
|
24 |
-
|
25 |
-
generator = pipeline("text-generation", model="silx-ai/Quasar-3.0-Max", device="cuda")
|
26 |
-
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
|
27 |
-
print(output["generated_text"])
|
28 |
-
```
|
29 |
|
30 |
-
|
31 |
|
32 |
-
|
|
|
|
|
33 |
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
### Framework versions
|
38 |
-
|
39 |
-
- TRL: 0.16.0.dev0
|
40 |
-
- Transformers: 4.49.0
|
41 |
-
- Pytorch: 2.5.1
|
42 |
-
- Datasets: 3.3.2
|
43 |
-
- Tokenizers: 0.21.0
|
44 |
-
|
45 |
-
## Citations
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
Cite TRL as:
|
50 |
-
|
51 |
-
```bibtex
|
52 |
-
@misc{vonwerra2022trl,
|
53 |
-
title = {{TRL: Transformer Reinforcement Learning}},
|
54 |
-
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
|
55 |
-
year = 2020,
|
56 |
-
journal = {GitHub repository},
|
57 |
-
publisher = {GitHub},
|
58 |
-
howpublished = {\url{https://github.com/huggingface/trl}}
|
59 |
-
}
|
60 |
-
```
|
|
|
4 |
library_name: transformers
|
5 |
model_name: Quasar-3.0-Max
|
6 |
tags:
|
7 |
+
- rl
|
8 |
+
- silx
|
9 |
- trl
|
10 |
- sft
|
11 |
licence: license
|
12 |
---
|
13 |
|
14 |
+
# Quasar Series of Models
|
15 |
|
16 |
+
<p align="center">
|
17 |
+
<img src="https://pbs.twimg.com/media/GlaGzuIWcAAI1JO?format=png&name=small" alt="Quasar Model Image">
|
18 |
+
</p>
|
19 |
|
20 |
+
## Introducing Quasar-3.3-Max
|
21 |
|
22 |
+
This model is provided by **SILX INC**. It has been supervised fine-tuned using the **open-r1** repository. The training data includes sequences of varying lengths (32k, 16k, and 8k) to enhance the model's knowledge and adaptability.
|
|
|
23 |
|
24 |
+
Quasar-3.3-Max represents the **first step** in the Quasar project before Reinforcement Learning (RL). At this stage, the model's reasoning steps are capped at a **maximum length of 8129 tokens** to optimize processing efficiency and contextual understanding.
|
|
|
|
|
|
|
|
|
25 |
|
26 |
+
Stay tuned for further updates as we advance the Quasar project with RL enhancements!
|
27 |
|
28 |
+
## Resources
|
29 |
+
- [Research Paper](https://arxiv.org/abs/2412.06822)
|
30 |
+
- [Website](https://sicopilot.cloud)
|
31 |
|
32 |
+
## Founders
|
33 |
+
- **Eyad Gomaa**
|
34 |
+
- **Gomaa Salah**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|