slightly reorganize readme
Browse files
README.md
CHANGED
@@ -8,9 +8,9 @@ base_model:
|
|
8 |
|
9 |
Fine-tune of https://huggingface.co/suayptalha/Falcon3-Jessi-v0.4-7B-Slerp using a custom training script + custom optimizer.
|
10 |
|
11 |
-
|
12 |
|
13 |
-
|
14 |
|
15 |
**Changes from previous versions**
|
16 |
- Unlike the prior versions, this only modifies `mlp.gate_proj`, leaving `self_attn` from the original model fully intact.
|
@@ -38,7 +38,7 @@ This is the syntax for the DSL I trained it on, which is called ROL (Reasoning O
|
|
38 |
`↺` Self-Reflect - Reconsider assumptions, mitigate overconfidence. Usually begins with the word "Wait" or "Alternatively" or "Actually". Use this at least 3 times.
|
39 |
`➤` Output - Structure tone, format, intent.
|
40 |
```
|
41 |
-
ROL was invented by Deepseek R1 and tweaked by me. The spec for it was not included in the training data - the model figured out how it works based on
|
42 |
|
43 |
The total training time for this was under 15 minutes.
|
44 |
|
|
|
8 |
|
9 |
Fine-tune of https://huggingface.co/suayptalha/Falcon3-Jessi-v0.4-7B-Slerp using a custom training script + custom optimizer.
|
10 |
|
11 |
+
Small early-stage reasoning model project that uses its own symbolic language to represent different reasoning steps.
|
12 |
|
13 |
+
Trained on a synthetic hand-edited dataset with 65 samples.
|
14 |
|
15 |
**Changes from previous versions**
|
16 |
- Unlike the prior versions, this only modifies `mlp.gate_proj`, leaving `self_attn` from the original model fully intact.
|
|
|
38 |
`↺` Self-Reflect - Reconsider assumptions, mitigate overconfidence. Usually begins with the word "Wait" or "Alternatively" or "Actually". Use this at least 3 times.
|
39 |
`➤` Output - Structure tone, format, intent.
|
40 |
```
|
41 |
+
ROL was invented by Deepseek R1 and tweaked by me. The spec for it was not included in the training data - the model figured out how it works based on training data.
|
42 |
|
43 |
The total training time for this was under 15 minutes.
|
44 |
|