nkpz commited on
Commit
5d12050
·
verified ·
1 Parent(s): 5888a6a

slightly reorganize readme

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -8,9 +8,9 @@ base_model:
8
 
9
  Fine-tune of https://huggingface.co/suayptalha/Falcon3-Jessi-v0.4-7B-Slerp using a custom training script + custom optimizer.
10
 
11
- Third try at creating a small reasoning model that uses its own symbolic language to represent different reasoning steps.
12
 
13
- Still in early stages, and I plan on building publicly and releasing as often as possible.
14
 
15
  **Changes from previous versions**
16
  - Unlike the prior versions, this only modifies `mlp.gate_proj`, leaving `self_attn` from the original model fully intact.
@@ -38,7 +38,7 @@ This is the syntax for the DSL I trained it on, which is called ROL (Reasoning O
38
  `↺` Self-Reflect - Reconsider assumptions, mitigate overconfidence. Usually begins with the word "Wait" or "Alternatively" or "Actually". Use this at least 3 times.
39
  `➤` Output - Structure tone, format, intent.
40
  ```
41
- ROL was invented by Deepseek R1 and tweaked by me. The spec for it was not included in the training data - the model figured out how it works based on a synthetic hand-edited dataset with 65 samples.
42
 
43
  The total training time for this was under 15 minutes.
44
 
 
8
 
9
  Fine-tune of https://huggingface.co/suayptalha/Falcon3-Jessi-v0.4-7B-Slerp using a custom training script + custom optimizer.
10
 
11
+ Small early-stage reasoning model project that uses its own symbolic language to represent different reasoning steps.
12
 
13
+ Trained on a synthetic hand-edited dataset with 65 samples.
14
 
15
  **Changes from previous versions**
16
  - Unlike the prior versions, this only modifies `mlp.gate_proj`, leaving `self_attn` from the original model fully intact.
 
38
  `↺` Self-Reflect - Reconsider assumptions, mitigate overconfidence. Usually begins with the word "Wait" or "Alternatively" or "Actually". Use this at least 3 times.
39
  `➤` Output - Structure tone, format, intent.
40
  ```
41
+ ROL was invented by Deepseek R1 and tweaked by me. The spec for it was not included in the training data - the model figured out how it works based on training data.
42
 
43
  The total training time for this was under 15 minutes.
44