Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,33 @@ base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
|
|
18 |
- **License:** apache-2.0
|
19 |
- **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
.
|
22 |
|
23 |
.
|
|
|
18 |
- **License:** apache-2.0
|
19 |
- **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit
|
20 |
|
21 |
+
## Trainer Configuration
|
22 |
+
|
23 |
+
| **Parameter** | **Value** |
|
24 |
+
|------------------------------|------------------------------------------|
|
25 |
+
| **Model** | `model` |
|
26 |
+
| **Tokenizer** | `tokenizer` |
|
27 |
+
| **Train Dataset** | `dataset` |
|
28 |
+
| **Dataset Text Field** | `text` |
|
29 |
+
| **Max Sequence Length** | `max_seq_length` |
|
30 |
+
| **Dataset Number of Processes** | `2` |
|
31 |
+
| **Packing** | `False` (Can make training 5x faster for short sequences.) |
|
32 |
+
| **Training Arguments** | |
|
33 |
+
| - **Per Device Train Batch Size** | `2` |
|
34 |
+
| - **Gradient Accumulation Steps** | `4` |
|
35 |
+
| - **Warmup Steps** | `5` |
|
36 |
+
| - **Number of Train Epochs** | `1` (Set this for 1 full training run.) |
|
37 |
+
| - **Max Steps** | `60` |
|
38 |
+
| - **Learning Rate** | `2e-4` |
|
39 |
+
| - **FP16** | `not is_bfloat16_supported()` |
|
40 |
+
| - **BF16** | `is_bfloat16_supported()` |
|
41 |
+
| - **Logging Steps** | `1` |
|
42 |
+
| - **Optimizer** | `adamw_8bit` |
|
43 |
+
| - **Weight Decay** | `0.01` |
|
44 |
+
| - **LR Scheduler Type** | `linear` |
|
45 |
+
| - **Seed** | `3407` |
|
46 |
+
| - **Output Directory** | `outputs` |
|
47 |
+
|
48 |
.
|
49 |
|
50 |
.
|