prithivMLmods
/

Llama-3.1-8B-4bit-axium

Text Generation

text-generation-inference

Model card Files Files and versions

prithivMLmods commited on Jul 29, 2024

Commit

4b8c2dc

·

verified ·

1 Parent(s): 0185ff9

Update README.md

Files changed (1) hide show

README.md +27 -0

README.md CHANGED Viewed

@@ -18,6 +18,33 @@ base_model: unsloth/meta-llama-3.1-8b-bnb-4bit
 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit
 .
 .

 - **License:** apache-2.0
 - **Finetuned from model :** unsloth/meta-llama-3.1-8b-bnb-4bit
+## Trainer Configuration
+| **Parameter**                | **Value**                                |
+|------------------------------|------------------------------------------|
+| **Model**                    | `model`                                  |
+| **Tokenizer**                | `tokenizer`                              |
+| **Train Dataset**            | `dataset`                                |
+| **Dataset Text Field**       | `text`                                   |
+| **Max Sequence Length**      | `max_seq_length`                         |
+| **Dataset Number of Processes** | `2`                                    |
+| **Packing**                  | `False` (Can make training 5x faster for short sequences.) |
+| **Training Arguments**       |                                          |
+| - **Per Device Train Batch Size** | `2`                                  |
+| - **Gradient Accumulation Steps** | `4`                                  |
+| - **Warmup Steps**           | `5`                                      |
+| - **Number of Train Epochs** | `1` (Set this for 1 full training run.)  |
+| - **Max Steps**              | `60`                                     |
+| - **Learning Rate**          | `2e-4`                                   |
+| - **FP16**                   | `not is_bfloat16_supported()`             |
+| - **BF16**                   | `is_bfloat16_supported()`                |
+| - **Logging Steps**          | `1`                                      |
+| - **Optimizer**              | `adamw_8bit`                              |
+| - **Weight Decay**           | `0.01`                                    |
+| - **LR Scheduler Type**      | `linear`                                  |
+| - **Seed**                   | `3407`                                    |
+| - **Output Directory**       | `outputs`                                 |
 .
 .