datajose/pruebas-ft
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
16 |
|
17 |
This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
-
- Loss:
|
20 |
|
21 |
## Model description
|
22 |
|
@@ -45,27 +45,28 @@ The following hyperparameters were used during training:
|
|
45 |
- lr_scheduler_type: linear
|
46 |
- lr_scheduler_warmup_steps: 2
|
47 |
- num_epochs: 10
|
|
|
48 |
|
49 |
### Training results
|
50 |
|
51 |
| Training Loss | Epoch | Step | Validation Loss |
|
52 |
|:-------------:|:-----:|:----:|:---------------:|
|
53 |
-
|
|
54 |
-
|
|
55 |
-
|
|
56 |
-
|
|
57 |
-
|
|
58 |
-
|
|
59 |
-
|
|
60 |
-
|
|
61 |
-
|
|
62 |
-
|
|
63 |
|
64 |
|
65 |
### Framework versions
|
66 |
|
67 |
- PEFT 0.9.0
|
68 |
-
- Transformers 4.38.
|
69 |
-
- Pytorch 2.
|
70 |
-
- Datasets 2.
|
71 |
- Tokenizers 0.15.2
|
|
|
16 |
|
17 |
This model is a fine-tuned version of [TheBloke/Mistral-7B-Instruct-v0.2-GPTQ](https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.2-GPTQ) on an unknown dataset.
|
18 |
It achieves the following results on the evaluation set:
|
19 |
+
- Loss: 0.5016
|
20 |
|
21 |
## Model description
|
22 |
|
|
|
45 |
- lr_scheduler_type: linear
|
46 |
- lr_scheduler_warmup_steps: 2
|
47 |
- num_epochs: 10
|
48 |
+
- mixed_precision_training: Native AMP
|
49 |
|
50 |
### Training results
|
51 |
|
52 |
| Training Loss | Epoch | Step | Validation Loss |
|
53 |
|:-------------:|:-----:|:----:|:---------------:|
|
54 |
+
| 2.2715 | 0.96 | 20 | 1.7064 |
|
55 |
+
| 0.71 | 1.98 | 41 | 1.1687 |
|
56 |
+
| 0.5515 | 2.99 | 62 | 1.0146 |
|
57 |
+
| 0.5052 | 4.0 | 83 | 0.8605 |
|
58 |
+
| 0.4887 | 4.96 | 103 | 0.7023 |
|
59 |
+
| 0.4311 | 5.98 | 124 | 0.6066 |
|
60 |
+
| 0.418 | 6.99 | 145 | 0.5606 |
|
61 |
+
| 0.4088 | 8.0 | 166 | 0.5206 |
|
62 |
+
| 0.4243 | 8.96 | 186 | 0.5048 |
|
63 |
+
| 0.3898 | 9.64 | 200 | 0.5016 |
|
64 |
|
65 |
|
66 |
### Framework versions
|
67 |
|
68 |
- PEFT 0.9.0
|
69 |
+
- Transformers 4.38.2
|
70 |
+
- Pytorch 2.1.0+cu121
|
71 |
+
- Datasets 2.18.0
|
72 |
- Tokenizers 0.15.2
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4b8f41a103ccfef48a0c7101fd706957a4b02d57796fae22b701eec88a4af294
|
3 |
+
size 8397056
|
runs/Mar12_13-41-31_57d4afd0bde0/events.out.tfevents.1710250893.57d4afd0bde0.3382.0
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:428c8c736b5c0e189a5d69d91ad1de488c85399c37b59fbd0cbb352e736b0326
|
3 |
+
size 10328
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:30e010fd8cbc265a352f57d16ac7c99310ada8ea19b2d05a7f4aa4a081d8dd63
|
3 |
+
size 4856
|