crscardellino commited on
Commit
48a9e41
verified
1 Parent(s): f781de0

End of training

Browse files
Files changed (5) hide show
  1. README.md +42 -77
  2. config.json +1 -1
  3. generation_config.json +1 -1
  4. model.safetensors +3 -0
  5. training_args.bin +2 -2
README.md CHANGED
@@ -1,112 +1,77 @@
1
  ---
2
- language:
3
- - es
4
- license: gpl-3.0
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
  - name: flisol-cba-martin-fierro
9
  results: []
10
- widget:
11
- - text: "Aqui me pongo a cantar"
12
- example_title: "Martin Fierro"
13
  ---
14
 
15
- Hugging Face: IA Colaborativa
16
- =============================
17
 
18
- En este repositorio estar谩 disponible el c贸digo y modelo que entren茅 para la
19
- charla ["Hugging Face: IA Colaborativa"](https://eventol.flisol.org.ar/events/cordoba2023/activity/378/)
20
- del [FLISoL de C贸rdoba](https://cordoba.flisol.org.ar), Argentina, de 2023.
21
 
22
- Para inicializar el setup hace falta tener instalado y activado
23
- [`git-lfs`](https://git-lfs.com/).
24
-
25
- Pueden clonar el repositorio con:
26
-
27
- $ git clone https://huggingface.co/crscardellino/flisol-cba-martin-fierro
28
-
29
- Y luego crean el entorno e instalan los requerimientos.
30
-
31
- $ python -m venv flisol-venv
32
- $ source ./flisol-venv/bin/activate
33
- (flisol-venv) $ pip install -r requirements.txt
34
-
35
- El c贸digo est谩 probado con Python 3.10, pero deber铆a funcionar con Python >=
36
- 3.8. En los requerimientos est谩 organizado para instalar
37
- [PyTorch](https://pytorch.org/) v2.0.0 para cpu, pero pueden ajustarlo para
38
- utilizar GPUs suponiendo que cumplan los requerimientos de CUDA.
39
-
40
- ## License
41
-
42
- flisol-cba-martin-fierro
43
- Copyright (C) 2023 Cristian Cardellino
44
-
45
- This program is free software: you can redistribute it and/or modify
46
- it under the terms of the GNU General Public License as published by
47
- the Free Software Foundation, either version 3 of the License, or
48
- (at your option) any later version.
49
-
50
- This program is distributed in the hope that it will be useful,
51
- but WITHOUT ANY WARRANTY; without even the implied warranty of
52
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
53
- GNU General Public License for more details.
54
-
55
- You should have received a copy of the GNU General Public License
56
- along with this program. If not, see <https://www.gnu.org/licenses/>.
57
-
58
- ## Model Specifications (Auto Generated)
59
-
60
- This model is a fine-tuned version of
61
- [DeepESP/gpt2-spanish](https://huggingface.co/DeepESP/gpt2-spanish) on the
62
- `./data/martin-fierro_train.txt` dataset. It achieves the following results on
63
- the evaluation set:
64
-
65
- - Loss: 3.9067
66
 
67
  ## Model description
68
 
69
- GPT-2 model finetuned on the poem ["El Gaucho Martin
70
- Fierro"](https://es.wikipedia.org/wiki/El_Gaucho_Mart%C3%ADn_Fierro)
71
 
72
  ## Intended uses & limitations
73
 
74
- This was trained for the talk ["Hugging Face: IA
75
- Colaborativa"](https://eventol.flisol.org.ar/events/cordoba2023/activity/378/) @
76
- [FLISoL de C贸rdoba](https://cordoba.flisol.org.ar), Argentina, 2023.
77
 
78
  ## Training and evaluation data
79
 
 
 
 
 
80
  ### Training hyperparameters
81
 
82
  The following hyperparameters were used during training:
83
  - learning_rate: 2e-05
84
- - train_batch_size: 8
85
- - eval_batch_size: 8
86
  - seed: 42
87
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
88
  - lr_scheduler_type: linear
89
- - num_epochs: 10
90
 
91
  ### Training results
92
 
93
  | Training Loss | Epoch | Step | Validation Loss |
94
  |:-------------:|:-----:|:----:|:---------------:|
95
- | 4.3864 | 1.0 | 18 | 4.2025 |
96
- | 3.948 | 2.0 | 36 | 4.0440 |
97
- | 3.7962 | 3.0 | 54 | 3.9804 |
98
- | 3.6105 | 4.0 | 72 | 3.9458 |
99
- | 3.4444 | 5.0 | 90 | 3.9280 |
100
- | 3.3855 | 6.0 | 108 | 3.9192 |
101
- | 3.3142 | 7.0 | 126 | 3.9091 |
102
- | 3.2192 | 8.0 | 144 | 3.9074 |
103
- | 3.1615 | 9.0 | 162 | 3.9070 |
104
- | 3.1637 | 10.0 | 180 | 3.9067 |
 
 
 
 
 
 
 
 
 
 
105
 
106
 
107
  ### Framework versions
108
 
109
- - Transformers 4.28.1
110
- - Pytorch 2.0.0+cpu
111
- - Datasets 2.11.0
112
- - Tokenizers 0.13.3
 
1
  ---
2
+ library_name: transformers
3
+ license: mit
4
+ base_model: DeepESP/gpt2-spanish
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
  - name: flisol-cba-martin-fierro
9
  results: []
 
 
 
10
  ---
11
 
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
 
15
+ # flisol-cba-martin-fierro
 
 
16
 
17
+ This model is a fine-tuned version of [DeepESP/gpt2-spanish](https://huggingface.co/DeepESP/gpt2-spanish) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 3.9095
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  ## Model description
22
 
23
+ More information needed
 
24
 
25
  ## Intended uses & limitations
26
 
27
+ More information needed
 
 
28
 
29
  ## Training and evaluation data
30
 
31
+ More information needed
32
+
33
+ ## Training procedure
34
+
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 2e-05
39
+ - train_batch_size: 16
40
+ - eval_batch_size: 16
41
  - seed: 42
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: linear
44
+ - num_epochs: 20
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
+ | 5.1028 | 1.0 | 9 | 4.3257 |
51
+ | 4.2296 | 2.0 | 18 | 4.1607 |
52
+ | 3.983 | 3.0 | 27 | 4.0513 |
53
+ | 3.838 | 4.0 | 36 | 3.9989 |
54
+ | 3.6462 | 5.0 | 45 | 3.9705 |
55
+ | 3.5612 | 6.0 | 54 | 3.9456 |
56
+ | 3.432 | 7.0 | 63 | 3.9310 |
57
+ | 3.3604 | 8.0 | 72 | 3.9196 |
58
+ | 3.2739 | 9.0 | 81 | 3.9135 |
59
+ | 3.2296 | 10.0 | 90 | 3.9082 |
60
+ | 3.1513 | 11.0 | 99 | 3.9078 |
61
+ | 3.0913 | 12.0 | 108 | 3.9057 |
62
+ | 3.054 | 13.0 | 117 | 3.9072 |
63
+ | 2.9832 | 14.0 | 126 | 3.9052 |
64
+ | 2.9653 | 15.0 | 135 | 3.9060 |
65
+ | 2.9376 | 16.0 | 144 | 3.9050 |
66
+ | 2.9133 | 17.0 | 153 | 3.9070 |
67
+ | 2.917 | 18.0 | 162 | 3.9082 |
68
+ | 2.8816 | 19.0 | 171 | 3.9093 |
69
+ | 2.8821 | 20.0 | 180 | 3.9095 |
70
 
71
 
72
  ### Framework versions
73
 
74
+ - Transformers 4.45.2
75
+ - Pytorch 2.3.1+cu121
76
+ - Datasets 3.0.1
77
+ - Tokenizers 0.20.1
config.json CHANGED
@@ -34,7 +34,7 @@
34
  }
35
  },
36
  "torch_dtype": "float32",
37
- "transformers_version": "4.28.1",
38
  "use_cache": true,
39
  "vocab_size": 50257
40
  }
 
34
  }
35
  },
36
  "torch_dtype": "float32",
37
+ "transformers_version": "4.45.2",
38
  "use_cache": true,
39
  "vocab_size": 50257
40
  }
generation_config.json CHANGED
@@ -2,5 +2,5 @@
2
  "_from_model_config": true,
3
  "bos_token_id": 50256,
4
  "eos_token_id": 50256,
5
- "transformers_version": "4.28.1"
6
  }
 
2
  "_from_model_config": true,
3
  "bos_token_id": 50256,
4
  "eos_token_id": 50256,
5
+ "transformers_version": "4.45.2"
6
  }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2fc09705329e56a6b091c5efc42089a80bd7d60e3c52dcff0233bbd4aad9693
3
+ size 497774208
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:402f127db0f19abbb4782498dd89f615bb96f958fb37ddfd8087acf6cc097fe4
3
- size 3579
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4cc7667595ff2e035e6b2dc82ecaef1ab3c1052c8f8e014a3c431e68e19a99e
3
+ size 5176