crscardellino commited on
Commit
9516c78
2 Parent(s): 8cbe4b9 92f59d3

Merge branch 'main' of hf.co:crscardellino/flisol-cba-martin-fierro

Browse files
README.md CHANGED
@@ -1,112 +1,77 @@
1
  ---
2
- language:
3
- - es
4
- license: gpl-3.0
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
  - name: flisol-cba-martin-fierro
9
  results: []
10
- widget:
11
- - text: "Aqui me pongo a cantar"
12
- example_title: "Martin Fierro"
13
  ---
14
 
15
- Hugging Face: IA Colaborativa
16
- =============================
17
 
18
- En este repositorio estar谩 disponible el c贸digo y modelo que entren茅 para la
19
- charla ["Hugging Face: IA Colaborativa"](https://eventol.flisol.org.ar/events/cordoba2023/activity/378/)
20
- del [FLISoL de C贸rdoba](https://cordoba.flisol.org.ar), Argentina, de 2023.
21
 
22
- Para inicializar el setup hace falta tener instalado y activado
23
- [`git-lfs`](https://git-lfs.com/).
24
-
25
- Pueden clonar el repositorio con:
26
-
27
- $ git clone https://huggingface.co/crscardellino/flisol-cba-martin-fierro
28
-
29
- Y luego crean el entorno e instalan los requerimientos.
30
-
31
- $ python -m venv flisol-venv
32
- $ source ./flisol-venv/bin/activate
33
- (flisol-venv) $ pip install -r requirements.txt
34
-
35
- El c贸digo est谩 probado con Python 3.10, pero deber铆a funcionar con Python >=
36
- 3.8. En los requerimientos est谩 organizado para instalar
37
- [PyTorch](https://pytorch.org/) v2.0.0 para cpu, pero pueden ajustarlo para
38
- utilizar GPUs suponiendo que cumplan los requerimientos de CUDA.
39
-
40
- ## License
41
-
42
- flisol-cba-martin-fierro
43
- Copyright (C) 2023 Cristian Cardellino
44
-
45
- This program is free software: you can redistribute it and/or modify
46
- it under the terms of the GNU General Public License as published by
47
- the Free Software Foundation, either version 3 of the License, or
48
- (at your option) any later version.
49
-
50
- This program is distributed in the hope that it will be useful,
51
- but WITHOUT ANY WARRANTY; without even the implied warranty of
52
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
53
- GNU General Public License for more details.
54
-
55
- You should have received a copy of the GNU General Public License
56
- along with this program. If not, see <https://www.gnu.org/licenses/>.
57
-
58
- ## Model Specifications (Auto Generated)
59
-
60
- This model is a fine-tuned version of
61
- [DeepESP/gpt2-spanish](https://huggingface.co/DeepESP/gpt2-spanish) on the
62
- `./data/martin-fierro_train.txt` dataset. It achieves the following results on
63
- the evaluation set:
64
-
65
- - Loss: 3.9067
66
 
67
  ## Model description
68
 
69
- GPT-2 model finetuned on the poem ["El Gaucho Martin
70
- Fierro"](https://es.wikipedia.org/wiki/El_Gaucho_Mart%C3%ADn_Fierro)
71
 
72
  ## Intended uses & limitations
73
 
74
- This was trained for the talk ["Hugging Face: IA
75
- Colaborativa"](https://eventol.flisol.org.ar/events/cordoba2023/activity/378/) @
76
- [FLISoL de C贸rdoba](https://cordoba.flisol.org.ar), Argentina, 2023.
77
 
78
  ## Training and evaluation data
79
 
 
 
 
 
80
  ### Training hyperparameters
81
 
82
  The following hyperparameters were used during training:
83
  - learning_rate: 2e-05
84
- - train_batch_size: 8
85
- - eval_batch_size: 8
86
  - seed: 42
87
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
88
  - lr_scheduler_type: linear
89
- - num_epochs: 10
90
 
91
  ### Training results
92
 
93
  | Training Loss | Epoch | Step | Validation Loss |
94
  |:-------------:|:-----:|:----:|:---------------:|
95
- | 4.3864 | 1.0 | 18 | 4.2025 |
96
- | 3.948 | 2.0 | 36 | 4.0440 |
97
- | 3.7962 | 3.0 | 54 | 3.9804 |
98
- | 3.6105 | 4.0 | 72 | 3.9458 |
99
- | 3.4444 | 5.0 | 90 | 3.9280 |
100
- | 3.3855 | 6.0 | 108 | 3.9192 |
101
- | 3.3142 | 7.0 | 126 | 3.9091 |
102
- | 3.2192 | 8.0 | 144 | 3.9074 |
103
- | 3.1615 | 9.0 | 162 | 3.9070 |
104
- | 3.1637 | 10.0 | 180 | 3.9067 |
 
 
 
 
 
 
 
 
 
 
105
 
106
 
107
  ### Framework versions
108
 
109
- - Transformers 4.28.1
110
- - Pytorch 2.0.0+cpu
111
- - Datasets 2.11.0
112
- - Tokenizers 0.13.3
 
1
  ---
2
+ base_model: DeepESP/gpt2-spanish
3
+ library_name: transformers
4
+ license: mit
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
  - name: flisol-cba-martin-fierro
9
  results: []
 
 
 
10
  ---
11
 
12
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
+ should probably proofread and complete it, then remove this comment. -->
14
 
15
+ # flisol-cba-martin-fierro
 
 
16
 
17
+ This model is a fine-tuned version of [DeepESP/gpt2-spanish](https://huggingface.co/DeepESP/gpt2-spanish) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Loss: 3.9095
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  ## Model description
22
 
23
+ More information needed
 
24
 
25
  ## Intended uses & limitations
26
 
27
+ More information needed
 
 
28
 
29
  ## Training and evaluation data
30
 
31
+ More information needed
32
+
33
+ ## Training procedure
34
+
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 2e-05
39
+ - train_batch_size: 16
40
+ - eval_batch_size: 16
41
  - seed: 42
42
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
  - lr_scheduler_type: linear
44
+ - num_epochs: 20
45
 
46
  ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss |
49
  |:-------------:|:-----:|:----:|:---------------:|
50
+ | 5.1028 | 1.0 | 9 | 4.3257 |
51
+ | 4.2296 | 2.0 | 18 | 4.1607 |
52
+ | 3.983 | 3.0 | 27 | 4.0513 |
53
+ | 3.838 | 4.0 | 36 | 3.9989 |
54
+ | 3.6462 | 5.0 | 45 | 3.9705 |
55
+ | 3.5612 | 6.0 | 54 | 3.9456 |
56
+ | 3.432 | 7.0 | 63 | 3.9310 |
57
+ | 3.3604 | 8.0 | 72 | 3.9196 |
58
+ | 3.2739 | 9.0 | 81 | 3.9135 |
59
+ | 3.2296 | 10.0 | 90 | 3.9082 |
60
+ | 3.1513 | 11.0 | 99 | 3.9078 |
61
+ | 3.0913 | 12.0 | 108 | 3.9057 |
62
+ | 3.054 | 13.0 | 117 | 3.9072 |
63
+ | 2.9832 | 14.0 | 126 | 3.9052 |
64
+ | 2.9653 | 15.0 | 135 | 3.9060 |
65
+ | 2.9376 | 16.0 | 144 | 3.9050 |
66
+ | 2.9133 | 17.0 | 153 | 3.9070 |
67
+ | 2.917 | 18.0 | 162 | 3.9082 |
68
+ | 2.8816 | 19.0 | 171 | 3.9093 |
69
+ | 2.8821 | 20.0 | 180 | 3.9095 |
70
 
71
 
72
  ### Framework versions
73
 
74
+ - Transformers 4.45.2
75
+ - Pytorch 2.3.1+cu121
76
+ - Datasets 3.0.1
77
+ - Tokenizers 0.20.1
config.json CHANGED
@@ -34,7 +34,7 @@
34
  }
35
  },
36
  "torch_dtype": "float32",
37
- "transformers_version": "4.28.1",
38
  "use_cache": true,
39
  "vocab_size": 50257
40
  }
 
34
  }
35
  },
36
  "torch_dtype": "float32",
37
+ "transformers_version": "4.45.2",
38
  "use_cache": true,
39
  "vocab_size": 50257
40
  }
generation_config.json CHANGED
@@ -2,5 +2,5 @@
2
  "_from_model_config": true,
3
  "bos_token_id": 50256,
4
  "eos_token_id": 50256,
5
- "transformers_version": "4.28.1"
6
  }
 
2
  "_from_model_config": true,
3
  "bos_token_id": 50256,
4
  "eos_token_id": 50256,
5
+ "transformers_version": "4.45.2"
6
  }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2fc09705329e56a6b091c5efc42089a80bd7d60e3c52dcff0233bbd4aad9693
3
+ size 497774208
special_tokens_map.json CHANGED
@@ -11,8 +11,32 @@
11
  "<|ax8|>",
12
  "<|ax9|>"
13
  ],
14
- "bos_token": "<|endoftext|>",
15
- "eos_token": "<|endoftext|>",
16
- "pad_token": "<|endoftext|>",
17
- "unk_token": "<|endoftext|>"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  }
 
11
  "<|ax8|>",
12
  "<|ax9|>"
13
  ],
14
+ "bos_token": {
15
+ "content": "<|endoftext|>",
16
+ "lstrip": false,
17
+ "normalized": true,
18
+ "rstrip": false,
19
+ "single_word": false
20
+ },
21
+ "eos_token": {
22
+ "content": "<|endoftext|>",
23
+ "lstrip": false,
24
+ "normalized": true,
25
+ "rstrip": false,
26
+ "single_word": false
27
+ },
28
+ "pad_token": {
29
+ "content": "<|endoftext|>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false
34
+ },
35
+ "unk_token": {
36
+ "content": "<|endoftext|>",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false
41
+ }
42
  }
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json CHANGED
@@ -1,41 +1,115 @@
1
  {
2
  "add_bos_token": false,
3
  "add_prefix_space": false,
4
- "bos_token": {
5
- "__type": "AddedToken",
6
- "content": "<|endoftext|>",
7
- "lstrip": false,
8
- "normalized": true,
9
- "rstrip": false,
10
- "single_word": false
11
- },
12
- "clean_up_tokenization_spaces": true,
13
- "eos_token": {
14
- "__type": "AddedToken",
15
- "content": "<|endoftext|>",
16
- "lstrip": false,
17
- "normalized": true,
18
- "rstrip": false,
19
- "single_word": false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  "errors": "replace",
22
  "full_tokenizer_file": null,
23
  "model_max_length": 1000000000000000019884624838656,
24
- "pad_token": {
25
- "__type": "AddedToken",
26
- "content": "<|endoftext|>",
27
- "lstrip": false,
28
- "normalized": true,
29
- "rstrip": false,
30
- "single_word": false
31
- },
32
  "tokenizer_class": "GPT2Tokenizer",
33
- "unk_token": {
34
- "__type": "AddedToken",
35
- "content": "<|endoftext|>",
36
- "lstrip": false,
37
- "normalized": true,
38
- "rstrip": false,
39
- "single_word": false
40
- }
41
  }
 
1
  {
2
  "add_bos_token": false,
3
  "add_prefix_space": false,
4
+ "added_tokens_decoder": {
5
+ "0": {
6
+ "content": "<|talk|>",
7
+ "lstrip": false,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false,
11
+ "special": true
12
+ },
13
+ "1": {
14
+ "content": "<|ax1|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false,
19
+ "special": true
20
+ },
21
+ "2": {
22
+ "content": "<|ax2|>",
23
+ "lstrip": false,
24
+ "normalized": false,
25
+ "rstrip": false,
26
+ "single_word": false,
27
+ "special": true
28
+ },
29
+ "3": {
30
+ "content": "<|ax3|>",
31
+ "lstrip": false,
32
+ "normalized": false,
33
+ "rstrip": false,
34
+ "single_word": false,
35
+ "special": true
36
+ },
37
+ "4": {
38
+ "content": "<|ax4|>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false,
43
+ "special": true
44
+ },
45
+ "5": {
46
+ "content": "<|ax5|>",
47
+ "lstrip": false,
48
+ "normalized": false,
49
+ "rstrip": false,
50
+ "single_word": false,
51
+ "special": true
52
+ },
53
+ "6": {
54
+ "content": "<|ax6|>",
55
+ "lstrip": false,
56
+ "normalized": false,
57
+ "rstrip": false,
58
+ "single_word": false,
59
+ "special": true
60
+ },
61
+ "7": {
62
+ "content": "<|ax7|>",
63
+ "lstrip": false,
64
+ "normalized": false,
65
+ "rstrip": false,
66
+ "single_word": false,
67
+ "special": true
68
+ },
69
+ "8": {
70
+ "content": "<|ax8|>",
71
+ "lstrip": false,
72
+ "normalized": false,
73
+ "rstrip": false,
74
+ "single_word": false,
75
+ "special": true
76
+ },
77
+ "9": {
78
+ "content": "<|ax9|>",
79
+ "lstrip": false,
80
+ "normalized": false,
81
+ "rstrip": false,
82
+ "single_word": false,
83
+ "special": true
84
+ },
85
+ "50256": {
86
+ "content": "<|endoftext|>",
87
+ "lstrip": false,
88
+ "normalized": true,
89
+ "rstrip": false,
90
+ "single_word": false,
91
+ "special": true
92
+ }
93
  },
94
+ "additional_special_tokens": [
95
+ "<|talk|>",
96
+ "<|ax1|>",
97
+ "<|ax2|>",
98
+ "<|ax3|>",
99
+ "<|ax4|>",
100
+ "<|ax5|>",
101
+ "<|ax6|>",
102
+ "<|ax7|>",
103
+ "<|ax8|>",
104
+ "<|ax9|>"
105
+ ],
106
+ "bos_token": "<|endoftext|>",
107
+ "clean_up_tokenization_spaces": false,
108
+ "eos_token": "<|endoftext|>",
109
  "errors": "replace",
110
  "full_tokenizer_file": null,
111
  "model_max_length": 1000000000000000019884624838656,
112
+ "pad_token": "<|endoftext|>",
 
 
 
 
 
 
 
113
  "tokenizer_class": "GPT2Tokenizer",
114
+ "unk_token": "<|endoftext|>"
 
 
 
 
 
 
 
115
  }
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:402f127db0f19abbb4782498dd89f615bb96f958fb37ddfd8087acf6cc097fe4
3
- size 3579
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4cc7667595ff2e035e6b2dc82ecaef1ab3c1052c8f8e014a3c431e68e19a99e
3
+ size 5176