Merge branch 'main' of hf.co:crscardellino/flisol-cba-martin-fierro

Browse files

Files changed (8) hide show

README.md +42 -77
config.json +1 -1
generation_config.json +1 -1
model.safetensors +3 -0
special_tokens_map.json +28 -4
tokenizer.json +0 -0
tokenizer_config.json +106 -32
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -1,112 +1,77 @@
 ---
-language:
-- es
-license: gpl-3.0
 tags:
 - generated_from_trainer
 model-index:
 - name: flisol-cba-martin-fierro
   results: []
-widget:
-- text: "Aqui me pongo a cantar"
-  example_title: "Martin Fierro"
 ---
-Hugging Face: IA Colaborativa
-=============================
-En este repositorio estará disponible el código y modelo que entrené para la
-charla ["Hugging Face: IA Colaborativa"](https://eventol.flisol.org.ar/events/cordoba2023/activity/378/)
-del [FLISoL de Córdoba](https://cordoba.flisol.org.ar), Argentina, de 2023.
-Para inicializar el setup hace falta tener instalado y activado
-[`git-lfs`](https://git-lfs.com/).
-Pueden clonar el repositorio con:
-    $ git clone https://huggingface.co/crscardellino/flisol-cba-martin-fierro
-Y luego crean el entorno e instalan los requerimientos.
-    $ python -m venv flisol-venv
-    $ source ./flisol-venv/bin/activate
-    (flisol-venv) $ pip install -r requirements.txt
-El código está probado con Python 3.10, pero debería funcionar con Python >=
-3.8. En los requerimientos está organizado para instalar
-[PyTorch](https://pytorch.org/) v2.0.0 para cpu, pero pueden ajustarlo para
-utilizar GPUs suponiendo que cumplan los requerimientos de CUDA.
-## License
-    flisol-cba-martin-fierro
-    Copyright (C) 2023 Cristian Cardellino
-    This program is free software: you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation, either version 3 of the License, or
-    (at your option) any later version.
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-    You should have received a copy of the GNU General Public License
-    along with this program.  If not, see <https://www.gnu.org/licenses/>.
-## Model Specifications (Auto Generated)
-This model is a fine-tuned version of
-[DeepESP/gpt2-spanish](https://huggingface.co/DeepESP/gpt2-spanish) on the
-`./data/martin-fierro_train.txt` dataset.  It achieves the following results on
-the evaluation set:
-- Loss: 3.9067
 ## Model description
-GPT-2 model finetuned on the poem ["El Gaucho Martin
-Fierro"](https://es.wikipedia.org/wiki/El_Gaucho_Mart%C3%ADn_Fierro)
 ## Intended uses & limitations
-This was trained for the talk ["Hugging Face: IA
-Colaborativa"](https://eventol.flisol.org.ar/events/cordoba2023/activity/378/) @
-[FLISoL de Córdoba](https://cordoba.flisol.org.ar), Argentina, 2023.
 ## Training and evaluation data
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 8
-- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 10
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 4.3864        | 1.0   | 18   | 4.2025          |
-| 3.948         | 2.0   | 36   | 4.0440          |
-| 3.7962        | 3.0   | 54   | 3.9804          |
-| 3.6105        | 4.0   | 72   | 3.9458          |
-| 3.4444        | 5.0   | 90   | 3.9280          |
-| 3.3855        | 6.0   | 108  | 3.9192          |
-| 3.3142        | 7.0   | 126  | 3.9091          |
-| 3.2192        | 8.0   | 144  | 3.9074          |
-| 3.1615        | 9.0   | 162  | 3.9070          |
-| 3.1637        | 10.0  | 180  | 3.9067          |
 ### Framework versions
-- Transformers 4.28.1
-- Pytorch 2.0.0+cpu
-- Datasets 2.11.0
-- Tokenizers 0.13.3

 ---
+base_model: DeepESP/gpt2-spanish
+library_name: transformers
+license: mit
 tags:
 - generated_from_trainer
 model-index:
 - name: flisol-cba-martin-fierro
   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# flisol-cba-martin-fierro
+This model is a fine-tuned version of [DeepESP/gpt2-spanish](https://huggingface.co/DeepESP/gpt2-spanish) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 3.9095
 ## Model description
+More information needed
 ## Intended uses & limitations
+More information needed
 ## Training and evaluation data
+More information needed
+## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 16
+- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 20
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 5.1028        | 1.0   | 9    | 4.3257          |
+| 4.2296        | 2.0   | 18   | 4.1607          |
+| 3.983         | 3.0   | 27   | 4.0513          |
+| 3.838         | 4.0   | 36   | 3.9989          |
+| 3.6462        | 5.0   | 45   | 3.9705          |
+| 3.5612        | 6.0   | 54   | 3.9456          |
+| 3.432         | 7.0   | 63   | 3.9310          |
+| 3.3604        | 8.0   | 72   | 3.9196          |
+| 3.2739        | 9.0   | 81   | 3.9135          |
+| 3.2296        | 10.0  | 90   | 3.9082          |
+| 3.1513        | 11.0  | 99   | 3.9078          |
+| 3.0913        | 12.0  | 108  | 3.9057          |
+| 3.054         | 13.0  | 117  | 3.9072          |
+| 2.9832        | 14.0  | 126  | 3.9052          |
+| 2.9653        | 15.0  | 135  | 3.9060          |
+| 2.9376        | 16.0  | 144  | 3.9050          |
+| 2.9133        | 17.0  | 153  | 3.9070          |
+| 2.917         | 18.0  | 162  | 3.9082          |
+| 2.8816        | 19.0  | 171  | 3.9093          |
+| 2.8821        | 20.0  | 180  | 3.9095          |
 ### Framework versions
+- Transformers 4.45.2
+- Pytorch 2.3.1+cu121
+- Datasets 3.0.1
+- Tokenizers 0.20.1

config.json CHANGED Viewed

@@ -34,7 +34,7 @@
     }
   },
   "torch_dtype": "float32",
-  "transformers_version": "4.28.1",
   "use_cache": true,
   "vocab_size": 50257
 }

     }
   },
   "torch_dtype": "float32",
+  "transformers_version": "4.45.2",
   "use_cache": true,
   "vocab_size": 50257
 }

generation_config.json CHANGED Viewed

@@ -2,5 +2,5 @@
   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
-  "transformers_version": "4.28.1"
 }

   "_from_model_config": true,
   "bos_token_id": 50256,
   "eos_token_id": 50256,
+  "transformers_version": "4.45.2"
 }

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d2fc09705329e56a6b091c5efc42089a80bd7d60e3c52dcff0233bbd4aad9693
+size 497774208

special_tokens_map.json CHANGED Viewed

@@ -11,8 +11,32 @@
     "<|ax8|>",
     "<|ax9|>"
   ],
-  "bos_token": "<|endoftext|>",
-  "eos_token": "<|endoftext|>",
-  "pad_token": "<|endoftext|>",
-  "unk_token": "<|endoftext|>"
 }

     "<|ax8|>",
     "<|ax9|>"
   ],
+  "bos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
 }

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

@@ -1,41 +1,115 @@
 {
   "add_bos_token": false,
   "add_prefix_space": false,
-  "bos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
-  "clean_up_tokenization_spaces": true,
-  "eos_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
   },
   "errors": "replace",
   "full_tokenizer_file": null,
   "model_max_length": 1000000000000000019884624838656,
-  "pad_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  },
   "tokenizer_class": "GPT2Tokenizer",
-  "unk_token": {
-    "__type": "AddedToken",
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": true,
-    "rstrip": false,
-    "single_word": false
-  }
 }

 {
   "add_bos_token": false,
   "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<|talk|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<|ax1|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "<|ax2|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<|ax3|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "<|ax4|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "5": {
+      "content": "<|ax5|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "6": {
+      "content": "<|ax6|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "7": {
+      "content": "<|ax7|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "8": {
+      "content": "<|ax8|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "9": {
+      "content": "<|ax9|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "50256": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
   },
+  "additional_special_tokens": [
+    "<|talk|>",
+    "<|ax1|>",
+    "<|ax2|>",
+    "<|ax3|>",
+    "<|ax4|>",
+    "<|ax5|>",
+    "<|ax6|>",
+    "<|ax7|>",
+    "<|ax8|>",
+    "<|ax9|>"
+  ],
+  "bos_token": "<|endoftext|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|endoftext|>",
   "errors": "replace",
   "full_tokenizer_file": null,
   "model_max_length": 1000000000000000019884624838656,
+  "pad_token": "<|endoftext|>",
   "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:402f127db0f19abbb4782498dd89f615bb96f958fb37ddfd8087acf6cc097fe4
-size 3579

 version https://git-lfs.github.com/spec/v1
+oid sha256:e4cc7667595ff2e035e6b2dc82ecaef1ab3c1052c8f8e014a3c431e68e19a99e
+size 5176