|
--- |
|
base_model: AI-Sweden-Models/gpt-sw3-6.7b-v2-instruct |
|
language: |
|
- sv |
|
- da |
|
- 'no' |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- translation |
|
--- |
|
# Model Card for gpt-sw3-6.7b-v2-translator-gguf |
|
The `gpt-sw3-6.7b-v2-translator` is a finetuned version of `gpt-sw3-6.7b-v2-instruct` on a carefully selected translation pair dataset that was gathered by AI Sweden. |
|
|
|
|
|
## Intended usage: |
|
Translate text data from English to Swedish, or Swedish to English. |
|
|
|
|
|
## How to use: |
|
Translate from English to Swedish: |
|
```bash |
|
FROM ./gpt-sw3-6-7b-v2-translator-SIZE.gguf |
|
TEMPLATE "<|endoftext|><s>User: Översätt till Svenska från Engelska\n{{ .Prompt }}<s>Bot:" |
|
PARAMETER stop <s> |
|
``` |
|
Translate from Swedish to English: |
|
```bash |
|
FROM ./gpt-sw3-6-7b-v2-translator-SIZE.gguf |
|
TEMPLATE "<|endoftext|><s>User: Översätt till Engelska från Svenska\n{{ .Prompt }}<s>Bot:" |
|
PARAMETER stop <s> |
|
``` |
|
|
|
## Versions: |
|
``` |
|
gpt-sw3-6-7b-v2-translator-q4.gguf |
|
gpt-sw3-6-7b-v2-translator-q8.gguf |
|
gpt-sw3-6-7b-v2-translator-f16.gguf |
|
gpt-sw3-6-7b-v2-translator-f32.gguf |
|
``` |
|
|
|
## Training & Data: |
|
The training was done on 1 NVIDIA DGX using DeepSpeed ZeRO 3 for three epochs on roughly 4GB of carefully selected translation data. It is a full finetune of all of the model parameters. |
|
|
|
| Epoch | Training Loss | Evaluation Loss | |
|
|-------|---------------|-----------------| |
|
| 1 | 1.309 | 1.281 | |
|
| 2 | 1.161 | 1.242 | |
|
| 3 | 1.053 | 1.219 | |