|
--- |
|
language: id |
|
tags: |
|
- bart |
|
- id |
|
license: mit |
|
--- |
|
|
|
# Indonesia Recipe Ingredients Generator Model |
|
|
|
**WARNING: inference on Huggingface might not run since the tokenizer used is not transformers's tokenizer.** |
|
|
|
Feel free to test the model [in this space](https://huggingface.co/spaces/haryoaw/id-recigen) |
|
|
|
π **Have fun on generating ingredients** π |
|
|
|
This is a fine-tuned model to generate the Indonesian food ingredients. One of my personal project that I did in my free time. |
|
|
|
Basically, you give the name of the food and it will produce the ingredients of the food. |
|
|
|
## Model |
|
|
|
Data: [Indonesian Recipe Data on Kaggle](https://www.kaggle.com/datasets/canggih/indonesian-food-recipes) |
|
Pre-trained Model: [IndoBART-v2](https://huggingface.co/indobenchmark/indobart-v2) |
|
|
|
## How to use |
|
|
|
We will specify the usage of the tokenizer and the model. |
|
|
|
### Tokenizer |
|
|
|
Since we use `indobart-v2`, we need to use their tokenizer. |
|
|
|
First, install the tokenizer by doing `pip install indobenchmark-toolkit`. |
|
|
|
|
|
|
|
|
|
After that, you can load the tokenizer: |
|
|
|
```python |
|
from indobenchmark.tokenization_indonlg import IndoNLGTokenizer |
|
|
|
tokenizer = IndoNLGTokenizer.from_pretrained("haryoaw/id-recigen-bart") |
|
``` |
|
|
|
**EDIT**: |
|
|
|
Seems like the tokenizer in the package is not the same as the one that I use to finetune the model. |
|
There are some noticeable bug such as some subword tokens are not considered as subword. Nevertheless, it stil works! |
|
|
|
### Model |
|
|
|
The model can be loaded by using AutoModel. |
|
|
|
```python |
|
from transformers import AutoModelForSeq2SeqLM |
|
|
|
model = AutoModelForSeq2SeqLM.from_pretrained("haryoaw/id-recigen-bart") |
|
``` |
|
|
|
|
|
## Input Example |
|
|
|
Make sure to input a **LOWERCASE** food name. The tokenizer is case-sensitive! |
|
|
|
``` |
|
sayur asam |
|
``` |
|
|
|
``` |
|
nasi goreng ayam |
|
``` |
|
|
|
~To be continued.. |
|
|
|
|
|
|