|
--- |
|
base_model: inceptionai/jais-adapted-7b-chat |
|
language: |
|
- ar |
|
license: apache-2.0 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
datasets: |
|
- Wajdi1976/Tunisian_Derja_Dataset |
|
library_name: transformers |
|
--- |
|
## Model Overview |
|
|
|
Labess is an open models instruction-tuned for Tunisian Derja, it's a continual pre-training version of jais-adapted-7b-chat with tunisian_Derja_Dataset |
|
# Uploaded model |
|
|
|
- **Developed by:** Wajdi1976 |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** inceptionai/jais-adapted-7b-chat |
|
- |
|
## Usage |
|
Below we share some code snippets on how to get quickly started with running the model. First, install the Transformers library with: |
|
|
|
```sh |
|
pip install unsloth |
|
``` |
|
### First, Load the Model |
|
```python |
|
from unsloth import FastLanguageModel |
|
import torch |
|
max_seq_length = 128 # Choose any! We auto support RoPE Scaling internally! |
|
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+ |
|
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False. |
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name = "Wajdi1976/Labess", |
|
max_seq_length = max_seq_length, |
|
dtype = dtype, |
|
load_in_4bit = load_in_4bit, |
|
) |
|
``` |
|
|
|
### Second, Try the model |
|
```python |
|
prompt_ar=" يمكنك الإجابة باللهجة التونسية فقط.\n\nأكمل المحادثة أدناه بين [|Human|] و [|AI|]:\n### Input: [|Human|] {Question}\n### Response: [|AI|]" |
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
FastLanguageModel.for_inference(model) |
|
if tokenizer.pad_token is None: |
|
tokenizer.pad_token = tokenizer.eos_token |
|
def get_response(text, tokenizer=tokenizer, model=model): |
|
tokenized = tokenizer(text, return_tensors="pt") |
|
input_ids, attention_mask = tokenized['input_ids'].to(device), tokenized['attention_mask'].to(device) |
|
input_len = input_ids.shape[-1] |
|
generate_ids = model.generate( |
|
input_ids, |
|
attention_mask=attention_mask, |
|
top_p=1, |
|
temperature=0.3, |
|
max_length=128, |
|
min_length=input_len + 4, |
|
repetition_penalty=1.2, |
|
do_sample=True, |
|
pad_token_id=tokenizer.pad_token_id |
|
) |
|
response = tokenizer.batch_decode( |
|
generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True |
|
)[0] |
|
response = response.split("### Response :")[-1].lstrip() |
|
return response |
|
|
|
ques = " آش نقصدو كي نقولو لاباس" |
|
text = prompt_ar.format_map({'Question': ques}) |
|
print(get_response(text)) |
|
``` |
|
- Response: لا باس معناها اللي الشخص موشي في مشكلة ولا مش مرتاح من الموضوع كيفاش نجم نعاونك باش تفهمو خير كان عندك تفاصيل أكثر على الوضعية والا السؤال متاعك تحب نساعدك بشوية سؤال آخر توة نهارك زين شكرا برشا عالمساعدة متاعيمحبت نقلب حاجة أخرى برك الله يباركفي هالمحادثة استعمل |
|
## Citations |
|
When using the **Tunisian Derja Dataset** dataset, please cite: |
|
|
|
```bibtex |
|
@model{linagora2025LLM-tn, |
|
author = {Wajdi Ghezaiel and Jean-Pierre Lorré}, |
|
title = {Labess:Tunisian Derja LLM}, |
|
year = {2025}, |
|
month = {January}, |
|
url = {https://huggingface.co/datasets/Wajdi1976/Labess} |
|
} |
|
|
|
``` |
|
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |