File size: 5,277 Bytes

---
library_name: transformers
license: apache-2.0
datasets:
- MohammedNasser/ARabic_Reasoning_QA
language:
- ar
metrics:
- accuracy
base_model: silma-ai/SILMA-9B-Instruct-v1.0
pipeline_tag: question-answering
---

# SILMA-9B-Instruct Fine-Tuned for Arabic Reasoning-QA


[![Generic badge](https://img.shields.io/badge/🤗-Hugging%20Face-blue.svg)](https://huggingface.co/MohammedNasser/silma_9b_instruct_ft)
[![License: Apache](https://img.shields.io/badge/License-Apache-yellow.svg)](https://opensource.org/licenses/Apache)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/release/python-390/)

This model is a fine-tuned version of [silma-ai/SILMA-9B-Instruct-v1.0](https://huggingface.co/silma-ai/SILMA-9B-Instruct-v1.0), optimized for Arabic Question Answering tasks. It excels at providing numerical answers to a wide range of questions in Arabic.

## Model Descriptionen

This fine-tuned model is based on the silma-ai/SILMA-9B-Instruct-v1.0 and is designed to answer reasoning questions in Arabic, providing integer-based answers. The model has be fine-tuned using a custom Arabic Reasoning QA dataset, specifically tailored to handle questions ranging from easy to difficult across various topics.

## Model Details

- **Model Name**: silma_9b_instruct_ft
- **Model Type**: Language Model
- **Language**: Arabic
- **Base Model**: silma-ai/SILMA-9B-Instruct-v1.0
- **Fine-Tuning Method**: PEFT with LoraConfig
- **Task**: Arabic Question Answering (Numerical Responses)
- **Training Data**: [Custom Arabic Reasoning QA dataset](https://huggingface.co/MohammedNasser/ARabic_Reasoning_QA)
- **Quantization**: 4-bit quantization using bitsandbytes

## Features

- Optimized for Arabic language understanding and generation
- Specialized in providing numerical answers to questions
- Efficient inference with 4-bit quantization
- Fine-tuned using PEFT with LoraConfig for parameter-efficient training

### Training results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 2.1356        | 0.04  | 10   | 1.4071          |
| 0.8079        | 0.08  | 20   | 0.2825          |
| 0.1592        | 0.12  | 30   | 0.1427          |
| 0.1202        | 0.16  | 40   | 0.1121          |
| 0.1095        | 0.2   | 50   | 0.1071          |
| 0.1024        | 0.24  | 60   | 0.1036          |
| 0.0993        | 0.28  | 70   | 0.1002          |
| 0.091         | 0.32  | 80   | 0.0992          |
| 0.1096        | 0.36  | 90   | 0.0965          |
| 0.0943        | 0.4   | 100  | 0.0916          |
| 0.0882        | 0.44  | 110  | 0.0896          |
| 0.0853        | 0.48  | 120  | 0.0848          |
| 0.0767        | 0.52  | 130  | 0.0808          |
| 0.0778        | 0.56  | 140  | 0.0765          |
| 0.0698        | 0.6   | 150  | 0.0734          |
| 0.0784        | 0.64  | 160  | 0.0694          |
| 0.0648        | 0.68  | 170  | 0.0658          |
| 0.0797        | 0.72  | 180  | 0.0630          |
| 0.0591        | 0.76  | 190  | 0.0604          |
| 0.0557        | 0.8   | 200  | 0.0582          |
| 0.0567        | 0.84  | 210  | 0.0561          |
| 0.057         | 0.88  | 220  | 0.0534          |
| 0.0505        | 0.92  | 230  | 0.0515          |
| 0.0483        | 0.96  | 240  | 0.0482          |
| 0.0463        | 1.0   | 250  | 0.0463          |


### Training Metrics
[Training Loss on wandb 🔗](https://wandb.ai/mohnasgbr/huggingface/reports/train-loss-24-09-07-03-41-58---Vmlldzo5MjgxMTY4)


## Usage

Here's a quick example of how to use the model:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch

model_name = "MohammedNasser/silma_9b_instruct_ft"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
)

# Create pipeline
qa_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=50,
    do_sample=True,
    temperature=0.7,
    top_p=0.95,
    return_full_text=False
)

# Example usage
question = "إذا كان لديك ثلاث سيارات، وبعت واحدة منها، كم سيارة ستبقى لديك؟"
prompt = f"Question: {question}\nAnswer:"
response = qa_pipeline(prompt)[0]['generated_text']

print(f"Question: {question}")
print(f"Answer: {response}")
```

## Performance

Our model demonstrates strong performance on Arabic QA tasks, particularly for questions requiring numerical answers. Here are some key metrics:

- **Eval Loss**: 0.046

## Limitations

- The model is optimized for numerical answers and may not perform as well on open-ended questions.
- Performance may vary for dialects or regional variations of Arabic not well-represented in the training data.
- The model may occasionally generate incorrect numerical answers for very complex or ambiguous questions.

## Fine-tuning Details

The model was fine-tuned using the following configuration:

- **LoRA Config**:
  - Alpha: 16
  - Dropout: 0.1
  - R: 4
- **Training Hyperparameters**:
  - Batch Size: 4
  - Learning Rate: 2e-4
  - Epochs: 3
- **Hardware**: 4 x NVIDIA A100 GPUs

---

Made with ❤️ by [M. N. Gaber/aiNarabic]