---
license: mit
datasets:
- arbml/arabic_100k_reviews
language:
- ar
- en
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
tags:
- fine-tuning-bert-arbic
- fine-tuning-bert-sentiment-analysis
- sentiment-analysis
- text-classification
- ktrain-library
---


# Fine-Tuned Arabic Sentiment Analysis with BERT 🚀

This repository contains a fine-tuned **BERT** model for sentiment analysis of Arabic reviews. The model is trained on the **[Arabic 100k Reviews](https://www.kaggle.com/datasets/abedkhooli/arabic-100k-reviews)** dataset and can classify reviews into three sentiment categories: **Positive**, **Negative**, and **Mixed**.

## Author 🧑‍💻

Khaled Soudy  
GitHub: [khaledsoudy-1](https://github.com/khaledsoudy-1)

---

## Source Code 💻

You can find the source code and full implementation of this project on my [GitHub repository](https://github.com/khaledsoudy-1/FineTuning-BERT-Arabic-Sentiment/tree/main).

The repository contains the Google Colab notebook, dataset, and scripts used to fine-tune the model for Arabic sentiment analysis.

---

## How to Use the Model

### 1. Install Required Libraries

Make sure you have the **transformers** and **tensorflow** libraries installed:

```bash
!pip install transformers
```

```bash
!pip install tensorflow 
```


### 2. Load the Fine-Tuned Model

You can load the fine-tuned model and tokenizer directly from Hugging Face using the following code:

```python
from transformers import TFBertForSequenceClassification, BertTokenizer

# Load model and tokenizer from Hugging Face
model_name = "khaledsoudy/arabic-sentiment-bert-model"

# Load model
model = TFBertForSequenceClassification.from_pretrained(model_name)

# Load tokenizer
tokenizer = BertTokenizer.from_pretrained(model_name)
```

### 3. Use the Model for Prediction

To use the model for sentiment analysis on an Arabic text, follow these steps:

```python
import tensorflow as tf


# Sample Arabic text for sentiment prediction
text = "الفندق رائع و الخدمة ممتازة"

# Tokenize the input text
inputs = tokenizer(text, return_tensors="tf")

# Get the model's prediction
outputs = model(**inputs)

# Get the predicted sentiment (assuming 3 classes: Positive, Negative, Mixed)
predicted_class = tf.argmax(outputs.logits, axis=-1).numpy()

# Map the predicted class index to sentiment labels
sentiment_labels = ['Mixed', 'Negative', 'Positive']
print(f"Predicted sentiment: {sentiment_labels[predicted_class[0]]}")
```

### 4. Input Format

The model expects Arabic text input. The text should be preprocessed to remove unnecessary characters or diacritics for better results.

### 5. Sentiment Labels

The model classifies the sentiment into three categories:

- **Positive** 🌟
- **Negative** 😠
- **Mixed** 🤔

## Model Details

- **Model Name:** `khaledsoudy/arabic-sentiment-bert-model`
- **Model Type:** `TFBertForSequenceClassification`
- **Language:** Arabic
- **Sentiment Classes:** Positive, Negative, Mixed

## How to Fine-Tune This Model

You can fine-tune this model further using your own dataset. Check out the source code and related notebooks on my GitHub for detailed steps and guidance.

## License 📜

This model is licensed under the MIT License.

## Acknowledgments 🙏

- **Hugging Face** for providing the platform to host models.
- **Google BERT** for the pre-trained model.
- **Kaggle** for the **Arabic 100k Reviews** dataset.

---

This README is ready for use on your Hugging Face model page! It includes detailed usage instructions, links to your GitHub, and other relevant information.