|
--- |
|
language: ar |
|
license: other |
|
tags: |
|
- vision |
|
- image-captioning |
|
pipeline_tag: image-to-text |
|
--- |
|
|
|
# 🦚 Peacock |
|
🦚 Peacock is an InstructBLIP based-model that uses AraLLaMA as its language model. It was introduced in the paper [Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks](https://arxiv.org/abs/2403.01031). |
|
|
|
# How to use |
|
|
|
Usage is as follows: |
|
|
|
``` |
|
from transformers import InstructBlipProcessor, InstructBlipForConditionalGeneration |
|
import torch |
|
from PIL import Image |
|
import requests |
|
model = InstructBlipForConditionalGeneration.from_pretrained("UBC-NLP/Peacock") |
|
processor = InstructBlipProcessor.from_pretrained("UBC-NLP/Peacock") |
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
model.to(device) |
|
url = "https://upload.wikimedia.org/wikipedia/commons/8/83/Socotra_dragon_tree.JPG" |
|
image = Image.open(requests.get(url, stream=True).raw).convert("RGB") |
|
prompt = "اوصف الصوره" |
|
inputs = processor(images=image, text=prompt, return_tensors="pt").to(device) |
|
outputs = model.generate( |
|
**inputs, |
|
do_sample=False, |
|
num_beams=5, |
|
max_length=256, |
|
min_length=1, |
|
top_p=0.9, |
|
repetition_penalty=1.5, |
|
length_penalty=1.0, |
|
temperature=1, |
|
) |
|
generated_text = processor.batch_decode(outputs, skip_special_tokens=True)[0].strip() |
|
print(generated_text) |
|
``` |
|
# Citation |
|
|
|
If you use this model, please cite the following paper: |
|
|
|
|
|
```bibtex |
|
@inproceedings{alwajih2024peacock, |
|
title = {Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks}, |
|
author = {Alwajih, Fakhraddin and Nagoudi, El Moatez Billah and Bhatia, Gagan and Mohamed, Abdelrahman and Abdul-Mageed, Muhammad}, |
|
booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)}, |
|
pages = {12753--12776}, |
|
year = {2024}, |
|
address = {Bangkok, Thailand}, |
|
publisher = {Association for Computational Linguistics}, |
|
url = {https://aclanthology.org/2024.acl-long.689} |
|
} |
|
``` |