|
--- |
|
base_model: facebook/detr-resnet-50 |
|
tags: |
|
- object-detection |
|
--- |
|
|
|
This model is trained to crop screen area from app introduction images. |
|
|
|
Code and dataset can be found at https://github.com/Jl-wei/guing |
|
|
|
Using with pipeline |
|
```py |
|
from PIL import Image |
|
from transformers import pipeline |
|
|
|
detector = pipeline("object-detection", model="Jl-wei/app-intro-img-cropper", device=0) |
|
image = Image.open(img_path) |
|
result = detector(image, threshold=0.5) |
|
``` |
|
|
|
This is the app introduction image cropper of the following paper: |
|
```bibtex |
|
@article{wei2024guing, |
|
author = {Wei, Jialiang and Courbis, Anne-Lise and Lambolais, Thomas and Xu, Binbin and Bernard, Pierre Louis and Dray, G\'{e}rard and Maalej, Walid}, |
|
title = {GUing: A Mobile GUI Search Engine using a Vision-Language Model}, |
|
year = {2025}, |
|
volume = {34}, |
|
number = {4}, |
|
doi = {10.1145/3702993}, |
|
journal = {ACM Trans. Softw. Eng. Methodol.}, |
|
publisher = {Association for Computing Machinery}, |
|
address = {New York, NY, USA} |
|
} |
|
``` |
|
Please note that the model can only be used for academic purpose. |
|
|