qianhuiwu commited on
Commit
609d191
Β·
verified Β·
1 Parent(s): 1b198a2

update model card.

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -8,14 +8,14 @@ base_model:
8
 
9
  - [GUI-Actor-7B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2-VL)
10
  - [GUI-Actor-2B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-2B-Qwen2-VL)
11
- - [GUI-Actor-7B-Qwen2.5-VL](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2.5-VL)
12
- - [GUI-Actor-3B-Qwen2.5-VL](https://huggingface.co/microsoft/GUI-Actor-3B-Qwen2.5-VL)
13
  - [GUI-Actor-Verifier-2B](https://huggingface.co/microsoft/GUI-Actor-Verifier-2B)
14
 
15
- This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents** (Wu et al, 2025)](https://github.com/microsoft/GUI-Actor).
16
  It is developed based on [Qwen2-VL-7B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
17
 
18
- For more details on model design and evaluation, please check the project page at [GUI-Actor](https://aka.ms/GUI-Actor).
19
 
20
  ## πŸ“Š Performance Comparison on GUI Grounding Benchmarks
21
  Table 1. Main results on ScreenSpot-Pro, ScreenSpot, and ScreenSpot-v2 with **Qwen2-VL** as the backbone. † indicates scores obtained from our own evaluation of the official models on Huggingface.
@@ -118,7 +118,7 @@ print(f"Predicted click point: [{round(px, 4)}, {round(py, 4)}]")
118
  # Predicted click point: [0.9709, 0.1548]
119
  ```
120
 
121
- ## Citation
122
  ```
123
  @article{wu2025guiactor,
124
  title={GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents},
 
8
 
9
  - [GUI-Actor-7B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2-VL)
10
  - [GUI-Actor-2B-Qwen2-VL](https://huggingface.co/microsoft/GUI-Actor-2B-Qwen2-VL)
11
+ - [GUI-Actor-7B-Qwen2.5-VL (coming soon)](https://huggingface.co/microsoft/GUI-Actor-7B-Qwen2.5-VL)
12
+ - [GUI-Actor-3B-Qwen2.5-VL (coming soon)](https://huggingface.co/microsoft/GUI-Actor-3B-Qwen2.5-VL)
13
  - [GUI-Actor-Verifier-2B](https://huggingface.co/microsoft/GUI-Actor-Verifier-2B)
14
 
15
+ This model was introduced in the paper [**GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents**](https://aka.ms/GUI-Actor).
16
  It is developed based on [Qwen2-VL-7B-Instruct ](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct), augmented by an attention-based action head and finetuned to perform GUI grounding using the dataset [here (coming soon)]().
17
 
18
+ For more details on model design and evaluation, please check: [🏠 Project Page](https://aka.ms/GUI-Actor) | [πŸ’» Github Repo](https://github.com/microsoft/GUI-Actor) | [πŸ“‘ Paper]().
19
 
20
  ## πŸ“Š Performance Comparison on GUI Grounding Benchmarks
21
  Table 1. Main results on ScreenSpot-Pro, ScreenSpot, and ScreenSpot-v2 with **Qwen2-VL** as the backbone. † indicates scores obtained from our own evaluation of the official models on Huggingface.
 
118
  # Predicted click point: [0.9709, 0.1548]
119
  ```
120
 
121
+ ## πŸ“ Citation
122
  ```
123
  @article{wu2025guiactor,
124
  title={GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents},