Video-Text-to-Text
English
gui
agent
File size: 401 Bytes
0d23b3d
 
 
 
 
41711e2
0d23b3d
 
2e4713d
0d23b3d
 
 
41711e2
2e4713d
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
---
datasets:
- shuaishuaicdp/GUI-World
language:
- en
license: cc-by-4.0
metrics:
- bertscore
- LLM-as-a-Judge
tags:
- gui
- agent
pipeline_tag: video-text-to-text
---

This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on [GUI-World](https://gui-world.github.io). See [Github](https://github.com/Dongping-Chen/GUI-World) for how to use GUI-Vid for GUI understanding tasks.