Video-Text-to-Text
English
gui
agent
GUI-Vid / README.md
nielsr's picture
nielsr HF Staff
Add pipeline tag
41711e2 verified
|
raw
history blame
401 Bytes
metadata
datasets:
  - shuaishuaicdp/GUI-World
language:
  - en
license: cc-by-4.0
metrics:
  - bertscore
  - LLM-as-a-Judge
tags:
  - gui
  - agent
pipeline_tag: video-text-to-text

This is the first VideoLLM with powerful GUI-oriented capabilities, retrained on GUI-World. See Github for how to use GUI-Vid for GUI understanding tasks.