Safetensors
internvl_chat
custom_code

Improve model card: Add pipeline tag, library name, abstract, and links

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +12 -4
README.md CHANGED
@@ -1,10 +1,12 @@
1
  ---
2
- license: apache-2.0
 
3
  datasets:
4
  - Code2Logic/GameQA-140K
5
  - Code2Logic/GameQA-5K
6
- base_model:
7
- - OpenGVLab/InternVL3-8B
 
8
  ---
9
 
10
  ***This model (GameQA-InternVL3-8B) results from training InternVL3-8B with GRPO solely on our [GameQA-5K](https://huggingface.co/datasets/Code2Logic/GameQA-5K) (sampled from the full [GameQA-140K](https://huggingface.co/datasets/Gabriel166/GameQA-140K) dataset).***
@@ -17,9 +19,15 @@ base_model:
17
 
18
  # Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning
19
 
 
 
 
 
 
 
20
  This is the first work, to the best of our knowledge, that leverages ***game code*** to synthesize multimodal reasoning data for ***training*** VLMs. Furthermore, when trained with a GRPO strategy solely on **GameQA** (synthesized via our proposed **Code2Logic** approach), multiple cutting-edge open-source models exhibit significantly enhanced out-of-domain generalization.
21
 
22
- [[πŸ“– Paper](https://arxiv.org/abs/2505.13886)] [[πŸ€— GameQA-140K Dataset](https://huggingface.co/datasets/Gabriel166/GameQA-140K)] [[πŸ€— GameQA-5K Dataset](https://huggingface.co/datasets/Code2Logic/GameQA-5K)] [[πŸ€— GameQA-InternVL3-8B](https://huggingface.co/Code2Logic/GameQA-InternVL3-8B) ] [[πŸ€— GameQA-Qwen2.5-VL-7B](https://huggingface.co/Code2Logic/GameQA-Qwen2.5-VL-7B)] [[πŸ€— GameQA-LLaVA-OV-7B](https://huggingface.co/Code2Logic/GameQA-llava-onevision-qwen2-7b-ov-hf) ]
23
 
24
  <div align=center><img src="https://raw.githubusercontent.com/tongjingqi/Code2Logic/refs/heads/main/assets/categorized_30_games_images.png"></div>
25
 
 
1
  ---
2
+ base_model:
3
+ - OpenGVLab/InternVL3-8B
4
  datasets:
5
  - Code2Logic/GameQA-140K
6
  - Code2Logic/GameQA-5K
7
+ license: apache-2.0
8
+ pipeline_tag: image-text-to-text
9
+ library_name: transformers
10
  ---
11
 
12
  ***This model (GameQA-InternVL3-8B) results from training InternVL3-8B with GRPO solely on our [GameQA-5K](https://huggingface.co/datasets/Code2Logic/GameQA-5K) (sampled from the full [GameQA-140K](https://huggingface.co/datasets/Gabriel166/GameQA-140K) dataset).***
 
19
 
20
  # Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning
21
 
22
+ **Paper Abstract:**
23
+ Visual-language Chain-of-Thought (CoT) data resources are relatively scarce compared to text-only counterparts, limiting the improvement of reasoning capabilities in Vision Language Models (VLMs). However, high-quality vision-language reasoning data is expensive and labor-intensive to annotate. To address this issue, we leverage a promising resource: game code, which naturally contains logical structures and state transition processes. Therefore, we propose Code2Logic, a novel game-code-driven approach for multimodal reasoning data synthesis. Our approach leverages Large Language Models (LLMs) to adapt game code, enabling automatic acquisition of reasoning processes and results through code execution. Using the Code2Logic approach, we developed the GameQA dataset to train and evaluate VLMs. GameQA is cost-effective and scalable, offers controllable difficulty gradation and is diverse with 30 games and 158 tasks. Surprisingly, despite training solely on game data, VLMs demonstrated out of domain generalization, specifically Qwen2.5-VL-7B improving performance by 2.33% across 7 diverse vision-language benchmarks. Our code, dataset and models are available at this https URL .
24
+
25
+ This model was presented in the paper [Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning](https://huggingface.co/papers/2505.13886).
26
+ Code: [https://github.com/tongjingqi/Code2Logic](https://github.com/tongjingqi/Code2Logic)
27
+
28
  This is the first work, to the best of our knowledge, that leverages ***game code*** to synthesize multimodal reasoning data for ***training*** VLMs. Furthermore, when trained with a GRPO strategy solely on **GameQA** (synthesized via our proposed **Code2Logic** approach), multiple cutting-edge open-source models exhibit significantly enhanced out-of-domain generalization.
29
 
30
+ [[πŸ€— GameQA-140K Dataset](https://huggingface.co/datasets/Gabriel166/GameQA-140K)] [[πŸ€— GameQA-5K Dataset](https://huggingface.co/datasets/Code2Logic/GameQA-5K)] [[πŸ€— GameQA-InternVL3-8B](https://huggingface.co/Code2Logic/GameQA-InternVL3-8B) ] [[πŸ€— GameQA-Qwen2.5-VL-7B](https://huggingface.co/Code2Logic/GameQA-Qwen2.5-VL-7B)] [[πŸ€— GameQA-LLaVA-OV-7B](https://huggingface.co/Code2Logic/GameQA-llava-onevision-qwen2-7b-ov-hf) ]
31
 
32
  <div align=center><img src="https://raw.githubusercontent.com/tongjingqi/Code2Logic/refs/heads/main/assets/categorized_30_games_images.png"></div>
33