VQA evaluation on VLMs

1
Qwen2.5-VL-7B (SFT)
60.5
2
Qwen2.5-VL-72B
40.7
3
Qwen2.5-VL-32B
38.2
4
GPT-4o
29.8
5
Gemini-2.5-Pro
28.2
-
Random Chance
25.0
6
mPLUG-Owl3-7B
25.4
7
Gemini-2-Flash
24.9
8
LLaVA-OneVision-7B
24.7
9
LLaVA-Video-7B
24.1
10
Qwen2.5-VL-7B
22.3
11
InternVL2.5-26B
19.8
12
InternVL2.5-8B
16.7
13
InternLMXComposer2.5-7B
9.3
14
InternVideo2-Chat-8B
5.3
15
Tarsier-Recap-7B
4.8