Guanzhou Ke's picture

1 79 5

Guanzhou Ke

guanzhouk

·

Guanzhou-Ke

AI & ML interests

Multi-modal learning

Recent Activity

upvoted a paper 4 days ago

Qwen2.5-VL Technical Report

liked a dataset 6 days ago

lmms-lab/LLaVA-Video-178K

upvoted a paper 6 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

View all activity

Organizations

None yet

guanzhouk's activity

upvoted a paper 4 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 5 days ago • 139

upvoted a paper 6 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published 8 days ago • 134

upvoted a paper 8 days ago

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Paper • 2502.09696 • Published 11 days ago • 38

upvoted 2 papers 13 days ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published 30 days ago • 62

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 62

upvoted a paper 21 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published 24 days ago • 106

upvoted 2 papers 30 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 331

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 97

upvoted 3 papers about 1 month ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 90

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 257

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 90

upvoted 2 papers about 2 months ago

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Paper • 2412.21187 • Published Dec 30, 2024 • 39

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Paper • 2412.20070 • Published Dec 28, 2024 • 46

upvoted 7 papers 2 months ago

Parallelized Autoregressive Visual Generation

Paper • 2412.15119 • Published Dec 19, 2024 • 51

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 346

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 41

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 92

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 92

Analyzing The Language of Visual Tokens

Paper • 2411.05001 • Published Nov 7, 2024 • 24

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 140