32 96 11

Byung-Kwan Lee

BK-Lee

https://sites.google.com/view/byungkwanlee

AI & ML interests

Computer Vision, Machine Learning, Vision Language Models

Recent Activity

upvoted a paper 4 days ago

Token-Efficient Long Video Understanding for Multimodal LLMs

upvoted a paper 15 days ago

Qwen2.5-VL Technical Report

upvoted a paper about 1 month ago

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

View all activity

Organizations

BK-Lee's activity

upvoted a paper 4 days ago

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published 8 days ago • 77

upvoted a paper 15 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 22 days ago • 163

upvoted a paper about 1 month ago

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

Paper • 2501.14818 • Published Jan 20 • 4

New activity in nvidia/Eagle2-9B about 1 month ago

Deepspeed ZeRO3 Compatible Issue

#4 opened about 1 month ago by

BK-Lee

upvoted a paper about 2 months ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published Jan 23 • 44

liked a Space about 2 months ago

662

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection

upvoted 5 papers about 2 months ago

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

Paper • 2501.13200 • Published Jan 22 • 65

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 103

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 85

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 346

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Paper • 2501.12368 • Published Jan 21 • 42

New activity in nvidia/Eagle2-9B about 2 months ago

For training

#2 opened about 2 months ago by

BK-Lee

liked a model about 2 months ago

nvidia/Eagle2-9B

Image-Text-to-Text • Updated Jan 28 • 1k • 45

New activity in nvidia/Eagle2-9B about 2 months ago

Version Crash for Qwen2 from Transformers

#1 opened about 2 months ago by

BK-Lee

upvoted a collection about 2 months ago

Eagle 2

Collection

Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated Jan 23 • 31

upvoted 2 papers about 2 months ago

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 106

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Paper • 2501.08326 • Published Jan 14 • 32

upvoted a collection about 2 months ago

Multimodal LLM

Collection

172 items • Updated 6 days ago • 14

upvoted 2 papers about 2 months ago

GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI

Paper • 2411.14522 • Published Nov 21, 2024 • 34

Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published Jan 16 • 34