3 3 3

Guangxuan Xiao

Guangxuan-Xiao

http://guangxuanx.com

Guangxuan-Xiao

AI & ML interests

Efficient Machine Learning

Recent Activity

upvoted a collection 4 days ago

🧠 Reasoning datasets

authored a paper 21 days ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

authored a paper 5 months ago

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

View all activity

Organizations

Guangxuan-Xiao's activity

upvoted a collection 4 days ago

🧠 Reasoning datasets

Collection

Datasets with reasoning traces for math and code released by the community • 14 items • Updated 3 days ago • 102

authored a paper 21 days ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Paper • 2502.14866 • Published 22 days ago • 12

authored a paper 5 months ago

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7

upvoted a paper 5 months ago

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7

commented a paper 5 months ago

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Paper • 2410.10819 • Published Oct 14, 2024 • 7 •

updated 2 models 5 months ago

mit-han-lab/Llama-3-8B-Instruct-Gradient-4194k-w8a8kv4-per-channel

Updated Oct 9, 2024 • 18

mit-han-lab/Llama-3-8B-Instruct-Gradient-1048k-w8a8kv4-per-channel

Updated Oct 9, 2024 • 16

authored 5 papers 7 months ago

InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory

Paper • 2402.04617 • Published Feb 7, 2024 • 4

updated 2 models 8 months ago

Guangxuan-Xiao/fastcomposer-models

Updated Jul 22, 2024

Guangxuan-Xiao/cat_quatitative_imgs

Updated Jul 20, 2024

updated a dataset 8 months ago

Guangxuan-Xiao/cat_quatitative_imgs

Updated Jul 20, 2024 • 5

updated a model about 1 year ago

mit-han-lab/smoothquant-scales

Updated Feb 27, 2024

liked a model about 1 year ago

jinaai/jina-colbert-v1-en

Updated Jan 6 • 1.04k • 99

authored a paper about 1 year ago

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15, 2024 • 22

updated a model over 1 year ago

mit-han-lab/offsite-tuning

Updated Nov 27, 2023

authored a paper over 1 year ago

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Paper • 2211.10438 • Published Nov 18, 2022 • 4