27 85 1

Lee Gao

leegao19

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations

commented on a paper 28 days ago

Entropy-Guided Attention for Private LLMs

upvoted a paper about 1 month ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

View all activity

Organizations

leegao19's activity

upvoted a paper 6 days ago

Concept Steerers: Leveraging K-Sparse Autoencoders for Controllable Generations

Paper • 2501.19066 • Published 12 days ago • 10

upvoted 10 papers about 1 month ago

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published Jan 8 • 89

Entropy-Guided Attention for Private LLMs

Paper • 2501.03489 • Published Jan 7 • 14

Position Information Emerges in Causal Transformers Without Positional Encodings via Similarity of Nearby Embeddings

Paper • 2501.00073 • Published Dec 30, 2024 • 1

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Paper • 2412.07171 • Published Dec 10, 2024 • 1

Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

Paper • 2501.00712 • Published Jan 1 • 6

Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

Paper • 2501.00658 • Published Dec 31, 2024 • 7

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 40

upvoted a paper 3 months ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 49

upvoted 8 papers 11 months ago

Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

Paper • 2403.09636 • Published Mar 14, 2024 • 2

Video Editing via Factorized Diffusion Distillation

Paper • 2403.09334 • Published Mar 14, 2024 • 22

Scattered Mixture-of-Experts Implementation

Paper • 2403.08245 • Published Mar 13, 2024 • 1

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 51

Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study

Paper • 2401.17981 • Published Jan 31, 2024 • 1

In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 43

Function Vectors in Large Language Models

Paper • 2310.15213 • Published Oct 23, 2023 • 1

What Algorithms can Transformers Learn? A Study in Length Generalization

Paper • 2310.16028 • Published Oct 24, 2023 • 2