-
Attention Is All You Need
Paper • 1706.03762 • Published • 55 -
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 13 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 11 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 73
Collections
Discover the best community collections!
Collections including paper arxiv:2403.13187
-
Can LLMs Follow Simple Rules?
Paper • 2311.04235 • Published • 14 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 79 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper • 2403.03507 • Published • 186 -
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper • 2402.17177 • Published • 88
-
Ultra-Long Sequence Distributed Transformer
Paper • 2311.02382 • Published • 6 -
Ziya2: Data-centric Learning is All LLMs Need
Paper • 2311.03301 • Published • 20 -
Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Paper • 2311.02103 • Published • 21 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 15
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 16 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 35 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper • 2308.10848 • Published • 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 10