-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper ā¢ 2402.14083 ā¢ Published ā¢ 48 -
Linear Transformers are Versatile In-Context Learners
Paper ā¢ 2402.14180 ā¢ Published ā¢ 7 -
Training-Free Long-Context Scaling of Large Language Models
Paper ā¢ 2402.17463 ā¢ Published ā¢ 23 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ā¢ 2402.17764 ā¢ Published ā¢ 610
Collections
Discover the best community collections!
Collections including paper arxiv:2402.17753
-
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper ā¢ 2303.16634 ā¢ Published ā¢ 3 -
miracl/miracl-corpus
Viewer ā¢ Updated ā¢ 77.2M ā¢ 5.8k ā¢ 44 -
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper ā¢ 2306.05685 ā¢ Published ā¢ 34 -
How is ChatGPT's behavior changing over time?
Paper ā¢ 2307.09009 ā¢ Published ā¢ 24
-
MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models
Paper ā¢ 2310.11954 ā¢ Published ā¢ 25 -
Training Chain-of-Thought via Latent-Variable Inference
Paper ā¢ 2312.02179 ā¢ Published ā¢ 11 -
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception
Paper ā¢ 2401.16158 ā¢ Published ā¢ 19 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper ā¢ 2402.09727 ā¢ Published ā¢ 38
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper ā¢ 2401.02038 ā¢ Published ā¢ 64 -
Learning To Teach Large Language Models Logical Reasoning
Paper ā¢ 2310.09158 ā¢ Published ā¢ 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper ā¢ 2311.00176 ā¢ Published ā¢ 9 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper ā¢ 2308.09583 ā¢ Published ā¢ 7
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper ā¢ 2310.08740 ā¢ Published ā¢ 16 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper ā¢ 2310.12823 ā¢ Published ā¢ 35 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper ā¢ 2308.10848 ā¢ Published ā¢ 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper ā¢ 2310.16450 ā¢ Published ā¢ 10