-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 64 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 180 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27
Collections
Discover the best community collections!
Collections including paper arxiv:2401.01325
-
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Paper • 2401.01854 • Published • 11 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 80
-
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 58 -
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Paper • 2312.12456 • Published • 42 -
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 14 -
Mini-GPTs: Efficient Large Language Models through Contextual Pruning
Paper • 2312.12682 • Published • 10
-
A survey on Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
Paper • 2009.10521 • Published • 1 -
Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
Paper • 1910.02190 • Published • 1 -
Learning Symmetrization for Equivariance with Orbit Distance Minimization
Paper • 2311.07143 • Published • 1 -
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting
Paper • 2311.11700 • Published • 4
-
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Paper • 2310.17752 • Published • 14 -
Instruction-tuning Aligns LLMs to the Human Brain
Paper • 2312.00575 • Published • 14 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 28
-
LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery
Paper • 2310.18356 • Published • 24 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 45
-
TRAMS: Training-free Memory Selection for Long-range Language Modeling
Paper • 2310.15494 • Published • 2 -
A Long Way to Go: Investigating Length Correlations in RLHF
Paper • 2310.03716 • Published • 10 -
YaRN: Efficient Context Window Extension of Large Language Models
Paper • 2309.00071 • Published • 68 -
Giraffe: Adventures in Expanding Context Lengths in LLMs
Paper • 2308.10882 • Published • 1