Collections
Discover the best community collections!
Collections including paper arxiv:2402.19173
-
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 138 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 55 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20 -
Priority Sampling of Large Language Models for Compilers
Paper • 2402.18734 • Published • 18
-
A Survey on Data Selection for Language Models
Paper • 2402.16827 • Published • 4 -
Instruction Tuning with Human Curriculum
Paper • 2310.09518 • Published • 3 -
Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs
Paper • 2312.05934 • Published • 1 -
Language Models as Agent Models
Paper • 2212.01681 • Published
-
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 55 -
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 51 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 138 -
Simple linear attention language models balance the recall-throughput tradeoff
Paper • 2402.18668 • Published • 20
-
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper • 2402.12354 • Published • 6 -
The FinBen: An Holistic Financial Benchmark for Large Language Models
Paper • 2402.12659 • Published • 21 -
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Paper • 2402.13249 • Published • 13 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser
Text Generation • Updated • 2.48k • 119 -
Evaluating Large Language Models Trained on Code
Paper • 2107.03374 • Published • 8 -
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Paper • 2002.08155 • Published • 2 -
code2seq: Generating Sequences from Structured Representations of Code
Paper • 1808.01400 • Published • 2
-
ReGAL: Refactoring Programs to Discover Generalizable Abstractions
Paper • 2401.16467 • Published • 10 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 138 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 82 -
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming
Paper • 2402.14261 • Published • 11
-
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Paper • 2401.03065 • Published • 11 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 60 -
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation
Paper • 2312.14187 • Published • 52 -
On the Effectiveness of Large Language Models in Domain-Specific Code Generation
Paper • 2312.01639 • Published • 2