Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.04088

Llemma: An Open Language Model For Mathematics

Paper • 2310.10631 • Published Oct 16, 2023 • 53
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 46
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 35
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model

Paper • 2309.11568 • Published Sep 20, 2023 • 10

Papers: MoE/Ensemble

Papers related to Mixture of Experts topics.

QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models

Paper • 2310.16795 • Published Oct 25, 2023 • 27
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

Paper • 2310.13961 • Published Oct 21, 2023 • 5
The Consensus Game: Language Model Generation via Equilibrium Search

Paper • 2310.09139 • Published Oct 13, 2023 • 14
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning

Paper • 2310.03094 • Published Oct 4, 2023 • 13

FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation

Paper • 2310.03214 • Published Oct 5, 2023 • 19
SteP: Stacked LLM Policies for Web Actions

Paper • 2310.03720 • Published Oct 5, 2023 • 8
Large Language Models Cannot Self-Correct Reasoning Yet

Paper • 2310.01798 • Published Oct 3, 2023 • 35
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158

Stuff I (TheProjectsGuy) have summarized (for time pass). Mostly papers. I do not guarantee that the summaries are fully correct (as I am no expert).

SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving

Paper • 2402.02519 • Published Feb 4, 2024
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158
Optimal Transport Aggregation for Visual Place Recognition

Paper • 2311.15937 • Published Nov 27, 2023
GOAT: GO to Any Thing

Paper • 2311.06430 • Published Nov 10, 2023 • 16

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Paper • 2310.09478 • Published Oct 14, 2023 • 21
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams

Paper • 2310.08678 • Published Oct 12, 2023 • 14
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 244
LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 14

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 54
Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 38
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Paper • 2309.12307 • Published Sep 21, 2023 • 88

Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

Paper • 2309.08532 • Published Sep 15, 2023 • 53
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 158
How faithful are RAG models? Quantifying the tug-of-war between RAG and LLMs' internal prior

Paper • 2404.10198 • Published Apr 16, 2024 • 7
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50

NLP Paper Reading

NLP Papre Reading

Large Language Models as Optimizers

Paper • 2309.03409 • Published Sep 7, 2023 • 76
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

Paper • 2309.04269 • Published Sep 8, 2023 • 33
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 87
Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 25

Previous
1
...
4
5
6
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs