-
mistralai/Mistral-7B-v0.1
Text Generation ā¢ Updated ā¢ 541k ā¢ 3.58k -
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper ā¢ 2307.09288 ā¢ Published ā¢ 244 -
togethercomputer/RedPajama-Data-V2
Updated ā¢ 3.15k ā¢ 358 -
9.41k
AI Comic Factory
š©Create your own AI comic with a single prompt
Collections
Discover the best community collections!
Collections including paper arxiv:2307.09288
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 51 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 8 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 157 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 46
-
Llemma: An Open Language Model For Mathematics
Paper ā¢ 2310.10631 ā¢ Published ā¢ 52 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 46 -
Qwen Technical Report
Paper ā¢ 2309.16609 ā¢ Published ā¢ 35 -
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
Paper ā¢ 2309.11568 ā¢ Published ā¢ 10
-
When can transformers reason with abstract symbols?
Paper ā¢ 2310.09753 ā¢ Published ā¢ 3 -
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper ā¢ 2310.10638 ā¢ Published ā¢ 29 -
Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
Paper ā¢ 2310.09520 ā¢ Published ā¢ 11 -
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Paper ā¢ 2309.08532 ā¢ Published ā¢ 53
-
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper ā¢ 2310.09199 ā¢ Published ā¢ 26 -
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper ā¢ 2310.08740 ā¢ Published ā¢ 15 -
Personality Traits in Large Language Models
Paper ā¢ 2307.00184 ā¢ Published ā¢ 20 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper ā¢ 2310.12962 ā¢ Published ā¢ 13
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 51 -
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper ā¢ 2005.11401 ā¢ Published ā¢ 10 -
LoRA: Low-Rank Adaptation of Large Language Models
Paper ā¢ 2106.09685 ā¢ Published ā¢ 32 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper ā¢ 2205.14135 ā¢ Published ā¢ 13
-
SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving
Paper ā¢ 2402.02519 ā¢ Published -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 157 -
Optimal Transport Aggregation for Visual Place Recognition
Paper ā¢ 2311.15937 ā¢ Published -
GOAT: GO to Any Thing
Paper ā¢ 2311.06430 ā¢ Published ā¢ 14