Running 2.25k 2.25k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer By Pringled and 1 other • Oct 14, 2024 • 77
hanspeterlyngsoeraaschoujensen/week41_train_en_input_output Viewer • Updated Sep 24, 2024 • 6.41k • 77
hanspeterlyngsoeraaschoujensen/deberta-v3-base-finetuned-nlp-course Question Answering • Updated Sep 23, 2024 • 15
hanspeterlyngsoeraaschoujensen/distilbert-base-uncased-finetuned-nlp-course Question Answering • Updated Sep 23, 2024 • 21
hanspeterlyngsoeraaschoujensen/mt5-base-finetuned-nlp-course Question Answering • Updated Sep 21, 2024 • 10
Llama 3.1 Evals Collection This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.1 models, including the configurations, • 6 items • Updated Dec 6, 2024 • 15
Running 872 872 FineWeb: decanting the web for the finest text data at scale 🍷 Generate high-quality web text data for LLM training