view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • Jan 20 • 37
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others • Dec 23, 2024 • 18
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language Dec 16, 2024 • 116
view article Article Open Preference Dataset for Text-to-Image Generation by the 🤗 Community Dec 9, 2024 • 55
view article Article Let’s make a generation of amazing image generation models By burtenshaw and 4 others • Nov 26, 2024 • 33
view article Article Argilla 2.4: Easily Build Fine-Tuning and Evaluation datasets on the Hub — No Code Required Nov 4, 2024 • 41
view article Article How to build a custom text classifier without days of human labeling By sdiazlor and 4 others • Oct 17, 2024 • 55
view article Article How to optimize your data labelling project with custom interfaces By burtenshaw and 9 others • Oct 16, 2024 • 18
view article Article 🔥 Argilla 2.0: the data-centric tool for AI makers 🤗 By dvilasuero • Jul 30, 2024 • 37
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context Jul 23, 2024 • 230
view article Article Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality Jun 24, 2024 • 34
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • Jun 4, 2024 • 78