Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2311.07911

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 17
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 1

ibm-research/AttaQ

Viewer • Updated Jan 26, 2024 • 1.4k • 2.32k • 15
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11, 2024 • 163 • 8
corbyrosset/researchy_questions

Viewer • Updated Feb 29, 2024 • 96.4k • 214 • 27
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 634 • 72

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 147
ReFT: Reasoning with Reinforced Fine-Tuning

Paper • 2401.08967 • Published Jan 17, 2024 • 30
Tuning Language Models by Proxy

Paper • 2401.08565 • Published Jan 16, 2024 • 23
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 20

papers bookmarks

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 20

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 20
HuggingFaceH4/mt_bench_prompts

Viewer • Updated Jul 3, 2023 • 80 • 318 • 17
vectara/hallucination_evaluation_model

Text Classification • Updated Oct 30, 2024 • 159k • 267
GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 192

Holistic Evaluation of Text-To-Image Models

Paper • 2311.04287 • Published Nov 7, 2023 • 16
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks

Paper • 2311.07463 • Published Nov 13, 2023 • 15
Trusted Source Alignment in Large Language Models

Paper • 2311.06697 • Published Nov 12, 2023 • 12
DiLoCo: Distributed Low-Communication Training of Language Models

Paper • 2311.08105 • Published Nov 14, 2023 • 15

Evals & Monitoring

G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment

Paper • 2303.16634 • Published Mar 29, 2023 • 3
miracl/miracl-corpus

Viewer • Updated Jan 5, 2023 • 77.2M • 6.13k • 44
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Paper • 2306.05685 • Published Jun 9, 2023 • 33
How is ChatGPT's behavior changing over time?

Paper • 2307.09009 • Published Jul 18, 2023 • 24

Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 20
Prompt Engineering a Prompt Engineer

Paper • 2311.05661 • Published Nov 9, 2023 • 24
Contrastive Chain-of-Thought Prompting

Paper • 2311.09277 • Published Nov 15, 2023 • 36

ChatAnything: Facetime Chat with LLM-Enhanced Personas

Paper • 2311.06772 • Published Nov 12, 2023 • 35
Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 29
A Survey on Language Models for Code

Paper • 2311.07989 • Published Nov 14, 2023 • 22
Instruction-Following Evaluation for Large Language Models

Paper • 2311.07911 • Published Nov 14, 2023 • 20

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs