Collections
Discover the best community collections!
Collections including paper arxiv:2311.07911
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 146 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 30 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 22 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
Instruction-Following Evaluation for Large Language Models
Paper • 2311.07911 • Published • 20 -
HuggingFaceH4/mt_bench_prompts
Viewer • Updated • 80 • 165 • 17 -
vectara/hallucination_evaluation_model
Text Classification • Updated • 79.4k • 251 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 192
-
Holistic Evaluation of Text-To-Image Models
Paper • 2311.04287 • Published • 12 -
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks
Paper • 2311.07463 • Published • 14 -
Trusted Source Alignment in Large Language Models
Paper • 2311.06697 • Published • 11 -
DiLoCo: Distributed Low-Communication Training of Language Models
Paper • 2311.08105 • Published • 15
-
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
miracl/miracl-corpus
Viewer • Updated • 77.2M • 4.69k • 44 -
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper • 2306.05685 • Published • 33 -
How is ChatGPT's behavior changing over time?
Paper • 2307.09009 • Published • 24
-
ChatAnything: Facetime Chat with LLM-Enhanced Personas
Paper • 2311.06772 • Published • 35 -
Fine-tuning Language Models for Factuality
Paper • 2311.08401 • Published • 29 -
A Survey on Language Models for Code
Paper • 2311.07989 • Published • 22 -
Instruction-Following Evaluation for Large Language Models
Paper • 2311.07911 • Published • 20