Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.01954

Papers - Multilingual - Benchmarks

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23
ByT5: Towards a token-free future with pre-trained byte-to-byte models

Paper • 2105.13626 • Published May 28, 2021 • 3
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 93

Papers - Fine-tuning - PPO

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23
UltraFeedback: Boosting Language Models with High-quality Feedback

Paper • 2310.01377 • Published Oct 2, 2023 • 5
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

Paper • 2305.14387 • Published May 22, 2023 • 1
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 107

Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2, 2024 • 37
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23

Papers - Text - Supervised Fine-tuning - Batch Grouping

Batches are grouped by similar token length to help optimize gpu/hardware. Mini batch lengths are different but the max number of tokens is the same.

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23

Papers - Text - Supervised Fine-tuning

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23

Papers - Pre-training - Dynamic Context Length

For HyperClova X they split 90% at 4096 and 10% at 32k context length during pt

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23

Papers - Pre-training - In-filling - PSM and SPM ordering

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23

non-english llm

RakutenAI-7B: Extending Large Language Models for Japanese

Paper • 2403.15484 • Published Mar 21, 2024 • 14
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23

Model Stock: All we need is just a few fine-tuned models

Paper • 2403.19522 • Published Mar 28, 2024 • 12
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23
Instruction Tuning with Human Curriculum

Paper • 2310.09518 • Published Oct 14, 2023 • 3

Papers - Reward Model - Bradley-Terry

https://web.stanford.edu/class/archive/stats/stats200/stats200.1172/Lecture24.pdf

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 53
HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2, 2024 • 23
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Paper • 2404.09956 • Published Apr 15, 2024 • 12
Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 84

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs