1206 65 55

Quentin Gallouédec

qgallouedec

AI & ML interests

None yet

Recent Activity

updated a model about 2 hours ago

qgallouedec/gemma-3-12b-it-codeforces-SFT

published a model about 2 hours ago

qgallouedec/gemma-3-12b-it-codeforces-SFT

updated a model about 3 hours ago

qgallouedec/gemma-3-12b-it-codeforces-SFT-eager-no-packing

View all activity

Organizations

qgallouedec's activity

upvoted a collection 1 day ago

Gemma 3 Release

Collection

9 items • Updated about 6 hours ago • 221

upvoted an article 2 days ago

Article

Open R1: Update #3

and 9 others •

2 days ago

• 197

upvoted a paper 2 days ago

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 8

upvoted an article 3 days ago

Article

The N Implementation Details of RLHF with PPO

Oct 24, 2023

• 42

upvoted 2 papers 14 days ago

ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

Paper • 1910.02054 • Published Oct 4, 2019 • 5

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 114

upvoted a paper 22 days ago

Presumed Cultural Identity: How Names Shape LLM Responses

Paper • 2502.11995 • Published 24 days ago • 10

upvoted an article about 1 month ago

Article

Open R1: Update #2

and 6 others •

Feb 10

• 202

upvoted a collection about 1 month ago

DeepSeek-R1

Collection

8 items • Updated Jan 21 • 573

upvoted 2 articles about 1 month ago

Article

Open-R1: Update #1

and 7 others •

Feb 2

• 295

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

•

Jan 31

• 44

upvoted a paper about 2 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 346

upvoted a paper 2 months ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 107

upvoted 2 papers 3 months ago

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper • 1910.10683 • Published Oct 23, 2019 • 11

Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 9

upvoted a collection 3 months ago

Tiny models

Collection

23 items • Updated Nov 30, 2024 • 1

upvoted a paper 4 months ago

QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 50

upvoted an article 5 months ago

Article

Finetuning PaliGemma with AutoTrain

•

Jul 25, 2024

• 10

upvoted 2 papers 5 months ago

The Perfect Blend: Redefining RLHF with Mixture of Judges

Paper • 2409.20370 • Published Sep 30, 2024 • 5

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Paper • 2401.08417 • Published Jan 16, 2024 • 35