QUANG HUY CHU's picture

8 46

QUANG HUY CHU

cqhofsns

·

AI & ML interests

Deep Reinforcement Learning --- Natural Language Processing

Recent Activity

commented on an article 29 days ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

upvoted an article 29 days ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

upvoted an article about 1 month ago

Open-R1: a fully open reproduction of DeepSeek-R1

View all activity

Organizations

None yet

cqhofsns's activity

commented on DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge 29 days ago

Thank you for this post. Very clear explanation and nice example ;)

upvoted an article 29 days ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By

•

Feb 7

• 70

upvoted an article about 1 month ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 803

liked a model about 1 month ago

mistralai/Mistral-7B-Instruct-v0.3

Text Generation • Updated Aug 21, 2024 • 905k • • 1.48k

upvoted a collection about 2 months ago

DeepSeek-R1

8 items • Updated Jan 21 • 573

liked 4 models about 2 months ago

answerdotai/ModernBERT-base

Fill-Mask • Updated Jan 15 • 3.44M • 790

nlpaueb/legal-bert-small-uncased

Fill-Mask • Updated Apr 28, 2022 • 28.6k • 21

meta-llama/Meta-Llama-3-70B-Instruct

Text Generation • Updated Dec 15, 2024 • 307k • • 1.46k

google/gemma-2-9b-it

Text Generation • Updated Aug 27, 2024 • 319k • • 687

upvoted 3 collections about 2 months ago

Gemma 2 Release

15 items • Updated 2 days ago • 216

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 359

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 719

liked 2 models about 2 months ago

Qwen/Qwen2-7B-Instruct

Text Generation • Updated Aug 21, 2024 • 267k • • 620

Qwen/Qwen2-57B-A14B-Instruct

Text Generation • Updated Aug 21, 2024 • 2.91k • 80

liked 4 datasets 2 months ago

CCLV/CausalBench

Preview • Updated Jun 13, 2024 • 137 • 5

uonlp/CulturaX

Viewer • Updated Dec 16, 2024 • 7.18B • 16.2k • 498

allenai/c4

Viewer • Updated Jan 9, 2024 • 10.4B • 359k • 380

wikimedia/wikipedia

Viewer • Updated Jan 9, 2024 • 61.6M • 99.2k • 758

liked a model 4 months ago

tohoku-nlp/bert-base-japanese-whole-word-masking

Fill-Mask • Updated Feb 22, 2024 • 126k • 64

liked a model 5 months ago

meta-llama/Llama-3.2-3B-Instruct

Text Generation • Updated Oct 24, 2024 • 3.03M • • 1.21k