202 137 524

Nishith Jain

KingNish

AI & ML interests

AI is fun actually. Busy till June 2025.

Recent Activity

reacted to AdinaY's post with 🔥 about 5 hours ago

Open Sora 2.0 is out 🔥 https://huggingface.co/collections/hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5 ✨ 11B with Apache2.0 ✨ Low training cost - $200k ✨ open weights, code and training workflow

reacted to AdinaY's post with 🤗 about 5 hours ago

reacted to burtenshaw's post with 🤗 about 5 hours ago

everybody and their dog is fine-tuning Gemma 3 today, so I thought I'd do a longer post on the tips and sharp edges I find. let's go! 1. has to be install everything form main and nightly. this is what I'm working with to get unsloth and TRL running ```txt git+https://github.com/huggingface/transformers@main git+https://github.com/huggingface/trl.git@main bitsandbytes peft ``` plus this with `--no-deps` ```txt git+https://github.com/unslothai/unsloth-zoo.git@nightly git+https://github.com/unslothai/unsloth.git@nightly ``` 2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb 3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps. 4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters. ```python from trl import GRPOConfig training_args = GRPOConfig( learning_rate = 5e-6, adam_beta1 = 0.9, adam_beta2 = 0.99, weight_decay = 0.1, warmup_ratio = 0.1, lr_scheduler_type = "cosine", optim = "adamw_8bit", logging_steps = 1, per_device_train_batch_size = 2, gradient_accumulation_steps = 1, num_generations = 2, max_prompt_length = 256, max_completion_length = 1024 - 256, num_train_epochs = 1, max_steps = 250, save_steps = 250, max_grad_norm = 0.1, report_to = "none", ) ``` 5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth ```python from transformers import AutoModelForImageTextToText model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it) ``` if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way. https://huggingface.co/reasoning-course

View all activity

Organizations

KingNish's activity

reacted to AdinaY's post with 🔥🤗 about 5 hours ago

Post

250

Open Sora 2.0 is out 🔥
hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5
✨ 11B with Apache2.0
✨ Low training cost - $200k
✨ open weights, code and training workflow

reacted to burtenshaw's post with 🤗 about 5 hours ago

Post

359

git+https://github.com/huggingface/transformers@main
git+https://github.com/huggingface/trl.git@main
bitsandbytes
peft

plus this with --no-deps

git+https://github.com/unslothai/unsloth-zoo.git@nightly
git+https://github.com/unslothai/unsloth.git@nightly

2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb

3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps.

4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters.

from trl import GRPOConfig

training_args = GRPOConfig(
    learning_rate = 5e-6,
    adam_beta1 = 0.9,
    adam_beta2 = 0.99,
    weight_decay = 0.1,
    warmup_ratio = 0.1,
    lr_scheduler_type = "cosine",
    optim = "adamw_8bit",
    logging_steps = 1,
    per_device_train_batch_size = 2,
    gradient_accumulation_steps = 1,
    num_generations = 2,
    max_prompt_length = 256,
    max_completion_length = 1024 - 256,
    num_train_epochs = 1,
    max_steps = 250,
    save_steps = 250,
    max_grad_norm = 0.1,
    report_to = "none",
)

5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it)

if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way.

https://huggingface.co/reasoning-course

2 replies

reacted to thomwolf's post with 🔥 1 day ago

Post

1140

We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: ⚡️OlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming –a domain Anthropic has been historically really strong at– and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions

reacted to BrigitteTousi's post with 🤗 1 day ago

Post

3548

Regardless of X being down or not, so glad I can rely on HF Posts for AI news ❤️🤗

1 reply

reacted to Smooke's post with 👍 1 day ago

Post

1729

upvoted a collection 1 day ago

Gemma 3 Release

Collection

9 items • Updated 1 day ago • 202

liked a model 2 days ago

RekaAI/reka-flash-3

Updated 6 minutes ago • 1.3k • 209

liked a model 3 days ago

EuroBERT/EuroBERT-210m

Fill-Mask • Updated 3 days ago • 3.25k • 46

upvoted a collection 3 days ago

EuroBERT

Collection

Scaling Multilingual Encoders for European Languages • 4 items • Updated 3 days ago • 8

reacted to JingzeShi's post with 🚀❤️ 3 days ago

Post

4621

We distill a more accurate and concise dataset from DeepSeek R1, and also provide a distillation pipeline code repository.🤗

Dataset: SmallDoge/SmallThoughts
Code: https://github.com/SmallDoges/small-thoughts

liked a Space 3 days ago

488

RWKV-Gradio-1

💻

reacted to BlinkDL's post with 🔥 3 days ago

Post

4939

RWKV-7 "Goose" 0.4B trained w/ ctx4k automatically extrapolates to ctx32k+, and perfectly solves NIAH ctx16k 🤯 100% RNN and attention-free. Only trained on the Pile. No finetuning. Replicable training runs. tested by our community: https://github.com/Jellyfish042/LongMamba

liked a Space 3 days ago

618

RWKV-Gradio-2

🚀

Generate text responses from prompts

liked a model 4 days ago

SparkAudio/Spark-TTS-0.5B

Text-to-Speech • Updated 6 days ago • 7.82k • 370

reacted to fdaudens's post with 🤗 4 days ago

Post

5642

Honored to be named among their 12 pioneers and power players in the news industry in the 2025 Tech Trends Report from Future Today Strategy Group.

Incredible group to be part of - each person is doing groundbreaking work at the intersection of AI and journalism. Worth following them all: they're consistently sharing practical insights on building the future of news.

Take the time to read this report, it's packed with insights as always. The news & information section's #1 insight hits hard: "The most substantive economic impact of AI to date has been licensing payouts for a handful of big publishers. The competition will start shifting in the year ahead to separate AI 'haves' that have positioned themselves to grow from the 'have-nots.'"

This AI-driven divide is something I've been really concerned about. Now is the time to build more than ever!

👉 Full report here: https://ftsg.com/wp-content/uploads/2025/03/FTSG_2025_TR_FINAL_LINKED.pdf

2 replies

upvoted a paper 4 days ago

Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing

Paper • 2502.14458 • Published 21 days ago • 2

New activity in KingNish/qwen-1b-continued-v2.2 4 days ago

Adding `safetensors` variant of this model

#1 opened 4 days ago by

SFconvertbot

updated a model 5 days ago

KingNish/qwen-1b-continued-v2.2

Text Generation • Updated 4 days ago • 29