Nishith Jain's picture

Nishith Jain

KingNish

·

AI & ML interests

AI is fun actually. Busy till June 2025.

Recent Activity

reacted to AdinaY's post with 🔥 about 15 hours ago

Open Sora 2.0 is out 🔥 https://huggingface.co/collections/hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5 ✨ 11B with Apache2.0 ✨ Low training cost - $200k ✨ open weights, code and training workflow

reacted to AdinaY's post with 🤗 about 15 hours ago

Open Sora 2.0 is out 🔥 https://huggingface.co/collections/hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5 ✨ 11B with Apache2.0 ✨ Low training cost - $200k ✨ open weights, code and training workflow

reacted to burtenshaw's post with 🤗 about 15 hours ago

everybody and their dog is fine-tuning Gemma 3 today, so I thought I'd do a longer post on the tips and sharp edges I find. let's go! 1. has to be install everything form main and nightly. this is what I'm working with to get unsloth and TRL running ```txt git+https://github.com/huggingface/transformers@main git+https://github.com/huggingface/trl.git@main bitsandbytes peft ``` plus this with `--no-deps` ```txt git+https://github.com/unslothai/unsloth-zoo.git@nightly git+https://github.com/unslothai/unsloth.git@nightly ``` 2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb 3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps. 4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters. ```python from trl import GRPOConfig training_args = GRPOConfig( learning_rate = 5e-6, adam_beta1 = 0.9, adam_beta2 = 0.99, weight_decay = 0.1, warmup_ratio = 0.1, lr_scheduler_type = "cosine", optim = "adamw_8bit", logging_steps = 1, per_device_train_batch_size = 2, gradient_accumulation_steps = 1, num_generations = 2, max_prompt_length = 256, max_completion_length = 1024 - 256, num_train_epochs = 1, max_steps = 250, save_steps = 250, max_grad_norm = 0.1, report_to = "none", ) ``` 5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth ```python from transformers import AutoModelForImageTextToText model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it) ``` if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way. https://huggingface.co/reasoning-course

View all activity

Organizations

KingNish's activity

New activity in KingNish/qwen-1b-continued-v2.2 5 days ago

Adding `safetensors` variant of this model

#1 opened 5 days ago by

New activity in KingNish/qwen-1b-continued-v2 6 days ago

Adding `safetensors` variant of this model

#1 opened 6 days ago by

New activity in KingNish/qwen-1b-continued 7 days ago

Adding `safetensors` variant of this model

#1 opened 7 days ago by

New activity in KingNish/Doc-Reader-and-Chat about 1 month ago

Upgrade gradio version

#1 opened 4 months ago by

New activity in fffiloni/YuE about 1 month ago

Optimized for speed

#7 opened about 1 month ago by

New activity in KingNish/OpenGPT-4o about 1 month ago

Update chatbot.py

#68 opened about 1 month ago by

New activity in innova-ai/README 2 months ago

Background Video Removal

#1 opened 3 months ago by

New activity in KingNish/Reasoning-Llama-3b-v0.2 5 months ago

Adding `safetensors` variant of this model

#1 opened 5 months ago by

New activity in innova-ai/video-background-removal 5 months ago

rick

#3 opened 5 months ago by

Upload 382661044_3835227736701043_416786998373332440_n.mp4

#4 opened 5 months ago by

Delete rickroll-2sec.mp4

#5 opened 5 months ago by

I wondered whether the uploaded background video can only be used as an image with its first frame

#6 opened 5 months ago by

New activity in huggingchat/chat-ui 5 months ago

[FEATURE] Community Tools

#569 opened 6 months ago by

New activity in KingNish/reasoning-base-20k 5 months ago

Distributed training code (no Unsloth)

#3 opened 5 months ago by

The data source

#4 opened 5 months ago by

New activity in KingNish/Instant-Video 5 months ago

what?

#10 opened 5 months ago by

New activity in KingNish/reasoning-base-20k 5 months ago

The difference between training a separate reasoning role and directly training assistant to conduct the reasoning process

#5 opened 5 months ago by

New activity in innova-ai/video-background-removal 5 months ago

The speed of inference is a bit slow!

#2 opened 5 months ago by

New activity in KingNish/reasoning-base-20k 5 months ago

Generation model?

#2 opened 5 months ago by

New activity in KingNish/Reasoning-Llama-3b-v0.1 5 months ago

Adding `safetensors` variant of this model

#1 opened 5 months ago by