Nishith Jain's picture

Nishith Jain

KingNish

AI & ML interests

AI is fun actually. Busy till June 2025.

Recent Activity

reacted to burtenshaw's post with 🤗 about 15 hours ago
everybody and their dog is fine-tuning Gemma 3 today, so I thought I'd do a longer post on the tips and sharp edges I find. let's go! 1. has to be install everything form main and nightly. this is what I'm working with to get unsloth and TRL running ```txt git+https://github.com/huggingface/transformers@main git+https://github.com/huggingface/trl.git@main bitsandbytes peft ``` plus this with `--no-deps` ```txt git+https://github.com/unslothai/unsloth-zoo.git@nightly git+https://github.com/unslothai/unsloth.git@nightly ``` 2. will brown's code to turn GSM8k into a reasoning dataset is a nice toy experiment https://gist.github.com/willccbb/4676755236bb08cab5f4e54a0475d6fb 3. with a learning rate of 5e-6 rewards and loss stayed flat for the first 100 or so steps. 4. so far none of my runs have undermined the outputs after 1 epoch. therefore, I'm mainly experimenting with bigger LoRA adapters. ```python from trl import GRPOConfig training_args = GRPOConfig( learning_rate = 5e-6, adam_beta1 = 0.9, adam_beta2 = 0.99, weight_decay = 0.1, warmup_ratio = 0.1, lr_scheduler_type = "cosine", optim = "adamw_8bit", logging_steps = 1, per_device_train_batch_size = 2, gradient_accumulation_steps = 1, num_generations = 2, max_prompt_length = 256, max_completion_length = 1024 - 256, num_train_epochs = 1, max_steps = 250, save_steps = 250, max_grad_norm = 0.1, report_to = "none", ) ``` 5. vision fine-tuning isn't available in TRL's GRPOTrainer, so stick to text datasets. but no need to load the model differently in transformers or Unsloth ```python from transformers import AutoModelForImageTextToText model = AutoModelForImageTextToText.from_pretrained("google/gemma-3-4b-it) ``` if you want an introduction to GRPO, check out the reasoning course, it walks you through the algorithm, theory, and implementation in a smooth way. https://huggingface.co/reasoning-course
View all activity

Organizations

Wikimedia's profile picture OpenGVLab's profile picture Blog-explorers's profile picture Multi🤖Transformers's profile picture The Collectionists's profile picture HelpingAI's profile picture ZeroGPU Explorers's profile picture Project Fluently's profile picture Poscye's profile picture INNOVA AI's profile picture Narra's profile picture Social Post Explorers's profile picture Cognitive Computations's profile picture Dev Mode Explorers's profile picture Stable Diffusion Community (Unofficial, Non-profit)'s profile picture ONNX Community's profile picture Hugging Face Discord Community's profile picture Nerdy Face's profile picture grafite's profile picture None yet's profile picture Project R's profile picture Doge Face's profile picture

KingNish's activity

New activity in KingNish/Doc-Reader-and-Chat about 1 month ago

Upgrade gradio version

#1 opened 4 months ago by
Csplk
New activity in fffiloni/YuE about 1 month ago

Optimized for speed

1
#7 opened about 1 month ago by
KingNish
New activity in KingNish/OpenGPT-4o about 1 month ago

Update chatbot.py

#68 opened about 1 month ago by
mancooper
New activity in innova-ai/README 2 months ago

Background Video Removal

1
#1 opened 3 months ago by
Adeal1
New activity in huggingchat/chat-ui 5 months ago

[FEATURE] Community Tools

76
#569 opened 6 months ago by
nsarrazin
New activity in KingNish/reasoning-base-20k 5 months ago

The data source

2
#4 opened 5 months ago by
lockon
New activity in KingNish/Instant-Video 5 months ago

what?

2
#10 opened 5 months ago by
pencilmender
New activity in KingNish/reasoning-base-20k 5 months ago

Generation model?

4
#2 opened 5 months ago by
Benjoyo