Jaiyam Sharma's picture

3 5

Jaiyam Sharma

dataplayer12

·

http://www.jaiyam.in

dataplayer12

AI & ML interests

Computer Vision

Recent Activity

liked a model about 2 months ago

maya-research/maya1

new activity 4 months ago

mlx-community/embeddinggemma-300m-8bit:`gemma3_text` is not supported in mlx

updated a model 10 months ago

dataplayer12/Mistral-Small-24B-Reasoning-Q4_K_M-GGUF

View all activity

Organizations

liked a model about 2 months ago

maya-research/maya1

Text-to-Speech • 3B • Updated Nov 12 • 80.1k • • 833

New activity in mlx-community/embeddinggemma-300m-8bit 4 months ago

`gemma3_text` is not supported in mlx

#1 opened 4 months ago by

updated a model 10 months ago

dataplayer12/Mistral-Small-24B-Reasoning-Q4_K_M-GGUF

24B • Updated Feb 18 • 12

published a model 10 months ago

dataplayer12/Mistral-Small-24B-Reasoning-Q4_K_M-GGUF

24B • Updated Feb 18 • 12

updated a model 11 months ago

dataplayer12/phi-4-Q6_K

15B • Updated Feb 3 • 6

published a model 11 months ago

dataplayer12/phi-4-Q6_K

15B • Updated Feb 3 • 6

updated a model 11 months ago

dataplayer12/phi-4-Q4_K_M

15B • Updated Feb 2 • 13

published a model 11 months ago

dataplayer12/phi-4-Q4_K_M

15B • Updated Feb 2 • 13

liked a model 11 months ago

microsoft/phi-4

Text Generation • 15B • Updated Nov 24 • 504k • 2.2k

reacted to chansung's post with 👍 11 months ago

Post

2041

Simple summary on DeepSeek AI's Janus-Pro: A fresh take on multimodal AI!

It builds on its predecessor, Janus, by tweaking the training methodology rather than the model architecture. The result? Improved performance in understanding and generating multimodal data.

Janus-Pro uses a three-stage training strategy, similar to Janus, but with key modifications:
✦ Stage 1 & 2: Focus on separate training for specific objectives, rather than mixing data.
✦ Stage 3: Fine-tuning with a careful balance of multimodal data.

Benchmarks show Janus-Pro holds its own against specialized models like TokenFlow XL and MetaMorph, and other multimodal models like SD3 Medium and DALL-E 3.

The main limitation? Low image resolution (384x384). However, this seems like a strategic choice to focus on establishing a solid "recipe" for multimodal models. Future work will likely leverage this recipe and increased computing power to achieve higher resolutions.

updated a model 11 months ago

dataplayer12/Mistral-Small-24B-Instruct-Q4_K_M-GGUF

24B • Updated Feb 1 • 7

published a model 11 months ago

dataplayer12/Mistral-Small-24B-Instruct-Q4_K_M-GGUF

24B • Updated Feb 1 • 7

liked 2 models almost 2 years ago

TheBloke/CodeLlama-70B-Instruct-GGUF

Text Generation • 69B • Updated Jan 30, 2024 • 1.84k • 62

TheBloke/CodeLlama-70B-hf-GGUF

Text Generation • 69B • Updated Jan 30, 2024 • 1.01k • 42

New activity in nvidia/GPT-2B-001 over 2 years ago

Does not work with NeMo container

#2 opened over 2 years ago by

gibberish on 4090

#4 opened over 2 years ago by

liked a dataset over 3 years ago

ILSVRC/imagenet-1k

Viewer • Updated Sep 17 • 1.43M • 65.4k • 648