Rajdeep Borgohain's picture

Rajdeep Borgohain

rbgo

·

RajdeepBorgohain

AI & ML interests

Solving language barriers.

Recent Activity

upvoted an article 6 days ago

Inside the family of Smol models

upvoted an article 6 days ago

SmolLM - blazingly fast and remarkably powerful

upvoted a paper 14 days ago

Kanana: Compute-efficient Bilingual Language Models

View all activity

Organizations

rbgo's activity

upvoted 2 articles 6 days ago

Article

Inside the family of Smol models

By

and 1 other •

14 days ago

• 7

Article

SmolLM - blazingly fast and remarkably powerful

Jul 16, 2024

• 332

upvoted a paper 14 days ago

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published 15 days ago • 62

upvoted a collection 14 days ago

Phi-4

Phi-4 family of small language and multi-modal models. • 7 items • Updated 10 days ago • 109

upvoted a paper 16 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 22 days ago • 161

upvoted a paper 30 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published about 1 month ago • 142

upvoted 2 articles about 1 month ago

Article

Mastering Long Contexts in LLMs with KVPress

By

and 1 other •

Jan 23

• 64

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 802

upvoted 4 collections about 2 months ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 8 items • Updated 17 days ago • 396

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 15 days ago • 106

DeepSeek-V2

8 items • Updated Jan 3 • 28

DeepSeek-LLM

DeepSeek LLM series • 5 items • Updated Aug 16, 2024 • 13

upvoted a paper about 2 months ago

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Paper • 2412.06071 • Published Dec 8, 2024 • 9

upvoted an article about 2 months ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

Jan 16

• 44

upvoted a paper 2 months ago

Phi-4 Technical Report

Paper • 2412.08905 • Published Dec 12, 2024 • 111

upvoted a collection 3 months ago

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated 1 day ago • 145

upvoted a paper 3 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 352

upvoted 3 collections 3 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 15 days ago • 560

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 359

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 576