Omar Sanseviero's picture

Omar Sanseviero

osanseviero

·

https://osanseviero.github.io/hackerllama/

AI & ML interests

Llamas, model merging, massive ASR for data collection, 3D ML, on-device ML, quantization, model judging, ML in browser, healthcare applications, education, intersection of art and ML.🦙

Recent Activity

liked a model about 12 hours ago

google/gemma-3-12b-it

liked a model about 12 hours ago

google/gemma-3-4b-it

liked a model about 12 hours ago

google/gemma-3-4b-pt

View all activity

Organizations

osanseviero's activity

upvoted a collection about 12 hours ago

Gemma 3 Release

9 items • Updated about 3 hours ago • 217

upvoted an article 1 day ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

2 days ago

• 207

upvoted a collection 4 days ago

C4AI Aya Vision

Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated 9 days ago • 63

upvoted an article 9 days ago

Article

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

10 days ago

• 65

upvoted a paper 11 days ago

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 23 days ago • 56

upvoted an article 17 days ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

23 days ago

• 65

upvoted a collection 20 days ago

GemmaX2

GemmaX2 language models, including pretrained and instruction-tuned models of 2 sizes, including 2B, 9B. • 7 items • Updated Feb 7 • 20

upvoted 2 papers 20 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 22 days ago • 163

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 21 days ago • 129

upvoted a collection 22 days ago

PaliGemma 2 Mix

13 items • Updated 2 days ago • 60

upvoted a paper 28 days ago

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published about 1 month ago • 29

upvoted 2 articles about 1 month ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.16k

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 295

upvoted a paper about 1 month ago

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Paper • 2501.18512 • Published Jan 30 • 27

upvoted an article about 1 month ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 803

upvoted an article about 2 months ago

Article

Mastering Long Contexts in LLMs with KVPress

By

and 1 other •

Jan 23

• 64

upvoted 3 papers about 2 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 92

Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published Jan 9 • 50

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 276

upvoted a paper 2 months ago

Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 93