Khalil Guetari

KhalilGuetari
Β·

AI & ML interests

None yet

Recent Activity

Organizations

huggingPartyParis's profile picture Moments Lab's profile picture

KhalilGuetari's activity

reacted to prithivMLmods's post with πŸ”₯ 1 day ago
view post
Post
1728
Gemma-3-4B : Image and Video Inference πŸ–ΌοΈπŸŽ₯

🧀Space: prithivMLmods/Gemma-3-Multimodal

@gemma3-4b : {Tag + Space_+ 'prompt'}
@video-infer : {Tag + Space_+ 'prompt'}

+ Gemma3-4B : google/gemma-3-4b-it
+ By default, it runs : prithivMLmods/Qwen2-VL-OCR-2B-Instruct

Gemma 3 Technical Report : https://storage.googleapis.com/deepmind-media/gemma/Gemma3Report.pdf

Additionally, I have also tested Aya-Vision 8B vs Custom Qwen2-VL-OCR for OCR with test case samples on messy handwriting for experimental purposes to optimize edge device VLMs for Optical Character Recognition.

πŸ“œRead the blog here: https://huggingface.co/blog/prithivMLmods/aya-vision-vs-qwen2vl-ocr-2b
  • 1 reply
Β·
reacted to cfahlgren1's post with πŸ‘€ 4 months ago
view post
Post
1164
If you are like me, I like to find up and coming datasets and spaces before everyone else.

I made a trending repo space cfahlgren1/trending-repos where it shows:

- New up and coming Spaces in the last day
- New up and coming Datasets in the last 2 weeks

It's a really good way to find some new gems before they become popular. For example, someone is working on a way to dynamically create assets inside a video game here: gptcall/AI-Game-Creator

reacted to fdaudens's post with πŸ‘ 4 months ago
view post
Post
2356
πŸ” NYT leveraged AI to investigate election interference by analyzing 400+ hours of recorded meetings - that's 5M words of data!

AI spotted patterns, humans verified facts. Every AI-flagged quote was manually verified against source recordings. Really appreciate that they published their full methodology - transparency matters when using AI in journalism.

A perfect blend of tech & journalism.

The future of journalism isn't robots replacing reporters - it's AI helping humans process massive datasets more efficiently. Sometimes the most powerful tech solutions are the least flashy ones.

Read the article: https://www.nytimes.com/interactive/2024/10/28/us/politics/inside-the-movement-behind-trumps-election-lies.html?unlocked_article_code=1.Vk4.ucv9.dbHVquTQaf0G&smid=nytcore-ios-share
upvoted an article 6 months ago
view article
Article

FineVideo: behind the scenes

β€’ 30
upvoted an article 9 months ago
view article
Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

β€’ 243
New activity in momentslab/AstroCaptions 10 months ago
reacted to merve's post with πŸ”₯ 10 months ago
view post
Post
1769
New open Vision Language Model by @Google : PaliGemma πŸ’™πŸ€

πŸ“ Comes in 3B, pretrained, mix and fine-tuned models in 224, 448 and 896 resolution
🧩 Combination of Gemma 2B LLM and SigLIP image encoder
πŸ€— Supported in transformers

PaliGemma can do..
🧩 Image segmentation and detection! 🀯
πŸ“‘ Detailed document understanding and reasoning
πŸ™‹ Visual question answering, captioning and any other VLM task!

Read our blog πŸ”– hf.co/blog/paligemma
Try the demo πŸͺ€ hf.co/spaces/google/paligemma
Check out the Spaces and the models all in the collection πŸ“š google/paligemma-release-6643a9ffbf57de2ae0448dda
Collection of fine-tuned PaliGemma models google/paligemma-ft-models-6643b03efb769dad650d2dda
Β·