Britny Farahdel's picture
1 21

Britny Farahdel PRO

britny
Ā·

AI & ML interests

None yet

Recent Activity

updated a collection 7 days ago
Audio
updated a collection 8 days ago
Image Editing
updated a collection 8 days ago
Image Editing
View all activity

Organizations

Hugging Face Discord Community's profile picture AI Starter Pack's profile picture

britny's activity

reacted to merve's post with šŸš€ 17 days ago
view post
Post
2245
smolagents can see šŸ”„
we just shipped vision support to smolagents šŸ¤— agentic computers FTW

you can now:
šŸ’» let the agent get images dynamically (e.g. agentic web browser)
šŸ“‘ pass images at the init of the agent (e.g. chatting with documents, filling forms automatically etc)
with few LoC change! šŸ¤Æ
you can use transformers models locally (like Qwen2VL) OR plug-in your favorite multimodal inference provider (gpt-4o, antrophic & co) šŸ¤ 

read our blog http://hf.co/blog/smolagents-can-see
reacted to merve's post with ā¤ļø about 1 month ago
view post
Post
3635
What a beginning to this year in open ML šŸ¤ 
Let's unwrap! merve/jan-10-releases-677fe34177759de0edfc9714

Multimodal šŸ–¼ļø
> ByteDance released SA2VA: a family of vision LMs that can take image, video, text and visual prompts
> moondream2 is out with new capabilities like outputting structured data and gaze detection!
> Dataset: Alibaba DAMO lab released multimodal textbook ā€” 22k hours worth of samples from instruction videos šŸ¤Æ
> Dataset: SciCap captioning on scientific documents benchmark dataset is released along with the challenge!

LLMs šŸ’¬
> Microsoft released Phi-4, sota open-source 14B language model šŸ”„
> Dolphin is back with Dolphin 3.0 Llama 3.1 8B šŸ¬šŸ¬
> Prime-RL released Eurus-2-7B-PRIME a new language model trained using PRIME alignment
> SmallThinker-3B is a new small reasoning LM based on Owen2.5-3B-Instruct šŸ’­
> Dataset: QWQ-LONGCOT-500K is the dataset used to train SmallThinker, generated using QwQ-32B-preview šŸ“•
> Dataset: @cfahlgren1 released React Code Instructions: a dataset of code instruction-code pairs šŸ“•
> Dataset: Qwen team is on the roll, they just released CodeElo, a dataset of code preferences šŸ‘©šŸ»ā€šŸ’»

Embeddings šŸ”–
> @MoritzLaurer released zero-shot version of ModernBERT large šŸ‘
> KaLM is a new family of performant multilingual embedding models with MIT license built using Qwen2-0.5B

Image/Video Generation āÆļø
> NVIDIA released Cosmos, a new family of diffusion/autoregressive World Foundation Models generating worlds from images, videos and texts šŸ”„
> Adobe released TransPixar: a new text-to-video model that can generate assets with transparent backgrounds (a first!)
> Dataset: fal released cosmos-openvid-1m Cosmos-tokenized OpenVid-1M with samples from OpenVid-1M

Others
> Prior Labs released TabPFNv2, the best tabular transformer is out for classification and regression
> Metagene-1 is a new RNA language model that can be used for pathogen detection, zero-shot embedding and genome understanding
reacted to merve's post with šŸš€ about 1 month ago
view post
Post
4857
supercharge your LLM apps with smolagents šŸ”„

however cool your LLM is, without being agentic it can only go so far

enter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!

Here's our blog for you to get started https://huggingface.co/blog/smolagents
reacted to ginipick's post with šŸ”„ about 2 months ago
view post
Post
5253
šŸŽ¬ Revolutionize Your Video Creation
Dokdo Multimodal AI Transform a single image into a stunning video with perfect audio harmony! šŸš€

Superior Technology šŸ’«
Advanced Flow Matching: Smoother video transitions surpassing Kling and Sora
Intelligent Sound System: Automatically generates perfect audio by analyzing video mood
Multimodal Framework: Advanced AI integrating image, text, and audio analysis
Outstanding Performance šŸŽÆ
Ultra-High Resolution: 4K video quality with bfloat16 acceleration
Real-Time Optimization: 3x faster processing with PyTorch GPU acceleration
Smart Sound Matching: Real-time audio effects based on scene transitions and motion
Exceptional Features āœØ
Custom Audio Creation: Natural soundtrack matching video tempo and rhythm
Intelligent Watermarking: Adaptive watermark adjusting to video characteristics
Multilingual Support: Precise translation engine powered by Helsinki-NLP
Versatile Applications šŸŒŸ
Social Media Marketing: Create engaging shorts for Instagram and YouTube
Product Promotion: Dynamic promotional videos highlighting product features
Educational Content: Interactive learning materials with enhanced engagement
Portfolio Enhancement: Professional-grade videos showcasing your work
Experience the video revolution with Dokdo Multimodal, where anyone can create professional-quality content from a single image. Elevate your content with perfectly synchronized video and audio that captivates your audience! šŸŽØ

Start creating stunning videos that stand out from the crowd - whether you're a marketer, educator, content creator, or business owner. Join the future of AI-powered video creation today!

ginipick/Dokdo-multimodal

#VideoInnovation #AITechnology #PremiumContent #MarketingSolution

šŸ”Š Please turn on your sound for the best viewing experience!
  • 1 reply
Ā·
reacted to hexgrad's post with šŸ”„ about 2 months ago
view post
Post
4041
Merry Christmas! šŸŽ„ Open sourced a small TTS model at hexgrad/Kokoro-82M
  • 2 replies
Ā·
reacted to prithivMLmods's post with šŸ¤— about 2 months ago
reacted to InferenceIllusionist's post with šŸ”„ about 2 months ago
view post
Post
1990
MilkDropLM-32b-v0.3: Unlocking Next-Gen Visuals āœØ

Stoked to release the latest iteration of our MilkDropLM project! This new release is based on the powerful Qwen2.5-Coder-32B-Instruct model using the same great dataset that powered our 7b model.

What's new?

- Genome Unlocked: Deeper understanding of preset relationships for more accurate and creative generations.

- Preset Revival: Breathe new life into old presets with our upgraded model!

- Loop-B-Gone: Say goodbye to pesky loops and hello to smooth generation.

- Natural Chats: Engage in more natural sounding conversations with our LLM than ever before.

Released under Apache 2.0, because sharing is caring!

Try it out: InferenceIllusionist/MilkDropLM-32b-v0.3

Shoutout to @superwatermelon for his invaluable insights and collab, and to all those courageous members in the community that have tested and provided feedback before!
updated a collection about 2 months ago