John6666 (John Smith)

reacted to mrzjy's post with 👀 about 3 hours ago

Post

210

A very small project:

Introducing CreativeTinyZero:
mrzjy/Qwen2.5-1.5B-GRPO-Creative-Ad-Generation

Unlike the impressive DeepSeek-R1(-Zero), this project focuses on a pure reinforcement learning (RL) experiment applied to an open-domain task: creative advertisement generation.

Objective:

- To investigate the feasibility of applying R1-like methods to an open-domain task without a verifiable ground-truth reward, while at least demonstrating its potential.
- To explore whether <think> and <answer> rewards can be explicitly designed to provide strong guidance through RL based on human prior knowledge.

Note:
- Our goal is not to induce self-reflective thinking, but to align with human thought processes purely through RL, without any supervised fine-tuning (SFT) on any constructed dataset.

Despite its small size, the resulting 1.5B-GRPO model demonstrates intriguing generative capabilities—though it's still far from perfect.

reacted to AdinaY's post with 🔥 about 12 hours ago

Post

1306

Ovis2 🔥 a multimodal LLM released by Alibaba AIDC team.
AIDC-AI/ovis2-67ab36c7e497429034874464
✨1B/2B/4B/8B/16B/34B
✨Strong CoT for deeper problem solving
✨Multilingual OCR – Expanded beyond English & Chinese, with better data extraction

reacted to onekq's post with 👍 about 12 hours ago

Post

758

R1 is still trending. Here is a collection of works trying to replicate R1.
onekq-ai/r1-reproduction-works-67a93f2fb8b21202c9eedf0b

Players include Huggingface (Open R1), Stanford (simple scaling), Berkeley (Bespoke, Open thoughts, etc.), ServiceNow, etc. I know there is another work from HKUST but couldn't find it on 🤗. Let me know if I miss any teams.

4 replies

·

reacted to Keltezaa's post with 👀 about 19 hours ago

Post

604

Why does all the Text-to-image models running on HF Inference API fail and report fail with the error
"Model strangerzonehf/Neon-Impressionism-Flux does not exist"

It used to work last month.

1 reply

·

reacted to fdaudens's post with ❤️ about 19 hours ago

Post

1491

⭐️ The AI Energy Score project just launched - this is a game-changer for making informed decisions about AI deployment.

You can now see exactly how much energy your chosen model will consume, with a simple 5-star rating system. Think appliance energy labels, but for AI.

Looking at transcription models on the leaderboard is fascinating: choosing between whisper-tiny or whisper-large-v3 can make a 7x difference. Real-time data on these tradeoffs changes everything.

166 models already evaluated across 10 different tasks, from text generation to image classification. The whole thing is public and you can submit your own models to test.

Why this matters:
- Teams can pick efficient models that still get the job done
- Developers can optimize for energy use from day one
- Organizations can finally predict their AI environmental impact

If you're building with AI at any scale, definitely worth checking out.

👉 leaderboard: https://lnkd.in/esrSxetj
👉 blog post: https://lnkd.in/eFJvzHi8

Huge work led by @sasha with @bgamazay @yjernite @sarahooker @regisss @meg

1 reply

·

reacted to etemiz's post with 🧠 about 19 hours ago

Post

1461

Some things are simple

reacted to davanstrien's post with 👀 about 19 hours ago

Post

459

Made some significant updates to my 🤗 semantic datasets search app. If you love falling into a wiki black hole, you might like this...

librarian-bots/huggingface-datasets-semantic-search

reacted to grimjim's post with 🔥🚀👍 about 19 hours ago

Post

1052

This recent paper points to an explanation for the unreasonable effectiveness of Frankenmerges: Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach (2502.05171)

Specifically, the duplication of layers in Frankenmerges serves a purpose similar to what occurs in their recurrent-depth architecture. Successful frankenmerges that operate without additional fine-tuning are able to recover or "heal" from any damage due to abrupt transitions between layer blocks. Operational replicated layer blocks can provide functional benefits grounded in latent reasoning. Frankenmerges can also result in hybrid reasoning, by splicing together the latent reasoning of different models.

Back in April 2024, I was able to duplicate a few layers in the Llama 3 8B model, turning it into a 9B model, without harming benchmarks significantly, despite any transition damage.
grimjim/llama-3-experiment-v1-9B
My informal experimentation suggested that latent reasoning circuits could occupy continguous stacks of 2-4 layers, though the result was highly sensitive to the choice of transition location between layers.

reacted to lxasqjc's post with 👍 1 day ago

Post

1155

⚡ Can Stable Diffusion's visual expertise enhance Llama-3.2?
🚀 Lavender: efficiently fine-tunes advanced vision-language models by aligning their text-vision attention with Stable Diffusion.
Paper: Diffusion Instruction Tuning (2502.06814)
🔑 Key Highlights:
✅ Significant Gains: +30% on 20 tasks, +68% on OOD WorldMedQA
✅ Data-Efficient: Needs only 0.13M samples (~2.5% of typical VLM datasets)
✅ Low Compute: Finetunes in ~1 day on 8 NVIDIA A10G GPUs
✅ Model-Agnostic: Works with Llama-3.2-11B, MiniCPM-Llama3-v2.5 & more
✅ Precise Alignment: Transfers strong text-vision alignment from Stable Diffusion
✅ Open-Source: Code, data & finetuned models will be available

👥 Discuss live at: https://www.alphaxiv.org/abs/2502.06814
🔗 Project Page: https://astrazeneca.github.io/vlm/

reacted to AdinaY's post with 🔥 1 day ago

Post

1703

InspireMusic 🎵🔥 an open music generation framework by Alibaba FunAudio Lab
Model: FunAudioLLM/InspireMusic-1.5B-Long
Demo: FunAudioLLM/InspireMusic
✨ Music, songs, audio - ALL IN ONE
✨ High quality audio: 24kHz & 48kHz sampling rates
✨ Long-Form Generation: enables extended audio creation
✨ Efficient Fine-Tuning: precision (BF16, FP16, FP32) with user-friendly scripts

1 reply

·

reacted to jsulz's post with 🤗🚀 1 day ago

Post

2474

Toward the end of last year, the Xet team provided an inside look into the foundations of how we plan to enable rapid experimentation and iteration for the AI builders on the Hub: https://huggingface.co/blog/from-files-to-chunks

But it turns out chunks aren't all you need!

Our goal is to bring:
🚀 Faster uploads
⏬ Speedy downloads
💪 All without sacrificing your workflow

To do that, we need the infrastructure and system and design to back it up. As we prepare to roll out the first Xet-backed repositories on the Hub, we wrote up a post explaining the nitty gritty details of the decisions that bring this to life https://huggingface.co/blog/from-chunks-to-blocks

Complete with an interactive visualization that shows the power of deduplication in action - taking a 191GB repo to ~97GB and shaving a few hours off upload speeds.

The darker each block in the heatmap, the more we dedupe, the less we have to transfer. Clicking on a file's blocks shows all other files that share blocks.

Check it out and explore for yourself! xet-team/quantization-dedup

reacted to nyuuzyou's post with 👍 1 day ago

Post

1322

🎓 Educational Text Collection - nyuuzyou/edutexts

A collection of 1.38M educational texts featuring:
- 1.33M educational presentations with full slide content
- 47K academic documents with complete text
- Multilingual content (Russian, Ukrainian, English)
- Full metadata including titles and descriptions

All content is available under CC0 license, allowing unrestricted use including commercial applications.

reacted to vikhyatk's post with 🔥 1 day ago

Post

1634

🚨 New VQA + captioning dataset! moondream/megalith-mdqa

Images from Megalith, captioned using Moondream, then transformed to short-form QA.

9M+ images, 6-10 QA pairs per image.

reacted to lukmanaj's post with 👍 1 day ago

Post

2093

I am excited to share that I’ve successfully completed Unit 1: Foundations of Agents in the Hugging Face Agents Course.
Exploring the fundamentals of AI agents has been an insightful journey, and I’m looking forward to applying these concepts in real-world applications.
Big thanks to the Hugging Face team for this amazing learning opportunity! 🤗
Check out the course here: https://huggingface.co/learn/agents-course/

2 replies

·

reacted to mkurman's post with 👍 1 day ago

Post

1419

I've been working on something cool: a GRPO with an LLM evaluator that can also perform SFT on the feedback data - if you want. Check it out 😊

Any 🌟are more than welcome 🤗

https://github.com/mkurman/grpo-llm-evaluator

reacted to CultriX's post with ❤️ 1 day ago

Post

1547

Final upgrade to the Multi-Agent Task Completion Space: CultriX/MultiAgent-CodeTask .

It now includes :
- a live stream of the progress being made on the task (see included video),
- The following components:
1. Automatic prompt optimization
2. An orchestrator deciding which agent to call dynamically including feedback from a human (human-in-the-loop)
3. A coding agent to complete the task
4. A code reviewing agent to iteratively provide feedback to improve the code generated by the coding agent until the code meets the required criteria after which it is approved.
5. A testing agent that tests the approved code or provides information on how to test it.
6. A documentation agent that provides documentation and a help message for the approved and tested code.

reacted to m-ric's post with 🔥 2 days ago

Post

3432

"𝟮𝟬𝟮𝟱 𝘄𝗶𝗹𝗹 𝗯𝗲 𝘁𝗵𝗲 𝘆𝗲𝗮𝗿 𝗼𝗳 𝗔𝗜 𝗮𝗴𝗲𝗻𝘁𝘀": this statement has often been made, here are numbers to support it.

I've plotted the progress of AI agents on GAIA test set, and it seems they're headed to catch up with the human baseline in early 2026.

And that progress is still driven mostly by the improvement of base LLMs: progress would be even faster with fine-tuned agentic models.

John Smith PRO

AI & ML interests

Recent Activity

Organizations

John6666's activity