prithivMLmods (Prithiv Sakthi)

reacted to burtenshaw's post with 🔥 about 9 hours ago

Post

1683

The Hugging Face agents course is finally out!

👉 https://huggingface.co/agents-course

This first unit of the course sets you up with all the fundamentals to become a pro in agents.

- What's an AI Agent?
- What are LLMs?
- Messages and Special Tokens
- Understanding AI Agents through the Thought-Action-Observation Cycle
- Thought, Internal Reasoning and the Re-Act Approach
- Actions, Enabling the Agent to Engage with Its Environment
- Observe, Integrating Feedback to Reflect and Adapt

reacted to nicolay-r's post with 🚀 3 days ago

Post

2175

📢 If you wish to empower LLM with IR and named entity recognition module, then I got relevant findings.
Just tested Flair below is how you can start for adapting for processing your CSV / JSONL data via bulk-ner
👩‍💻 code: https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/ner_flair_0151.sh
🤖 models: https://huggingface.co/flair

Provider: https://raw.githubusercontent.com/nicolay-r/nlp-thirdgate/refs/heads/master/ner/flair_0151.py
Framework: https://github.com/nicolay-r/bulk-ner

🚀 Performance: the default ner model (Thinkpad X1 Nano)
Batch-size 1 6it/sec
Batch-size 10+ 12it/sec

🌌 other wrappers for bulk-ner nlp-thirdgate: https://github.com/nicolay-r/nlp-thirdgate

posted an update 3 days ago

Post

3629

QwQ Edge Gets a Small Update..! 💬
try now: prithivMLmods/QwQ-Edge

🚀Now, you can use the following commands for different tasks:

🖼️ @image 'prompt...' → Generates an image
🔉@tts1 'prompt...' → Generates speech in a female voice
🔉 @tts2 'prompt...' → Generates speech in a male voice
🅰️@text 'prompt...' → Enables textual conversation (If not specified, text-to-text generation is the default mode)

💬Multimodality Support : prithivMLmods/Qwen2-VL-OCR-2B-Instruct
💬For text generation, the FastThink-0.5B model ensures quick and efficient responses, prithivMLmods/FastThink-0.5B-Tiny
💬Image Generation: sdxl lightning model, SG161222/RealVisXL_V4.0_Lightning

Github: https://github.com/PRITHIVSAKTHIUR/QwQ-Edge

graph TD
    A[User Interface] --> B[Chat Logic]
    B --> C{Command Type}
    C -->|Text| D[FastThink-0.5B]
    C -->|Image| E[Qwen2-VL-OCR-2B]
    C -->|@image| F[Stable Diffusion XL]
    C -->|@tts| G[Edge TTS]
    D --> H[Response]
    E --> H
    F --> H
    G --> H

reacted to burtenshaw's post with 🤗 4 days ago

Post

3041

SmolLM2 paper is out! 😊

😍 Why do I love it? Because it facilitates teaching and learning!

Over the past few months I've engaged with (no joke) thousands of students based on SmolLM.

- People have inferred, fine-tuned, aligned, and evaluated this smol model.
- People used they're own machines and they've used free tools like colab, kaggle, and spaces.
- People tackled use cases in their job, for fun, in their own language, and with their friends.

upvote the paper SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2502.02737)

1 reply

·

reacted to nicolay-r's post with 🧠 5 days ago

Post

2047

🚨 Key takeaway of a quick mastering Sentiment Analysis nowadays. Trough the questionare 📝 of the past RuOpinoinNE-2024 competition we got insights and participants model preference chocies. Our main conclusion:

✨ The submissions of the top performed models exploit Few-shot learning for LLM.

Takeaway note comparing with the prior RuSentNE-2023 competition:
🧠 Reasoning in steps requires more actions for tweaking. Most recent solutions empowered with Chain-of-Thouhgt are tend to think too much. Earlier we might see improvements for the Flan-T5 (2.8B) in fine-tuned mode but not among the zero-shot approaches.
nicolay-r/flan-t5-tsa-thor-xl

Related materials:
https://github.com/dialogue-evaluation/RuOpinionNE-2024
RuSentNE-2023: Evaluating Entity-Oriented Sentiment Analysis on Russian News Texts (2305.17679)
Large Language Models in Targeted Sentiment Analysis (2404.12342)

reacted to IliaLarchenko's post with 🔥 5 days ago

Post

1978

I am presenting Decoder-Only Transformer (DOT) Policy a simple Behavioral Control policy that outperforms SOTA models on two simple benchmark tasks:

✅ PushT (pushing an object to a goal) – 84% success on keypoints, 74% on images (previous best: 75% / 69%)
✅ ALOHA Insert (precise bimanual insertion) – 30% success (previous best: ~21%)

The best part? DOT is much smaller (sometimes 100 times less parameters) than previous SOTA models, trains faster, and avoids complexity:
🚫 No generative models (Diffusion, VAE, GANs)
🚫 No discretization/tokenization of actions
🚫 No reinforcement learning or multi-stage training
✅ Just learns from human demos, plain and simple

This is still early — more complex real-life tasks need testing, and no guarantees it will actually work well there, but I think it's interesting to share. Sometimes, simpler approaches can be just as effective (or even better) than complex ones.

🔗 Open-source code and detailed description: https://github.com/IliaLarchenko/dot_policy

Trained models on Hugging Face:
IliaLarchenko/dot_pusht_keypoints
IliaLarchenko/dot_pusht_images
IliaLarchenko/dot_bimanual_insert

reacted to davidberenstein1957's post with 🤗 6 days ago

Post

2697

Anyone can create free hosted tools for their AI agents! 🔥

Agentic RAG stack part 2 - augment
Augment retrieval results by reranking optimises content without increasing time too much

part2: https://huggingface.co/blog/davidberenstein1957/ai-blueprint-agentic-rag-part-2-augment
code: https://github.com/huggingface/ai-blueprint

replied to victor's post 7 days ago

The Space Cards and App Directory are really cool, especially with the dark theme!
🤗

reacted to victor's post with ❤️ 7 days ago

Post

3734

Hey everyone, we've given https://hf.co/spaces page a fresh update!

Smart Search: Now just type what you want to do—like "make a viral meme" or "generate music"—and our search gets it.

New Categories: Check out the cool new filter bar with icons to help you pick a category fast.

Redesigned Space Cards: Reworked a bit to really show off the app descriptions, so you know what each Space does at a glance.

Random Prompt: Need ideas? Hit the dice button for a burst of inspiration.

We’d love to hear what you think—drop us some feedback plz!

5 replies

·

reacted to m-ric's post with 🔥 7 days ago

Post

9187

Introducing 𝗼𝗽𝗲𝗻 𝗗𝗲𝗲𝗽-𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 by Hugging Face! 💥

OpenAI's latest agentic app Deep Research seems really good... But it's closed, as usual.

⏱️ So with a team of cracked colleagues, we set ourselves a 24hours deadline to replicate and open-source Deep Research! ⏱️

➡️ We built open-Deep-Research, an entirely open agent that can: navigate the web autonomously, scroll and search through pages, download and manipulate files, run calculation on data...

We aimed for the best performance: are the agent's answers really rigorous?

On GAIA benchmark, Deep Research had 67% accuracy on the validation set.
➡️ open Deep Research is at 55% (powered by o1), it is:
- the best pass@1 solution submitted
- the best open solution 💪💪

And it's only getting started ! Please jump in, drop PRs, and let's bring it to the top !

Read the blog post 👉 https://huggingface.co/blog/open-deep-research

reacted to rubenroy's post with ❤️ 7 days ago

Post

2333

🔥🚀 Hey everyone! I'm excited to share my latest LLM release: Gilgamesh 72B, a model built on Qwen 2.5-72B Instruct. Gilgamesh was trained on a couple of my GammaCorpus datasets, specifically:

- rubenroy/GammaCorpus-CoT-Math-170k
- rubenroy/GammaCorpus-v2-5m
- rubenroy/GammaCorpus-Fact-QA-450k

I've submitted GGM 72B to the Open LLM Leaderboard for benchmarking, I'll send an update post once the results are in!

You can try it out and share your feedback, check out the model page and see what it can do:
👉 rubenroy/Gilgamesh-72B

Would love to hear your thoughts!

reacted to nicolay-r's post with 👀 8 days ago

Post

2197

📢 Qwen so far released the 2.5-MAX that claims to outperform DeepSeek-V3 [Edited: not R1].
And here is how you can start applying it for handling CSV / JSONL data.
The model is compatible with OpenAI API so here is my wrapper for it:
🌌 https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/openai_156.py

🚀 All you have to do is to set
base-url: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
and API key of the platform.

↗️ Below is the link to the complete example (see screenshot):
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_qwen_25_max_chat.sh

📰 Source: https://www.alibabacloud.com/help/en/model-studio/developer-reference/what-is-qwen-llm
📺 Official Sandbox Demo: Qwen/Qwen2.5-Max-Demo
📜 Paper: https://arxiv.org/abs/2412.15115

2 replies

·

posted an update 9 days ago

Post

4733

o3-Mini and Deepseek R1
Worked out with some famous and weird examples.

🔥Blog: https://huggingface.co/blog/prithivMLmods/o3-mini-vs-deepseek-r1

Prompt : Using HTML, CSS, and JavaScript in a single HTML file to create a simulation of the solar system. Pay extreme attention to the UI to make it as intuitive as possible. Ensure that every planet appears as a sphere and is labeled with its corresponding name.

example 1: o3 Mini , example 2: Deepseek R1

Q2 : https://huggingface.co/blog/prithivMLmods/o3-mini-vs-deepseek-r1#q2--web-solar-system-explorer

1 reply

·

reacted to jasoncorkill's post with 🚀 11 days ago

Post

2654

We benchmarked @xai-org 's Aurora model, as far as we know the first public evaluation of the model at scale.

We collected 401k human annotations in over the past ~2 days for this, we have uploaded all of the annotation data here on huggingface with a fully permissive license
Rapidata/xAI_Aurora_t2i_human_preferences

reacted to mkurman's post with ❤️ 12 days ago

Post

2031

Blurred-Thoughts Supervised Fine-Tuning (BT-SFT) 🤖

Can we teach a model to think completely on its own without reinforcement learning? Actually, yes.

We can do straightforward supervised fine-tuning using a relatively simple trick: blurring a part of CoT thoughts. But why is this effective?

We observed that various models differ in their thinking processes, and fine-tuning one model on another model’s thoughts (CoT) can sometimes be inefficient—often resulting in the model simply memorizing reasoning rather than learning how to actually think.

I discovered that this process can still be efficient if we clearly indicate when the model should start and stop thinking and uncover only a part of CoT and the expected answer, blurring the other part of CoT. This approach allows the model to learn only a portion of the thought process while still arriving at an expected answer.

To demonstrate this, you can watch my experimental BT-SFT on meditsolutions/Llama-3.2-SUN-2.5B-chat model, which was fine-tuned on 151 million tokens from the Magpie-Align/Magpie-Reasoning-V2-250K-CoT-Deepseek-R1-Llama-70B dataset.

Enjoy! 🚀

PS. If you were curious enough to read this, leave me a comment. It's always nice to chat with open-minded and intelligent ppl.

3 replies

·

reacted to not-lain's post with 🤗 13 days ago

Post

3430

I have just released a new blogpost about kv caching and its role in inference speedup 🚀
🔗 https://huggingface.co/blog/not-lain/kv-caching/
some takeaways :