Alex's picture

Alex

AlexPoto
·

AI & ML interests

None yet

Recent Activity

reacted to Kseniase's post with 🚀 4 days ago
8 New Types of RAG RAG techniques continuously evolve to enhance LLM response accuracy by retrieving relevant external data during generation. To keep up with current AI trends, new RAG types incorporate deep step-by-step reasoning, tree search, citations, multimodality and other effective techniques. Here's a list of 8 latest RAG advancements: 1. DeepRAG -> https://huggingface.co/papers/2502.01142 Models retrieval-augmented reasoning as a Markov Decision Process, enabling strategic retrieval. It dynamically decides when to retrieve external knowledge and when rely on parametric reasoning. 2. RealRAG -> https://huggingface.co/papers/2502.00848 Enhances  novel object generation by retrieving real-world images and using self-reflective contrastive learning to fill knowledge gap, improve realism and reduce distortions. 3. Chain-of-Retrieval Augmented Generation (CoRAG) -> https://huggingface.co/papers/2501.14342 Retrieves information step-by-step and adjusts it, also deciding how much compute power to use at test time. If needed it reformulates queries. 4. VideoRAG -> https://huggingface.co/papers/2501.05874 Enables unlimited-length video processing, using dual-channel architecture that integrates graph-based textual grounding and multi-modal context encoding. 5. CFT-RAG ->  https://huggingface.co/papers/2501.15098 A tree-RAG acceleration method uses an improved Cuckoo Filter to optimize entity localization, enabling faster retrieval. 6. Contextualized Graph RAG (CG-RAG) -> https://huggingface.co/papers/2501.15067 Uses Lexical-Semantic Graph Retrieval (LeSeGR) to integrate sparse and dense signals within graph structure and capture citation relationships 7. GFM-RAG -> https://huggingface.co/papers/2502.01113 A graph foundation model that uses a graph neural network to refine query-knowledge connections 8. URAG -> https://huggingface.co/papers/2501.16276 A hybrid system combining rule-based and RAG methods to improve lightweight LLMs for educational chatbots
liked a dataset 11 days ago
kristaller486/Nebo-T1-Russian
View all activity

Organizations

None yet

AlexPoto's activity

reacted to Kseniase's post with 🚀 4 days ago
view post
Post
7381
8 New Types of RAG

RAG techniques continuously evolve to enhance LLM response accuracy by retrieving relevant external data during generation. To keep up with current AI trends, new RAG types incorporate deep step-by-step reasoning, tree search, citations, multimodality and other effective techniques.

Here's a list of 8 latest RAG advancements:

1. DeepRAG -> DeepRAG: Thinking to Retrieval Step by Step for Large Language Models (2502.01142)
Models retrieval-augmented reasoning as a Markov Decision Process, enabling strategic retrieval. It dynamically decides when to retrieve external knowledge and when rely on parametric reasoning.

2. RealRAG -> RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning (2502.00848)
Enhances  novel object generation by retrieving real-world images and using self-reflective contrastive learning to fill knowledge gap, improve realism and reduce distortions.

3. Chain-of-Retrieval Augmented Generation (CoRAG) -> Chain-of-Retrieval Augmented Generation (2501.14342)
Retrieves information step-by-step and adjusts it, also deciding how much compute power to use at test time. If needed it reformulates queries.

4. VideoRAG -> VideoRAG: Retrieval-Augmented Generation over Video Corpus (2501.05874)
Enables unlimited-length video processing, using dual-channel architecture that integrates graph-based textual grounding and multi-modal context encoding.

5. CFT-RAG ->  CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter (2501.15098)
A tree-RAG acceleration method uses an improved Cuckoo Filter to optimize entity localization, enabling faster retrieval.

6. Contextualized Graph RAG (CG-RAG) -> CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs (2501.15067)
Uses Lexical-Semantic Graph Retrieval (LeSeGR) to integrate sparse and dense signals within graph structure and capture citation relationships

7. GFM-RAG -> GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation (2502.01113)
A graph foundation model that uses a graph neural network to refine query-knowledge connections

8. URAG -> URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT (2501.16276)
A hybrid system combining rule-based and RAG methods to improve lightweight LLMs for educational chatbots
  • 1 reply
·
reacted to kristaller486's post with 🚀 11 days ago
view post
Post
1341
Nebo-T1-Russian

(Probably) the first "longCoT" dataset for the Russian language created via Deeseek-R1.

- Prompts taken from the Sky-T1 dataset and translated via Llama3.3-70B.
- Answers and reasoning generated by Deepseek-R1 (685B).
- 16.4K samples in total, ≈12.4K Russian-only (in the rest, either the answer or reasoning is in English).
- Languages in the answers and reasoning are labeled using fasttext.

kristaller486/Nebo-T1-Russian
New activity in benxh/Qwen2.5-VL-7B-Instruct-GGUF 17 days ago

Wrong format?

5
#1 opened 17 days ago by
AlexPoto
reacted to nyuuzyou's post with 🤗 about 1 month ago
view post
Post
1506
🗂️ I don't think the collections feature of Hugging Face is widely used, even though it's an excellent way to organize and discover interesting resources. To do my bit to change that, I've created two carefully curated collections that combine both my original work and other valuable datasets:

Educational Datasets
- Mostly English-Russian, but other languages are also included
- Extended by my new Begemot.ai dataset (2.7M+ Russian education records) nyuuzyou/begemot

Link: nyuuzyou/educational-datasets-677c268978ac1cec96cc3605

Anime & Art

- Extensive art-focused collection, including my new datasets:
- Buzzly.art (2K artworks) nyuuzyou/buzzlyart
- Paintberri (60K+ pieces) nyuuzyou/paintberri
- Itaku.ee (924K+ items) nyuuzyou/itaku
- Extended with other amazing datasets from the community

Link: nyuuzyou/anime-and-art-677ae996682a389fccd892c3

Collections should become a more common feature - hopefully this will encourage others to create and share their own curated collections. By organizing related datasets into these themed collections, I hope to make it easier for researchers and developers to discover and use these valuable resources.
  • 1 reply
·