Spaces:

DeepRetrieval
/

README

Running

App Files Files Community

README / README.md

pat-jj

Update README.md

50da90e verified 5 months ago

preview code

raw

history blame

1.66 kB

	# DeepRetrieval

	## Overview

	DeepRetrieval is a novel approach that uses reinforcement learning (RL) to train Large Language Models (LLMs) for query generation without requiring supervised data. Instead of relying on expensive human-annotated or distilled reference queries, DeepRetrieval enables LLMs to learn through direct trial and error, using retrieval metrics as rewards.

	## Key Features

	- No Supervision Required: Eliminates the need for expensive human-annotated or distilled reference queries
	- RL-Based Framework: Uses reinforcement learning to optimize query generation directly for retrieval performance
	- Reasoning-Enhanced Generation: Incorporates a structured generation method with explicit reasoning before query formulation
	- State-of-the-Art Performance: Achieves remarkable results across diverse retrieval tasks
	- Parameter Efficiency: With just 3B parameters, outperforms much larger models like GPT-4o and Claude-3.5-Sonnet

	## Performance Highlights

	- Literature Search: Doubles the recall on PubMed (65.07% vs previous SOTA 24.68%) and ClinicalTrials.gov (63.18% vs previous SOTA 32.11%)
	- Evidence-Seeking Retrieval: Achieves performance equivalent to industry-leading LLMs on NQ and TriviaQA, and significantly outperforms them on SQuAD
	- Classic IR: Shows superior performance across diverse retrieval benchmarks
	- SQL Database Search: Excels in text-to-SQL generation for database search


	## About

	DeepRetrieval was developed by researchers from the University of Illinois Urbana-Champaign. For more information, visit the [GitHub repository](https://github.com/pat-jj/DeepRetrieval).

	# DeepRetrieval

	## Overview

	DeepRetrieval is a novel approach that uses reinforcement learning (RL) to train Large Language Models (LLMs) for query generation without requiring supervised data. Instead of relying on expensive human-annotated or distilled reference queries, DeepRetrieval enables LLMs to learn through direct trial and error, using retrieval metrics as rewards.

	## Key Features

	- No Supervision Required: Eliminates the need for expensive human-annotated or distilled reference queries
	- RL-Based Framework: Uses reinforcement learning to optimize query generation directly for retrieval performance
	- Reasoning-Enhanced Generation: Incorporates a structured generation method with explicit reasoning before query formulation
	- State-of-the-Art Performance: Achieves remarkable results across diverse retrieval tasks
	- Parameter Efficiency: With just 3B parameters, outperforms much larger models like GPT-4o and Claude-3.5-Sonnet

	## Performance Highlights

	- Literature Search: Doubles the recall on PubMed (65.07% vs previous SOTA 24.68%) and ClinicalTrials.gov (63.18% vs previous SOTA 32.11%)
	- Evidence-Seeking Retrieval: Achieves performance equivalent to industry-leading LLMs on NQ and TriviaQA, and significantly outperforms them on SQuAD
	- Classic IR: Shows superior performance across diverse retrieval benchmarks
	- SQL Database Search: Excels in text-to-SQL generation for database search


	## About

	DeepRetrieval was developed by researchers from the University of Illinois Urbana-Champaign. For more information, visit the [GitHub repository](https://github.com/pat-jj/DeepRetrieval).