title: Pas2 Llm Hallucination Detector
emoji: 🐠
colorFrom: purple
colorTo: yellow
sdk: gradio
sdk_version: 5.20.1
app_file: app.py
pinned: false
license: mit
short_description: pas2 is an llm-as-a-judge system used to verify outputs
PAS2 - Hallucination Detection System
A sophisticated system for detecting hallucinations in AI responses using a paraphrase-based approach with model-as-judge verification.
Features
- Paraphrase Generation: Automatically generates semantically equivalent variations of user queries
- Multi-Model Architecture: Uses Mistral Large for responses and OpenAI's o3-mini as a judge
- Real-time Progress Tracking: Visual feedback during the analysis process
- Permanent Cloud Storage: User feedback and results are stored in MongoDB Atlas for persistent storage across restarts
- Interactive Web Interface: Clean, responsive Gradio interface with example queries
- Detailed Analysis: Provides confidence scores, reasoning, and specific conflicting facts
- Statistics Dashboard: Real-time tracking of hallucination detection statistics
Setup
- Clone this repository
- Install dependencies:
pip install -r requirements.txt
- Set up your API keys as environment variables:
HF_MISTRAL_API_KEY
: Your Mistral AI API keyHF_OPENAI_API_KEY
: Your OpenAI API key
Deployment on Hugging Face Spaces
- Create a new Space on Hugging Face
- Select "Gradio" as the SDK
- Add your repository
- Set up a MongoDB Atlas database (see below)
- Set the following secrets in your Space's settings:
HF_MISTRAL_API_KEY
HF_OPENAI_API_KEY
MONGODB_URI
MongoDB Atlas Setup
For permanent data storage that persists across HuggingFace Space restarts:
- Create a free MongoDB Atlas account
- Create a new cluster (the free tier is sufficient)
- In the "Database Access" menu, create a database user with read/write permissions
- In the "Network Access" menu, add IP
0.0.0.0/0
to allow access from anywhere (required for HuggingFace Spaces) - In the "Databases" section, click "Connect" and choose "Connect your application"
- Copy the connection string and replace
<password>
with your database user's password - Set this as your
MONGODB_URI
secret in HuggingFace Spaces settings
Usage
- Enter a factual question or select from example queries
- Click "Detect Hallucinations" to start the analysis
- Review the detailed results:
- Hallucination detection status
- Confidence score
- Original and paraphrased responses
- Detailed reasoning and analysis
- Provide feedback to help improve the system
How It Works
Query Processing:
- Your question is paraphrased multiple ways
- Each version is sent to Mistral Large
- Responses are collected and compared
Hallucination Detection:
- OpenAI's o3-mini analyzes responses
- Identifies factual inconsistencies
- Provides confidence scores and reasoning
Feedback Collection:
- User feedback is stored in MongoDB Atlas
- Cloud-based persistent storage ensures data survival
- Statistics are updated in real-time
- Data can be exported for further analysis
Data Persistence
The application uses MongoDB Atlas for data storage, providing several benefits:
- Permanent Storage: Data persists even when Hugging Face Spaces restart
- Scalability: MongoDB scales as your data grows
- Cloud-based: No reliance on Space-specific storage that can be lost
- Query Capabilities: Powerful query functionality for data analysis
- Export Options: Built-in methods to export data to CSV
Contributing
Contributions are welcome! Please feel free to submit pull requests.
License
This project is licensed under the MIT License - see the LICENSE file for details.
About
This application uses a combination of paraphrasing techniques and model-as-judge approaches to identify potential hallucinations in LLM responses. It provides confidence scores, identifies conflicting facts, and offers detailed reasoning for its judgments.
Features
- Generates paraphrased versions of input queries
- Evaluates responses using semantic similarity analysis
- Provides match percentage and similarity metrics
- Includes visualization tools for similarity matrices
- Web interface for interactive testing
- Benchmarking capabilities for bulk evaluation
Installation
git clone https://github.com/serhanylmz/pas2
cd pas2
pip install -r requirements.txt
Set up your OpenAI API key in a .env
file:
OPENAI_API_KEY=your_api_key_here
Usage
Web Interface
Run the Gradio interface:
python pas2-gradio.py
Benchmark Tool
Run the benchmark tool:
python pas2-benchmark.py --json_file your_data.json --num_samples 10
Library Usage
from pas2 import PAS2
detector = PAS2()
hallucinated, response, questions, answers = detector.detect_hallucination(
"your question",
n_paraphrases=5,
similarity_threshold=0.9,
match_percentage_threshold=0.7
)
Configuration
- Default model: gpt-4-2024-08-06
- Default embedding model: text-embedding-3-small
- Adjustable similarity and match percentage thresholds
Output Files
- Similarity matrix plots (PNG)
- Match matrix plots (PNG)
- Benchmark results (CSV, TXT)
- User feedback logs (XLSX)
License
This project is licensed under the MIT License with an attribution requirement - see the LICENSE file for details.
Citation
If you use PAS2 in your research or project, please cite it as:
@software{pas2_2024,
author = {Serhan Yilmaz},
title = {PAS2 - Paraphrase-based AI System for Semantic Similarity},
year = {2024},
publisher = {GitHub},
url = {https://github.com/serhanylmz/pas2}
}
Attribution Requirements
When using PAS2, you must provide appropriate attribution by:
- Including the copyright notice and license in any copy or substantial portion of the software
- Citing the project in any publications, presentations, or documentation that uses or builds upon this work
- Maintaining a link to the original repository in any forks or derivative works
Contact
Serhan Yilmaz [email protected] Sabanci University