Spaces:
Sleeping
Sleeping
title: Gaia Llamaindex Agent | |
emoji: π¦ | |
colorFrom: red | |
colorTo: pink | |
sdk: docker | |
app_file: app.py | |
pinned: false | |
short_description: Test To Pass GAIA | |
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |
# π¦ GAIA Benchmark Agent with LlamaIndex | |
This Space implements a complete LlamaIndex agent designed to tackle the GAIA (General AI Assistants) benchmark questions. | |
## Features | |
- **Local LLM**: Runs entirely on Hugging Face Spaces without external API dependencies | |
- **LlamaIndex Integration**: Uses ReAct agent framework for reasoning and tool use | |
- **GAIA API Integration**: Fetches questions and submits answers automatically | |
- **Tool Suite**: Web search, calculation, file reading, and more | |
- **User-Friendly Interface**: Gradio UI for testing and submission | |
## Architecture | |
``` | |
π¦ GAIA Agent | |
βββ π§ Local LLM (DialoGPT/GPT-2) | |
βββ π§ Agent Tools | |
β βββ Web Search | |
β βββ Calculator | |
β βββ File Reader | |
β βββ GAIA API Client | |
βββ π€ ReAct Agent (LlamaIndex) | |
βββ π₯οΈ Gradio Interface | |
``` | |
## Usage | |
1. **Test Single Questions**: Try individual GAIA questions | |
2. **Full Evaluation**: Process all 20 questions from the dataset | |
3. **Submit to GAIA**: Send answers for official scoring | |
## Scoring Target | |
The goal is to achieve **30% accuracy** on GAIA Level 1 questions, which represents a significant milestone in AI assistant capabilities. | |
## Hardware Requirements | |
- CPU: Works on free tier | |
- Memory: ~8GB recommended | |
- GPU: Optional but improves performance | |
## Getting Started | |
1. Clone or duplicate this Space | |
2. Run the application | |
3. Start with single question testing | |
4. Process all questions when ready | |
5. Submit to GAIA leaderboard | |
Built with β€οΈ for the GAIA benchmark challenge! |