Spaces:
Sleeping
Sleeping
metadata
title: Gaia Llamaindex Agent
emoji: π¦
colorFrom: red
colorTo: pink
sdk: docker
app_file: app.py
pinned: false
short_description: Test To Pass GAIA
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
π¦ GAIA Benchmark Agent with LlamaIndex
This Space implements a complete LlamaIndex agent designed to tackle the GAIA (General AI Assistants) benchmark questions.
Features
- Local LLM: Runs entirely on Hugging Face Spaces without external API dependencies
- LlamaIndex Integration: Uses ReAct agent framework for reasoning and tool use
- GAIA API Integration: Fetches questions and submits answers automatically
- Tool Suite: Web search, calculation, file reading, and more
- User-Friendly Interface: Gradio UI for testing and submission
Architecture
π¦ GAIA Agent
βββ π§ Local LLM (DialoGPT/GPT-2)
βββ π§ Agent Tools
β βββ Web Search
β βββ Calculator
β βββ File Reader
β βββ GAIA API Client
βββ π€ ReAct Agent (LlamaIndex)
βββ π₯οΈ Gradio Interface
Usage
- Test Single Questions: Try individual GAIA questions
- Full Evaluation: Process all 20 questions from the dataset
- Submit to GAIA: Send answers for official scoring
Scoring Target
The goal is to achieve 30% accuracy on GAIA Level 1 questions, which represents a significant milestone in AI assistant capabilities.
Hardware Requirements
- CPU: Works on free tier
- Memory: ~8GB recommended
- GPU: Optional but improves performance
Getting Started
- Clone or duplicate this Space
- Run the application
- Start with single question testing
- Process all questions when ready
- Submit to GAIA leaderboard
Built with β€οΈ for the GAIA benchmark challenge!