LamiaYT's picture
Fix README YAML and regenerate full content
4a42cc8
metadata
title: Gaia Llamaindex Agent
emoji: πŸ¦™
colorFrom: red
colorTo: pink
sdk: docker
app_file: app.py
pinned: false
short_description: Test To Pass GAIA

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

πŸ¦™ GAIA Benchmark Agent with LlamaIndex

This Space implements a complete LlamaIndex agent designed to tackle the GAIA (General AI Assistants) benchmark questions.

Features

  • Local LLM: Runs entirely on Hugging Face Spaces without external API dependencies
  • LlamaIndex Integration: Uses ReAct agent framework for reasoning and tool use
  • GAIA API Integration: Fetches questions and submits answers automatically
  • Tool Suite: Web search, calculation, file reading, and more
  • User-Friendly Interface: Gradio UI for testing and submission

Architecture

πŸ“¦ GAIA Agent
β”œβ”€β”€ 🧠 Local LLM (DialoGPT/GPT-2)
β”œβ”€β”€ πŸ”§ Agent Tools
β”‚   β”œβ”€β”€ Web Search
β”‚   β”œβ”€β”€ Calculator
β”‚   β”œβ”€β”€ File Reader
β”‚   └── GAIA API Client
β”œβ”€β”€ πŸ€– ReAct Agent (LlamaIndex)
└── πŸ–₯️ Gradio Interface

Usage

  1. Test Single Questions: Try individual GAIA questions
  2. Full Evaluation: Process all 20 questions from the dataset
  3. Submit to GAIA: Send answers for official scoring

Scoring Target

The goal is to achieve 30% accuracy on GAIA Level 1 questions, which represents a significant milestone in AI assistant capabilities.

Hardware Requirements

  • CPU: Works on free tier
  • Memory: ~8GB recommended
  • GPU: Optional but improves performance

Getting Started

  1. Clone or duplicate this Space
  2. Run the application
  3. Start with single question testing
  4. Process all questions when ready
  5. Submit to GAIA leaderboard

Built with ❀️ for the GAIA benchmark challenge!