Building a Hugging Face AI Agent with Bright Data’s Web MCP Integration

Community Article Published August 26, 2025

Hugging Face recently added the mcp package to the huggingface_hub Python library, enabling you to build AI agents powered by HF models that can connect to MCP servers.

In this walkthrough, we’ll show you how to use the new Agent class to create an AI agent that connects to the Bright Data MCP server for retrieving data from the web and interacting with it.

By the end of this tutorial, you’ll have a web data-grounded AI agent that can assist you in deciding whether to buy products on Amazon.

The New, Experimental Agent Class in Hugging Face Hub SDK

As better explained in the blog post Tiny Agents in Python: an MCP-powered agent in ~70 lines of code*, the huggingface_hub client SDK can now act as an MCP client. In detail, it can pull tools from MCP servers and pass them to an LLM during inference. Learn more in the MCPClient class documentation.

This update also introduces a dedicated Agent class that extends MCPClient and implements conversation management logic. In detail, the Agent class can connect to models exposed by the Inference Providers using your Hugging Face API key. It’s intentionally lightweight, focusing on the conversational loop, as you can verify in the class code on GitHub.

Note: The Agent class is experimental and may change in future releases without prior notice.

How to Build a Hugging Face Agent that Connects to the Bright Data Web MCP

If you’re not familiar with it, the Bright Data Web MCP (also referred to simply as Web MCP) is an open-source MCP server that connects to the Bright Data AI infrastructure. Specifically, it exposes more than 60 tools, including:

Tool Description
search_engine Scrape search results from Google, Bing, or Yandex. It returns SERP results in Markdown.
scrape_as_markdown Scrape a single webpage and return the page content in Markdown. It can unlock pages, even with bot detection or CAPTCHA.

On top of those two, there are other tools for web search, browser interaction, and structured data retrieval from popular domains like LinkedIn, Amazon, TikTok, and many others. For example, the web_data_amazon_product tool takes an Amazon product page and returns structured data, scraping the target page on the fly.

In this tutorial section, you’ll use the new Agent class to build an AI agent powered by a Hugging Face inference model and integrated with Bright Data’s Web MCP tools. This will enable your agent to ground its responses with fresh data, search the web, scrape web pages, and interact with online content.

The agent will generate a Markdown report that includes product data and insights about the Nintendo Switch 2, helping you decide whether it’s worth buying.

Follow the steps below to build it!

Prerequisites

To follow this tutorial, you need:

Install the required dependencies with:

pip install "huggingface_hub[mcp]>=0.32.2"

Step #1: Define the MCP Connection

The Agent class extends the MCPClient class, inheriting all its capabilities. In particular, if you look at the Agent code, you’ll notice it includes a servers field:

The servers field on Agent

This field accepts configuration dictionaries for MCP integration. To connect to the Bright Data Web MCP, define the MCP configuration as follows:

bright_data_mcp_server = {
    "type": "stdio",
    "command": "npx",
    "args": ["-y", "@brightdata/mcp"],
    "env": {
      "API_TOKEN": "<YOUR_BRIGHT_DATA_API_KEY>",
      "PRO_MODE": "true"
    }
} 

Replace <YOUR_BRIGHT_DATA_API_KEY> with your actual API key.

When passed to the servers option in Agent, that configuration results in the following bash command being executed locally:

API_TOKEN=<YOUR_BRIGHT_DATA_API_KEY> PRO_MODE=true npx -y @brightdata/mcp

That's the command required to launch the Web MCP via the @brightdata/mcp npm package. Find out more in the official docs.

In other words, your agent will be able to start a local Bright Data Web MCP process and connect to it.

⚠️ Note: The PRO_MODE: true setting is optional! Enabling it gives you access to all 60+ tools provided by the server, but it may incur additional charges. That's because only the search_engine and scrape_as_markdown tools are included in the generous free tier (5,000 tool calls per month).

Step #2: Initialize an Agent Instance

Use the bright_data_mcp_server variable defined earlier to instantiate an instance of the Agent class:

from huggingface_hub import Agent

agent = Agent(
    servers=[bright_data_mcp_server],
    provider="nebius",
    model="Qwen/Qwen2.5-72B-Instruct",
    api_key="<YOUR_HUGGING_FACE_API_KEY>",
)

Replace the <YOUR_HUGGING_FACE_API_KEY> placeholder with your actual Hugging Face API token.

This will create an AI agent powered by the Qwen/Qwen2.5-72B-Instruct model, running through the Nebius inference provider available on the Hugging Face Hub. The agent will also be able to use the tools provided by the Bright Data Web MCP.

Feel free to replace the inference provider and model with the ones you prefer.

Step #3: Load the MCP Tools

By default, the Agent class doesn’t load MCP tools automatically, even if you populated the servers field in the constructor. To actually load the tools, call the asynchronous load_tools() method:

await agent.load_tools()

You can then log all loaded MCP tools with the following code:

print("Loaded MCP tools:")
for i, tool in enumerate(agent.available_tools, start=1):
    print(f"{i}. {tool.function.name}: {tool.function.description.replace("\n", " ")}")

The result will look something like this:

Loaded MCP tools:
1. search_engine: Scrape search results from Google, Bing or Yandex. Returns SERP results in markdown (URL, title, description)
2. scrape_as_markdown: Scrape a single webpage URL with advanced options for content extraction and get back the results in MarkDown language. This tool can unlock any webpage even if it uses bot detection or CAPTCHA.
3. scrape_as_html: Scrape a single webpage URL with advanced options for content extraction and get back the results in HTML. This tool can unlock any webpage even if it uses bot detection or CAPTCHA.    

# Omitted for brevity...

58. scraping_browser_get_html: Get the HTML content of the current page. Avoid using the full_page option unless it is important to see things like script tags since this can be large
59. scraping_browser_scroll: Scroll to the bottom of the current page
60. scraping_browser_scroll_to: Scroll to a specific element on the page

Amazing! You can now confirm that your agent has access to the 60+ tools available in the Bright Data Web MCP.

Step #4: Run the Agent

You now have an AI agent with MCP integration in place. To test its capabilities offered by the Web MCP tools, try a task that a regular LLM could not perform. For example, ask it to retrieve data from an Amazon page (e.g., Nintendo Switch 2) and analyze it to generate the top 3 reasons to buy and the top 3 reasons not to buy.

Execute the task and stream the agent’s response as below:

prompt = """
  You are a shopping expert. Retrieve data from the Amazon product page 
  "https://www.amazon.com/Nintendo-Switch-2-System/dp/B0F3GWXLTS" and create a concise Markdown report that includes the main product information (title, price, key features, etc.), the top 3 reasons to buy the product, and the top 3 reasons not to buy the product.
"""

# Run the task with the agent and stream the response in the terminal
async for chunk in agent.run(prompt):
    if hasattr(chunk, "role") and chunk.role == "tool":
        print(f"[TOOL] {chunk.name}: {chunk.content}\n\n", flush=True)
    else:
        delta_content = chunk.choices[0].delta.content
        if delta_content:
            print(delta_content, end="", flush=True)

Notice how the code distinguishes between tool logs and the agent’s streaming response. Tool outputs are marked with [TOOL], while the rest of the agent’s response is streamed incrementally in the terminal.

Before the end, don’t forget to close the agent with:

await agent.cleanup()

Step #5: Put It All Together

Your final Hugging Face Python agent script should contain:

import asyncio
from huggingface_hub import Agent

async def main():
    # Bright Data Web MCP configuration
    bright_data_mcp_server = {
        "type": "stdio",
        "command": "npx",
        "args": ["-y", "@brightdata/mcp"],
        "env": {
          "API_TOKEN": "<YOUR_BRIGHT_DATA_API_KEY>", # Replace with your BD API key
          "PRO_MODE": "true"
        }
    }

    # Initialize the agent
    agent = Agent(
        servers=[bright_data_mcp_server],
        provider="nebius",
        model="Qwen/Qwen2.5-72B-Instruct",
        api_key="<YOUR_HUGGING_FACE_API_KEY>", # Replace with your HF API key
    )
    # Load the MCP tools
    await agent.load_tools()

    # The task to be performed by the agent
    prompt = """
        You are a shopping expert. Retrieve data from the Amazon product page "https://www.amazon.com/Nintendo-Switch-2-System/dp/B0F3GWXLTS" and create a concise Markdown report that includes the main product information (title, price, key features, etc.), the top 3 reasons to buy the product, and the top 3 reasons not to buy the product.
    """

    # Run a task with the agent and stream the response in the terminal
    async for chunk in agent.run(prompt):
        if hasattr(chunk, "role") and chunk.role == "tool":
            print(f"[TOOL] {chunk.name}: {chunk.content}\n\n", flush=True)
        else:
          delta_content = chunk.choices[0].delta.content
          if (delta_content):
            print(delta_content, end="", flush=True)

    # Close the agent and release its resource
    await agent.cleanup()

if __name__ == "__main__":
    asyncio.run(main())

Run the agent, and in the terminal you’ll see:

The tool output

This confirms that the agent successfully called the web_data_amazon_product tool from the Bright Data Web MCP, which is the best tool for completing this task. As you can tell, the tool returns structured data for the specified Amazon product page.

If you save that data to a JSON file and open it in Visual Studio Code, you’ll see:

The tool output in formatted JSON format in Visual Studio Code

That's precisely the same data, but in a structured format, as you can verify on the target Amazon product page:

The Amazon product target page

Finally, the Markdown report produced by the agent will look something like this:

Great, we have retrieved the data for the Nintendo Switch 2 system from the Amazon product page. Now, let's summarize the key information and identify the top 3 reasons to buy and the top 3 reasons not to buy the product.

### Main Product Information

- **Title:** Nintendo Switch 2 System
- **Brand:** Nintendo
- **Price:** $449.99
- **Rating:** 4.7 out of 5 stars
- **Reviews Count:** 343
- **Availability:** Available
- **Product Dimensions:** 4.21 x 8.66 x 8.86 inches; 5.29 ounces
- **Included Items:**
  - Nintendo Switch 2 Console
  - Joy-Con 2 (L) in Light Blue
  - Joy-Con 2 (R) in Light Red
  - Nintendo Switch 2 AC Adapter
  - USB-C® Charging Cable
  - Nintendo Switch 2 Dock
  - Joy-Con 2 Grip
  - Joy-Con 2 Strap (×2)
  - Ultra High Speed HDMI™ Cable

### Key Features

- **The next evolution of Nintendo Switch**
- **Three play modes:** TV, Tabletop, and Handheld
- **Larger, vivid, 7.9” LCD touch screen** with support for HDR and up to 120 fps
- **Dock that supports 4K** when connected to a compatible TV*
- **GameChat** lets you voice chat, share your game screen, and connect via video chat as you play
- **256GB internal storage**, expandable with microSD Express cards (sold separately)
- **Backward compatibility** with physical and digital Nintendo Switch games**
- **Joy-Con 2 controllers** attach magnetically and now offer mouse controls
- **Same-system multiplayer, local wireless, and online multiplayer**

### Top 3 Reasons to Buy

1. **Enhanced Performance and Graphics:**
   - The Nintendo Switch 2 offers significantly improved performance with faster game loading, smoother frame rates, and sharper visuals. The 7.9” 1080p screen supports HDR and up to 120 fps, providing a much better gaming experience compared to the original Switch.

2. **Improved Hardware Design:**
   - The Joy-Con 2 controllers have been redesigned for better ergonomics and to address stick drift issues. They can now be used as mouse controls, enhancing the versatility of the console. The overall design is more premium and responsive.

3. **Extended Battery Life:**
   - Despite the performance boost, the battery life has been extended, allowing for longer gaming sessions in handheld mode. This is a significant improvement for portable gaming enthusiasts.

### Top 3 Reasons Not to Buy

1. **Limited Availability:**
   - Currently, the Nintendo Switch 2 is available by invitation only, which might make it difficult for everyone to purchase immediately.

2. **Higher Price Point:**
   - At $449.99, the Nintendo Switch 2 is more expensive than the original Switch. This might be a barrier for budget-conscious consumers.

3. **Not All Games Are Compatible:**
   - While the console is backward compatible with most Nintendo Switch games, some older games may not be fully supported or might require additional accessories. It’s recommended to check Nintendo’s website for compatibility information.

### Conclusion

The Nintendo Switch 2 is a significant upgrade over the original Switch, offering better performance, an improved design, and extended battery life. However, its higher price point and limited availability might be factors to consider before making a purchase. Additionally, ensure that your favorite games are compatible with the new system.

Is there anything else you would like to include in the report or any further questions about the product?

That’s exactly what you asked for! You can see it even more clearly by copying this content into a report.md file and exploring it in Visual Studio Code:

Exploring the resulting Markdown report in VS Code

Et voilà! Your Hugging Face agent—thanks to Bright Data’s MCP integration—was able to scrape data from a complex site like Amazon (known for its anti-scraping and anti-bot measures) and generate a concise, accurate, and actionable report.

Without this tool, the result would have been either:

  1. A message saying that the LLM cannot access Amazon pages, or
  2. Fabricated/hallucinated data that you can’t trust.

This quick example demonstrates the power of MCP integration in the Hugging Face Agent class. With other prompts, you can test numerous additional supported use cases.

[Extra] Define the Agent Using Tiny Agents

The Agent implementation comes from Tiny Agents, which supports configuring a custom agent with MCP integration directly through an agent.json file.

This means you can achieve the same functionality as before without writing a single line of Python code. Here’s how!

Start by installing the required dependencies:

pip install -U "huggingface_hub[mcp]"

Then, authenticate your Hugging Face connection in the CLI with:

hf auth login

You’ll be prompted to enter your Hugging Face API token:

Entering your Hugging Face API token in the terminal

The key will be stored in your HF_HOME directory for future use.

Next, define a local agent.json file as below:

{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "provider": "nebius",
    "servers": [
        {
            "type": "stdio",
            "command": "npx",
            "args": ["-y", "@brightdata/mcp"],
            "env": {
              "API_TOKEN": "<YOUR_BRIGHT_DATA_API_KEY>",
              "PRO_MODE": "true"
            }
        }
    ]
}

This is equivalent to the bright_data_mcp_server variable and the Agent instantiation defined in the above chapter.

Run the agent with:

tiny-agents run agent.json

You’ll see the agent connect to all 60 MCP tools:

Note all the available tools

Paste the prompt in the >> section:

You are a shopping expert. Retrieve data from the Amazon product page "https://www.amazon.com/Nintendo-Switch-2-System/dp/B0F3GWXLTS" and create a concise Markdown report that includes the main product information (title, price, key features, etc.), the top 3 reasons to buy the product, and the top 3 reasons not to buy the product.

Press Enter, and you should get:

Agent execution in Tiny Agents mode

Notice how the agent calls the web_data_amazon_product tool and produces a Markdown report, just like in the Python-based approach. Mission complete!

Next Steps

In this guided blog post, you learned how to create a Hugging Face-powered agent that can connect to the Bright Data Web MCP for web data retrieval, page interaction, structured data extraction, web search, and more.

This was just a simple example, but you can easily extend it to build more complex AI agents. Thanks to Hugging Face and Bright Data, you can cover sophisticated, modern agentic AI scenarios!

Now it’s your turn: let us know what you think about this implementation, share any feedback, and ask any questions you may have.

Community

Sign up or log in to comment