--- title: Baseer Self-Driving API emoji: 🚗 colorFrom: blue colorTo: red sdk: docker app_port: 7860 pinned: true license: mit short_description: Hierarchical API for Interfuser-HDPE self-driving system. tags: - computer-vision - autonomous-driving - deep-learning - fastapi - pytorch - carla - self-driving - graduation-project - control-systems - object-tracking --- # 🚗 Baseer Self-Driving API: The Hierarchical Brain **Service** | **Status** :--- | :--- **API Status** | ✅ **Online & Ready** **Perception Engine** | 🧠 **[Interfuser-HDPE Model](https://huggingface.co/BaseerAI/Interfuser-Baseer-v1)** **Core Logic** | 🚀 **FastAPI, Python** --- ## 📋 Project Overview Welcome to the **Baseer Self-Driving API**, the real-time, stateful decision-making engine for our advanced autonomous driving system. This Space hosts a high-performance FastAPI server that encapsulates the complete "brain" of our agent, going far beyond simple model inference. This API orchestrates the entire driving task: it takes raw sensor data from a simulator like CARLA, processes it through our foundational **Interfuser-HDPE** perception model, and then uses our custom-built **Temporal Tracker** and **Hierarchical Controller** to output intelligent, safe, and interpretable driving commands. --- ## 🏗️ System Architecture: Where Perception Meets Control Our system demonstrates a clean separation between the core perception model and the decision-making logic, which is hosted entirely within this Space. ``` +-----------+ +------------------------------------+ +-------------------------+ | | | | | | | Client | ----> | Baseer API (This Space) | ---> | Interfuser-HDPE Model | |(e.g.CARLA)| | +--------------------------------+ | | (Perception Engine) | | | HTTP | FastAPI Server | | | | +-----------+ | | + Identity-Aware Tracker ✅ | | +-------------------------+ | | + Hierarchical Controller ✅ | | | +--------------------------------+ | | | +--------------------------------------------+ ``` The client sends sensor data, and this API orchestrates everything: it calls the perception model for analysis, then uses its internal stateful modules (Tracker and Controller) to make a final, context-aware decision. --- ## ✨ Key Features & Innovations This API's intelligence comes from our custom-built downstream modules: ### 🧠 **Intelligent Hierarchical Controller** - **What it is:** A sophisticated decision-making module that operates on a clear hierarchy of rules: **`Safety First > Dynamic Obstacle Avoidance > Navigation`**. - **Why it matters:** This structured approach ensures predictable, safe behavior and avoids the pitfalls of overly simplistic controllers. It produces actions that are aware of the full driving context. ### 💡 **Cautious Memory for Occlusions** - **What it is:** Our controller features a short-term "grace period" memory. If a lead vehicle is temporarily occluded (e.g., behind a truck), the system remains cautious instead of accelerating dangerously into the unknown. - **Why it matters:** This elegantly solves the "deadly flutter" problem common in autonomous agents and drastically improves safety in dynamic traffic. ### 👁️ **Identity-Aware Temporal Tracking** - **What it is:** A custom-built, object-oriented tracker that maintains a consistent ID for every vehicle in the scene. - **Why it matters:** It provides the stable, long-term context needed for our controller to make informed decisions about following, yielding, or reacting to specific agents in the environment. ### 🗣️ **Human-Readable Decisions** - **What it is:** The API doesn't just return numbers; it returns a `reason` string (e.g., `"Following vehicle ID 15"`, `"Slowing for red light"`, `"Cautious: Lost track of lead vehicle"`). - **Why it matters:** This provides unparalleled interpretability, making it easy to understand and debug the agent's behavior in real-time. --- ## 🚀 How to Use Interact with the API by making HTTP requests to its endpoints. The typical workflow is to start a session, run steps in a loop, and then end the session. ### 1. Start a New Session This will initialize a new set of tracker and controller instances on the server. **Request:** ```bash curl -X POST "https://baseerai-baseer-server.hf.space/start_session" ``` **Example Response:** ```json { "session_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef" } ``` ### 2. Run a Simulation Step Send the current camera view and vehicle measurements to be processed. The API will return control commands and a full analysis. **Request:** ```bash curl -X POST "https://baseerai-baseer-server.hf.space/run_step" \ -H "Content-Type: application/json" \ -d '{ "session_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef", "image_b64": "your-base64-encoded-bgr-image-string", "measurements": { "pos_global": [105.0, -20.0], "theta": 1.57, "speed": 5.5, "target_point": [10.0, 0.0] } }' ``` **Example Response:** ```json { "control_commands": { "steer": 0.05, "throttle": 0.6, "brake": false }, "scene_analysis": { "is_junction": 0.02, "traffic_light_state": 0.95, "stop_sign": 0.01 }, "predicted_waypoints": [ [1.0, 0.05], [2.0, 0.06], [3.0, 0.07], [4.0, 0.07], [5.0, 0.08], [6.0, 0.08], [7.0, 0.09], [8.0, 0.09], [9.0, 0.10], [10.0, 0.10] ], "dashboard_b64": "a-very-long-base64-string-representing-the-dashboard-image...", "reason": "Following vehicle ID 15" } ``` **Response Fields:** - **`control_commands`**: The final commands to be applied to the vehicle. - **`scene_analysis`**: Probabilities for different road hazards. A high `traffic_light_state` value (e.g., > 0.5) indicates a red light. - **`predicted_waypoints`**: The model's intended path, relative to the vehicle. - **`dashboard_b64`**: A Base64-encoded JPEG image of the full dashboard view, which can be directly displayed in a client application. - **`reason`**: A human-readable string explaining the primary reason for the control action (e.g., "Following vehicle ID 15", "Slowing for red light", "Cautious: Lost track of lead vehicle"). ### 3. End the Session This will clean up the session data from the server. **Request:** ```bash curl -X POST "https://baseerai-baseer-server.hf.space/end_session?session_id=a1b2c3d4-e5f6-7890-1234-567890abcdef" ``` **Example Response:** ```json { "message": "Session a1b2c3d4-e5f6-7890-1234-567890abcdef ended." } ``` --- ## 📡 API Endpoints | Endpoint | Method | Description | |---|---|---| | `/` | GET | Landing page with API status. | | `/docs` | GET | Interactive API documentation (Swagger UI). | | `/start_session` | POST | Initializes a new driving session. | | `/run_step` | POST | Processes a single frame and returns control commands. | | `/end_session` | POST | Terminates a specific session. | | `/sessions` | GET | Lists all currently active sessions. | --- ## 🎯 Intended Use Cases & Limitations ✅ **Optimal Use Cases** - **Closed-loop simulation** and evaluation of full-stack driving agents in CARLA. - **Researching the interplay between advanced perception and intelligent control.** - **Rapid prototyping** of complex driving behaviors (like cautious following, yielding, etc.). - Serving as a "smart agent" for creating dynamic traffic scenarios. ⚠️ **Limitations** - **Simulation-Only:** Designed for CARLA. Not for real-world vehicles. - **Rule-Based High-Level Logic:** The controller's decision-making, while advanced, is based on a deterministic, hierarchical rule set, not end-to-end learning. - **Vision-Based:** Inherits the limitations of its camera-only perception model in adverse weather or lighting. --- ## 🛠️ Development & Relation to Research This API is the practical, deployed application of the research conducted for an Artificial Intelligence graduation thesis. It serves as the "brain" that utilizes the perception outputs from our foundational model. - **Core Perception Model:** **[Interfuser-HDPE (Model Weights & Research)](https://huggingface.co/BaseerAI/Interfuser-Baseer-v1)** - **Core Logic:** The Tracker and Controller modules are custom Python implementations hosted in this Space. - **API Framework:** FastAPI ## 👨💻 Development - **Lead Researcher:** Adam Altawil - **Project Type:** Graduation Project - AI & Autonomous Driving - **Contact:** [Your Contact Information] ## 📄 License This project is licensed under the MIT License. ---