metadata

title: Baseer Self-Driving API
emoji: 🚗
colorFrom: blue
colorTo: red
sdk: docker
app_port: 7860
pinned: true
license: mit
short_description: A RESTful API for an InterFuser-based self-driving model.
tags:
  - computer-vision
  - autonomous-driving
  - deep-learning
  - fastapi
  - pytorch
  - interfuser
  - graduation-project
  - carla
  - self-driving

🚗 Baseer Self-Driving API

Service	Status
API Status
Model
Frameworks

📋 Project Description

Baseer is an advanced self-driving system that provides a robust, real-time API for autonomous vehicle control. This Space hosts the FastAPI server that acts as an interface to the fine-tuned Interfuser-Baseer-v1 model.

The system is designed to take a live camera feed and vehicle measurements, process them through the deep learning model, and return actionable control commands and a comprehensive scene analysis.

🏗️ Architecture

This project follows a decoupled client-server architecture, where the model and the application are managed separately for better modularity and scalability.

+-----------+    +------------------------+    +--------------------------+
|           |    |                        |    |                          |
|  Client   | -> |   Baseer API (Space)   | -> |  Interfuser Model (Hub)  |
|(e.g.CARLA)|    |   (FastAPI Server)     |    | (Private/Gated Weights)  |
|           |    |                        |    |                          |
+-----------+    +------------------------+    +--------------------------+
     HTTP              Loads Model                    Model Repository
   Request

✨ Key Features

🧠 Advanced Perception Engine

Powered by: The Interfuser-Baseer-v1 model.
Focus: High-accuracy traffic object detection and safe waypoint prediction.
Scene Analysis: Real-time assessment of junctions, traffic lights, and stop signs.

⚡ High-Performance API

Framework: Built with FastAPI for high throughput and low latency.
Stateful Sessions: Manages multiple, independent driving sessions, each with its own tracker and controller state.
RESTful Interface: Intuitive and easy-to-use API design.

📊 Comprehensive Outputs

Control Commands: steer, throttle, brake.
Scene Analysis: Probabilities for junctions, traffic lights, and stop signs.
Predicted Waypoints: The model's intended path for the next 10 steps.
Visual Dashboard: A generated image that provides a complete, human-readable overview of the current state.

🚀 How to Use

Interact with the API by making HTTP requests to its endpoints.

1. Start a New Session

This will initialize a new set of tracker and controller instances on the server.

curl -X POST "https://adam-it-baseer-server.hf.space/start_session"

Response: {"session_id": "your-new-session-id"}

2. Run a Simulation Step

Send the current camera view and vehicle measurements to be processed.

curl -X POST "https://adam-it-baseer-server.hf.space/run_step" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "your-new-session-id",
    "image_b64": "your-base64-encoded-bgr-image-string",
    "measurements": {
      "pos_global": [105.0, -20.0],
      "theta": 1.57,
      "speed": 5.5,
      "target_point": [10.0, 0.0]
    }
  }'

3. End the Session

This will clean up the session data from the server.

curl -X POST "https://adam-it-baseer-server.hf.space/end_session?session_id=your-new-session-id"

📡 API Endpoints

Endpoint	Method	Description
`/`	GET	Landing page with API status.
`/docs`	GET	Interactive API documentation (Swagger UI).
`/start_session`	POST	Initializes a new driving session.
`/run_step`	POST	Processes a single frame and returns control commands.
`/end_session`	POST	Terminates a specific session.
`/sessions`	GET	Lists all currently active sessions.

🎯 Intended Use Cases & Limitations

✅ Optimal Use Cases

Simulating driving in CARLA environments.
Research in end-to-end autonomous driving.
Testing perception and control modules in a closed-loop system.
Real-time object detection and trajectory planning.

⚠️ Limitations

Simulation-Only: Trained exclusively on CARLA data. Not suitable for real-world driving.
Vision-Based: Relies on a single front-facing camera and has inherent blind spots.
No LiDAR: Lacks the robustness of sensor fusion in adverse conditions.

🛠️ Development

This project is part of a graduation thesis in Artificial Intelligence.

Deep Learning: PyTorch
API Server: FastAPI
Image Processing: OpenCV
Scientific Computing: NumPy

📞 Contact

For inquiries or support, please use the Community tab in this Space or open an issue in the project's GitHub repository (if available).

Developed by: Adam-IT
License: MIT