Baseer_api_server / README.md
mohammed-aljafry's picture
Upload README.md with huggingface_hub
779f141 verified
metadata
title: Baseer Self-Driving API
emoji: πŸš—
colorFrom: blue
colorTo: red
sdk: docker
app_port: 7860
pinned: true
license: mit
short_description: A RESTful API for an InterFuser-based self-driving model.
tags:
  - computer-vision
  - autonomous-driving
  - deep-learning
  - fastapi
  - pytorch
  - interfuser
  - graduation-project
  - carla
  - self-driving

πŸš— Baseer Self-Driving API

Service Status
API Status Status
Model Model
Frameworks FastAPI PyTorch

πŸ“‹ Project Description

Baseer is an advanced self-driving system that provides a robust, real-time API for autonomous vehicle control. This Space hosts the FastAPI server that acts as an interface to the fine-tuned Interfuser-Baseer-v1 model.

The system is designed to take a live camera feed and vehicle measurements, process them through the deep learning model, and return actionable control commands and a comprehensive scene analysis.


πŸ—οΈ Architecture

This project follows a decoupled client-server architecture, where the model and the application are managed separately for better modularity and scalability.

+-----------+    +------------------------+    +--------------------------+
|           |    |                        |    |                          |
|  Client   | -> |   Baseer API (Space)   | -> |  Interfuser Model (Hub)  |
|(e.g.CARLA)|    |   (FastAPI Server)     |    | (Private/Gated Weights)  |
|           |    |                        |    |                          |
+-----------+    +------------------------+    +--------------------------+
     HTTP              Loads Model                    Model Repository
   Request

✨ Key Features

🧠 Advanced Perception Engine

  • Powered by: The Interfuser-Baseer-v1 model.
  • Focus: High-accuracy traffic object detection and safe waypoint prediction.
  • Scene Analysis: Real-time assessment of junctions, traffic lights, and stop signs.

⚑ High-Performance API

  • Framework: Built with FastAPI for high throughput and low latency.
  • Stateful Sessions: Manages multiple, independent driving sessions, each with its own tracker and controller state.
  • RESTful Interface: Intuitive and easy-to-use API design.

πŸ“Š Comprehensive Outputs

  • Control Commands: steer, throttle, brake.
  • Scene Analysis: Probabilities for junctions, traffic lights, and stop signs.
  • Predicted Waypoints: The model's intended path for the next 10 steps.
  • Visual Dashboard: A generated image that provides a complete, human-readable overview of the current state.

πŸš€ How to Use

Interact with the API by making HTTP requests to its endpoints. The typical workflow is to start a session, run steps in a loop, and then end the session.

1. Start a New Session

This will initialize a new set of tracker and controller instances on the server.

Request:

curl -X POST "https://BaseerAI-baseer-server.hf.space/start_session"

Example Response:

{
  "session_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef"
}

2. Run a Simulation Step

Send the current camera view and vehicle measurements to be processed. The API will return control commands and a full analysis.

Request:

curl -X POST "https://BaseerAI-baseer-server.hf.space/run_step" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
    "image_b64": "your-base64-encoded-bgr-image-string",
    "measurements": {
      "pos_global": [105.0, -20.0],
      "theta": 1.57,
      "speed": 5.5,
      "target_point": [10.0, 0.0]
    }
  }'

Example Response:

{
  "control_commands": {
    "steer": 0.05,
    "throttle": 0.6,
    "brake": false
  },
  "scene_analysis": {
    "is_junction": 0.02,
    "traffic_light_state": 0.95,
    "stop_sign": 0.01
  },
  "predicted_waypoints": [
    [1.0, 0.05],
    [2.0, 0.06],
    [3.0, 0.07],
    [4.0, 0.07],
    [5.0, 0.08],
    [6.0, 0.08],
    [7.0, 0.09],
    [8.0, 0.09],
    [9.0, 0.10],
    [10.0, 0.10]
  ],
  "dashboard_b64": "a-very-long-base64-string-representing-the-dashboard-image...",
  "reason": "Red Light"
}

Response Fields:

  • control_commands: The final commands to be applied to the vehicle.
  • scene_analysis: Probabilities for different road hazards. A high traffic_light_state value (e.g., > 0.5) indicates a red light.
  • predicted_waypoints: The model's intended path, relative to the vehicle.
  • dashboard_b64: A Base64-encoded JPEG image of the full dashboard view, which can be directly displayed in a client application.
  • reason: A human-readable string explaining the primary reason for the control action (e.g., "Following ID 12", "Red Light", "Cruising").

3. End the Session

This will clean up the session data from the server.

Request:

curl -X POST "https://BaseerAI-baseer-server.hf.space/end_session?session_id=a1b2c3d4-e5f6-7890-1234-567890abcdef"

Example Response:

{
  "message": "Session a1b2c3d4-e5f6-7890-1234-567890abcdef ended."
}

πŸ“‘ API Endpoints

Endpoint Method Description
/ GET Landing page with API status.
/docs GET Interactive API documentation (Swagger UI).
/start_session POST Initializes a new driving session.
/run_step POST Processes a single frame and returns control commands.
/end_session POST Terminates a specific session.
/sessions GET Lists all currently active sessions.

🎯 Intended Use Cases & Limitations

βœ… Optimal Use Cases

  • Simulating driving in CARLA environments.
  • Research in end-to-end autonomous driving.
  • Testing perception and control modules in a closed-loop system.
  • Real-time object detection and trajectory planning.

⚠️ Limitations

  • Simulation-Only: Trained exclusively on CARLA data. Not suitable for real-world driving.
  • Vision-Based: Relies on a single front-facing camera and has inherent blind spots.
  • No LiDAR: Lacks the robustness of sensor fusion in adverse conditions.

πŸ› οΈ Development

This project is part of a graduation thesis in Artificial Intelligence.

  • Deep Learning: PyTorch
  • API Server: FastAPI
  • Image Processing: OpenCV
  • Scientific Computing: NumPy

πŸ“ž Contact

For inquiries or support, please use the Community tab in this Space or open an issue in the project's GitHub repository (if available).


Developed by: Adam Altawil
License: MIT