File size: 8,916 Bytes
d80d18d
adff7c4
 
 
d31cf46
d80d18d
adff7c4
d31cf46
adff7c4
3e2aeec
adff7c4
 
 
 
 
 
 
 
3e2aeec
 
 
d80d18d
 
3e2aeec
 
 
 
 
 
 
adff7c4
3e2aeec
adff7c4
3e2aeec
adff7c4
3e2aeec
adff7c4
3e2aeec
adff7c4
d31cf46
 
3e2aeec
d31cf46
3e2aeec
d31cf46
 
3e2aeec
 
 
 
 
 
 
 
 
 
d31cf46
adff7c4
3e2aeec
 
 
 
 
 
 
 
 
 
 
d31cf46
3e2aeec
 
 
d31cf46
3e2aeec
 
 
d31cf46
3e2aeec
 
 
d31cf46
 
adff7c4
d31cf46
adff7c4
ab30865
adff7c4
d31cf46
 
ab30865
 
adff7c4
3e2aeec
adff7c4
 
ab30865
 
 
 
 
 
d31cf46
 
 
ab30865
d31cf46
ab30865
adff7c4
3e2aeec
adff7c4
 
ab30865
d31cf46
adff7c4
d31cf46
 
 
 
adff7c4
 
 
 
ab30865
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3e2aeec
ab30865
 
 
 
 
 
 
 
3e2aeec
ab30865
d31cf46
adff7c4
d31cf46
adff7c4
ab30865
d31cf46
3e2aeec
ab30865
 
 
 
 
 
 
d31cf46
adff7c4
d31cf46
adff7c4
d31cf46
adff7c4
d31cf46
 
 
 
 
 
 
 
adff7c4
d31cf46
adff7c4
d31cf46
adff7c4
3e2aeec
 
 
 
 
adff7c4
3e2aeec
 
 
 
adff7c4
d31cf46
adff7c4
3e2aeec
 
 
 
 
 
 
 
 
adff7c4
3e2aeec
 
 
adff7c4
3e2aeec
adff7c4
3e2aeec
adff7c4
 
 
3e2aeec
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
---
title: Baseer Self-Driving API
emoji: πŸš—
colorFrom: blue
colorTo: red
sdk: docker
app_port: 7860
pinned: true
license: mit
short_description: Hierarchical API for Interfuser-HDPE self-driving system.
tags:
  - computer-vision
  - autonomous-driving
  - deep-learning
  - fastapi
  - pytorch
  - carla
  - self-driving
  - graduation-project
  - control-systems
  - object-tracking
---

# πŸš— Baseer Self-Driving API: The Hierarchical Brain

**Service** | **Status**
:--- | :---
**API Status** | βœ… **Online & Ready**
**Perception Engine** | 🧠 **[Interfuser-HDPE Model](https://huggingface.co/BaseerAI/Interfuser-Baseer-v1)**
**Core Logic** | πŸš€ **FastAPI, Python**

---

## πŸ“‹ Project Overview

Welcome to the **Baseer Self-Driving API**, the real-time, stateful decision-making engine for our advanced autonomous driving system. This Space hosts a high-performance FastAPI server that encapsulates the complete "brain" of our agent, going far beyond simple model inference.

This API orchestrates the entire driving task: it takes raw sensor data from a simulator like CARLA, processes it through our foundational **Interfuser-HDPE** perception model, and then uses our custom-built **Temporal Tracker** and **Hierarchical Controller** to output intelligent, safe, and interpretable driving commands.

---

## πŸ—οΈ System Architecture: Where Perception Meets Control

Our system demonstrates a clean separation between the core perception model and the decision-making logic, which is hosted entirely within this Space.

```
+-----------+ +------------------------------------+ +-------------------------+
|           | |                                    | |                         |
|  Client   | ----> |        Baseer API (This Space)        | ---> | Interfuser-HDPE Model |
|(e.g.CARLA)| |     +--------------------------------+     | |   (Perception Engine)   |
|           | HTTP  |        FastAPI Server         |     | |                         |
+-----------+ |     | + Identity-Aware Tracker βœ…    |     | +-------------------------+
              |     | + Hierarchical Controller βœ…   |     |
              |     +--------------------------------+     |
              |                                            |
              +--------------------------------------------+
```

The client sends sensor data, and this API orchestrates everything: it calls the perception model for analysis, then uses its internal stateful modules (Tracker and Controller) to make a final, context-aware decision.

---

## ✨ Key Features & Innovations

This API's intelligence comes from our custom-built downstream modules:

### 🧠 **Intelligent Hierarchical Controller**
- **What it is:** A sophisticated decision-making module that operates on a clear hierarchy of rules: **`Safety First > Dynamic Obstacle Avoidance > Navigation`**.
- **Why it matters:** This structured approach ensures predictable, safe behavior and avoids the pitfalls of overly simplistic controllers. It produces actions that are aware of the full driving context.

### πŸ’‘ **Cautious Memory for Occlusions**
- **What it is:** Our controller features a short-term "grace period" memory. If a lead vehicle is temporarily occluded (e.g., behind a truck), the system remains cautious instead of accelerating dangerously into the unknown.
- **Why it matters:** This elegantly solves the "deadly flutter" problem common in autonomous agents and drastically improves safety in dynamic traffic.

### πŸ‘οΈ **Identity-Aware Temporal Tracking**
- **What it is:** A custom-built, object-oriented tracker that maintains a consistent ID for every vehicle in the scene.
- **Why it matters:** It provides the stable, long-term context needed for our controller to make informed decisions about following, yielding, or reacting to specific agents in the environment.

### πŸ—£οΈ **Human-Readable Decisions**
- **What it is:** The API doesn't just return numbers; it returns a `reason` string (e.g., `"Following vehicle ID 15"`, `"Slowing for red light"`, `"Cautious: Lost track of lead vehicle"`).
- **Why it matters:** This provides unparalleled interpretability, making it easy to understand and debug the agent's behavior in real-time.

---

## πŸš€ How to Use

Interact with the API by making HTTP requests to its endpoints. The typical workflow is to start a session, run steps in a loop, and then end the session.

### 1. Start a New Session
This will initialize a new set of tracker and controller instances on the server.

**Request:**
```bash
curl -X POST "https://baseerai-baseer-server.hf.space/start_session"
```

**Example Response:**
```json
{
  "session_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef"
}
```

### 2. Run a Simulation Step

Send the current camera view and vehicle measurements to be processed. The API will return control commands and a full analysis.

**Request:**
```bash
curl -X POST "https://baseerai-baseer-server.hf.space/run_step" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "a1b2c3d4-e5f6-7890-1234-567890abcdef",
    "image_b64": "your-base64-encoded-bgr-image-string",
    "measurements": {
      "pos_global": [105.0, -20.0],
      "theta": 1.57,
      "speed": 5.5,
      "target_point": [10.0, 0.0]
    }
  }'
```

**Example Response:**
```json
{
  "control_commands": {
    "steer": 0.05,
    "throttle": 0.6,
    "brake": false
  },
  "scene_analysis": {
    "is_junction": 0.02,
    "traffic_light_state": 0.95,
    "stop_sign": 0.01
  },
  "predicted_waypoints": [
    [1.0, 0.05],
    [2.0, 0.06],
    [3.0, 0.07],
    [4.0, 0.07],
    [5.0, 0.08],
    [6.0, 0.08],
    [7.0, 0.09],
    [8.0, 0.09],
    [9.0, 0.10],
    [10.0, 0.10]
  ],
  "dashboard_b64": "a-very-long-base64-string-representing-the-dashboard-image...",
  "reason": "Following vehicle ID 15"
}
```

**Response Fields:**
- **`control_commands`**: The final commands to be applied to the vehicle.
- **`scene_analysis`**: Probabilities for different road hazards. A high `traffic_light_state` value (e.g., > 0.5) indicates a red light.
- **`predicted_waypoints`**: The model's intended path, relative to the vehicle.
- **`dashboard_b64`**: A Base64-encoded JPEG image of the full dashboard view, which can be directly displayed in a client application.
- **`reason`**: A human-readable string explaining the primary reason for the control action (e.g., "Following vehicle ID 15", "Slowing for red light", "Cautious: Lost track of lead vehicle").

### 3. End the Session

This will clean up the session data from the server.

**Request:**
```bash
curl -X POST "https://baseerai-baseer-server.hf.space/end_session?session_id=a1b2c3d4-e5f6-7890-1234-567890abcdef"
```

**Example Response:**
```json
{
  "message": "Session a1b2c3d4-e5f6-7890-1234-567890abcdef ended."
}
```

---

## πŸ“‘ API Endpoints

| Endpoint | Method | Description |
|---|---|---|
| `/` | GET | Landing page with API status. |
| `/docs` | GET | Interactive API documentation (Swagger UI). |
| `/start_session` | POST | Initializes a new driving session. |
| `/run_step` | POST | Processes a single frame and returns control commands. |
| `/end_session` | POST | Terminates a specific session. |
| `/sessions` | GET | Lists all currently active sessions. |

---

## 🎯 Intended Use Cases & Limitations

βœ… **Optimal Use Cases**
- **Closed-loop simulation** and evaluation of full-stack driving agents in CARLA.
- **Researching the interplay between advanced perception and intelligent control.**
- **Rapid prototyping** of complex driving behaviors (like cautious following, yielding, etc.).
- Serving as a "smart agent" for creating dynamic traffic scenarios.

⚠️ **Limitations**
- **Simulation-Only:** Designed for CARLA. Not for real-world vehicles.
- **Rule-Based High-Level Logic:** The controller's decision-making, while advanced, is based on a deterministic, hierarchical rule set, not end-to-end learning.
- **Vision-Based:** Inherits the limitations of its camera-only perception model in adverse weather or lighting.

---

## πŸ› οΈ Development & Relation to Research

This API is the practical, deployed application of the research conducted for an Artificial Intelligence graduation thesis. It serves as the "brain" that utilizes the perception outputs from our foundational model.

- **Core Perception Model:** **[Interfuser-HDPE (Model Weights & Research)](https://huggingface.co/BaseerAI/Interfuser-Baseer-v1)**
- **Core Logic:** The Tracker and Controller modules are custom Python implementations hosted in this Space.
- **API Framework:** FastAPI

## πŸ‘¨β€πŸ’» Development

- **Lead Researcher:** Adam Altawil
- **Project Type:** Graduation Project - AI & Autonomous Driving
- **Contact:** [Your Contact Information]

## πŸ“„ License

This project is licensed under the MIT License.

---

<div align="center">
  <strong>πŸš— Driving the Future with Hierarchical Intelligence πŸš—</strong>
</div>