altawil
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,9 +1,3 @@
|
|
1 |
-
You got it. Here is the complete, professional, and copy-paste-ready README.md content in English for your model repository on the Hugging Face Hub.
|
2 |
-
|
3 |
-
This version is structured to be clear, informative, and follows the best practices for creating a Model Card.
|
4 |
-
|
5 |
-
README.md for Adam-IT/Interfuser-Baseer-v1
|
6 |
-
Generated markdown
|
7 |
---
|
8 |
license: mit
|
9 |
language:
|
@@ -22,115 +16,146 @@ datasets:
|
|
22 |
pipeline_tag: object-detection
|
23 |
---
|
24 |
|
25 |
-
# π InterFuser
|
26 |
|
27 |
-
|
|
|
|
|
28 |
|
29 |
-
|
30 |
|
31 |
-
|
32 |
|
33 |
-
|
34 |
|
35 |
-
|
|
|
|
|
|
|
36 |
|
37 |
-
|
38 |
-
* **Multi-Task Learning:** Simultaneously performs two critical tasks:
|
39 |
-
1. **Traffic Object Detection:** Identifies cars, motorcycles, and pedestrians in a 20x20 meter grid in front of the vehicle.
|
40 |
-
2. **Waypoint Prediction:** Predicts a safe and drivable trajectory for the next 10 waypoints.
|
41 |
-
* **Scene Understanding:** Provides logits for crucial environmental factors, including the presence of junctions, red light hazards, and stop signs.
|
42 |
-
* **Optimized for CARLA:** Fine-tuned on the `PDM_Lite_Carla` dataset, making it highly effective for scenarios within the CARLA simulator.
|
43 |
|
44 |
-
|
45 |
|
46 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
-
|
49 |
-
* **LiDAR Backbone:** `ResNet-18` (architecture defined, but LiDAR input is disabled in this version)
|
50 |
-
* **Transformer:**
|
51 |
-
* **Embedding Dimension:** 256
|
52 |
-
* **Encoder Depth:** 6 Layers
|
53 |
-
* **Decoder Depth:** 6 Layers
|
54 |
-
* **Attention Heads:** 8
|
55 |
-
* **Prediction Heads:**
|
56 |
-
* **Waypoints:** Gated Recurrent Unit (GRU) based predictor.
|
57 |
-
* **Traffic Detection:** A detection head that outputs a `20x20x7` grid representing object confidence, position offsets, dimensions, and orientation.
|
58 |
|
59 |
-
|
|
|
|
|
60 |
|
61 |
-
|
|
|
|
|
62 |
|
63 |
-
**1. Installation**
|
64 |
```bash
|
65 |
pip install torch torchvision timm huggingface_hub
|
66 |
-
|
67 |
-
|
68 |
|
69 |
-
|
70 |
-
The recommended way to load the model is by using the custom load_and_prepare_model function from the project, which handles the configuration and weight loading automatically.
|
71 |
|
72 |
-
|
73 |
import torch
|
74 |
-
|
75 |
-
from config_loader import load_and_prepare_model
|
76 |
|
77 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
model = load_and_prepare_model(device)
|
84 |
-
model.eval()
|
85 |
-
print("Model loaded successfully!")
|
86 |
-
except Exception as e:
|
87 |
-
print(f"Error loading model: {e}")
|
88 |
|
89 |
-
|
90 |
-
# dummy_input = ...
|
91 |
-
# with torch.no_grad():
|
92 |
-
# outputs = model(dummy_input)
|
93 |
-
IGNORE_WHEN_COPYING_START
|
94 |
-
content_copy
|
95 |
-
download
|
96 |
-
Use code with caution.
|
97 |
-
Python
|
98 |
-
IGNORE_WHEN_COPYING_END
|
99 |
-
π Training and Fine-tuning
|
100 |
|
101 |
-
|
|
|
|
|
102 |
|
103 |
-
|
104 |
|
105 |
-
|
106 |
|
107 |
-
|
|
|
|
|
|
|
108 |
|
109 |
-
|
110 |
|
111 |
-
|
|
|
|
|
|
|
112 |
|
113 |
-
|
114 |
|
115 |
-
|
116 |
|
117 |
-
|
|
|
|
|
|
|
118 |
|
119 |
-
|
120 |
|
121 |
-
|
122 |
|
123 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
124 |
|
125 |
-
|
126 |
|
127 |
-
|
|
|
|
|
128 |
|
129 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
130 |
|
131 |
-
|
132 |
-
|
133 |
-
|
134 |
-
download
|
135 |
-
Use code with caution.
|
136 |
-
IGNORE_WHEN_COPYING_END
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
language:
|
|
|
16 |
pipeline_tag: object-detection
|
17 |
---
|
18 |
|
19 |
+
# π InterFuser-Baseer-v1: Autonomous Driving Model
|
20 |
|
21 |
+
[](https://opensource.org/licenses/MIT)
|
22 |
+
[](https://pytorch.org/)
|
23 |
+
[](https://carla.org/)
|
24 |
|
25 |
+
## π Overview
|
26 |
|
27 |
+
InterFuser-Baseer-v1 is a state-of-the-art transformer-based model for autonomous driving, specifically fine-tuned for the **Baseer Self-Driving API**. This model combines computer vision and deep learning to provide real-time traffic object detection and trajectory planning in simulated driving environments.
|
28 |
|
29 |
+
### π― Key Capabilities
|
30 |
|
31 |
+
- **Multi-Task Learning**: Simultaneous traffic object detection and waypoint prediction
|
32 |
+
- **Transformer Architecture**: Advanced attention mechanisms for scene understanding
|
33 |
+
- **Real-Time Processing**: Optimized for real-time inference in driving scenarios
|
34 |
+
- **CARLA Integration**: Specifically tuned for CARLA simulation environment
|
35 |
|
36 |
+
## ποΈ Architecture
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
+
### Model Components
|
39 |
|
40 |
+
| Component | Specification |
|
41 |
+
|-----------|---------------|
|
42 |
+
| **Image Backbone** | ResNet-50 (ImageNet pretrained) |
|
43 |
+
| **LiDAR Backbone** | ResNet-18 (disabled in this version) |
|
44 |
+
| **Transformer** | 6-layer encoder/decoder, 8 attention heads |
|
45 |
+
| **Embedding Dimension** | 256 |
|
46 |
+
| **Prediction Heads** | GRU-based waypoint predictor + Detection head |
|
47 |
|
48 |
+
### Output Format
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
+
- **Traffic Detection**: 20Γ20Γ7 grid (confidence, position, dimensions, orientation)
|
51 |
+
- **Waypoint Prediction**: 10 future trajectory points
|
52 |
+
- **Scene Understanding**: Junction, traffic light, and stop sign detection
|
53 |
|
54 |
+
## π Quick Start
|
55 |
+
|
56 |
+
### Installation
|
57 |
|
|
|
58 |
```bash
|
59 |
pip install torch torchvision timm huggingface_hub
|
60 |
+
```
|
|
|
61 |
|
62 |
+
### Usage Example
|
|
|
63 |
|
64 |
+
```python
|
65 |
import torch
|
66 |
+
from huggingface_hub import hf_hub_download
|
|
|
67 |
|
68 |
+
# Download model weights
|
69 |
+
model_path = hf_hub_download(
|
70 |
+
repo_id="Adam-IT/Interfuser-Baseer-v1",
|
71 |
+
filename="best_model.pth"
|
72 |
+
)
|
73 |
+
|
74 |
+
# Load model (requires InterFuser class definition)
|
75 |
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
76 |
+
model = torch.load(model_path, map_location=device)
|
77 |
+
model.eval()
|
78 |
+
|
79 |
+
# Inference
|
80 |
+
with torch.no_grad():
|
81 |
+
outputs = model(input_data)
|
82 |
+
```
|
83 |
+
|
84 |
+
## π Performance
|
85 |
+
|
86 |
+
### Training Details
|
87 |
|
88 |
+
- **Dataset**: PDM-Lite-CARLA (Urban driving scenarios)
|
89 |
+
- **Training Objective**: Multi-task learning with IoU optimization
|
90 |
+
- **Framework**: PyTorch
|
|
|
|
|
|
|
|
|
|
|
91 |
|
92 |
+
### Key Metrics
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
93 |
|
94 |
+
- Optimized for traffic detection accuracy
|
95 |
+
- Enhanced bounding box IoU performance
|
96 |
+
- Robust waypoint prediction in urban scenarios
|
97 |
|
98 |
+
## β οΈ Limitations
|
99 |
|
100 |
+
### Current Constraints
|
101 |
|
102 |
+
- **Simulation Only**: Trained exclusively on CARLA data
|
103 |
+
- **Single Camera**: Front-facing camera view only
|
104 |
+
- **No LiDAR**: Vision-based approach without LiDAR fusion
|
105 |
+
- **Dataset Scope**: Limited to PDM-Lite-CARLA scenarios
|
106 |
|
107 |
+
### Recommended Use Cases
|
108 |
|
109 |
+
- β
CARLA simulation environments
|
110 |
+
- β
Research and development
|
111 |
+
- β
Autonomous driving prototyping
|
112 |
+
- β Real-world deployment (requires additional training)
|
113 |
|
114 |
+
## π οΈ Integration
|
115 |
|
116 |
+
This model is designed to work with:
|
117 |
|
118 |
+
- **Baseer Self-Driving API**
|
119 |
+
- **CARLA Simulator**
|
120 |
+
- **PyTorch Inference Pipeline**
|
121 |
+
- **Custom Autonomous Driving Systems**
|
122 |
|
123 |
+
## π Citation
|
124 |
|
125 |
+
If you use this model in your research, please cite:
|
126 |
|
127 |
+
```bibtex
|
128 |
+
@misc{interfuser-baseer-v1,
|
129 |
+
title={InterFuser-Baseer-v1: Fine-tuned Autonomous Driving Model},
|
130 |
+
author={Adam-IT},
|
131 |
+
year={2024},
|
132 |
+
publisher={Hugging Face},
|
133 |
+
howpublished={\url{https://huggingface.co/Adam-IT/Interfuser-Baseer-v1}}
|
134 |
+
}
|
135 |
+
```
|
136 |
|
137 |
+
## π¨βπ» Development
|
138 |
|
139 |
+
**Developed by**: Adam-IT
|
140 |
+
**Project Type**: Graduation Project - AI & Autonomous Driving
|
141 |
+
**Institution**: [Your Institution Name]
|
142 |
|
143 |
+
## π License
|
144 |
+
|
145 |
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
146 |
+
|
147 |
+
## π€ Contributing
|
148 |
+
|
149 |
+
Contributions, issues, and feature requests are welcome! Feel free to check the [issues page](../../issues).
|
150 |
+
|
151 |
+
## π Support
|
152 |
+
|
153 |
+
For questions and support:
|
154 |
+
- Create an issue in this repository
|
155 |
+
- Contact: [Your Contact Information]
|
156 |
+
|
157 |
+
---
|
158 |
|
159 |
+
<div align="center">
|
160 |
+
<strong>π Drive the Future with AI π</strong>
|
161 |
+
</div>
|
|
|
|
|
|