Spaces:
Sleeping
Sleeping
added model (v5) desc
Browse files
app.py
CHANGED
@@ -7,6 +7,49 @@ yolov8_result = os.path.join(os.getcwd(), "data/xai/yolov8.png")
|
|
7 |
yolov5_dff = os.path.join(os.getcwd(), "data/xai/yolov5_dff.png")
|
8 |
yolov8_dff = os.path.join(os.getcwd(), "data/xai/yolov8_dff.png")
|
9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
# Netron HTML templates
|
11 |
def get_netron_html(model_url):
|
12 |
return f"""
|
@@ -60,6 +103,7 @@ with gr.Blocks(css=custom_css) as demo:
|
|
60 |
|
61 |
with gr.Row():
|
62 |
with gr.Column():
|
|
|
63 |
gr.HTML(get_netron_html(yolov5_url))
|
64 |
gr.Image(yolov5_result, label="Detections & Interpretability Map")
|
65 |
gr.Image(yolov5_dff, label="Feature Factorization & discovered concept")
|
|
|
7 |
yolov5_dff = os.path.join(os.getcwd(), "data/xai/yolov5_dff.png")
|
8 |
yolov8_dff = os.path.join(os.getcwd(), "data/xai/yolov8_dff.png")
|
9 |
|
10 |
+
|
11 |
+
architecture_description = """
|
12 |
+
# YOLOv5 Architecture Overview
|
13 |
+
|
14 |
+
YOLOv5 consists of three main components: the **backbone**, the **neck**, and the **head**.
|
15 |
+
|
16 |
+
### 1. **Backbone**: Feature Extraction
|
17 |
+
- **CSPDarknet53**: A modified version of **Darknet53**, leveraging **CSPNet** for improved gradient flow and memory usage.
|
18 |
+
- **Residual Connections**: Utilizes **ResNet**-like residual connections to enable deeper learning without vanishing gradients.
|
19 |
+
- **Focus Layer**: Performs convolutional downsampling to focus on key image features before feature extraction.
|
20 |
+
|
21 |
+
### 2. **Neck**: Aggregation of Features
|
22 |
+
- **PANet (Path Aggregation Network)**: Used for better feature aggregation and enhanced information flow across scales.
|
23 |
+
- **FPN (Feature Pyramid Network)**: Helps in detecting objects at multiple scales by generating a pyramid of feature maps.
|
24 |
+
- **Upsample and Downsample**: Combines low-level and high-level features for accurate localization and detection.
|
25 |
+
|
26 |
+
### 3. **Head**: Detection and Output Generation
|
27 |
+
- **Bounding Box Prediction**: Predicts **center (x, y)**, **width (w)**, and **height (h)** for each bounding box.
|
28 |
+
- **Class Prediction**: Outputs the **class probabilities** for each detected object.
|
29 |
+
- **Objectness Score**: Predicts whether the bounding box contains an object.
|
30 |
+
- **Anchor Boxes**: Uses predefined anchor boxes to assist with bounding box prediction.
|
31 |
+
|
32 |
+
### 4. **Detection Layer**: Grid-Based Prediction
|
33 |
+
- **Grid-based Prediction**: Divides the image into a grid where each cell predicts multiple bounding boxes.
|
34 |
+
- **Non-Maximum Suppression (NMS)**: Filters out redundant bounding boxes with high overlap based on confidence scores.
|
35 |
+
|
36 |
+
### 5. **Loss Function**:
|
37 |
+
- **CIoU Loss**: Used for bounding box regression to measure the overlap, aspect ratio, and center distance.
|
38 |
+
- **Binary Cross-Entropy**: For **objectness score** prediction.
|
39 |
+
- **Cross-Entropy**: For **classification** of the detected objects.
|
40 |
+
|
41 |
+
### 6. **Post-Processing**:
|
42 |
+
- **Non-Maximum Suppression (NMS)**: Eliminates duplicate boxes with high overlap, keeping only the most confident predictions.
|
43 |
+
|
44 |
+
### Summary:
|
45 |
+
1. **Input Image**: Pre-processed and fed into the model.
|
46 |
+
2. **Backbone**: Feature extraction with **CSPDarknet53** and **Residual Connections**.
|
47 |
+
3. **Neck**: Feature aggregation with **PANet** and **FPN**.
|
48 |
+
4. **Head**: Outputs bounding boxes, objectness scores, and class probabilities.
|
49 |
+
5. **Loss Functions**: CIoU, binary cross-entropy, and cross-entropy.
|
50 |
+
6. **Post-Processing**: **NMS** to filter overlapping detections.
|
51 |
+
"""
|
52 |
+
|
53 |
# Netron HTML templates
|
54 |
def get_netron_html(model_url):
|
55 |
return f"""
|
|
|
103 |
|
104 |
with gr.Row():
|
105 |
with gr.Column():
|
106 |
+
gr.Markdown(architecture_description)
|
107 |
gr.HTML(get_netron_html(yolov5_url))
|
108 |
gr.Image(yolov5_result, label="Detections & Interpretability Map")
|
109 |
gr.Image(yolov5_dff, label="Feature Factorization & discovered concept")
|