Update README.md
Browse files
README.md
CHANGED
@@ -10,4 +10,118 @@ pipeline_tag: depth-estimation
|
|
10 |
tags:
|
11 |
- xgboost
|
12 |
- python
|
13 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
tags:
|
11 |
- xgboost
|
12 |
- python
|
13 |
+
---
|
14 |
+
|
15 |
+
# Depth Estimation Using ResNet50 and XGBoost
|
16 |
+
|
17 |
+
## Overview
|
18 |
+
This project demonstrates a depth estimation model that predicts the average depth of images using features extracted from a pre-trained ResNet50 model and an XGBoost regressor. The model was trained using the **NYUv2 dataset** hosted on Hugging Face ([0jl/NYUv2](https://huggingface.co/datasets/0jl/NYUv2)). The trained model is saved as `model.pkl` using Python's `pickle` library for easy deployment and reuse.
|
19 |
+
|
20 |
+
## Features
|
21 |
+
- **Model Architecture**:
|
22 |
+
- Feature extraction: ResNet50 (pre-trained on ImageNet, with the top layers removed and global average pooling).
|
23 |
+
- Regression: XGBoost, optimized for structured data prediction.
|
24 |
+
- **Training GPU**: NVIDIA RTX 4060 Ti, ensuring efficient computation.
|
25 |
+
- **Target**: Predict the average depth of images based on the depth maps from the dataset.
|
26 |
+
|
27 |
+
## Dataset
|
28 |
+
- Dataset: **NYUv2** ([0jl/NYUv2](https://huggingface.co/datasets/0jl/NYUv2))
|
29 |
+
- Format: The dataset includes RGB images and corresponding depth maps.
|
30 |
+
- Preprocessing:
|
31 |
+
- Images were resized to 224x224 pixels to match the input requirements of ResNet50.
|
32 |
+
- Depth maps were converted into single average depth values for each image by taking the mean of the depth map.
|
33 |
+
|
34 |
+
## Model Training
|
35 |
+
1. **Feature Extraction**:
|
36 |
+
- ResNet50 was used to extract a fixed-length feature vector from each image.
|
37 |
+
- Preprocessing: Images were normalized using the `preprocess_input` function from TensorFlow's ResNet50 module.
|
38 |
+
2. **Regression**:
|
39 |
+
- XGBoost regressor was trained on the extracted features to predict average depth values.
|
40 |
+
- Hyperparameters were tuned using cross-validation for optimal performance.
|
41 |
+
|
42 |
+
## Results
|
43 |
+
- **R² Score**: 0.541 (indicates the model explains 54.1% of the variance in depth prediction).
|
44 |
+
- Performance is reasonable for a first implementation and can be further improved with additional tuning or feature extraction methods.
|
45 |
+
|
46 |
+
## How to Use
|
47 |
+
### Requirements
|
48 |
+
1. Python 3.8+
|
49 |
+
2. Required libraries:
|
50 |
+
- `numpy`
|
51 |
+
- `tensorflow`
|
52 |
+
- `xgboost`
|
53 |
+
- `pickle`
|
54 |
+
- `opencv-python`
|
55 |
+
- `datasets`
|
56 |
+
|
57 |
+
Install the dependencies using pip:
|
58 |
+
```bash
|
59 |
+
pip install numpy tensorflow xgboost pickle-mixin opencv-python datasets
|
60 |
+
```
|
61 |
+
|
62 |
+
### Loading the Model
|
63 |
+
The model is saved as `model.pkl` using `pickle`. You can load and use it as follows:
|
64 |
+
|
65 |
+
```python
|
66 |
+
import pickle
|
67 |
+
|
68 |
+
# Load the trained model
|
69 |
+
with open("model.pkl", "rb") as f:
|
70 |
+
model = pickle.load(f)
|
71 |
+
|
72 |
+
# Example usage
|
73 |
+
features = extract_features("path/to/image.jpg") # Use the same feature extraction pipeline
|
74 |
+
predicted_depth = model.predict([features])
|
75 |
+
print("Predicted Depth:", predicted_depth[0])
|
76 |
+
```
|
77 |
+
|
78 |
+
### Training Pipeline
|
79 |
+
If you want to retrain the model, follow these steps:
|
80 |
+
1. Download the **NYUv2 dataset** from Hugging Face:
|
81 |
+
```python
|
82 |
+
from datasets import load_dataset
|
83 |
+
dataset = load_dataset("0jl/NYUv2")
|
84 |
+
```
|
85 |
+
2. Extract features using ResNet50:
|
86 |
+
```python
|
87 |
+
from tensorflow.keras.applications import ResNet50
|
88 |
+
from tensorflow.keras.applications.resnet50 import preprocess_input
|
89 |
+
import numpy as np
|
90 |
+
|
91 |
+
# Load ResNet50 model
|
92 |
+
model = ResNet50(weights="imagenet", include_top=False, pooling="avg")
|
93 |
+
|
94 |
+
def extract_features(image_path):
|
95 |
+
from PIL import Image
|
96 |
+
image = Image.open(image_path).convert("RGB")
|
97 |
+
image = image.resize((224, 224))
|
98 |
+
image_array = np.array(image).astype("float32")
|
99 |
+
image_array = preprocess_input(image_array)
|
100 |
+
image_array = np.expand_dims(image_array, axis=0)
|
101 |
+
features = model.predict(image_array)
|
102 |
+
return features.flatten()
|
103 |
+
```
|
104 |
+
3. Train the XGBoost regressor on the extracted features and save the model:
|
105 |
+
```python
|
106 |
+
from xgboost import XGBRegressor
|
107 |
+
import pickle
|
108 |
+
|
109 |
+
regressor = XGBRegressor()
|
110 |
+
regressor.fit(X_train, y_train)
|
111 |
+
|
112 |
+
# Save the trained model
|
113 |
+
with open("model.pkl", "wb") as f:
|
114 |
+
pickle.dump(regressor, f)
|
115 |
+
```
|
116 |
+
|
117 |
+
## License
|
118 |
+
This project is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for more information.
|
119 |
+
|
120 |
+
## Author
|
121 |
+
**Vishal Adithya.A**
|
122 |
+
|
123 |
+
## Acknowledgments
|
124 |
+
- Hugging Face for hosting the NYUv2 dataset.
|
125 |
+
- NVIDIA RTX 4060 Ti for providing efficient GPU acceleration.
|
126 |
+
- TensorFlow and XGBoost for robust machine learning frameworks.
|
127 |
+
|