vishal-adithya
/

depth-estimator

 tags:
 - xgboost
 - python
+---
+# Depth Estimation Using ResNet50 and XGBoost
+## Overview
+This project demonstrates a depth estimation model that predicts the average depth of images using features extracted from a pre-trained ResNet50 model and an XGBoost regressor. The model was trained using the **NYUv2 dataset** hosted on Hugging Face ([0jl/NYUv2](https://huggingface.co/datasets/0jl/NYUv2)). The trained model is saved as `model.pkl` using Python's `pickle` library for easy deployment and reuse.
+## Features
+- **Model Architecture**:
+  - Feature extraction: ResNet50 (pre-trained on ImageNet, with the top layers removed and global average pooling).
+  - Regression: XGBoost, optimized for structured data prediction.
+- **Training GPU**: NVIDIA RTX 4060 Ti, ensuring efficient computation.
+- **Target**: Predict the average depth of images based on the depth maps from the dataset.
+## Dataset
+- Dataset: **NYUv2** ([0jl/NYUv2](https://huggingface.co/datasets/0jl/NYUv2))
+- Format: The dataset includes RGB images and corresponding depth maps.
+- Preprocessing:
+  - Images were resized to 224x224 pixels to match the input requirements of ResNet50.
+  - Depth maps were converted into single average depth values for each image by taking the mean of the depth map.
+## Model Training
+1. **Feature Extraction**:
+   - ResNet50 was used to extract a fixed-length feature vector from each image.
+   - Preprocessing: Images were normalized using the `preprocess_input` function from TensorFlow's ResNet50 module.
+2. **Regression**:
+   - XGBoost regressor was trained on the extracted features to predict average depth values.
+   - Hyperparameters were tuned using cross-validation for optimal performance.
+## Results
+- **R² Score**: 0.541 (indicates the model explains 54.1% of the variance in depth prediction).
+- Performance is reasonable for a first implementation and can be further improved with additional tuning or feature extraction methods.
+## How to Use
+### Requirements
+1. Python 3.8+
+2. Required libraries:
+   - `numpy`
+   - `tensorflow`
+   - `xgboost`
+   - `pickle`
+   - `opencv-python`
+   - `datasets`
+Install the dependencies using pip:
+```bash
+pip install numpy tensorflow xgboost pickle-mixin opencv-python datasets
+```
+### Loading the Model
+The model is saved as `model.pkl` using `pickle`. You can load and use it as follows:
+```python
+import pickle
+# Load the trained model
+with open("model.pkl", "rb") as f:
+    model = pickle.load(f)
+# Example usage
+features = extract_features("path/to/image.jpg")  # Use the same feature extraction pipeline
+predicted_depth = model.predict([features])
+print("Predicted Depth:", predicted_depth[0])
+```
+### Training Pipeline
+If you want to retrain the model, follow these steps:
+1. Download the **NYUv2 dataset** from Hugging Face:
+   ```python
+   from datasets import load_dataset
+   dataset = load_dataset("0jl/NYUv2")
+   ```
+2. Extract features using ResNet50:
+   ```python
+   from tensorflow.keras.applications import ResNet50
+   from tensorflow.keras.applications.resnet50 import preprocess_input
+   import numpy as np
+   # Load ResNet50 model
+   model = ResNet50(weights="imagenet", include_top=False, pooling="avg")
+   def extract_features(image_path):
+       from PIL import Image
+       image = Image.open(image_path).convert("RGB")
+       image = image.resize((224, 224))
+       image_array = np.array(image).astype("float32")
+       image_array = preprocess_input(image_array)
+       image_array = np.expand_dims(image_array, axis=0)
+       features = model.predict(image_array)
+       return features.flatten()
+   ```
+3. Train the XGBoost regressor on the extracted features and save the model:
+   ```python
+   from xgboost import XGBRegressor
+   import pickle
+   regressor = XGBRegressor()
+   regressor.fit(X_train, y_train)
+   # Save the trained model
+   with open("model.pkl", "wb") as f:
+       pickle.dump(regressor, f)
+   ```
+## License
+This project is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for more information.
+## Author
+**Vishal Adithya.A**
+## Acknowledgments
+- Hugging Face for hosting the NYUv2 dataset.
+- NVIDIA RTX 4060 Ti for providing efficient GPU acceleration.
+- TensorFlow and XGBoost for robust machine learning frameworks.