vishal-adithya commited on
Commit
e4c8282
·
verified ·
1 Parent(s): 11e08d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -1
README.md CHANGED
@@ -10,4 +10,118 @@ pipeline_tag: depth-estimation
10
  tags:
11
  - xgboost
12
  - python
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  tags:
11
  - xgboost
12
  - python
13
+ ---
14
+
15
+ # Depth Estimation Using ResNet50 and XGBoost
16
+
17
+ ## Overview
18
+ This project demonstrates a depth estimation model that predicts the average depth of images using features extracted from a pre-trained ResNet50 model and an XGBoost regressor. The model was trained using the **NYUv2 dataset** hosted on Hugging Face ([0jl/NYUv2](https://huggingface.co/datasets/0jl/NYUv2)). The trained model is saved as `model.pkl` using Python's `pickle` library for easy deployment and reuse.
19
+
20
+ ## Features
21
+ - **Model Architecture**:
22
+ - Feature extraction: ResNet50 (pre-trained on ImageNet, with the top layers removed and global average pooling).
23
+ - Regression: XGBoost, optimized for structured data prediction.
24
+ - **Training GPU**: NVIDIA RTX 4060 Ti, ensuring efficient computation.
25
+ - **Target**: Predict the average depth of images based on the depth maps from the dataset.
26
+
27
+ ## Dataset
28
+ - Dataset: **NYUv2** ([0jl/NYUv2](https://huggingface.co/datasets/0jl/NYUv2))
29
+ - Format: The dataset includes RGB images and corresponding depth maps.
30
+ - Preprocessing:
31
+ - Images were resized to 224x224 pixels to match the input requirements of ResNet50.
32
+ - Depth maps were converted into single average depth values for each image by taking the mean of the depth map.
33
+
34
+ ## Model Training
35
+ 1. **Feature Extraction**:
36
+ - ResNet50 was used to extract a fixed-length feature vector from each image.
37
+ - Preprocessing: Images were normalized using the `preprocess_input` function from TensorFlow's ResNet50 module.
38
+ 2. **Regression**:
39
+ - XGBoost regressor was trained on the extracted features to predict average depth values.
40
+ - Hyperparameters were tuned using cross-validation for optimal performance.
41
+
42
+ ## Results
43
+ - **R² Score**: 0.541 (indicates the model explains 54.1% of the variance in depth prediction).
44
+ - Performance is reasonable for a first implementation and can be further improved with additional tuning or feature extraction methods.
45
+
46
+ ## How to Use
47
+ ### Requirements
48
+ 1. Python 3.8+
49
+ 2. Required libraries:
50
+ - `numpy`
51
+ - `tensorflow`
52
+ - `xgboost`
53
+ - `pickle`
54
+ - `opencv-python`
55
+ - `datasets`
56
+
57
+ Install the dependencies using pip:
58
+ ```bash
59
+ pip install numpy tensorflow xgboost pickle-mixin opencv-python datasets
60
+ ```
61
+
62
+ ### Loading the Model
63
+ The model is saved as `model.pkl` using `pickle`. You can load and use it as follows:
64
+
65
+ ```python
66
+ import pickle
67
+
68
+ # Load the trained model
69
+ with open("model.pkl", "rb") as f:
70
+ model = pickle.load(f)
71
+
72
+ # Example usage
73
+ features = extract_features("path/to/image.jpg") # Use the same feature extraction pipeline
74
+ predicted_depth = model.predict([features])
75
+ print("Predicted Depth:", predicted_depth[0])
76
+ ```
77
+
78
+ ### Training Pipeline
79
+ If you want to retrain the model, follow these steps:
80
+ 1. Download the **NYUv2 dataset** from Hugging Face:
81
+ ```python
82
+ from datasets import load_dataset
83
+ dataset = load_dataset("0jl/NYUv2")
84
+ ```
85
+ 2. Extract features using ResNet50:
86
+ ```python
87
+ from tensorflow.keras.applications import ResNet50
88
+ from tensorflow.keras.applications.resnet50 import preprocess_input
89
+ import numpy as np
90
+
91
+ # Load ResNet50 model
92
+ model = ResNet50(weights="imagenet", include_top=False, pooling="avg")
93
+
94
+ def extract_features(image_path):
95
+ from PIL import Image
96
+ image = Image.open(image_path).convert("RGB")
97
+ image = image.resize((224, 224))
98
+ image_array = np.array(image).astype("float32")
99
+ image_array = preprocess_input(image_array)
100
+ image_array = np.expand_dims(image_array, axis=0)
101
+ features = model.predict(image_array)
102
+ return features.flatten()
103
+ ```
104
+ 3. Train the XGBoost regressor on the extracted features and save the model:
105
+ ```python
106
+ from xgboost import XGBRegressor
107
+ import pickle
108
+
109
+ regressor = XGBRegressor()
110
+ regressor.fit(X_train, y_train)
111
+
112
+ # Save the trained model
113
+ with open("model.pkl", "wb") as f:
114
+ pickle.dump(regressor, f)
115
+ ```
116
+
117
+ ## License
118
+ This project is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for more information.
119
+
120
+ ## Author
121
+ **Vishal Adithya.A**
122
+
123
+ ## Acknowledgments
124
+ - Hugging Face for hosting the NYUv2 dataset.
125
+ - NVIDIA RTX 4060 Ti for providing efficient GPU acceleration.
126
+ - TensorFlow and XGBoost for robust machine learning frameworks.
127
+