XiangpengYang commited on
Commit
a043943
·
1 Parent(s): a064439

update all

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitignore +2 -1
  2. README.md +2 -2
  3. __pycache__/ptp_utils_null_text_inversion.cpython-310.pyc +0 -0
  4. __pycache__/ptp_utils_null_text_inversion.cpython-38.pyc +0 -0
  5. __pycache__/utils.cpython-310.pyc +0 -0
  6. __pycache__/xformers.cpython-310.pyc +0 -0
  7. annotator/__pycache__/util.cpython-310.pyc +0 -0
  8. annotator/dwpose/__pycache__/__init__.cpython-310.pyc +0 -0
  9. annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc +0 -0
  10. annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc +0 -0
  11. annotator/dwpose/__pycache__/util.cpython-310.pyc +0 -0
  12. annotator/dwpose/__pycache__/wholebody.cpython-310.pyc +0 -0
  13. annotator/midas/__pycache__/__init__.cpython-310.pyc +0 -0
  14. annotator/midas/__pycache__/api.cpython-310.pyc +0 -0
  15. annotator/midas/midas/__pycache__/__init__.cpython-310.pyc +0 -0
  16. annotator/midas/midas/__pycache__/base_model.cpython-310.pyc +0 -0
  17. annotator/midas/midas/__pycache__/blocks.cpython-310.pyc +0 -0
  18. annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc +0 -0
  19. annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc +0 -0
  20. annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc +0 -0
  21. annotator/midas/midas/__pycache__/transforms.cpython-310.pyc +0 -0
  22. annotator/midas/midas/__pycache__/vit.cpython-310.pyc +0 -0
  23. annotator/openpose/__pycache__/__init__.cpython-310.pyc +0 -0
  24. annotator/openpose/__pycache__/body.cpython-310.pyc +0 -0
  25. annotator/openpose/__pycache__/face.cpython-310.pyc +0 -0
  26. annotator/openpose/__pycache__/hand.cpython-310.pyc +0 -0
  27. annotator/openpose/__pycache__/model.cpython-310.pyc +0 -0
  28. annotator/openpose/__pycache__/util.cpython-310.pyc +0 -0
  29. annotator/zoe/__pycache__/__init__.cpython-310.pyc +0 -0
  30. annotator/zoe/zoedepth/data/__init__.py +24 -0
  31. annotator/zoe/zoedepth/data/data_mono.py +573 -0
  32. annotator/zoe/zoedepth/data/ddad.py +117 -0
  33. annotator/zoe/zoedepth/data/diml_indoor_test.py +125 -0
  34. annotator/zoe/zoedepth/data/diml_outdoor_test.py +114 -0
  35. annotator/zoe/zoedepth/data/diode.py +125 -0
  36. annotator/zoe/zoedepth/data/hypersim.py +138 -0
  37. annotator/zoe/zoedepth/data/ibims.py +81 -0
  38. annotator/zoe/zoedepth/data/preprocess.py +154 -0
  39. annotator/zoe/zoedepth/data/sun_rgbd_loader.py +106 -0
  40. annotator/zoe/zoedepth/data/transforms.py +481 -0
  41. annotator/zoe/zoedepth/data/vkitti.py +151 -0
  42. annotator/zoe/zoedepth/data/vkitti2.py +187 -0
  43. annotator/zoe/zoedepth/models/__init__.py +24 -0
  44. annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-310.pyc +0 -0
  45. annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-38.pyc +0 -0
  46. annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-39.pyc +0 -0
  47. annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-310.pyc +0 -0
  48. annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-38.pyc +0 -0
  49. annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-39.pyc +0 -0
  50. annotator/zoe/zoedepth/models/__pycache__/model_io.cpython-310.pyc +0 -0
.gitignore CHANGED
@@ -1,3 +1,4 @@
1
  annotator/ckpts/**
2
  result/**
3
- trash/**
 
 
1
  annotator/ckpts/**
2
  result/**
3
+ trash/**
4
+ data/**
README.md CHANGED
@@ -6,9 +6,9 @@ Our method is tested using cuda12.1, fp16 of accelerator and xformers on a singl
6
  conda create -n st-modulator python==3.10
7
  conda activate st-modulator
8
 
9
- # Step 2: Install PyTorch and CUDA
10
  conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
11
-
12
  # Step 3: Install additional dependencies with pip
13
  pip install -r requirements.txt
14
  ```
 
6
  conda create -n st-modulator python==3.10
7
  conda activate st-modulator
8
 
9
+ # Step 2: Install PyTorch, CUDA and Xformers
10
  conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
11
+ pip install --pre -U xformers==0.0.27
12
  # Step 3: Install additional dependencies with pip
13
  pip install -r requirements.txt
14
  ```
__pycache__/ptp_utils_null_text_inversion.cpython-310.pyc DELETED
Binary file (10 kB)
 
__pycache__/ptp_utils_null_text_inversion.cpython-38.pyc DELETED
Binary file (9.33 kB)
 
__pycache__/utils.cpython-310.pyc DELETED
Binary file (2.01 kB)
 
__pycache__/xformers.cpython-310.pyc DELETED
Binary file (359 Bytes)
 
annotator/__pycache__/util.cpython-310.pyc CHANGED
Binary files a/annotator/__pycache__/util.cpython-310.pyc and b/annotator/__pycache__/util.cpython-310.pyc differ
 
annotator/dwpose/__pycache__/__init__.cpython-310.pyc CHANGED
Binary files a/annotator/dwpose/__pycache__/__init__.cpython-310.pyc and b/annotator/dwpose/__pycache__/__init__.cpython-310.pyc differ
 
annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc CHANGED
Binary files a/annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc and b/annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc differ
 
annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc CHANGED
Binary files a/annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc and b/annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc differ
 
annotator/dwpose/__pycache__/util.cpython-310.pyc CHANGED
Binary files a/annotator/dwpose/__pycache__/util.cpython-310.pyc and b/annotator/dwpose/__pycache__/util.cpython-310.pyc differ
 
annotator/dwpose/__pycache__/wholebody.cpython-310.pyc CHANGED
Binary files a/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc and b/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc differ
 
annotator/midas/__pycache__/__init__.cpython-310.pyc CHANGED
Binary files a/annotator/midas/__pycache__/__init__.cpython-310.pyc and b/annotator/midas/__pycache__/__init__.cpython-310.pyc differ
 
annotator/midas/__pycache__/api.cpython-310.pyc CHANGED
Binary files a/annotator/midas/__pycache__/api.cpython-310.pyc and b/annotator/midas/__pycache__/api.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/__init__.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/__init__.cpython-310.pyc and b/annotator/midas/midas/__pycache__/__init__.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/base_model.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/base_model.cpython-310.pyc and b/annotator/midas/midas/__pycache__/base_model.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/blocks.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/blocks.cpython-310.pyc and b/annotator/midas/midas/__pycache__/blocks.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc and b/annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc and b/annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc and b/annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/transforms.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/transforms.cpython-310.pyc and b/annotator/midas/midas/__pycache__/transforms.cpython-310.pyc differ
 
annotator/midas/midas/__pycache__/vit.cpython-310.pyc CHANGED
Binary files a/annotator/midas/midas/__pycache__/vit.cpython-310.pyc and b/annotator/midas/midas/__pycache__/vit.cpython-310.pyc differ
 
annotator/openpose/__pycache__/__init__.cpython-310.pyc CHANGED
Binary files a/annotator/openpose/__pycache__/__init__.cpython-310.pyc and b/annotator/openpose/__pycache__/__init__.cpython-310.pyc differ
 
annotator/openpose/__pycache__/body.cpython-310.pyc CHANGED
Binary files a/annotator/openpose/__pycache__/body.cpython-310.pyc and b/annotator/openpose/__pycache__/body.cpython-310.pyc differ
 
annotator/openpose/__pycache__/face.cpython-310.pyc CHANGED
Binary files a/annotator/openpose/__pycache__/face.cpython-310.pyc and b/annotator/openpose/__pycache__/face.cpython-310.pyc differ
 
annotator/openpose/__pycache__/hand.cpython-310.pyc CHANGED
Binary files a/annotator/openpose/__pycache__/hand.cpython-310.pyc and b/annotator/openpose/__pycache__/hand.cpython-310.pyc differ
 
annotator/openpose/__pycache__/model.cpython-310.pyc CHANGED
Binary files a/annotator/openpose/__pycache__/model.cpython-310.pyc and b/annotator/openpose/__pycache__/model.cpython-310.pyc differ
 
annotator/openpose/__pycache__/util.cpython-310.pyc CHANGED
Binary files a/annotator/openpose/__pycache__/util.cpython-310.pyc and b/annotator/openpose/__pycache__/util.cpython-310.pyc differ
 
annotator/zoe/__pycache__/__init__.cpython-310.pyc CHANGED
Binary files a/annotator/zoe/__pycache__/__init__.cpython-310.pyc and b/annotator/zoe/__pycache__/__init__.cpython-310.pyc differ
 
annotator/zoe/zoedepth/data/__init__.py ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
annotator/zoe/zoedepth/data/data_mono.py ADDED
@@ -0,0 +1,573 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ # This file is partly inspired from BTS (https://github.com/cleinc/bts/blob/master/pytorch/bts_dataloader.py); author: Jin Han Lee
26
+
27
+ import itertools
28
+ import os
29
+ import random
30
+
31
+ import numpy as np
32
+ import cv2
33
+ import torch
34
+ import torch.nn as nn
35
+ import torch.utils.data.distributed
36
+ from zoedepth.utils.easydict import EasyDict as edict
37
+ from PIL import Image, ImageOps
38
+ from torch.utils.data import DataLoader, Dataset
39
+ from torchvision import transforms
40
+
41
+ from zoedepth.utils.config import change_dataset
42
+
43
+ from .ddad import get_ddad_loader
44
+ from .diml_indoor_test import get_diml_indoor_loader
45
+ from .diml_outdoor_test import get_diml_outdoor_loader
46
+ from .diode import get_diode_loader
47
+ from .hypersim import get_hypersim_loader
48
+ from .ibims import get_ibims_loader
49
+ from .sun_rgbd_loader import get_sunrgbd_loader
50
+ from .vkitti import get_vkitti_loader
51
+ from .vkitti2 import get_vkitti2_loader
52
+
53
+ from .preprocess import CropParams, get_white_border, get_black_border
54
+
55
+
56
+ def _is_pil_image(img):
57
+ return isinstance(img, Image.Image)
58
+
59
+
60
+ def _is_numpy_image(img):
61
+ return isinstance(img, np.ndarray) and (img.ndim in {2, 3})
62
+
63
+
64
+ def preprocessing_transforms(mode, **kwargs):
65
+ return transforms.Compose([
66
+ ToTensor(mode=mode, **kwargs)
67
+ ])
68
+
69
+
70
+ class DepthDataLoader(object):
71
+ def __init__(self, config, mode, device='cpu', transform=None, **kwargs):
72
+ """
73
+ Data loader for depth datasets
74
+
75
+ Args:
76
+ config (dict): Config dictionary. Refer to utils/config.py
77
+ mode (str): "train" or "online_eval"
78
+ device (str, optional): Device to load the data on. Defaults to 'cpu'.
79
+ transform (torchvision.transforms, optional): Transform to apply to the data. Defaults to None.
80
+ """
81
+
82
+ self.config = config
83
+
84
+ if config.dataset == 'ibims':
85
+ self.data = get_ibims_loader(config, batch_size=1, num_workers=1)
86
+ return
87
+
88
+ if config.dataset == 'sunrgbd':
89
+ self.data = get_sunrgbd_loader(
90
+ data_dir_root=config.sunrgbd_root, batch_size=1, num_workers=1)
91
+ return
92
+
93
+ if config.dataset == 'diml_indoor':
94
+ self.data = get_diml_indoor_loader(
95
+ data_dir_root=config.diml_indoor_root, batch_size=1, num_workers=1)
96
+ return
97
+
98
+ if config.dataset == 'diml_outdoor':
99
+ self.data = get_diml_outdoor_loader(
100
+ data_dir_root=config.diml_outdoor_root, batch_size=1, num_workers=1)
101
+ return
102
+
103
+ if "diode" in config.dataset:
104
+ self.data = get_diode_loader(
105
+ config[config.dataset+"_root"], batch_size=1, num_workers=1)
106
+ return
107
+
108
+ if config.dataset == 'hypersim_test':
109
+ self.data = get_hypersim_loader(
110
+ config.hypersim_test_root, batch_size=1, num_workers=1)
111
+ return
112
+
113
+ if config.dataset == 'vkitti':
114
+ self.data = get_vkitti_loader(
115
+ config.vkitti_root, batch_size=1, num_workers=1)
116
+ return
117
+
118
+ if config.dataset == 'vkitti2':
119
+ self.data = get_vkitti2_loader(
120
+ config.vkitti2_root, batch_size=1, num_workers=1)
121
+ return
122
+
123
+ if config.dataset == 'ddad':
124
+ self.data = get_ddad_loader(config.ddad_root, resize_shape=(
125
+ 352, 1216), batch_size=1, num_workers=1)
126
+ return
127
+
128
+ img_size = self.config.get("img_size", None)
129
+ img_size = img_size if self.config.get(
130
+ "do_input_resize", False) else None
131
+
132
+ if transform is None:
133
+ transform = preprocessing_transforms(mode, size=img_size)
134
+
135
+ if mode == 'train':
136
+
137
+ Dataset = DataLoadPreprocess
138
+ self.training_samples = Dataset(
139
+ config, mode, transform=transform, device=device)
140
+
141
+ if config.distributed:
142
+ self.train_sampler = torch.utils.data.distributed.DistributedSampler(
143
+ self.training_samples)
144
+ else:
145
+ self.train_sampler = None
146
+
147
+ self.data = DataLoader(self.training_samples,
148
+ batch_size=config.batch_size,
149
+ shuffle=(self.train_sampler is None),
150
+ num_workers=config.workers,
151
+ pin_memory=True,
152
+ persistent_workers=True,
153
+ # prefetch_factor=2,
154
+ sampler=self.train_sampler)
155
+
156
+ elif mode == 'online_eval':
157
+ self.testing_samples = DataLoadPreprocess(
158
+ config, mode, transform=transform)
159
+ if config.distributed: # redundant. here only for readability and to be more explicit
160
+ # Give whole test set to all processes (and report evaluation only on one) regardless
161
+ self.eval_sampler = None
162
+ else:
163
+ self.eval_sampler = None
164
+ self.data = DataLoader(self.testing_samples, 1,
165
+ shuffle=kwargs.get("shuffle_test", False),
166
+ num_workers=1,
167
+ pin_memory=False,
168
+ sampler=self.eval_sampler)
169
+
170
+ elif mode == 'test':
171
+ self.testing_samples = DataLoadPreprocess(
172
+ config, mode, transform=transform)
173
+ self.data = DataLoader(self.testing_samples,
174
+ 1, shuffle=False, num_workers=1)
175
+
176
+ else:
177
+ print(
178
+ 'mode should be one of \'train, test, online_eval\'. Got {}'.format(mode))
179
+
180
+
181
+ def repetitive_roundrobin(*iterables):
182
+ """
183
+ cycles through iterables but sample wise
184
+ first yield first sample from first iterable then first sample from second iterable and so on
185
+ then second sample from first iterable then second sample from second iterable and so on
186
+
187
+ If one iterable is shorter than the others, it is repeated until all iterables are exhausted
188
+ repetitive_roundrobin('ABC', 'D', 'EF') --> A D E B D F C D E
189
+ """
190
+ # Repetitive roundrobin
191
+ iterables_ = [iter(it) for it in iterables]
192
+ exhausted = [False] * len(iterables)
193
+ while not all(exhausted):
194
+ for i, it in enumerate(iterables_):
195
+ try:
196
+ yield next(it)
197
+ except StopIteration:
198
+ exhausted[i] = True
199
+ iterables_[i] = itertools.cycle(iterables[i])
200
+ # First elements may get repeated if one iterable is shorter than the others
201
+ yield next(iterables_[i])
202
+
203
+
204
+ class RepetitiveRoundRobinDataLoader(object):
205
+ def __init__(self, *dataloaders):
206
+ self.dataloaders = dataloaders
207
+
208
+ def __iter__(self):
209
+ return repetitive_roundrobin(*self.dataloaders)
210
+
211
+ def __len__(self):
212
+ # First samples get repeated, thats why the plus one
213
+ return len(self.dataloaders) * (max(len(dl) for dl in self.dataloaders) + 1)
214
+
215
+
216
+ class MixedNYUKITTI(object):
217
+ def __init__(self, config, mode, device='cpu', **kwargs):
218
+ config = edict(config)
219
+ config.workers = config.workers // 2
220
+ self.config = config
221
+ nyu_conf = change_dataset(edict(config), 'nyu')
222
+ kitti_conf = change_dataset(edict(config), 'kitti')
223
+
224
+ # make nyu default for testing
225
+ self.config = config = nyu_conf
226
+ img_size = self.config.get("img_size", None)
227
+ img_size = img_size if self.config.get(
228
+ "do_input_resize", False) else None
229
+ if mode == 'train':
230
+ nyu_loader = DepthDataLoader(
231
+ nyu_conf, mode, device=device, transform=preprocessing_transforms(mode, size=img_size)).data
232
+ kitti_loader = DepthDataLoader(
233
+ kitti_conf, mode, device=device, transform=preprocessing_transforms(mode, size=img_size)).data
234
+ # It has been changed to repetitive roundrobin
235
+ self.data = RepetitiveRoundRobinDataLoader(
236
+ nyu_loader, kitti_loader)
237
+ else:
238
+ self.data = DepthDataLoader(nyu_conf, mode, device=device).data
239
+
240
+
241
+ def remove_leading_slash(s):
242
+ if s[0] == '/' or s[0] == '\\':
243
+ return s[1:]
244
+ return s
245
+
246
+
247
+ class CachedReader:
248
+ def __init__(self, shared_dict=None):
249
+ if shared_dict:
250
+ self._cache = shared_dict
251
+ else:
252
+ self._cache = {}
253
+
254
+ def open(self, fpath):
255
+ im = self._cache.get(fpath, None)
256
+ if im is None:
257
+ im = self._cache[fpath] = Image.open(fpath)
258
+ return im
259
+
260
+
261
+ class ImReader:
262
+ def __init__(self):
263
+ pass
264
+
265
+ # @cache
266
+ def open(self, fpath):
267
+ return Image.open(fpath)
268
+
269
+
270
+ class DataLoadPreprocess(Dataset):
271
+ def __init__(self, config, mode, transform=None, is_for_online_eval=False, **kwargs):
272
+ self.config = config
273
+ if mode == 'online_eval':
274
+ with open(config.filenames_file_eval, 'r') as f:
275
+ self.filenames = f.readlines()
276
+ else:
277
+ with open(config.filenames_file, 'r') as f:
278
+ self.filenames = f.readlines()
279
+
280
+ self.mode = mode
281
+ self.transform = transform
282
+ self.to_tensor = ToTensor(mode)
283
+ self.is_for_online_eval = is_for_online_eval
284
+ if config.use_shared_dict:
285
+ self.reader = CachedReader(config.shared_dict)
286
+ else:
287
+ self.reader = ImReader()
288
+
289
+ def postprocess(self, sample):
290
+ return sample
291
+
292
+ def __getitem__(self, idx):
293
+ sample_path = self.filenames[idx]
294
+ focal = float(sample_path.split()[2])
295
+ sample = {}
296
+
297
+ if self.mode == 'train':
298
+ if self.config.dataset == 'kitti' and self.config.use_right and random.random() > 0.5:
299
+ image_path = os.path.join(
300
+ self.config.data_path, remove_leading_slash(sample_path.split()[3]))
301
+ depth_path = os.path.join(
302
+ self.config.gt_path, remove_leading_slash(sample_path.split()[4]))
303
+ else:
304
+ image_path = os.path.join(
305
+ self.config.data_path, remove_leading_slash(sample_path.split()[0]))
306
+ depth_path = os.path.join(
307
+ self.config.gt_path, remove_leading_slash(sample_path.split()[1]))
308
+
309
+ image = self.reader.open(image_path)
310
+ depth_gt = self.reader.open(depth_path)
311
+ w, h = image.size
312
+
313
+ if self.config.do_kb_crop:
314
+ height = image.height
315
+ width = image.width
316
+ top_margin = int(height - 352)
317
+ left_margin = int((width - 1216) / 2)
318
+ depth_gt = depth_gt.crop(
319
+ (left_margin, top_margin, left_margin + 1216, top_margin + 352))
320
+ image = image.crop(
321
+ (left_margin, top_margin, left_margin + 1216, top_margin + 352))
322
+
323
+ # Avoid blank boundaries due to pixel registration?
324
+ # Train images have white border. Test images have black border.
325
+ if self.config.dataset == 'nyu' and self.config.avoid_boundary:
326
+ # print("Avoiding Blank Boundaries!")
327
+ # We just crop and pad again with reflect padding to original size
328
+ # original_size = image.size
329
+ crop_params = get_white_border(np.array(image, dtype=np.uint8))
330
+ image = image.crop((crop_params.left, crop_params.top, crop_params.right, crop_params.bottom))
331
+ depth_gt = depth_gt.crop((crop_params.left, crop_params.top, crop_params.right, crop_params.bottom))
332
+
333
+ # Use reflect padding to fill the blank
334
+ image = np.array(image)
335
+ image = np.pad(image, ((crop_params.top, h - crop_params.bottom), (crop_params.left, w - crop_params.right), (0, 0)), mode='reflect')
336
+ image = Image.fromarray(image)
337
+
338
+ depth_gt = np.array(depth_gt)
339
+ depth_gt = np.pad(depth_gt, ((crop_params.top, h - crop_params.bottom), (crop_params.left, w - crop_params.right)), 'constant', constant_values=0)
340
+ depth_gt = Image.fromarray(depth_gt)
341
+
342
+
343
+ if self.config.do_random_rotate and (self.config.aug):
344
+ random_angle = (random.random() - 0.5) * 2 * self.config.degree
345
+ image = self.rotate_image(image, random_angle)
346
+ depth_gt = self.rotate_image(
347
+ depth_gt, random_angle, flag=Image.NEAREST)
348
+
349
+ image = np.asarray(image, dtype=np.float32) / 255.0
350
+ depth_gt = np.asarray(depth_gt, dtype=np.float32)
351
+ depth_gt = np.expand_dims(depth_gt, axis=2)
352
+
353
+ if self.config.dataset == 'nyu':
354
+ depth_gt = depth_gt / 1000.0
355
+ else:
356
+ depth_gt = depth_gt / 256.0
357
+
358
+ if self.config.aug and (self.config.random_crop):
359
+ image, depth_gt = self.random_crop(
360
+ image, depth_gt, self.config.input_height, self.config.input_width)
361
+
362
+ if self.config.aug and self.config.random_translate:
363
+ # print("Random Translation!")
364
+ image, depth_gt = self.random_translate(image, depth_gt, self.config.max_translation)
365
+
366
+ image, depth_gt = self.train_preprocess(image, depth_gt)
367
+ mask = np.logical_and(depth_gt > self.config.min_depth,
368
+ depth_gt < self.config.max_depth).squeeze()[None, ...]
369
+ sample = {'image': image, 'depth': depth_gt, 'focal': focal,
370
+ 'mask': mask, **sample}
371
+
372
+ else:
373
+ if self.mode == 'online_eval':
374
+ data_path = self.config.data_path_eval
375
+ else:
376
+ data_path = self.config.data_path
377
+
378
+ image_path = os.path.join(
379
+ data_path, remove_leading_slash(sample_path.split()[0]))
380
+ image = np.asarray(self.reader.open(image_path),
381
+ dtype=np.float32) / 255.0
382
+
383
+ if self.mode == 'online_eval':
384
+ gt_path = self.config.gt_path_eval
385
+ depth_path = os.path.join(
386
+ gt_path, remove_leading_slash(sample_path.split()[1]))
387
+ has_valid_depth = False
388
+ try:
389
+ depth_gt = self.reader.open(depth_path)
390
+ has_valid_depth = True
391
+ except IOError:
392
+ depth_gt = False
393
+ # print('Missing gt for {}'.format(image_path))
394
+
395
+ if has_valid_depth:
396
+ depth_gt = np.asarray(depth_gt, dtype=np.float32)
397
+ depth_gt = np.expand_dims(depth_gt, axis=2)
398
+ if self.config.dataset == 'nyu':
399
+ depth_gt = depth_gt / 1000.0
400
+ else:
401
+ depth_gt = depth_gt / 256.0
402
+
403
+ mask = np.logical_and(
404
+ depth_gt >= self.config.min_depth, depth_gt <= self.config.max_depth).squeeze()[None, ...]
405
+ else:
406
+ mask = False
407
+
408
+ if self.config.do_kb_crop:
409
+ height = image.shape[0]
410
+ width = image.shape[1]
411
+ top_margin = int(height - 352)
412
+ left_margin = int((width - 1216) / 2)
413
+ image = image[top_margin:top_margin + 352,
414
+ left_margin:left_margin + 1216, :]
415
+ if self.mode == 'online_eval' and has_valid_depth:
416
+ depth_gt = depth_gt[top_margin:top_margin +
417
+ 352, left_margin:left_margin + 1216, :]
418
+
419
+ if self.mode == 'online_eval':
420
+ sample = {'image': image, 'depth': depth_gt, 'focal': focal, 'has_valid_depth': has_valid_depth,
421
+ 'image_path': sample_path.split()[0], 'depth_path': sample_path.split()[1],
422
+ 'mask': mask}
423
+ else:
424
+ sample = {'image': image, 'focal': focal}
425
+
426
+ if (self.mode == 'train') or ('has_valid_depth' in sample and sample['has_valid_depth']):
427
+ mask = np.logical_and(depth_gt > self.config.min_depth,
428
+ depth_gt < self.config.max_depth).squeeze()[None, ...]
429
+ sample['mask'] = mask
430
+
431
+ if self.transform:
432
+ sample = self.transform(sample)
433
+
434
+ sample = self.postprocess(sample)
435
+ sample['dataset'] = self.config.dataset
436
+ sample = {**sample, 'image_path': sample_path.split()[0], 'depth_path': sample_path.split()[1]}
437
+
438
+ return sample
439
+
440
+ def rotate_image(self, image, angle, flag=Image.BILINEAR):
441
+ result = image.rotate(angle, resample=flag)
442
+ return result
443
+
444
+ def random_crop(self, img, depth, height, width):
445
+ assert img.shape[0] >= height
446
+ assert img.shape[1] >= width
447
+ assert img.shape[0] == depth.shape[0]
448
+ assert img.shape[1] == depth.shape[1]
449
+ x = random.randint(0, img.shape[1] - width)
450
+ y = random.randint(0, img.shape[0] - height)
451
+ img = img[y:y + height, x:x + width, :]
452
+ depth = depth[y:y + height, x:x + width, :]
453
+
454
+ return img, depth
455
+
456
+ def random_translate(self, img, depth, max_t=20):
457
+ assert img.shape[0] == depth.shape[0]
458
+ assert img.shape[1] == depth.shape[1]
459
+ p = self.config.translate_prob
460
+ do_translate = random.random()
461
+ if do_translate > p:
462
+ return img, depth
463
+ x = random.randint(-max_t, max_t)
464
+ y = random.randint(-max_t, max_t)
465
+ M = np.float32([[1, 0, x], [0, 1, y]])
466
+ # print(img.shape, depth.shape)
467
+ img = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))
468
+ depth = cv2.warpAffine(depth, M, (depth.shape[1], depth.shape[0]))
469
+ depth = depth.squeeze()[..., None] # add channel dim back. Affine warp removes it
470
+ # print("after", img.shape, depth.shape)
471
+ return img, depth
472
+
473
+ def train_preprocess(self, image, depth_gt):
474
+ if self.config.aug:
475
+ # Random flipping
476
+ do_flip = random.random()
477
+ if do_flip > 0.5:
478
+ image = (image[:, ::-1, :]).copy()
479
+ depth_gt = (depth_gt[:, ::-1, :]).copy()
480
+
481
+ # Random gamma, brightness, color augmentation
482
+ do_augment = random.random()
483
+ if do_augment > 0.5:
484
+ image = self.augment_image(image)
485
+
486
+ return image, depth_gt
487
+
488
+ def augment_image(self, image):
489
+ # gamma augmentation
490
+ gamma = random.uniform(0.9, 1.1)
491
+ image_aug = image ** gamma
492
+
493
+ # brightness augmentation
494
+ if self.config.dataset == 'nyu':
495
+ brightness = random.uniform(0.75, 1.25)
496
+ else:
497
+ brightness = random.uniform(0.9, 1.1)
498
+ image_aug = image_aug * brightness
499
+
500
+ # color augmentation
501
+ colors = np.random.uniform(0.9, 1.1, size=3)
502
+ white = np.ones((image.shape[0], image.shape[1]))
503
+ color_image = np.stack([white * colors[i] for i in range(3)], axis=2)
504
+ image_aug *= color_image
505
+ image_aug = np.clip(image_aug, 0, 1)
506
+
507
+ return image_aug
508
+
509
+ def __len__(self):
510
+ return len(self.filenames)
511
+
512
+
513
+ class ToTensor(object):
514
+ def __init__(self, mode, do_normalize=False, size=None):
515
+ self.mode = mode
516
+ self.normalize = transforms.Normalize(
517
+ mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) if do_normalize else nn.Identity()
518
+ self.size = size
519
+ if size is not None:
520
+ self.resize = transforms.Resize(size=size)
521
+ else:
522
+ self.resize = nn.Identity()
523
+
524
+ def __call__(self, sample):
525
+ image, focal = sample['image'], sample['focal']
526
+ image = self.to_tensor(image)
527
+ image = self.normalize(image)
528
+ image = self.resize(image)
529
+
530
+ if self.mode == 'test':
531
+ return {'image': image, 'focal': focal}
532
+
533
+ depth = sample['depth']
534
+ if self.mode == 'train':
535
+ depth = self.to_tensor(depth)
536
+ return {**sample, 'image': image, 'depth': depth, 'focal': focal}
537
+ else:
538
+ has_valid_depth = sample['has_valid_depth']
539
+ image = self.resize(image)
540
+ return {**sample, 'image': image, 'depth': depth, 'focal': focal, 'has_valid_depth': has_valid_depth,
541
+ 'image_path': sample['image_path'], 'depth_path': sample['depth_path']}
542
+
543
+ def to_tensor(self, pic):
544
+ if not (_is_pil_image(pic) or _is_numpy_image(pic)):
545
+ raise TypeError(
546
+ 'pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
547
+
548
+ if isinstance(pic, np.ndarray):
549
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
550
+ return img
551
+
552
+ # handle PIL Image
553
+ if pic.mode == 'I':
554
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
555
+ elif pic.mode == 'I;16':
556
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
557
+ else:
558
+ img = torch.ByteTensor(
559
+ torch.ByteStorage.from_buffer(pic.tobytes()))
560
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
561
+ if pic.mode == 'YCbCr':
562
+ nchannel = 3
563
+ elif pic.mode == 'I;16':
564
+ nchannel = 1
565
+ else:
566
+ nchannel = len(pic.mode)
567
+ img = img.view(pic.size[1], pic.size[0], nchannel)
568
+
569
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
570
+ if isinstance(img, torch.ByteTensor):
571
+ return img.float()
572
+ else:
573
+ return img
annotator/zoe/zoedepth/data/ddad.py ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import os
26
+
27
+ import numpy as np
28
+ import torch
29
+ from PIL import Image
30
+ from torch.utils.data import DataLoader, Dataset
31
+ from torchvision import transforms
32
+
33
+
34
+ class ToTensor(object):
35
+ def __init__(self, resize_shape):
36
+ # self.normalize = transforms.Normalize(
37
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
38
+ self.normalize = lambda x : x
39
+ self.resize = transforms.Resize(resize_shape)
40
+
41
+ def __call__(self, sample):
42
+ image, depth = sample['image'], sample['depth']
43
+ image = self.to_tensor(image)
44
+ image = self.normalize(image)
45
+ depth = self.to_tensor(depth)
46
+
47
+ image = self.resize(image)
48
+
49
+ return {'image': image, 'depth': depth, 'dataset': "ddad"}
50
+
51
+ def to_tensor(self, pic):
52
+
53
+ if isinstance(pic, np.ndarray):
54
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
55
+ return img
56
+
57
+ # # handle PIL Image
58
+ if pic.mode == 'I':
59
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
60
+ elif pic.mode == 'I;16':
61
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
62
+ else:
63
+ img = torch.ByteTensor(
64
+ torch.ByteStorage.from_buffer(pic.tobytes()))
65
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
66
+ if pic.mode == 'YCbCr':
67
+ nchannel = 3
68
+ elif pic.mode == 'I;16':
69
+ nchannel = 1
70
+ else:
71
+ nchannel = len(pic.mode)
72
+ img = img.view(pic.size[1], pic.size[0], nchannel)
73
+
74
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
75
+
76
+ if isinstance(img, torch.ByteTensor):
77
+ return img.float()
78
+ else:
79
+ return img
80
+
81
+
82
+ class DDAD(Dataset):
83
+ def __init__(self, data_dir_root, resize_shape):
84
+ import glob
85
+
86
+ # image paths are of the form <data_dir_root>/{outleft, depthmap}/*.png
87
+ self.image_files = glob.glob(os.path.join(data_dir_root, '*.png'))
88
+ self.depth_files = [r.replace("_rgb.png", "_depth.npy")
89
+ for r in self.image_files]
90
+ self.transform = ToTensor(resize_shape)
91
+
92
+ def __getitem__(self, idx):
93
+
94
+ image_path = self.image_files[idx]
95
+ depth_path = self.depth_files[idx]
96
+
97
+ image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
98
+ depth = np.load(depth_path) # meters
99
+
100
+ # depth[depth > 8] = -1
101
+ depth = depth[..., None]
102
+
103
+ sample = dict(image=image, depth=depth)
104
+ sample = self.transform(sample)
105
+
106
+ if idx == 0:
107
+ print(sample["image"].shape)
108
+
109
+ return sample
110
+
111
+ def __len__(self):
112
+ return len(self.image_files)
113
+
114
+
115
+ def get_ddad_loader(data_dir_root, resize_shape, batch_size=1, **kwargs):
116
+ dataset = DDAD(data_dir_root, resize_shape)
117
+ return DataLoader(dataset, batch_size, **kwargs)
annotator/zoe/zoedepth/data/diml_indoor_test.py ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import os
26
+
27
+ import numpy as np
28
+ import torch
29
+ from PIL import Image
30
+ from torch.utils.data import DataLoader, Dataset
31
+ from torchvision import transforms
32
+
33
+
34
+ class ToTensor(object):
35
+ def __init__(self):
36
+ # self.normalize = transforms.Normalize(
37
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
38
+ self.normalize = lambda x : x
39
+ self.resize = transforms.Resize((480, 640))
40
+
41
+ def __call__(self, sample):
42
+ image, depth = sample['image'], sample['depth']
43
+ image = self.to_tensor(image)
44
+ image = self.normalize(image)
45
+ depth = self.to_tensor(depth)
46
+
47
+ image = self.resize(image)
48
+
49
+ return {'image': image, 'depth': depth, 'dataset': "diml_indoor"}
50
+
51
+ def to_tensor(self, pic):
52
+
53
+ if isinstance(pic, np.ndarray):
54
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
55
+ return img
56
+
57
+ # # handle PIL Image
58
+ if pic.mode == 'I':
59
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
60
+ elif pic.mode == 'I;16':
61
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
62
+ else:
63
+ img = torch.ByteTensor(
64
+ torch.ByteStorage.from_buffer(pic.tobytes()))
65
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
66
+ if pic.mode == 'YCbCr':
67
+ nchannel = 3
68
+ elif pic.mode == 'I;16':
69
+ nchannel = 1
70
+ else:
71
+ nchannel = len(pic.mode)
72
+ img = img.view(pic.size[1], pic.size[0], nchannel)
73
+
74
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
75
+ if isinstance(img, torch.ByteTensor):
76
+ return img.float()
77
+ else:
78
+ return img
79
+
80
+
81
+ class DIML_Indoor(Dataset):
82
+ def __init__(self, data_dir_root):
83
+ import glob
84
+
85
+ # image paths are of the form <data_dir_root>/{HR, LR}/<scene>/{color, depth_filled}/*.png
86
+ self.image_files = glob.glob(os.path.join(
87
+ data_dir_root, "LR", '*', 'color', '*.png'))
88
+ self.depth_files = [r.replace("color", "depth_filled").replace(
89
+ "_c.png", "_depth_filled.png") for r in self.image_files]
90
+ self.transform = ToTensor()
91
+
92
+ def __getitem__(self, idx):
93
+ image_path = self.image_files[idx]
94
+ depth_path = self.depth_files[idx]
95
+
96
+ image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
97
+ depth = np.asarray(Image.open(depth_path),
98
+ dtype='uint16') / 1000.0 # mm to meters
99
+
100
+ # print(np.shape(image))
101
+ # print(np.shape(depth))
102
+
103
+ # depth[depth > 8] = -1
104
+ depth = depth[..., None]
105
+
106
+ sample = dict(image=image, depth=depth)
107
+
108
+ # return sample
109
+ sample = self.transform(sample)
110
+
111
+ if idx == 0:
112
+ print(sample["image"].shape)
113
+
114
+ return sample
115
+
116
+ def __len__(self):
117
+ return len(self.image_files)
118
+
119
+
120
+ def get_diml_indoor_loader(data_dir_root, batch_size=1, **kwargs):
121
+ dataset = DIML_Indoor(data_dir_root)
122
+ return DataLoader(dataset, batch_size, **kwargs)
123
+
124
+ # get_diml_indoor_loader(data_dir_root="datasets/diml/indoor/test/HR")
125
+ # get_diml_indoor_loader(data_dir_root="datasets/diml/indoor/test/LR")
annotator/zoe/zoedepth/data/diml_outdoor_test.py ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import os
26
+
27
+ import numpy as np
28
+ import torch
29
+ from PIL import Image
30
+ from torch.utils.data import DataLoader, Dataset
31
+ from torchvision import transforms
32
+
33
+
34
+ class ToTensor(object):
35
+ def __init__(self):
36
+ # self.normalize = transforms.Normalize(
37
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
38
+ self.normalize = lambda x : x
39
+
40
+ def __call__(self, sample):
41
+ image, depth = sample['image'], sample['depth']
42
+ image = self.to_tensor(image)
43
+ image = self.normalize(image)
44
+ depth = self.to_tensor(depth)
45
+
46
+ return {'image': image, 'depth': depth, 'dataset': "diml_outdoor"}
47
+
48
+ def to_tensor(self, pic):
49
+
50
+ if isinstance(pic, np.ndarray):
51
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
52
+ return img
53
+
54
+ # # handle PIL Image
55
+ if pic.mode == 'I':
56
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
57
+ elif pic.mode == 'I;16':
58
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
59
+ else:
60
+ img = torch.ByteTensor(
61
+ torch.ByteStorage.from_buffer(pic.tobytes()))
62
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
63
+ if pic.mode == 'YCbCr':
64
+ nchannel = 3
65
+ elif pic.mode == 'I;16':
66
+ nchannel = 1
67
+ else:
68
+ nchannel = len(pic.mode)
69
+ img = img.view(pic.size[1], pic.size[0], nchannel)
70
+
71
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
72
+ if isinstance(img, torch.ByteTensor):
73
+ return img.float()
74
+ else:
75
+ return img
76
+
77
+
78
+ class DIML_Outdoor(Dataset):
79
+ def __init__(self, data_dir_root):
80
+ import glob
81
+
82
+ # image paths are of the form <data_dir_root>/{outleft, depthmap}/*.png
83
+ self.image_files = glob.glob(os.path.join(
84
+ data_dir_root, "*", 'outleft', '*.png'))
85
+ self.depth_files = [r.replace("outleft", "depthmap")
86
+ for r in self.image_files]
87
+ self.transform = ToTensor()
88
+
89
+ def __getitem__(self, idx):
90
+ image_path = self.image_files[idx]
91
+ depth_path = self.depth_files[idx]
92
+
93
+ image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
94
+ depth = np.asarray(Image.open(depth_path),
95
+ dtype='uint16') / 1000.0 # mm to meters
96
+
97
+ # depth[depth > 8] = -1
98
+ depth = depth[..., None]
99
+
100
+ sample = dict(image=image, depth=depth, dataset="diml_outdoor")
101
+
102
+ # return sample
103
+ return self.transform(sample)
104
+
105
+ def __len__(self):
106
+ return len(self.image_files)
107
+
108
+
109
+ def get_diml_outdoor_loader(data_dir_root, batch_size=1, **kwargs):
110
+ dataset = DIML_Outdoor(data_dir_root)
111
+ return DataLoader(dataset, batch_size, **kwargs)
112
+
113
+ # get_diml_outdoor_loader(data_dir_root="datasets/diml/outdoor/test/HR")
114
+ # get_diml_outdoor_loader(data_dir_root="datasets/diml/outdoor/test/LR")
annotator/zoe/zoedepth/data/diode.py ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import os
26
+
27
+ import numpy as np
28
+ import torch
29
+ from PIL import Image
30
+ from torch.utils.data import DataLoader, Dataset
31
+ from torchvision import transforms
32
+
33
+
34
+ class ToTensor(object):
35
+ def __init__(self):
36
+ # self.normalize = transforms.Normalize(
37
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
38
+ self.normalize = lambda x : x
39
+ self.resize = transforms.Resize(480)
40
+
41
+ def __call__(self, sample):
42
+ image, depth = sample['image'], sample['depth']
43
+ image = self.to_tensor(image)
44
+ image = self.normalize(image)
45
+ depth = self.to_tensor(depth)
46
+
47
+ image = self.resize(image)
48
+
49
+ return {'image': image, 'depth': depth, 'dataset': "diode"}
50
+
51
+ def to_tensor(self, pic):
52
+
53
+ if isinstance(pic, np.ndarray):
54
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
55
+ return img
56
+
57
+ # # handle PIL Image
58
+ if pic.mode == 'I':
59
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
60
+ elif pic.mode == 'I;16':
61
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
62
+ else:
63
+ img = torch.ByteTensor(
64
+ torch.ByteStorage.from_buffer(pic.tobytes()))
65
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
66
+ if pic.mode == 'YCbCr':
67
+ nchannel = 3
68
+ elif pic.mode == 'I;16':
69
+ nchannel = 1
70
+ else:
71
+ nchannel = len(pic.mode)
72
+ img = img.view(pic.size[1], pic.size[0], nchannel)
73
+
74
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
75
+
76
+ if isinstance(img, torch.ByteTensor):
77
+ return img.float()
78
+ else:
79
+ return img
80
+
81
+
82
+ class DIODE(Dataset):
83
+ def __init__(self, data_dir_root):
84
+ import glob
85
+
86
+ # image paths are of the form <data_dir_root>/scene_#/scan_#/*.png
87
+ self.image_files = glob.glob(
88
+ os.path.join(data_dir_root, '*', '*', '*.png'))
89
+ self.depth_files = [r.replace(".png", "_depth.npy")
90
+ for r in self.image_files]
91
+ self.depth_mask_files = [
92
+ r.replace(".png", "_depth_mask.npy") for r in self.image_files]
93
+ self.transform = ToTensor()
94
+
95
+ def __getitem__(self, idx):
96
+ image_path = self.image_files[idx]
97
+ depth_path = self.depth_files[idx]
98
+ depth_mask_path = self.depth_mask_files[idx]
99
+
100
+ image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
101
+ depth = np.load(depth_path) # in meters
102
+ valid = np.load(depth_mask_path) # binary
103
+
104
+ # depth[depth > 8] = -1
105
+ # depth = depth[..., None]
106
+
107
+ sample = dict(image=image, depth=depth, valid=valid)
108
+
109
+ # return sample
110
+ sample = self.transform(sample)
111
+
112
+ if idx == 0:
113
+ print(sample["image"].shape)
114
+
115
+ return sample
116
+
117
+ def __len__(self):
118
+ return len(self.image_files)
119
+
120
+
121
+ def get_diode_loader(data_dir_root, batch_size=1, **kwargs):
122
+ dataset = DIODE(data_dir_root)
123
+ return DataLoader(dataset, batch_size, **kwargs)
124
+
125
+ # get_diode_loader(data_dir_root="datasets/diode/val/outdoor")
annotator/zoe/zoedepth/data/hypersim.py ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import glob
26
+ import os
27
+
28
+ import h5py
29
+ import numpy as np
30
+ import torch
31
+ from PIL import Image
32
+ from torch.utils.data import DataLoader, Dataset
33
+ from torchvision import transforms
34
+
35
+
36
+ def hypersim_distance_to_depth(npyDistance):
37
+ intWidth, intHeight, fltFocal = 1024, 768, 886.81
38
+
39
+ npyImageplaneX = np.linspace((-0.5 * intWidth) + 0.5, (0.5 * intWidth) - 0.5, intWidth).reshape(
40
+ 1, intWidth).repeat(intHeight, 0).astype(np.float32)[:, :, None]
41
+ npyImageplaneY = np.linspace((-0.5 * intHeight) + 0.5, (0.5 * intHeight) - 0.5,
42
+ intHeight).reshape(intHeight, 1).repeat(intWidth, 1).astype(np.float32)[:, :, None]
43
+ npyImageplaneZ = np.full([intHeight, intWidth, 1], fltFocal, np.float32)
44
+ npyImageplane = np.concatenate(
45
+ [npyImageplaneX, npyImageplaneY, npyImageplaneZ], 2)
46
+
47
+ npyDepth = npyDistance / np.linalg.norm(npyImageplane, 2, 2) * fltFocal
48
+ return npyDepth
49
+
50
+
51
+ class ToTensor(object):
52
+ def __init__(self):
53
+ # self.normalize = transforms.Normalize(
54
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
55
+ self.normalize = lambda x: x
56
+ self.resize = transforms.Resize((480, 640))
57
+
58
+ def __call__(self, sample):
59
+ image, depth = sample['image'], sample['depth']
60
+ image = self.to_tensor(image)
61
+ image = self.normalize(image)
62
+ depth = self.to_tensor(depth)
63
+
64
+ image = self.resize(image)
65
+
66
+ return {'image': image, 'depth': depth, 'dataset': "hypersim"}
67
+
68
+ def to_tensor(self, pic):
69
+
70
+ if isinstance(pic, np.ndarray):
71
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
72
+ return img
73
+
74
+ # # handle PIL Image
75
+ if pic.mode == 'I':
76
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
77
+ elif pic.mode == 'I;16':
78
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
79
+ else:
80
+ img = torch.ByteTensor(
81
+ torch.ByteStorage.from_buffer(pic.tobytes()))
82
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
83
+ if pic.mode == 'YCbCr':
84
+ nchannel = 3
85
+ elif pic.mode == 'I;16':
86
+ nchannel = 1
87
+ else:
88
+ nchannel = len(pic.mode)
89
+ img = img.view(pic.size[1], pic.size[0], nchannel)
90
+
91
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
92
+ if isinstance(img, torch.ByteTensor):
93
+ return img.float()
94
+ else:
95
+ return img
96
+
97
+
98
+ class HyperSim(Dataset):
99
+ def __init__(self, data_dir_root):
100
+ # image paths are of the form <data_dir_root>/<scene>/images/scene_cam_#_final_preview/*.tonemap.jpg
101
+ # depth paths are of the form <data_dir_root>/<scene>/images/scene_cam_#_final_preview/*.depth_meters.hdf5
102
+ self.image_files = glob.glob(os.path.join(
103
+ data_dir_root, '*', 'images', 'scene_cam_*_final_preview', '*.tonemap.jpg'))
104
+ self.depth_files = [r.replace("_final_preview", "_geometry_hdf5").replace(
105
+ ".tonemap.jpg", ".depth_meters.hdf5") for r in self.image_files]
106
+ self.transform = ToTensor()
107
+
108
+ def __getitem__(self, idx):
109
+ image_path = self.image_files[idx]
110
+ depth_path = self.depth_files[idx]
111
+
112
+ image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
113
+
114
+ # depth from hdf5
115
+ depth_fd = h5py.File(depth_path, "r")
116
+ # in meters (Euclidean distance)
117
+ distance_meters = np.array(depth_fd['dataset'])
118
+ depth = hypersim_distance_to_depth(
119
+ distance_meters) # in meters (planar depth)
120
+
121
+ # depth[depth > 8] = -1
122
+ depth = depth[..., None]
123
+
124
+ sample = dict(image=image, depth=depth)
125
+ sample = self.transform(sample)
126
+
127
+ if idx == 0:
128
+ print(sample["image"].shape)
129
+
130
+ return sample
131
+
132
+ def __len__(self):
133
+ return len(self.image_files)
134
+
135
+
136
+ def get_hypersim_loader(data_dir_root, batch_size=1, **kwargs):
137
+ dataset = HyperSim(data_dir_root)
138
+ return DataLoader(dataset, batch_size, **kwargs)
annotator/zoe/zoedepth/data/ibims.py ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import os
26
+
27
+ import numpy as np
28
+ import torch
29
+ from PIL import Image
30
+ from torch.utils.data import DataLoader, Dataset
31
+ from torchvision import transforms as T
32
+
33
+
34
+ class iBims(Dataset):
35
+ def __init__(self, config):
36
+ root_folder = config.ibims_root
37
+ with open(os.path.join(root_folder, "imagelist.txt"), 'r') as f:
38
+ imglist = f.read().split()
39
+
40
+ samples = []
41
+ for basename in imglist:
42
+ img_path = os.path.join(root_folder, 'rgb', basename + ".png")
43
+ depth_path = os.path.join(root_folder, 'depth', basename + ".png")
44
+ valid_mask_path = os.path.join(
45
+ root_folder, 'mask_invalid', basename+".png")
46
+ transp_mask_path = os.path.join(
47
+ root_folder, 'mask_transp', basename+".png")
48
+
49
+ samples.append(
50
+ (img_path, depth_path, valid_mask_path, transp_mask_path))
51
+
52
+ self.samples = samples
53
+ # self.normalize = T.Normalize(
54
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
55
+ self.normalize = lambda x : x
56
+
57
+ def __getitem__(self, idx):
58
+ img_path, depth_path, valid_mask_path, transp_mask_path = self.samples[idx]
59
+
60
+ img = np.asarray(Image.open(img_path), dtype=np.float32) / 255.0
61
+ depth = np.asarray(Image.open(depth_path),
62
+ dtype=np.uint16).astype('float')*50.0/65535
63
+
64
+ mask_valid = np.asarray(Image.open(valid_mask_path))
65
+ mask_transp = np.asarray(Image.open(transp_mask_path))
66
+
67
+ # depth = depth * mask_valid * mask_transp
68
+ depth = np.where(mask_valid * mask_transp, depth, -1)
69
+
70
+ img = torch.from_numpy(img).permute(2, 0, 1)
71
+ img = self.normalize(img)
72
+ depth = torch.from_numpy(depth).unsqueeze(0)
73
+ return dict(image=img, depth=depth, image_path=img_path, depth_path=depth_path, dataset='ibims')
74
+
75
+ def __len__(self):
76
+ return len(self.samples)
77
+
78
+
79
+ def get_ibims_loader(config, batch_size=1, **kwargs):
80
+ dataloader = DataLoader(iBims(config), batch_size=batch_size, **kwargs)
81
+ return dataloader
annotator/zoe/zoedepth/data/preprocess.py ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import numpy as np
26
+ from dataclasses import dataclass
27
+ from typing import Tuple, List
28
+
29
+ # dataclass to store the crop parameters
30
+ @dataclass
31
+ class CropParams:
32
+ top: int
33
+ bottom: int
34
+ left: int
35
+ right: int
36
+
37
+
38
+
39
+ def get_border_params(rgb_image, tolerance=0.1, cut_off=20, value=0, level_diff_threshold=5, channel_axis=-1, min_border=5) -> CropParams:
40
+ gray_image = np.mean(rgb_image, axis=channel_axis)
41
+ h, w = gray_image.shape
42
+
43
+
44
+ def num_value_pixels(arr):
45
+ return np.sum(np.abs(arr - value) < level_diff_threshold)
46
+
47
+ def is_above_tolerance(arr, total_pixels):
48
+ return (num_value_pixels(arr) / total_pixels) > tolerance
49
+
50
+ # Crop top border until number of value pixels become below tolerance
51
+ top = min_border
52
+ while is_above_tolerance(gray_image[top, :], w) and top < h-1:
53
+ top += 1
54
+ if top > cut_off:
55
+ break
56
+
57
+ # Crop bottom border until number of value pixels become below tolerance
58
+ bottom = h - min_border
59
+ while is_above_tolerance(gray_image[bottom, :], w) and bottom > 0:
60
+ bottom -= 1
61
+ if h - bottom > cut_off:
62
+ break
63
+
64
+ # Crop left border until number of value pixels become below tolerance
65
+ left = min_border
66
+ while is_above_tolerance(gray_image[:, left], h) and left < w-1:
67
+ left += 1
68
+ if left > cut_off:
69
+ break
70
+
71
+ # Crop right border until number of value pixels become below tolerance
72
+ right = w - min_border
73
+ while is_above_tolerance(gray_image[:, right], h) and right > 0:
74
+ right -= 1
75
+ if w - right > cut_off:
76
+ break
77
+
78
+
79
+ return CropParams(top, bottom, left, right)
80
+
81
+
82
+ def get_white_border(rgb_image, value=255, **kwargs) -> CropParams:
83
+ """Crops the white border of the RGB.
84
+
85
+ Args:
86
+ rgb: RGB image, shape (H, W, 3).
87
+ Returns:
88
+ Crop parameters.
89
+ """
90
+ if value == 255:
91
+ # assert range of values in rgb image is [0, 255]
92
+ assert np.max(rgb_image) <= 255 and np.min(rgb_image) >= 0, "RGB image values are not in range [0, 255]."
93
+ assert rgb_image.max() > 1, "RGB image values are not in range [0, 255]."
94
+ elif value == 1:
95
+ # assert range of values in rgb image is [0, 1]
96
+ assert np.max(rgb_image) <= 1 and np.min(rgb_image) >= 0, "RGB image values are not in range [0, 1]."
97
+
98
+ return get_border_params(rgb_image, value=value, **kwargs)
99
+
100
+ def get_black_border(rgb_image, **kwargs) -> CropParams:
101
+ """Crops the black border of the RGB.
102
+
103
+ Args:
104
+ rgb: RGB image, shape (H, W, 3).
105
+
106
+ Returns:
107
+ Crop parameters.
108
+ """
109
+
110
+ return get_border_params(rgb_image, value=0, **kwargs)
111
+
112
+ def crop_image(image: np.ndarray, crop_params: CropParams) -> np.ndarray:
113
+ """Crops the image according to the crop parameters.
114
+
115
+ Args:
116
+ image: RGB or depth image, shape (H, W, 3) or (H, W).
117
+ crop_params: Crop parameters.
118
+
119
+ Returns:
120
+ Cropped image.
121
+ """
122
+ return image[crop_params.top:crop_params.bottom, crop_params.left:crop_params.right]
123
+
124
+ def crop_images(*images: np.ndarray, crop_params: CropParams) -> Tuple[np.ndarray]:
125
+ """Crops the images according to the crop parameters.
126
+
127
+ Args:
128
+ images: RGB or depth images, shape (H, W, 3) or (H, W).
129
+ crop_params: Crop parameters.
130
+
131
+ Returns:
132
+ Cropped images.
133
+ """
134
+ return tuple(crop_image(image, crop_params) for image in images)
135
+
136
+ def crop_black_or_white_border(rgb_image, *other_images: np.ndarray, tolerance=0.1, cut_off=20, level_diff_threshold=5) -> Tuple[np.ndarray]:
137
+ """Crops the white and black border of the RGB and depth images.
138
+
139
+ Args:
140
+ rgb: RGB image, shape (H, W, 3). This image is used to determine the border.
141
+ other_images: The other images to crop according to the border of the RGB image.
142
+ Returns:
143
+ Cropped RGB and other images.
144
+ """
145
+ # crop black border
146
+ crop_params = get_black_border(rgb_image, tolerance=tolerance, cut_off=cut_off, level_diff_threshold=level_diff_threshold)
147
+ cropped_images = crop_images(rgb_image, *other_images, crop_params=crop_params)
148
+
149
+ # crop white border
150
+ crop_params = get_white_border(cropped_images[0], tolerance=tolerance, cut_off=cut_off, level_diff_threshold=level_diff_threshold)
151
+ cropped_images = crop_images(*cropped_images, crop_params=crop_params)
152
+
153
+ return cropped_images
154
+
annotator/zoe/zoedepth/data/sun_rgbd_loader.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import os
26
+
27
+ import numpy as np
28
+ import torch
29
+ from PIL import Image
30
+ from torch.utils.data import DataLoader, Dataset
31
+ from torchvision import transforms
32
+
33
+
34
+ class ToTensor(object):
35
+ def __init__(self):
36
+ # self.normalize = transforms.Normalize(
37
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
38
+ self.normalize = lambda x : x
39
+
40
+ def __call__(self, sample):
41
+ image, depth = sample['image'], sample['depth']
42
+ image = self.to_tensor(image)
43
+ image = self.normalize(image)
44
+ depth = self.to_tensor(depth)
45
+
46
+ return {'image': image, 'depth': depth, 'dataset': "sunrgbd"}
47
+
48
+ def to_tensor(self, pic):
49
+
50
+ if isinstance(pic, np.ndarray):
51
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
52
+ return img
53
+
54
+ # # handle PIL Image
55
+ if pic.mode == 'I':
56
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
57
+ elif pic.mode == 'I;16':
58
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
59
+ else:
60
+ img = torch.ByteTensor(
61
+ torch.ByteStorage.from_buffer(pic.tobytes()))
62
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
63
+ if pic.mode == 'YCbCr':
64
+ nchannel = 3
65
+ elif pic.mode == 'I;16':
66
+ nchannel = 1
67
+ else:
68
+ nchannel = len(pic.mode)
69
+ img = img.view(pic.size[1], pic.size[0], nchannel)
70
+
71
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
72
+ if isinstance(img, torch.ByteTensor):
73
+ return img.float()
74
+ else:
75
+ return img
76
+
77
+
78
+ class SunRGBD(Dataset):
79
+ def __init__(self, data_dir_root):
80
+ # test_file_dirs = loadmat(train_test_file)['alltest'].squeeze()
81
+ # all_test = [t[0].replace("/n/fs/sun3d/data/", "") for t in test_file_dirs]
82
+ # self.all_test = [os.path.join(data_dir_root, t) for t in all_test]
83
+ import glob
84
+ self.image_files = glob.glob(
85
+ os.path.join(data_dir_root, 'rgb', 'rgb', '*'))
86
+ self.depth_files = [
87
+ r.replace("rgb/rgb", "gt/gt").replace("jpg", "png") for r in self.image_files]
88
+ self.transform = ToTensor()
89
+
90
+ def __getitem__(self, idx):
91
+ image_path = self.image_files[idx]
92
+ depth_path = self.depth_files[idx]
93
+
94
+ image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
95
+ depth = np.asarray(Image.open(depth_path), dtype='uint16') / 1000.0
96
+ depth[depth > 8] = -1
97
+ depth = depth[..., None]
98
+ return self.transform(dict(image=image, depth=depth))
99
+
100
+ def __len__(self):
101
+ return len(self.image_files)
102
+
103
+
104
+ def get_sunrgbd_loader(data_dir_root, batch_size=1, **kwargs):
105
+ dataset = SunRGBD(data_dir_root)
106
+ return DataLoader(dataset, batch_size, **kwargs)
annotator/zoe/zoedepth/data/transforms.py ADDED
@@ -0,0 +1,481 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import math
26
+ import random
27
+
28
+ import cv2
29
+ import numpy as np
30
+
31
+
32
+ class RandomFliplr(object):
33
+ """Horizontal flip of the sample with given probability.
34
+ """
35
+
36
+ def __init__(self, probability=0.5):
37
+ """Init.
38
+
39
+ Args:
40
+ probability (float, optional): Flip probability. Defaults to 0.5.
41
+ """
42
+ self.__probability = probability
43
+
44
+ def __call__(self, sample):
45
+ prob = random.random()
46
+
47
+ if prob < self.__probability:
48
+ for k, v in sample.items():
49
+ if len(v.shape) >= 2:
50
+ sample[k] = np.fliplr(v).copy()
51
+
52
+ return sample
53
+
54
+
55
+ def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA):
56
+ """Rezise the sample to ensure the given size. Keeps aspect ratio.
57
+
58
+ Args:
59
+ sample (dict): sample
60
+ size (tuple): image size
61
+
62
+ Returns:
63
+ tuple: new size
64
+ """
65
+ shape = list(sample["disparity"].shape)
66
+
67
+ if shape[0] >= size[0] and shape[1] >= size[1]:
68
+ return sample
69
+
70
+ scale = [0, 0]
71
+ scale[0] = size[0] / shape[0]
72
+ scale[1] = size[1] / shape[1]
73
+
74
+ scale = max(scale)
75
+
76
+ shape[0] = math.ceil(scale * shape[0])
77
+ shape[1] = math.ceil(scale * shape[1])
78
+
79
+ # resize
80
+ sample["image"] = cv2.resize(
81
+ sample["image"], tuple(shape[::-1]), interpolation=image_interpolation_method
82
+ )
83
+
84
+ sample["disparity"] = cv2.resize(
85
+ sample["disparity"], tuple(shape[::-1]), interpolation=cv2.INTER_NEAREST
86
+ )
87
+ sample["mask"] = cv2.resize(
88
+ sample["mask"].astype(np.float32),
89
+ tuple(shape[::-1]),
90
+ interpolation=cv2.INTER_NEAREST,
91
+ )
92
+ sample["mask"] = sample["mask"].astype(bool)
93
+
94
+ return tuple(shape)
95
+
96
+
97
+ class RandomCrop(object):
98
+ """Get a random crop of the sample with the given size (width, height).
99
+ """
100
+
101
+ def __init__(
102
+ self,
103
+ width,
104
+ height,
105
+ resize_if_needed=False,
106
+ image_interpolation_method=cv2.INTER_AREA,
107
+ ):
108
+ """Init.
109
+
110
+ Args:
111
+ width (int): output width
112
+ height (int): output height
113
+ resize_if_needed (bool, optional): If True, sample might be upsampled to ensure
114
+ that a crop of size (width, height) is possbile. Defaults to False.
115
+ """
116
+ self.__size = (height, width)
117
+ self.__resize_if_needed = resize_if_needed
118
+ self.__image_interpolation_method = image_interpolation_method
119
+
120
+ def __call__(self, sample):
121
+
122
+ shape = sample["disparity"].shape
123
+
124
+ if self.__size[0] > shape[0] or self.__size[1] > shape[1]:
125
+ if self.__resize_if_needed:
126
+ shape = apply_min_size(
127
+ sample, self.__size, self.__image_interpolation_method
128
+ )
129
+ else:
130
+ raise Exception(
131
+ "Output size {} bigger than input size {}.".format(
132
+ self.__size, shape
133
+ )
134
+ )
135
+
136
+ offset = (
137
+ np.random.randint(shape[0] - self.__size[0] + 1),
138
+ np.random.randint(shape[1] - self.__size[1] + 1),
139
+ )
140
+
141
+ for k, v in sample.items():
142
+ if k == "code" or k == "basis":
143
+ continue
144
+
145
+ if len(sample[k].shape) >= 2:
146
+ sample[k] = v[
147
+ offset[0]: offset[0] + self.__size[0],
148
+ offset[1]: offset[1] + self.__size[1],
149
+ ]
150
+
151
+ return sample
152
+
153
+
154
+ class Resize(object):
155
+ """Resize sample to given size (width, height).
156
+ """
157
+
158
+ def __init__(
159
+ self,
160
+ width,
161
+ height,
162
+ resize_target=True,
163
+ keep_aspect_ratio=False,
164
+ ensure_multiple_of=1,
165
+ resize_method="lower_bound",
166
+ image_interpolation_method=cv2.INTER_AREA,
167
+ letter_box=False,
168
+ ):
169
+ """Init.
170
+
171
+ Args:
172
+ width (int): desired output width
173
+ height (int): desired output height
174
+ resize_target (bool, optional):
175
+ True: Resize the full sample (image, mask, target).
176
+ False: Resize image only.
177
+ Defaults to True.
178
+ keep_aspect_ratio (bool, optional):
179
+ True: Keep the aspect ratio of the input sample.
180
+ Output sample might not have the given width and height, and
181
+ resize behaviour depends on the parameter 'resize_method'.
182
+ Defaults to False.
183
+ ensure_multiple_of (int, optional):
184
+ Output width and height is constrained to be multiple of this parameter.
185
+ Defaults to 1.
186
+ resize_method (str, optional):
187
+ "lower_bound": Output will be at least as large as the given size.
188
+ "upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
189
+ "minimal": Scale as least as possible. (Output size might be smaller than given size.)
190
+ Defaults to "lower_bound".
191
+ """
192
+ self.__width = width
193
+ self.__height = height
194
+
195
+ self.__resize_target = resize_target
196
+ self.__keep_aspect_ratio = keep_aspect_ratio
197
+ self.__multiple_of = ensure_multiple_of
198
+ self.__resize_method = resize_method
199
+ self.__image_interpolation_method = image_interpolation_method
200
+ self.__letter_box = letter_box
201
+
202
+ def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
203
+ y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
204
+
205
+ if max_val is not None and y > max_val:
206
+ y = (np.floor(x / self.__multiple_of)
207
+ * self.__multiple_of).astype(int)
208
+
209
+ if y < min_val:
210
+ y = (np.ceil(x / self.__multiple_of)
211
+ * self.__multiple_of).astype(int)
212
+
213
+ return y
214
+
215
+ def get_size(self, width, height):
216
+ # determine new height and width
217
+ scale_height = self.__height / height
218
+ scale_width = self.__width / width
219
+
220
+ if self.__keep_aspect_ratio:
221
+ if self.__resize_method == "lower_bound":
222
+ # scale such that output size is lower bound
223
+ if scale_width > scale_height:
224
+ # fit width
225
+ scale_height = scale_width
226
+ else:
227
+ # fit height
228
+ scale_width = scale_height
229
+ elif self.__resize_method == "upper_bound":
230
+ # scale such that output size is upper bound
231
+ if scale_width < scale_height:
232
+ # fit width
233
+ scale_height = scale_width
234
+ else:
235
+ # fit height
236
+ scale_width = scale_height
237
+ elif self.__resize_method == "minimal":
238
+ # scale as least as possbile
239
+ if abs(1 - scale_width) < abs(1 - scale_height):
240
+ # fit width
241
+ scale_height = scale_width
242
+ else:
243
+ # fit height
244
+ scale_width = scale_height
245
+ else:
246
+ raise ValueError(
247
+ f"resize_method {self.__resize_method} not implemented"
248
+ )
249
+
250
+ if self.__resize_method == "lower_bound":
251
+ new_height = self.constrain_to_multiple_of(
252
+ scale_height * height, min_val=self.__height
253
+ )
254
+ new_width = self.constrain_to_multiple_of(
255
+ scale_width * width, min_val=self.__width
256
+ )
257
+ elif self.__resize_method == "upper_bound":
258
+ new_height = self.constrain_to_multiple_of(
259
+ scale_height * height, max_val=self.__height
260
+ )
261
+ new_width = self.constrain_to_multiple_of(
262
+ scale_width * width, max_val=self.__width
263
+ )
264
+ elif self.__resize_method == "minimal":
265
+ new_height = self.constrain_to_multiple_of(scale_height * height)
266
+ new_width = self.constrain_to_multiple_of(scale_width * width)
267
+ else:
268
+ raise ValueError(
269
+ f"resize_method {self.__resize_method} not implemented")
270
+
271
+ return (new_width, new_height)
272
+
273
+ def make_letter_box(self, sample):
274
+ top = bottom = (self.__height - sample.shape[0]) // 2
275
+ left = right = (self.__width - sample.shape[1]) // 2
276
+ sample = cv2.copyMakeBorder(
277
+ sample, top, bottom, left, right, cv2.BORDER_CONSTANT, None, 0)
278
+ return sample
279
+
280
+ def __call__(self, sample):
281
+ width, height = self.get_size(
282
+ sample["image"].shape[1], sample["image"].shape[0]
283
+ )
284
+
285
+ # resize sample
286
+ sample["image"] = cv2.resize(
287
+ sample["image"],
288
+ (width, height),
289
+ interpolation=self.__image_interpolation_method,
290
+ )
291
+
292
+ if self.__letter_box:
293
+ sample["image"] = self.make_letter_box(sample["image"])
294
+
295
+ if self.__resize_target:
296
+ if "disparity" in sample:
297
+ sample["disparity"] = cv2.resize(
298
+ sample["disparity"],
299
+ (width, height),
300
+ interpolation=cv2.INTER_NEAREST,
301
+ )
302
+
303
+ if self.__letter_box:
304
+ sample["disparity"] = self.make_letter_box(
305
+ sample["disparity"])
306
+
307
+ if "depth" in sample:
308
+ sample["depth"] = cv2.resize(
309
+ sample["depth"], (width,
310
+ height), interpolation=cv2.INTER_NEAREST
311
+ )
312
+
313
+ if self.__letter_box:
314
+ sample["depth"] = self.make_letter_box(sample["depth"])
315
+
316
+ sample["mask"] = cv2.resize(
317
+ sample["mask"].astype(np.float32),
318
+ (width, height),
319
+ interpolation=cv2.INTER_NEAREST,
320
+ )
321
+
322
+ if self.__letter_box:
323
+ sample["mask"] = self.make_letter_box(sample["mask"])
324
+
325
+ sample["mask"] = sample["mask"].astype(bool)
326
+
327
+ return sample
328
+
329
+
330
+ class ResizeFixed(object):
331
+ def __init__(self, size):
332
+ self.__size = size
333
+
334
+ def __call__(self, sample):
335
+ sample["image"] = cv2.resize(
336
+ sample["image"], self.__size[::-1], interpolation=cv2.INTER_LINEAR
337
+ )
338
+
339
+ sample["disparity"] = cv2.resize(
340
+ sample["disparity"], self.__size[::-
341
+ 1], interpolation=cv2.INTER_NEAREST
342
+ )
343
+
344
+ sample["mask"] = cv2.resize(
345
+ sample["mask"].astype(np.float32),
346
+ self.__size[::-1],
347
+ interpolation=cv2.INTER_NEAREST,
348
+ )
349
+ sample["mask"] = sample["mask"].astype(bool)
350
+
351
+ return sample
352
+
353
+
354
+ class Rescale(object):
355
+ """Rescale target values to the interval [0, max_val].
356
+ If input is constant, values are set to max_val / 2.
357
+ """
358
+
359
+ def __init__(self, max_val=1.0, use_mask=True):
360
+ """Init.
361
+
362
+ Args:
363
+ max_val (float, optional): Max output value. Defaults to 1.0.
364
+ use_mask (bool, optional): Only operate on valid pixels (mask == True). Defaults to True.
365
+ """
366
+ self.__max_val = max_val
367
+ self.__use_mask = use_mask
368
+
369
+ def __call__(self, sample):
370
+ disp = sample["disparity"]
371
+
372
+ if self.__use_mask:
373
+ mask = sample["mask"]
374
+ else:
375
+ mask = np.ones_like(disp, dtype=np.bool)
376
+
377
+ if np.sum(mask) == 0:
378
+ return sample
379
+
380
+ min_val = np.min(disp[mask])
381
+ max_val = np.max(disp[mask])
382
+
383
+ if max_val > min_val:
384
+ sample["disparity"][mask] = (
385
+ (disp[mask] - min_val) / (max_val - min_val) * self.__max_val
386
+ )
387
+ else:
388
+ sample["disparity"][mask] = np.ones_like(
389
+ disp[mask]) * self.__max_val / 2.0
390
+
391
+ return sample
392
+
393
+
394
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
395
+ class NormalizeImage(object):
396
+ """Normlize image by given mean and std.
397
+ """
398
+
399
+ def __init__(self, mean, std):
400
+ self.__mean = mean
401
+ self.__std = std
402
+
403
+ def __call__(self, sample):
404
+ sample["image"] = (sample["image"] - self.__mean) / self.__std
405
+
406
+ return sample
407
+
408
+
409
+ class DepthToDisparity(object):
410
+ """Convert depth to disparity. Removes depth from sample.
411
+ """
412
+
413
+ def __init__(self, eps=1e-4):
414
+ self.__eps = eps
415
+
416
+ def __call__(self, sample):
417
+ assert "depth" in sample
418
+
419
+ sample["mask"][sample["depth"] < self.__eps] = False
420
+
421
+ sample["disparity"] = np.zeros_like(sample["depth"])
422
+ sample["disparity"][sample["depth"] >= self.__eps] = (
423
+ 1.0 / sample["depth"][sample["depth"] >= self.__eps]
424
+ )
425
+
426
+ del sample["depth"]
427
+
428
+ return sample
429
+
430
+
431
+ class DisparityToDepth(object):
432
+ """Convert disparity to depth. Removes disparity from sample.
433
+ """
434
+
435
+ def __init__(self, eps=1e-4):
436
+ self.__eps = eps
437
+
438
+ def __call__(self, sample):
439
+ assert "disparity" in sample
440
+
441
+ disp = np.abs(sample["disparity"])
442
+ sample["mask"][disp < self.__eps] = False
443
+
444
+ # print(sample["disparity"])
445
+ # print(sample["mask"].sum())
446
+ # exit()
447
+
448
+ sample["depth"] = np.zeros_like(disp)
449
+ sample["depth"][disp >= self.__eps] = (
450
+ 1.0 / disp[disp >= self.__eps]
451
+ )
452
+
453
+ del sample["disparity"]
454
+
455
+ return sample
456
+
457
+
458
+ class PrepareForNet(object):
459
+ """Prepare sample for usage as network input.
460
+ """
461
+
462
+ def __init__(self):
463
+ pass
464
+
465
+ def __call__(self, sample):
466
+ image = np.transpose(sample["image"], (2, 0, 1))
467
+ sample["image"] = np.ascontiguousarray(image).astype(np.float32)
468
+
469
+ if "mask" in sample:
470
+ sample["mask"] = sample["mask"].astype(np.float32)
471
+ sample["mask"] = np.ascontiguousarray(sample["mask"])
472
+
473
+ if "disparity" in sample:
474
+ disparity = sample["disparity"].astype(np.float32)
475
+ sample["disparity"] = np.ascontiguousarray(disparity)
476
+
477
+ if "depth" in sample:
478
+ depth = sample["depth"].astype(np.float32)
479
+ sample["depth"] = np.ascontiguousarray(depth)
480
+
481
+ return sample
annotator/zoe/zoedepth/data/vkitti.py ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import torch
26
+ from torch.utils.data import Dataset, DataLoader
27
+ from torchvision import transforms
28
+ import os
29
+
30
+ from PIL import Image
31
+ import numpy as np
32
+ import cv2
33
+
34
+
35
+ class ToTensor(object):
36
+ def __init__(self):
37
+ self.normalize = transforms.Normalize(
38
+ mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
39
+ # self.resize = transforms.Resize((375, 1242))
40
+
41
+ def __call__(self, sample):
42
+ image, depth = sample['image'], sample['depth']
43
+
44
+ image = self.to_tensor(image)
45
+ image = self.normalize(image)
46
+ depth = self.to_tensor(depth)
47
+
48
+ # image = self.resize(image)
49
+
50
+ return {'image': image, 'depth': depth, 'dataset': "vkitti"}
51
+
52
+ def to_tensor(self, pic):
53
+
54
+ if isinstance(pic, np.ndarray):
55
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
56
+ return img
57
+
58
+ # # handle PIL Image
59
+ if pic.mode == 'I':
60
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
61
+ elif pic.mode == 'I;16':
62
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
63
+ else:
64
+ img = torch.ByteTensor(
65
+ torch.ByteStorage.from_buffer(pic.tobytes()))
66
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
67
+ if pic.mode == 'YCbCr':
68
+ nchannel = 3
69
+ elif pic.mode == 'I;16':
70
+ nchannel = 1
71
+ else:
72
+ nchannel = len(pic.mode)
73
+ img = img.view(pic.size[1], pic.size[0], nchannel)
74
+
75
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
76
+ if isinstance(img, torch.ByteTensor):
77
+ return img.float()
78
+ else:
79
+ return img
80
+
81
+
82
+ class VKITTI(Dataset):
83
+ def __init__(self, data_dir_root, do_kb_crop=True):
84
+ import glob
85
+ # image paths are of the form <data_dir_root>/{HR, LR}/<scene>/{color, depth_filled}/*.png
86
+ self.image_files = glob.glob(os.path.join(
87
+ data_dir_root, "test_color", '*.png'))
88
+ self.depth_files = [r.replace("test_color", "test_depth")
89
+ for r in self.image_files]
90
+ self.do_kb_crop = True
91
+ self.transform = ToTensor()
92
+
93
+ def __getitem__(self, idx):
94
+ image_path = self.image_files[idx]
95
+ depth_path = self.depth_files[idx]
96
+
97
+ image = Image.open(image_path)
98
+ depth = Image.open(depth_path)
99
+ depth = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR |
100
+ cv2.IMREAD_ANYDEPTH)
101
+ print("dpeth min max", depth.min(), depth.max())
102
+
103
+ # print(np.shape(image))
104
+ # print(np.shape(depth))
105
+
106
+ # depth[depth > 8] = -1
107
+
108
+ if self.do_kb_crop and False:
109
+ height = image.height
110
+ width = image.width
111
+ top_margin = int(height - 352)
112
+ left_margin = int((width - 1216) / 2)
113
+ depth = depth.crop(
114
+ (left_margin, top_margin, left_margin + 1216, top_margin + 352))
115
+ image = image.crop(
116
+ (left_margin, top_margin, left_margin + 1216, top_margin + 352))
117
+ # uv = uv[:, top_margin:top_margin + 352, left_margin:left_margin + 1216]
118
+
119
+ image = np.asarray(image, dtype=np.float32) / 255.0
120
+ # depth = np.asarray(depth, dtype=np.uint16) /1.
121
+ depth = depth[..., None]
122
+ sample = dict(image=image, depth=depth)
123
+
124
+ # return sample
125
+ sample = self.transform(sample)
126
+
127
+ if idx == 0:
128
+ print(sample["image"].shape)
129
+
130
+ return sample
131
+
132
+ def __len__(self):
133
+ return len(self.image_files)
134
+
135
+
136
+ def get_vkitti_loader(data_dir_root, batch_size=1, **kwargs):
137
+ dataset = VKITTI(data_dir_root)
138
+ return DataLoader(dataset, batch_size, **kwargs)
139
+
140
+
141
+ if __name__ == "__main__":
142
+ loader = get_vkitti_loader(
143
+ data_dir_root="/home/bhatsf/shortcuts/datasets/vkitti_test")
144
+ print("Total files", len(loader.dataset))
145
+ for i, sample in enumerate(loader):
146
+ print(sample["image"].shape)
147
+ print(sample["depth"].shape)
148
+ print(sample["dataset"])
149
+ print(sample['depth'].min(), sample['depth'].max())
150
+ if i > 5:
151
+ break
annotator/zoe/zoedepth/data/vkitti2.py ADDED
@@ -0,0 +1,187 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
25
+ import os
26
+
27
+ import cv2
28
+ import numpy as np
29
+ import torch
30
+ from PIL import Image
31
+ from torch.utils.data import DataLoader, Dataset
32
+ from torchvision import transforms
33
+
34
+
35
+ class ToTensor(object):
36
+ def __init__(self):
37
+ # self.normalize = transforms.Normalize(
38
+ # mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
39
+ self.normalize = lambda x: x
40
+ # self.resize = transforms.Resize((375, 1242))
41
+
42
+ def __call__(self, sample):
43
+ image, depth = sample['image'], sample['depth']
44
+
45
+ image = self.to_tensor(image)
46
+ image = self.normalize(image)
47
+ depth = self.to_tensor(depth)
48
+
49
+ # image = self.resize(image)
50
+
51
+ return {'image': image, 'depth': depth, 'dataset': "vkitti"}
52
+
53
+ def to_tensor(self, pic):
54
+
55
+ if isinstance(pic, np.ndarray):
56
+ img = torch.from_numpy(pic.transpose((2, 0, 1)))
57
+ return img
58
+
59
+ # # handle PIL Image
60
+ if pic.mode == 'I':
61
+ img = torch.from_numpy(np.array(pic, np.int32, copy=False))
62
+ elif pic.mode == 'I;16':
63
+ img = torch.from_numpy(np.array(pic, np.int16, copy=False))
64
+ else:
65
+ img = torch.ByteTensor(
66
+ torch.ByteStorage.from_buffer(pic.tobytes()))
67
+ # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
68
+ if pic.mode == 'YCbCr':
69
+ nchannel = 3
70
+ elif pic.mode == 'I;16':
71
+ nchannel = 1
72
+ else:
73
+ nchannel = len(pic.mode)
74
+ img = img.view(pic.size[1], pic.size[0], nchannel)
75
+
76
+ img = img.transpose(0, 1).transpose(0, 2).contiguous()
77
+ if isinstance(img, torch.ByteTensor):
78
+ return img.float()
79
+ else:
80
+ return img
81
+
82
+
83
+ class VKITTI2(Dataset):
84
+ def __init__(self, data_dir_root, do_kb_crop=True, split="test"):
85
+ import glob
86
+
87
+ # image paths are of the form <data_dir_root>/rgb/<scene>/<variant>/frames/<rgb,depth>/Camera<0,1>/rgb_{}.jpg
88
+ self.image_files = glob.glob(os.path.join(
89
+ data_dir_root, "rgb", "**", "frames", "rgb", "Camera_0", '*.jpg'), recursive=True)
90
+ self.depth_files = [r.replace("/rgb/", "/depth/").replace(
91
+ "rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
92
+ self.do_kb_crop = True
93
+ self.transform = ToTensor()
94
+
95
+ # If train test split is not created, then create one.
96
+ # Split is such that 8% of the frames from each scene are used for testing.
97
+ if not os.path.exists(os.path.join(data_dir_root, "train.txt")):
98
+ import random
99
+ scenes = set([os.path.basename(os.path.dirname(
100
+ os.path.dirname(os.path.dirname(f)))) for f in self.image_files])
101
+ train_files = []
102
+ test_files = []
103
+ for scene in scenes:
104
+ scene_files = [f for f in self.image_files if os.path.basename(
105
+ os.path.dirname(os.path.dirname(os.path.dirname(f)))) == scene]
106
+ random.shuffle(scene_files)
107
+ train_files.extend(scene_files[:int(len(scene_files) * 0.92)])
108
+ test_files.extend(scene_files[int(len(scene_files) * 0.92):])
109
+ with open(os.path.join(data_dir_root, "train.txt"), "w") as f:
110
+ f.write("\n".join(train_files))
111
+ with open(os.path.join(data_dir_root, "test.txt"), "w") as f:
112
+ f.write("\n".join(test_files))
113
+
114
+ if split == "train":
115
+ with open(os.path.join(data_dir_root, "train.txt"), "r") as f:
116
+ self.image_files = f.read().splitlines()
117
+ self.depth_files = [r.replace("/rgb/", "/depth/").replace(
118
+ "rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
119
+ elif split == "test":
120
+ with open(os.path.join(data_dir_root, "test.txt"), "r") as f:
121
+ self.image_files = f.read().splitlines()
122
+ self.depth_files = [r.replace("/rgb/", "/depth/").replace(
123
+ "rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
124
+
125
+ def __getitem__(self, idx):
126
+ image_path = self.image_files[idx]
127
+ depth_path = self.depth_files[idx]
128
+
129
+ image = Image.open(image_path)
130
+ # depth = Image.open(depth_path)
131
+ depth = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR |
132
+ cv2.IMREAD_ANYDEPTH) / 100.0 # cm to m
133
+ depth = Image.fromarray(depth)
134
+ # print("dpeth min max", depth.min(), depth.max())
135
+
136
+ # print(np.shape(image))
137
+ # print(np.shape(depth))
138
+
139
+ if self.do_kb_crop:
140
+ if idx == 0:
141
+ print("Using KB input crop")
142
+ height = image.height
143
+ width = image.width
144
+ top_margin = int(height - 352)
145
+ left_margin = int((width - 1216) / 2)
146
+ depth = depth.crop(
147
+ (left_margin, top_margin, left_margin + 1216, top_margin + 352))
148
+ image = image.crop(
149
+ (left_margin, top_margin, left_margin + 1216, top_margin + 352))
150
+ # uv = uv[:, top_margin:top_margin + 352, left_margin:left_margin + 1216]
151
+
152
+ image = np.asarray(image, dtype=np.float32) / 255.0
153
+ # depth = np.asarray(depth, dtype=np.uint16) /1.
154
+ depth = np.asarray(depth, dtype=np.float32) / 1.
155
+ depth[depth > 80] = -1
156
+
157
+ depth = depth[..., None]
158
+ sample = dict(image=image, depth=depth)
159
+
160
+ # return sample
161
+ sample = self.transform(sample)
162
+
163
+ if idx == 0:
164
+ print(sample["image"].shape)
165
+
166
+ return sample
167
+
168
+ def __len__(self):
169
+ return len(self.image_files)
170
+
171
+
172
+ def get_vkitti2_loader(data_dir_root, batch_size=1, **kwargs):
173
+ dataset = VKITTI2(data_dir_root)
174
+ return DataLoader(dataset, batch_size, **kwargs)
175
+
176
+
177
+ if __name__ == "__main__":
178
+ loader = get_vkitti2_loader(
179
+ data_dir_root="/home/bhatsf/shortcuts/datasets/vkitti2")
180
+ print("Total files", len(loader.dataset))
181
+ for i, sample in enumerate(loader):
182
+ print(sample["image"].shape)
183
+ print(sample["depth"].shape)
184
+ print(sample["dataset"])
185
+ print(sample['depth'].min(), sample['depth'].max())
186
+ if i > 5:
187
+ break
annotator/zoe/zoedepth/models/__init__.py ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MIT License
2
+
3
+ # Copyright (c) 2022 Intelligent Systems Lab Org
4
+
5
+ # Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ # of this software and associated documentation files (the "Software"), to deal
7
+ # in the Software without restriction, including without limitation the rights
8
+ # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ # copies of the Software, and to permit persons to whom the Software is
10
+ # furnished to do so, subject to the following conditions:
11
+
12
+ # The above copyright notice and this permission notice shall be included in all
13
+ # copies or substantial portions of the Software.
14
+
15
+ # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ # SOFTWARE.
22
+
23
+ # File author: Shariq Farooq Bhat
24
+
annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-310.pyc ADDED
Binary file (165 Bytes). View file
 
annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-38.pyc ADDED
Binary file (167 Bytes). View file
 
annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-39.pyc ADDED
Binary file (167 Bytes). View file
 
annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-310.pyc ADDED
Binary file (6.26 kB). View file
 
annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-38.pyc ADDED
Binary file (6.33 kB). View file
 
annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-39.pyc ADDED
Binary file (6.31 kB). View file
 
annotator/zoe/zoedepth/models/__pycache__/model_io.cpython-310.pyc ADDED
Binary file (2.27 kB). View file