Spaces:

XiangpengYang
/

VideoGrain

Configuration error

App Files Files Community

XiangpengYang commited on Feb 24

Commit

a043943

1 Parent(s): a064439

update all

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitignore +2 -1
README.md +2 -2
__pycache__/ptp_utils_null_text_inversion.cpython-310.pyc +0 -0
__pycache__/ptp_utils_null_text_inversion.cpython-38.pyc +0 -0
__pycache__/utils.cpython-310.pyc +0 -0
__pycache__/xformers.cpython-310.pyc +0 -0
annotator/__pycache__/util.cpython-310.pyc +0 -0
annotator/dwpose/__pycache__/__init__.cpython-310.pyc +0 -0
annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc +0 -0
annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc +0 -0
annotator/dwpose/__pycache__/util.cpython-310.pyc +0 -0
annotator/dwpose/__pycache__/wholebody.cpython-310.pyc +0 -0
annotator/midas/__pycache__/__init__.cpython-310.pyc +0 -0
annotator/midas/__pycache__/api.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/__init__.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/base_model.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/blocks.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/transforms.cpython-310.pyc +0 -0
annotator/midas/midas/__pycache__/vit.cpython-310.pyc +0 -0
annotator/openpose/__pycache__/__init__.cpython-310.pyc +0 -0
annotator/openpose/__pycache__/body.cpython-310.pyc +0 -0
annotator/openpose/__pycache__/face.cpython-310.pyc +0 -0
annotator/openpose/__pycache__/hand.cpython-310.pyc +0 -0
annotator/openpose/__pycache__/model.cpython-310.pyc +0 -0
annotator/openpose/__pycache__/util.cpython-310.pyc +0 -0
annotator/zoe/__pycache__/__init__.cpython-310.pyc +0 -0
annotator/zoe/zoedepth/data/__init__.py +24 -0
annotator/zoe/zoedepth/data/data_mono.py +573 -0
annotator/zoe/zoedepth/data/ddad.py +117 -0
annotator/zoe/zoedepth/data/diml_indoor_test.py +125 -0
annotator/zoe/zoedepth/data/diml_outdoor_test.py +114 -0
annotator/zoe/zoedepth/data/diode.py +125 -0
annotator/zoe/zoedepth/data/hypersim.py +138 -0
annotator/zoe/zoedepth/data/ibims.py +81 -0
annotator/zoe/zoedepth/data/preprocess.py +154 -0
annotator/zoe/zoedepth/data/sun_rgbd_loader.py +106 -0
annotator/zoe/zoedepth/data/transforms.py +481 -0
annotator/zoe/zoedepth/data/vkitti.py +151 -0
annotator/zoe/zoedepth/data/vkitti2.py +187 -0
annotator/zoe/zoedepth/models/__init__.py +24 -0
annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-310.pyc +0 -0
annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-38.pyc +0 -0
annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-39.pyc +0 -0
annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-310.pyc +0 -0
annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-38.pyc +0 -0
annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-39.pyc +0 -0
annotator/zoe/zoedepth/models/__pycache__/model_io.cpython-310.pyc +0 -0

.gitignore CHANGED Viewed

@@ -1,3 +1,4 @@
 annotator/ckpts/**
 result/**
-trash/**

 annotator/ckpts/**
 result/**
+trash/**
+data/**

README.md CHANGED Viewed

@@ -6,9 +6,9 @@ Our method is tested using cuda12.1, fp16 of accelerator and xformers on a singl
 conda create -n st-modulator python==3.10
 conda activate st-modulator
-# Step 2: Install PyTorch and CUDA
 conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
 # Step 3: Install additional dependencies with pip
 pip install -r requirements.txt
 ```

 conda create -n st-modulator python==3.10
 conda activate st-modulator
+# Step 2: Install PyTorch, CUDA and Xformers
 conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
+pip install --pre -U xformers==0.0.27
 # Step 3: Install additional dependencies with pip
 pip install -r requirements.txt
 ```

__pycache__/ptp_utils_null_text_inversion.cpython-310.pyc DELETED Viewed

Binary file (10 kB)

__pycache__/ptp_utils_null_text_inversion.cpython-38.pyc DELETED Viewed

Binary file (9.33 kB)

__pycache__/utils.cpython-310.pyc DELETED Viewed

Binary file (2.01 kB)

__pycache__/xformers.cpython-310.pyc DELETED Viewed

Binary file (359 Bytes)

annotator/__pycache__/util.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/__pycache__/util.cpython-310.pyc and b/annotator/__pycache__/util.cpython-310.pyc differ

annotator/dwpose/__pycache__/__init__.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/dwpose/__pycache__/__init__.cpython-310.pyc and b/annotator/dwpose/__pycache__/__init__.cpython-310.pyc differ

annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc and b/annotator/dwpose/__pycache__/onnxdet.cpython-310.pyc differ

annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc and b/annotator/dwpose/__pycache__/onnxpose.cpython-310.pyc differ

annotator/dwpose/__pycache__/util.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/dwpose/__pycache__/util.cpython-310.pyc and b/annotator/dwpose/__pycache__/util.cpython-310.pyc differ

annotator/dwpose/__pycache__/wholebody.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc and b/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc differ

annotator/midas/__pycache__/__init__.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/__pycache__/__init__.cpython-310.pyc and b/annotator/midas/__pycache__/__init__.cpython-310.pyc differ

annotator/midas/__pycache__/api.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/__pycache__/api.cpython-310.pyc and b/annotator/midas/__pycache__/api.cpython-310.pyc differ

annotator/midas/midas/__pycache__/__init__.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/__init__.cpython-310.pyc and b/annotator/midas/midas/__pycache__/__init__.cpython-310.pyc differ

annotator/midas/midas/__pycache__/base_model.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/base_model.cpython-310.pyc and b/annotator/midas/midas/__pycache__/base_model.cpython-310.pyc differ

annotator/midas/midas/__pycache__/blocks.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/blocks.cpython-310.pyc and b/annotator/midas/midas/__pycache__/blocks.cpython-310.pyc differ

annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc and b/annotator/midas/midas/__pycache__/dpt_depth.cpython-310.pyc differ

annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc and b/annotator/midas/midas/__pycache__/midas_net.cpython-310.pyc differ

annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc and b/annotator/midas/midas/__pycache__/midas_net_custom.cpython-310.pyc differ

annotator/midas/midas/__pycache__/transforms.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/transforms.cpython-310.pyc and b/annotator/midas/midas/__pycache__/transforms.cpython-310.pyc differ

annotator/midas/midas/__pycache__/vit.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/midas/midas/__pycache__/vit.cpython-310.pyc and b/annotator/midas/midas/__pycache__/vit.cpython-310.pyc differ

annotator/openpose/__pycache__/__init__.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/openpose/__pycache__/__init__.cpython-310.pyc and b/annotator/openpose/__pycache__/__init__.cpython-310.pyc differ

annotator/openpose/__pycache__/body.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/openpose/__pycache__/body.cpython-310.pyc and b/annotator/openpose/__pycache__/body.cpython-310.pyc differ

annotator/openpose/__pycache__/face.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/openpose/__pycache__/face.cpython-310.pyc and b/annotator/openpose/__pycache__/face.cpython-310.pyc differ

annotator/openpose/__pycache__/hand.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/openpose/__pycache__/hand.cpython-310.pyc and b/annotator/openpose/__pycache__/hand.cpython-310.pyc differ

annotator/openpose/__pycache__/model.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/openpose/__pycache__/model.cpython-310.pyc and b/annotator/openpose/__pycache__/model.cpython-310.pyc differ

annotator/openpose/__pycache__/util.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/openpose/__pycache__/util.cpython-310.pyc and b/annotator/openpose/__pycache__/util.cpython-310.pyc differ

annotator/zoe/__pycache__/__init__.cpython-310.pyc CHANGED Viewed

Binary files a/annotator/zoe/__pycache__/__init__.cpython-310.pyc and b/annotator/zoe/__pycache__/__init__.cpython-310.pyc differ

annotator/zoe/zoedepth/data/__init__.py ADDED Viewed

	@@ -0,0 +1,24 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat

annotator/zoe/zoedepth/data/data_mono.py ADDED Viewed

	@@ -0,0 +1,573 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+# This file is partly inspired from BTS (https://github.com/cleinc/bts/blob/master/pytorch/bts_dataloader.py); author: Jin Han Lee
+import itertools
+import os
+import random
+import numpy as np
+import cv2
+import torch
+import torch.nn as nn
+import torch.utils.data.distributed
+from zoedepth.utils.easydict import EasyDict as edict
+from PIL import Image, ImageOps
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+from zoedepth.utils.config import change_dataset
+from .ddad import get_ddad_loader
+from .diml_indoor_test import get_diml_indoor_loader
+from .diml_outdoor_test import get_diml_outdoor_loader
+from .diode import get_diode_loader
+from .hypersim import get_hypersim_loader
+from .ibims import get_ibims_loader
+from .sun_rgbd_loader import get_sunrgbd_loader
+from .vkitti import get_vkitti_loader
+from .vkitti2 import get_vkitti2_loader
+from .preprocess import CropParams, get_white_border, get_black_border
+def _is_pil_image(img):
+    return isinstance(img, Image.Image)
+def _is_numpy_image(img):
+    return isinstance(img, np.ndarray) and (img.ndim in {2, 3})
+def preprocessing_transforms(mode, **kwargs):
+    return transforms.Compose([
+        ToTensor(mode=mode, **kwargs)
+    ])
+class DepthDataLoader(object):
+    def __init__(self, config, mode, device='cpu', transform=None, **kwargs):
+        """
+        Data loader for depth datasets
+        Args:
+            config (dict): Config dictionary. Refer to utils/config.py
+            mode (str): "train" or "online_eval"
+            device (str, optional): Device to load the data on. Defaults to 'cpu'.
+            transform (torchvision.transforms, optional): Transform to apply to the data. Defaults to None.
+        """
+        self.config = config
+        if config.dataset == 'ibims':
+            self.data = get_ibims_loader(config, batch_size=1, num_workers=1)
+            return
+        if config.dataset == 'sunrgbd':
+            self.data = get_sunrgbd_loader(
+                data_dir_root=config.sunrgbd_root, batch_size=1, num_workers=1)
+            return
+        if config.dataset == 'diml_indoor':
+            self.data = get_diml_indoor_loader(
+                data_dir_root=config.diml_indoor_root, batch_size=1, num_workers=1)
+            return
+        if config.dataset == 'diml_outdoor':
+            self.data = get_diml_outdoor_loader(
+                data_dir_root=config.diml_outdoor_root, batch_size=1, num_workers=1)
+            return
+        if "diode" in config.dataset:
+            self.data = get_diode_loader(
+                config[config.dataset+"_root"], batch_size=1, num_workers=1)
+            return
+        if config.dataset == 'hypersim_test':
+            self.data = get_hypersim_loader(
+                config.hypersim_test_root, batch_size=1, num_workers=1)
+            return
+        if config.dataset == 'vkitti':
+            self.data = get_vkitti_loader(
+                config.vkitti_root, batch_size=1, num_workers=1)
+            return
+        if config.dataset == 'vkitti2':
+            self.data = get_vkitti2_loader(
+                config.vkitti2_root, batch_size=1, num_workers=1)
+            return
+        if config.dataset == 'ddad':
+            self.data = get_ddad_loader(config.ddad_root, resize_shape=(
+                352, 1216), batch_size=1, num_workers=1)
+            return
+        img_size = self.config.get("img_size", None)
+        img_size = img_size if self.config.get(
+            "do_input_resize", False) else None
+        if transform is None:
+            transform = preprocessing_transforms(mode, size=img_size)
+        if mode == 'train':
+            Dataset = DataLoadPreprocess
+            self.training_samples = Dataset(
+                config, mode, transform=transform, device=device)
+            if config.distributed:
+                self.train_sampler = torch.utils.data.distributed.DistributedSampler(
+                    self.training_samples)
+            else:
+                self.train_sampler = None
+            self.data = DataLoader(self.training_samples,
+                                   batch_size=config.batch_size,
+                                   shuffle=(self.train_sampler is None),
+                                   num_workers=config.workers,
+                                   pin_memory=True,
+                                   persistent_workers=True,
+                                #    prefetch_factor=2,
+                                   sampler=self.train_sampler)
+        elif mode == 'online_eval':
+            self.testing_samples = DataLoadPreprocess(
+                config, mode, transform=transform)
+            if config.distributed:  # redundant. here only for readability and to be more explicit
+                # Give whole test set to all processes (and report evaluation only on one) regardless
+                self.eval_sampler = None
+            else:
+                self.eval_sampler = None
+            self.data = DataLoader(self.testing_samples, 1,
+                                   shuffle=kwargs.get("shuffle_test", False),
+                                   num_workers=1,
+                                   pin_memory=False,
+                                   sampler=self.eval_sampler)
+        elif mode == 'test':
+            self.testing_samples = DataLoadPreprocess(
+                config, mode, transform=transform)
+            self.data = DataLoader(self.testing_samples,
+                                   1, shuffle=False, num_workers=1)
+        else:
+            print(
+                'mode should be one of \'train, test, online_eval\'. Got {}'.format(mode))
+def repetitive_roundrobin(*iterables):
+    """
+    cycles through iterables but sample wise
+    first yield first sample from first iterable then first sample from second iterable and so on
+    then second sample from first iterable then second sample from second iterable and so on
+    If one iterable is shorter than the others, it is repeated until all iterables are exhausted
+    repetitive_roundrobin('ABC', 'D', 'EF') --> A D E B D F C D E
+    """
+    # Repetitive roundrobin
+    iterables_ = [iter(it) for it in iterables]
+    exhausted = [False] * len(iterables)
+    while not all(exhausted):
+        for i, it in enumerate(iterables_):
+            try:
+                yield next(it)
+            except StopIteration:
+                exhausted[i] = True
+                iterables_[i] = itertools.cycle(iterables[i])
+                # First elements may get repeated if one iterable is shorter than the others
+                yield next(iterables_[i])
+class RepetitiveRoundRobinDataLoader(object):
+    def __init__(self, *dataloaders):
+        self.dataloaders = dataloaders
+    def __iter__(self):
+        return repetitive_roundrobin(*self.dataloaders)
+    def __len__(self):
+        # First samples get repeated, thats why the plus one
+        return len(self.dataloaders) * (max(len(dl) for dl in self.dataloaders) + 1)
+class MixedNYUKITTI(object):
+    def __init__(self, config, mode, device='cpu', **kwargs):
+        config = edict(config)
+        config.workers = config.workers // 2
+        self.config = config
+        nyu_conf = change_dataset(edict(config), 'nyu')
+        kitti_conf = change_dataset(edict(config), 'kitti')
+        # make nyu default for testing
+        self.config = config = nyu_conf
+        img_size = self.config.get("img_size", None)
+        img_size = img_size if self.config.get(
+            "do_input_resize", False) else None
+        if mode == 'train':
+            nyu_loader = DepthDataLoader(
+                nyu_conf, mode, device=device, transform=preprocessing_transforms(mode, size=img_size)).data
+            kitti_loader = DepthDataLoader(
+                kitti_conf, mode, device=device, transform=preprocessing_transforms(mode, size=img_size)).data
+            # It has been changed to repetitive roundrobin
+            self.data = RepetitiveRoundRobinDataLoader(
+                nyu_loader, kitti_loader)
+        else:
+            self.data = DepthDataLoader(nyu_conf, mode, device=device).data
+def remove_leading_slash(s):
+    if s[0] == '/' or s[0] == '\\':
+        return s[1:]
+    return s
+class CachedReader:
+    def __init__(self, shared_dict=None):
+        if shared_dict:
+            self._cache = shared_dict
+        else:
+            self._cache = {}
+    def open(self, fpath):
+        im = self._cache.get(fpath, None)
+        if im is None:
+            im = self._cache[fpath] = Image.open(fpath)
+        return im
+class ImReader:
+    def __init__(self):
+        pass
+    # @cache
+    def open(self, fpath):
+        return Image.open(fpath)
+class DataLoadPreprocess(Dataset):
+    def __init__(self, config, mode, transform=None, is_for_online_eval=False, **kwargs):
+        self.config = config
+        if mode == 'online_eval':
+            with open(config.filenames_file_eval, 'r') as f:
+                self.filenames = f.readlines()
+        else:
+            with open(config.filenames_file, 'r') as f:
+                self.filenames = f.readlines()
+        self.mode = mode
+        self.transform = transform
+        self.to_tensor = ToTensor(mode)
+        self.is_for_online_eval = is_for_online_eval
+        if config.use_shared_dict:
+            self.reader = CachedReader(config.shared_dict)
+        else:
+            self.reader = ImReader()
+    def postprocess(self, sample):
+        return sample
+    def __getitem__(self, idx):
+        sample_path = self.filenames[idx]
+        focal = float(sample_path.split()[2])
+        sample = {}
+        if self.mode == 'train':
+            if self.config.dataset == 'kitti' and self.config.use_right and random.random() > 0.5:
+                image_path = os.path.join(
+                    self.config.data_path, remove_leading_slash(sample_path.split()[3]))
+                depth_path = os.path.join(
+                    self.config.gt_path, remove_leading_slash(sample_path.split()[4]))
+            else:
+                image_path = os.path.join(
+                    self.config.data_path, remove_leading_slash(sample_path.split()[0]))
+                depth_path = os.path.join(
+                    self.config.gt_path, remove_leading_slash(sample_path.split()[1]))
+            image = self.reader.open(image_path)
+            depth_gt = self.reader.open(depth_path)
+            w, h = image.size
+            if self.config.do_kb_crop:
+                height = image.height
+                width = image.width
+                top_margin = int(height - 352)
+                left_margin = int((width - 1216) / 2)
+                depth_gt = depth_gt.crop(
+                    (left_margin, top_margin, left_margin + 1216, top_margin + 352))
+                image = image.crop(
+                    (left_margin, top_margin, left_margin + 1216, top_margin + 352))
+            # Avoid blank boundaries due to pixel registration?
+            # Train images have white border. Test images have black border.
+            if self.config.dataset == 'nyu' and self.config.avoid_boundary:
+                # print("Avoiding Blank Boundaries!")
+                # We just crop and pad again with reflect padding to original size
+                # original_size = image.size
+                crop_params = get_white_border(np.array(image, dtype=np.uint8))
+                image = image.crop((crop_params.left, crop_params.top, crop_params.right, crop_params.bottom))
+                depth_gt = depth_gt.crop((crop_params.left, crop_params.top, crop_params.right, crop_params.bottom))
+                # Use reflect padding to fill the blank
+                image = np.array(image)
+                image = np.pad(image, ((crop_params.top, h - crop_params.bottom), (crop_params.left, w - crop_params.right), (0, 0)), mode='reflect')
+                image = Image.fromarray(image)
+                depth_gt = np.array(depth_gt)
+                depth_gt = np.pad(depth_gt, ((crop_params.top, h - crop_params.bottom), (crop_params.left, w - crop_params.right)), 'constant', constant_values=0)
+                depth_gt = Image.fromarray(depth_gt)
+            if self.config.do_random_rotate and (self.config.aug):
+                random_angle = (random.random() - 0.5) * 2 * self.config.degree
+                image = self.rotate_image(image, random_angle)
+                depth_gt = self.rotate_image(
+                    depth_gt, random_angle, flag=Image.NEAREST)
+            image = np.asarray(image, dtype=np.float32) / 255.0
+            depth_gt = np.asarray(depth_gt, dtype=np.float32)
+            depth_gt = np.expand_dims(depth_gt, axis=2)
+            if self.config.dataset == 'nyu':
+                depth_gt = depth_gt / 1000.0
+            else:
+                depth_gt = depth_gt / 256.0
+            if self.config.aug and (self.config.random_crop):
+                image, depth_gt = self.random_crop(
+                    image, depth_gt, self.config.input_height, self.config.input_width)
+            if self.config.aug and self.config.random_translate:
+                # print("Random Translation!")
+                image, depth_gt = self.random_translate(image, depth_gt, self.config.max_translation)
+            image, depth_gt = self.train_preprocess(image, depth_gt)
+            mask = np.logical_and(depth_gt > self.config.min_depth,
+                                  depth_gt < self.config.max_depth).squeeze()[None, ...]
+            sample = {'image': image, 'depth': depth_gt, 'focal': focal,
+                      'mask': mask, **sample}
+        else:
+            if self.mode == 'online_eval':
+                data_path = self.config.data_path_eval
+            else:
+                data_path = self.config.data_path
+            image_path = os.path.join(
+                data_path, remove_leading_slash(sample_path.split()[0]))
+            image = np.asarray(self.reader.open(image_path),
+                               dtype=np.float32) / 255.0
+            if self.mode == 'online_eval':
+                gt_path = self.config.gt_path_eval
+                depth_path = os.path.join(
+                    gt_path, remove_leading_slash(sample_path.split()[1]))
+                has_valid_depth = False
+                try:
+                    depth_gt = self.reader.open(depth_path)
+                    has_valid_depth = True
+                except IOError:
+                    depth_gt = False
+                    # print('Missing gt for {}'.format(image_path))
+                if has_valid_depth:
+                    depth_gt = np.asarray(depth_gt, dtype=np.float32)
+                    depth_gt = np.expand_dims(depth_gt, axis=2)
+                    if self.config.dataset == 'nyu':
+                        depth_gt = depth_gt / 1000.0
+                    else:
+                        depth_gt = depth_gt / 256.0
+                    mask = np.logical_and(
+                        depth_gt >= self.config.min_depth, depth_gt <= self.config.max_depth).squeeze()[None, ...]
+                else:
+                    mask = False
+            if self.config.do_kb_crop:
+                height = image.shape[0]
+                width = image.shape[1]
+                top_margin = int(height - 352)
+                left_margin = int((width - 1216) / 2)
+                image = image[top_margin:top_margin + 352,
+                              left_margin:left_margin + 1216, :]
+                if self.mode == 'online_eval' and has_valid_depth:
+                    depth_gt = depth_gt[top_margin:top_margin +
+                                        352, left_margin:left_margin + 1216, :]
+            if self.mode == 'online_eval':
+                sample = {'image': image, 'depth': depth_gt, 'focal': focal, 'has_valid_depth': has_valid_depth,
+                          'image_path': sample_path.split()[0], 'depth_path': sample_path.split()[1],
+                          'mask': mask}
+            else:
+                sample = {'image': image, 'focal': focal}
+        if (self.mode == 'train') or ('has_valid_depth' in sample and sample['has_valid_depth']):
+            mask = np.logical_and(depth_gt > self.config.min_depth,
+                                  depth_gt < self.config.max_depth).squeeze()[None, ...]
+            sample['mask'] = mask
+        if self.transform:
+            sample = self.transform(sample)
+        sample = self.postprocess(sample)
+        sample['dataset'] = self.config.dataset
+        sample = {**sample, 'image_path': sample_path.split()[0], 'depth_path': sample_path.split()[1]}
+        return sample
+    def rotate_image(self, image, angle, flag=Image.BILINEAR):
+        result = image.rotate(angle, resample=flag)
+        return result
+    def random_crop(self, img, depth, height, width):
+        assert img.shape[0] >= height
+        assert img.shape[1] >= width
+        assert img.shape[0] == depth.shape[0]
+        assert img.shape[1] == depth.shape[1]
+        x = random.randint(0, img.shape[1] - width)
+        y = random.randint(0, img.shape[0] - height)
+        img = img[y:y + height, x:x + width, :]
+        depth = depth[y:y + height, x:x + width, :]
+        return img, depth
+    def random_translate(self, img, depth, max_t=20):
+        assert img.shape[0] == depth.shape[0]
+        assert img.shape[1] == depth.shape[1]
+        p = self.config.translate_prob
+        do_translate = random.random()
+        if do_translate > p:
+            return img, depth
+        x = random.randint(-max_t, max_t)
+        y = random.randint(-max_t, max_t)
+        M = np.float32([[1, 0, x], [0, 1, y]])
+        # print(img.shape, depth.shape)
+        img = cv2.warpAffine(img, M, (img.shape[1], img.shape[0]))
+        depth = cv2.warpAffine(depth, M, (depth.shape[1], depth.shape[0]))
+        depth = depth.squeeze()[..., None]  # add channel dim back. Affine warp removes it
+        # print("after", img.shape, depth.shape)
+        return img, depth
+    def train_preprocess(self, image, depth_gt):
+        if self.config.aug:
+            # Random flipping
+            do_flip = random.random()
+            if do_flip > 0.5:
+                image = (image[:, ::-1, :]).copy()
+                depth_gt = (depth_gt[:, ::-1, :]).copy()
+            # Random gamma, brightness, color augmentation
+            do_augment = random.random()
+            if do_augment > 0.5:
+                image = self.augment_image(image)
+        return image, depth_gt
+    def augment_image(self, image):
+        # gamma augmentation
+        gamma = random.uniform(0.9, 1.1)
+        image_aug = image ** gamma
+        # brightness augmentation
+        if self.config.dataset == 'nyu':
+            brightness = random.uniform(0.75, 1.25)
+        else:
+            brightness = random.uniform(0.9, 1.1)
+        image_aug = image_aug * brightness
+        # color augmentation
+        colors = np.random.uniform(0.9, 1.1, size=3)
+        white = np.ones((image.shape[0], image.shape[1]))
+        color_image = np.stack([white * colors[i] for i in range(3)], axis=2)
+        image_aug *= color_image
+        image_aug = np.clip(image_aug, 0, 1)
+        return image_aug
+    def __len__(self):
+        return len(self.filenames)
+class ToTensor(object):
+    def __init__(self, mode, do_normalize=False, size=None):
+        self.mode = mode
+        self.normalize = transforms.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) if do_normalize else nn.Identity()
+        self.size = size
+        if size is not None:
+            self.resize = transforms.Resize(size=size)
+        else:
+            self.resize = nn.Identity()
+    def __call__(self, sample):
+        image, focal = sample['image'], sample['focal']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        image = self.resize(image)
+        if self.mode == 'test':
+            return {'image': image, 'focal': focal}
+        depth = sample['depth']
+        if self.mode == 'train':
+            depth = self.to_tensor(depth)
+            return {**sample, 'image': image, 'depth': depth, 'focal': focal}
+        else:
+            has_valid_depth = sample['has_valid_depth']
+            image = self.resize(image)
+            return {**sample, 'image': image, 'depth': depth, 'focal': focal, 'has_valid_depth': has_valid_depth,
+                    'image_path': sample['image_path'], 'depth_path': sample['depth_path']}
+    def to_tensor(self, pic):
+        if not (_is_pil_image(pic) or _is_numpy_image(pic)):
+            raise TypeError(
+                'pic should be PIL Image or ndarray. Got {}'.format(type(pic)))
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img

annotator/zoe/zoedepth/data/ddad.py ADDED Viewed

	@@ -0,0 +1,117 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import os
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+class ToTensor(object):
+    def __init__(self, resize_shape):
+        # self.normalize = transforms.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x : x
+        self.resize = transforms.Resize(resize_shape)
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        image = self.resize(image)
+        return {'image': image, 'depth': depth, 'dataset': "ddad"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class DDAD(Dataset):
+    def __init__(self, data_dir_root, resize_shape):
+        import glob
+        # image paths are of the form <data_dir_root>/{outleft, depthmap}/*.png
+        self.image_files = glob.glob(os.path.join(data_dir_root, '*.png'))
+        self.depth_files = [r.replace("_rgb.png", "_depth.npy")
+                            for r in self.image_files]
+        self.transform = ToTensor(resize_shape)
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
+        depth = np.load(depth_path)  # meters
+        # depth[depth > 8] = -1
+        depth = depth[..., None]
+        sample = dict(image=image, depth=depth)
+        sample = self.transform(sample)
+        if idx == 0:
+            print(sample["image"].shape)
+        return sample
+    def __len__(self):
+        return len(self.image_files)
+def get_ddad_loader(data_dir_root, resize_shape, batch_size=1, **kwargs):
+    dataset = DDAD(data_dir_root, resize_shape)
+    return DataLoader(dataset, batch_size, **kwargs)

annotator/zoe/zoedepth/data/diml_indoor_test.py ADDED Viewed

	@@ -0,0 +1,125 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import os
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+class ToTensor(object):
+    def __init__(self):
+        # self.normalize = transforms.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x : x
+        self.resize = transforms.Resize((480, 640))
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        image = self.resize(image)
+        return {'image': image, 'depth': depth, 'dataset': "diml_indoor"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class DIML_Indoor(Dataset):
+    def __init__(self, data_dir_root):
+        import glob
+        # image paths are of the form <data_dir_root>/{HR, LR}/<scene>/{color, depth_filled}/*.png
+        self.image_files = glob.glob(os.path.join(
+            data_dir_root, "LR", '*', 'color', '*.png'))
+        self.depth_files = [r.replace("color", "depth_filled").replace(
+            "_c.png", "_depth_filled.png") for r in self.image_files]
+        self.transform = ToTensor()
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
+        depth = np.asarray(Image.open(depth_path),
+                           dtype='uint16') / 1000.0  # mm to meters
+        # print(np.shape(image))
+        # print(np.shape(depth))
+        # depth[depth > 8] = -1
+        depth = depth[..., None]
+        sample = dict(image=image, depth=depth)
+        # return sample
+        sample = self.transform(sample)
+        if idx == 0:
+            print(sample["image"].shape)
+        return sample
+    def __len__(self):
+        return len(self.image_files)
+def get_diml_indoor_loader(data_dir_root, batch_size=1, **kwargs):
+    dataset = DIML_Indoor(data_dir_root)
+    return DataLoader(dataset, batch_size, **kwargs)
+# get_diml_indoor_loader(data_dir_root="datasets/diml/indoor/test/HR")
+# get_diml_indoor_loader(data_dir_root="datasets/diml/indoor/test/LR")

annotator/zoe/zoedepth/data/diml_outdoor_test.py ADDED Viewed

	@@ -0,0 +1,114 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import os
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+class ToTensor(object):
+    def __init__(self):
+        # self.normalize = transforms.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x : x
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        return {'image': image, 'depth': depth, 'dataset': "diml_outdoor"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class DIML_Outdoor(Dataset):
+    def __init__(self, data_dir_root):
+        import glob
+        # image paths are of the form <data_dir_root>/{outleft, depthmap}/*.png
+        self.image_files = glob.glob(os.path.join(
+            data_dir_root, "*", 'outleft', '*.png'))
+        self.depth_files = [r.replace("outleft", "depthmap")
+                            for r in self.image_files]
+        self.transform = ToTensor()
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
+        depth = np.asarray(Image.open(depth_path),
+                           dtype='uint16') / 1000.0  # mm to meters
+        # depth[depth > 8] = -1
+        depth = depth[..., None]
+        sample = dict(image=image, depth=depth, dataset="diml_outdoor")
+        # return sample
+        return self.transform(sample)
+    def __len__(self):
+        return len(self.image_files)
+def get_diml_outdoor_loader(data_dir_root, batch_size=1, **kwargs):
+    dataset = DIML_Outdoor(data_dir_root)
+    return DataLoader(dataset, batch_size, **kwargs)
+# get_diml_outdoor_loader(data_dir_root="datasets/diml/outdoor/test/HR")
+# get_diml_outdoor_loader(data_dir_root="datasets/diml/outdoor/test/LR")

annotator/zoe/zoedepth/data/diode.py ADDED Viewed

	@@ -0,0 +1,125 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import os
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+class ToTensor(object):
+    def __init__(self):
+        # self.normalize = transforms.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x : x
+        self.resize = transforms.Resize(480)
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        image = self.resize(image)
+        return {'image': image, 'depth': depth, 'dataset': "diode"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class DIODE(Dataset):
+    def __init__(self, data_dir_root):
+        import glob
+        # image paths are of the form <data_dir_root>/scene_#/scan_#/*.png
+        self.image_files = glob.glob(
+            os.path.join(data_dir_root, '*', '*', '*.png'))
+        self.depth_files = [r.replace(".png", "_depth.npy")
+                            for r in self.image_files]
+        self.depth_mask_files = [
+            r.replace(".png", "_depth_mask.npy") for r in self.image_files]
+        self.transform = ToTensor()
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        depth_mask_path = self.depth_mask_files[idx]
+        image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
+        depth = np.load(depth_path)  # in meters
+        valid = np.load(depth_mask_path)  # binary
+        # depth[depth > 8] = -1
+        # depth = depth[..., None]
+        sample = dict(image=image, depth=depth, valid=valid)
+        # return sample
+        sample = self.transform(sample)
+        if idx == 0:
+            print(sample["image"].shape)
+        return sample
+    def __len__(self):
+        return len(self.image_files)
+def get_diode_loader(data_dir_root, batch_size=1, **kwargs):
+    dataset = DIODE(data_dir_root)
+    return DataLoader(dataset, batch_size, **kwargs)
+# get_diode_loader(data_dir_root="datasets/diode/val/outdoor")

annotator/zoe/zoedepth/data/hypersim.py ADDED Viewed

	@@ -0,0 +1,138 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import glob
+import os
+import h5py
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+def hypersim_distance_to_depth(npyDistance):
+    intWidth, intHeight, fltFocal = 1024, 768, 886.81
+    npyImageplaneX = np.linspace((-0.5 * intWidth) + 0.5, (0.5 * intWidth) - 0.5, intWidth).reshape(
+        1, intWidth).repeat(intHeight, 0).astype(np.float32)[:, :, None]
+    npyImageplaneY = np.linspace((-0.5 * intHeight) + 0.5, (0.5 * intHeight) - 0.5,
+                                 intHeight).reshape(intHeight, 1).repeat(intWidth, 1).astype(np.float32)[:, :, None]
+    npyImageplaneZ = np.full([intHeight, intWidth, 1], fltFocal, np.float32)
+    npyImageplane = np.concatenate(
+        [npyImageplaneX, npyImageplaneY, npyImageplaneZ], 2)
+    npyDepth = npyDistance / np.linalg.norm(npyImageplane, 2, 2) * fltFocal
+    return npyDepth
+class ToTensor(object):
+    def __init__(self):
+        # self.normalize = transforms.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x: x
+        self.resize = transforms.Resize((480, 640))
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        image = self.resize(image)
+        return {'image': image, 'depth': depth, 'dataset': "hypersim"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class HyperSim(Dataset):
+    def __init__(self, data_dir_root):
+        # image paths are of the form <data_dir_root>/<scene>/images/scene_cam_#_final_preview/*.tonemap.jpg
+        # depth paths are of the form <data_dir_root>/<scene>/images/scene_cam_#_final_preview/*.depth_meters.hdf5
+        self.image_files = glob.glob(os.path.join(
+            data_dir_root, '*', 'images', 'scene_cam_*_final_preview', '*.tonemap.jpg'))
+        self.depth_files = [r.replace("_final_preview", "_geometry_hdf5").replace(
+            ".tonemap.jpg", ".depth_meters.hdf5") for r in self.image_files]
+        self.transform = ToTensor()
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
+        # depth from hdf5
+        depth_fd = h5py.File(depth_path, "r")
+        # in meters (Euclidean distance)
+        distance_meters = np.array(depth_fd['dataset'])
+        depth = hypersim_distance_to_depth(
+            distance_meters)  # in meters (planar depth)
+        # depth[depth > 8] = -1
+        depth = depth[..., None]
+        sample = dict(image=image, depth=depth)
+        sample = self.transform(sample)
+        if idx == 0:
+            print(sample["image"].shape)
+        return sample
+    def __len__(self):
+        return len(self.image_files)
+def get_hypersim_loader(data_dir_root, batch_size=1, **kwargs):
+    dataset = HyperSim(data_dir_root)
+    return DataLoader(dataset, batch_size, **kwargs)

annotator/zoe/zoedepth/data/ibims.py ADDED Viewed

	@@ -0,0 +1,81 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import os
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms as T
+class iBims(Dataset):
+    def __init__(self, config):
+        root_folder = config.ibims_root
+        with open(os.path.join(root_folder, "imagelist.txt"), 'r') as f:
+            imglist = f.read().split()
+        samples = []
+        for basename in imglist:
+            img_path = os.path.join(root_folder, 'rgb', basename + ".png")
+            depth_path = os.path.join(root_folder, 'depth', basename + ".png")
+            valid_mask_path = os.path.join(
+                root_folder, 'mask_invalid', basename+".png")
+            transp_mask_path = os.path.join(
+                root_folder, 'mask_transp', basename+".png")
+            samples.append(
+                (img_path, depth_path, valid_mask_path, transp_mask_path))
+        self.samples = samples
+        # self.normalize = T.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x : x
+    def __getitem__(self, idx):
+        img_path, depth_path, valid_mask_path, transp_mask_path = self.samples[idx]
+        img = np.asarray(Image.open(img_path), dtype=np.float32) / 255.0
+        depth = np.asarray(Image.open(depth_path),
+                           dtype=np.uint16).astype('float')*50.0/65535
+        mask_valid = np.asarray(Image.open(valid_mask_path))
+        mask_transp = np.asarray(Image.open(transp_mask_path))
+        # depth = depth * mask_valid * mask_transp
+        depth = np.where(mask_valid * mask_transp, depth, -1)
+        img = torch.from_numpy(img).permute(2, 0, 1)
+        img = self.normalize(img)
+        depth = torch.from_numpy(depth).unsqueeze(0)
+        return dict(image=img, depth=depth, image_path=img_path, depth_path=depth_path, dataset='ibims')
+    def __len__(self):
+        return len(self.samples)
+def get_ibims_loader(config, batch_size=1, **kwargs):
+    dataloader = DataLoader(iBims(config), batch_size=batch_size, **kwargs)
+    return dataloader

annotator/zoe/zoedepth/data/preprocess.py ADDED Viewed

	@@ -0,0 +1,154 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import numpy as np
+from dataclasses import dataclass
+from typing import Tuple, List
+# dataclass to store the crop parameters
+@dataclass
+class CropParams:
+    top: int
+    bottom: int
+    left: int
+    right: int
+def get_border_params(rgb_image, tolerance=0.1, cut_off=20, value=0, level_diff_threshold=5, channel_axis=-1, min_border=5) -> CropParams:
+    gray_image = np.mean(rgb_image, axis=channel_axis)
+    h, w = gray_image.shape
+    def num_value_pixels(arr):
+        return np.sum(np.abs(arr - value) < level_diff_threshold)
+    def is_above_tolerance(arr, total_pixels):
+        return (num_value_pixels(arr) / total_pixels) > tolerance
+    # Crop top border until number of value pixels become below tolerance
+    top = min_border
+    while is_above_tolerance(gray_image[top, :], w) and top < h-1:
+        top += 1
+        if top > cut_off:
+            break
+    # Crop bottom border until number of value pixels become below tolerance
+    bottom = h - min_border
+    while is_above_tolerance(gray_image[bottom, :], w) and bottom > 0:
+        bottom -= 1
+        if h - bottom > cut_off:
+            break
+    # Crop left border until number of value pixels become below tolerance
+    left = min_border
+    while is_above_tolerance(gray_image[:, left], h) and left < w-1:
+        left += 1
+        if left > cut_off:
+            break
+    # Crop right border until number of value pixels become below tolerance
+    right = w - min_border
+    while is_above_tolerance(gray_image[:, right], h) and right > 0:
+        right -= 1
+        if w - right > cut_off:
+            break
+    return CropParams(top, bottom, left, right)
+def get_white_border(rgb_image, value=255, **kwargs) -> CropParams:
+    """Crops the white border of the RGB.
+    Args:
+        rgb: RGB image, shape (H, W, 3).
+    Returns:
+        Crop parameters.
+    """
+    if value == 255:
+        # assert range of values in rgb image is [0, 255]
+        assert np.max(rgb_image) <= 255 and np.min(rgb_image) >= 0, "RGB image values are not in range [0, 255]."
+        assert rgb_image.max() > 1, "RGB image values are not in range [0, 255]."
+    elif value == 1:
+        # assert range of values in rgb image is [0, 1]
+        assert np.max(rgb_image) <= 1 and np.min(rgb_image) >= 0, "RGB image values are not in range [0, 1]."
+    return get_border_params(rgb_image, value=value, **kwargs)
+def get_black_border(rgb_image, **kwargs) -> CropParams:
+    """Crops the black border of the RGB.
+    Args:
+        rgb: RGB image, shape (H, W, 3).
+    Returns:
+        Crop parameters.
+    """
+    return get_border_params(rgb_image, value=0, **kwargs)
+def crop_image(image: np.ndarray, crop_params: CropParams) -> np.ndarray:
+    """Crops the image according to the crop parameters.
+    Args:
+        image: RGB or depth image, shape (H, W, 3) or (H, W).
+        crop_params: Crop parameters.
+    Returns:
+        Cropped image.
+    """
+    return image[crop_params.top:crop_params.bottom, crop_params.left:crop_params.right]
+def crop_images(*images: np.ndarray, crop_params: CropParams) -> Tuple[np.ndarray]:
+    """Crops the images according to the crop parameters.
+    Args:
+        images: RGB or depth images, shape (H, W, 3) or (H, W).
+        crop_params: Crop parameters.
+    Returns:
+        Cropped images.
+    """
+    return tuple(crop_image(image, crop_params) for image in images)
+def crop_black_or_white_border(rgb_image, *other_images: np.ndarray, tolerance=0.1, cut_off=20, level_diff_threshold=5) -> Tuple[np.ndarray]:
+    """Crops the white and black border of the RGB and depth images.
+    Args:
+        rgb: RGB image, shape (H, W, 3). This image is used to determine the border.
+        other_images: The other images to crop according to the border of the RGB image.
+    Returns:
+        Cropped RGB and other images.
+    """
+    # crop black border
+    crop_params = get_black_border(rgb_image, tolerance=tolerance, cut_off=cut_off, level_diff_threshold=level_diff_threshold)
+    cropped_images = crop_images(rgb_image, *other_images, crop_params=crop_params)
+    # crop white border
+    crop_params = get_white_border(cropped_images[0], tolerance=tolerance, cut_off=cut_off, level_diff_threshold=level_diff_threshold)
+    cropped_images = crop_images(*cropped_images, crop_params=crop_params)
+    return cropped_images

annotator/zoe/zoedepth/data/sun_rgbd_loader.py ADDED Viewed

	@@ -0,0 +1,106 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import os
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+class ToTensor(object):
+    def __init__(self):
+        # self.normalize = transforms.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x : x
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        return {'image': image, 'depth': depth, 'dataset': "sunrgbd"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class SunRGBD(Dataset):
+    def __init__(self, data_dir_root):
+        # test_file_dirs = loadmat(train_test_file)['alltest'].squeeze()
+        # all_test = [t[0].replace("/n/fs/sun3d/data/", "") for t in test_file_dirs]
+        # self.all_test = [os.path.join(data_dir_root, t) for t in all_test]
+        import glob
+        self.image_files = glob.glob(
+            os.path.join(data_dir_root, 'rgb', 'rgb', '*'))
+        self.depth_files = [
+            r.replace("rgb/rgb", "gt/gt").replace("jpg", "png") for r in self.image_files]
+        self.transform = ToTensor()
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        image = np.asarray(Image.open(image_path), dtype=np.float32) / 255.0
+        depth = np.asarray(Image.open(depth_path), dtype='uint16') / 1000.0
+        depth[depth > 8] = -1
+        depth = depth[..., None]
+        return self.transform(dict(image=image, depth=depth))
+    def __len__(self):
+        return len(self.image_files)
+def get_sunrgbd_loader(data_dir_root, batch_size=1, **kwargs):
+    dataset = SunRGBD(data_dir_root)
+    return DataLoader(dataset, batch_size, **kwargs)

annotator/zoe/zoedepth/data/transforms.py ADDED Viewed

	@@ -0,0 +1,481 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import math
+import random
+import cv2
+import numpy as np
+class RandomFliplr(object):
+    """Horizontal flip of the sample with given probability.
+    """
+    def __init__(self, probability=0.5):
+        """Init.
+        Args:
+            probability (float, optional): Flip probability. Defaults to 0.5.
+        """
+        self.__probability = probability
+    def __call__(self, sample):
+        prob = random.random()
+        if prob < self.__probability:
+            for k, v in sample.items():
+                if len(v.shape) >= 2:
+                    sample[k] = np.fliplr(v).copy()
+        return sample
+def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA):
+    """Rezise the sample to ensure the given size. Keeps aspect ratio.
+    Args:
+        sample (dict): sample
+        size (tuple): image size
+    Returns:
+        tuple: new size
+    """
+    shape = list(sample["disparity"].shape)
+    if shape[0] >= size[0] and shape[1] >= size[1]:
+        return sample
+    scale = [0, 0]
+    scale[0] = size[0] / shape[0]
+    scale[1] = size[1] / shape[1]
+    scale = max(scale)
+    shape[0] = math.ceil(scale * shape[0])
+    shape[1] = math.ceil(scale * shape[1])
+    # resize
+    sample["image"] = cv2.resize(
+        sample["image"], tuple(shape[::-1]), interpolation=image_interpolation_method
+    )
+    sample["disparity"] = cv2.resize(
+        sample["disparity"], tuple(shape[::-1]), interpolation=cv2.INTER_NEAREST
+    )
+    sample["mask"] = cv2.resize(
+        sample["mask"].astype(np.float32),
+        tuple(shape[::-1]),
+        interpolation=cv2.INTER_NEAREST,
+    )
+    sample["mask"] = sample["mask"].astype(bool)
+    return tuple(shape)
+class RandomCrop(object):
+    """Get a random crop of the sample with the given size (width, height).
+    """
+    def __init__(
+        self,
+        width,
+        height,
+        resize_if_needed=False,
+        image_interpolation_method=cv2.INTER_AREA,
+    ):
+        """Init.
+        Args:
+            width (int): output width
+            height (int): output height
+            resize_if_needed (bool, optional): If True, sample might be upsampled to ensure
+                that a crop of size (width, height) is possbile. Defaults to False.
+        """
+        self.__size = (height, width)
+        self.__resize_if_needed = resize_if_needed
+        self.__image_interpolation_method = image_interpolation_method
+    def __call__(self, sample):
+        shape = sample["disparity"].shape
+        if self.__size[0] > shape[0] or self.__size[1] > shape[1]:
+            if self.__resize_if_needed:
+                shape = apply_min_size(
+                    sample, self.__size, self.__image_interpolation_method
+                )
+            else:
+                raise Exception(
+                    "Output size {} bigger than input size {}.".format(
+                        self.__size, shape
+                    )
+                )
+        offset = (
+            np.random.randint(shape[0] - self.__size[0] + 1),
+            np.random.randint(shape[1] - self.__size[1] + 1),
+        )
+        for k, v in sample.items():
+            if k == "code" or k == "basis":
+                continue
+            if len(sample[k].shape) >= 2:
+                sample[k] = v[
+                    offset[0]: offset[0] + self.__size[0],
+                    offset[1]: offset[1] + self.__size[1],
+                ]
+        return sample
+class Resize(object):
+    """Resize sample to given size (width, height).
+    """
+    def __init__(
+        self,
+        width,
+        height,
+        resize_target=True,
+        keep_aspect_ratio=False,
+        ensure_multiple_of=1,
+        resize_method="lower_bound",
+        image_interpolation_method=cv2.INTER_AREA,
+        letter_box=False,
+    ):
+        """Init.
+        Args:
+            width (int): desired output width
+            height (int): desired output height
+            resize_target (bool, optional):
+                True: Resize the full sample (image, mask, target).
+                False: Resize image only.
+                Defaults to True.
+            keep_aspect_ratio (bool, optional):
+                True: Keep the aspect ratio of the input sample.
+                Output sample might not have the given width and height, and
+                resize behaviour depends on the parameter 'resize_method'.
+                Defaults to False.
+            ensure_multiple_of (int, optional):
+                Output width and height is constrained to be multiple of this parameter.
+                Defaults to 1.
+            resize_method (str, optional):
+                "lower_bound": Output will be at least as large as the given size.
+                "upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
+                "minimal": Scale as least as possible.  (Output size might be smaller than given size.)
+                Defaults to "lower_bound".
+        """
+        self.__width = width
+        self.__height = height
+        self.__resize_target = resize_target
+        self.__keep_aspect_ratio = keep_aspect_ratio
+        self.__multiple_of = ensure_multiple_of
+        self.__resize_method = resize_method
+        self.__image_interpolation_method = image_interpolation_method
+        self.__letter_box = letter_box
+    def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
+        y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
+        if max_val is not None and y > max_val:
+            y = (np.floor(x / self.__multiple_of)
+                 * self.__multiple_of).astype(int)
+        if y < min_val:
+            y = (np.ceil(x / self.__multiple_of)
+                 * self.__multiple_of).astype(int)
+        return y
+    def get_size(self, width, height):
+        # determine new height and width
+        scale_height = self.__height / height
+        scale_width = self.__width / width
+        if self.__keep_aspect_ratio:
+            if self.__resize_method == "lower_bound":
+                # scale such that output size is lower bound
+                if scale_width > scale_height:
+                    # fit width
+                    scale_height = scale_width
+                else:
+                    # fit height
+                    scale_width = scale_height
+            elif self.__resize_method == "upper_bound":
+                # scale such that output size is upper bound
+                if scale_width < scale_height:
+                    # fit width
+                    scale_height = scale_width
+                else:
+                    # fit height
+                    scale_width = scale_height
+            elif self.__resize_method == "minimal":
+                # scale as least as possbile
+                if abs(1 - scale_width) < abs(1 - scale_height):
+                    # fit width
+                    scale_height = scale_width
+                else:
+                    # fit height
+                    scale_width = scale_height
+            else:
+                raise ValueError(
+                    f"resize_method {self.__resize_method} not implemented"
+                )
+        if self.__resize_method == "lower_bound":
+            new_height = self.constrain_to_multiple_of(
+                scale_height * height, min_val=self.__height
+            )
+            new_width = self.constrain_to_multiple_of(
+                scale_width * width, min_val=self.__width
+            )
+        elif self.__resize_method == "upper_bound":
+            new_height = self.constrain_to_multiple_of(
+                scale_height * height, max_val=self.__height
+            )
+            new_width = self.constrain_to_multiple_of(
+                scale_width * width, max_val=self.__width
+            )
+        elif self.__resize_method == "minimal":
+            new_height = self.constrain_to_multiple_of(scale_height * height)
+            new_width = self.constrain_to_multiple_of(scale_width * width)
+        else:
+            raise ValueError(
+                f"resize_method {self.__resize_method} not implemented")
+        return (new_width, new_height)
+    def make_letter_box(self, sample):
+        top = bottom = (self.__height - sample.shape[0]) // 2
+        left = right = (self.__width - sample.shape[1]) // 2
+        sample = cv2.copyMakeBorder(
+            sample, top, bottom, left, right, cv2.BORDER_CONSTANT, None, 0)
+        return sample
+    def __call__(self, sample):
+        width, height = self.get_size(
+            sample["image"].shape[1], sample["image"].shape[0]
+        )
+        # resize sample
+        sample["image"] = cv2.resize(
+            sample["image"],
+            (width, height),
+            interpolation=self.__image_interpolation_method,
+        )
+        if self.__letter_box:
+            sample["image"] = self.make_letter_box(sample["image"])
+        if self.__resize_target:
+            if "disparity" in sample:
+                sample["disparity"] = cv2.resize(
+                    sample["disparity"],
+                    (width, height),
+                    interpolation=cv2.INTER_NEAREST,
+                )
+                if self.__letter_box:
+                    sample["disparity"] = self.make_letter_box(
+                        sample["disparity"])
+            if "depth" in sample:
+                sample["depth"] = cv2.resize(
+                    sample["depth"], (width,
+                                      height), interpolation=cv2.INTER_NEAREST
+                )
+                if self.__letter_box:
+                    sample["depth"] = self.make_letter_box(sample["depth"])
+            sample["mask"] = cv2.resize(
+                sample["mask"].astype(np.float32),
+                (width, height),
+                interpolation=cv2.INTER_NEAREST,
+            )
+            if self.__letter_box:
+                sample["mask"] = self.make_letter_box(sample["mask"])
+            sample["mask"] = sample["mask"].astype(bool)
+        return sample
+class ResizeFixed(object):
+    def __init__(self, size):
+        self.__size = size
+    def __call__(self, sample):
+        sample["image"] = cv2.resize(
+            sample["image"], self.__size[::-1], interpolation=cv2.INTER_LINEAR
+        )
+        sample["disparity"] = cv2.resize(
+            sample["disparity"], self.__size[::-
+                                             1], interpolation=cv2.INTER_NEAREST
+        )
+        sample["mask"] = cv2.resize(
+            sample["mask"].astype(np.float32),
+            self.__size[::-1],
+            interpolation=cv2.INTER_NEAREST,
+        )
+        sample["mask"] = sample["mask"].astype(bool)
+        return sample
+class Rescale(object):
+    """Rescale target values to the interval [0, max_val].
+    If input is constant, values are set to max_val / 2.
+    """
+    def __init__(self, max_val=1.0, use_mask=True):
+        """Init.
+        Args:
+            max_val (float, optional): Max output value. Defaults to 1.0.
+            use_mask (bool, optional): Only operate on valid pixels (mask == True). Defaults to True.
+        """
+        self.__max_val = max_val
+        self.__use_mask = use_mask
+    def __call__(self, sample):
+        disp = sample["disparity"]
+        if self.__use_mask:
+            mask = sample["mask"]
+        else:
+            mask = np.ones_like(disp, dtype=np.bool)
+        if np.sum(mask) == 0:
+            return sample
+        min_val = np.min(disp[mask])
+        max_val = np.max(disp[mask])
+        if max_val > min_val:
+            sample["disparity"][mask] = (
+                (disp[mask] - min_val) / (max_val - min_val) * self.__max_val
+            )
+        else:
+            sample["disparity"][mask] = np.ones_like(
+                disp[mask]) * self.__max_val / 2.0
+        return sample
+# mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
+class NormalizeImage(object):
+    """Normlize image by given mean and std.
+    """
+    def __init__(self, mean, std):
+        self.__mean = mean
+        self.__std = std
+    def __call__(self, sample):
+        sample["image"] = (sample["image"] - self.__mean) / self.__std
+        return sample
+class DepthToDisparity(object):
+    """Convert depth to disparity. Removes depth from sample.
+    """
+    def __init__(self, eps=1e-4):
+        self.__eps = eps
+    def __call__(self, sample):
+        assert "depth" in sample
+        sample["mask"][sample["depth"] < self.__eps] = False
+        sample["disparity"] = np.zeros_like(sample["depth"])
+        sample["disparity"][sample["depth"] >= self.__eps] = (
+            1.0 / sample["depth"][sample["depth"] >= self.__eps]
+        )
+        del sample["depth"]
+        return sample
+class DisparityToDepth(object):
+    """Convert disparity to depth. Removes disparity from sample.
+    """
+    def __init__(self, eps=1e-4):
+        self.__eps = eps
+    def __call__(self, sample):
+        assert "disparity" in sample
+        disp = np.abs(sample["disparity"])
+        sample["mask"][disp < self.__eps] = False
+        # print(sample["disparity"])
+        # print(sample["mask"].sum())
+        # exit()
+        sample["depth"] = np.zeros_like(disp)
+        sample["depth"][disp >= self.__eps] = (
+            1.0 / disp[disp >= self.__eps]
+        )
+        del sample["disparity"]
+        return sample
+class PrepareForNet(object):
+    """Prepare sample for usage as network input.
+    """
+    def __init__(self):
+        pass
+    def __call__(self, sample):
+        image = np.transpose(sample["image"], (2, 0, 1))
+        sample["image"] = np.ascontiguousarray(image).astype(np.float32)
+        if "mask" in sample:
+            sample["mask"] = sample["mask"].astype(np.float32)
+            sample["mask"] = np.ascontiguousarray(sample["mask"])
+        if "disparity" in sample:
+            disparity = sample["disparity"].astype(np.float32)
+            sample["disparity"] = np.ascontiguousarray(disparity)
+        if "depth" in sample:
+            depth = sample["depth"].astype(np.float32)
+            sample["depth"] = np.ascontiguousarray(depth)
+        return sample

annotator/zoe/zoedepth/data/vkitti.py ADDED Viewed

	@@ -0,0 +1,151 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import torch
+from torch.utils.data import Dataset, DataLoader
+from torchvision import transforms
+import os
+from PIL import Image
+import numpy as np
+import cv2
+class ToTensor(object):
+    def __init__(self):
+        self.normalize = transforms.Normalize(
+            mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        # self.resize = transforms.Resize((375, 1242))
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        # image = self.resize(image)
+        return {'image': image, 'depth': depth, 'dataset': "vkitti"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class VKITTI(Dataset):
+    def __init__(self, data_dir_root, do_kb_crop=True):
+        import glob
+        # image paths are of the form <data_dir_root>/{HR, LR}/<scene>/{color, depth_filled}/*.png
+        self.image_files = glob.glob(os.path.join(
+            data_dir_root, "test_color", '*.png'))
+        self.depth_files = [r.replace("test_color", "test_depth")
+                            for r in self.image_files]
+        self.do_kb_crop = True
+        self.transform = ToTensor()
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        image = Image.open(image_path)
+        depth = Image.open(depth_path)
+        depth = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR |
+                           cv2.IMREAD_ANYDEPTH)
+        print("dpeth min max", depth.min(), depth.max())
+        # print(np.shape(image))
+        # print(np.shape(depth))
+        # depth[depth > 8] = -1
+        if self.do_kb_crop and False:
+            height = image.height
+            width = image.width
+            top_margin = int(height - 352)
+            left_margin = int((width - 1216) / 2)
+            depth = depth.crop(
+                (left_margin, top_margin, left_margin + 1216, top_margin + 352))
+            image = image.crop(
+                (left_margin, top_margin, left_margin + 1216, top_margin + 352))
+            # uv = uv[:, top_margin:top_margin + 352, left_margin:left_margin + 1216]
+        image = np.asarray(image, dtype=np.float32) / 255.0
+        # depth = np.asarray(depth, dtype=np.uint16) /1.
+        depth = depth[..., None]
+        sample = dict(image=image, depth=depth)
+        # return sample
+        sample = self.transform(sample)
+        if idx == 0:
+            print(sample["image"].shape)
+        return sample
+    def __len__(self):
+        return len(self.image_files)
+def get_vkitti_loader(data_dir_root, batch_size=1, **kwargs):
+    dataset = VKITTI(data_dir_root)
+    return DataLoader(dataset, batch_size, **kwargs)
+if __name__ == "__main__":
+    loader = get_vkitti_loader(
+        data_dir_root="/home/bhatsf/shortcuts/datasets/vkitti_test")
+    print("Total files", len(loader.dataset))
+    for i, sample in enumerate(loader):
+        print(sample["image"].shape)
+        print(sample["depth"].shape)
+        print(sample["dataset"])
+        print(sample['depth'].min(), sample['depth'].max())
+        if i > 5:
+            break

annotator/zoe/zoedepth/data/vkitti2.py ADDED Viewed

	@@ -0,0 +1,187 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat
+import os
+import cv2
+import numpy as np
+import torch
+from PIL import Image
+from torch.utils.data import DataLoader, Dataset
+from torchvision import transforms
+class ToTensor(object):
+    def __init__(self):
+        # self.normalize = transforms.Normalize(
+        #     mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+        self.normalize = lambda x: x
+        # self.resize = transforms.Resize((375, 1242))
+    def __call__(self, sample):
+        image, depth = sample['image'], sample['depth']
+        image = self.to_tensor(image)
+        image = self.normalize(image)
+        depth = self.to_tensor(depth)
+        # image = self.resize(image)
+        return {'image': image, 'depth': depth, 'dataset': "vkitti"}
+    def to_tensor(self, pic):
+        if isinstance(pic, np.ndarray):
+            img = torch.from_numpy(pic.transpose((2, 0, 1)))
+            return img
+        #         # handle PIL Image
+        if pic.mode == 'I':
+            img = torch.from_numpy(np.array(pic, np.int32, copy=False))
+        elif pic.mode == 'I;16':
+            img = torch.from_numpy(np.array(pic, np.int16, copy=False))
+        else:
+            img = torch.ByteTensor(
+                torch.ByteStorage.from_buffer(pic.tobytes()))
+        # PIL image mode: 1, L, P, I, F, RGB, YCbCr, RGBA, CMYK
+        if pic.mode == 'YCbCr':
+            nchannel = 3
+        elif pic.mode == 'I;16':
+            nchannel = 1
+        else:
+            nchannel = len(pic.mode)
+        img = img.view(pic.size[1], pic.size[0], nchannel)
+        img = img.transpose(0, 1).transpose(0, 2).contiguous()
+        if isinstance(img, torch.ByteTensor):
+            return img.float()
+        else:
+            return img
+class VKITTI2(Dataset):
+    def __init__(self, data_dir_root, do_kb_crop=True, split="test"):
+        import glob
+        # image paths are of the form <data_dir_root>/rgb/<scene>/<variant>/frames/<rgb,depth>/Camera<0,1>/rgb_{}.jpg
+        self.image_files = glob.glob(os.path.join(
+            data_dir_root, "rgb", "**", "frames", "rgb", "Camera_0", '*.jpg'), recursive=True)
+        self.depth_files = [r.replace("/rgb/", "/depth/").replace(
+            "rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
+        self.do_kb_crop = True
+        self.transform = ToTensor()
+        # If train test split is not created, then create one.
+        # Split is such that 8% of the frames from each scene are used for testing.
+        if not os.path.exists(os.path.join(data_dir_root, "train.txt")):
+            import random
+            scenes = set([os.path.basename(os.path.dirname(
+                os.path.dirname(os.path.dirname(f)))) for f in self.image_files])
+            train_files = []
+            test_files = []
+            for scene in scenes:
+                scene_files = [f for f in self.image_files if os.path.basename(
+                    os.path.dirname(os.path.dirname(os.path.dirname(f)))) == scene]
+                random.shuffle(scene_files)
+                train_files.extend(scene_files[:int(len(scene_files) * 0.92)])
+                test_files.extend(scene_files[int(len(scene_files) * 0.92):])
+            with open(os.path.join(data_dir_root, "train.txt"), "w") as f:
+                f.write("\n".join(train_files))
+            with open(os.path.join(data_dir_root, "test.txt"), "w") as f:
+                f.write("\n".join(test_files))
+        if split == "train":
+            with open(os.path.join(data_dir_root, "train.txt"), "r") as f:
+                self.image_files = f.read().splitlines()
+            self.depth_files = [r.replace("/rgb/", "/depth/").replace(
+                "rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
+        elif split == "test":
+            with open(os.path.join(data_dir_root, "test.txt"), "r") as f:
+                self.image_files = f.read().splitlines()
+            self.depth_files = [r.replace("/rgb/", "/depth/").replace(
+                "rgb_", "depth_").replace(".jpg", ".png") for r in self.image_files]
+    def __getitem__(self, idx):
+        image_path = self.image_files[idx]
+        depth_path = self.depth_files[idx]
+        image = Image.open(image_path)
+        # depth = Image.open(depth_path)
+        depth = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR |
+                           cv2.IMREAD_ANYDEPTH) / 100.0  # cm to m
+        depth = Image.fromarray(depth)
+        # print("dpeth min max", depth.min(), depth.max())
+        # print(np.shape(image))
+        # print(np.shape(depth))
+        if self.do_kb_crop:
+            if idx == 0:
+                print("Using KB input crop")
+            height = image.height
+            width = image.width
+            top_margin = int(height - 352)
+            left_margin = int((width - 1216) / 2)
+            depth = depth.crop(
+                (left_margin, top_margin, left_margin + 1216, top_margin + 352))
+            image = image.crop(
+                (left_margin, top_margin, left_margin + 1216, top_margin + 352))
+            # uv = uv[:, top_margin:top_margin + 352, left_margin:left_margin + 1216]
+        image = np.asarray(image, dtype=np.float32) / 255.0
+        # depth = np.asarray(depth, dtype=np.uint16) /1.
+        depth = np.asarray(depth, dtype=np.float32) / 1.
+        depth[depth > 80] = -1
+        depth = depth[..., None]
+        sample = dict(image=image, depth=depth)
+        # return sample
+        sample = self.transform(sample)
+        if idx == 0:
+            print(sample["image"].shape)
+        return sample
+    def __len__(self):
+        return len(self.image_files)
+def get_vkitti2_loader(data_dir_root, batch_size=1, **kwargs):
+    dataset = VKITTI2(data_dir_root)
+    return DataLoader(dataset, batch_size, **kwargs)
+if __name__ == "__main__":
+    loader = get_vkitti2_loader(
+        data_dir_root="/home/bhatsf/shortcuts/datasets/vkitti2")
+    print("Total files", len(loader.dataset))
+    for i, sample in enumerate(loader):
+        print(sample["image"].shape)
+        print(sample["depth"].shape)
+        print(sample["dataset"])
+        print(sample['depth'].min(), sample['depth'].max())
+        if i > 5:
+            break

annotator/zoe/zoedepth/models/__init__.py ADDED Viewed

	@@ -0,0 +1,24 @@

+# MIT License
+# Copyright (c) 2022 Intelligent Systems Lab Org
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+# The above copyright notice and this permission notice shall be included in all
+# copies or substantial portions of the Software.
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+# File author: Shariq Farooq Bhat

annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-310.pyc ADDED Viewed

Binary file (165 Bytes). View file

annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-38.pyc ADDED Viewed

Binary file (167 Bytes). View file

annotator/zoe/zoedepth/models/__pycache__/__init__.cpython-39.pyc ADDED Viewed

Binary file (167 Bytes). View file

annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-310.pyc ADDED Viewed

Binary file (6.26 kB). View file

annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-38.pyc ADDED Viewed

Binary file (6.33 kB). View file

annotator/zoe/zoedepth/models/__pycache__/depth_model.cpython-39.pyc ADDED Viewed

Binary file (6.31 kB). View file

annotator/zoe/zoedepth/models/__pycache__/model_io.cpython-310.pyc ADDED Viewed

Binary file (2.27 kB). View file