update_repo

Browse files

Files changed (15) hide show

README.md +143 -3
add_clearml_yolov5.patch +215 -0
dataset.py +526 -0
demo.bat +12 -0
demo.py +781 -0
demo_headless.sh +27 -0
eval.py +397 -0
plots.py +303 -0
predict.py +470 -0
requirements.txt +26 -0
roi.py +34 -0
try_chart.ipynb +0 -0
usgfw2wrapper.dll +0 -0
weights/.keep +1 -0
weights/yolov5s-v2 +1 -0

README.md CHANGED Viewed

@@ -1,3 +1,143 @@
----
-license: mit
----

+<!-- #region -->
+# Automation of Aorta Measurement in Ultrasound Images
+## Env setup
+Suggested hardware:
+- GPU: NVIDIA RTX 3090 or higher x1 (model training using PyTorch)
+- CPU: 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz, or higher (model inference using OpenVINO)
+Software stack:
+- OS: Ubuntu 20.04 LTS
+- Python: 3.8+
+- Python Env: conda
+```shell
+conda create -n aorta python=3.8 -y
+conda activate aorta
+pip install -r requirements.txt
+```
+## Dataset
+Steps to prepare the dataset:
+1. Collect images and import to CVAT
+2. Label the images in CVAT
+3. Export the labelled data in `COCO 1.0` format using CVAT
+   1. Go to CVAT > `Projects` page
+   2. Click `⋮` on `aorta` project
+   3. Click `Export dataset`
+      - Format: `COCO 1.0`
+      - Save images: `Yes`
+4. Convert the new split data into YOLOv5 format
+   ```shell
+   python dataset.py coco2yolov5 [path/to/coco/input/dir] [path/to/yolov5/output/dir]
+   ```
+[CVAT](https://github.com/cvat-ai/cvat/tree/v2.3.0) info, set up with docker compose
+- Server version: 2.3
+- Core version: 7.3.0
+- Canvas version: 2.16.1
+- UI version: 1.45.0
+Dataset related scripts:
+- [coco2yolov5seg.ipynb](../coco2yolov5seg.ipynb): Convert COCO format to YOLOv5 format for segmentation task
+- [coco_merge_split.ipnb](../coco_merge_split.ipynb): Merge and split COCO format dataset
+## Training / Validation / Export
+Model choice: Prefer [yolov5-seg] over [yolov7-seg] for training/validation/exporting models, performance comparison:
+- yolov5s-seg, fast transfer learning (~5-10 mins for 100 epochs using RTX 3090) and CPU inference
+- yolov7-seg, seems too heavy (slower inference using CPU)
+Please refer to the repos of yolov5 seg & yolov7 seg for details of training/validation/exporting models.
+[yolov5-seg]: https://github.com/ultralytics/yolov5/blob/master/segment/tutorial.ipynb
+[yolov7-seg]: https://github.com/WongKinYiu/yolov7/tree/u7/seg
+### yolov5-seg
+Tested commit:
+```shell
+# Assume work dir is aorta/
+git clone https://github.com/ultralytics/yolov5
+cd yolov5
+git checkout 23c492321290266810e08fa5ee9a23fc9d6a571f
+git apply ../add_clearml_yolov5.patch
+```
+As of 2023, yolov5 seg doesn't support ClearML, but there is a [PR](https://github.com/ultralytics/yolov5/pull/10752) for it. So we can manually update these files to use ClearML to track the training process, or apply [add_clearml_yolov5.patch](./add_clearml_yolov5.patch).
+```shell
+# Example
+## Original training script
+python segment/train.py --img 640 --batch 16 --epochs 3 --data coco128-seg.yaml --weights yolov5s-seg.pt --cache
+## Updated training script with ClearML support
+python segment/train.py --project [clearml_project_name] --name [task_name] --img 640 --batch 16 --epochs 3 --data coco128-seg.yaml --weights yolov5s-seg.pt --cache
+```
+## Test video
+- Test video: [Demo.mp4](./Demo.mp4)
+- Tested video (mp4): Converted from the original avi using `ffmpeg`:
+  ```shell
+  ffmpeg -i "Demo.avi" -vcodec h264 -acodec aac -b:v 500k -strict -2 Demo.mp4`
+  ```
+## Demo (POC for 2022 Intel DevCup)
+```shell
+# run demo, using openvino model
+python demo.py --video Demo.mp4 --model weights/yolov5s-v2/best_openvino_model/yolov5-640-v2.xml --plot-mask --img-size 640
+# or run the demo using onnx model
+python demo.py --video Demo.mp4 --model weights/yolov5s-v2/yolov5-640.onnx --plot-mask --img-size 640
+# or run in the headless mode, generating a recording of the demo
+./demo_headless.sh --video Demo.mp4 --model [path/to/model]
+```
+## Deploy Pyinstaller EXE
+Only tested on Windows 10:
+```shell
+pip install pyinstaller==5.9
+pyinstaller demo.py
+# (TODO) Replace the following manual steps with pyinstaller --add-data or spec file
+#
+# Manual copy files to dist\demo
+# 1. Copy best_openvino_model folder to dist\demo\
+# 2. Copy openvino files to dist\demo
+# C:\Users\sa\miniforge3\envs\echo\Lib\site-packages\openvino\libs
+#   plugins.xml
+#   openvino_ir_frontend.dll
+#   openvino_intel_cpu_plugin.dll
+#   openvino_intel_gpu_plugin.dll
+```
+Troubleshooting: If the deployed EXE is not working with error `ValueError: --plotlyjs argument is not a valid URL or file path:`, please move the dist folder to another location with no special characters or Chinese in the path. Reference: <https://github.com/plotly/Kaleido/issues/57>
+## Paper
+https://www.nature.com/articles/s41746-024-01269-4
+Chiu, IM., Chen, TY., Zheng, YC. et al. Prospective clinical evaluation of deep learning for ultrasonographic screening of abdominal aortic aneurysms. npj Digit. Med. 7, 282 (2024).
+<!-- #endregion -->
+```python
+```

add_clearml_yolov5.patch ADDED Viewed

	@@ -0,0 +1,215 @@

+diff --git a/utils/loggers/__init__.py b/utils/loggers/__init__.py
+index 9de1f22..93b9ba2 100644
+--- a/utils/loggers/__init__.py
++++ b/utils/loggers/__init__.py
+@@ -110,7 +110,7 @@ class Loggers():
+         if clearml and 'clearml' in self.include:
+             try:
+                 self.clearml = ClearmlLogger(self.opt, self.hyp)
+-            except Exception:
++            except Exception as e:
+                 self.clearml = None
+                 prefix = colorstr('ClearML: ')
+                 LOGGER.warning(f'{prefix}WARNING ⚠️ ClearML is installed but not configured, skipping ClearML logging.'
+@@ -159,10 +159,11 @@ class Loggers():
+             paths = self.save_dir.glob('*labels*.jpg')  # training labels
+             if self.wandb:
+                 self.wandb.log({'Labels': [wandb.Image(str(x), caption=x.name) for x in paths]})
+-            # if self.clearml:
+-            #    pass  # ClearML saves these images automatically using hooks
+             if self.comet_logger:
+                 self.comet_logger.on_pretrain_routine_end(paths)
++            if self.clearml:
++                for path in paths:
++                    self.clearml.log_plot(title=path.stem, plot_path=path)
+     def on_train_batch_end(self, model, ni, imgs, targets, paths, vals):
+         log_dict = dict(zip(self.keys[:3], vals))
+@@ -289,6 +290,8 @@ class Loggers():
+             self.wandb.finish_run()
+         if self.clearml and not self.opt.evolve:
++            self.clearml.log_summary(dict(zip(self.keys[3:10], results)))
++            [self.clearml.log_plot(title=f.stem, plot_path=f) for f in files]
+             self.clearml.task.update_output_model(model_path=str(best if best.exists() else last),
+                                                   name='Best Model',
+                                                   auto_delete_file=False)
+@@ -303,6 +306,8 @@ class Loggers():
+             self.wandb.wandb_run.config.update(params, allow_val_change=True)
+         if self.comet_logger:
+             self.comet_logger.on_params_update(params)
++        if self.clearml:
++            self.clearml.task.connect(params)
+ class GenericLogger:
+@@ -315,7 +320,7 @@ class GenericLogger:
+         include:         loggers to include
+     """
+-    def __init__(self, opt, console_logger, include=('tb', 'wandb')):
++    def __init__(self, opt, console_logger, include=('tb', 'wandb', 'clearml')):
+         # init default loggers
+         self.save_dir = Path(opt.save_dir)
+         self.include = include
+@@ -333,6 +338,22 @@ class GenericLogger:
+                                     config=opt)
+         else:
+             self.wandb = None
++
++        if clearml and 'clearml' in self.include:
++            try:
++                # Hyp is not available in classification mode
++                if 'hyp' not in opt:
++                    hyp = {}
++                else:
++                    hyp = opt.hyp
++                self.clearml = ClearmlLogger(opt, hyp)
++            except Exception:
++                self.clearml = None
++                prefix = colorstr('ClearML: ')
++                LOGGER.warning(f'{prefix}WARNING ⚠️ ClearML is installed but not configured, skipping ClearML logging.'
++                               f' See https://github.com/ultralytics/yolov5/tree/master/utils/loggers/clearml#readme')
++        else:
++            self.clearml = None
+     def log_metrics(self, metrics, epoch):
+         # Log metrics dictionary to all loggers
+@@ -349,6 +370,9 @@ class GenericLogger:
+         if self.wandb:
+             self.wandb.log(metrics, step=epoch)
++
++        if self.clearml:
++            self.clearml.log_scalars(metrics, epoch)
+     def log_images(self, files, name='Images', epoch=0):
+         # Log images to all loggers
+@@ -361,6 +385,12 @@ class GenericLogger:
+         if self.wandb:
+             self.wandb.log({name: [wandb.Image(str(f), caption=f.name) for f in files]}, step=epoch)
++
++        if self.clearml:
++            if name == 'Results':
++                [self.clearml.log_plot(f.stem, f) for f in files]
++            else:
++                self.clearml.log_debug_samples(files, title=name)
+     def log_graph(self, model, imgsz=(640, 640)):
+         # Log model graph to all loggers
+@@ -373,11 +403,17 @@ class GenericLogger:
+             art = wandb.Artifact(name=f'run_{wandb.run.id}_model', type='model', metadata=metadata)
+             art.add_file(str(model_path))
+             wandb.log_artifact(art)
++
++        if self.clearml:
++            self.clearml.log_model(model_path=model_path, model_name=model_path.stem)
+     def update_params(self, params):
+         # Update the parameters logged
+         if self.wandb:
+             wandb.run.config.update(params, allow_val_change=True)
++
++        if self.clearml:
++            self.clearml.task.connect(params)
+ def log_tensorboard_graph(tb, model, imgsz=(640, 640)):
+diff --git a/utils/loggers/clearml/clearml_utils.py b/utils/loggers/clearml/clearml_utils.py
+index 2764abe..e7525da 100644
+--- a/utils/loggers/clearml/clearml_utils.py
++++ b/utils/loggers/clearml/clearml_utils.py
+@@ -3,6 +3,9 @@ import glob
+ import re
+ from pathlib import Path
++import matplotlib.image as mpimg
++import matplotlib.pyplot as plt
++
+ import numpy as np
+ import yaml
+@@ -79,13 +82,16 @@ class ClearmlLogger:
+         # Maximum number of images to log to clearML per epoch
+         self.max_imgs_to_log_per_epoch = 16
+         # Get the interval of epochs when bounding box images should be logged
+-        self.bbox_interval = opt.bbox_interval
++        # Only for detection task though!
++        if 'bbox_interval' in opt:
++            self.bbox_interval = opt.bbox_interval
+         self.clearml = clearml
+         self.task = None
+         self.data_dict = None
+         if self.clearml:
+             self.task = Task.init(
+-                project_name=opt.project if opt.project != 'runs/train' else 'YOLOv5',
++                # project_name=opt.project if opt.project != 'runs/train' else 'YOLOv5',
++                project_name=opt.project if not str(opt.project).startswith('runs/') else 'YOLOv5',
+                 task_name=opt.name if opt.name != 'exp' else 'Training',
+                 tags=['YOLOv5'],
+                 output_uri=True,
+@@ -112,6 +118,53 @@ class ClearmlLogger:
+                 # Set data to data_dict because wandb will crash without this information and opt is the best way
+                 # to give it to them
+                 opt.data = self.data_dict
++
++    def log_scalars(self, metrics, epoch):
++        """
++        Log scalars/metrics to ClearML
++        arguments:
++        metrics (dict) Metrics in dict format: {"metrics/mAP": 0.8, ...}
++        epoch (int) iteration number for the current set of metrics
++        """
++        for k, v in metrics.items():
++            title, series = k.split('/')
++            self.task.get_logger().report_scalar(title, series, v, epoch)
++
++    def log_model(self, model_path, model_name, epoch=0):
++        """
++        Log model weights to ClearML
++        arguments:
++        model_path (PosixPath or str) Path to the model weights
++        model_name (str) Name of the model visible in ClearML
++        epoch (int) Iteration / epoch of the model weights
++        """
++        self.task.update_output_model(model_path=str(model_path),
++                                      name=model_name,
++                                      iteration=epoch,
++                                      auto_delete_file=False)
++
++    def log_summary(self, metrics):
++        """
++        Log final metrics to a summary table
++        arguments:
++        metrics (dict) Metrics in dict format: {"metrics/mAP": 0.8, ...}
++        """
++        for k, v in metrics.items():
++            self.task.get_logger().report_single_value(k, v)
++
++    def log_plot(self, title, plot_path):
++        """
++        Log image as plot in the plot section of ClearML
++        arguments:
++        title (str) Title of the plot
++        plot_path (PosixPath or str) Path to the saved image file
++        """
++        img = mpimg.imread(plot_path)
++        fig = plt.figure()
++        ax = fig.add_axes([0, 0, 1, 1], frameon=False, aspect='auto', xticks=[], yticks=[])  # no ticks
++        ax.imshow(img)
++
++        self.task.get_logger().report_matplotlib_figure(title, "", figure=fig, report_interactive=False)
+     def log_debug_samples(self, files, title='Debug Samples'):
+         """
+@@ -126,7 +179,8 @@ class ClearmlLogger:
+                 it = re.search(r'_batch(\d+)', f.name)
+                 iteration = int(it.groups()[0]) if it else 0
+                 self.task.get_logger().report_image(title=title,
+-                                                    series=f.name.replace(it.group(), ''),
++                                                    # series=f.name.replace(it.group(), ''),
++                                                    series=f.name.replace(f"_batch{iteration}", ''),
+                                                     local_path=str(f),
+                                                     iteration=iteration)

dataset.py ADDED Viewed

	@@ -0,0 +1,526 @@

+import typer
+import fiftyone as fo
+from fiftyone import ViewField as F
+from pathlib import Path
+from pycocotools.coco import COCO
+from loguru import logger
+import cv2
+import shutil
+import os
+import random
+from collections import defaultdict
+import csv
+DEFAULT_EXCLUDE_NAME = "Ellen"
+DEFAULT_INS_TRAIN = "instances_Train.json"
+DEFAULT_INS_TEST = "instances_Test.json"
+app = typer.Typer()
+@app.command()
+def newsplit(
+    in_dir: str,
+    train_json=DEFAULT_INS_TRAIN,
+    test_json=DEFAULT_INS_TEST,
+    exclude_name=DEFAULT_EXCLUDE_NAME,
+):
+    """
+    Merge the train and test datasets,
+    and then split them into new train/test by leaving one person out.
+    """
+    # load the dataset
+    logger.info("Loading datasets...")
+    ds1 = fo.Dataset.from_dir(
+        dataset_type=fo.types.COCODetectionDataset,
+        data_path=Path(in_dir) / "images",
+        labels_path=Path(in_dir) / "annotations" / train_json,
+    )
+    ds2 = fo.Dataset.from_dir(
+        dataset_type=fo.types.COCODetectionDataset,
+        data_path=Path(in_dir) / "images",
+        labels_path=Path(in_dir) / "annotations" / test_json,
+    )
+    logger.info(f"[Before] Num samples in train: {len(ds1)}")
+    logger.info(f"[Before] Num samples in test: {len(ds2)}")
+    # merge the datasets
+    ds1.merge_samples(ds2)
+    # generate the new split
+    logger.info(f"Excluding name in filepath as train set: {exclude_name}")
+    new_train_view = ds1.match(~F("filepath").re_match(exclude_name))
+    new_test_view = ds1.match(F("filepath").re_match(exclude_name))
+    assert len(new_train_view) + len(new_test_view) == len(ds1)
+    logger.info(f"[After] Num samples in train: {len(new_train_view)}")
+    logger.info(f"[After] Num samples in test: {len(new_test_view)}")
+    train_counts = new_train_view.count_values("detections.detections.label")
+    test_counts = new_test_view.count_values("detections.detections.label")
+    logger.info(f"[After] Train counts: {train_counts}")
+    logger.info(f"[After] Test counts: {test_counts}")
+    # export the new split
+    logger.info("Exporting new train/test...")
+    new_train_p = Path(in_dir) / "annotations" / f"new_train_no-{exclude_name}.json"
+    new_test_p = Path(in_dir) / "annotations" / f"new_test_{exclude_name}.json"
+    new_train_view.export(
+        dataset_type=fo.types.COCODetectionDataset,
+        labels_path=new_train_p,
+        label_field="segmentations",
+        classes=ds1.default_classes,
+        abs_paths=True,
+    )
+    new_test_view.export(
+        dataset_type=fo.types.COCODetectionDataset,
+        labels_path=new_test_p,
+        label_field="segmentations",
+        classes=ds2.default_classes,
+        abs_paths=True,
+    )
+    logger.info(f"Exported new train: {new_train_p}")
+    logger.info(f"Exported new test: {new_test_p}")
+def _normalize(img_size, xy_s):
+    assert len(xy_s) % 2 == 0
+    normalized_xy_s = []
+    dw = 1.0 / (img_size[0])
+    dh = 1.0 / (img_size[1])
+    for i in range(len(xy_s)):
+        p = xy_s[i]
+        p = p * dw if i % 2 == 0 else p * dh
+        assert p <= 1.0 and p >= 0.0, f"{p} should < 1 and > 0"
+        normalized_xy_s.append(p)
+    return normalized_xy_s
+def _coco2yolo(coco_img_dir, coco_json_path, out_dir, bbox_only=False, rois=None):
+    logger.info(f"Reading {Path(coco_json_path).name}...")
+    coco = COCO(coco_json_path)
+    cats = coco.loadCats(coco.getCatIds())
+    cats = sorted(cats, key=lambda x: x["id"], reverse=False)
+    assert cats[0]["id"] == 1, f"Assume cat id starts from 1, but got {cats[0]['id']}"
+    logger.info(f"{len(cats)} categories: {[cat['name'] for cat in cats]}")
+    img_ids = coco.getImgIds()
+    prefix = Path(coco_json_path).stem.split("_")[-1].lower()  # either train or test
+    # create output directories
+    target_txt_r = Path(out_dir) / prefix / "labels"
+    target_img_r = Path(out_dir) / prefix / "images"
+    target_txt_r.mkdir(parents=True, exist_ok=False)
+    target_img_r.mkdir(parents=True, exist_ok=False)
+    logger.info(f"Num of imgs: {len(img_ids)}")
+    n_imgs_no_annos = 0
+    num_zero_area = 0
+    for img_id in img_ids:
+        img = coco.loadImgs(img_id)[0]
+        img_p = Path(coco_img_dir) / img["file_name"]
+        assert img_p.exists(), f"{img_p} does not exist"
+        anno_ids = coco.getAnnIds(imgIds=img["id"])
+        annos = coco.loadAnns(anno_ids)
+        new_filename = f"{img['id']}_{img_p.stem}"
+        out_img_p = target_img_r / (new_filename + img_p.suffix)
+        # get roi for the image if any
+        im_cv = cv2.imread(img_p.as_posix())
+        im_width, im_height = im_cv.shape[1], im_cv.shape[0]
+        roi = rois[(im_width, im_height)] if rois is not None else None
+        has_roi = (rois is not None) and (roi is not None) and len(roi) == 4
+        if not has_roi:
+            # copy image to target dir
+            shutil.copy(img_p, out_img_p)
+        else:
+            # crop the image to target dir
+            assert len(roi) == 4, f"ROI should have 4 values, but got {roi}"
+            cropped_img = im_cv[roi[1] : roi[1] + roi[3], roi[0] : roi[0] + roi[2]]
+            cv2.imwrite(out_img_p.as_posix(), cropped_img)
+        # bg imgs: only need to copy img, no need to create label file
+        if len(annos) == 0:
+            n_imgs_no_annos += 1
+            continue
+        # create the label txt file
+        txt_p = Path(target_txt_r) / (new_filename + ".txt")
+        if txt_p.exists():
+            logger.warning(f"{txt_p} already exists, {img_p} skipped")
+        txt_f = open(txt_p, "w")
+        img = cv2.imread(img_p.as_posix())
+        h, w, _ = img.shape
+        # generate txt file for each image
+        for ann in annos:
+            cls_id = ann["category_id"] - 1  # yolov5 uses zero-based class idx
+            # region bbox, for object detection
+            if bbox_only:
+                bbox = ann["bbox"]
+                # convert coco to yolo: top-x, top-y, w, h -> center-x, center-y, w, h
+                bbox_yolo = [
+                    bbox[0] + bbox[2] / 2,
+                    bbox[1] + bbox[3] / 2,
+                    bbox[2],
+                    bbox[3],
+                ]
+                n_bbox_p = " ".join([str(a) for a in _normalize((w, h), bbox_yolo)])
+                txt_f.write(f"{cls_id} {n_bbox_p}{os.linesep}")
+                continue
+            # endregion
+            # region seg, for instance segmentation
+            seg = ann["segmentation"]
+            if len(seg) > 1:
+                # TODO: Investigate why sometimes there are multiple segs
+                logger.warning(f"Skip {img_p} with {len(seg)} segs of {ann}")
+                continue
+            if len(seg) == 1:
+                xy_s = seg[0]
+                # handle roi if any
+                if has_roi:
+                    xy_s = [xy - roi[i % 2] for i, xy in enumerate(xy_s)]
+                    w, h = roi[2], roi[3]
+                    # remove the points outside of roi
+                    new_xy_s = []
+                    for i in range(0, len(xy_s), 2):
+                        x, y = xy_s[i], xy_s[i + 1]
+                        if x >= 0 and x <= w and y >= 0 and y <= h:
+                            new_xy_s.extend([x, y])
+                    xy_s = new_xy_s
+                n_xy_s = _normalize((w, h), xy_s)
+                seg_p = " ".join([str(a) for a in n_xy_s])
+                txt_f.write(f"{cls_id} {seg_p}{os.linesep}")
+            # endregion
+            # region keypoint, for pose estimation
+            if "keypoints" in ann:
+                # skip area 0 keypoints which could cause yolov8 training error
+                if int(ann["area"]) == 0:
+                    num_zero_area += 1
+                    continue
+                kps = ann["keypoints"]
+                bbox = ann["bbox"]
+                # convert coco to yolo: top-x, top-y, w, h -> center-x, center-y, w, h
+                bbox_yolo = [
+                    bbox[0] + bbox[2] / 2,
+                    bbox[1] + bbox[3] / 2,
+                    bbox[2],
+                    bbox[3],
+                ]
+                n_bbox_p = " ".join([str(a) for a in _normalize((w, h), bbox_yolo)])
+                # normalize x,y of each keypoint and keep visibility as is
+                n_kp = []
+                for i in range(0, len(kps), 3):
+                    n_kp.append(kps[i] / w)
+                    n_kp.append(kps[i + 1] / h)
+                    n_kp.append(kps[i + 2])
+                n_kp_p = " ".join([str(a) for a in n_kp])
+                txt_f.write(f"{cls_id} {n_bbox_p} {n_kp_p}{os.linesep}")
+            # endregion
+        txt_f.close()
+        # remove empty label file which has no annos
+        if txt_p.stat().st_size == 0:
+            txt_p.unlink()
+            n_imgs_no_annos += 1
+    empty_ratio = 100 * float(n_imgs_no_annos) / len(img_ids)
+    n_imgs_anns = len(img_ids) - n_imgs_no_annos
+    logger.info(f"# imgs w anns: {n_imgs_anns} {(100-empty_ratio):.2f}%")
+    logger.info(f"# imgs w/o anns: {n_imgs_no_annos} {empty_ratio:.2f}%")
+    logger.info(f"# zero area kps: {num_zero_area}")
+    txts = [f for f in target_txt_r.iterdir() if f.is_file()]
+    imgs = [f for f in target_img_r.iterdir() if f.is_file()]
+    assert (len(txts) + n_imgs_no_annos) == len(imgs) == len(img_ids)
+    return target_img_r
+@app.command(help="Convert COCO dataset to YOLOv5 format")
+def coco2yolov5(
+    in_dir: str,
+    out_dir: str,
+    split_val_ratio: float = 0.2,
+    seed: int = 42,
+    bbox_only: bool = False,
+    crop_roi_file: str = None,
+):
+    """
+    Convert COCO dataset to YOLOv5 format.
+    Support 3 task types: object detection, instance segmentation, pose estimation.
+    YOLOv5 seg labels are the same as detection labels, using txt files with one object per line.
+    The difference is that instead of "class, xywh" they are "class xy1, xy2, xy3,...".
+    Ref: https://github.com/ultralytics/yolov5/issues/10161#issuecomment-1315672357
+    YOLOv5 keypoint labels is using txt files with one object per line.
+        class cx cy w h x1 y1 v1 ... xn yn vn
+    All coordinates are normalized by image width and height.
+    vn (visibility): 0, 1, or 2 => not labeled, labeled but invisible, labeled and visible
+    Ref: https://github.com/ultralytics/ultralytics/issues/1870#issuecomment-1498909244
+    Example: https://ultralytics.com/assets/coco8-pose.zip
+    """
+    if Path(out_dir).exists():
+        delete = typer.confirm(f"{out_dir} alread exists. Are you sure to delete it?")
+        if not delete:
+            logger.info("Not deleting")
+            raise typer.Abort()
+        shutil.rmtree(out_dir)
+        logger.info(f"Deleted {Path(out_dir).name}")
+    ann_dir_p = Path(in_dir) / "annotations"
+    img_dir_p = Path(in_dir) / "images"
+    assert ann_dir_p.exists(), f"{ann_dir_p} does not exist"
+    assert img_dir_p.exists(), f"{img_dir_p} does not exist"
+    # try to find the json files of train & test in annotations dir
+    train_json_p = None
+    test_json_p = None
+    for f in ann_dir_p.iterdir():
+        if f.stem.lower().endswith("train"):
+            train_json_p = f
+            logger.info(f"Found train json: {f.name}")
+        elif f.stem.lower().endswith("test"):
+            test_json_p = f
+            logger.info(f"Found test json: {f.name}")
+    # must have train, while test is optional
+    assert train_json_p is not None, f"Cannot find train json in {ann_dir_p}"
+    do_split = False
+    if test_json_p is None:
+        logger.warning("Cannot find test json in [in_dir]/annotations")
+        do_split = typer.confirm("Do you want to split val from train?")
+    # region handle ROIs
+    rois = None
+    if crop_roi_file is not None:
+        roi_csv_p = Path(crop_roi_file)
+        assert roi_csv_p.exists(), f"{roi_csv_p} does not exist"
+        # read ROIs from csv, each image size should have one ROI
+        rois = defaultdict(lambda: [], {})
+        with open(roi_csv_p, "r") as f:
+            for roi in csv.DictReader(f):
+                ori_width = int(roi["ori_width"])
+                ori_height = int(roi["ori_height"])
+                roi_x = int(roi["roi_x"])
+                roi_y = int(roi["roi_y"])
+                roi_width = int(roi["roi_width"])
+                roi_height = int(roi["roi_height"])
+                key = (ori_width, ori_height)
+                assert key not in rois, f"Duplicate ROI for {key}"
+                rois[key] = [roi_x, roi_y, roi_width, roi_height]
+    # endregion
+    yolo_train_img_dir = None
+    yolo_test_img_dir = None
+    yolo_train_img_dir = _coco2yolo(img_dir_p, train_json_p, out_dir, bbox_only, rois)
+    if test_json_p is not None:
+        yolo_test_img_dir = _coco2yolo(img_dir_p, test_json_p, out_dir, bbox_only, rois)
+    if do_split:
+        yolo_test_img_dir = Path(out_dir) / "val" / "images"
+        # randomly select 20% of train images
+        train_imgs = [f for f in yolo_train_img_dir.iterdir() if f.is_file()]
+        n_test = int(len(train_imgs) * split_val_ratio)
+        logger.info(f"Split ratio {split_val_ratio}: {n_test} test images from train")
+        # set random seed to make sure the same images are selected
+        random.seed(seed)
+        test_imgs = random.sample(train_imgs, n_test)
+        # move test images to val/images
+        yolo_test_img_dir.mkdir(parents=True, exist_ok=True)
+        for f in test_imgs:
+            shutil.move(str(f), str(yolo_test_img_dir))
+        # move labels of test images to val/labels
+        yolo_test_label_dir = Path(out_dir) / "val" / "labels"
+        yolo_test_label_dir.mkdir(parents=True, exist_ok=True)
+        for f in test_imgs:
+            label_f = yolo_train_img_dir.parent / "labels" / f"{f.stem}.txt"
+            if label_f.exists():
+                shutil.move(str(label_f), str(yolo_test_label_dir))
+    # region create yaml file
+    logger.info(f"Reading {Path(train_json_p).name}...")
+    train_coco = COCO(train_json_p)
+    train_cats = train_coco.loadCats(train_coco.getCatIds())
+    num_kps = [
+        len(c["keypoints"])
+        for c in train_cats
+        if "keypoints" in c and len(c["keypoints"]) > 0
+    ]
+    # check if all categories have the same number of keypoints
+    if len(num_kps) > 0:
+        assert len(set(num_kps)) == 1, "Categories have different number of keypoints"
+    logger.info(f"Number of keypoints: {set(num_kps)}")
+    train_cats = [c["name"] for c in train_cats]
+    # ensure having the same categories in the json of train & test
+    # test_coco = COCO(test_json_p)
+    # test_cats = test_coco.loadCats(test_coco.getCatIds())
+    # test_cats = sorted(test_cats, key=lambda x: x["id"], reverse=False)
+    # test_cats = [c["name"] for c in test_cats]
+    # assert ",".join(train_cats) == ",".join(test_cats), "Categories mismatch"
+    out_config_file = Path(out_dir) / "data.yaml"
+    with open(out_config_file, "w") as f:
+        if len(num_kps) > 0:
+            f.write(f"kpt_shape: [{num_kps[0]},3]" + os.linesep)
+            assert num_kps[0] == 1, "Only support 1 keypoint for now"
+            f.write("flip_idx: [0]" + os.linesep)
+        f.write("names:" + os.linesep)
+        for c in train_cats:
+            f.write(f"- {c}" + os.linesep)
+        f.write(f"nc: {len(train_cats)}" + os.linesep)
+        f.write(f"path: {Path(out_dir).absolute()}" + os.linesep)
+        train_rel_path = f"{yolo_train_img_dir.parent.name}/{yolo_train_img_dir.name}"
+        f.write(f"train: {train_rel_path}" + os.linesep)
+        if yolo_test_img_dir is not None:
+            val_rel_path = f"{yolo_test_img_dir.parent.name}/{yolo_test_img_dir.name}"
+            f.write(f"val: {val_rel_path}" + os.linesep)
+    logger.info(f"Config file saved: {out_config_file}")
+    # endregion
+    logger.info("Done ✅")
+@app.command(help="List all image sizes and counts in a directory recursively")
+def list_img_sizes(
+    in_dir: str = typer.Argument(..., help="Input directory"),
+):
+    in_dir_p = Path(in_dir)
+    assert in_dir_p.exists(), f"{in_dir_p} does not exist"
+    assert in_dir_p.is_dir(), f"{in_dir_p} is not a directory"
+    ds = fo.Dataset.from_images_dir(in_dir_p)
+    ds.compute_metadata()
+    logger.info(f"Found {len(ds)} images in {in_dir_p}")
+    # count number of images for each size
+    sizes = defaultdict(lambda: 0, {})
+    for sample in ds:
+        metadata = sample.metadata
+        width = metadata.width
+        height = metadata.height
+        sizes[(width, height)] += 1
+    # sort with the most frequent size first
+    sizes = dict(sorted(sizes.items(), key=lambda x: x[1], reverse=True))
+    for k, v in sizes.items():
+        # find one example image for each size
+        sample = ds.match({"metadata.width": k[0], "metadata.height": k[1]}).first()
+        print(f"Size (w, h) {k}: {v} image(s), e.g., {sample.filepath}")
+@app.command(help="Crop images in a directory recursively with ROIs from csv")
+def crop_imgs(
+    in_dir: str = typer.Argument(..., help="Input directory"),
+    roi_csv: str = typer.Argument(..., help="CSV file containing ROIs"),
+):
+    in_dir_p = Path(in_dir)
+    assert in_dir_p.exists(), f"{in_dir_p} does not exist"
+    assert in_dir_p.is_dir(), f"{in_dir_p} is not a directory"
+    roi_csv_p = Path(roi_csv)
+    assert roi_csv_p.exists(), f"{roi_csv_p} does not exist"
+    # read ROIs from csv, each image size should have one ROI
+    rois = defaultdict(lambda: [], {})
+    with open(roi_csv_p, "r") as f:
+        for roi in csv.DictReader(f):
+            ori_width = int(roi["ori_width"])
+            ori_height = int(roi["ori_height"])
+            roi_x = int(roi["roi_x"])
+            roi_y = int(roi["roi_y"])
+            roi_width = int(roi["roi_width"])
+            roi_height = int(roi["roi_height"])
+            key = (ori_width, ori_height)
+            assert key not in rois, f"Duplicate ROI for {key}"
+            rois[key] = [roi_x, roi_y, roi_width, roi_height]
+    # read and crop images
+    # write the cropped images to a new directory
+    out_dir_p = in_dir_p.parent / f"{in_dir_p.name}_cropped"
+    Path(out_dir_p).mkdir(parents=True, exist_ok=True)
+    ds = fo.Dataset.from_images_dir(in_dir_p)
+    logger.info(f"Found {len(ds)} images in {in_dir_p}")
+    for sample in ds:
+        img_path = sample.filepath
+        # read and crop the image
+        img = cv2.imread(img_path)
+        width, height = img.shape[1], img.shape[0]
+        roi = rois[(width, height)]
+        cropped_img = img[roi[1] : roi[1] + roi[3], roi[0] : roi[0] + roi[2]]
+        # keep the original folder structure
+        out_img_p = out_dir_p / Path(img_path).relative_to(in_dir_p.absolute())
+        # create the subfolder if not exist
+        if not out_img_p.parent.exists():
+            out_img_p.parent.mkdir(parents=True, exist_ok=True)
+        cv2.imwrite(str(out_img_p), cropped_img)
+    logger.info(f"Cropped images saved to {out_dir_p}")
+@app.command(help="Count num of images without aorta annotations")
+def count_n_imgs_no_aorta(
+    in_coco_json_p: str = typer.Argument(..., help="Input coco json file"),
+    aorta_cat_name: str = typer.Argument("aorta", help="Name of aorta category"),
+):
+    logger.info(f"Reading {Path(in_coco_json_p).name}...")
+    assert Path(in_coco_json_p).exists(), f"{in_coco_json_p} does not exist"
+    coco = COCO(in_coco_json_p)
+    cats = coco.loadCats(coco.getCatIds())
+    cats = sorted(cats, key=lambda x: x["id"], reverse=False)
+    # find the category id of aorta
+    aorta_cat_id = None
+    for cat in cats:
+        if cat["name"] == aorta_cat_name:
+            aorta_cat_id = cat["id"]
+            break
+    assert aorta_cat_id is not None, f"Cannot find {aorta_cat_name} in {in_coco_json_p}"
+    logger.info(f"Found {aorta_cat_name} with id {aorta_cat_id}")
+    n_img_no_aorta = 0
+    for img_id in coco.getImgIds():
+        anno_ids = coco.getAnnIds(imgIds=img_id)
+        annos = coco.loadAnns(anno_ids)
+        has_aorta = False
+        for anno in annos:
+            if anno["category_id"] == aorta_cat_id:
+                has_aorta = True
+                break
+        if not has_aorta:
+            n_img_no_aorta += 1
+    logger.info(f"Found {n_img_no_aorta} images without {aorta_cat_name}")
+@app.command(help="Remove non-aorta annotations from a YOLOv5 dataset")
+def keep_only_aorta_labels_in_yolo(
+    in_dir: str = typer.Argument(..., help="Input label directory"),
+    aorta_class_id: int = typer.Argument(0, help="Class id of aorta"),
+):
+    txts = list(Path(in_dir).glob("*.txt"))
+    logger.info(f"Found {len(txts)} txt files in {in_dir}")
+    for txt_p in txts:
+        ori_lines, new_lines = [], []
+        with open(txt_p, "r") as f:
+            ori_lines = f.readlines()
+            for line in ori_lines:
+                nums = line.split(" ")
+                if int(nums[0]) == aorta_class_id:
+                    new_lines.append(line)
+        with open(txt_p, "w") as new_f:
+            new_f.writelines(new_lines)
+if __name__ == "__main__":
+    app()

demo.bat ADDED Viewed

	@@ -0,0 +1,12 @@

+@echo off
+REM "Please change the path to your own path"
+cd "C:\Users\chenp\Downloads\aorta_demo_v3"
+REM "Please change the path to your own path"
+call C:\ProgramData\miniconda3\Scripts\activate.bat
+call conda activate echo
+call python demo.py --device GPU --jobs 2
+pause

demo.py ADDED Viewed

	@@ -0,0 +1,781 @@

+from pathlib import Path
+import sys
+import time
+from time import perf_counter
+import argparse
+from loguru import logger
+import os
+from predict import Model
+from datetime import datetime
+from scipy import signal
+import plotly.graph_objects as go
+import numpy as np
+import io
+from PIL import Image
+import cv2
+from PySide6.QtCore import Qt, QThread, Signal, Slot
+from PySide6.QtGui import QImage, QPixmap
+from PySide6.QtWidgets import (
+    QApplication,
+    QHBoxLayout,
+    QLabel,
+    QMainWindow,
+    QPushButton,
+    QSizePolicy,
+    QVBoxLayout,
+    QWidget,
+)
+# for telemed
+import matplotlib.pyplot as plt
+import ctypes
+from ctypes import *
+# 720p
+video_w = 1280
+video_h = 720
+# Copy from detection.py from telemed sample code
+class Telemed:
+    def __init__(self):
+        # starting copy from the origianl main
+        # Setting ultrasound size
+        # w = 512
+        # h = 512
+        w = 640
+        h = 640
+        # Load dll
+        # usgfw2 = cdll.LoadLibrary('./usgfw2wrapper_C++_sources/usgfw2wrapper/x64/Release/usgfw2wrapper.dll')
+        usgfw2 = cdll.LoadLibrary("./usgfw2wrapper.dll")
+        # Ultrasound initialize
+        usgfw2.on_init()
+        ERR = usgfw2.init_ultrasound_usgfw2()
+        # Check probe
+        if ERR == 2:
+            logger.error("Main Usgfw2 library object not created")
+            usgfw2.Close_and_release()
+            sys.exit()
+        ERR = usgfw2.find_connected_probe()
+        if ERR != 101:
+            logger.error("Probe not detected")
+            usgfw2.Close_and_release()
+            sys.exit()
+        ERR = usgfw2.data_view_function()
+        if ERR < 0:
+            logger.error(
+                "Main ultrasound scanning object for selected probe not created"
+            )
+            sys.exit()
+        ERR = usgfw2.mixer_control_function(0, 0, w, h, 0, 0, 0)
+        if ERR < 0:
+            logger.error("B mixer control not returned")
+            sys.exit()
+        # Probe setting
+        res_X = ctypes.c_float(0.0)
+        res_Y = ctypes.c_float(0.0)
+        usgfw2.get_resolution(ctypes.pointer(res_X), ctypes.pointer(res_Y))
+        X_axis = np.zeros(shape=(w))
+        Y_axis = np.zeros(shape=(h))
+        if w % 2 == 0:
+            k = 0
+            for i in range(-w // 2, w // 2 + 1):
+                if i < 0:
+                    j = i + 0.5
+                    X_axis[k] = j * res_X.value
+                    k = k + 1
+                else:
+                    if i > 0:
+                        j = i - 0.5
+                        X_axis[k] = j * res_X.value
+                        k = k + 1
+        else:
+            for i in range(-w // 2, w // 2):
+                X_axis[i + w / 2 + 1] = i * res_X.value
+        for i in range(0, h - 1):
+            Y_axis[i] = i * res_Y.value
+        old_resolution_x = res_X.value
+        old_resolution_y = res_X.value
+        # Image setting
+        p_array = (ctypes.c_uint * w * h * 4)()
+        fig, ax = plt.subplots()
+        usgfw2.return_pixel_values(ctypes.pointer(p_array))
+        buffer_as_numpy_array = np.frombuffer(p_array, np.uint)
+        reshaped_array = np.reshape(buffer_as_numpy_array, (w, h, 4))
+        img = ax.imshow(
+            reshaped_array[:, :, 0:3],
+            cmap="gray",
+            vmin=0,
+            vmax=255,
+            origin="lower",
+            extent=[np.amin(X_axis), np.amax(X_axis), np.amax(Y_axis), np.amin(Y_axis)],
+        )
+        # starting copy from the original __int__
+        self.w = w
+        self.h = h
+        (
+            self.usgfw2,
+            self.p_array,
+            self.res_X,
+            self.res_Y,
+            self.old_resolution_x,
+            self.old_resolution_y,
+            self.X_axis,
+            self.Y_axis,
+            self.img,
+        ) = (
+            usgfw2,
+            p_array,
+            res_X,
+            res_Y,
+            old_resolution_x,
+            old_resolution_y,
+            X_axis,
+            Y_axis,
+            img,
+        )
+    # return the image from telemed
+    def imaging(self):
+        self.usgfw2.return_pixel_values(ctypes.pointer(self.p_array))
+        buffer_as_numpy_array = np.frombuffer(self.p_array, np.uint)
+        reshaped_array = np.reshape(buffer_as_numpy_array, (self.w, self.h, 4))
+        self.usgfw2.get_resolution(
+            ctypes.pointer(self.res_X), ctypes.pointer(self.res_Y)
+        )
+        if (
+            self.res_X.value != self.old_resolution_x
+            or self.res_Y.value != self.old_resolution_y
+        ):
+            if self.w % 2 == 0:
+                k = 0
+                for i in range(-self.w // 2, self.w // 2 + 1):
+                    if i < 0:
+                        j = i + 0.5
+                        self.X_axis[k] = j * self.res_X.value
+                        k = k + 1
+                    else:
+                        if i > 0:
+                            j = i - 0.5
+                            self.X_axis[k] = j * self.res_X.value
+                            k = k + 1
+            else:
+                for i in range(-self.w // 2, self.w // 2):
+                    self.X_axis[i + self.w / 2 + 1] = i * self.res_X.value
+            for i in range(0, self.h - 1):
+                self.Y_axis[i] = i * self.res_Y.value
+            self.old_resolution_x = self.res_X.value
+            self.old_resolution_y = self.res_X.value
+        self.img.set_data(reshaped_array[:, :, 0:3])
+        self.img.set_extent(
+            [
+                np.amin(self.X_axis),
+                np.amax(self.X_axis),
+                np.amax(self.Y_axis),
+                np.amin(self.Y_axis),
+            ]
+        )
+        # Transfer image format to cv2
+        img_array = np.asarray(self.img.get_array())
+        img_array = img_array[::-1, :, ::-1]  # format same as plt image, RBG to BGR
+        return img_array
+class Thread(QThread):
+    updateFrame = Signal(QImage)
+    def __init__(self, parent=None, args=None):
+        QThread.__init__(self, parent)
+        self.status = True
+        self.cap = True
+        self.args = args
+        # init telemed
+        if args.video is None:
+            self.telemed = Telemed()
+        # init model
+        is_async = (
+            True if self.args.jobs == "auto" or int(self.args.jobs) > 1 else False
+        )
+        self.model = Model(
+            model_path=self.args.model,
+            imgsz=self.args.img_size,
+            classes=self.args.classes,
+            device=self.args.device,
+            plot_mask=self.args.plot_mask,
+            conf_thres=self.args.conf_thres,
+            is_async=is_async,
+            n_jobs=self.args.jobs,
+        )
+    def get_stats_fig(self, aorta_widths, aorta_confs, fig_w, fig_h, ts):
+        title_font_size = 28
+        body_font_size = 24
+        img_quality = 100 * np.mean(aorta_confs)
+        avg_width = np.mean(aorta_widths)
+        max_width = np.max(aorta_widths)
+        suggestions = [
+            "N/A, within normal limit",
+            "Follow up in 5 years",
+            "Make an appointment as soon as possible",
+        ]
+        s = None
+        if avg_width < 3:
+            s = suggestions[0]
+        elif avg_width < 5:
+            s = suggestions[1]
+        else:
+            s = suggestions[2]
+        # region smoothing: method 2, keep the peaks
+        # peaks = signal.find_peaks(aorta_widths, height=0.5, distance=40)
+        # new_y = []
+        # # smooth the values between the peaks
+        # start = 0
+        # end = peaks[0][0]
+        # new_y.extend(signal.savgol_filter(aorta_widths[start:end], end - start, 2))
+        # for i in range(len(peaks[0]) - 1):
+        #     start = peaks[0][i] + 1
+        #     end = peaks[0][i + 1]
+        #     new_y.append(aorta_widths[peaks[0][i]])  # add peak value
+        #     new_y.extend(
+        #         signal.savgol_filter(
+        #             aorta_widths[start:end],
+        #             end - start,  # window size used for filtering
+        #             2,
+        #         )
+        #     )  # order of fitted polynomial
+        # # add the last peak
+        # new_y.append(aorta_widths[peaks[0][-1]])
+        # start = peaks[0][-1] + 1
+        # end = len(aorta_widths)
+        # new_y.extend(signal.savgol_filter(aorta_widths[start:end], end - start, 2))
+        # endregion
+        # region smoothing: method 1, do not keep the peaks
+        window_size = 53
+        if len(aorta_widths) < window_size:
+            window_size = len(aorta_widths) - 1
+        new_y = signal.savgol_filter(aorta_widths, window_size, 3)
+        # endregion
+        x = np.arange(1, len(aorta_widths) + 1, dtype=int)
+        fig = go.Figure()
+        fig.add_trace(
+            go.Scatter(
+                x=x, y=aorta_widths, mode="lines", line=dict(color="royalblue", width=1)
+            )
+        )
+        fig.add_trace(
+            go.Scatter(
+                x=x,
+                y=new_y,
+                mode="lines",
+                marker=dict(
+                    size=3,
+                    color="mediumpurple",
+                ),
+            )
+        )
+        fig.update_layout(
+            autosize=False,
+            width=fig_w,
+            height=fig_h,
+            margin=dict(l=50, r=50, b=50, t=400, pad=4),
+            paper_bgcolor="LightSteelBlue",
+            showlegend=False,
+        )
+        fig.add_annotation(
+            text=f"max={max_width:.2f} cm",
+            x=np.argmax(aorta_widths),
+            y=np.max(aorta_widths),
+            xref="x",
+            yref="y",
+            showarrow=True,
+            font=dict(color="#ffffff"),
+            arrowhead=2,
+            arrowsize=1,
+            arrowwidth=2,
+            borderpad=4,
+            bgcolor="#ff7f0e",
+            opacity=0.8,
+        )
+        fig.add_annotation(
+            text=f"smoothed max={np.max(new_y):.2f} cm",
+            x=np.argmax(new_y),
+            y=np.max(new_y),
+            xref="x",
+            yref="y",
+            showarrow=True,
+            font=dict(color="#ffffff"),
+            arrowhead=2,
+            arrowsize=1,
+            arrowwidth=2,
+            ax=-100,
+            ay=-50,
+            borderpad=4,
+            bgcolor="#ff7f0e",
+            opacity=0.8,
+        )
+        fig.add_annotation(
+            text="<b>Report of Abdominal Aorta Examination</b>",
+            xref="paper",
+            yref="paper",
+            x=0.5,
+            y=2.3,
+            showarrow=False,
+            font=dict(size=title_font_size),
+        )
+        fig.add_annotation(
+            text=f"Image acquisition quality: {img_quality:.0f}%",
+            xref="paper",
+            yref="paper",
+            x=0,
+            y=2.0,
+            showarrow=False,
+            font=dict(size=body_font_size),
+        )
+        fig.add_annotation(
+            text=f"Aorta Maximal Width: {max_width:.2f} cm",
+            xref="paper",
+            yref="paper",
+            x=0,
+            y=1.8,
+            showarrow=False,
+            font=dict(size=body_font_size),
+        )
+        fig.add_annotation(
+            text=f"Aorta Maximal Width (Smoothed): {np.max(new_y):.2f} cm",
+            xref="paper",
+            yref="paper",
+            x=0,
+            y=1.6,
+            showarrow=False,
+            font=dict(size=body_font_size),
+        )
+        fig.add_annotation(
+            text=f"Average: {avg_width:.2f} cm",
+            xref="paper",
+            yref="paper",
+            x=0,
+            y=1.4,
+            showarrow=False,
+            font=dict(size=body_font_size),
+        )
+        fig.add_annotation(
+            text=f"Suggestion: {s}",
+            xref="paper",
+            yref="paper",
+            x=0,
+            y=1.2,
+            showarrow=False,
+            font=dict(size=body_font_size),
+        )
+        fig.add_annotation(
+            text=f"Generated at {ts}",
+            xref="paper",
+            yref="paper",
+            x=1,
+            y=1,
+            showarrow=False,
+        )
+        return fig
+    def run(self):
+        one_cm_in_pixels = 48  # hard-coded
+        aorta_cm_thre1 = 3
+        aorta_cm_thre2 = 5
+        black = (0, 0, 0)
+        white = (255, 255, 255)
+        red = (0, 0, 255)
+        green = (0, 255, 0)
+        aorta_widths_stats = [0, 0, 0]  # three ranges: <3, 3-5, >5
+        aorta_widths = []
+        aorta_confs = []
+        expected_fps = None
+        frame_count = None
+        frame_w = None
+        frame_h = None
+        if self.args.video:
+            self.cap = cv2.VideoCapture(self.args.video)
+            expected_fps = self.cap.get(cv2.CAP_PROP_FPS)
+            secs_per_frame = 1 / expected_fps
+            frame_w, frame_h = int(self.cap.get(cv2.CAP_PROP_FRAME_WIDTH)), int(
+                self.cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
+            )
+            frame_count = int(self.cap.get(cv2.CAP_PROP_FRAME_COUNT))
+            logger.info(f"Video source FPS: {expected_fps}")
+            logger.info(f"Milliseconds per frame: {secs_per_frame}")
+            logger.info(f"Video source resolution (WxH): {frame_w}x{frame_h}")
+            logger.info(f"Video source frame count: {frame_count}")
+            assert frame_count > 0, "No frame found"
+        n_read_frames = 0
+        next_frame_to_infer = 0
+        next_frame_to_show = 0
+        n_repeat_failure = 0
+        is_last_failed = False
+        start_time = perf_counter()
+        while self.status:
+            frame = None
+            # avoid infinite loop
+            if n_repeat_failure > 30:
+                break
+            # inference
+            color_frame, others, results, xyxy, conf = None, None, None, None, None
+            if self.model.is_async:
+                results = self.model.get_result(next_frame_to_show)
+                if results:
+                    color_frame, others = results
+                    xyxy, conf, _ = others
+                    next_frame_to_show += 1
+            if self.model.is_async and self.model.is_free_to_infer_async():
+                if self.args.video:
+                    ret, frame = self.cap.read()
+                    if not ret:
+                        n_repeat_failure += 1 if is_last_failed else 0
+                        is_last_failed = True
+                        continue
+                else:
+                    # read the frame from telemed
+                    # TODO(martin): Check read failure
+                    frame = self.telemed.imaging()
+                n_read_frames += 1
+                self.model.predict_async(frame, next_frame_to_infer)
+                next_frame_to_infer += 1
+            elif not self.model.is_async:
+                if self.args.video:
+                    ret, frame = self.cap.read()
+                    if not ret:
+                        n_repeat_failure += 1 if is_last_failed else 0
+                        is_last_failed = True
+                        continue
+                else:
+                    # read the frame from telemed
+                    # TODO(martin): Check read failure
+                    frame = self.telemed.imaging()
+                n_read_frames += 1
+                results = self.model.predict(frame)
+                color_frame, others = results
+                xyxy, conf, _ = others  # bbox and confidence
+            if results is None:
+                continue
+            is_last_failed = False
+            # check if aorta is within the ROI box, and draw the box
+            aorta_width_in_cm = 0
+            is_found = xyxy is not None
+            is_in_box = False
+            is_too_left, is_too_right = False, False
+            w, h = color_frame.shape[1], color_frame.shape[0]
+            box_w = int(w * 0.1)
+            box_h = int(h * 0.5)
+            box_top_left = (w // 2 - box_w // 2, h // 4)
+            box_bottom_right = (w // 2 + box_w // 2, h // 4 + box_h)
+            if xyxy is not None:
+                x1, y1, x2, y2 = xyxy
+                # check aorta width
+                aorta_width_in_cm = (x2 - x1) / one_cm_in_pixels
+                aorta_widths.append(aorta_width_in_cm)
+                aorta_confs.append(conf)
+                if aorta_width_in_cm < aorta_cm_thre1:
+                    aorta_widths_stats[0] += 1
+                elif aorta_width_in_cm < aorta_cm_thre2:
+                    aorta_widths_stats[1] += 1
+                else:
+                    aorta_widths_stats[2] += 1
+                # check whether aorta is in the box
+                if (
+                    x1 > box_top_left[0]
+                    and x2 < box_bottom_right[0]
+                    and y1 > box_top_left[1]
+                    and y2 < box_bottom_right[1]
+                ):
+                    is_in_box = True
+                is_too_right = x2 > box_bottom_right[0]
+                is_too_left = x1 < box_top_left[0]
+            # plot ROI box with color status
+            box_color = green if is_in_box else red
+            color_frame = cv2.rectangle(
+                color_frame, box_top_left, box_bottom_right, box_color, 2
+            )
+            assert not (
+                is_too_left and is_too_right
+            ), "Cannot be both too left and too right"
+            if is_too_left:
+                start_p = (box_top_left[0], int(h * 0.9))
+                end_p = (box_bottom_right[0], int(h * 0.9))
+                cv2.arrowedLine(color_frame, start_p, end_p, red, 3)
+            if is_too_right:
+                start_p = (box_bottom_right[0], int(h * 0.9))
+                end_p = (box_top_left[0], int(h * 0.9))
+                cv2.arrowedLine(color_frame, start_p, end_p, red, 3)
+            if is_in_box:
+                cv2.putText(
+                    color_frame,
+                    "GOOD",
+                    (box_top_left[0], int(h * 0.9)),
+                    cv2.FONT_HERSHEY_SIMPLEX,
+                    1,
+                    green,
+                    3,
+                )
+            # plot aorta width
+            text = (
+                f"Aorta width: {aorta_width_in_cm:.2f} cm"
+                if is_found
+                else "Aorta width: N/A"
+            )
+            cv2.putText(
+                color_frame, text, (50, 90), cv2.FONT_HERSHEY_SIMPLEX, 1, white, 3
+            )
+            # region FPS
+            fps = None
+            if n_read_frames > 0:
+                fps = n_read_frames / (perf_counter() - start_time)
+                # Slow down the loop if FPS is too high
+                if self.args.sync:
+                    while fps > expected_fps:
+                        time.sleep(0.001)
+                        fps = n_read_frames / (perf_counter() - start_time)
+                cv2.putText(
+                    color_frame,
+                    f"FPS: {fps:.2f}",
+                    (50, 30),
+                    cv2.FONT_HERSHEY_SIMPLEX,
+                    1,
+                    white,
+                    3,
+                )
+            # endregion
+            # Creating and scaling QImage
+            h, w, ch = color_frame.shape
+            img = QImage(color_frame.data, w, h, ch * w, QImage.Format_BGR888)
+            scaled_img = img.scaled(video_w, video_h, Qt.KeepAspectRatio)
+            # Emit signal
+            self.updateFrame.emit(scaled_img)
+            if self.args.video:
+                progress = 100 * n_read_frames / frame_count
+                fps_msg = f", FPS: {fps:.2f}" if fps is not None else ""
+                print(
+                    f"Processed {n_read_frames}/{frame_count} ({progress:.2f}%) frames"
+                    + fps_msg,
+                    end="\r" if n_read_frames < frame_count else os.linesep,
+                )
+                if n_read_frames >= frame_count:
+                    logger.info("Finished processing video")
+                    break
+        if self.args.video:
+            self.cap.release()
+        if not self.status:
+            logger.info("Stopped by user")
+            return
+        # draw a black image with frame width & height
+        # with some text in center indicating generating report
+        # it's just a dummy step to make demo more real
+        im = np.zeros((frame_h, frame_w, 3), np.uint8)
+        cv2.putText(
+            im,
+            "Generating report for you...",
+            (frame_w // 3, frame_h // 2),
+            cv2.FONT_HERSHEY_SIMPLEX,
+            1,
+            white,
+            3,
+        )
+        img = QImage(im.data, frame_w, frame_h, ch * w, QImage.Format_BGR888)
+        scaled_img = img.scaled(video_w, video_h, Qt.KeepAspectRatio)
+        self.updateFrame.emit(scaled_img)
+        time.sleep(3)
+        # plot aorta width tracing line chart
+        now_t = datetime.now()
+        ts1 = now_t.strftime("%Y%m%d_%H%M%S")
+        ts2 = now_t.strftime("%Y/%m/%d %I:%M:%S")
+        Path("runs").mkdir(parents=True, exist_ok=True)
+        # np.save("runs/aorta_widths.npy", aorta_widths)
+        fig_out_p = f"runs/aorta_report_{ts1}.jpeg"
+        fig = self.get_stats_fig(aorta_widths, aorta_confs, video_w, video_h, ts2)
+        # This may hang under Windows: https://github.com/plotly/Kaleido/issues/110
+        # The workaround is to install older kaleido version (see requirements.txt)
+        fig.write_image(fig_out_p)
+        logger.info(f"Saved aorta report: {fig_out_p}")
+        img_bytes = fig.to_image(format="jpg", width=video_w, height=video_h)
+        line_chart = np.array(Image.open(io.BytesIO(img_bytes)))
+        line_chart = cv2.cvtColor(line_chart, cv2.COLOR_RGB2BGR)
+        h, w, ch = line_chart.shape
+        img = QImage(line_chart.data, video_w, video_h, ch * w, QImage.Format_BGR888)
+        scaled_img = img.scaled(w, h, Qt.KeepAspectRatio)
+        # Emit signal
+        self.updateFrame.emit(scaled_img)
+        time.sleep(5)
+        # keep report open until user closes the window
+        while self.status and not self.args.exit_on_end:
+            time.sleep(0.1)
+class Window(QMainWindow):
+    def __init__(self, args=None):
+        super().__init__()
+        # Title and dimensions
+        self.setWindowTitle("Demo")
+        self.setGeometry(0, 0, 800, 500)
+        # Create a label for the display camera
+        self.label = QLabel(self)
+        # self.label.setFixedSize(self.width(), self.height())
+        self.label.setFixedSize(video_w, video_h)
+        # Thread in charge of updating the image
+        self.th = Thread(self, args)
+        self.th.finished.connect(self.close)
+        self.th.updateFrame.connect(self.setImage)
+        # Buttons layout
+        buttons_layout = QHBoxLayout()
+        self.button1 = QPushButton("Start")
+        self.button2 = QPushButton("Stop/Close")
+        self.button1.setSizePolicy(QSizePolicy.Preferred, QSizePolicy.Expanding)
+        self.button2.setSizePolicy(QSizePolicy.Preferred, QSizePolicy.Expanding)
+        buttons_layout.addWidget(self.button2)
+        buttons_layout.addWidget(self.button1)
+        right_layout = QHBoxLayout()
+        # right_layout.addWidget(self.group_model, 1)
+        right_layout.addLayout(buttons_layout, 1)
+        # Main layout
+        layout = QVBoxLayout()
+        layout.addWidget(self.label)
+        layout.addLayout(right_layout)
+        # Central widget
+        widget = QWidget(self)
+        widget.setLayout(layout)
+        self.setCentralWidget(widget)
+        # Connections
+        self.button1.clicked.connect(self.start)
+        self.button2.clicked.connect(self.kill_thread)
+        self.button2.setEnabled(False)
+        if args.start_on_open:
+            # start thread
+            self.start()
+    @Slot()
+    def kill_thread(self):
+        logger.info("Finishing...")
+        self.th.status = False
+        time.sleep(1)
+        # Give time for the thread to finish
+        self.button2.setEnabled(False)
+        self.button1.setEnabled(True)
+        cv2.destroyAllWindows()
+        self.th.exit()
+        # Give time for the thread to finish
+        time.sleep(1)
+    @Slot()
+    def start(self):
+        logger.info("Starting...")
+        self.button2.setEnabled(True)
+        self.button1.setEnabled(False)
+        self.th.start()
+        logger.info("Thread started")
+    @Slot(QImage)
+    def setImage(self, image):
+        self.label.setPixmap(QPixmap.fromImage(image))
+if __name__ == "__main__":
+    # get user inputs using argparse
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--video",
+        type=str,
+        default=None,
+        help="path to video file, if None (default) would read from telemed",
+    )
+    parser.add_argument(
+        "--model",
+        type=str,
+        default="best_openvino_model/best.xml",
+        help="path to model file",
+    )
+    parser.add_argument("--img-size", type=int, default=640, help="image size")
+    parser.add_argument(
+        "--classes", nargs="+", type=int, default=[0], help="filter by class"
+    )
+    parser.add_argument("--device", type=str, default="CPU", help="device to use")
+    parser.add_argument("--sync", action="store_true", help="sync video FPS")
+    parser.add_argument("--plot-mask", action="store_true", help="plot mask")
+    parser.add_argument("--conf-thres", type=float, default=0.25, help="conf thresh")
+    parser.add_argument("--jobs", type=str, default=1, help="num of jobs, async if > 1")
+    parser.add_argument("--start-on-open", action="store_true", help="start on open")
+    parser.add_argument("--exit-on-end", action="store_true", help="exit if video ends")
+    args = parser.parse_args()
+    assert (
+        args.jobs == "auto" or int(args.jobs) > 0
+    ), f"--jobs must be > 0 or auto, got {args.jobs}"
+    if args.video:
+        assert Path(args.video).exists(), f"Video file {args.video} not found"
+    assert Path(args.model).exists(), f"Model file {args.model} not found"
+    app = QApplication()
+    w = Window(args)
+    w.show()
+    sys.exit(app.exec())

demo_headless.sh ADDED Viewed

	@@ -0,0 +1,27 @@

+#!/bin/bash
+VIRTUAL_DISPLAY_NUM=99
+OUTPUT_VIDEO="runs/demo_recording_$(date +"%Y-%m-%d_%H-%M-%S").mp4"
+# start xvfb server
+Xvfb :$VIRTUAL_DISPLAY_NUM -screen 0 1280x720x24 > /dev/null & XVFB_PID=$!
+# start recording
+ffmpeg -f x11grab -draw_mouse 0 -video_size 1280x720 \
+-i :$VIRTUAL_DISPLAY_NUM \
+-codec:v libx264 -r 25 $OUTPUT_VIDEO \
+> /dev/null 2>&1 < /dev/null & FFMPEG_PID=$!
+# start the demo program
+DISPLAY=:$VIRTUAL_DISPLAY_NUM QT_QPA_PLATFORM=xcb \
+python demo.py "$@" --start-on-open --exit-on-end
+# kill the recording
+kill $FFMPEG_PID
+# kill xvfb server
+kill $XVFB_PID
+# success msg
+echo -e "Recording saved: $OUTPUT_VIDEO"

eval.py ADDED Viewed

	@@ -0,0 +1,397 @@

+import typer
+from typing import Optional
+from pathlib import Path
+from loguru import logger
+import cv2
+from tqdm import tqdm
+import numpy as np
+import pandas as pd
+import shutil
+from datetime import datetime
+import matplotlib
+import os
+matplotlib.use("Agg")  # use non-interactive backend
+import matplotlib.pyplot as plt
+from predict import Model
+app = typer.Typer()
+@app.command(help="Export videos to images (to a dir per video)")
+def export_videos_to_images(
+    input_dir: Path = typer.Argument(..., help="Input directory"),
+    output_dir: Path = typer.Argument(..., help="Output directory"),
+    ext: str = typer.Option("avi", help="Video Extension"),
+    path_filter: Optional[str] = typer.Option(None, help="input path filter"),
+    patient_prefix: Optional[bool] = typer.Option(
+        True, help="use patient info as output dir prefix"
+    ),
+    copy_extent: Optional[bool] = typer.Option(
+        True, help="copy extent files to output dir"
+    ),
+):
+    # log all the arguments passed in
+    logger.info(f"Function called with arguments: {locals()}")
+    # find all video files in input_dir
+    input_dir = Path(input_dir)
+    output_dir = Path(output_dir)
+    output_dir.mkdir(parents=True, exist_ok=True)
+    video_files = list(input_dir.glob(f"**/*.{ext.lower()}"))
+    video_files.extend(list(input_dir.glob(f"**/*.{ext.upper()}")))
+    logger.info(f"# of avi videos found: {len(video_files)}")
+    if path_filter is not None:
+        logger.info(f"Filtering videos with {path_filter}")
+        video_files = [x for x in video_files if path_filter in str(x)]
+        logger.info(f"# of avi videos found after filtering: {len(video_files)}")
+    video_files.sort(key=lambda x: x.name)  # sort by name ascending
+    # log each video path after filtering, one per line
+    logger.info(f"{os.linesep}" + f"{os.linesep}".join([str(x) for x in video_files]))
+    # check that all the extent files exist
+    # the extent (.csv) should be in the same directory as the video
+    # the video filename would start with video_
+    # the extent filename would start with extents_
+    if copy_extent:
+        all_exist = True
+        for video_path in video_files:
+            extent_filename = video_path.stem.replace("video_", "extents_")
+            extent_path = video_path.parent / f"{extent_filename}.csv"
+            if not extent_path.exists():
+                logger.error(f"Extent file {extent_path} does not exist")
+                all_exist = False
+        if not all_exist:
+            logger.error("Extent files do not exist for all videos")
+            return
+    for video_path in video_files:
+        # copy extent file to output dir
+        if copy_extent:
+            extent_filename = video_path.stem.replace("video_", "extents_")
+            extent_path = video_path.parent / f"{extent_filename}.csv"
+            shutil.copy(extent_path, output_dir)
+        # Dir structure: Patient_Info / [PATIENT_ID] / [DATE] / video / xxx.avi
+        patient_id = (
+            video_path.parent.parent.parent.name
+        )  # WARNING: Hard-coded based on dir structure
+        video_name = video_path.stem
+        logger.info(f"Processing video {video_name} of patient {patient_id}")
+        # create subdirectory for each video
+        sub_dir = output_dir / (
+            f"{patient_id}-{video_name}" if patient_prefix else video_name
+        )
+        sub_dir.mkdir(parents=True, exist_ok=True)
+        # read video and export frames
+        cap = cv2.VideoCapture(str(video_path))
+        frame_count = 0
+        while cap.isOpened():
+            ret, frame = cap.read()
+            if ret:
+                # padding frame_count with zeros
+                cv2.imwrite(str(sub_dir / f"{frame_count:03}.jpg"), frame)
+                frame_count += 1
+            else:
+                break
+@app.command(help="Evaluate model on a directory of images")
+def eval(
+    input_dir: Path = typer.Argument(..., help="Input directory"),
+    input_model: Path = typer.Argument(..., help="Input model"),
+    imgsz: int = typer.Option(640, help="Image size"),
+    class_id: int = typer.Option(0, help="Class id to filter"),
+    conf_thresh: float = typer.Option(0.5, help="Confidence threshold"),
+    video_ext: str = typer.Option("avi", help="Video Extension"),
+    out_dir: Path = typer.Option("runs", help="Output directory"),
+    gt_csv_path: Path = typer.Option(
+        "results_20230822_aorta_identified_added_by_Ray.csv",
+        help="Ground truth csv path",
+    ),
+    no_extent: Optional[bool] = typer.Option(True, help="no extent file"),
+    write_viz: Optional[bool] = typer.Option(False, help="write viz images"),
+    gt_column_name: str = typer.Option("aorta_identified", help="Ground truth column"),
+):
+    # check inputs are valid
+    assert input_dir.exists(), f"Input directory {input_dir} does not exist"
+    assert input_model.exists(), f"Input model {input_model} does not exist"
+    assert gt_csv_path.exists(), f"Ground truth csv {gt_csv_path} does not exist"
+    # load model
+    model = Model(
+        model_path=str(input_model),
+        imgsz=imgsz,
+        classes=[class_id],  # filter by class id, only aorta
+        device="CPU",
+        plot_mask=True,
+        conf_thres=conf_thresh,
+        is_async=False,
+        n_jobs=1,
+    )
+    # setup output directory
+    out_dir = Path(out_dir)
+    # create a sub output directory of current date and time
+    start_t = datetime.now()
+    start_timestamp = start_t.strftime("%Y_%m_%d_%H_%M_%S")
+    out_dir = out_dir / f"max_aorta_result-{start_timestamp}"
+    out_dir.mkdir(parents=True, exist_ok=True)
+    # log to file
+    logger.add(str(out_dir.absolute()) + "/eval_{time}.log")
+    out_csv_p = out_dir / "results.csv"
+    out_trace_csv_p = out_dir / "trace.csv"
+    logger.info(f"Output directory: {out_dir}")
+    # find all directories in input_dir
+    input_dir = Path(input_dir)
+    sub_dirs = [x for x in input_dir.iterdir() if x.is_dir()]
+    sub_dirs.sort(key=lambda x: x.name)  # sort sub_dirs by name ascending
+    logger.info(f"# of subdirectories found: {len(sub_dirs)}")
+    num_sub_dirs = len(sub_dirs)
+    has_patient_prefix = False if sub_dirs[0].name.startswith("video") else True
+    # setup csv headers
+    trace_headers = ["video", "image_idx", "aorta_pixels", "aorta_mm", "conf"]
+    headers = ["video", "max_aorta_pixels", "max_aorta_mm", "max_image_idx", "conf"]
+    if has_patient_prefix:
+        headers.insert(0, "patient_info")
+    # loop through each subdirectory of images
+    for idx, sub_dir in enumerate(sub_dirs):
+        max_aorta_w = -1  # max aorta width in pixels
+        max_aorta_w_mm = -1  # max aorta width in mm
+        max_aorta_viz = None
+        max_aorta_im_path = None
+        max_center_x, max_center_y = -1, -1
+        max_conf = None
+        max_im_n = ""
+        # read the extent file of the images
+        # the extent file should be in the same directory as the video
+        video_filename = (
+            sub_dir.name
+            if not has_patient_prefix
+            else "-".join(sub_dir.name.split("-")[1:])
+        )
+        extent_filename = video_filename.replace("video_", "extents_")
+        extent_file = sub_dir.parent / f"{extent_filename}.csv"
+        extents = None
+        if not no_extent:
+            assert extent_file.exists(), f"Extent file {extent_file} does not exist"
+            extents = pd.read_csv(extent_file).to_dict("records")
+        logger.info(f"Processing subdir {sub_dir.name} ({idx+1}/{num_sub_dirs})")
+        # find all images in sub_dir
+        images = list(sub_dir.glob("*.jpg"))
+        # Sort the list of images in ascending order by name
+        images.sort(key=lambda img: img.name)
+        logger.info(f"\t# of images found: {len(images)}")
+        # create a viz output directory for each sub_dir
+        out_sub_viz_dir = out_dir / sub_dir.name
+        Path(out_sub_viz_dir).mkdir(parents=True, exist_ok=True)
+        for im_idx, image_path in enumerate(tqdm(images)):
+            # read image
+            cv_frame = cv2.imread(str(image_path))
+            cv_width = cv_frame.shape[1]
+            # inference
+            viz_frame, results = model.predict(cv_frame)
+            bbox_xyxy = results[0]
+            conf = results[1]
+            masks = results[2]
+            # output viz image if the flag is set
+            if write_viz:
+                cv2.imwrite(
+                    str(out_sub_viz_dir / f"{image_path.stem}_viz.jpg"),
+                    viz_frame,
+                )
+            trace_row = [
+                f"{sub_dir.name}.{video_ext}",
+                image_path.stem,
+                -1,
+                -1,
+                conf,
+            ]
+            if masks is not None or bbox_xyxy is not None:
+                # method 1: find the largest contour
+                # find min enclosing circle of mask
+                # mask = (masks * 255).astype(np.uint8)
+                # contours, _ = cv2.findContours(
+                #     mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
+                # )
+                # largest_contour = max(contours, key=cv2.contourArea)
+                # (center_x, center_y), radius = cv2.minEnclosingCircle(largest_contour)
+                # aorta_width = radius * 2
+                # method 2: use the height of the bbox as a measure of aorta width
+                # because we observed that the width of the bbox is too large
+                aorta_width = bbox_xyxy[3] - bbox_xyxy[1]
+                # get physical unit
+                w_mm_left, w_mm_right, w_mm_per_pixel = None, None, None
+                if not no_extent:
+                    w_mm_left = extents[im_idx]["Width-Left(mm)"]
+                    w_mm_right = extents[im_idx]["Width-Right(mm)"]
+                    assert w_mm_right > 0 and w_mm_left < 0
+                    w_mm_per_pixel = (w_mm_right - w_mm_left) / cv_width
+                # update trace when aorta is found
+                trace_row[2] = aorta_width
+                trace_row[3] = aorta_width * w_mm_per_pixel if not no_extent else None
+                # output viz image when aorta is found
+                cv2.imwrite(
+                    str(out_sub_viz_dir / f"{image_path.stem}_viz.jpg"),
+                    viz_frame,
+                )
+                # copy the raw image to the output directory
+                shutil.copy(image_path, out_sub_viz_dir)
+                if aorta_width > max_aorta_w:
+                    max_aorta_w = aorta_width
+                    max_aorta_viz = viz_frame.copy()
+                    max_aorta_im_path = image_path
+                    # Note: only need to calculate the center if using method 1
+                    # max_center_x = center_x
+                    # max_center_y = center_y
+                    max_im_n = image_path.stem
+                    max_conf = conf
+                    logger.info(
+                        f"\tNew max aorta (pixels): {max_aorta_w:.2f}, conf: {max_conf:.2f}"
+                    )
+                    # convert pixels to mm
+                    max_aorta_w_mm = (
+                        max_aorta_w * w_mm_per_pixel if not no_extent else None
+                    )
+            # save trace to csv
+            df = pd.DataFrame([trace_row], columns=trace_headers)
+            df.to_csv(
+                out_trace_csv_p,
+                mode="a",
+                header=not out_trace_csv_p.exists(),
+                index=False,
+                float_format="%.3f",
+            )
+        if max_aorta_w > 0:
+            logger.info(f"\tMax aorta (pixels): {max_aorta_w:.2f}")
+            # copy the raw image to the output directory
+            out_raw_p = out_dir / f"raw_{sub_dir.name}_{max_im_n}.jpg"
+            shutil.copy(max_aorta_im_path, out_raw_p)
+            # method 1 viz: draw enclosing circle on max_aorta_viz
+            # plot circle on max_aorta_viz
+            # cv2.circle(
+            #     max_aorta_viz,
+            #     (int(max_center_x), int(max_center_y)),
+            #     int(max_aorta_w / 2),
+            #     (0, 255, 0),
+            #     2,
+            # )
+            # region Save the image with extent
+            # convert the BGR image to RGB image
+            out_viz_p = out_dir / f"viz_{sub_dir.name}_{max_im_n}.jpg"
+            max_aorta_viz_rgb = cv2.cvtColor(max_aorta_viz, cv2.COLOR_BGR2RGB)
+            # Use matplotlib to save the image
+            # Get the size of the image in inches
+            dpi = plt.rcParams["figure.dpi"]  # Get the default dpi value
+            figsize = (
+                max_aorta_viz_rgb.shape[1] / dpi,
+                max_aorta_viz_rgb.shape[0] / dpi,
+            )  # width, height
+            # Create a new figure with the same aspect ratio as the image
+            fig = plt.figure(figsize=figsize)
+            if not no_extent:
+                # specify the extent of the image in the form [xmin, xmax, ymin, ymax]
+                extent = [
+                    extents[im_idx]["Width-Left(mm)"],
+                    extents[im_idx]["Width-Right(mm)"],
+                    extents[im_idx]["Depth-Bottom(mm)"],
+                    extents[im_idx]["Depth-Top(mm)"],
+                ]
+                plt.imshow(max_aorta_viz_rgb, extent=extent)
+                plt.xlabel("Width [mm]")
+                plt.ylabel("Depth [mm]")
+            else:
+                plt.imshow(max_aorta_viz_rgb)
+            plt.savefig(str(out_viz_p))
+            plt.close(fig)
+            # cv2.imwrite(str(out_viz_p), max_aorta_viz)
+            # endregion
+        else:
+            logger.warning(f"\tNo aorta found in {sub_dir.name}")
+        patient_info = sub_dir.name.split("-")[0] if has_patient_prefix else ""
+        row = [
+            f"{sub_dir.name}.{video_ext}",
+            max_aorta_w,
+            max_aorta_w_mm,
+            max_im_n,
+            max_conf,
+        ]
+        if has_patient_prefix:
+            row.insert(0, patient_info)
+            # remove patient info from sub_dir name
+            video_name = "-".join(sub_dir.name.split("-")[1:]) + f".{video_ext}"
+            row[1] = video_name
+        # export results to csv
+        # If file does not exist, this will create it, otherwise it will append to the file
+        df = pd.DataFrame([row], columns=headers)
+        df.to_csv(
+            out_csv_p,
+            mode="a",
+            header=not out_csv_p.exists(),
+            index=False,
+            float_format="%.3f",
+        )
+    # join the results with ground truth to add the ground truth column
+    # df_results = pd.read_csv(out_csv_p)
+    # df_gt = pd.read_csv(gt_csv_path)[["video", gt_column_name]]  # id & gt columns
+    # df_gt_first = df_gt.drop_duplicates(subset="video", keep="first")  # avoid new rows
+    # df_merged = pd.merge(df_results, df_gt_first, on="video", how="left")
+    # df_merged.to_csv(out_csv_p, header=True, index=False, float_format="%.3f")
+    # # show stats
+    # value_counts_with_nan = df_merged[gt_column_name].value_counts(dropna=False)
+    # total = len(df_merged)
+    # percentage = (value_counts_with_nan / total) * 100
+    # # Combine value counts and percentages into a DataFrame for better visualization
+    # stats = pd.DataFrame({"Count": value_counts_with_nan, "Percentage": percentage})
+    # logger.info(stats)
+    logger.info(f"Done! Results written to {out_csv_p}")
+@app.command(help="Copy source images to viz result folder")
+def copy_srcimg_to_vizdir(
+    src_img_dir: Path = typer.Argument(..., help="Source Images root directory"),
+    out_viz_dir: Path = typer.Argument(..., help="Target viz dirtectory"),
+):
+    vizs = list(Path(out_viz_dir).glob("**/*.jpg"))
+    for viz in vizs:
+        splits = viz.stem.split("_")
+        ori_img = Path(src_img_dir) / splits[1] / f"{splits[2]}.jpg"
+        shutil.copy(ori_img, Path(out_viz_dir) / f"{viz.stem}_src.jpg")
+if __name__ == "__main__":
+    app()

plots.py ADDED Viewed

	@@ -0,0 +1,303 @@

+import torch
+import torch.nn.functional as F
+import cv2
+from PIL import Image
+import numpy as np
+class Colors:
+    def __init__(self):
+        # hexs = matplotlib.colors.TABLEAU_COLORS.values()
+        hexs = (
+            "00FF00",  # aorta class 0
+            "FF3838",
+            "FF701F",
+            "FFB21D",
+            "CFD231",
+            "48F90A",
+            "92CC17",
+            "3DDB86",
+            "1A9334",
+            "00D4BB",
+            "2C99A8",
+            "00C2FF",
+            "344593",
+            "6473FF",
+            "0018EC",
+            "8438FF",
+            "520085",
+            "CB38FF",
+            "FF95C8",
+            "FF37C7",
+        )
+        self.palette = [self.hex2rgb(f"#{c}") for c in hexs]
+        self.n = len(self.palette)
+    def __call__(self, i, bgr=False):
+        c = self.palette[int(i) % self.n]
+        return (c[2], c[1], c[0]) if bgr else c
+    @staticmethod
+    def hex2rgb(h):  # rgb order (PIL)
+        return tuple(int(h[1 + i : 1 + i + 2], 16) for i in (0, 2, 4))
+colors = Colors()  # create instance for 'from utils.plots import colors'
+def is_ascii(s=""):
+    # Is string composed of all ASCII (no UTF) characters? (note str().isascii() introduced in python 3.7)
+    s = str(s)  # convert list, tuple, None, etc. to str
+    return len(s.encode().decode("ascii", "ignore")) == len(s)
+def clip_boxes(boxes, shape):
+    # Clip boxes (xyxy) to image shape (height, width)
+    if isinstance(boxes, torch.Tensor):  # faster individually
+        boxes[:, 0].clamp_(0, shape[1])  # x1
+        boxes[:, 1].clamp_(0, shape[0])  # y1
+        boxes[:, 2].clamp_(0, shape[1])  # x2
+        boxes[:, 3].clamp_(0, shape[0])  # y2
+    else:  # np.array (faster grouped)
+        boxes[:, [0, 2]] = boxes[:, [0, 2]].clip(0, shape[1])  # x1, x2
+        boxes[:, [1, 3]] = boxes[:, [1, 3]].clip(0, shape[0])  # y1, y2
+def scale_boxes(img1_shape, boxes, img0_shape, ratio_pad=None):
+    # Rescale boxes (xyxy) from img1_shape to img0_shape
+    if ratio_pad is None:  # calculate from img0_shape
+        gain = min(
+            img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1]
+        )  # gain  = old / new
+        pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (
+            img1_shape[0] - img0_shape[0] * gain
+        ) / 2  # wh padding
+    else:
+        gain = ratio_pad[0][0]
+        pad = ratio_pad[1]
+    boxes[:, [0, 2]] -= pad[0]  # x padding
+    boxes[:, [1, 3]] -= pad[1]  # y padding
+    boxes[:, :4] /= gain
+    clip_boxes(boxes, img0_shape)
+    return boxes
+def crop_mask(masks, boxes):
+    """
+    "Crop" predicted masks by zeroing out everything not in the predicted bbox.
+    Vectorized by Chong (thanks Chong).
+    Args:
+        - masks should be a size [h, w, n] tensor of masks
+        - boxes should be a size [n, 4] tensor of bbox coords in relative point form
+    """
+    n, h, w = masks.shape
+    x1, y1, x2, y2 = torch.chunk(boxes[:, :, None], 4, 1)  # x1 shape(1,1,n)
+    r = torch.arange(w, device=masks.device, dtype=x1.dtype)[
+        None, None, :
+    ]  # rows shape(1,w,1)
+    c = torch.arange(h, device=masks.device, dtype=x1.dtype)[
+        None, :, None
+    ]  # cols shape(h,1,1)
+    return masks * ((r >= x1) * (r < x2) * (c >= y1) * (c < y2))
+def process_mask(protos, masks_in, bboxes, shape, upsample=False):
+    """
+    Crop before upsample.
+    proto_out: [mask_dim, mask_h, mask_w]
+    out_masks: [n, mask_dim], n is number of masks after nms
+    bboxes: [n, 4], n is number of masks after nms
+    shape:input_image_size, (h, w)
+    return: h, w, n
+    """
+    c, mh, mw = protos.shape  # CHW
+    ih, iw = shape
+    masks = (masks_in @ protos.float().view(c, -1)).sigmoid().view(-1, mh, mw)  # CHW
+    downsampled_bboxes = bboxes.clone()
+    downsampled_bboxes[:, 0] *= mw / iw
+    downsampled_bboxes[:, 2] *= mw / iw
+    downsampled_bboxes[:, 3] *= mh / ih
+    downsampled_bboxes[:, 1] *= mh / ih
+    masks = crop_mask(masks, downsampled_bboxes)  # CHW
+    if upsample:
+        masks = F.interpolate(masks[None], shape, mode="bilinear", align_corners=False)[
+            0
+        ]  # CHW
+    return masks.gt_(0.5)
+def scale_image(im1_shape, masks, im0_shape, ratio_pad=None):
+    """
+    img1_shape: model input shape, [h, w]
+    img0_shape: origin pic shape, [h, w, 3]
+    masks: [h, w, num]
+    """
+    # Rescale coordinates (xyxy) from im1_shape to im0_shape
+    if ratio_pad is None:  # calculate from im0_shape
+        gain = min(
+            im1_shape[0] / im0_shape[0], im1_shape[1] / im0_shape[1]
+        )  # gain  = old / new
+        pad = (im1_shape[1] - im0_shape[1] * gain) / 2, (
+            im1_shape[0] - im0_shape[0] * gain
+        ) / 2  # wh padding
+    else:
+        pad = ratio_pad[1]
+    top, left = int(pad[1]), int(pad[0])  # y, x
+    bottom, right = int(im1_shape[0] - pad[1]), int(im1_shape[1] - pad[0])
+    if len(masks.shape) < 2:
+        raise ValueError(
+            f'"len of masks shape" should be 2 or 3, but got {len(masks.shape)}'
+        )
+    masks = masks[top:bottom, left:right]
+    # masks = masks.permute(2, 0, 1).contiguous()
+    # masks = F.interpolate(masks[None], im0_shape[:2], mode='bilinear', align_corners=False)[0]
+    # masks = masks.permute(1, 2, 0).contiguous()
+    masks = cv2.resize(masks, (im0_shape[1], im0_shape[0]))
+    if len(masks.shape) == 2:
+        masks = masks[:, :, None]
+    return masks
+class Annotator:
+    # YOLOv5 Annotator for train/val mosaics and jpgs and detect/hub inference annotations
+    def __init__(
+        self,
+        im,
+        line_width=None,
+        font_size=None,
+        font="Arial.ttf",
+        pil=False,
+        example="abc",
+    ):
+        assert (
+            im.data.contiguous
+        ), "Image not contiguous. Apply np.ascontiguousarray(im) to Annotator() input images."
+        non_ascii = not is_ascii(
+            example
+        )  # non-latin labels, i.e. asian, arabic, cyrillic
+        self.pil = pil or non_ascii
+        if self.pil:  # use PIL
+            self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
+            self.draw = ImageDraw.Draw(self.im)
+            self.font = check_pil_font(
+                font="Arial.Unicode.ttf" if non_ascii else font,
+                size=font_size or max(round(sum(self.im.size) / 2 * 0.035), 12),
+            )
+        else:  # use cv2
+            self.im = im
+        self.lw = line_width or max(round(sum(im.shape) / 2 * 0.003), 2)  # line width
+    def box_label(
+        self, box, label="", color=(128, 128, 128), txt_color=(255, 255, 255)
+    ):
+        # Add one xyxy box to image with label
+        if self.pil or not is_ascii(label):
+            self.draw.rectangle(box, width=self.lw, outline=color)  # box
+            if label:
+                w, h = self.font.getsize(label)  # text width, height
+                outside = box[1] - h >= 0  # label fits outside box
+                self.draw.rectangle(
+                    (
+                        box[0],
+                        box[1] - h if outside else box[1],
+                        box[0] + w + 1,
+                        box[1] + 1 if outside else box[1] + h + 1,
+                    ),
+                    fill=color,
+                )
+                # self.draw.text((box[0], box[1]), label, fill=txt_color, font=self.font, anchor='ls')  # for PIL>8.0
+                self.draw.text(
+                    (box[0], box[1] - h if outside else box[1]),
+                    label,
+                    fill=txt_color,
+                    font=self.font,
+                )
+        else:  # cv2
+            p1, p2 = (int(box[0]), int(box[1])), (int(box[2]), int(box[3]))
+            cv2.rectangle(
+                self.im, p1, p2, color, thickness=self.lw, lineType=cv2.LINE_AA
+            )
+            if label:
+                tf = max(self.lw - 1, 1)  # font thickness
+                w, h = cv2.getTextSize(label, 0, fontScale=self.lw / 3, thickness=tf)[
+                    0
+                ]  # text width, height
+                outside = p1[1] - h >= 3
+                p2 = p1[0] + w, p1[1] - h - 3 if outside else p1[1] + h + 3
+                cv2.rectangle(self.im, p1, p2, color, -1, cv2.LINE_AA)  # filled
+                cv2.putText(
+                    self.im,
+                    label,
+                    (p1[0], p1[1] - 2 if outside else p1[1] + h + 2),
+                    0,
+                    self.lw / 3,
+                    txt_color,
+                    thickness=tf,
+                    lineType=cv2.LINE_AA,
+                )
+    def masks(self, masks, colors, im_gpu, alpha=0.5, retina_masks=False):
+        """Plot masks at once.
+        Args:
+            masks (tensor): predicted masks on cuda, shape: [n, h, w]
+            colors (List[List[Int]]): colors for predicted masks, [[r, g, b] * n]
+            im_gpu (tensor): img is in cuda, shape: [3, h, w], range: [0, 1]
+            alpha (float): mask transparency: 0.0 fully transparent, 1.0 opaque
+        """
+        im_gpu = torch.from_numpy(im_gpu)  # not sure why we need this fix?
+        # print(im_gpu)
+        if self.pil:
+            # convert to numpy first
+            self.im = np.asarray(self.im).copy()
+        if len(masks) == 0:
+            self.im[:] = im_gpu.permute(1, 2, 0).contiguous().cpu().numpy() * 255
+        colors = torch.tensor(colors, device=im_gpu.device, dtype=torch.float32) / 255.0
+        colors = colors[:, None, None]  # shape(n,1,1,3)
+        masks = masks.unsqueeze(3)  # shape(n,h,w,1)
+        masks_color = masks * (colors * alpha)  # shape(n,h,w,3)
+        inv_alph_masks = (1 - masks * alpha).cumprod(0)  # shape(n,h,w,1)
+        mcs = (masks_color * inv_alph_masks).sum(
+            0
+        ) * 2  # mask color summand shape(n,h,w,3)
+        im_gpu = im_gpu.flip(dims=[0])  # flip channel
+        im_gpu = im_gpu.permute(1, 2, 0).contiguous()  # shape(h,w,3)
+        im_gpu = im_gpu * inv_alph_masks[-1] + mcs
+        im_mask = (im_gpu * 255).byte().cpu().numpy()
+        self.im[:] = (
+            im_mask
+            if retina_masks
+            else scale_image(im_gpu.shape, im_mask, self.im.shape)
+        )
+        if self.pil:
+            # convert im back to PIL and update draw
+            self.fromarray(self.im)
+    def rectangle(self, xy, fill=None, outline=None, width=1):
+        # Add rectangle to image (PIL-only)
+        self.draw.rectangle(xy, fill, outline, width)
+    def text(self, xy, text, txt_color=(255, 255, 255), anchor="top"):
+        # Add text to image (PIL-only)
+        if anchor == "bottom":  # start y from font bottom
+            w, h = self.font.getsize(text)  # text width, height
+            xy[1] += 1 - h
+        self.draw.text(xy, text, fill=txt_color, font=self.font)
+    def fromarray(self, im):
+        # Update self.im from a numpy array
+        self.im = im if isinstance(im, Image.Image) else Image.fromarray(im)
+        self.draw = ImageDraw.Draw(self.im)
+    def result(self):
+        # Return annotated image as array
+        return np.asarray(self.im)

predict.py ADDED Viewed

	@@ -0,0 +1,470 @@

+# Must import torch before onnxruntime, else could not create cuda context
+# ref: https://github.com/microsoft/onnxruntime/issues/11092#issuecomment-1386840174
+import torch, torchvision
+import onnxruntime
+from time import perf_counter
+from openvino.runtime import Core, Layout, get_batch, AsyncInferQueue
+from pathlib import Path
+import yaml
+import cv2
+import numpy as np
+import time
+from plots import Annotator, process_mask, scale_boxes, scale_image, colors
+from loguru import logger
+def from_numpy(x):
+    return torch.from_numpy(x) if isinstance(x, np.ndarray) else x
+def yaml_load(file="data.yaml"):
+    # Single-line safe yaml loading
+    with open(file, errors="ignore") as f:
+        return yaml.safe_load(f)
+def load_metadata(f=Path("path/to/meta.yaml")):
+    # Load metadata from meta.yaml if it exists
+    if f.exists():
+        d = yaml_load(f)
+        return d["stride"], d["names"]  # assign stride, names
+    return None, None
+def letterbox(
+    im,
+    new_shape=(640, 640),
+    color=(114, 114, 114),
+    auto=True,
+    scale_fill=False,
+    scaleup=True,
+    stride=32,
+):
+    # Resize and pad image while meeting stride-multiple constraints
+    shape = im.shape[:2]  # current shape [height, width]
+    if isinstance(new_shape, int):
+        new_shape = (new_shape, new_shape)
+    # Scale ratio (new / old)
+    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
+    if not scaleup:  # only scale down, do not scale up (for better val mAP)
+        r = min(r, 1.0)
+    # Compute padding
+    ratio = r, r  # width, height ratios
+    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
+    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
+    if auto:  # minimum rectangle
+        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding
+    elif scale_fill:  # stretch
+        dw, dh = 0.0, 0.0
+        new_unpad = (new_shape[1], new_shape[0])
+        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios
+    dw /= 2  # divide padding into 2 sides
+    dh /= 2
+    if shape[::-1] != new_unpad:  # resize
+        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
+    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
+    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
+    im = cv2.copyMakeBorder(
+        im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color
+    )  # add border
+    return im, ratio, (dw, dh)
+def xywh2xyxy(x):
+    # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
+    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
+    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
+    y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
+    y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
+    y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
+    return y
+def box_iou(box1, box2, eps=1e-7):
+    # https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py
+    """
+    Return intersection-over-union (Jaccard index) of boxes.
+    Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
+    Arguments:
+        box1 (Tensor[N, 4])
+        box2 (Tensor[M, 4])
+    Returns:
+        iou (Tensor[N, M]): the NxM matrix containing the pairwise
+            IoU values for every element in boxes1 and boxes2
+    """
+    # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
+    (a1, a2), (b1, b2) = box1.unsqueeze(1).chunk(2, 2), box2.unsqueeze(0).chunk(2, 2)
+    inter = (torch.min(a2, b2) - torch.max(a1, b1)).clamp(0).prod(2)
+    # IoU = inter / (area1 + area2 - inter)
+    return inter / ((a2 - a1).prod(2) + (b2 - b1).prod(2) - inter + eps)
+def non_max_suppression(
+    prediction,
+    conf_thres=0.25,
+    iou_thres=0.45,
+    classes=None,
+    agnostic=False,
+    multi_label=False,
+    labels=(),
+    max_det=300,
+    nm=0,  # number of masks
+    redundant=True,  # require redundant detections
+):
+    """Non-Maximum Suppression (NMS) on inference results to reject overlapping detections
+    Returns:
+         list of detections, on (n,6) tensor per image [xyxy, conf, cls]
+    """
+    if isinstance(
+        prediction, (list, tuple)
+    ):  # YOLOv5 model in validation model, output = (inference_out, loss_out)
+        prediction = prediction[0]  # select only inference output
+    device = prediction.device
+    mps = "mps" in device.type  # Apple MPS
+    if mps:  # MPS not fully supported yet, convert tensors to CPU before NMS
+        prediction = prediction.cpu()
+    bs = prediction.shape[0]  # batch size
+    nc = prediction.shape[2] - nm - 5  # number of classes
+    xc = prediction[..., 4] > conf_thres  # candidates
+    # Checks
+    assert (
+        0 <= conf_thres <= 1
+    ), f"Invalid Confidence threshold {conf_thres}, valid values are between 0.0 and 1.0"
+    assert (
+        0 <= iou_thres <= 1
+    ), f"Invalid IoU {iou_thres}, valid values are between 0.0 and 1.0"
+    # Settings
+    # min_wh = 2  # (pixels) minimum box width and height
+    max_wh = 7680  # (pixels) maximum box width and height
+    max_nms = 30000  # maximum number of boxes into torchvision.ops.nms()
+    multi_label &= nc > 1  # multiple labels per box (adds 0.5ms/img)
+    merge = False  # use merge-NMS
+    t = time.time()
+    mi = 5 + nc  # mask start index
+    output = [torch.zeros((0, 6 + nm), device=prediction.device)] * bs
+    for xi, x in enumerate(prediction):  # image index, image inference
+        # Apply constraints
+        # x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0  # width-height
+        x = x[xc[xi]]  # confidence
+        # Cat apriori labels if autolabelling
+        if labels and len(labels[xi]):
+            lb = labels[xi]
+            v = torch.zeros((len(lb), nc + nm + 5), device=x.device)
+            v[:, :4] = lb[:, 1:5]  # box
+            v[:, 4] = 1.0  # conf
+            v[range(len(lb)), lb[:, 0].long() + 5] = 1.0  # cls
+            x = torch.cat((x, v), 0)
+        # If none remain process next image
+        if not x.shape[0]:
+            continue
+        # Compute conf
+        x[:, 5:] *= x[:, 4:5]  # conf = obj_conf * cls_conf
+        # Box/Mask
+        box = xywh2xyxy(
+            x[:, :4]
+        )  # center_x, center_y, width, height) to (x1, y1, x2, y2)
+        mask = x[:, mi:]  # zero columns if no masks
+        # Detections matrix nx6 (xyxy, conf, cls)
+        if multi_label:
+            i, j = (x[:, 5:mi] > conf_thres).nonzero(as_tuple=False).T
+            x = torch.cat((box[i], x[i, 5 + j, None], j[:, None].float(), mask[i]), 1)
+        else:  # best class only
+            conf, j = x[:, 5:mi].max(1, keepdim=True)
+            x = torch.cat((box, conf, j.float(), mask), 1)[conf.view(-1) > conf_thres]
+        # Filter by class
+        if classes is not None:
+            x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]
+        # Apply finite constraint
+        # if not torch.isfinite(x).all():
+        #     x = x[torch.isfinite(x).all(1)]
+        # Check shape
+        n = x.shape[0]  # number of boxes
+        if not n:  # no boxes
+            continue
+        elif n > max_nms:  # excess boxes
+            x = x[x[:, 4].argsort(descending=True)[:max_nms]]  # sort by confidence
+        else:
+            x = x[x[:, 4].argsort(descending=True)]  # sort by confidence
+        # Batched NMS
+        c = x[:, 5:6] * (0 if agnostic else max_wh)  # classes
+        boxes, scores = x[:, :4] + c, x[:, 4]  # boxes (offset by class), scores
+        i = torchvision.ops.nms(boxes, scores, iou_thres)  # NMS
+        if i.shape[0] > max_det:  # limit detections
+            i = i[:max_det]
+        if merge and (1 < n < 3e3):  # Merge NMS (boxes merged using weighted mean)
+            # update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
+            iou = box_iou(boxes[i], boxes) > iou_thres  # iou matrix
+            weights = iou * scores[None]  # box weights
+            x[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(
+                1, keepdim=True
+            )  # merged boxes
+            if redundant:
+                i = i[iou.sum(1) > 1]  # require redundancy
+        output[xi] = x[i]
+        if mps:
+            output[xi] = output[xi].to(device)
+    return output
+class Model:
+    def __init__(
+        self,
+        model_path,
+        imgsz=320,
+        classes=None,
+        device="CPU",
+        plot_mask=False,
+        conf_thres=0.7,
+        n_jobs=1,
+        is_async=False,
+    ):
+        # filter by class: classes=[0], or classes=[0, 2, 3]
+        model_type = "onnx" if Path(model_path).suffix == ".onnx" else "openvino"
+        assert Path(model_path).exists(), f"Model {model_path} not found"
+        assert Path(model_path).suffix in (
+            ".onnx",
+            ".xml",
+        ), "Model must be .onnx or .xml"
+        self.model_type = model_type
+        self.model_path = model_path
+        self.imgsz = imgsz
+        self.classes = classes
+        self.plot_mask = plot_mask
+        self.conf_thres = conf_thres
+        # async settings
+        self.n_jobs = n_jobs
+        self.is_async = is_async
+        self.completed_results = {}  # key: frame_id, value: inference results
+        self.ori_cv_imgs = {}  # key: frame_id, value: original cv image
+        self.prep_cv_imgs = {}  # key: frame_id, value: preprocessed cv image
+        if self.model_type == "onnx":
+            assert is_async is False, "Async mode is not supported for ONNX models"
+            providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
+            session = onnxruntime.InferenceSession(model_path, providers=providers)
+            self.session = session
+            output_names = [x.name for x in session.get_outputs()]
+            self.output_names = output_names
+            meta = session.get_modelmeta().custom_metadata_map  # metadata
+            if "stride" in meta:
+                stride, names = int(meta["stride"]), eval(meta["names"])
+                self.stride = stride
+                self.names = names
+        elif self.model_type == "openvino":
+            # load OpenVINO model
+            assert Path(model_path).suffix == ".xml", "OpenVINO model must be .xml"
+            ie = Core()
+            weights = Path(model_path).with_suffix(".bin").as_posix()
+            network = ie.read_model(model=model_path, weights=weights)
+            if network.get_parameters()[0].get_layout().empty:
+                network.get_parameters()[0].set_layout(Layout("NCHW"))
+            batch_dim = get_batch(network)
+            if batch_dim.is_static:
+                batch_size = batch_dim.get_length()
+            # To run inference on M1, we must export the IR model using "mo --use_legacy_frontend"
+            # Otherwise, we would get the following error when compiling the model
+            # https://github.com/openvinotoolkit/openvino/issues/12476#issuecomment-1222202804
+            config = {}
+            if n_jobs == "auto":
+                config = {"PERFORMANCE_HINT": "THROUGHPUT"}
+            self.executable_network = ie.compile_model(
+                network, device_name=device, config=config
+            )
+            num_requests = self.executable_network.get_property(
+                "OPTIMAL_NUMBER_OF_INFER_REQUESTS"
+            )
+            self.n_jobs = num_requests if n_jobs == "auto" else int(n_jobs)
+            logger.info(f"Optimal number of infer requests should be: {num_requests}")
+            self.stride, self.names = load_metadata(
+                Path(weights).with_suffix(".yaml")
+            )  # load metadata
+            if is_async:
+                logger.info(f"Using num of infer requests jobs: {n_jobs}")
+                self.pipeline = AsyncInferQueue(self.executable_network, self.n_jobs)
+                self.pipeline.set_callback(self.callback)
+    def preprocess(self, cv_img, pt=False):
+        im = letterbox(cv_img, self.imgsz, stride=self.stride, auto=pt)[
+            0
+        ]  # padded resize
+        im = im.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGB
+        im = np.ascontiguousarray(im)  # contiguous
+        im = torch.from_numpy(im)
+        im = im.float()  # uint8 to fp16/32
+        im /= 255  # 0 - 255 to 0.0 - 1.0
+        if len(im.shape) == 3:
+            im = im[None]  # expand for batch dim
+        im = im.cpu().numpy()  # torch to numpy
+        return im
+    def postprocess(self, y, ori_cv_im, prep_im):
+        y = [from_numpy(x) for x in y]
+        pred, proto = y[0], y[-1]
+        im0 = ori_cv_im
+        # NMS
+        iou_thres = 0.45
+        agnostic_nms = False
+        max_det = 1  # maximum detections per image, only 1 aorta is needed
+        pred = non_max_suppression(
+            pred,
+            self.conf_thres,
+            iou_thres,
+            self.classes,
+            agnostic_nms,
+            max_det=max_det,
+            nm=32,
+        )
+        # Process predictions
+        line_thickness = 3
+        annotator = Annotator(
+            np.ascontiguousarray(im0),
+            line_width=line_thickness,
+            example=str(self.names),
+        )
+        i = 0
+        det = pred[0]
+        im = prep_im
+        r_xyxy, r_conf, r_masks = None, None, None
+        if len(pred[0]):
+            masks = process_mask(
+                proto[i],
+                det[:, 6:],
+                det[:, :4],
+                (self.imgsz, self.imgsz),
+                upsample=True,
+            )  # HWC
+            det[:, :4] = scale_boxes(
+                (self.imgsz, self.imgsz), det[:, :4], im0.shape
+            ).round()  # rescale boxes to im0 size
+            # Mask plotting
+            if self.plot_mask:
+                annotator.masks(
+                    masks,
+                    colors=[colors(x, True) for x in det[:, 5]],
+                    im_gpu=im[i],
+                    alpha=0.1,
+                )
+            # Write results
+            for j, (*xyxy, conf, cls) in enumerate(reversed(det[:, :6])):
+                # Add bbox to image
+                c = int(cls)  # integer class
+                label = f"{self.names[c]} {conf:.2f}"
+                annotator.box_label(xyxy, label, color=colors(c, True))
+                r_xyxy = xyxy
+                r_conf = conf
+            r_xyxy = [i.int().numpy().item() for i in r_xyxy]
+            r_conf = r_conf.numpy().item()
+            r_masks = scale_image((self.imgsz, self.imgsz), masks.numpy()[0], im0.shape)
+        return annotator.result(), (r_xyxy, r_conf, r_masks)
+    def predict(self, cv_img):
+        # return the annotated image and the bounding box
+        result_cv_img, xyxy = None, None
+        im = self.preprocess(cv_img)
+        if self.model_type == "onnx":
+            y = self.session.run(
+                self.output_names, {self.session.get_inputs()[0].name: im}
+            )
+        elif self.model_type == "openvino":
+            # OpenVINO model inference
+            # Note: Please use FP32 model on M1, otherwise you will get many runtime errors
+            # Very slow on M1, but works
+            # start = perf_counter()
+            y = list(self.executable_network([im]).values())
+            # logger.info(f"OpenVINO inference time: {perf_counter() - start:.3f}s")
+        result_cv_img, others = self.postprocess(y, cv_img, im)
+        return result_cv_img, others
+    def callback(self, request, userdata):
+        # callback function for AsyncInferQueue
+        outputs = request.outputs
+        frame_id = userdata
+        self.completed_results[frame_id] = [i.data for i in outputs]
+    def predict_async(self, cv_img, frame_id):
+        assert self.is_async, "Please set is_async=True when initializing the model"
+        self.ori_cv_imgs[frame_id] = cv_img
+        im = self.preprocess(cv_img)
+        self.prep_cv_imgs[frame_id] = im
+        # Note: The start_async function call is not required to be synchronized - it waits for any available job if the queue is busy/overloaded.
+        # https://docs.openvino.ai/latest/openvino_docs_OV_UG_Python_API_exclusives.html#asyncinferqueue
+        #
+        # idle_id = self.pipeline.get_idle_request_id()
+        # self.pipeline.start_async({idle_id: im}, frame_id)
+        self.pipeline.start_async({0: im}, frame_id)
+    def is_free_to_infer_async(self):
+        """Returns True if any free request in the pool, otherwise False"""
+        assert self.is_async, "Please set is_async=True when initializing the model"
+        return self.pipeline.is_ready()
+    def get_result(self, frame_id):
+        """Returns the inference result for the given frame_id"""
+        assert self.is_async, "Please set is_async=True when initializing the model"
+        if frame_id in self.completed_results:
+            y = self.completed_results.pop(frame_id)
+            cv_img = self.ori_cv_imgs.pop(frame_id)
+            im = self.prep_cv_imgs.pop(frame_id)
+            result_cv_img, others = self.postprocess(y, cv_img, im)
+            return result_cv_img, others
+        return None
+if __name__ == "__main__":
+    m_p = "weights/yolov7seg-JH-v1.onnx"
+    m_p = "weights/yolov5s-seg-MK-v1.onnx"
+    m_p = "weights/best_openvino_model/best.xml"
+    imgsz = 320
+    # imgsz = 640
+    model = Model(model_path=m_p, imgsz=imgsz)
+    # inference an image using the loaded model
+    # source = 'Tim_3-0-00-20.05.jpg'
+    path = "data/Jimmy_2-0-00-04.63.jpg"
+    assert Path(path).exists(), f"Input image {path} doesn't exist"
+    # output path
+    save_dir = "runs/predict"
+    Path(save_dir).mkdir(parents=True, exist_ok=True)
+    out_p = f"{save_dir}/{Path(path).stem}.jpg"
+    # load image and preprocess
+    im0 = cv2.imread(path)  # BGR
+    result_cv_img, _ = model.predict(im0)
+    if result_cv_img is not None:
+        cv2.imwrite(out_p, result_cv_img)
+        logger.info(f"Saved result to {out_p}")
+    else:
+        logger.error("No result, something went wrong")

requirements.txt ADDED Viewed

	@@ -0,0 +1,26 @@

+# dataset processing
+loguru==0.6.0
+typer[all]==0.7.0
+fiftyone==0.19.1
+pycocotools==2.0.6
+torch==1.13.0
+torchvision==0.14.0
+openvino==2022.2.0; sys_platform != "darwin"
+openvino-arm==2022.1.0.1; sys_platform == "darwin"
+opencv-python==4.6.0.66
+PyYAML==6.0
+onnx==1.13.1
+onnxruntime==1.13.1
+onnxruntime-gpu==1.13.1; sys_platform != "darwin"
+# demo GUI
+PySide6==6.4.1
+scipy==1.9.3
+matplotlib==3.5.2
+# demo plot
+plotly==5.11.0
+pandas==1.5.2
+kaleido==0.2.1; platform_system != "Windows"
+kaleido==0.1.0post1; platform_system == "Windows"

roi.py ADDED Viewed

	@@ -0,0 +1,34 @@

+import cv2
+import json
+import argparse
+from pathlib import Path
+# get the image from arguments
+ap = argparse.ArgumentParser()
+ap.add_argument("-i", "--image", required=True, help="Path to the image")
+args = vars(ap.parse_args())
+# check if the image does exist
+img_p = Path(args["image"])
+assert img_p.exists(), "Image does not exist"
+# Read the image
+img = cv2.imread(args["image"])
+# Select the ROI from the image
+ROI = cv2.selectROI("Image", img, False, False)
+# Append the ROI coordinates to the csv file
+# header: filename, ori_width, ori_height, roi_x, roi_y, roi_width, roi_height
+with open("roi.csv", "a") as f:
+    # if no file exists, create a new one with the header
+    if f.tell() == 0:
+        f.write("filename,ori_width,ori_height,roi_x,roi_y,roi_width,roi_height\n")
+    ori_w, ori_h = img.shape[1], img.shape[0]
+    f.write(f"{img_p.name},{ori_w},{ori_h},{ROI[0]},{ROI[1]},{ROI[2]},{ROI[3]}\n")
+# Display cropped image
+cropped = img[ROI[1] : ROI[1] + ROI[3], ROI[0] : ROI[0] + ROI[2]]
+cv2.imshow("Cropped Image", cropped)
+cv2.waitKey(0)

try_chart.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

usgfw2wrapper.dll ADDED Viewed

Binary file (15.4 kB). View file

weights/.keep ADDED Viewed

	@@ -0,0 +1 @@


1	+

weights/yolov5s-v2 ADDED Viewed

	@@ -0,0 +1 @@


1	+