Spaces:
Running
Running
PanNuke Preparation
The original PanNuke dataset has the following style using just one big array for each dataset split:
βββ fold0
β βββ images.npy
β βββ masks.npy
β βββ types.npy
βββ fold1
β βββ images.npy
β βββ masks.npy
β βββ types.npy
βββ fold2
βββ images.npy
βββ masks.npy
βββ types.npy
For memory efficieny and to make us of multi-threading dataloading with our augmentation pipeline, we reassemble the dataset to the following structure:
βββ fold0
β βββ cell_count.csv # cell-count for each image to be used in sampling
β βββ images # H&E Image for each sample as .png files
β βββ images
β β βββ 0_0.png
β β βββ 0_1.png
β β βββ 0_2.png
...
β βββ labels # label as .npy arrays for each sample
β β βββ 0_0.npy
β β βββ 0_1.npy
β β βββ 0_2.npy
...
β βββ types.csv # csv file with type for each image
βββ fold1
β βββ cell_count.csv
β βββ images
β β βββ 1_0.png
...
β βββ labels
β β βββ 1_0.npy
...
β βββ types.csv
βββ fold2
β βββ cell_count.csv
β βββ images
β β βββ 2_0.png
...
β βββ labels
β β βββ 2_0.npy
...
β βββ types.csv
βββ dataset_config.yaml # dataset config with dataset information
βββ weight_config.yaml # config file for our sampling
We provide all configuration files for the PanNuke dataset in the configs/datasets/PanNuke
folder. Please copy them in your dataset folder. Images and masks have to be extracted using the cell_segmentation/datasets/prepare_pannuke.py
script.