|
# Data Preparation |
|
|
|
## InstructPix2Pix |
|
```shell |
|
bash scripts/download_data.sh path/to/clip-filtered-dataset |
|
python convert_instructp2p.py --data-dir /path/to/clip-filtered-dataset/ --output-dir /path/to/output-dir/ --num-process 64 |
|
``` |
|
|
|
## OpenImage |
|
```shell |
|
wget https://storage.googleapis.com/openimages/2018_04/image_ids_and_rotation.csv |
|
python convert_openimage.py --data-dir /path/to/image_ids_and_rotation.csv --output-dir /path/to/output-dir/ --num-process 8 --cuda_device [0, 1, 2, 3, 4, 5, 6, 7] |
|
``` |
|
|
|
if you want to preprocess the data in multiple nodes, you need to specify the `--num-machine` and `--machine-id` arguments. For example, if you want to preprocess the data in 8 nodes, you can run the following command in node 0: |
|
```shell |
|
python convert_openimage.py --data-dir /path/to/image_ids_and_rotation.csv --output-dir /path/to/output-dir/ --num-process 8 --cuda_device [0, 1, 2, 3, 4, 5, 6, 7] --num-machine 8 --machine-id 0 |
|
``` |
|
and run the following command in node 1: |
|
```shell |
|
python convert_openimage.py --data-dir /path/to/image_ids_and_rotation.csv --output-dir /path/to/output-dir/ --num-process 8 --cuda_device [0, 1, 2, 3, 4, 5, 6, 7] --num-machine 8 --machine-id 1 |
|
``` |
|
and so on. |