doc: add broken detection
Browse files
README.md
CHANGED
|
@@ -44,6 +44,18 @@ The generated dataset will be saved in the `dataset/font_img` directory.
|
|
| 44 |
|
| 45 |
Note that `batch_generate_script_cmd_32.bat` and `batch_generate_script_cmd_64.bat` are batch scripts for Windows that can be used to generate the dataset in parallel with 32 partitions and 64 partitions.
|
| 46 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
### (Optional) Linux Cluster Generation Walkthrough
|
| 48 |
|
| 49 |
If you would like to run the generation script on linux clusters, we also provides the environment setup script `linux_venv_setup.sh`.
|
|
|
|
| 44 |
|
| 45 |
Note that `batch_generate_script_cmd_32.bat` and `batch_generate_script_cmd_64.bat` are batch scripts for Windows that can be used to generate the dataset in parallel with 32 partitions and 64 partitions.
|
| 46 |
|
| 47 |
+
### Final Check
|
| 48 |
+
|
| 49 |
+
Since the task might be terminated unexpectedly or deliberately by user. The script has a caching mechanism to avoid re-generating the same image.
|
| 50 |
+
|
| 51 |
+
In this case, the script might not be able to detect corruption in cache (might be caused by terminating when writing to files) during this task, thus we also provides a script checking the generated dataset and remove the corrupted images and labels.
|
| 52 |
+
|
| 53 |
+
```bash
|
| 54 |
+
python font_ds_detect_broken.py
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
After running the script, you might want to rerun the generation script to fill up the holes of the removed corrupted files.
|
| 58 |
+
|
| 59 |
### (Optional) Linux Cluster Generation Walkthrough
|
| 60 |
|
| 61 |
If you would like to run the generation script on linux clusters, we also provides the environment setup script `linux_venv_setup.sh`.
|