GeroldMeisinger
/

control-edgedrawing

Transformers

English

controlnet

Inference Endpoints

Model card Files Files and versions Community

Gerold Meisinger commited on Sep 25, 2023

Commit

61baa04

1 Parent(s): a141c2f

blub

Browse files

Files changed (1) hide show

README.md +9 -9

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 - en
 ---
-Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (experiments 1-2), parameter-free (experiments 3+), color (not yet available).
 * Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
 * For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
@@ -60,12 +60,12 @@ accelerate launch train_controlnet.py ^
 # Evaluation
 To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
-* canny 1.0 model was trained on 3M images, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k.
 * canny edge-detector requires parameter tuning while edpf is parameter-free.
 * Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
-* Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model?
 * When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
-* When evaluating style we need to be aware of the bias from the image dataset (laion2b-en-aesthetics65), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
 # Versions
@@ -86,7 +86,7 @@ Images converted with https://github.com/shaojunluo/EDLinePython (based on origi
 additional arguments: `--proportion_empty_prompts=0.5`.
-Trained for 40000 steps with default settings => empty prompts were probably too excessive
 Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
@@ -94,7 +94,7 @@ Update 2023-09-22: bug in algorithm produces too sparse images on default, see h
 Same as experiment 1 with `smoothed=True` and `--proportion_empty_prompts=0`.
-Trained for 40000 steps with default settings => conditioning images are too noisy
 **Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000**
@@ -109,7 +109,7 @@ edges    = ed.detectEdges(image)
 edge_map = ed.getEdgeImage(edges)
 ```
-45000 steps => This is **version 0.1 on civitai**.
 **Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000**
@@ -121,13 +121,13 @@ resumed with epoch 2 from 90000 using `--proportion_empty_prompts=0.5` => result
 **Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
-see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` =>
 **Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
 # Ideas
-* fine-tune off canny
 * cleanup image dataset (l65)
 * uncropped mod64 images
 * integrate edcolor

 - en
 ---
+Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (ed), parameter-free (edpf), color (edcolor).
 * Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
 * For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
 # Evaluation
 To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
+* canny 1.0 model was trained on 3M images with fp32, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k with fp16.
 * canny edge-detector requires parameter tuning while edpf is parameter-free.
 * Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
+* Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model? (2023-09-25: see `eval_canny_edpf.zip` but it seems as it doesn't work and the edpf model may be justified)
 * When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
+* When evaluating style we need to be aware of the bias from the image dataset (`laion2b-en-aesthetics65`), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
 # Versions
 additional arguments: `--proportion_empty_prompts=0.5`.
+Trained for 40000 steps with default settings => results are not good. empty prompts were probably too excessive. retry with no drops and different algorithm parameters.
 Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
 Same as experiment 1 with `smoothed=True` and `--proportion_empty_prompts=0`.
+Trained for 40000 steps with default settings => results are not good. conditioning images look too noisy. investigate algorithm.
 **Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000**
 edge_map = ed.getEdgeImage(edges)
 ```
+45000 steps => looks good. released as **version 0.1 on civitai**.
 **Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000**
 **Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
+see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` => results are not good, 50% is probably too much for 45k steps. guessmode still doesn't work and tends to produces humans. resuming until 90k with right-left flipped in the hope it will get better with more images.
 **Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
 # Ideas
+* fine-tune from canny
 * cleanup image dataset (l65)
 * uncropped mod64 images
 * integrate edcolor