Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
# Model Overview
|
3 |
+
|
4 |
+
This model is a fine-tuned Denoising Diffusion Probabilistic Model (DDPM) for generating images of flowers using the Oxford Flowers dataset. It builds upon the pretrained google/ddpm-cifar10-32 model and is optimized for training on a GPU.
|
5 |
+
|
6 |
+
# Model Details
|
7 |
+
```
|
8 |
+
|
9 |
+
Architecture: UNet2DModel
|
10 |
+
|
11 |
+
Noise Scheduler: DDPMScheduler
|
12 |
+
|
13 |
+
Training Data: Oxford Flowers dataset (nelorth/oxford-flowers)
|
14 |
+
|
15 |
+
Optimizer: AdamW
|
16 |
+
|
17 |
+
Learning Rate: 1e-4, adjusted using a cosine scheduler
|
18 |
+
|
19 |
+
Training Steps: 100 epochs
|
20 |
+
|
21 |
+
Batch Size: 64
|
22 |
+
|
23 |
+
Image Size: 32x32 pixels
|
24 |
+
```
|
25 |
+
|
26 |
+
# Training Configuration
|
27 |
+
|
28 |
+
The training process involves the following steps:
|
29 |
+
|
30 |
+
# Data Preprocessing:
|
31 |
+
|
32 |
+
Images resized to 32x32.
|
33 |
+
|
34 |
+
Random horizontal flipping applied for augmentation.
|
35 |
+
|
36 |
+
Normalized to the range [-1, 1].
|
37 |
+
|
38 |
+
# Noise Addition:
|
39 |
+
|
40 |
+
Random noise added to images using a linear beta schedule.
|
41 |
+
|
42 |
+
# Model Training:
|
43 |
+
|
44 |
+
The UNet model predicts the noise added to images.
|
45 |
+
|
46 |
+
The Mean Squared Error (MSE) loss is used.
|
47 |
+
|
48 |
+
The learning rate is adjusted with a cosine scheduler.
|
49 |
+
|
50 |
+
# Checkpointing:
|
51 |
+
|
52 |
+
Model checkpoints are saved every 1000 steps.
|
53 |
+
|
54 |
+
# Usage
|
55 |
+
|
56 |
+
Once trained, the model can be used for generating images of flowers. The trained model is saved as a DDPMPipeline and can be loaded for inference.
|
57 |
+
|
58 |
+
# Model Inference
|
59 |
+
```
|
60 |
+
python
|
61 |
+
|
62 |
+
A quantized version of the model is available for reduced memory usage and faster inference on resource-limited devices.
|
63 |
+
|
64 |
+
from optimum.intel.openvino import OVModelForImageGeneration
|
65 |
+
|
66 |
+
pipeline = OVModelForImageGeneration.from_pretrained("flower_diffusion_quantized", export=True)
|
67 |
+
images = pipeline(batch_size=4, num_inference_steps=50).images
|
68 |
+
images[0].show()
|
69 |
+
```
|
70 |
+
# Model Variants
|
71 |
+
|
72 |
+
FP32 Version: Standard precision model.
|
73 |
+
|
74 |
+
FP16 Version: Reduced precision for lower memory usage.
|
75 |
+
|
76 |
+
# Limitations and Considerations
|
77 |
+
|
78 |
+
Image Resolution: Trained at 32x32, which may limit the fine details.
|
79 |
+
|
80 |
+
Computational Requirements: A GPU is recommended for inference.
|
81 |
+
|
82 |
+
Dataset Bias: The model is trained solely on Oxford Flowers, so its generalization to other datasets is limited.
|
83 |
+
|
84 |
+
Quantized Model Accuracy: INT8 quantization may slightly reduce output quality but speeds up inference.
|
85 |
+
|