|
--- |
|
library_name: keras |
|
license: mit |
|
language: |
|
- en |
|
pipeline_tag: image-to-image |
|
--- |
|
# Autoencoder Grayscale2Color Landscape π‘οΈ |
|
|
|
[](https://huggingface.co/docs/hub) |
|
[](https://pypi.org/project/pillow/) |
|
[](https://numpy.org/) |
|
[](https://www.tensorflow.org/) |
|
[](https://gradio.app/) |
|
[](https://opensource.org/licenses/MIT) |
|
|
|
## Introduction |
|
Transform grayscale landscape images into vibrant, full-color visuals with this autoencoder model. Built from scratch, this project leverages deep learning to predict color channels (a*b* in L*a*b* color space) from grayscale inputs, delivering impressive results with a sleek, minimalist design. π |
|
|
|
## Key Features |
|
- πΈ Converts grayscale landscape images to vivid RGB. |
|
- π§ Custom autoencoder with spatial attention for enhanced detail. |
|
- β‘ Optimized for high-quality inference at 512x512 resolution. |
|
- π Achieves a PSNR of 21.70 on the validation set. |
|
|
|
## Notebook |
|
Explore the implementation in our Jupyter notebook: |
|
[](https://colab.research.google.com/#fileId=https://huggingface.co/danhtran2mind/autoencoder-grayscale2color-landscape/blob/main/notebooks/autoencoder-grayscale-to-color-landscape.ipynb) |
|
[](https://huggingface.co/danhtran2mind/autoencoder-grayscale2color-landscape/blob/main/notebooks/autoencoder-grayscale-to-color-landscape.ipynb) |
|
|
|
## Dataset |
|
Details about the dataset are available in the [README Dataset](./dataset/README.md). π |
|
|
|
## From Scratch Model |
|
Custom-built autoencoder with a spatial attention mechanism, trained **FROM SCRATCH** to predict a*b* color channels from grayscale (L*) inputs. π§© |
|
|
|
## Demonstration |
|
Experience the brilliance of our cutting-edge technology! Transform grayscale landscapes into vibrant colors with our interactive demo. |
|
|
|
[](https://huggingface.co/spaces/danhtran2mind/autoencoder-grayscale2color-landscape) |
|
|
|
 |
|
|
|
## Installation |
|
|
|
### Step 1: Clone the Repository |
|
```bash |
|
git clone https://huggingface.co/danhtran2mind/autoencoder-grayscale2color-landscape |
|
cd ./autoencoder-grayscale2color-landscape |
|
git lfs pull |
|
``` |
|
|
|
### Step 2: Install Dependencies |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
## Usage |
|
|
|
Follow these steps to colorize images programmatically using Python. |
|
|
|
### 1. Import Required Libraries |
|
Install and import the necessary libraries for image processing and model inference. |
|
|
|
```python |
|
from PIL import Image |
|
import os |
|
import numpy as np |
|
import tensorflow as tf |
|
import requests |
|
import matplotlib.pyplot as plt |
|
from skimage.color import lab2rgb |
|
from models.auto_encoder_gray2color import SpatialAttention |
|
``` |
|
|
|
### 2. Load the Pre-trained Model |
|
Download and load the autoencoder model from a remote source if itβs not already available locally. |
|
|
|
```python |
|
load_model_path = "./ckpts/best_model.h5" |
|
os.makedirs(os.path.dirname(load_model_path), exist_ok=True) |
|
|
|
print(f"Loading model from {load_model_path}...") |
|
loaded_autoencoder = tf.keras.models.load_model( |
|
load_model_path, custom_objects={"SpatialAttention": SpatialAttention} |
|
) |
|
print("Model loaded successfully.") |
|
``` |
|
|
|
### 3. Define Image Processing Functions |
|
These functions handle image preprocessing, colorization, and visualization. |
|
|
|
```python |
|
def process_image(input_img): |
|
"""Convert a grayscale image to color using the autoencoder.""" |
|
# Store original dimensions |
|
original_width, original_height = input_img.size |
|
|
|
# Preprocess: Convert to grayscale, resize, and normalize |
|
img = input_img.convert("L").resize((512, 512)) |
|
img_array = tf.keras.preprocessing.image.img_to_array(img) / 255.0 |
|
img_array = img_array[None, ..., 0:1] # Add batch dimension |
|
|
|
# Predict color channels |
|
output_array = loaded_autoencoder.predict(img_array) |
|
|
|
# Reconstruct LAB image |
|
L_channel = img_array[0, :, :, 0] * 100.0 # Scale L channel |
|
ab_channels = output_array[0] * 128.0 # Scale ab channels |
|
lab_image = np.stack([L_channel, ab_channels[:, :, 0], ab_channels[:, :, 1]], axis=-1) |
|
|
|
# Convert to RGB and clip values |
|
rgb_array = lab2rgb(lab_image) |
|
rgb_array = np.clip(rgb_array, 0, 1) * 255.0 |
|
|
|
# Create and resize output image |
|
rgb_image = Image.fromarray(rgb_array.astype(np.uint8), mode="RGB") |
|
return rgb_image.resize((original_width, original_height), Image.Resampling.LANCZOS) |
|
|
|
def process_and_save_image(image_path): |
|
"""Process an image and save the colorized result.""" |
|
input_img = Image.open(image_path) |
|
output_img = process_image(input_img) |
|
output_img.save("output.jpg") |
|
return input_img, output_img |
|
|
|
def plot_images(input_img, output_img): |
|
"""Display input and output images side by side.""" |
|
plt.figure(figsize=(17, 8), dpi=300) |
|
|
|
# Plot input grayscale image |
|
plt.subplot(1, 2, 1) |
|
plt.imshow(input_img, cmap="gray") |
|
plt.title("Input Grayscale Image") |
|
plt.axis("off") |
|
|
|
# Plot output colorized image |
|
plt.subplot(1, 2, 2) |
|
plt.imshow(output_img) |
|
plt.title("Colorized Output Image") |
|
plt.axis("off") |
|
|
|
# Save and display the plot |
|
plt.savefig("output.jpg", dpi=300, bbox_inches="tight") |
|
plt.show() |
|
``` |
|
|
|
### 4. Perform Inference |
|
Run the colorization process on a sample image. |
|
|
|
```python |
|
# Set image dimensions and path |
|
WIDTH, HEIGHT = 512, 512 |
|
image_path = "<path_to_input_image.jpg>" # Replace with your image path |
|
|
|
# Process and visualize the image |
|
input_img, output_img = process_and_save_image(image_path) |
|
plot_images(input_img, output_img) |
|
``` |
|
|
|
### 5. Example Output |
|
The output will be a side-by-side comparison of the input grayscale image and the colorized result, saved as `output.jpg`. For a sample result, see the example below: |
|
 |
|
|
|
## Training Hyperparameters |
|
- **Resolution**: 512x512 pixels |
|
- **Color Space**: L*a*b* |
|
- **Custom Layer**: SpatialAttention |
|
- **Model File**: `best_model.h5` |
|
- **Epochs**: 100 |
|
|
|
## Callbacks |
|
- **Early Stopping**: Monitors `val_loss`, patience of 20 epochs, restores best weights. |
|
- **ReduceLROnPlateau**: Monitors `val_loss`, reduces learning rate by 50% after 5 epochs, minimum learning rate of 1e-6. |
|
- **BackupAndRestore**: Saves checkpoints to `./ckpts/backup`. |
|
|
|
## Metrics |
|
- **PSNR (Validation)**: 21.70 π |
|
|
|
## Environment |
|
- Python 3.11.11 |
|
- Libraies |
|
``` |
|
numpy==1.26.4 |
|
tensorflow==2.18.0 |
|
opencv-python==4.11.0.86 |
|
scikit-image==0.25.2 |
|
matplotlib==3.7.2 |
|
scikit-image==0.25.2 |
|
``` |
|
|
|
## Contact |
|
For questions or issues, reach out via the [HuggingFace Community](https://huggingface.co/danhtran2mind/autoencoder-grayscale2color-landscape/discussions) tab. π |
|
|