Diffusers
Safetensors
English
vae
convolutional
generative
Vittorio Pippi commited on
Commit
dd90e2c
·
1 Parent(s): 9b27178

Fix the YAML metadata

Browse files
Files changed (1) hide show
  1. README.md +1 -5
README.md CHANGED
@@ -1,6 +1,3 @@
1
- # Emuru Convolutional VAE
2
-
3
- ```yaml
4
  ---
5
  language:
6
  - "en"
@@ -18,9 +15,8 @@ metrics:
18
  - CER
19
  library_name: diffusers
20
  ---
21
- ```
22
 
23
- ## Model Description
24
 
25
  This repository hosts the **Emuru Convolutional VAE**, described in our paper. The model features a convolutional encoder and decoder, each with four layers. The output channels for these layers are 32, 64, 128, and 256, respectively. The encoder downsamples an input RGB image \( I \in \mathbb{R}^{3 \times W \times H} \) to a latent representation with a single channel and spatial dimensions \( h \times w \) (where \( h = H/8 \) and \( w = W/8 \)). This design compresses the style information in the image, allowing a lightweight Transformer Decoder to efficiently process the latent features.
26
 
 
 
 
 
1
  ---
2
  language:
3
  - "en"
 
15
  - CER
16
  library_name: diffusers
17
  ---
 
18
 
19
+ ## Emuru Convolutional VAE
20
 
21
  This repository hosts the **Emuru Convolutional VAE**, described in our paper. The model features a convolutional encoder and decoder, each with four layers. The output channels for these layers are 32, 64, 128, and 256, respectively. The encoder downsamples an input RGB image \( I \in \mathbb{R}^{3 \times W \times H} \) to a latent representation with a single channel and spatial dimensions \( h \times w \) (where \( h = H/8 \) and \( w = W/8 \)). This design compresses the style information in the image, allowing a lightweight Transformer Decoder to efficiently process the latent features.
22