danjacobellis commited on
Commit
66b9dac
·
verified ·
1 Parent(s): b941198

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -260
README.md CHANGED
@@ -4,265 +4,9 @@ datasets:
4
  - danjacobellis/musdb_segments
5
  ---
6
 
7
- - [WaLLoC repository](https://github.com/danjacobellis/walloc/)
 
 
8
  - [Paper: "Learned Compression for Compressed Learning"](https://danjacobellis.net/_static/walloc.pdf)
9
  - [Additional code accompanying the paper](https://github.com/danjacobellis/lccl)
10
-
11
-
12
- # Wavelet Learned Lossy Compression (WaLLoC)
13
-
14
- WaLLoC sandwiches a convolutional autoencoder between time-frequency analysis and synthesis transforms using
15
- CDF 9/7 wavelet filters. The time-frequency transform increases the number of signal channels, but reduces the temporal or spatial resolution, resulting in lower GPU memory consumption and higher throughput. WaLLoC's training procedure is highly simplified compared to other $\beta$-VAEs, VQ-VAEs, and neural codecs, but still offers significant dimensionality reduction and compression. This makes it suitable for dataset storage and compressed-domain learning. It currently supports 1D and 2D signals, including mono, stereo, or multi-channel audio, and grayscale, RGB, or hyperspectral images.
16
-
17
- ## Installation
18
-
19
- 1. Follow the installation instructions for [torch](https://pytorch.org/get-started/locally/)
20
- 2. Install WaLLoC and other dependencies via pip
21
-
22
- ```pip install walloc PyWavelets pytorch-wavelets```
23
-
24
- ## Pre-trained checkpoints
25
-
26
- Pre-trained checkpoints are available on [Hugging Face](https://huggingface.co/danjacobellis/walloc).
27
-
28
- ## Training
29
-
30
- [RGB Images](https://github.com/danjacobellis/walloc/blob/main/train/train_rgb.ipynb)
31
- [Stereo Audio](https://github.com/danjacobellis/walloc/blob/main/train/train_stereo.ipynb)
32
- If your are interested in training a codec on a different modality, contact Dan via [email.](mailto:[email protected])
33
-
34
- ## Usage example
35
-
36
-
37
- ```python
38
- import os
39
- import torch
40
- import matplotlib.pyplot as plt
41
- import numpy as np
42
- from PIL import Image, ImageEnhance
43
- from IPython.display import display
44
- from torchvision.transforms import ToPILImage, PILToTensor
45
- from walloc import walloc
46
- from walloc.walloc import latent_to_pil, pil_to_latent
47
- class Config: pass
48
- ```
49
-
50
- ### Load the model from a pre-trained checkpoint
51
-
52
- ```wget https://hf.co/danjacobellis/walloc/resolve/main/RGB_Li_27c_J3_nf4_v1.0.2.pth```
53
-
54
-
55
- ```python
56
- device = "cpu"
57
- checkpoint = torch.load("RGB_Li_27c_J3_nf4_v1.0.2.pth",map_location="cpu",weights_only=False)
58
- codec_config = checkpoint['config']
59
- codec = walloc.Codec2D(
60
- channels = codec_config.channels,
61
- J = codec_config.J,
62
- Ne = codec_config.Ne,
63
- Nd = codec_config.Nd,
64
- latent_dim = codec_config.latent_dim,
65
- latent_bits = codec_config.latent_bits,
66
- lightweight_encode = codec_config.lightweight_encode
67
- )
68
- codec.load_state_dict(checkpoint['model_state_dict'])
69
- codec = codec.to(device)
70
- codec.eval();
71
- ```
72
-
73
- ### Load an example image
74
-
75
- ```wget "https://r0k.us/graphics/kodak/kodak/kodim05.png"```
76
-
77
-
78
- ```python
79
- img = Image.open("kodim05.png")
80
- img
81
- ```
82
-
83
-
84
-
85
-
86
-
87
- ![png](https://huggingface.co/danjacobellis/walloc/resolve/main/README_files/README_6_0.png)
88
-
89
-
90
-
91
-
92
- ### Full encoding and decoding pipeline with .forward()
93
-
94
- * If `codec.eval()` is called, the latent is rounded to nearest integer.
95
-
96
- * If `codec.train()` is called, uniform noise is added instead of rounding.
97
-
98
-
99
- ```python
100
- with torch.no_grad():
101
- codec.eval()
102
- x = PILToTensor()(img).to(torch.float)
103
- x = (x/255 - 0.5).unsqueeze(0).to(device)
104
- x_hat, _, _ = codec(x)
105
- ToPILImage()(x_hat[0]+0.5)
106
- ```
107
-
108
-
109
-
110
-
111
-
112
- ![png](https://huggingface.co/danjacobellis/walloc/resolve/main/README_files/README_8_0.png)
113
-
114
-
115
-
116
-
117
- ### Accessing latents
118
-
119
-
120
- ```python
121
- with torch.no_grad():
122
- codec.eval()
123
- X = codec.wavelet_analysis(x,J=codec.J)
124
- Y = codec.encoder(X)
125
- X_hat = codec.decoder(Y)
126
- x_hat = codec.wavelet_synthesis(X_hat,J=codec.J)
127
-
128
- print(f"dimensionality reduction: {x.numel()/Y.numel()}×")
129
- ```
130
-
131
- dimensionality reduction: 7.111111111111111×
132
-
133
-
134
-
135
- ```python
136
- Y.unique()
137
- ```
138
-
139
-
140
-
141
-
142
- tensor([-7., -6., -5., -4., -3., -2., -1., -0., 1., 2., 3., 4., 5., 6.,
143
- 7.])
144
-
145
-
146
-
147
-
148
- ```python
149
- plt.figure(figsize=(5,3),dpi=150)
150
- plt.hist(
151
- Y.flatten().numpy(),
152
- range=(-7.5,7.5),
153
- bins=15,
154
- density=True,
155
- width=0.9);
156
- plt.title("Histogram of latents")
157
- plt.xticks(range(-7,8,1));
158
- plt.xlim([-7.5,7.5])
159
- ```
160
-
161
-
162
-
163
-
164
- (-7.5, 7.5)
165
-
166
-
167
-
168
-
169
-
170
- ![png](https://huggingface.co/danjacobellis/walloc/resolve/main/README_files/README_12_1.png)
171
-
172
-
173
-
174
- # Lossless compression of latents
175
-
176
-
177
- ```python
178
- def scale_for_display(img, n_bits):
179
- scale_factor = (2**8 - 1) / (2**n_bits - 1)
180
- lut = [int(i * scale_factor) for i in range(2**n_bits)]
181
- channels = img.split()
182
- scaled_channels = [ch.point(lut * 2**(8-n_bits)) for ch in channels]
183
- return Image.merge(img.mode, scaled_channels)
184
- ```
185
-
186
- ### Single channel PNG (L)
187
-
188
-
189
- ```python
190
- Y_padded = torch.nn.functional.pad(Y, (0, 0, 0, 0, 0, 9))
191
- Y_pil = latent_to_pil(Y_padded,codec.latent_bits,1)
192
- display(scale_for_display(Y_pil[0], codec.latent_bits))
193
- Y_pil[0].save('latent.png')
194
- png = [Image.open("latent.png")]
195
- Y_rec = pil_to_latent(png,36,codec.latent_bits,1)
196
- assert(Y_rec.equal(Y_padded))
197
- print("compression_ratio: ", x.numel()/os.path.getsize("latent.png"))
198
- ```
199
-
200
-
201
-
202
- ![png](https://huggingface.co/danjacobellis/walloc/resolve/main/README_files/README_16_0.png)
203
-
204
-
205
-
206
- compression_ratio: 15.171345894154717
207
-
208
-
209
- ### Three channel WebP (RGB)
210
-
211
-
212
- ```python
213
- Y_pil = latent_to_pil(Y,codec.latent_bits,3)
214
- display(scale_for_display(Y_pil[0], codec.latent_bits))
215
- Y_pil[0].save('latent.webp',lossless=True)
216
- webp = [Image.open("latent.webp")]
217
- Y_rec = pil_to_latent(webp,27,codec.latent_bits,3)
218
- assert(Y_rec.equal(Y))
219
- print("compression_ratio: ", x.numel()/os.path.getsize("latent.webp"))
220
- ```
221
-
222
-
223
-
224
- ![png](https://huggingface.co/danjacobellis/walloc/resolve/main/README_files/README_18_0.png)
225
-
226
-
227
-
228
- compression_ratio: 16.451175633838172
229
-
230
-
231
- ### Four channel TIF (CMYK)
232
-
233
-
234
- ```python
235
- Y_padded = torch.nn.functional.pad(Y, (0, 0, 0, 0, 0, 9))
236
- Y_pil = latent_to_pil(Y_padded,codec.latent_bits,4)
237
- display(scale_for_display(Y_pil[0], codec.latent_bits))
238
- Y_pil[0].save('latent.tif',compression="tiff_adobe_deflate")
239
- tif = [Image.open("latent.tif")]
240
- Y_rec = pil_to_latent(tif,36,codec.latent_bits,4)
241
- assert(Y_rec.equal(Y_padded))
242
- print("compression_ratio: ", x.numel()/os.path.getsize("latent.tif"))
243
- ```
244
-
245
-
246
-
247
- ![jpeg](README_files/README_20_0.jpg)
248
-
249
-
250
-
251
- compression_ratio: 12.40611656815935
252
-
253
-
254
-
255
- ```python
256
- !jupyter nbconvert --to markdown README.ipynb
257
- ```
258
-
259
- [NbConvertApp] Converting notebook README.ipynb to markdown
260
- [NbConvertApp] Support files will be in README_files/
261
- [NbConvertApp] Making directory README_files
262
- [NbConvertApp] Writing 6024 bytes to README.md
263
-
264
-
265
-
266
- ```python
267
- !sed -i 's|!\[png](README_files/\(README_[0-9]*_[0-9]*\.png\))|![png](https://huggingface.co/danjacobellis/walloc/resolve/main/README_files/\1)|g' README.md
268
- ```
 
4
  - danjacobellis/musdb_segments
5
  ---
6
 
7
+ # Wavelet Learned Lossy Compression
8
+
9
+ - [Project page and documentation](https://danjacobellis.net/walloc)
10
  - [Paper: "Learned Compression for Compressed Learning"](https://danjacobellis.net/_static/walloc.pdf)
11
  - [Additional code accompanying the paper](https://github.com/danjacobellis/lccl)
12
+ - [Pre-trained codecs available on Hugging Face](https://huggingface.co/danjacobellis/walloc)