Experiments generating textures using Variational Autoencoders (VAE).
VAEs have the benefit of fast inference, making them suitable to be deployed in the browser or games.
The VAE model can be used beyond simple sampling. Since the image generation is fully differentiateable, we can guide the image synthesis towards an objective.
See notebooks/stich.ipynb
- Python >= 3.8
- PyTorch >= 1.10.2
- Linux OS
- CUDA compatible GPU
# create venv and install dependencies
make setup
# download the training data
wget https://build-fitid-s3-bucket-download.s3.eu-central-1.amazonaws.com/motesque/bricks1000.zip
unzip bricks1000.zip
model:
latent_dims: 32
texture_size: 128
kl_weight: 0.7
lr: 1e-4
epochs: 500
snapshots: "../snapshots"
train_images: "../bricks1000"
logs:
logdir: "../tb/runs"
experiment_name: "default"
cd texture_vae
python train.py -c config.yml
Tensorboard
make tb
Http Dev Server with ONNX runtime
make http
The training images have been compiled from high-res Creative Commons Brick Textures sourced at Texture Ninja
Each high-res texture was randomly cropped into 128x128 images using cli/crop_gen.py
*
- Latent code size does not have a dramatic effect. 32 seems to be enough
- The KLD weigthing has a huge influence on the sampling quality. While a lower value does improve reconstruction, the sampled images are of bad quality then.
- https://www.jeremyjordan.me/variational-autoencoders/
- https://avandekleut.github.io/vae/
- https://towardsdatascience.com/generating-images-with-autoencoders-77fd3a8dd368
- https://medium.com/mlearning-ai/enerating-images-with-ddpms-a-pytorch-implementation-cef5a2ba8cb1