Normalizing-flow models are invertible neural networks (INNs) — a type of
generative model that allows not only generating new samples from the learned
distribution (which GANs and VAEs do too) but also exact likelihood computation
as well (which GANs and VAEs do not). This is accomplished with an
architecture that ensures all transformations are reversible and the Jacobian
determinant is efficiently computable. By modeling probabilities directly,
INNs allow for a range of other applications too - a real Swiss-army-knife of
the modeling world that I'm recently fascinated with.
These normalizing-flow models transform complex data distributions into more tractable ones (usually Gaussian) in which it's feasible to do probabilistic calculations such as anomaly detection for example. But these models allow far more than anomaly detection - their capabilities allow INNs to cover generative image modeling, generative classification, parameter estimation on ill-conditioned problems, and (ill-posed) inverse problems with or without noise on the data. All of these stem from the theme of mapping one probability distribution into another.
Other implementations of INNs I've seen out there only cover one specific application and with a lot of bespoke code. But the Tensorflow Probability package provides almost everything needed to implement these models in a more encapsulated, cleaner, and easier to understand way (at least for me!). Of course as I expand this work I'm wrestling a number of tradeoffs in what to generalize/simply via TFP and what to explicitly implement - part of the learning process for me.
The above diagram summarizes, for different applications, variations in how the N-dimensional model inputs are mapped through the flow model to N-dimensional outputs that include a latent multivariate standard normal distribution to capture some or all of the complex variations on the input side. All those output points can each be mapped back though the model to the inputs as well, important in the image generation, uncertainty quantifaction, and inverse problems among others. The little images in each frame of the gif are subtle references to the example applications I'm implementing for each variation, and key research papers from the literature that describe these variations one at a time. Sorry, I acknowledge that at this summary level I'm not currently describing what all those little images and details are yet; the papers are referenced at the bottom of this readme though.
Work is currently still in progress - I'm gradually implementing the series of 7 applications in the figure - currently #2 is fully implemented (documented at "Flow_models 2: Image generation and anomaly detection as two sides of the same coin", and as the first comprised the bulk of the work - the rest are variations using same modeling code). Instructions for using/running that follow below, and similar ones are upcoming for the other applications as well. Point being, it's all the same model, just with a few variations in the partitioning of the inputs and outputs. The overview/introduction article to the series is also available, at "Flow_models: Overview / Introduction".
-
For full runs on a GPU-enabled EC2 instance (as opposed to just initial smaller scale testing on a CPU-only instance), I recommend following these instructions from my py_tf2_gpu_dock_mlflow repository to set that up.
I'm also working on some scripts to kick off the training remotely in a Docker container via AWS ECR using AWS Batch (that's what's in subdir
awsbatch-support
), but that's not ready yet. Meanwhile, simply installing on the GPU-enabled instance per those instructions linked above allows to run the training on there. -
Create the python environment and install dependencies:
> make create-env Creating/installing new python env /home/ubuntu/src/python/flow_models/.venv3
(this is just a convenience macro to run the usual
python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt
. except note this macro creates new .venvN subdirectories incrementing N to avoid overwriting existing env subdirectories.) -
Get images to train/test with:
Of course you can use whatever images you want. For my experimentation I used the really nicely curated Kaggle dataset animal-faces.
If using a dedicated GPU-enabled instance, you could save these image files directly on that instance in a
data
subdir within theflow_models
repo directory. For that case the URIs for train_generator and other_generator in train.py can simply be"data/train"
for example. Or you can use image files in an S3 bucket, whether in the dedicated GPU-enabled instance or in a batch configuration. For that case the URIs should have the form"s3://mybucket/myprefix/train"
.This is not supervised learning so labels are not used for training, but it can still be useful to reserve some validation data to experiment with after training anyway. Whether locally or in S3, I find the following directory structure helpful. Note the data generator reading the files will combine all subdirectories of files together, so
cat
andbeachball
images will be mixed together in the validation dataset:data/ train/ cat/ val/ beachball/ <-- these show up as outliers in gaussian latent points cat/ <-- these don't
- Enter the python environment created above if not already in:
source .venv/bin/activate
- Set environment variable
export TF_CPP_MIN_LOG_LEVEL=2
to squelch a number of status/info lines spewed by Tensorflow and Tensorflow Probability (TFP) that I don't find too helpful and that make a mess in the console output. (Similarly note I've put a python line at the top of train.py to squelchUserWarning
s that are spewed by TFP.) - Set desired parameters in
train.py
. - Run
python train.py
.
-
Distribution mapping and generative image modeling with INNs
-
Generative classification and ill-conditioned parameter estimation with INNs
-
Bayesian inverse problems with INNs
-
TensorFlow Probability components