Skip to content

Latest commit

 

History

History
87 lines (52 loc) · 2.39 KB

README.md

File metadata and controls

87 lines (52 loc) · 2.39 KB

explain-mnist

Experiments in explainable AI with exact optimization tools on the MNIST image dataset. Evaluating the robustness of a classifier based on (provably, globally) smallest adversarial pertubations.

Dependencies:

  • python3
  • CPLEX (and docplex python library)
  • matplotlib
  • numpy
  • keras

Proof-of-concept level scripts on a simple neural network and a binary classification task.

train_twoclass.py

Train a simple fully connected neural network with one hidden layer.

min_explanation.py

Compute an "explanation" of a prediction. Given an input image this is a minimal set of pixels which determine the output label, regardless the value of any other pixels. Uses CPLEX as a decision procedure for a "destructive MUS" algorithm.

min_adv_sum.py

Compute the smallest adversarial example with respect to sum of squared errors on the original picture, using mixed integer quadratic programming (MIQP).

min_adv_card.py

Compute a smallest adversarial example with respect the total number of changed pixels from the original input using mixed integer programming (MIP).

Apply the above techniques to a multiclass classifier.

train_multiclass.py

Train a somewhat more complicated neural network with multiple hidden layers and output classes.

min_adv_sum.py

For a given imput image, compute the minimal changes to predict each possible label.

Can we do the same with simple convolutional networks?

train_cnn_simple.py

Train a simple CNN with 10 3x3 convolution kernels.

min_adv_sum.py

As above, but we observe clear visual differences in the minimum adversarial changes compared to the network without convolution.

min_adv_card.py

Computing minimum number of changed pixels instead.

binary.py

Implementing the encodings of AAAI paper Verifying Properties of Binarized Deep Neural Networks

  • MIP: working
  • IP: working
  • CNF: producing intractably large formulas