Skip to content

Latest commit

 

History

History
116 lines (72 loc) · 3.85 KB

README.md

File metadata and controls

116 lines (72 loc) · 3.85 KB

Adversarial-MNIST

This README has been created from: Generating_Adversarial_Images_with_MNIST.ipynb

import theano
import theano.tensor as T
import numpy as np
from neuralnet import build_neural_net
Using gpu device 0: GeForce GT 755M (CNMeM is disabled, cuDNN 5005)

When imported, the MNIST images are one dimensional uint8 arrays. For the sake of the neural network, the entries in the image are converted to the

def deprocess(array, new_shape=None):
    if new_shape != None:
        array = array.reshape(new_shape)
    return np.clip(array * 255., 0, 255).astype('uint8')

Here I am setting up the relevant symbolic variables. Note that the neural network is being built using weights that have already been trained on the MNIST dataset. We will use this neural network to generate adversarial images. The adversarial image is an MNIST image, with a bit of noise added.

image = T.vector()
target = T.vector()
noise = theano.shared(np.asarray(np.random.randn(784), dtype=theano.config.floatX))
adv_image = image + noise
output, _, _ = build_neural_net(adv_image, 'mnist_weights.npy')

I have defined the loss function to be the mean squared error, plus the sum of the squared entries in our noise vector. The second summand of the loss function ensures that the optimized noise will alter the image as little as possible. The whole point is that we want to trick the neural network using an amount of noise that would be imperceptable or, at least, not obvious to the human eye. This way to us the adversarial image will look more or less the same as the original, but to the computer it will "look" completely different.

loss = T.mean(T.sqr(output - target)) + T.mean(T.sqr(noise))
updates = [(noise, noise - T.grad(cost=loss, wrt=noise))]

Here I am defining a few functions that will be compiled by Theano at runtime. Note that when optimize_noise is called, noise will be re-assigned as noise - T.grad(cost=loss, wrt=noise). As such, we're really just performing gradient descent.

Each of the functions should be self-evident, except for evaluate which will return two items: the predicted digit and the level of confidence.

optimize_noise = theano.function(inputs=[image, target], outputs=loss, updates=updates, allow_input_downcast=True)
get_adv_image = theano.function(inputs=[image], outputs=adv_image, allow_input_downcast=True)

i = T.argmax(output)
evaluate = theano.function(inputs=[image], outputs=[i, output[0,i]], allow_input_downcast=True)

Now let's get the MNIST dataset

from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original', data_home='./data')
zero_img = mnist.data[0] / 255.

Now let's take a look at the first digit in the MNIST dataset, which happens to be a zero, and run it through our neural network.

%matplotlib inline
import matplotlib.pyplot as plt

plt.imshow(deprocess(zero_img, (28, 28)), cmap='Greys')
plt.show()

png

prediction, confidence = evaluate(zero_img)
print 'predicted the number', prediction, 'with a confidence of', confidence * 100, '%'
predicted the number 2 with a confidence of 82.348793745 %

Now we will use the previously defined machinery to generate an adversarial image. Notice that we're using a one-hot-encoding to represent a target output value of five.

for i in range(5000):
    error = optimize_noise(zero_img, [0, 0, 0, 0, 0, 0, 0, 1, 0, 0])
adv_zero_img = get_adv_image(zero_img)
plt.imshow(deprocess(adv_zero_img, (28, 28)), cmap='Greys')
plt.show()

png

prediction, confidence = evaluate(zero_img)
print 'predicted the number', prediction, 'with a confidence of', confidence * 100, '%'
predicted the number 7 with a confidence of 95.9307074547 %