How to Fool A CNN in 5 easy steps

A lighthearted article on the importance of data quality.

Written by Isabella Gagner

Published on Medium

Within machine learning, and especially in my team at NordAxon, we are constantly talking about the importance of data quality when we are building our models. For beginners, it is easy to forget that we must be mindful of data quality when using our fully trained machine learning models as well — just because we built a model on perfect data (that does not even exist, but in theory!), does not mean that our model can handle any and all challenges coming its way!

Let me show you a very pedagogic example of this. This article contains some code for the ones that wish to experiment on their own, but you should be able to follow even if you have no prior programming experience. This is a fun project that I did during work hours, as we in NordAxon believe strongly in having our own projects and continuously educating ourselves — that is our Kaizen!

Step 1: Choose a pre-trained model that you wish to fool.

As my data will be images I want to use a Convolutional Neural Network (CNN). The strength in CNNs lie with the convolutional layers, that essentially filter out and boil down important features in each image, all while keeping spatial information! An intuitive way to think about this “spatial information” is that a CNN finds relations between pixels, leading it to understand that a cat consists of two ears at the top, two eyes below, and a nose underneath that — we rarely see cats with the ears below the nose!

I will make this very easy and use a pre-trained model from Keras, namely the VGG16 net with ‘imagenet’ weights! For those of you that are not familiar with this concept, it is essentially a fully-trained, pretty big convolutional neural network machine learning model, that has 1000 classes as outputs — this essentially means that the model can recognize 1000 different objects in images. It can classify everything from coffee mugs and cars to elephants and dogs, and it was trained on 1.2 million images from the ImageNet dataset! As I did not have 1.2 million images myself (or the computer power to train such a large network from scratch..), I decided to use the VGG16 and predict on my images directly.

Step 2: Download or create images!

Now, let us see what we can use to try to fool this VVG16 network; what could be “dirty data”? As it happens, I stumbled upon koty_vezdes Instagram account and thought it was perfect. With the approval of the account owner, I have now downloaded 20 images and tried to classify them…

So, let us look at the images!

Step 3: Enjoy the images for a while

And while you’re at it, take a look on the wonderful Instagram account where I found these cat fusions! There are a lot of images like these.

Step 4: Predict!

This section shows the code for creating the predictions for the images, i.e. what does the model think that it is looking at when presented with these cat-fusion images?

Alright, let’s get down to business! Easy peezy lemon squeezy, we read in our images into the shape that our net expects. For our VGG16, that would be tensors of shape (224, 224, 3). That means images that are 224 pixels wide and high, and 3 channels deep for the three colours in RGB.

import tensorflow as tf
import numpy as np
import os
import keras
from tensorflow.keras.preprocessing import image
from PIL import Image
from tensorflow.keras.applications import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input, decode_predictions
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from collections import defaultdict# Read paths into list
dirr = 'Cat-animals'
imgs_paths = [os.path.join(dirr,path) 
              for path in os.listdir(dirr) 
              if path.endswith('jpg')]# Read images and resize them (yes, I do love list comprehensions)
imgs = [image.img_to_array(image.load_img(
        path, target_size=(224, 224))) 
        for path in imgs_paths]# Normalize the colour channels
imgs= np.asarray([preprocess_input(img) for img in imgs])# Print final tensor shape = (20,224,224,3)

Now, we have our images saved in our imgs-tensor. Let’s predict!

preds = model.predict(imgs)
all_top_3_preds = defaultdict()# Pick the top 3 predictions for each image
for img_pred in range(20):
    top_3_preds = decode_predictions(np.expand_dims(
                                     axis=0), top=3)[0]    # Save the top 3 predictions along with the file name
    all_top_3_preds[imgs_path[img_pred]] = [[top_3_preds[i][j+1] 
                                             for i in range(3) 
                                             for j in range(2)]]

Step 5: Evaluate

So, let us see how the VGG16 net classified these images (and remember that the percentages can be interpreted as how sure the model is that the prediction is correct):

My personal favourites are the black fish classified as Hummingbird and Bow Tie, and the giraffe that was classified as a Cheetah.

While it is very easy for us humans to distinguish exactly what has happened in the images, it proved difficult for our dear VGG16 net. None of the top 3 predictions for each image contained “cat”, which actually surprised me! But, when thinking about it a bit more, it makes sense — the model does take a lot of information into consideration when classifying, and while it is easy for us to see “cat” and “squirrel” as two separate objects that have been fused, the model will try to make sense of these things together.

In fact, the model assumes that all the data that it sees is correct. It would never dream of us trying to fool it, and as such, it tries to make sense out of all the information that it has. For the model, the majority of the information in most of the pictures point to one animal, or at least one type of animal. The fact that these photoshopped animals are completely out of the realms of reality is not considered, although in some cases the predictions get a bit confused (such as the cat-giraffe, where it sees a cat face and the giraffe pattern and draws the conclusion Cheetah).

An important consideration is that the model that you have built, only knows what it has previously seen. And that leads me to my point:

It is so important to keep track of the data quality, not only when building your model, but also when using the model. Just because you trained a model on clean data, does not mean that it will perform well on dirty data afterwards. Continuous controls of data, model performance and model verification is of the utmost importance — we cannot have a model that predicts “hummingbird” and “bow tie” where the obvious answers are “fish” and possibly “cat”.

Step 6 (optional): Create heatmaps

One of the perks of working with CNNs is that we actually can visualize what the model has activated on when classifying the image. Let us look closer at that!

This section is very much inspired by Chollet’s notebook on heatmaps, that you can find on this link. This is mostly for fun, and the analysis above persists.

Essentially what happens in this code, is that we are looking at the gradients of the convolutional layers in the model. These gradients give us a hint on how much the model looks at the different part of an image when classifying. Intuitively, this can be thought of as “of how much importance is this specific part of the image for the model prediction?”, where the colour red indicates important.

# We start by looking at the summary of the model. We do this partly
# as a reminder, but also to later get the names of the different 
# layers
layer = 'block5_conv3'# Let us loop through each of the cat images
for i,img_path in enumerate(imgs_paths):
    idx = np.argmax(preds[i])
    # This is the output entry in the prediction vector
    output = model.output[:, idx]
    # The is the output feature map of the `block5_conv3` layer,
    # the last convolutional layer in VGG16
    last_conv_layer = model.get_layer(layer)
    # This is the gradient of the "african elephant" class with
    # regard to the output feature map of `block5_conv3`
    grads = K.gradients(output, last_conv_layer.output)[0]
    # This is a vector of shape (512,), where each entry
    # is the mean intensity of the gradient over a specific feature 
    # map channel
    pooled_grads = K.mean(grads, axis=(0, 1, 2))
    # This function allows us to access the values of the 
    # quantities we just defined:
    # `pooled_grads` and the output feature map of `block5_conv3`,
    # given a sample image
    iterate = K.function([model.input], [pooled_grads, 
    # These are the values of these two quantities, as Numpy arrays
    # given our sample image of two elephants
    x = imgs[i]
    x = np.expand_dims(x, axis=0)
    pooled_grads_value, conv_layer_output_value = iterate([x])
    # We multiply each channel in the feature map array
    # by "how important this channel is" with regard to the elephant
    # class
    for i in range(pooled_grads.shape[0]):
        conv_layer_output_value[:, :, i] *= pooled_grads_value[i]
    # The channel-wise mean of the resulting feature map
    # is our heatmap of class activation
    heatmap = np.mean(conv_layer_output_value, axis=-1)
    heatmap = np.maximum(heatmap, 0)
    heatmap /= np.max(heatmap)
    # We use cv2 to load the original image, I loaded it as a 
    # gray-scale image as I had a hard time seeing the heatmap 
    # otherwise
    img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
    img = np.expand_dims(img, axis=2)
    # We resize the heatmap to have the same size as the original
    # image
    heatmap = cv2.resize(heatmap, (img.shape[1], img.shape[0]))
    # We convert the heatmap to RGB
    heatmap = np.uint8(255 * heatmap)    # We apply the heatmap to the original image
    heatmap = cv2.applyColorMap(heatmap, cv2.COLORMAP_JET)    # 0.5 here is a heatmap intensity factor
    superimposed_img = heatmap * 0.5 + img    # Save the image to disk
                 layer+'.jpg', superimposed_img)

Alright, now we only looked at the last convolutional layer in the model. If we repeat these steps for each layer, and combine the resulting images into a GIF, we get a better overview of the model and how it activates during the iterations! Let us look at a few.

We now see that the model does indeed consider several parts of each image during the layer iterations, starting with edges and moving on to larger areas. The first layers extract local features, such as edges and patterns, and with each layer, more global features and patterns are taken into consideration. At the end, it seems like the model in the majority of the cases discards the “cat-face”-feature as unimportant for the classification, and focuses on the areas around it instead. This shows that the model essentially sees what it wants to see — it does not care that a detail such as the face does not cohere with what is has seen before.

Finishing this little project, I was curious of how some of the predictions were wrong even when the model ignored the cat face. As an example, a quick Google image search for Flatworm, that the Octopus was predicted as, shows that the model must have confused the Octopus arms with the sides of a Flatworm— it is possible that this would happen with the original image as well. And the obvious cat faces? Apparently not so obvious for the VGG16 net.

Imagine what could happen if a model misclassified objects in a self driving car…