CNN visualization of layer activations¶

Using TensorFlow 2.0 and Keras

Based on the CIFAR-10 deep neural network prepared in the CnnCifar10 notebook (HTML / Jupyter), let's try simple visualization techniques.

Learning goals:

Visualize intermediate layers of a CNN
Visualize activation maps

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras import datasets, models
import tensorflow as tf
import seaborn as sns

if True:
    import os
    os.environ['KMP_DUPLICATE_LIB_OK']='True'

Data - CIFAR-10¶

Images are normalized to $[0, 1]$

(xTrain, yTrain),(xTest, yTest) = datasets.cifar10.load_data()
xTrain = xTrain / 255.
xTest  = xTest  / 255.

classNames = ['plane', 'car', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

xTrain.shape, xTest.shape

((50000, 32, 32, 3), (10000, 32, 32, 3))

Model¶

model0 = models.load_model('models/CIFAR-10_CNN5.h5')
model0.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv_0 (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
_________________________________________________________________
conv_1 (Conv2D)              (None, 13, 13, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv_2 (Conv2D)              (None, 4, 4, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 1024)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1024)              0         
_________________________________________________________________
dense_0 (Dense)              (None, 64)                65600     
_________________________________________________________________
dense_1 (Dense)              (None, 10)                650       
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0
_________________________________________________________________

Helpers¶

def predictUntilLayer(model, layerIndex, data):
    """ Execute prediction on a portion of the model """
    intermediateModel = models.Model(inputs=model.input,
                                 outputs=model.layers[layerIndex].output)
    return intermediateModel.predict(data)

def plotHeatMap(X, classes='auto', title=None, fmt='.2g', ax=None, xlabel=None, ylabel=None,
               vmin=None, vmax=None, cbar=True):
    """ Fix heatmap plot from Seaborn with pyplot 3.1.0, 3.1.1
        https://stackoverflow.com/questions/56942670/matplotlib-seaborn-first-and-last-row-cut-in-half-of-heatmap-plot
    """
    ax = sns.heatmap(X, xticklabels=classes, yticklabels=classes, annot=True, 
                     fmt=fmt, vmin=vmin, vmax=vmax, cbar=cbar, cmap=plt.cm.bwr, ax=ax) 
    bottom, top = ax.get_ylim()
    ax.set_ylim(bottom + 0.5, top - 0.5)
    if title:
        ax.set_title(title)
    if xlabel:
        ax.set_xlabel(xlabel)
    if ylabel:
        ax.set_ylabel(ylabel)
        
def plotOutputs(numConvo, numCols, samples, originals, labels):
    """ Plot a layer output on selected samples """
    numRowsPerSample = int(np.ceil(numConvo / numCols))
    numRows = len(originals) * numRowsPerSample
    fig, axes = plt.subplots(numRows, numCols + 1, figsize=((numCols + 1) * 2, numRows * 2))
    axes = axes.ravel()
    for i, sample in enumerate(samples):
        ax = axes[i * (numRowsPerSample * (numCols + 1))]
        ax.imshow(originals[i])
        ax.set_title(labels[i])
        if(i == 0): 
            ax.xaxis.tick_top()
        for row in range(numRowsPerSample):
            for col in range(numCols):
                c = row * numCols + col
                ax = axes[(i * numRowsPerSample + row) * (numCols + 1) + col + 1]
                ax.imshow(samples[i,:, :, c], cmap='gray')
                ax.set_title('Conv #' + str(c))
                if(i == 0): 
                    ax.xaxis.tick_top()
    plt.setp(axes, xticks=[], yticks=[], frame_on=False)

Convolution layer #0 activation¶

Convolution layer #0 is connected to the 32x32, single channel image input. There are 6 convolutions in this layer.

Let's draw the layer's convolution units' coefficients as heat maps:

weights0 = model0.get_weights()

fig, axes = plt.subplots(6, 6, figsize=(16, 12), sharex=True, sharey=True)
for i in range(32):
    ax = axes.ravel()[i]
    plotHeatMap(weights0[0][:,:, 0, i], [0, 1, 2], ax=ax, vmin=-1., vmax=1., cbar=False)

axes[0,0].set_title("Convolution layer #0 weights");
plt.setp(axes, frame_on=False);

It is quite difficult to extract information from that many displays with so many figures.

It is also difficult to evaluate the impact of each neuron unit given the connections to previous and following layers.

Sample set to compute activations¶

Above display of weights is limited in useful information about the network.

Many visualization technics are using some sample set of images to inspect the activations within the network

selectedSampleIndexes = [0, 2, 3, 4, 6, 17]
selectedSamples = xTest[selectedSampleIndexes]
selectedLabels = [classNames[yTest[i][0]] for i in selectedSampleIndexes]
preds = np.argmax(model0.predict(selectedSamples), axis=1)
selectedPredictions = [classNames[p] for p in preds]

fig, axes = plt.subplots(1, len(selectedSampleIndexes), figsize=(16, 7))
for ax, sample, estLabel, trueLabel in zip(axes, selectedSamples, selectedPredictions, selectedLabels):
    ax.imshow(sample)
    ax.set_title("predict : %s\nactual : %s" % (estLabel, trueLabel))
plt.setp(axes, xticks=[], yticks=[], frame_on=False);

Note the horse on the right is wrongly labeled dog by the classifier.

sampleAtLayer0 = predictUntilLayer(model0, 0, selectedSamples)
sampleAtLayer0.shape

(6, 30, 30, 32)

plotOutputs(32, 8, sampleAtLayer0, selectedSamples, selectedLabels)

We may observe that some neurons are focusing on edges like #3, #4, #8

Here again, the information load is high.

Layer #0 dropout activations¶

sampleAtLayer0drop = predictUntilLayer(model0, 1, selectedSamples[0:2])
sampleAtLayer0drop.shape

(2, 15, 15, 32)

plotOutputs(32, 8, sampleAtLayer0drop, selectedSamples[0:2], selectedLabels[0:2])

As expected, the average drop out is acting as a low pass filter and a downsampler.

Convolution layer #1 activation¶

Convolution layer #1 is connecter to the #0 by an average pooling 2x2 (halving the size of the image on each of the 2 dimensions). It is made of 64 convolutions connected to each of the 6 inputs.

weights0[2].shape

(3, 3, 32, 64)

sampleAtLayer1 = predictUntilLayer(model0, 2, selectedSamples)
sampleAtLayer1.shape

(6, 13, 13, 64)

plotOutputs(64, 8, sampleAtLayer1, selectedSamples, selectedLabels)

We see that at the output of the second convolutional layer, the filters are focusing on some more detailed par of the digits. But it is becomes harder to exactly state what is the focus of each filter.

The jet plane with its very sharp edges is the easiest to read out as the edges appear on some of the filters' activations.

Conclusion¶

Activation maps are a nice exploration tools. However, their interpretability is limited to the first couple of layers. In the following notebook (HTML / Jupyter), we will experiment some other techniques based on gradient in order to get more consolidated view of the hidden layers.

References¶

Tensorflow tutorial for CNN

data-science

Notebooks and Python about data science

If you like this project please add your Star