View on GitHub

data-science

Notebooks and Python about data science

If you like this project please add your Star

CNN visualization of layer activations

Using TensorFlow 2.0 and Keras

Based on the CIFAR-10 deep neural network prepared in the CnnCifar10 notebook (HTML / Jupyter), let's try simple visualization techniques.

Learning goals:

  • Visualize intermediate layers of a CNN
  • Visualize activation maps
In [1]:
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras import datasets, models
import tensorflow as tf
import seaborn as sns
In [2]:
if True:
    import os
    os.environ['KMP_DUPLICATE_LIB_OK']='True'

Data - CIFAR-10

Images are normalized to $[0, 1]$

In [3]:
(xTrain, yTrain),(xTest, yTest) = datasets.cifar10.load_data()
xTrain = xTrain / 255.
xTest  = xTest  / 255.

classNames = ['plane', 'car', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

xTrain.shape, xTest.shape
Out[3]:
((50000, 32, 32, 3), (10000, 32, 32, 3))

Model

In [4]:
model0 = models.load_model('models/CIFAR-10_CNN5.h5')
model0.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv_0 (Conv2D)              (None, 30, 30, 32)        896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 32)        0         
_________________________________________________________________
conv_1 (Conv2D)              (None, 13, 13, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv_2 (Conv2D)              (None, 4, 4, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 1024)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1024)              0         
_________________________________________________________________
dense_0 (Dense)              (None, 64)                65600     
_________________________________________________________________
dense_1 (Dense)              (None, 10)                650       
=================================================================
Total params: 122,570
Trainable params: 122,570
Non-trainable params: 0
_________________________________________________________________

Helpers

In [5]:
def predictUntilLayer(model, layerIndex, data):
    """ Execute prediction on a portion of the model """
    intermediateModel = models.Model(inputs=model.input,
                                 outputs=model.layers[layerIndex].output)
    return intermediateModel.predict(data)

def plotHeatMap(X, classes='auto', title=None, fmt='.2g', ax=None, xlabel=None, ylabel=None,
               vmin=None, vmax=None, cbar=True):
    """ Fix heatmap plot from Seaborn with pyplot 3.1.0, 3.1.1
        https://stackoverflow.com/questions/56942670/matplotlib-seaborn-first-and-last-row-cut-in-half-of-heatmap-plot
    """
    ax = sns.heatmap(X, xticklabels=classes, yticklabels=classes, annot=True, 
                     fmt=fmt, vmin=vmin, vmax=vmax, cbar=cbar, cmap=plt.cm.bwr, ax=ax) 
    bottom, top = ax.get_ylim()
    ax.set_ylim(bottom + 0.5, top - 0.5)
    if title:
        ax.set_title(title)
    if xlabel:
        ax.set_xlabel(xlabel)
    if ylabel:
        ax.set_ylabel(ylabel)
        
def plotOutputs(numConvo, numCols, samples, originals, labels):
    """ Plot a layer output on selected samples """
    numRowsPerSample = int(np.ceil(numConvo / numCols))
    numRows = len(originals) * numRowsPerSample
    fig, axes = plt.subplots(numRows, numCols + 1, figsize=((numCols + 1) * 2, numRows * 2))
    axes = axes.ravel()
    for i, sample in enumerate(samples):
        ax = axes[i * (numRowsPerSample * (numCols + 1))]
        ax.imshow(originals[i])
        ax.set_title(labels[i])
        if(i == 0): 
            ax.xaxis.tick_top()
        for row in range(numRowsPerSample):
            for col in range(numCols):
                c = row * numCols + col
                ax = axes[(i * numRowsPerSample + row) * (numCols + 1) + col + 1]
                ax.imshow(samples[i,:, :, c], cmap='gray')
                ax.set_title('Conv #' + str(c))
                if(i == 0): 
                    ax.xaxis.tick_top()
    plt.setp(axes, xticks=[], yticks=[], frame_on=False)

Convolution layer #0 activation

Convolution layer #0 is connected to the 32x32, single channel image input. There are 6 convolutions in this layer.

Let's draw the layer's convolution units' coefficients as heat maps:

In [6]:
weights0 = model0.get_weights()

fig, axes = plt.subplots(6, 6, figsize=(16, 12), sharex=True, sharey=True)
for i in range(32):
    ax = axes.ravel()[i]
    plotHeatMap(weights0[0][:,:, 0, i], [0, 1, 2], ax=ax, vmin=-1., vmax=1., cbar=False)

axes[0,0].set_title("Convolution layer #0 weights");
plt.setp(axes, frame_on=False);

It is quite difficult to extract information from that many displays with so many figures.

It is also difficult to evaluate the impact of each neuron unit given the connections to previous and following layers.

Sample set to compute activations

Above display of weights is limited in useful information about the network.

Many visualization technics are using some sample set of images to inspect the activations within the network

In [7]:
selectedSampleIndexes = [0, 2, 3, 4, 6, 17]
selectedSamples = xTest[selectedSampleIndexes]
selectedLabels = [classNames[yTest[i][0]] for i in selectedSampleIndexes]
preds = np.argmax(model0.predict(selectedSamples), axis=1)
selectedPredictions = [classNames[p] for p in preds]

fig, axes = plt.subplots(1, len(selectedSampleIndexes), figsize=(16, 7))
for ax, sample, estLabel, trueLabel in zip(axes, selectedSamples, selectedPredictions, selectedLabels):
    ax.imshow(sample)
    ax.set_title("predict : %s\nactual : %s" % (estLabel, trueLabel))
plt.setp(axes, xticks=[], yticks=[], frame_on=False);

Note the horse on the right is wrongly labeled dog by the classifier.

In [8]:
sampleAtLayer0 = predictUntilLayer(model0, 0, selectedSamples)
sampleAtLayer0.shape
Out[8]:
(6, 30, 30, 32)
In [9]:
plotOutputs(32, 8, sampleAtLayer0, selectedSamples, selectedLabels)

We may observe that some neurons are focusing on edges like #3, #4, #8

Here again, the information load is high.

Layer #0 dropout activations

In [10]:
sampleAtLayer0drop = predictUntilLayer(model0, 1, selectedSamples[0:2])
sampleAtLayer0drop.shape
Out[10]:
(2, 15, 15, 32)
In [11]:
plotOutputs(32, 8, sampleAtLayer0drop, selectedSamples[0:2], selectedLabels[0:2])

As expected, the average drop out is acting as a low pass filter and a downsampler.

Convolution layer #1 activation

Convolution layer #1 is connecter to the #0 by an average pooling 2x2 (halving the size of the image on each of the 2 dimensions). It is made of 64 convolutions connected to each of the 6 inputs.

In [12]:
weights0[2].shape
Out[12]:
(3, 3, 32, 64)
In [13]:
sampleAtLayer1 = predictUntilLayer(model0, 2, selectedSamples)
sampleAtLayer1.shape
Out[13]:
(6, 13, 13, 64)
In [14]:
plotOutputs(64, 8, sampleAtLayer1, selectedSamples, selectedLabels)

We see that at the output of the second convolutional layer, the filters are focusing on some more detailed par of the digits. But it is becomes harder to exactly state what is the focus of each filter.

The jet plane with its very sharp edges is the easiest to read out as the edges appear on some of the filters' activations.

Conclusion

Activation maps are a nice exploration tools. However, their interpretability is limited to the first couple of layers. In the following notebook (HTML / Jupyter), we will experiment some other techniques based on gradient in order to get more consolidated view of the hidden layers.