SECRET OF CSS

Understanding the Design of a Convolutional Neural Network


Last Updated on July 13, 2022

Convolutional neural networks have been found successful in computer vision applications. Various network architectures are proposed and they are neither magical nor hard to understand.

In this tutorial, we will make sense of the operation of convolutional layers and their role in a larger convolutional neural network.

After finishing this tutorial, you will learn:

  • How convolutional layers extract features from image
  • How different convolutional layers can stack up to build a neural network

Let’s get started.

kin shing lai 7qUtO7iNZ4M unsplash scaled

Understanding the Design of a Convolutional Neural Network
Photo by Kin Shing Lai. Some rights reserved.

Overview

This article is split into three sections; they are:

  • An Example Network
  • Showing the Feature Maps
  • Effect of the Convolutional Layers

An Example Network

The following is a program to do image classification on the CIFAR-10 dataset:

This network should be able to achieve around 70% accuracy in classification. The images are in 32×32 pixels in RGB color. They are in 10 different classes, which the labels are integers from 0 to 9.

We can print the network using Keras’ summary() function:

In this network, the following will be shown on the screen:

It is typical in a network for image classification to comprise of convolutional layers at early stage, with dropout and pooling layers interleaved. At later stage, the output from convolutional layers are flattened and processed by some fully connected layers.

Showing the Feature Maps

In the above network, we used two convolutional layers (Conv2D). The first layer is defined as follows:

which means the convolutional layer will have a 3×3 kernel and apply on an input image of 32×32 pixels and 3 channels (the RGB colors). The output of this layer will be 32 channels.

To make sense of the convolutional layer, we can check out its kernel. The variable model holds the network and we can find the kernel of the first convolutional layer with the following:

and this prints:

We can tell that model.layers[0] is the correct layer by comparing the name conv2d from the above output to the output of model.summary(). This layer has a kernel of shape (3, 3, 3, 32), which are respectively the height, width, input channels, and output feature maps.

Assume the kernel is a NumPy array k. A convolutional layer will take its kernel k[:, :, 0, n] (a 3×3 array) and apply on the first channel of the image. Then apply k[:, :, 1, n] on the second channel of the image, and so on. Afterwards, the result of the convolution on all the channels are added up to become feature map n of output, which n in this case will run from 0 to 31 for the 32 output feature maps.

In Keras, we can extract the output of each layer using an extractor model. In the following, we create a batch with one input image and send to the network. Then we look at the feature maps of the first convolutional layer:

The above code will print the feature maps like the following:featuremap1

This is corresponding to the following input image:input image

We can see that we call them the feature maps because they are highlighting certain features from the input image. A feature is identified using a small window (in this case, over a 3×3 pixels filter). The input image has 3 color channels. Each channel has a different filter applied, which their results are combined for an output feature.

We can similarly display the feature map from the output of the second convolutional layer, as follows:

Which shows the following:featuremap2

From the above, you can see that the features extracted are more abstract and less recognizable.

Effect of the Convolutional Layers

The most important hyperparameter to a convolutional layer is the size of the filter. Usually it is in a square shape and we can consider that as a window or receptive field to look at the input image. Therefore, the higher resolution of the image, we would expect a larger filter.

On the other hand, a filter too large will blur the detailed features because all pixels from the receptive field through the filter will be combined into one pixel at the output feature map. Therefore, there is a trade off for the appropriate size of the filter.

Stacking two convolutional layers (without any other layers in between) is equivalent to a single convolutional layer with larger filter. But this is a typical design nowadays to use two layers with small filters stacked together rather than one larger with larger filter, as there are fewer parameters to train.

The exception would be convolutional layer with 1×1 filter. It is usually found as the beginning layer of a network. The purpose of such a convolutional layer is to combine the input channels into one rather than transforming the pixels. Conceptually, this can convert a color image into grayscale, but usually we make multiple ways of conversion to create more input channels than merely RGB for the network.

Also note that in the above network, we are using Conv2D, for a 2D filter. There is also a Conv3D layer for a 3D filter. The difference is whether we apply the filter separately for each channel or feature map, or to consider the input feature maps stacked up as a 3D array and apply a single filter transform it altogether. Usually the former is used as it is more reasonable to consider no particular order the feature maps should be stacked.

Further Reading

This section provides more resources on the topic if you are looking to go deeper.

Articles

Tutorials

Summary

In this post, you have seen how we can visualize the feature maps from a convolutional neural network and how it works to extract the feature maps

Specifically, you learned:

  • The structure of a typical convolutional neural networks
  • What is the effect of the filter size to a convolutional layer
  • What is the effect of stacking convolutional layers in a network

Develop Deep Learning Projects with Python!

Deep Learning with Python

 What If You Could Develop A Network in Minutes

…with just a few lines of Python

Discover how in my new Ebook:

Deep Learning With Python

It covers end-to-end projects on topics like:

Multilayer PerceptronsConvolutional Nets and Recurrent Neural Nets, and more…

Finally Bring Deep Learning To

Your Own Projects

Skip the Academics. Just Results.

See What’s Inside



News Credit

%d bloggers like this: