Calculating Output dimensions in a CNN for Convolution and Pooling Layers with KERAS

3 min readJun 25, 2020

This article outlines how an input image changes as it passes through the Convolutional-Layers and Pooling layers in a Convolutional Neural Network (CNN) and as a bonus also has the calculation of the number of parameters. The article assumes that you are familiar with the fundamentals of KERAS and CNN’s.

If you are new or just starting with CNN’s I recommend these following sources:-

A great intro video on working of CNN http://brohrer.github.io/how_convolutional_neural_networks_work.html
Official Keras book from its creator François Chollet https://www.manning.com/books/deep-learning-with-python

Calculating the output when an image passes through a Convolutional layer:-

NOTE:- All matrices are square i.e the image width = height.

Parameters that influence the output shape : -

The input dimensions of the image — > I (ixi)
The size of filter/kernel — > F (fxf)
Strides — > S (integer)
Padding — > P (integer)
Depth/Number of feature maps/activation maps — > D (integer)

Convolution Output dimension = [(I - F +2 *P) / S] +1 x D > Formula1

NOTE:- The “x D” above doesn’t stand for multiplication operation but it depicts the depth or the number of activation maps.

Let us take a look at an example with python snippet: -

An input image, I with dimensions (32x32x3) -An input image 32 pixel wide and 32 pixel in height with 3 channels i.e, (I =32),
A filter size 3x3 (F=3)
Stride is1 (S =1),
Zero padding (P=3), and
Depth /feature maps are 5 (D =5)

The output dimensions are = [(32 - 3 + 2 * 0) / 1] +1 x 5 = (30x30x5)

Keras Code snippet for the above example

import numpy as np
from tensorflow import keras


model = keras.models.Sequential()



#here in the snippet below
#D = 5 (first parameter)
#Stride= (1,1) by defaultmodel.add(keras.layers.Conv2D(5, kernel_size=3, activation='relu', input_shape=(32, 32, 3)))model.summary()

No of Parameter calculation, the kernel Size is (3x3) with 3 channels (RGB in the input), one bias term, and 5 filters.

Parameters = (FxF * number of channels + bias-term) * D

In our example Parameters = (3 * 3 * 3 + 1) * 5 = 140

Calculating the output when an image passes through a Pooling (Max) layer:-

For a pooling layer, one can specify only the filter/kernel size (F) and the strides (S).

Pooling Output dimension = [(I - F) / S] + 1 x D

Note Depth, D will be same as the previous layer (i.e the depth dimension remains unchanged, in our case D=5 ) — -> Formula2

Let F = 2 (2x2 window)
Stride, S = 2
Depth, D = 5 (depth from the previous layer)

In our example we have

Output = [(30–2) / 2] + 1 x D = (15x15x5)

#default strides is 2 in pooling layer)
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))model.summary()

Note:- Since pooling operation is a fixed function it introduces no additional parameters.

A visual summary of the 2 operations in our example.

The same formula1 and formula2 are applicable as the depth grows.

Thank you for reading this article, please let me know what your thoughts are down below in comments.

REFERENCES:-

Great Notes on CNN from Stanford https://cs231n.github.io/convolutional-networks/
Detailed coverage on Convolution Arthematic https://arxiv.org/pdf/1603.07285.pdf
A great theoretical book for Deep Learning https://www.deeplearningbook.org/

Calculating Output dimensions in a CNN for Convolution and Pooling Layers with KERAS

Calculating the output when an image passes through a Pooling (Max) layer:-

REFERENCES:-

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Virajdatt Kohir

Responses (2)