What is a convolutional neural network?

Convolutional neural networks (CNNs) are one of the trends in machine learning. There are many different types of neural networks to use in machine learning projects. For example, recurrent neural networks, feed-forward neural networks, modular neural networks, etc. A convolutional neural network is one of the types of common neural networks. But before going into the details of convolutional neural networks, it is better to talk about the normal neural network.

What is a neural network?

When deep learning, which is a field of machine learning, is mentioned, we are probably talking about the neural network. Neural networks are modeled after our brains. Some nodes form layers in the network and connect different areas, just like neurons in our brains.

The inputs of the nodes in a layer are assigned a weight that modifies the effect the parameter has on the overall prediction result. Since weights are assigned to links between nodes, each node may be affected by different weights.

The neural network takes all the training data in the input layer. It then passes the data through the hidden layers, changes the values ​​based on the weight of each node, and finally returns a value in the output layer.

convolutional neural network
A neural network

Properly tuning a neural network to achieve consistent and reliable results may take some time. Testing and training a neural network is a balancing process to determine the most important features of the model.

How does a CNN work differently?

A convolutional neural network is a special type of neural network with multiple layers that processes data that has a grid arrangement and then extracts important features from them. A big advantage of using CNNs is that there is no need to do a lot of pre-processing on the images.

In most algorithms that perform image processing, an engineer usually creates filters based on heuristic methods. CNNs can learn the most important features of filters, and because many parameters are not needed, it saves a lot of time and trial and error operations.

The savings won’t be noticeable unless you’re working with high-resolution images with thousands of pixels. The main goal of the CNN algorithm is to reduce the data into forms that are easier to process while preserving the features that are important to understanding what the data represents. They are also a good option for working with large data sets.

A big difference between a CNN and a regular neural network is that CNNs use convolution to handle the math behind the scenes. At least in one layer of CNN, convolution is used instead of matrix multiplication. Convolutions take up to two functions and return one function.

See also  The best programming languages for machine learning

CNNs work by applying filters to your input data. What makes them so special is that they can adjust the filters at the same time as the training process. This way, even when you have a huge data set like images, the results will be accurate and instantly accurate.

Since the filters can be updated to train the CNN better, the need for manual filters is eliminated, and this gives us more flexibility in the number and relevance of filters applied to the dataset. Using this algorithm, we can work on more complex problems such as face recognition.

Lack of data is one of the problems that prevent the use of CNN. Although networks can be trained with a relatively small amount of data (approximately 10,000), the more data available, the better the CNN is tuned.

Just keep in mind that this data needs to be clean and labeled for CNN to use it, which makes working with it time-consuming and computationally demanding.

How does a convolutional neural network work?

Convolution neural networks work based on the findings of neuroscience. They are made of layers of artificial neurons called nodes. These nodes are functions that calculate the weighted sum of the inputs and return an activation map. This is the convolutional part of the neural network.

convolution

Each node in a layer is defined by its weight values. When you feed a data layer, for example, an image, it takes the pixel values ​​and extracts some visual properties.

When you feed data to the CNN, each layer returns activation maps. These mappings identify important features of the dataset. If you give CNN an image, it will detect features based on pixel values, such as colors, and provide you with an activation function.

Usually, in images, CNN first finds the edges of the image. This partial definition of the image is then passed to the next layer, and that layer begins to identify things like corners and color groups. This new definition of the image is then transferred to the next layer, and the cycle continues until prediction.

As shown in the image below, maximum pooling should be done by increasing the number of layers. Max Aggregation returns only the most relevant features from the layer in the activation map and passes them to subsequent layers until you reach the last layer.

The last layer of CNN is the classification layer which determines the predicted value based on the activation map. If you feed CNN a handwriting sample, the classification layer will tell you the letters in the image. This is what self-driving vehicles use to determine whether an object is a car, a person, or an obstacle.

See also  How to Start Learning Machine Learning: A Beginner’s Guide

CNN training is similar to training many other machine learning algorithms. With training data that is different from your test data, you will adjust the bootstrap and weights based on the accuracy of the predicted values. Just be careful not to overfit the model.

Types of CNN

One-dimensional CNN: In this case, the CNN kernel moves in one direction. One-dimensional CNNs are commonly used on time series data.

Two-dimensional CNN: In this type of CNN, the kernels move in two directions. Two-dimensional CNNs are used in image labeling and processing.

3D CNN: This type of CNN has a kernel that moves in three directions. Researchers use this type of CNN in 3D images such as CT scan and MRI.

Since most problems are related to image data, 2D CNNs are often used. Following are some of the possible applications of CNNs:

  • Image recognition with some pre-processing
  • Recognition of different handwriting
  • Computer Vision
  • Used in banking to read the figures on a check
  • Used in postal services to read the postal code on the envelope

Example of CNN in Python

As an example of using CNN in a real-world problem, let’s recognize handwritten numbers using the MNIST dataset.

The first thing we need to do is to define the CNN model. Then we separate the training and testing data, and finally, we use the training data to train the model and test data to test it. Look at the code below:

from keras import layers
from keras import models
from keras.datasets import mnist
from keras.utils import to_categorical

# Define the CNN model
model = models.Sequential()

model.add(layers.Conv2D(32, (5,5), activation='relu', input_shape=(28, 28,1)))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Conv2D(64, (5, 5), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))

model.add(layers.Flatten())
model.add(layers.Dense(10, activation='softmax'))

model.summary()

# Split the data into training and test sets
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Use the training data to train the model
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

model.fit(train_images, train_labels,
          batch_size=100,
          epochs=5,
          verbose=1)

# Test the model's accuracy with the test data
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('Test accuracy:', test_acc)

Conclusion

Convolutional neural networks are multilayer neural networks that are skilled at capturing data features. They work well with images and don’t require much pre-processing. By using convolution and pooling, to reduce the image to its main features, you can correctly identify images.

CNN models are easier to train than other neural networks with fewer initial parameters. Since convolutions can discover many hidden layers, you won’t need many hidden layers. One of the interesting things about CNNs is the number of complex problems they can be used on. From self-driving cars to diagnosing diabetes, CNNs can process data and make accurate predictions.

Was this helpful?
[5]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top