Using a Neural Network to Classify Lego Figures

- September 15, 2020

The efficacy of image classification has grown leaps and bounds in recent years with the advent of neural networks. Without going into too much detail, an artificial neural network consists of a series of layers through which inputs flow becoming output predictions. Each layer is composed of a predetermined number of neurons, and each neuron feeds information to the next layer of neurons via weighted connections. These weights are real numbers that flow from layer to layer eventually reaching an output layer, producing a numerical prediction.

For the purposes of image classification, the inputs are pixels. These pixels are converted to numeric inputs by gray-scale figures or color codes (RBG, hex, etc). The output is the neural network's prediction of what that image contains. The neural network learns using a dataset of labeled images; recording correct predictions and attempting to grasp what elements of a picture correspond to which labels.

This abstract concept can be intuitively understood with a fun example. My favorite: Legos. I accessed a dataset of different Lego figures with assigned figure names from Kaggle.com. It included characters from Star Wars, Marvel, Jurassic World, and Harry Potter. In total, there were images of 27 different characters arranged in several different positions and taken from different angles. Here are some examples:

One unique feature of artificial neural networks is the ability to utilize pre-trained neural networks for new tasks. For this experiment, I piggybacked off one methodology and used DenseNet121, a densely connected convolutional network with existing architecture & weights already built for image classification. This model is designed to use inputs of a specific size, so I edited the images to a specific shape (224x224x3). Additionally, I changed the output layer to allow for the same number of true classifications in the dataset (27).

With the model ready, we can fit it onto our data and record accuracy. After 50 epochs, the model was extremely accurate, with 99.3% correct predictions.

As a proof of concept, if we feed this model an image and ask it to make predictions (printed out as text above the picture), we can see it is performing quite well!

This is mainly a toy example and a proof of concept for image classification. However, deep learning image classification has dozens of valuable real-world use cases. From healthcare to security to gaming to retail, image classification is already a part of our daily lives. As these models become more and more sophisticated, their capabilities continue to expand. It it a fascinating topic, and a thrilling time to be a data analyst!