Image Classification in Python

Image classification is a fundamental task in computer vision, where the goal is to categorize an image into one of several predefined classes. From recognizing handwritten digits to identifying objects in photos, image classification has a wide range of applications. In this blog, we'll explore how to build an image classification model in Python using TensorFlow and Keras.

What is Image Classification in Machine Learning?

Image classification is a critical task in machine learning that involves assigning a label or category to an image from a predefined set of classes. It forms the basis of numerous applications, such as object recognition, face detection, medical imaging, and more. In image classification, a model is trained on a dataset of labeled images, learning to identify patterns, features, and structures within the images that are indicative of specific classes. For example, in a dataset of animals, the model learns to differentiate between cats, dogs, and birds based on their unique characteristics. Convolutional Neural Networks (CNNs) are the most commonly used models for image classification due to their ability to automatically and adaptively learn spatial hierarchies of features through backpropagation. The goal of image classification is to generalize well to unseen data, allowing the model to correctly classify new images that it has never encountered before. This task is essential for developing intelligent systems that can interpret visual data, making it a foundational component in the field of computer vision.

Getting Started with Image Classification in Python

Getting started with image classification in Python is both accessible and powerful, thanks to the availability of robust libraries like TensorFlow and Keras. The first step involves setting up your environment by installing the necessary packages, ensuring you have the tools to build and train deep learning models. Once the environment is ready, you begin by loading a dataset, such as CIFAR-10 or MNIST, which are commonly used benchmarks in image classification. Preprocessing the data is crucial, where tasks like normalizing pixel values and reshaping images help the model learn more effectively. With the data prepared, you can then build a Convolutional Neural Network (CNN), the go-to architecture for image classification, by stacking layers that extract features from images. After defining the model, the next step is compiling it with an appropriate optimizer and loss function, followed by training it on the dataset. As the model learns from the data, you can monitor its performance on a validation set, adjusting parameters if necessary. Finally, the trained model is evaluated on test data, and predictions can be made on new images, completing the basic workflow of image classification in Python. This process lays the foundation for further experimentation and improvement, allowing you to explore the vast possibilities of image classification.

Step 1: Setting Up the Environment

First, ensure you have TensorFlow installed. If not, you can install it using pip:

pip install tensorflow

Step 2: Loading and Preprocessing the Data

We'll begin by loading the CIFAR-10 dataset and preprocessing the data:

import tensorflow as tf

from tensorflow.keras import datasets, layers, models

# Load the CIFAR-10 dataset

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to be between 0 and 1

train_images, test_images = train_images / 255.0, test_images / 255.0

# Verify the shape of the data

print(f'Train Images: {train_images.shape}')

print(f'Test Images: {test_images.shape}')

Output for the above code:

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
170498071/170498071 ━━━━━━━━━━━━━━━━━━━━ 52s 0us/step
Train Images: (50000, 32, 32, 3)
Test Images: (10000, 32, 32, 3)

Step 3: Building the Convolutional Neural Network (CNN)

Next, we'll build a CNN model using Keras. The CNN we built consists of the following layers:

Convolutional Layers: Extract features from the input images by applying filters.
MaxPooling Layers: Reduce the spatial dimensions of the feature maps, retaining the most important information.
Flatten Layer: Converts the 2D feature maps into a 1D vector.
Dense Layers: Perform classification based on the features extracted by the convolutional layers.The model will consist of several convolutional layers, followed by fully connected layers:

# Model architecture

model = models.Sequential([

layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),

layers.MaxPooling2D((2, 2)),

layers.Conv2D(64, (3, 3), activation='relu'),

layers.MaxPooling2D((2, 2)),

layers.Conv2D(64, (3, 3), activation='relu'),

layers.Flatten(),

layers.Dense(64, activation='relu'),

layers.Dense(10, activation='softmax')

])

# Print the model summary

print(model.summary())

Output for the above code:

Step 4: Compiling and Training the Model

Before training the model, we need to compile it by specifying the optimizer, loss function, and metrics:

# Model compilation

model.compile(optimizer='adam',

loss='sparse_categorical_crossentropy',

metrics=['accuracy'])

# Train the model

history = model.fit(train_images, train_labels, epochs=10,

validation_data=(test_images, test_labels))

Output for the above code:

Epoch 1/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 71s 44ms/step - accuracy: 0.3282 - loss: 1.7978 - val_accuracy: 0.5078 - val_loss: 1.3578
Epoch 2/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 80s 43ms/step - accuracy: 0.5461 - loss: 1.2788 - val_accuracy: 0.5708 - val_loss: 1.2039
Epoch 3/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 43ms/step - accuracy: 0.6022 - loss: 1.1271 - val_accuracy: 0.6341 - val_loss: 1.0400
Epoch 4/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 43ms/step - accuracy: 0.6431 - loss: 1.0169 - val_accuracy: 0.6566 - val_loss: 0.9899
Epoch 5/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 84s 44ms/step - accuracy: 0.6749 - loss: 0.9307 - val_accuracy: 0.6678 - val_loss: 0.9478
Epoch 6/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 68s 43ms/step - accuracy: 0.6965 - loss: 0.8657 - val_accuracy: 0.6889 - val_loss: 0.8938
Epoch 7/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 43ms/step - accuracy: 0.7117 - loss: 0.8335 - val_accuracy: 0.6987 - val_loss: 0.8784
Epoch 8/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 43ms/step - accuracy: 0.7228 - loss: 0.7958 - val_accuracy: 0.6945 - val_loss: 0.8989
Epoch 9/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 82s 43ms/step - accuracy: 0.7411 - loss: 0.7502 - val_accuracy: 0.6972 - val_loss: 0.8843
Epoch 10/10
1563/1563 ━━━━━━━━━━━━━━━━━━━━ 81s 43ms/step - accuracy: 0.7536 - loss: 0.7023 - val_accuracy: 0.6927 - val_loss: 0.9154

Step 5: Evaluating the Model

Once the model is trained, we can evaluate its performance on the test data:

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)

print(f'\nTest accuracy: {test_acc}')

Output for the above code:

313/313 - 4s - 12ms/step - accuracy: 0.6927 - loss: 0.9154

Test accuracy: 0.6927000284194946

Step 6: Making Predictions

Finally, we can use the trained model to make predictions on new images:

# Predictions on the test images

predictions = model.predict(test_images)

# Print the prediction for the first test image

print(f'Predicted label: {tf.argmax(predictions[0])}')

print(f'Actual label: {test_labels[0]}')

Output for the above code:

313/313 ━━━━━━━━━━━━━━━━━━━━ 4s 13ms/step
Predicted label: 3
Actual label: [3]

Step 7: Visualize the Validation Accuracy

Once the model is trained, we can visualize the evaluation of its performance across each epoch:

import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'], label='accuracy')

plt.plot(history.history['val_accuracy'], label = 'val_accuracy')

plt.title('Model Evaluation')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.show()

Output for the above code:

Full code for Image Classification in Python

Here's a full code example for image classification in Python using TensorFlow and Keras: it loads the CIFAR-10 dataset, builds a Convolutional Neural Network (CNN), trains the model, and evaluates its performance on test data. This concise implementation demonstrates the core steps of image classification, from data preprocessing to making predictions on new images.

# Import dependencies

import tensorflow as tf

from tensorflow.keras import datasets, layers, models

import matplotlib.pyplot as plt

# Load the CIFAR-10 dataset

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to be between 0 and 1

train_images, test_images = train_images / 255.0, test_images / 255.0

# Verify the shape of the data

print(f'Train Images: {train_images.shape}')

print(f'Test Images: {test_images.shape}')

# Model architecture

model = models.Sequential([

layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),

layers.MaxPooling2D((2, 2)),

layers.Conv2D(64, (3, 3), activation='relu'),

layers.MaxPooling2D((2, 2)),

layers.Conv2D(64, (3, 3), activation='relu'),

layers.Flatten(),

layers.Dense(64, activation='relu'),

layers.Dense(10, activation='softmax')

])

# Print the model summary

print(model.summary())

# Model compilation

model.compile(optimizer='adam',

loss='sparse_categorical_crossentropy',

metrics=['accuracy'])

# Train the model

history = model.fit(train_images, train_labels, epochs=10,

validation_data=(test_images, test_labels))

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)

print(f'\nTest accuracy: {test_acc}')

# Predictions on the test images

predictions = model.predict(test_images)

# Print the prediction for the first test image

print(f'Predicted label: {tf.argmax(predictions[0])}')

print(f'Actual label: {test_labels[0]}')

plt.plot(history.history['accuracy'], label='accuracy')

plt.plot(history.history['val_accuracy'], label = 'val_accuracy')

plt.title('Model Evaluation')

plt.xlabel('Epoch')

plt.ylabel('Accuracy')

plt.show()

Conclusion

In conclusion, image classification in Python offers a powerful and accessible way to delve into the world of computer vision and machine learning. By leveraging libraries like TensorFlow and Keras, you can quickly build and train models that accurately classify images into predefined categories. Starting with foundational concepts like data preprocessing, model building, and evaluation, you gain a solid understanding of how to create effective image classifiers. As you progress, the flexibility of Python allows you to experiment with more complex architectures and techniques, opening the door to a wide range of applications in AI. Whether you're classifying simple objects or tackling more intricate visual tasks, mastering image classification is a significant step towards developing sophisticated, real-world machine learning solutions.

Learn through our Blogs, Get Expert Help, Mentorship & Freelance Support!

ColabCodes

Image Classification in Python

What is Image Classification in Machine Learning?

Getting Started with Image Classification in Python

Step 1: Setting Up the Environment

Step 2: Loading and Preprocessing the Data

Step 3: Building the Convolutional Neural Network (CNN)

Step 4: Compiling and Training the Model

Step 5: Evaluating the Model

Step 6: Making Predictions

Step 7: Visualize the Validation Accuracy

Full code for Image Classification in Python

Conclusion

Related Posts

Comments

Get in touch for customized mentorship, research and freelance solutions tailored to your needs.