The Visual Geometry Group (VGG) network is a deep convolutional neural network architecture that has become a cornerstone in the field of computer vision. Known for its simplicity and depth, VGG was introduced by Simonyan and Zisserman in their 2014 paper, "Very Deep Convolutional Networks for Large-Scale Image Recognition." Keras, a popular deep learning library, provides pre-built versions of VGG, such as VGG16 and VGG19, making it easier for developers to leverage this powerful architecture in their projects.
In this blog, we'll explore how to use Keras' built-in VGG models, focusing on how to load, modify, and apply them to image classification tasks.
Table of Contents
Introduction to VGG and Its Variants
Loading Pre-Trained VGG Network with Keras in Python
Using VGG for Feature Extraction
Fine-Tuning VGG for Custom Tasks
Implementing VGG on the CIFAR-10 Dataset
Conclusion
1. Introduction to VGG and Its Variants
VGG is renowned for its depth and uniform architecture, which consists of a series of convolutional layers with small (3x3) filters and ReLU activation, followed by max-pooling layers. The two most common versions are:
VGG16: Contains 16 layers with learnable weights.
VGG19: Contains 19 layers with learnable weights.
Both models are available in Keras, pre-trained on the ImageNet dataset, which includes over 14 million labeled images across 1,000 categories. This pre-training makes them highly effective for various image recognition tasks.
2. Loading Pre-Trained VGG Network with Keras in Python
Keras makes it simple to load VGG models. Here's how you can load the VGG16 model pre-trained on ImageNet:
from tensorflow.keras.applications import VGG16
# Load the VGG16 model with pre-trained ImageNet weights
vgg16 = VGG16(weights='imagenet', include_top=True)
# Display the model's architecture
vgg16.summary()
Output for the above code:
In this example, include_top=True means that the fully connected layers at the top of the network are included, making the model ready for classification with 1,000 classes. If you want to use VGG for a different task, you can set include_top=False to exclude these layers.
3. Using VGG for Feature Extraction
One common use of pre-trained models like VGG is to use them as feature extractors. This is done by removing the fully connected layers and using the convolutional base to generate feature maps.
from tensorflow.keras.models import Model
from tensorflow.keras import layers
# Load VGG16 without the top fully connected layers
vgg16_base = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Freeze the convolutional base
for layer in vgg16_base.layers:
layer.trainable = False
# Add custom classification layers on top
x = vgg16_base.output
x = layers.Flatten()(x)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(10, activation='softmax')(x)
# Create a new model
model = Model(inputs=vgg16_base.input, outputs=x)
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Display the model's architecture
model.summary()
Output for the above code:
In this setup, the convolutional layers act as a feature extractor, and the newly added fully connected layers are trained for a specific task, such as classifying images from the CIFAR-10 dataset.
4. Fine-Tuning VGG for Custom Tasks
Fine-tuning is a process where you unfreeze some of the top layers of the convolutional base and train both the newly added layers and the top layers of the base model. This allows the model to adapt more specifically to the new dataset while leveraging the pre-trained weights.
import tensorflow as tf
# Unfreeze the top layers of the convolutional base
for layer in vgg16_base.layers[-4:]:
layer.trainable = True
# Recompile the model with a lower learning rate
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5),
loss='categorical_crossentropy',
metrics=['accuracy'])
Fine-tuning is especially effective when the new dataset is small or similar to the original dataset on which the model was trained.
5. Implementing VGG on the CIFAR-10 Dataset
Let’s apply VGG16 to the CIFAR-10 dataset. Since CIFAR-10 images are smaller (32x32 pixels), they need to be resized to 224x224 pixels to fit the input shape expected by VGG16.
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.utils import to_categorical
import tensorflow as tf
# Load and preprocess the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = tf.image.resize(x_train, (224, 224))
x_test = tf.image.resize(x_test, (224, 224))
x_train, x_test = x_train / 255.0, x_test / 255.0
y_train, y_test = to_categorical(y_train), to_categorical(y_test)
# Train the model
history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_test, y_test))
This script fine-tunes the VGG16 model on the CIFAR-10 dataset, enabling the powerful pre-trained model to adapt to the new classification task.
6. Conclusion
Keras' built-in VGG models provide a powerful and convenient way to leverage pre-trained deep learning architectures for various image classification tasks. Whether you're using VGG as a feature extractor, fine-tuning it for specific applications, or simply exploring its architecture, Keras makes it accessible and easy to implement.
By using VGG with Keras, you can achieve high performance on complex image recognition tasks with minimal effort, making it an excellent tool for both beginners and experienced practitioners.
Comments