In this blog, we've walked through the process of building a simple neural network to classify images from the Fashion MNIST dataset using PyTorch. We've covered everything from loading and preprocessing the data to building, training, and evaluating the model. This tutorial provides a solid foundation for further exploration into more complex models and techniques, such as convolutional neural networks (CNNs) or transfer learning.
Introduction to Fashion MNIST in PyTorch
The Fashion MNIST dataset is a popular alternative to the classic MNIST dataset, featuring 70,000 grayscale images of 10 different categories of clothing items. Each image is 28x28 pixels in size, making it ideal for testing and learning image classification techniques. In this blog, we will explore how to use PyTorch to build a neural network for classifying images from the Fashion MNIST dataset. Fashion MNIST consists of 60,000 training images and 10,000 test images, each labeled with one of the following categories:
T-shirt/top
Trouser
Pullover
Dress
Coat
Sandal
Shirt
Sneaker
Bag
Ankle boot
Each image is a low-resolution (28x28 pixel) grayscale image, which makes the dataset ideal for quick experimentation with image classification algorithms.
Setting Up the Environment
To get started, ensure that you have PyTorch and other necessary libraries installed. You can install them using pip:
pip install torch torchvision
Loading the Fashion MNIST Dataset
PyTorch provides easy access to the Fashion MNIST dataset via the torchvision library. Let's start by loading the dataset and applying basic transformations such as converting images to tensors and normalizing them.
import torch
from torchvision import datasets, transforms
# Define a transform to normalize the data
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# Load the Fashion MNIST dataset
trainset = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
testset = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)
Output for the above code:
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz
100%|██████████| 26421880/26421880 [00:03<00:00, 6682548.34it/s]
Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
100%|██████████| 29515/29515 [00:00<00:00, 310989.28it/s]
Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
100%|██████████| 4422102/4422102 [00:00<00:00, 5418161.51it/s]
Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
100%|██████████| 5148/5148 [00:00<00:00, 14599240.70it/s]Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw
Building the Neural Network
We'll create a simple feedforward neural network for classifying the Fashion MNIST images. The network will consist of an input layer, a hidden layer, and an output layer.
import torch.nn as nn
import torch.nn.functional as F
class FashionMNISTClassifier(nn.Module):
def init(self):
super(FashionMNISTClassifier, self).__init__()
self.fc1 = nn.Linear(28 * 28, 512)
self.fc2 = nn.Linear(512, 256)
self.fc3 = nn.Linear(256, 10)
def forward(self, x):
x = x.view(-1, 28 * 28) # Flatten the image
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
model = FashionMNISTClassifier()
print(model)
Output for the above code:
FashionMNISTClassifier(
(fc1): Linear(in_features=784, out_features=512, bias=True)
(fc2): Linear(in_features=512, out_features=256, bias=True)
(fc3): Linear(in_features=256, out_features=10, bias=True)
)
Training the Model
Next, we need to define the loss function and the optimizer. We'll use cross-entropy loss, which is standard for multi-class classification problems, and the Adam optimizer.
import torch.optim as optim
# Loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training loop
def train_model(trainloader, model, criterion, optimizer, num_epochs=5):
for epoch in range(num_epochs):
running_loss = 0.0
for images, labels in trainloader:
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch {epoch+1}, Loss: {running_loss / len(trainloader)}")
train_model(trainloader, model, criterion, optimizer)
Output for the above code:
Epoch 1, Loss: 0.4830017135913438
Epoch 2, Loss: 0.3657274188231558
Epoch 3, Loss: 0.3273437559318695
Epoch 4, Loss: 0.3029962965706264
Epoch 5, Loss: 0.278847184866222
Evaluating the Model
After training, it's crucial to evaluate the model's performance on the test set to see how well it generalizes to new data.
def evaluate_model(testloader, model):
correct = 0
total = 0
with torch.no_grad():
for images, labels in testloader:
outputs = model(images)
_, predicted = torch.max(outputs, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f"Accuracy: {100 * correct / total}%")
evaluate_model(testloader, model)
Output for the above code:
Accuracy: 87.28%
Visualizing the Results
To get a better understanding of how well the model is performing, let's visualize some predictions on the test data.
import matplotlib.pyplot as plt
# Class names in Fashion MNIST
classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# Function to show an image
def imshow(img,i):
img = img / 2 + 0.5 # Unnormalize
plt.imshow(img.numpy().squeeze(), cmap='gray')
plt.title((f'Predicted: {classes[predicted[i]]}\nTrue: {classes[labels[i]]}'))
plt.show()
# Display some predictions
dataiter = iter(testloader)
images, labels = next(dataiter)
outputs = model(images)
_, predicted = torch.max(outputs, 1)
# Show images and predictions
for i in range(2):
imshow(images[i],i)
Output for the above code:
Full Code for Fashion MNIST Dataset with PyTorch
The full code for working with the Fashion MNIST dataset using PyTorch covers the entire pipeline from data loading and preprocessing to model building, training, and evaluation. It includes setting up the dataset using torchvision, defining a neural network model, and training the model using a loop that updates the model's weights based on the loss. The code concludes with evaluating the model's accuracy on test data and visualizing the results, providing a comprehensive example of how to approach image classification tasks in PyTorch.
# Loading dependencies import torch
from torchvision import datasets, transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import matplotlib.pyplot as plt
# Define a transform to normalize the data
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
# Load the Fashion MNIST dataset
trainset = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
testset = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)
# Neural network
class FashionMNISTClassifier(nn.Module):
def init(self):
super(FashionMNISTClassifier, self).__init__()
self.fc1 = nn.Linear(28 * 28, 512)
self.fc2 = nn.Linear(512, 256)
self.fc3 = nn.Linear(256, 10)
def forward(self, x):
x = x.view(-1, 28 * 28) # Flatten the image
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
model = FashionMNISTClassifier()
print(model)
# Loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Training loop
def train_model(trainloader, model, criterion, optimizer, num_epochs=5):
for epoch in range(num_epochs):
running_loss = 0.0
for images, labels in trainloader:
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print(f"Epoch {epoch+1}, Loss: {running_loss / len(trainloader)}")
train_model(trainloader, model, criterion, optimizer)
# Model evaluation
def evaluate_model(testloader, model):
correct = 0
total = 0
with torch.no_grad():
for images, labels in testloader:
outputs = model(images)
_, predicted = torch.max(outputs, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
print(f"Accuracy: {100 * correct / total}%")
evaluate_model(testloader, model)
# Class names in Fashion MNIST
classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# Function to show an image
def imshow(img,i):
img = img / 2 + 0.5 # Unnormalize
plt.imshow(img.numpy().squeeze(), cmap='gray')
plt.title((f'Predicted: {classes[predicted[i]]}\nTrue: {classes[labels[i]]}'))
plt.show()
# Display some predictions
dataiter = iter(testloader)
images, labels = next(dataiter)
outputs = model(images)
_, predicted = torch.max(outputs, 1)
# Show images and predictions
for i in range(2):
imshow(images[i],i)
Conclusion
The process of building a neural network to classify images from the Fashion MNIST dataset demonstrates the foundational steps of deep learning and image classification with PyTorch. Starting with data preparation and loading, we've seen how important it is to properly transform and normalize the dataset to ensure effective model training. The simple feedforward neural network used in this tutorial provides a basic yet powerful introduction to image classification, highlighting how layers, activation functions, and loss calculation contribute to the learning process.
Comments