Fashion MNIST Dataset with PyTorch: A Step-by-Step Tutorial

samuel black

Aug 30, 20246 min read

In this blog, we've walked through the process of building a simple neural network to classify images from the Fashion MNIST dataset using PyTorch. We've covered everything from loading and preprocessing the data to building, training, and evaluating the model. This tutorial provides a solid foundation for further exploration into more complex models and techniques, such as convolutional neural networks (CNNs) or transfer learning.

Introduction to Fashion MNIST in PyTorch

The Fashion MNIST dataset is a popular alternative to the classic MNIST dataset, featuring 70,000 grayscale images of 10 different categories of clothing items. Each image is 28x28 pixels in size, making it ideal for testing and learning image classification techniques. In this blog, we will explore how to use PyTorch to build a neural network for classifying images from the Fashion MNIST dataset. Fashion MNIST consists of 60,000 training images and 10,000 test images, each labeled with one of the following categories:

T-shirt/top
Trouser
Pullover
Dress
Coat
Sandal
Shirt
Sneaker
Bag
Ankle boot

Each image is a low-resolution (28x28 pixel) grayscale image, which makes the dataset ideal for quick experimentation with image classification algorithms.

Setting Up the Environment

To get started, ensure that you have PyTorch and other necessary libraries installed. You can install them using pip:

pip install torch torchvision

Loading the Fashion MNIST Dataset

PyTorch provides easy access to the Fashion MNIST dataset via the torchvision library. Let's start by loading the dataset and applying basic transformations such as converting images to tensors and normalizing them.

import torch

from torchvision import datasets, transforms

# Define a transform to normalize the data

transform = transforms.Compose([

transforms.ToTensor(),

transforms.Normalize((0.5,), (0.5,))

])

# Load the Fashion MNIST dataset

trainset = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)

testset = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)

Output for the above code:

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz
100%|██████████| 26421880/26421880 [00:03<00:00, 6682548.34it/s] 
Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
100%|██████████| 29515/29515 [00:00<00:00, 310989.28it/s]
Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
100%|██████████| 4422102/4422102 [00:00<00:00, 5418161.51it/s]
Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
100%|██████████| 5148/5148 [00:00<00:00, 14599240.70it/s]Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw

Building the Neural Network

We'll create a simple feedforward neural network for classifying the Fashion MNIST images. The network will consist of an input layer, a hidden layer, and an output layer.

import torch.nn as nn

import torch.nn.functional as F

class FashionMNISTClassifier(nn.Module):

def init(self):

super(FashionMNISTClassifier, self).__init__()

self.fc1 = nn.Linear(28 * 28, 512)

self.fc2 = nn.Linear(512, 256)

self.fc3 = nn.Linear(256, 10)

def forward(self, x):

x = x.view(-1, 28 * 28) # Flatten the image

x = F.relu(self.fc1(x))

x = F.relu(self.fc2(x))

x = self.fc3(x)

return x

model = FashionMNISTClassifier()

print(model)

Output for the above code:

FashionMNISTClassifier(
  (fc1): Linear(in_features=784, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=256, bias=True)
  (fc3): Linear(in_features=256, out_features=10, bias=True)
)

Training the Model

Next, we need to define the loss function and the optimizer. We'll use cross-entropy loss, which is standard for multi-class classification problems, and the Adam optimizer.

import torch.optim as optim

# Loss function and optimizer

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop

def train_model(trainloader, model, criterion, optimizer, num_epochs=5):

for epoch in range(num_epochs):

running_loss = 0.0

for images, labels in trainloader:

optimizer.zero_grad()

outputs = model(images)

loss = criterion(outputs, labels)

loss.backward()

optimizer.step()

running_loss += loss.item()

print(f"Epoch {epoch+1}, Loss: {running_loss / len(trainloader)}")

train_model(trainloader, model, criterion, optimizer)

Output for the above code:

Epoch 1, Loss: 0.4830017135913438
Epoch 2, Loss: 0.3657274188231558
Epoch 3, Loss: 0.3273437559318695
Epoch 4, Loss: 0.3029962965706264
Epoch 5, Loss: 0.278847184866222

Evaluating the Model

After training, it's crucial to evaluate the model's performance on the test set to see how well it generalizes to new data.

def evaluate_model(testloader, model):

correct = 0

total = 0

with torch.no_grad():

for images, labels in testloader:

outputs = model(images)

_, predicted = torch.max(outputs, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

print(f"Accuracy: {100 * correct / total}%")

evaluate_model(testloader, model)

Output for the above code:

Accuracy: 87.28%

Visualizing the Results

To get a better understanding of how well the model is performing, let's visualize some predictions on the test data.

import matplotlib.pyplot as plt

# Class names in Fashion MNIST

classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',

'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# Function to show an image

def imshow(img,i):

img = img / 2 + 0.5 # Unnormalize

plt.imshow(img.numpy().squeeze(), cmap='gray')

plt.title((f'Predicted: {classes[predicted[i]]}\nTrue: {classes[labels[i]]}'))

plt.show()

# Display some predictions

dataiter = iter(testloader)

images, labels = next(dataiter)

outputs = model(images)

_, predicted = torch.max(outputs, 1)

# Show images and predictions

for i in range(2):

imshow(images[i],i)

Output for the above code:

Fashion MNIST Dataset with PyTorch - COLABCODES

Full Code for Fashion MNIST Dataset with PyTorch

The full code for working with the Fashion MNIST dataset using PyTorch covers the entire pipeline from data loading and preprocessing to model building, training, and evaluation. It includes setting up the dataset using torchvision, defining a neural network model, and training the model using a loop that updates the model's weights based on the loss. The code concludes with evaluating the model's accuracy on test data and visualizing the results, providing a comprehensive example of how to approach image classification tasks in PyTorch.

# Loading dependencies import torch

from torchvision import datasets, transforms

import torch.nn as nn

import torch.nn.functional as F

import torch.optim as optim

import matplotlib.pyplot as plt

# Define a transform to normalize the data

transform = transforms.Compose([

transforms.ToTensor(),

transforms.Normalize((0.5,), (0.5,))

])

# Load the Fashion MNIST dataset

trainset = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)

testset = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)

trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)

# Neural network

class FashionMNISTClassifier(nn.Module):

def init(self):

super(FashionMNISTClassifier, self).__init__()

self.fc1 = nn.Linear(28 * 28, 512)

self.fc2 = nn.Linear(512, 256)

self.fc3 = nn.Linear(256, 10)

def forward(self, x):

x = x.view(-1, 28 * 28) # Flatten the image

x = F.relu(self.fc1(x))

x = F.relu(self.fc2(x))

x = self.fc3(x)

return x

model = FashionMNISTClassifier()

print(model)

# Loss function and optimizer

criterion = nn.CrossEntropyLoss()

optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop

def train_model(trainloader, model, criterion, optimizer, num_epochs=5):

for epoch in range(num_epochs):

running_loss = 0.0

for images, labels in trainloader:

optimizer.zero_grad()

outputs = model(images)

loss = criterion(outputs, labels)

loss.backward()

optimizer.step()

running_loss += loss.item()

print(f"Epoch {epoch+1}, Loss: {running_loss / len(trainloader)}")

train_model(trainloader, model, criterion, optimizer)

# Model evaluation

def evaluate_model(testloader, model):

correct = 0

total = 0

with torch.no_grad():

for images, labels in testloader:

outputs = model(images)

_, predicted = torch.max(outputs, 1)

total += labels.size(0)

correct += (predicted == labels).sum().item()

print(f"Accuracy: {100 * correct / total}%")

evaluate_model(testloader, model)

# Class names in Fashion MNIST

classes = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',

'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

# Function to show an image

def imshow(img,i):

img = img / 2 + 0.5 # Unnormalize

plt.imshow(img.numpy().squeeze(), cmap='gray')

plt.title((f'Predicted: {classes[predicted[i]]}\nTrue: {classes[labels[i]]}'))

plt.show()

# Display some predictions

dataiter = iter(testloader)

images, labels = next(dataiter)

outputs = model(images)

_, predicted = torch.max(outputs, 1)

# Show images and predictions

for i in range(2):

imshow(images[i],i)

Conclusion

The process of building a neural network to classify images from the Fashion MNIST dataset demonstrates the foundational steps of deep learning and image classification with PyTorch. Starting with data preparation and loading, we've seen how important it is to properly transform and normalize the dataset to ensure effective model training. The simple feedforward neural network used in this tutorial provides a basic yet powerful introduction to image classification, highlighting how layers, activation functions, and loss calculation contribute to the learning process.

Learn through our Blogs, Get Expert Help, Mentorship & Freelance Support!

ColabCodes

Fashion MNIST Dataset with PyTorch: A Step-by-Step Tutorial

Introduction to Fashion MNIST in PyTorch

Setting Up the Environment

Loading the Fashion MNIST Dataset

Building the Neural Network

Training the Model

Evaluating the Model

Visualizing the Results

Full Code for Fashion MNIST Dataset with PyTorch

Conclusion

Related Posts

Comments

Get in touch for customized mentorship and freelance solutions tailored to your needs.

ColabCodes

Services

Experts