💊 Pill of the Week
Welcome back! In Part 1, we explored what makes PyTorch special and why it's widely adopted. Now, let's dive into how PyTorch is structured, its key modules, and how to set up your environment. We'll also build a basic PyTorch model to understand its workings from a high level. Don't worry if you're new to PyTorch - we'll walk through everything step by step!
Understanding the Structure of PyTorch is more than just a deep-learning framework - it's a complete ecosystem designed to make your machine learning journey easier and more efficient.
Let's take a closer look at its main components:
Core PyTorch
At the heart of PyTorch are two essential elements:
Tensors: Think of tensors as multi-dimensional arrays, similar to NumPy arrays, but with the added bonus of being optimized for GPUs. This means you can perform high-speed computations and train your models faster.
Autograd: This module takes care of automatically computing gradients for backpropagation, which is a crucial step in training neural networks. With autograd, you can focus on building your model while PyTorch handles the complex math behind the scenes.
PyTorch Ecosystem
Following the key libraries in the PyTorch ecosystem:
TorchVision: If you're working on computer vision projects, TorchVision is your go-to library. It includes pre-trained models (like ResNet for image classification), popular datasets (like CIFAR-10), and handy utilities for image transformations and augmentations.
TorchText: For natural language processing (NLP) tasks, TorchText provides tools for text tokenization, vocabulary management, and even pre-trained word embeddings to get you started quickly.
TorchAudio: Dealing with audio or speech data? TorchAudio has you covered with pre-trained models and utilities for loading, augmenting, and analyzing audio data.
TorchServe: When it's time to deploy your trained models, TorchServe simplifies the process of serving PyTorch models in production environments. It supports model versioning and provides metrics for monitoring.
TorchRec: Building recommendation systems? TorchRec is designed specifically for constructing scalable, high-performance recommendation algorithms.
TorchMetrics: Part of PyTorch Lightning, TorchMetrics offers a collection of evaluation metrics for various tasks like vision, NLP, and more, making it easier to track your model's performance during training.
PyTorch Lightning: If you want to focus more on the high-level design of your models and less on the boilerplate code, PyTorch Lightning is a great choice. It provides a structured workflow and helps you experiment faster.
Setting Up PyTorch
Option 1: Installing PyTorch Locally If you prefer to work on your local machine, you can install PyTorch via pip:
pip install torch torchvision torchaudio
To ensure you have the right installation command based on your environment (CPU/GPU, CUDA version, etc.), check out the official PyTorch website. If you have an Apple Mac with an Apple Silicon chip (like the M1, M2, M3, or newer models), you can use its capabilities to accelerate PyTorch code execution. To do so you can run a simple code snippet in Python:
print(torch.backends.mps.is_available())
Option 2: Setting Up PyTorch on Google Colab If you don't want to deal with local installations, Google Colab is a great alternative. Just go to Google Colab, create a new notebook and start writing and running PyTorch code, no installation required!
Now….. let’s get to the fun part!
Building Your First Model
Now that you have PyTorch set up, let's build a simple neural network to classify handwritten digits using the famous MNIST dataset. We'll break it down into three easy steps:
Step 1: Prepare the Dataset
First, we need to load and transform the MNIST dataset using torchvision.datasets
:
from torchvision import datasets, transforms
# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))])
# Download and load the training data
trainset = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
Step 2: Define the Model
Next, let's define a simple feedforward neural network:
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(784, 512)
self.fc2 = nn.Linear(512, 256)
self.fc3 = nn.Linear(256, 10)
def forward(self, x):
x = x.view(-1, 784)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
model = Net()
Here, we define a neural network with two hidden layers and a ReLU activation function. The forward method defines how the input data flows through the network.
Step 3: Train the Model
Finally, we'll train our model using a simple training loop:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
epochs = 5
for e in range(epochs):
running_loss = 0
for images, labels in trainloader:
# Flatten the input images of [28,28] to [784]
images = images.view(images.shape[0], -1)
# Training pass
optimizer.zero_grad()
output = model(images)
loss = criterion(output, labels)
#This is where the model learns by backpropagating
loss.backward()
#And optimizes its weights here
optimizer.step()
running_loss += loss.item()
else:
print(f"Training loss: {running_loss/len(trainloader)}")
Here's a high-level overview of what's happening in the training loop:
Defining the Loss Function and Optimizer
The loss function, such as CrossEntropyLoss, measures how well the model performs by comparing its predictions to the true labels.
The optimizer, like Adam, updates the model's weights based on the gradients computed during backpropagation.
Training Loop: Iterating Over Epochs and Batches
The outer loop iterates over a fixed number of epochs, where each epoch is a complete pass through the entire training dataset.
The inner loop iterates over the training data in batches, allowing for more efficient training by leveraging parallel processing capabilities of GPUs.
Preparing the Input Data
Before passing the input data through the model, ensure it has the correct shape.
For MNIST images, flatten the 2D images of shape [28, 28] into 1D vectors of length 784 to match the model's input size.
Resetting Gradients
Before each training step, reset the gradients of all model parameters to zero.
This ensures that each step starts with a clean slate, as PyTorch accumulates gradients over successive backward passes.
Forward Pass
Pass the input data through the model to obtain the output predictions.
The model's forward method defines how the input data flows through the network's layers.
Calculating the Loss
Compute the loss between the model's output predictions and the true labels using the defined loss function.
The loss value indicates how well the model is performing on the current batch.
Backpropagation
Backpropagate the loss through the network to calculate the gradients of the model's parameters with respect to the loss.
This step determines how each parameter should be adjusted to reduce the loss.
Optimization Step
Use the optimizer to update the model's weights based on the computed gradients.
The optimizer adjusts the weights in the direction that minimizes the loss, helping the model learn from the training data.
And there you have it! You've just built and trained your first PyTorch model. In Part 3, we'll do a deep dive into the mechanics of the code we wrote throughout this notebook.
Here is the Google Colab link:
🎓Further Learning*
Let us present: “From Beginner to Advanced LLM Developer”. This comprehensive course takes you from foundational skills to mastering scalable LLM products through hands-on projects, fine-tuning, RAG, and agent development. Whether you're building a standout portfolio, launching a startup idea, or enhancing enterprise solutions, this program equips you to lead the LLM revolution and thrive in a fast-growing, in-demand field.
Who Is This Course For?
This certification is for software developers, machine learning engineers, data scientists or computer science and AI students to rapidly convert to an LLM Developer role and start building
*Sponsored: by purchasing any of their courses you would also be supporting MLPills.
⚡Power-Up Corner: Understanding Tensors in Neural Networks
When we built our MNIST classifier, we worked with tensors without diving too deep into what they actually are. Let's lift the hood and understand why tensors are fundamental to deep learning, and how PyTorch makes working with them intuitive.
What Are Tensors, Really?
Think of tensors as a hierarchy of nested arrays. While this might sound complex, you're already familiar with simpler forms of tensors:
Keep reading with a 7-day free trial
Subscribe to Machine Learning Pills to keep reading this post and get 7 days of free access to the full post archives.