Pytorch

Pytorch

Pytorch and Python

 -
Artificial Intelligence - Tech and Technology
Pytorch and Python
What is pytorch?
Pytorch is library for Python to run machine learning, working with data, creating models, optimizing model parameters, and saving the trained models.
Pytorch tutorials - Learn the Basics, Quickstart, Tensors, Datasets and DataLoaders, Transforms, Build Model, Autograd, Optimization, Save and Load Model - Download Notebook
Pytorch tutorials in Microsoft Learn
Pytorch tutorials in Google Colab
Why Pythorch?
Learn the Basics
Most machine learning workflows involve working with data, creating models, optimizing model parameters, and saving the trained models.
The PyTorch tutorial introduces you to a complete ML workflow to learn these concepts.
Learn to use the a dataset to train a neural network that predicts if an input image belongs to one of the knowed classes.
The tutorial assumes a basic familiarity with Python and Deep Learning concepts.
Tutorial - How to run the code.
You can run the tutorial in the cloud or your server:
In the cloud: This is the easiest way to get started! Each section has a “Run in Microsoft Learn” and “Run in Google Colab” link at the top, which opens an integrated notebook in Microsoft Learn or Google Colab, respectively, with the code in a fully-hosted environment.
Locally: This option requires you to setup PyTorch and TorchVision first on your local machine with the installation instructions. Download the notebook or copy the code into your favorite IDE.
Get familiar with other deep learning frameworks. Quickstart first to quickly familiarize yourself with PyTorch’s API.
If you’re new to deep learning frameworks, head right into the first section of our step-by-step guide.
Tensors, Datasets and DataLoaders and Transforms.
Build Model - Build the Neural Network.
Automatic Differentiation with torch.autograd
Optimization Loop - Optimizing Model Parameters
How to Save, Load and Use a Model.
Examples, Recipes, source code and more tutorials.
Introduction to PyTorch on YouTube Videos.
Introduction to PyTorch Tensors
The Fundamentals of Autograd
Building Models with PyTorch
PyTorch TensorBoard Support
Training with PyTorch
Model Understanding with Captum
Learning PyTorch to :
Text, Audio, Image and Video, Mobile
Reinforcement Learning, Recommendation Systems
Model Optimization, Deploying PyTorch Models in Production
Code Transforms with FX
Frontend APIs
Extending PyTorch
Parallel and Distributed Training
Introduction to TorchRec
Exploring TorchRec sharding
Multimodality
TorchMultimodal Tutorial: Finetuning FLAVA
PyTorch Tutorials
Learn Pytorch in a Day! Literally! Great video tutorial with 24 hours of pytorch examples, from basic to advanced lessons.
What is PyTorch?
Implementing High Performance Transformers with Scaled Dot Product Attention
torch.compile Tutorial
Per Sample Gradients
Jacobians, Hessians, hvp, vhp, and more: composing function transforms
Model Ensembling
Neural Tangent Kernels
Reinforcement Learning (PPO) with TorchRL Tutorial
Changing Default Device
Learn the Basics
Familiarize yourself with PyTorch concepts and modules. Learn how to load data, build deep neural networks, train and save your models in this quickstart guide.
PyTorch Recipes
Bite-size, ready-to-deploy PyTorch code examples : All, Attention, Audio, Ax, Best Practice, C++, CUDA, Extending PyTorch, FX, Frontend APIs, Getting Started, Image/Video, Interpretability, Memory Format, Mobile, Model Optimization, Parallel and-Distributed-Training, Production, Profiling, Quantization, Recommender, Reinforcement Learning, TensorBoard, Text, TorchMultimodal, TorchRec, TorchScript, TorchX, Transformer
Learn the Basics - A step-by-step guide to building a complete ML workflow with PyTorch.
Getting Started - Introduction to PyTorch on YouTube
An introduction to building a complete ML workflow with PyTorch. Follows the PyTorch Beginner Series on YouTube.
Learning PyTorch with Examples - This tutorial introduces the fundamental concepts of PyTorch through self-contained examples.
What is torch.nn really?
Use torch.nn to create and train a neural network.
Visualizing Models, Data, and Training with TensorBoard
Learn to use TensorBoard to visualize data and model training.
Interpretability, Getting Started, TensorBoard
TorchVision Object Detection Finetuning Tutorial
Finetune a pre-trained Mask R-CNN model.
Image/Video
Additional Resources
Examples of PyTorch - A set of examples around PyTorch in Vision, Text, Reinforcement Learning that you can incorporate in your existing work.
PyTorch Cheat Sheet - Quick overview to essential PyTorch elements.
Tutorials on GitHub - Access PyTorch Tutorials from GitHub.
Run Tutorials on Google Colab - Learn how to copy tutorial data into Google Drive so that you can run tutorials on Google Colab.
PyTorch Cheat Sheet

Imports



General
import torch # root package
from torch.utils.data import Dataset, DataLoader # dataset representation and loading

Neural Network API



import torch.autograd as autograd # computation graph
from torch import Tensor # tensor node in the computation graph
import torch.nn as nn # neural networks
import torch.nn.functional as F # layers, activations and more
import torch.optim as optim # optimizers e.g. gradient descent, ADAM, etc.
from torch.jit import script, trace # hybrid frontend decorator and tracing jit
See autograd, nn, functional and optim

Torchscript and JIT



torch.jit.trace() # takes your module or function and an example
# data input, and traces the computational steps
# that the data encounters as it progresses through the model
@script # decorator used to indicate data-dependent
# control flow within the code being traced

ONNX



torch.onnx.export(model, dummy data, xxxx.proto) # exports an ONNX formatted
# model using a trained model, dummy
# data and the desired file name
model = onnx.load("alexnet.proto") # load an ONNX model
onnx.checker.check_model(model) # check that the model
# IR is well formed
onnx.helper.printable_graph(model.graph) # print a human readable
# representation of the graph

Vision



Vision from torchvision import datasets, models, transforms # vision datasets,
# architectures &
# transforms
import torchvision.transforms as transforms # composable transforms

Distributed Training


Distributed Training and multiprocessing
import torch.distributed as dist # distributed communication
from torch.multiprocessing import Process # memory sharing processes

Tensors


Tensors are the central data abstraction in PyTorch. This interactive notebook provides an in-depth introduction to the torch.Tensor class.
First things first, let’s import the PyTorch module. We’ll also add Python’s math module to facilitate some of the examples.
PyTorch Tensors, Creating Tensors, Math and Logic with PyTorch Tensors, Copying Tensors, Moving to GPU, Manipulating Tensor Shapes, NumPy Bridge


import torch
import math

Creation
x = torch.randn(*size) # tensor with independent N(0,1) entries
x = torch.[ones|zeros](*size) # tensor with all 1's [or 0's]
x = torch.tensor(L) # create tensor from [nested] list or ndarray L
y = x.clone() # clone of x
with torch.no_grad(): # code wrap that stops autograd from tracking tensor history
requires_grad=True # arg, when set to True, tracks computation
# history for future derivative calculations

Tensors and their number of dimensions, and terminology:
Anything with more than two dimensions is generally just called a tensor.
a 1-dimensional tensor is called a vector.
a 2-dimensional tensor is often referred to as a matrix.


More often than not, you’ll want to initialize your tensor with some value. Common cases are all zeros, all ones, or random values, and the torch module provides factory methods for all of these.
zeros = torch.zeros(2, 3)
print(zeros)

ones = torch.ones(2, 3)
print(ones)

torch.manual_seed(1729)
random = torch.rand(2, 3)
print(random)

Tensor Shapes
x = torch.empty(2, 2, 3)
print(x.shape)
print(x)

Tensor data types include:
torch.bool, torch.int8, torch.uint8, torch.int16, torch.int32,
torch.int64, torch.half, torch.float, torch.double, torch.bfloat

Dimensionality



x.size() # return tuple-like object of dimensions
x = torch.cat(tensor_seq, dim=0) # concatenates tensors along dim
y = x.view(a,b,...) # reshapes x into size (a,b,...)
y = x.view(-1,a) # reshapes x into size (b,a) for some b
y = x.transpose(a,b) # swaps dimensions a and b
y = x.permute(*dims) # permutes dimensions
y = x.unsqueeze(dim) # tensor with added axis
y = x.unsqueeze(dim=2) # (a,b,c) tensor -> (a,b,1,c) tensor
y = x.squeeze() # removes all dimensions of size 1 (a,1,b,1) -> (a,b)
y = x.squeeze(dim=1) # removes specified dimension of size 1 (a,1,b,1) -> (a,b,1)

Math


Algebra and math operations
ret = A.mm(B) # matrix multiplication
ret = A.mv(x) # matrix-vector multiplication
x = x.t() # matrix transpose

GPU


GPU Usage
torch.cuda.is_available # check for cuda
x = x.cuda() # move x's data from
# CPU to GPU and return new object
x = x.cpu() # move x's data from GPU to CPU
# and return new object
if not args.disable_cuda and torch.cuda.is_available(): # device agnostic code
args.device = torch.device('cuda') # and modularity
else: #
args.device = torch.device('cpu') #
net.to(device) # recursively convert their
# parameters and buffers to
# device specific tensors
x = x.to(device) # copy your tensors to a device
# (gpu, cpu)

Deep Learning


NN - Deep Learning
nn.Linear(m,n) # fully connected layer from
# m to n units
nn.ConvXd(m,n,s) # X dimensional conv layer from
# m to n channels where X⍷{1,2,3}
# and the kernel size is s
nn.MaxPoolXd(s) # X dimension pooling layer
# (notation as above)
nn.BatchNormXd # batch norm layer
nn.RNN/LSTM/GRU # recurrent layers
nn.Dropout(p=0.5, inplace=False) # dropout layer for any dimensional input
nn.Dropout2d(p=0.5, inplace=False) # 2-dimensional channel-wise dropout
nn.Embedding(num_embeddings, embedding_dim) # (tensor-wise) mapping from
# indices to embedding vectors

Loss Functions



nn.X # where X is L1Loss, MSELoss, CrossEntropyLoss
# CTCLoss, NLLLoss, PoissonNLLLoss,
# KLDivLoss, BCELoss, BCEWithLogitsLoss,
# MarginRankingLoss, HingeEmbeddingLoss,
# MultiLabelMarginLoss, SmoothL1Loss,
# SoftMarginLoss, MultiLabelSoftMarginLoss,
# CosineEmbeddingLoss, MultiMarginLoss,
# or TripletMarginLoss

Activation Functions



nn.X # where X is ReLU, ReLU6, ELU, SELU, PReLU, LeakyReLU,
# RReLu, CELU, GELU, Threshold, Hardshrink, HardTanh,
# Sigmoid, LogSigmoid, Softplus, SoftShrink,
# Softsign, Tanh, TanhShrink, Softmin, Softmax,
# Softmax2d, LogSoftmax or AdaptiveSoftmaxWithLoss

Optimizers



opt = optim.x(model.parameters(), ...) # create optimizer
opt.step() # update weights
optim.X # where X is SGD, Adadelta, Adagrad, Adam,
# AdamW, SparseAdam, Adamax, ASGD,
# LBFGS, RMSprop or Rprop
Learning rate scheduling
scheduler = optim.X(optimizer,...) # create lr scheduler
scheduler.step() # update lr after optimizer updates weights
optim.lr_scheduler.X # where X is LambdaLR, MultiplicativeLR,
# StepLR, MultiStepLR, ExponentialLR,
# CosineAnnealingLR, ReduceLROnPlateau, CyclicLR,
# OneCycleLR, CosineAnnealingWarmRestarts,

Data


Data Utilities
Datasets
Dataset # abstract class representing dataset
TensorDataset # labelled dataset in the form of tensors
Concat Dataset # concatenation of Datasets
See datasets
Dataloaders and DataSamplers
DataLoader(dataset, batch_size=1, ...) # loads data batches agnostic
# of structure of individual data points
sampler.Sampler(dataset,...) # abstract class dealing with
# ways to sample from dataset
sampler.XSampler where ... # Sequential, Random, SubsetRandom,
# WeightedRandom, Batch, Distributed


Deep Learning with PyTorch: A 60 Minute Blitz (pytorch.org)
PyTorch Forums (discuss.pytorch.org)
PyTorch for Numpy users (github.com/wkentaro/pytorch-for-numpy-users)

PyTorch Cheat Sheet
Imports
Tensors
Deep Learning
Data Utilities

Tensors


Tensors
Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model’s parameters.
Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. In fact, tensors and NumPy arrays can often share the same underlying memory, eliminating the need to copy data (see Bridge with NumPy). Tensors are also optimized for automatic differentiation (we’ll see more about that later in the Autograd section). If you’re familiar with ndarrays, you’ll be right at home with the Tensor API. If not, follow along!
import torch
import numpy as np
Initializing a Tensor
Tensors can be initialized in various ways. Take a look at the following examples:
Directly from data
Tensors can be created directly from data. The data type is automatically inferred.
data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)
From a NumPy array
Tensors can be created from NumPy arrays (and vice versa - see Bridge with NumPy).
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
From another tensor:
The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")
x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")
Ones Tensor:
tensor([[1, 1],
[1, 1]])
Random Tensor:
tensor([[0.9262, 0.3414],
[0.3801, 0.9828]])
With random or constant values:
shape is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.
shape = (2,3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")
Random Tensor:
tensor([[0.3168, 0.4644, 0.0181],
[0.2735, 0.2558, 0.6432]])
Ones Tensor:
tensor([[1., 1., 1.],
[1., 1., 1.]])
Zeros Tensor:
tensor([[0., 0., 0.],
[0., 0., 0.]])
Attributes of a Tensor
Tensor attributes describe their shape, datatype, and the device on which they are stored.
tensor = torch.rand(3,4)
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu
Operations on Tensors
Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing, indexing, slicing), sampling and more are comprehensively described here.
Each of these operations can be run on the GPU (at typically higher speeds than on a CPU). If you’re using Colab, allocate a GPU by going to Runtime > Change runtime type > GPU.
By default, tensors are created on the CPU. We need to explicitly move tensors to the GPU using .to method (after checking for GPU availability). Keep in mind that copying large tensors across devices can be expensive in terms of time and memory!
# We move our tensor to the GPU if available
if torch.cuda.is_available():
tensor = tensor.to("cuda")
Try out some of the operations from the list. If you’re familiar with the NumPy API, you’ll find the Tensor API a breeze to use.
Standard numpy-like indexing and slicing:
tensor = torch.ones(4, 4)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)
First row: tensor([1., 1., 1., 1.])
First column: tensor([1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1.])
tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
Joining tensors You can use torch.cat to concatenate a sequence of tensors along a given dimension. See also torch.stack, another tensor joining operator that is subtly different from torch.cat.
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)
tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])
Arithmetic operations
# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
# ``tensor.T`` returns the transpose of a tensor
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)
y3 = torch.rand_like(y1)
torch.matmul(tensor, tensor.T, out=y3)
# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
z2 = tensor.mul(tensor)
z3 = torch.rand_like(tensor)
torch.mul(tensor, tensor, out=z3)
tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
Single-element tensors If you have a one-element tensor, for example by aggregating all values of a tensor into one value, you can convert it to a Python numerical value using item():
agg = tensor.sum()
agg_item = agg.item()
print(agg_item, type(agg_item))
12.0 < class 'float' >
In-place operations Operations that store the result into the operand are called in-place. They are denoted by a _ suffix. For example: x.copy_(y), x.t_(), will change x.
print(f"{tensor} \n")
tensor.add_(5)
print(tensor)
tensor([[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.],
[1., 0., 1., 1.]])
tensor([[6., 5., 6., 6.],
[6., 5., 6., 6.],
[6., 5., 6., 6.],
[6., 5., 6., 6.]])
Note : In-place operations save some memory, but can be problematic when computing derivatives because of an immediate loss of history. Hence, their use is discouraged.
Bridge with NumPy
Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other.
Tensor to NumPy array
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")
t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]
A change in the tensor reflects in the NumPy array.
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")
t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]
NumPy array to Tensor
n = np.ones(5)
t = torch.from_numpy(n)
Changes in the NumPy array reflects in the tensor.
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")
t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]

Datasets & DataLoaders



Code for processing data samples can get messy and hard to maintain; we ideally want our dataset code to be decoupled from our model training code for better readability and modularity. PyTorch provides two data primitives: torch.utils.data.DataLoader and torch.utils.data.Dataset that allow you to use pre-loaded datasets as well as your own data. Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.
PyTorch domain libraries provide a number of pre-loaded datasets (such as FashionMNIST) that subclass torch.utils.data.Dataset and implement functions specific to the particular data. They can be used to prototype and benchmark your model. You can find them here: Image Datasets, Text Datasets, and Audio Datasets
Loading a Dataset
Here is an example of how to load the Fashion-MNIST dataset from TorchVision. Fashion-MNIST is a dataset of Zalando’s article images consisting of 60,000 training examples and 10,000 test examples. Each example comprises a 28×28 grayscale image and an associated label from one of 10 classes.
We load the FashionMNIST Dataset with the following parameters:
root is the path where the train/test data is stored,
train specifies training or test dataset,
download=True downloads the data from the internet if it’s not available at root.
transform and target_transform specify the feature and label transformations
import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=ToTensor()
)
test_data = datasets.FashionMNIST(
root="data",
train=False,
download=True,
transform=ToTensor()
)
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz



Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz

Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Iterating and Visualizing the Dataset

We can index Datasets manually like a list: training_data[index]. We use matplotlib to visualize some samples in our training data.
labels_map = {
0: "T-Shirt",
1: "Trouser",
2: "Pullover",
3: "Dress",
4: "Coat",
5: "Sandal",
6: "Shirt",
7: "Sneaker",
8: "Bag",
9: "Ankle Boot",
}
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
for i in range(1, cols * rows + 1):
sample_idx = torch.randint(len(training_data), size=(1,)).item()
img, label = training_data[sample_idx]
figure.add_subplot(rows, cols, i)
plt.title(labels_map[label])
plt.axis("off")
plt.imshow(img.squeeze(), cmap="gray")
plt.show()
Sandal, Trouser, Sandal, T-Shirt, Coat, Sandal, Dress, Sneaker, Coat
Creating a Custom Dataset for your files
A custom Dataset class must implement three functions: __init__, __len__, and __getitem__. Take a look at this implementation; the FashionMNIST images are stored in a directory img_dir, and their labels are stored separately in a CSV file annotations_file.
In the next sections, we’ll break down what’s happening in each of these functions.
import os
import pandas as pd
from torchvision.io import read_image
class CustomImageDataset(Dataset):
def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
self.img_labels = pd.read_csv(annotations_file)
self.img_dir = img_dir
self.transform = transform
self.target_transform = target_transform
def __len__(self):
return len(self.img_labels)
def __getitem__(self, idx):
img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
image = read_image(img_path)
label = self.img_labels.iloc[idx, 1]
if self.transform:
image = self.transform(image)
if self.target_transform:
label = self.target_transform(label)
return image, label
__init__
The __init__ function is run once when instantiating the Dataset object. We initialize the directory containing the images, the annotations file, and both transforms (covered in more detail in the next section).
The labels.csv file looks like:
tshirt1.jpg, 0
tshirt2.jpg, 0
......
ankleboot999.jpg, 9
def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
self.img_labels = pd.read_csv(annotations_file)
self.img_dir = img_dir
self.transform = transform
self.target_transform = target_transform
__len__
The __len__ function returns the number of samples in our dataset.
Example:
def __len__(self):
return len(self.img_labels)
__getitem__
The __getitem__ function loads and returns a sample from the dataset at the given index idx. Based on the index, it identifies the image’s location on disk, converts that to a tensor using read_image, retrieves the corresponding label from the csv data in self.img_labels, calls the transform functions on them (if applicable), and returns the tensor image and corresponding label in a tuple.
def __getitem__(self, idx):
img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
image = read_image(img_path)
label = self.img_labels.iloc[idx, 1]
if self.transform:
image = self.transform(image)
if self.target_transform:
label = self.target_transform(label)
return image, label
Preparing your data for training with DataLoaders
The Dataset retrieves our dataset’s features and labels one sample at a time. While training a model, we typically want to pass samples in “minibatches”, reshuffle the data at every epoch to reduce model overfitting, and use Python’s multiprocessing to speed up data retrieval.
DataLoader is an iterable that abstracts this complexity for us in an easy API.
from torch.utils.data import DataLoader
train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)
Iterate through the DataLoader
We have loaded that dataset into the DataLoader and can iterate through the dataset as needed. Each iteration below returns a batch of train_features and train_labels (containing batch_size=64 features and labels respectively). Because we specified shuffle=True, after we iterate over all batches the data is shuffled (for finer-grained control over the data loading order, take a look at Samplers).
# Display image and label.
train_features, train_labels = next(iter(train_dataloader))
print(f"Feature batch shape: {train_features.size()}")
print(f"Labels batch shape: {train_labels.size()}")
img = train_features[0].squeeze()
label = train_labels[0]
plt.imshow(img, cmap="gray")
plt.show()
print(f"Label: {label}")
data tutorial
Feature batch shape: torch.Size([64, 1, 28, 28])
Labels batch shape: torch.Size([64])
Label: 8
Further Reading
torch.utils.data API

Built with Sphinx using a theme provided by Read the Docs.
Datasets & DataLoaders
Loading a Dataset
Iterating and Visualizing the Dataset
Creating a Custom Dataset for your files
Preparing your data for training with DataLoaders
Iterate through the DataLoader
Further Reading

Transforms


Data does not always come in its final processed form that is required for training machine learning algorithms. We use transforms to perform some manipulation of the data and make it suitable for training.
All TorchVision datasets have two parameters -transform to modify the features and target_transform to modify the labels - that accept callables containing the transformation logic. The torchvision.transforms module offers several commonly-used transforms out of the box.
The FashionMNIST features are in PIL Image format, and the labels are integers. For training, we need the features as normalized tensors, and the labels as one-hot encoded tensors. To make these transformations, we use ToTensor and Lambda.
import torch
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda
ds = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=ToTensor(),
target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz
Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw
ToTensor()
ToTensor converts a PIL image or NumPy ndarray into a FloatTensor. and scales the image’s pixel intensity values in the range [0., 1.]
Lambda Transforms
Lambda transforms apply any user-defined lambda function. Here, we define a function to turn the integer into a one-hot encoded tensor. It first creates a zero tensor of size 10 (the number of labels in our dataset) and calls scatter_ which assigns a value=1 on the index as given by the label y.
target_transform = Lambda(lambda y: torch.zeros( 10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))

Captum (“comprehension” in Latin) is an open source, extensible library for model interpretability built on PyTorch.
With the increase in model complexity and the resulting lack of transparency, model interpretability methods have become increasingly important. Model understanding is both an active area of research as well as an area of focus for practical applications across industries using machine learning. Captum provides state-of-the-art algorithms, including Integrated Gradients, to provide researchers and developers with an easy way to understand which features are contributing to a model’s output.
Full documentation, an API reference, and a suite of tutorials on specific topics are available at the captum.ai website.
Technology Pytorch 2025
Arm Accelerates AI From Cloud to Edge With New PyTorch and ExecuTorch Integrations to Deliver Immediate Performance Improvements for Developers Arm Newsroom
Using Hugging Face Transformers with PyTorch and TensorFlow KDnuggets
This Deep Learning Paper from Eindhoven University of Technology Releases Nerva: A Groundbreaking Sparse Neural Network Library Enhancing Efficiency and Performance MarkTechPost
New Arm partnerships extend AI performance from edge to cloud InfoWorld