PyTorch Basics#
Pytorch is a library for creating neural networks in Python.
This notebook draws heavily from the following sources:
Pytorch’s official Tensors notebook
Phillip Lippe’s Intro to Pytorch notebook
Overview of the library#
torch
The top-level PyTorch package that provides an entry point to all PyTorch modules including the Tensor object.torch.nn
A subpackage that contains modules and classes for building neural networks.torch.utils.data
A subpackage that provides tools for working with data.torch.distributed
A subpackage that provides support for training on multiple gpus and multiple nodes.torch.autograd
A package that provides automatic differentiation for all operations on Tensors.torchvision
A package that provides access to popular datasets, model architectures, and image transformations for computer vision.
Tensors#
Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to store:
model inputs
model outputs
model parameters.
Tensors can have many dimensions (at least 10,000 in version 2.0 – I checked).
Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. Tensors are also optimized for automatic differentiation. If you’re familiar with numpy arrays, you’ll be right at home with the Tensor API.
To start working with tensors, we import the top-level Pytorch package:
import torch
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 import torch
ModuleNotFoundError: No module named 'torch'
Initializing a Tensor#
Tensors can be initialized in various ways. Take a look at the following examples:
A range of values
torch.arange(10)
Directly from data
Tensors can be created directly from data. The data type is automatically inferred.
data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)
x_data
From/To a NumPy array
Tensors can be created from NumPy arrays.
import numpy as np
np_arr = np.array([[1, 2], [3, 4]])
tensor = torch.from_numpy(np_arr)
np_arr_2 = tensor.numpy()
print("Numpy array:\n", np_arr)
print("PyTorch tensor:\n", tensor)
print("Numpy array 2:\n", np_arr_2)
From another tensor:
The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")
x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")
With random or constant values:
shape
is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.
shape = (2,3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")
Attributes of a Tensor#
Tensor attributes describe their shape, datatype, and the device on which they are stored.
tensor = torch.rand(3,4)
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")
Operations on Tensors#
Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing, indexing, slicing), sampling and more are comprehensively described here.
Indexing
We often have the situation where we need to select a part of a tensor. Indexing works just like in numpy, so let’s try it:
x = torch.arange(12).view(3, 4)
x
x[:, 1] # Second column
x[0] # First row
x[:2, -1] # First two rows, last column
x[1:3] # Middle two rows
adding/removing indices
We often need to add empty indices.
x[None].shape # add new index at front
x[:, None].shape # in the 2nd position
x[..., None].shape # in the last position
unsqueeze
accomplishes the same thing
print(x.unsqueeze(0).shape)
print(x.unsqueeze(1).shape)
print(x.unsqueeze(-1).shape)
We can remove empty indices as well.
x = torch.randn(1,3,4)
x.shape
x[0].shape
x.squeeze().shape
Changing the shape There are many ways
x = torch.randn(2,3)
x, x.shape
x.T, x.T.shape # transpose
x.reshape(3,2)
Question
How would we create a tensor with shape `torch.Size([6,1])`?
Shapes must be compatible
try:
x.reshape(2,6)
except RuntimeError as e:
print(e)
permute
allows us to rearrange indices more flexibly
# create a tensor to play with
y = torch.arange(24).reshape(2,3,4)
y, y.shape
y.permute(1, 0, 2).shape
y.permute(0,2,1).shape
Joining tensors
You can use torch.cat
to concatenate a sequence of tensors along a given dimension.
See also torch.stack_,
another tensor joining option that is subtly different from torch.cat
.
tensor = torch.arange(12).reshape(3,4)
tensor
torch.cat([tensor, tensor, tensor], dim=0)
torch.cat([tensor, tensor, tensor], dim=1)
Sometimes, you want to create a new dimension when combining:
torch.stack([tensor, tensor, tensor])
Arithmetic operations
# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
# ``tensor.T`` returns the transpose of a tensor
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)
y1, y2, y1-y2
# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
z2 = tensor.mul(tensor)
z1, z2, z1-z2
aggregations
x = torch.arange(12, dtype=torch.float32).reshape(3,4)
x
# sum, mean, std, etc.
x.sum(), x.mean(), x.std()
# sum along first axis
print(x.sum(axis=0))
# or second
x.sum(axis=1)