PyTorch Basics#

Pytorch is a library for creating neural networks in Python.

This notebook draws heavily from the following sources:

Overview of the library#

  • torch The top-level PyTorch package that provides an entry point to all PyTorch modules including the Tensor object.

  • torch.nn A subpackage that contains modules and classes for building neural networks.

  • A subpackage that provides tools for working with data.

  • torch.distributed A subpackage that provides support for training on multiple gpus and multiple nodes.

  • torch.autograd A package that provides automatic differentiation for all operations on Tensors.

  • torchvision A package that provides access to popular datasets, model architectures, and image transformations for computer vision.


Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to store:

  1. model inputs

  2. model outputs

  3. model parameters.

Tensors can have many dimensions (at least 10,000 in version 2.0 – I checked).

Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. Tensors are also optimized for automatic differentiation. If you’re familiar with numpy arrays, you’ll be right at home with the Tensor API.

To start working with tensors, we import the top-level Pytorch package:

import torch
ModuleNotFoundError                       Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 import torch

ModuleNotFoundError: No module named 'torch'

Initializing a Tensor#

Tensors can be initialized in various ways. Take a look at the following examples:

A range of values


Directly from data

Tensors can be created directly from data. The data type is automatically inferred.

data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)

From/To a NumPy array

Tensors can be created from NumPy arrays.

import numpy as np
np_arr = np.array([[1, 2], [3, 4]])
tensor = torch.from_numpy(np_arr)
np_arr_2 = tensor.numpy()

print("Numpy array:\n", np_arr)
print("PyTorch tensor:\n", tensor)
print("Numpy array 2:\n", np_arr_2)

From another tensor:

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.

x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")

With random or constant values:

shape is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.

shape = (2,3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

Attributes of a Tensor#

Tensor attributes describe their shape, datatype, and the device on which they are stored.

tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Operations on Tensors#

Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing, indexing, slicing), sampling and more are comprehensively described here.


We often have the situation where we need to select a part of a tensor. Indexing works just like in numpy, so let’s try it:

x = torch.arange(12).view(3, 4)
x[:, 1]   # Second column
x[0]      # First row
x[:2, -1] # First two rows, last column
x[1:3] # Middle two rows

adding/removing indices

We often need to add empty indices.

x[None].shape # add new index at front
x[:, None].shape # in the 2nd position
x[..., None].shape # in the last position

unsqueeze accomplishes the same thing


We can remove empty indices as well.

x = torch.randn(1,3,4)

Changing the shape There are many ways

x = torch.randn(2,3)
x, x.shape
x.T, x.T.shape  # transpose


How would we create a tensor with shape `torch.Size([6,1])`?

Shapes must be compatible

except RuntimeError as e:

permute allows us to rearrange indices more flexibly

# create a tensor to play with
y = torch.arange(24).reshape(2,3,4)
y, y.shape
y.permute(1, 0, 2).shape

Joining tensors

You can use to concatenate a sequence of tensors along a given dimension. See also torch.stack_, another tensor joining option that is subtly different from

tensor = torch.arange(12).reshape(3,4)
tensor[tensor, tensor, tensor], dim=0)[tensor, tensor, tensor], dim=1)

Sometimes, you want to create a new dimension when combining:

torch.stack([tensor, tensor, tensor])

Arithmetic operations

# This computes the matrix multiplication between two tensors. y1, y2, y3 will have the same value
# ``tensor.T`` returns the transpose of a tensor
y1 = tensor @ tensor.T
y2 = tensor.matmul(tensor.T)

y1, y2, y1-y2
# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
z2 = tensor.mul(tensor)

z1, z2, z1-z2


x = torch.arange(12, dtype=torch.float32).reshape(3,4)
# sum, mean, std, etc.
x.sum(), x.mean(), x.std()
# sum along first axis

# or second