Move reused code into python script files#
Jupyter is a great place to test out ideas and get code working. It can also be a good place to run and document your experiments. However, once your code is working, it’s usually a good idea to move much of the code into python script files. This allows you to create copies of your notebook without duplicating all of the logic for how data is loaded, models are defined, and training is performed. In the end, the notebook should simply document the experiment that you performed.
By default, jupyter does not reload imported modules. If you are editing local .py files, it’s a good idea to use the autoreload
extension to automatically reload the local files.
# use autoreload because, by default, python will not re-import modules
%load_ext autoreload
%autoreload 2
import os
import torch
from torchvision import transforms
import matplotlib.pyplot as plt
Settings#
These parameters are inputs to fitting process. We leave them in the notebook because we might change them from one experiment to the next.
data_dir = f"/scratch/{os.environ['USER']}/data"
model_path = f"/scratch/{os.environ['USER']}/model.pt"
# Model and Training
epochs=5 # number of training epochs
batch_size=128 #input batch size for training (default: 64)
test_batch_size=1000 #input batch size for testing (default: 1000)
num_workers=10 # parallel data loading to speed things up
lr=1.0 #learning rate (default: 1.0)
gamma=0.7 #Learning rate step gamma (default: 0.7)
no_cuda=False #disables CUDA training (default: False)
seed=355 #random seed (default: 355)
log_interval=10 #how many batches to wait before logging training status (default: 10)
save_model=False #save the trained model (default: False)
# additional derived settings
use_cuda = not no_cuda and torch.cuda.is_available()
torch.manual_seed(seed)
device = torch.device("cuda" if use_cuda else "cpu")
print("Device:", device)
Dataset#
The logic for loading data will be repeated across several experiments. To avoid code duplication, we move code out of the notebook and into a separate .py file. This also reduces the number of import statements needed in the notebook itself.
from utils import data
# transforms (we may wish to experiment with these so leave as inputs)
train_transforms = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
test_transforms = train_transforms
train_loader = data.get_train_dataloader(data_dir, train_transforms, batch_size, num_workers)
test_loader = data.get_test_dataloader(data_dir, test_transforms, test_batch_size, num_workers)
# save a test batch for later testing
image_gen = iter(test_loader)
test_img, test_trg = next(image_gen)
from utils.response import create_answer_box
create_answer_box("After you run the above code cell to load the data, open `utils/data.py` and check out the code there. Add a `print` statement to each of `get_train_dataloader` and `get_test_dataloader` to say that each dataset was successfully loaded. Then run the code cell again to see the print statements appear. This immediate change is possible because of the autoreload magic we used! If you have any questions, submit them here.", "02-01")
print("Training dataset:", train_loader.dataset)
print("Testing dataset:", test_loader.dataset)
Model definition#
We move the model definitions to a models.py file. This file also contains test code for developing the model. In the future we may place several different model definitions into this file, so that we can compare different architecture choices.
from utils import models
# Create the model
model = models.Classifier()
# let's make sure we can run a batch of data through the model
with torch.no_grad():
x, y = next(iter(train_loader))
y_hat = model(x)
y_hat.shape, y_hat,
model
print("Number of parameters:", model.num_params())
Training and testing#
We also move our training logic into its own .py file.
from utils import training
model = models.Classifier().to(device)
model
training.train_and_test(model, train_loader, test_loader, epochs, lr, gamma, device)
Code challenge#
Open utils/models.py
and make two new classes like Classifier
– Classifier_deep
and Classifier_wide
. For Classifier_deep
, add another convolution layer (or a few!). For Classifier_wide
, keep the same number of layers as in Classifier
, but make the layers wider – i.e., use more output channels than Classifier
. Report on your results: 1) How many parameters does each model have? and 2) What final test accuracy does each model achieve?
# Use this code cell to test your modified model
model = models.Classifier().to(device)
print("Number of trainable parameters:", sum(p.numel() for p in model.parameters() if p.requires_grad))
training.train_and_test(model, train_loader, test_loader, epochs, lr, gamma, device)
create_answer_box("How many parameters does each of `Classifer_deep` and `Classifier_wide` have?", "01-02")
create_answer_box("What's the final test accuracy you achieved with each model?", "01-03")