PyTorch
- pagelastupdated
2022-08-08
PyTorch is a commonly used Python package for deep learning.
Basic usage
First, check the tutorials up to and including GPU computing.
If you plan on using NVIDIA’s containers to run your model, please check the page about NVIDIA’s singularity containers.
The basic way to use PyTorch is via the Python in the anaconda
module.
Don’t load any additional CUDA modules, anaconda
includes everything.
Building your own environment with PyTorch
If you need a PyTorch version different to the one supplied with anaconda we recommend installing your own anaconda environment as detailed here.
Creating an environment with GPU enabled PyTorch
To create an environment with GPU enabled PyTorch you can use an
environment file like this
pytorch-env.yml
:
name: pytorch-env
channels:
- nvidia
- pytorch
- conda-forge
dependencies:
- pytorch
- pytorch-cuda=11.7
- torchvision
- torchaudio
Here we install the latest pytorch version from pytorch
-channel and
the pytorch-cuda
-metapackage that makes certain that the
Additional packages required by pytorch
are installed from conda-forge
-channel.
Hint
During installation conda will try to verify what is the maximum version of CUDA installed graphics cards can support and it will install non-CUDA enabled versions by default if none are found (as is the case on the login node, where environments are normally built). This can be usually overcome by setting explicitly that the packages should be the CUDA-enabled ones. It might however happen, that the environment creation process aborts with a message similar to:
nothing provides __cuda needed by tensorflow-2.9.1-cuda112py310he87a039_0
In this instance it might be necessary to override the CUDA settings used by
conda/mamba.
To do this, prefix your environment creation command with CONDA_OVERRIDE_CUDA=CUDAVERSION
,
where CUDAVERSION is the CUDA toolkit version you intend to use as in:
CONDA_OVERRIDE_CUDA="11.2" mamba env create -f cuda-env.yml
This will allow conda to assume that the respective CUDA libraries will be present at a later point and so it will skip those requirements during installation.
For more information, see this helpful post in Conda-Forge’s documentation.
Examples:
Simple PyTorch model
Let’s run the MNIST example from PyTorch’s tutorials:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5, 1)
self.conv2 = nn.Conv2d(20, 50, 5, 1)
self.fc1 = nn.Linear(4*4*50, 500)
self.fc2 = nn.Linear(500, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4*4*50)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
The full code for the example is in
pytorch_mnist.py
.
One can run this example with srun
:
$ wget https://raw.githubusercontent.com/AaltoSciComp/scicomp-docs/master/triton/examples/pytorch/pytorch_mnist.py
$ module load anaconda
$ srun --time=00:15:00 --gres=gpu:1 python pytorch_mnist.py
or with sbatch
by submitting
pytorch_mnist.sh
:
#!/bin/bash
#SBATCH --gres=gpu:1
#SBATCH --time=00:15:00
module load anaconda
python pytorch_mnist.py
The Python-script will download the MNIST dataset to data
folder.
Running simple PyTorch model with NVIDIA’s containers
Let’s run the MNIST example from PyTorch’s tutorials:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 20, 5, 1)
self.conv2 = nn.Conv2d(20, 50, 5, 1)
self.fc1 = nn.Linear(4*4*50, 500)
self.fc2 = nn.Linear(500, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 4*4*50)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return F.log_softmax(x, dim=1)
The full code for the example is in
pytorch_mnist.py
.
One can run this example with srun
:
wget https://raw.githubusercontent.com/AaltoSciComp/scicomp-docs/master/triton/examples/pytorch/pytorch_mnist.py
module load nvidia-pytorch/20.02-py3
srun --time=00:15:00 --gres=gpu:1 singularity_wrapper exec python pytorch_mnist.py
or with sbatch
by submitting
pytorch_singularity_mnist.sh
:
#!/bin/bash
#SBATCH --gres=gpu:1
#SBATCH --time=00:15:00
module load nvidia-pytorch/20.02-py3
singularity_wrapper exec python pytorch_mnist.py
The Python-script will download the MNIST dataset to data
folder.