Tensorflow

pagelastupdated

2022-08-08

Tensorflow is a commonly used Python package for deep learning.

Basic usage

First, check the tutorials up to and including GPU computing.

If you plan on using NVIDIA’s containers to run your model, please check the page about NVIDIA’s singularity containers.

We provide a module for gpu enabled tensorflow 2.6 which can be loaded by module load tensorflow. If you need a newer tensorflow version, we suggest you install it via your own conda environment (see the instructions below).

Installing via conda

Have a look here for details on how to install conda environments.

Creating an environment with packages requiring CUDA

Many tools check, whether the system has a cuda capable graphics card set up and will install non cuda enabled versions by default if none is found (as is the case on the login node, where environments are normally built). This can be overcome by loading cuda specific versions (as detailed below). It might however happen, that the environment creation process aborts with a message similar to:

nothing provides __cuda needed by tensorflow-2.9.1-cuda112py310he87a039_0

In this instance it might be necessary to override the CUDA settings used by conda/mamba. To do this, prefix your environment creation command with CONDA_OVERRIDE_CUDA=CUDAVERSION, where CUDAVERSION is the Cuda toolkit version you intend to use as in:

CONDA_OVERRIDE_CUDA="11.2" mamba env create -f cuda-env.yml

This will allow conda to assume that the respective cuda libraries will be present at a later point but skip those requirements during installation.

Creating an environment with GPU enabled Tensorflow

To create an environment with GPU enabled Tensorflow you can use an environment file like this tensorflow-env.yml:

name: tensorflow-env
channels:
  - conda-forge
dependencies:
  - tensorflow=*=*cuda*

Here we install the latest tensorflow from conda-forge-channel with an additional requirement that the build version of the tensorflow-package must contain a reference to a CUDA toolkit. For a specific version replace the =*=*cuda* with e.g. =2.8.1=*cuda* for version 2.8.1.

Examples:

Let’s run the MNIST example from Tensorflow’s tutorials:

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

The full code for the example is in tensorflow_mnist.py. One can run this example with srun:

wget https://raw.githubusercontent.com/AaltoSciComp/scicomp-docs/master/triton/examples/tensorflow/tensorflow_mnist.py
module load anaconda
srun --time=00:15:00 --gres=gpu:1 python tensorflow_mnist.py

or with sbatch by submitting tensorflow_mnist.sh:

#!/bin/bash
#SBATCH --gres=gpu:1
#SBATCH --time=00:15:00

module load anaconda

python tensorflow_mnist.py

Do note that by default Keras downloads datasets to $HOME/.keras/datasets.

Let’s run the MNIST example from Tensorflow’s tutorials:

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

The full code for the example is in tensorflow_mnist.py. One can run this example with srun:

wget https://raw.githubusercontent.com/AaltoSciComp/scicomp-docs/master/triton/examples/tensorflow/tensorflow_mnist.py
module load nvidia-tensorflow/20.02-tf1-py3
srun --time=00:15:00 --gres=gpu:1 singularity_wrapper exec python tensorflow_mnist.py

or with sbatch by submitting tensorflow_singularity_mnist.sh:

#!/bin/bash
#SBATCH --gres=gpu:1
#SBATCH --time=00:15:00

module load nvidia-tensorflow/20.02-tf1-py3

singularity_wrapper exec python tensorflow_mnist.py

Do note that by default Keras downloads datasets to $HOME/.keras/datasets.