Running Python with OpenMP parallelization
Various Python packages such as Numpy, Scipy and pandas can utilize OpenMP
to run on multiple CPUs. As an example, let’s run the python script
python_openmp.py
that calculates multiplicative inverse of five symmetric matrices of
size 2000x2000.
nrounds = 5
t_start = time()
for i in range(nrounds):
a = np.random.random([2000,2000])
a = a + a.T
b = np.linalg.pinv(a)
t_delta = time() - t_start
print('Seconds taken to invert %d symmetric 2000x2000 matrices: %f' % (nrounds, t_delta))
The full code for the example is in
HPC examples-repository.
One can run this example with srun
:
wget https://raw.githubusercontent.com/AaltoSciComp/hpc-examples/master/python/python_openmp/python_openmp.py
module load anaconda/2022-01
export OMP_PROC_BIND=true
srun --cpus-per-task=2 --mem=2G --time=00:15:00 python python_openmp.py
or with sbatch
by submitting
python_openmp.sh
:
#!/bin/bash -l
#SBATCH --time=00:10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem-per-cpu=1G
#SBATCH -o python_openmp.out
module load anaconda/2022-01
export OMP_PROC_BIND=true
echo 'Running on: '$HOSTNAME
srun python python_openmp.py
Important
Python has a global interpreter lock (GIL), which forces some operations to be executed on only one thread and when these operations are occuring, other threads will be idle. These kinds of operations include reading files and doing print statements. Thus one should be extra careful with multithreaded code as it is easy to create seemingly parallel code that does not actually utilize multiple CPUs.
There are ways to minimize effects of GIL on your Python code and if you’re creating your own multithreaded code, we recommend that you take this into account.