# Storage: local drives¶

Local disks on computing nodes are the preferred place for doing your IO. The general idea is use network storage as a backend and local disk for actual data processing.

• In the beginning of the job cd to /tmp and make a unique directory for your run
• copy needed input from WRKDIR to there
• run your calculation normally forwarding all the output to /tmp
• in the end copy relevant output to WRKDIR for analysis and further usage

Pros

• You get better and steadier IO performance. WRKDIR is shared over all users making per-user performance actually rather poor.
• You save performance for WRKDIR to those who cannot use local disks.
• You get much better performance when using many small files (Lustre works poorly here).
• Saves your quota if your code generate lots of data but finally you need only part of it
• In general, it is an excellent choice for single-node runs (that is all job’s task run on the same node).

Cons

• Not feasible for huge files (>100GB). Use WRKDIR instead.
• Small learning curve (must copy files before and after the job).
• Not feasible for cross-node IO (MPI jobs). Use WRKDIR instead.

## How to use local drives on compute nodes¶

NOT for the long-term data. Cleaned every time your job is finished.

You have to use --gres=spindle to ensure that you get a hard disk (note 2019-january: except GPU nodes).

/tmp is a bind-mounted user specific directory. Directory is per-user (not per-job that is), if you get two jobs running on the same node, you get the same /tmp.

### Interactively¶

How to use /tmp when you login interactively

$sinteractive -t 1:00:00 # request a node for one hour (node)$ mkdir /tmp/$SLURM_JOB_ID # create a unique directory, here we use (node)$ cd /tmp/$SLURM_JOB_ID ... do what you wanted ... (node)$ cp your_files $WRKDIR/my/valuable/data # copy what you need (node)$ cd; rm -rf /tmp/$SLURM_JOB_ID # clean up after yourself (node)$ exit


### In batch script¶

Batch job example that prevents data lost in case program gets terminated (either because of scancel or due to time limit).

#!/bin/bash

#SBATCH --time=0-12:00:00 --mem-per-cpu=2500                  # time and memory requirements

mkdir /tmp/$SLURM_JOB_ID # get a directory where you will send all output from your program cd /tmp/$SLURM_JOB_ID

## set the trap: when killed or exits abnormally you get the
## output copied to $WRKDIR/$SLURM_JOB_ID anyway
trap "mkdir $WRKDIR/$SLURM_JOB_ID; mv -f /tmp/$SLURM_JOB_ID$WRKDIR/$SLURM_JOB_ID; exit" TERM EXIT ## run the program and redirect all IO to a local drive ## assuming that you have your program and input at$WRKDIR
srun $WRKDIR/my_program$WRKDIR/input > output

mv /tmp/$SLURM_JOB_ID/output$WRKDIR/SOMEDIR                   # move your output fully or partially


### Batch script for thousands input/output files¶

If your job requires a large amount of files as input/output using tar utility can greatly reduce the load on the $WRKDIR-filesystem. Using methods like this is recommended if you’re working with thousands of files. Working with tar balls is done in a following fashion: 1. Determine if your input data can be collected into analysis-sized chunks that can be (if possible) re-used 2. Make a tar ball out of the input data (tar cf <tar filename>.tar <input files>) 3. At the beginning of job copy the tar ball into /tmp and untar it there (tar xf <tar filename>.tar) 4. Do the analysis here, in the local disk 5. If output is a large amount of files, tar them and copy them out. Otherwise write output to $WRKDIR

A sample code is below:

#!/bin/bash

#SBATCH --time=0-12:00:00 --mem-per-cpu=2000      # time and memory requirements
mkdir /tmp/$SLURM_JOB_ID # get a directory where you will put your data cp$WRKDIR/input.tar /tmp/$SLURM_JOB_ID # copy tarred input files cd /tmp/$SLURM_JOB_ID

trap "rm -rf /tmp/$SLURM_JOB_ID; exit" TERM EXIT # set the trap: when killed or exits abnormally you clean up your stuff tar xf input.tar # untar the files srun input/* # do the analysis, or what ever else tar cf output.tar output/* # tar output mv output.tar$WRKDIR/SOMEDIR                     # copy results back