Triton quick reference¶
In this page, you have all important reference information
Modules¶
Command |
Description |
---|---|
|
load module |
|
list all modules |
|
search modules |
|
list currently loaded modules |
|
details on a module |
|
details on a module |
|
unload a module |
|
save module collection to this alias (saved in |
|
load saved module collection (faster than loading individually) |
|
unload all loaded modules (faster than unloading individually) |
Common software¶
Storage¶
Name |
Path |
Quota |
Backup |
Locality |
Purpose |
---|---|---|---|---|---|
Home |
|
hard quota 10GB |
Nightly |
all nodes |
Small user specific files, no calculation data. |
Work |
|
200GB and 1 million files |
x |
all nodes |
Personal working space for every user. Calculation data etc. Quota can be increased on request. |
Scratch |
|
on request |
x |
all nodes |
Department/group specific project directories. |
Local temp |
|
limited by disk size |
x |
single-node |
Primary (and usually fastest) place for single-node calculation data. Removed once user’s jobs are finished on the node. |
Local persistent |
|
varies |
x |
dedicated group servers only |
Local disk persistent storage. On servers purchased for a specific group. Not backed up. |
ramfs (login nodes only) |
|
limited by memory |
x |
single-node |
Ramfs on the login node only, in-memory filesystem |
Partitions¶
Partition |
Max job size |
Mem/core (GB) |
Tot mem (GB) |
Cores/node |
Limits |
Use |
---|---|---|---|---|---|---|
<default> |
If you leave off all possible partitions will be used (based on time/mem) |
|||||
debug |
2 nodes |
2.66 - 12 |
32-256 |
12,20,24 |
15 min |
testing and debugging short interactive. work. 1 node of each arch. |
batch |
16 nodes |
2.66 - 12 |
32-256 |
12, 20,24 |
5d |
primary partition, all serial & parallel jobs |
short |
8 nodes |
4 - 12 |
48-256 |
12, 20,24 |
4h |
short serial & parallel jobs, +96 dedicated CPU cores |
hugemem |
1 node |
43 |
1024 |
24 |
3d |
huge memory jobs, 1 node only |
gpu |
1 node, 2-8GPUs |
2 - 10 |
24-128 |
12 |
5d |
|
gpushort |
4 nodes, 2-8 GPUs |
2 - 10 |
24-128 |
12 |
4h |
|
interactive |
2 nodes |
5 |
128 |
24 |
1d |
for |
Use slurm partitions
to see more details.
Job submission¶
Command |
Description |
---|---|
|
submit a job to queue (see standard options below) |
|
Within a running job script/environment: Run code using the allocated resources (see options below) |
|
On frontend: submit to queue, wait until done, show output. (see options below) |
|
Submit job, wait, provide shell on node for interactive playing (X forwarding works, default partition interactive). Exit shell when done. (see options below) |
|
(advanced) Another way to run interactive jobs, no X forwarding but simpler. Exit shell when done. |
|
Cancel a job in queue |
|
(advanced) Allocate resources from frontend node. Use |
|
View/modify job and slurm configuration |
Command |
Option |
Description |
---|---|---|
|
|
time limit |
|
time limit, days-hours |
|
|
job partition. Usually leave off and things are auto-detected. |
|
|
request n MB of memory per core |
|
|
request n MB memory per node |
|
|
Allocate *n* CPU’s for each task. For multithreaded jobs. (compare ``–ntasks``: ``-c`` means the number of cores for each process started.) |
|
|
allocate minimum of n, maximum of m nodes. |
|
|
allocate resources for and start n tasks (one task=one process started, it is up to you to make them communicate. However the main script runs only on first node, the sub-processes run with “srun” are run this many times.) |
|
|
short job name |
|
|
print output into file output |
|
|
print errors into file error |
|
|
allocate exclusive access to nodes. For large parallel jobs. |
|
|
request feature (see |
|
|
Run job multiple times, use variable |
|
|
request a GPU, or |
|
|
request nodes that have disks, |
|
|
notify of events: |
|
|
whome to send the email |
|
|
|
Print allocated nodes (from within script) |
Command |
Description |
---|---|
|
Status of your queued jobs (long/short) |
|
Overview of partitions (A/I/O/T=active,idle,other,total) |
|
list free CPUs in a partition |
|
Show status of recent jobs |
|
Show percent of mem/CPU used in job |
|
Job details (only while running) |
|
Show status of all jobs |
|
Full history information (advanced, needs args) |
Full slurm command help:
$ slurm
Show or watch job queue:
slurm [watch] queue show own jobs
slurm [watch] q show user's jobs
slurm [watch] quick show quick overview of own jobs
slurm [watch] shorter sort and compact entire queue by job size
slurm [watch] short sort and compact entire queue by priority
slurm [watch] full show everything
slurm [w] [q|qq|ss|s|f] shorthands for above!
slurm qos show job service classes
slurm top [queue|all] show summary of active users
Show detailed information about jobs:
slurm prio [all|short] show priority components
slurm j|job show everything else
slurm steps show memory usage of running srun job steps
Show usage and fair-share values from accounting database:
slurm h|history show jobs finished since, e.g. "1day" (default)
slurm shares
Show nodes and resources in the cluster:
slurm p|partitions all partitions
slurm n|nodes all cluster nodes
slurm c|cpus total cpu cores in use
slurm cpus cores available to partition, allocated and free
slurm cpus jobs cores/memory reserved by running jobs
slurm cpus queue cores/memory required by pending jobs
slurm features List features and GRES
Examples:
slurm q
slurm watch shorter
slurm cpus batch
slurm history 3hours
Other advanced commands (many require lots of parameters to be useful):
Command |
Description |
---|---|
|
Full info on queues |
|
Advanced info on partitions |
|
List all nodes |
Toolchains¶
Toolchain |
Compiler version |
MPI version |
BLAS version |
ScaLAPACK version |
FFTW version |
CUDA version |
---|---|---|---|---|---|---|
GOOLF Toolchains: |
||||||
goolf/triton-2016a |
GCC/4.9.3 |
OpenMPI/1.10.2 |
OpenBLAS/0.2.15 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
|
goolf/triton-2016b |
GCC/5.4.0 |
OpenMPI/1.10.3 |
OpenBLAS/0.2.18 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
|
goolfc/triton-2016a |
GCC/4.9.3 |
OpenMPI/1.10.2 |
OpenBLAS/0.2.15 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
7.5.18 |
goolfc/triton-2017a |
GCC/5.4.0 |
OpenMPI/2.0.1 |
OpenBLAS/0.2.19 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
8.0.61 |
GMPOLF Toolchains: |
||||||
gmpolf/triton-2016a |
GCC/4.9.3 |
MPICH/3.0.4 |
OpenBLAS/0.2.15 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
|
gmpolfc/triton-2016a |
GCC/4.9.3 |
MPICH/3.0.4 |
OpenBLAS/0.2.15 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
7.5.18 |
GMVOLF Toolchains: |
||||||
gmvolf/triton-2016a |
GCC/4.9.3 |
MVAPICH2/2.0.1 |
OpenBLAS/0.2.15 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
|
gmvolfc/triton-2016a |
GCC/4.9.3 |
MVAPICH2/2.0.1 |
OpenBLAS/0.2.15 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
7.5.18 |
IOOLF Toolchains: |
||||||
ioolf/triton-2016a |
icc/2015.3.187 |
OpenMPI/1.10.2 |
OpenBLAS/0.2.15 |
ScaLAPACK/2.0.2 |
FFTW/3.3.4 |
|
IOMKL Toolchains: |
||||||
iomkl/triton-2016a |
icc/2015.3.187 |
OpenMPI/1.10.2 |
imkl/11.3.1.150 |
imkl/11.3.1.150 |
imkl/11.3.1.150 |
|
iomkl/triton-2016b |
icc/2015.3.187 |
OpenMPI/1.10.3 |
imkl/11.3.1.150 |
imkl/11.3.1.150 |
imkl/11.3.1.150 |
|
iompi/triton-2017a |
icc/2017.1.132 |
OpenMPI/2.0.1 |
imkl/2017.1.132 |
imkl/2017.1.132 |
imkl/2017.1.132 |
Hardware¶
Node name |
Number of nodes |
Node type |
Year |
Arch (constraint) |
CPU type |
Memory Configuration |
Infiniband |
GPUs |
---|---|---|---|---|---|---|---|---|
ivb[1-24] |
24 |
ProLiant SL230s G8 |
ivb avx |
2x10 core Xeon E5 2680 v2 2.80GHz |
256GB DDR3-1667 |
FDR |
||
ivb[25-48] |
24 |
ProLiant SL230s G8 |
ivb avx |
2x10 core Xeon E5 2680 v2 2.80GHz |
64GB DDR3-1667 |
FDR |
||
pe[1-48,65-81] |
65 |
Dell PowerEdge C4130 |
2016 |
hsw avx avx2 |
2x12 core Xeon E5 2680 v3 2.50GHz |
128GB DDR4-2133 |
FDR |
|
pe[49-64,82] |
17 |
Dell PowerEdge C4130 |
2016 |
hsw avx avx2 |
2x12 core Xeon E5 2680 v3 2.50GHz |
256GB DDR4-2133 |
FDR |
|
pe[83-91] |
8 |
Dell PowerEdge C4130 |
2017 |
bdw avx avx2 |
2x14 core Xeon E5 2680 v4 2.40GHz |
128GB DDR4-2400 |
FDR |
|
c[579-628,639-698] |
110 |
ProLiant XL230a Gen9 |
2017 |
hsw avx avx2 |
2x12 core Xeon E5 2690 v3 2.60GHz |
128GB DDR4-2666 |
FDR |
|
c[629-638] |
10 |
ProLiant XL230a Gen9 |
2017 |
hsw avx avx2 |
2x12 core Xeon E5 2690 v3 2.60GHz |
256GB DDR4-2400 |
FDR |
|
skl[1-48] |
48 |
Dell PowerEdge C6420 |
2019 |
skl avx avx2 avx512 |
2x20 core Xeon Gold 6148 2.40GHz |
192GB DDR4-2667 |
EDR |
|
csl[1-48] |
48 |
Dell PowerEdge C6420 |
2020 |
csl avx avx2 avx512 |
2x20 core Xeon Gold 6248 2.50GHz |
192GB DDR4-2667 |
EDR |
|
fn3 |
1 |
Dell PowerEdge R940 |
2020 |
avx avx2 avx512 |
4x20 core Xeon Gold 6148 2.40GHz |
2TB DDR4-2666 |
EDR |
|
gpu[1-10] |
10 |
Dell PowerEdge C4140 |
2020 |
skl avx avx2 avx512 volta |
2x8 core Intel Xeon Gold 6134 @ 3.2GHz |
384GB DDR4-2667 |
EDR |
4x V100 32GB |
gpu[20-22] |
3 |
Dell PowerEdge C4130 |
2016 |
hsw avx avx2 kepler |
2x6 core Xeon E5 2620 v3 2.50GHz |
128GB DDR4-2133 |
EDR |
4x2 GPU K80 |
gpu[23-27] |
5 |
Dell PowerEdge C4130 |
2017 |
hsw avx avx2 pascal |
2x12 core Xeon E5-2680 v3 @ 2.5GHz |
256GB DDR4-2400 |
EDR |
4x P100 |
dgx[01-02] |
2 |
Nvidia DGX-1 |
2018 |
bdw avx avx2 volta |
2x20 core Xeon E5-2698 v4 @ 2.2GHz |
512GB DDR4-2133 |
EDR |
8x V100 |
gpu[28-37] |
10 |
Dell PowerEdge C4140 |
2019 |
skl avx avx2 avx512 volta |
2x8 core Intel Xeon Gold 6134 @ 3.2GHz |
384GB DDR4-2667 |
EDR |
4x V100 32GB |
Node type |
CPU count |
---|---|
48GB Xeon Westmere (2012) |
1404 |
24GB Xeon Westmere + 2x GPU (2012) |
120 |
96GB Xeon Westmere (2012) |
288 |
1TB Xeon Westmere (2012) |
48 |
256GB Xeon Ivy Bridge (2014) |
480 |
64GB Xeon Ivy Bridge (2014) |
480 |
128GB Xeon Haswell (2016) |
1224 |
256GB Xeon Haswell (2016) |
360 |
128GB Xeon Haswell + 4x GPU (2016) |
36 |
GPUs¶
Card |
total amount |
nodes |
architecture |
compute threads per GPU |
memory per card |
CUDA compute capability |
Slurm feature name |
Slurm gres name |
Tesla K80* |
12 |
gpu[20-22] |
Kepler |
2x2496 |
2x12GB |
3.7 |
|
|
Tesla P100 |
20 |
gpu[23-27] |
Pascal |
3854 |
16GB |
6.0 |
|
|
Tesla V100 |
40 |
gpu[1-10] |
Volta |
5120 |
32GB |
7.0 |
|
|
Tesla V100 |
40 |
gpu[28-37] |
Volta |
5120 |
32GB |
7.0 |
|
|
Tesla V100 |
16 |
dgx[01-02] |
Volta |
5120 |
16GB |
7.0 |
|
|