Triton quick reference
In this page, you have all important reference information
Quick reference guide for the Triton cluster at Aalto University, but also useful for many other Slurm clusters. See also this printable Triton cheatsheet, as well as other cheatsheets.
Connecting
See also: Connecting to Triton.
Method |
Description |
From where? |
|---|---|---|
Standard way of connecting via command line. Hostname is
>Linux/Mac/Win from command line: >Windows: same, see Connecting via ssh for details options. |
VPN and Aalto networks (which is VPN, most wired,
internal servers, |
|
Use Aalto VPN and row above. If needed: same as above, but must set up SSH key and then |
Whole Internet, if you first set up SSH key AND also use passwords (since 2023) |
|
https://ondemand.triton.aalto.fi, Web-based interface to the cluster. Also known as OOD. Includes shell access, GUI, data transfer, Jupyter and a number of GUI applications like Matlab etc. More info. |
Whole internet |
|
Since April 2024 Jupyter is part of Open OnDemand, see above. Use the “Jupyter” app to get same environment as before. More info. |
See Open OnDemand above |
|
With the “Remote-SSH” extension it can provide shell access and file transfer. See the VS Code page for some important usage warnings. |
Same as SSH options above, but connect to
|
|
They connect via SSH (see above), but read the AI coding agents page carefully for common problems. |
Same as SSH options above, but connect to
|
Modules
See also: Software modules.
Command / concept |
Description |
|---|---|
‘module’ |
A software that can be made available. |
‘software stack’ |
Includes the compliers (etc) needed for other modules. Must be loaded before other modules. |
|
Load module NAME, can combine multiple names. |
|
Load software stack module NAME. Makes other compiled software available. Generally, run |
|
List all modules available with current software stack. |
|
Search modules for PATTERN |
|
Show prerequisite modules (the softare stack) to this one |
|
List currently loaded modules |
|
Details on a module |
|
Details on a module |
|
Unload a module |
|
Save module collection to this alias (saved in |
|
List all saved collections |
|
Details on a collection |
|
Load saved module collection (faster than loading individually) |
|
Unload all loaded modules (faster than unloading individually) |
Common software
See also: Applications.
Python:
module load scicomp-python-envfor the Aalto Scientific Computing managed Python environment with common packages. More info.module load mambafor mamba/conda for making your own environments (see below)
Note
For PyTorch users:
scicomp-python-envonly supports older GPUs (V100s). For newer GPUs (B300s), use the PyTorch-specific environment below.PyTorch:
module load scicomp-pytorch-env/2026.1for the latest PyTorch environment optimized for newer GPUs (B300s) with common deep learning packages. More info.R:
module load rfor a basic R package. More info.module load scicomp-r-envfor an R module with various packages pre-installed
Matlab:
module load matlabfor the latest Matlab version. More info.
Storage
See also: Data storage
Name |
Path |
Quota |
Backup |
Sharing across |
Purpose |
|---|---|---|---|---|---|
Home |
|
hard quota 10GB |
Nightly |
all nodes |
Small user specific files, no calculation data. |
Work |
|
200GB and 1 million files |
x |
all nodes |
Personal working space for every user. Calculation data etc. Quota can be increased on request. |
Scratch |
|
on request |
x |
all nodes |
Department/group specific project directories. |
|
local disk size |
x |
single-node |
(Usually fastest) place for single-node calculation data. Removed once user’s jobs are finished on the node. Request with |
|
|
limited by memory |
x |
single-node |
Very fast but small in-memory filesystem |
Remote data access
See also: Remote access to data.
Method |
Description |
|---|---|
rsync transfers |
Transfer back and forth via command line. Set up ssh first.
|
SFTP transfers |
Operates over SSH. sftp://triton.aalto.fi in file browsers
(Linux at least), FileZilla (to |
SMB mounting |
Mount (make remote viewable locally) to your own computer. Linux: File browser, MacOS: File browser, same URL as Linux Windows: |
Quotas
See also: Quotas.
Read about data storage at Aalto and requesting storage space. We strongly recommend project-based storage instead of increasing personal quotas.
quota- print your quota and usageFinding what is using space
The
dusttool prints a nice tree of largest directories.module load dustthendust $HOMEon Triton.$HOMEcan be replaced with any other directory (or left off for current directory)du -h $HOME/ | sort -h: Like above but works everywhere. Usedu -ahto list all files.du -h --max-depth=1 $HOME/ | sort -h: Similar, but only list down to--max-depthlevels.du --inodes --max-depth=1 $HOME/ | sort -n: Similar, but list the number of files in the directories.
rmremoves a single file,rm -rremoves a whole directory tree. Warning: on scratch and Linux in general (unless backed up), there is no recovery from this!! Think twice before you push enter. If you have any questions, come to a garage and get help.conda cleancleans up downloaded conda files (but not environments).
Job submission
See also: Serial Jobs, Array jobs: embarassingly parallel execution, Parallel computing: different methods explained, Serial Jobs.
Command |
Description |
|---|---|
|
submit a job to queue (see standard options below) |
|
Within a running job script/environment: Run code using the allocated resources (see options below) |
|
On frontend: submit to queue, wait until done, show output. (see options below) |
|
Submit job, wait, provide shell on node for interactive playing (X forwarding works, default partition interactive). Exit shell when done. (see options below) |
|
(advanced) Another way to run interactive jobs, no X forwarding but simpler. Exit shell when done. |
|
Cancel a job in queue |
|
(advanced) Allocate resources from frontend node. Use |
|
View/modify job and slurm configuration |
Command |
Option |
Description |
|---|---|---|
|
|
time limit |
|
time limit, days-hours |
|
|
job partition. Usually leave off and things are auto-detected. |
|
|
request N MB of memory per core |
|
|
request N MB memory per node |
|
|
Allocate *n* CPU’s for each task. For multithreaded jobs. (compare ``–ntasks``: ``-c`` means the number of cores for each process started.) |
|
|
allocate minimum of N, maximum of M nodes. |
|
|
allocate resources for and start n tasks (one task=one process started, it is up to you to make them communicate. However the main script runs only on first node, the sub-processes run with “srun” are run this many times.) |
|
|
request a GPU, or |
|
|
request GPUs with at least |
|
|
request GPUs with CUDA compute capability of at least N.N. See above for combining with other GRES. |
|
|
short job name |
|
|
print output into file output |
|
|
print errors into file error |
|
|
allocate exclusive access to nodes. For large parallel jobs. |
|
|
request feature (see |
|
|
request a node with a local disk storage space and |
|
|
Run job multiple times, use variable |
|
|
notify of events: |
|
|
Aalto email to send the notification about the job. External email addresses doesn’t work. |
|
|
|
Print allocated nodes (from within script) |
Command |
Description |
|---|---|
|
Status of your queued jobs (long/short) |
|
Overview of partitions (A/I/O/T=active,idle,other,total) |
|
list free CPUs in a partition |
|
Show status of recent jobs |
|
Show percent of mem/CPU used in job. See Monitoring. |
|
Show GPU efficiency |
|
Job details (only while running) |
|
Show status of all jobs |
|
Full history information (advanced, needs args) |
Full slurm command help:
$ slurm
Show or watch job queue:
slurm [watch] queue show own jobs
slurm [watch] q show user's jobs
slurm [watch] quick show quick overview of own jobs
slurm [watch] shorter sort and compact entire queue by job size
slurm [watch] short sort and compact entire queue by priority
slurm [watch] full show everything
slurm [w] [q|qq|ss|s|f] shorthands for above!
slurm qos show job service classes
slurm top [queue|all] show summary of active users
Show detailed information about jobs:
slurm prio [all|short] show priority components
slurm j|job show everything else
slurm steps show memory usage of running srun job steps
Show usage and fair-share values from accounting database:
slurm h|history show jobs finished since, e.g. "1day" (default)
slurm shares
Show nodes and resources in the cluster:
slurm p|partitions all partitions
slurm n|nodes all cluster nodes
slurm c|cpus total cpu cores in use
slurm cpus cores available to partition, allocated and free
slurm cpus jobs cores/memory reserved by running jobs
slurm cpus queue cores/memory required by pending jobs
slurm features List features and GRES
Examples:
slurm q
slurm watch shorter
slurm cpus batch
slurm history 3hours
Other advanced commands (many require lots of parameters to be useful):
Command |
Description |
|---|---|
|
Full info on queues |
|
Advanced info on partitions |
|
List all nodes |
Slurm examples
See also: Serial Jobs, Array jobs: embarassingly parallel execution.
Simple batch script, submit with sbatch the_script.sh:
#!/bin/bash -l
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=1G
module load scicomp-python-env
python my_script.py
Simple batch script with array (can also submit with
sbatch --array=1-10 the_script.sh):
#!/bin/bash -l
#SBATCH --array=1-10
python my_script.py --seed=$SLURM_ARRAY_TASK_ID
Slurm Partitions
Partition |
Max job size |
Mem/core (GB) |
Tot mem (GB) |
Cores/node |
Limits |
Use |
|---|---|---|---|---|---|---|
<default> |
If you leave off all possible partitions will be used (based on time/mem) |
Use slurm partitions to see more details.
Hardware
See also: Cluster technical overview.
Node name |
Number of nodes |
Node type |
Year |
Arch ( |
CPU type |
Memory Configuration |
Infiniband |
GPUs |
Disks |
|---|---|---|---|---|---|---|---|---|---|
pe[1-48,65-81] |
65 |
Dell PowerEdge C4130 |
2016 |
hsw avx2 |
2x12 core Xeon E5 2680 v3 2.50GHz |
128GB DDR4-2133 |
FDR |
900GB HDD |
|
pe[49-64,82] |
17 |
Dell PowerEdge C4130 |
2016 |
hsw avx2 |
2x12 core Xeon E5 2680 v3 2.50GHz |
256GB DDR4-2133 |
FDR |
900GB HDD |
|
pe[83-91] |
8 |
Dell PowerEdge C4130 |
2017 |
bdw avx2 |
2x14 core Xeon E5 2680 v4 2.40GHz |
128GB DDR4-2400 |
FDR |
900GB HDD |
|
skl[1-48] |
48 |
Dell PowerEdge C6420 |
2019 |
skl avx2 avx512 |
2x20 core Xeon Gold 6148 2.40GHz |
192GB DDR4-2667 |
EDR |
No disk |
|
csl[1-48] |
48 |
Dell PowerEdge C6420 |
2020 |
csl avx2 avx512 |
2x20 core Xeon Gold 6248 2.50GHz |
192GB DDR4-2667 |
EDR |
No disk |
|
milan[1-32] |
32 |
Dell PowerEdge C6525 |
2023 |
milan avx2 |
2x64 core AMD EPYC 7713 @2.0 GHz |
512GB DDR4-3200 |
HDR-100 |
No disk |
|
fn3 |
1 |
Dell PowerEdge R940 |
2020 |
avx2 avx512 |
4x20 core Xeon Gold 6148 2.40GHz |
2TB DDR4-2666 |
EDR |
No disk |
|
gpu[1-10] |
10 |
Dell PowerEdge C4140 |
2020 |
skl avx2 avx512 volta |
2x8 core Intel Xeon Gold 6134 @ 3.2GHz |
384GB DDR4-2667 |
EDR |
4x V100 32GB |
1.5 TB SSD |
gpu[11-17,38-44] |
14 |
Dell PowerEdge XE8545 |
2021, 2023 |
milan avx2 ampere a100 |
2x24 core AMD EPYC 7413 @ 2.65GHz |
503GB DDR4-3200 |
EDR |
4x A100 80GB |
440 GB SSD |
gpu[28-37] |
10 |
Dell PowerEdge C4140 |
2019 |
skl avx2 avx512 volta |
2x8 core Intel Xeon Gold 6134 @ 3.2GHz |
384GB DDR4-2667 |
EDR |
4x V100 32GB |
1.5 TB SSD |
dgx[1-2,8-10,15-27] |
22 |
Nvidia DGX-1 |
2018, 2025 |
bdw avx2 volta |
2x20 core Xeon E5-2698 v4 @ 2.2GHz |
512GB DDR4-2133 |
EDR |
8x V100 16GB |
7 TB SSD |
dgx[3,5-7] |
4 |
Nvidia DGX-1 |
2018 |
bdw avx2 volta |
2x20 core Xeon E5-2698 v4 @ 2.2GHz |
512GB DDR4-2133 |
EDR |
8x V100 32GB |
7 TB SSD |
gpuamd1 |
1 |
Dell PowerEdge R7525 |
2021 |
rome avx2 mi100 |
2x8 core AMD EPYC 7262 @3.2GHz |
250GB DDR4-3200 1 |
EDR |
32GB SSD |
|
gpu[45-48] |
4 |
Dell PowerEdge XE8640 |
2024 |
saphr avx2 h100 hopper |
2x48 core Xeon Platinum 8468 2.1GHz |
1024GB DDR5-4800 |
HDR |
4x H100 SXM 80GB |
21 TB SSD |
gpu[49] |
1 |
Dell PowerEdge XE9680 |
2025 |
emerald avx2 h200 hopper |
2x32 core Xeon® Platinum 8562Y+ 2.8GHz |
2048GB DDR5-5600 |
HDR |
8x H200 SXM each split to 3x35GB |
20 TB SSD |
gpu[50-63] |
14 |
Dell PowerEdge XE9680 |
2025 |
emerald avx2 h200 hopper |
2x32 core Xeon® Platinum 8562Y+ 2.8GHz |
2048GB DDR5-5600 |
HDR |
8x H200 SXM 141GB |
20 TB SSD |
gpu[64-65] |
2 |
Dell PowerEdge XE9780 |
2026 |
granite avx2 b300 blackwell |
2x64 core Xeon® 6767P 2.4GHz |
3072GB DDR5-6400 |
NDR |
8x B300 SXM6 AC 288GB |
28 TB SSD |
gpuarm[1-2] |
2 |
Supermicro ARS-221GL-NHIR |
2026 |
NVIDIA grace h200 hopper |
2x72 core Grace A02 Neoverse-V2 3.4GHz |
1318GB LPDDR5-6400 |
NDR |
2x H200 H200 SXM 144GB HBM3e |
21 TB SSD |
GPUs
See also: GPU computing.
GPU brand name |
GPU name in Slurm ( |
VRAM GB ( |
CUDA compute capability ( |
total amount |
nodes |
GPUs per node |
Compute threads per GPU |
Slurm partition ( |
|---|---|---|---|---|---|---|---|---|
NVIDIA B300(*) |
|
|
10.3 ( |
16 |
gpu[64-65] |
8 |
18944 |
|
NVIDIA H200(*) |
|
|
9.0 ( |
112 |
gpu[50-63] |
8 |
16896 |
|
NVIDIA H200(**) |
|
|
9.0 ( |
24 |
gpu[49] |
24 |
4224 |
|
NVIDIA Grace-H200(+) |
|
|
9.0 ( |
4 |
gpuarm[1-2] |
2 |
16896 |
|
NVIDIA H100 |
|
|
9.0 ( |
16 |
gpu[45-48] |
4 |
16896 |
|
NVIDIA A100 |
|
|
8.0 ( |
56 |
gpu[11-17,38-44] |
4 |
7936 |
|
NVIDIA V100 |
|
|
7.0 ( |
40 |
gpu[28-37] |
4 |
5120 |
|
NVIDIA V100 |
|
|
7.0 ( |
40 |
gpu[1-10] |
4 |
5120 |
|
NVIDIA V100 |
|
|
7.0 ( |
32 |
dgx[3,5-7] |
8 |
5120 |
|
NVIDIA V100 |
|
|
7.0 ( |
176 |
dgx[1-2,8-27] |
8 |
5120 |
|
AMD MI210 |
|
|
2 |
gpuamd[1] |
2 |
7680 |
|
|
AMD MI100 |
|
|
1 |
gpuamd[1] |
1 |
6656 |
|
Since 2025, the main way to request certain types of GPUs is with
--gres, for example --gpus=1 --gres=min-vram:32. Only one
--gres option is allowed, so to combine gres, use a comma
separated list: --gres=gpu-vram:32g,min-cuda-cc:80.
Running CPU-only workloads on GPU partitions is not permitted.
If you encounter an error along the lines of
srun: error: Unable to allocate resources: Job violates accounting/QOS policy
(job submit limit, user's size and/or time limits), one possible
cause is that you request a GPU partition but script is missing --gpus=n.
(*) These GPUs have a priority queue for the Ellis project, since they were procured for this project. Any job submitted to the short queue might be preempted if a job requiring the resources comes in from the Ellis queue.
(**) These GPUs are split from a single GPU with NVIDIA’s Multi-Instance GPU-feature.
(+) These computers have the Nvidia Grace CPU, an ARM based cpu. Normal software compiled for x86 does not run on these nodes. See Grace Hopper Super Chips.
Conda Environments (Mamba)
See also: Conda. Note that mamba is a
drop-in replacement for conda.
Command |
Description |
|---|---|
|
Load module that provides conda/mamba Triton, for use making
your own environments. |
See link for six commands to run once per user account on Triton (to avoid filling up all space on your home directory). |
|
name: conda-example
channels:
- conda-forge
dependencies:
- numpy
- pandas
|
Minimal |
Environment management: |
Creating, activating, removing: |
|
Create environment from yaml file. Use |
|
Activate environment of name NAME. Note we use this and not
|
|
Deactivate conda from this session. HPC Cluster specific. |
|
List all environments. |
|
Remove the environment of that name. |
Package management: |
Inside the activate environment |
|
List packages in currently active environment. |
|
Update an environment based on updated environment.yml |
|
Install packages in an environment with minimal changes to what is already installed. Usually you would want to go at add them to environment.yml if it is a dependency. Better: add to environment.yml and see the previous line. |
|
Export an environment.yml that describes the current
environment. Add |
|
Search for a package. List includes name, version, build version (often including linked libraries like Python/CUDA), and channel. |
Other notes: |
|
|
Use |
|
Clean up cached files to free up space (not environments or packages in them). |
|
Used when making CUDA environment on login node (choose right
CUDA version for you). Used with |
Channel Package selection |
Package selection for tensorflow. The first |
Channels Package selection |
Package selection for pytorch. The first |
CUDA |
In channel conda-forge, automatically selected based on
software you need. For manual compilation, package
|
Command line
See also: Linux shell crash course.
- General notes
The command line has many small programs that when connected, allow you to do many things. Only a little bit of this is shown here.
Programs are generally silent if everything worked, and only print an error if something goes wrong.
ls [DIR]List current directory (or DIR if given).
pwdPrint current directory.
cd DIRChange directory.
..is parent directory,/is root,/is also chaining directories, e.g.dir1/dir2or../../nano FILEEdit a file (there are many other editors, but
nanois common, nice, and simple).mkdir DIR-NAMEMake a new directory.
cat FILEPrint entire contents of file to standard output (the terminal).
less FILELess is a “pager”, and lets you scroll through a file (up/down/pageup/pagedown).
qto quit,/to search.mv SOURCE DESTMove (=rename) a file.
mv SOURCE1 SOURCE2 DEST-DIRECTORY/copies multiple files to a directory.cp SOURCE DESTCopy a file. The
DEST-DIRECTORY/syntax ofmvworks as well.rm FILE ...Remove a file. Note, from the command line there is no recovery, so always pause and check before running this command! The
-ioption will make it confirm before removing each file. Add-rto remove whole directories recursively.head [FILE]Print the first 10 (or N lines with
-n N) of a file. Can take input from standard input instead ofFILE.tailis similar but the end of the file.tail [FILE]See above.
grep PATTERN [FILE]Print lines matching a pattern in a file, suitable as a primitive find feature, or quickly searching for output. Can also use standard input instead of
FILE.du [-ash] [DIR]Print disk usage of a directory. Default is KiB, rounded up to block sizes (1 or 4 KiB),
-hmeans “human readable” (MB, GB, etc),-smeans “only of DIR, not all subdirectories also”.-ameans “all files, not only directories”. A common pattern isdu -h DIR | sort -hto print all directories and their sizes, sorted by size.statShow detailed information on a file’s properties.
find [DIR]find can do almost anything, but that means it’s really hard to use it well. Let’s be practical: with only a directory argument, it prints all files and directories recursively, which might be useful itself. Many of us do
find DIR | grep NAMEto grep for the name we want (even though this isn’t the “right way”, there are find options which do this same thing more efficiently).|(pipe):COMMAND1 | COMMAND2The output of
COMMAND1is sent to the input ofCOMMAND2. Useful for combining simple commands together into complex operations - a core part of the unix philosophy.>(output redirection):COMMAND > FILEWrite standard output of
COMMANDtoFILE. Any existing content is lost.>>(appending output redirection):COMMAND >> FILELike above, but doesn’t lose content: it appends.
<(input redirection):COMMAND < FILEOpposite of
>, input toCOMMANDcomes fromFILE.type COMMANDorwhich COMMANDShow exactly what will be run, for a given command (e.g.
type python3).man COMMAND-NAMEBrowse on-line help for a command.
qwill exit,/will search (it useslessas its pager by default).-hand--helpCommon command line options to print help on a command. But, it has to be implemented by each command.