Cluster overview

Shared resource

Triton is a joint installation by a number of Aalto School of Science faculties within Science-IT project, which was founded in 2009 to facilitate the HPC Infrastructure in all of School of Science. It is now available to all Aalto researchers.

As of 2016, Triton is part of FGCI - Finnish Grid and Cloud Infrastructure (predecessor of Finnish Grid Infrastructure). Through the national grid and cloud infrastructure, Triton also becomes part of the European Grid Infrastructure.

Hardware

Node name

Number of nodes

Node type

Year

Arch (constraint)

CPU type

Memory Configuration

Infiniband

GPUs

ivb[1-24]

24

ProLiant SL230s G8

ivb avx

2x10 core Xeon E5 2680 v2 2.80GHz

256GB DDR3-1667

FDR

ivb[25-48]

24

ProLiant SL230s G8

ivb avx

2x10 core Xeon E5 2680 v2 2.80GHz

64GB DDR3-1667

FDR

pe[1-48,65-81]

65

Dell PowerEdge C4130

2016

hsw avx avx2

2x12 core Xeon E5 2680 v3 2.50GHz

128GB DDR4-2133

FDR

pe[49-64,82]

17

Dell PowerEdge C4130

2016

hsw avx avx2

2x12 core Xeon E5 2680 v3 2.50GHz

256GB DDR4-2133

FDR

pe[83-91]

8

Dell PowerEdge C4130

2017

bdw avx avx2

2x14 core Xeon E5 2680 v4 2.40GHz

128GB DDR4-2400

FDR

c[579-628,639-698]

110

ProLiant XL230a Gen9

2017

hsw avx avx2

2x12 core Xeon E5 2690 v3 2.60GHz

128GB DDR4-2666

FDR

c[629-638]

10

ProLiant XL230a Gen9

2017

hsw avx avx2

2x12 core Xeon E5 2690 v3 2.60GHz

256GB DDR4-2400

FDR

skl[1-48]

48

Dell PowerEdge C6420

2019

skl avx avx2 avx512

2x20 core Xeon Gold 6148 2.40GHz

192GB DDR4-2667

EDR

csl[1-48]

48

Dell PowerEdge C6420

2020

csl avx avx2 avx512

2x20 core Xeon Gold 6248 2.50GHz

192GB DDR4-2667

EDR

fn3

1

Dell PowerEdge R940

2020

avx avx2 avx512

4x20 core Xeon Gold 6148 2.40GHz

2TB DDR4-2666

EDR

gpu[1-10]

10

Dell PowerEdge C4140

2020

skl avx avx2 avx512 volta

2x8 core Intel Xeon Gold 6134 @ 3.2GHz

384GB DDR4-2667

EDR

4x V100 32GB

gpu[20-22]

3

Dell PowerEdge C4130

2016

hsw avx avx2 kepler

2x6 core Xeon E5 2620 v3 2.50GHz

128GB DDR4-2133

EDR

4x2 GPU K80

gpu[23-27]

5

Dell PowerEdge C4130

2017

hsw avx avx2 pascal

2x12 core Xeon E5-2680 v3 @ 2.5GHz

256GB DDR4-2400

EDR

4x P100

dgx[01-02]

2

Nvidia DGX-1

2018

bdw avx avx2 volta

2x20 core Xeon E5-2698 v4 @ 2.2GHz

512GB DDR4-2133

EDR

8x V100

gpu[28-37]

10

Dell PowerEdge C4140

2019

skl avx avx2 avx512 volta

2x8 core Intel Xeon Gold 6134 @ 3.2GHz

384GB DDR4-2667

EDR

4x V100 32GB

All Triton computing nodes are identical in respect to software and access to common file system. Each node has its own unique host name and ip-address.

Networking

The cluster has two internal networks: Infiniband for MPI and Lustre filesystem and Gigabit Ethernet for everything else like NFS /home directories and ssh.

The internal networks are unaccessible from outside. Only the login node triton.aalto.fi has an extra Ethernet connection to outside.

High performance InfiniBand has fat-tree configuration in general. Triton has several InfiniBand segments (often called islands) distinguished based on the CPU arch. The nodes within those islands connected with different ratio like 2:1, 4:1 or 8:1, (i.e. in 4:1 case for each 4 downlinks there is 1 uplink to spine switches. The islands are ivb[1-45] 540 cores, pe[3-91] 2152 cores (keep in mind that pe[83-91] have 28 cores per node), four c[xxx-xxx] segments with 600 cores each, skl[1-48] and csl[1-48] with 1920 cores each [CHECKME]. Uplinks from those islands are mainly used for Lustre communication. Running MPI jobs possible on the entire island or its segment, but not across the cluster.

Disk arrays

All compute nodes and front-end are connected to DDN SFA12k storage system: large disk arrays with the Lustre filesystem on top of it cross-mounted under /scratch directory. The system provides about 1.8PB of disk space available to end-user.

Software

The cluster is running open source software infrastructure: CentOS 7, with SLURM as the scheduler and batch system.