Cluster overview¶
Hardware¶
Node name |
Number of nodes |
Node type |
Year |
Arch (constraint) |
CPU type |
Memory Configuration |
Infiniband |
GPUs |
---|---|---|---|---|---|---|---|---|
ivb[1-24] |
24 |
ProLiant SL230s G8 |
ivb avx |
2x10 core Xeon E5 2680 v2 2.80GHz |
256GB DDR3-1667 |
FDR |
||
ivb[25-48] |
24 |
ProLiant SL230s G8 |
ivb avx |
2x10 core Xeon E5 2680 v2 2.80GHz |
64GB DDR3-1667 |
FDR |
||
pe[1-48,65-81] |
65 |
Dell PowerEdge C4130 |
2016 |
hsw avx avx2 |
2x12 core Xeon E5 2680 v3 2.50GHz |
128GB DDR4-2133 |
FDR |
|
pe[49-64,82] |
17 |
Dell PowerEdge C4130 |
2016 |
hsw avx avx2 |
2x12 core Xeon E5 2680 v3 2.50GHz |
256GB DDR4-2133 |
FDR |
|
pe[83-91] |
8 |
Dell PowerEdge C4130 |
2017 |
bdw avx avx2 |
2x14 core Xeon E5 2680 v4 2.40GHz |
128GB DDR4-2400 |
FDR |
|
c[579-628,639-698] |
110 |
ProLiant XL230a Gen9 |
2017 |
hsw avx avx2 |
2x12 core Xeon E5 2690 v3 2.60GHz |
128GB DDR4-2666 |
FDR |
|
c[629-638] |
10 |
ProLiant XL230a Gen9 |
2017 |
hsw avx avx2 |
2x12 core Xeon E5 2690 v3 2.60GHz |
256GB DDR4-2400 |
FDR |
|
skl[1-48] |
48 |
Dell PowerEdge C6420 |
2019 |
skl avx avx2 avx512 |
2x20 core Xeon Gold 6148 2.40GHz |
192GB DDR4-2667 |
EDR |
|
csl[1-48] |
48 |
Dell PowerEdge C6420 |
2020 |
csl avx avx2 avx512 |
2x20 core Xeon Gold 6248 2.50GHz |
192GB DDR4-2667 |
EDR |
|
fn3 |
1 |
Dell PowerEdge R940 |
2020 |
avx avx2 avx512 |
4x20 core Xeon Gold 6148 2.40GHz |
2TB DDR4-2666 |
EDR |
|
gpu[1-10] |
10 |
Dell PowerEdge C4140 |
2020 |
skl avx avx2 avx512 volta |
2x8 core Intel Xeon Gold 6134 @ 3.2GHz |
384GB DDR4-2667 |
EDR |
4x V100 32GB |
gpu[20-22] |
3 |
Dell PowerEdge C4130 |
2016 |
hsw avx avx2 kepler |
2x6 core Xeon E5 2620 v3 2.50GHz |
128GB DDR4-2133 |
EDR |
4x2 GPU K80 |
gpu[23-27] |
5 |
Dell PowerEdge C4130 |
2017 |
hsw avx avx2 pascal |
2x12 core Xeon E5-2680 v3 @ 2.5GHz |
256GB DDR4-2400 |
EDR |
4x P100 |
dgx[01-02] |
2 |
Nvidia DGX-1 |
2018 |
bdw avx avx2 volta |
2x20 core Xeon E5-2698 v4 @ 2.2GHz |
512GB DDR4-2133 |
EDR |
8x V100 |
gpu[28-37] |
10 |
Dell PowerEdge C4140 |
2019 |
skl avx avx2 avx512 volta |
2x8 core Intel Xeon Gold 6134 @ 3.2GHz |
384GB DDR4-2667 |
EDR |
4x V100 32GB |
All Triton computing nodes are identical in respect to software and access to common file system. Each node has its own unique host name and ip-address.
Networking¶
The cluster has two internal networks: Infiniband for MPI and Lustre
filesystem and Gigabit Ethernet for everything else like NFS /home
directories and ssh.
The internal networks are unaccessible from outside. Only the login node
triton.aalto.fi
has an extra Ethernet connection to outside.
High performance InfiniBand has fat-tree configuration in general. Triton
has several InfiniBand segments (often called islands) distinguished based
on the CPU arch. The nodes within those islands connected with different
ratio like 2:1, 4:1 or 8:1, (i.e. in 4:1 case for each 4
downlinks there is 1 uplink to spine switches. The islands are
ivb[1-45]
540 cores, pe[3-91]
2152 cores
(keep in mind that pe[83-91]
have 28 cores per node), four c[xxx-xxx]
segments
with 600 cores each, skl[1-48] and csl[1-48] with 1920 cores each [CHECKME]. Uplinks from
those islands are mainly used for Lustre communication.
Running MPI jobs possible on the entire island or its segment, but not
across the cluster.
Disk arrays¶
All compute nodes and front-end are connected to DDN SFA12k storage
system:
large disk arrays with the Lustre filesystem on top of it cross-mounted
under /scratch
directory. The system provides about 1.8PB of disk
space available to end-user.
Software¶
The cluster is running open source software infrastructure: CentOS 7, with SLURM as the scheduler and batch system.