Data

Data connects most research together. It’s easy to make in short term, but in the long term it can become so chaotic it loses its value. Can you access your research group’s data from 5 years ago and use it?

Summary for Triton data

If you use Triton or go beyond your own device/simple cloud services, you have the following basic options:

Shared

Size / backups

When you leave

Usage

Triton Work

no

100s GB+

no

deleted

Small projects, testing work, or getting started. Switch to a Triton project directory when a project becomes large.

/scratch/work/USER

Triton project

yes

No limit

no

stays

The most recommended place for large data. Make sure you back up irreplaceable data (see below).

/scratch/DEPT/PROJECT/ Needs to be requested as described here

(Aalto/Dept) Project directory

yes

GBs-TBs

yes

stays

Recommend to have a copy here for original or irreplaceable data. Only available on login nodes, so you need a Scratch/Work directory also to copy

/m/DEPT/PROJECT/

You really should talk to your group leader about data storage locations - data is important and your group leader needs to be aware of where all the group’s data is stored so they can coordinate all the use.

Data storage in Aalto

This section is for general data in Aalto. These references aren’t focused on Triton and scientific computing data, but they are mentioned.

Data in Science-IT departments (CS, NBE, PHYS)

Common practices in our departments.

Getting space:

More details:

Data on Triton

Triton is a computer cluster that provides large and fast data storage connected to significant computing power, but it is not backed up.

Data management

This section covers administrative and organizational matters about data.

Other

Summary table

This is a broad summary of many of the locations mentioned above.

O = good, x = bad

Requirements table

Different data has different needs.

Large

Fast

Confidential

Frequent backups

Long-term archival

Shareable

Code

OO

OO

O

Original data

O

O

OO?

OO

OO

O

Intermediate files

OO

OO

OO?

Final results/open data

OO

OO

Storage location table

Large

Fast

Confidential

Backups

Long-term archival

Shareable

Triton

Triton project

OO

OO*

O

x

x

O

work

OO

OO*

O

x

x

Triton home

x

O

OO

Local disks

O

OO

O

ramfs

OOO

OO

Depts

/m/…/project

O

O

OO

OO

O

/m/…/archive

O

O

OO

OO

O

O

Aalto

Aalto home

OO

OO

Aalto work

O

O

OO

OO

O

Aalto teamwork

O

O

OO

OO

O

Aalto laptops

x

x

X

Aalto webspace

OO

version.aalto.fi

OO

OO

O

OO

ACRIS

O

O

Eduuni

Aalto Wiki

Finland

Funet filesender

O

OO

CSC cPouta

O

O

O

CSC Ida

OOO

x

OO

O

O

FSD

OO

O

OO

O

Public

github

x

OO

Zenodo

OO

OO

Google drive

x

O

OneDrive

Own computers

x

x

x

Emails

x

x

x

EUDAT B2SHARE

O

O

O

(*) For details check out the description of the lustre file system and the issue of small Files on Triton.