Data
Data connects most research together. It’s easy to make in short term, but in the long term it can become so chaotic it loses its value. Can you access your research group’s data from 5 years ago and use it?
Summary for Triton data
If you use Triton or go beyond your own device/simple cloud services, you have the following basic options:
Shared |
Size / backups |
When you leave |
Usage |
|
|---|---|---|---|---|
Triton Work |
no |
100s GB+ no |
deleted |
Small projects, testing work, or getting started. Switch to a Triton project directory when a project becomes large.
|
Triton project |
yes |
No limit no |
stays |
The most recommended place for large data. Make sure you back up irreplaceable data (see below).
|
(Aalto/Dept) Project directory |
yes |
GBs-TBs yes |
stays |
Recommend to have a copy here for original or irreplaceable data. Only available on login nodes, so you need a Scratch/Work directory also to copy
|
You really should talk to your group leader about data storage locations - data is important and your group leader needs to be aware of where all the group’s data is stored so they can coordinate all the use.
Data storage in Aalto
This section is for general data in Aalto. These references aren’t focused on Triton and scientific computing data, but they are mentioned.
Quick summary: What file storage to use?
Teamwork storage space (Aalto project directories)
Data in Science-IT departments (CS, NBE, PHYS)
Common practices in our departments.
Getting space:
More details:
Data on Triton
Triton is a computer cluster that provides large and fast data storage connected to significant computing power, but it is not backed up.
Tutorial: Data storage
Tutorial: Remote access to data
Triton quick reference: Storage and Remote data access
Overview with checklist: Storage
Data management
This section covers administrative and organizational matters about data.
Aalto Research Data Management pages, and here we focus on the practical side of things.
Other
Summary table
This is a broad summary of many of the locations mentioned above.
O = good, x = bad
Requirements table
Different data has different needs.
Large |
Fast |
Confidential |
Frequent backups |
Long-term archival |
Shareable |
|
|---|---|---|---|---|---|---|
Code |
OO |
OO |
O |
|||
Original data |
O |
O |
OO? |
OO |
OO |
O |
Intermediate files |
OO |
OO |
OO? |
|||
Final results/open data |
OO |
OO |
Storage location table
Large |
Fast |
Confidential |
Backups |
Long-term archival |
Shareable |
||
|---|---|---|---|---|---|---|---|
Triton |
OO |
OO* |
O |
x |
x |
O |
|
OO |
OO* |
O |
x |
x |
|||
x |
O |
OO |
|||||
O |
OO |
O |
|||||
OOO |
OO |
||||||
Depts |
/m/…/project |
O |
O |
OO |
OO |
O |
|
/m/…/archive |
O |
O |
OO |
OO |
O |
O |
|
Aalto |
Aalto home |
OO |
OO |
||||
Aalto work |
O |
O |
OO |
OO |
O |
||
Aalto teamwork |
O |
O |
OO |
OO |
O |
||
Aalto laptops |
x |
x |
X |
||||
Aalto webspace |
OO |
||||||
version.aalto.fi |
OO |
OO |
O |
OO |
|||
ACRIS |
O |
O |
|||||
Eduuni |
|||||||
Aalto Wiki |
|||||||
Finland |
Funet filesender |
O |
OO |
||||
CSC cPouta |
O |
O |
O |
||||
CSC Ida |
OOO |
x |
OO |
O |
O |
||
FSD |
OO |
O |
OO |
O |
|||
Public |
github |
x |
OO |
||||
Zenodo |
OO |
OO |
|||||
Google drive |
x |
O |
|||||
OneDrive |
|||||||
Own computers |
x |
x |
x |
||||
Emails |
x |
x |
x |
||||
EUDAT B2SHARE |
O |
O |
O |
(*) For details check out the description of the lustre file system and the issue of small Files on Triton.