Training

Scientific computing and data science require special, practical skills in programming and computer use. However, these aren’t often learned in courses.

This page is your portal for training. The focus is practical, hands-on courses for scientists, not theoretical academic courses. As a broad classification, we divide the skills a scientist would need into four big levels A-D:

A (basics) Having a basic knowledge of university resources, so that you use the right tool for the right job and don’t lose your data.
B (scientific computing) When you are doing science but existing software isn’t enough: you have to connect things together (or make your own).
C (high performance computing) Using large computer clusters for large-scale analysis. Basically, at Aalto, Triton.
D (advanced high performance computing) Catch-all for everything past level C.
Special tracks Programming, scientific papers/posters/presentations, etc. Can be at any level.

If you’re starting research, ask your advisor what level of skill they expect you to have. There are both courses and self-study materials below. A few hours now can save you days of time during your career.

Subpages:

For course announcements at Aalto, see the Science-IT training page.

A: Basics

A01 University IT systems

This covers the basics of research facilities at Aalto and how to use them.

There is not currently a dedicated course, but, but all of our information is found at Welcome, researchers!.

A10 Configuring Mac for scientific work Getting your Mac computer set up for scientific computing tasks. After this, you can follow most of the other instructions below which assume a Linux-like system.
A11 Configuring Windows for scientific work Like A10, but for Windows. (Why isn’t there a Linux course? Because these are to get you close enough to Linux to have the power you need for computing.)

B: Scientific computing

Core courses:

B10

Basic shell

Let’s face it: the linux command line is the basis of most data science. Check out Software Carpentry shell-novice sections 1-4

B14

Data management

If you do the obvious thing, your data will turn into a huge mess and you won’t be able to work anymore. This course gives some practical hints. (For now, check out the data section)

B23

Text editors and IDEs

Your best friend is a good text editor. Software Carpentry shell-novice, part of section 3.

B20

Shell scripting

If you can do it on the Linux shell, you can automate it. Continue with the Science-IT Linux shell tutorial, first few sections.

B21

Version control for you

Version control lets you track changes, go back in time, and collaborate on code and papers: an absolute requirement for scientific computing. CodeRefinery Introduction to version control

B22

SSH and remote access

A short but important course: how to do work remotely. Different expert tips for making ssh better, too.

Other courses:

B30

Makefiles

Makefiles are like smart shell scripts. We learn some about them and in the process, become ever more efficient. Software Carpentry make-novice.

B50

Version control for teams

Previously, you learned only the basics. Now for the real stuff. CodeRefinery collaborative distributed version control lesson

B51

Jupyter Notebooks

Notebooks are an efficient way to make self-documenting code and scripts and do data science well. CodeRefinery Jupyter course.

Software development track: Do you do programming? These courses are for you. This does not teach you how to program: you need to find your own course for that, but this will make sure you can do scientific programming well.

B60

Modular code development

CodeRefinery lesson

B61

Software testing

CodeRefinery lesson

B62

Profiling

Aalto course, see Profiling for now.

B63

Debugging

Aalto course, for example course by Janne

B02

Software Licensing

CodeRefinery lesson

C: High performance computing

When your own computer is not enough, you need more power. For that, high-performance computing is your next step. Level C is about using HPC, level D is about programming it yourself.

Core courses:

C01

What is HPC?

See training by Science-IT

C20

Modules and software

See training by Science-IT or Software Modules

C21

Slurm

See training by Science-IT or interactive, serial, array

C22

HPC Storage

See training by Science-IT or storage basics, lustre, local storage, small files

C23

Parallel computing

See training by Science-IT

C24

Advanced shell scripting and automation

Hands-on, putting everything together. Various courses, finishing the linux shell tutorial is a good start.

D: Advanced high performance computing

Dxx

Parallel programming computers

This is an academic course taught in the CS department. It mainly covers OpenMP and CUDA. Usually taught in 5th period (Apr-May), search MyCourses/Oodi for CS-E4580.

Dxx

GPU Programming

This was an advanced guest course, useful if you want to know how to program GPU applications Materials here.

Dxx

MPI Programming

This was an advanced guest course, useful if you want to know internals of MPI or program MPI applications. Materials here.

Dxx

HTCondor

Condor allows you to use many workstations as a high throughput cluster, ideal for mid-range embarrassingly parallel problems. Materials here.

Also see the Science-IT training archive for more level D courses.