June 2021 / Intro to Scientific Computing (FGCI HPC Summer Kickstart)
News
Watch at https://twitch.tv/coderefinery
The livestream is archived on Twitch for 14 days. Videos will be posted on this playlist once they are ready.
Before the workshop:
Registration is open: https://forms.gle/yNFLYt676kKorF3X7
View the prerequisites below.
Check back here for other updates that don’t get their own email.
Part of the Scientific Computing in Practice lecture series at Aalto University.
Audience: All researchers looking for a start to scientific computing. We go over the various options and tools that everyone needs to know about, and then go in-depth about using a remote computational cluster (though these skills will be useful to everyone). This is specifically designed for our summer workers who are just starting their internship, but anyone who is doing computing or data-focused work can get something from this course. Anyone is welcome to listen along and learn from some experts.
Most examples use Aalto University resources, but everyone can learn something and we are careful to explain local vs general practices.
About the course:
Summer Kickstart is a three day courses for researchers to get started with the available computational resources at FGCI (Finnish Grid and Cloud Infrastructure, basically HPC, high-performance computing, at universities) and CSC (the Finnish national computing center). On the day one we start with the basic HPC intro, and some basic intro to Linux command line and Git version control, for those who are not yet familiar with these tools.
On days two and three we cover one by one steps on how to get started on the local computational clusters: learning by doing with lots of examples and hands-on exercises.
By the end of the course you get the hints, ready solutions and copy/paste examples on how to find, run and monitor your applications, and manage your data. In addition to how to optimize your workflow in terms of filesystem traffic, memory usage etc.
University specific information:
Aalto: this course is obligatory for all new Triton users and recommended to all interested in scientific computing in general. Basic reference information is at the Triton page
Tampere: this course is recommended for all new Narvi users and also all interested in HPC. Most things should work with simply replacing triton -> narvi. Some differences in configuration are listed in Narvi differences
Practical information
Time, date: Mon 7.6, Tue 8.6, Wed 9.6, 11:50-16:00 EEST
Place: Online, see below
Lecturering by: Aalto Scientific Computing (Science-IT) and others
Registration: https://forms.gle/yNFLYt676kKorF3X7
Cost: Free of charge for FGCI consortium members including Aalto employees and students. Livestream is free to everyone.
Additional course info at: scip@aalto.fi
How to attend
This is an online hybrid of MOOC and interactive:
Livestream: anyone may watch at https://twitch.tv/coderefinery, no registration needed!
Zoom: if you register, you will be able to attend a Zoom meeting that includes interactive breakout rooms and hands-on help. We watch the livestream for the main material.
HackMD: instead of chat, this is used for Q&A. See the CodeRefinery HackMD manual for how this works.
Schedule
The daily schedule will be adjusted based on the audience; below is the tentative plan. There will be frequent breaks. You will be given time to try and ask, it’s more like an informal help session to get you started with the computing resources. All times are EEST (Helsinki) time.
Preparatory material. Each year the first day has varying topics presented. We don’t repeat these every year, but we strongly recommend that you watch these videos yourself as preparation:
Day #1 (Mon 7.jun): Basics and background
11:50: Joining time and pre-discussion, please join 10 minutes early. (Richard Darst, Enrico Glerean)
12:00: Welcome, general introduction (Notes) (Enrico Glerean and all)
12:10: HPC crash course: what is behind the front-end HPC fundamentals: terminology, architectures, interconnects, infrastructure behind, as well as MPI vs shared memory. Continued on day 3. (Ivan Degtyarenko, Simppa Äkäslompolo) (Slides (.pdf))
12:40: Summary and discussion about the videos “Basic linux shell scripting” and “Scientific computing workflows” (see videos in preparatory material above) (Notes) (Richard Darst, Enrico Glerean)
12:50: Break
13:00: Currently available resources at CSC CSC is the Finnish center for scientific computing, and also has many resources for research. (Slides) (Jussi Enkovaara, CSC).
13:45: Break
14:00: Git intro: why you need version control for any scientific work and how to get started. We don’t go in depth into theory, but talk about the simplest usage by yourself. (Richard Darst, Jarno Rantaharju)
14:45: Break
15:00: Your future career in scientific computing (and this course). (Notes) (Enrico Glerean, TBA)
15:15: Connecting to the cluster, hands-on. Get connected in preparation for day 2 (Enrico Glerean)
Aalto: Connecting to Triton tutorial – if you can ssh to Triton and run
hostname
, you are ready for tomorrow.Helsinki: general information
Tampere: Connecting to Narvi
Day #2 (Tue 8.jun): Basic use of a cluster (Richard Darst, Simo Tuomisto)
This day will go over all practical aspects of using the cluster
11:50: Joining time/icebreaker
12:00: Connecting to Triton
Every site will have its own ways of connecting. The basic lessons of
ssh
is the same for everyone, but it will have a different hostname and possibly different initial steps (jump hosts).Aalto: (same)
Helsinki: general information
Tampere: Connecting to Narvi. Note, that you will need SSH keys.
12:30: Applications
Each site will be quite different here, so don’t worry about making the exercises work outside of Aalto, but think and prepare for what comes next (where we’ll explain the differences).
12:50: Break
13:00: Software modules
13:20: Data storage
Aalto: (same)
Helsinki: general information
Tampere: Narvi storage
This topic is very site-specific. The general principles will apply everywhere, but the exact paths/servers will vary.
13:50: Break
14:00: Short talk: Radovan Bast (UiT The Arctic University of Norway): Asking for help with supercomputers
How should you write support requests so that you get quick (and useful!) answers? Radovan, one of the founders of CodeRefinery, will talk about how we can all improve the dialogue between supercomputer user community and support staff so that we always remain respectful and try to learn and solve problems together.
14:35: Interactive jobs
The basic Slurm concepts are the same across all clusters (at least all those that use Slurm, but that is everyone in Finland). However, partition names may be different. You can list partitions at your site using
sinfo -O partition
and list nodes at your site withsinfo -N
. How these work will vary depending on your site - definitely read up on this.
14:50: Break
15:00 Continuing with interactive slurm jobs and exercises
16:00: End
Day #3 (Wed 9.jun): Advanced cluster use (Simo Tuomisto, Richard Darst)
11:50: Joining time/icebreaker
12:00 Serial Jobs
Array jobs: embarassingly parallel execution
Array jobs allow you to quickly run many jobs, and are the simplest unit of advanced computing. We will go over them in detail.
-
In other sites, you should
module load fgci-common
to be able to make the Aalto modules available. Other specifics, such asmatlab
, won’t directly work.
GPU computing (Simo Tuomisto)
Aalto: (same as above)
Helsinki: general information
Tampere: Narvi GPU computing differences
At other sites, you may need to use
-p gpu
in addition to--gres=gpu
.
Parallel computing: different methods explained (Simo Tuomisto)
Parallel computing programming (Ivan Degtyarenko, Simo Tuomisto)
16:00: End
Follow-up suggestions: While not an official part of this course, we suggest these videos (co-produced by our staff) as a follow-up perspective:
Attend a CodeRefinery workshop, which teaches more useful tools for scientific software development.
Look at Hands-on Scientific Computing for an online course to either browse or take for credits.
Cluster Etiquette (in Research Software Hour): The Summer Kickstart teaches what you can do from this course, but what should you do to be a good user.
How to tame the cluster (in Research Software Hour). This mostly repeats the contents of this course, with a bit more discussion, and working one example from start to parallel.
Prerequisites
Participants will be provided with either access to their university’s cluster or Triton for running examples.
You should have an account on your university’s HPC cluster:
Aalto: if you do not yet have access to Triton, request an account in advance.
Helsinki: Account notes at the bottom of this page
Tampere: your cluster will require ssh keys to connect.
Others: Aalto will provide you with a guest Triton account, check back for more information.
Participants are expected to have a SSH client installed (for options, see the Triton connecting tutorial for examples).
You should install Zoom. Hints on installation.
If you aren’t familiar with the Linux shell, read the crash course, watch the video, or watch the relevant preparatory video linked as part of the schedule.
Try to get connected to your cluster in advance. We have some time scheduled for this, but you need to also try in advance, or else we can’t keep up.
Aalto: connecting to Triton
Helsinki: general information
Tampere: Connecting to Narvi
Other preparation
How to attend this course:
Take this seriously. There is a lot of material and hands-on exercises. Don’t overbook your time, don’t skip hands-on parts, and come prepared.
Anyone may watch via Livestream, https://twitch.tv/coderefinery . Register anyway to get emails.
You will be given a Zoom link to join. Join each session 10 minutes early.
Join with a name of “(University) First Last”, e.g. “(Aalto) Richard Darst”. This will help us to put people into university-specific breakout rooms.
There will be a <HackMD.io> document sent to all participants. This is for communication an asking questions. Read more about how this works here
Always write new questions or comments at the bottom of the document.
Moderators will follow the developments, and answer questions and comments. You may get several answers from different perspectives, even. Our focus is the bottom, but we will scan the whole document and keep it organized.
The final document (excluding personal data and questions about individual circumstances) will be published as the notes at the end.