Triton’s JupyterHub is available at http://jupyter.triton.aalto.fi.
For new users
Are you new to Triton and want to access JupyterHub? Triton is a high-performance computing cluster, and JupyterHub is just one of our services - one of the easiest ways to get started. You still need a Triton account. This site has many instructions, but you should read at least:
- About us, how to get help, and acknowledging Triton usage (this JupyterHub is part of Triton, and thus Science-IT must be acknowledged in publications).
- The accounts page, in order to request a Triton account.
- Possibly the storage page to learn about the places to store data and how to transfer data.
- The JupyterHub section of this page (below).
If you want to use Triton more, you should finish the entire tutorials section.
Jupyter notebooks are a way of interactive, web-based computing: instead of either scripts or interactive shells, the notebooks allow you to see a whole script + output and experiment interactively and visually. They are good for developing and testing things, but once things work and you need to scale up, it is best to put your code into proper programs (more info). You must do this if you are going to large parallel computing.
You can always run notebooks yourself on your own (or remote) computers, but on Triton we have some facilities already set up to make it easier.
How Jupyter notebooks work¶
- Start a notebook
- Enter some code into a cell.
- Run it with the buttons or
Shift-enterto run a cell.
- Edit/create new cells, run again. Repeat indefinitely.
- You have a visual history of what you have run, with code and results nicely interspersed. With certain languages such as Python, you can plots and other things embedded, so that it becomes a complete reproducible story.
JupyterLab is the next iteration of this and has many more features, making it closer to an IDE or RStudio.
Notebooks are without a doubt a great tool. However, they are only one tool, and you need to know their limitations. See our other page on limitations of notebooks.
JupyterHub on Triton is still under development, and features will be added as they are needed or requested. Please use the Triton issue tracker.
The easiest way of using Jupyter is through JupyterHub - it is a multi-user jupyter server which takes a web-based login and spawns your own single-user server. This is available on Triton.
Connecting and starting¶
Currently jupyterHub is available only within Aalto networks, or from the rest of the internet after a first Aalto login: https://jupyter.triton.aalto.fi.
Once you log in, you must start your single-user server. There are
several options available that trade off between long run time and
short run time but more memory available. Your server runs in the
Slurm queue, so the first start-up takes a few seconds but after that
it will stay running even if you log out. The resources you request
are managed by slurm: if you go over the memory limit, your server
will be killed without warning or notification (but you can see it in
the output log,
~/'jupyterhub_slurmspawner_*.log). The Jupyter
server nodes are oversubscribed, which means that we can allocate more
memory and CPU than is actually available. We will monitor the nodes
to try to ensure that there are enough resources available, so do
report problems to us. Please request the minimum amount of memory
you think you need - you can always restart with more memory. You
can go over your memory request a little bit before you get problems.
When you use Jupyter via this interface, the slurm billing weights are lower, so that the rest of your Triton priority does not decrease by as much.
Once you get to your single-user server Jupyter running as your own
user on Triton. You begin in a convenience directory which has links to
scratch, etc. You can not make files in this directory
(it is read-only), but you can navigate to the other folders to create
your notebooks. You have access to all the Triton filesystems (not
project/archive) and all normal software.
We have some basic extensions installed:
- Jupyterlab (to use it, change
/treein the URL to
/lab). Jupyterlab will eventually be made the default.
- modules integration
- jupyter_contrib_nbextensions - check out the variable inspector
- diff and merge tools (currently does not work somehow)
The log files for your single-user servers can be found in, see
~/jupyterhub_slurmspawner_*.log. When a new server starts, these
are automatically cleaned up when they are one week old.
For reasons of web security, you can’t install your own extensions (but you can install your own kernels). Send your requests to us instead.
This service is currently in beta and under active development. If you notice problems or would like any more extensions or features, let us know. If this is useful to you, please let us know your user store, too. In the current development stage, the threshold for feedback should be very low.
Currently, the service level is best effort. The service may go down at any time and/or notebooks may be killed whenever there is a shortage of resources or need of maintenance. However, notebooks auto-save and do survive service restarts, and we will try to avoid killing things unnecessarily.
Software and kernels¶
We have various kernels automatically installed (these instructions
should apply to both JupyterHub and
- Python (2 and 3 via
anacondaN/latestmodules + a few more Python modules.)
- Matlab (latest module)
- Bash kernel
- R (a default R environment you can get by
module load r-triton. (“R (safe)” is similar but tries to block some local user configuration which sometimes breaks things, see FAQ for more hints.)
- Julia: currently doesn’t seem to play nicely with global
installations (if anyone knows something otherwise, let us know).
Just load two modules:
module load julia,
module load jupyterhub/live, and then install the kernel
Pkg.add("IJulia")and it will install locally for you.
- We do not yet have a kernel management policy. Kernels may be added or removed over time. We would like to keep them synced with the most common Triton modules, but it will take some time to get this automatic. Send requests and problem reports.
Since these are the normal Triton modules, you can submit installation requests for software in these so that it is automatically available.
If you want to install your own kernels:
module load jupyterhub/live. This loads the anaconda environment which contains all the server code and configuration. (This step may not be needed for all kernels)
- Follow the instructions you find for your kernel. You may need to
--useror some such to have it install in your user directory.
- You can check your own kernels in
If your kernel involves loading a module,
you can either a) load the modules within the notebook server
(“softwares” tab in the menu), or b) update your
include the required environment variables (see kernelspec).
(We need to do some work to figure out just how this works). Check
for an example of a kernel that loads a module first.
You can enable git integration on Triton by using the following
lines from inside a git repository. (This is normal nbdime, but uses
the centrally installed one so that you don’t have to load a
particular conda environment first. The
sed command fixes
relative paths to absolute paths, so that you use the tools no matter
what modules you have loaded):
/share/apps/jupyterhub/live/miniconda/bin/nbdime config-git --enable sed --in-place -r '[email protected](= )[ a-z/-]*(git-nb)@\1/share/apps/jupyterhub/live/miniconda/bin/\2@' .git/config
- Jupyterhub won’t spawn my server: “Error: HTTP 500: Internal
Server Error (Spawner failed to start [status=1].”. Is your home
directory quota exceeded? If that’s not it, check the
~/jupyterhub_slurmspawner_*logs then contact us.
- My server has died mysteriously. This may happen if resource
usage becomes too much and exceed the limits - Slurm will kill your
notebook. You can check the
~/jupyterhub_slurmspawner_*log files for jupyterhub to be sure.
- My server seems inaccessible / I can’t get to the control panel to
restart my server. Especially with JupyterLab. In JupyterLab,
there is a “Hub” menu that lets you go to the control panel. If
that doesn’t work, change your browser URL path to
/hub/homeand you can get to the control panel.
- My R kernel keeps dying. Some people seem to have global R
configuration, either in
.Renvironor some such which globally, which even affects the R kernel here. Things we have seen: pre-loading modules in
.bashrcwhich conflict with the kernel R module; changing
.Renviron. You can either (temporarily or permanently) remove these changes, or you could install your own R kernel. If you install your own, it is up to you to maintain it (and remember that you installed it).
- “Spawner pending” when you try to start - this is hopefully fixed in issue #1534/#1533 in JupyterHub. Current recommendation: wait a bit and return to JupyterHub home page and see if the server has started. Don’t click the button twice!
- Online demos and live tutorial: https://jupyter.org/try (use the Python one)
- Jupyter basic tutorial: https://www.youtube.com/watch?v=HW29067qVWk (this is just the first link on youtube - there are many more too)
- More advanced tutorial: Data Science is Software (this is not just a Jupyter tutorial, but about the whole data science workflow using Jupyter. It is annoying long (2 hours), but very complete and could be considered good “required watching”)
- Pitfalls of Jupyter Notebooks
- CSC has this service, too, however there is no long term storage yet so there is limited usefulness for research: https://notebooks.csc.fi/
Our configuration is available on Github. Theoretically, all the pieces are here but it is not yet documented well and not yet generalizable. The Ansible role is a good start but the jupyterhub config and setup is hackish.
- Ansible config role: https://github.com/AaltoScienceIT/ansible-role-fgci-jupyterhub
- Configuration and automated conda environment setup: https://github.com/AaltoScienceIT/triton-jupyterhub