Quick Start • Rhpc

Prerequisites

Working on High-Performance Computing (HPC) facilities, we primarily interface with the systems via a command-line shell. We expect the people in this course to have a wide range of expertise and different starting points concerning the use of the command-line, which will make it easier or more challenging for participants, depending on where they are coming from.

Similarly, we expect participants to start with different operating systems, e.g. not only Windows but also macOS and different Linux flavors. In order to keep the course consistent, we recommend Windows users to work through the materials using the Windows Subsystem for Linux (WSL 2.0). This should be available for any up-to-date Windows 10 or later, with installation instructions available here. Should this not work, we can also use MobaXterm.

Likely, macOS and Linux users are already familiar with their terminal.

Note: On WSL, your home directory (every time we refer to paths starting with ~) will likely be in a folder like /mnt/c/Users/<UserName>.

Note: If using MobaXterm, be sure the directory you are running it from is local (not a networked file system) and you have write permissions there.

Note: If using WSL, you’ll want to enable Ctrl+Shift+V for copy-paste. Otherwise you will still be able to use Shift+Right Mouse.

Connecting to the computing cluster

Command-line connections to the HPC are established using the Secure Shell tool (SSH). From a terminal, you can connect to Sulis using the following line:

ssh -i <your keyfile> <user>@login.sulis.ac.uk

Since this is a bit tedious, we can outsource most of the information to a ~/.ssh/config file with the following contents:

Host sulis
    Hostname login.sulis.ac.uk
    User <username>
#    ProxyJump <proxy>   # when connecting via the host entry "<proxy>"
    IdentityFile ~/.ssh/<sulis_rsa>

Having it set up like this, establishing the connection is as simple as typing:

ssh sulis

This will come in handy when not only connecting to the HPC once, but also copying files between your local machine and the HPC.

Note, however, that in both cases you will still need to decrypt your private key file with your passphrase (which, at least on macOS and Linux, can be automated using your OS keychain) and the 2FA token once a day.

Setting up R

which R
# /usr/bin/which: no R in [...]

module load GCC/11.2.0 OpenMPI/4.1.1 R/4.1.2
which R
# /sulis/easybuild/software/R/4.1.2-foss-2021b/bin/R

Running an interactive job

The simplest way to request an interactive job is to use Slurm’s srun command where we specify that we want to run a shell (specified by the $SHELL environment variable) that is connected to our terminal input and output (--pty). In addition, we need to specify the account our requested resources will be budgeted to (--account) We can do that by running:

srun --account su105 --pty $SHELL

You will see that your command prompt changes from user@login to user$nodeXX, which means that we are now connected to a compute node instead of the login node. Here, we are allowed to run heavy computations within the resource constraints that we specified.

First, let’s get an overview of which processes are already running on this node. This we can do by running the resource monitor htop:

htop # quit by typing 'q'

In this overview, we can see how many cores the compute node has, how many processes are running, and how much memory is used. Depending on how much of those resources we requested (and the overall load), we will see that at least our resource allocation is still free. We do, however, need to stay within our allocation (and not the overall amount of available resources), because otherwise our processes will be terminated automatically.

If we have loaded the R module beforehand, we’ll see that the $PATH (the environment variable where our shell looks for executables) is still set up to include R. We can check this by asking for the R path:

which R

We can then run R via the command-line, as we can on our local machines as well. Running R will give us an R command prompt:

#> 
#> R version 4.1.3 (2022-03-10) -- "One Push-Up"
#> Copyright (C) 2022 The R Foundation for Statistical Computing
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> 
#> R is free software and comes with ABSOLUTELY NO WARRANTY.
#> You are welcome to redistribute it under certain conditions.
#> Type 'license()' or 'licence()' for distribution details.
#> 
#>   Natural language support but running in an English locale
#> 
#> R is a collaborative project with many contributors.
#> Type 'contributors()' for more information and
#> 'citation()' on how to cite R or R packages in publications.
#> 
#> Type 'demo()' for some demos, 'help()' for on-line help, or
#> 'help.start()' for an HTML browser interface to help.
#> Type 'q()' to quit R.
#> 
#> > 
#> >

We can use R as we could use the R shell in e.g. RStudio as well:

x = 5
y = 3
x * y
#> [1] 15

When we are done, we can exit R by typing quit(save="no"). Note that after exiting R the interactive job is still running. We can exit the job shell by typing exit or Ctrl+d, unless we started the interactive job with srun --pty R instead of $SHELL. Note that exiting the job will sometimes display the message Exited with exit code 127. This can safely be ignored.

Copying files

There are different options for getting your files to the compute cluster. One option is to edit all files locally, and then copy them over SSH. So, for instance, one could have a local file test.txt and copy it over:

scp test.txt sulis:

where you specified sulis in your ~/.ssh/config before, or otherwise you may need to specify the user name, key file, and host manually (this will look differently if you are using a graphical SSH client). The : is used for the scp command to know which one is the remote end, i.e. you could use the following to copy the same file from the remote to your local end (run this on your machine, not sulis):

scp sulis:test.txt . # '.' means current directory (`pwd`)

Here, the scp command looks for test.txt in your home directory (~; this is the default if no directory is specified) and will copy it to your current directory (.).

If you want to copy directories you need to use recursive copying, i.e. scp -r.

Another, and maybe better alternative is the rsync command. This will keep timestamps intact, and can be used to copy only files that have updated timestamps (-u) compared to the local files (-v will print the files while copying):

rsync -uvr sulis:test.txt .

Note: all copy commands, both from and to Sulis should be run from your local machine.

Editing files

For either making small changes or iterative work, it is often more convenient to edit files directly on the computing cluster instead of editing them locally and then copy them.

There are multiple text-based editors that work in a terminal, such as nano, emacs and vim. nano is a minimalist editor without any special features, and is often recommended to users new to the terminal. The problem with that is that we get very quickly stuck at a local optimum, where we can make simple changes to a file, but will never get features such as syntax highlighting. The other two editors on the other hand either have or can be extended to have any/every feature imaginable.

This is why for this course, we will show some basic features of nvim (neovim, a modern implementation of vim) instead. If you are already a user of emacs, please feel free to use this editor instead.

To edit a simple text file, we can run:

# first run 'module load Neovim/0.6.1' if in an interactive job
nvim myfile.txt

You will see that the console gets cleared, and we are shown the contents of an empty file instead. Try typing a couple of a in the file:

aaaa

You will see the characters show up, as you would in other editor as well. Notice, however, that the first a you typed did not show up, only all subsequent as. This is because this editor has a “normal” and an “edit” mode. By typing the first a, we switched from the former to the latter.

We can now use Esc to switch back from the edit to the normal mode. In normal mode, we can type :w Enter to write the file, or :wq to write the file and quit the editor. :q! will exit without saving any changes.

This is all you need to know to make vim as useful as nano. However, if we now for instance edit an .r file we also get syntax highlighting. This feature alone makes it worth to use nvim instead of nano.

You can explore more features by running vimtutor from the command-line, which is an interactive tool to familiarize yourself with how to use it effectively.

Compute resources

For now, we have submitted our job while specifying only the minimum required parameters and relying on the defaults for others. For instance, we have not specific a partition, which is one of several job queues that we can submit our jobs to. To get an overview of which are available, we can use the sinfo command:

sinfo

Here, we see the different partitions listed and a number of nodes associated with each of them, including the walltime (maximum amount of time that a job can request) for each queue. You will see that one queue is marked with a *. This denotes the default queue, which we have been using by not specifying any particular queue via the --partition parameter.

One argument that we did specify but not explain in more detail is the account. This specifies the connection between your user name and a collection of resources available that you can use, which are then subtracted from this budget. You can check which accounts your user has access to by typing:

sacctmgr show associations where user=<your user name>

You will likely only belong to one account, or project, at this time (which was the one created for this course).

Job submission scripts

Usually, you will want to run more complex computations than can be specified with a single srun. For running multiple commands on multiple hosts, it is often better to specify the resource requirements and exact commands using a job submission script. This may look like the following:

#!/bin/sh
#SBATCH --account su105
#SBATCH --partition compute
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 1
#SBATCH --mem 1024M
#SBATCH --time 8:00:00

srun uname -n

We can submit this script by saving it to a script file and then running sbatch <script>. It should tell us something like:

Submitted batch job 1290046

where the number is the identifier of the job (and will be different with multiple runs).

After we started it, we can check all jobs that our user is running by typing:

squeue -u <username>

This should list a job with the above ID, show that it is currently running, and the node the job is running on. We can also get detailed information about the job using scontrol:

scontrol show jobid <jobid>

This information will be available while the job is running and some (short) time after it is finished. We can now also see resources that we did not explicitly request, e.g. that the time limit was 1 hour, that we used the compute partition, and were able to use up to 1-4 Gb of memory (may depend on the node).

After the job started, it will create an output file called slurm-xxxxxx.out (where xxxxxx is the job id) with the standard output of the command (the standard output is what would have otherwise been printed to the console). After it is finished, it will contain the output of uname -n, which is the name of the node the command was run on. If all went well, it should contain nodeXX and not login, which means that it was run on one of the compute nodes.

There’s quite a few lines in there, so let’s break this down:

#!/bin/sh is called a shebang and specifies which application should be used to run the script, which is required by sbatch (otherwise it will refuse to submit the job)
#SBATCH --account is needed again to budget the resources correctly
#SBATCH --partition this time we specify the compute partition explicitly
#SBATCH --ntasks lists the tasks (computations) this job contains
#SBATCH --cpus-per-task specifies the numbers of CPUs per task
#SBATCH --mem specifies the amount of memory requested; this is in total, other options include --mem-per-task, --mem-per-cpu, or --mem-per-gpu. Memory multipliers such as K, M and G are supported (kilobytes, megabytes and gigabytes, respectively).
#SBATCH --time is the maximum amount of time after which the job will be terminated (dd-hh:mm:ss)
#SBATCH commands need to be directly following the shebang, otherwise they will be ignored
srun specifies the command to be run; it is not required for running individual computations, but helps set up parallel helpers, such as running the call once per task, or e.g. setting up MPI if this is used

Exercise

Create a batch submission script like the one above using the command-line editor on the cluster
Submit it using sbatch <script>, both with and without the srun. What changes?
What happens when we change the ntasks parameter to 2, both with and without the srun command?

R on HPC

R’s best developed parallel capabilities is running computations on multiple cores. Here, we will briefly outline which approaches are commonly used. For simplicity, let’s consider a simple R function that sleeps for a couple of seconds:

fsleep = function(i) {
    print("starting ", i)
    Sys.sleep(1)
    print("done!")
}

fsleep(1)
#> [1] "starting "
#> [1] "done!"

If we need to call this function 10 times, we could use something like lapply:

system.time({ lapply(1:10, fsleep) })
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#> [1] "starting "
#> [1] "done!"
#>    user  system elapsed 
#>   0.005   0.000  10.014

This will, unsurprisingly, take 10 times as long as an individual call to fsleep(). For many use cases, it makes sense to run computations (done in reality instead of just sleeping) in parallel. The probably most integrated solution is the parallel package, which is one of the core packages distributed with a standard R installation by default.

system.time({ parallel::mclapply(1:10, fsleep) })
#>    user  system elapsed 
#>   0.013   0.000   5.021

However, there are also other possibilities, e.g. the foreach package that enables parallel processing using the %dopar% command, or the future package by using plan(multisession).

Exercise

Create a sleep.r script based on the fsleep function and the mclapply above calling the function 10 times. Print the result of system.time().
Create a submission script that will request 1 task with 5 cores (hint: see the documentation here)
Submit this script as a job. Does it run successfully? How long does it take according to the job log file? Does the runtime make sense?
Submit the same script using a parallel foreach %dopar% loop and a parallel::makePSOCKcluster. Does it run? Do the results make sense?
A newer approach that aims at providing a common interface to many parallel backends is the future package. Can you run the above example using plan(multisession) and future.apply?

Solution mclapply

To start off, you will have created two files, sleep.r and submit_sleep.sh.

sleep.r

fsleep = function(i) {
    print("starting ", i)
    Sys.sleep(1)
    print("done!")
}

system.time({ parallel::mclapply(1:10, fsleep) })

submit_sleep.sh

#!/bin/sh
#SBATCH --account su105
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 5
#SBATCH --mem 1024M
#SBATCH --time 0:10:00

# export MC_CORES=$SLURM_CPUS_ON_NODE
srun Rscript sleep.r

An run in the terminal:

sbatch submit_sleep.sh

Running this script, you will see in the slurm-XXXXX.out that the runtime is 5 seconds, which corresponds to 2 cores used, not the 5 requested. This is because parallel::mclapply uses mc.cores = getOption("mc.cores", 2L), which as you can see defaults to 2. We can check this by inspecting the mclapply function:

head(parallel::mclapply)
#>                                                                           
#> 1 function (X, FUN, ..., mc.preschedule = TRUE, mc.set.seed = TRUE,       
#> 2     mc.silent = FALSE, mc.cores = getOption("mc.cores", 2L),            
#> 3     mc.cleanup = TRUE, mc.allow.recursive = TRUE, affinity.list = NULL) 
#> 4 {                                                                       
#> 5     cores <- as.integer(mc.cores)                                       
#> 6     if ((is.na(cores) || cores < 1L) && is.null(affinity.list))
getOption("mc.cores")
#> NULL
getOption("mc.cores", 2L)
#> [1] 2

One way to specify the correct number of cores is to uncomment the # export line in the submission script. Running this again will report 2 seconds, as expected.

Solution foreach

Using the foreach and %dopar% for parallel processing, we would write the following script:

sleep_foreach.r

library(foreach)
# library(doParallel)
# registerDoParallel(cores=getOption("mc.cores", 2L))

fsleep = function(i) {
    print("starting ", i)
    Sys.sleep(1)
    print("done!")
}

system.time({ foreach(i=1:10) %dopar% fsleep(i) })

submit_sleep_foreach.sh

#!/bin/sh
#SBATCH --account su105
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 5
#SBATCH --mem 1024M
#SBATCH --time 0:10:00

export MC_CORES=$SLURM_CPUS_ON_NODE
srun Rscript sleep_foreach.r

An run in the terminal:

sbatch submit_sleep_foreach.sh

We will get the following warning message:

Warning message: _executing %dopar% sequentially: no parallel backend registered

To register the backend, we can uncomment the # line in sleep_foreach.r. Now it also finishes within the expected 2 seconds instead of 10.

Solution makePSOCKcluster

sleep_psock.r

library(parallel)
cl = makePSOCKcluster(getOption("mc.cores"))

fsleep = function(i) {
    print("starting ", i)
    Sys.sleep(1)
    print("done!")
}

system.time({ parLapply(cl, 1:10, fsleep) })

stopCluster(cl)

submit_sleep_psock.sh

#!/bin/sh
#SBATCH --account su105
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 5
#SBATCH --mem 1024M
#SBATCH --time 0:10:00

export MC_CORES=$SLURM_CPUS_ON_NODE
srun Rscript sleep_psock.r

An run in the terminal:

sbatch submit_sleep_psock.sh

This will report that it ran in 2 seconds.

Solution future

sleep_future.r

library(future.apply)
plan(multicore)

fsleep = function(i) {
    print("starting ", i)
    Sys.sleep(1)
    print("done!")
}

system.time({ future_lapply(1:10, fsleep) })

submit_sleep_future.sh

#!/bin/sh
#SBATCH --account su105
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 5
#SBATCH --mem 1024M
#SBATCH --time 0:10:00

export MC_CORES=$SLURM_CPUS_ON_NODE
srun Rscript sleep_future.r

An run in the terminal:

sbatch submit_sleep_future.sh

This will report a runtime of a bit over the 2 seconds we saw previously. The reason for this is that the future framework adds overhead to what the parallel package provides.