Table of Contents
CLUSTER MANUAL
Getting Started with HPC on psychp01
What's on this page:
- Prerequisites
- High Performance Computing on psychp01
- First Steps on psychp01
1. Prerequisites
Psychp01 is a High Performance Computing (HPC) cluster as a vitualized system that runs Debian 11, a Linux operating system. You will need to become familiar with the Linux command line interface to use them effectively. While it can be time consuming to learn a bit of Linux it can be considered an investment in scientific skills.
This Youtube video starts from the very basics of the Linux command line. And Fig.1 shows a nice cheat sheet for Linux.
Also, please find here some useful tutorials on HPC clusters. Moreover, the department of Psychology has a github page with tutorials and resources to help you get started, guide you through some practicalities, and provide you with useful code snippets.
Moreover, for any technical assistance, please send a ticket to the IT help desk at itservicedesk@rhul.ac.uk, or email Gabriele.Bellucci@rhul.ac.uk
Figure 1. A nice cheat sheet for Linux
2. High Performance Computing on psychp01
Objective
Here you will get to know psychp01 and will learn how to connect to psychp01.
What's psychp01?
Psychp01 is a virtual computer cluster in a cloud environment at Royal Holloway. Psychp01 provides local HPC resource with an end user experience similar to most HPC Linux clusters.
Current available resources are 128 cores and 256 GB memory. The server mounts NFS filesystems from a NetApp File Server (see Fig. 2).
Psychp01 utilizes Linux (Debian), a batch scheduler (SLURM), and various software packages deployed using a module system.
There will be an option for Singularity modules which enable applications and user to bring their own software environment and preserve that environment in the name of reproducible research.
Figure 2. Psychology cluster specifications
3. First Steps on psychp01
It is strongly advised that new HPC users explore the many tutorials and documentation resources available on the web, for example: HPC Cluster Tutorials or our RHULPsychology github page; Please see Prerequisites above.
Before starting, you need to have a short introduction to the usage of the cluster and current guidelines in place at the department. For that, please contact Gabriele.Bellucci@rhul.ac.uk.
On psychp01, there are four main locations you'd need to get familiar with:
- /home
- /MRIWork
- /MRIArchive
- /MRIRaw
We will explain to you step by step how to move to these locations and what they are there for. Importantly, data in the /MRI* locations will be backed up to avoid data loss.
See also this CUBIC wiki page for further details on data structure, storage, backup and data sharing.
Account and Password
Access to psychp01 can be made available to all staff members with a @rhul.ac.uk email address. To request for an account for HPC access, please send an email to itservicedesk@rhul.ac.uk. It is expected to take 2-3 days for creating your account and corresponding access to the system. Someone from the Psychology IT team will get in touch with you shortly.
When you get confirmation, you will be able to connect to the cluster.
Access to psychp01
Access to psychp01 can be performed through the command line (for computing purposes) as well as the Graphical User Interface (GUI; for visualization purposes only). Here, the ways of access to the cluster are described.
Psychp01 Environment
Once you have logged into psychp01, you are in a basic Linux Debian command line environment. You will need to be familiar with the basics of the Linux command line interface to use psychp01. Luckily, there are many good tutorials on the web to help with this.
On psychp01 you can setup compute jobs and submit them for processing. You can have an interactive environment enabling you to edit files, write scripts, load software modules and compile programs. You can download resources from the internet such as git repositories or singularity containers.
Psychp01 is a batch computing system which means you must submit your computational work to a job scheduler, in our case SLURM. To submit a job to the scheduler you will need to create a job script. Creating job script is so key to batch HPC cluster computing that if you are not familiar with batch jobs and SLURM, please see the Running Jobs section of this document.
Files and Data Access
When you ssh login to psychp01, you’ll be in your cluster home directory
/home/username
There will be a quota on this directory of 1.5TB. This is different from your campus home directory or network file share. This is a place where you can setup your programs and scripting for jobs that will be submitted to run on the compute resources.
Given the small space of your home directory, no data should be uploaded to it. Instead, you should place all your data into /MRIWork
, which is your workspace. A specific workplace folder will be created and provided by the psychology IT team, which is regularly backed up. It is encouraged that the users keep the workspace clean and all cached and unnecessary files to be deleted from the workspace at regular intervals.
Data Staging
To move files from your computer to psychp01 or vice versa, you may use any tool that works with ssh.
On Linux and OSX, these are scp
, sftp
, rsync
, or similar programs. Please see Transferring files to/from psychp01.
On Windows, you may use VNC.
Discussion