User Tools

Site Tools


cluster-file_transfer

CLUSTER MANUAL

Transferring files to/from psychp01

What's on this page:

  1. Transferring data
  2. Transferring code
  3. File Transfer Clients

1. Transferring data

You need to upload your data into the folder that was created by the IT for you within the /MRIWork folder on the cluster. Your folder name will be something like MRIWork# where # stands for a number assigned to you. Hence, if your number is “25”, your path name to upload your data on the cluster will be:

/MRIWork/MRIWork25

The most secure way to transfer data is using File Transfer Clients like sftp and scp via the command line.
To know more about /MRIWork, folder structure, data archiving, data backup and data sharing, have a look at the CUBIC wiki.

Please NOTE: theoretically, you can upload your data also into your home directory (see Transferring code below) and save your analysis results there. However, the home directory has limited amount of storage space and is not periodically backed up. Hence, it is advisable that you save your data and analysis results in your /MRIWork/MRIWork# folder, even if you do not have MRI data and do not think your data occupy too much space. This will avoid problems with storage space and minimize risks of data loss.

2. Transferring code

You can upload your scripts into a folder within your home directory on the cluster. The IT will have probably created a folder in the home directory with the first letter of your name preceding your surname (e.g., gbellucci). Hence, if your name is Gabriele Bellucci, your path name to upload your data on the cluster will be:

/home/gbellucci

Your folder name in the home directory has the same name as the username you use to access the cluster. This name was provided to you by the IT when you asked for access. Be aware that there might be deviations on how your folder name in the home directory has been created (especially if you were granted access to the cluster before 2024). If you have access to the cluster, just log in using ssh and type pwd to see your folder name in the home directory. Something like the line above will pop up on your command line.

The most secure way to transfer code is using File Transfer Clients like sftp and scp via the command line. Please see below for how to use these clients and page 20 of the pdf file on the home page for a demonstration video.

3. File Transfer Clients

Primary access to psychp01 is via ssh based tools (on the command line). To upload or download data and code, File Transfer Clients such as scp and sftp can be used.

To transfer data to and from psychp01 use the following address:

psychp01.rhul.ac.uk

SFTP

sftp, which stands for Secure File Transfer Protocol, is an encrypted protocol built into SSH that can implement commands for transferring files between two remote systems over a secure connection. There are many resources on the web on how to use sftp (e.g., here). Here, example applications to transfer data onto psychp01 will be shown.
First, you need to establish a secure connection with the server. This is very similar to how you would connect with the server using ssh (see here).

sftp username@psychp01.rhul.ac.uk

Like for the ssh connection, “username” is the username provided to you by the IT when you asked for access to the cluster. Hit enter and you will be required to enter your password. Once you are connected, at the beginning of your command line, you will see that an connection has been established:

sftp>

Now, you can use ftp commands to (among others) upload, download, remove, and move files. Type help to check all commands available.

sftp> help

The sftp connection puts you on the cluster. Here, you can use all common commands you would use on your local machine to get the current directory, change the current directory and so on. If you would like to use the same commands on your local computer, you can do that by adding an “l” in front of the command you want to use. This “l” stands for “local” and tells sftp to use the command on the local machine as opposed to the remote one.
For instance, when you establish an sftp connection, you will find yourself in your home directory. Hence, if your home directory path is /home/gbellucci, when you type pwd, you will see the second line of the code below appearing:

sftp> pwd
Remote working directory: /home/gbellucci

On the contrary, if you type lpwd, you will see the second line of the code below appearing:

sftp> lpwd
Local working directory: /Users/Gab

where /Users/Gab is my (local) current directory on my computer. Type help to see the difference in the commands for remote and local implementations.

To download data from the cluster onto your local directory, you need to use the get command, like this:

sftp> get remote_filename_path local_dirpath

For example, if you have to get a file called results_matrix.mat from the folder results in your home directory /home/gbellucci and download it in your folder project_results on your local directory /Users/Gab, you will do:

sftp> get /home/gbellucci/results/results_matrix.mat /Users/Gab/project_results

Alternatively, you can cd to results (on the cluster), lcd to project_results (on your local machine), and then just type get results_matrix.mat, like this:

sftp> cd /home/gbellucci/results
sftp> lcd /Users/Gab/project_results
sftp> get results_matrix.mat

NOTE: If you have folder names that contain spaces, sftp would fail. For instance, something like that: sftp> lcd /Users/Gab/project results, (i.e., your results folder named “project result” with a space) would not work!
If you have to download a folder, you will need to use the -r argument like that:

sftp> get -r remote_dirpath local_dirpath

On the contrary, if you have to upload data from your local machine to the cluster, you will need to use the put command:

sftp> put local_filename_path remote_dirpath

In the video on page 20 of the pdf file on the main page, you will see how to transfer a Python code and a bash file to psychp01 using put.

SCP

scp (secure copy) is a command-line utility that allows you to securely copy files and directories between two locations. scp use requires a password, and both the files and password are encrypted so as to securely transfer data from one location to the other. scp uses the ssh protocol for both authentication and encryption. See here for more information.

When transferring data, scp takes on two main arguments:

scp source destination

The first argument is the address of the source file to transfer, the second the address where it has to be transferred to. A good way to memorize it is to think that scp needs to know what to send where to.
For example, to transfer files from the remote cluster (source) to your local machine (destination), use:

scp username@address_name:pathname_remote_src pathname_local_dest

To transfer files from your local machine (source) to the remote cluster (destination), use:

scp pathname_local_src username@address_name:pathname_remote_dest

Suppose my username (the one given to you by the IT when you got access to the cluster) is gbellucci, the filename of the file (e.g., a MATLAB file .m) I need to transfer is best_analysis.m, the pathname to that file on my local computer is /Users/Gab, and the pathname of the remote folder on the cluster I need to send my file to is /home/gbellucci/coolest_project. The line I need on terminal to transfer my file will be:

scp /Users/Gab/best_analysis.m gbellucci@psychp01.rhul.ac.uk:/home/gbellucci/coolest_project

Remember, your data will not be in your folder in the home directory but in your MRIWork# folder in /MRIWork. Hence, to upload a data file (say, data.mat), you’d need to type:

scp /Users/Gab/data.mat gbellucci@psychp01.rhul.ac.uk:/MRIWork/MRIWork25/data_coolest_project

If you have to upload or download multiple files or a file that contains multiple file (e.g., a folder), now you’ll have a directory path (and not a file path), and you can use the -r argument to reiterate the sending over all files like that:

scp -r dirpath_local_src username@psychp01.rhul.ac.uk:dirpath_remote_dest

For example, if your directory path is to the folder called analyses_folder, you can type the following:

scp -r /Users/Gab/analyses_folder gbellucci@psychp01.rhul.ac.uk:/home/gbellucci/coolest_project

If you have a whole data folder to transfer, you will upload it into your /MRIWork/MRIWork# folder like that:

scp -r /Users/Gab/data gbellucci@psychp01.rhul.ac.uk:/MRIWork/MRIWork25/data_coolest_project

You would swap the two arguments if the folder is on the cluster, and you would need to get it onto your local computer:

scp -r username@psychp01.rhul.ac.uk:dirpath_remote_src dirpath_local_dest

For example, if your directory path is to the folder on the cluster called results_folder that you need to download into your analyses_folder on your local computer, you can type the following:

scp -r gbellucci@psychp01.rhul.ac.uk:/home/gbellucci/results_folder /Users/Gab/analyses_folder

RSYNC

rsync, which stands for remote sync, is a remote and local file synchronization tool. It uses an algorithm to minimize the amount of data copied by only moving the portions of files that have changed. Please see here for more information.

SSHFS

sshfs allows you to mount the file system on your local machine. See here for more details. Basic usage for Linux users:

sshfs username@psychp01.rhul.ac.uk:dirpath mountpoint [options]

FileZilla

FileZilla is a free and open-source File Transfer Protocol (FTP) client that supports ftp, ftps and sftp protocols. It allows the implementation of the above command-line programs through a graphical interface. Please have a look at this step-by-step guide on how to use FileZilla.

ExpanDrive

An alternative to File Transfer Clients like the one mentioned above is ExpanDrive. ExpanDrive is a network filesystem client for MacOS, Microsoft Windows and Linux that facilitates mapping of local volume to many different types of cloud storage. It is different from the above File Transfer Clients because it is integrated into all applications on the operating system and does not require a file to be downloaded onto the local machine. On the contrary, remote files can be accessed, managed and changed as if they were stored locally.

The downside is that it is a non-free commercial tool.

Running analyses on psychp01Bash files and Batch system
Return to Table of ContentsReturn to main page

Discussion

Enter your comment. Wiki syntax is allowed:
 
cluster-file_transfer.txt · Last modified: 2024/05/09 12:48 by gabriele

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki