====== CLUSTER MANUAL ====== ====== Transferring files to/from psychp01 ====== What's on this page:\\ - Transferring data - Transferring code - File Transfer Clients ===== 1. Transferring data ===== You need to upload your data into the folder that was created by the IT for you within the ''/MRIWork'' folder on the cluster. Your folder name will be something like ''MRIWork#'' where ''#'' stands for a number assigned to you. Hence, if your number is “25”, your path name to upload your data on the cluster will be: /MRIWork/MRIWork25 The most secure way to transfer data is using File Transfer Clients like ''sftp'' and ''scp'' via the command line.\\ To know more about ''/MRIWork'', folder structure, data archiving, data backup and data sharing, have a look at the [[data-storage|CUBIC wiki]]. Please NOTE: theoretically, you can upload your data also into your home directory (see Transferring code below) and save your analysis results there. However, the home directory has limited amount of storage space and is **not** periodically backed up. Hence, it is advisable that you save your data and analysis results in your ''/MRIWork/MRIWork#'' folder, even if you do not have MRI data and do not think your data occupy too much space. This will avoid problems with storage space and minimize risks of data loss. ===== 2. Transferring code ===== You can upload your scripts into a folder within your home directory on the cluster. The IT will have probably created a folder in the home directory with the first letter of your name preceding your surname (e.g., ''gbellucci''). Hence, if your name is Gabriele Bellucci, your path name to upload your data on the cluster will be: /home/gbellucci Your folder name in the home directory has the same name as the username you use to access the cluster. This name was provided to you by the IT when you asked for access. Be aware that there might be deviations on how your folder name in the home directory has been created (especially if you were granted access to the cluster before 2024). If you have access to the cluster, just log in using ''ssh'' and type ''pwd'' to see your folder name in the home directory. Something like the line above will pop up on your command line. The most secure way to transfer code is using File Transfer Clients like ''sftp'' and ''scp'' via the command line. Please see below for how to use these clients and page 20 of the pdf file on the [[cluster-guide|home page]] for a demonstration video. ===== 3. File Transfer Clients ===== Primary access to psychp01 is via ssh based tools (on the command line). To upload or download data and code, File Transfer Clients such as ''scp'' and ''sftp'' can be used. To transfer data to and from psychp01 use the following address: psychp01.rhul.ac.uk ==== SFTP ==== ''sftp'', which stands for Secure File Transfer Protocol, is an encrypted protocol built into SSH that can implement commands for transferring files between two remote systems over a secure connection. There are many resources on the web on how to use ''sftp'' (e.g., [[https://www.digitalocean.com/community/tutorials/how-to-use-sftp-to-securely-transfer-files-with-a-remote-server|here]]). Here, example applications to transfer data onto psychp01 will be shown.\\ First, you need to establish a secure connection with the server. This is very similar to how you would connect with the server using ''ssh'' (see [[cluster-access|here]]). sftp username@psychp01.rhul.ac.uk Like for the ssh connection, “username” is the username provided to you by the IT when you asked for access to the cluster. Hit enter and you will be required to enter your password. Once you are connected, at the beginning of your command line, you will see that an connection has been established: sftp> Now, you can use ''ftp'' commands to (among others) upload, download, remove, and move files. Type ''help'' to check all commands available. sftp> help The ''sftp'' connection puts you on the cluster. Here, you can use all [[cluster-linux|common commands]] you would use on your local machine to get the current directory, change the current directory and so on. If you would like to use the same commands on //your local computer//, you can do that by adding an “l” in front of the command you want to use. This “l” stands for “local” and tells ''sftp'' to use the command on the local machine as opposed to the remote one.\\ For instance, when you establish an ''sftp'' connection, you will find yourself in your home directory. Hence, if your home directory path is ''/home/gbellucci'', when you type ''pwd'', you will see the second line of the code below appearing: sftp> pwd Remote working directory: /home/gbellucci On the contrary, if you type ''lpwd'', you will see the second line of the code below appearing: sftp> lpwd Local working directory: /Users/Gab where ''/Users/Gab'' is my (local) current directory on my computer. Type ''help'' to see the difference in the commands for remote and local implementations. To download data from the cluster onto your local directory, you need to use the ''get'' command, like this: sftp> get remote_filename_path local_dirpath For example, if you have to get a file called ''results_matrix.mat'' from the folder ''results'' in your home directory ''/home/gbellucci'' and download it in your folder ''project_results'' on your local directory ''/Users/Gab'', you will do: sftp> get /home/gbellucci/results/results_matrix.mat /Users/Gab/project_results Alternatively, you can ''cd'' to ''results'' (on the cluster), ''lcd'' to ''project_results'' (on your local machine), and then just type ''get results_matrix.mat'', like this: sftp> cd /home/gbellucci/results sftp> lcd /Users/Gab/project_results sftp> get results_matrix.mat __NOTE__: If you have folder names that contain spaces, ''sftp'' would fail. For instance, something like that: ''sftp> lcd /Users/Gab/project results'', (i.e., your results folder named “project result” with a space) would not work!\\ If you have to download a folder, you will need to use the ''-r'' argument like that: sftp> get -r remote_dirpath local_dirpath On the contrary, if you have to upload data from your local machine to the cluster, you will need to use the put command: sftp> put local_filename_path remote_dirpath In the video on page 20 of the pdf file on the [[cluster-guide|main page]], you will see how to transfer a Python code and a bash file to psychp01 using ''put''. ==== SCP ==== ''scp'' (secure copy) is a command-line utility that allows you to securely copy files and directories between two locations. ''scp'' use requires a password, and both the files and password are encrypted so as to securely transfer data from one location to the other. ''scp'' uses the ''ssh'' protocol for both authentication and encryption. See [[https://linuxize.com/post/how-to-use-scp-command-to-securely-transfer-files/|here]] for more information. When transferring data, ''scp'' takes on two main arguments: scp source destination The first argument is the address of the source file to transfer, the second the address where it has to be transferred to. A good way to memorize it is to think that ''scp'' needs to know ''what'' to send ''where to''.\\ For example, to transfer files from the remote cluster (source) to your local machine (destination), use: scp username@address_name:pathname_remote_src pathname_local_dest To transfer files from your local machine (source) to the remote cluster (destination), use: scp pathname_local_src username@address_name:pathname_remote_dest Suppose my username (the one given to you by the IT when you got access to the cluster) is ''gbellucci'', the filename of the file (e.g., a MATLAB file .m) I need to transfer is ''best_analysis.m'', the pathname to that file on my local computer is ''/Users/Gab'', and the pathname of the remote folder on the cluster I need to send my file to is ''/home/gbellucci/coolest_project''. The line I need on terminal to transfer my file will be: scp /Users/Gab/best_analysis.m gbellucci@psychp01.rhul.ac.uk:/home/gbellucci/coolest_project Remember, your data will not be in your folder in the home directory but in your ''MRIWork#'' folder in ''/MRIWork''. Hence, to upload a data file (say, ''data.mat''), you’d need to type: scp /Users/Gab/data.mat gbellucci@psychp01.rhul.ac.uk:/MRIWork/MRIWork25/data_coolest_project If you have to upload or download multiple files or a file that contains multiple file (e.g., a folder), now you’ll have a directory path (and not a file path), and you can use the ''-r'' argument to reiterate the sending over all files like that: scp -r dirpath_local_src username@psychp01.rhul.ac.uk:dirpath_remote_dest For example, if your directory path is to the folder called ''analyses_folder'', you can type the following: scp -r /Users/Gab/analyses_folder gbellucci@psychp01.rhul.ac.uk:/home/gbellucci/coolest_project If you have a whole data folder to transfer, you will upload it into your ''/MRIWork/MRIWork#'' folder like that: scp -r /Users/Gab/data gbellucci@psychp01.rhul.ac.uk:/MRIWork/MRIWork25/data_coolest_project You would swap the two arguments if the folder is on the cluster, and you would need to get it onto your local computer: scp -r username@psychp01.rhul.ac.uk:dirpath_remote_src dirpath_local_dest For example, if your directory path is to the folder on the cluster called ''results_folder'' that you need to download into your ''analyses_folder'' on your local computer, you can type the following: scp -r gbellucci@psychp01.rhul.ac.uk:/home/gbellucci/results_folder /Users/Gab/analyses_folder ==== RSYNC ==== ''rsync'', which stands for //remote sync//, is a remote and local file synchronization tool. It uses an algorithm to minimize the amount of data copied by only moving the portions of files that have changed. Please see [[https://www.digitalocean.com/community/tutorials/how-to-use-rsync-to-sync-local-and-remote-directories|here]] for more information. ==== SSHFS ==== ''sshfs'' allows you to mount the file system on your local machine. See [[https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh|here]] for more details. Basic usage for Linux users: sshfs username@psychp01.rhul.ac.uk:dirpath mountpoint [options] ==== FileZilla ==== ''FileZilla'' is a free and open-source //File Transfer Protocol (FTP)// client that supports ''ftp'', ''ftps'' and ''sftp'' protocols. It allows the implementation of the above command-line programs through a graphical interface. Please have a look at [[http://54.236.43.240/doku.php?id=data-download|this step-by-step guide]] on how to use FileZilla. ==== ExpanDrive ==== An alternative to File Transfer Clients like the one mentioned above is [[https://www.expandrive.com/|ExpanDrive]]. ExpanDrive is a network filesystem client for MacOS, Microsoft Windows and Linux that facilitates mapping of local volume to many different types of cloud storage. It is different from the above File Transfer Clients because it is integrated into all applications on the operating system and does not require a file to be downloaded onto the local machine. On the contrary, remote files can be accessed, managed and changed as if they were stored locally. The downside is that it is a non-free commercial tool. [[{:backward_arrow.png?40|width: 12em}cluster-analyses|Running analyses on psychp01]][[{:forward_arrow.png?40|width: 12em}cluster-batch|Bash files and Batch system]]\\ [[{:toc.png?40|width: 12em}cluster-toc|Return to Table of Contents]][[{:main_page.png?40|width: 12em}cluster-guide|Return to main page]] ~~DISCUSSION|Discussion~~