In the field of geo-information, the amount of available data is staggering. This Big Data is often too large to download and too intensive to process on a personal computer. Therefore data providers such as the European Space Agency and research organisations like SURFsara are now offering virtual machines (VM) running on their own cloud infrastructure for scientific purposes, so that this Big Data can be processed faster and more efficiently. These cloud VMs are typically running Linux distributions and have a whole lot of resources (processor cores, RAM, hard disk) available.
Let’s start with SURFsara cloud. First go to https://ui.hpccloud.surfsara.nl/ and log in with the credentials that have been provided to you. Each student has a unique account. Then, click on your username and go to Settings. Click “Update SSH Key” and paste the public key you generated in the previous lesson into the text box, and click the “Update SSH Key” button.
Hint: the easiest way to find it is by running Git GUI, though Help → Show SSH Key. In the PC lab the SSH keys are stored on the M: drive, so they are unique to the WUR user account. If you want to access the cloud from your own device, you will need to update the SSH key in your SURFsara account to the one on your device.
You can start a virtual machine by going to the VMs tab and clicking the + button. Select the “Ubuntu 16.04 GeoScripting” template and press “Create” (do not change the size of the disks). Wait a bit (several minutes) until your VM enters the ready state (icon turns green). Note that the page does not update automatically: you need to click the Refresh (circling arrows) button in the SURFsara UI (not in your browser!) to update the view.
There are two ways to log into your virtual machine: through a graphical interface using the X2Go Client software, and through a command line using Git Bash. We will mainly use the graphical interface in this course. To do that, launch X2Go Client on Windows, and the new session Session Preferences window will be opened automatically. Fill in the details as follows:
In addition, in the Connections tab you can set the compression method to “4k-png”. This is a lower-colour setting that saves bandwith and thus makes the desktop more responsive.
Press OK, and from here on you can press the bubble on the right-hand side to launch the dektop of the virtual machine. You will see a Host key verification failed warning; this is expected every time you launch a new VM, answer Yes. You might also see a warning about shared folders and printers; this is safe to ignore. Turn off folder and printer sharing for the message to go away.
Very important: Remember to properly shut down VMs that you are not actively using, as they take up precious resources from the cloud! If you do not, other students might not be able to start their own VMs.
To properly shut down your VM, first you need to make sure your computer is no longer connected to it. If you have X2Go running, open the X2Go Client window again, and press the Terminate button on the lower right part of the main pane. This will gracefully log your user out.
Then on the SURFSara website, select your VM, press the Power off button and confirm Send the power off signal. Wait for the state to become OFF, then press the Delete button and confirm deletion.
Important: Only if your VM is deleted (does not show up in the VMs tab altogether) does it free the resources for other students to use!
If you change something, then restart the VM, you may notice that your changes have been lost. This is because by default VM hard drives are not persistent. You need to clone them and set them as persistent to make the changes stick.
To do that, when your VM is in the OFF state, press the green floppy disk icon. Give your new template a descriptive name (for example, your name or the name of your group) and choose “Persistent”. Wait a while while your changes are saved.
From here on, as long as you start the VM from the template in the Saved tab, the changes you do will be saved.
Important: Do not make more than one VM persistent, as it might go over the hard disk quota! If you want to start fresh, make sure your previous saved template is deleted first.
You might want to transfer files to or from the VM. There are two ways to do so. The more convenient method is to use Network File System (NFS) to access the VM drive as if it was a local hard drive. To do that, you need to establish a secure connection to the VM and map the network drive to your PC.
First, start Git Bash. This will open a terminal window. In it, enter:
ssh -N -L 2049:localhost:2049 -L 2050:localhost:2050 -L 111:localhost:111 ubuntu@<ip> where
<ip> is the IP address of your VM. This will establish a secure SSH connection to the VM on ports 2049, 2050 and 111 (the console getting stuck and seemingly nothing happening is good and expected). You can use this method for other applications and ports as well (such as to access RStudio Server on some VMs). When you are done, to disconnect the secure connection, press Ctrl+C in the Git Bash window, which brings you back to the
To map the network drive, there are also two options. The GUI option is to open My Computer, press the Map Network Drive button, and enter
localhost:/home/ubuntu/userdata (again, use the IP of your VM). Double-clicking on the drive, you will see a folder with nothing but an inaccesible
lost+found folder. It is the /home/ubuntu/userdata directory on the VM, and you can drag and drop files and folders to and from it. To disconnect, right-click on the network drive and click Disconnect.
Note: If the GUI option has Windows treating files you create as read-only, try mounting the remote drive by using the command line. To do that, run
cmd.exe (Command Prompt) from the Start menu, then enter
mount -o fileaccess=777 localhost:/home/ubuntu/userdata Z:. This will make the drive appear in My Computer just like the GUI method did, with all new files being treated as read-write.
Another, less complicated but also less convenient method to tranfer files is using the Secure File Transfter Protocol (SFTP). It is a command-line program that comes with SSH. You can use it for transferring any file you can access using SSH, so it is not limited to prespecified directories like NFS is. So you can use it to get or send files outside of your userdata directory.
To use it, open Git Bash again, and enter:
sftp ubuntu@<ip>. This will give you a prompt in which you can enter commands. Use
get <file> to download files and
put <file> to upload them, where
<file> is the path to the file. You can change the directory you download the files to by using
lcd <path>, and the directory you download from by using
cd <path>. Enter
? for a complete list of commands,
exit to quit.