Advanced workshop ex2

PURPOSE

In this exercise, we will learn how to initiate and access a Linux server via AWS (in AWS jargon it is called to spin up or launch an instance). Further, we will learn how to transfer files between your local computer and the remote AWS instance.

DATA

A set of paired-end reads generated by an Illumina MiSeq sequencer using the Nextera XT library preparation and the 2X300 cycle MiSeq sequencing kit. The sequenced isolate is Staphylococcus aureus ATCC 25923.

File 1: SRR4114395_1.fastq.gz

File 2: SRR4114395_2.fastq.gz

These files will also be used in Ex. 3.

Note: As the commands are getting longer, you might want to copy/paste them from the exercise guide to the command prompt instead of typing them in character by character. You can copy text as you usually would, but to paste it in at the command prompt, right-click and select “Paste”. If it doesn't work try copying as usual, and pasting by selecting the AWS terminal window, holding down the key marked "alt", and left-clicking on your mouse. 

WORKFLOW

Accessing AWS control panel

Prior to the workshop, an AWS user account has been generated for each participant using a temporary email address that should be used for login. The temporary email address and the password associated with it will be handed out to you. If you wish to continue using AWS when the workshop is over, you should sign up to AWS using your own email address.

To log in to the AWS control panel, go to the AWS login page: https://console.aws.amazon.com

Note: You will not have access to billing information as you are only a sub-user under the main GoSeqIt account.

Launching an AWS instance

Once you are logged in go to Services (top bar) > EC2 (under Compute).

In the top bar to the right, you can see which AWS location you are accessing. Use the dropdown menu to select “EU (Frankfurt)”.

Now you are ready to launch an instance. Click the blue button marked “Launch Instance”. For a thorough description of how to launch AWS instances, you could go through this tutorial: https://aws.amazon.com/getting-started/tutorials/launch-a-virtual-machine/ , but you don’t need to do that just now. Instead follow these simple steps:

Step 1: Choose an Amazon Machine Image (AMI):

Select the option “Amazon Linux AMI 2018.03.0 (HVM), SSD Volume Type”.

Step 2: Choose an Instance Type:

Keep the default selection “t2.micro”.

Click the button “Next: Configure Instance Details”.

Step 3: Configure Instance Details:

Change Shutdown behaviour from “Stop” to “Terminate”.

Click “Next: Add Storage”.

Step 4: Add Storage:

Keep the default storage size of 8 GiB.

Click “Next: Add Tags”.

You do not need to change anything under “Add Tags”. Click “Next: Configure Security Group”.

Step 6: Configure Security Group:

Under “Source” change the IP address’ that can access the instance from “Custom” to “My IP”.

Click “Review and Launch”.

This will take you to a review of your selections. If everything looks OK, click “Launch”.

You will next be asked to either add an existing key pair or create a new key pair. A key pair is used to securely access your AWS instance using Secure Shell (SSH).

Select “Create a new key pair” (you can read more about AWS key pairs here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html).

Give your key pair a name that contains only letters and numbers, no special characters and no spaces.

Click “Download Key Pair“ and save the file in your Downloads folder. Note: It is only the first time you launch an AWS instance that you need to create a key pair. The next time, you can select “Choose an existing key pair” and select the one you create now.

Save your key pair in the .ssh sub-directory in your home directory. This is most easily done by entering the following command in the terminal window:

$ mv ~/Downloads/[MyKeyPair.pem] ~/.ssh/[MyKeyPair.pem]

Where MyKeyPair.pem is the file you just downloaded. If you get an error message that MyKeyPair.pem does not exist, check the name in the Downloads folder. Sometimes “.txt” is added to the file name. If that is the case, use this command instead:

$ mv ~/Downloads/[MyKeyPair.pem.txt] ~/.ssh/[MyKeyPair.pem]

Note that this will move and rename the file in one command.

Go back to the AWS console and click “Launch Instances”.

It will take a few minutes before your instance has launched.

Accessing your AWS instance

Go to Services (top bar) > EC2 (under Compute) again.

In front of “Running Instances” it should now say “1”.

Click “Running Instances”, which will take you to a page that contains an overview of your instances. Here you can, e.g, see which types of instances you have and their state (running/stopped/terminated). If you scroll to the right, you can see the IP address of the instance you just launched under “IPv4 Public IP”. Write the number down as you will need it for accessing the instance.

Now, for getting access to the instance, go to the terminal window and type:

$ ssh -i ~/.ssh/[KeyPair.pem] ec2-user@[IP-Address]

Note: If ssh is not found on your computer, see HERE on how to fix it . AN alternative to using ssh on Windows is to use PuTTY (can be downloaded from HERE), but first choice should be ssh.

Where KeyPair.pem is the name of the Key Pair file you previously downloaded, and IP-Address is the IP address of the instance. For me, the specific command would look like this if my Key Pair file was named qc-test.pem and the IP address was 35.157.21.25:

$ ssh -i ~/.ssh/qc-test.pem ec2-user@35.157.21.25

On Windows, you might get an error message, if there are spaces in, e.g., your username. If that is the case put single quotes around the username:

$ ssh -i /c/Users/‘Mette Larsen’/.ssh/qc-test.pem ec2-user@35.157.21.25

If you get an error message regarding the permissions of the KeyPair.pem file you can fix it like this:

Move to the .ssh folder:

$ cd ~/.ssh

Change the permissions related to the pem file:

$ chmod 400 [MyKeyPair.pem]

If it doesn't work and you are on a Windows PC, the below might help:

Right click on the file in Windows Explorer and choose Properties > Security > Advanced

Modify the permissions so that:

  • The key file doesn't inherit from the container
  • You (the owner) have full access
  • Remove permission entries for any other users (e.g., SYSTEM, Administrator)

If you don’t get any error messages, you will be asked if you are sure you want to continue connecting. Type “yes”.

You should now see something similar to this:

Note how the prompt has changed, as you are no longer on your local computer, but on your AWS Linux virtual machine in the cloud!

You are standing in your home folder “/home/ec2-user” or just “~”.

Type “ls” to see what’s in the folder (nothing).

If you type “ls /bin”, you can see the content of the /bin folder including some commands that are available from the start.

Make a folder called “data” in the /home/ec2-user folder:

$ mkdir data

Copying data from your local computer to the AWS instance

Now copy the two data files SRR4114395_1.fastq.gz and SRR4114395_2.fastq.gz from your local computer to the AWS instance and the data folder you just created in the following way:

First, open a new terminal window:

Mac:

Make sure the Terminal program is selected, then select Shell > New Window from the top bar.

Windows:

In Desktop view, right click and select “Git Bash Here”.

In the new terminal window, use the below command for copying SRR4114395_1.fastq.gz to the AWS instance. For it to work you must be in the folder where the file you want to copy is located. If you are not, you must type the the full path to the file. Note that you cannot use cp to copy between two systems, you have to use secure copy, “scp”:

$ scp -i ~/.ssh/[KeyPair.pem] SRR4114395_1.fastq.gz ec2-user@[IP-Address]:~/data/.

Note the “.” at the end of the command. It means that the file should be saved with the same name as originally in the data directory.

Again, if you are on a Windows computer and your username have spaces in it, use single quotes around it.

It takes a minute or so to copy the entire file.

Try to figure out yourself how to copy the other file, SRR4114395_2.fastq.gz, to the AWS instance.

Once both files have been copied, go to the AWS instance and ensure that the two files can now be found in the data folder you created in your home directory.

Have a look at the size of each of the two files using “ls -l”. The size should be identical to the size of the corresponding file on your local computer:

SRR4114395_1.fastq.gz is 66258835 bytes

SRR4114395_2.fastq.gz is 77714179 bytes