AWS General Info

AWS General Info

Developer AWS Accounts

See instructions for General Sage AWS accounts.

Use your individual AWS account under the Sage consolidated bill for AWS experiments. The rule of thumb is that if you cannot shut off what ever you are running while you are on vacation, it belongs in the Production AWS Account.

Production AWS Account

Use the platform@sagebase.org account for:

  • S3

  • EC2

  • Elastic Beanstalk

  • Elastic MapReduce

  • Relational Database Service

  • Identity and Access Management Service

  • You will need to log into the AWS console with the platform@sagebase.org username and password: https://console.aws.amazon.com/

You can also use your IAM account if you like but many AWS services do not support it yet such as Beanstalk. There is a different link to log into the AWS console with your IAM login and password: https://325565585839.signin.aws.amazon.com/console/ec2

Credentials, passwords, ssh keys

You can find them on our shared servers. When storing passwords locally on your laptop (which already has an encrypted drive, yay!) you might also consider using Password Safe.

/work/platform>hostname sodo /work/platform/PasswordsAndCredentials>ls AtlassianAccountAWSCredentials platformStagingEncryptionKey.txt crowdServerCertificate SshCertificates passwords.txt SshKeys PlatformAWSCredentials StackCredentials PlatformIAMCreds wildcard-sagebase.org-cert platformPropertyEncryptionKey.txt

Miscellaneous How To's

How to SSH to an EC2 Host

Connecting from Linux

ssh -i PlatformKeyPairEast.pem ec2-user@<the ec2 host>

For screen shots see EC2 docs

Connecting from Windows using Putty

For screen shots see EC2 docs

Window's users can also connect using PuTTY or WinSCP, however you will to first create a PuTTY private key file using puttygen.exe
Here is how to create the private key file:

  1. Run the 'puttygen.exe' tool

  2. Select the 'load' button from the UI.

  3. From the file dialog select your the KeyPair file (i.e. PlatformKeyPairEast.pem)

  4. A popup dialog should tell you the key file was imported sucessfully and to save it using "Save private Key"

  5. Select 'Save Private Key' and give it a name such as PlatformKeyPairEast.ppk to create the PuTTY private key file.

Once you have a PuTTY private key file you can use it to connect to your host using PuTTY or WinSCP.
To connect with WinSCP:

  1. Set the host name, and keep the default port (22). Note: Make sure port 22 is open on the box you are connecting to.

  2. Set the user name to ec2-user

  3. Select the '...' button under 'Private Key File' and select the .ppk file you created above.

  4. Select 'Login'

Figure out if AWS is broken

AWS occasionally has issues. To figure out whether the problem you are currently experiencing is their fault or not:

  1. Check the AWS status console to see if they are reporting any problems http://status.aws.amazon.com/

  2. Check the most recent messages on the forums https://forums.aws.amazon.com/index.jsp Problems often get reported there first.

  3. If you still do not find evidence that the problem is AWS's fault, search the forums for your particular issue. Its likely that someone else has run into the same exact problem in the past.

  4. Still no luck? Ask your coworkers and/or post a question to the forums.

How to save money on the AWS bill

If you use EBS-backed AMIs you can "stop" (not "terminate") your instance when you are not using it. Your root partition and other EBS volumes stick around and you are only charged for EBS usage while the instance is "stopped". When you need to use it again you "start" the instance and then re-start your applications.

You can also start with a less expensive instance type easily upgrade to a larger size in this same manner. One thing to note is that you cannot switch between 32bit to 64bit OS - choose well for your initial choice.

S3 How To's

How to enforce HTTPS-only access to S3

We enforce HTTPS-only access to S3 for all buckets. The bucket policies can be found here: http://sagebionetworks.jira.com/source/browse/PLFM/trunk/configuration/s3Policies

Run a Report to Know Who has Accessed What When

Use Elastic MapReduce to run a script on all our logs in the bucket logs.sagebase.org. There are some scripts in bucket emr.sagebase.org/scripts that will do the trick. If you want to change what they do, feel free to make new scripts.

Here is what a configured job looks like:

for the purpose of cutting-and-pasting:

Input Location: s3n://prodlogs.sagebase.org/

Output Location: s3n://emr.sagebase.org/output/report20110809

Mapper: s3n://emr.sagebase.org/scripts/downloadsByUserMapper.py

Reducer: s3n://emr.sagebase.org/scripts/downloadsByUserReducer.py

Amazon S3 Log Path: s3n://emr.sagebase.org/output/debugLogs

And here is some sample output from the job.  Note that:

  • All Sage employees will have their sagebase.org username as their IAM username

  • Platform users register with an email address and we will use that email address as their IAM username.

  • User d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440is the platform@sagebase.org user.

    arn:aws:iam::325565585839:user/prod-nicole.deflaux@sagebase.org [09/Aug/2011:01:07:49 +0000] REST.GET.OBJECT 4621/0.0.0/mouse_model_of_sexually_dimorphic_atherosclerotic_traits.phenotype.zip d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440 [09/Aug/2011:01:48:49 +0000] REST.GET.BUCKET - d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440 [09/Aug/2011:01:55:47 +0000] REST.GET.BUCKETPOLICY - d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440 [09/Aug/2011:01:55:53 +0000] REST.GET.BUCKET - ... d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440 [09/Aug/2011:01:56:30 +0000] REST.GET.ACL 5031/0.0.0/rClient/5030/sangerIC50.zip d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440 [09/Aug/2011:01:56:30 +0000] REST.HEAD.OBJECT 5031/0.0.0/rClient/5030/sangerIC50.zip d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440 [09/Aug/2011:01:56:30 +0000] REST.GET.OBJECT 5031/0.0.0/rClient/5030/sangerIC50.zip Downloads per file: - 14 5031/0.0.0/rClient/5030/sangerIC50.zip 3 4621/0.0.0/mouse_model_of_sexually_dimorphic_atherosclerotic_traits.phenotype.zip 1 Downloads per user: arn:aws:iam::325565585839:user/prod-nicole.deflaux@sagebase.org 1 d9df08ac799f2859d42a588b415111314cf66d0ffd072195f33b921db966b440 17

Upload a dataset to S3

UPDATE: We no longer use bucket explorer to upload datasets. Instead we now use the R client to perform uploads. You can still use bucket explorer to browse datasets.