Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Use the AWS console to create a new SSH key named SageKeyPair
  2. Download it to ~/.ssh on the shared servers
  3. ssh to sodo
  4. Fix the permissions on it
    Code Block
    chmod~>chmod 600 ~/.ssh/SageKeyPair.pem
    mode of `/home/ndeflaux/.ssh/SageKeyPair.pem' retained as 0600 (rw-------)
    

Configure S3

  1. Use the AWS console to make a new S3 bucket named sagebio-YourUnixUsername Note: Do not put any underscores in your bucket name. Only use hyphens, lowercase letters and numbers.
  2. Make these five subdirectories
    1. scripts
    2. input
    3. output
    4. results
    5. logs

Set up your config

...

file for the AWS Elastic MapReduce command line

...

tool installed on the shared servers

Get your credentials

Get your security credentials from your AWS Account

  • Access Key ID
  • Secret Access Key

Set up s3curl

Set up your configuration files for the s3curl AWS tool installed on the shared servers (belltown, sodo, ballard, ...)

...

Code Block

~>cat .ssh/s3curl 
#!/bin/perl
%awsSecretAccessKeys = (
    YourUnixUsername => {
        id => 'YourAccessKeyID',
        key => 'YourSecretAccessKey',
    },
);

...

...

ln -s ~/.ssh/s3curl ~/.s3curl

...

Code Block

~>/work/platform/bin/s3curl.pl --id $USER https://s3.amazonaws.com/sagebio-$USER/ | head -c 200
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>sagetestemr</Name><Prefix></Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><IsTruncated>

Set up the Elastic MapReduce command line tools

Set up your configuration files for the Elastic MapReduce AWS tool installed on the shared servers (belltown, sodo, ballard, ...)

  1. ssh to belltownsodo
  2. Create the configuration file for the Elastic Map Reduce command line tool
    Code Block
    ~>cat ~/.ssh/$USER-credentials.json
    
    {
    "access_id": "YourAWSAccessKeyID",
    "private_key": "YourAWSSecretAccessKey",
    "keypair": "SageKeyPair",
    "key-pair-file": "~/home/ndeflaux/.ssh/SageKeyPair.pem",
    "log_uri": "s3n://sagebio-YourUnixUsername/logs/",
    "region": "us-east-1"
    }
    
  3. Test that you can run it
    Code Block
    ~>/work/platform/bin/elastic-mapreduce-cli/elastic-mapreduce --credentials ~/.ssh/$USER-credentials.json --help
    Usage: elastic-mapreduce [options]
    
      Creating Job Flows
            --create                     Create a new job flow
            --name NAME                  The name of the job flow being created
            --alive                      Create a job flow that stays running even though it has executed all its steps
            --with-termination-protection
                                         Create a job with termination protection (default is no termination protection)
            --num-instances NUM          Number of instances in the job flow
    ...
    
  4. For less typing, you can make an alias to this command. If you use bash, you can put the following in your .bashrc:
    Code Block
    
    alias emr='/work/platform/bin/elastic-mapreduce-cli/elastic-mapreduce --credentials ~/.ssh/$USER-credentials.json'
    

Other useful tools

s3curl

You can use the AWS Console to upload/download files to S3 but sometimes it is handy to do this from the command line too, and this tool will let you do that.

To set up your configuration file for the s3curl AWS tool installed on the shared servers (belltown, sodo, ballard, ...):

  1. ssh to sodo
  2. Create the configuration file for s3curl command line tool
    Code Block
    
    ~>cat ~/.ssh/s3curl 
    #!/bin/perl
    %awsSecretAccessKeys = (
        YourUnixUsername => {
            id => 'YourAccessKeyID',
            key => 'YourSecretAccessKey',
        },
    );
    
  3. Make a symlink to it in your home directory
    Code Block
    ~>ln -s ~/.ssh/s3curl ~/.s3curl
  4. Test that you can run s3curl
    Code Block
    
    ~>/work/platform/bin/s3curl.pl --id $USER https://s3.amazonaws.com/sagebio-$USER/ | head -c 200
    <?xml version="1.0" encoding="UTF-8"?>
    <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>sagebio-ndeflaux</Name><Prefix></Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><IsTruncated>
    

nano text editor

The nano editor is available on sodo/ballard/belltown/etc... and on the miami cluster. It does not use X windows. If you need a simple text editor and are not familiar with vi or emacs, nano is a good choice and installed by default on many linux systems.

...