...
- Use the AWS console to make a new S3 bucket named sagetest-YourUsername and make these four subdirectories
- scripts
- input
- output
- results
- logs
Note: Do not put any underscores in your bucket name. Only use hyphens, lowercase letters and numbers.
Set up your config files for the AWS command line tools installed on the shared servers
...
Set up your configuration files for the s3curl AWS tools tool installed on the shared serverservers (belltown, sodo, ballard, ...)
- ssh to belltown
- Create the configuration file for
s3curl
command line toolCode Block ~>cat .s3curl #!/bin/perl %awsSecretAccessKeys = ( YourUsername => { id => 'YourAccessKeyID', key => 'YourSecretAccessKey', }, );
- Test that you can run s3curl
Code Block ~>/work/platform/bin/s3curl.pl --id $USER https://s3.amazonaws.com/sagetest-$USER/ | head -c 200 <?xml version="1.0" encoding="UTF-8"?> <ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>sagetestemr</Name><Prefix></Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><IsTruncated>
Set up the Elastic MapReduce command line tools
Set up your configuration files for the Elastic MapReduce AWS tool installed on the shared servers (belltown, sodo, ballard, ...)
- ssh to belltown
- Create the configuration file for the Elastic Map Reduce command line tool
Code Block ~>cat YourUsername-credentials.json { "access_id": "YourAccessKeyID", "private_key": "YourSecretAccessKey", "keypair": "SageKeyPair", "key-pair-file": "/home/$user/SageKeyPair.pem", "log_uri": "s3n://sagetest-YourUsername/logs/", "region": "us-east-1" }
- Test that you can run it
Code Block ~>/work/platform/bin/elastic-mapreduce-cli/elastic-mapreduce --credentials ~/$USER-credentials.json --help Usage: elastic-mapreduce [options] Creating Job Flows --create Create a new job flow --name NAME The name of the job flow being created --alive Create a job flow that stays running even though it has executed all its steps --with-termination-protection Create a job with termination protection (default is no termination protection) --num-instances NUM Number of instances in the job flow ...