Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Section
Column
width50%

On This page

Table of Contents
Column
width5%

Column
width45%

On Related Pages

Page Tree
rootSCICOMP:@parent
startDepth3

Computing squares in R

...

The following script will download and install the latest version of R on each of your Elastic MapReduce hosts. (The default version of R is very old.)

Name this Download script bootstrapLatestR.sh and it should contain the following code:

Iframe
srchttp://sagebionetworks.jira.com/source/browse/~raw,r=HEAD/PLFM/users/deflaux/scripts/EMR/rWordCountExample/bootstrapLatestR.sh
styleheight:250px;width:80%;

...

IframeWhat is going on in this script?

...

The following script will download and install several packages needed for RHadoop.

Name this Download script bootstrapRHadoop.sh and it should contain the following code:

Iframe
srchttp://sagebionetworks.jira.com/source/browse/~raw,r=HEAD/PLFM/users/deflaux/scripts/EMR/rmrExample/bootstrapRHadoop.sh
styleheight:250px;width:80%;

...

Iframe

Upload your scripts to S3

...

Code Block
~>elastic-mapreduce --credentials ~/.ssh/$USER-credentials.json --create \
--master-instance-type=m1.small --slave-instance-type=m1.small \
--num-instances=1 --enable-debugging \
--bootstrap-action s3://sagebio-$USER/scripts/bootstrapLatestR.sh \
--bootstrap-action s3://sagebio-ndeflaux$USER$USER/scripts/bootstrapRHadoop.sh \
--name rmrTry1 --alive

Created job flow j-79VXH9Z07ECL

...

Code Block
~>elastic-mapreduce --credentials ~/.ssh/$USER-credentials.json --ssh --jobflow j-79VXH9Z07ECL
ssh -i /home/ndeflaux/.ssh/SageKeyPair.pem hadoop@ec2-107-20-44-27.compute-1.amazonaws.com 
Linux domU-12-31-39-04-08-C8 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:39:36 EST 2008 i686
--------------------------------------------------------------------------------

Welcome to Amazon Elastic MapReduce running Hadoop and Debian/Lenny.
 
Hadoop is installed in /home/hadoop. Log files are in /mnt/var/log/hadoop. Check
/mnt/var/log/hadoop/steps for diagnosing step failures.

The Hadoop UI can be accessed via the following commands: 

  JobTracker    lynx http://localhost:9100/
  NameNode      lynx http://localhost:9101/
 
--------------------------------------------------------------------------------
hadoop@domU-12-31-39-04-08-C8:~$ 

Set JAVA_HOME and start R

...

Code Block
> q()Save workspace image? [y/n/c]: n
hadoop@ip-10-114-89-121:/mnt/var/log/bootstrap-actions$ exit
logout
Connection to ec2-107-20-108-57.compute-1.amazonaws.com closed.
~>elastic-mapreduce --credentials ~/.ssh/$USER-credentials.json --terminate --jobflow j-79VXH9Z07ECL
Terminated job flow j-79VXH9Z07ECL

What next?