Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
# set lib path to install packages

clusterEvalQ(cl, { .libPaths( c('/home/ubuntu/R/library', .libPaths()) ) })
clusterEvalQ(cl, {
    install.packages("someUsefulPackage")
    require(someUsefulPackage)
})

...

Sage packages

Code Block
clusterEvalQ(cl, {
    options(repos=structure(c(CRAN="http://cran.fhcrc.org/")))
    source('http://depot.sagebase.org/CRAN.R')
 
    pkgInstall("synapseClient")
    pkgInstall("predictiveModeling")
     
    library(synapseClient)
    library(predictiveModeling)
})

Logging workers into synapse:

Code Block
clusterEvalQ(cl, { synapseLogin('joe.user@mydomain.com','secret') })

Asking many worker nodes to load packages and request Synapse entities isn't a recommended or scalable approach.Instead, see request Synapse entities at once is a fun and easy way to mount a distributed denial of service attack on the repository service. The service deals with this by timing out requests, which means some workers will succeed, while others will fail. A couple of tricks will help smooth over these problems.

  1. check if our target data already exists. That way, we can re-try in the event of partial failure without re-doing work and unnecessarily thrashing Synapse.
  2. throw in a few random seconds of rest for our workers. This spreads out the load on Synapse.
Code Block
clusterEvalQ(cl, {
    if (!exists('expr')) {
        Sys.sleep(runif(1,0,5))
        expr_entity <- loadEntity('syn269056')
        expr <- expr_entity$objects$eSet_expr
    }
})

Attaching a shared EBS volume

It might be worth looking into attaching a shared EBS volume and adding that to R's .libPaths(). See Configuration of Cluster for Scientific Computing for an example of connecting a shared EBS volume to the nodesin StarCluster. How to do this in the context of a cloud formation stack is something yet to be figured out.

<<attached shared EBS volume for R packages and files>>

Accessing source code repos on worker nodes

...