...
- move queue to AWS SWF or SQS
Requirements (or at least desired initial functionality)
- Master slave type architecture to collect results from a distributed computations into an object that can be used in subsequent computations. Revolution foreach package meets this requirement by collecting results into an R list. Traditional batch submission systems do not meet this requirement without additional engineering, as results of each job may be output to a separate text file which need to be aggregated by a separate program, which becomes cumbersome.
- Code should run in serial (for normal interactive computing) and in parallel with as little modification to the user's typical workstream as possible Again, Revolution foreach meets this requirement as parallelization only requires changing %do% to %dopar% (or running the same %dopar% code with or without a registered parallel backend). Traditional batch submission systems require a significantly different workstream and code modifications to run in parallel.
Out of scope for initial functionality (though desirable in the future)
Inter-node communication – e.g. reduce step in map reduce. Sufficient to assume jobs are embarrassingly parallel for initial functionality.
Driving use cases to implement parallelization
- Elias' randomized simulation. Requires 10,000 runs of elastic net, lasso, ridge using slightly different data.
- In Sock's prediction pipeline. Very similar to Elias use case. Parallelization can be either on: a) each predictive model (as in Elias' case); b) each bootstrap run; c) each cross validation fold.
Solutions to explore
- iPython (on Amazon). Larsson says this allows parallelization in Python the same way we are trying to design into BigR. He says this is already set up to run using Star Cluster on Amazon.
- Revolution foreach (on Amazon). Chris Bare brings up a good point – have we explored if Revolution's foreach package can run on Amazon? I would think this is the first place they would implement it and likely someone has gotten it working?