Using the Package
From within R, install from the Sage internal CRAN serverGet the source from Github and build the package:
Code Block |
---|
source('http://sage.fhcrc.org/CRAN.R'); pkgInstall("SageBionetworksCoex")
|
...
# Install the dependencies, from R:
> install.packages(pkgs=c("WGCNA", "flashClust", "dynamicTreeCut"))
# now, from the command line, clone the Github repository
git clone https://github.com/Sage-Bionetworks/SageBionetworksCoex.git
# again, from the command line, build the package
R CMD INSTALL SageBionetworksCoex
|
In R, load the library:
Code Block |
---|
> library(SageBionetworksCoex)
|
For guidance on using the package:
Code Block |
---|
?SageBionetworksCoex
|
Table of Contents
Table of Contents |
---|
Goals
1) Make the Sage coexpression software runnable by any data analyst in R.
...
Performance questions: For datasets having >18,000 probes, how much time and space does each algorithm use?
Dataset | # Probes | # Samples | Sage Time | Sage Space | Package Time | Package Space | Sage beta | Package beta | Gene trees same, independent beta? | Gene trees same, same beta? | Module difference****, independent beta | Module difference****, same beta |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Female mouse liver | 3600 | 135 | --- | --- | --- | --- | 6.5 | 6.5 | TRUE | TRUE | 3.7% | 3.7% |
Cranio | 2534 | 249 | --- | --- | --- | --- | 4.0 | 4.5 | FALSE | TRUE | 44% | 0.9% |
Methylation, top 5K genes | 5000 | 555 | --- | --- | --- | --- | 8.5 | 8.5 | TRUE | TRUE | 0 | 0 |
Colon cancer, top 5K genes | 5000 | 322 | --- | --- | --- | --- | 3 | 3.5 | FALSE | TRUE | 11% | 0.5% |
Human liver cohort, top 5K genes | 5000 | 427 | --- | --- | --- | --- | 11 | 11 | TRUE | TRUE | 1.0% | 1.0% |
PARC* | 18,392 | 960 | 5h:55m | 83.9 GB | 1h:40m | 71 GB | 8 | 7.5 | FALSE | FALSE | 4.7% | 0.6% |
Methylation (full set)* | 27,578 | 555 | 24h:45m | 180 GB | 6h:38m | 196 GB | 8 | 11.5 | FALSE | FALSE | 14% | 0.2% |
Colon cancer, top 45K genes*** | 45,000 | 322 | --- | --- | 5h:52 | 368 GB | --- | --- | --- | --- | --- | --- |
Human liver cohort*** | 40,102 | 427 | --- | --- | 5h:13m | 313 GB | --- | --- | --- | --- | --- | --- |
...
Goals, Revisited
Goal | How we met it |
---|---|
Make the Sage coexpression software runnable by any data analyst in R | Created easy to use, documented R package. (TODO: training class) |
Clearly explain the methodology underlying the coexpression algorithms. | Included links to literature in the R package documentation. |
Make the Sage coexpression software publicly available. | TBD (see below) |
Make the Sage coexpression software perform well, on commonly available hardware. | Used UCLA's accelerated algorithms. Accelerated the 'intra-module statistics' computation. Profiled datasets of up to 27,000 genes on inexpensive, high capacity cloud resources. |
Choices for package 'publication' include:
...