Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Using the Package

From within R, install from the Sage internal CRAN serverGet the source from Github and build the package:

Code Block

source('http://sage.fhcrc.org/CRAN.R'); pkgInstall("SageBionetworksCoex")

...

# Install the dependencies, from R:
> install.packages(pkgs=c("WGCNA", "flashClust", "dynamicTreeCut"))
# now, from the command line, clone the Github repository
git clone https://github.com/Sage-Bionetworks/SageBionetworksCoex.git
# again, from the command line, build the package
R CMD INSTALL SageBionetworksCoex

In R, load the library:

Code Block
> library(SageBionetworksCoex)

For guidance on using the package:

Code Block

?SageBionetworksCoex

Table of Contents

Table of Contents

Goals

1) Make the Sage coexpression software runnable by any data analyst in R.

...

Performance questions:  For datasets having >18,000 probes, how much time and space does each algorithm use?

Dataset

# Probes

# Samples

Sage Time

Sage Space

Package Time

Package Space

Sage beta

Package beta

Gene trees same, independent beta?

Gene trees same, same beta?

Module difference****, independent beta

Module difference****, same beta

Female mouse liver

3600

135

---

---

---

---

6.5

6.5

TRUE

TRUE

3.7%

3.7%

Cranio

2534

249

---

---

---

---

4.0

4.5

FALSE

TRUE

44%

0.9%

Methylation, top 5K genes

5000

555

---

---

---

---

8.5

8.5

TRUE

TRUE

0

0

Colon cancer, top 5K genes

5000

322

---

---

---

---

3

3.5

FALSE

TRUE

11%

0.5%

Human liver cohort, top 5K genes

5000

427

---

---

---

---

11

11

TRUE

TRUE

1.0%

1.0%

PARC*

18,392

960

5h:55m

83.9 GB

1h:40m

71 GB

8

7.5

FALSE

FALSE

4.7%

0.6%

Methylation (full set)*

27,578

555

24h:45m

180 GB

6h:38m

196 GB

8

11.5

FALSE

FALSE

14%

0.2%

Colon cancer, top 45K genes***

45,000

322

---

---

5h:52

368 GB

---

---

---

---

---

---

Human liver cohort***

40,102

427

---

---

5h:13m

313 GB

---

---

---

---

---

---

...

Goals, Revisited

Goal

How we met it

Make the Sage coexpression software runnable by any data analyst in R

Created easy to use, documented R package.  (TODO: training class)

Clearly explain the methodology underlying the coexpression algorithms.

Included links to literature in the R package documentation.

Make the Sage coexpression software publicly available.

TBD  (see below)

Make the Sage coexpression software perform well, on commonly available hardware.

Used UCLA's accelerated algorithms.  Accelerated the 'intra-module statistics' computation.  Profiled datasets of up to 27,000 genes on inexpensive, high capacity cloud resources.

Choices for package 'publication' include:

...