Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Goals

1) Make the Sage coexpression software runnable by any data analyst in R.

...

  1. Compute correlation coefficient matrix.
  2. Determine optimal value for the scale free exponent, beta, and collect regression statistics.
  3. Compute the toplogical ovelap topological overlap matrix (TOM).
  4. Perform hierarchical clustering of genes, based on TOM.
  5. Detect and label modules in TOM, using "Dynamic Tree Cutting".
  6. Merge modules based on hierarchical clustering of representative genes.
  7. Cluster samples hierarchically.
  8. Compute intra/inter-module network statistics, per gene.
  9. Produce diagnostic plots (dendrograms, heat maps, statistical scatter plots).
  10. Produce tabular output of module membeshipmembership, network statistics, and scale-free regression statistics.

Elsewhere http://sagebionetworks.jira.com/wiki/display/SCICOMP/Coexpression+Evaluation we have shown that for realistic dataset sizes, the vast majority of time is spent in steps 1 and 3, and that steps 1, 3, and 4 are identical in the Sage and UCLA-WGCNA code bases, while the UCLA-WGCNA code uses compiled/optimized software for these three steps.  Further, we have seen that steps 2 and 5 (with the right parameter choices in the UCLA package) produce extremely similar results.  (Note, the observed similarity is no surprise, since the two code bases represent forks from an original set of algorithms, which have evolved separately for appx. 6 years.)

Our strategy, therefore, is:

leverage the UCLA-WGCNA package for the "common" steps, 1->5, gaining significant performance

provide the user a parameter choice at step 5, to do "tree cutting" in the manner of the Sage algorithm, or in that of the UCLA-WGCNA algorithm

provide two algorithms for step 6 (module merging), allowing a user to choose the Sage or UCLA-WGCNA algorithm

leverage the UCLA-WGCNA dendrogram/module plotting algorithm in step 9

maintain the Sage algorithms for the Sage-specific post-processing, i.e. step 7, step 8, the heat maps in step 9.

UCLA-WGCNA dependencies

Sage software dependencies

...