Correlation of network modules with clinical traits
How: extract first principal components of each module and correlate it with known clinical traits (survival, cancer stage, cancer grade, age)
M value (center, batch, plate row and column removed)
Survival
Today (Nov-10,2011) downloaded an updated clinical file from TCGA for the patients whose DNA methylation data was used for the network analysis (468 patients). During construction of the survival object based on days to death, living status and days to the last follow-up determined that 3 patients had living status "LIVING" but no information was available for the days to the last follow-up. One patient didn't have any information in regard to her living status. These patients (TCGA-04 -1341,-1357,-1360,-1519) were excluded from the survival analysis. Kaplan-Meier curve and a dot chart of the Wald's p-values (Cox proportional hazards model (coxph)) of association of each module PC1 with survival:
This analysis shows no signal. In contrast when the network was constructed using only the methylated probes several modules were found to be associated with survival.
Days to tumor recurrence
This analysis was done in the same way as the survival analysis above.
It looks like that there might be some weak hope that we can identify some loci associated with the days to tumor recurrence. Need to check what I get in univariate analysis and elastic net.
Age
Spearman correlation to see if any modules are correlated with age since it wasn't removed in normalization
Modules yellow, red, lightgreen, grey60 (and grey), greenyellow and black are correlated with age. This is not surprising since we see a lot of CpG loci come up in the univariate analysis of the correlation with age.
Tumor stage, tumor grade, tumor residual disease and primary therapy outcome
Kruskal Wallis test
Module grey60 is correlated with tumor grade, it has the following GO categories: respiratory electron transport chain (10^-1), mitochondrial membrane part (10^-2)
No modules are correlated with the primary therapy outcome (which might not be very surprising since no modules are correlated with survival either), and module "blue" is correlated with the tumor residual disease with P value 0.04871 (it has the following gene ontology categories (cellular component): plasma membrane, cell periphery, cell adhesion). This might (might be!) relevant since in the conversation with Charles Drescher he mentioned that larger tumors have very different histology. Need more reading.
M value (only batch, plate row and column removed)
Survival
Coxph model
Red module has very weak correlation with survival (P value = 0.0509).
Age
Spearman correlation
Only my module is correlated with age unlike in the network where the center is removed.
Tumor stage, grade, tumor residual disease, primary therapy outcome
Unlike in the network where center was removed no correlation with grade is found. Similarly no correlation with stage is found (dot chart of P values is not shown here). Since red module is weakly associated with survival I also see it being associated with the primary therapy outcome success. Blue module is correlated with the tumor residual disease.