Important update (January 20th, 2011): the data below have been corrected for the BCR batch which is not necessarily the processing batch. The dataset needs to be reanalyzed.
Correlation of batch with clinical traits
Date: December 20th, 2011
Number of batches based on DNA methylation samples (not necessarily available samples): 10
Correlation of batch with clinical traits (complete table) can be found here
Cross-tabulation of batch vs center (two=second field in the patient barcode)
> table(batchID,two) two batchID 05 35 38 44 49 50 53 55 64 67 71 73 75 78 80 86 91 93 95 97 99 0945 0 1 0 7 0 0 0 4 4 5 0 0 0 0 0 0 0 0 0 0 0 1104 3 2 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1205 14 0 6 1 7 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 1551 0 0 3 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1626 6 1 0 2 0 7 0 1 6 0 0 0 2 0 1 0 0 0 0 0 0 1756 3 0 3 5 1 13 0 1 0 4 0 2 9 0 0 1 0 0 0 0 0 1856 8 0 0 6 9 6 0 2 0 0 1 0 0 0 0 0 5 0 0 0 0 1947 0 0 0 0 1 1 0 16 0 0 0 0 5 0 2 1 5 0 2 0 0 2037 0 0 1 1 0 1 0 6 0 0 0 0 0 17 0 0 0 1 0 5 1 2064 0 0 0 9 0 0 2 1 2 0 0 0 0 11 0 2 0 0 1 0 0
Significant correlations of batch with the clinical variables:
"distant_metastasis_pathologic_spread","factor",115,"Pearson's Chi-squared test",5.22E-09
"days_to_form_completion","integer",124,"Kruskal-Wallis rank sum test",6.42E-04
"year_of_initial_pathologic_diagnosis","integer",108,"Kruskal-Wallis rank sum test",1.98E-03
"vital_status","factor",108,"Pearson's Chi-squared test",4.50E-03
Most likely that days to form completion is not a real clinical trait.
DNA methylation data (tumor matched): 27k, available 32 patients. At this point it is not enough to proceed with analyses