Correlation of batch with clinical traits (READ: can be found here, COAD: can be found here). READ, number of batches: 11, COAD, number of batches: 15
Additional technical variables to be considered in normalization: batch (6th field in the patient's barcode), center (2nd field in the patients' barcode), amount, barcode bottom (?), concentration, day, month, year (to be concatenated and considered as a single variable), plate row, plate column. This information is available from clinical_aliquot_public_CANCERTYPE.txt files.
DNA methylation: 27k, level1 downloaded on December 19th, 2011.
Total number of COAD patients (tumor/matched normal/unmatched normal) = 212
Total number of READ patients (tumor/matched normal/unmatched normal) = 81
Place them in the same directory and combine in a single matrix of M values (log2(methylated/unmethylated))
Total number of tumor samples: 237
Extracted batch and center information, relationship between them:
> table(center,batch) batch center 0820 0825 0904 1020 1110 1116 A004 A00B A081 A6 4 2 0 4 0 0 0 0 0 AA 26 5 31 44 4 0 15 17 13 AF 4 0 0 0 0 1 0 0 0 AG 12 10 8 0 0 17 9 9 0 AY 0 0 0 0 2 0 0 0 0