Important update (January 20th, 2011): the data below have been corrected for the BCR batch which is not necessarily the processing batch. The dataset needs to be reanalyzed.
Batch vs clinical traits
Number of clinical traits: 57, number of theoretical DNA methylation batches: 20
Correlation of batch with the center
> table(batchID, center)
center
batchID A5 AJ AP AW AX B5 BG BK BS D1 DF DI E6 EC EO EY FI H5
A00U 0 0 9 0 13 0 0 2 0 0 0 0 0 0 0 0 0 0
A039 18 0 13 0 2 7 6 0 0 0 0 0 0 0 0 0 0 0
A105 7 0 4 0 1 7 13 1 14 0 0 0 0 0 0 0 0 0
A10A 1 0 0 0 1 0 3 0 3 0 0 0 0 0 0 0 0 0
A10N 0 0 0 0 1 7 3 0 1 8 0 0 0 0 0 0 0 0
A10Q 7 0 4 0 1 7 13 1 14 0 0 0 0 0 0 0 0 0
A123 4 0 4 0 4 13 4 3 4 11 0 0 0 0 0 0 0 0
A12K 0 0 0 0 0 0 5 0 1 40 0 1 0 0 0 0 0 0
A138 0 0 12 0 15 0 0 0 0 0 0 3 0 0 0 0 0 0
A13K 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 20 0 0
A145 0 0 0 0 0 4 0 0 0 0 0 0 2 0 0 1 0 0
A14H 3 0 1 0 0 5 0 0 2 6 0 0 1 1 0 2 0 0
A14N 1 0 0 0 0 2 0 0 0 1 0 0 0 0 0 2 0 0
A161 0 2 0 1 0 0 3 0 0 0 0 1 0 0 3 2 0 0
A16G 0 0 0 0 0 0 2 1 0 2 0 1 0 2 0 0 0 0
A17F 0 0 0 0 9 1 1 0 0 4 0 0 0 0 0 0 13 0
A17H 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
A17Z 3 0 0 0 3 0 0 0 0 1 5 0 0 0 5 0 0 1
A18O 2 5 0 0 3 0 1 0 0 0 1 1 0 0 3 2 1 0
A19Z 0 6 0 0 0 4 1 0 0 2 0 2 2 0 7 3 0 0
Correlation with clinical traits (complete table is here)
Batch vs survival
Call:
coxph(formula = survivalObject ~ batchVector)
n= 369, number of events= 29
coef exp(coef) se(coef) z Pr(>|z|)
batchVectorA039 3.652e-01 1.441e+00 8.239e-01 0.443 0.6576
batchVectorA105 -4.042e-01 6.675e-01 1.000e+00 -0.404 0.6862
batchVectorA10A -1.746e+01 2.609e-08 8.054e+03 -0.002 0.9983
batchVectorA10N -1.750e+01 2.510e-08 6.350e+03 -0.003 0.9978
batchVectorA123 -5.327e-02 9.481e-01 9.163e-01 -0.058 0.9536
batchVectorA12K 1.225e+00 3.405e+00 8.810e-01 1.391 0.1643
batchVectorA138 -7.666e-01 4.646e-01 1.225e+00 -0.626 0.5315
batchVectorA13K 1.542e+00 4.675e+00 9.249e-01 1.667 0.0954 .
batchVectorA145 -1.745e+01 2.644e-08 1.376e+04 -0.001 0.9990
batchVectorA14H -1.650e-01 8.479e-01 1.240e+00 -0.133 0.8942
batchVectorA14N -1.745e+01 2.639e-08 1.486e+04 -0.001 0.9991
batchVectorA161 2.565e+00 1.301e+01 1.286e+00 1.994 0.0461 *
batchVectorA16G -1.744e+01 2.657e-08 1.697e+04 -0.001 0.9992
batchVectorA17F 1.342e+00 3.829e+00 8.677e-01 1.547 0.1218
batchVectorA17Z -1.746e+01 2.602e-08 1.347e+04 -0.001 0.9990
batchVectorA18O 1.955e+00 7.066e+00 1.001e+00 1.953 0.0509 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
exp(coef) exp(-coef) lower .95 upper .95
batchVectorA039 1.441e+00 6.940e-01 0.28660 7.244
batchVectorA105 6.675e-01 1.498e+00 0.09396 4.742
batchVectorA10A 2.609e-08 3.832e+07 0.00000 Inf
batchVectorA10N 2.510e-08 3.985e+07 0.00000 Inf
batchVectorA123 9.481e-01 1.055e+00 0.15737 5.712
batchVectorA12K 3.405e+00 2.937e-01 0.60563 19.143
batchVectorA138 4.646e-01 2.153e+00 0.04209 5.127
batchVectorA13K 4.675e+00 2.139e-01 0.76293 28.647
batchVectorA145 2.644e-08 3.783e+07 0.00000 Inf
batchVectorA14H 8.479e-01 1.179e+00 0.07459 9.639
batchVectorA14N 2.639e-08 3.790e+07 0.00000 Inf
batchVectorA161 1.301e+01 7.689e-02 1.04492 161.869
batchVectorA16G 2.657e-08 3.763e+07 0.00000 Inf
batchVectorA17F 3.829e+00 2.612e-01 0.69893 20.972
batchVectorA17Z 2.602e-08 3.844e+07 0.00000 Inf
batchVectorA18O 7.066e+00 1.415e-01 0.99266 50.292
Rsquare= 0.064 (max possible= 0.545 )
Likelihood ratio test= 24.27 on 16 df, p=0.08375
Wald test = 18.5 on 16 df, p=0.2953
Score (logrank) test = 30.57 on 16 df, p=0.01524
DNA methylation
27k, M value, didn't split into red and green. Had to remove two arrays that had NA value for unmethylated or methylated probe intensities (TCGA-A5-A0VQ-01A-11D-A10Q-05,TCGA-BS-A0UF-01A-11D-A10Q-05). Ended up with 115 arrays total.