Wiki Markup |
---|
h5. Batch vs clinical traits Clinical traits: 36, number of batches: 13 Batch vs center: {code:collapse=true}> table(batchID,two) two batchID A3 AK AS B0 B2 B4 B8 BP CJ CW CZ DV EU 0859 31 8 2 0 0 0 0 0 0 0 0 0 0 1186 4 6 0 0 5 0 6 5 9 0 0 0 0 1275 0 12 0 29 1 0 1 0 0 0 0 0 0 1284 0 0 0 0 0 0 0 50 0 0 0 0 0 1303 0 0 0 6 0 0 0 11 24 0 6 0 0 1323 18 7 0 0 4 0 3 5 9 0 0 0 0 1332 0 0 0 6 0 0 0 39 2 0 0 0 0 1418 6 0 0 27 0 0 6 8 0 0 0 0 0 1424 0 0 0 0 0 0 0 28 16 0 3 0 0 1500 0 1 0 15 0 2 1 1 0 0 24 0 0 1536 2 0 0 18 5 0 5 0 13 9 0 9 0 1551 0 0 0 0 0 0 3 0 0 0 0 0 0 1670 0 0 0 6 0 7 4 0 7 6 7 0 4{code} Significant batch/trait correlations (complete table can be found [here|^BatchClinicalInfoCorrelationsKIRC.txt]): {csv}KIRC_clinical_traits,DataType,NumberOfNAs,Test,Pvalue white_cell_count_result,factor,82,Pearson's Chi-squared test,2.09E-13 serum_calcium_result,factor,160,Pearson's Chi-squared test,8.31E-13 tumor_stage,factor,21,Pearson's Chi-squared test,2.11E-11 tumor_grade,factor,5,Pearson's Chi-squared test,6.43E-09 vital_status,factor,0,Pearson's Chi-squared test,9.62E-09 days_to_form_completion,integer,0,Kruskal-Wallis rank sum test,1.16E-07 year_of_initial_pathologic_diagnosis,integer,0,Kruskal-Wallis rank sum test,1.38E-07 days_to_last_known_alive,integer,10,Kruskal-Wallis rank sum test,8.41E-07 days_to_last_followup,integer,4,Kruskal-Wallis rank sum test,1.94E-06 distant_metastasis_pathologic_spread,factor,11,Pearson's Chi-squared test,2.23E-06 primary_tumor_pathologic_spread,factor,0,Pearson's Chi-squared test,3.63E-06 person_neoplasm_cancer_status,factor,28,Pearson's Chi-squared test,4.26E-06 hemoglobin_result,factor,71,Pearson's Chi-squared test,2.66E-04 lymphnode_pathologic_spread,factor,2,Pearson's Chi-squared test,7.85E-04 lymphnodes_examined_prior_presentation,factor,43,Pearson's Chi-squared test,2.05E-03 gender,factor,0,Pearson's Chi-squared test,2.10E-02 age_at_initial_pathologic_diagnosis,integer,0,Kruskal-Wallis rank sum test,2.51E-02 days_to_birth,integer,8,Kruskal-Wallis rank sum test,2.87E-02 prior_diagnosis,factor,0,Pearson's Chi-squared test,4.75E-02{csv} h5. Survival vs Batch !KaplanMeierCurveKIRC.png|thumbnail! !SurvivalByBatchKIRC.png|thumbnail! Summary can be found [here|^SurvivalBatchSummaryStatisticsKIRC.txt], batch is significantly correlated with survival: Likelihood ratio test= 61.35 on 10 df, p=2.007e-09 Wald test = 64.35 on 10 df, p=5.39e-10 Score (logrank) test = 75.35 on 10 df, p=4.066e-12 h5. DNA methylation data analysis 27k dataset, downloaded on December 28, 2011. 219 samples. Technical variables available: batch, amount, concentration, day of shipment, month of shipment, year of shipment, plate row, plate column. Combine day, month and year in a single variable. Info about technical variables: {code:collapse=true}> head(methNew) batchID amount concentration plate_column plate_row dateCombined 2 0859 26.7 uL 0.14 ug/uL 1 A 17-3-2010 32 0859 26.7 uL 0.17 ug/uL 1 C 17-3-2010 59 0859 26.7 uL 0.15 ug/uL 1 D 17-3-2010 84 0859 26.7 uL 0.15 ug/uL 1 E 17-3-2010 110 > table(methNew$batchID) 0859 26.71186 uL1284 1303 1332 0.15 ug/uL 40 35 50 47 147 > table(methNew$amount) 26.7 uL F 219 > 17-3-2010 124 0859 26.7 uL 0.15table(methNew$concentration) 0.13 ug/uL 0.14 ug/uL 0.15 ug/uL 0.16 ug/uL 0.17 ug/uL 7 1 G50 17-3-2010 > table(methNew$batchID) 0859122 1186 1275 1284 1303 1323 1332 1418 1424 150030 1536 1551 1670 40 3510 > table(methNew$plate_column) 0 1 502 3 47 4 5 0 6 477 39 40 40 040 35 23 02 > table(methNew$plate_row) 0 A B 0 C D 0 E F 0 >G table(methNew$amount) H 10.330 uL28 28 27 1027 uL27 11.227 uL25 > 11 uL 12.4 uL 13.2 uL 13.3 uL 15.7 uL 15 uL 16.1 uL 0 0 0 0 0 0 0 0 0 0 16.3 uL 16.7 uL 16 uL 17 uL 19 uL 20.9 uL 20 uL 21.5 uL 22 uL 23.7 uL 0 0table(methNew$plate_column,methNew$plate_row) A B C D E F G H 1 5 4 5 5 5 5 5 5 2 5 5 5 5 5 5 5 5 3 5 5 5 5 5 5 5 5 4 5 5 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 6 4 3 3 3 3 3 3 1 7 1 1 0 0 0 0 0 0 > table(methNew$dateCombined) 11-10-2010 17-3-2010 25-8-2010 27-9-2010 6-10-2010 47 40 35 050 0 47 > table(methNew$dateCombined,methNew$batchID) 0 0 0859 1186 01284 1303 1332 11-10-2010 0 0 0 0 0 0 2547 uL 26.7 uL 17-3-2010 30 uL40 40 uL0 50 uL 600 uL 610 uL 63.1 uL 66.6 uL 6.67 uL 25-8-2010 0 35 0 0 219 0 27-9-2010 0 0 0 50 0 0 06-10-2010 0 0 0 0 47 0{code} Exclude "amount" from calculations for the 0correlations 66.7of uLthe first 6.7principal uLcomponents of 7.2the uLdata with the 80 uL 8.9 uL 0 0 0 0 0 #It seems that all values of the amount are 26.7 (although I have factor levels from the all values available for future DNA methylation datasets for patients for whom samples are already collected) > table(methNew$concentration) 0.01 ug/uL 0.03 ug/uL 0.04 ug/uL 0.0500 ug/uL 0.050 ug/uL 0.05 ug/uL 0 0 0 0 0 0 0.09 ug/uL 0.100 ug/uL 0.10 ug/uL 0.11 ug/uL 0.12 ug/uL 0.13 ug/uL 0 0 0 0 0 7 0.14 ug/uL 0.15 ug/uL 0.16 ug/uL 0.17 ug/uL 0.1 ug/uL 0.50 ug/uL 50 122 30 10 0 0 .05 ug/uL 0.5 ug/uL .100 ug/uL .150 ug/uL .1 ug/uL .50 ug/uL 0 0 0 0 0 0 .5 ug/uL 0 > table(methNew$plate_column) 1 2 3 4 5 6 7 39 40 40 40 35 23 2 > table(methNew$plate_row) A B C D E F G H 30 28 28 27 27 27 27 25 > table(methNew$plate_row,methNew$plate_column) 1 2 3 4 5 6 7 A 5 5 5 5 5 4 1 B 4 5 5 5 5 3 1 C 5 5 5 5 5 3 0 D 5 5 5 5 4 3 0 E 5 5 5 5 4 3 0 F 5 5 5 5 4 3 0 G 5 5 5 5 4 3 0 H 5 5 5 5 4 1 0 > table(methNew$dateCombined) 11-10-2010 17-3-2010 25-8-2010 27-9-2010 6-10-2010 47 40 35 50 47 > table(methNew$dateCombined,methNew$batchID) 0859 1186 1275 1284 1303 1323 1332 1418 1424 1500 1536 1551 1670 11-10-2010 0 0 0 0 0 0 47 0 0 0 0 0 0 17-3-2010 40 0 0 0 0 0 0 0 0 0 0 0 0 25-8-2010 0 35 0 0 0 0 0 0 0 0 0 0 0 27-9-2010 0 0 0 50 0 0 0 0 0 0 0 0 0 6-10-2010 0 0 0 0 47 0 0 0 0 0 0 0 0{code}technical variables. Created a matrix of M values, didn't split read and green. Relative variance, no normalization: !KIRC_Mval_noNorm_RelativeVariance.png|thumbnail! Based on the plot will look at the first 8 principal components: {code:collapse=true}batchID concentration plate_column plate_row dateCombined V1 2.024556e-22 0.5182919 0.22249235 0.9371285 2.024556e-22 V2 1.777673e-18 0.2878497 0.40175378 0.6195123 1.777673e-18 V3 3.196508e-01 0.3802798 0.27628233 0.5517096 3.196508e-01 V4 1.693859e-30 0.2449447 0.50367703 0.9672545 1.693859e-30 V5 2.435091e-03 0.1812444 0.08644977 0.5581507 2.435091e-03 V6 4.437547e-03 0.9473683 0.15938639 0.8458098 4.437547e-03 V7 1.271181e-03 0.3644802 0.79816984 0.7038321 1.271181e-03 V8 1.051940e-05 0.5905213 0.28713862 0.2173504 1.051940e-05{code} Batch and dateCombined are highly correlated with the first principal components (V1 - V8 are the principal components after performing an SVD on unnormalized matrix) Start by removing the batch: |
Page Comparison
General
Content
Integrations
App links