Wiki Markup |
---|
h5. Batch vs clinical traits Clinical traits: 36, number of batches: 13 Batch vs center: {code:collapse=true}> table(batchID,two) two batchID A3 AK AS B0 B2 B4 B8 BP CJ CW CZ DV EU 0859 31 8 2 0 0 0 0 0 0 0 0 0 0 1186 4 6 0 0 5 0 6 5 9 0 0 0 0 1275 0 12 0 29 1 0 1 0 0 0 0 0 0 1284 0 0 0 0 0 0 0 50 0 0 0 0 0 1303 0 0 0 6 0 0 0 11 24 0 6 0 0 1323 18 7 0 0 4 0 3 5 9 0 0 0 0 1332 0 0 0 6 0 0 0 39 2 0 0 0 0 1418 6 0 0 27 0 0 6 8 0 0 0 0 0 1424 0 0 0 0 0 0 0 28 16 0 3 0 0 1500 0 1 0 15 0 2 1 1 0 0 24 0 0 1536 2 0 0 18 5 0 5 0 13 9 0 9 0 1551 0 0 0 0 0 0 3 0 0 0 0 0 0 1670 0 0 0 6 0 7 4 0 7 6 7 0 4{code} Significant batch/trait correlations (complete table can be found [here|^BatchClinicalInfoCorrelationsKIRC.txt]): {csv}KIRC_clinical_traits,DataType,NumberOfNAs,Test,Pvalue white_cell_count_result,factor,82,Pearson's Chi-squared test,2.09E-13 serum_calcium_result,factor,160,Pearson's Chi-squared test,8.31E-13 tumor_stage,factor,21,Pearson's Chi-squared test,2.11E-11 tumor_grade,factor,5,Pearson's Chi-squared test,6.43E-09 vital_status,factor,0,Pearson's Chi-squared test,9.62E-09 days_to_form_completion,integer,0,Kruskal-Wallis rank sum test,1.16E-07 year_of_initial_pathologic_diagnosis,integer,0,Kruskal-Wallis rank sum test,1.38E-07 days_to_last_known_alive,integer,10,Kruskal-Wallis rank sum test,8.41E-07 days_to_last_followup,integer,4,Kruskal-Wallis rank sum test,1.94E-06 distant_metastasis_pathologic_spread,factor,11,Pearson's Chi-squared test,2.23E-06 primary_tumor_pathologic_spread,factor,0,Pearson's Chi-squared test,3.63E-06 person_neoplasm_cancer_status,factor,28,Pearson's Chi-squared test,4.26E-06 hemoglobin_result,factor,71,Pearson's Chi-squared test,2.66E-04 lymphnode_pathologic_spread,factor,2,Pearson's Chi-squared test,7.85E-04 lymphnodes_examined_prior_presentation,factor,43,Pearson's Chi-squared test,2.05E-03 gender,factor,0,Pearson's Chi-squared test,2.10E-02 age_at_initial_pathologic_diagnosis,integer,0,Kruskal-Wallis rank sum test,2.51E-02 days_to_birth,integer,8,Kruskal-Wallis rank sum test,2.87E-02 prior_diagnosis,factor,0,Pearson's Chi-squared test,4.75E-02{csv} h5. Survival vs Batch !KaplanMeierCurveKIRC.png|thumbnail! !SurvivalByBatchKIRC.png|thumbnail! Summary can be found [here|^SurvivalBatchSummaryStatisticsKIRC.txt], batch is significantly correlated with survival: Likelihood ratio test= 61.35 on 10 df, p=2.007e-09 Wald test = 64.35 on 10 df, p=5.39e-10 Score (logrank) test = 75.35 on 10 df, p=4.066e-12 h5. DNA methylation data analysis 27k dataset, downloaded on December 28, 2011. 219 samples. Technical variables available: batch, amount, concentration, day of shipment, month of shipment, year of shipment, plate row, plate column. Combine day, month and year in a single variable. Info about technical variables: {code:collapse=true}> head(methNew) batchID amount concentration plate_column plate_row dateCombined 2 0859 26.7 uL 0.14 ug/uL 1 A 17-3-2010 32 0859 26.7 uL 0.17 ug/uL 1 C 17-3-2010 59 0859 26.7 uL 0.15 ug/uL 1 D 17-3-2010 84 0859 26.7 uL 0.15 ug/uL 1 E 17-3-2010 > table(methNew$batchID) 0859 1186 1284 1303 1332 40 35 50 47 47 > table(methNew$amount) 26.7 uL 219 > table(methNew$concentration) 0.13 ug/uL 0.14 ug/uL 0.15 ug/uL 0.16 ug/uL 0.17 ug/uL 7 50 122 30 10 > table(methNew$plate_column) 1 2 3 4 5 6 7 39 40 40 40 35 23 2 > table(methNew$plate_row) A B C D E F G H 30 28 28 27 27 27 27 25 > table(methNew$plate_column,methNew$plate_row) A B C D E F G H 1 5 4 5 5 5 5 5 5 2 5 5 5 5 5 5 5 5 3 5 5 5 5 5 5 5 5 4 5 5 5 5 5 5 5 5 5 5 5 5 4 4 4 4 4 6 4 3 3 3 3 3 3 1 7 1 1 0 0 0 0 0 0 > table(methNew$dateCombined) 11-10-2010 17-3-2010 25-8-2010 27-9-2010 6-10-2010 47 40 35 50 47 > table(methNew$dateCombined,methNew$batchID) 0859 1186 1284 1303 1332 11-10-2010 0 0 0 0 47 17-3-2010 40 0 0 0 0 25-8-2010 0 35 0 0 0 27-9-2010 0 0 50 0 0 6-10-2010 0 0 0 47 0{code} Exclude "amount" from calculations for the correlations of the first principal components of the data with the technical variables. Created a matrix of M values, didn't split read and green. Relative variance, no normalization and the outliers: !KIRC_Mval_noNorm_RelativeVariance.png|thumbnail! !KIRC_Mval_unnorm_PC1_outliers.png|thumbnail! Based on the plot will look at the first 8 principal components: {code:collapse=true}batchID concentration plate_column plate_row dateCombined V1 2.024556e-22 0.5182919 0.22249235 0.9371285 2.024556e-22 V2 1.777673e-18 0.2878497 0.40175378 0.6195123 1.777673e-18 V3 3.196508e-01 0.3802798 0.27628233 0.5517096 3.196508e-01 V4 1.693859e-30 0.2449447 0.50367703 0.9672545 1.693859e-30 V5 2.435091e-03 0.1812444 0.08644977 0.5581507 2.435091e-03 V6 4.437547e-03 0.9473683 0.15938639 0.8458098 4.437547e-03 V7 1.271181e-03 0.3644802 0.79816984 0.7038321 1.271181e-03 V8 1.051940e-05 0.5905213 0.28713862 0.2173504 1.051940e-05{code} Batch and dateCombined are highly correlated with the first principal components (V1 - V8 are the principal components after performing an SVD on unnormalized matrix) Start by removing the batch:. Relative variance and the outliers after removing the batch. |
Page Comparison
General
Content
Integrations
App links