Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h5. Batch vs clinical traits

Number of batches is 12. Correlation between batch and center:
{code:collapse=true}table(two,batchID)
    batchID
two  0689 0848 0979 1096 1198 1440 1551 1633 1818 1871 1947 2043
  18    0    0   12    2    0    2    0    2    0    0    0    0
  21   13    0    0    0    0    0    0    4    0    0    0    0
  22    9    0    0    0    6    3    0   12    2    0    2    0
  33    0    0    0    0    4    5    0    0    1    0    1    0
  34    0    6    0    0    0    2    0    1    7    1    1    0
  37    0    0    2    6    1    0    0    1    0    0    0    0
  39    0    0    0    0    0   12    0    0    4    0    0    0
  43    0    3    2    0    0    0    2    1    4    0    1    0
  46    0    0    5    0    0    0    0    0    2    0    0    0
  51    0    0    0    3    0    0    0    0    0    0    0    1
  56    1    0    0    0    0    0    0    2    2    0    0    6
  60    0   20    0    0    1    0    0    0    1    2    0    2
  63    0    0    0    0    0    2    0    0    1    0    4    0
  66    0   18   15    0    6    0    0    0    0    0    0    0
  70    0    0    0    0    0    0    0    0    2    0    0    0
  77    0    0    0    0    0    0    0    0    0    0    4   10
  79    0    0    0    0    0    0    0    0    0    0    1    0
  85    0    0    0    0    0    0    0    0    3    0    1    0
  90    0    0    0    0    0    0    0    0    0    0    1    0
  92    0    0    0    0    0    0    0    0    0    0    0    2
  94    0    0    0    0    0    0    0    0    0    0    1    0
  96    0    0    0    0    0    0    0    0    0    0    0    2
  98    0    0    0    0    0    0    0    0    0    0    0    1{code}
Significant batch/clinical traits correlations (complete list can be found [here|^BatchClinicalInfoCorrelationsLUSC.txt]):
{csv}LUSC,DataType,NumberOfNAs,Test,Pvalue
tumor_stage,factor,27,Pearson's Chi-squared test,8.78E-14
year_of_initial_pathologic_diagnosis,integer,23,Kruskal-Wallis rank sum test,7.95E-12
days_to_form_completion,integer,30,Kruskal-Wallis rank sum test,1.48E-09
primary_tumor_pathologic_spread,factor,23,Pearson's Chi-squared test,1.96E-09
distant_metastasis_pathologic_spread,factor,29,Pearson's Chi-squared test,3.77E-05
days_to_last_followup,integer,42,Kruskal-Wallis rank sum test,7.68E-05
vital_status,factor,23,Pearson's Chi-squared test,2.37E-03
year_of_tobacco_smoking_onset,integer,116,Kruskal-Wallis rank sum test,3.12E-03
year_of_tobacco_smoking_cessation,integer,88,Kruskal-Wallis rank sum test,5.84E-03
days_to_last_known_alive,integer,75,Kruskal-Wallis rank sum test,7.37E-03
residual_tumor,factor,46,Pearson's Chi-squared test,2.00E-02
lymphnode_pathologic_spread,factor,23,Pearson's Chi-squared test,5.48E-02
age_at_initial_pathologic_diagnosis,integer,30,Kruskal-Wallis rank sum test,9.24E-02
days_to_birth,integer,30,Kruskal-Wallis rank sum test,9.73E-02{csv}

h5. Batch vs survival

Again, for this type of cancer clinical traits file contains days to last known alive but it has more NAs than days to the last follow up so I will use the latter for construction of the survival object.  

!KaplanMeierCurveLUSC.png|thumbnail! !SurvivalByBatchLUSC.png|thumbnail!
{code:collapse=true}Call:
coxph(formula = survivalObject ~ batchVector)

  n= 223, number of events= 92

                    coef exp(coef) se(coef)      z Pr(>|z|)
batchVector0848 -0.14217   0.86748  0.37975 -0.374  0.70813
batchVector0979 -0.23685   0.78911  0.42661 -0.555  0.57877
batchVector1096  1.66699   5.29619  0.60925  2.736  0.00622 **
batchVector1198 -0.13837   0.87077  0.42245 -0.328  0.74325
batchVector1440 -0.25689   0.77345  0.38754 -0.663  0.50741
batchVector1633  0.27021   1.31025  0.37760  0.716  0.47423
batchVector1818 -0.21253   0.80853  0.46395 -0.458  0.64688
batchVector1947 -0.05172   0.94959  0.48598 -0.106  0.91524
batchVector2043  0.06161   1.06355  1.03684  0.059  0.95261
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

                exp(coef) exp(-coef) lower .95 upper .95
batchVector0848    0.8675     1.1528    0.4121     1.826
batchVector0979    0.7891     1.2672    0.3420     1.821
batchVector1096    5.2962     0.1888    1.6046    17.481
batchVector1198    0.8708     1.1484    0.3805     1.993
batchVector1440    0.7735     1.2929    0.3619     1.653
batchVector1633    1.3102     0.7632    0.6251     2.746
batchVector1818    0.8085     1.2368    0.3257     2.007
batchVector1947    0.9496     1.0531    0.3663     2.462
batchVector2043    1.0636     0.9402    0.1394     8.116

Rsquare= 0.041   (max possible= 0.973 )
Likelihood ratio test= 9.36  on 9 df,   p=0.4044
Wald test            = 12.53  on 9 df,   p=0.1849
Score (logrank) test = 15.53  on 9 df,   p=0.07747
{code}

On overall, correlation of batch with survival is not significant. There is one batch (1096) that seems to be somewhat more involved and it has only 11 patients. When I removed all patients from that batch correlationno ofother batch with survival becamebatches showed completely insignificant (Likelihoodcorrelation ratio test= 2.47  on 8 df,   p=0.9631
Wald test            = 2.61  on 8 df,   p=0.9563
Score (logrank) test = 2.65  on 8 df,   p=0.9544with survival. 


h5. DNA methylation