Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Carefully identify the probes, remove the from the raw data (5060 + ~800).
  2. Create two datasets: batch only normalized (125 tumor patients) and gender,age normalized (no need to remove batch since they all came from a single batch; 100 patients).
  3. Repeat ConsensusClusterPlus with the most variable probes (HC, complete linkage, euclidean distance). Use 10% of the original probe number as described in the Sweave document. They followed the vignette directly. 
  4. Repeat ConsensusClusterPlus using K-means, K=2:6, Pearson correlation. Use 10% of the original probe number as described in the Sweave document.


    Batch only normalized, HC, euclidean distance.

May be there are like 5 or 6 clusters but not 4. It definitely doesn't look the the clusters identified in the paper.

Gender and age adjusted, HC, euclidean distance. 

This looks significantly worse. 

 

Final attempt: K-means clustering, pearson Pearson correlation and the seed value provided in the package. Batch removed:

Code Block
collapsetrue
> icl[["clusterConsensus"]]
      k cluster clusterConsensus
 [1,] 2       1        0.8790997
 [2,] 2       2        0.8825873
 [3,] 3       1        0.8900365
 [4,] 3       2        0.8615545
 [5,] 3       3        0.8763479
 [6,] 4       1        0.7145161
 [7,] 4       2        0.8060535
 [8,] 4       3        0.9781181
 [9,] 4       4        0.9889430
[10,] 5       1        0.8289404
[11,] 5       2        0.7577152
[12,] 5       3        0.8221909
[13,] 5       4        0.7363796
[14,] 5       5        0.9454712
[15,] 6       1        0.8789223
[16,] 6       2        0.7593188
[17,] 6       3        0.7090342
[18,] 6       4        0.7150963
[19,] 6       5        0.9857516
[20,] 6       6        0.9189523

 I

tried to correlate clusters (Summary table for association with clinical variable for K=2,3 and K=4) with age and gender. Looks that the clusters don't correlate with age at all but have some correlation with gender.

K = 4:

Code Block
collapsetrue
> kruskal.test(tumorMeta$Age,consClass4)
        Kruskal-Wallis rank sum test
data:  tumorMeta$Age and consClass4
Kruskal-Wallis chi-squared = 4.9015, df = 3, p-value = 0.1792
> chisq.test(tumorMeta$Gender,consClass4)
        Pearsons Chi-squared test
data:  tumorMeta$Gender and consClass4
X-squared = 14.7676, df = 3, p-value = 0.002026

Age distribution among clusters:

Image Removed

Test for association with mutation status:

Code Block
collapsetrue
> chisq.test(k,tumorMeta$BRAF_mutation)
        Pearson's Chi-squared test
data:  k and tumorMeta$BRAF_mutation 
X-squared = 95.1974, df = 3, p-value < 2.2e-16
Warning message:
In chisq.test(k, tumorMeta$BRAF_mutation) :
  Chi-squared approximation may be incorrect
> chisq.test(k,tumorMeta$KRAS_mutation)
        Pearson's Chi-squared test
data:  k and tumorMeta$KRAS_mutation 
X-squared = 26.6428, df = 3, p-value = 6.995e-06
Warning message:
In chisq.test(k, tumorMeta$KRAS_mutation) :
  Chi-squared approximation may be incorrect
> chisq.test(k,tumorMeta$TP53_mutation)
        Pearson's Chi-squared test
data:  k and tumorMeta$TP53_mutation 
X-squared = 12.1586, df = 3, p-value = 0.006859

 

K = 3:

Code Block
collapsetrue
 
> kruskal.test(tumorMeta$Age,consClass3)
        Kruskal-Wallis rank sum test
data:  tumorMeta$Age and consClass3
Kruskal-Wallis chi-squared = 2.4866, df = 2, p-value = 0.2884
> chisq.test(tumorMeta$Gender,consClass3)
        Pearsons Chi-squared test
data:  tumorMeta$Gender and consClass3
X-squared = 5.9141, df = 2, p-value = 0.05197

So it is not very correlated with age and it is somewhat correlated with gender. 

Test for association with mutation status:

Code Block
collapsetrue
> k<-resultsK[[3]][["consensusClass"]] > chisq.test(k,tumorMeta$BRAF_mutation) Pearson's Chi-squared test data: k and tumorMeta$BRAF_mutation X-squared = 50.5952, df = 2, p-value = 1.031e-11 Warning message: In chisq.test(k, tumorMeta$BRAF_mutation) : Chi-squared approximation may be incorrect > chisq.test(k,tumorMeta$KRAS_mutation) Pearson's Chi-squared test data: k and tumorMeta$KRAS_mutation X-squared = 6.7096, df = 2, p-value = 0.03492 > chisq.test(k,tumorMeta$TP53_mutation) Pearson's Chi-squared test data: k and tumorMeta$TP53_mutation X-squared = 25.1538, df = 2, p-value = 3.451e-06

,4,5,6 (only batch is removed)

K23456
Age0.780.880.70.63450.444
Gender0.120.052.02e-035.35e-038.097155e-03
Rectal/colon0.00164.74e-044.44e-037.41e-032.2e-03
Tumor stage0.360.430.230.30.58
BRAF1.33e-051.03e-111.67e-201.01e-195.15e-19
KRAS3.87e-030.046.7e-061.22e-045.97e-06
KRAS type1.97e-020.247.14e-031.59e-022.32e-03
TP530.963.45e-066.86e-031.75e-031.26e-03
MLH1 mutation1.26e-045.11e-101.45e-208.96e-204.12e-19

 

Gender and age are removed, ConsensusClusterPlus, K means, Pearson correlation:

Cluster consensus:

Code Block
collapsetrue
> icl[["clusterConsensus"]]
      k cluster clusterConsensus
 [1,] 2       1        0.9008927
 [2,] 2       2        0.9411635
 [3,] 3       1        0.8680856
 [4,] 3       2        0.8760303
 [5,] 3       3        0.7333630
 [6,] 4       1        0.7745014
 [7,] 4       2        0.7901681
 [8,] 4       3        0.8024310
 [9,] 4       4        0.8247401
[10,] 5       1        0.9153414
[11,] 5       2        0.8141924
[12,] 5       3        0.6573712
[13,] 5       4        0.6708655
[14,] 5       5        0.6319631
[15,] 6       1        0.8695192
[16,] 6       2        0.8568005
[17,] 6       3        0.7128842
[18,] 6       4        0.7407073
[19,] 6       5        0.5585342
[20,] 6       6        0.7426816

Test for association of clusters with mutation status. K = 2.

Code Block
collapsetrue
#w is the data frame with clinical information for 100 tumor patients
 
> chisq.test(k,w$BRAF_mutation)
        Pearson's Chi-squared test with Yates' continuity correction
data:  k and w$BRAF_mutation 
X-squared = 15.2964, df = 1, p-value = 9.189e-05
> chisq.test(k,w$KRAS_mutation)
        Pearson's Chi-squared test with Yates' continuity correction
data:  k and w$KRAS_mutation 
X-squared = 5.0882, df = 1, p-value = 0.02409
> chisq.test(k,w$TP53_mutation)
        Pearson's Chi-squared test with Yates' continuity correction
data:  k and w$TP53_mutation 
X-squared = 0.12, df = 1, p-value = 0.729

 

 
K23456
Tumor stage0.360.440.660.830.76
Rectal/colon8.69e-031.68e-026.12e-020.150.13
BRAF9.18e-051.46e-121.81e-121.16e-131.63e-10
KRAS2.41e-023.36e-025.51e-022.53e-043.32e-03
KRAS type6.98e-023.705241e-010.559.23e-030.46
TP530.725.90e-034.15e-031.40e-022.14e-02
MLH1 methyl.9.19e-051.46e-122.01e-101.03e-131.63e-10