...
Parameters: euclidean distance, hierarchical clustering, complete linkage, 10000 bootstraps (pvclust library)
Hinoue data: 125 tumor patients, batch removed, M value, remove 5060 "null probes" identified from the series matrix. Take 10% the most variable probes. Pvclust to identify cluster stability. Parameters: euclidean distance, hierarchical clustering, complete linkage, 10000 bootstraps
The best results I can get is 69 to 83 AU value for cluster stability even after 10,000 bootstraps. Does it mean that the clusters are weak and they just accepted it as is?
ConsensusCluster plus with the same parameters as pvlcust above
Will using a different package make any difference? Use ConsensusClusterPlus package with the same parameters as pvclust. Hierarchical clustering, evaluate 20 clusters, use 80% of the data for bootstrapping. They claimed that the identified 4 clusters.
It seems that there is some separation but the size of the clusters is very uneven, nothing like was presented in the paper.
After receiving the Sweave file I found that 5060 "null" probes don't include probes from XY chromosomes. Then I took the most variable probes I used for clustering and identified that 25% of those are from X or Y chromosomes.
- Carefully identify the probes, remove the from the raw data (5060 + ~800).
- Create two datasets: batch only normalized (125 tumor patients) and gender,age normalized (no need to remove batch since they all came from a single batch; 100 patients).
- Repeat ConsensusClusterPlus with the most variable probes (HC, complete linkage, euclidean distance). Use 10% of the original probe number as described in the Sweave document. They followed the vignette directly.
- Repeat ConsensusClusterPlus using K-means, K=2:6, Pearson correlation. Use 10% of the original probe number as described in the Sweave document.
ConsensusClusterPlus, XY probes removed, hierarchical/euclidean
Batch only normalized, HC, euclidean distance. removed
May be there are like 5 or 6 clusters but not 4. It definitely doesn't look the the clusters identified in the paper.
Gender and age adjusted
This looks significantly worse.
ConsensusClusterPlus, K means, pearson correlation
Batch removed (125 patients)
Code Block | ||
---|---|---|
| ||
> icl[["clusterConsensus"]]
k cluster clusterConsensus
[1,] 2 1 0.8790997
[2,] 2 2 0.8825873
[3,] 3 1 0.8900365
[4,] 3 2 0.8615545
[5,] 3 3 0.8763479
[6,] 4 1 0.7145161
[7,] 4 2 0.8060535
[8,] 4 3 0.9781181
[9,] 4 4 0.9889430
[10,] 5 1 0.8289404
[11,] 5 2 0.7577152
[12,] 5 3 0.8221909
[13,] 5 4 0.7363796
[14,] 5 5 0.9454712
[15,] 6 1 0.8789223
[16,] 6 2 0.7593188
[17,] 6 3 0.7090342
[18,] 6 4 0.7150963
[19,] 6 5 0.9857516
[20,] 6 6 0.9189523 |
Summary table for association with clinical variable for K=2,3,4,5,6 (only batch is removed)
K | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|
Age | 0.78 | 0.88 | 0.7 | 0.6345 | 0.444 |
Gender | 0.12 | 0.05 | 2.02e-03 | 5.35e-03 | 8.097155e-03 |
Rectal/colon | 0.0016 | 4.74e-04 | 4.44e-03 | 7.41e-03 | 2.2e-03 |
Tumor stage | 0.36 | 0.43 | 0.23 | 0.3 | 0.58 |
BRAF | 1.33e-05 | 1.03e-11 | 1.67e-20 | 1.01e-19 | 5.15e-19 |
KRAS | 3.87e-03 | 0.04 | 6.7e-06 | 1.22e-04 | 5.97e-06 |
KRAS type | 1.97e-02 | 0.24 | 7.14e-03 | 1.59e-02 | 2.32e-03 |
TP53 | 0.96 | 3.45e-06 | 6.86e-03 | 1.75e-03 | 1.26e-03 |
MLH1 mutation | 1.26e-04 | 5.11e-10 | 1.45e-20 | 8.96e-20 | 4.12e-19 |
ConsensusClusterPlus, K means, Pearson correlation
Age/gender removed, 100 patients
Cluster consensus:
Code Block | ||
---|---|---|
| ||
> icl[["clusterConsensus"]]
k cluster clusterConsensus
[1,] 2 1 0.9008927
[2,] 2 2 0.9411635
[3,] 3 1 0.8680856
[4,] 3 2 0.8760303
[5,] 3 3 0.7333630
[6,] 4 1 0.7745014
[7,] 4 2 0.7901681
[8,] 4 3 0.8024310
[9,] 4 4 0.8247401
[10,] 5 1 0.9153414
[11,] 5 2 0.8141924
[12,] 5 3 0.6573712
[13,] 5 4 0.6708655
[14,] 5 5 0.6319631
[15,] 6 1 0.8695192
[16,] 6 2 0.8568005
[17,] 6 3 0.7128842
[18,] 6 4 0.7407073
[19,] 6 5 0.5585342
[20,] 6 6 0.7426816 |
K | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|
Tumor stage | 0.36 | 0.44 | 0.66 | 0.83 | 0.76 |
Rectal/colon | 8.69e-03 | 1.68e-02 | 6.12e-02 | 0.15 | 0.13 |
BRAF | 9.18e-05 | 1.46e-12 | 1.81e-12 | 1.16e-13 | 1.63e-10 |
KRAS | 2.41e-02 | 3.36e-02 | 5.51e-02 | 2.53e-04 | 3.32e-03 |
KRAS type | 6.98e-02 | 3.705241e-01 | 0.55 | 9.23e-03 | 0.46 |
TP53 | 0.72 | 5.90e-03 | 4.15e-03 | 1.40e-02 | 2.14e-02 |
MLH1 methyl. | 9.19e-05 | 1.46e-12 | 2.01e-10 | 1.03e-13 | 1.63e-10 |