Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Carefully identify the probes, remove the from the raw data (5060 + ~800).
  2. Create two datasets: batch only normalized (125 tumor patients) and gender,age normalized (no need to remove batch since they all came from a single batch; 100 patients).
  3. Repeat ConsensusClusterPlus with the most variable probes (HC, complete linkage, euclidean distance). Use 10% of the original probe number as described in the Sweave document. They followed the vignette directly. 
  4. Repeat ConsensusClusterPlus using K-means, K=2:6, Pearson correlation. Use 10% of the original probe number as described in the Sweave document.


    Batch only normalized, HC, euclidean distance.

May be there are like 5 or 6 clusters but not 4. It definitely doesn't look the the clusters identified in the paper.

Gender and age adjusted, HC, euclidean distance. 

This looks significantly worse. 

 

Final attempt: K-means clustering, pearson correlation and the seed value provided in the package. Batch removed:

I tried to correlate clusters (K=3 and K=4) with age and gender. Looks that the clusters don't correlate with age at all but have some correlation with gender.

K = 4:

Code Block
collapsetrue
> kruskal.test(tumorMeta$Age,consClass4)
        Kruskal-Wallis rank sum test
data:  tumorMeta$Age and consClass4
Kruskal-Wallis chi-squared = 4.9015, df = 3, p-value = 0.1792
> chisq.test(tumorMeta$Gender,consClass4)
        Pearsons Chi-squared test
data:  tumorMeta$Gender and consClass4
X-squared = 14.7676, df = 3, p-value = 0.002026

Age distribution among clusters:

 

K = 3:

Code Block
collapsetrue
> kruskal.test(tumorMeta$Age,consClass3)
        Kruskal-Wallis rank sum test
data:  tumorMeta$Age and consClass3
Kruskal-Wallis chi-squared = 2.4866, df = 2, p-value = 0.2884
> chisq.test(tumorMeta$Gender,consClass3)
        Pearsons Chi-squared test
data:  tumorMeta$Gender and consClass3
X-squared = 5.9141, df = 2, p-value = 0.05197

So it is not very correlated with age and it is somewhat correlated with gender. 

Gender and age are removed, ConsensusClusterPlus, K means, Pearson correlation:

Image AddedImage AddedImage AddedImage AddedImage Added