...
- Carefully identify the probes, remove the from the raw data (5060 + ~800).
- Create two datasets: batch only normalized (125 tumor patients) and gender,age normalized (no need to remove batch since they all came from a single batch; 100 patients).
- Repeat ConsensusClusterPlus with the most variable probes (HC, complete linkage, euclidean distance). Use 10% of the original probe number as described in the Sweave document. They followed the vignette directly.
- Repeat ConsensusClusterPlus using K-means, K=2:6, Pearson correlation. Use 10% of the original probe number as described in the Sweave document.
Batch only normalized, HC, euclidean distance.
May be there are like 5 or 6 clusters but not 4. It definitely doesn't look the the clusters identified in the paper.
Gender and age adjusted, HC, euclidean distance.
This looks significantly worse.
Final attempt: K-means clustering, pearson correlation and the seed value provided in the package. Batch removed:
I tried to correlate clusters (K=3 and K=4) with age and gender. Looks that the clusters don't correlate with age at all but have some correlation with gender.
K = 4:
Code Block | ||
---|---|---|
| ||
> kruskal.test(tumorMeta$Age,consClass4) Kruskal-Wallis rank sum test data: tumorMeta$Age and consClass4 Kruskal-Wallis chi-squared = 4.9015, df = 3, p-value = 0.1792 > chisq.test(tumorMeta$Gender,consClass4) Pearsons Chi-squared test data: tumorMeta$Gender and consClass4 X-squared = 14.7676, df = 3, p-value = 0.002026 |
Age distribution among clusters:
K = 3:
Code Block | ||
---|---|---|
| ||
> kruskal.test(tumorMeta$Age,consClass3) Kruskal-Wallis rank sum test data: tumorMeta$Age and consClass3 Kruskal-Wallis chi-squared = 2.4866, df = 2, p-value = 0.2884 > chisq.test(tumorMeta$Gender,consClass3) Pearsons Chi-squared test data: tumorMeta$Gender and consClass3 X-squared = 5.9141, df = 2, p-value = 0.05197 |
So it is not very correlated with age and it is somewhat correlated with gender.
Gender and age are removed, ConsensusClusterPlus, K means, Pearson correlation: