...
Code Block |
---|
x<-u507$v[,1]
names(x)<-colnames(exp507)
#Identification of outliers in the boxplot of the first eigengene grouped by batch
> boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out#Identification of outliers in the boxplot of the first eigengene grouped by batch
> boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out
TCGA-13-0762-01A TCGA-13-0768-01A TCGA-24-1614-01A TCGA-04-1655-01A
-0.12132315 0.06403801 -0.10078455 -0.05562895
TCGA-04-1649-01A TCGA-04-1652-01A TCGA-09-2049-01D TCGA-29-2425-01A
-0.07523963 -0.15237043 -0.09206651 -0.05197294
TCGA-13-0762-01A TCGA-13-0768-01A TCGA-24-1614-01A TCGA-04-1655-01A
-0.12132315 0.06403801 -0.10078455 -0.05562895
TCGA-04-1649-01A TCGA-04-1652-01A TCGA-09-2049-01D TCGA-29-2425-01A
-0.07523963 -0.15237043 -0.09206651 -0.0519729
> boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")
> text(x=boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$group,y=boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out,labels=names(boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out))
|
Kruskal-Wallis test for association of the first 4 egengenes with batch:
...
Note: I also performed a few analyses where I removed batch and the center and then also looked at the distribution of Kruskal-Wallis test with day of shipment, month of shipment, year of shipment, concentration, plate column, plate row and amount. Justin suggested that center effect is very minor and not worth removing but I also noticed that removing batch and center also completely removed the day, month and year of shipment effects (which I saw that in DNA methylation normalization these technical factors were highly correlated with the batch effect) and concentration, plate column, plate row and amount are insignificant. The graphs for these analyses can be found here.