...
Code Block |
---|
x<-u507$v[,1] names(x)<-colnames(exp507) #Identification of outliers in the boxplot of the first eigengene grouped by batch > boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out#Identification of outliers in the boxplot of the first eigengene grouped by batch > boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out TCGA-13-0762-01A TCGA-13-0768-01A TCGA-24-1614-01A TCGA-04-1655-01A -0.12132315 0.06403801 -0.10078455 -0.05562895 TCGA-04-1649-01A TCGA-04-1652-01A TCGA-09-2049-01D TCGA-29-2425-01A -0.07523963 -0.15237043 -0.09206651 -0.05197294 TCGA-13-0762-01A TCGA-13-0768-01A TCGA-24-1614-01A TCGA-04-1655-01A -0.12132315 0.06403801 -0.10078455 -0.05562895>05562895 TCGA-04-1649-01A TCGA-04-1652-01A TCGA-09-2049-01D TCGA-29-2425-01A -0.07523963 -0.15237043 -0.09206651 -0.0519729 > boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene") > text(x=boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$group,y=boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out,labels=names(boxplot(split(x,shortBatch),xlab="batch",ylab="1st eigengene")$out)) TCGA-04-1649-01A TCGA-04-1652-01A TCGA-09-2049-01D TCGA-29-2425-01A -0.07523963 -0.15237043 -0.09206651 -0.0519729 |
Kruskal-Wallis test for association of the first 4 egengenes with batch:
...
I identified that the first 150 eigengenes account for 80% of the variance, performed Kruskal-Wallis test with all 150 and plotted the P value. Also looked at the correlation of the batch and the center effect and looked at the correlation of the first 150 eigengenes with the center:
Remove the batch effect:
Code Block |
---|
X<-model.matrix(~factor(batch)) bch <- solve(t(X) %*% X) %*% t(X) %*% t(expr) resExpr <- expr-t(X %*% bch) |
Look at the relative variance and the association of the eigengenes with the batch and the center after removing the batch effect (again, looking at the first 150 eigengenes):
I looked specifically at the p values from the Kruskal-Wallis test for association with the center effect:
...