Beta value
Beta matrix
There exists a transformation between M value (analyzed here, and first introduced in this paper) and the Beta value (formula for transformation is also described here):
beta=2^M/(2^M+1); M=log2(M/U); beta=M/(M+U+offset)
Data distribution based on the expression used for measuring DNA methylation (beta vs M value):
I used the matrix of M values and transformed them into beta values. Here are the plots of the first eigengene, first eigenarray (colored by the probe dye) and the outliers identified in the first eigengene:
Density plots by the color of the dye of the first 3 eigenarrays:
Consultation with Brig and Justin: don't split the data, keep all probes together.
Correlation with adjustment and biological variables:
PC |
Batch |
Center |
Amount |
Concentr. |
Day |
Month |
Column |
Row |
Year |
Grade |
Stage |
Age |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
0.1404 |
0.1263 |
0.004464 |
0.559545 |
0.120663 |
0.966426 |
0.423536 |
0.258863 |
0.036442 |
0.06361 |
0.35619 |
0.006236 |
2 |
2.2e-16 |
2.2e-16 |
2.447e-18 |
6.019e-03 |
2.198e-51 |
3.374e-32 |
1.389e-01 |
1.027e-01 |
2.397e-26 |
0.06419 |
0.12857 |
0.3884 |
3 |
8.057e-05 |
0.001862 |
0.0030439 |
0.5574784 |
0.0001560 |
0.0002149 |
0.7666292 |
0.5092963 |
0.0012852 |
0.6350 |
0.8488 |
0.1439 |
4 |
0.9221 |
0.2629 |
0.5526 |
0.3857 |
0.9925 |
0.7725 |
0.3971 |
0.8410 |
0.7211 |
0.0750 |
0.6507 |
0.0003187 |
since we already know from the analyses of the M value that batch, center, plate row and plate column have effect on the data, I will skip the preliminary steps and remove these factors. Also, the dataset will exclude batch number 0652.
> X<-model.matrix(~factor(batch[mask]) + adj$plate_row[mask] + adj$plate_column[mask] + factor(center[mask])) > Xmod<-solve(t(X) %*% X) %*% t(X) %*% t(beta[,mask]) > betaRes<- beta[,mask] - t(X %*% Xmod)
Correlation with adjustment and biological variables:
PC |
Batch |
Center |
Amount |
Concentr. |
Day |
Month |
Column |
Row |
Year |
Grade |
Stage |
Age |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 |
1 |
1 |
0.9830 |
0.6760 |
0.9999 |
1.0000 |
0.4995 |
0.9871 |
0.9133 |
0.3422 |
0.2915 |
0.01044 |
2 |
1 |
1 |
0.9750 |
0.7878 |
1.0000 |
1.0000 |
0.8328 |
1.0000 |
0.8487 |
0.5115 |
0.6704 |
0.4074 |
3 |
0.99 |
1 |
0.9402 |
0.7498 |
0.9994 |
1.0000 |
0.8415 |
0.9993 |
0.9304 |
0.11282 |
0.02903 |
0.0001028 |
4 |
1 |
1 |
0.9521 |
0.1648 |
1.0000 |
0.9998 |
0.5466 |
1.0000 |
0.9786 |
0.4573 |
0.6384 |
0.8565 |
It seems that with beta value I see a stronger correlation with age. First eigengene, first and second eigenarrays and the outliers:
We can proceed with building a comethylation network although the outliers look terrible.