/
Beta value

Beta value

Beta matrix

There exists a transformation between M value (analyzed here, and first introduced in this paper) and the Beta value (formula for transformation is also described here):

beta=2^M/(2^M+1); M=log2(M/U); beta=M/(M+U+offset)

Data distribution based on the expression used for measuring DNA methylation (beta vs M value):
 

I used the matrix of M values and transformed them into beta values. Here are the plots of the first eigengene, first eigenarray (colored by the probe dye) and the outliers identified in the first eigengene:

Density plots by the color of the dye of the first 3 eigenarrays:
 
Consultation with Brig and Justin: don't split the data, keep all probes together.

Correlation with adjustment and biological variables:

PC

Batch

Center

Amount

Concentr.

Day

Month

Column

Row

Year

Grade

Stage

Age

1

0.1404

0.1263

0.004464

0.559545

0.120663

0.966426

0.423536

0.258863

0.036442

0.06361

0.35619

0.006236

2

2.2e-16

2.2e-16

2.447e-18

6.019e-03

2.198e-51

3.374e-32

1.389e-01

1.027e-01

2.397e-26

0.06419

0.12857

0.3884

3

8.057e-05

0.001862

0.0030439

0.5574784

0.0001560

0.0002149

0.7666292

0.5092963

0.0012852

0.6350

0.8488

0.1439

4

0.9221

0.2629

0.5526

0.3857

0.9925

0.7725

0.3971

0.8410

0.7211

0.0750

0.6507

0.0003187

since we already know from the analyses of the M value that batch, center, plate row and plate column have effect on the data, I will skip the preliminary steps and remove these factors. Also, the dataset will exclude batch number 0652.

> X<-model.matrix(~factor(batch[mask]) + adj$plate_row[mask] + adj$plate_column[mask] + factor(center[mask]))
> Xmod<-solve(t(X) %*% X) %*% t(X) %*% t(beta[,mask])
> betaRes<- beta[,mask] - t(X %*% Xmod)

Correlation with adjustment and biological variables:

PC

Batch

Center

Amount

Concentr.

Day

Month

Column

Row

Year

Grade

Stage

Age

1

1

1

0.9830

0.6760    

0.9999

1.0000

0.4995

0.9871

0.9133

0.3422

0.2915

0.01044

2

1

1

0.9750

0.7878

1.0000

1.0000

0.8328

1.0000

0.8487

0.5115

0.6704

0.4074

3

0.99

1

0.9402

0.7498

0.9994

1.0000

0.8415

0.9993

0.9304

0.11282

0.02903

0.0001028

4

1

1

0.9521

0.1648

1.0000

0.9998

0.5466

1.0000

0.9786

0.4573

0.6384

0.8565

It seems that with beta value I see a stronger correlation with age. First eigengene, first and second eigenarrays and the outliers:


We can proceed with building a comethylation network although the outliers look terrible.