Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

...

Important

...

update

...

(January

...

20th,

...

2011):

...

the

...

data

...

below

...

have

...

been

...

corrected

...

for

...

the

...

BCR

...

batch

...

which

...

is

...

not

...

necessarily

...

the

...

processing

...

batch.

...

The

...

dataset

...

needs

...

to

...

be

...

reanalyzed.

...

 

Batch vs clinical traits

Batch vs center:

Code Block
> table(batchID,center)
       center
batchID B7 BR CD CG D7 EQ F1
   1129  0 31  0  0  0  0  0
   1156  0 12  0 23  0  0  0
   1601  2  0  7 16  3  1  0
   1801  0  9  0  3 10  0  1
   1883  0 11  0  0  5  0  1{code}

Most

...

significant

...

correlations

...

(complete

...

list

...

can

...

be

...

found

...

here

...

)

Wiki Markup
{csv}STAD,DataType,NumberOfNAs,Test,Pvalue
residual_tumor,factor,8,Pearson's Chi-squared test,8.06E-17
year_of_initial_pathologic_diagnosis,integer,1,Kruskal-Wallis rank sum test,2.88E-13
days_to_form_completion,integer,1,Kruskal-Wallis rank sum test,3.72E-11
days_to_last_followup,integer,1,Kruskal-Wallis rank sum test,5.15E-11
primary_tumor_pathologic_spread,factor,1,Pearson's Chi-squared test,1.68E-06
histological_type,factor,6,Pearson's Chi-squared test,5.82E-06
lymphnode_pathologic_spread,factor,1,Pearson's Chi-squared test,6.09E-05
number_of_lymphnodes_examined,integer,53,Kruskal-Wallis rank sum test,1.25E-04
vital_status,factor,1,Pearson's Chi-squared test,2.95E-03
tumor_stage,factor,31,Pearson's Chi-squared test,3.49E-02{csv}

...


Batch vs survival

Image Added Image Added

Code Block
collapsetrue
Call:
coxph(formula = survivalObject ~ batchVector)

  n= 134, number of events= 14

                     coef exp(coef)  se(coef)     z Pr(>|z|)
batchVector1156 1.722e+01 3.012e+07 7.152e+03 0.002    0.998
batchVector1601 1.786e+01 5.728e+07 7.152e+03 0.002    0.998
batchVector1801 1.663e+01 1.665e+07 7.152e+03 0.002    0.998
batchVector1883        NA        NA 0.000e+00    NA       NA

                exp(coef) exp(-coef) lower .95 upper .95
batchVector1156  30117568  3.320e-08         0       Inf
batchVector1601  57279474  1.746e-08         0       Inf
batchVector1801  16645384  6.008e-08         0       Inf
batchVector1883        NA         NA        NA        NA

Rsquare= 0.026   (max possible= 0.496 )
Likelihood ratio test= 3.58  on 3 df,   p=0.3109
Wald test            = 2.13  on 3 df,   p=0.545
Score (logrank) test = 3.1  on 3 df,   p=0.3762
Warning messages:
1: In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Loglik converged before variable  1,2,3 ; beta may be infinite.
2: In coxph(survivalObject ~ batchVector) :
  X matrix deemed to be singular; variable 4{code}

No

...

correlation

...

with

...

survival.

...

For

...

some

...

reason

...

I

...

got

...

NAs

...

and

...

an

...

error

...

for

...

the

...

last

...

batch

...

although

...

it

...

is

...

definitely

...

not

...

because

...

of

...

the

...

unused

...

factor

...

levels.

...

 

DNA methylation

27k arrays, 66 patients. Create M value, don't split between red and green. SVD:

Image Added Image Added Image Added
Summary of the technical variables:

Code Block
> summary(methS)
 batchID       amount      concentration plate_column   plate_row
 1129:31   16.9 uL: 1   0.13 ug/uL: 6    1:16         A      :10
 1156:35   26.7 uL:65   0.14 ug/uL:27    2:13         C      : 9
                        0.15 ug/uL:25    3:13         D      : 9
                        0.16 ug/uL: 7    4:10         F      : 9
                        0.17 ug/uL: 1    5: 9         B      : 8
                                         6: 5         E      : 8
                                                      (Other):13
      shortDay
 21-7-2010:31
 28-7-2010:35{code}

So

...

this

...

dataset

...

has

...

only

...

2

...

batches.

...

Lets

...

see

...

if

...

they

...

have

...

any

...

correlation

...

with

...

the

...

principal

...

components:

...

:=}
Code Block
collapse
true
> x
        batchID    amount concentration plate_column  plate_row     shortDay
V1 4.999780e-01 0.8132652     0.9636092    0.2126458 0.41035836 4.999780e-01
V2 1.080231e-07 0.1214957     0.8025371    0.2954381 0.91858389 1.080231e-07
V3 6.028215e-01 0.4465735     0.9897603    0.5199681 0.07110241 6.028215e-01
V4 7.947106e-02 0.2818850     0.3579813    0.8230956 0.52338954 7.947106e-02
V5 1.125719e-01 0.9790610     0.5150996    0.3113563 0.29650943 1.125719e-01
V6 5.502164e-01 0.4465735     0.3134523    0.3787485 0.50090145 5.502164e-01
V7 7.922533e-01 0.5117243     0.6591395    0.4459644 0.76917348 7.922533e-01
V8 9.614704e-02 0.1488680     0.2382575    0.3455933 0.94015824 9.614704e-02{code}

Looks

...

like

...

the

...

second

...

PC

...

is

...

highly

...

correlated

...

but

...

the

...

batch

...

and

...

also

...

4th

...

and

...

8th.

...

The

...

second

...

PC

...

explains

...

10%

...

of

...

the

...

data

...

variance.

...

Remove

...

the

...

batch: Image Added Image Added Image Added

Code Block
collapsetrue
> x
     batchID    amount concentration plate_column plate_row  shortDay
V1 0.9538949 0.7329525     0.9668135    0.1956406 0.3925206 0.9538949
V2 0.6951568 0.1346448     0.7778342    0.6589222 0.1054539 0.6951568
V3 0.8522117 0.1642106     0.3273640    0.7278436 0.7377284 0.8522117
V4 0.9436648 0.8132652     0.2334584    0.4411353 0.9901676 0.9436648
V5 0.9743762 0.8955925     0.3016907    0.8663039 0.2159179 0.9743762
V6 0.9130370 0.4158556     0.3873149    0.5267212 0.4462888 0.9130370
V7 0.4145840 0.5815169     0.3605256    0.4940810 0.6986479 0.4145840
V8 0.9028540 0.1214957     0.5218528    0.3929218 0.5285360 0.9028540{code}

Removing

...

batch

...

took

...

care

...

of

...

all

...

other

...

correlations.

...

I

...

was

...

also

...

wondering

...

about

...

correlation

...

of

...

batch

...

with

...

the

...

clinical

...

traits

...

in

...

this

...

smaller

...

dataset

...

(actual

...

DNA

...

methylation

...

data,

...

not

...

potential).

...

Correlation

...

of

...

batch

...

and

...

histological

...

type:

...

0.001488

...

(Chi-square

...

test)

...

and

...

3.0e-05

...

(Fisher

...

test);

...

residual

...

tumor:

...

 7.465e-

...

07 (Chi-square

...

test)

...

and 6.536e-09

...

(Fisher

...

test).

...

There

...

weren't

...

any

...

significant

...

correlation

...

with

...

tumor

...

grade.

...

With

...

tumor

...

stage:

...

 0.04773

...

(Chi-square),

...

 0.009894

...

(Fisher

...

test).

...

 

Consider the data to be normalized. 
Expression set object is available.