IlluminaHumanMethylation450k.db Bioconductor package

In order to do test analysis of DNA methylation data I need to cluster the data based on 27 markers described in Shen, 2007. Seven out of those markers are MINTs and I need to figure out what would be their most likely location on the 450k array. I will use Bioconductor IlluminaHumanMethylation450k.db package for annotation (i.e. chromosome and location of the probes) but some of the features are confusing. Here I test what each annotation feature in the package really means.

IlluminaHumanMethylation450kCHR36: What chromosome does the target sequence for a probe align to, in build 36

> x <- IlluminaHumanMethylation450kCHR36
> xx<-as.list(x)[1:3]
> xx
$cg00000029
[1] "16"
$cg00000108
[1] "3"
$cg00000109
[1] "3"

IlluminaHumanMethylation450kCHR37: What chromosome does the target sequence for a probe align to, in build 37?

> x <- IlluminaHumanMethylation450kCHR37
> xx <- as.list(x)[1:3]
> xx
$cg00000029
[1] "16"
$cg00000108
[1] "3"
$cg00000109
[1] "3"

IlluminaHumanMethylation450kCHRLOC: IlluminaHumanMethylation450kCHRLOC is an R object that maps manufacturer identiﬁers to the starting position of the gene. The position of a gene is measured as the number of base pairs. The CHRLOCEND mapping is the same as the CHRLOC mapping except that it speciﬁes the ending base of a gene instead of the start.

> x <- IlluminaHumanMethylation450kCHRLOC
> xx <- as.list(x)[1:3]
> xx
$cg00000029
      16 
53468350 
$cg00000108
       3        3 
37440967 37440967 
$cg00000109
        3         3 
171757417 171758343

IlluminaHumanMethylation450kCHR: IlluminaHumanMethylation450kCHR is an R object that provides mappings between a manufacturer identiﬁer and the chromosome that contains the gene of interest. Each manufacturer identiﬁer maps to a vector of chromosomes. Due to inconsistencies that may exist at the time the object was built, the vector may contain more than one chromosome (e.g., the identiﬁer may map to more than one chromosome). If the chromosomal location is unknown, the vector will contain an NA. Mappings were based on data provided by: Entrez Gene ftp://ftp.ncbi.nlm.nih.gov/gene/DATA With a date stamp from the source of: 2010-Sep7

> x <- IlluminaHumanMethylation450kCHR
> xx <- as.list(x)[1:3]
> xx
$cg00000029
[1] "16"
$cg00000108
[1] "3"
$cg00000109
[1] "3"

IlluminaHumanMethylation450kCPG36: CpG location annotations against genome build 36

 > as.list(IlluminaHumanMethylation450kCPG36)[1:3]
$cg00000029
[1] 52025613
$cg00000108
[1] 37434210
$cg00000109
[1] 173398731

IlluminaHumanMethylation450kCPG37: CpG location annotations against genome build 37

> as.list(IlluminaHumanMethylation450kCPG37)[1:3]
$cg00000029
[1] 53468112
$cg00000108
[1] 37459206
$cg00000109
[1] 171916037