In order to do test analysis of DNA methylation data I need to cluster the data based on 27 markers described in Shen, 2007. Seven out of those markers are MINTs and I need to figure out what would be their most likely location on the 450k array. I will use Bioconductor IlluminaHumanMethylation450k.db package for annotation (i.e. chromosome and location of the probes) but some of the features are confusing. Here I test what each annotation feature in the package really means.
IlluminaHumanMethylation450kCHR36: What chromosome does the target sequence for a probe align to, in build 36
IlluminaHumanMethylation450kCHR37: What chromosome does the target sequence for a probe align to, in build 37?
IlluminaHumanMethylation450kCHRLOC: IlluminaHumanMethylation450kCHRLOC is an R object that maps manufacturer identifiers to the starting position of the gene. The position of a gene is measured as the number of base pairs. The CHRLOCEND mapping is the same as the CHRLOC mapping except that it specifies the ending base of a gene instead of the start.
IlluminaHumanMethylation450kCHR: IlluminaHumanMethylation450kCHR is an R object that provides mappings between a manufacturer identifier and the chromosome that contains the gene of interest. Each manufacturer identifier maps to a vector of chromosomes. Due to inconsistencies that may exist at the time the object was built, the vector may contain more than one chromosome (e.g., the identifier may map to more than one chromosome). If the chromosomal location is unknown, the vector will contain an NA. Mappings were based on data provided by: Entrez Gene ftp://ftp.ncbi.nlm.nih.gov/gene/DATA With a date stamp from the source of: 2010-Sep7
IlluminaHumanMethylation450kCPG36: CpG location annotations against genome build 36
IlluminaHumanMethylation450kCPG37: CpG location annotations against genome build 37