Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

One can read my Sweave file (filename) for these analysis to understand step by step what and how I did it. In short, I performed PCA on each sub-dataset and identified technical variables that had the biggest influence on my data. It was batch, which was also highly correlated with month and center. When I removed batch, I could still see strange patterns in my data. I ended up also removing the first principal component. Since I was not sure if it was correct for my further analyses I proceded with the dataset from which only batch was removed (called mb) and the dataset from which batch and the first principal component were removed (called mbc). One more thing: I centered the red and green channel before combining them into the final datasets. 

This is all great and I agree with Brig's approach as it is very intuitive and unassuming. However, now that I proceed with data analysis it is critical for me to figure out which probes are actually methylated and which are not. Especially because I don't have any control data. How should I approach it? With M value the methods have been developed for drawing a cutoff (because the data has a distinctive bimodal shape). Should I take unmethylated probes, process them similarly to the methylated probes, combine them to make M values and apply the existing method (described here) to figure out what is methylated and what is not? Also, Bin has built comethylation networks based on mb and mbc normalization. Should I rebuild them with a new M value? Something to think about.