Skip to end of banner
Go to start of banner

Notes

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

Summary of MetaGEO steps:

Step 0:  Find studies that are to be processed and initiate workflow instances for these studies.
Step 1:  Download and modify/parse the study's series_matrix files. (Q:  Where should the file go:  file system, Synapse, other?  Is it temporary?)
Step 2:  Parse metadata from the series_matrix file and create folder hierarchy (Q:  Why create folders *now*?  What if folders already exist?  Where should metago:  file system, Synapse?  Should steps 1&2 be combined?)

Step 3:  Create 'raw data' folders and 'makefiles'.  (Q:  Can 'makefile' logic be split between later steps and the 'decider'? Can 'raw data' folder creation be done in Step 1 or 2 instead of here?)

Step 4: Download CEL files.  (Q: Should there be one activity per CEL file?  If not, then the activity should know how to start from a partial result and/or how to recover if the ftp fails. A: Can download all as a .tar.gz)

Step 5: Unzip .tar.gz and discard non-CEL files.

Step 6: Extract scan timestamp and add to metadata. 

Step 7:  Reconcile CEL files with metadata.  If there is a mismatch then halt the process.

Step 8: Add CEL file name to meta data, transpose the metadata file.

** at this point we run parallel steps for each platform (="array pattern") in the study **

Step 9: Create Sweave file for processing the <study,platform> in R.

Step 10: Run Sweave file.  Input:  CEL file set and metadata file; Output: processed data (R object), diagnostics (many image files), and inference (R objects).

Step 11: Clean up temporary files.  (Currently done by makefile.)

Output:

- processed data

- diagnostics

- inference data

- *might* need to save files for manual rework, i.e. makefiles.  (But this might be subsumed by the workflow framework.)

Temporary files, to be deleted:

- CEL files - can be deleted once the output is complete

- various files made by makefile

  • No labels