...
Code Block |
---|
# Make sure you have the latest version svn up # Execute the loader # Replace <repo_instance> and <auth_instance> by the repository and authentication instances. # Either make sure that <platform_admin_email> is a Synapse administrator on crowd, or replace it by a Synapse administrator account python datasetCsvLoader.py -d ./AllDatasets.csv -l ./AllDatasetLayerLocations.csv -e http://<repo_instance>/repo/v1 v1 -a http://<auth_instance>/auth/v1 -m /work/platform/DatasetMetadataLoader/platform.md5sums.csv -u <platform_admin_email> -p <platform_admin_pw> |
This will create a publicly-accessible project called Sage BioCuration, and populate it with curated data from Sage's repository data team.
If you need to repopulate the data in S3, pass the -3 argument to the data loader. It upload the data in serial right now so it takes an hour or two. We really should only need to do this if we've messed up our S3 bucket.
Verify Deployment
To verify deployment, run top-level queries against the repository instances from an authenticated account.
Make sure you can download the MSKCC clinical data layer from S3.
TODO: Add queries and expected counts returned.