Skip to end of banner
Go to start of banner

New Challenge Infrastructure

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Infrastructure Setup

At Sage, we generally provision an EC2 Linux instance for a Challenge that leverages SynapseWorkflowOrchestrator to run CWL workflows.  These workflows will be responsible for evaluating and scoring submissions (see model-to-data-challenge-workflow GitHub for an example workflow).  If Sage is responsible for the cloud compute services, please give a general estimate of the computing power (memory, volume) needed.  We can also help with the estimates if you are unsure.

What Can Affect the Computing Power

By default, up to ten submissions can be evaluated concurrently, though this number can be increased or decreased accordingly within the orchestrator's .env file.  Generally, the more submissions you want to run concurrently, the more power will be required of the instance.

Example

Let’s say a submission file that is very large and/or complex will require up to 10GB of memory for evaluation.  If a max of four submissions should be run at the same time, then an instance of at least 40GB memory will be required (give or take some extra memory for system processes as well), whereas ten concurrent submissions would require at least 100GB.

The volume of the instance will be dependent on variables such as the size of the input files and the generated output files.  If running a model-to-data challenge, Docker images should also be taken into account.  On average, participants will create Docker images that are around 2-4 GB in size, though some have reached up to >10 GB.  (When this happens, we do encourage participants to revisit their Dockerfile and source code to ensure they are following best practices, as >10 GB is a bit high).

Sensitive Data

If data is sensitive and cannot leave the external site or data provider, please provide a remote server with (ideally) the following:

If Sage is not allowed access to the server, then it is the external site’s responsibility to get the Orchestrator running in whatever environment chosen.  If Docker is not supported by the system, please let us know as we do have solutions for workarounds (e.g. using Java to execute, etc.).

Typical Infrastructure Setup Steps


1. Create a workflow infrastructure GitHub repository for the Challenge.  We have created two templates in Sage-Bionetworks-Challenges that you may use as a starting point. The READMEs outline what will need to be updated within the scripts, but we will return to this later in Step 10.

a. data-to-model-challenge-workflow (submission type: prediction files)

b. model-to-data-challenge-workflow (submission type: Docker images)

2. Create the Challenge site on Synapse.  This can easily be done with challengeutils:

challengeutils create-challenge "challenge_name"

This command will create two Synapse Projects: one staging site and one live site.  You may think of them as development and production, in that all edits must be done in the staging site, NOT live.  Changes to the live site will instead be synced over with challengeutils' mirror-wiki (more on this under Update the Challenge).

Note: at first, the live site will be just one page where a general overview about the Challenge is provided.  There will also be a pre-register button that Synapse users can click on if they are interested in the upcoming Challenge:

For the initial deployment of the staging site to live, use synapseutils' copyWiki command, NOT mirror-wiki (more on this under Launch the Challenge).

create-challenge will also create four Synapse Teams for the Challenge: * Preregistrants, * Participants, * Organizers, and * Admin, where * is the Challenge name.  Add users to the Organizers and Admin teams as needed.

3. On the live site, go to the CHALLENGE tab and create as many Evaluation Queues as needed, e.g. one per sub-challenge, etc. by clicking on Challenge Tools > Create Evaluation Queue.  By default, create-challenge will create an Evaluation Queue for writeups, which you will already see listed here.

Important: the 7-digits in the parentheses following each Evaluation Queue name is its evaluation IDs, e.g. 

You will need these IDs later for Step 9, so make note of them.

4. While still on the live site, go to the FILES tab and create a new Folder called "Logs" by clicking on Files Tools > Add New Folder.

Important: this will be where the participants' submission logs and prediction files are uploaded, so make note of its Synapse ID for later usage in Step 9.

5. On the staging site, go to the FILES tab and create a new File by clicking on Files Tools > Upload or Link to a File > Link to URL.

For "URL", enter the link address to the zipped download of the workflow infrastructure repository.  You may get this address by going to the repository and clicking on Code > right-clicking Download Zip > Copy Link Address:

Name the File whatever you like (we generally use "workflow"), then hit Save.

Important: this File will be what links the Evaluation Queue to the orchestrator, so make note of its Synapse ID for later usage in Step 9.

6. Add an Annotation to the File called ROOT_TEMPLATE by clicking on Files Tools > Annotations > Edit.  The "Value" will be the path to the workflow script, written as: {infrastructure workflow repo}-{branch}/path/to/workflow.cwl For example, this is the path to workflow.cwl of the model-to-data template repo: model-to-data-challenge-workflow-main/workflow.cwl

Important: the ROOT_TEMPLATE annotation is what the orchestrator uses to determine which file among the repo is the workflow script.

7. Create a cloud compute environment with the required memory and volume specifications.  Once it spins up, log into the instance and clone the orchestrator:

git clone https://github.com/Sage-Bionetworks/SynapseWorkflowOrchestrator.git


Follow the "Setting up linux environment" instructions in the README to install and run Docker, as well as docker-compose.

8. While still on the instance, change directories to SynapseWorkflowOrchestrator/ and create a copy of the .envTemplate file as .env (or simply rename it to .env):

cd SynapseWorkflowOrchestrator/
cp .envTemplate .env

9. Open .env and enter values for the following property variables:

Property

Description

Example

SYNAPSE_USERNAME

Synapse credentials under which the orchestrator will run.  


The provided user must have access to the Evaluation Queue(s) being serviced.

dream_user

SYNAPSE_PASSWORD

API key for SYNAPSE_USERNAME.  


This can be found under My Dashboard > Settings.

"abcdefghi1234=="

WORKFLOW_OUTPUT_ROOT_ENTITY_ID

Synapse ID for "Logs" Folder.


Use the Synapse ID from Step 4.

syn123

EVALUATION_TEMPLATES

JSON map of evaluation IDs to the workflow repo archive, where the key is the evaluation ID and the value is the link address to the archive.


Use the evaluation IDs from Step 3 as the key(s) and the Synapse ID from Step 5 as the value.

{"9810678": "syn456", 

 "9810679": "syn456"}

Other properties may also be updated if desired, e.g. SUBMITTER_NOTIFICATION_MASK, SHARE_RESULTS_IMMEDIATELY, MAX_CONCURRENT_WORKFLOWS, etc.  Refer to the "Running the Orchestrator with Docker containers" notes in the README for more details.

10. Return to the workflow infrastructure repository and clone it onto your local machine.  Open the repo in your editor of choice and make the following edits to the scripts:
data-to-model template:

Script

What to Edit

Required TODO?

workflow.cwl

Update synapseid to the Synapse ID of the Challenge's goldstandard

TRUE

Set errors_only to false if an email notification about a valid submission should also be sent

FALSE

Add metrics and scores to private_annotations if they are to be withheld from the participants

FALSE


validate.cwl

Update the base image if the validation code is not Python

FALSE

Remove the sample validation code and replace with validation code for the Challenge

TRUE

score.cwl

Update the base image if the validation code is not Python

FALSE

Remove the sample scoring code and replace with scoring code for the Challenge

TRUE


model-to-data template:

Script

What to Edit

Required TODO?

workflow.cwl

Provide the admin user ID or admin team ID for principalid 


(2 steps: set_submitter_folder_permissions, set_admin_folder_permissions)

TRUE

Update synapseid to the Synapse ID of the Challenge's goldstandard

TRUE

Set errors_only to false if an email notification about a valid submission should also be sent


(2 steps: email_docker_validation, email_validation)

FALSE

Provide the absolute path to the data directory, denoted as input_dir, to be mounted during the container runs.

TRUE

Set store to false if log files should be withheld from the participants

FALSE

Add metrics and scores to private_annotations if they are to be withheld from the participants

FALSE


validate.cwl

Update the base image if the validation code is not Python

FALSE

Remove the sample validation code and replace with validation code for the Challenge

TRUE

score.cwl

Update the base image if the validation code is not Python

FALSE

Remove the sample scoring code and replace with scoring code for the Challenge

TRUE

Push the changes up to GitHub when done.

11. On the instance, change directories to SynapseWorkflowOrchestrator/ and kick-start the orchestrator with:
docker-compose up

To have it run in the background, add the -d flag (for detached mode):

docker-compose up -d

Note: it may be helpful to not run the orchestrator in detached mode at first, so that you will be made aware of any errors with the orchestrator setup right away.

If successful, the orchestrator will continuously monitor the Evaluation Queues specified by EVALUATION_TEMPLATES for submissions with the status, RECEIVED.  When it encounters a RECEIVED submission, it will run the workflow as specified by ROOT_TEMPLATE and update the submission status from RECEIVED to EVALUATION_IN_PROGRESS.  The orchestrator will also upload logs and prediction files to the Folder as specified by WORKFLOW_OUTPUT_ROOT_ENTITY_ID.  The Folder will be structured like this:

Logs 

 ├── submitteridA

 │  ├── submission01

 │  │  └── submission01.zip

 │  ├── submission02

 │  │  └── submission02.zip

 │ ...

 │

 ├── submitteridA_LOCKED

 │  ├── submission01

 │  │  └── predictions.csv

 │  ├── submission02

 │  │  └── predictions.csv

 │ ...

 │

...

If an error is encountered during any of the workflow steps, the orchestrator will update the submission status to INVALID and the workflow will stop.  If, instead, the workflow finishes to completion, the orchestrator will update the submission status to ACCEPTED.  Depending on how the workflow is set up (configured by step 10), participants may periodically be notified by email of their submission's progress.

For a visual reference, a diagram of the orchestrator and its interactions with Synapse is provided below:

Display a Submissions Dashboard (optional)

12. (Optional) Go to the staging site and click on the TABLES tab.  Create a new Submission View by clicking on Table Tools > Add Submission View.  Under "Scope", add the Evaluation Queue(s) you are interested in monitoring (you may add more than one), then click Next.  On the next screen, select which information to display, then hit Save.  A Synapse table of the submissions and their metadata is now available for viewing and querying

Changes to the information displayed can be edited by going to the Submission View, then clicking on Submission View Tools > Show Submission View Schema > Edit Schema.

Launch the Challenge (one-time only)

13. On the live site, go to the CHALLENGE tab and share the appropriate Evaluation Queues with the Participants team, giving them "Can submit" permissions.

14. Use the copyWiki command provided by synapseutils to copy over all pages from the staging site to the live site.  When using copyWiki, it is important to also specify the destinationSubPageId parameter.  This ID can be found in the URL of the live site, where it is the integer following .../wiki/, e.g.
https://www.synapse.org/#!Synapse:syn123/wiki/123456

Example Script

import synapseclient
import synapseutils
syn = synapseclient.login()

synapseutils.copyWiki(
   syn, 
   "syn1234",  # Synapse ID of staging site
   destinationId="syn2345",  # Synapse ID of live site
   destinationSubPageId=999999  # ID following ../wiki/ of live site URL
)


Important!! After the initial copying, all changes to the live site should now be synced over with mirror-wiki; DO NOT use copyWiki again.  More on updating the Wikis under Update the Challenge.


Stop the Orchestrator

15. On the instance, enter: 
Ctrl + C 

followed by:

docker-compose down


If you are running the orchestrator in the background, skip the first step and simply enter:

docker-compose down

Note: docker-compose must be run in the SynapseWorkflowOrchestrator/ directory.  If you are not already in that directory, change directories first.

Note that if the Challenge is currently active but you need to stop the orchestrator (e.g. to make updates to the .env file, etc), it may be helpful to first check whether any submissions are currently being evaluated.  If you are running the orchestrator in the background, you can monitor its activity by entering:

docker ps

If only one image is listed, e.g. sagebionetworks/synapse-workflow-orchestrator, this will indicate that no submissions are currently running.

Otherwise, read the logs on the terminal screen to determine whether there is currently activity.



  • No labels