Content Comparison

Table of Contents

indent	24px
style	circle
printable	false

This guide is meant to help organizers create a space within Synapse to host a crowd-sourced Challenge. A Challenge space provides participants with a Synapse project to learn about the Challenge, join the Challenge community, submit entries, track progress and view results. This article will focus on:

Setting up the Challenge infrastructure
Launching the Challenge space
Setting up the Challenge wiki
Updating the Challenge
Interacting with the submitted entries

Infrastructure Setup

At Sage Bionetworks, we generally provision an EC2 Linux instance for a Challenge challenge that leverages SynapseWorkflowOrchestrator to run CWL workflows. These workflows will be responsible for evaluating and scoring submissions (see model-to-data-challenge-workflow GitHub for an example workflow). If Sage Bionetworks is responsible for the cloud compute services, please give a general estimate of the computing power (memory, volume) needed. We can also help with the estimates if you are unsure.

...

If Sage is not allowed access to the server, then it is the external site’s responsibility to get the Orchestrator orchestrator running in whatever environment chosen. If Docker is not supported by the system, please let us know as we do have solutions for workarounds (e.g. using Java to execute, etc.).

...

2. Create the Challenge site on Synapse . This can easily be done with challengeutils:the challengeutils Python package. The instructions to install and use this package are located in the Challenge Utilities Repository.

Code Block

language	bash

challengeutils create-challenge "challenge_name"

This The create-challenge command will create two Synapse Projects projects: one staging site and one live site. You may think of them as development and production, in that all edits must be done in the staging site, NOT live. Changes to the live site will instead be synced over with challengeutils' mirror-wiki (more on this under Update the Challenge).

Note: at first, the live site will be just one page where a general overview about the Challenge is provided. There will also be a pre-register button that Synapse users can click on if they are interested in the upcoming Challenge:

...

For the initial deployment of the staging site to live, use synapseutils' copyWiki command, NOT mirror-wiki (more on this under Launch the Challenge).

create-challenge will also create four Synapse Teams for the Challenge: * Preregistrants, * Participants, * Organizers, and * Admin, where * is the Challenge name. Add Synapse users to the Organizers and Admin teams as required.

3. On the live site, go to the CHALLENGE tab and create as many Evaluation Queues as needed, e.g. one per sub-challenge, etc. by clicking on Challenge Tools > Create Evaluation Queue. By default, create-challenge will create an Evaluation Queue for writeups, which you will already see listed here.

Important: the 7-digits in the parentheses following each Evaluation Queue name is its evaluation IDs, e.g.

...

You will need these IDs later for Step 9, so make note of them.

4. While still on the live site, go to the FILES tab and create a new Folder called "Logs" by clicking on Files Tools > Add New Folder.

Important: this will be where the participants' submission logs and prediction files are uploaded, so make note of its Synapse ID for later usage in Step 9.

5. On the staging

Staging - Organizers use this project during the challenge development to share files and draft the challenge wiki. The createchallenge command initializes the wiki with the DREAM Challenge Wiki Template.
Live - Organizers use this project as the pre-registration page during challenge development, and it is replaced with a wiki once the challenge is ready to be launched. Organizers write the content of the wiki to provide detailed information about the challenge (e.g. challenge questions, data, participation, evaluation metrics). The wiki page must be made public to allow anyone to learn about the challenge and pre-register.

You may think of these two projects as development (staging project) and production (live project), in that all edits must be done in the staging site, NOT the live site. Maintenance of both projects enables wiki content to be edited and previewed in the staging project before the content is published to the live project. Changes to the live site are synced over with challengeutils' mirror-wiki (see Update the Challenge for more).

Info
Note: At first, the live site will be just one page where a general overview about the Challenge is provided. There will also be a pre-register button that Synapse users can click on if they are interested in the upcoming Challenge:

...

For the initial deployment of the staging site to live, use synapseutils' copyWiki command, NOT mirror-wiki (more on this under Launch the Challenge).

The create-challenge command will also create four /wiki/spaces/DOCS/pages/1985446029 for the Challenge:

Participants - This Synapse team includes the individual participants and teams who register to the challenge.
Organizers -
Administrators - The challenge organizers must be added to this list to provide the permissions to share files and edit the wiki on the staging project.
Pre-registrants - This team is recommended for when the challenge is under development. It allows participants to join a mailing list to receive notification of challenge launch news.

Add Synapse users to the Organizers and Admin teams as required.

3. On the live site, go to the CHALLENGE tab and create as many /wiki/spaces/DOCS/pages/1985151345 as needed (for example, one per sub-challenge) by clicking on Challenge Tools > Create Evaluation Queue. By default, create-challenge will create an evaluation queue for writeups, which you will already see listed here.

Note
Important: the 7-digits in the parentheses following each evaluation queue name is its evaluation ID. You will need these IDs later for Step 9, so make note of them.

...

4. While still on the live site, go to the FILES tab and create a new File folder called "Logs" by clicking on Files Tools > Upload or Link to a Add New Folder.

Note
Important: This folder is where the participants' submission logs and prediction files are uploaded, so make note of its Synapse ID for use later in Step 9.

5. On the staging site, go to the FILES tab and create a new file by clicking on Files Tools > Upload or Link to a File > Link to URL.

For "URL", enter the link web address to the zipped download of the workflow infrastructure repository. You may get this address by going to the repository and clicking on Code > right-clicking Download Zip > Copy Link Address:

...

Name the File whatever you like file (we generally use "workflow"), then hit click Save.

Note
Important:

...

This file will be what links the

...

evaluation queue to the orchestrator, so make note of its Synapse ID for use later

...

in Step 9.

6. Add an Annotation annotation to the File file called ROOT_TEMPLATE by clicking on Files Tools > Annotations > Edit. The "Value" will be the path to the workflow script, written as:

...

model-to-data-challenge-workflow-main/workflow.cwl

Note
Important:

...

The ROOT_TEMPLATE annotation is what the orchestrator uses to determine which file among the repo is the workflow script.

7. Create a cloud compute environment with the required memory and volume specifications. Once it spins up, log into the instance and clone the orchestrator:

...

Property	Description	Example
`SYNAPSE_USERNAME`	Synapse credentials under which the orchestrator will run. The provided user must have access to the Evaluation Queueevaluation queue(s) being serviced.	`dream_user`
`SYNAPSE_PASSWORD`	API key for `SYNAPSE_USERNAME`. This can be found under My Dashboard > Settings.	`"abcdefghi1234=="`
`WORKFLOW_OUTPUT_ROOT_ENTITY_ID`	Synapse ID for "Logs" Folderfolder. Use the Synapse ID from Step 4.	`syn123`
`EVALUATION_TEMPLATES`	JSON map of evaluation IDs to the workflow repo archive, where the key is the evaluation ID and the value is the link address to the archive. Use the evaluation IDs from Step 3 as the key(s) and the Synapse ID from Step 5 as the value.	`{"9810678": "syn456",` `"9810679": "syn456"}`

...

10. Return to the workflow infrastructure repository and clone it onto your local machine. Open the repo in your editor of choice and make the following edits to the scripts:

dataData-to-model template:

Script	What to Edit	Required TODO?
`workflow.cwl`	Update `synapseid` to the Synapse ID of the Challenge's goldstandard	`TRUE`
	Set `errors_only` to `false` if an email notification about a valid submission should also be sent	`FALSE`
	Add metrics and scores to `private_annotations` if they are to be withheld from the participants	`FALSE`
`validate.cwl`	Update the base image if the validation code is not Python	`FALSE`
`validate.cwl`	Remove the sample validation code and replace with validation code for the Challenge	`TRUE`
`score.cwl`	Update the base image if the validation code is not Python	`FALSE`
`score.cwl`	Remove the sample scoring code and replace with scoring code for the Challenge	`TRUE`

modelModel-to-data template:

Script	What to Edit	Required TODO?
`workflow.cwl`	Provide the admin user ID or admin team ID for `principalid` (2 steps: `set_submitter_folder_permissions`, `set_admin_folder_permissions`)	`TRUE`
	Update `synapseid` to the Synapse ID of the Challenge's goldstandard	`TRUE`
	Set `errors_only` to `false` if an email notification about a valid submission should also be sent (2 steps: `email_docker_validation`, `email_validation`)	`FALSE`
	Provide the absolute path to the data directory, denoted as `input_dir`, to be mounted during the container runs.	`TRUE`
	Set `store` to `false` if log files should be withheld from the participants	`FALSE`
	Add metrics and scores to `private_annotations` if they are to be withheld from the participants	`FALSE`
`validate.cwl`	Update the base image if the validation code is not Python	`FALSE`
`validate.cwl`	Remove the sample validation code and replace with validation code for the Challenge	`TRUE`
`score.cwl`	Update the base image if the validation code is not Python	`FALSE`
`score.cwl`	Remove the sample scoring code and replace with scoring code for the Challenge	`TRUE`

...

Code Block
docker-compose up -d

Info
Note: it may be helpful to not run the orchestrator in detached mode at first, so that you will be made aware of any errors with the orchestrator setup right away.

If successful, the orchestrator will continuously monitor the Evaluation Queues evaluation queues specified by EVALUATION_TEMPLATES for submissions with the status, RECEIVED. When it encounters a RECEIVED submission, it will run the workflow as specified by ROOT_TEMPLATE and update the submission status from RECEIVED to EVALUATION_IN_PROGRESS. The orchestrator will also upload logs and prediction files to the Folder folder as specified by WORKFLOW_OUTPUT_ROOT_ENTITY_ID. The Folder folder will be structured like this:

...

For a visual reference, a diagram of the orchestrator and its interactions with Synapse is provided below:

...

Display a Submissions Dashboard (

...

Optional)

12. Go to the staging site and click on the TABLES tab. Create a new Submission View submission view by clicking on Table Tools > Add Submission View. Under "Scope", add the Evaluation Queueevaluation queue(s) you are interested in monitoring (you may add more than one), then click Next. On the next screen, select which information to display, then hit click Save. A Synapse table of the submissions and their metadata is now available for viewing and querying.

Changes to the information displayed can be edited by going to the Submission Viewsubmission view, then clicking on Submission View Tools > Show Submission View Schema > Edit Schema.

Launch the Challenge (

...

One-

...

Time Only)

13. On the live site, go to the CHALLENGE tab and share the appropriate Evaluation Queues evaluation queues with the Participants team, giving them "Can submit" permissions.

...

Code Block

language	py

import synapseclient
import synapseutils
syn = synapseclient.login()

synapseutils.copyWiki(
   syn, 
   "syn1234",  # Synapse ID of staging site
   destinationId="syn2345",  # Synapse ID of live site
   destinationSubPageId=999999  # ID following ../wiki/ of live site URL
)

Note
Important

...

: After the initial copying, all changes to the live site should now be synced over with mirror-wiki; DO NOT use copyWiki again.

...

See more on updating the

...

wikis under the Update the Challenge section below.

Stop the Orchestrator

15. On the instance, enter:
Ctrl + C or cmd + C

...

Code Block
docker-compose down

Info
Note: `docker-compose`must be run in the`SynapseWorkflowOrchestrator/` directory. If you are not already in that directory, change directories first.

Note that if If the Challenge is currently active, but you need to stop the orchestrator for any reason, e.g. to make updates to the .env file, it may be helpful to first check whether any submissions are currently being evaluated. If you are running the orchestrator in the background, you can monitor its activity by entering:

...

Otherwise, if you are not running the orchestrator in the background, read the logs on the terminal screen to determine whether there is current activity.

Wiki Setup

Use the following questions to help plan and set up the Challenge site and Evaluation Queuesevaluation queues.

How many Challenge questions (“sub-challenges”) will there be?

...

What is the general timeline for the Challenge?

Will there be challenge rounds? If so, how many?

Using rounds may help increase participation levels throughout the Challenge, as submission activity is usually high near the end of of rounds/phases. It is best to have end dates during the mid-week if possible; this will ensure that there will be someone on-hand to help monitor and resolve issues should one they arise.

Can users submit multiple submissions to a sub-challenge?

If so, should there be a limit in frequency? Examples: one submission per day, 3 submissions per week, 5 total, etc.

Setting a limit may help with potential overfitting as well as limit a user/team from monopolizing the compute resources.

What sort of submissions will the participants submit?
Common formats supported by Sage: prediction file (i.e. csv file), Docker image

...

Is the data sensitive?
If so, will a clickwrap be needed (? A clickwrap is an agreement between the participant and data provider that requires the former participant to click a button that they will agree agreeing to the policies put in place regarding for data usage)? . Should log files be returned? Will there be a need to generate synthetic data?

...

Who will be responsible for providing/writing the validation and/or scoring scripts?
If Sage will be responsible, please provide as many details regarding the format of a valid predictions file (e.g. for example, number of columns, names of column headers, valid values, etc.) and all exceptional cases. For scoring, please provide the primary and secondary metrics, as well as any special circumstances for evaluations, i.e. CTD² BeatAML primary metric is an average Spearman correlation, calculated from each drug’s Spearman correlation.

If not Sage will not be responsible, please provide the scripts in either Python or R. If needed, we do provide sample scoring models that you may use as a template, available in both Python and R.

...

Regarding writeups: when will these be accepted?
Should participants submit their writeups during submission evaluations or after the Challenge has closed?

A writeup is something we require of all participants in order to be considered for final evaluation and ranking. Within a A writeup should be include all contributing persons, a thorough description of their methods and usage of data outside of the Challenge data, as well as all of their scripts, code, and predictions file(s)/Docker image(s). We require all of these so that , should they be a top-performer, we can ensure their code and final output is reproducible if they are a top-performer.

Update the Challenge

Challenge Site

...

and Wikis

Any changes to the Challenge site and its Wikiwiki/sub-Wiki wiki contents must be done in the staging site, not live. Steps for To updating the site is outlined below:

Make whatever changes needed to the staging Synapse Projectproject.
Use challengeutils' mirror-wiki to push the changes to the live Projectproject.

...

Using the --dryrun flag prior to officially mirroring can be helpful in ensuring that the pages to be updated are actually the ones intended. For example, an update is only made on the main

...

wiki page of this particular Challenge, therefore, it is expected that only the first page will be updated:

...

Evaluation Queue Quotas

Updating an Evaluation Queue’s evaluation queue’s quota can be done in one of two ways:

...

The deadline date for the first round of a Challenge has been shifted from 12 January 2020 to 9 February 2021. The queues are currently implementing a "1 submission per day" limit, therefore, the Number of Rounds will need to be updated, NOT Round Duration. This is because a "round" will need to stay as one day long (86400000 Epoch milliseconds), so that Synapse can enforce the Submission Limit of 1 per day.

Info
Note:

...

If there is no daily (or weekly) submission limit, then updating the Round Duration

...

is appropriate. For example, the final round of a Challenge has a total Submission Limit of 2, that is, participants are only allowed two submissions during the entire phase. A "round", this time, is considered to be the entire phase, so updating Round Duration (or end_datewhen usingset-evaluation-quota) will be the appropriate step to take when updating the deadline for the queue(s).

To update the quota(s) on Synapse, go to the live site, then head to CHALLENGE tab. Edit the Evaluation Queues evaluation queues as needed; in this case, there are three queues and they will all need to be updated. There are 57 days between the start date (14 December 2020) and the new end date (9 February 2021), which can be translated as 57 "rounds". And so, Number of Rounds will be increased to 57 rounds:

...

There is one important caveat to using the latter approach: blank values will replace any existing quotas if they are not set during this command

...

That . That is, if the command above had not set round_start as 2020-12-14T21:00:00, then when the command completes, the Evaluation Queue will no longer have a starting date, e.g.

...

For any changes to the infrastructure workflow steps and/or scripts involved with the workflow (e.g. run_docker.py), simply make the edits to the scripts, then push the changes.

Info
Note: dry-runs should always follow a change to the workflow; this will ensure things are still working as expected.

Interacting with Submissions

Throughout the challenge, participants will continuously submit to the evaluation queues. To manage continuous submissions, organizers can automate validation and scoring with the Synapse python client evaluation commands.

Revealing Submissions and Scores

Organizers can create leaderboards when scores are ready to be revealed to participants using /wiki/spaces/DOCS/pages/2011070739.

Submission views are sorted, paginated, tabular forms that can display submission data and annotations (e.g. scores from the scoring application and other metadata) and update as annotations or scores change. A submission view can provide real-time insight into the progress of a challenge

Learn more about adding leaderboards in the /wiki/spaces/DOCS/pages/1985151345.

Version	Old Version 12	New Version 13
Changes made by	Verena Chung	Stacey Taylor (Unlicensed)
Saved on	Apr 08, 2021	May 13, 2021

Versions Compared

Key