Otherwise, read the logs on the terminal screen to determine whether there is currently activity.

Wiki Setup

Use the following questions to help plan and set up the Challenge site and Evaluation Queues.

How many Challenge questions (“sub-challenges”) will there be?

Will the participants submit a single file/model to answer all sub-challenges or will they submit a different file/model per sub-challenge?

What is the general timeline for the Challenge?

Will there be rounds? If so, how many?

Using rounds may help increase participation levels throughout the Challenge, as submission activity is usually high near the end of rounds/phases.

It is best to have end dates during the mid-week if possible; this will ensure that there will be someone on-hand to help monitor and resolve issues should one arise.

Can users submit multiple submissions to a sub-challenge?

If so, should there be a limit in frequency? Examples: one submission per day, 3 submissions per week, 5 total, etc.

Setting a limit may help with potential overfitting as well as limit a user/team from monopolizing the compute resources.

What sort of submissions will the participants submit?
Common formats supported by Sage: prediction file (i.e. csv file), Docker image
When can the truth files (goldstandard) and training data (if any) be expected?
Will the data be released upon the challenge end? After the embargo? Never?
Is the data sensitive?
If so, will a clickwrap be needed (an agreement between the participant and data provider that requires the former to click a button that they will agree to the policies put in place regarding data usage)? Should log files be returned? Will there be a need to generate synthetic data?
Who will be responsible for providing/writing the validation and/or scoring scripts?
If Sage, please provide as many details regarding the format of a valid predictions file (e.g. number of columns, names of column headers, valid values, etc.) and all exceptional cases. For scoring, please provide the primary and secondary metrics, as well as any special circumstances for evaluations, i.e. CTD² BeatAML primary metric is an average Spearman correlation, calculated from each drug’s Spearman correlation.
If not Sage, please provide the scripts in either Python or R. If needed, we do provide sample scoring models that you may use as a template, available in both Python and R.
Are scores returned to the participants immediately or should they be withheld until the Challenge end?
A typical Challenge will immediately return the scores in an email upon evaluation completion, however, there have been past Challenges that did not return scores until after the end date.
There is also an “hybrid” approach, in which scores are immediately returned during the Leaderboard/Testing Phase but withheld during the Final/Validation Phase (in which participants do not know their performance until after the Challenge end).
When should the evaluation results/leaderboard be accessible to the participants?
Some past Challenges had active leaderboards (i.e. participants could readily view their ranking throughout the evaluation round) whereas other Challenges did not release the leaderboards until the round/Challenge was over.
Regarding writeups: when will these be accepted?
Should participants submit their writeups during submission evaluations or after the Challenge has closed?
A writeup is something we require of all participants in order to be considered for final evaluation and ranking. Within a writeup should be all contributing persons, a thorough description of their methods and usage of data outside of the Challenge data, as well as all of their scripts, code, and predictions file(s)/Docker image(s). We require all of these so that, should they be a top-performer, we can ensure their code and final output is reproducible.

Update the Challenge

Challenge Site

Wikis

Any changes to the Challenge site and its Wiki/sub-Wiki contents must be done in the staging site, not live. Steps for updating the site is outlined below:

Make whatever changes needed to the staging Synapse Project.
Use challengeutils' mirror-wiki to push the changes to the live Project.

Note: using the --dryrun flag prior to officially mirroring can be helpful in ensuring that the pages to be updated are actually the ones intended. For example, an update is only made on the main Wiki page of this particular Challenge, therefore, it is expected that only the first page will be updated:

...

Evaluation Queue Quotas

Updating an Evaluation Queue’s quota can be done in one of two ways:

On Synapse via Edit Evaluation Queue.

In the terminal via challengeutils' set-evaluation-quota:

...

Example

The deadline date for the first round of a Challenge has been shifted from 12 January 2020 to 9 February 2021. The queues are currently implementing a "1 submission per day" limit, therefore, the Number of Rounds will need to be updated, NOT Round Duration. This is because a "round" will need to stay as one day long (86400000 Epoch milliseconds), so that Synapse can enforce the Submission Limit of 1 per day.

Note: if there is no daily (or weekly) submission limit, then updating the Round Duration would be appropriate. For example, the final round of a Challenge has a total Submission Limit of 2, that is, participants are only allowed two submissions during the entire phase. A "round", this time, is considered to be the entire phase, so updating Round Duration (or end_date when using set-evaluation-quota) will be the appropriate step to take when updating the deadline for the queue(s).

To update the quota(s) on Synapse, go to the live site, then head to CHALLENGE tab. Edit the Evaluation Queues as needed; in this case, there are three queues and they will all need to be updated. There are 57 days between the start date (14 December 2020) and the new end date (9 February 2021), which can be translated as 57 "rounds". And so, Number of Rounds will be increased to 57 rounds:

...

Notice how Round Duration still remains the same at 1 day long.

Updating with challengeutils' set-evaluation-quota is more or less the same (except round_duration must be given as Epoch milliseconds):

...

There is one important caveat to using the latter approach:

blank values will replace any existing quotas if they are not set during this command

That is, if the command above had not set round_start as 2020-12-14T21:00:00, then when the command completes, the Evaluation Queue will no longer have a starting date, e.g.

...

quota no longer has a firstRoundStart property

Workflow Steps

For any changes to the infrastructure workflow steps and/or scripts involved with the workflow (e.g. run_docker.py), simply make the edits to the scripts, then push the changes.

Note: dry-runs should always follow a change to the workflow; this will ensure things are still working as expected.

Versions Compared

Old Version 4

New Version 5

Key

Wiki Setup

Challenge Site

Wikis

Evaluation Queue Quotas

Workflow Steps

Page Comparison

Versions Compared

Old Version 4

New Version 5

Key

Wiki Setup

Challenge Site

Wikis

Evaluation Queue Quotas

Workflow Steps