Document toolboxDocument toolbox

Collecting Writeups

At the end of a typical challenge on Synapse, all participating users and teams will be asked to provide a writeup on their submission(s). The writeup should include details such as methodologies, external data sources (if any), and source code and/or Docker images. Participants and teams will submit their writeups as Synapse projects.

Although not true for all challenges, writeups are typically a criterion for any challenge incentives, such as “top performer” eligibility, byline authorship, and more.

If you are a challenge organizer and your challenge will require writeups, this article will help you:

  • Set up the infrastructure for collecting and validating writeup submissions

  • Display the writeups on a leaderboard


Workflow Setup

Requirements

  • One Sage account

  • (for local testing) CWL runner of choice, e.g. cwltool

  • Access to cloud compute services, e.g. AWS, GCP, etc.

Outcome

This infrastructure will continuously monitor the writeup queue for a new submission, perform a quick validation, and, if valid, create an archive of the submission. An archive is created to ensure a copy of the writeup is always available to the organizers team, in case the project owner deletes the original copy or remove accessibility.

Steps

1. On the live project, go to the Challenge tab and check whether there is already an evaluation queue for collecting writeups (skip to Step 3 if so).

Note that by default, evaluation queues are only accessible to the evaluation creator; if you are not currently an admin for the challenge, double-check with other organizers to ensure that you have the correct permissions to all available queues.

 

2. Create a new evaluation queue if one is not available. Click on Challenge Tools > Add Evaluation Queue, and name it something like <Challenge Name> Writeup

By default, the newly-created queue will only be accessible to the creator, in this case, you. For now, update its sharing settings so that:

  • The organizers team has Can score permissions

  • The admin team has Adminstrator permissions

  • Anyone on the web has Can view permissions

We recommend not sharing the evaluation queue with the participants team until after the workflow has been tested.

See Submit Writeup Dry-runs below for more details.

 

3. Update the quota of the writeup evaluation queue, such as the Duration and Submission Limits. Generally, there are no submission limits for writeups, so this field can be left blank.

Get the 7-digit evaluation ID as it will later be needed in Step 9.

 

4. Go to Sage-Bionetworks-Challenges, and create a New repo.

For “Repository template, select Sage-Bionetworks-Challenges/writeup-workflow. The “Owner” can be left as default, and give any name you’d like for “Repository name”.

Past naming examples are:

  • Anti-PD1-DREAM-Writeups

  • CTD2-Chemosensitivity-Writeup

  • RA2-Writeup-Infrastructure

Before creating the repo, switch the view from Private to Public.

 

5. Clone the repo onto your machine. Using a text editor or IDE, make the following updates to workflow.cwl:

Line Number

TODO

Motivation

Line Number

TODO

Motivation

31

Update "jane.doe" with the organizers team name.

This will identify the Synapse team that are the organizers for the challenge.

45

Update "syn123" with the synID of the Challenge live site.

This will check that the participant/team did not submit the challenge site as their writeup.

47-50

Uncomment.

Lines 49-50 checks that the writeup project is AT LEAST accessible to the organizers team (as defined by Line 31). Lines 47-48 checks the writeup is accessible to anyone on the web.

Push the changes up to GitHub when done.

 

6. On the staging project, go to the Files tab and click on the upload icon to Upload or Link to a File:

 

7. In the pop-up window, switch tabs to Link to URL. For "URL", enter the web address to the zipped download of the workflow infrastructure repository.  You may get this address by going to the repository and clicking on Code > right-clicking Download Zip > Copy Link Address:

Click Save.

 

8. Add an annotation to the file called ROOT_TEMPLATE by clicking on the annotations icon, followed by Edit:

For “Value”, enter the path to the workflow script, written as:

{name of repo}-{branch}/workflow.cwl

For example: my-writeup-repo-main/workflow.cwl

 

9. There are two approaches for running the writeup workflow. You can either:

a) tack on the evaluation + workflow onto an existing instance with orchestrator, by adding another key-value pair to EVALUATION_TEMPLATES in the .env file

b) create a new instance (a t3.small EC2 should be sufficient enough) and setup orchestrator on that machine. See Steps 7-9, 11 of Creating and Managing a Challenge for more details on how to set this up.

10. (optional but recommended for Displaying Writeups in a Leaderboard) Create a Submission View so that you can track and monitor the writeup submissions.

We recommend the following schema for monitoring writeup submissions:

Column Name

Description

Facet values?

Column Name

Description

Facet values?

id

Submission ID

 

createdOn

Date and time of the submission (in Epoch, but rendered as MM/dd/yyyy, hh:mm:ss)

 

submitterid

User or team who submitted (user or team ID, but rendered as username or team name)

Recommended

entityid

Link to the writeup

Not recommended

archived

Link to the writeup archive

Not recommended

status

Workflow status of the submission (one of [RECEIVED, EVALUATION_IN_PROGRESS, ACCEPTED, INVALID])

Recommended

submission_status

Evaluation status of the submission (one of [None, VALIDATED, SCORED, INVALID])

Recommended

submission_errors

(if any) Validation errors for the predictions file

Not recommended

Submit Dry-runs

Use and/or create a new Synapse project and submit it to the writeup queue. You can directly submit to the queue within the Challenge tab of the live project:

Some cases we recommend testing for are:

Test Case

Workflow Configurations

Expected Outcome

Test Case

Workflow Configurations

Expected Outcome

Submitting the Challenge site

 

INVALID

Submitting a private Synapse project

Lines 49-50 is used (writeup should be accessible to the Organizers team)

INVALID

Submitting a private Synapse project

Lines 47-48 is used (writeup should be publicly accessible)

INVALID

Submitting a Private Synapse project that is shared with the Organizers team

Lines 49-50 is used (writeup should be accessible to the Organizers team)

VALID

Submitting a Private Synapse project that is shared with the Organizers team

Lines 47-48 is used (writeup should be publicly accessible)

INVALID

Submitting a public Synapse project

Lines 47-48 and/or lines 49-50 are used

VALID

Once you are satisfied that the writeup workflow is to your expectations, remember to open the queue to the challenge participants!

You can do so by updating its sharing settings so that the participants team has Can submit permissions.

Display Writeups in a Leaderboard

Once the challenge has concluded and top-performers are ready to be announced, a “Final Results” leaderboard will typically be created to include writeups alongside the final submissions and their scores. This will enable the sharing of solutions and algorithms submitted to the challenge.

Creating this leaderboard can be achieved with a Materialized View table.

Requirements

  • synID of Submission View for final submissions

  • synID of Submission View for writeups

  • challengeutils (for syncing changes)

Steps

1. Go to the staging project and click on the Tables tab. Create a new Materialized View by clicking on Add New… > Add Materialized View.

2. Under “Defining SQL”, enter the following query:

SELECT s.id AS id, s.createdOn AS createdOn, s.submitterid AS submitterid, s.status AS status, (score columns), w.entityid AS writeup, w.archived AS archived_writeup FROM syn123 s JOIN syn456 w ON (s.submitterid = w.submitterid) WHERE s.status = 'ACCEPTED' AND w.status = 'ACCEPTED'

 

This query will join the Submission View for submissions (s) with the Submission View for writeups (w), using the submitterids from both Views as the index. Rows are filtered to only include valid submissions from both Views (*.status = 'ACCEPTED') – these clauses are optional and can be removed if desired.

Columns selected from the submissions View (s) include:

  • submission ID (s.id)

  • submission date (s.createdOn)

  • participant/team (s.submitterid)

  • status (s.status)

  • any annotations related to scores

Columns selected from the writeups View (w) include:

  • synID of the writeup (w.entityid)

  • synID of the writeup archive (w.archive)

The query is also assigning aliases, e.g. w.entityid AS writeup, for readability and easier querying.

 

3. Click Save. A table with the selected columns from each Submission View will now be available for viewing and querying.

 

4. Continuing on the staging project, switch to the Wiki tab and go to the “Results” page (create a new page if it’s not available). Click on the pencil icon to Edit Project Wiki:

 

5. In the pop-up window, click on + Insert button > Table: Query on a Synapse Table/View:

 

5. For “Query”, enter something like:

SELECT * FROM syn123

where syn123 is the synID to the MaterializedView created in Step 3. Optionally include aliases and/or clauses to the query if desired, e.g. exclude certain columns such as archived_writeup. Click on the Preview button in the top-left corner to preview the prospective leaderboard, e.g.

 

6. Click Save once you are satisfied with the final leaderboard.

 

7. Sync changes to the live site with challengeutils' mirror-wiki:

challengeutils mirror-wiki staging-synid live-synid

For additional assistance or guidance, contact the Challenges and Benchmarking team at cnb@sagebase.org.