For the Common Mind Consortium, data access is to be approved via the NIMH Repository and Genomic Resourcse (NRGR) while the data is to be hosted in Synapse. The two systems each have their own user accounts and identifiers and cannot (at least currently) reconcile users against an external user database. Moreover, NRGR exposes no API for checking approval status nor can it (at least currently) be modified to signal approval via Synapse's APIs. These constraints create a challenge for secure, unambiguous communication of user approval in NRGR back to Synapse. Fortunately, NRGR does allow arbitrary supplemental information to accompany an application and we can leverage this to solve the problem.
Our solution (diagrammed below) is as follows:
(1) A applicant starts the process by clicking a button in the Synapse UI. This creates a "membership request" event for a predefined data access group.
(2) A background process monitors the data access group for new membership requests. When a new one appears it generates an approval token and emails it to the applicant. This approval token has the following fields:
- the user's Synapse ID,
- the IDs of the Access Requirement objects in Synapse which restrict the hosted data,
- a time stamp,
- a hashed message authentication code (HMAC) which digitally signs the token, ensuring that such tokens can be generated only by Synapse.
Additionally a record of the sent email is placed in a Synapse table, for future reference.
(3) The user receives the email and is instructed to include it with their NRGR application. If the user is the head of a lab, then each lab member wishing to access the data must perform step (1) and the applicant must include all the emailed tokens with the NRGR application.
(4) Upon approval of the applicant(s), the token email(s) are sent to a predefined email address. The email includes a digital signature, authenticating it as being from the NIH.
(5) Upon receipt of the email, the digital signature in validated, the tokens are extracted, and their HMACs validated. Since the tokens are time stamped, a time limit can be imposed, ensuring out-of-date requests are rejected. The tokens' contents are used to generate Access Approvals in Synapse, unlocking the data for those approved in NRGR. The applicants are added to the data access group. Email notification alerts the applicants to the completion of the process. The Synapse table record created in step (2) is updated, providing the Synapse Access and Compliance Team (ACT) a dashboard of approval progress. If a token is rejected (e.g. if the data is corrupt, the token is too old, or the signature is invalid), this is noted in the table. If the applicant's Synapse user ID can be discerned from the record, an email rejection notice is sent to them.
The NRGR approval process considers an entire lab to be approved once a lab P.I. has completed their process. Thus a new lab member may request access without involving NRGR. In this case we require the new user to perform step (1) and provide the token to the ACT, which is authorized to trigger step (5), bypassing the email from NIH.