...
The flow starts when the user indicates to Synapse that they want to login with an external IdP. Synapse redirects the browser (the “user agent”) to the IdP, which, after authenticating them, returns an authorization code. This is forwarded to Synapse which uses its so called client credentials to exchange the authorization code for an access token, id token and (optionally) a refresh token. The inclusion of the id token is the fundamental extension to OAuth 2.0 by OpenID Connect: In addition to authorizing access (via the access token) the IdP returns information it has about the user. The id token is a JSON Web Token (JWT) so it has a JSON payload, i.e. a key-value map. The keys are “claims” about the user, like “family_name” or “email”, and the values are the user data. If the IdP is a so-called “Broker” then it can return a GA4GH data passport. The claim name is “ga4gh“passport_
passportjwt_
v1” v11
” and the value is another, embedded JWT which has a claim “ga4gh_passport_v1
". The value of this claim is an array of GA4GH “visas”, described further below. An example from NIH RAS is here.
The OIDC specification provides for defining an expiration for user information. That is, the IdP can indicate that the recipient of an id token should only consider the user information valid for a limited time. One the information has expired then a new id token should be obtained. An OIDC IdP has a “/userInfo” endpoint to which an access token can be passed. The result is a new collection of user info, returned either as a JWT or as a JSON object.
...
In the DAC review phase, the DAC must verify the identity of the data user and determine if the proposed research is within the bounds of the permitted use(s) of the dataset. If approved, the data user and their institution must agree to the terms of use of the repository’s data through a data use or processing agreement. In the data use phase, the data user gains access to the dataset(s).
We may then then conceive of different sorts of passport-linked access requirements in Synapse. One type would grant access to data if a researcher is indicated to have a certain status by a trusted Broker. Another type would grant access only if a visa provided by a trusted Broker indicates that the user has access to a certain (controlled) data set. We would expect that in the visa the data set would be referred to by its ID in the namespace known to the Broker, as opposed to its Synapse ID. Therefore the Synapse access requirement would have to include the former ID in order to be able to evaluate the user’s visas.
...
OIDC provides for an 'expires_in' the token response. This is the time, in seconds, until the provided access token expires. The client can use this to decide when to use the refresh token to get a new access token. Note that doing so may also update the refresh token.
ID Tokens, being JWTs, have a 'exp' time stamp which is the epoch time after which the user information should no longer be considered valid. A passport should not be respected beyond this time limit.
...
It seems some in the GA4GH community view passports as being provided to the server Clearinghouse by the client, rather than the server Clearinghouse retrieving them from Brokers as proposed above. See:
https://docs.google.com/document/d/1T3uYGS2yZflDbLRbG4uxhi8ICqk9C9xWPmJ0DQpFvDU1amOGLwAbKkMSU6up_dHGhsuUF9SVFlllO0qjQSJYuVI/edit#heading=h.5atd0vqkj5vqhttps://docs.google.com/document/d/1amOGLwAbKkMSU6up_dHGhsuUF9SVFlllO0qjQSJYuVI/edit#heading=h.4uzh9l6x5chihh644d5mw17b
In the diagram shown in “Approach 2”, the client (the blue column in the sequence diagram) will receive a userInfo object from the RAS server with a subject ID “paired” to that client by RAS. The Auth server, upon receiving a passport containing that subject ID will not be able to resolve it against the subject IDs it received from RAS. It will not “know” which of its own users the passport represents. The question then is what is required by security compliances standards (HIPAA, FISMA) and Sage Governance with respect to tracking the identity of users who download controlled data.