Table of Contents |
---|
Introduction
A data contributor works with the Sage Access and Compliance Team (ACT) to establish that a new data set added to Synapse can only be downloaded by a NIH qualified researchers. This means, when a caller attempts to download this dataset, Synapse must first check with the NIH to determine if the caller is actually an NIH qualified researcher. GA4GH provides a technical specification to facilitate this type of authentication/authorization exchange between two system: GA4GH Passports.
According to the GA4GH specification, Synapse would be a Passport Clearinghouse, while the NIH service would be the passport Broker. This document covers the API changes needed to support this type of use case. Here is a high level summary of what we proposed to build into Synapse:
...
A new AccessRequirement (AR) type that can be created/managed by ACT to define one or more Claims that the caller must have in order to download restricted data.
...
A new Action type that informs callers when a passport visa is required in order to download a file.
...
Extend the Synapse OIDC Authentication system:
Add new OAuthProviderBinding implementation to connect with each passport Broker that we wish to support.
Extend the Synapse generated access_tokens system to append passport claims provided by passport Brokers to the Synapse access_token.
...
This document has been superseded by:API Extensions for GA4GH Passport Integration
Table of Contents |
---|
Introduction
A data contributor works with the Sage Access and Compliance Team (ACT) to establish that a new data set added to Synapse can only be downloaded by a NIH qualified researchers. This means, when a caller attempts to download this dataset, Synapse must first check with the NIH to determine if the caller is actually an NIH qualified researcher. GA4GH provides a technical specification to facilitate this type of authentication/authorization exchange between two system: GA4GH Passports.
According to the GA4GH specification, Synapse would be a Passport Clearinghouse, while the NIH service would be the passport Broker. This document covers the API changes needed to support this type of use case. Here is a high level summary of what we proposed to build into Synapse:
A new AccessRequirement (AR) type that can be created/managed by ACT to define one or more Claims that the caller must have in order to download restricted data.
A new Action type that informs callers when a passport visa is required in order to download a file.
Extend the Synapse OIDC Authentication system:
Add new OAuthProviderBinding implementation to connect with each passport Broker that we wish to support.
Extend the Synapse generated access_tokens system to append passport claims provided by passport Brokers to the Synapse access_token.
Add a passport visa interceptor that will validate passport visas from the Synapse access token and forward the valid sub-set to the thread localExtend the EntityAuthorizationManagerImpl to . Extend UserManagerImpl.getUserInfo() to add visas from the thread local to resulting UserInfo object.
Extend the EntityAuthorizationManagerImpl to match AR visa conditions to the principal’s visas in the thread localUserInfo.
Extend AsynchJobStatusManagerImpl append visa from UserInfo to the Job’s status.
PassportACTManagedAccessRequirement
...
Code Block | ||
---|---|---|
| ||
{ "description": "In order to download a file the user will need to provide oreone oreor more GA4GH passport visa claimclaims. Such a claim will be provided by the linked GA4GH passport broker.", "implements": [ { "$ref": "org.sagebionetworks.repo.model.download.Action" } ], "properties": { "brokerRedirectUrl": { "description": "The redirect URL of the passport broker that provides the passport visa claims needed to access data.", "type": "string" }, "visaNames": { "description": "The name of the visas that the to be provided.", "type": "array", "items": { "type": "string" } } } } |
...
Synapse already uses OpenID Connect (OIDC) to support login via “Google” and to link an ORCID to a Synapse account. For the login case, information from Google is used to link the caller to a Synapse user ID. The final product of the OIDC process is a new Synapse access token that encodes both the user’s ID and the scope of the token. The Synapse access token is a signed JSON Web Token (JWT). The Synapse access token can be used by both web and programmatic clients to authenticate for all Synapse API callsrequests.
The GA4GH ‘Data Passports' specification extends the basic OIDC process to enable a passport broker to provide a passport clearinghouse with a passport containing one or more visa claims. See also: ‘AAI OIDC Profile’. Specifically, the access token (also a JWT) provided by the broker, to the clearinghouse will include an entry for the caller’s passportspassport.
We propose extending the Synapse OIDC support to not only “login” via a broker but to also capture the broker provided passport in the resulting Synapse access token.
...
By appending claims to the resulting Synapse access token, we can ensure that the visa visas are available to both web and command line clients. In the next section we will cover how the Synapse access tokens with embedded visa claim JWTs can be used for download authorization.
...
Validate signature and expiration of each visa.
Validate the conditional relationship between visas. For example, a via visa might include a condition such that it is only valid if another visa also exists. For such a case, the dependent visa would be invalid if its dependency were missing.
...
Note: Since the interceptor excludes invalid visas, the PassportVisa.json does not include or any field used for validation or signing such as; conditions, asserted, alg, exp, jit, iat…
In the next section we will cover how download authorization code can use the passport visas to make download decisions.
Download Authorization
The EntityAuthorizationManagerImpl is responsible for making all entity related authorization decisions, including file download. The following is the current download decision chain:
...
Currently, the service layer calls: UserManagerImpl.getUserInfo() to get an in-memory representation of the User (UserInfo). This UserInfo object is then forwarded to all of the lower code layers. Therefore, we propose extending the UserManager to gather the Vias from the thread local and add them to the resulting UserInfo object. This abstracts most of the code from the thread local data.
In the next section we will cover how download authorization code can use the passport visas to make download decisions.
Download Authorization
The EntityAuthorizationManagerImpl is responsible for making all entity related authorization decisions, including file download. The following is the current download decision chain:
Code Block |
---|
DENY_IF_DOES_NOT_EXIST, DENY_IF_IN_TRASH, GRANT_IF_ADMIN, DENY_IF_HAS_UNMET_ACCESS_RESTRICTIONS, DENY_IF_TWO_FA_REQUIREMENT_NOT_MET, GRANT_IF_OPEN_DATA_WITH_READ, DENY_IF_ANONYMOUS, DENY_IF_HAS_NOT_ACCEPTED_TERMS_OF_USE, GRANT_IF_HAS_DOWNLOAD, DENY |
Currently, line:4 DENY_IF_HAS_UNMET_ACCESS_RESTRICTIONS
is based on managed AR where the principal must be approved by ACT. This typically involves, checking if the principal has been granted approval for all ARs that have the given file as a subject.
...
,
DENY |
Currently, line:4 DENY_IF_HAS_UNMET_ACCESS_RESTRICTIONS
is based on managed AR where the principal must be approved by ACT. This typically involves, checking if the principal has been granted approval for all ARs that have the given file as a subject.
We will need to extend the unmet AR check to look for the new passport AR type. The conditions of each passport AR must then be matched against the principal’s passport visas contained in the UserInfo object passed to the manager. The AR would be treated as ‘met’ if all visas match, and ‘unmet’ if one or more do not match. Note: The visa condition matching system should be the same as the system used to validate visas with conditions in the interceptor layer.
Asynchronous Jobs
A caller can start an asynchronous to download files as a zip. For this case one or more of the files to be download might require one or more visas in order to be authorized to download. For such a case, machine that executes the job will not be the same as the machine that originated the request, so the thread local visa information will unavailable on to the worker’s thread.
Note: A user might use multiple access tokens to make API calls at the same time. For example, a user might uses one token to make edits to a Synapse project in the web UI. At the same time, they might be running a headless workflow to update data in a different project. We cannot assume that both access tokens will have the same passport visas. Passport visas cannot be treated as global data automatically applied to a user.
In order to maintain the stateless nature of passport visas, we propose copping visas from thread that starts an asynchronous job into the job’s status. Specifically, the AsynchJobStatusManagerImpl.startJob() method can copy visas from the provided UserInfo into the job’s status. We can then extend the AsyncJobRunnerAdapter to pull the visas from the job’s state, and add them to the UserInfo used at the the start of each asynchronous worker run. This would allow download authorization checks from within asynchronous workers to behave the same as synchronous calls.