Skip to end of banner
Go to start of banner

Access Requirements with GA4GH Passport Visa Claims

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

Introduction

A data contributor works with the Sage Access and Compliance Team (ACT) to establish that a new data set added to Synapse can only be downloaded by a NIH qualified researchers. This means, when a caller attempts to download this dataset, Synapse must first check with the NIH to determine if the caller is actually an NIH qualified researcher. GA4GH provides a technical specification to facilitate this type of authentication/authorization exchange between two system: GA4GH Passports.

According to the GA4GH specification, Synapse would be a Passport Clearinghouse, while the NIH service would be the passport Broker. This document covers the API changes needed to support this type of use case. Here is a high level summary of what we proposed to build into Synapse:

  • A new AccessRequirement (AR) type that can be created/managed by ACT to define one or more Claims that the caller must have in order to download restricted data.

  • A new Action type that informs callers when a passport visa is required in order to download a file.

  • Extend the Synapse OIDC Authentication system:

    • Add new OAuthProviderBinding implementation to connect with each passport Broker that we wish to support.

    • Extend the Synapse generated access_tokens system to append passport claims provided by passport Brokers to the Synapse access_token.

  • Add a passport visa interceptor that will validate passport visas from the Synapse access token and forward the valid sub-set to the thread local. Extend UserManagerImpl.getUserInfo() to add visas from the thread local to resulting UserInfo object.

  • Extend the EntityAuthorizationManagerImpl to match AR visa conditions to the principal’s visas in the UserInfo.

  • Extend AsynchJobStatusManagerImpl append visa from UserInfo to the Job’s status.

PassportACTManagedAccessRequirement

Currently, an ACT managed access requirement (AR) is created by a member of ACT to restrict download access to one or more files within Synapse. When a user wishes to download a file that is the subject of a managed AR, they will typically need to first submit a data access request to ACT. The user will only be able to download the file after ACT has approved their submission. The approval process often involves providing information that demonstrates their qualification as a researcher.

The GA4GH passport specification was designed for the case where the system that holds data and the system that approves data access are not the same. In the introduction we introduced an example where Synapse controls data that can only be access by NIH qualified researchers. For this example, Synapse must defer to an NIH system to determine if a user is an NIH qualified researcher. In the GA4GH passport specification terms, Synapse would be the passport clearinghouse, while the NIH system would be the passport broker. The broker provides authentication information about the user in the form of one or more passport visas, and the clearinghouse uses the passport visas to make authorization decisions.

In order to support the approval delegation process in Synapse, ACT members need a new mechanism. Specifically, ACT needs a way to define cases where data access is contingent on one or more data broker provided passport visa claims. We propose adding a new managed access requirement type: PassportACTManagedAccessRequirement. This new AR type will define the required passport visa claims needed to download data for its associated subjects.

GA4GH supports multiple types of visa claims, each with varying degrees of complexity. In addition, each claim contains temporal data used for validation. Some visas have conditions such that they are only valid if one or more other visas are also present. Deciding if a visa matches the access requirement conditions will often require more than a simple “equals” check.

Part of the GA4GH visa claim specification includes a section called: conditions for cases where a visa is only valid if another visa is present. The conditions specification provides a syntax for defining visa matching rules. We propose that we reuse this syntax within the new passport AR to define the rules for matching the AR to the appropriate visa(s).

Note: As a passport clearinghouse we are required to parse visa claim conditions in order to determine if the claim is valid. For example, if visa A has a condition on visa B, then A must be invalid if B is missing. This means we already need a system for parsing conditions and matching them to visas. We should be able to reuse that system to match passport ARs to the user’s visa claims.

PassportACTManagedAccessRequirement.json

{
	"description": "This is an ACT managed access requirement used to require that a user has obtained one or more GA4GH Passport Visa Claims in order to access the associated subjects.",
	"extends": {
		"$ref": "org.sagebionetworks.repo.model.ManagedACTAccessRequirement"
	},
	"properties": {
		"visaConditions": {
			"description": "The conditions define how this access requirement matches to each required GA4GH passport visa.  Each condition group can contain one or more VisaConditions. Conditions within each group are delimited with an 'AND' while groups are delimited with an 'OR'",
			"type": "array",
			"items": {
				"$ref": "org.sagebionetworks.repo.model.ar.ConditionGroup"
			}
		}
	}
}

ConditionGroup.json

{
	"description": "A group of one or more VisaConditions.",
	"properties": {
		"andConditions": {
			"description": "A group of one or more visa conditions.  Each condition within the group is delimited with an 'AND'.",
			"type": "array",
			"items": {
				"$ref": "org.sagebionetworks.repo.model.ar.VisaCondition"
			}
		}
	}
}

VisaType.json

{
	"description": "Required.  The visa type to be matched.  Note: Custom types are not supported.",
	"type": "string",
	"enum": [
		{
			"name": "AffiliationAndRole"
		},
		{
			"name": "AcceptedTermsAndPolicies"
		},
		{
			"name": "ResearcherStatus"
		},
		{
			"name": "ControlledAccessGrants"
		},
		{
			"name": "LinkedIdentities"
		}
	]
}

VisaCondition.json

{
	"description": "Defines a match to a single GA4GH passport visa.  See: <a href=\"https://github.com/ga4gh-duri/ga4gh-duri.github.io/blob/master/researcher_ids/ga4gh_passport_v1.md#conditions\">GA4GH passport conditions</a>",
	"properties": {
		"type": {
			"description": "Required.  The visa type to be matched.  Note: Custom types are not supported.",
			"$ref": "org.sagebionetworks.repo.model.ar.VisaType"
		},
		"value": {
			"description": "Optional.  When provided defines an expected 'value' claim.",
			"$ref": "org.sagebionetworks.repo.model.ar.MatchTypeValue"
		},
		"source": {
			"description": "Optional.  When provided defines an expected 'source' claim.",
			"$ref": "org.sagebionetworks.repo.model.ar.MatchTypeValue"
		},
		"by": {
			"description": "Optional.  When provided defines an expected 'by' claim.",
			"$ref": "org.sagebionetworks.repo.model.ar.MatchTypeValue"
		},
		"brokerRedirectUrl": {
			"description": "The redirect URL of the passport broker that will provide this GA4GH passport visa for an authenticated caller.",
			"type": "string"
		},
		"visaName": {
			"description": "The name of the visa to request from the passport broker.",
			"type": "string"
		}
	}
}

MatchTypeValue.json

{
	"description": "Defines both the operation and value for matching a single visa claim.",
	"properties": {
		"type": {
			"description": "Required.  The type defines how value should be matched to a claim.",
			"name": "MatchType",
			"type": "string",
			"enum": [
				{
					"name": "const",
					"description": "A case sensitive full string match."
				},
				{
					"name": "pattern",
					"description": "Supports special meaning characters for matching values.  Use '?' to match any single character, and '*' to match multiple characters"
				},
				{
					"name": "split_pattern",
					"description": "A pattern match on part of a ';' delimited value."
				}
			]
		},
		"value": {
			"description": "The value depends on match type.  For 'const' a match requires a case sensitive full string match of this value.  For 'patterns', use a '?' to match any single character, and '*' to match multiple characters including the empty string and null string.",
			"type": "string"
		}
	}
}

Visa Action Required

Clients use the ‘GET /entity/{id}/actions/download’ service to help guide callers with “unmet” access requirements. This service provides a list of “Actions” that the caller will need to take in order to meet all of the ARs associated with a file.

With the new passport AR, a caller will need to acquire one or more GA4GH passport visas from one or more passport brokers before they will be permitted to download any file that is the subject of the passport AR.

Note: The caller might have many visas available to them from a passport broker. A broker might ask the caller which visa should be sent to Synapse. For such a case, we need to provide the user with the names of the visas to request from the broker.

In order to acquire the required visas, the web client will need to redirect the caller’s browser to the broker’s portal. We will cover the details of this redirect in a later section. The end result of a broker redirect will be the creation of a new Synapse access token that will include the visa claims provided by the broker. The resulting access token can then be used by either a web or programmatic client to download files that are subject to the passport AR.

Therefore, the “Action Required” for an unmet passport AR must provide both the broker’s redirect URL, plus the names of the visas to acquire.

PassportVisaClaimAction.json

{
	"description": "In order to download a file the user will need to provide one or more GA4GH passport visa claims.  Such a claim will be provided by the linked GA4GH passport broker.",
	"implements": [
		{
			"$ref": "org.sagebionetworks.repo.model.download.Action"
		}
	],
	"properties": {
		"brokerRedirectUrl": {
			"description": "The redirect URL of the passport broker that provides the passport visa claims needed to access data.",
			"type": "string"
		},
		"visaNames": {
			"description": "The name of the visas that the to be provided.",
			"type": "array",
			"items": {
				"type": "string"
			}
		}
	}
}

If more than one passport broker is needed to meet an single AR, ‘GET /entity/{id}/actions/download’ will provide a separate PassportVisaClaimAction for each broker.

Broker OIDC interaction

Synapse already uses OpenID Connect (OIDC) to support login via “Google” and to link an ORCID to a Synapse account. For the login case, information from Google is used to link the caller to a Synapse user ID. The final product of the OIDC process is a new Synapse access token that encodes both the user’s ID and the scope of the token. The Synapse access token is a signed JSON Web Token (JWT). The Synapse access token can be used by both web and programmatic clients to authenticate Synapse API requests.

The GA4GH ‘Data Passports' specification extends the basic OIDC process to enable a passport broker to provide a passport clearinghouse with a passport containing one or more visa claims. See also: ‘AAI OIDC Profile’. Specifically, the access token (also a JWT) provided by the broker, to the clearinghouse will include an entry for the caller’s passport.

We propose extending the Synapse OIDC support to not only “login” via a broker but to also capture the broker provided passport in the resulting Synapse access token.

Note: More than one passport broker might be needed to provide a full set of required passport visa claims. Therefore, it is important that newly provided visa claim accumulate with existing visa claims.

By appending claims to the resulting Synapse access token, we can ensure that the visas are available to both web and command line clients. In the next section we will cover how the Synapse access tokens with embedded visa claim JWTs can be used for download authorization.

Passport Visa Interceptor

Currently, the primary job of the Synapse AuthenticationFilter is to validate a user provided access token in order identify the caller. The filter also passes along the access token as a header that can be accessed by downstream code such as the OAuthScopeInterceptor. We propose adding a new interceptor for processing visa claims found in the access token. The per-processing would include the following:

  • Validate signature and expiration of each visa.

  • Validate the conditional relationship between visas. For example, a visa might include a condition such that it is only valid if another visa also exists. For such a case, the dependent visa would be invalid if its dependency were missing.

After validation, the passport visa interceptor will bind all valid visas to the thread local. The thread local list would be of type:

PassportVisa.json

{
	"description": "A representation of See: <a href=\"https://github.com/ga4gh-duri/ga4gh-duri.github.io/blob/master/researcher_ids/ga4gh_passport_v1.md#visa-format\">GA4GH passport visa</a>",
	"properties": {
		"type": {
			"description": "Required.",
			"$ref": "org.sagebionetworks.repo.model.ar.VisaType"
		},
		"value": {
			"description": "Required. A string that represents any of the scope, process, identifier and version of the assertion. The format of the string can vary by the Visa Type",
			"type": "string"
		},
		"source": {
			"description": "Required. A URL Claim that provides at a minimum the organization that made the assertion. If there is no organization making the assertion, the source claim value MUST be set to 'https://no.organization'.",
			"type": "string"
		},
		"by": {
			"description": "Optional. The level or type of authority within the 'source' organization of the assertion.",
			"type": "string"
		}
	}
}

Note: Since the interceptor excludes invalid visas, the PassportVisa.json does not include or any field used for validation or signing such as; conditions, asserted, alg, exp, jit, iat…

Currently, the service layer calls: UserManagerImpl.getUserInfo() to get an in-memory representation of the User (UserInfo). This UserInfo object is then forwarded to all of the lower code layers. Therefore, we propose extending the UserManager to gather the Vias from the thread local and add them to the resulting UserInfo object. This abstracts most of the code from the thread local data.

In the next section we will cover how download authorization code can use the passport visas to make download decisions.

Download Authorization

The EntityAuthorizationManagerImpl is responsible for making all entity related authorization decisions, including file download. The following is the current download decision chain:

			DENY_IF_DOES_NOT_EXIST,
			DENY_IF_IN_TRASH,
			GRANT_IF_ADMIN,
			DENY_IF_HAS_UNMET_ACCESS_RESTRICTIONS,
			DENY_IF_TWO_FA_REQUIREMENT_NOT_MET,
			GRANT_IF_OPEN_DATA_WITH_READ,
			DENY_IF_ANONYMOUS,
			DENY_IF_HAS_NOT_ACCEPTED_TERMS_OF_USE,
			GRANT_IF_HAS_DOWNLOAD,
			DENY

Currently, line:4 DENY_IF_HAS_UNMET_ACCESS_RESTRICTIONS is based on managed AR where the principal must be approved by ACT. This typically involves, checking if the principal has been granted approval for all ARs that have the given file as a subject.

We will need to extend the unmet AR check to look for the new passport AR type. The conditions of each passport AR must then be matched against the principal’s passport visas contained in the UserInfo object passed to the manager. The AR would be treated as ‘met’ if all visas match, and ‘unmet’ if one or more do not match. Note: The visa condition matching system should be the same as the system used to validate visas with conditions in the interceptor layer.

Asynchronous Jobs

A caller can start an asynchronous to download files as a zip. For this case one or more of the files to be download might require one or more visas in order to be authorized to download. For such a case, machine that executes the job will not be the same as the machine that originated the request, so the thread local visa information will unavailable on to the worker’s thread.

Note: A user might use multiple access tokens to make API calls at the same time. For example, a user might uses one token to make edits to a Synapse project in the web UI. At the same time, they might be running a headless workflow to update data in a different project. We cannot assume that both access tokens will have the same passport visas. Passport visas cannot be treated as global data automatically applied to a user.

In order to maintain the stateless nature of passport visas, we propose copping visas from thread that starts an asynchronous job into the job’s status. Specifically, the AsynchJobStatusManagerImpl.startJob() method can copy visas from the provided UserInfo into the job’s status. We can then extend the AsyncJobRunnerAdapter to pull the visas from the job’s state, and add them to the UserInfo used at the the start of each asynchronous worker run. This would allow download authorization checks from within asynchronous workers to behave the same as synchronous calls.

  • No labels