Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. A set of files that can only be accessed by an NIH approved researcher. This use case assumes the use of the NIH Research Auth Service (RAS).

  2. A set of files that can only be accessed by a user with IRB approval. The IRB approval can be verified either by the Sage ACT or by the user’s own institution.

  3. A set of files that can only be access within a FISMA approved boundary. For example, the files can be accessed by a Cavatica workflow but not be download to an unsecured laptop (PLFM-8264).

  4. A set of files with controlled access that will be downloaded using a command line client. For this case the client will need something like a Personal Access Token (PAT) instead of a w eb web UI login.

Visas

The GA4GH Passport standard provides a standardized mechanism for data users to present their digital identity, including authenticated credentials and permissions, in the form of visas and to share this across distributed data systems and organizational boundaries.

...

In practice, a passport would be presented within an access token to a file download API call, such as the DRS, and . The passport would be used by the service to determine if the caller is authorized to download the requested data file/files. Such a request While the access token would only have contain one passport that contains passport, the passport can contain one or more visas. Each visa make a single assertion about the caller. Each visa is a signed JSON Web Token (JWT) Claim. For example, we could create a visa that states that the caller has passed the Synapse Certification Quiz (see: Figure 1).

Code Block
languagejson
{
  "typ": "vnd.ga4gh.visa+jwt",]
  "alg": "RS256",
  "jku": "https://repo-prod.prod.sagebase.org/auth/v1/oauth2/jwks",
  "kid": "W7NN:WLJT:J5RK:L7TL:T7L7:3VX6:JEOU:644R:U3IX:5KZ2:7ZCK:FPTH"
}.
{
  "iss": "https://repo-prod.prod.sagebase.org/auth/v1",
  "sub": "456",
  "scope": "download",
  "jti": "some-uuid",
  "iat": 1708651144,
  "exp": 1708665544,
  "ga4gh_visa_v1": {
    "type": "AcceptedTermsAndPolicies",
    "asserted": 1645593544,
    "value": "https://repo-prod.prod.sagebase.org/repo/v1/certified/user/456",
    "source": "https://repo-prod.prod.sagebase.org/auth/v1"
  }
}.<signature>

...

In the example visa in Figure 1., most of the fields are part of the general JWT Claim standard and help the receiver to validate both the signature and the claim's expiration. However, the “ga4gh_visa_v1” field object defines the core of the visa’s assertion. The four sub-fields (type, asserted, value, & source) tell us exactly what the visa means.

...

Note: By making conditions immutable, user can be assured that it is safe to reuse them in multiple passport AR. If they conditions were to be mutable, it would be difficult for users to predict the impact any changes would have across the system.

...

  • Defining the visas that make up a passport access requirements

  • Informing the caller as to which visas will be needed for a successful call

  • Requesting a new access token containing the desired visas issued by both Synapse and trusted 3rd party visa issuersbrokers/brokersissuer.

Passport Access Requirement

The new passport access requirement needs to be able to define which visas are required to be present in order for a user to be able to download its associated subjects (typically files). This means a passport access requirement can be define exclusively with visa conditions IDs. A passport AR with a single condition will require the user provide a single visa matching that condition to be considered “met”. A passport AR can also include multiple conditions with simple logical relationships between each. For example, given conditions with the IDs 1 through 5, we can have an passport AR with a logical combination of visa conditions. In this example, the passport AR would require the following logical combination is possibleof conditions:

Code Block
(1 and 2) or (3 and 4) or 5

...

Note: There is an “and” relationship between elements within a single conditionIds array. There is an “or“ relationship between each conditionID array. This aligns with the GA4GH conditions definition.

Note: The GA4GH Visa conditions specification cannot encode nested logical relationships does not support nested logic such as:

Code Block
1 and (2 or (3 and 4))

...

It is important to note that passport AR the new PassportAccessRequirement does not inherit any functionality directly from existing ARs (ManagedACTAccessRequirement, SelfSignAccessRequirement, LockAccessRequirement, TermsOfUseAccessRequirement). The new passport AR is also not intended to PassportAccessRequirement is intended supplement, but not replace any of the existing ARs. Instead, the passport AR type is designed to supplement existing ARs while at the same time provide a mechanism to delegate to authorization decisions to trusted third parties, such as the NIH RAS. A key aspect of this feature will be the addition of providing exiting AR approvals, and other Synapse authorization features as Synapse signed visaARS. To achieve this goal, Synapse will issue a visa, to a user, for each traditional AR that they have been grated an approval. In the next section we will cover all of the types of visas that Synapse will be provided by Synapseissue.

Visa Issuer

By setting up Synapse to issue visas, we can enable 3rd party services to make their own authorization decisions based on a user’s Synapse accolades. For example, a 3rd party might limit access to a resource to users with a Synapse validated profile. Synapse would provide this information to the 3rd party service by issuing a signed visa claim only if the user’s profile is validated:

...

Note: According to the GA4GH specification, ResearcherStatus value URL must point to a human readable web page that asserts the researcher’s status. This is an odd requirement for the GA4GH passport specification which is designed for the automatic machine validation of visas.

The above validated profile visa is just one example of a predefined visa that Synapse will be able to provide for users. Other such predefined visas will include:

...

It is important to note, that in the previous example of IRB approval, ACT is not acting as an IRB. Instead, ACT is manually confirming that an external IRB has granted approval to access the data to the researcher. If the external IRB used a system that could issue IRB’s system issued GA4GH visas that assert a researchers has their approval, then ACT would no longer need to manually confirm the assertion. Instead, ACT would first create a condition to match the IRB’s visas issued by the IRB’s system. ACT would then use the resulting condition ID to define a new Passport Access Requirement PassportAccessRequirement and bind the AR to the data. Once the Passport AR is setup, a member of the ACT would no longer need to manually “confirm” each IRB approval. Instead, Synapse would automatically acquire and validate the researcher’s visa directly from the IRB’s system to confirm approvals.

...

When the UI calls GET /entity/syn123/actions/download with an access token, the service will check if the provided token includes the required visa to meet conditions 1 and 4. For this example, the user’s token includes a visa matching condition 4 but does not have a visa for condition 1. Therefore, the service will return the following:

...

In this example, the user’s ID within some.institution.org was 123. However, the same user’s ID within Synapse might be 88. Therefore, we need a mechanism that maps the userId in the visa claim, 123, to the Synapse user 88. The GA4GH specification defines a special visa type LinkedIdentities for such mapping.

Linked Identities

At the time where Synapse acquires a visa from a 3rd party, Synapse will need to issue an additional LinkedIdentities visa that maps 3rd party userId to the user’s Synapse ID. For the example above, the resulting link visa would look like:

Code Block
 {
	"ga4gh_visa_v1": {
		"type": "LinkedIdentities",
		"asserted": 1645593544,
		"value": "123,https://some.institution.org/irb/approval/;88,https://www.synapse.org/#!Profile:",
		"source": "https://repo-prod.prod.sagebase.org/auth/v1",
		"by": "dac"
	}
}

...

"
	}
}

The above linked visa maps some.institution’s user ID 123 to Synapse user ID 88.

To receive a LinkedIdentities visa, the user needs to demonstrate that they are the holder of both identities.

--GA4GH Passport standard for digital identity and access permissions

Limit Download Context

We have a use cases where a set of files must only be downloaded within a limited context (PLFM-8264). Specifically, a set of files must only be access from a FISMA approved system. In this example, Cavatica has received FISMA approval for their workflow system, while an individual researcher’s laptop has not been approved. In other words, a researcher can access the data via a Cavatica workflow, but the same researcher cannot donwload the same data to their laptop.

Synapse implements the OAuth 2.0 specification which allows users to “login” to Synapse via trusted 3rd party clients. Each time a user logs into Synapse from one of the trusted clients, the resulting access token will include the registered client ID. For our example, let us assume that the registered Cavatica client ID is 33. Note: If the user logs into Synapse using any other client (even the web client), the client ID in the resulting access token will not be 33. We can use this information to solve the limited download context problem.

Let us assume that Synapse can provide a visa that states which client ID was used for authentication. If a user were to authenticate to vial Cavatica client, then such a visa might look like:

Code Block
languagejson
{
	"ga4gh_visa_v1": {
		"type": "ControlledAccessGrants",
		"asserted": 1645593544,
		"value": "https://repo-prod.prod.sagebase.org/auth/v1/oauth/client/id/33/user/111",
		"source": "https://repo-prod.prod.sagebase.org/auth/v1",
		"by": "system"
	}
}

If the same user were to login via the Synapse web client, with a client ID of 22, then the same visa might be:

Code Block
languagejson
{
	"ga4gh_visa_v1": {
		"type": "ControlledAccessGrants",
		"asserted": 1645593544,
		"value": "https://repo-prod.prod.sagebase.org/auth/v1/oauth/client/id/22/user/111",
		"source": "https://repo-prod.prod.sagebase.org/auth/v1",
		"by": "system"
	}
}

We now have enough information to define a new visa condition for this use case:

Code Block
languagejson
{
	"id": "101",
	"name":"OAuth 2.0 Client ID",
	"type":"ControlledAccessGrants",
	"value":{
		"match-type":"pattern",
		"match-value":"https://repo-prod.prod.sagebase.org/auth/v1/oauth/client/id/33/user/*"
	},
	"source":{
		"match-type":"const",
		"match-value":"https://repo-prod.prod.sagebase.org/auth/v1"
	},
	"by":"system"
}

The resulting visa condition ( id=101) states that the client ID must be 33. We are ready to create the new passport AR that will enforce the requirements for this use case when bound to the restricted dataset:

Code Block
languagejson
{
	"concreteType":"org.sagebionetworks.repo.model.ar.PassportAccessRequirement",
	"conditions":[
		{"conditionIds":["101"]}
	]
}

Personal Access Tokens

Most Synapse users will use utilize one of the many Synapse web UIs at some point. However, there is a class of Synapse that depend on one of the programmatic clients for their Synapse interactions. This is especially true for Synapse users that write/depend on scripts for automation. However, the GA4GH visa specification is an extension of the OIDC Connect specification with a typical “log in” flow that involves redirecting a browser between various web pages. Since the programmatic clients do not have web pages, an alternate means of authentication is needed.

The recommenced solution to the programmatic client authentication problem is to acquire a Personal Access Token (PAT) using POST /personalAccessToken. A PAT is a special access token that can be created using a web client (that supports redirects), and then saved to a programmatic client’s local credential store. The programmatic client is then free to use the resulting PAT for all calls to Synapse that require authentication.

Since the new passport AR feature will requires clients to present GA4GH visas to access the certain datasets, we will need to extend the current PAT service to allow users to request which visas to include within the generated PAT. Specifically, we will need to extend the AccessTokenGenerationRequest’s OAuthScope to include ga4gh_passport_v1. When included, Synapse will include the GA4GA visas found in the access token provided to create the PAT. Note: Each visa within the resulting PAT will have its own expiration that will almost certainly be shorter than the expiration date of the PAT. A PAT that includes expired visas will still be valid for any call that does not require the visas. Given the short lived nature of GA4GH visas we might want to consider adding a mechanism where a client can “refresh” a PAT and its visas.

New API Objects

org.sagebionetworks.repo.model.ar.Visa

...

Code Block
{
	"description": "A group of one or more GA4GH visa condition IDs.",
	"properties": {
		"conditionIds": {
			"description": "A group of one or more condition IDs.  There is an 'AND' relationship between each condition IDs.",
			"type": "array",
			"items": {
				"type": "string"
			}
		}
	}
}

To receive a LinkedIdentities visa, the user needs to demonstrate that they are the holder of both identities.

...

The Passport broker collects the visas from the Visa Issuers, assembles the Passport, and gives it to the data user. When the data user accesses a Passport Clearinghouse, the Passport is included with requests such that the computing environment is made aware that access policies are met and access to the dataset can be permitted.

...