Document toolboxDocument toolbox

OAuth 2.0 Development Phases

Epic Issue:

PLFM-4585 - Getting issue details... STATUS


OAuth 2 is a large specification, and there are many use cases and goals that we are attempting to address by implementing OAuth 2 into Synapse. We can separate this development into incremental components that can be built separately and sequentially to elevate different use cases. We can define each development phase by the features implemented, and organize use cases around these phases.

Many potential use cases and considerations related to OAuth 2.0 will be mentioned in this document. This document does not represent a commitment to or prioritization of any specific use cases, and only exists as a tool to organize incremental development work on OAuth in Synapse.

Developers are encouraged to reanalyze requirements before each phase, because this document offers only example suggestions for which components of OAuth to implement for each use case. Better solutions may be available.

Use Cases

We will separate use cases into selected use cases (those which we are actively trying to enable), other use cases (those use cases that have been conceptualized, but are not yet committed to/being enabled) and implied use cases (use cases that are driven by goals, which are listed later, rather than user needs). Priority/necessity of other use cases is unclear, but due to the general nature of OAuth, we can track them to identify if their requirements are met when we achieve other use cases.

Selected Use Cases



Organizations want to use Synapse as an identity provider to link Synapse identities to their users. This use case has been developed, and is occurring today, so it will not be elaborated further here.

Users are interested in running workflows (which require access to a user’s Synapse resources) on workflow engines operated by a third party. Because the workflow engine is operated by a third party, it is unacceptable for a user to authorize access using their username and password, or their API key. The short-lived OAuth access tokens currently in Synapse are not sufficient for this use case either, because a submission to a workflow engine may sit in a queue and/or execute for a duration exceeding the lifetime of an access token (currently 24 hours). These issues are overcome by issuing a refresh token to the workflow engine.

Other Use Cases

These require work to be done by maintainers of one or more tools not maintained by the Platform team. Because OAuth is such a general spec, we track these only to understand if they are enabled by work done for other use cases.

This is similar to the workflow use case. The key difference is that jobs may not be triggered by a submission to an evaluation queue or other process. A user would need to manually generate and provide a refresh token and pass it to the client. This is similar to API keys, but the token would be scoped, individually revocable, and only usable by a particular client.

The client credentials are presumed to be confidential.

From the epic,

PLFM-4585 - Getting issue details... STATUS

Another use case is Dockstore integration. (Note, we do not at the time of this writing have a specific plan to integrate with Dockstore). This system, a catalog of 'dockerized' scientific software tools, is a thin layer on top of Github and a Docker registry (Quay.io). To get Dockstore to integrate with the Synapse Docker registry we would likely have to implement an API similar to Quay's, which includes being an OAuth provider (http://docs.quay.io/api/).

Dockstore is a platform used by GA4GH where users can share Docker images that are described with workflow languages. Dockstore currently integrates with Quay, GitHub, GitLab, and more via OAuth to enable users to share content. Synapse acting as an OAuth 2.0 provider could allow Dockstore users to share Docker images stored in the Synapse Docker registry on their platform.

Dockstore has documentation that explains their existing integrations. Dockstore alsohas an open issue for using Synapse as a Docker registry in their issue tracker, but it has not been addressed as of the time of writing, and it is not totally clear what supplemental API requirements would be needed to fully enable this use case.

Dockstore publishes synapse-plugin, which seems to provide a way to access Synapse files via the Dockstore CLI. To configure it, a user must enter their Synapse API key. Ideally, this application would use a scoped token to access exactly the resources it needs.

Goals

  • Enable selected use cases that require OAuth 2.0

  • Simplify authentication mechanisms in Synapse

    • Deprecate session tokens and API keys, making the only two supported authentication mechanisms in Synapse username/password, and OAuth 2 access tokens.

    • Remove/discourage password use where it is not absolutely necessary

    • Consideration: authorization with username/password may at one point require MFA – this is incompatible with certain OAuth flows



       

Implied Use Cases based on Goals

OAuth 2.0 to authorize a first-party Synapse command line app

  • One consideration is which flow the command line app is supposed to follow, and how it uses it

    • Authorization code flow

      • Requires a browser

        • Option 1: Show a link to authenticate, display the authorization code in the browser, user is required to copy and paste it

          • Works in browserless environments

          • See GSUtil

        • Option 2: Auto-open a browser, user authenticates, redirect to the terminal

          • Least friction

          • Additional complexity/vector for attack if not using PKCE

          • Impossible in browserless scenarios and requires a fallback

OAuth 2.0 to authorize the GWT Web Client

  • Assumed to use Authorization Code flow

  • GWT uses a client-server model, so we should consider the possibility of storing an OAuth client secret on the server, if possible

    • If implemented client-side (i.e. public client credentials), PKCE should be required

  • Would require shifting all authentication activities to one centralized source (e.g. signin.synapse.org)

  • Strongly consider creating exceptions for certain OAuth clients that should have implied consent (lots of friction and confusion if a user has to consent to the web client accessing their data every time they sign in)

    • Implications of this?

Phases and Requirements

Each phase will be complete when all of its requirements (as well as all requirements in previous phases) are complete.

Phase 1: Short lived access tokens to confidential clients to access OAuth 2/OIDC resources

Use Cases

  • Synapse as an identity provider

  • Short lived access to Synapse resources

Requirements

  • Register OAuth 2.0 clients in Synapse

  • Use the authorization code flow to generate scoped access tokens for a client to act on behalf of a user

  • Issue user identity information compliant with OIDC 1.0 via OAuth 2

Notes

At the time of writing, this phase has already been completed. For the sake of simplicity, all components are consolidated into one phase, and not all requirements may be captured here.

Phase 2: Long-lived confidential client authorization using authorization code flow

Use cases

  • Workflows

Requirements

  • Issue refresh tokens to confidential clients using the authorization code flow

    • PLFM-5753 - Getting issue details... STATUS

  • Provide mechanisms to resource owners and clients to revoke refresh tokens

    • PLFM-6120 - Getting issue details... STATUS

Notes

Since the use case dictates that OAuth clients in this use case will necessarily

  • Be confidential clients

  • use the authorization code flow

we do not need PKCE at this time.

For more information, see the design document, OAuth 2.0 Refresh Tokens and Revocation

Phase 3: OAuth2 public client access via authorization code flow or generated refresh tokens

Use cases

  • Synapse command line clients

  • Cron jobs

Requirements

  • Implement optional PKCE in authorization code flow

    • Consider identifying public and confidential status of clients, and handling them differently (e.g. public clients must use PKCE)

  • Allow users to manually issue scoped refresh tokens for particular clients

  • Consider supporting authorization code flow in the Python client

    • Not strictly necessary if users can manually issue a token.

Notes

Phase 4: “Frictionless” authorization for OAuth via first party apps

Use cases

  • OAuth in the GWT client

  • Deprecating session tokens

Requirements

  • Consent skipping for specific clients, including the GWT client.

  • Determiniation that session tokens are not used elsewhere

What Shouldn’t Be Built

This section includes a sample of proposed solutions that have been rejected, along with reasoning for the rejection

Item

Rejection reason

Item

Rejection reason

Resource-owner password credentials grant type

Documents about OAuth best practices argue against using this grant type. The crux of most arguments is this:

  • OAuth is designed to be used by third party clients

  • Third party clients should not be trusted with resource-owner credentials

  • ROPC relies on handing resource-owner credentials to a third party client

  • Thus ROPC is not secure, and should not be used

Clients considered secure enough to contain passwords should not use OAuth. OAuth may be used in first-party clients (e.g. the GWT web client), but if so, it should be treated like a third-party client, and credentials should not be entered into it.

Device code flow

The primary use case for device code flow is on input-constrained devices, like smart TVs. This type of use case does not exist for Synapse. In places where the device code flow could be used (e.g. a command line app), the authorization code flow will also be possible.