Implementing OAuth2 into Synapse

Implementing OAuth2 into Synapse

This page is starting as a collection of notes, design decisions, etc. related to implementing OAuth2 into Synapse. Part of the process has included considerations about developing our own library, or using an off-the-shelf solution like ORY Hydra. The information on this page may change as the project evolves.

See also: @Bruce Hoff's presentation on his preliminary research into OAuth2 and how it relates to Synapse.

More reading for historical purposes: Synapse as OAuth 2.0 Provider

And a Jira Epic: 

PLFM-4585 - Getting issue details... STATUS

Good summary of OIDC: https://github.com/dexidp/dex/blob/master/Documentation/openid-connect.md

Use cases

High level use-cases, per @Bruce Hoff's presentation:

  1. Let third party (web) applications securely access a user’s data in Synapse.  Today such applications must either

  2.  

    1. predownload/embed data,

    2. use the application author’s Synapse credentials, or

    3. prompt the user for their Synapse credentials

  3. Let a headless batch job (e.g.,a “workflow”) securely access a user’s data in Synapse.  Today such a process must either

  4.  

    1. Use predownloaded data

    2. Use the job runner’s Synapse credentials

Brief Overview of OAuth2

This document presupposes a basic (not necessarily thorough) level of understanding of OAuth2. There are four authorization grant flows in the OAuth spec. They can be summarized:

  • Authorization Code grant (most secure, client secret confidentiality must be guaranteed)

    1. Upon user consent, an OAuth client (3rd party) is granted an authorization code

    2. The authorization code can be used with a client secret to obtain a scoped access token.

    3. The access token can be used to access resources until it expires or is revoked

    4. The access token can be refreshed by the client with a refresh token and the client secret.

  • Implicit code grant (client secret confidentiality cannot be guaranteed)

    1. Upon user consent, an OAuth client (3rd party) is granted a scoped access token.

    2. The token can be used to access resources until it is expired or revoked, but it cannot be refreshed. The time that the token is active is typically very short (minutes).

  • Resource owner password credentials (not secure, especially with an untrusted client)

    1. The user provides their username and password to the OAuth client

    2. The OAuth client uses the credentials to obtain a scoped access token

  • Client credentials (used for cases where clients manage their own resources, i.e. not really authorization delegation)

    1. The OAuth client can request an access token with their client ID and client secret

OAuth2 in Synapse - API Design

The current proposal is to introduce OAuth2 authorization code flow into Synapse. OAuth clients would be instructed to use authorization codes only.

Backend

We can separate the endpoints required on the backend based on task.

Client Management

Basic CRUD for OAuth2.0 clients.

Verb

Endpoint

Purpose

Request

Response

Notes

Verb

Endpoint

Purpose

Request

Response

Notes

GET

/oauth2/client/{id}

Get details about one client

Path param:

id: an existing OAuth2 Client ID

OAuth2Client

name: String

redirect_uri: String

client_id: Unique

created_by/on

modified_by/on

Only the owner or a Synapse admin can make this request.

GET

/oauth2/client/

List clients created by user



List of above.

Don't return secret.

POST

/oauth2/client

Create a client

OAuth2Client

name: String

redirect_uri: String

Supplemental params

OAuth2Client

name: String

redirect_uri: String

client_id: Unique

client_secret: String

created_by/on

modified_by/on

Supplemental params could include the URL to a logo, link to a website for the app, terms of service, etc.  See https://openid.net/specs/openid-connect-registration-1_0.html

Typically the secret key is used to perform these actions, but if we give Synapse users a claim over an OAuth2 client, we could use their credentials.

DELETE

/oauth2/client/{id}

Delete a client

Path param:

id: an existing OAuth2 Client ID

None



PUT

/oauth2/client/{id}

Update a client

OAuth2Client

name: String

redirect_uri: String

Supplemental params

OAuth2Client

name: String

redirect_uri: String

client_id: Unique

created_by/on

modified_by/on

Supplemental params

Typically the secret key is used to perform these actions, but if we give Synapse users a claim over an OAuth2 client, we could use their credentials.

Authorization/Consent Requests

These endpoints are necessary for users to approve/reject OAuth2.0 access requests

Verb

Endpoint

Purpose

Request Object/Params

Response Object/Params

Notes

Verb

Endpoint

Purpose

Request Object/Params

Response Object/Params

Notes

POST

/login/scoped

Get a scoped access token

sessionToken: String

scope: String

scopedLoginResponse:

scopedSessionToken: String

acceptsTermsOfUse: Boolean

scope: String

exp: Integer (seconds until expiry)

This is a more secure alternative to the current session token as limits what can be done with the session token.

These can (should?) expire quickly (minutes-hours).

This could be done by passing in username and password rather than sessionToken, but we'd have to handle the case where the user has no password and logs in via Google (or other OAuth provider).

GET

/oauth2/details

Get human-interpretable details about the requesting client, and the scope that they are requesting

Parameters

clientId: Unique (the ID of an existing OAuth2 client requesting access)

scope: String

OAuth2Client

scopes: Array<string> 

e.g. [("read", "syn123"), ("create","syn456")]

(actual representation TBD)

The web layer can use this to get details about a client requesting authorization and the scope they request

POST

/oauth2/consent

The user grants access to the OAuth2 Client to access protected resources

URL Parameters:

response_type: String (always "code")

client_id: Unique

redirect_uri: String (points to OAuth client)

scope: String

state: String

If scopedAccessToken is valid:

Body:

OAuthClientUrl: String

redirect_uri?code={code}&state={state} (all provided in request)

Parameters:

code: the authorization code

state: the same value in the request

Who should execute this? The User Agent or the Web Layer on behalf of the user agent?

Question: how to handle with various Synapse IdPs? (E.g. Synapse users who sign in with Google accounts).

The "state" parameter is designed to avoid CSRF attacks and the client must utilize it per RFC-6749 § 10.12. More info.

POST

/oauth2/revoke

A logged in user can revoke OAuth2 client access using this method.

OAuth2RevokeRequest

client_id: unique

Is there a need for more granularity?

None

Revoking access not in the OAuth2 spec but allowing users to revoke client access may be important.  Revocation should be at the token level not at the client level.

Token Requests

These endpoints would be used by OAuth2.0 clients to retrieve tokens with an access code

Verb

Endpoint

Purpose

Request

Response

Notes

Verb

Endpoint

Purpose

Request

Response

Notes

POST

/oauth2/token

Called by a client to get an access token

Body:

OAuth2AuthorizationCodeTokenRequest

grant_type: String (always "authorization_code" for this call)

code: String (the authorization code)

redirect_uri: String (should be the same as previous redirect uri)

client_id: Unique

client_secret: String

Body:

OAuth2AccessToken

access_token: String

token_type: String ("Bearer")

expires_in: Integer (seconds)

refresh_token: String

(optionally, scope)

As per OIDC the response should include both an access token and an ID Token, https://openid.net/specs/openid-connect-core-1_0.html#TokenResponse



Authorization codes must be single-use and short-lived. If an authorization code is used more than once, we should revoke the active access token retrieved with the code. More info @ RFC-6749 § 10.5

The redirect URI should be validated here before granting a token, along with the credentials in the request. More info @ RFC-6749 § 10.6.

The token should be opaque/unguessable. More info @ RFC-6749 § 10.3.  Update:  token could be JWT, https://www.oauth.com/oauth2-servers/access-tokens/self-encoded-access-tokens/.  Advantage is that permissions are 'built in' to the token.  Downside is that it becomes irrevocable.  So let's not use JWT.  

The token type in almost all OAuth2 cases is "Bearer". We can use a different token type (e.g. HMAC, or make our own) if we want to. See RFC-6749 § 7.1. Additional info.

POST

/oauth2/token/refresh

Called by a client to refresh an access token

Body:

OAuth2AuthorizationCodeRefreshTokenRequest

grant_type: String (always "refresh_token" for this call)

refresh_token: String

client_id: Unique

client_secret: String

Body:

OAuth2AccessToken

access_token: String

token_type: String ("Bearer")

expires_in: Integer (seconds)

refresh_token: String

Client authentication must be done here. This must be done over TLS. More info @ RFC-6749 § 10.4

POST

/oauth2/token/introspect

or

/oauth2/token/info

Clients can determine if an authentication token is valid (and get scope, if it is opaque in the token)

Body:

OAuth2TokenIntrospectionRequest

token: String

client_id: Unique

client_secret: String

Body:

OAuth2TokenIntrospectionResponse

active: Boolean

client_id: Unique

username: String (principal of user who authorized the token)

exp: Date (seconds until expiration)

scope: Array<String> (human-interpretable scope)

This endpoint is not strictly necessary, but we should strongly consider including this if we decide to not include scope with the access token.



See RFC-7662.

OpenID Connect services



Verb

Endpoint

Purpose

Request

Response

Notes

Verb

Endpoint

Purpose

Request

Response

Notes

GET

/.well-known/openid-configuration

return metadata about the service

N/A

See spec -→

Key elements are:

https://openid.net/specs/openid-connect-discovery-1_0.html#ProviderConfigurationRespon

GET

/oauth2/userinfo

return information about the user

N/A

Body:

sub: "subject", Synapse user id

aud: "audience", the OAuth client ID

iat: "issued at", the timestamp when the response was created

given_name:  first name

family_name:  last name



https://openid.net/specs/openid-connect-basic-1_0.html#UserInfo



Note: content type must be application/jwt as per https://openid.net/specs/openid-connect-core-1_0.html#UserInfoResponse

POST

/oauth2/userinfo

as above

N/A

as above

Although GET is the recommended HTTP method, POST must be supported, as per https://openid.net/specs/openid-connect-core-1_0.html#UserInfo

Web Layer Interfaces (Portal)

The portal needs to implement interfaces for handling the components of OAuth2 that are best accomplished through user interfaces.

Page

Purpose

Actions

Page

Purpose

Actions

OAuth2 Authentication

Provides an interface for the user to authenticate in order to manage OAuth2 requests

Prompt for authentication (u:p/OAuth login)

Retrieve a scoped access token on behalf of the user (which can only be used to authorize OAuth2 authorization requests)

Redirect to/render OAuth2.0 Authorization Request

OAuth2 Authorization

Provides an interface for the user to approve/reject OAuth requests

Retrieve/interpret client details + scope and display to the user "Do you want Client123 to have full access over syn123"

Approve/reject OAuth2 request on behalf of the user

Redirect user-agent to OAuth2AuthorizedUrl (provided by backend)

Diagrams to show where these API endpoints would be used and what objects/params are needed:





How do we know if we're up-to-spec when we are done?

If we implement OIDC, there is a process to become OIDC-certified here: https://openid.net/certification/

think that to have OIDC configured properly, you must have OAuth2 configured properly, so this would cover both cases. (I would like to investigate this further if we do decide to implement OIDC)

If we do not implement OIDC (just OAuth2) there doesn't seem to be any certification process that I can find. We may have to read the spec ourselves and hope we don't miss anything with thorough tests, future security audits, etc.

What is "scope"?

JIRA: 

PLFM-5170 - Getting issue details... STATUS

In short: scopes are clearly defined permissions that a user may grant to an OAuth client.

In ORY Hydra, OAuth clients may be limited in the scope they can request (e.g. a photo printing service (OAuth client) may be restricted to only acquire read-photo permission, even if they attempt to request edit-photo permission). This is not a requirement of our implementation, but it is worth consideration.

From RFC-6749

The authorization and token endpoints allow the client to specify the scope of the access request using the "scope" request parameter. In turn, the authorization server uses the "scope" response parameter to inform the client of the scope of the access token issued.

The value of the scope parameter is expressed as a list of space-delimited, case-sensitive strings. The strings are defined by the authorization server. If the value contains multiple space-delimited strings, their order does not matter, and each string adds an additional access range to the requested scope. scope = scope-token *( SP scope-token ) scope-token = 1*( %x21 / %x23-5B / %x5D-7E )
The authorization server MAY fully or partially ignore the scope requested by the client, based on the authorization server policy or the resource owner's instructions. If the issued access token scope is different from the one requested by the client, the authorization server MUST include the "scope" response parameter to inform the client of the actual scope granted. If the client omits the scope parameter when requesting authorization, the authorization server MUST either process the request using a pre-defined default value or fail the request indicating an invalid scope. The authorization server SHOULD document its scope requirements and default value (if defined).

The actual encoding and representation of scope is not necessarily within the scope of this document (ha ha), but it is something that will likely strongly guide our implementation of OAuth. Naturally, how we design scope should be informed by use cases. We should methodically determine what access we wish to grant OAuth2 clients, as well as how much granularity (both breadth of permissions and Synapse object access) we can/should reasonably encode. Additionally, this should be informed by how we have designed and use ACLs, since this paradigm of authorizing access to content in Synapse is likely to be the driver of OAuth-based content authorization.

An OAuth2 client developer must be able to determine the scope that they need when designing their OAuth2 client service, so it is also critical that we document 

Examples of OAuth2 scope documentation in the wild:

If we choose to be ultra-granular with scope (e.g. only granting read access to one particular file, with a permission structure like read:syn123), we will likely have to store these scopes in a database. One implementation option is to make scope a UUID and store the actual scope information in a database; the downside: client developers may have to generate scope UUIDs on-the-fly. This also seems to conflict with the design decisions of other OAuth providers (see external documentation samples above).

Worth looking into: UMA

Stack overflow that led me towards UMA: https://softwareengineering.stackexchange.com/questions/372526/how-to-handle-per-resource-fine-grained-permissions-in-oauth

The spec: https://docs.kantarainitiative.org/uma/ed/uma-core-2.0-01.html

Possibly of interest is the HEART Working group: https://openid.net/wg/heart/

HEART WG aims to standardize data sharing in healthcare using HFIR, OAuth, OIDC, UMA. They seem to be more aimed at patients having control over data they share, but some of their specs may be appropriate for some of our use cases

ORY Hydra

This is a good place to collect information and research relevant to using ORY Hydra to implement OAuth2 and OIDC in Synapse. This portion of the document is not complete, and may not be completed if we choose not to use ORY Hydra to implement OAuth2.

Why ORY Hydra:

Per @Bruce Hoff's preliminary research, ORY Hydra is one of very few (and perhaps the only) established, off-the-shelf OAuth2 + OIDC solution that delegates authorization provider and resource provider roles to external services. Since Synapse already has this infrastructure in place (excepting interfaces for OAuth2 authorization flow), we can use Hydra to handle some of the more complicated parts of implementing OAuth2 and OIDC, and create our own authorization/resource provider services that use existing Synapse infrastructure.

Why NOT ORY Hydra:

To be expanded upon later, and points may be focused upon later in this document.

  • Possible future maintenance costs, incompatibilities with our infrastructure, and other tech-debt related concerns

  • Overkill for our use case?

Does Hydra work with our use cases?

For example, can we implement our proposed scope pattern into Hydra? What other constraints exist in Hydra that may be major roadblocks? 

ORY Hydra in the wild

Can we find cases of other engineers using Hydra?

What problems have they run into?

What benefits have they seen that Hydra provides?

What other potentially crucial information about Hydra can we find on the internet?

ORY Hydra authentication + consent flow

See the user guide

Hydra stores its own authentication sessions (this would be a redundancy in Synapse). Additionally, the Hydra documentation instructs implementers to not utilize session in the authentication provider (I think we probably could if we wanted to, but we would want to be thorough in our understanding of how Hydra handles sessions in these cases).

  1. An OAuth client will redirect a user to a page controlled by the Hydra server to authenticate the user.

    1. If the user has an existing Hydra authentication session go to #,

  2. With no existing authentication session, the user is redirected to an authentication provider (i.e. Synapse login) to enter and credentials.

  3. The authentication provider will authenticate the user and communicate info about the user directly to Hydra

  4. The user is redirected to Hydra for authorization, which will redirect the user to the consent provider (i.e. Synapse).

  5. The consent provider will ask the user for permission to grant access to the requested resources and will directly tell Hydra if the user accepts or rejects the request.

  6. The user is redirected to the OAuth client. Hydra handles access code and token generation.

Decisions and requirements to deploy via Cloudformation

See:

PLFM-5163 - Getting issue details... STATUS
 

Internal configuration

We should tailor ORY Hydra's configuration to meet our needs

Database