Implementing OAuth2 into Synapse
- 1 Use cases
- 2 Brief Overview of OAuth2
- 3 OAuth2 in Synapse - API Design
- 4 What is "scope"?
- 5 ORY Hydra
- 5.1 Why ORY Hydra:
- 5.2 Why NOT ORY Hydra:
- 5.3 Does Hydra work with our use cases?
- 5.4 ORY Hydra in the wild
- 5.5 ORY Hydra authentication + consent flow
- 5.6 Decisions and requirements to deploy via Cloudformation
- 5.6.1 Internal configuration
- 5.6.1.1 Database
- 5.6.2 Infrastructure Configuration
- 5.6.2.1 Choosing an ELB type
- 5.6.2.2 VPC
- 5.6.1 Internal configuration
- 6 Spring Security
- 7 Another library to look into: Connect2ID's OAuth2.0 SDK with OpenID Connect
This page is starting as a collection of notes, design decisions, etc. related to implementing OAuth2 into Synapse. Part of the process has included considerations about developing our own library, or using an off-the-shelf solution like ORY Hydra. The information on this page may change as the project evolves.
See also: @Bruce Hoff's presentation on his preliminary research into OAuth2 and how it relates to Synapse.
More reading for historical purposes: Synapse as OAuth 2.0 Provider
And a Jira Epic:
Good summary of OIDC: https://github.com/dexidp/dex/blob/master/Documentation/openid-connect.md
Use cases
High level use-cases, per @Bruce Hoff's presentation:
Let third party (web) applications securely access a user’s data in Synapse. Today such applications must either
predownload/embed data,
use the application author’s Synapse credentials, or
prompt the user for their Synapse credentials
Let a headless batch job (e.g.,a “workflow”) securely access a user’s data in Synapse. Today such a process must either
Use predownloaded data
Use the job runner’s Synapse credentials
Brief Overview of OAuth2
This document presupposes a basic (not necessarily thorough) level of understanding of OAuth2. There are four authorization grant flows in the OAuth spec. They can be summarized:
Authorization Code grant (most secure, client secret confidentiality must be guaranteed)
Upon user consent, an OAuth client (3rd party) is granted an authorization code
The authorization code can be used with a client secret to obtain a scoped access token.
The access token can be used to access resources until it expires or is revoked
The access token can be refreshed by the client with a refresh token and the client secret.
Implicit code grant (client secret confidentiality cannot be guaranteed)
Upon user consent, an OAuth client (3rd party) is granted a scoped access token.
The token can be used to access resources until it is expired or revoked, but it cannot be refreshed. The time that the token is active is typically very short (minutes).
Resource owner password credentials (not secure, especially with an untrusted client)
The user provides their username and password to the OAuth client
The OAuth client uses the credentials to obtain a scoped access token
Client credentials (used for cases where clients manage their own resources, i.e. not really authorization delegation)
The OAuth client can request an access token with their client ID and client secret
OAuth2 in Synapse - API Design
The current proposal is to introduce OAuth2 authorization code flow into Synapse. OAuth clients would be instructed to use authorization codes only.
Backend
We can separate the endpoints required on the backend based on task.
Client Management
Basic CRUD for OAuth2.0 clients.
Verb | Endpoint | Purpose | Request | Response | Notes |
|---|---|---|---|---|---|
GET | /oauth2/client/{id} | Get details about one client | Path param: id: an existing OAuth2 Client ID | OAuth2Client name: String redirect_uri: String client_id: Unique created_by/on modified_by/on | Only the owner or a Synapse admin can make this request. |
GET | /oauth2/client/ | List clients created by user | List of above. | Don't return secret. | |
POST | /oauth2/client | Create a client | OAuth2Client name: String redirect_uri: String Supplemental params | OAuth2Client name: String redirect_uri: String client_id: Unique client_secret: String created_by/on modified_by/on | Supplemental params could include the URL to a logo, link to a website for the app, terms of service, etc. See https://openid.net/specs/openid-connect-registration-1_0.html Typically the secret key is used to perform these actions, but if we give Synapse users a claim over an OAuth2 client, we could use their credentials. |
DELETE | /oauth2/client/{id} | Delete a client | Path param: id: an existing OAuth2 Client ID | None | |
PUT | /oauth2/client/{id} | Update a client | OAuth2Client name: String redirect_uri: String Supplemental params | OAuth2Client name: String redirect_uri: String client_id: Unique created_by/on modified_by/on Supplemental params | Typically the secret key is used to perform these actions, but if we give Synapse users a claim over an OAuth2 client, we could use their credentials. |
Authorization/Consent Requests
These endpoints are necessary for users to approve/reject OAuth2.0 access requests
Verb | Endpoint | Purpose | Request Object/Params | Response Object/Params | Notes |
|---|---|---|---|---|---|
POST | /login/scoped | Get a scoped access token | sessionToken: String scope: String | scopedLoginResponse: scopedSessionToken: String acceptsTermsOfUse: Boolean scope: String exp: Integer (seconds until expiry) | This is a more secure alternative to the current session token as limits what can be done with the session token. These can (should?) expire quickly (minutes-hours). This could be done by passing in username and password rather than sessionToken, but we'd have to handle the case where the user has no password and logs in via Google (or other OAuth provider). |
GET | /oauth2/details | Get human-interpretable details about the requesting client, and the scope that they are requesting | Parameters clientId: Unique (the ID of an existing OAuth2 client requesting access) scope: String | OAuth2Client scopes: Array<string> e.g. [("read", "syn123"), ("create","syn456")] (actual representation TBD) | The web layer can use this to get details about a client requesting authorization and the scope they request |
POST | /oauth2/consent | The user grants access to the OAuth2 Client to access protected resources | URL Parameters: response_type: String (always "code") client_id: Unique redirect_uri: String (points to OAuth client) scope: String state: String | If scopedAccessToken is valid: Body: OAuthClientUrl: String redirect_uri?code={code}&state={state} (all provided in request) Parameters: code: the authorization code state: the same value in the request | Who should execute this? The User Agent or the Web Layer on behalf of the user agent? Question: how to handle with various Synapse IdPs? (E.g. Synapse users who sign in with Google accounts). The "state" parameter is designed to avoid CSRF attacks and the client must utilize it per RFC-6749 § 10.12. More info. |
POST | /oauth2/revoke | A logged in user can revoke OAuth2 client access using this method. | OAuth2RevokeRequest client_id: unique Is there a need for more granularity? | None | Revoking access not in the OAuth2 spec but allowing users to revoke client access may be important. Revocation should be at the token level not at the client level. |
Token Requests
These endpoints would be used by OAuth2.0 clients to retrieve tokens with an access code
Verb | Endpoint | Purpose | Request | Response | Notes |
|---|---|---|---|---|---|
POST | /oauth2/token | Called by a client to get an access token | Body: OAuth2AuthorizationCodeTokenRequest grant_type: String (always "authorization_code" for this call) code: String (the authorization code) redirect_uri: String (should be the same as previous redirect uri) client_id: Unique client_secret: String | Body: OAuth2AccessToken access_token: String token_type: String ("Bearer") expires_in: Integer (seconds) refresh_token: String (optionally, scope) | As per OIDC the response should include both an access token and an ID Token, https://openid.net/specs/openid-connect-core-1_0.html#TokenResponse Authorization codes must be single-use and short-lived. If an authorization code is used more than once, we should revoke the active access token retrieved with the code. More info @ RFC-6749 § 10.5 The redirect URI should be validated here before granting a token, along with the credentials in the request. More info @ RFC-6749 § 10.6. The token should be opaque/unguessable. More info @ RFC-6749 § 10.3. Update: token could be JWT, https://www.oauth.com/oauth2-servers/access-tokens/self-encoded-access-tokens/. Advantage is that permissions are 'built in' to the token. Downside is that it becomes irrevocable. So let's not use JWT. The token type in almost all OAuth2 cases is "Bearer". We can use a different token type (e.g. HMAC, or make our own) if we want to. See RFC-6749 § 7.1. Additional info. |
POST | /oauth2/token/refresh | Called by a client to refresh an access token | Body: OAuth2AuthorizationCodeRefreshTokenRequest grant_type: String (always "refresh_token" for this call) refresh_token: String client_id: Unique client_secret: String | Body: OAuth2AccessToken access_token: String token_type: String ("Bearer") expires_in: Integer (seconds) refresh_token: String | Client authentication must be done here. This must be done over TLS. More info @ RFC-6749 § 10.4 |
POST | /oauth2/token/introspect or /oauth2/token/info | Clients can determine if an authentication token is valid (and get scope, if it is opaque in the token) | Body: OAuth2TokenIntrospectionRequest token: String client_id: Unique client_secret: String | Body: OAuth2TokenIntrospectionResponse active: Boolean client_id: Unique username: String (principal of user who authorized the token) exp: Date (seconds until expiration) scope: Array<String> (human-interpretable scope) | This endpoint is not strictly necessary, but we should strongly consider including this if we decide to not include scope with the access token. See RFC-7662. |
OpenID Connect services
Verb | Endpoint | Purpose | Request | Response | Notes |
|---|---|---|---|---|---|
GET |
| return metadata about the service | N/A | See spec -→ Key elements are:
| https://openid.net/specs/openid-connect-discovery-1_0.html#ProviderConfigurationRespon |
GET | /oauth2/userinfo | return information about the user | N/A | Body: sub: "subject", Synapse user id aud: "audience", the OAuth client ID iat: "issued at", the timestamp when the response was created given_name: first name family_name: last name | https://openid.net/specs/openid-connect-basic-1_0.html#UserInfo Note: content type must be application/jwt as per https://openid.net/specs/openid-connect-core-1_0.html#UserInfoResponse |
POST | /oauth2/userinfo | as above | N/A | as above | Although GET is the recommended HTTP method, POST must be supported, as per https://openid.net/specs/openid-connect-core-1_0.html#UserInfo |
Web Layer Interfaces (Portal)
The portal needs to implement interfaces for handling the components of OAuth2 that are best accomplished through user interfaces.
Page | Purpose | Actions |
|---|---|---|
OAuth2 Authentication | Provides an interface for the user to authenticate in order to manage OAuth2 requests | Prompt for authentication (u:p/OAuth login) Retrieve a scoped access token on behalf of the user (which can only be used to authorize OAuth2 authorization requests) Redirect to/render OAuth2.0 Authorization Request |
OAuth2 Authorization | Provides an interface for the user to approve/reject OAuth requests | Retrieve/interpret client details + scope and display to the user "Do you want Client123 to have full access over syn123" Approve/reject OAuth2 request on behalf of the user Redirect user-agent to OAuth2AuthorizedUrl (provided by backend) |
Diagrams to show where these API endpoints would be used and what objects/params are needed:
How do we know if we're up-to-spec when we are done?
If we implement OIDC, there is a process to become OIDC-certified here: https://openid.net/certification/
I think that to have OIDC configured properly, you must have OAuth2 configured properly, so this would cover both cases. (I would like to investigate this further if we do decide to implement OIDC)
If we do not implement OIDC (just OAuth2) there doesn't seem to be any certification process that I can find. We may have to read the spec ourselves and hope we don't miss anything with thorough tests, future security audits, etc.
What is "scope"?
JIRA:
In short: scopes are clearly defined permissions that a user may grant to an OAuth client.
In ORY Hydra, OAuth clients may be limited in the scope they can request (e.g. a photo printing service (OAuth client) may be restricted to only acquire read-photo permission, even if they attempt to request edit-photo permission). This is not a requirement of our implementation, but it is worth consideration.
From RFC-6749
The authorization and token endpoints allow the client to specify the scope of the access request using the "scope" request parameter. In turn, the authorization server uses the "scope" response parameter to inform the client of the scope of the access token issued.The value of the scope parameter is expressed as a list of space-delimited, case-sensitive strings. The strings are defined by the authorization server. If the value contains multiple space-delimited strings, their order does not matter, and each string adds an additional access range to the requested scope. scope = scope-token *( SP scope-token ) scope-token = 1*( %x21 / %x23-5B / %x5D-7E )The authorization server MAY fully or partially ignore the scope requested by the client, based on the authorization server policy or the resource owner's instructions. If the issued access token scope is different from the one requested by the client, the authorization server MUST include the "scope" response parameter to inform the client of the actual scope granted. If the client omits the scope parameter when requesting authorization, the authorization server MUST either process the request using a pre-defined default value or fail the request indicating an invalid scope. The authorization server SHOULD document its scope requirements and default value (if defined).
The actual encoding and representation of scope is not necessarily within the scope of this document (ha ha), but it is something that will likely strongly guide our implementation of OAuth. Naturally, how we design scope should be informed by use cases. We should methodically determine what access we wish to grant OAuth2 clients, as well as how much granularity (both breadth of permissions and Synapse object access) we can/should reasonably encode. Additionally, this should be informed by how we have designed and use ACLs, since this paradigm of authorizing access to content in Synapse is likely to be the driver of OAuth-based content authorization.
An OAuth2 client developer must be able to determine the scope that they need when designing their OAuth2 client service, so it is also critical that we document
Examples of OAuth2 scope documentation in the wild:
If we choose to be ultra-granular with scope (e.g. only granting read access to one particular file, with a permission structure like read:syn123), we will likely have to store these scopes in a database. One implementation option is to make scope a UUID and store the actual scope information in a database; the downside: client developers may have to generate scope UUIDs on-the-fly. This also seems to conflict with the design decisions of other OAuth providers (see external documentation samples above).
Worth looking into: UMA
Stack overflow that led me towards UMA: https://softwareengineering.stackexchange.com/questions/372526/how-to-handle-per-resource-fine-grained-permissions-in-oauth
The spec: https://docs.kantarainitiative.org/uma/ed/uma-core-2.0-01.html
Possibly of interest is the HEART Working group: https://openid.net/wg/heart/
HEART WG aims to standardize data sharing in healthcare using HFIR, OAuth, OIDC, UMA. They seem to be more aimed at patients having control over data they share, but some of their specs may be appropriate for some of our use cases
ORY Hydra
This is a good place to collect information and research relevant to using ORY Hydra to implement OAuth2 and OIDC in Synapse. This portion of the document is not complete, and may not be completed if we choose not to use ORY Hydra to implement OAuth2.
Why ORY Hydra:
Per @Bruce Hoff's preliminary research, ORY Hydra is one of very few (and perhaps the only) established, off-the-shelf OAuth2 + OIDC solution that delegates authorization provider and resource provider roles to external services. Since Synapse already has this infrastructure in place (excepting interfaces for OAuth2 authorization flow), we can use Hydra to handle some of the more complicated parts of implementing OAuth2 and OIDC, and create our own authorization/resource provider services that use existing Synapse infrastructure.
Why NOT ORY Hydra:
To be expanded upon later, and points may be focused upon later in this document.
Possible future maintenance costs, incompatibilities with our infrastructure, and other tech-debt related concerns
Overkill for our use case?
Does Hydra work with our use cases?
For example, can we implement our proposed scope pattern into Hydra? What other constraints exist in Hydra that may be major roadblocks?
ORY Hydra in the wild
Can we find cases of other engineers using Hydra?
What problems have they run into?
What benefits have they seen that Hydra provides?
What other potentially crucial information about Hydra can we find on the internet?
ORY Hydra authentication + consent flow
See the user guide
Hydra stores its own authentication sessions (this would be a redundancy in Synapse). Additionally, the Hydra documentation instructs implementers to not utilize session in the authentication provider (I think we probably could if we wanted to, but we would want to be thorough in our understanding of how Hydra handles sessions in these cases).
An OAuth client will redirect a user to a page controlled by the Hydra server to authenticate the user.
If the user has an existing Hydra authentication session go to #,
With no existing authentication session, the user is redirected to an authentication provider (i.e. Synapse login) to enter and credentials.
The authentication provider will authenticate the user and communicate info about the user directly to Hydra
The user is redirected to Hydra for authorization, which will redirect the user to the consent provider (i.e. Synapse).
The consent provider will ask the user for permission to grant access to the requested resources and will directly tell Hydra if the user accepts or rejects the request.
The user is redirected to the OAuth client. Hydra handles access code and token generation.
Decisions and requirements to deploy via Cloudformation
See:
Internal configuration
We should tailor ORY Hydra's configuration to meet our needs