...
Support for Workflows. Workflows that act on behalf of a user may have a long queue of jobs. If the duration of those jobs exceeds the duration of an a job does not execute within the lifetime of the issued access token (currently 24 hours), then the job will fail. A workflow triggered by a Synapse OAuth client could utilize When a user triggers a workflow job via a submission queue, Synapse could issue a refresh token to maintain authorization to grant access to a workflow engine beyond the current duration of an access token. When the job is finished, the refresh token can be revoked.
OAuth in the Synapse Python Client. While not currently implemented, one goal is to allow users to With refresh tokens, users can use OAuth to authenticate in the Python client because it is more secure than a , eliminating the use of their username and password, or an API key. Similarly to the API key, a user is capable of revoking the token. A key advantage over the API key is that a user can issue multiple refresh tokens, allowing for granular, machine-level access tokens. Because access tokens are short-lived, a user authenticating into the Python client with our current implementation of OAuth would be forced to reauthenticate every 24 hours. A refresh token could can be stored locally, so that users would not be required to reauthenticate.
...
User-centric token revocation is not defined in any of the OAuth Specifications. The OIDC Specification § 16.18 simply suggests it: ““…The
The Authorization Server SHOULD provide a mechanism for the End-User to revoke Access Tokens and Refresh Tokens granted to a Client.
...
Thus the design for this feature is influenced by other services, such as those showcased by Okta/OAuth.com on their page Listing Authorizations https://www.oauth.com/oauth2-servers/listing-authorizations/ .
Use Cases for user-centric token revocation
A user may no longer trust or need to use an OAuth client that has been granted access to their Synapse identity and resources (revoke all tokens from one client)
A user may no longer trust an OAuth client that has been granted have access to their Synapse identity and resourcesa machine with their Synapse credentials used for a command line application, but they do not wish to expire all machines on which they are authenticated (revoke one token from one client)
A user may forget about a machine on which they have authenticated, and would benefit from being signed out automatically (token expiration)
Client-centric token revocation is defined in RFC 7009 https://tools.ietf.org/html/rfc7009. From the RFC:
...
Resource access is no longer needed by the client. As an example, a workflow engine job executed by an OAuth client may no longer require access to a user’s Synapse account at the conclusion of a job. They may revoke the token to ensure it is no longer valid.
...
Since we will be issuing long-lived refresh tokens, we will need a mechanism to revoke refresh tokens. While not necessary, it would be ideal to also revoke the access tokens themselves. (RFC 7009 § 2)
To accomplish this, we can store refresh tokens in the database, which can be revoked by a user or client with a REST API call. Once the refresh token is revoked, it may no longer be used to generate access tokens.
By linking an access tokens token to its associated refresh token, we are able to invalidate access tokens without storing them in the database.
...
Access tokens are JWTs. To link an access token to a refresh token, we can simply add a claim with a corresponding refresh token ID. The JWT specification § 4.2 suggests we use a namespace for this claim, (e.g. Auth0 recommends a URL like https://synapse.org/refresh_token_id
or https://sagebionetworks.org/refresh_token_id
, but we should be able to use a reverse domain name like org.sagebionetworks.repo.model.oauth.claims.refresh_token_id
). As a side note, I think we are already in violation of this specification, since we currently use nonstandard, non-namespaced claims such as orcid
, is_certified
, etc. We should determine if we should get back “in-spec” and add namespaces to the existing claims (breaking API change).
...
This section will identify a three new object objects used in the REST API, three six new endpoints, and an extended implementation for an existing endpoint.
New objects
There are two three new objects proposed in this document.
...
OAuthClientAuthorization
This object can be used to show the a user the OAuth clients that have access to the requesting user’s resources and identity. Using this information, the user can identify the client that has access, the amount of access that the client has (via scopes), how long the client has had access, and how recently the client has accessed that user’s resources by requesting a new access token.
Field |
---|
...
Type | Description |
---|---|
client |
...
...
Client information that can be displayed to the end user | ||
authorizedOn | date-time | The time when access was first granted (i.e. the issue date of the oldest active refresh token) |
lastUsed | date-time | The most recent time a refresh token was used to issue a new access token |
OAuthTokenInformation
This object captures information about an active refresh token, intended to be seen by the user whose resources can be accessed by the token. Note that the token itself is not shown.
Field | Type | Description |
---|---|---|
tokenId | integer | Unique ID of the token |
clientId | integer | Unique ID of the client that possesses this token |
name | string | A human-readable identifier for the token. We may initially set this to a string of random words. The user is able to overwrite this field (e.g to identify the machine on which this token lives) |
scopes |
...
Array<OAuthScope> |
...
The scopes that the client can request using |
...
this refresh token | |
authorizedOn | date-time |
...
The time when |
...
this token was first |
...
issued | |
lastUsed | date-time |
...
The most recent time |
...
this refresh token was used to issue a new access token | ||
modifiedOn | date-time | The last time this token’s metadata (i.e. name) was updated. |
etag | string | For OCC |
OAuthTokenRevocationRequest
This object is used when a client makes a request to revoke a refresh/access token. It is defined by RFC 7009 § 2.1.
Field |
---|
...
Type | Description |
---|---|
token |
...
string | The token to revoke | |
token_type_hint |
...
enum | The type of token to revoke (must be |
...
or |
...
) |
New API Endpoints
Three Six new endpoints and an extension of implementation for one existing endpoint are proposed.
Viewing applications that have OAuth access to a user’s account
Endpoint: GET /oauth2/permissions/
Request body: none
Return body: PaginatedList<OAuthClientAuthorization>
Returns a paginated list of the clients and permissions that the user has granted. Allows a user to audit which parties have access to their resources.
Viewing tokens for an application that has OAuth access to a user’s account
Endpoint: GET /oauth2/permissions/:client_id/tokens
Path Parameter: client_id
: returned tokens will be associated with this OAuth2 client
Request body: none
Return body: PaginatedList<OAuthGrantedPermission>PaginatedList<OAuthTokenInformation>
Returns a paginated list of the clients and permissions that the user has granted. Allows a user to audit which parties have access to their resources.
...
Endpoint: POST /oauth2/permissions/:client_id/revoke
Request Path Parameter: client_id
: the OAuth2 client that will no longer have access to the user’s resources and/or identity
Response: On successful revocation, return HTTP 200. No body.
Upon calling this method, the refresh token and access tokens held by the specified client for the authenticated user making the API call will be revoked.
Update metadata for a token
Endpoint: PUT /oauth2/permissions/:client_id/tokens/:token_id
Request Parameters:
client_id
: the OAuth2 client that is associated with the tokentoken_id
: the token to update
Response: On successful revocation, return HTTP 200. No body
Upon calling this method, the token identifier will be updated
In practice, only the token name can be updated.
User revocation of a particular access token
Endpoint: POST /oauth2/permissions/:client_id/tokens/:token_id/revoke
Request Parameters:
client_id
: the OAuth2 client that is associated with the token to revoketoken_id
: the token to revoke
Response: On successful revocation, return HTTP 200. No body.
Upon calling this method, the refresh token and access tokens held by the specified client for the authenticated user making the API call will be revoked.
Client revocation of a token
...
Endpoint: POST /oauth2/token
This method exists. This feature proposal would add support for grant_type=refresh_token
, and return a refresh token for grant_type=code
. For details, see OIDC Core 1.0 § 12.1, 12.2.
Additionally, we should require that a refresh token be passed in the request body and not as a request parameter. If the token is passed as a request parameter, it will be logged in the web application firewall and on the server, so it is insecure.
Public vs. Private Clients - A Security Note
It is important to understand the distinction between confidential and public OAuth clients, and how they would interact with Synapse (RFC 6749 § 2.1).
A confidential client is capable of keeping their credentials confidential. The example given in the OAuth specification is a web application running on a web server. In our context, this could be a workflow engine. Maintaining the confidentiality of the client credentials adds an additional layer of security because the credentials must be supplied when using a refresh token to request an access token.
A public client cannot ensure that their credentials are confidential. Examples given in the OAuth specification are user agent based applications (i.e. a single-page application where the client code runs in the browser) and native applications (where the client credentials may be extracted or decompiled, exposing a client secret). In our context, the Synapse CLI applications would act as public clients (i.e. the client secret is not a secret).
A major implication of this is that refresh tokens issued to public clients (e.g. the Synapse Python client) are no more secure than bearer tokens (any user may use it because the client secret is not confidential).
Despite the security flaw, this scheme is used in practice
For native apps integrating with Google, they simply suggest embedding your secret in the app, noting that it is not really a secret
GSUtil, the command line app for interfacing with Google Cloud, stores credentials in the app
MSAL, Microsoft’s current authorization library, indicates that Microsoft internally discerns public/confidential clients (public clients do not have a secret).
This doesn’t remove all of the benefits of using OAuth in Synapse CLIs. Using OAuth still accomplishes
Removing password from the authentication flow
Scoped access
Ability to issue multiple tokens and revoke them individually
Additional usage context for a user (e.g. for a particular refresh token or client you can say “Used for Synapse CLI 2 days ago”)
Backend Implementation Detail: Database Model
...
New DB Table: OAUTH_REFRESH_TOKENS
id: integer, primary key
token: CHAR(36) (semantically, a UUID)
created_on: TIMESTAMP
user_id: BIGINT (referencing the principal user)
client_id: BIGINT (referencing the client)
last_used: TIMESTAMP
Uniqueness constraint on (user_id, client_id)One implication of this design is that a client may only have one active refresh token per user. This seems to be how most OAuth providers design their systems (clients only appear once in lists of granted access). It also reduces user confusion (“Which instance of Client X permissions do I want to revoke?”)
Name | Type | Notes |
---|---|---|
ID | INTEGER | Primary key |
TOKEN | CHAR(64) | SHA256 hash of the refresh token passed to the client |
NAME | VARCHAR(256) | Human-readable identifier for the token |
USER_ID | BIGINT | Foreign key reference to the principal whose resources this token grants access to |
CLIENT_ID | BIGINT | The client that this token is issued to |
CREATED_ON | TIMESTAMP | When this refresh token was created |
LAST_USED | TIMESTAMP | The last time this refresh token was used to issue an access token |
MODIFIED_ON | TIMESTAMP | The last time this token was modified (i.e. the name was changed) |
ETAG | CHAR(36) | For OCC |
Additional constraint: UNIQUE(USER_ID, NAME) – a user may not have duplicate names for their tokens
FAQ/Anticipated Concerns
Does adding refresh tokens break any existing behavior?
It should not because we are merely extending the access token with a reference to its refresh token ID. Current clients would not see the refresh token without requesting the offline_access
scope, and if they did receive it, they may ignore it. The duration of access tokens is unchanged.
Open Questions
Is a UUID a good choice for the Choice of a refresh token? Proposed: 256 bit random string using SecureRandom
Should (When) should a refresh token expire?
Per sections 1.5 and 10.4 of the OAuth 2.0 spec (by way of this StackOverflow post), it seems we have some liberty in terms of the lifecycle of a refresh token.
Some options we have include
Refresh tokens last until they are revoked
SimpleRefresh tokens are leased (i.e. they expire X days after last use)
Rotate the refresh token after each use
Trades additional client complexity for security
Refresh tokens expire after a duration
The proposed API design for user revocation of a token assumes “no”, because a user revokes access using the client ID.
Other applications seem to only allow one token per user (listing your granted application integrations in Google, GitHub, etc. won’t show the same client more than once, as far as I know)
As a user, revoking a token is confusing if you have more than one token per client (“Which instance of Client X permissions do I want to revoke?”).
What happens when a client attempts to create a new refresh token, when one exists (e.g. to ask for more scope)
- Probably just invalidate the old refresh token, and create a new one
User will be required to reauthorize the application after a certain period of time (e.g. one year)
Functionally, this is not much different than a long-lived access token, so I don’t think this is the best option
- Probably just invalidate the old refresh token, and create a new one
This decision should be driven by use-cases
Should a client be able to possess more than one refresh token per user at a time?
This would break things like cron jobs
Proposal: refresh tokens are leased and expire 6 months after their last use
Is there a compelling use case for a user to be able to see the access they have given in the past but have revoked? (i.e. do we keep a record of revoked refresh tokens?)
Should we limit the number of active refresh tokens per (user, client) pair?
Google limits this to 50 and expires the least recently used token when a new token is issued
Implications
Simplifying an interface where a user is trying to audit/revoke tokens
Limits number of jobs that can be submitted to a workflow queue