Goals of our design
1) Once a user signs in using a credential, we want them to be able to stay signed in indefinitely. This holds true even if the client returns outside of any time window we might anticipate. To reauthenticate, we've been issuing a token for the purpose of re-acquiring a session when the user loses it.
2) We don't want the reauthentication token stored in plaintext on the server, since it is effectively a password.
3) The reauthentication token can be lost in transit back to the client, along with the session being returned. When this happens the client is in a "failed reauthentication" state and these improvements are primarily designed to ensure the client can recover from this state.
4) We want to invalidate the existing session when issuing a new session (on sign in). There are some alternatives for how we might accomplish this, but because it's related to sign in, it can be done on a later update (see "Addressing Concurrent Sign In Requests" below).
In the current implementation, When a reauthentication request succeeds, but the client fails to get back the session, we create a new reauth token and store the old token in Redis. While the client can recover by resending the old reauth token, and they will get the session, the session we send back does not include the new reauth token (we don't have it due to #2 above). We just return the old token in the session. As a result, at some point, that user will still have to authenticate when the cached reauthentication token expires from cache.
The proposed design would fix this.
Signing in
Signing in does not change:
- User signs in
- We create a new session, session token, reauthToken
- We store the following Redis mappings:
- sessionToken ↝ userId
- userId ↝ session
- return the session with the sessionToken and reauthToken in the session
Authenticating a request
This doesn't change (it might change if we attempt to address concurrent sign in requests; see below).
- User makes request with a sessionToken
- Retrieve the userId with the sessionToken (if this fails return 404)
- Retrieve the session with the userId (if this fails return 404)
- return the session with this session token and whatever reauthentication token is in the session
Reauthentication
- User reauthenticates with the reauthentication token
- We retrieve the N most recent records by their creation date, hash the token by the algorithm in each record, and compare to the hashed records looking for a match. A match is an authentication success
- If a session exists, we create a new session but keep the session token/internal session token;
- If a session does not exist, a new session includes new session token/internal session token;
- Persist a new record in the secrets table for the new reauthToken
- We store the following Redis mappings:
- sessionToken ↝ userId
- userId ↝ session
- return the session with the sessionToken and reauthToken in the session
Thus, the reauth tokens are rotated by successful reauthentication attempts, not by an expiration time. The session is not rotated once it exists (reauthentication does not rotate the session token if you do it before a session expires). If the session token rotated with each reauth request and we removed the validity of the last session token, concurrent reauthentication requests might capture the invalidated session token. The session token still expires after 12 hours.
Sign Out
- Delete the userId ↝ session mapping
- Delete the sessionToken ↝ userId mapping
- Delete the reauth secret records for this user in the secrets table
Concurrency issues
There are some issue we have identified when the client makes multiple requests to reauthenticate:
- Multiple requests lead to a reauthentication token being consumed and then subsequent reauthentication requests fail. Without keeping the reauth token in plaintext on the server, we issue a unique token on each request and all tokens will be valid until N tokens are issued. So any request that is persisted will have a valid reauthentication token;
- Multiple request lead to multiple session tokens being returned, only one of which can be valid. We return the existing session token if it exists, rather than rotating it. The token still expires ever 12 hours, or can be deleted with a sign in/sign out operation (along with all valid reauth tokens). So any request that is persisted will have the valid session token;
- There were issues with updating outdated versions of the account record when two requests were both writing a reauth token to the account table. By moving the creation of new reauth tokens to a different table, we eliminate 409 responses during reauthentication (unless the health code is missing, no update to the account table occurs).
Persistence
Add this table, along with a DAO to manage writes to it. Possible names: AccountCredential, AccountSecret, AccountToken, Account(Secret)Key... this table could eventually hold other credentials, like passwords or API keys, so I would keep the nomenclature more general.
CREATE TABLE `AccountSecret` (
`userId` VARCHAR(255) NOT NULL,
`algorithm` ENUM('STORMPATH_HMAC_SHA_256', 'BCRYPT', 'PBKDF2_HMAC_SHA_256') NOT NULL,
`hash` VARCHAR(255) NOT NULL,
`createdOn` BIGINT NOT NULL,
`sessionToken` VARCHAR(255) NOT NULL, # maybe... not sure we'll ever need to know the pairing
`type` ENUM('REAUTH_TOKEN') DEFAULT 'REAUTH_TOKEN'
);
Migration
For some amount of time we'll need to read and incorporate the existing reauth token in the Accounts table into the records we load from this new table, and persist back to this new table. Once this is deployed, we can migrate the tokens out of the Accounts table, then remove the 3 columns from Accounts.
We could eventually migrate passwords out this way as well, if it's ever useful.
Addressing Concurrent Sign In Requests
Concurrent sign ins should be rarer because they involve human intervention (enter credentials, click on a link), but could still theoretically happen. There are two approaches to dealing with this.
The simpler option would be to record the timestamp when a session token is created, and reuse that token for a grace period on subsequent or concurrent sign ins. This is the simplest approach.
A more complicated approach would be to issue a new token with each successful sign in. This would solve concurrent sign ins but it would also allow for sign ins on multiple devices. The logic could be as follows:
- issue new session token on each sign in, adding to a set of tokens in the session
- on access
- sessionToken ↝ userId
- userId ↝ session
- is token in session tokens set?
- NO: not authenticated
- YES: is there more than one token in session?
- NO: return session
- YES: replace set with a set consisting only of this token, write session to cache, return session
- is token in session tokens set?
Note that with this approach, we can later allow multiple clients to authenticate simultaneously by not stripping out other session tokens. Each token has an expiry due to the first sessionToken ↝ userId lookup, independent of the session expiry.