The purpose of this feature is to allow administrators to selectively control which API calls are throttled, without having to rebuild and deploy a new version every time a throttle is changed.
The current frequency throttle is only able to throttle each user's total call frequency using a semaphore. The UserThrottleFilter uses an InMemoryTimeBlockCountingSemaphore that maps from a key(user id in this case) to a SimpleSemaphore, which keeps a count of calls made and the the after which the count will reset. The UserThrottleFilter then compares the user's call count to the maximum allowed.
The new throttle will employ a similar mechanism, except that the keys will be userId+normalizedAPI so that each user can be throttled on each API call.
The throttle limits for each API call will be defined by a SQL table, allowing administrators to set the throttling of any API calls dynamically.
Table Schema
This table would contain the api calls that administrators would want to throttle, the maximum number of calls per period, and the period of time after which the max calls would reset.
(optional: expiration of throttle rule?? not sure if necessary)
CREATE TABLE 'THROTTLED_CALLS'( 'THROTTLE_ID' int(20) PRIMARY KEY, --id of the throttle rule 'NORMALIZED_URI' varchar(256) NOT NULL, --normalized api URL, numbers such as {id} replaced with # 'MAX_LOCKS' int(20) NOT NULL, --maximum number of locks per time block 'LOCK_TIMEOUT_SECONDS' int(20) NOT NULL, --duration of each time block in seconds 'THROTTLE_EXPIRATION' bigint(20) DEFAULT NULL, --optional expiration of the rule in unix timestamp. maybe not necessary? UNIQUE ('NORMALIZED_CALL') )
Reducing Table Accesses
UserThrottleFilter will have a in-memory cached version of this table. It would periodically check the THROTTLED_CALLS table and update its cached version to reflect the information in the database.
The cached in memory version will be a Map from the normalized URI to a pair of values for maxLocks and lockTimeoutSeconds.
Proposed ways to update the throttle limits:
- By having UserThrottleFilter periodically read the table (via a DAO) and updating the Map.
- Wrap the Map with a synchronized singleton. Allowing a worker to update it.
Throttle Logic
When an request comes in, the request URI will be normalized using normalizeMethodSignature() AccessRecordUtils of the Synapse-Warehouse-Records project (copy it over). once normalized, the url is compared to the cached throttled calls to see if it is being throttled. if it is, attempt to get a lock from the semaphore. The key used for the semaphore will be the userID + normalizedThrottledCall.
If we can not get a lock, block the request and return a HTTP 429 error code. otherwise proceed with the other filters.
Services
These services make updating rules more convenient.
Administrators could also just directly update the SQL table.
Method | URI | Body | Parameters | Return | Description | Permission |
---|---|---|---|---|---|---|
GET | admin/throttle/ | -- | -- | PaginatedResults<Throttle> | Gets a list of throttle rules | Administrators only |
POST | admin/throttle/new | Throttle | -- | Id of throttle created | Creates a new throttle rule | Administrators only |
DELETE | admin/throttle/delete | -- | throttleId | -- | Removes a throttle rule given its id | Administrators only |
PUT | admin/throttle/update | Throttle | throttleId | -- | Updates an existing entry | Administrators only |
Potential problems
If there are many calls being throttled, the throttle could potentially use up a lot of memory. With N throttled calls and M users, the throttle's map for call counts could have up to M x N entires. Additionally, the map will not remove entries for users that are are no longer making calls so memory will not be freed until an administrator calls clearAllLocks().
Updates to the throttle rules will not immediately take effect because they are only written into the SQL table. The actual enforcement of the throttle will not happen until UserThrottleFilter updates its cached version of the rules.