Solution
The four hierarchy related operations would be changed to only update the container and not the dependencies in the hierarchy in a single transaction. After the transaction commits, a message would be sent to a new asynchronous worker. The new worker worker would update each dependency with an individual transaction for each update. After the commit of each update an entity update message would be sent to all listeners. This option results in the eventual consistency of benefactor and project ID of each dependent.
Concurrency & Consistency
Unlike the current implementation, this option does not require locks to ensure consistency when there are concurrent updates to the same hierarchy. With the example above where user-A moves folder-1 to folder-2 and at the same time user-B moves folder-2 to project-2, both transaction would commit in constant time. In addition, the two moves would trigger two events to be pushed to the asynchronous worker queue (one for each transaction). Both transaction will have been committed by the time the worker receives and processes the second message, therefore the second event will result in a consistent state for all dependent benefactor IDs.
Migration
With this option, any large hierarchy change that occurs immediately before migration would still succeed. However, once the stack enters read-only mode the worker updating the descendants of the hierarchy would stop, and the event would be returned to the worker queue. Both the change to the container, and the hierarchy update event would migrate to staging. This means the full hierarchy change would eventually migrate to the staging stack.
User Experience
All four hierarchy related operations will have a constant time complexity. Therefore, updates to large hierarchies will no longer result in web-service timeouts. This will improve the experience of the user that initiates a large change. However, the experience will change for users waiting for large hierarchies to become consistent. For example, if an ACL is added to a folder with 20K files to grant user-A access, user-A would need to wait for all of the files in the folder to become visible. In fact, user-A might only see a few files immediately after the ACL event, and subsequent refreshes of the view would reveal more files.
Problems with this Option
- Users might expect ACL changes to occur immediately.