Document toolboxDocument toolbox

Hierarchical Changes in Synapse (PLFM-4174)

Goal

To better handle linear time complexity hierarchical operations with consideration to consistency, concurrency, migration, and user experience.

Introduction

A core object in Synapse is the Entity.  Each entity has a type which can be one of the following: Project, Folder, File, Docker, Table or View.  All Entities reside within a hierarchy much like the files and folders of a computer.  Each Entity has a parent ID, that references its parent.  An Entity can only have a single parent.

Entity Data

Figure 1.


There are two special Entity containers in Synapse: Root & Trash.  Figure 2 shows an example hierarchy of Entities.

Figure 2.

Entity Permissions & Benefactor

Each Entity is controlled by an ACL.  An entity can either have its own ACL or it can inherit the ACL of its first ancestor that has an ACL.  The ID of the entity that owns the ACL that controls its access is called the Entity's 'benefactor'.  When Entity has its own ACL, its benefactor equals  its own entity ID.  When an Entity inherits its ACL from an ancestor, then the entity's benefactor equals the ID of the ancestor.  The entity's benefactor ID is used in all authorization checks. Since walking the entity hierarchy is expensive and complicated to do in SQL, each entity's benefactor ID is cached.

Project Id

Many operations depend on an entity's project ID.  For example, project ID is used to lookup the project setting that apply to an Entity.  Since project ID is commonly needed and walking the hierarchy is expensive and complicated to do in SQL, the project ID of each entity is cached (similar to benefactor ID).

Entity Operations

There are four Entity operations that when performed on a container (Projects & Folders) can change the values of the benefactor ID and project ID for each dependent child in the hierarchy.

  • Move - A move operation is defined as a change in an entity's parent ID.
  • Delete - A delete operation is a special type of move where the entity's parent ID is set to be the Trash.
  • Add an Access Control List (ACL) - When an ACL is added to an Entity, that ACL will control who has access to that Entity (and its descendants). 
  • Remove an ACL - When an ACL is removed from an Entity, access control to that Entity will be governed by the first ancestor of that Entity with an ACL.

Additionally, a move operation of container can change the project ID of each dependent child in the hierarchy.

Current Implementation

Currently, when a container is moved, deleted or an ACL is added or removed, an attempt is made to update project ID and benefactor ID for all dependent children in the same transaction as the original operation.  The time complexity for all four of these operations is linear (O(n)) with n equal to the number of dependent children in the hierarchy. For large hierarchies this leads to performance problems (PLFM-4160 & PLFM-3250).  In addition, a notification message is not sent for each dependent child as that would further degrade performance.  Since there is no notification for dependent children, data can be missing or stale in secondary systems such as search or table views (PLFM-4049 & PLFM-1723).

Concurrency & Inconsistency

Since the benefactor ID is used for all authorization checks, an inconsistent value becomes a serious issue.  In the past, inconsistency has occurred due to concurrent changes within the same hierarchy; see PLFM-3713. The following example illiterates how benefactor inconsistency can be introduced.  From Figure 2 assume all ACLs are at the project level, so all files and folders have their project as a benefactor.  If user-A moves folder-1 to folder-2 and at the same time user-B moves folder-2 to project-2.  All changes from user A's move occur in one transaction and all changes from user-B's move occur in another.  Since Synapse uses a transaction isolation level of Read_committed,  A's transaction does not see that the new benefactor should be project-2, and B's transaction does not see the new children in folder-2.  Therefore, neither transaction correctly sets the benefactor of files 1-n to project-2.  To address PLFM-3713, we changed the move operation such that an exclusive lock is acquired on both old and new benefactors.  This ensures that the two transactions occur in series rather than in parallel. In other words, the second transaction is blocked from running until the first transaction commits, so the second transaction will always set the benefactor correctly.

Migration

Our migration process replicates data from the current production database to the staging database.  The replication process is driven by comparing checksum ranges between production and staging for each migrated table.  Any deltas are then copied from production to staging.  To ensure all changes are migrated from staging to projection both stacks must be in in read-only mode for the final migration.

The migration process assumes all user driven transaction are committed before the scans for deltas starts.  Before the migration process starts scanning for deltas, the counts for all tables are gathered.  It can take more than ten minutes to gather the counts.  Therefore, there is a ten minute window between entering read-only mode and the start of the delta scan.  This is more than enough time for most transactions to commit before the delta scan starts.

However, since the four hierarchy related operations all have a linear time complexity, it is possible for their corresponding transactions to be open longer than ten minutes.  In fact, PLFM-4160 shows a single transaction can take twenty minutes or more.  This is a benefactor inconsistency problem waiting to occur.  For example, if a user were to start the move of a large hierarchy immediately before Synapse entered read-only mode, the benefactor ID of the children in the hierarchy could be inconsistent on the new stack.

Note: All other user driven transaction in Synapse have a constant time complexity (we limit the size of all batch updates).  The only exception, would be the upload of a large CSV to a table.  However, CSV upload is driven by an asynchronous worker that will terminate and rollback the transaction when the stack enters read-only mode.


User Experience

Since the four hierarchy related operations have a linear time complexity, the user experience with each operations varies with size of the hierarchy.  For example, if folder-1 contains three files, and an ACL is added to folder-1 granting user-B access to the files, then the web-services call would return within seconds, and user-b would have access to the files immediately after the call returns.   However, if folder-1 contained 20K files the user experience would not be good.  First, the web service call to add the ACL would timeout.  The user would likely assume the operation failed and would attempt the call again, making things worse.  Since the original transaction would still be running (despite the client-side timeout), it would eventually commit.  Users have reported it can take twenty minutes for their changes to be applied.  The user is not provided any feedback that their transaction is still executing.  During the twenty minute transaction user-b would not be able to see any of the files in folder-1.  After the the transaction commits user-b would be able to see all of the files in folder-1.


Final Resolution

We are removing both benefactorId and projectId from the JDONODE table therefore we no longer need a strategy for keeping these cached values up-to-date.  The benefactorId was replaced with a MySQL function called getEntityBenefactorId().  This function walks the entity hierarchy to find an entity's benefactor.  Similarly, the projectId was replaced with a function called getEntityProjectId.  All operations that depend on the either benefacorId or projectId were converted to use these new functions.


The last remaining dependency on the JDONODE benefactorId and projectId is the entity query services.  The long term plan is to completely remove this service.  In the short term, we refactored this service such that it runs queries against the ENTTITY_REPLICATION table instead of the JDONODE table.  At some point in the near future the entity query service will be completely removed.