Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Any REST call that writes data from the repository services will always be in a single database transaction.  This includes writes where data is stored in S3.  For such cases, data is first written to S3 using a key containing a UUID.  The key is then stored in RDS as part of the single transaction.  This means any S3 data or RDS data will always have read/write consistency (although there may be "orphaned" files in S3 if a transaction fails).  All secondary data sources will be eventual consistentare eventually consistent.  

Although there are many logically independent web services in Synapse, they are all bundled in a single .war for deployment to a single Elastic Beanstalk environment, which provides autoscalling for all the services at minimal cost given the moderate traffic on Synapse.  The workers.war is deployed to a second Elastic Beanstalk environment, ensuring that worker processes can not interfere with the performance of the webservices tier.  Again, several logically independent workers are bundled into one deployable unit, again for reasons of cost and opperational simplicity.

Asynchronous processing

As mentioned above, the repository services only writes to RDS and S3.   All other data-sources (Dynamo, CloudSearch, etc.) are secondary and serve as indexes for quick data retrieval for things such as ad hock queries and search. These secondary indexes are populated by the workers in the workers.war. The details of these worker will be covered more detail later, but for now, think of the workers as a suite of processes that respond to messages generated by the repository services.

...