Asynchronous Table Operations

There are a series of table operations that we would like to support that involve moving potentially large amounts of data between clients and Synapse. For example, create a new table entity from a local CSV file. Or for a particularly large query result, download the entire result set as a CSV file. This document outlines the proposed services that will support these types of features.

Create or Update a Table from CSV

The client must first create a ColumnModel for each column in the CSV.
Create or update the TableEntity with the desired column model IDs from step one.
Upload the CSV as a file handle (see: File Services). The first row of the table should be a header with either the names or ids of the corresponding columns.
Start a table job passing the FileHandle id from step 3. and the TableEntity ID from step 2 (see: POST /entity/<id>/table/asynch in the table below). This call will return a AsynchTableJobStatus object with a job ID. A background process will be launched to create a RowSet from the data in the CSV files.
Monitor the status of the job using the job ID provide from step 2. Once the job completes all of the data from the CSV will be applied to the table. Either all of the data from the CSV will be applied to the table or none of the data will be applied (in the case of an error). If the job fails, the job status will provide details on the failure.

Query Results as CSV

Start a table job passing the query to be executed against an existing table (see: POST /table/query/asynch in the table below). A AsynchTableJobStatus object with a job ID will be returned.
Monitor the status of the job using the ID from the previous step. Once the CSV is ready for download a temporary URL will be provide in the job status.

Object Model

REST API

URL	Method	Request Body	Response body	Description	Authority
/entity/<id>/table/asynch	POST	AsynchTableUpdateRequest	AsynchTableJobStatus	Start an asynchronous job to update a table entity with the contents of a CSV file.	Must have UPDATE permission on the table entity.
/table/query/asynch	POST	AsynchTableQueryRequest	AsynchTableJobStatus	Start an asynchronous job that will execute the provide query post the results to a CSV file. When the job.status is set to COMPLETE, a pre-signed URL will be set in the AsynchTableJobStatus.url. Note, the resulting pre-signed URL will expire after 10 minutes.	Must have READ permission on the table entity.
/table/asynch/<job_Id>	GET	N/A	AsynchTableJobStatus	Get the status of an asynchronous table job using the job's id. Use this method to get the status of both query and updates jobs.	Only the creator of the job can access a job's status.

CSV Formatting

CSV files generated from a query request will always include the two system generated columns: ROW_ID and ROW_VERSION. The first row of the generated CSV will contain headers for each column.

In order to update rows with a CSV file, the file must include the ROW_ID column. If a ROW_ID column is not provide then the all rows will be treated as new and appended to the table.