Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Overview

  1. Bridge-EX-Scheduler lives inside AWS Lambda, with a schedule configured to execute every day at 10am UTC (2am PST, 3am PDT).
    1. Scheduler pulls configs from DynamoDB table Exporter-Scheduler-Config, which contains the SQS queue URL, time zone, and an optional request JSON override.
    2. Scheduler generates the request, fills in yesterday's date, and writes the request to the SQS queue.
  2. Bridge-EX polls SQS, waiting for a request.
    1. When the request arrives, it pulls user health data records from DDB (along with study info, schemas, attachment metadata, and participant options (sharing options)), as well as attachment content from S3.
    2. Bridge-EX writes uploads attachments to Synapse as file handles and writes the record to a TSV (tab-separated values) file on disk.
    3. On completion, Bridge-EX uploads the TSVs to the Synapse tables, creating new tables as necessary.

Components

Bridge-EX components can be broken down into 3 major groups (not including Spring components and various helpers).

  1. Record Processor - This is the entry point into Bridge-EX. It polls SQS for a request. When a request comes in, it iterates through all health data records associated with that request, calling the Worker Manager with each record. When it's done iterating records, it signals "end of stream" to the Worker Manager.
  2. Worker Manager - This contains a thread pool, a collection of worker handlers, and a task queue, as well as various helper logic needed by the handlers. The Worker Manager is called for every record in a request, and queues a task onto each relevant handler. On end of stream, the Worker Manager signals to each handler to upload their TSVs to Synapse.
  3. Handlers - Various handlers that can be run asynchronously and in parallel. These include health data handlers, app version handlers, and legacy iOS survey handlers.

Spring Configs and Launcher

(config) AppInitializer - This is called by Spring Boot and used to initialize the Spring context and the app.

(config) SpringConfig - Annotation-based Spring config. Self-explanatory.

(config) WorkerLauncher - This is a command-line runner that Spring Boot knows about. Spring Boot automatically calls the run() method on this when it's done loading the Spring context. This is what sets up and runs the PollSqsWorker, which in turn calls the BridgeExporterSqsCallback when it gets a request. The WorkerLauncher currently does everything single-threaded, since Bridge-EX workers are already heavily multi-threaded and we never need to run multiple Export requests in parallel.

Record Processor

(record) BridgeExporterRecordProcessor - This is the main entry point into the Exporter. This is called once for each request, with the deserialized request. This calls the RecordIdSourceFactory to get the stream of requests, the RecordFilterHelper to determine whether to include or exclude the record, and the WorkerManager to queue an asynchronous task for each record. This has a configurable loop delay to prevent browning out DDB and will log progress at configurable intervals.

(record) RecordFilterHelper

(record) RecordIdSource

(record) RecordIdSourceFactory

(request) BridgeExporterSqsCallback

Worker Manager

(worker) ExportSubtask

(worker) ExportTask

(worker) ExportWorker

(worker) ExportWorkerManager

(worker) TsvInfo

Handlers

(handler) AppVersionExportHandler

(handler) ExportHandler

(handler) HealthDataExportHandler

(handler) IosSurveyExportHandler

(handler) SynapseExportHandler

Helpers

(dynamo) DynamoHelper

(helper) ExportHelper

(metrics) Metrics

(metrics) MetricsHelper

(synapse) SynapseHelper

(synapse) SynapseStatusTableHelper

(synapse) SynapseTableIterator

(util) BridgeExporterUtil

Deployment

Troubleshooting

Redrives

Legacy Hacks

More Info

Bridge Data Pipeline

Bridge Upload Data Format

https://github.com/Sage-Bionetworks/Bridge-Exporter

https://github.com/Sage-Bionetworks/Bridge-EX-Scheduler

  • No labels