Bridge Downstream is a data pipeline that takes data uploaded to Bridge by a digital health app and turns it into parquet datasets. These parquet datasets are written to an S3 bucket and can be accessed through Synapse.
Why would you want to do that?
A digital health app typically sends data to Bridge as a .zip archive of JSON files. This is not an easy data format for analysts to work with – we want data frames! A parquet datasets is a relational version of the data in these JSON files and can be easily loaded as a data frame.