Bridge Downstream is a data pipeline that takes data uploaded to Bridge by a digital health app and turns it into Parquet datasets. These Parquet datasets are written to an S3 bucket and can be accessed through Synapse.
Why bother doing all that?
A digital health app typically sends data to Bridge as a .zip archive of JSON files. This is not an easy data format for analysts to work with – we want data frames! A Parquet dataset is a normalized version of the data in these JSON files and can be easily loaded as a data frame.
Why is the Parquet folder in my Synapse project empty?
Parquet data is written to an S3 bucket that acts as the external storage location of the Parquet folder. For instructions on how to access the Parquet datasets in this external storage location and read each of them as a data frame, see Getting Started.
How do I interpret the names of the Parquet datasets?
The Parquet dataset names specify what type of assessment data is contained inside them. All of this is explained in Understanding Parquet Datasets.