Getting Started

Accessing the data

Data comes in two formats, raw and parquet.

Raw Data

Raw data is the data exactly as it has been sent by the app to Bridge. You can find the raw data under the “Files” tab in a folder called “Bridge Raw Data”. You can also view this data and its metadata in a view under the “Tables” tab. The view will be named “Bridge Raw Data View”. We don’t recommend working with the data in this format, although you may find the view convenient for working with the file metadata.

Parquet Data

Parquet data is a relational version of the raw data. It can be found under the “Files” tab in a folder named “parquet”. Don’t freak out if this folder is empty! This is by design. You will notice something similar to the following text underneath the folder name:

This tells you where the data really is. We use Synapse to control access to this data, although all the real action happens in S3.

To interact with this data, you will need to authenticate with AWS so that their services know you have access to this data in S3. Synapse makes this authentication easy with STS tokens. STS tokens can be retrieved using the Python, R, or command line Synapse clients. You may need to install one of those clients, but assuming you have already installed the client, here is some sample code which will allow you to access the parquet data.

Python

R

Command Line

For those who are curious, full documentation on using STS with Synapse can be found here.