Portals are websites that allow you to explore data stored in Synapse. created to promote data sharing within a specific research community. These websites aggregate relevant data from Synapse, and they allow users to explore data, projects, people, and organizations within their research community.
Sage Bionetworks hosts a variety of Portals for different research communities. The AD Knowledge Portal, the NF Data Portal, and the PsychENCODE Knowledge Portal are a few examples. The following guide describes how to programmatically download data discovered via a Portal.
Find Files using Explore
All entities in Synapse are automatically assigned a globally unique identifier used for reference with the format syn12345678
. Often abbreviated to “synID”, the ID of an object never changes, even if the name does. You will use a synID to locate the files you wish to download.
Find Files using Explore
Search the available data files via the Explore Data or Explore Files tab in the navigation bar. The Explore section presents several ways to select data files of interest. The top of the page displays pie charts that summarize the number of data files based on file annotations of interest, including Assay and Tissue, among others. Selection of one of these chart segments will filter the table below to subset the set of files. Alternatively, access the filters using the facet selection boxes to the left of the table. For this example, you will download the processed data and metadata from the MC-CAA study in the Alzheimer’s Disease (AD) Knowledge Portal.
...
Code Block |
---|
synapse login -u <Synapse username> -p <API key> --rememberMe |
From Explore Data in the portalPortal, select the Download Options icon and Programmatic Options to visualize the command to download the data subset.
...
The command synapse get
with the -q
argument downloads files from the entirety of the portal data that meet the specified condition. In this example, all processed and metadata files from the MC-CAA study will be downloaded. Execute the following command from the directory where you would like to store the files.
...
Also in your working directory, you will find a SYNAPSE_TABLE_QUERY_###.csv
file that lists the annotations associated with each downloaded file. Here, you will find helpful experimental details relevant to how the data was processed. Additionally, you will find important details about the file itself including the file version number.
...
Once you have identified the files you want to download from Explore Data, use the Export Table from option from Download Options. The table includes annotations associated with each downloaded file.
...
You may choose to download the file as a .csv
or .tsv
. Files are named Job-
#### ####
(where # is a long set of numbers). Move this file to your working directory to proceed with the following steps.
Install the Synapse R client synapser
to download data from Synapse. Login to Synapse with your API key.
Code Block | ||
---|---|---|
| ||
library(synapser) synLogin("my_username", "api_key") |
Read the exported table into R replacing Job-####
with the complete filename of the downloaded table. Create a directory to store files and download data using synGet
. If downloadLocation
is not specified, the files are downloaded to a hidden directory called ~/.synapseCache
.
Code Block | ||
---|---|---|
| ||
exported_table <- read.csv("Job-####.csv") dir.create("files") lapply(exported_table$id, synGet, downloadLocation = "./files") |
...
In order to download data programaticallyprogrammatically, you need a list of synIDs that correspond to the files.
Once you have identified the files you want to download from Explore Data, use the Export Table from option from Download Options. The table includes annotations associated with each downloaded file.
...
You may choose to download the file as a .csv
or .tsv
. Files are named Job-####
, where # includes a long set of numbers. Move this file to your working directory to proceed with the following steps.
Install the Synapse Python client synapseclient
to download data from Synapse, the pandas
library to read a csv file and the os
module to make a directory. Login to Synapse with your API key.
Code Block | ||
---|---|---|
| ||
import synapseclient, pandas, os syn = synapseclient.Synapse() syn.login('my_username', 'api_key') |
Read the exported table into R Python replacing Job-####
with the complete filename of the downloaded table. Create a directory to store files and download data using syn.get
. If downloadLocation
is not specified, the files are downloaded to a hidden directory called ~/.synapseCache
.
Code Block | ||
---|---|---|
| ||
exported_table = pandas.read_csv("Job-####.csv") os.mkdir("files") [syn.get(x, downloadLocation = "./files") for x in exported_table.id] |
...