Document toolboxDocument toolbox

Querying Tables, Views, and Datasets

Not familiar with these terms? Find more information on Tables, Views, and Datasets.

You can query tables, views, and datasets to find the data you need quickly. Views are especially useful for finding data that may be spread across multiple folders or projects. You can take advantage of querying via the Synapse UI, or using one of the programmatic clients.

Querying Tables, Views, and Datasets via the Synapse UI

Using Simple Search

On the web, simple search mode displays radio buttons or sliders to the left of a table, view, or dataset. These are facets that you can use to search the data. Each facet corresponds to a column in the table, view, or dataset

If the simple search is not visible on a table, view, or dataset, then no facets were specified by the owner of that table, view, or dataset. In this case, use the advanced search described below to query for data.

To use simple search facets, navigate to a table, view, or dataset. Select the features you are interested in to filter the results. Note that the slider for range in simple search is inclusive, meaning the smallest and largest numbers in the selected range will be included in your search results.

Setting Up Simple Search

If you are the owner of a table, view, or dataset, you can choose what facets appear in a simple search menu. To select the facets, click on the Tools menu and select Show View Schema, Show Table Schema, or Show Dataset Schema. Click on the Edit Schema button to view the Edit Columns window. You can choose what facets appear by selecting Values or Range from the dropdown menus under the Facet option. Values can be thought of as categories whereas Range is a date or number. Selecting a blank field will remove the facet from the search window.

Note: If you change the Column Type in the schema, you must set its facet selection again.

Using Advanced Search Queries

The data within a table, view, or dataset can also be retrieved by using a SQL-like query language from the Synapse web interface. To query a table, view, or dataset, select the wrench icon in the upper right-hand menu to reveal the advanced search bar.

By default, the advanced search bar will be pre-populated with a query to capture all of the data contained in the table, view, or dataset. For example:

SELECT * FROM syn3079449

 

You may specify columns explicitly after the SELECT statement to subset the data:

SELECT age, gender FROM syn3079449

 

To keep all columns in the query and filter rows that meet a certain condition, incorporate a WHERE statement:

SELECT * FROM syn3079449 WHERE age > 50

 

Add an ORDER BY statement and a column name to sort the results by that column. The ASC statement sorts the data returned in ascending order.

 

COUNT, SELECT AS, and GROUP_CONCAT SQL statements are also supported. To count the number of rows:

 

Select and rename a subset of columns:

 

Group rows by name and identify distinct differences:

 

To list out the distinct treatment arms that were studied, by gender:

 

See the REST API docs for a list of all queries that can be performed.

Querying Tables, Views, and Datasets Programatically

Tables, views, and datasets can also be queried directly from the programmatic clients, which accept all of the SQL-like language used above. For example, to query for the contents of syn12345678:

Command Line

Python

R

The expressions are the conditions for limiting a search. Every entity has properties useful for searching:

  • All entities (projects, files, folders, tables/views, Docker containers):  idnamecreatedOncreatedBymodifiedOnmodifiedByetagtypeparentIdbenefactorIdprojectId

  • Versionable entities (files, table, views, datasets): currentVersion

  • Files only: dataFileHandleId

Files also have contentMd5contentSize, and contentType as properties. These properties are not available in a view and are not searchable.

For a complete list of example queries, see:

SQL Query Examples

Finding Files in a Specific Project

To find files in a specific project, create a file view in the web client. For example, if you’d like to see all files in a project, navigate to your project and then select the Tables tab. From there, click Tables Tools and Add File View. Click Add container and Enter Synapse ID to create a tabular file view that contains every file in the project, which you can now query. Importantly, if you want to later query on annotations, you must select Add All Annotations.

Listing Files in a Specific Folder

If you are using a programmatic client, you can list the files in a specific folder. First, you need to know the synID of the folder (for example syn1524884, which has data from TCGA related to melanoma). All entities in this folder will have a parentID of syn1524884.

The function to find all files in this folder is called “getChildren”:

Python

R

Queries on Annotations

If annotations have been added to files, they can be used to discover files of interest from a file view syn12345678. For example, you can identify all files annotated as bam files (fileFormat = bam) with the following query:

Likewise, if you put the RNA-Seq related files described in the section above into the project syn00123 with the described annotations, then you could find all of the files for conditionB and sampleA:

Lastly, you can query on a subset of entities that have a specific annotation. You can limit the annotations you want displayed as following.

Reproducible queries can be constructed using one of the analytical clients (command line, Python, and R) and on the web client, query results can be displayed in a table on a wiki page.

In a project, from the wiki page click Wiki Tools in the upper right corner to Edit Project Wiki. Click Insert and choose Table: Query on Files/Folders. Enter your query in the box and click the Insert button. Once you save the wiki page, the results will be displayed as a table.

Downloading from a Query

You can download files in a folder using queries. Currently this feature is only available in the command line client. For example, if you want to download all files in a file view that has a synapse ID of syn00123, use:

Troubleshooting

Single quotes in Synapse queries must be replaced by double quotes or two single quotes. In order to query for the chemicalStructure of 4'-chemical: