Skip to main content
Skip table of contents

Querying Tables, Views, and Datasets

Not familiar with these terms? Find more information on Tables, Views, and Datasets.

You can query tables, views, and datasets to find the data you need quickly. Views are especially useful for finding data that may be spread across multiple folders or projects. You can take advantage of querying via the Synapse UI, or using one of the programmatic clients.

Querying Tables, Views, and Datasets via the Synapse UI

Using Simple Search

On the web, simple search mode displays radio buttons or sliders to the left of a table, view, or dataset. These are facets that you can use to search the data. Each facet corresponds to a column in the table, view, or dataset

If the simple search is not visible on a table, view, or dataset, then no facets were specified by the owner of that table, view, or dataset. In this case, use the advanced search described below to query for data.

To use simple search facets, navigate to a table, view, or dataset. Select the features you are interested in to filter the results. Note that the slider for range in simple search is inclusive, meaning the smallest and largest numbers in the selected range will be included in your search results.

Setting Up Simple Search

If you are the owner of a table, view, or dataset, you can choose what facets appear in a simple search menu. To select the facets, click on the Tools menu and select Show View Schema, Show Table Schema, or Show Dataset Schema. Click on the Edit Schema button to view the Edit Columns window. You can choose what facets appear by selecting Values or Range from the dropdown menus under the Facet option. Values can be thought of as categories whereas Range is a date or number. Selecting a blank field will remove the facet from the search window.

Note: If you change the Column Type in the schema, you must set its facet selection again.

Using Advanced Search Queries

The data within a table, view, or dataset can also be retrieved by using a SQL-like query language from the Synapse web interface. To query a table, view, or dataset, select the funnel icon in the upper righthand menu to reveal the advanced search bar.

 

By default, the advanced search bar will be pre-populated with a query to capture all of the data contained in the table, view, or dataset. For example:

CODE
SELECT * FROM syn3079449

 

You may specify columns explicitly after the SELECT statement to subset the data:

CODE
SELECT age, gender FROM syn3079449

 

To keep all columns in the query and filter rows that meet a certain condition, incorporate a WHERE statement:

CODE
SELECT * FROM syn3079449 WHERE age > 50

 

Add an ORDER BY statement and a column name to sort the results by that column. The ASC statement sorts the data returned in ascending order.

CODE
SELECT * FROM syn3079449 WHERE age > 50 ORDER BY "treatmentArm" ASC

 

COUNT, SELECT AS, and GROUP_CONCAT SQL statements are also supported. To count the number of rows:

CODE
SELECT count(*) FROM syn3079449

 

Select and rename a subset of columns:

CODE
SELECT age AS "Age at Diagnosis", gender AS "Gender" FROM syn3079449

 

Group rows by name and identify distinct differences:

CODE
SELECT count(distinct(treatmentArm)) AS "Number of Treatments", gender FROM syn3079449 group by gender

 

To list out the distinct treatment arms that were studied, by gender:

CODE
SELECT GROUP_CONCAT(distinct(treatmentArm) SEPARATOR ', ') AS "Available Treatments", gender as "By Gender" FROM syn3079449 group by gender

 

See the REST API docs for a list of all queries that can be performed.

Toggling Between Simple and Advanced Search

You can toggle from the simple search to the advanced search without losing your query results. For example, if you selected treatment arm A, age of 23:64, and gender as female, the query will be preserved in the advanced search bar. However, this feature is unidirectional because the advanced search allows for parameters that are not available with facets. Therefore, switching from advanced to simple search will result in resetting the search query. Synapse will warn you before your search is reset.

 

Warning: When toggling back to simple search, the query will be reset.

Querying Tables, Views, and Datasets Programatically

Tables, views, and datasets can also be queried directly from the programmatic clients, which accept all of the SQL-like language used above. For example, to query for the contents of syn12345678:

Command Line

CODE
synapse query 'SELECT * FROM syn12345678'

Python

CODE
query = syn.tableQuery('SELECT * FROM syn12345678')

R

CODE
query <- synTableQuery('SELECT * FROM syn12345678')

The expressions are the conditions for limiting a search. Every entity has properties useful for searching:

  • All entities (projects, files, folders, tables/views, Docker containers):  idnamecreatedOncreatedBymodifiedOnmodifiedByetagtypeparentIdbenefactorIdprojectId

  • Versionable entities (files, table, views, datasets): currentVersion

  • Files only: dataFileHandleId

Files also have contentMd5contentSize, and contentType as properties. These properties are not available in a view and are not searchable.

CODE
SELECT * FROM syn12345678 WHERE "id" = 'syn00012'

For a complete list of example queries, see:

SQL Query Examples

Finding Files in a Specific Project

To find files in a specific project, create a file view in the web client. For example, if you’d like to see all files in a project, navigate to your project and then select the Tables tab. From there, click Tables Tools and Add File View. Click Add container and Enter Synapse ID to create a tabular file view that contains every file in the project, which you can now query. Importantly, if you want to later query on annotations, you must select Add All Annotations.

Listing Files in a Specific Folder

If you are using a programmatic client, you can list the files in a specific folder. First, you need to know the synID of the folder (for example syn1524884, which has data from TCGA related to melanoma). All entities in this folder will have a parentID of syn1524884.

The function to find all files in this folder is called “getChildren”:

Python

CODE
foo = list(syn.getChildren(parent='syn1524884', includeTypes=['file']))

R

CODE
foo <- as.list(synGetChildren(parent='syn1524884', includeTypes=list('file')))

Queries on Annotations

If annotations have been added to files, they can be used to discover files of interest from a file view syn12345678. For example, you can identify all files annotated as bam files (fileFormat = bam) with the following query:

CODE
SELECT * FROM syn123456 WHERE "fileFormat"='bam'

Likewise, if you put the RNA-Seq related files described in the section above into the project syn00123 with the described annotations, then you could find all of the files for conditionB and sampleA:

CODE
SELECT * FROM syn123456 WHERE "projectId"='syn00123' AND "specimenID"='sampleA_conditionB'

Lastly, you can query on a subset of entities that have a specific annotation. You can limit the annotations you want displayed as following.

CODE
SELECT specimenID,genomeBuild,fileFormat,platform FROM file WHERE "projectId"='syn00123' AND "specimenID"='sampleA_conditionB'

Reproducible queries can be constructed using one of the analytical clients (command line, Python, and R) and on the web client, query results can be displayed in a table on a wiki page.

In a project, from the wiki page click Wiki Tools in the upper right corner to Edit Project Wiki. Click Insert and choose Table: Query on Files/Folders. Enter your query in the box and click the Insert button. Once you save the wiki page, the results will be displayed as a table.

CODE
synapse query "SELECT specimenID,genomeBuild,fileFormat,platform FROM syn123456 WHERE \"specimenID\"='sampleA_conditionB'"

result = syn.tableQuery("SELECT specimenID,genomeBuild,fileFormat,platform FROM syn123456 WHERE \"specimenID\"='sampleA_conditionB'")

result = synTableQuery("SELECT specimenID,genomeBuild,fileFormat,platform FROM syn123456 WHERE \"specimenID\"='sampleA_conditionB'")

Downloading from a Query

You can download files in a folder using queries. Currently this feature is only available in the command line client. For example, if you want to download all files in a file view that has a synapse ID of syn00123, use:

CODE
synapse get -q "SELECT * FROM file WHERE parentId = 'syn00123'"

Troubleshooting

Single quotes in Synapse queries must be replaced by double quotes or two single quotes. In order to query for the chemicalStructure of 4'-chemical:

CODE
SELECT * FROM syn123 where "chemicalStructure" = '4"-chemical'
# OR
SELECT * FROM syn123 where "chemicalStructure" = '4''-chemical'

 

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.