- Created by Stacey Taylor (Unlicensed), last modified by Kevin Boske on Mar 06, 2024
You are viewing an old version of this page. View the current version.
Compare with Current View Page History
« Previous Version 18 Current »
About Synapse
Synapse is a cloud-based data repository and sharing platform where researchers can share and describe content to co-analyze, learn from, and improve knowledge of health and disease. Synapse was developed to encourage research collaborations across institutional boundaries and is therefore provided as “Software As A Service” with a single instance used by all users. This makes it easy to discover and share public User Content, including Data, analyses, tools, methods, and other content. Synapse also supports private project spaces where the individual content contributor controls User Content sharing.
Synapse provides a standard interface to describe User Content, where it comes from, and how to use it. Synapse also provides mechanisms for adding User Content and its descriptions.
Synapse can facilitate sharing User Content stored in many locations, or cloud storage. This allows Synapse to store metadata about the Content, such as annotations, descriptive wiki pages, and provenance, but not the actual data. Currently, Synapse supports files stored in AWS S3 buckets and the Google Cloud Storage Platform. (see Custom Storage Locations).
Not directly. Synapse helps you manage User Content, including Data, analysis, tools, methods, and results. However, using the programmatic interfaces built into Synapse makes it easy to set up analytical pipelines and ad hoc analyses that interact with Synapse. By default, Synapse uses Amazon’s cloud infrastructure (S3) for storage, making it simple to allocate large compute resources and collocate them next to User Content storage.
Synapse does support users performing analysis on data stored in S3 buckets in Synapse using the AWS Security Token Service (STS). (see Computing Directly on Data in Synapse In S3)
Anyone 18 years or older may create an account on Synapse. Sage offers different plans for Synapse. These plans have varying restrictions and limitations on user training requirements and allowable user content. We have highlighted a series of research communities currently using Synapse for collaborative work and some open resources hosted in Synapse.
Our Discussion Forum is a great place to reach out to the broader Synapse community to find others that may be interested in a collaboration. Depending on your plan level (Synapse Offerings), you may have access to our help desk.
The Terms and Conditions of Use fully describes the governance terms and conditions of Synapse. In order to register on Synapse, you must review and agree to the terms of the Synapse Awareness and Ethics Pledge. For more information, see Synapse Governance.
Yes, Synapse is released under the Apache 2.0 License The source code is available on GitHub. Synapse is also offered free of charge as a hosted Software as a Service (SaaS) at https://www.synapse.org/.
Yes, Synapse is built on top of a RESTful service that is automatically documented, including an OpenAPI Specification. In addition, we have purpose-built APIs for Python, R, Java and a command line interface.
Synapse was developed with the philosophy to encourage collaboration across institutional boundaries and is therefore provided as “Software As A Service” with a single instance used by all users. This makes it easy both to discover new content and share with new collaborators. We do support private project spaces where content sharing is controlled by the individual user. In addition, Synapse has the ability to reference resources that are stored elsewhere. This allows Synapse to store metadata about the content such as annotations, descriptive wiki pages and provenance but not the actual data. Currently Synapse has specific support for files stored at URLs, on SFTP servers, on AWS S3 and arbitrary file servers (see: /wiki/spaces/DOCS/pages/2048327803).
You may browse open issues or file a bug through our Jira tracker system. To file a bug, use the blue “Create” button in the top center of the page. Please be sure to include your email address in your submission so we may follow up with you.
See Getting Started for a breakdown of what you need to get started and how to make the most of Synapse. You may browse the public content catalog and access limited features, but to access most features of Synapse, you must register for a Synapse user account. Before uploading User Content, you will need to complete specific training demonstrating your understanding of the ethical, legal, and technical issues associated with using and sharing User Content and how User Content is managed and shared in Synapse.
You can browse public content in Synapse without registering. However, without an account, you cannot add new User Content to Synapse, download restricted files or tables, or access the most advanced features of Synapse. With an account, you can, among other things, create projects and wikis, download some open content, and request access to controlled User Content. Further, an account lets you collaborate with other Synapse users and create user teams. For more information, see the Account Types page.
Validating your profile is a process where your identity is established through a combination of your profile information, ORCID, and an external credential. Validation increases transparency between researchers and User Content donors. A validated profile is needed for access to specific User Content and is currently required for access to User Content collected through Sage Bionetworks’ research apps. Profile validation instructions can be found in the Settings tab of your Synapse profile page. Click the “Request Profile Validation” link for the required steps.
Accessing Content
This will depend if the content is public or private. If private, you will need to make sure your colleague has shared this content with you. Shared content is visible from your “Dashboard page” under the tab “Shared directly with me”. If you favorite the content (using the star) it will appear under your list of favorites visible from the /wiki/spaces/DOCS/pages/2048557182 or on your /wiki/spaces/DOCS/pages/2055405596.
All public data is queryable. For more information see /wiki/spaces/DOCS/pages/2667642897 or from the “Search” box in the top right corner of any Synapse page.
Multiple research communities use Synapse to generate data that is released to the public. A description of some of these communities can be found on the Synapse Research Communities Page and public resources page.
You can browse public content in Synapse without registering. However, without an account you cannot add new content to Synapse, nor can you upload or download files or tables. With an account, you can create projects and wikis, download open data and request access to controlled data. Further, an account lets you collaborate with other Synapse users and create user teams. For more information see the /wiki/spaces/DOCS/pages/2007072795 page.
Validating your profile is a process where your identity is established through a combination of your profile information, your ORCID, a signed Synapse Pledge, and an external credential. Validation increases transparency between researchers and data donors. A validated profile is needed for access to specific datasets and is currently required for access to data collected through Sage Bionetworks’ research apps. Profile validation instructions can be found in the Settings tab of your Synapse profile page. Click on the ‘Request Profile Validation’ link to see the required steps.
Adding Content
Synapse makes it easy to share files of any sort, with whomever you choose whether a small group of collaborators or the general public. You may share raw data, summarized data, analysis results, or anything in between. See /wiki/spaces/DOCS/pages/2002846338 for information and instructions on sharing your data.
You must be a certified user to post User Content on Synapse. To become a certified user, you must demonstrate an understanding of your responsibilities for sharing User Content through Synapse, especially data derived from human participants, by completing the required training. These responsibilities include ensuring that Data derived from human participants is de-identified (unless unambiguously authorized in writing) and that all applicable privacy laws and regulations are observed.
To become a certified user, you will need to pass a brief quiz.
No. Use sharing settings to control who can see the content you create. By default, projects and their content are visible only to the user who created it. By using the Synapse sharing settings, you have the ability to grant other Synapse users, Synapse teams, or the public access to your Project content. You can learn more here: /wiki/spaces/DOCS/pages/2024276030.
It depends. You may not store sensitive information about human subjects in Synapse if you have a Basic Plan. If you have a Self-Managed Plan or a Data Coordination Plan, you may store sensitive information about human subjects in Synapse if authorized. Synapse has an IRB-approved data governance procedure that employs Conditions for Use to allow for the sharing of sensitive data in a controlled manner. You can learn more by reading Sharing Settings, Permissions, and Conditions for Use. If you have questions or would like assistance in applying Conditions of Use to your User Content, please get in touch with the Synapse Access and Compliance Team at act@sagebase.org.
It depends. If you have a Basic Plan, you may only upload user content that is anonymized or user content that is not subject to the data protection laws outside the United States, e.g., GDPR. If you have a Self-Managed Plan or a Data Coordination Plan and have reached an agreement with Sage regarding the storage of overseas data, you may store such data in Synapse.
Synapse stores content in Amazon Web Services, which provides a layer of security measures designed and implemented by Amazon. While Synapse is an open access site, each user has control over who may access their content by using sharing settings.
By default, Synapse stores files in Amazon Simple Storage Services (S3). However, it is possible to set up Synapse to store files in different locations in S3 or Google Cloud Storage. For files stored outside of S3, Synapse can be used to organize, manage, and access files through the use of Synapse annotations to store file-specific metadata. (see: /wiki/spaces/DOCS/pages/2048327803)
Sage Offerings
Sage offers the following service plans: (1) a Basic Hosting Plan, (2) a paid-for Self-Managed Plan, and (3) a paid-for, customized, Data Coordination Plan. The Sage Terms of Service apply to each of these plans unless otherwise noted. Additional governance terms may apply.
Basic Hosting Plan
Included Services
The Basic Hosting Plan is intended for users wanting to share small datasets for scientific, educational, research collaborations, and publications (including creating DOIs) purposes. This plan includes:
User Content hosting of up to 100GB of space
Egress capped at 4TB/year
Self-service Project set-up with no direct support from Sage staff
A basic portal landing page (Wiki)
Users can create Projects with administrative control over who is granted access to each Project
User Content Longevity
Sage will continue to host User Content in a Basic Hosting Plan for as long as the User Content is being viewed or accessed. Sage currently utilizes Amazon Web Services (AWS) Intelligent-Tiering storage in all service plans. Infrequently accessed User Content will be moved to access tiers that require longer retrieval times.
Sage may, at its own discretion and with notice to the Project Administrator, implement cost mitigation strategies, including, for example, moving User Content to lower-cost hosting services or employing fee sharing if such User Content accumulates high egress charges.
Sage reserves the right to delete User Content, including Private Content, if deemed inappropriate or if it conflicts with the Sage Terms of Service or at Sage’s discretion, after 24 months of inactivity (i.e., User Content is not viewed or downloaded). Sage will reach out to the Project Administrator of the User Content at least two times (at the email address provided upon registration) warning that the User Content may be removed if the user does not respond with a proposed use case for the User Content. If the Project Administrator does not respond within 30 days of the second warning, Sage has the right to delete the User Content.
Third-party access to User Content Hosted in a Basic Hosting Plan
Project Administrators are expected to respond to inquiries from Sage or other Synapse users without delay. If Sage receives complaints that a Project Administrator is not responding to inquiries, Sage will assume the Project Administrator’s account is inactive and reserves the right to suspend the Project Administrator’s account.
Self-Managed Plan
Included Services
The Self-Managed Plan includes:
Option 1: User Content hosting of up to 100GB of space; Egress capped at 4TB /year
Option 2: User Content hosting of up to 500GB of space; Egress capped at 20TB/ year
Up to 15 hours of consulting services, including for setting up Projects and sharing User Content according to the F.A.I.R. principles and governance
Up to 25 hours of help desk support
Tools for self-managing User Content access requests
A basic portal landing page (Wiki)
Users can create Projects where they have administrative control over who is granted access to each Project and can deploy User Content restrictions by clickwrap agreements.
User Content Longevity
Sage will keep User Content in a Self-Managed Plan for 5 years (or as otherwise negotiated). Sage utilizes Amazon Web Services (AWS) Intelligent-Tiering storage in all of the service plans. Infrequently accessed User Content will be moved to access tiers that require longer retrieval times.
After the Self-Managed Plan expires, the Self-Managed Plan account holder will have 3 months to retrieve or make other arrangements with regard to their User Content or to renew the Self-Managed Plan. After this 3-month grace period, if the account holder does not retrieve or make other arrangements with regard to their User Content or renew the Self-Managed Plan, Sage reserves the right, at its sole discretion, to archive, freeze, delete, or move the User Content. For example, if the User Content meets the criteria, Sage may move the User Content to a Basic Hosting Plan or to the AWS Open Data (or a similar) program.
Third-Party Access to User Content Hosted in a Self-Managed Plan
Project Administrators are expected to respond to inquiries from Sage or other Synapse Users without delay. If Sage receives complaints that a Project Administrator is not responding to inquiries, Sage will try to reach out to such Project Administrator, and to the institution associated with the Self-Managed Plan account. If the Project Administrator and institution are not responsive, Sage reserves the right to suspend the Self-Managed Plan account. Sage will not provide a refund if a Self-Managed Plan is suspended for failure to meet service level expectations.
Data Coordination Plan
Included Services
The Data Coordination Plan offers customized, end-to-end management of User Content supporting collaborative, multi-institutional research consortia. This plan includes:
Customized consulting services, and data curation, harmonization, and validation services
Expert resources to localize data policies and governance controls to your particular country’s requirements as applicable
Services to create a customized and feature-rich data exploration portal for your coordination center that includes links to sophisticated computational environments
User Content Longevity
The longevity of Sage’s hosting of User Content will be mutually agreed upon by the account holder of the Data Coordination Plan and Sage.
Because Data Coordination Plans are typically customized to meet the needs of a particular proposal, they are subject to additional contractual documentation.
Basic Hosting Plan: Access to the Service is provided free of charge except in cases where users request additional services not included in the Basic Hosting Plan. Sage reserves the right to initiate fees for the Service, or portions thereof, at any time, by providing the user 30 days’ prior written notice via Synapse or via the email provided during registration. If users do not wish to pay such fees, they can remove their User Content and terminate their account. Continued use of the Service after 30 days may trigger fee payment obligations.
Self-Managed and Data Coordination Plans: These plans require users to pay fees. All fees are in U.S. Dollars and are non-refundable unless specified in the Sage Terms of Service or separate agreements.
Additional Services: If users need additional services that extend beyond those specified under their respective plans, Sage may provide such services at an additional cost.
The DMSP that went into effect on January 25, 2023 applies to all research funded or conducted in whole or in part by NIH. The goal of the policy is to promote the sharing of scientific data, which can accelerate biomedical research discovery, enable validation of research results, and provide accessibility to high-value datasets.
Under the policy, NIH expects that investigators and institutions will:
Plan and budget for the managing and sharing of data
Submit a Data Management and Sharing (DMS) plan
Comply with the approved DMS plan and annually report on its implementation
The DMSP provides guidance on how to manage and share data in a responsible and ethical manner to help accelerate biomedical research discovery and improve human health.
How can Sage help? In most cases researchers must address the NIH DMSP when submitting their grant/funding proposal. For the Self-Managed and Data Coordination Plans, Sage can help:
Develop a data management and sharing plan that meets the requirements of the NIH Data Management and Sharing Policy.
Develop an appropriate budget for the data management and sharing plan that meets the F.A.I.R principles.
Provide guidance on how to store and secure the data at a level appropriate for its sensitivity.
Promote the responsible sharing of data for scientific, education, and research purposes.
How to get started? Sage’s expertise makes it easy to create an NIH Data Management and Sharing plan and budget. Sage can provide budgeting quotes, and text researchers can directly use in their NIH Data Management and Sharing plan. To start, Contact Us.
If you have a Self-Managed or Data Coordination Plan, Sage can help you prepare materials for your data management and sharing plan and budget. We provide quotes for services for use in your budget and standardized text for your plan. To start, Contact Us.
For Basic plan users, by default, Synapse stores User Content in a shared bucket managed by Sage in Amazon Simple Storage Services (S3) in AWS US-EAST-1. Users can customize the location by providing Custom Storage.
For Self-Managed and Data Coordination Plans, User Content can be stored in individual buckets managed by Sage in Amazon Simple Storage Services (S3) in AWS US-EAST-1. Users can customize the location by providing Custom Storage.
No, accounts and projects as of March 4, 2024 will continue under the terms of their previous agreement with Sage.
You can mint DOIs for projects and files uploaded to Synapse, extremely useful for citing data. Learn More
These are the general storage limitations for each plan: up to 100GB for Basic Hosting Plan and up to 500GB for the Self-Managed Plans. You may pay for additional storage if needed. Please see our Sage Offerings page for more information. There are no data limits if you use Custom Storage in your own cloud bucket, and depending on the dataset uploaded, storage may be sponsored by an existing project. Contact us to learn more.
You may create a User Project (or Workspace) where you have administrative control over who is granted access to the workspace. However, you cannot implement specific User Content restrictions for your User Content under a Basic Plan. You must have a Self-Managed Plan or Data Coordination Plan to implement such User Content restrictions.
As a non-profit organization, Sage strives to keep our services accessible and affordable for all users. Maintaining a reliable and secure cloud storage infrastructure comes with significant costs, particularly when it comes to data egress. These costs can quickly add up, especially as the volume of data downloaded increases.
To address this challenge, we implemented measures to monitor and control data downloads. By setting reasonable limits on data egress, we can manage our costs and ensure the long-term viability of our platform for the benefit of all users.
In general, we advise to store and process user content in the cloud. Beyond cost efficiency, this provides greater data security, improved accessibility, faster insights, and better governance controls.
There may be exceptions to these standard egress limits under the following circumstances:
When a Synapse plan is used in combination with a user’s Custom Storage location. Under these circumstances, the user has greater ownership and control of one’s files, and storage and egress costs are the responsibility of the user.
When the Synapse plan user qualifies as a non-profit or newly established start-up entity. Under these circumstances, the user may be eligible for a discounted plan. This is judged on a case-by-case basis.
Sage offers integrations with multiple cloud compute environments for customized Data Coordination Plans. Synapse is open source with standard interfaces and programming models (see API Clients and Documentation)
You can manually add or remove Synapse users to your project spaces in the Basic Plan. In the Self-Managed Plan, Sage can set up User Content restrictions and require accessors to record their agreement to comply with such restrictions through a clickwrap agreement prior to accessing the User Content. For the Data Coordination Center Plan, we offer a complete data access committee service, either managed by you or by Sage to review/administer requests to controlled-access data.
After the Self-Managed Plan expires, the Self-Managed Plan account holder will have 3 months to retrieve the User Content or to renew the plan. After this 3-month grace period, Sage reserves the right to delete the User Content. For the Data Coordination Plan, the longevity of Sage’s hosting of User Content and any transition period will be mutually agreed upon by the account holder of the Data Coordination Plan and Sage.
Sage offers bespoke portals for customized Data Coordination Plan. For the Basic and Managed Plans, you can create your own lightweight portal directly in Synapse.
“Deidentified Data” refers to Personal Data from which all directly identifiable elements
(e.g., name, street address, date of birth, government identity number, etc.) is
removed and the individual is solely identified by a random, unique reference number or
code that is not derived from or related to the individual’s personal information; (2) that are
provided, stored, and transmitted separately from the key that make reidentification
possible; and are subject to a binding contractual or other legal obligation not to
attempt to reidentify the Data except as authorized by law.
- No labels