Document toolboxDocument toolbox

Onboarding Procedures

To prepare for a research team’s data contribution, there are several steps the Data Curator will perform:

  • Open Github issue (View)

  • Add grant information to Projects table

  • Coordinate an Onboarding call between:

    • Principal Investigators

    • Data Liaison and Uploader

    • DCC Curator and Director

1. MODEL-AD: Onboarding Meeting

Onboarding Meeting Agenda

  • Tour AD Knowledge Portal

  • Discuss Study Organization

  • Discuss Roles & Responsibilities (sites, individuals, and timeline)

  • Discuss the Planned Data Contribution

    • Experimental Tools

    • Metadata Templates & Validation

    • Data (assays performed, data types)

  • DCC Promotion

    • Newsletter

    • Manuscript Pages

SUBJECT: MODEL-AD: Onboarding

I am contacting you on behalf of the MODEL-AD Data Coordination Center (DCC) managed by Sage Bionetworks regarding your grant <GRANT_NUMBER>. MODEL-AD consortium members are required to share data according to the agreed-upon grant milestones and the DCC is responsible for managing the data shared among consortium members, and for the public release of data, analyses, and tools through the AD Knowledge Portal.

  1. To get started, PIs and team members must register for a Synapse account.

    1. Register here: https://www.synapse.org/#

    2. Note the Synapse terms of use: https://docs.synapse.org/articles/governance.html

    3. Profiles should include your full name, institution, photo, and a brief bio. See example.

    4. MODEL-AD team access requires that PIs create a Synapse account and submit their username through this form.

  2. Submit the Data Contribution form with your study information. Please email me your grant's data sharing plan, including yearly milestones. Please send this information before our onboarding call so we can clarify any remaining questions and prepare the required infrastructure to securely receive and store your data.

  3. I will send you a Doodle poll to schedule an onboarding call (1 h) to discuss the data you expect to contribute, the timeline of the contribution, and the expectations involved in data transfer and sharing. Please feel free to extend the invitation to any members of your team.

2. Following the Onboarding Meeting

  • Governance Team will contact PIs to review contributions and addendum signing

  • Create a Jira ticket for Governance to perform Attachment process

  • PI will complete the Data Contribution form

  • Curator will schedule Metadata Templates and Validation Meeting

  • Update Grant Outputs table with details from data survey

SUBJECT: MODEL-AD Onboarding – Data Submission – Follow-up

Here is an overview of the expected data contributions and timelines for your study:

  • Program: MODEL-AD

  • Study: <STUDY_NAME>

  • Data:

    • Assay_a -- Data type, file format, # of samples

    • Assay_b -- Data type, , file format, # of samples

Please reply to this email with the below information:

  1. A short phrase describing your study, and a study abbreviation 

    • We use the abbreviation to annotate all content associated with the study. Examples:

      • The Mount Sinai Brain Bank (MSBB) study

      • The Mayo RNAseq Study (MayoRNAseq)

  2. The name and contact info of a team liaison

    • This person will be my main contact for the team, and will triage tasks. This does not have to be the person responsible for data upload, and ideally should not be the PI

    • A list of the people on your team you want attributed in the portal as having contributed to your grant(s). 

      1. We do this by linking to people’s Synapse profile (The AD Knowledge Portal’s data management system). To create a Synapse profile see here: https://www.synapse.org/#!RegisterAccount:0

      2. Ideally Synapse profiles should include full name, a photo, and affiliated institution. People can also include a bio (or link to one) if they so wish.Those with a profile containing photo, are featured on the front page of the Portal (all members are listed under Explore - People irrespective of profile completeness)

    • In addition to the AD Knowledge Portal, we manage a private collaboration space within the Synapse platform. This space has limited access to members of the funded AD Knowledge Portal Programs and is a place for the distribution of meeting materials, information about working groups etc. Please indicate which of your team members you would like to have access to the AD consortia private space/resources. We will add them to a Synapse team that provides access

  3. Write an acknowledgment statement for your data

    • Users of the data will be requested to use this statement in publications.

    • Example:  "These data were generated by Kristen Brennand, a New York Stem Cell Foundation - Robertson Investigator. This work was supported by the Brain and Behavior Research Foundation, NIH grant R01 MH101454 and the New York Stem Cell Foundation."

    • The template for writing the acknowledgment statement is here: https://www.synapse.org/#!Synapse:syn25014532

    • The form to submit the completed statement is here: https://www.synapse.org/#!Synapse:syn25051271

  4. If you have a manuscript being prepared, please fill out this form detailing the data used in the manuscript so we can begin setting up the infrastructure for the data: https://docs.google.com/forms/d/e/1FAIpQLSeQjHkj72iZuwMzwhmXXwi7fhK4d57skDHwrjC5Kaldr4DHVw/viewform

    1. The NIA requests that AD Knowledge Portal funded partners include a specific data availability statement in manuscripts based on use of data from the portal:

      1. <data, analysis output, tools (describe content)> are available via the AD Knowledge Portal (https://adknowledgeportal.org/). The AD Knowledge Portal is a platform for accessing data, analyses, and tools generated by the Accelerating Medicines Partnership (AMP-AD) Target Discovery Program and other National Institute on Aging (NIA)-supported programs to enable open-science practices and accelerate translational learning. The data, analyses and tools are shared early in the research cycle without a publication embargo on secondary use. Data is available for general research use according to the following requirements for data access and data attribution (https://adknowledgeportal.org/DataAccess/Instructions).
        For access to content described in this manuscript see: <manuscript landing page DOI>

      2. The AD-DCC will provide you with a doi to use in the manuscript as a data reference.

In addition, you will be receiving a second email shortly introducing you to our data sharing process and the Sage Bionetworks Governance Team contacts. In the meantime, here are some community resources that may help you become more involved in your Synapse community:

Here are some ways we can get you involved in the community. Contact Zoe Leanza (zoe.leanza@sagebase.org) for more information on how we can help you:

  • spread the word about recent publications or news releases featuring your research

  • promote open positions you’re looking to fill

  • feature a member of your team in the quarterly newsletter 

  • present a Webinar to discuss the science behind the data you are contribution

  • get feedback about the usage of your data

If you have any questions, please let me know!

 

3. Metadata Validation

SUBJECT: MODEL-AD Metadata Validation

This is the follow-up to the dccvalidator and template training. Here's a summary of what we went over, and some tips/things to look out for when you are creating your metadata.

Documentation:

  • Study description

  • Assay description

  • Acknowledgment Statement

Please submit these documents through this form. https://www.synapse.org/#!Synapse:syn25051271

  • Links to the templates for writing these are at the top of the form

  • You don't need to submit everything at once, and you can submit as many times as you want

  • If there are other team members that you want to be linked to the grant, have them create a Synapse account and submit their Synapse name through the form. I'll link them to the grant.

Data Submission Process Summary:

  1. You fill out your templates and manifest and validate them in the dccvalidator

  2. Continue to validate your files until there are no more failures (the red section)

    1. This often is a round of collaboration between you, me and our data curator - the validator can be somewhat cryptic about what it's throwing failures about, so email me if you run into problems or questions.

    2. If you want to find any controlled values, for example, a platform, you can use our data dictionary - the main search box is in the upper right-hand corner: https://www.synapse.org/#!Synapse:syn20729790

  3. Once the files have passed validation, email me. Our data curator will do some manual curation, and then I'll give you the file location and permission to start the data and metadata upload.

Here is the link to the dccvalidator: https://www.synapse.org/#!Synapse:syn25878247

Here is the link to the templates.https://www.synapse.org/#!Synapse:syn18512044

  • MAKE SURE to check to see if the column uses controlled values, and use the values provided. If you don't see the value you need, let me know and we can add it. (the data dictionary for looking up values is here: https://www.synapse.org/#!Synapse:syn20729790 )

  • If the value you need isn't in the list, let me know. We can addf it!

  • You'll need one individual human template, one biospecimen template and one template for each type of assay you have.Save the individual, biospecimen and assay files as .csv

  • Save the manifest file as a .tsv

  • I'll be sending you synIDs for the "Parent" column in the manifest

  • Save the manifest as a .tsv file

  • For cells with no values, leave them blank - do not use "null" or "NA"

  • Do not delete any empty columns

  • Be careful when you copy and drag in the excel templates when you have a column that has the same value for a lot of cells (for example, for grant number) - that tends to add incremental numbers to values

  • Validate all of your files at the same time - it does some cross-checking across files

Please let me know if you have any questions!

 

4. Uploading Data to Synapse

SUBJECT: MODEL-AD - Study - Uploading Data

Here is an overview of the expected data contributions and timelines for your study. Please let me know if you have questions.

Study Information

Program: MODEL-AD

Grant: U54AG123456 

Study: Site_Study

Abbreviation: Study_Abbrev

Staging: URL_here

Transfer: Data will be ready for upload during January 2022

Metadata

Data

Upload

The web interface works well for quickly transferring a few files, but the Synapse client should be used for uploading files in bulk. 

Documentation