Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

Term

Definition

Discussion/Guidance

Details

References

1

Access Renewal

Anchor
Access

Approval/Rejection

When a Synapse user submits an access request to a Managed AR, their request can either be approved or rejected by ACT.

Once ACT approves or rejects an access request, an email is generated to the access request submitter. The rejection email links the submitter to the AR in Synapse so that they can resubmit their access request, and it also explains what they need to update in their access application to become approved. Note that approval and rejection automatic emails are only sent to the request submitters and not to everyone included within the access request.

2

Access Renewal

Some Managed ARs require request submitters to renew their data access after a specified time interval.

Automated Synapse emails are sent to request submitters before their access is set to expire, and if the user does not resubmit an access request application by the expiration date they will lose access to the respective data. Most access renewal periods are yearly, but this field can be customized by ACT during the Access Requirement setup. During the renewal, users should update their Intended Data Use statement and their list of Data Requestors from their institution that need access to the data (note: Data Requestors should be updated in both the access application and in the DUC if applicable).

3

Access Requirement (AR)

An Access Requirement (AR) is a data use restriction set up by ACT that defines conditions for access to a Synapse entity.

Managed ARs require a Data Access Committee (DAC) to approve the access request before users can obtain access to the Synapse entity. Other ARs include "click-wraps" which do not require DAC approval, and only require users to read data use conditions and click "I accept" before obtaining acccess. ARs can

_Renewal
Access_Renewal

Resubmission of accessor(s) Synapse access request to enable continued access to data governed by a Managed AR with an access expiration period.

*Applicability: Synapse

Access renewal settings, including specified intervals, are set when an access requirement is setup by ACT. Renewals and their intervals are determined as part of Conditions of Use for the data.

Automated Synapse emails are sent to request submitters 2 months and 1 month before their access is set to expire. If the user does not resubmit an access request application by the expiration date they will lose access to the respective data. Most access renewal periods are yearly, but this field can be customized by ACT during the Access Requirement setup.

2

Access Requirement (AR)

Anchor
AR
AR

A data restriction, or lock, applied by ACT to a Synapse entity (such as a folder, file, project, or team) according to the data contributor’s established Conditions for Use defining the requirements that must be met by a user in order to be allowed access to the entity.

*Applicability: Synapse

ARs are applied to controlled-access data and may be applied in the form of a Managed AR and/or a Click-wrap. A Managed AR requires a user to submit an Access Request. Access Requests may be reviewed and approved by members of the Synapse Access and Compliance Team (ACT) or a Data Access Committee (DAC). A Click-wrap requires a user to read data terms and conditions and click "I accept" before obtaining access. Click-wraps do not require ACT or DAC approval. ARs can be set up for projects, folders, tables, and teams.

4

3

Access Request

Anchor
Access_Request
Access_Request

An

access request is when a Synapse user submits a data access application to a Data Access Committee (DAC).Once an access request is approved, the user will gain access to one or more Synapse entities. A user can submit one access request

electronic application submitted via Synapse by a user seeking permission to controlled-access data protected by a Managed AR requiring the user fulfill terms and conditions of the AR and review and approval either by the Access and Compliance Team (ACT) or a Data Access Committee (DAC). An Access Request user may be submitted by a single user on behalf of several collaborating Synapse users at their institution.

5

Access Request Submitter

An Access Request Submitter is a Synapse user that submits an access request.

Multiple data requestors can be included within an access request, but each request can only have one submitter via Synapse. This person is the only user that receives approval/rejection emails generated from a request.

6

Acknowledgement Statement (Attribution)

A statement that a Data Contributor requires data recipients to include in publications, talks, presentations, etc. Its purpose is to ensure the Data Contributor (and any other relevant bodies, such as participants or funders) are recognized for their efforts surrounding the data.

Acknowledgement Statements are often included within the project wiki page or within a click-wrap agreement.

7

ACT (ACT Team)

The Access and Compliance Team (ACT) is a governance subteam that has special governance administration related privledges on Synapse, allowing it to process access requests, create and manage ARs, validate user profiles, etc.

Current ACT members are Emily Lang and Hayley Sanchez.

8

AD Umbrella AR

This is an AR that governs most data within the AD Knowledge Portal.

When users gain access to this AR, they are able to access most data within the AD Knowledge Portal (exceptions: AddNeuroMed data, Exceptional Longevity data)

9

Anonymous Access Tier

A category of Synapse data that is available for anyone on the web without requiring them to fulfill Conditions for Use.

This data must be made available for anonymous download by Sage Engineering.

10

Anonymous Journal Review

A process by which journal reviewers anonymously access manuscript data in Synapse in order to evaluate it for publication.

Please file a Governance Jira ticket using the "Request Anonymous Reviewer Accounts" component if you wish to generate anonymous accounts for a journal review.

11

Certified User

Synapse users that have registered for an account and completed the certification quiz.

Certified users have access to full Synapse functionality, including the ability to upload files and tables as well as create folders. To become certified, users must take a short quiz about the Synapse Commons Data Use Procedure.

12

Click-wrap

A type of AR that can be satisfied by a user by selecting "I accept the terms of use".

Click-wraps generally contain Terms and Conditions of data use and often contain an Acknowledgement Statement.

13

Conditions of Use

A set of expectations and/or terms for data access applied to Synapse content.

Conditions of Use typically are structured to comply with the terms under which the data were collected or with other human subjects regulations. Data Contributors collaborate with ACT to set up Conditions for Use.

14

Controlled Access Data Tier

A category of Synapse data that is available to registered, certified, or validated users that

*Applicability: Synapse

Once the designated request reviewer - either Access and Compliance Team (ACT) or Data Access Committee (DAC) - issues an approval or rejection of an Access Request via Synapse, an email is generated and sent to the submitter of the Access Request.

If the Access Request is approved, the approval email will alert the submitter that they now have access to the Synapse entities associated with the Managed AR for which the request was submitted.

If the Access Request is rejected, the rejection email will include notes from the reviewer explaining the reason for rejection, guidance for successful resubmission, and will redirect the submitter to the Managed AR in Synapse to resubmit their Access Request. Approval / rejection emails are only sent to the user who submitted the Access Request and not to other users who may also have been listed on the Access Request.

4

Access Request Submitter

Anchor
Access_Request_Submitter
Access_Request_Submitter

A Synapse user who submits an Access Request via Synapse for access to controlled-access data protected by a Managed AR. The Access Request Submitter may submit the Access Request on behalf of the user only or may also list additional collaborating Synapse users from a single institution.

*Applicability: Synapse

A single Access Request will have a single submitter via Synapse who completes and submits the Access Request and is the only Synapse user who will receive approval/rejection emails generated for the Access Request. Multiple collaborating Synapse users from a single institution may be included by the submitter for data access through a single Access Request, but these additional users are not considered Access Request Submitters and will not receive approval/rejection emails generated for the Access Request.

5

Acknowledgement Statement

Anchor
Acknowledgement_Statement
Acknowledgement_Statement

A statement set forth by a Data Contributor to be used by data recipients to include in publications, talks, presentations, etc., to ensure the Data Contributor (and any other relevant bodies, such as participants or funders, or Sage Bionetworks) are recognized for their efforts surrounding the data.

*Applicability: Synapse

Acknowledgement Statements are usually posted on the project wiki page or directly in a click-wrap agreement.

6

Access and Compliance Team (ACT)

Anchor
ACT
ACT

A Sage Governance sub-team that has Synapse administration privileges enabling members to process access requests, create and manage ARs, validate user profiles, escalate data incidents and other violations of the Synapse Terms and Conditions of Use, and take other administrative actions for governance purposes.

*Applicability: Synapse

7

Access Tiers

Anchor
Access_Tiers
Access_Tiers

A categorization used to designate the level of restriction that should be applied to data based on factors such as risk of identifiability or limitations on use.

*Applicability: Synapse/General

Access Tiers are defined by Governance in a manner appropriate to each individual study. Terms used to describe access tiers include “Open/Anonymous/Whitelisted”, “Registered,” “Restricted,” “Controlled,” and “Controlled-Plus.”

Image Added
  • Open/Anonymous/Whitelisted: data that is available for anyone on the web without requiring them to fulfill Conditions of Use

  • Registered: data that is available to registered users of Synapse

  • Restricted/Controlled: data that is available to registered users of Synapse who fulfill specific requirements for data access, such as submitting an Intended Data Use statement,

obtaining IRB approval,
or other prerequisites.

Please file a Governance Jira ticket using the "Add, Edit, or Remove Synapse Access Requirement/Click-wrap" component if you wish to categorize your data in the Controlled Access Data Tier

15

Creative Commons Licensing

A Creative Commons license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted "work".

This is required for most data in the Open Access Data Tier.

16

Data Access Committee (DAC)

An individual or group that approves or rejects data access applications.

The ACT serves as the DAC at Sage.

17

Data Contributor

The individual or group that provides data to Synapse.

For Synapse communities, a DTA or other agreement is required before a Data Contributor can upload their data.

18

Data Requestors

All users listed on a data access request, including the request submitter.

The list of Data Requestors within a Data Access Application should exactly match the list of Data Requestors within the submitted DUC, if applicable for the specific AR.

19

Data Transfer Agreement (DTA)

An agreement that permits a Data Contributor to provide data to Synapse.

DTAs are required for institutions contributing data to a Synapse community, or for institutions that are having Sage manage data access for them. Note that a grant or other agreement such as a Data Use Agreement (DUA) can take the place of a DTA as long as it is signed by an institutional signing official and mentions that data will be stored in a repository.

20

Data Subject/Human Subject/Research Participant

GDPR Definition: Identified or identifiable living individual to whom personal data relates.
HIPAA Definition: The living individual about whom an investigator conducting research obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.

21

Data Use Certificate (DUC)

DUC is a physical agreement that the data requestor needs to sign. The agreement outlines terms of use for accessing the dataset on Synapse, and it's usually signed by a Signing Official.

Managed ARs can be set up so that a DUC is required for data access.

22

Federated Query Governance Structure

Data are housed in a variety of locations, and users are able to query to those local data simultaneously. Typically restricted to pre-configured queries (rather than data exploration) and may require registration before use

23

General Data Protection Regulation (GDPR)

Rules and privacy regulations governing European data

24

Health Information Portability & Accountability Act (HIPAA)

US health information privacy regulations

25

HIPAA Limited Data Set

Data that excludes all PHI (as defined by HIPPA) except for at least one of the following:

  • dates such as admission, discharge, service, DOB, DOD;

  • city, state, five digit or more zip code; and

  • ages in years, months or days or hours.

HIPAA-limited data should always be categorized in the Controlled Access Data Tier

26

Informed Consent

An agreement that data subjects must sign before participating in a research study

Informed consents often help to establish Conditions for Data Use within Synapse

27

Intended Data Use Statement (IDU)

A description of the research purpose for using requested Synapse data.

IDUs can be required to access certain data via a Managed AR. They are often posted publicly on Synapse wiki pages.

28

Institutional Review Board (IRB)

A committee that applies research ethics by reviewing the methods proposed for research to ensure that they are ethical.

IRB approval can be required to access certain data via a Managed AR.

29

Managed Access Requirement

An Access Requirement that requires data access to be granted via a Data Access Committee (DAC).

ACT often implements Managed ARs on data categorized in the Controlled Access Tier

30

Model-to-Data Governance Structure

Data are held by a steward who is responsible for running algorithms on the behalf of researchers. In some cases, a synthetic version of the data may be released openly to facilitate model training. Researchers develop algorithms, send them to the steward, and receive back output of their analysis as run on the real dataset. The variety of analyses that may be performed is restricted by this structure, because the data steward must ensure data are specifically curated for any analytical question at hand

31

Open Access Data Tier

A category of Synapse data available to all registered Synapse users without use limitations

Sensitive data should not be included in this category.

32

Open Source Governance Structure

Data are distributed for reuse with a license defining reuse rights and conditions. The creator is in charge of the negotiation at first (choice of license), but then rights to analyze and redistribute are permanently transferred to the user.

This governance structure is typical of a centralized project in the sciences, i.e., the Human Genome Project

33

Pairwise Governance Structure

Two parties agree to work together on and/or share a data set in some fashion, typically with a closed contract or an informal agreement. The negotiation terms depend on the relative status of the parties and/or the value of the data and knowledge.

34

Private Access Tier

A category of Synapse data only available to the Data Contributor (i.e. Project Administrator) and other users that they specify in the entity's Sharing Settings.

Often, Private Data is managed via sharing through Synapse Teams.

35

Registered User

Synapse users that have successfully created an account and agreed to the Synapse Pledge.

Registered users can create projects and wikis. They can collaborate with other registered users and create Synapse teams. Registered users can also download publicly available data and, if they fulfill the Conditions for Use, they can also access controlled data.

36

Sensitive Data

Data that must be protected from unauthorized access to safeguard the privacy or security of an individual or organization.

“De-identified” data (maintained in a way that does not allow association with a specific person) is not considered sensitive.

37

Sharing Settings

A setting on the Synapse platform that enables a Project Administrator to define with whom a project or entity may be shared

Within Sharing Settings, Project Administrators can grant users view, download, edit, edit/delete, and administrator access

38

Signing Official

An employee affiliated with the respective organization who has oversight authority over the research study or data collaboration

A DUC or DTA may require an Instiutional Signing Offical's signature to validate the document.

39

Teams

Multiple Synapse users accepted into a group

Teams can be used to share Synapse entities to multiple users at once. Access Requirements can be implemented on Synapse teams or directly on Synapse entities

40

Validated User

Synapse users that have submitted identity attestation information to the ACT and have had their identities confirmed

Users are required to be validated in order to access certain data in the Controlled Access Data Tier (ex: mHealth data).
  • and/or undergoing Profile Validation.

  • Controlled-Plus: data that is restricted/controlled and is sensitive enough that additional prerequisites are required such as submitting an IRB approval letter or other institutional documentation.

8

Aggregate Data

Anchor
Aggregate_Data
Aggregate_Data

Data produced by grouping information into categories and combining values within these categories.

*Applicability: General

Also known as tabular data or macrodata. Often presented in tables. Since aggregate data is the combination of individual-level data, aggregate data is often a term used to describe data that is “less easy” to identify individual subjects; however, disclosure risks can arise if a user can access multiple tables containing common data elements. Data reduction treatments (such as combining categories so sample sizes within categories represent a larger n) and data modification treatments (such as rounding or adding perturbations so the potential for re-identification is reduced) are example methods that can be applied to aggregate data as part of a robust data privacy strategy.

Definition Source: Data Confidentiality Guide, Australian Bureau of Statistics

9

Anonymous Access Data (Synapse)

Data available for download on Synapse by anyone on the web without requiring them to login to a Synapse account or fulfill Conditions for Use.

*Applicability: Synapse

10

Anonymous Data

Anchor
Anonymous_Data
Anonymous_Data

Anonymized Data

(1) Broad Definition:

Individual-level data that has been stripped of personally identifiable information.

(2) Enhanced Definition:

Individual-level data that cannot be used alone or with other data to identify a unique individual.

*Applicability: General

Anonymization performed through simple de-identification techniques are useful as a primary safeguard for protecting privacy, but a growing body of literature has shown that as the size and diversity of available data grows, the likelihood of being able to re-identify individuals also grows substantially.

When communicating the protectiveness of de-identification, “anonymization” should be used carefully so as to not mislead participants or the community that “anonymized” data without additional treatment or analysis is a robust method of protecting against future re-identification.

In the “Enhanced Definition,” the data cannot be coded such that a link to the identifiers existing in a separate, existing data set could re-identify the individual.

11

Anonymous Journal Review

A process by which reviewers from a scientific journal anonymously access data in Synapse in order to evaluate it as part of their review of a manuscript being submitted for publication.

*Applicability: Synapse

To facilitate Anonymous Journal Review, ACT sets up a temporary account for journal reviewers to access data for a temporary period of time.

12

Anonymous User

Anchor
Anonymous-User
Anonymous-User

A Synapse user interacting with the platform without creating (or logging into) a Synapse account.

*Applicability: Synapse

Anonymous Users are able to review platform features, public resources (including the catalog of public projects, files, and tables), and other Anonymous Access Data.

Anonymous Users cannot create Projects in Synapse, upload or download data, add wiki content, or comment in discussion forums.

13

Biometric Data

Anchor
Biometric_Data
Biometric_Data

Personal data (see below) resulting from specific technical processing relating to the physical, physiological or behavioral characteristics of a natural person, which allow or confirm the unique identification of that natural person, such as facial images or dactyloscopic data.

*Applicability: General, GDPR

Uses: This definition may be used broadly outside of the scope of GDPR, but the definition source is from GDPR.

Definition Source: GDPR Article 4, See also: 21 CFR 11.3(3)

14

Breach

Anchor
Breach
Breach

(1) General Applicability:

The loss of control, compromise, unauthorized disclosure, unauthorized acquisition, or any similar occurrence where (1) a person other than an authorized user accesses or potentially accesses personally identifiable information or (2) an authorized user accesses or potentially accesses personally identifiable information for an other then authorized purpose.

(2) HIPAA Applicability:

The acquisition, access, use, or disclosure of protected health information in a manner not permitted [by the regulations] which compromises the security or privacy of the protected health information.

*Applicability and Uses - Definition 1: Can be used broadly for breach incidents that do not involved HIPAA-regulated data. This definition is adopted from Office of Management and Budget; however, it does not carry regulatory weight.

*Applicability and Uses - Definition 2: Applies only to Breaches subject to HIPAA regulations.

Definition Source, HIPAA: 45 CFR 164.402

Definition Source, General: OMB M-17-12

15

Business Associate (under HIPAA)

Anchor
Business_Associate
Business_Associate

A business associate, with respect to a covered entity, is a person or entity who:

On behalf of such covered entity […] creates, receives, maintains, or transmits protected health information for a function or activity regulated by [HIPAA], including claims processing or administration, data analysis, processing or administration, utilization review, quality assurance, patient safety activities […], billing, benefit management, practice management, and repricing.

*Applicability: HIPAA

Uses: This definition should only be applied to situations where Sage is agreeing to take on a Business Associate role.

A Business Associate role can only be taken on when a formal contract (Business Associate Agreement) meeting regulatory requirements has been executed. When an organization agrees to be a Business Associate, HIPAA regulations are applied in full effect (including enforcement requirements, breach requirements and the possibility of penalties).

Definition Source: 45 CFR 160.103

16

Business Associate Agreement (BAA)

Anchor
BAA
BAA

A contract between a covered entity and the entity or person agreeing to participate in activities on behalf of the covered entity. The BAA establishes the permitted and required uses and disclosures of protected health information by the business associate.

*Applicability: HIPAA

Uses: This definition should only be applied to situations where Sage is agreeing to take on a Business Associate role.

Definition Source: 45 CFR 164.502(e)

17

Certified User

Anchor
Certified_User
Certified_User

A Synapse user who has created a Synapse ID, has logged into Synapse using their email and password, and has successfully completed the Certification Quiz.

*Applicability: Synapse

To become a Certified User, a Registered User must pass a short quiz concerning the Synapse Commons Data Use Procedure to ensure the user understands the rules and policies that govern data sharing on Synapse.

Certified Users have access to full Synapse functionality, including the ability to upload files and tables as well as create folders.

18

Certification Quiz

Anchor
Certification_Quiz
Certification_Quiz

A quiz which is taken by a Registered User to become a Certified User and ensures the user understands the rules and policies that govern data sharing on Synapse.

*Applicability: Synapse

To become a Certified User, a Registered User must pass a short quiz concerning the Synapse Commons Data Use Procedure to ensure the user understands the rules and policies that govern data sharing on Synapse.

The Certification Quiz is 15 questions and takes approximately 15-20 minutes to complete.

19

Certificate of Confidentiality (CoC)

Anchor
COC
COC

A CoCs is a tool to protect information, documents, and/or biospecimens that contain identifiable, sensitive information related to a research participant with the intention to protect the privacy of research participants by prohibiting disclosure of identifiable, sensitive research information to anyone not connected to the research except when the participant consents or in a few other specific situations.

*Applicability: NIH, General

CoCs are:

  • Established by the Public Health Service Act §301(d), 42 U.S.C. §241(d), "Protection of privacy of individuals who are research subjects”

  • Applicable only to human subjects research studies in which identifiable, sensitive information is collected or used.

  • Issued by NIH and other HHS agencies (e.g., CDC) for research studies.

    • Since 2017, NIH automatically issues CoCs for any NIH-funded research meeting their criteria.

    • Researchers can also apply for a CoC for non-NIH funded research studies.

For more information, see the NIH FAQs page.

OHRP Guidance: Certificates of Confidentiality - Privacy Protection for Research Subjects

NIH CoC

20

Click-wrap

Anchor
Click-wrap
Click-wrap

A type of Access Requirement placed on a Synapse entity (a folder, file, project, or team) that can be satisfied by the user by reviewing the data contributors conditions and clicking the button that states "I accept the terms of use.”

*Applicability: Synapse

Click-wraps generally contain Terms and Conditions of data use (i.e., what you can and cannot do with the data) and often contain an Acknowledgement Statement.

21

Common Rule

A set of federal guidelines in the U.S. that protect people who participate in research studies, ensuring their rights, safety, and privacy are respected by outlining how researchers must obtain consent from participants, how the proposed research must reviewed and approved by an Institutional Review Board (IRB), and how the researchers must handle the human subject data.

*Applicability: General

45 CFR 46 - "Common Rule" | HHS.gov

22

Community Governance

Policies, processes, and structures that guide and oversee the research activities such as the research design, data collection, analysis, tools, methods, and dissemination. Community Governance:

1) ensures the ethical, responsible, and accountable conduct of research activities; and

2) protects the rights and well-being of research participants and maintains the integrity of the research process.

*Applicability: General

Who is involved: Research consortia, steering committees, funders

23

Conditions of Use

Anchor
Conditions_of_Use
Conditions_of_Use

A set of expectations and/or terms for data access applied to Synapse content.

*Applicability: Synapse

Conditions of Use are organized to help Requesters comply with the terms under which the data were collected or with other human subjects regulations. Data Contributors collaborate with ACT to set up Conditions for Use in the form of an Access Requirement.

24

Coded Data

Anchor
Coded_Data
Coded_Data

Data is coded when:

  1. Identifying information (such as name or social security number) that would enable the investigator to readily ascertain the identity of the individual to whom the private information or specimens pertain has been replaced with a number, letter, symbol, or combination thereof (i.e., the code); and

  2. A key to decipher the code exists, enabling linkage of the identifying information to the private information or specimens.

*Applicability: General

Uses: This definition may be used broadly.

Definition Source: OHRP Guidance: Coded Private Information or Specimens Use in Research (2008)

25

Covered Entity

Anchor
Covered_Entity
Covered_Entity

(“HIPAA Covered Entity”)

Covered entity means:

(1) A health plan.

(2) A health care clearinghouse.

(3) A health care provider who transmits any health information in electronic form in connection with a transaction covered by this subchapter (45 CFR 160.102).

*Applicability: HIPAA

Uses: This definition should only be used to determine whether an institution (entity) is subject to HIPAA regulations.

Sage is not a covered entity. Covered entities are generally organizations engaged in health care operations that cause them to be subject to HIPAA laws.

Related Definitions: Hybrid Entity, Business Associate

Definition Source: 45 CFR 160.103

26

Creative Commons License

Anchor
Creative_Commons_License
Creative_Commons_License

One of several public copyright licenses that enable the free distribution of an otherwise copyrighted work and is used when an author wants to give other people the right to share, use, and build upon a work that the author has created.

*Applicability: Synapse, General

This is required for most data in the Open Access Data Tier.

https://creativecommons.org/licenses/

27

Data Access Committee (DAC)

Anchor
DAC
DAC

An individual or group who reviews and approves or rejects applications or requests for access to and use of data governed by a managed AR.

*Applicability: Synapse, General

The Access and Compliance Team (ACT) serves as the Sage Data Access Committee (DAC).

28

Data Concerning Health

Anchor
Data_Concerning_Health
Data_Concerning_Health

Personal data (see below) related to the physical or mental health of a natural person, including the provision of health care services, which reveal information about his or her health status.

*Applicability: GDPR

Uses: This definition need only be used when working with data subject to GDPR.

Related Definition (HIPAA): Health Information

Definition Source: GDPR Article 4

29

Data Contributor

Anchor
Data_Contributor
Data_Contributor

The owner (individual, group or institution) of data (which may include analysis and/or tools) who uploads data content to Synapse.

*Applicability: Synapse, General

For Synapse communities involving Sage services (such as data curation), a Data Ingress / Egress Agreement may be required before a Data Contributor can upload their data.

30

Data Disposition

The process of determining when and how data is retained, archived, or deleted. It involves making decisions about what to do with data based on factors such as its value, relevance, and legal or regulatory requirements.

*Applicability: Synapse, General

31

Data Disposition View

A tool, such as a Synapse fileview or materialized view (if data spans multiple projects), .csv file, or R or Python script that enumerates synIDs, which is created by Sage to allow a Data Contributor to easily view and confirm their contributed data for data migration and/or removal from Synapse.

*Applicability: Synapse, General

32

Data Encryption Key

A secret code used to unscramble encrypted data to make it readable.

*Applicability: Synapse, General

33

Data Governance

Policies, procedures, and controls of research assets including access management, safe and responsible use, supporting interoperability, and contributing to overall data lifecycle management. Data Governance:

1) ensures data is managed, protected, and utilized effectively and responsibly within an organization; and

2) ensures the availability, usability, integrity, and security of data.

*Applicability: General

Who is involved: ACT, IT and Security, Platform engineers, Privacy officers

34

Data Incident

Anchor
Data_Incident
Data_Incident

An occurrence that (1) actually or imminently jeopardizes the integrity, confidentiality, or availability of information or an information system, or (2) constitutes a violation or imminent threat of violation of law, security policies, security procedures, or acceptable use policies.

*Applicability: General

This definition is adopted from Office of Management and Budget; however, it does not carry regulatory weight.

Definition Source: OMB M-17-12

35

Data Ingress / Egress Agreement

Anchor
Data_Ingress_/_Ingress_Agreements
Data_Ingress_/_Ingress_Agreements

A formal contract between Sage and external party/ies that outlines the terms and conditions under which data is allowed to enter or be imported into Synapse and the intentions, responsibilities, and roles of each party. 

*Applicability: General

This term represents:

  • Data Processing Agreement (DPA)

  • Data Sharing Agreement (DSA)

  • Data Sharing Permission (DSP) - this is a Sage term that is not commonly used externally.

  • Data Transfer Agreement (DTA)

  • Data Transfer and Use Agreement (DTUA)

  • Data Use Agreement (DUA) - See separate definition. Note that DUAs that are issued for HIPAA Limited Data Sets must meet specific requirements as dictated by HIPAA. “DUA” should be reserved for HIPAA-regulated data sets unless the countersigning party has a preference for using this term.

  • Materials Transfer Agreement (MTA)

  • Memorandum of Understanding (MOU)

Sage engages in a variety of agreements with external customers and partners to define: the expectations for providing data to Synapse; the roles and responsibilities each party takes to manage data; and the conditions under which data will be shared with other users and/or institutions from a Sage platform (e.g., authorized persons, access tiers, security boundaries); and the roles and responsibilities that Sage may take on for reviewing access requests.  The scope and applicability of these agreements is dependent upon a number of project-specific factors, including participant consent, data types, contractual obligations, institutional policies, rules and regulations, funder mandate, and/or research community sharing expectations.

A Data Ingress / Egress Agreement is required for institutions contributing data to a Synapse community, and/or for institutions that are having Sage manage data access for them. Sage Governance may attempt to use a standard template to meet the agreement needs (e.g., using an FDP template), but the type and content of the agreement can vary widely depending on the nature of the data, the scope of work, and the preferences of the institution.

Note that a grant or other existing agreement such as a Data Use Agreement (DUA) can take the place of an additional Data Ingress / Egress Agreement as long as it is signed by an Institutional Signing Official and the existing document mentions that data will be stored in a repository matching the project’s access controls.

36

Data Landscape Survey

Phase of initial engagement between the DCC, data contributors, and collaborators which involves defining parameters for receiving and classifying data.

*Applicability: General

37

Data Management

Anchor
Data_Management
Data_Management

The process of validating, organizing, protecting, maintaining, and processing scientific data to ensure the accessibility, reliability, and quality of the scientific data for its users.

*Applicability: General

Definition Source: NIH NOT-OD-21-013 (Data Sharing and Management Plans)

38

Data Migration

A Data Disposition option which involves the retention and relocation of the Data Contributor’s data on Synapse.

*Applicability: Synapse, General

39

Data Protection Impact Assessment (DPIA)

Anchor
DPIA
DPIA

A tool used to identify risks, impact or risks arising out of the processing of personal data and build awareness to minimize these risks as much and as early as possible.

*Applicability: General, GDPR

This is a general tool that may be used at Sage regardless of the regulatory oversight.

A Data Protection Impact Assessment (DPIA) is required under the GDPR any time a new project is initiated that is likely to involve “a high risk” to personal information. (More HERE.)

Sage Data Protection Policy

GDPR Article 35

40

Data Repository

Anchor
Data_Repository
Data_Repository

A database of research data maintained for the purpose of performing secondary research.

*Applicability: General

Additional synonyms: banks, registries, libraries

Data repository activities can include data curation and data maintenance (i.e., “data management”), and access management. A data repository containing de-identified data is not “research,” through the downstream product of the repository is for research.

41

Data Requester

Anchor
Data_Requesters
Data_Requesters

All individuals listed on a Synapse Access Request for access to data.

*Applicability: Synapse

When applicable for a Managed Access Requirements (AR), all Data Requesters listed on a Synapse data Access Eequest should exactly match the Data Requesters as listed on the associated Data Use Certificate (DUC).

42

Data Roadmap

An evidence-driven data plan, developed in the Data Landscap Phase by a DCC Team, which will be updated in subsequent stages as the data landscape changes and expands.

*Applicability: General

The Data Roadmap will answer the following questions:

  • who is contributing data

  • what types of data

  • how much (e.g., expected number of files, samples and individuals per data type)

  • who owns the data or has ability to approve sharing it

  • who should have access to the data

  • where is the data currently

  • where will it be stored

  • where the data will be analyzed (in the case of cloud computing)

  • when will it be transferred to the DCC

  • when should the DCC expect to release the data

  • how to communicate with the DCC regarding data

  • the governance conditions for sharing and using the data

43

Data Sharing

Anchor
Data_Sharing
Data_Sharing

The act of making scientific data available for use by others (e.g., the larger research community, institutions, the broader public), for example, via an established repository.

*Applicability: NIH, General

Definition Source: NIH NOT-OD-21-013 (Data Sharing and Management Plans)

44

Data Sharing and Management Plan (DSMP)

Anchor
DSMP
DSMP

A plan describing the data management, preservation, and sharing of scientific data and accompanying metadata.

*Applicability: NIH, General

See also NOT-OD-21-014: Supplemental Information to the NIH Policy for Data Management and Sharing: Elements of an NIH Data Management and Sharing Plan

Source: NIH NOT-OD-21-013 (Data Sharing and Management Plans)

45

Data Subject

Anchor
Data_Subject
Data_Subject

Identified or identifiable living individual to whom personal data relates.

*Applicability: GDPR

Uses: This definition need only be used when working with data subject to GDPR.

Related Definitions: Human Subject

Definition Source: GDPR Article 4

46

Data Use Agreement (DUA)

Anchor
DUA
DUA

(1) General Applicability:

A contractual document used for the transfer of data that has been developed by nonprofit, government or private industry, where the data are nonpublic or is otherwise subject to some restrictions on its use.

(2) HIPAA Applicability:

An agreement between a covered entity and a limited data set recipient to establish permitted uses and disclosures by the recipient.

*Applicability: General, HIPAA

Uses: This definition has broad uses, but the HIPAA definition is for specific circumstances where a covered entity is disclosing a Limited Data Set to another institution.

  • Non-HIPAA uses of the term generally refer to data sharing agreements between institutions and may be synonymous with “DTA,” “DSA,” “MOU,” and similar agreements to govern data sharing.

  • HIPAA: DUAs under HIPAA must meet specific regulatory requirements. The terms of the DUA define the allowed uses. HIPAA regulations prohibit the recipient from further disclosing or using the information in a manner that would violate HIPAA regulations or the agreement. Recipients under the agreement are required to use appropriate safeguards to prevent use or disclosure of information outside of the defined terms of the agreement.

Definition Sources:

General: UPitt Office of Sponsored Programs

HIPAA: 45 CFR 164.514(e)(4)

47

Data Use Certificate (DUC)

Anchor
DUC
DUC

A documented agreement outlining the terms of use for accessing a specific Synapse dataset, which must be signed by the Data Requester(s) and often also requires the signature of an institutional Signing Official.

*Applicability: Synapse

Managed ARs can be created to require submission of a Data Use Certificate (DUC) for data access.

48

Data Use Ontology (DUO) Codes

Standardized terms used to specify permissible data uses and access restrictions for shared genomic and health-related datasets to help ensure that researchers and data users comply with ethical and legal requirements, particularly regarding the privacy and consent preferences of data subjects (the individuals whose data is being shared).

*Applicability: General

DUO Codes simplify the interpretation of Data Ingress / Egress Agreements by providing a consistent framework, making it easier for data providers, access committees, and researchers to understand what uses of the data are allowed. Examples of DUO Codes include terms like "general research use," "disease-specific research," or "not-for-profit use."

49

De-identification

Anchor
De-identification
De-identification

De-Identified Data

(1) Non-HIPAA/General:

Information that has had personally identifiable information (PII), including PHI, removed.

(2) HIPAA Safe Harbor Method:

(i) Removal of the 18 identifiers defined in 45 CFR 164.514(b)(2)(i)(A)-(R) [paraphrased]

and

(ii) The covered entity does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information.

(3) HIPAA Expert Determination/Statistical Method:

A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable:

(i) Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and

(ii) Documents the methods and results of the analysis that justify such determination.

*Applicability: HIPAA, General

Uses:

Non-HIPAA/General Considerations:

HIPAA’s de-identification standards are over 20 years old and numerous studies have demonstrated many ways in which data labeled as de-identified can be re-identified.

The U.S. Department of Health and Human Services (“HHS”) Secretary’s Advisory Committee on Human Research Protections (“SACHRP”) has noted, for example:

Though de-identification is commonly perceived to be an effective means to protect human participants, certain studies have shown convincingly that other data can be used in conjunction with de-identified data from research studies to re-identify individuals.  Increasingly, the protections afforded by removing the eighteen identifying data elements cited in HIPAA have become out of date, as technological advances and the combining of data sets increase the risk of re-identification.  For example, commercial interests have increasingly been trying to combine large, de-identified data sets with real-world data collected during the course of ordinary daily activities (e.g., credit card charges, driving habits), which increases the risk of re-identification and misuse of previously de-identified data. 

It is important to note that these de-identification methods are not recognized globally. GDPR requirements in the European Union, for example, are comparatively more rigorous. However, GDPR does not provide any specific de-identification methods.

At Sage, HIPAA standards for de-identification are applied broadly in recognition of national standards and as a basic foundation for protecting privacy; however, Governance’s evaluation of data sensitivity and privacy risks must take into account the limitations of HIPAA de-identification standards in favor of more rigorously protective methods or systems.

HIPAA:

HIPAA has defined two de-identification methods that have become a national standard. These definitions specifically apply to protected health information (PHI), which is created by and transmitted by a covered entity, but have been applied broadly across the U.S. and within the research profession.

For more information, see “Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the HIPAA Privacy Rule (2012)

Definition Sources:

(1) HIPAA 45 CFR 164.514(b)(2)

(2) HIPAA 45 CFR 164.514(b)(1)

50

Derived Data

Anchor
Derived-Data
Derived-Data

New data created by transforming, processing, or analyzing existing data.

*Applicability: General

51

FISMA

Anchor
FISMA
FISMA

(Federal Information Security Management Act of 2002 and Federal Information Security Modernization Act of 2014)

A U.S. federal law (FISMA 2002) which requires each federal agency to develop, document, and implement an agency-wide program to provide information security for the information and systems that support the operations and assets of the agency, including those provided or managed by another agency, contractor, or other sources.

FISMA 2014 amends FISMA 2002 by modernizing federal security practices to address evolving security concerns resulting in less overall reporting, strengthening the use of continuous monitoring in systems, and increasing focus on the agencies for compliance and reporting that is more focused on the issues caused by security incidents.

FISMA 2014 also required the Office of Management and Budget (OMB) to amend/revise OMB Circular A-130 to eliminate inefficient and wasteful reporting and reflect changes in law and technological advances.

*Applicability: Synapse, General

Synapse is a FISMA-compliant platform. See the Synapse Platform page for more information. Federal Information Security Management Act (FISMA)

52

Fully-Executed

Anchor
Fully-Executed
Fully-Executed

Term used when all Parties’ authorized representatives have formally signed the Project Material(s).

53

General Data Protection Regulation (GDPR)

Anchor
GDPR
GDPR

Rules and privacy regulations governing data in the European Union (EU). GDPR establishes personal data privacy protections as a fundamental right.

*Applicability: GDPR

Fulltext of GDPR: https://gdpr.eu/tag/gdpr/

54

Genetic Data

Anchor
Genetic_Data
Genetic_Data

Personal data (see below) relating to the inherited or acquired genetic characteristics of a natural person which give unique information about the physiology or the health of that natural person and which result, in particular, from an analysis of a biological sample from the natural person in question.

*Applicability: GDPR, General

Uses: This definition may be used broadly outside of the scope of GDPR.

Definition Source: GDPR Article 4

55

Governance Structures

Anchor
Governance_Structures
Governance_Structures

Governance Models

The data sharing framework that dictates what data to acquire, how to bring them into systems, how to store them, how to analyze them, and how to share downstream knowledge.

*Applicability: General

Types of Governance Structures:

  • Pairwise (One-to-one): Two parties agree to work together and/or share on a data set in some fashion, typically with a closed contract or an informal agreement. The negotiation terms depend on the relative status of the parties and/or the value of the data and knowledge.

  • Open Source (One-to-many or some-to-many): Data are distributed for reuse with a license defining reuse rights and conditions. The creator is in charge of the negotiation at first (choice of license), but then rights to analyze and redistribute are permanently transferred to the user. This is typical of a centralized project in the sciences, i.e., the Human Genome Project.

  • Federated Query (Many-to-many, via platform): Data are housed in a variety of locations, and users are able to query to those local data simultaneously. Typically restricted to pre-configured queries (rather than data exploration) and may require registration before use.

  • Trusted research environment (Many-to-some): Data are housed in a central location under a contractual regime including Data Ingress / Egress Agreements. Users apply to use the data. Users must “visit” the data rather than download them, agree to be known, and, in some cases, agree to be surveilled by a data steward.

  • Model-to-data (One-to-many): Data are held by a steward who is responsible for running algorithms on the behalf of researchers. In some cases, a synthetic version of the data may be released openly to facilitate model training. Researchers develop algorithms, send them to the steward, and receive back output of their analysis as run on the real dataset. The variety of analyses that may be performed is restricted by this structure, because the data steward must ensure data are specifically curated for any analytical question at hand.

  • Open citizen science (Many-to-many): Rights to use and distribute data are often fully decentralized via license or contract. Open citizen science is a peer-to-peer version of open source science.

  • Clubs and Trusts (Some-to-some): Clubs and Trusts are versions of a common pool resource: a group of people and/or institutions who agree to share resources towards a common goal. Control over the development and negotiation of data sharing and use terms is often held by the founders/settlers (and/or funders) and then can be distributed amongst club participants. Importantly, clubs that operate in the cloud can easily publish data products that are more “open” than the club itself.

  • Closed: Data are held privately by a single party.

  • Closed and Restricted: Data are held privately in order to protect a population, meet a legal requirement, or protect a secret.

Mangravite, Lara M., Avery Sen, John T. Wilbanks, and Sage Bionetworks Team. Mechanisms to Govern Responsible Sharing of Open Data: A Progress Report. Manubot, 2020. https://github.com/Sage-Bionetworks/governanceGreenPaper/tree/3c2a648b892d8c672a3043c4bacda65505947921

56

Health Information

Anchor
Health_Information
Health_Information

Any information, including genetic information, whether oral or recorded in any form or medium, that:

(1) Is created or received by a health care provider, health plan, public health authority, employer, life insurer, school or university, or health care clearinghouse; and

(2) Relates to the past, present, or future physical or mental health or condition of an individual; the provision of health care to an individual; or the past, present, or future payment for the provision of health care to an individual.

*Applicability: HIPAA, General

Uses: This definition may be used broadly, but sub-definition (1) can be omitted if the use is not within the scope of HIPAA-regulated activities.

Related Definition (GDPR): Data Concerning Health

Defintion Sources: 45 CFR 160.103

57

Health Information Portability & Accountability Act (HIPAA)

Anchor
HIPAA
HIPAA

US health information privacy law. HIPAA legislation resulted in regulations collectively referred to as “HIPAA” and are made up of the “Privacy Rule,” “Security Rule,” and “Enforcement Rule.”

*Applicability: HIPAA

HIPAA Legislation:

https://www.govinfo.gov/content/pkg/PLAW-104publ191/pdf/PLAW-104publ191.pdf

Combined HIPAA Regulations:

https://www.hhs.gov/sites/default/files/ocr/privacy/hipaa/administrative/combined/hipaa-simplification-201303.pdf

58

Human Subject

Anchor
Human_Subject
Human_Subject

Research Participant

A living individual about whom an investigator (whether professional or student) conducting research:

(i) Obtains information or biospecimens through interaction or intervention with the individual, and uses, studies, or analyzes the information or biospecimens, or

(ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens.

*Applicability: Common Rule, FDA Regulations

Uses: This definition is used primarily to determine whether information, interactions, interventions, or biospecimens used for research purposes is subject to human subjects regulations (i.e., whether IRB review is required).

This is a truncated definition. Contact Governance for an in-depth discussion.

Definition Source: 45 CFR 46.102(e) (2018 revision)

See also: Chart 01: Is an Activity Human Subjects Research Covered by 45 CFR Part 46?

https://www.hhs.gov/ohrp/regulations-and-policy/decision-charts-2018/index.html#c1

59

Hybrid Entity

Anchor
Hybrid_Entity
Hybrid_Entity

A single legal entity:

(1) That is a covered entity;

(2) Whose business activities include both covered and non-covered functions; and

(3) That designates health care components in accordance with paragraph 164.105(a)(2)(iii)(D) of HIPAA regulations.

*Applicability: HIPAA

Uses: This definition only applies to HIPAA-regulated organizations.

A typical example of a hybrid entity is a university with an affiliated teaching hospital. The hospital portion of the organization performs HIPAA-covered health care functions, while the rest of the university performs non-covered functions.

Definition Source: 45 CFR 164.103

60

Identifiable Data/Information

Anchor
Identifiable_Data
Identifiable_Data

(1) Common Rule:

Data for which the identities of the source subjects are or may readily be ascertained by the investigator or associated with the information.

(2) NIH:

Data that are still attached to a readily available subject identifier such as name, social security number, study number, hospital number, medical record number, address, telephone number, etc., such that the identity of the subject can be ascertained.

(3) GDPR (“Identifiable Natural Person”):

One who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

*Applicability: Common Rule, NIH, General

Uses:

U.S. Federal policies focus on identifiable meaning that the identities of the subjects can be readily ascertained or that there are readily available identifiers attached to the data that would allow individual subject identities to be ascertained. When navigating the applicability of federal policies and regulations, the definition provided by the regulatory source should be applied.

The GDPR definition of an “identifiable natural person” goes beyond the U.S. references to traditional identifiers (like name, address, phone number or SSN), and includes reference to “one or more factors specific to the physical, psychological, genetic, mental, economic, cultural or social identity” of the subject.

At Sage, we recognize the need to combine many definition sources when evaluating factors such as the level to which data is identifiable, data sensitivity, and the risk of re-identification. In practice, Sage Governance will always apply the definition applicable to the specific laws and regulations of the data, but will take a more protective stance whenever feasible. When evaluating data outside the scope of a specific regulatory question, Sage should apply the NIH definition. While the evaluation of data sensitivity and risk should include the combined nature of the individual factors listed in the GDPR definition, Sage will not label data “identifiable” due to these factors alone unless GDPR applies.

For GDPR-regulated data, also see Personal Data.

For HIPAA-regulated data, also see Individually Identifiable Health Information (IIHI).

Related Definitions: Personally Identifiable Information (PII), Coded Data, De-identified Data

Definition Sources:

Common Rule: 45 CFR 46.102(e)(5)

NIH: 3016 - Intramural Research Program Human Data Sharing (HDS) Policy

GDPR: GDPR Article 4

61

Identity Attestation Document

Documentation needed from a Synapse User to confirm their identity as part of the Synapse Profile Verification process.

*Applicability: Synapse

Acceptable forms of Identity Attestation Documents include: - Letter from a signing official (other than the person submitting) on official letterhead attesting to their identity - Notarized letter attesting to their identity - A copy of a professional license (i.e. medical license, etc.)

62

Incident

Suspected event that impacts the computer or data environment within Sage Bionetworks.

*Applicability: Synapse, General

63

Individually Identifiable Health Information (IIHI)

Anchor
IIHI
IIHI

Individually identifiable health information is information that is a subset of health information, including demographic information collected from an individual, and:

(1) Is created or received by a health care provider, health plan, employer, or health care clearinghouse; and

(2) Relates to the past, present, or future physical or mental health or condition of an individual; the provision of health care to an individual; or the past, present, or future payment for the provision of health care to an individual; and

(i) That identifies the individual; or

(ii) With respect to which there is a reasonable basis to believe the information can be used to identify the individual

*Applicability: HIPAA

Uses: This definition need only be used when working with data subject to HIPAA regulations.

Defintion Source: 45 CFR 160.103

64

Informed Consent

Anchor
Informed_Consent
Informed_Consent

The process of informed consent is a fundamental mechanism to ensure respect for persons through the provision of thoughtful consent for a voluntary act.

*Applicability: General

Depending on the research and the approved consenting plan approved by an Institutional Review Board (IRB), consent may be performed (1) orally without a signed document; (2) using a disclosure form without a signature; or (3) using an informed consent form with required signatures.

65

Informed Consent Form (ICF)

Anchor
ICF
ICF

Informed Consent Document (ICD)

Informed consent forms are written documents presented as part of an informed consent process when enrolling a human subject in research.

*Applicability: General

Informed consent forms must meet specific requirements defined by the regulations.

Informed consent is not the same as “HIPAA Authorization,” though some institutions may allow these distinct documents to be combined.

Informed consent forms often include restrictions on data sharing and future use limitations. Informed consent forms therefore help to establish Conditions for Data Use within Synapse.

Elements of informed consent are defined by the regulations at 45 CFR 46.116 (Common Rule), 21 CFR 56.116 (for FDA-regulated studies).

Documentation requirements for informed consent are defined by the regulations at 45 CFR 46.117 (Common Rule), and 21 CFR 56.117 (for FDA-regulated studies).

66

Intended Data Use Statement (IDU)

Anchor
Intended-Data-Use-Statement-(IDU)
Intended-Data-Use-Statement-(IDU)

A detailed description submitted with a Data Access Request identifying the Data Requester's research purpose for accessing and using certain data stored in Synapse which is used by the Data Access Committee (DAC) to determine whether access to the data should be allowed. IDUs should address the following questions: What do you want to do with the data? Why are you doing it? How do you want to do it?

*Applicability: Synapse

IDUs can be required to access certain data via a Managed AR. They are often posted publicly on Synapse wiki pages or portal pages.

67

Institutional Review Board (IRB)

Anchor
IRB
IRB

An independent body constituted of medical, scientific, and nonscientific members, whose responsibility it is to ensure the protection of the rights, safety, and well-being of human subjects by, among other things, reviewing, approving, and providing continuing review of protocols, amendments, and the methods and material to be used in obtaining and documenting informed consent of the research subjects.

*Applicability: General

IRB approval may be required to access certain data via a Managed AR.

Adapted from ICH E6(R2) 1.31 Good Clinical Practice

68

Interconnection Security Agreement (ISA)

Anchor
ISA
ISA

An ISA captures the technical and security requirements to establish and maintain the interconnection between any two or more systems.

*Applicability: NIH

Federal policy recommends agencies to develop Interconnection Security Agreements (ISAs) when information is exchanged with another organization via a system interconnection. This is a FISMA-required document discussing security-relevant aspects of an intended connection between a federal agency system and an external system.

Reference: NIST

69

Journal

A periodical publication that disseminates original research, reviews, and scholarly articles in a specific field of study.

*Applicability: General

Scientific journals serve as the primary means of sharing new knowledge, discoveries, and theories among researchers, academics, and professionals.

70

Legacy Project

Anchor
Legacy_Project
Legacy_Project

Term used for Synapse Data Coordination Center (DCC) projects that are no longer actively funded, yet require Sage’s continued support, maintenance and closure, as needed. Work completed in support of such projects is funded through indirect funds.

*Applicability: General, Synapse

71

Limited Data Set

Anchor
Limited_Data_Set
Limited_Data_Set

“HIPAA Limited Data Set”

A limited data set is protected health information (PHI) that excludes the direct identifiers listed in 45 CFR 164.514(e)(2).

For simplification purposes, one or more of the following identifiers may be allowed:

  • dates such as admission, discharge, date of service, date of birth, date of death;

  • city, state, five digit or more zip code; and

  • calculated ages in years, months or days or hours (including ages over 89).

*Applicability: HIPAA

Uses: The term “Limited Data Set” is only truly applicable when:

  1. The data was created or received by a covered entity,

  2. The data was stripped of all identifiers except one or more of the identifiers indicated on the left, AND

  3. There is a Data Use Agreement in place meeting the requirements specified by HIPAA regulations.

At Sage, “Limited Data Set” is used broadly as Limited Data Sets are recognized benchmarks in de-identification in the U.S.; however, it is important to be aware of the regulatory applicability. Whereas de-identification of PHI (via the HIPAA Safe Harbor or Expert Determination methods) can convert data into a non-PHI state, Limited Data Sets remain as PHI with the DUA serving as the additional protection.

Generally, Limited Data Sets should always be categorized in the Controlled Access Data Tier.

Defintion Source: 45 CFR 164.514(e)

72

Managed Access Requirement (AR)

Anchor
Managed_AR
Managed_AR

An Access Requirement that requires data access to be granted via the Synapse Access and Compliance Team (ACT) and/or Data Access Committee (DAC).

*Applicability: Synapse

ACT often implements Managed ARs on data categorized in the Controlled Access Tier. Managed ARs often consist of:

  1. Data Access Application.

  2. One or more of the following: intended data use statement, IRB approval letter, or data use certificate.

  3. Requirement for data accessors to be registered, certified or validated.

73

Manuscript

A research paper or scholarly work that is submitted to a scientific journal for publication.

*Applicability: General

A manuscript typically contains original research findings, theoretical analysis, or a review of existing literature. Before being published, a manuscript goes through a peer-review process where it is evaluated by experts in the field.

74

Metadata

Anchor
Metadata
Metadata

Data that provide additional information intended to make scientific data interpretable and reusable (e.g., date, independent sample and variable construction and description, methodology, data provenance, data transformations, any intermediate or descriptive observational variables).

*Applicability: NIH, General

Definition Source: NIH NOT-OD-21-013 (Data Sharing and Management Plans)

75

Peer Review

A process in which a submitted manuscript or research paper is evaluated by independent experts in the same field before it is accepted for publication in a journal.

*Applicability: General

The purpose of peer review is to ensure the quality, credibility, and validity of the research by subjecting it to scrutiny from knowledgeable professionals who are not involved in the work.

76

Personal Data

Anchor
Personal_Data
Personal_Data

Personal data means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

*Applicability: GDPR

Uses: This definition should only be applied to GDPR-regulated data. See Identifiable Data/Information for more discussion and related terms.

Defintion Source: GDPR Article 4

77

Personally Identifiable Information (PII)

Anchor
PII
PII

Information that can be used to distinguish or trace an individual’s identity, either alone or when combined with other information that is linked or linkable to a specific individual.

(Because there are many different types of information that can be used to distinguish or trace an individual identity, the term PII is necessarily broad.)

*Applicability: General

PII is not defined officially by the U.S. government through legislative or regulatory bodies, but has been offered through Office of Management and Budget (OMB) memoranda.

To determine whether information is PII, the OMB has recommended to executive agencies that they should perform assessments of the specific risk that an individual can be identified using the information with other information that is linked or linkable to the individual. This is because information that is not PII can become PII whenever additional information becomes available - in any medium or from any source - that would make it possible to identify an individual.

Definition Source: OMB M-17-12

78

Privacy Incident

An event where protected information is used or disclosed without authorization.

*Applicability: Synapse, General

79

Private Access

Anchor
Private_Access
Private_Access

Private Project

A category of Synapse data only available to the Data Contributor (i.e., Project Administrator) and other users that they specify in the entity's Sharing Settings.

*Applicability: Synapse

Often, Private Data is managed via sharing through Synapse Teams.

80

Private Information

Anchor
Private_Information
Private_Information

(1) Information about behavior that occurs in a context in which an individual can reasonably expect that no observation or recording is taking place, and

(2) Information that has been provided for specific purposes by an individual an that the individual can reasonably expect that will not be made public (e.g., a medical record).

*Applicability: Common Rule

Uses: This definition may be used broadly.

Defintion Source: 45 CFR 46.102(e)(4)

81

Project Materials

Anchor
Project-Materials
Project-Materials

Project-specific governance documentation, e.g., agreements, amendments, memorandums of understanding, and related legal documents.

82

Protected Health Information (PHI)

Anchor
PHI
PHI

Protected health information means individually identifiable health information:

(1) Except as provided in paragraph (2) of this definition, that is:

(i) Transmitted by electronic media;

(ii) Maintained in electronic media; or

(iii) Transmitted or maintained in any other form or medium.

(2) Protected health information excludes individually identifiable health information:

(i) In education records covered by the Family Educational Rights and Privacy Act, as amended, 20 U.S.C. 1232g;

(ii) In records described at 20 U.S.C. 1232g(a)(4)(B)(iv);

(iii) In employment records held by a covered entity in its role as employer; and

(iv) Regarding a person who has been deceased for more than 50 years.

*Applicability: HIPAA, General

Do not use this term to mean “Personal Health Information.”

Uses: Data is only PHI when it is regulated under HIPAA. This means that it was created and transmitted by a covered entity, and/or has either been transmitted to another covered entity or to a business associate (with a BAA in place). Sage is not a covered entity, but has, in some circumstances, served as a business associate.

HIPAA terminology has become commonplace when discussing health information used for research. Since health information is most often collected by or combined with data collected by covered entities (like hospitals and clinics), discussion of, and reference to PHI has served to keep a focus on data privacy and security, and the penalties that can arise when privacy rules are broken. Discussion of PHI also maintains a focus on de-identification processes, such as the removal of the 18 HIPAA identifiers, or use of Limited Data Sets.

At Sage, data will rarely meet the definition of being PHI when it is placed in Synapse. The exceptions are when Sage has signed a BAA, or if the data contributor is a covered entity and has put data in Synapse improperly.

Data may start as PHI (when Individually Identifiable Health Information [IIHI] is created by a covered entity and transmitted electronically), but through the process of compliant disclosure authorizations, releases, formal IRB-approved waivers, and/or de-identification procedures, PHI may be placed into Synapse and no longer meet the definition of PHI. Additionally, once data is transferred from a covered entity to a non-covered entity, HIPAA protections no longer apply.

In cases where PHI is put in Synapse “improperly,” this constitutes a privacy breach at the fault of the disclosing entity. These instances should be reported to Sage Governance for investigation and corrective action.

Defintion Source: 45 CFR 160.103

83

Pseudonymization

Anchor
Pseudonymization
Pseudonymization

(1) GDPR Applicability:

The processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.

(2) General Applicability:

Data where individual identifiers have been replaced by a code or pseudo (false) identifier.

*Applicability: GDPR

Uses: These definitions may be used broadly outside of the scope of GDPR.

Defintion Source: GDPR Article 4

84

Publicly Accessible Data

Anchor
Publicly_Accessible
Publicly_Accessible

Data are available to qualified researchers. It may include either data that are openly accessible and available for any use or data that are accessed in a controlled manner to protect appropriately certain interests, for example, the privacy of research subjects, intellectual property or security.

*Applicability: General

In some cases, “publicly accessible” data may include only “openly accessible” data.

Definition Source: NIH 3016 - Intramural Research Program Human Data Sharing (HDS) Policy

85

Registered User

Anchor
Registered_User
Registered_User

Synapse user who has successfully created an account, has logged into Synapse using their email and password, and has agreed to the Synapse Pledge.

*Applicability: Synapse

Registered users can create projects and wikis. They can collaborate with other registered users and create Synapse teams. Registered users can also download publicly available data and, if they fulfill the Conditions for Use, they can also access controlled data.

86

Reliable Method (RM)

Anchor
Reliable-Method
Reliable-Method

Internal process documents that provide detailed, step-by-step instructions for completing a task.

*Applicability: Governance Document Control

RMs are meant to elaborate on other generalized instructions that are covered in SOP or Policy documents. Unlike SOPs or Policies, RMs are meant be updated on a continual basis to best reflect the most reliable, comprehensive method for completing work.

87

Research

Anchor
Research
Research

A systematic investigation, including research development, testing, and evaluation, designed to develop or contribute to generalizable knowledge.

*Applicability: Common Rule, HIPAA, General

Uses: This definition can be used broadly.

Definition Sources:

45 CFR 46.102(l)

45 CFR 164.501

88

Research Governance

Anchor
Research-Governance
Research-Governance

Policies, processes, and structures that guide and oversee the research activities such as the research design, data collection, analysis, tools, methods, and dissemination. Research Governance:

1) ensures the ethical, responsible, and accountable conduct of research activities; and

2) protects the rights and well-being of research participants and maintains the integrity of the research process.

*Applicability: General

Who is involved: IRB, ethics committees

89

Scientific Data

Anchor
Scientific_Data
Scientific_Data

The recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications. Scientific data do not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens.

*Applicability: NIH, General

Definition Source: NIH NOT-OD-21-013 (Data Sharing and Management Plans)

90

Secondary Research

Anchor
Secondary_Research
Secondary_Research

Reusing information or specimens that are collected for some other “primary” or “initial” activity for research purposes.

*Applicability: General

Secondary research will generally involve use of data or specimens that were collected for a reason other than the present research purpose. The “primary” or “initial” activity can be for research purposes or non-research purposes.

  • For example, research performed using medical records is an example of secondary research because the medical records data was collected for regular patient care. The “initial” activity in this case was for non-research purposes.

  • In another example, a researcher might collect data for a specific research purpose by consenting subjects and administering a validated assessment. Once that research study is completed, the researcher may store the data (if the subjects consented to future use of their data and an IRB approved the protocol) and another researcher may conduct secondary research analysis of the data for a different research study.

Definition Source: Preamble to 45 CFR 46 (82 F.R. 7191)

91

Secret Store

A secure service, such as LastPass, used to store sensitive information such as passwords, API keys, encryption keys, certificates, and other credentials and ensures that these secrets are protected from unauthorized access, often through encryption and access controls.

*Applicability: General

92

Security Incident

A fault in the confidentiality, availability, or integrity of an information system.

*Applicability: Synapse, General

93

Security Incident Response Team (SIRT)

Sage workforce members who are responsible for organizational response to incidents, and to prepare for incidents, assess risks, and maintain the incident response process.

*Applicability: General

94

Sensitive Data

Anchor
Sensitive_Data
Sensitive_Data

(1) General Applicability:

Data that must be protected from unauthorized access to safeguard the privacy or security of an individual or organization. This includes human data at risk of re-identification.

(2) GDPR Applicability:

The following personal data is considered sensitive:

  • personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs;

  • trade-union membership;

  • genetic data, biometric data processed solely to identify a human being;

  • health-related data;

  • data concerning a person’s sex life or sexual orientation.

(3) Veteran’s Affairs Applicability (for example purposes):

Sensitive personal information includes:
(A) Education, financial transactions, medical history, and criminal or employment history.
(B) Information that can be used to distinguish or trace the individual’s identity, including name, social security number, date and place of birth, mother’s maiden name, or biometric records.

*Applicability: GDPR, General

“Sensitivity” of information is highly subjective and it is generally difficult to set a list of data elements that will reliably apply to every data set as a method to easily label information as “sensitive.” As a result, some governmental agencies choose to use consider any personally identifiable information (PII) as “sensitive.”

Sage Governance processes may involve a risk-based approach to evaluating the sensitivity of data. This may include an analysis of the risk that the data could pose if data were re-identified, coupled with an analysis of the de-identification methods used to treat the data.

Defintion Sources:

GDPR Article 4(13), (14) and (15), Article 9 and Recitals (51) to (56)

38 U.S.C. 5727(19)

95

Sharing Settings

Anchor
Sharing_Settings
Sharing_Settings

Controls used by a Project Administrator to define and customize public or private access to a Synapse entitiy (Project, File, Folder, or Table). The Project Administrator also has the option to create "Local Sharing Settings" which allows for different access customization for an entity within another entity (example: a parent Folder may have Sharing Settings that allow for "public" access, while a File within that parent Folder may have Local Sharing Settings restricting access to specific Users).

*Applicability: Synapse

Within Sharing Settings, Project Administrators can grant users view, download, edit, edit/delete, and administrator access

96

Signing Official

Anchor
Signing_Official
Signing_Official

Institutional Signing Official

(1) General:

An employee affiliated with the respective organization who has oversight authority.

(2) NIH:

An Institutional Signing Official is generally a senior official at an institution who is credentialed through NIH eRA Commons system and is authorized to enter the institution into a legally binding contract and sign on behalf of an investigator who has submitted data or a data access request to NIH.

*Applicability: Synapse, General, NIH

A Data Use Certificate (DUC) or Data Ingress / Egress Agreement may require a Signing Official's signature to validate the document. This term is not synonymous with “Institutional Official.

For DUCs: Generally, the Signing Official should be a person meeting the following criteria:

  • Has oversight authority over the data requestor,

  • Is responsible for ensuring appropriate and ethical use of the Data by the data data requestor, and

  • Is not a member of the study team (as this would introduce a conflict of interest).

The institutional role of a Signing Official on a DUC is generally more appropriate in a Department Head position (or similar) due to the nature of wanting closer oversight of the requestor.

For data ingress agreements (e.g., DTA, DUAs, MOUs, etc.): A Signing Official must have institutional authority to enter their institution into a legally binding contracts. For this reason, the Signing Official is typically a designee in a Grants & Contracts office (or similar).

For NIH Data Sharing Policy: The NIH requires additional credentialing and authority.

Definition Source (NIH): NOT-OD-14-124 Genomic Data Sharing Policy

97

Synthetic Data

Artificially generated data that mimics real-world data and is created using algorithms and simulations rather than collected from real-life events or observations.

*Applicability: General

98

Team (in Synapse)

Anchor
Teams
Teams

Multiple Synapse users accepted into a group for the purpose of controlling access to projects, faciliatating communication within Synapse, and/or allowing participation in Challenges.

*Applicability: Synapse

Teams can be used to share Synapse entities to multiple users at once. Access Requirements can be implemented on Synapse teams or directly on Synapse entities

Synapse Docs > Collaborating in Synapse > Teams

99

Team Manager (Synapse)

Role for Synapse Team member(s) with authority to invite or remove team members and the abilitty to edit Team Synapse settings.

*Applicability: Synapse

Synapse Docs > Collaborating in Synapse > Teams

100

Two-Factor Authentication (2FA)

A security method that requires two different forms of identification to access data or resources which can help protect against phishing, social engineering, password brute-force attacks, and weak or stolen credentials.

*Applicability: Synapse, General

Adding Two-Factor Authentication (2FA) to your Synapse account

101

Unlinked Data

Anchor
Unlinked_Data
Unlinked_Data

Data that were initially collected with identifiers but, before research use, have been irreversibly stripped of all identifiers by use of an arbitrary or random alphanumeric code and the key to the code is destroyed, thus making impossible for anyone to link the samples to the sources.  This does not preclude linkage with existing clinical, pathological, and demographic information so long as all individual identifiers are removed prior to distribution or receipt.

*Applicability: General

Definition Source: NIH 3016 - Intramural Research Program Human Data Sharing (HDS) Policy

102

Validated User

Anchor
Validated_User
Validated_User

Synapse user who has created a Synapse ID, has logged into Synapse using their email and password, has successfully completed the Certification Quiz, and has had their profile and identity validated by Sage Access and Compliance Team.

*Applicability: Synapse

The process of becoming a Validated User enables greater transparency within the research community which promote a reciprocal relationship between the Synapse user and the data participants and contributors. Validated Users are eligible to request access to specific controlled-access data and to Bridge data.

To become a Validated User, a Certified User must establish their identity by providing to the Sage Access and Compliance Team (ACT) a combination of Synapse profile information, ORCID profile information, a signed Synapse Pledge, and an external credential.

103

Violation

Any behavior or action that is not compliant with the Synapse Terms and Conditions of Use, Privacy Policy, or Community Standards.

*Applicability: Synapse

104

Whitelisting

The act of making data available on Synapse as Anonymous Access, a category of data available for download by anyone on the web without requiring them to login to a Synapse account or fulfill Conditions for Use.

*Applicability: Synapse