Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The periodic audit of Synapse activity is intended to surface potential threat scenarios concerning the privacy and security of data held in the Synapse. The approach to this audit is informed by an assessment of risks to priority data, such as the data sets associated with Synapse with Synapse projects marked with restricted access control lists. The risk assessment process considers access control at the point when access is granted, when access is used, and when access may become uncontrolled.

Auditing may be done by analyzing a representative sample of activity or a comprehensive report of activity over the audit period. A comprehensive report is preferred when the queries driving the report can be targeted to precisely address the threat scenario. Sampling is used as an alternative when comprehensive reporting is not feasible to address a given audit query, such as for activity common to all users within the application.

Threat Scenarios

Data access

Synapse implements an access control system based on the properties of the profile attempting to access information and on properties of the data set itself. An account must be validated in order to access controlled use data, and an access restriction must be in place on data with a controlled use classification in order to implement this access control.

Threat: A Synapse user intentionally or inadvertently accesses controlled data without qualification of their account

Identify through data warehouse query and end user reporting:

  • Users who have posted or access controlled data without the the appropriate validation property on their account.

  • Users who should have access removed at a prior time no longer have access

Threat: A Synapse user with significant access to data intentionally or inadvertently shares access

Identify through data warehouse query and end user reporting:

  • A single file downloaded multiple times by a single user

(Audit targets: MD5 duplicates, Restriction change of state, Top downloaders)

Data handling

Synapse allows end users to upload data once they have certified their account through a training module. The certification process is an administrative control that trains users on appropriate data handling procedures. Once granted data upload rights, an end user is expected to respect the permission sets associated with the data sets they handle.

Threat: A Synapse superuser intentionally or accidentally copies or uploads a controlled data set without appropriate access controls

Identify through data warehouse query and end user reporting:

  • PHI accidentally/intentionally released without appropriate conditions

  • Original terms of data contribution are not respected. Data proliferated into Synapse beyond the original terms of use

  • Public Synapse spaces contain only data classified as public (audit through automation)

Associated queries: MD5 duplicates, Restriction change of state

Data loss

A Synapse account may be permitted to access many data sets of differing classifications. An incident of account sharing or account compromise may result in the download of a data set beyond what is intended according to an access restriction.

Threat: A Synapse account with extensive access to controlled data sets may be compromised:

Identify through data warehouse query and end user reporting:

  • Detecting the exfiltration of data from Synapse correlated with large-scale download activity by a user

Associated queries: Restriction change of state, Top downloaders

Audit Constraints

The Synapse audit approach was revised in 2020 to focus on specific threats identified through a risk assessment process. Automated queries were designed to report on the activity related to each threat.

The audit reports are limited by the time spans available to the automated queries. Some queries are based on changes to properties of objects and a query may not be able to compare an event with activity outside of its observation window. In these cases, the query will not surface a conflict between the event and a prior state.

Data warehouse queries

Restriction change of state

Code Block
#select t1.ID, t1.IS_CONTROLLED, t1.IS_RESTRICTED, t1.IS_PUBLIC, t2.IS_CONTROLLED, t2.IS_RESTRICTED, t2.IS_PUBLIC
select t1.*, t2.*
from (
    select ns2.*
    from NODE_SNAPSHOT ns2
    join (
        # most recent snapshot
        select ns1.ID, max(ns1.TIMESTAMP)
        from NODE_SNAPSHOT ns1
        group by ns1.ID
    ) nsmax1 on nsmax1.ID=ns2.ID
) t1
join (
    select ns2.*
    from NODE_SNAPSHOT ns2
    join (
        # snapshot a month ago
        select ns1.ID, max(ns1.TIMESTAMP)
        from NODE_SNAPSHOT ns1
        where ns1.TIMESTAMP < unix_timestamp('2019-09-01 00:00:00')*1000
        group by ns1.ID
    ) nsmax1 on nsmax1.ID=ns2.ID
) t2 on t2.ID=t1.ID and t2.VERSION_NUMBER=t1.VERSION_NUMBER
where not (t1.IS_PUBLIC = t2.IS_PUBLIC and t1.IS_CONTROLLED = t2.IS_CONTROLLED and t1.IS_RESTRICTED = t2.IS_RESTRICTED)
limit 100
;

Top downloaders

Code Block
# top 20 downloaders by count(filehandle_id)
select fhdr.USER_ID, count(*) as c
from FILE_HANDLE_DOWNLOAD_RECORD fhdr
where fhdr.TIMESTAMP between unix_timestamp('2019-07-01 00:00:00')*1000 and unix_timestamp('2019-09-10 00:00:00')*1000
group by fhdr.USER_ID
order by c desc
limit 20;

...

generated by running queries that precisely target privacy threat scenarios.

Overview

The Synapse audit should occur twice a year, once in July and once in January. Each audit should contain data from the two quarters prior to the data pull. The purpose of the audit is to ensure that there have not been any data breaches or security risks during the respective audit period.

An audit report is generated during each audit to analyze the data and explain whether there have been any security breaches or privacy concerns. The Governance Regulatory Support Team should submit the audit report to WIRB annually in October during the Synapse continuing review, which occurs in October.

For more details, please reference the following pages:

Child pages (Children Display)

Audit Timeline

When

Who

What

First two weeks of January and July

Synapse Security Engineer

Run Automation

  • Pull MD5 Duplicate Data, State Change Data, Top Downloader Data from past 2 quarters

  • Post data files onto Synapse

  • Email data files to ACT@sagebionetworks.org

Reference “Engineering Audit Resources” page for details

Second two weeks of January and July

Synapse ACT

Sort Data & Triage Threats

  • Sort MD5 Duplicate Data, State Change Data, & Top Downloader Data

  • Review top 20 downloaders and reach out to Community Managers, project owners, or Sage employees regarding potential security or privacy threats if any are suspected.

  • Document responses and note resolutions

Reference the “Audit Details for ACT” page for details

Mid September

Synapse Security Engineer and Synapse ACT

Generate Audit report following this template

  • Synapse ACT will enter information based on the past two audit cycles. For example, the October 2021 report will contain data from the January 2020 and July 2020 audits

  • Synapse ACT will tag the Synapse Security Engineer for review

  • Once the draft is finalized, Synapse ACT will email the Director of Governance for final review

Reference the “Audit Report” page for details

Late September

Director of Governance (Christine)

Review and Approve/Reject Audit Report

  • Synapse ACT will email the finalized draft to the Director of Governance and make modifications if necessary

October

Synapse Security Engineer and Governance Regulatory Support Team

Security Engineer: Submit Audit Report to HITRUST

Governance Regulatory Support Team: Submit Audit Report to WIRB during Synapse Continuing Review

Reference the “Audit Report” page for details