Document toolboxDocument toolbox

Synapse Dependency Vulnerability Alerts Strategy

This document will address

PLFM-6419 - Getting issue details... STATUS
.

Summary of Issue

GitHub provides the “Dependabot” service on our repositories, where a project’s dependencies will be scanned to see if any dependencies are vulnerable to known security issues. If a security issue is discovered, engineers with sufficient permissions will see an alert on the repository page, and may receive an email notification. If possible, Dependabot may also create a pull request.

Because the alerts are automated and based only on a project dependency tree, a given security issue may not affect our products for various reasons. For example, we may not using a vulnerable feature, or we may only pass restricted input to a dependency which has a vulnerability which can only be exploited by unrestricted input. Therefore, assessing the true risk of a security alert may require an investigation by an engineer to determine an appropriate response.

I’ve tried to collect current (Feb. 2022) stats on Dependabot in our repos, but this information may be incomplete:

Considerations

  • We have many active repositories, so it would be more valuable to see vulnerability alerts for multiple projects at once.

Synapse-Repository-Services

https://github.com/Sage-Bionetworks/Synapse-Repository-Services

Synapse-Stack-Builder

https://github.com/Sage-Bionetworks/Synapse-Stack-Builder

Synapse-Warehouse-Workers

https://github.com/Sage-Bionetworks/Synapse-Warehouse-Workers

worker-utilities

https://github.com/Sage-Bionetworks/worker-utilities

Synapse-Migration-Utility

https://github.com/Sage-Bionetworks/Synapse-Migration-Utility

SynapseWorkflowOrchestrator

https://github.com/Sage-Bionetworks/SynapseWorkflowOrchestrator

file-proxy

https://github.com/Sage-Bionetworks/file-proxy

Synapse-Sftp-Proxy

https://github.com/Sage-Bionetworks/Synapse-Sftp-Proxy

aws-utilities

https://github.com/Sage-Bionetworks/aws-utilities

database-semaphore

https://github.com/Sage-Bionetworks/database-semaphore

schema-to-pojo

GitHub - Sage-Bionetworks/schema-to-pojo: The schema-to-pojo project was started because we wanted to programmatically generate the Plain Old Java Objects (POJOs) from JSON schemas. More specifically we wanted to use the generated POJOs as resources of the Synapse JSON REST web services API. Since we are using the POJOs as REST resources we also wanted an easy way to marsh instances to/from JSON. Lastly, we also wanted to use the generated POJOs in our Google Web Toolkit (GWT) client-side code and server side code. Thus the schema-to-pojo project was born. Learn more at the folllowing wiki:

Synapse-User-Geolocation

https://github.com/Sage-Bionetworks/Synapse-User-Geolocation

SimpleHttpClient

GitHub - Sage-Bionetworks/SimpleHttpClient

JSON-java

GitHub - Sage-Bionetworks/JSON-java: A reference implementation of a JSON package in Java.

EvaluationStatistics

GitHub - Sage-Bionetworks/EvaluationStatistics: This application computes statistics on the evaluation queues in Synapse. See https://sagebionetworks.jira.com/browse/CHAL-16 for more info.

common-utilities

https://github.com/Sage-Bionetworks/common-utilities

ChallengeDockerAgent

https://github.com/Sage-Bionetworks/ChallengeDockerAgent

csv-utilities

https://github.com/Sage-Bionetworks/csv-utilities

database-utils

https://github.com/Sage-Bionetworks/database-utils

QuizTextToJSONConverter

https://github.com/Sage-Bionetworks/QuizTextToJSONConverter

service-redirect

https://github.com/Sage-Bionetworks/service-redirect

url-signer

https://github.com/Sage-Bionetworks/url-signer

portals

https://github.com/Sage-Bionetworks/portals

Synapse-React-Client

https://github.com/Sage-Bionetworks/Synapse-React-Client

synapse-oauth-signin

https://github.com/Sage-Bionetworks/synapse-oauth-signin

markdown-it-synapse-server

https://github.com/Sage-Bionetworks/markdown-it-synapse-server

markdown-it-synapse

https://github.com/Sage-Bionetworks/markdown-it-synapse

SynapseWebClient

https://github.com/Sage-Bionetworks/SynapseWebClient

GwtVisualizationWrappers

https://github.com/Sage-Bionetworks/GwtVisualizationWrappers

react-base-table

https://github.com/Sage-Bionetworks/react-base-table

nbconvert-webapp

https://github.com/Sage-Bionetworks/nbconvert-webapp

gwtbootstrap3

https://github.com/Sage-Bionetworks/gwtbootstrap3

gwtbootstrap3-extras

https://github.com/Sage-Bionetworks/gwtbootstrap3-extras

  • We would like to integrate the strategy into our existing SDLC cadences (e.g. addressing vulnerabilities at the weekly Stack Release Meeting), rather than sending additional notifications that could just be ignored.

  • It would be valuable for our approach to be easily adopted by other teams at Sage. Most of the technical approaches below could be easily modified to look at a different collection of repositories or specific GitHub team.

Proposals

I’ve proposed a few different options and summarized what I view to be the work required to accomplish the proposal. These are not detailed estimates and are subject to change.

Option 1: Build “Dependabot Dashboard” to be reviewed in weekly Stack Release meeting

In our Redash Stack Review Dashboard, add a table that is a collection of all of the un-dismissed Dependabot alerts across all repos. In the meeting, create Jira tickets to address any un-dismissed alerts, and then dismiss the alerts so they will be removed from the view in the future.

Prioritization of these tickets is at the discretion of the product manager, who may use the issue severity reported by GitHub or Synapse Engineers' estimated risk.

Technical Requirements:

  • GitHub App + granted permissions on Sage-Bionetworks GitHub Organization

  • Redash upgraded to recent version

  • Code to fetch and process security alert data on GitHub

Option 2: Assemble all Alerts in Email Digest

Send an email to Synapse Engineers at regular intervals, e.g. weekly containing a digest of all security alerts that have not been dismissed on relevant repositories.

It must be the responsibility of engineers or PM to create tickets based on the email content.

Technical Requirements:

  • GitHub App + granted permissions on Sage-Bionetworks GitHub Organization

  • Code to fetch and process vulnerability data on GitHub, send email digest

Option 3: Automatically File Jira Issues for new Alerts

Similar to Option 2, instead of sending an email, a job could automatically create issues in Jira, in which case we may want to ensure we have a more refined process for handling the vulnerability alerts (e.g. 10 alerts regarding the same shared vulnerability across 10 repos results in one Jira ticket instead of 10 tickets). The job could run daily/weekly.

Technical Requirements:

  • GitHub App + granted permissions on Sage-Bionetworks GitHub Organization

  • Code to fetch and process vulnerability data on GitHub, create Jira issues

(https://blog.developer.atlassian.com/creating-a-jira-cloud-issue-in-a-single-rest-call/#:~:text=1 call is all it takes to create,through using Basic Auth with an API token.)

Option 4: Manually Inspect Alerts at Regular Cadence

This solution would not require any technical overhead.

For all active repositories, one or more members of the team will inspect the Dependabot alerts and file issues in Jira to address each alert. Issues could be prioritized by the product manager based on preliminary information, such as vulnerability severity score.

Since this would take a fair amount of time, this would likely happen less frequently than the other proposals. It is likely that we would not be able to respond as quickly to severe issues.

Technical Requirements

Brief description and justification of the individual technical requirements mentioned in the above proposals.

GitHub App

To collate all of the vulnerability data, our best option would be to create a GitHub App that can be installed on the Sage-Bionetworks GitHub Organization. The GitHub App should be scoped to have only the permissions required to view security alerts

We should use a GitHub App rather than the GitHub OAuth integration to ensure we can see all security alerts, and not just the alerts visible to the logged-in user. We should use a GitHub App over a Personal Access Token for the same reason, plus we should not tie an individual’s account to this process, in case the individual leaves the organization.

Upgrade Redash

In newer versions of Redash (the visualization tool that is used for the Data Warehouse), you can create data source for running arbitrary Python code, which we could use to fetch and process vulnerability data from the GitHub API.

We need to upgrade Redash at some point, so while this isn’t additional work, the priority of this task may be affected by relying on it here.

We must spend more time considering the risks of allowing arbitrary Python Code Execution in Redash. Here are some relevant details:

  • GitHub App credentials will likely be supplied as environment variables (which could simply be printed by a user)

  • Redash is in the VPN, so all users will be employees.

  • We may be able to configure permissions in Redash to restrict access to the Python data source.

A possible alternative to a Python data source is the JSON Data Connector. We could move all of the sensitive credentials and operations to a more secure environment e.g. Jenkins or AWS, and then publish a JSON document containing Dependabot alert information. The document could be loaded into Redash. The JSON data connector only supports HTTP Basic auth, if the file needs to be secured and on the internet.

Other Suggestions

I am including the following suggestions that are relevant but do not directly achieve the goal of addressing Dependabot issues

  • Maintain a list of active repositories that a particular team is responsible for

    • We could do this by splitting “Synapse-Developers” GitHub team into “Synapse-Developers” and “Bridge-Developers”. Consider all non-archived repos owned by “Synapse-Developers” to be the active list (we can get this info from the GitHub API).

  • Consider enabling CI runs for pull requests made by Dependabot. This could make it easy to apply certain vulnerability alerts with confidence that the upgrade is unlikely to cause a regression.

  • Consider enabling GitHub CodeQL code scanning on all repositories. Here’s an example of a CodeQL alert on our Portals repository that seemed helpful, though happened to not be applicable.