Executing Operational Tasks
The Synapse Engineering Team regularly has to run a variety of operational tasks, including the major, weekly tasks of deploying the system, migrating data, and a variety of minor, ad hoc tasks like removing users, white-listing OAuth clients, and making select data sets publicly accessible.
A standard technology used for such tasks is GitHub workflows, however this is made challenging by two requirements: (1) some tasks require making requests to administrative endpoints, which, for security reasons, may only be invoked from within the Synapse private subnet, and (2) tasks can run for up to 30 hours. (The longest operational task is data migration.) These two requirements preclude using GitHub hosted runners. Moreover, the team needs to execute tasks in both the Synapse “dev” and “prod” environments. To avoid catastrophic errors we must ensure that the “ops” dashboards for running tasks in the two environments are well separated.
To meet the above requirements we use GitHub workflows with:
Self-hosted runners, deployed within the Synapse (dev or prod) subnet;
Cross-repository workflow invocation to avoid duplication of workflows;
GitHub OIDC integration to selectively authorize GitHub repositories to access dev or prod accounts;
AWS Secrets Manager to authorize the afore-mentioned self-hosted runners to access administrative Synapse end-points.
The arrangement of the GitHub repositories containing the relevant workflows (regular boxes) and the AWS accounts (curved boxes) is show below, with dev/prod repositories invoking cross-repo workflows (“run”) and the core repository deploying self-hosted runners to AWS using CDK and running tasks there.
The self-hosted runners are deployed within the VPC of their respective Synapse stacks. Each AWS account (synapse-dev and synapse-prod) is configured with an administrative access token, stored in AWS Secrets Manager. The admin' tasks, running on the self-hosted runner, retrieve the admin' token from Secrets Manager at run time.
Each Ops repository is configured with:
Repository Secret:
ADMIN_READ_ORG_PAT: A GitHub fine-grained personal access token which allows reviewing and creating self-hosted runners in the repository. It must have the following permissions:
Repository Permission: Administration / Read and Write
Organization Permission: Self-Hosted Runners / Read and Write
The PAT should be scoped to this repository ONLY.
Repository Variables:
CONTEXT: dev or prod
ROLE_TO_ASSUME: AWS Role enabled for the repo' via OIDC integration
SYNAPSE_HOST: https://repo-dev.dev.sagebase.org or https://repo-prod.prod.sagebase.org
SYNAPSE_DEPLOYMENT_ROLE: a role for the self-hosted runner to assume which allows it to deploy AWS infrastructure
Diagram source for mermaid.com
block-beta
columns 3
op["synapse-ops-prod<br/><small>(prod-aws-role)</small><br/><small>(GH PAT)</small>"] space od["synapse-ops-dev<br><small>(dev-aws-role)</small><br/><small>(GH PAT)</small>"]
space space space
space oc["synapse-ops-core<br/><small>(inherits role, GH PAT)</small>"] space
space space space
pr("AWS self-hosted runner<br><small>(prod SYN-PAT)</small>") space pd("AWS self-hosted runner<br><small>(dev SYN-PAT)</small>")
op--"run"-->oc
od--"run"-->oc
oc--"deploy/run"-->pr
oc--"deploy/run"-->pd