Executing Operational Tasks
The Synapse Engineering Team regularly has to run a variety of operational tasks, including the major, weekly tasks of deploying the system, migrating data, and a variety of minor, ad hoc tasks like removing users, white-listing OAuth clients, and making select data sets publicly accessible.
A standard technology used for such tasks is GitHub workflows, however this is made challenging by two requirements: (1) some tasks require making requests to administrative endpoints, which, for security reasons, may only be invoked from within the Synapse private subnet, and (2) tasks can run for up to 30 hours. (The longest operational task is data migration.) These two requirements preclude using GitHub hosted runners. Moreover, the team needs to execute tasks in both the Synapse “dev” and “prod” environments. To avoid catastrophic errors we must ensure that the “ops” dashboards for running tasks in the two environments are well separated.
To meet the above requirements we use GitHub workflows with:
AWS Code Pipeline using Code Build jobs running within the Synapse (dev or prod) subnet;
Cross-repository workflow invocation to avoid duplication of workflows;
GitHub OIDC integration to selectively authorize GitHub repositories to access dev or prod accounts;
AWS Secrets Manager to authorize the afore-mentioned Code Build jobs to access administrative Synapse end-points.
The arrangement of the GitHub repositories containing the relevant workflows (regular boxes) and the AWS accounts (curved boxes) is show below, with dev/prod repositories invoking cross-repo workflows (“run”) and the core repository deploying Code Pipelines/Builds to AWS using CloudFormation and running tasks there.
The Code Build jobs are configured with the VPC of their respective Synapse stacks. Each AWS account (synapse-dev and synapse-prod) is configured with an administrative access token, stored in AWS Secrets Manager. The admin' tasks, running in Code Build retrieve the admin' token from Secrets Manager at run time.
Each Ops repository is configured with Repository Variables:
CONTEXT: dev or prod
ROLE_TO_ASSUME: AWS Role enabled for the repo' via OIDC integration
SYNAPSE_HOST: https://repo-dev.dev.sagebase.org or https://repo-prod.prod.sagebase.org
SYNAPSE_DEPLOYMENT_ROLE: a role for the GitHub workflow to assume which allows it to deploy Code Pipeline / Build.
Diagram source for https://mermaid.ai/
block-beta
columns 3
op["synapse-ops-prod<br/><small>(prod-aws-role)</small><br/><small>(GH PAT)</small>"] space od["synapse-ops-dev<br><small>(dev-aws-role)</small><br/><small>(GH PAT)</small>"]
space space space
space oc["synapse-ops-core<br/><small>(inherits role, GH PAT)</small>"] space
space space space
pr("AWS Code Build<br><small>(prod SYN-PAT)</small>") space pd("AWS Code Build<br><small>(dev SYN-PAT)</small>")
op--"run"-->oc
od--"run"-->oc
oc--"deploy/run"-->pr
oc--"deploy/run"-->pd