Synapse stacks deployment
Deployments
Creating branches (usually Fri night or Sat morning)
Each week, we make sure release branches are merged back into develop at the stack release meeting. Before building the next staging stack (N+1), we tag the existing release branch and create branches for the next stack.
You can create a shell script to simplify the process:
#!/bin/bash
pushd ~/tmp/deploy
git clone git@github.com:Sage-Bionetworks/$1.git
pushd $1
git checkout release-$2
git tag -a $2.1 -m stack-$2-prod
git checkout develop
git merge release-$2
git push origin develop --tags
git checkout -b release-$3
git tag -a $3.0 -m stack-$3-staging
git push origin release-$3 --tags
popd
popd
Then use it (in this case, create-branch.sh) to close/tag the current staging branches and create new ones:
~/bin/create-branch.sh Synapse-Stack-Builder <N> <N+1>
~/bin/create-branch.sh SynapseWebClient <N> <N+1>
~/bin/create-branch.sh Synapse-Repository-Services <N> <N+1>
Now run the build jobs, pointing to the new branches:
Release builds point to release-<N+1>
Connect to VPN
Go to swc-release-build
'Configure' > Source Code Management > "Branches to build" = release-<N+1>
example: release-5
Save
Click the 'Enable' button to enable the build if not already enabled
Click "Build now"
Go to repo-release-build
‘Configure’ >
Source Code Management' > ‘Branches to build’ = release-<N+1>
Build-Steps > Trigger/call builds on other projects > Project to build == Stack-Builder-Parameterized > Predefined Parameters > “BRANCH” = release-<N+1>
Production builds point to release-<N>
Go to swc-production-build
Same process as above but point to ‘release-<N>’
Go to repo-production-build
Same process as above but point to ‘release-<N>’
Note: the production builds are typically enabled when needed. Click the 'Enable' button to enable the build if you need to build the production artifacts.
Deploying dev stack (usually Fri night or Sat)
The dev stack runs the same version of the code as the staging stack in its production stack (i.e. there is no visible staging dev stack). Once we have artifacts from the builds above, we can deploy a staging version.
Go to build-dev-stack
Make any change needed to configuration (new config values, update beanstalk solution…)
Build with Parameters > Specify instance number (N+1), beanstalk numbers (0 if regular/non patch), versions (from builds above) and git branch for the stack builder (matches the instance number)
After the build is done, go to SynapseDev AWS console / Elastic Beanstalk and restart the servers for the environments
Go to dev-link-dns-to-alb to configure the stack just deployed as a staging stack
Configure > Execute shell
In the command, for ‘repo.staging.dev.sagebase.org’ and ‘portal.staging.dev.sagebase.org’ replace ‘none’ by ‘repo-dev-<N+1>-<beanstalk-number>’ and ‘portal-dev-<N+1>-<beanstalk-number>’ where <N+1> and <beanstalk-number> are the values specified for the dev stack deployed above
==> This is going to move the instances in the target groups for the staging ALB (i.e. make stack <N+1> the staging stack. The operations takes about 5 minutes.
While the linking is happening, setup a mySQL connection to the new stack repo database.
dev-<N+1>-db.cdusmwdhqvso.us-east-1.rds.amazonaws.com as endpoint
dev-<N+1>user as user
dev-<N+1> as db
To verify that you have the correct prod and staging instance of the dev stack
go to https://repo-prod.dev.sagebase.org/repo/v1/version ==> N
go to https://repo-staging.dev.sagebase.org/repo/v1/version ==> N+1
Go to migrate-dev-stack
Migrate the stack by running the job
Should take 2-3 attempts to get it down to about 5 minutes
When at 5 minutes, connect to the ‘prod’ dev stack and set it to Read-Only mode (the migration job sets the destination to RO and leaves it RO).
After the final migration, both stacks are in RO mode. Verify that all types displayed have null values for (min,max) [these are empty tables]
Go back to dev-link-dns-to-alb, this time to config the new stack as the ‘prod’ stack
Configure > Execute shell
In the command, for ‘repo.prod.dev.sagebase.org’ and ‘portal.prod.dev.sagebase.org’ replace ‘N’-<beanstalk-number> by ‘<N+1>-<beanstalk-number>’ (i.e. make the new stack the prod stack) and for ‘repo.staging.dev.sagebase.org’ and ‘portal.staging.dev.sagebase.org’ set them again to ‘none’ (there is no ‘staging’ stack except for migration purposes)
==> Again, the operations takes about 5 minutes
In the database for the new stack, set the status to read-write mode and verify that the prod stack is in read-write mode at https://repo-prod.dev.sagebase.org/repo/v1/status .
In the Synapse Dev AWS console, go to CloudFormation and delete the stacks for portal-dev-<N>, repo-dev-<N> and workers-dev-N.
Run the synapsePython client integration tests
Go to http://build-system-ops.dev.sagebase.org:8080/job/synapse-python-client-integration-test-prod/ > Build Now
Go to http://build-system-ops.dev.sagebase.org:8080/job/synapse-python-client-integration-test-staging/ > Build Now
NOTE: If you wait more than a few minutes to run these, hold on for a while else whatever they expect on the queues will start mixing with the replaying of messages and tests will time out. Sometimes happens as soon as the second (dev branch) test.
If there is any error, open a Jira and work with the client dev team to resolve.
Deploying upcoming staging stack (usually Sat or Sun in between migrations below)
Copy the configuration of the job to the build-synapse-prod-stack job
Click Configure then copy the content of the Execute Shell Command field to the clipboard
Go to build-synapse-prod-stack
Click Configure and paste into the Execute Shell command field
In Source Management / Branches to build, update the branch to ‘release-<N>’
Update the configuration of the build-synapse-staging-stack job
Go to build-synapse-staging-stack
In Source Management / Branches to build, update the branch to ‘release-<N+1>’
In Execute Shell Command
update ‘org.sagebionetworks.instance’ to <N+1>
update ‘org.sagebionetworks.beanstalk.version.[repo|workers|portal]’ to the corresponding versions of the artifacts (usually ‘release-<N+1>.0’)
update ‘org.sagebionetworks.vpc.subnet.color’ to the next color in ('red', ‘green', ‘blue’)
verify that the beanstalk numbers are all '0'
update ‘org.sagebionetworks.beanstalk.image.version.amazonlinux’ as needed
make any other change needed in the configuration (new config values?)
Build now
After the build is done, go to SynapseProd AWS console / Elastic Beanstalk, verify that the environments are up and restart the servers for the environments
Setup a connection to the repo database and set the stack to read-only mode.
Final migrations
Migrations are scheduled Tue-Fri night and take a 2-3 hours. The goal is to be around 30 mins when going into the final migration (Sat night or Sun night depending on the progress of the table/search building and/or availability of migrator). To achieve that, we take advantage of a decrease of activity during the week-end and start several migrations during the day (at least one in the morning, one mid-afternoon to get the time around 1hr, then several back to back until the migration time dips towards 30 mins). In between migrations, perform final verifications such as checking that the deployed artifacts is the latest build, any validation left over after the stack meeting etc. Once we get to back-to-back migration, the job is usually setup to keep the destination in read-only mode.
Go to migration-prod-staging
When migration time is around 30 mins, Configure > in “Execute shell” change the value for " “-Dorg.sagebionetworks.remain.read.only.mode=false" to “true”. From this point, the staging stack will stay in read-only mode when migration is done.
The final migration, where both prod and staging are in read-only mode is usually timed early evening to minimize impact on users. Back-to-back migrations start around 2hrs prior to that, at some point the migration time is not decreasing anymore. That’s the clue that it’s ready to go…
Final migration - read-only mode
Connect to the repo database for the production stack (<N-1>) and set the stack in read-only mode
Go to migration-prod-staging and start the last migration.
At the end of the job, you should only see a couple of lines with (null, null) values. these are migrateable tables that are empty.
04:34:38,031 INFO - Currently running: 0 restore jobs. Waiting to start 0 restore jobs.
04:34:38,158 INFO - MULTIPART_UPLOAD_COMPOSER_PART_STATE: mins:(null, null) maxes:(null, null)
04:34:38,159 INFO - COMMENT: mins:(null, null) maxes:(null, null)
04:34:38,159 INFO - migration successful
04:34:38,159 INFO - Destination remains in READ-ONLY mode.
If you see anything else do not go live without a clear understanding of what happened.
Open an issue with the type that’s the problem, and we’ll discuss on Monday.
Promoting staging stack to production
Verify the versions for each stack
Go to build-synapse-prod-bind-nlb-to-alb
Build with parameters
specify N for ‘REPO_PROD_INSTANCE_AND_VERSION’ and ‘PORTAL_PROD_INSTANCE_AND_VERSION’
specify N+1 for ‘REPO_STAGING_INSTANCE_AND_VERSION' and 'PORTAL_STAGING_INSTANCE_AND_VERSION’
It takes about 5 minutes for the operation to complete
Verify the versions for each stack
Set the production stack to read-write mode
Connect to repo database and set the status to READ_WRITE.
Create an invalidation for the CDN
Go to AWS console, Cloudfront > look for distribution E3BS1ZI5UHV8OS
on Invalidations tab, click ‘Create invalidation’ and specify ‘/*’ for object path
Connect to https://www.synapse.org and verify that it works (e.g. create an annotation)
Removing old production stack
Go to AWS console, Cloudformation
In the list of stacks
for each stack in (‘portal-prod-<N-1>’, ‘repo-prod-<N-1>’, ‘workers-prod-<N-1>’)
select stack
Stack-Actions > Edit Termination Protection > Deactivated
Delete
Wait for all the stacks to be deleted before deleting the shared resources stack
select ‘prod-<N-1>-shared-resources’ > Stack Actions > Edit Termination Protection > Deactivated (where N-1 is just the version (e.g. 444))
Go to delete-cloudformation-stack
Build with Parameters > replace ‘999’ by the version above
Initial migration
At this point, you have a live production stack and an empty staging stack. You can start migration for next week.
go to migration-prod-staging
Build
Note: typically after starting the job, I change the ‘org.sagebionetworks.remain.read.only.mode’ back to ‘false’ and queue up another job
Patching
Patching is deploying changes (parameters or artifacts) to a running stack. It happens routinely on the staging stack as we’re finding and fixing issues, less often on the production stack.
Patching staging
Patching staging is essentially deploying staging, just update the artifacts versions in the ‘Update the configuration of the build-synapse-staging-stack job’ above and run the job.
Patching production
Patching production is essentially deploying a ‘-1’ environment, mapping it as a ‘tst’ endpoint, validating the fixes and then promoting it to production.
Deploy ‘-1’ environment
Same as above but specify ‘1' for the beanstalk numbers (’repo' and ‘portal’ only, ‘workers’ is always '0').
Map environment to ‘tst’ website
Go to http://build-system-ops.prod.sagebase.org:8080/job/build-synapse-prod-bind-nlb-to-alb/
Configure
In Execute Shell Command, edit the values for ‘repotst.prod.sagebase.org’ and ‘tst.synapse.org' from ‘none’ to ‘repo-prod-<stack-instance>-1’ and ‘portal-prod-<stack-instance.-1’, where <stack-instance> is the release number you’re on (e.g. ‘473’)
Build with parameters
CAUTION: Make sure you enter the correct parameters for (<release-number>-<beanstalk-number>) as the running stack (i.e. what was input when staging was promoted to prod above).
Promote to production
Go to http://build-system-ops.prod.sagebase.org:8080/job/build-synapse-prod-bind-nlb-to-alb/
Configure
In Execute Shell Command, set the values for ‘repotst.prod.sagebase.org’ and ‘tst.synapse.org' back to ‘none’
Build with parameters
CAUTION: Make sure you enter the correct parameters for (<release-number>-<beanstalk-number>) as the running stack (i.e. what was input when staging was promoted to prod above).