Document toolboxDocument toolbox

Platform AWS Log

This is a reverse-chronological log of all work done in the platform AWS account.

Deploy fix for staging portal v 1.5.4 by Mat on 2012/08/07

  1. Upload .war file using SynapseDeployer
  2. Deploy
  3. Smoke test
    1. Login with openID, logout
    2. Login with correct Synapse user/pwd
    3. Verify fix for PLFM-1417 

Setup prodA as staging by Matt and Xa on 2012/07/31

  1. Upload .war files using SynapseDeployer
  2. Drop/Created the prodA schema
  3. Deploy environments
    1. repo-prod-a
      1. cert and param1 OK ==> started OK
    2. auth-prod-a
      1. cert and param1 OK ==> started OK
    3. portal-prod-a
      1. Built hotfix version 1.3.1 per below on prod-b, uploaded to S3.
      2. Deploy with AWS console
  4. Smoke test.
    1. Login with openID, logout.
    2. Login with incorrect Synapse user/pwd.
    3. Login with correct Synapse user/pwd, create project/data. Upload > 1 file. Delete created data.
    4. Register new user.
  5. Change CNAMES
    1. synapse-staging.sagebase.org >> synapse-prod-a.sagebase.org
    2. auth-staging.sagebase.org >> auth-prod-a.sagebase.org
    3. repo-staging.sagebase.org >> repo-prod-a.sagebase.org

Finalize prodC deployment by Xa on 2012/07/30

  1. Final sync
    1. Deleted one entity (syn382168) causing FK problem on update
      1. 102838 entities, 0/0/0 submitted when sync'ed
    2. Put prod in read-only mode
    3. Migrated approval tables
      1. Ran Bruce's code from SODO before (resulted in empty list) and after (resulted in non-empty list)
  2. Switch CNAMES
    1. auth-prod -> auth-prod-c.sagebase.org
    2. repo-prod -> repo-prod-c.sagebase.org
    3. synapse-prod - > synapse-prod-c.sagebase.org
  3. Manual smoke test on prod
    1. Login with openID, logout.
    2. Login with incorrect Synapse user/pwd.
    3. Login with correct Synapse user/pwd, create project/data. Upload > 1 file. Delete created data.
    4. Register new user.
    5. Click on SCR/BCC as anonymous user.
    6. Search on cancer.
    7. Change pwd (valid and invalid)

Update prodC to /1.4.3 by Xa on 2012/07/28

  1. Uploaded .war file to S3 using SynapseDeployer
  2. Deploy to prodC from AWS console, rebuild environments.

Update prodC to 1.3.2/1.4.2 by Xa on 2012/07/27

  1. Uploaded .war files to S3 using SynapseDeployer
  2. Deploy to prodC from AWS console, rebuild environments.

Update prodC to 1.3.1-RC, 1.4.1-RC by Xa on 2012/07/26

  1. Uploaded .war files to S3 using SynapseDeployer
  2. Deploy to prodC from AWS console, rebuild environments.

Update prod-c to 1.3-RC/1.4-RC by Xa on 2012/07/25-26

  1. Setup SWC build process to use git tag, close release-1.2/3 branches, create release-1.3/4 branches.
  2. Upload .war files to S3 using SynapseDeployer
  3. Deploy to prod-C from AWS console. Rebuild environments per discussion at postmortem.
  4. Migrate from prod to prodC
    1. Issue with migrator: deleted all entities and started from scratch (expected it would just sync up)

Setup prod-a for deployment of test 1.2.0 by Xa on 2012/07/24

  1. Created prodA schema
    1. there's already a user ProdAuser with correct privileges:
      GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, CREATE VIEW, SHOW VIEW ON `prodA`.* TO 'prodAuser'@'%'
      GRANT SELECT, INSERT ON `idGeneratorDB`.* TO 'prodAuser'@'%'
  2. 1.1.0-RC and 1.2.x-RC .war files already pushed to S3
  3. Checked .properties file (https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodA-stack.properties)
    1. Changed org.sagebionetworks.id.generator.database.connection.url=jdbc:mysql://syndb-prod-2.sagebase.org/idGeneratorDB
    2. Changed org.sagebionetworks.repository.database.connection.url=jdbc:mysql://syndb-prod-2.sagebase.org/prodA
    3. Changed org.sagebionetworks.bcc.spreadsheet.title=BCC Registrants
    4. CloudSearch settings are OK
    5. Changed org.sagebionetworks.portal.endpoint=https://synapse-prod-a.sagebase.org
    6. Push .properties file to S3
  4. Checked searchUpdater
    1. .properties file (https://s3.amazonaws.com/configuration.sagebase.org/SearchUpdater/prodA-searchUpdater.properties) OK
    2. supervisord.conf ()
      1. Changed .jar filename

        command= java -Djava.io.tmpdir=/mnt/data/prodA -Dorg.sagebionetworks.stack=prod -Dorg.sagebionetworks.stack.iam.id=AKIAI4GXXV3SDI3JXT5Q -Dorg.sagebionetworks.stack.iam.key=JAED9jTGOsIIWAFPl4b3rBFcp1IL/XI+iisSwQua -Dorg.sagebionetworks.stack.instance=A -Dorg.sagebionetworks.stack.configuration.url=https://s3.amazonaws.com/configuration.sagebase.org/SearchUpdater/prodA-searchUpdater.properties -Dorg.sagebionetworks.stackEncryptionKey=8469597bc7bd4d9ac12e851cdb146bab -cp /mnt/data/daemons/tool-migration-utility-1.2.0-jar-with-dependencies.jar org.sagebionetworks.tool.searchupdater.SearchMigrationDriver

    3. Uploaded supervisord.conf to S3, scp to hudson.sagebase.org (tool-migration-utility-1.2.0-jar-with-dependencies.jar is already there per prodB setup below)
  5. Create environments
    1. repo-prod-a
      1. cert and param1 OK ==> started OK
    2. auth-prod-a
      1. cert and param1 OK ==> started OK
    3. portal-prod-a
      1. Built hotfix version 1.3.1 per below on prod-b, uploaded to S3.
      2. Deploy with AWS console

Apply hotfix to prod-b stack by Xa on 2012/07/24

  1. Point Hudson build 'Project SwC-release' to SWC branch release-1.2
  2. Update root pom.xml on that branch to 1.2.4-RC and build.
  3. Check that .war appears on http://sagebionetworks.artifactoryonline.com/sagebionetworks.
  4. Upload .war to S3 using SynapseDeployer (TODO: Clearly document dependencies to get this working on any system.)
    1. Modify Synapse-Repository-Services/tools/SynapseDeployer/main.py (version and env_to_upgrade)
    2. run main.py
  5. Deploy on portal-prod-b-test environment in AWS console
  6. Point Hudson build back to branch release-1.3
  7. Test upload file using test portal 
    1. Login, goto to on of my entities, create data entity, upload file, create data entity, upload file. ==> OK
  8. Deploy 1.2.4-RC on prod-b-portal (07/25, 10:18am)

07/24/2012

  • Deal with aftermath of PLFM-1399 (rollback to ProdB)
  • Finalize changes to PLFM-release build
  • Start looking at closing Synapse-Repository-Services/release-1.2 and SynapseWebClient/release-1.3 branches

07/25/2012

  • Close Synapse-Repository-Services/release-1.2 and SynapseWebClient/release-1.3 branches

    Merge SWC-release branch
    git clone https://github.com/Sahe-bionetworks/SynapseWebClient.git
    git checkout release-1.3
    git tag 1.3.1
    git checkout develop
    git merge release-1.3
    # Fix conflict on pom.xml (version must be develop-SNAPSHOT)
    git add pom.xml
    git commit -m "Merged release-1.3"
    git checkout master
    git merge release-1.3
    # Fix conflicts
    git add pom.xml
    git commit -m "Merged release-1.3"
  • Modify SWC builds on Hudson to use generate the version from git tag (same as repo service builds)

    • In Hudson, find SWC-release build.
      • Configure
        • In git section, Advanced: check 'Skip Internal Tag' so Hudson does not overwrite our tag
        • In Build section, add an Execute Shell step with following command:
chmod 777 pomSnapshotToLastGitTag.sh
./pomSnapshotToLastGitTag.sh
        • In Git/Branches to build, specify release-1.4 (NOTE: This step is part of the weekly routine)

  • Create Synapse-Repository-Services/release-1.3 and SynapseWebClient/release-1.4 branches
  • Deploy 1.3/1.4 artifacts on prodC and migrate data from prod to prodC

07/26/2012

  • Build and deploy fixes for PLFM-1390 and PLFM-1399.
    • Tag and push with tags to build 1.3.1-RC and 1.4.1-RC
    • Upload and deploy on prodC

 

Finalize prod-c deployment by Xa on 2012/07/23

  1. Final sync.
    1. deleted two entities at target to solve update issues (FK)
    2. Put prod in read-only mode
    3. Migrated approval tables
  2. Switched CNAMES
    1. auth-prod -> auth-prod-c.sagebase.org
    2. repo-prod -> repo-prod-c.sagebase.org
    3. synapse-prod - > synapse-prod-c.sagebase.org

Setup prod-c for deployment of production 1.2.0 by Bruce & Xa on 2012/07/19

1. Pushed .war to S3 using SynapseDeployer.py

2. Checked .properties file (https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodC-stack.properties)
- Changed org.sagebionetworks.id.generator.database.connection.url=jdbc:mysql://syndb-prod-2.sagebase.org/idGeneratorDB
- Changed org.sagebionetworks.repository.database.connection.url=jdbc:mysql://syndb-prod-2.sagebase.org/prodC
- Changed org.sagebionetworks.bcc.spreadsheet.title=BCC Registrants


3. Checked database
- create schema prodC;
- there's already a prodCuser user with correct privileges:
show grants for 'prodCuser';
'GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, REFERENCES, INDEX, ALTER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE ON `prodC`.* TO ''prodCuser''@''%'''
'GRANT SELECT, INSERT ON `idGeneratorDB`.* TO ''prodCuser''@''%'''

4. Checked search index
- index specified in .properties seems old (201203030, 25 fields vs 26 for A and B)
- Created new search index
- ./createSearchIndex.sh prod-c-20120719 (in Synapse-Repository-Services/tools/SynapseDeployer)
- Changed service .properties file (https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodC-stack.properties)
org.sagebionetworks.cloudsearch.searchservice.endpoint=http://search-prod-c-20120719-th2tqyoo4npuincyzfdfhyqhsa.us-east-1.cloudsearch.amazonaws.com/2011-02-01/search
org.sagebionetworks.cloudsearch.documentservice.endpoint=https://doc-prod-c-20120719-th2tqyoo4npuincyzfdfhyqhsa.us-east-1.cloudsearch.amazonaws.com/2011-02-01/documents/batch
- Changed search updater .properties file ( https://s3.amazonaws.com/configuration.sagebase.org/SearchUpdater/prodC-searchUpdater.properties)
org.sagebionetworks.cloudsearch.searchservice.endpoint=http://search-prod-c-20120719-th2tqyoo4npuincyzfdfhyqhsa.us-east-1.cloudsearch.amazonaws.com/2011-02-01/search
org.sagebionetworks.cloudsearch.documentservice.endpoint=https://doc-prod-c-20120719-th2tqyoo4npuincyzfdfhyqhsa.us-east-1.cloudsearch.amazonaws.com/2011-02-01/documents/batch
- Changed search updater config (https://s3.amazonaws.com/configuration.sagebase.org/Hudson/supervisord.conf)
command= java -Djava.io.tmpdir=/mnt/data/prodC -Dorg.sagebionetworks.stack=prod -Dorg.sagebionetworks.stack.iam.id=xxxxx -Dorg.sagebionetworks.stack.iam.key=xxxxxx Dorg.sagebionetworks.stack.instance=C -Dorg.sagebionetworks.stack.configuration.url=https://s3.amazonaws.com/configuration.sagebase.org/SearchUpdater/prodC-searchUpdater.properties -Dorg.sagebionetworks.stackEncryptionKey=xxxxx -cp /mnt/data/daemons/tool-migration-utility-0.12-SNAPSHOT-jar-with-dependencies.jar org.sagebionetworks.tool.searchupdater.SearchMigrationDriver
directory=/mnt/data/prodC ; directory to cwd to before exec (def no cwd)

5. Pushed .properties and .conf file to S3

6. Create environments
- repo-prod-c
failed to start
changed cert to arn:aws:iam::325565585839:server-certificate/sage-wildcart-cert-2012
changed PARAM1 to https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodC-stack.properties
started
- auth-prod-c
failed to start
changed cert to arn:aws:iam::325565585839:server-certificate/sage-wildcart-cert-2012
changed PARAM1 to https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodC-stack.properties
started
- portal-prod-c
failed to start
changed cert to arn:aws:iam::325565585839:server-certificate/sage-wildcart-cert-2012
changed PARAM1 to https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodC-stack.properties
started
TODO: Check AMI IDs, not sure we're using the same ones everywhere.

7. Smoke test
- Failed on anon browse because database is empty >> Add at least en entity creation/deletion.
- Failed to find Login button by text
- Logout is going to fail as well as UI has changed
- Reverted to manual smoke
- Login via Google OK
- Login with Synapse account OK
- Invalid login OK
- Registration OK
TODO: Update smokeTest to match UI changes

8. Update search updater svc
- ssh into hudson.sagebase.org
- mkdir /mnt/data/prodC
- upload supervisord.conf to hudson.sagebase.org
- restart supervisor with 'sudo kill -SIGHUP 1193'

9. Copied new .jar and .conf using it to hudson.sagebase.org. Restarted supervisor. (5:00pm)

Restarted prod-b portal on 6/15 at 5:30PM

Enabled BCC signup in the prodB-stack properties file and restarted the app server.

Setup prod-b for deployment of production 1.1.0 by John on 6/14 starting at 15:30

Setup the following:

  1. Manually pushed the following war files to AWS:
    1. portal-1.1.0.war
    2. services-authentication-1.1.0-RC.war
    3. services-repository-1.0.2.war
  2. Created a CNAME for the ID Generator database called syn-db-id-generator pointing to synapse-repo-prod.c5sxx7pot9i8.us-east-1.rds.amazonaws.com
  3. Created a CNAME for the new database called syndb-prod-2 pointing to synapse-repo-prod.c5sxx7pot9i8.us-east-1.rds.amazonaws.com
  4. Changed the prodB.properties file (https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodB-stack.properties):
    1. org.sagebionetworks.repository.database.connection.url=jdbc:mysql://syndb-prod-2.sagebase.org/prodB
    2. NOTE: I left the ID generator database unchanged org.sagebionetworks.id.generator.database.connection.url=jdbc:mysql://syndb-prod.sagebase.org/idGeneratorDB as that database will still be used for the id generator while migration runs.
  5. Created the following beanstalk instances (by loading the old prod-b config):
    1. repo-prod-b with services-repository-1.1.0-RC.war
    2. auth-prod-b with services-authentication-1.1.0-RC.war.
    3. portal-prod-b with portal-1.1.0.war.
  6. The old configuration for each load balances had the expired SSL certificates so changed all three for them from arn:aws:iam::325565585839:server-certificate/sage-wildcart-cert to arn:aws:iam::325565585839:server-certificate/sage-wildcart-cert-2012
  7. Created a prod-b search index by running:
    1. ./Synapse-Repository-Services/tools/SynapseDeployer/createSearchIndex.sh prod-b-20120714
    2. Changed the prodB.properties file (https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodB-stack.properties):
      1. org.sagebionetworks.cloudsearch.searchservice.endpoint=http://search-prod-b-20120714-lfn2tw3p326ckyscimm3awkuwq.us-east-1.cloudsearch.amazonaws.com/2011-02-01/search
      2. org.sagebionetworks.cloudsearch.documentservice.endpoint=https://doc-prod-b-20120714-lfn2tw3p326ckyscimm3awkuwq.us-east-1.cloudsearch.amazonaws.com/2011-02-01/documents/batch
    3. Changed the https://s3.amazonaws.com/configuration.sagebase.org/SearchUpdater/prodB-searchUpdater.properties
      1. org.sagebionetworks.cloudsearch.searchservice.endpoint=http://search-prod-b-20120714-lfn2tw3p326ckyscimm3awkuwq.us-east-1.cloudsearch.amazonaws.com/2011-02-01/search
      2. org.sagebionetworks.cloudsearch.documentservice.endpoint=https://doc-prod-b-20120714-lfn2tw3p326ckyscimm3awkuwq.us-east-1.cloudsearch.amazonaws.com/2011-02-01/documents/batch
    4. Updated the https://s3.amazonaws.com/configuration.sagebase.org/Hudson/supervisord.conf:
      1. Added a [program:prodBSearchUpdater]
    5. pushed the changed supervisor.conf file to hudson.sagebase.org/etc/supervisor.conf
    6. restarted supervisor by running: sudo kill -SIGHUP 1193
  8. Started migration of data from prodA to prodB (started at 18:15)
  9. Continued on 7/15 at 15:17.
  10. Restarted the migration process and the two stacks are currently synched.
  11. Change the prodB idGenerator database to:
    1. org.sagebionetworks.id.generator.database.connection.url=jdbc:mysql://syndb-prod-2.sagebase.org/idGeneratorDB
    2. re-uploaded (https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodB-stack.properties).
  12. Put the prod-A into read-only mode
  13. Setup the the syndb-prod-2.idGeneratorDB table to be 375535:
    1. insert into idGeneratorDB (ID, CREATED_ON) VALUES (375535, 0);
  14. Swapped the CNAMES
    1. auth-prod -> auth-prod-b.sagebase.org
    2. repo-prod -> repo-prod-b.sagebase.org
    3. synapse-prod - > synapse-prod-b.sagebase.org

Migrated Crowd database by Bruce, John, 7-13-14

Created a new RDS called 'synapse-crowd' from a snapshot taken on 7-13.  Created the CNAME crowd-db.sagebase.org to point to it. 

Switched dev, staging, and production crowds to point to this database instance (each has its own schema).  Used the Crowd backup/restore function to get the most recent data from the original prod Crowd schema (~2:15PM on 7/14) and restored it to the production crowd instance after switching to the new database.

Setup prod-a for deployment of production 1.0.0 by Jon on 6/9 starting at 10:36

  1. setup the following:
    1. portal-prod-a using portal-1.0.2.war
    2. auth-prod-a using services-authentication-1.0.1.war
    3. repo-prod-a  using services-repository-1.0.1.war
  2. Since each environment was setup using a old saved configuration (prod-a-0.10), the SSL certificates were out of data, so switched each to use: arn:aws:iam::325565585839:server-certificate/sage-wildcart-cert-2012
  3. The prodA-stack.properties file was using the old non-CNAME database URL, so updated it to use the CNAME for the database: syndb-prod.sagebase.org
  4. Rebuilt repo-prod-a and atuh-prod-a.
  5. Still failed to start.  It turns out the old configuration was using the old property file, each was changed to use https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodA-stack.properties
  6. All the the services now seem to be up.  Manually, ran a search, logged in, and clicked on a entity.
  7. Save the configuration of each so we do not need to go through this again to get a stack setup.
  8. services-repository-1.0.1.war still has PLFM-1353 so we deployed 1.0.2.war
  9. Terminated old environments

Restored production Synapse after Amazon US-EAST-1 (Northern Virginia) crashed (crashed on 6/29/2012 @ ~20:00)  by John and Mike on 6/30 starting around 8:40 AM

  1. Got information from http://status.aws.amazon.com/ that AWS was having power and availbility issues starting at ~8:30PM on 6/29 affecting zone us-east-1a.  This affected RDS, EC2 and beanstalk instances.  This seems to be root cause of PLFM-1358 and PLFM-1357
  2. To investigate solution for PLFM-1358 we rebuilt the prod-a-auth beanstalk environment which was the only environment residing in us-east-1a.  It came up in a new zone and seems stable.
  3. First we restored the platform-build RDS instance from the "latest snapshot" giving the new instance name "platform-build-restored".
    1. Since this changed the public endpoint URLs and we are not using CNAMEs for this URLs we are going to try deleting the original DB  and then restoring to the old name.
    2. Deleted platform-build with final snapshot name: "final-snapshot-platform-build".
  4. We restored the "repo" RDS instance from the "latest snapshot" giving it the new name "production-restored".
    1. The current endpoint that all of the configuration files are using is "repo.c5sxx7pot9i8.us-east-1.rds.amazonaws.com".  Our goal will be to get a new instance restored with this same URL.
    2. Deleted the repo RDS instance with a final snapshot name: "final-snapshot-repo"
  5. Attempted to delete the original RDS instances under the theory that we could then do a second restore and bring up new RDS instances with same public endpoint as our original ones, allowing us to restore production with no configuration changes.  After long wait abandoned this approach.
  6. Now attempting to address PLFM-1360 by using CNAMEs as we re-wire the application
    1. In GoDaddy created  
      1. syndb-dev.sagebase.org -> platform-build-restored.c5sxx7pot9i8.us-east-1.rds.amazonaws.com
      2. syndb-prod -> production-restored.c5sxx7pot9i8.us-east-1.rds.amazonaws.com
  7. Updated https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodC-stack.propertiesto use syndb-prod.sagebase.org for both:
    1. org.sagebionetworks.id.generator.database.connection.url=jdbc:mysql://syndb-prod.sagebase.org/idGeneratorDB
    2. org.sagebionetworks.repository.database.connection.url=jdbc:mysql://syndb-prod.sagebase.org/prodC
  8. Attempting to change the database url for Crowd
    1. ssh into prod-crowd: ec2-50-17-222-147.compute-1.amazonaws.com
    2. Made a copy of the config file:
      1. cp /var/crowd-home/crowd.cfg.xml /var/crowd-home/crowd.cfg-old.xml
      2. changed /var/crowd-home/crowd.cfg.xml:
        1. from: <property name="hibernate.connection.url">jdbc:mysql://repo.c5sxx7pot9i8.us-east-1.rds.amazonaws.com/crowd_prod?autoReconnect=true&amp;characterEncoding=utf8&amp;useUnicode=true</property> 
        2. to: <property name="hibernate.connection.url">jdbc:mysql://syndb-prod.sagebase.org/crowd_prod?autoReconnect=true&amp;characterEncoding=utf8&amp;useUnicode=true</propert>
  9. Restarted crowd:
    1. ssh into the machine
    2. sudo reboot
    3. Validated that crowd could be access from the web https://prod-crowd.sagebase.org:8443/crowd/console/
  10. Rebuilt all three prod-c beanstalk instance from the AWS management console.
  11. Shut down all three prod-a environments as well as the 3 staging environments to save cash while on break.

Update prod to 1.0.1 and 1.0.2 on stack A by John on 6/27

  1. Since the deployment script is broken (see: PLFM-1351 - Getting issue details... STATUS ), I manually uploaded the war files (Upload in the AWS  Web Console has been fixed!):
    1. services-repository-1.0.1.war
    2. services-authentication-1.0.1.war
    3. portal-1.0.2.war
  2. Deployed the respective wars to Synapse prodA.
  3. I was unable to run the smoke tests on prod-a due to PLFM-1352 - Getting issue details... STATUS .

Update prod to 1.0.0 on stack A by Xa on 6/20

  1. Drop/create prodA schema
  2. Update .properties file
  3. Deploy build using deployment script
    1. Change version to 1.0.0 and isSnapshot to False
    2. Remove buildNumber from name
  4. Up # repo instances to 4
  5. Migrate data from prod to prodA
    1. Use custom .properties file
  6. Smoke
    1. Initial failures on anon, register
    2. Fix search endpoints and stack configs
      1. Still have problem with search + problem with find projects created by me (malformed query and use of CreatedBy)
      2. Behavior change after registration (sign user agreement) >>> need to change test

Sync staging with prod by Xa on 5/3

  1. Drop and recreate stagingB schema
  2. Deploy build 7398 using deployment script
  3. Up # repo instances to 4
  4. Migrate data from prod to staging
  5. Set min inst # back to 2

Finalize prod by Xa on 4/16-17

  1. Start final sync.
  2. Put 0.11 prod-b in read-only mode.
  3. Finish final sync.
  4. Change CNAMES at godaddy.com
  5. Smoke
    1. OpenID login
    2. Normal login
    3. Register
      1. ==> When coming back with the link from the mail. The UI display Welcome Prod UserAdmin >> need to confirm and file bug.
    4. Anon
  6. Deploy 0.12 (7231) using deployment scripts
  7. Deploy 0.12 (7232) using deployment scripts

Update prod to 0.12 (7190) on stack C by Xa on 4/13

  1. Drop/recreate prodC schema
  2. Deploy build 7190 using deployment script
  3. Start migration
    1. Migration done.
  4. Fixed searchUpdater params in prodC .properties files per Nicole's mail
  5. Deploy build 7202 using deployment script.

Update prod to 0.12 (7140) on stack C by Xa on 4/11

  1. Drop/recreate prodC schema.
  2. Deploy build 7140 using deployment script.
  3. Start migration

Update prod to 0.12 on stack C by Xa on 4/9-10

  1. Drop/recreate prodC schema.
  2. Deploy build 7083 using deployment script.
  3. Spin up C environments.
    1. Unable to smoke test UI due to bug.
  4. Deploy build 7088 using deployment script.
    1. Start migration.
      1. Deadlock problems (PLFM-1194)
    2. Bruce fixes config problem for OpenID login

Finalize staging by Mike on 4/4 at 4:30

  1. Shut down staging A beanstalk instances
  2. Change CNames in GoDaddy to point to staging B

Update staging to 0.12 on stack B by Xa on 4/4

  1. Drop and re-create stagingB schema.
  2. Deploy build 6987 using deployment script
    1. Spin up B environments
      1. Use 0.10 configs modified to point to new config file location
        1. Start OK, able to login, register user etc.
      2. Modify search params and upload, rebuild envs
        1. Cannot get env to restart (no table gets created)
        2. Fix error in param, restart envs
  3. Start migration from Prod to StagingB

Flip to Search Index prod-20120320 on prod by Nicole 3/19 16:00

already brought search index up to date

just changed the endpoints in https://s3.amazonaws.com/configuration.sagebase.org/Stack/prodB-stack.properties and restarted tomcat

Final sync prod 0.11 on stack B by Xa on 3/16

  1. Migration UI to sync-up
  2. Put Repo 0.10 in RO mode
  3. Complete sync-up
  4. Change xxx-prod CNAMES to point to prod-b

Resync prod 0.11 on stack B by Xa on 3/14

  1. Stop prodBSearchUpdater
  2. Drop and re-create prodB schema
  3. Restart App repo-prod-b on AWS
    • tables are not being created
      • looking at tomcat logs, looks like the app did not terminate in time
      • tried restart again but it looks like app not responding
    • try rebuild env to force shutdown of tomcat
      • had to do it a couple if times, no idea why failed the first time...
  4. Start migration from Prod to ProdB
    • Migration took about 1.5hr
    • Put repo in RO mode
  5. Getting 'An error occured trying to load' from portal when trying to display entity (tried 4492, 4494, 4496)
    • Does not look like a repo issue, I can get the entities from Python
    • Cannot login using Google account (error from auth) or Sage account (says I'm not authenticated)
    • Can login from Python
  6. Restart App for portal and auth
    • No change
  7. Log into auth/portal EC2 to look at logs
    • Nothing special in auth
    • After getting error msg about load failure, following msg in portal log:
      "Cannot find user login data in the cookies using cookie.name=org.sagbionetworks.security.user.login.data" >> Normal since I did not login?
    • No log entry on login failure
    • ~2:30am: Timeouts from repo (maybe connected to instance being removed)
  8. Put repo back in RW mode

Deploy prod 0.11 (6529) to stack B by Xa on 3/10

  1. Drop and re-create prodB schema
  2. Deploy build 6529 using deployment script
  3. Start migration from Prod to ProdB
    • Verified that disease, species, tissue show up again

Deploy staging 0.11 (6529) to stack A by Xa on 3/10

  1. Drop and re-create stagingA schema
  2. Deploy build 6529 using deployment script
    • changed config to use new location for config files
    • Check able to connect to portal, login.
  3. Start migration from Prod to StagingA
    • Migration done in ~1.5hr (after switching to small instance)
    • Verified that tissue, disease and species now show up in dataset table again

Deploy prod 0.11 (6504) to stack B by Xa on 3/08

  1. Drop and re-create prodB schema
  2. Deploy build 6504 using deployment script
    • using config setup by Nicole below
    • all svcs up
      • can connect to portal and python
  3. Migrating from Prod to ProdB
    • Synced

Deploy latest 0.11 (6496) to staging A by Xa on 03/07 20:00

  1. Drop and re-create stagingA schema
  2. Deploy build 6496 with deployment script
    1. all svcs up
  3. Migrated from StagingB to StagingA
    • Note: seeing lots of read timeouts, CPU utilization close to 100%. Should put deploy on small and go back to micro after migration.
  4. Changed xxx-staging CNAMEs

Start using configuration.sagebase.org S3 bucket by Nicole on 03/05 16:00

Update staging to 0.11 on 'A' stack by Xa on 03/02 22:00

  1. Dropped and re-created stagingA db.
  2. Deployed new stagingA-stack.properties
    1. Added search config
    2. Added S3 config
    3. Changed crowd config to point to staging crowd (this is where I screwed up the encryption this afternoon)
  3. Deployed build 6433 with deployment script
    1. auth and repo come up
    2. portal does not
      1. ssh in and look at log >> No error but the only msgs are about reading config...
  4. Checked database from SQL, able to authenticate and connect to repo from Python.
  5. Migrated from stagingB to stagingA.

Updated IAM policies for stagingWorkflowIamUser and hudsonIamUser by Nicole on 2/29 14:30

moving configuration files to versioned S3 bucket configuration.sagebase.org

Deployed repo svc and search updaters to staging and prod by Nicole on 2/24 ~21:00

Deployed some bug fixes due to the AwesomeSearch API changes that were deployed earlier in the day

Update staging and prod by Mike on 2/18 at 7:45

  1. Ran Synapse Deployer script to update both stacks to 0.10-6241

Shut down prod 'c' stacks by Mike on 2/17 at 8:30

  1. Shut down the prod 'c' stacks, decided to mix old static content into demo and talk around differences.

Upgrade prod to 0.10 by Mike on 2/9 at 7:20

  1. Re-run migration utility until stacks identical
  2. Put prod-c into read-only mode
  3. Change GoDaddy names to point to prod-a
  4. Left prod-c running in read-only-mode to support old demo, set min instances to 1

Updated staging and prod by Mike on 2/8 at 21:00

  1. Deployed 0.10-6115 to staging and prod via deployment script
  2. Deployed new versions on staging
  3. Deployed new versions to prod-a
  4. Ran migration utility, filed PLFM-987

Prepped prod for 0.10 on 'A' stack; updated staging by Mike on 2/7 20:50pm

  1. Dropped and recreated prodA db
  2. Deployed new prodA-stack.properties
    1. New search configuration from Nicole
  3. Deployed 0.10 build 6107 to staging and prod via deployment script
  4. Deployed 6107 to all staging environments
  5. Spin up 'A' environments using saved configurations
    1. Updated Custom AMI ID to ami-37f2275e for all environments.
    2. Set Monitoring Interval to 1 minute for portal
    3. Set minimum instance count = 2 for all environments
    4. Saved new configurations as 0.10-<component>-staging-a
  6. Ran the Stack Migration UI to copy data from prod-c to prod-a

Updated staging to 0.10 on 'B' stack by Mike on 2/6 14:30

  1. Dropped and recreated stagingB db
  2. Deployed new stagingB-stack.properties
    1. New search configuration from Nicole
    2. New crowd instance for staging only created by Xa
    3. All URLs now though public, stack-specific aliases
  3. Deployed 0.10 build 6087 and created new beanstalk versions via deployment script
  4. Spin up 'B' environments using saved configurations
    1. Updated Custom AMI ID to ami-37f2275e for all environments.
    2. Set Monitoring Interval to 1 minute for portal
    3. Portal minimum instance count = 2 for all environments
    4. Saved new configurations as 0.10-<component>-staging-b
  5. Updated aliases in GoDaddy to point to new stack
  6. Bruce and Xa fixed a Crowd configuration issue
  7. Verified log-in to web client via sagebase credentials.
  8. Ran the Stack Migration UI to copy data over from production to staging.

Updated staging/prod IAM users for Search Updater use case by Nicole 02/06 1:30pm

  1. Added search permissions for repository service user, see r6088            
  2. Added search update permissions for search updater user, see r6088            
  3. I deleted IAM users corresponding to platform team Synapse users for prod since they are obsolete, will do the rest at a later date

Updated bamboo IAM user for Search Updater use case by Nicole 02/02 at 2pm

remember to do this via the console

Updated bamboo IAM policy for Search Updater use case by Nicole 02/02 at noonish

See r6058             and I also updated https://s3.amazonaws.com/elasticbeanstalk-us-east-1-325565585839/bambooa-v2-stack.properties

Updated dev IAM user for Search Updater use case by Nicole 01/31 at 17:00

See r6042             and CR-PLFM-295            

This is for dev only but I'll do the same for other stacks later.

  1. removed IAM permissions to create IAM users and add users to IAM groups, the repo service longer needs those permissions now that it is using STS tokens for signing S3 urls
  2. I deleted all the IAM users corresponding to Synapse users for dev/bamboo/staging since they are obsolete
  3. I've made a workflow bucket for each stack. Each particular workflow will not have access to the entire bucket, just a slice of the bucket. For example the SearchUpdater workflow can access files matching <stack>workflow.sagebase.org/Search/*

Preparing to collapse locations on prod-c by John 1/23 at 17:00

  1. Dropped/recreated the prod-c schema (prodC).
  2. Mike deployed the latest 0.9 wars to prod-c (5951)
  3. Deployed to repo-prod-c: services-repository-0.9-SNAPSHOT-5951                
  4. Deployed to auth-prod-c: services-authentication-0.9-SNAPSHOT-5951                .
  5. Deployed to portal-prod-c: portal-0.9-SNAPSHOT-5951                .
  6. Started migrating data from prod-a to prod-c.
    1. Migration finished in 1:41 hours.
  7. Put the prod-a into read-only mode (sent out an email to synapse-users)
  8. Started the MigrateLocations.sh scrip on prod-c
    1. Finished running MigrateLocations.sh after correcting an error in the new path generation at 6:50 1/24.
  9. Spot checked prod-c and downloaded some data to validate.
  10. Switched the CNAME in GoDaddy to point to prod-c:
    1. repo-prod -> repo-prod-c.sagebase.org
    2. auth-prod -> auth-prod-c.sagebase.org
    3. synapse-prod - > synapse-prod-c.sagebase.org

The location collapse should now be complete.

Deployed latest wars to prod-c by John 1/17 at 9:15

  1. Created new wars version using the SynapseDeployer.main.py
  2. Deployed to repo-prod-c: services-repository-0.9-SNAPSHOT-5914                
  3. Deployed to auth-prod-c: services-authentication-0.9-SNAPSHOT-5914                .
  4. Deployed to portal-prod-c: portal-0.9-SNAPSHOT-5914                .

Setup prod-c for the first time by John 1/16 at 20:24

  1. Created the prodC-stack.properties file and uploaded it to S3.
  2. Created the prodCuser in the MySQL
    1. granted prodCuser select and insert on idGeneratorDB
    2. granted prodCuser all on prodC
  3. created the prodC schema
  4. created the prodC encryption key and saved it to sodo
  5. created repo-prod-c using services-repository-0.9-SNAPSHOT-5895                  .
  6. created auth-prod-c using services-authentication-0.9-SNAPSHOT-5830                  .
  7. created portal-prod-c using portal-0.9.1-SNAPSHOT-5916                  .
  8. started migrating data from prodA to prodC. Finished in 1:36 hours.

Edit bamboo properties

Added these two properties for the TCGA Workflow integration tests

org.sagebionetworks.synapse.username
org.sagebionetworks.synapse.password

Edit .properties files and restart portals

In both stagingA-stack.properties and prodA-stack.properties, changed org.sagebionetworks.portal.endpoint to be generic, not stack specific, then restarted the Portal app's.

This is meant to be a temporary work-around for http://sagebionetworks.jira.com/browse/PLFM-883.

Deploy 0.9 patches to prod by Nicole on 1/16 at 16:45

Fixes from John, Nicole, and Dave

  • portal-0.9-SNAPSHOT-5891                    
  • services-repository-0.9-SNAPSHOT-5895                    .war

Deploy trunk portal and repo to staging by Nicole on 1/16 at 16:30

  • portal-0.10-SNAPSHOT-5893                    
  • services-repository-0.10-SNAPSHOT-5894                    

Deploy stateless bug fix to repo svn staging by Nicole on 1/16 at 14:11

Deploying version services-repository-010-SNAPSHOT-5892                     for SYNR-68                    

Shut down unused B stacks by Mike on 1/13 at 16:45

  1. Shut down beanstalk b environments in both staging and prod
  2. Dropped the min # of instances for production repo service from 4 to 2.
  3. Updated GoDaddy aliases for b stacks to match format conventions of the A stacks

Switched staging from stagingB to statingA John 1/13 12:30

  1. Dropped/created the 'stagingA' database schema.
  2. To repo-staging-a deployed the new war: services-repository-0.9-SNAPSHOT-5853                               
  3. To portal-staging-a deployed the new war: portal-0.9-SNAPSHOT-5853                               
  4. To auth-staging-a deployed the new war: services-authentication-0.9-SNAPSHOT-5853                               
  5. Waited for all three applications to turn green.
  6. Started the migration from stagingB to stagingB using the following property file:
    1. https://s3.amazonaws.com/elasticbeanstalk-us-east-1-325565585839/migration-staging-b-to-staging-a.properties
    2. Migration completed in 3 minutes.
  7. swapped the CNAMES from staging-b to stagimg-a in GO Daddy.

Switched production from prodB to prodA John 1/11 19:00

  1. First put prodB into READ-ONLY mode.

    UPDATE `prodB`.`JDOSTACKSTATUS` SET `STATUS`=1 WHERE `ID`='0';
    
  2. Migrated the remaining data from prodB to prodA using:
    1. https://s3.amazonaws.com/elasticbeanstalk-us-east-1-325565585839/migration-prod-b-to-prod-a.properties
    2. Migration finished in 3 minutes.
  3. swapped the CNAMES from prod-b to prod-A in GO Daddy.
  4. Validated that changes in the web UI changed data in the prodA schema.
  5. Put prodB into DOWN mode

    UPDATE `prodB`.`JDOSTACKSTATUS` SET `STATUS`=2 WHERE `ID`='0';
    
  6. Took a database snapshot of the RDS instance named 'switch-to-prodA'

Setup prod-a using 0.9-SNAPSHOT-5828                           John 1/10 16:38

  1. To repo-prod-a deployed the new war: services-repository-0.9-SNAPSHOT-5828                               
  2. To portal-prod-a deployed the new war: portal-0.9-SNAPSHOT-5828                               
  3. To auth-prod-a deployed the new war: services-authentication-0.9-SNAPSHOT-5828                               
  4. Waited for all three applications to turn green.
  5. Started the migration from prod-b to prod-a using the following property file:
    1. https://s3.amazonaws.com/elasticbeanstalk-us-east-1-325565585839/migration-prod-b-to-prod-a.properties
    2. Migration completed in 106 minutes.

Setup prod-a using 0.9-SNAPSHOT-5647                             and new AMI ami-37f2275e John 12/22 20:38

  1. To repo-prod-a deployed the new war: services-repository-0.9-SNAPSHOT-5647                               
  2. To portal-prod-a deployed the new war: portal-0.9-SNAPSHOT-5647                               
  3. To auth-prod-a deployed the new war: services-authentication-0.9-SNAPSHOT-5647                               
  4. Waited for all three applications to turn green.
  5. Started the migration from prod-b to prod-a using the following property file:
    1. https://s3.amazonaws.com/elasticbeanstalk-us-east-1-325565585839/migration-prod-b-to-prod-a.properties

Setup prod-a using 0.9-SNAPSHOT-5640                              and new AMI ami-37f2275e John 12/22 14:49

Our previous attempt to migrate the data from prodB to prodA bogged down to a crawl at about 86% complete due to PLFM-852                             .
We turned off all JDO caching and we are now ready to try again with build 5640.

  1. Set the minimum number of instances of repo-prod-a to be 4.
  2. To repo-prod-a deployed the new war: services-repository-0.9-SNAPSHOT-5640                               
  3. To portal-prod-a deployed the new war: portal-0.9-SNAPSHOT-5640                               
  4. To auth-prod-a deployed the new war: services-authentication-0.9-SNAPSHOT-5640                               
  5. Waited for all three applications to turn green.
  6. Started the migration from prod-b to prod-a using the following property file:
    1. https://s3.amazonaws.com/elasticbeanstalk-us-east-1-325565585839/migration-prod-b-to-prod-a.properties

Note: The migration is already 86% complete so we are only testing the last 14% this time.
The last prodA.DAEMON_STATUS (select max(ID) from prodA.DAEMON_STATUS) is ID=2005. We will look at all new jobs to determine if
the performance fix for PLFM-852                             is working as expected.

Setup prod-a using 0.9-SNAPSHOT-5551                                and new AMI ami-37f2275e John 12/20 15:38

  1. Dropped the the 'prodA' schema and recreated it.
  2. Changed the AMI all prod A to ami-37f2275e
    1. Changed portal-prod-a to use ami-37f2275e.
    2. Changed repo-prod-a to use ami-37f2275e.
    3. Changed repo-prod-a to use ami-37f2275e.
  3. Waited for all three applications to turn green.
  4. Set the minimum number of instances of repo-prod-a to be 4.
  5. To repo-prod-a deployed the new war: services-repository-0.9-SNAPSHOT-5551                               
  6. To portal-prod-a deployed the new war: portal-0.9-SNAPSHOT-5551                               
  7. To auth-prod-a deployed the new war: services-authentication-0.9-SNAPSHOT-5551                               
  8. Waited for all three applications to turn green.
  9. Started the migration from prod-b to prod-a using the following property file:
    1. https://s3.amazonaws.com/elasticbeanstalk-us-east-1-325565585839/migration-prod-b-to-prod-a.properties

Set staging-a to use new AMI John 12/20 13:30

  1. Change the image used by repo-staging-a to use 'ami-37f2275e'.
  2. The database tables were created as expected.
  3. Changed portal-staging-a to use 'ami-37f2275e'
  4. Changed auth-staging-a to use 'ami-37f2275e'

Migration of the data from staging-b to staging-a was successful with 520 entities migrated in 2 minutes.

Created a new AMI using Sun's JDK 1.6 John 12/20 13:10

See: http://sagebionetworks.jira.com/wiki/display/PLFM/Creating+an+AMI+for+elasticbeanstalk

The new AMI is 'ami-37f2275e' which has the following: elasticbeanstalk/32-bit/Tomcat 7/ Sun's JDK 1.6.25

Setup a test ESB instance using the new AIM John 12/19 at 11:13

Brian Holt created a new AIM (ami-1b8a5f72) with the Sun JDK. The current image uses OpenJDK and we suspect that is the cause of the OutOfMemoryError (see: PLFM-792                                     ) we have been seeing.

  1. Used the python deployer too upload an version all three wars from 0.9 to the Synapse-Staging Environments.
    1. portal-0.9-SNAPSHOT-5551                                     .war
    2. services-repository-0.9-SNAPSHOT-5551                                     .war
    3. services-authentication-0.9-SNAPSHOT-5551                                     .war
  2. Manually upgraded portal, repo, auth staging-a to use the new 5551 wars.
  3. Set repo-staging-a to use the new image ami-1b8a5f72.

After changing the image the repo instance would not start. Reverted it back to the old image (ami-23de1f4a).

Updated the wars for prod-b to use the new wars by John 12/15 at 22:15

  1. prod-b-repo set to use version services-repository-0.8-SNAPSHOT-5539                                       .
  2. prod-b-auth set to use version services-authentication-0.8-SNAPSHOT-5539                                       .
  3. prod-b-portal set to use version portal-0.8-SNAPSHOT-5539                                       .
    Also applied the same to prod-a for all three.

Created new A stack for staging and prod by Mike 12/14 at 12:20

  1. Created new CNAMES in GoDaddy giving us public names per stack instance, e.g. auth-staging-a.sagebase.org -> auth-staging-a.elasticbeanstalk.com
  2. Pointed existing CNAMES to existing functional 'B' stack through new CNAME layer e.g. auth-staging.sagebase.org -> auth-staging-b.sagebase.org
  3. Validated that existing staging B still works
  4. Updated stagingA-stack.properties to use new instance-specific CNAMES
  5. Dropped and recreated stagingA database
  6. Launched new staging repo, auth, and portal environments using 0.8-SNAPSHOT-5507                                          versions and saved staging A configs
  7. Validated staging A works
  8. Repeat all of the above for prod

Created new application versions by Mike 12/14 at 12:20

  1. Created 0.8-SNAPSHOT-5507                                            versions of all .wars in Synapse and Synapse-Staging beanstalk applications.

Deployed new 0.8 WAR to Staging John Hill 12/12 at 16:16

  1. Used Synapse deployer, which created: services-repository-0.8-SNAPSHOT-5474                                             .war
  2. Deployed the new WAR from the web and validated startup.
  3. Set the minimum number of instances to 4 to support multiple backup threads.

Modified IAM perms for workflow users Nicole Deflaux 12/12/2011 10:20am

r5476                                               and r5477                                              

Work for PLFM-482                                              . John Hill 12/10 at 18:16

  1. Created a new S3 bucket for all backup files: 'shared.backups.sagebase.org'. We need a shared bucket to migrate data between stacks.
  2. Created a new IAM group for the new bucket 'backupBucketGroup'. Applied the policy (trunk\configuration\awsIamPolicies\BackupBucketGroup.policy) to the new group.
  3. Added all services users to this group:
    1. bambooServiceIamUser
    2. stagingServiceIamUser
    3. devServiceIamUser
    4. prodServiceIamUser
  4. Checked-in the code to use this new buck for all backup/restore operations.

Restarted prod-b-repo because the single instance ran out of memory again. John Hill 12/08 at 11:02 am

Deployed 0.8 branch to Alpha on 12/07, 4:37PM.

Used Synapse deployer, which created:

services-authentication-0.8-SNAPSHOT-5414                                                .war
services-repository-0.8-SNAPSHOT-5414                                                .war
portal-0.8-SNAPSHOT-5414                                                .war

Deployed 0.8 branch to Staging on 12/07 around 10AM.

Added IAM permissions to staging and prod service suers for Security Token Service by Nicole on 12/05 at 10:00

per r5385                                                 

Added IAM permissions to bamboo service user for Security Token Service by Nicole on 12/01 at 15:30

  1. updated IAM policy for bambooServiceIamUser.policy r5369                                                 

Update Prod to 0.8.6 on new 'B' stacks by John & Xa on 11/30 at 18:00

  1. Took a backup of prod-a (https://s3.amazonaws.com/proddata.sagebase.org/BackupDaemonJob88486-800942718895052893.zip).
  2. Created a new EC2 instance (OneTimeRestoreDaemon i-8d0e20ee).
    1. Installed JDK 1.6 and maven 2.2 and set all environment variables.
    2. svn check of branch/Synapse-0.8
    3. mvn clean install -Dmaven.test.skip=true.
    4. Synapse-0.8/integration-tests mvn cargo:run -D(all of the prob-b specific start up properties).
    5. Once the repo services was running on i-8d0e20ee started the restore process on BackupDaemonJob88486-800942718895052893.zip (at about 19:30).
  3. After the restore process finished created a new environment for prod-B-repo (at about 5:00 12/1).
  4. Waited for the cache to warmup then validate everything was working through http://prod-b-portal.elasticbeanstalk.com/ checked datasets, layers, and projects.
  5. Terminated prod-a-repo, prod-a-auth, prod-a-portal
  6. Change CNAMES in Godaddy to point to B stack

Update Prod to 0.8.6 on new 'B' stacks by John & Xa on 11/30 at 12:37

  1. Created new environments for auth, prod, portal using the saved configes (prod-B-portal, prod-B-auth, prod-B-repo).
  2. Dropped and recreated the prodB schema.
  3. Restarted the prod-B-repo application server.
  4. Waited for all of the tables to show up in prodB schema.
  5. Started the restore process using BackupDaemonJob88445-857150449942082859.zip.

Update Prod to 0.8.6 on new 'B' stacks by Mike on 11/29 at 20:00

  1. Use Synapse Deployer script to create 0.8.6 versions in the Synapse application
  2. Spin up new environments using saved stack B configuration
  3. Drop and recreate prodB schema in db
  4. Recreate the stack B repo service
  5. Shutdown the A stack portal
  6. Back up the A stack using python client
  7. Shutdown rest of the A stack
  8. Change CNAMES in Godaddy to point to B stack
  9. Restore from backup to B stack using python client
  10. Logged PLFM-804                                                         after failed to restore from backup 3 times
  11. Shutdown stack B, restored CNAMES to stack A, rebuilt all stack A environments on 0.8.1.

Added better health check urls to staging by Nicole on 11/28 at 13:30

  1. Added health check urls to beanstalk load balancer configuration for both the repo service and the web ui

Finish Update Staging to 0.8.5 by Mike on 11/28 at 11:30

  1. PLFM-799                                                            fixed in branch, portal only code change
  2. Determined that reopening of PLFM-775                                                            is due to deploying old version and hence old schema
  3. Drop and recreate stagingB schema in db
  4. Restart repo service
  5. Deploy newest build of branch to portal only

Update Staging to 0.8.5 on new 'B' Stacks by Mike on 11/28 at 15:30

  1. Spin up new B beanstalk environments using saved configurations
  2. Turned off prod-A-portal
  3. Ran backup on staging services
  4. Shutdown prod-A repo and auth services
  5. Drop and recreate stagingB schema in db
  6. Change CNAMES in GoDaddy to new stack
  7. Run SynapseDeployer to move staging up to 0.8.5
  8. Restore from backup
  9. Couple of errors identified:## Reopened PLFM-775  as data previews did not show up## Opened PLFM-799  cosmetic issue on all datasets page

Testing Security Token Service by Nicole on 11/16 at 17:00

  1. updated IAM policy for devServiceIamUser r5150                                                              
  2. turned on S3 bucket logging for devdata.sagebase.org to devlogs.sagebase.org

Update Prod to 0.8.1 on new 'A' stacks by Mike on 11/8 at 21:00

  1. Spun up new A beanstalk environments using saved configurations
  2. Updated prodA-stack.properties with new endpoints
  3. Turned off prod-B-portal. 
  4. Ran backup on production stack. 
  5. Shut down prod-B auth and repo environments.
  6. Drop and recreate prodA schema in db.
  7. Change CNAMES in GoDaddy to new stack
  8. Run SynapseDeployer to move prod up to 0.8.1
  9. Validate log-in, empty database
  10. Run restore from backup
  11. Validate log in and download a dataset

Update Staging to 0.8.1 by Mike on 11/8 at 11:00

  1. Generated 0.8.1 tag to address PLFM-742                                                               ; created 0.8 branch for future patches
  2. Ran SynapseDeployer to move staging up to 0.8.1

Finish update Staging Stack to 0.8.0 on new 'A' stacks by Mike on 11/4 at 9:30

  1. Rebuilt all staging A beanstalk environments
  2. Verified SSO and download data now work as expected.

Update Staging Stack to 0.8.0 on new 'A' stacks by Mike on 11/3 at 22:00

  1. Created tag 0.8.0 in SVN
  2. Built new staging A beanstalk environments; deployed 0.7.9;
  3. Set endpoints to local values in stagingA-stack.properties
  4. Dropped and recreated the database schema
  5. Ran the Synapse Deployer to bring staging A up to 0.8.0, verified stack is functional
  6. Shut down portal B;
  7. Ran backup utility on staging B
  8. Shut down services for staging B
  9. Changed GoDaddy to point to stack A endpoints
  10. Updated stagingA-stack.properties to use public endpoints, deployed the .properties file and restarted the app servers.
  11. At this point staging A seems functional, except I get a security exception on logging via sagebase.org account.  Rechecking GoDaddy and .properties file does not resolve issue.
  12. Continue with restore from backup.  Completes normally.  Validate that I can log in with synapse account and download data.

Cleaned up elastic IPs by Nicole 10/27

Do not release any AWS elastic IPs if they are not assigned to an instance but ARE listed in GoDaddy because those may be instances we spin up and down on demand such as tranSMART or RStudio

Update Prod Stack to 0.7.9 on new 'B' stacks by Mike on 10/4 at 20:00

  1. Follow exact same procure as below for staging on prod without the pause to correct certificate errors.

Update Staging Stack to 0.7.9 on new 'B' stacks by Mike on 10/4 at 16:00

  1. Created tag 0.7.9 including Bruce's fix for PLFM-630                                                                       .
  2. Uploaded .wars to S3 and created Beanstalk versions.
  3. Set endpoints in stagingB-stack.properties as follows and uploaded to S3:
    1. org.sagebionetworks.authenticationservice.privateendpoint=https://staging-b-auth.elasticbeanstalk.com/auth/v1
    2. org.sagebionetworks.authenticationservice.publicendpoint=https://auth-staging.sagebase.org/auth/v1
    3. org.sagebionetworks.repositoryservice.endpoint=https://repo-staging.sagebase.org/repo/v1
    4. org.sagebionetworks.portal.endpoint=https://synapse-staging.sagebase.org
  4. Updated the beanstalk environments I created yesterday to 0.7.9
  5. Verified ability to log-in to Synapse
  6. Ran data migration utility to restore from yesterday's backup of old Staging-A instance
  7. Verified data shows up and is downloadable from Synapse web client

Update Staging and Prod Stacks to 0.7.8 on new 'B' stacks by Mike on 10/3 at 19:00

  1. Created tag 0.7.8 at SVN 4638, moved trunk to 0.8
  2. Dropped and recreated the stagingB database (stagingA is the one in use)
  3. Downloaded .wars from Artifactory (have new Python script that does this), uploaded to S3 using console, followed work-around below to create the beanstalk version.  AWS console bug persists.
  4. Updated stagingB-stack.properties to use internal URLs.
  5. Created new staging B beanstalk environments following wiki
  6. Validate functionality to log-in, get certificate errors
  7. Run the data migration to back up stack A
  8. Shut down stack A
  9. Move DNS names to point to stack B
  10. Update stagingB-stack.properties to use external URLs and restart all app servers.  Still get error message on attempted login.

Update Staging and Prod Portal to 0.7.4-SNAPSHOT-4544                                                                            by Mike on 9/26  at 21:30

  1. Deployed snapshot build 4544 of portal only. 
  2. Starting Thurs 9/22 we started experiencing problems creating a new AWS Beanstalk application through the AWS console.  Logged problem in AWS forum.
  3. Following advice this work-around allowed me to update the new version:
    1. Uploaded the portal.war to S3 in it's normal bucket.
    2. Executed the following command line from sodo to create the application version in both staging and production stacks: elastic-beanstalk-create-application-version -j -a Synapse-Staging -d Build4454 -l portal-0.7.4-SNAPSHOT-4544                                                                                -s elasticbeanstalk-us-east-1-325565585839/portal-0.7-SNAPSHOT-4544                                                                               .war
    3. Used the AWS console to update the application version.  It is apparently only the creation of application version via AWS console that is broken.
  4. At this point after updating both staging and prod I realize that I can not log into Synapse as reported by Matt. Logging in gives error as Synapse tries to log in at: http://localhost:8080/services-authentication-0.7-SNAPSHOT/auth/v1/openid
  5. Rolling back to previous version does not fix issue.  Validate that environment is still configured to point to prodA-stack.properties and this file still has configuration of proper end points.
  6. Bring up brand new prodA portal environment in parallel.  This has same problem.  Terminate the environment.
  7. Notice that prodA-stack.properties has changes for recent work (after 4544) to separate public and private endpoints for auth service, and therefore auth service is not actually set.  Roll back .properties file in S3 to old auth endpoint key.
  8. This now allows login.  Now update to SNAPSHOT 4544.  Validate still works
  9. Apply same fix to staging

Update Staging and Prod to 0.7.4 by Mike on 9/6/2011 at 15:30, paused and completed at 23:30

  1. Tagged trunk as 0.7.4 (R client was up to 0.7.3 so we skipped some numbers)
  2. Dropped and recreated prodA and stagingA schemas (B stacks were in use, A stacks were around from prior use)
  3. Updated stagingA-stack.properties to use internal beanstalk URLs
  4. Created new beanstalk environments and deployed new .war files
  5. Verified stagingA functionality
  6. Migrated data from staging B to staging A
  7. Shut down stack B environments
  8. Updated stagingA-stack.properties to use public CNAMES, restarted app server
  9. Verified functionality
  10. Repeat 3-9 for prod

Update Staging Portal to 0.7 SNAPSHOT by Mike on 9/6/2011 at 20:30

  1. Deployed build 4340 of portal.war to staging to get Dave's latest UI in state to demo.

Update Staging to 0.7.0 by Mike on 9/2/2011 at 16:00

  1. Tagged trunk as 0.7.0
  2. Updated the .wars on the staging stack to new versions

Create "B" Stacks for Staging and Prod, and Deploy 0.6.5 on these by Mike on 8/19 at 11:30

  1. Created a 0.6 branch and 0.6.5 tag in SVN, moved trunk to 0.7
  2. Uploaded 0.6.5 wars to prod and staging Beanstalk applications as new versions
  3. Followed instructions for Synapse Database Setup and Configuration to create staging B and prod B users and schemas
  4. Built new properties files for the instances
    1. Downloaded stagingA-stack.properties and prodA-stack.properties from S3
    2. Renamed them to stagingB-stack.properties and prodB-stack.properties
    3. Updated the db connection info with data from step 2
    4. Used new encryption key to generate crowd and email passwords and updated .properties file.
    5. Updated the endpoints to use staging / prod b URLs
    6. Uploaded the .properties file to S3, bucket elasticbeanstalk-us-east-1-325565585839, with this naming convention IAM service users should be able to get the config file using existing policy
  5. Follow Synapse Stack Deployment to create and configure staging B environments; saved configuration of environments
  6. Follow Repository Administration to migrate data from staging A to staging B, validate that data exists in db and web client
  7. Go to GoDaddy and point staging CNAMES to the new stack.  We still get security errors from redirect Bug PLFM-506                                                                                         .  Changed stagingB-stack.properties to use CNAMS and updated file in S3, restarted all staging B environments.
  8. Repeat steps 5,6,7 for prod stack, added the platform account to the Administrators group of Crowd.

Change Instance Configuration on Prod by Mike on 8/15 at 13:30

  1. Set Prod-A-auth to small instance, min instances 1, max instances 4
  2. Set Prod-A-repo to small instance, min instances 1, max instances 4
  3. Set Prod-A-portal to small instance, min instances 1, max instances 10
  4. Note: There was a period of a few minutes where the application state went to RED for the repo and portal environments, but it transitioned to GREEN with no involvement on my part.
  5. Confirmed ability to to login, download a dataset, and change an annotation.

Shut down demo environment by Mike on 8/15 at 9:30

  1. Shut down the prod-A-demo-portal environment given Dave's fix to PLFM-341                                                                                                 
  2. In Go Daddy, changed synapse.sagebase.org and synapse-demo.sagebase.org to point to alpha portal URL

Upgrade Staging and Production to 0.6.3 by Mike on 8/12

  1. Tagged trunk as 0.6.2 at SVN 3850
  2. Changed the platform (root) user password of RDS, updated /work/platform/PasswordsAndCredentials/passwords.txt with the next password (random-generated)
  3. Deleted the following schemas in our RDS instance:
    1. prodRepositoryDb
    2. repoProduction
    3. repoStaging
    4. repodb_alpha_0_5
    5. repositorydb
    6. repositorydbCongress
    7. repositorydbNOGAE
  4. Deleted the users beans-staging and beans-production from RDS
  5. Deployed repo, auth, and portal .wars as new versions in existing Staging A stack.
  6. Discovered bug in repo service related to anonymous not being a valid user identifier PLFM-463                                                                                                    , John checked in fix. SVN 3857
  7. Deployed repo, auth, and portal .wars to staging and production A stacks.