Introduction
When to Migrate
Quick Note: If you only need to spin up a new stack that is an exact copy of another stack then you do not need to migrate the data. Instead, just create a database backup of the source schema an apply it to your destination repository. However, if any data migration is required then this option will not work. For example, to deploy code changes that require schema changes, data migration from the old schema to the new schema is required. The following outlines when data migration is required:
- Database schema changes.
- JSON Schema changes:
- Addition of 'required' properties (the default value must be applied to all existing entities of that type).
- Deleting a property
- Renaming property
- Moving fields from/to primary/additional fields
- Converting Entity types.
Adding new entity types and adding new non-required properties are really the only JSON Schema changes that do not require migration.
How Incremental Migration Works
The migration tool drives the migration of an entity from the source to destination repository as follows:
- A snapshot of the entity is taken from the the source repository using the repo/v1/admin/daemon/backup web-service.
- The destination repository is told to restore the snapshot from step 1 using the repo/v1/admin/daemon/restore web-service. Data migration occurs as part of the restoration process:
- The restoration daemon will read the entity data from the snapshot.
- The daemon will then trigger any required data transformation on the entity model.
- Finally, the transformed entity will be written to the destination database.
Note: The source repository can read from the old database schema while the destination respiratory can write to the new database schema. There is no code that can read from the old schema and write to the new schema.
Running the Migration Tool User Interface
- First make sure both the source and destination stacks are setup and running.
- Download the migration tool UI from Artifactory: http://sagebionetworks.artifactoryonline.com/sagebionetworks/libs-snapshots-local/org/sagebionetworks/tool-migration-utility/0.10-SNAPSHOT/tool-migration-utility-0.10-SNAPSHOT-jar-with-dependencies.jar. Note: This URL will need to be updated when the version changes.
- The migration tool is configured using a property file passed as a command line argument. In this property file you will need to provide the required information to connect to both the source and destination stack. Create new property file using the following as a template:
# The source authentication service org.sagebionetworks.source.authentication.endpoint=https://<source_authentication_host>/auth/v1 # The source repository. org.sagebionetworks.source.repository.endpoint=https://<source_repository_host>/repo/v1 org.sagebionetworks.source.admin.username=<source_admin_username> org.sagebionetworks.source.admin.password=<source_admin_password> # The destination data. # The destination authentication service org.sagebionetworks.destination.authentication.endpoint=https://<destination_authentication_host>/auth/v1 # The source repository. org.sagebionetworks.destination.repository.endpoint=https://<destination_repository_host>/repo/v1 org.sagebionetworks.destination.admin.username=<destination_admin_username> org.sagebionetworks.destination.admin.password=<destination_admin_password> # The max number of entities submitted per job. org.sagebionetworks.batch.size=25 # The max number of thread used by this tool. org.sagebionetworks.max.threads=4 # Worker timeout in MS org.sagebionetworks.worker.thread.timout.ms=240000
- Save the property file to your local machine. This file will be used in the next step.
- Start the migration tool from the command line using the following:
Note: Replace <path_property_file> with the path to your property file from the previous step.
java -cp tool-migration-utility-0.10-SNAPSHOT-jar-with-dependencies.jar org.sagebionetworks.tool.migration.gui.MigrationConsoleUI "<path_property_file>.properties"