Document toolboxDocument toolbox

Shared Module Library

Status

Critical

These must be in-place before external developers can start using this.

Non-Critical

Not a blocker for external developers to use this, but should be prioritized high.

Nice to Have

Quality of life improvements.

Future Work

Possible phase 2, pending further requirements and design.

Overview

We want to allow study developers to rapidly bootstrap and develop new studies. A big part of this is setting up surveys and activities and schemas, both server-side and client-side. Frequently, a new study will want to share modules with the another study, examples include Tapping, Walking, Tremor, PHQ8. As such, we want a shared library of modules that study developers can import into their study and their app.

Scenarios

Provisioning and populating a study. Study developer provisions a new study. They go into the Bridge Study Manager UI, sees a list of modules (surveys, schemas), and chooses which modules they want to import into their study. The Researcher Portal automatically populates the study with the schemas and surveys and instructs the developer which code module (Github link? Maven package and version, or equivalent in iOS?) to import into their app and how to configure it.

Customizing Surveys and Schemas. Study developers may want to customize a survey or schema after they import it.

Bootstrapping a new app. Study app developer wants to quickly build their study app. They start with the Template App (aka "Super App", aka Sample App, aka the app that contains all of the publicly available modules), downloads a set of configuration files from Bridge Study Manager UI, and they have a working study app with basic functionality.

Data analysis for shared modules. Multiple researchers are using the same module from the shared module library. They want to run some analysis for basic stats. Bridge provides standard analysis code packages for some of the shared modules. (Examples: Tapping, Walking, Voice.)

Publishing new shared modules. Currently, Sage needs to curate what's available in the shared library. Whether this is promoting a module from an existing study or creating a new "ideal" module in the shared space is still up for discussion. Either way, Sage will need to provide metadata and tagging for the module. Note that module metadata needs to include:

  • copyright and/or licensing restrictions
  • whether they are OS-specific
  • if it requires app code, where to find the app code (repo/code link, specific code version, etc)
  • default schedule that would provision with the module, so the module is accessible and usable right away

Business Development. One customer for the shared module feature is a scientist / funder / patient advocate putting together a study design and evaluating Sage's platform and services.  This person may never actually log into the study manager and do anything with their own account.  But, we may want to make it easy for this sort of person to see all the modules available for use, either in the context of a demo by Sage employees, or by reviewing themselves on their own time.  This scenario implies there might be some functionality visible to a non-logged in user.

Non-Goals

  1. Study developer owns two studies. Study A has a module that they want to use in study B. They want to clone that module from study A to study B. This is currently outside the scope of this particular feature, but is something we still want to do separately. See BRIDGE-1494 - Getting issue details... STATUS
  2. Study developer provisions a study from a template, already populated with schemas and surveys and schedules. We don't have a use case for study developers being "clones" of an entire study. Expected use case is more that study developers will pick and choose what they need.
  3. Consents are out of scope, as they will almost always need to be customized, based on what's in the app and based on local requirements. We may want to tackle this at a later date, but it's not included in this project.
  4. Schedules are out of scope, but we may add them in the future.  They're much more likely to be developed from scratch for each study.
  5. Survey and schema metadata in Synapse tables. This is orthogonal to the Shared Module Library and is already being tracked in a separate project Upload Format v2#Subtask1c:ExportSurveyQuestionstoSynapse. Note that this is currently blocked on a Synapse feature request PLFM-3831 - Getting issue details... STATUS

Open Questions

Do we need to block some modules from being edited? (For example, a copyrighted survey like UPDRS can't be updated, ever.)

Do we need a story to notify local developers of new shared module versions?

Design Details

Shared Module Library

JIRAs:

BRIDGE-1542 - Getting issue details... STATUS

BRIDGE-1543 - Getting issue details... STATUS

Shared Study

Create a new reserved study called "shared", in which all the shared modules (shared surveys and schemas) will live. Making this a full-fledged study allows us to re-use a lot of our existing code and UI for managing surveys and schemas. It also allows easy separation of permissions between admins, shared study developers, and local study developers.

  • Only developers in the shared study can create or modify objects in the shared study. Any developer can read surveys and schemas from the shared study.
    • We'll need new APIs to read surveys and schemas from the shared study. This will largely have the same Service and DAO code. Controller code will differ only in permissions and will reference the shared study instead of the local study.
  • Surveys and schemas are copied from the shared study to the local study.
    • This decision was made because "fallback logic" to first check the local study then fallback to the shared study would require touching code in about a dozen different places. This would also add both conceptual and mechanical complexity, which increases development time, maintenance time, and results in more bugs.
    • This logic can live in the Study Manager UI. The Study Manager will read the shared object, then call the create API to create that object in the local study. (We'll need to check that the create APIs can be used to "copy" objects from other studies.)
    • Study Manager will need to annotate the newly created object with references to the original object in the shared study.
    • Bridge APIs for creating and updating surveys and schemas to allow publishing by setting the "published" flag to true. This is to allow Study Manager to create and publish surveys and schemas in a single atomic step.
  • When copying surveys and schemas, you'll need a reference to the shared module the survey or schema was copied from. This means having the shared module ID and version as fields in surveys and schemas.
    • The shared module ID and version will persist even when local developers create a new revision of the survey or schema. This is to track where this object originally came from. There is no need to explicitly track whether the object has been edited from its original form or not.
  • Publish workflow for schemas. We'll need to add a "published" flag to schemas and a publish API. This is to enforce immutability in the shared study and to force a revision bump if a study developer tries to modify a schema copied from the shared study.
    • For testing purposes, we will still allow apps to submit data for unpublished schemas.

CLARIFYING NOTE: This is different from the study that the "Sample App" points to, which is "sample-study". Rather, no app should be pointed at the shared study, and no users should be submitting data to the shared study.

Module Metadata

New object type (and APIs) for module metadata, which can only be written by shared developers, but can be read and listed by shared developers and normal developers. Since shared objects live in the special shared study, the module metadata is study-agnostic and can be accessed from any study (as long as the accessing account has the appropriate permissions).

To aid discoverability, we want to be able to search and filter on a variety of fields, such as name, type, or tags. This suggests we should use a relational database like Amazon RDS.

Module metadata includes:

  • ID - Unique key that identifies this shared module. Since Sage curates all of these, we can make these user-friendly, like study IDs. Note that multiple versions of the same module can use the same ID and are differed by version.
  • version - Monotonically increasing version number.
  • publish workflow - Modules will need a publish workflow to prevent changes once they are "released and made public". As an optimization, publishing a module will also publish its corresponding survey or schema. For testing purposes, we will still allow local developers to import unpublished modules.
  • name (user-friendly name that describes what this is, used for discoverability and display)
  • type (survey or schema)
  • key (guid for surveys, ID for schemas)
  • revision (createdOn for surveys, revision number for schemas)
  • isCopyrighted (or isLicenseRestricted or other reasonable name)
  • OS (if it's iOS or Android specific, blank otherwise)
  • tags (for searching, filtering, and general discoverability)
  • notes (explanatory text shown in Bridge Study Manager UI) - This will be a generic String for now. The editor and view for this will be in the Study Manager and will most likely be HTML.

Shared Module Library Operations

Searching and Selecting from Bridge Study Manager UI. Study developer goes to a list of shared modules in the Bridge Study Manager. They can search or filter the shared modules based on criteria. They can view further information about a module (licensing restrictions, link to app code, tags, notes). They can click a "add to study" button, which copies the module into their study.

  • Bridge Study Manager UI for listing, searching, filtering, and selecting modules
  • Bridge Server API for the same
  • Bridge Study Manager operation for copying the survey or schema from the shared library to the local study.

Creating and Editing Shared Modules. If a shared developer goes to the list of shared modules, they also have the option of editing an existing module or creating a new module. This takes them to the module metadata page, where they can create and edit module metadata. They can click through to edit the survey or schema contained in the module.

  • Bridge Study Manager UI for creating and editing module metadata.
  • Bridge Server APIs for creating and editing module metadata.
  • Bridge Study Manager UI for creating and editing shared surveys and shared schemas. This can use the same UI as local surveys and schemas, but we may need separate navigation and logic as these will call different server APIs.
  • A version/publish workflow for shared modules.

Promoting to the Shared Library. Out of scope. We currently don't need a full-fledged experience, as this will require dev intervention anyway and a lot of curating and metadata editing. Instead, we can use the copy survey and schema tool if we want to copy something from (for example) mPower to the shared study. See BRIDGE-1494 - Getting issue details... STATUS

Template App

The expectation is that, once the Template App is completed, all future app development will be based on the Template App. This is to allow code consolidation so that (a) there are fewer "forks" of Bridge-based apps going around and (b) everyone's app works in a similar, predictable way. As such, it's critically important to have the Template App to launch the Shared Module Library. This includes app development on iOS (in progress), Android (not started), and Bridge Server work to facilitate the Template App.

As a corollary, this means that the Shared Module Library doesn't need to know where to find app code for specific modules. It will always be available in the Template App.

iOS: BRIDGE-1514 - Getting issue details... STATUS

Android: BRIDGE-1515 - Getting issue details... STATUS

  • App that, by default, does nothing but contains every public app module (example: Memory, Tapping, Voice, Walking).
  • Can be configured to point to any study.
  • Comes with a default lightweight sign-up and consent experience, but these can be swapped out.
  • Use case: App developer provisions a new study, wants to quickly bootstrap a new app. They create a new app project, import the Template App, and download configuration files. This gives them an app with basic functionality that talks to their study, and pulls down their schedules and surveys.
  • We'll want to build one for both iOS and for Android.
  • iOS version currently pending AppCore/BridgeAppSdk refactor (and prioritization).

Server-Side Work

JIRA: BRIDGE-1544 - Getting issue details... STATUS

Study developers may need to make breaking changes to a schema (and hence bump a new schema rev) without needing to update the app. At the same time, if we make a breaking change to the app, we need a way for older app versions to use the older schema rev whike newer app versions use the new schema rev. To solve this, the schedule API will return tasks populated with a list of schema IDs and revs.

Details:

  • Schema revisions have min/maxAppVersion per OS.
  • Tasks will return a list of surveys and/or schemas - The original assumption was each task corresponds to one "thing the user had to do". This assumption proved false in Lilly and Smart4Sure, where a single task could contain memory, tapping, voice, walking, potentially up to half a dozen things.
  • Study developers will define schedules and tasks that include a list survey guids and schema IDs - They don't need to specify the survey createdOn or schema rev here. Otherwise, they end up having to create a separate schedule for each app version or specify a bunch of schema rev mappings, which can be tedious and redundant.
  • Schedule API will, at request time, look up the latest schema revs for the current app version and populate the task with those schema revs - This means even if the app upgrades after the task is created and scheduled, we can still give them the most up-to-date schema rev.
    • Similarly, we'll want to populate the task with survey createdOn.

Example: Foo App version 1 uses Tapping Activity v5. Foo App Version 2 introduces 2-handed tapping, which is Tapping Activity v6. We add maxAppVersion 1 to Tapping Activity v5 and minAppVersion 2 to Tapping Activity v6. When Foo App version 1 asks for the schedules, we return Tapping Activity v5. For Foo App version 2, we return Tapping Activity v6.

We find an error in Tapping Activity v6, so we correct it and bump the rev to v7. Foo App version 2 now receives Tapping Activity v7. Foo App version 1 still receives Tapping Activity v5.

Shared Data Analysis

JIRA: BRIDGE-1545 - Getting issue details... STATUS

Overview

The research team at Sage will own the standard libraries for post-processing and feature extraction for the shared components (Memory, Tapping, Voice, Walking, etc). These packages will be generic and can be run against any table in any study, as long as the data format is compatible. Bridge team will build a platform to automatically run these libraries as soon as the Bridge data is exported to Synapse. This Data Analysis Platform is also responsible for passing in the Project and Table IDs to the data processing scripts.

When a researcher (internal or external) adds a shared module to their study, if that module has a data analysis package, the Study Manager will call Bridge Server to register that package to their study. This signals to the Data Analysis Platform to include their study in that data analysis package's execution.

Bruce's work in https://github.com/Sage-Bionetworks/mPowerProcessing and https://github.com/Sage-Bionetworks/mPowerStatistics can form the basis of the Data Analysis Platform. This includes logic to determine the delta of work to be done. It does not include scheduling.

Requirements:

  • Data Analysis packages live in a location where they can be easily edited by Sage internal.
  • Versioning - Research teams may want to lock into a specific version of the data analysis code so they can have consistent analysis throughout the lifetime of their study.
    • As a corollary, this means different versions will need to be able to run concurrently.
  • Data Analysis platform can run nightly or hourly, depending on the needs of the study.

Non-Goals:

  • Data cleansing - For reasons beyond the scope of this document, when Bridge first launched, Sage had little to no control over the data format, and as a result, many breaking changes were made to the data in the early days. This necessitated a "data cleansing" step to unify the different incompatible data formats into a single data format that could be widely used. For reasons beyond the scope of this document, the data formats are much more stable today, so there's no need for a data cleansing step. (At least not a generalized one that can be shared across different projects.)
  • External research teams sharing their own data analysis packages - This is currently out of scope. Currently, Sage curates and manages all of the shared data analysis packages.

Scheduling and Bootstrapping

We need a way for BridgeEX to signal to the Data Analysis Platform to begin running. Options include:

  1. AWS R client to poll an SQS queue.
  2. Java app (which trivially interacts with AWS) which runs R code.
  3. Synapse triggers running R code off an update to a specific Synapse table.

Configuration

Similarly, we need a way for Bridge to communicate to the Data Analysis Platform which studies and which tables. Bridge Server will most likely track this data either in DynamoDB or in RDS. However, we still need to communicate to the Data Analysis Platform somehow. Options include:

  1. AWS R client
  2. Java app running R
  3. BridgeEX reads from these tables and writes to a configuration table in the Synapse project.
  4. Bridge Server writes directly to the config tables in Synapse.

Change History

2016-10-05T18:14-0700:

  • For copied schemas and surveys, instead of having references to the original schema and survey, they have references to shared module metadata.
  • Module metadata needs versioning and publish workflow.
  • Instead of a "get me latest revs for each schema" API, this is now included in tasks (scheduled activities).
  • Additional minor changes.