Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Synapse Managed Plans require a new object to encapsulate the definition of the plan (the limits (storage, egress, others) to manage data and users governance across projects. Organizations are the proposed solution.  Organizations are a paradigm that is used by other applications (GitHub, HuggingFace).  Organizations allow us to support users with complex data curation needs that span projects.  

What are Organizations? Organizations are objects that enforce Plan limits .  Organizations serve no other purpose, they are not intended for governance or access (other than project creation and data upload as part of plan enforcement), data curation or dissemination.  

Glossary

Content: User-contributed files, datasets. (as opposed to Data which is too ambiguous)

Download: Download is defined as a request for a pre-signed URL.

Limits:  The restriction on data, egress, or other process, count etct.

For example, an Organization may have a limit on storage, egress, and/or other limits we may define in the future. 

Organization: An object to encapsulate the collection of users, project, teams and storage.  An Organization is defined by a Plan and enforces the Plan’s Limits.

Plan (Managed Plan, plans): A plan is the construct to map accounting and limits to an Organization.  All users of Synapse get a (free) Basic plan (and associated Organization). Users who wish to purchase a paid plan (Self-Managed or DCC) will need to contact Sage to create the Organization associated with the Plan.  The Plan accounts for the limits and length of the agreement, the Organization implements the Plan with Synapse.

Project: A Synapse project.  Definition of a project is not changed for this discussion.

Storage (buckets): Storage can be one of the following types:

Shared Storage: Shared storage is any storage managed by Sage on behalf of the user.  Shared storage MAY HAVE many Organizations' data included in it. Storage and Egress is limited by the Organization’s Plan. Sage’s shared storage bucket proddata is an example of shared storage.

Custom Storage: Storage managed by the user. Users control and manage the storage. Storage and egress costs are paid by the user and no limits are imposed by Synapse.  Data from custom storage does not count against the plan limits.

Private Storage: Storage managed by Sage on behalf of the Organization in the form of an S3 bucket that contains only that Organization’s content.  Storage and Egress is limited by the Organization’s Plan.  AWS Open Data is an example of private storage.

Storage Limit: The upper allowed amount of storage (in GB) for the total amount of data in the organization.

Egress Limit: The upper limit of egress of data from within projects in the organization (measured in GB/Year).

User Classes

  1. Downloader Only - User who does not need a plan - only wishes to browse and access data.

  2. Basic Plan User (aka: Independent Researchers) - User who wants a Basic Plan for publishing content. Each Certified User account has a single Default Organization as part of a Basic Plan

  3. Multi-Plan User - User who has one Basic Plan, and a member of one or more Self-Managed Plans or DCC Plans

  4. Legacy Plan User - User who has uploaded content into Synapse previous to the plan rollout date (##/##/2023) or is a member of a Legacy Plan.

...

and encapsulate a collection of projects, storage locations or both.

Users may not desire Organizations (or managed plans as usage of Synapse was previously unrestrained). But once Organizations are deployed, they represent substantial assets that users will want to protect from misuse.

In order to protect Sage from runaway or unmanaged costs of data egress or storage, we need to implement controls on the system to prevent users from uploading too much data, egressing data at a rate faster than we can financially manage and support the managed plans through enforcement.

  • Eliminate unlimited free storage and egress by Synapse users.

  • Configurable manage total size of contents in Synapse to meet requirements of managed plans

  • Display size limits and current size in understandable and actionable by users in SWC

  • Prevent users from uploading additional content when project limit is reached

  • Control access to a plan’s resources to authorized users

Glossary

Content: User-contributed files, datasets. (as opposed to Data which is too ambiguous)

Download: Download is defined as a request for a pre-signed URL.

Limits:  The restriction on data, egress, or other process, count etct.

For example, an Organization may have a limit on storage, egress, and/or other limits we may define in the future. 

Organization: An object to encapsulate the collection of users, project, teams and storage.  An Organization is defined by a Plan and enforces the Plan’s Limits.

Plan (Managed Plan, plans): A plan is the construct to map accounting and limits to an Organization.  All users of Synapse get a (free) Basic plan (and associated Organization). Users who wish to purchase a paid plan (Self-Managed or DCC) will need to contact Sage to create the Organization associated with the Plan.  The Plan accounts for the limits and length of the agreement, the Organization implements the Plan with Synapse.

Project: A Synapse project.  Definition of a project is not changed for this discussion.

Storage (buckets): Storage can be one of the following types:

Shared Storage: Shared storage is any storage managed by Sage on behalf of the user.  Shared storage MAY HAVE many Organizations' data included in it. Storage and Egress is limited by the Organization’s Plan. Sage’s shared storage bucket proddata is an example of shared storage.

Custom Storage: Storage managed by the user. Users control and manage the storage. Storage and egress costs are paid by the user and no limits are imposed by Synapse.  Data from custom storage does not count against the plan limits.

Private Storage: (previously BaaS) Storage managed by Sage on behalf of the Organization in the form of an S3 bucket that contains only that Organization’s content.  Storage and Egress is limited by the Organization’s Plan.  AWS Open Data is an example of private storage.

Storage Limit: The upper allowed amount of storage (in GB) for the total amount of data in the organization.

Egress Limit: The upper limit of egress of data from within projects in the organization (measured in GB/Year).

User Classes

  1. Downloader Only - User who does not need a plan - only wishes to browse and access data.

  2. Basic Plan User (aka: Independent Researchers) - User who wants a Basic Plan for publishing content. Each Certified User account has a single Default Organization as part of a Basic Plan

  3. Multi-Plan User - User who has one Basic Plan, and a member of one or more Self-Managed Plans or DCC Plans

  4. Legacy Plan User - User who has uploaded content into Synapse previous to the plan rollout date (##/##/2023) or is a member of a Legacy Plan.

Use Cases & Scenarios

Actors:

Data Contributors: Data contributors are those individuals who upload data. They must be Certified (in order to upload) AND must be a member of the organization to create a project, upload data into an Organization.

Data Users (Downloaders): Data users need to have no relationship to an Organization. They can access data from any Organization using the governance and controls that exist today.

Organization Adminstrators

Organization Managers (may or may not be Data Contributors): The Organization Manager(s) are responsible for controlling membership to the organization, setting up the default storage location.

Organization Members:

System (Synapse):

Scenario 1: New user to the system

Dr Fauci is a new cancer researcher with a grant for his new vaccine. He is researching appropriate repositories for his lab’s work. He finds Synapse.org, creates an account and Certifies his account. He now navigates to his Dashboard which lists an empty list of projects. He clicks on “Create New Project” and is offered to name a new Organization and Project. He’s not sure what an Organization is, but assumes its similar to his GitHub experience. Nonetheless, he clicks on the ? link to go to help and reads about Organizations and plans. He names his Organization “CancerOrg123” and his new project, “ProjectCancer1”. He now has a project and an Organization. After several days of use and kicking the tires, he decides that Synapse is his choice for his work, and he wants his research associates, Thing1 and Thing2 to upload data to his project. He goes back to his Synapse Organization page, and chooses “Invite users”, where he invites Thing1, (already a Synapse user) by his user name, and Thing2 (not a Synapse user) by email.

Thing1 and Thing2 create new projects and start uploading data into Dr Fauci’s org. After a few days, he sees that he is close to his 100GB limit of storage, he clicks on the Organizations menu on the left rail of Synapse and is shown his one organization “CancerOrg123” from there, he can view the Organization page, which describes his org as a Basic Plan and leads to the SYNSD Managed Plan form. He fills out the form to the best of his ability and clicks “Send”. He receives an email response from Sage, forms are flown, signatures are signed and a few days later, “CancerOrg123” is created as a Managed Plan with 500GB and he receives an email from Ann at Sage offering to help him configure the governance for his data.

As a Project owner, (necessarily an Organization Member), I must be able to see the storage amount used for my project and the limit and storage used for the organization in order to make decisions on how to control my data usage.

As a System, Synapse needs to track a collection of projects and storage locations associated with those projects. (Organizations)

As a System, Synapse can limit storage based on Organization in order to limit storage costs for Sage.

As a System, Synapse can limit egress based on Organization for Shared and Private storage locations in order to limit Egress costs for Sage.

As an Organization Manager, I want to invite other Synapse users to my Organization so that they create projects and upload data as part of my Organization.

As an Organization Manager, I want to invite other non-Synapse users via email to my Organization so that I can invite users who are not yet a user of Synapse.

As an Organization Manager, I want to remove a user from my organization maintaining the data they contributed. (Organization is the data owner, but data contributor just uploads data).

As an Organization Manager, I want Synapse to limit the size of data from uploaded into my Custom Storage from Synapse.

As an Organization Manager, I want to be able to rename my organization.

As an Organization Manager, I want to merge two organizations that I manage into a single organization.

As an Organization Manager, I want to remove empty project(s) from my organization, which reverts them to the project owner’s Basic Plan organization.

As an Organization Manager, I want to see usage against my organizations' storage limits broken out by project, storage and by contributor over time so that I can make decisions about my organization (like when to purchase more storage, etc).

As a System, Synapse SHALL NOT have projects outside of o

Functional Design

Organization Properties

...

Synapse MUST store and maintain project and organization relationships.organization relationships.

Synapse Projects MUST be a member of one (and only one) organization.

Projects MAY BE migrated between two organizations,

Hierarchy:

Projects to Organizations (Many to One)

...

Storage limits are calculated against any content file handle in shared or private storage.  Content stored in Custom Storage is not counted against the limit. For example, take a simple organization with a Self-Managed plan (100 GB limit) with 3 projects, (Projects A, B, & C)  Project A (30GB) has its storage in private storage A.  Project B (40GB) has its storage in private Storage B.  Project C (700GB) has its storage in custom storage.

...

Organization Admins can update the limits of an organization:

Notifications

...

Transitioning Existing Data to an Organized Synapse

Data and content exist in pre-organization Synapse today, this “Legacy “ content MUST BE allocated to appropriate new Organizations.  This will require some data analysis.

Basic Plan Requirements

Qualified Synapse users MUST NOT BE required to contact Sage in order to create and begin utilizing a Basic Plan.

User Experience for users migrating from a Basic Plan to a Self-Managed Plan SOULD BE easy and understandable.

Open Questions

Please put in any open questions you have and I will address - Kevin

...