Synapse Entity Types
Entity Types
When a user works with Synapse either through the Web UI, R Client (or other client) they see the following object:
- Project
- Study
- Data
- Analysis
- many more...
Each of these objects is a Synapse Entity that is defined using a JSON schema following IETF draft: draft-zyp-json-schema-03. The schema for each Entity type defines the properties that a user can expect to find on every instance of an Entity. The schema definition allows property inheritances so a base type can be defined once and then extended much like class hierarchy in programming languages such as Java and C++. In fact, Entity is literally a base type that all Synapse Entities extend. This base Entity type describes the basic metadata needed to work with Synapse. See the following table for a list of the most basic properties defined in the Entity JSON schema:
Field Name | Description |
---|---|
name | The name of the entity. This is the most prominent fields of the entity and will show up everywhere. |
id | This is the Synapse ID for an entity. Synapse will automatically issue a immutable Synapse ID to all entities upon creation. This ID is the key used for most entity related API calls. |
parentId | The Synapse ID of an entity's parent. An entity can only have one parent. Note: If a parent is deleted all of its children will be deleted, much like a folder on a file system. |
description | The description provides more details about an Entity. This field is very prominent on the Synapse web site. |
entityType | This field is used to communicate the Entity's type with all web-service calls. This is a required filed that will be covered in more detail further on. |
createdBy | The user that created this entity. |
createdOn | The date-time that this entity was create on. |
modifiedBy | The user to last modify this entity |
modifedOn | The date-time that this entity was last modified on. |
Entity Type Schema
Each entity type is defined with a JSON schema (see: draft-zyp-json-schema-03). For example, the Link Entity is defined by the following schema:
{ "title":"Link", "description":"JSON schema for a Link", "implements":[ { "$ref":"org.sagebionetworks.repo.model.Entity" } ], "properties":{ "linksTo":{ "type":"object", "$ref":"org.sagebionetworks.repo.model.Reference", "description":"The synapse id that this link points to.", "title":"Links To" }, "linksToClassName": { "type":"string", "description":"The synapse Entity's class name that this link points to.", "title":"Links To Class Name" } } }
In this example, we can see that Link implements Entity and adds two property that are specific to Links: 'linkTo' and 'linkToClassName'. The links to the actual JSON for each entity schema can be found in the table below. The full list of the current Entities registered in Synapse can be founds int the following file: Register.json. We will cover more on the register in the next section.
Get Entity Registry
The following curl commands show how to fetch the current Registry from Synapse:
Request
$ curl -i -H Accept:application/json 'https://repo-prod.sagebase.org/repo/v1/entity/registry'
Response
HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Content-Type: application/json Transfer-Encoding: chunked Date: Fri, 01 Jun 2012 23:16:11 GMT { "entityTypes":[ { "validParentTypes":[ "org.sagebionetworks.repo.model.Project" ], "name":"dataset", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Study", "aliases":[ "dataset", "study" ] }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Study", "org.sagebionetworks.repo.model.Project" ], "name":"layer", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Data", "aliases":[ "layer", "data" ] }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Folder", "org.sagebionetworks.repo.model.Project", "DEFAULT" ], "name":"project", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Project" }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Data" ], "name":"preview", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Preview" }, { "validParentTypes":[ "DEFAULT", "org.sagebionetworks.repo.model.Folder" ], "name":"folder", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Folder" }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Project" ], "name":"analysis", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Analysis" }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Folder", "org.sagebionetworks.repo.model.Analysis", "DEFAULT" ], "name":"step", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Step" }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Project" ], "name":"code", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Code" }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Project", "org.sagebionetworks.repo.model.Folder", "org.sagebionetworks.repo.model.Study", "org.sagebionetworks.repo.model.Data", "org.sagebionetworks.repo.model.Step", "org.sagebionetworks.repo.model.Analysis", "DEFAULT" ], "name":"link", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.Link" }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Project", "org.sagebionetworks.repo.model.Study" ], "name":"phenotypedata", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.PhenotypeData", "aliases":[ "phenotypedata", "layer" ] }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Project", "org.sagebionetworks.repo.model.Study" ], "name":"genotypedata", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.GenotypeData", "aliases":[ "genotypedata", "layer" ] }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Project", "org.sagebionetworks.repo.model.Study" ], "name":"expressiondata", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.ExpressionData", "aliases":[ "expressiondata", "layer" ] }, { "validParentTypes":[ "org.sagebionetworks.repo.model.Project", "org.sagebionetworks.repo.model.Study" ], "name":"robject", "defaultParentPath":"/root", "entityType":"org.sagebionetworks.repo.model.RObject" } ] }
Get Full Entity Schema
The "entityType" can be used like any REST Resource ID to get the full schema for an Entity. The following example shows how to get the full schema of a Study:
Request
curl -i -H Accept:application/json 'https://repo-prod.sagebase.org/repo/v1/REST/resources/schema?resourceId='org.sagebionetworks.repo.model.Study''
Response
HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Content-Type: application/json Transfer-Encoding: chunked Date: Fri, 01 Jun 2012 23:27:36 GMT { "title":"Study", "description":"JSON schema for Study POJO", "implements":[ { "$ref":"org.sagebionetworks.repo.model.Locationable" } ], "properties":{ "platform":{ "title":"Platform", "description":"Chip platform for the samples in this Study. Platform is described by the Synapse ontology concept: http://synapse.sagebase.org/ontology#12591", "links":[ { "rel":"describedby", "href":"http://synapse.sagebase.org/ontology#12591" } ], "type":"string" }, "species":{ "title":"Species", "description":"The species associated with this Study", "type":"string" }, "tissueType":{ "title":"Tissue Type", "description":"Type of tissue for the samples in this Data. Tissue is described by the Synapse ontology concept: http://synapse.sagebase.org/ontology#11171", "links":[ { "rel":"describedby", "href":"http://synapse.sagebase.org/ontology#11171" } ], "type":"string" }, "numSamples":{ "title":"Number of samples", "description":"Approximate number of samples in this Study", "type":"integer" }, "disease":{ "title":"Disease", "description":"The disease associated with this Study", "type":"string" } } }
Get Effective Entity Schema
The full schema is useful to see the full schema hierarchy of an Entity. However, if you just need to know what the properties of an Entity are, then it can be tedious to build up this list by navigating the type hierarchy. To get around this problem, the REST API provides an "Effective" schema for any REST Resource. The effective schema is the schema of a resources with all type hierarchy collapsed. For example, here is how to get the effective schema from the REST API:
Request
curl -i -H Accept:application/json 'https://repo-prod.sagebase.org/repo/v1/REST/resources/effectiveSchema?resourceId='org.sagebionetworks.repo.model.Study''
Response
HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Content-Type: application/json Transfer-Encoding: chunked Date: Fri, 01 Jun 2012 23:35:04 GMT { "id":"org.sagebionetworks.repo.model.Study", "title":"Study", "description":"JSON schema for Study POJO", "name":"Study", "properties":{ "accessControlList":{ "description":"The URI to get to this entity's access control list", "transient":true, "type":"string" }, "etag":{ "description":"Synapse employs an Optimistic Concurrency Control (OCC) scheme to handle concurrent updates. Since the E-Tag changes every time an entity is updated it is used to detect when a client's current representation of an entity is out-of-date.", "transient":true, "type":"string" }, "versionLabel":{ "title":"Version", "description":"The version label for this entity", "type":"string" }, "modifiedBy":{ "title":"Modified By", "description":"The user that last modified this entity.", "transient":true, "type":"string" }, "contentType":{ "description":"The type of file of this location", "type":"string" }, "disease":{ "title":"Disease", "description":"The disease associated with this Study", "type":"string" }, "entityType":{ "description":"The full class name of this entiy.", "transient":true, "type":"string" }, "id":{ "description":"The unique immutable ID for this entity. A new ID will be generated for new Entities. Once issued, this ID is guaranteed to never change or be re-issued", "transient":true, "type":"string" }, "parentId":{ "description":"The ID of the parent of this entity", "type":"string" }, "versionComment":{ "title":"Version Comment", "description":"The version comment for this entity", "type":"string" }, "locations":{ "items":{ "id":"org.sagebionetworks.repo.model.LocationData", "description":"JSON schema for Location Data POJO", "name":"LocationData", "properties":{ "path":{ "description":"The path of this location", "type":"string" }, "type":{ "id":"org.sagebionetworks.repo.model.LocationTypeNames", "description":"The type of this location", "name":"LocationTypeNames", "enum":[ "awss3", "awsebs", "sage", "external", "github" ], "type":"string" } }, "type":"object" }, "description":"The list of location data.", "contentEncoding":"binary", "uniqueItems":false, "type":"array" }, "tissueType":{ "title":"Tissue Type", "description":"Type of tissue for the samples in this Data. Tissue is described by the Synapse ontology concept: http://synapse.sagebase.org/ontology#11171", "links":[ { "rel":"describedby", "href":"http://synapse.sagebase.org/ontology#11171" } ], "type":"string" }, "description":{ "title":"Description", "description":"The description of this entity.", "type":"string" }, "name":{ "title":"Name", "description":"The name of this entity", "type":"string" }, "attachments":{ "items":{ "id":"org.sagebionetworks.repo.model.attachment.AttachmentData", "description":"JSON Data about a single attachment.", "name":"AttachmentData", "properties":{ "previewId":{ "description":"This token is used to get a pre-signed URL that can be used to download this attachment's preview.", "type":"string" }, "tokenId":{ "description":"This token is used to get a pre-signed URL that can be used to download this attachment.", "type":"string" }, "name":{ "description":"The name of this attachment.", "type":"string" }, "md5":{ "type":"string" }, "contentType":{ "type":"string" }, "url":{ "description":"When provided, the URL can be used to directly download this attachment.", "type":"string" }, "previewState":{ "id":"org.sagebionetworks.repo.model.attachment.PreviewState", "description":"The state of the preview for this attachment", "name":"PreviewState", "enum":[ "FAILED", "PREVIEW_EXISTS", "NOT_COMPATIBLE" ], "type":"string" } }, "type":"object" }, "description":"The list of attachment data.", "contentEncoding":"binary", "uniqueItems":false, "type":"array" }, "platform":{ "title":"Platform", "description":"Chip platform for the samples in this Study. Platform is described by the Synapse ontology concept: http://synapse.sagebase.org/ontology#12591", "links":[ { "rel":"describedby", "href":"http://synapse.sagebase.org/ontology#12591" } ], "type":"string" }, "s3Token":{ "description":"The URL to an S3 token for this entity. This URL is provided by Synapse.", "type":"string" }, "species":{ "title":"Species", "description":"The species associated with this Study", "type":"string" }, "versionUrl":{ "description":"The full URL of this exect version. This URL is provided by Synapse.", "transient":true, "type":"string" }, "uri":{ "description":"The Uniform Resource Identifier (URI) for this entity.", "transient":true, "type":"string" }, "createdOn":{ "title":"Created On", "description":"The date this entity was created.", "transient":true, "format":"date-time", "type":"string" }, "modifiedOn":{ "title":"Modified On", "description":"The date this entity was last modified.", "transient":true, "format":"date-time", "type":"string" }, "versions":{ "description":"The URL to get all versions of this entity. This URL is provided by Synapse.", "transient":true, "type":"string" }, "createdBy":{ "title":"Created By", "description":"The user that created this entity.", "transient":true, "type":"string" }, "md5":{ "description":"The checksum of this location", "type":"string" }, "numSamples":{ "title":"Number of samples", "description":"Approximate number of samples in this Study", "type":"integer" }, "annotations":{ "description":"The URI to get to this entity's annotations", "transient":true, "type":"string" }, "versionNumber":{ "title":"Version Number", "description":"The version number issued to this version on the entity.", "transient":true, "type":"integer" } }, "type":"object" }
Get All REST Resources
All REST Resources supported by the Synapse REST API, have a JSON schema and a ResourceID. The full list of REST Resources ID can be fetched with the following command:
Request
curl -i -H Accept:application/json 'https://repo-prod.sagebase.org/repo/v1/REST/resources'
Response
HTTP/1.1 200 OK Server: Apache-Coyote/1.1 Content-Type: application/json Transfer-Encoding: chunked Date: Fri, 01 Jun 2012 23:42:41 GMT { "list":[ "org.sagebionetworks.repo.model.EntityHeader", "org.sagebionetworks.repo.model.attachment.AttachmentData", "org.sagebionetworks.repo.model.Study", "org.sagebionetworks.repo.model.ExpressionData", "org.sagebionetworks.repo.model.Analysis", "org.sagebionetworks.repo.model.daemon.BackupSubmission", "org.sagebionetworks.repo.model.Data", "org.sagebionetworks.repo.model.Link", "org.sagebionetworks.repo.model.S3Token", "org.sagebionetworks.repo.model.attachment.S3AttachmentToken", "org.sagebionetworks.repo.model.ontology.ConceptRequest", "org.sagebionetworks.repo.model.EnvironmentDescriptor", "org.sagebionetworks.repo.model.ResourceAccess", "org.sagebionetworks.repo.model.AccessControlList", "org.sagebionetworks.repo.model.Preview", "org.sagebionetworks.repo.model.DatasetTrackingData", "org.sagebionetworks.repo.model.AcquisitionTrackingData", "org.sagebionetworks.repo.model.attachment.PresignedUrl", "org.sagebionetworks.repo.model.search.Facet", "org.sagebionetworks.repo.model.registry.MigrationSpec", "org.sagebionetworks.repo.model.registry.FieldDescription", "org.sagebionetworks.repo.model.Media", "org.sagebionetworks.repo.model.ontology.Concept", "org.sagebionetworks.repo.model.auth.UserEntityPermissions", "org.sagebionetworks.repo.model.search.query.FacetSort", "org.sagebionetworks.repo.model.Project", "org.sagebionetworks.repo.model.Folder", "org.sagebionetworks.repo.model.search.query.KeyValue", "org.sagebionetworks.repo.model.search.query.KeyList", "org.sagebionetworks.repo.model.search.DocumentFields", "org.sagebionetworks.repo.model.PhenotypeData", "org.sagebionetworks.repo.model.RObject", "org.sagebionetworks.repo.model.ontology.ConceptSummary", "org.sagebionetworks.repo.model.status.StackStatus", "org.sagebionetworks.repo.model.CurationTrackingData", "org.sagebionetworks.repo.model.search.FacetConstraint", "org.sagebionetworks.repo.model.registry.EntityMigration", "org.sagebionetworks.repo.model.Reference", "org.sagebionetworks.repo.model.Row", "org.sagebionetworks.repo.model.LocationData", "org.sagebionetworks.repo.model.GenotypeData", "org.sagebionetworks.repo.model.StatusHistoryRecord", "org.sagebionetworks.repo.model.registry.FieldMigrationSpec", "org.sagebionetworks.repo.model.Code", "org.sagebionetworks.repo.model.search.SearchResults", "org.sagebionetworks.repo.model.ExampleEntity", "org.sagebionetworks.repo.model.registry.RenameData", "org.sagebionetworks.repo.model.ontology.ConceptResponsePage", "org.sagebionetworks.repo.model.search.Document", "org.sagebionetworks.repo.model.ontology.SummaryRequest", "org.sagebionetworks.repo.model.daemon.RestoreSubmission", "org.sagebionetworks.repo.model.UserProfile", "org.sagebionetworks.repo.model.Step", "org.sagebionetworks.repo.model.RestResourceList", "org.sagebionetworks.repo.model.UserGroup", "org.sagebionetworks.repo.model.EntityPath", "org.sagebionetworks.repo.model.registry.EntityRegistry", "org.sagebionetworks.repo.model.daemon.BackupRestoreStatus", "org.sagebionetworks.repo.model.registry.EntityTypeMetadata", "org.sagebionetworks.repo.model.search.query.SearchQuery", "org.sagebionetworks.repo.model.search.Hit", "org.sagebionetworks.repo.model.attachment.UploadResult", "org.sagebionetworks.repo.model.registry.EntityTypeMigrationSpec", "org.sagebionetworks.repo.model.search.query.FacetTopN", "org.sagebionetworks.repo.model.search.DocumentBatch" ] }
Just like Entities, the full schema and effective schema can be fetched using the resulting Resource ID for any REST Resource.
Entity Type Registration
Register.json
Each Entity type is registered with Synapse through the Register.json file. Synapse will use this register to discover all of the Entity types. The register also provides Synapse with some basic meta-data about each Entity type:
Register field | Description |
---|---|
validParentTypes | Lists other Entity types that are considered valid parent of a given entity |
aliases | The list of extra valid values that can be used to query for an Entity. For example, 'dataset' is an alias for 'study'. This means a Synapse query to find all studies can be written as 'select * from dataset' or 'select * from study'. By default, any type that an Entity extends is automatically used as an alias. For example, since all Entity types extend 'Entity', they all have an implied alias of 'entity'. So a query of 'select * from entity' will return all Entities in Synapse regardless of type. |
name | The basic name of this entity. This name must be unique within Synapse. |
entityType | When creating a new Entity instance through the REST API, the caller must provide the 'entityType' field to tell Synapse the Entity type they wish to create. The 'entityType' field value must always match the full class name of the entity. |
defaultParentPath | This field tells Synapse where to place a new instance of this Entity type when the caller does not specify a 'parentId'. All Entity instances will have a parent in Synapse (with the only exception being the 'root' entity). |
Creating and Updating Entity Types
Any additions or changes to Entity types in Synapse is done at compile-time only. This means it is not possible to add or update an Entity type on the live Synapse services. Rather all additions and changes are deployed with releases of the Synapse REST services. While it is possible that this restriction will be lifted in the future, for now it ensures that Entity data within Synapse always matches the current Entity type schema. For example, when an existing field of an Entity type is renamed, that field will get renamed for every existing instance of that Entity type in Synapse as part of the normal Synapse data migration process. This ensures that all data within Synapse meets the current Entity type schema and there is only one version of each Entity type schema, the current version.