JSON Schemas
In Synapse, you can streamline the annotation process and ensure that your metadata meets certain requirements using JSON Schemas. You can define a JSON Schema to require certain fields, restrict annotations to specific values, and even apply conditional logic to validate metadata.
JSON Schemas primarily supplement annotations; you should understand annotations in Synapse before using JSON Schemas. This document also assumes you are comfortable with using the Synapse Python Client.
JSON Schemas are an experimental feature in Synapse. Functionality in the web UI and programmatic clients is currently limited, but we have plans to improve support for managing organizations, schemas, and annotations in the near future.
JSON Schemas and Annotations
JSON Schema is a tool used to validate data. In Synapse, JSON Schemas can be used to validate the metadata applied to a project, file, folder, table, or view, including the Annotations applied to it. To learn more about JSON Schemas, check out JSON-Schema.org.
Synapse supports a subset of features from json-schema-draft-07. To see the list of features currently supported, see the JsonSchema object definition from our REST API Documentation.
When a JSON Schema is bound to an object in Synapse, a couple of things happen:
When the metadata or schema changes, the metadata is automatically validated against the applied JSON schema.
(Experimental Mode only) In the web UI, a custom form is shown when editing Annotations to help write Annotations that match the bound schema.
Organizations
JSON Schemas are managed by Organizations. At this time, Organizations must be created via a programmatic client or REST API call.
Organizations are different from teams, which can be used for collaboration, communication, and data sharing.
Create an Organization
To create an Organization, all you need is a name, which must meet certain requirements.
In Python, after logging in, you can create an organization. Note that you’ll have to change the organization name to something unique.
organizationName = "SynapseDocs"
organizationRequestBody = f"{{ \"organizationName\": \"{organizationName}\"}}"
organization = syn.restPOST("/schema/organization", organizationRequestBody)
Create a JSON Schema
Once you’ve created an Organization, you can create a JSON Schema. We’ll create a simple schema that specifies an annotation called “color”. Note that you will have to modify the organization name in the schema $id to successfully create your own schema.
All JSON Schemas published to Synapse are publicly viewable by anyone on the internet, so make sure your schemas don’t include sensitive information.
schemaRequestBody = """
{
"schema": {
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://repo-prod.prod.sagebase.org/repo/v1/schema/type/registered/SynapseDocs-Color",
"properties": {
"color": {
"type": "string",
"title": "Color",
"description": "The color of the object",
"enum": [
"Red",
"Green",
"Blue",
"Yellow",
"Orange",
"Purple",
"Brown",
"Black",
"White"
]
}
},
"required": [
"color"
]
},
"dryRun": false
}
"""
# Issue a request to create the schema
schemaJobResponse = syn.restPOST("/schema/type/create/async/start", schemaRequestBody)
# Check on the job until it completes.
asyncJobStatus = syn.restGET(f"/asynchronous/job/{schemaJobResponse['token']}")
while asyncJobStatus["jobState"] == "PROCESSING":
time.sleep(1)
asyncJobStatus = syn.restGET(f"/asynchronous/job/{schemaJobResponse['token']}")
Schema Versioning
You can create new versions of the schema by issuing a new request to the same endpoint, POST /schema/type/create/async/start.
When you bind a JSON schema to an object, you can choose to bind a particular version of the schema to prevent updates to the schema from applying to the object.
Bind a JSON Schema to an Object
You can bind a JSON Schema to any project, folder, file, table, or view. When you bind a JSON Schema to a project or folder, then all items inside of the project or folder will inherit the schema binding, unless the item has a schema bound to itself. Only one schema can be bound to an item at a time.
Bound schema inheritance is similar to Sharing Settings inheritance, but is tracked separately.
If you have edit access on a Synapse object, you can bind a schema to the entity in Python:
objectId = 'syn########' # Replace the ID with your own
jsonSchemaObjectBinding = f"""{{ "entityId": "{objectId}", "schema$id": "SynapseDocs-Color"}}"""
syn.restPUT(f"/entity/{entityId}/schema/binding", bindSchemaRequest)
Even though only one schema can be applied to an item, you can use JSON schema references to create a schema composed of multiple sub-schemas.
In the jsonSchemaObjectBinding
, you may also include the boolean property enableDerivedAnnotations
to have Synapse automatically calculate derived annotations based on the schema. See the Derived Annotations section below for more information.
Annotate an Object with a Schema
At the bottom of the page, ensure that Experimental Mode is toggled on. This may cause issues with other Synapse features that you use in your workflow.
Once you have enabled Experimental Mode, navigate to the file or folder for which you’ve bound a schema. As you edit the annotations on the file, you will see a form that corresponds to the schema that you have bound.
In Experimental Mode, you’ll also be able to see if a file’s metadata or annotations are invalid because of missing or invalid data.
Derived Annotations
JSON Schemas can also be used to prescribe default annotation values. The annotations can be static, or based on conditional properties.
Defining Derived Annotations using a JSON Schema
Derived annotations can be enabled for a set of objects in Synapse when the schema is bound to the object. The JsonSchemaObjectBinding
request object should contain the enabledDerivedAnnotations
property with a value of true
.
The contents of the bound JSON Schema will be used to determine the derived annotations. Derived annotations are denoted using the JSON Schema keywords default
and const
. For example, consider the following JSON Schema:
{
"type": "object",
"properties": {
"derivedFromConst": {
"type": "string",
"const": "Derived Constant Value"
},
"derivedFromDefault": {
"type": "string",
"default": "Derived Default Value"
}
}
}
Any object that has this schema will have the following derived annotations:
Annotation Key | Value |
---|---|
derivedFromConst | Derived Constant Value |
derivedFromDefault | Derived Default Value |
Conditionally Derived Annotations
JSON Schemas can also be written to conditionally apply annotations. For example, consider the following JSON Schema:
{
"type": "object",
"properties": {
"country": {
"type": "string",
"enum": ["United States", "Canada"]
},
"if": {
"properties": {
"country": {
"const": "United States"
}
},
"required": ["country"]
},
"then": {
"properties": {
"measurementSystem": {
"const": "Imperial"
}
}
},
"else": {
"properties": {
"measurementSystem": {
"const": "Metric"
}
}
}
}
}
On an object where this schema is bound, the derived annotation value for measurementSystem
of an object would depend on the actual annotation value of country
.
Viewing Derived Annotations
Derived annotations will be returned when fetching annotations using the GET /entity/{id}/annotations2 API by setting the includeDerived
parameter to true. Derived annotations will be included with other entity metadata when using the GET /entity/{id}/json service by setting the includeDerivedAnnotations
parameter to true.
A table or view can be updated to show derived annotations using the POST /column/column/view/scope/async/start API by passing includeDerivedAnnotations
in the request body with a value of true
.
Limitations of Derived Annotations
Conditionally derived annotations cannot be derived from other conditional annotations
Derived annotations will only be generated when the annotation data is valid against the bound JSON Schema