Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In summary, the PetPhoto.json schema encapsulates everything we have defined thus far and even outlines which sub-schema should be used under various conditions.

Schema Versions

A new JSON schema is created by calling: POST /schema/type/create/async/start. The same call is also used to make changes to an existing JSON schema. However, the exact impact of a JSON schema change depends on the inclusion or exclusion of the optional semantic version suffix in the schema $id.

...

Alternative: Conditional Logic

One alternative to the base types (Pet.json) and extensions types (Cat.json & Dog.json) is the use of the JSON schema conditional logic: if/then/else. The following example semantically equivalent to the combination of Pet.json, Cat.json, Dog.json, and PetPhoto.json all in a single JSON schema:

Code Block
{
	"$schema": "http://json-schema.org/draft-07/schema",
	"$id": "my.organization-pets.ConditionalAlternative",
	"description": "Conditional Alternative to base types with extensions.",
	"properties": {
		"petName": {
			"type": "string",
			"description": "The name of the pet shown in the photo."
		},
		"birthday": {
			"description": "The birthday of the pet shown in the photo.",
			"type": "string",
			"format": "date-time"
		},
		"petType": {
			"enum": [
				"cat",
				"dog",
				"fish"
			]
		}
	},
	"allOf": [
		{
			"$ref": "org.sagebionetworks-repo.model.FileEntity-1.0.0"
		},
		{
			"if": {
				"properties": {
					"petType": {
						"const": "cat"
					}
				}
			},
			"then": {
				"properties": {
					"breed": {
						"description": "Enumeration of possible cat breeds.",
						"type": "string",
						"enum": [
							"Siamese",
							"Persian",
							"Maine Coon",
							"Ragdoll",
							"American Shorthair"
						]
					}
				}
			}
		},
		{
			"if": {
				"properties": {
					"petType": {
						"const": "dog"
					}
				}
			},
			"then": {
				"properties": {
					"breed": {
						"description": "Enumeration of possible dog breeds.",
						"type": "string",
						"enum": [
							"Labrador Retriever",
							"German Shepherd",
							"Golden Retriever",
							"Bulldog",
							"Beagle"
						]
					}
				}
			}
		}
	]
}

ConditionalAlternative.json

Schema Versions

A new JSON schema is created by calling: POST /schema/type/create/async/start. The same call is also used to make changes to an existing JSON schema. However, the exact impact of a JSON schema change depends on the inclusion or exclusion of the optional semantic version suffix in the schema $id.

When a semantic version is included in a schema $id, its its value serves as a human readable reference to a specific schema change. The semantic version values must follow the rules of Semantic Versioning, with a major, minor, and patch number. If a semantic version is included in a JSON schema $id, its value must be unique within the schema $id space.

...

  • Project designers will likely put a lot of details into their JSON schemas. Many of the same details will be required when defining the schemas of views in their project. Therefore, the project designer will need a way to transfer all of the relevant details from their JSON schemas to their view schemas. This includes both the creation of new views, and updating relevant view schemas with JSON schema changes.If a project designer creates a view of a specific type of object, such as Cats, then they expect that all other types of objects, such as Dogs, to be excluded from the view. This assumes that the object type is defined by a JSON schema.includes both the creation of new views, and updating relevant view schemas with JSON schema changes. This needs to work for both version and non-schemas.

  • If a project designer creates a view of a specific type of object, such as Cats, then they expect that all other types of objects, such as Dogs, to be excluded from the view. This assumes that the object type is defined by a JSON schema.

  • Project Designers might choose to use if/then/else conditions in their JSON schemas instead of defining and extending classes/types. For such cases, each branch of the conditional logic might have its own set of unique columns. See: ConditionalAlternative.json above. Just like the previous use cases, project designers need to create a view that only includes rows for a particular branch, such as all cats, with dogs excluded.

  • It is currently possible to “break” a view by adding/updating an annotation on an object in a view that does not conform to the view’s schema. For example, if a view schema has a column named “foo” of type string, with a max length of 50 characters, adding an annotation to an object in the view with a value for “foo” that is 51 characters, will currently break the view. It would be nice if the JSON schemas somehow prevent this type of breakage.

  • A project designer wants to use a view to find all objects that do not conform to their bound JSON schema, both to identify and repair the issuesa view to find all objects that do not conform to their bound JSON schema, both to identify and repair the issues.

  • Data consumes can potentially find JSON schemas associated with Files either in views or any other location a file might be viewed (such as the file’s Entity page). The data consumer would like discover similar views or files that are also associated with the same JSON schema.

  • A consumer of view data would like more information about a column of a view, such as its description and its origination. This information should help them better understand what they are looking at. For example, a view with a “Cat Breed” columns should provide the consumer with information about the possible cat breeds, the description of the columns, and maybe a link to the JSON schema where it was defined. The consumer might also want to discover other views that also use “Cat Breed”.use “Cat Breed”.

  • It is possible to create a snapshot of any view, including views created with JSON Schemas. If JSON schema without a semantic version in used to create a view, then a snapshot of that view should be immutable even it the JSON schema changes in the future.

  • When a base JSON schema, such as Pets.json, is used to create a view, which columns should be included? Should only the columns defined in the base JSON schema be used, or should all columns from everything that extends the base class be included?

Goal

The goal for phase three is to provide services for defining Synapse Views using JSON schemas. We also want to make some of the schema information available for both display and filtering in Views. This could include information about the schema bound to the Entity, the type of the file, and the validation state of the Entity. For example, is the Entity valid according its bound schema? Users will likely also want to filter by these additional fields.

...

The following two example of a views of the “All Pets” folder, one use using the Cat.json as a driver and the other using the Dog.json as a driver:

...

  • Each property of a JSON schema is a candidate for a columns in a View. This includes both direct properties of a schema and all of its inherited properties. For example, ‘petName’, ‘birthday’, and ‘petType’ are all direct properties of the Pet.json schema. While ‘name’, ‘id’, ‘createdOn’, ‘createdBy’ ‘modifiedOn’, ‘modifiedBy’ ‘etag' etcetera, are properties that are inherited from FileEntity.

  • It is likely that view designers will want to limit the properties included from a JSON schema to a sub-set. They might not want all of the properties from a JSON schema included in the view.

  • It is likely that view designers will want to control the order of the columns in their view. Note: According to the JSON Specification, value-key pairs, such as the properties of a JSON Schema are “unordered”. While Synapse will attempt to maintain the provided order of the properties, we cannot expect that 3rd party libraries and clients will do the same. Therefore, we should consider the properties of a JSON schema to be unordered.

  • It is likely that view designers will want to include additional columns in their views that are not defined in the schema. The example view that includes schema validation information is an example of this. One option is to require a view designer to create a schema that includes all properties that they would want to include in the view, similar to schemas used to bind Entities.

There are at least two approaches available to defining a View’s schema using a JSON schema:

  • Direct - The schema ID and a list of properties names are used to define a View. The schema would provide the full set of possible columns (from properties). The list of names would both control the order and define the sub-set of columns that should be used in the view.

  • Indirect - For the indirect case we provide an API that given a JSON schema ID, will generate a List of ColumnModels. The users then selects and orders the sub-set of columns they want to use and create a regular View with the results.

  • Both - We might want to consider doing both.

The direct option is not as flexible as the indirect options but would maintain a connection to the JSON schema used to define it. It would be possible to propagate JSON schema changes to driven Views. The direct option requires more back-end work than the indirect option. We would also need to either redesign the existing View editor UI, or add a new JSON schema specific View editor to the UI.

The indirect option is more flexible and it is similar to how Views are built today. For example we currently have two buttons on the view creation UI: “Add Default View Columns” and “Add all Annotations”, that provide similar behavior. When the user clicks each button, appropriate Columns are added to the view editor. The view designer is then free to remove anything they do not want and reorder the columns as they see fit. With the indirect option we could add a button “Add Columns from a JSON Schema”. This option would require the least amount of work in the UI and the back end. However, the resulting view from the indirect option would be completely disconnected from the driving JSON schema.

Using JSON schemas to drive the rows of a View

How do we use JSON schemas to drive the rows of a View? The following is a list of possible requirements:

  • View designers would like to limit the row in a view to Entities within a given scope, such as one or more Projects or Folders. This is directly carried over from existing Synapse Views.

  • Ideally, only Entities that “match” the driving JSON schema would be included as rows in the View. See the View examples from above. Specifically, the Views driven by Cat.json, Dog.json, each only include rows that match the schema.

  • Ideally, a View would include a row for each Entity that matches the driving JSON schema by extension. For example, both Cat.json and Dog.json extends Pet.json, so a View driven by Pet.json (see above) would includes rows for Entities that match either Cat or Dog. Note: Pet.json is “abstract”, so there is nothing that directly matches Pet.json. A match to Pet.json is only possible by extension.

  • A View should include a row for each Entity that matches the driving JSON schema even if the Entity is invalid according to the JSON schema validation. This would allow the view to be used as a tool to find Entities that need to be “fixed”.

...

  • view designers will want to include additional columns in their views that are not defined in the schema. The example view that includes schema validation information is an example of this.

Proposed Services

Service to Generate ColumnModels from a JSON Schema

We propose adding a new REST API service that given the $id of a JSON schema will generate a List of ColumnModels that contain all of the details captured in the JSON schema. Project designers could then use this service to transfer the details of their JSON schemas to their views schemas. The clients could help the project designer apply these new ColumnModels to their views, similar to how the “Add All Annotations” button works in the exiting view editor user interface.

It is important to note, that this does not change the fundamental nature of views in any way. The new service is simply a tool to help transfer details of a JSON schema into a view schema.

If the project designer changes their JSON schema, the changes will not automatically, propagate to their views. Instead, if a project designer wishes to apply JSON schema changes to their existing views, they would need to manually, re-run the service to generate new ColumnModel. In theory, the view editor user interface would help the user manage the ColumnModel delta to apply to their view. This means the project designer will be in full control of what changes propagate to their views.

In addition to the service to generate ColumnModels from JSON schemas, we propose adding a new field to ColumnModel to allow the column to be linked to a JSON schema. Specifically, we propose adding a string field to ColumnModel called “derivedFrom$id”. The value of this field would be the $id of the JSON schema from which the column was derived. The service to generate ColumnModels from JSON schemas would automatically provide a value for this new field based on the $id of the parameter of the service.

REST API Additions

Response

URL

Request

Description

JobToken

POST /schema/type/columnmodel/async/start

$id of the JSON schema

Start a job to create a list of ColumnModel objects that represent the properties of the given JSON schema. One ColumnModel will be returned for each property of the JSON schema.

ColumnModelResponse

GET /schema/type/columnmodel/async/get/{asyncToken}

JobToken

Get the results of the asynchronous job to create create a list of ColumnModels from a JSON schema.

Object Models

ColumnModel

Name

Type

Description

All of the existing fields of ColumnModel

derivedFrom$id

String

The $id of the JSON schema that was used to define this ColumnModel. Note: The name of the ColumnModel will match the name of the property from the JSON schema.

ColumnModelResponse

Name

Type

Description

columnModels

List<ColumnModel>

The resulting list of ColumnModels

$id

String

The $id of the JSON schema used to define the list of ColumnModels