Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The most of the basic objects that Synapse currently supports are Entities. Each Entity has first class data that makes up the fields of an entity. All Entities also have Annotations that store additional data about an entity.

The following are the current E

Currently, all Entities are defined by "hard-coded" Java objects. The fields of these Java objects define the first class data of each entity. The only mechanism we have for constraining data of an Entity is to write Java code to do the validation. We also lack a mechanism to constrain or define annotations.

While defining entities using Java allowed us to quickly get a first version of Synapse built, we always planed on supporting a more dynamic approach to object definitions. Ideally we would like our users to define entities without writing Java code. Over the past year we have considered several technologies for entity definitions, but ultimately decided to use: JSON Schema 03. We will not cover that details of this selection process in this document.As it stands now if our users want to add a field to an entity, an engineering task must be scheduled to get the change implemented. In theory, if we used a schema like JSON Schema 03, for both entity definitions and data constraints, we could make changes to schema with little or no engineering effort. Engineering would no longer be the bottle-neck for the evolution of Synapse Entities and data.

Proposal

We are proposing to use JSON Schema 03 to define both an Entity and the Annotations of an Entity. The JSON Schema breaks an object definition into two major categories; properties and additional properties.

An example JSON Schema that describes products might look like:

Code Block

   {
     "name":"Product",
     "properties":{
       "id":{
         "type":"number",
         "description":"Product identifier",
         "required":true
       },
       "name":{
         "description":"Name of the product",
         "type":"string",
         "required":true
       },
       "price":{
         "required":true,
         "type": "number",
         "minimum":0,
         "required":true
       },
       "tags":{
         "type":"array",
         "items":{
           "type":"string"
         }
       }
     },
     "additionalProperties":{
       "releaseStatus":{
         "type":"string",
         "description":"The release status of a product",
         "enum":[ "PROTOTYPE", "RELEASED", "RECALLED", DEPRECIATED"]
       }
     }
   }

In the above example, we can seen an how various types of data can be defined for a Product using the JSON Schema. For example, "id" is a number and required, while "releaseStatus" is an enumeration of strings.

The following pseudo-code snippet shows a partial of an implementation json-schema-03 represented as a Model class:

Code Block

// This class represents a JSON Schema as defined by: http://tools.ietf.org/html/draft-zyp-json-schema-03
class ObjectSchema {
	
...
	// This map defines the primary fields of an Object
	Map<String, ObjectSchema> properties;
	// This map defines the additional Annotations of an Object.
	Map<String, ObjectSchema> additionalProperties;
...
}

In the above class each property has a name and schema that defines the property. We are proposing to use the ObjectSchema.properties to define the primary fields of a Synapse Entity, and the ObjectScheam.additinalProperties to define the Annotations of a Synapse Entity.

There is a lot more detail to the JSON Schema definition that we will not cover here.

Schema Life-cycle

For the initial implementation we are proposing that an Entity Schema can only be defined and edited as part of the compile of synapse. This means run-time edits or additions to each schema will not be possible. The reason for this limitation is to keep the Life-cycle of the schema simple as possible. As we will see, the life-cycle is already complicated with this limitation.

Entity Creation

A new entity will be created by first creating a new JSON text file in the lib-auto-generated project's src/main/resources folder. Folder hierarchies should be used to represent the equivalent of "packages" for each entity.
The following example show where a Dataset entity might be created:

Code Block

/lib-auto-generated/src/main/resource/org/sagebionetworks/entities/Dataset.json