Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Table of Contents

Introduction

Most of the basic objects that Synapse currently supports are Entities. Each Entity has first class data that makes up the fields of an entity. All Entities also have Annotations that store additional data about an entity.

The following are the current Synapse Entities:

  • Project
  • Folder
  • Dataset
  • Layer
  • Location
  • EULA

Currently, all Entities are defined by "hard-coded" Java objects. The fields of these Java objects define the first class data of each entity. The only mechanism we have for constraining data of an Entity is to write Java code to do the validation. We also lack a mechanism to constrain or define annotations.

While defining entities using Java allowed us to quickly get a first version of Synapse built, we always planed on supporting a more dynamic approach to object definitions. Ideally we would like our users to define entities without writing Java code. As it stands now if our users want to add a field to an entity, an engineering task must be scheduled to get the change implemented. In theory, if we used a schema like JSON Schema 03, for both entity definitions and data constraints, we could make changes to schema with little or no engineering effort. Engineering would no longer be the bottle-neck for the evolution of Synapse Entities and data.

Proposal

We are proposing to use JSON Schema 03 to define both an Entity type. The JSON Schema breaks an object definition into two major categories; properties and additional properties.

An example JSON Schema that describes products might look like:

Code Block
{
     "name":"Product",
     "properties":{
       "id":{
         "type":"number",
         "description":"Product identifier",
         "required":true
       },
       "name":{
         "description":"Name of the product",
         "type":"string",
         "required":true
       },
       "price":{
         "required":true,
         "type": "number",
         "minimum":0,
         "required":true
       },
       "tags":{
         "type":"array",
         "items":{
           "type":"string"
         }
       },
       "releaseStatus":{
         "type":"string",
         "description":"The release status of a product",
         "enum":[ "PROTOTYPE", "RELEASED", "RECALLED", "DEPRECIATED"]
       }
     },
     # not used...
     "additionalProperties":{
     }
   }
{code}

In

...

the

...

above

...

example,

...

we

...

can

...

seen

...

an

...

how

...

various

...

types

...

of

...

data

...

can

...

be

...

defined

...

for

...

a

...

Product

...

using

...

the

...

JSON

...

Schema.

...

For

...

example,

...

"id"

...

is

...

a

...

number

...

and

...

required,

...

while

...

"releaseStatus"

...

is

...

an

...

enumeration

...

of

...

strings.

...

We

...

are

...

proposing

...

to

...

use

...

the

...

"properties"

...

to

...

define

...

the

...

primary

...

fields

...

of

...

a

...

Synapse

...

Entity.

...

These

...

primary

...

fields

...

can

...

be

...

considered

...

the

...

expected

...

data

...

of

...

all

...

instances

...

of

...

a

...

given

...

entity.

...

Using

...

the

...

Product

...

example

...

from

...

above,

...

this

...

implies

...

that

...

all

...

instances

...

of

...

Product

...

would

...

have

...

"id",

...

"name",

...

"price"

...

and

...

"tags.

...

Initially

...

we

...

were

...

planning

...

to

...

use

...

"additinalProperties"

...

to

...

define

...

the

...

Annotations

...

of

...

a

...

Synapse

...

Entity,

...

but

...

this

...

raised

...

a

...

fundamental

...

issue.

...

If

...

the

...

Annotations

...

of

...

an

...

entity

...

are

...

provided

...

for

...

ad-hock

...

user

...

data,

...

then

...

formally

...

defining

...

them

...

in

...

the

...

entity

...

schema

...

for

...

all

...

instances

...

of

...

a

...

type

...

seems

...

like

...

a

...

poor

...

fit.

...

That

...

said,

...

we

...

still

...

have

...

many

...

use

...

cases

...

where

...

we

...

want

...

to

...

constrain

...

the

...

data

...

of

...

an

...

annotation

...

when

...

they

...

are

...

added

...

to

...

an

...

instance

...

of

...

an

...

entity.

...

Therefore,

...

we

...

are

...

positioning

...

that

...

these

...

annotation

...

types

...

are

...

set

...

on

...

a

...

per-instances

...

basis

...

rather

...

than

...

at

...

the

...

entity

...

schema

...

level.

...

Annotation

...

types

...

are

...

covered

...

in

...

a

...

separate

...

document:

...

Proposal

...

for

...

Annotation

...

Types

Schema Life-cycle

...

For

...

the

...

initial

...

implementation

...

we

...

are

...

proposing

...

that

...

an

...

Entity

...

Schema

...

can

...

only

...

be

...

defined

...

and

...

edited

...

as

...

part

...

of

...

the

...

compile

...

of

...

synapse.

...

This

...

means

...

run-time

...

edits

...

or

...

additions

...

to

...

each

...

schema

...

will

...

not

...

be

...

possible.

...

The

...

reason

...

for

...

this

...

limitation

...

is

...

to

...

keep

...

the

...

Life-cycle

...

of

...

the

...

schema

...

as

...

simple

...

as

...

possible.

...

As

...

we

...

will

...

see,

...

the

...

life-cycle

...

is

...

already

...

complicated

...

even

...

with

...

this

...

limitation.

...

Define Entities

A new entity will be created by first creating a new JSON text file in the lib-auto-generated

...

project's

...

src/main/resources

...

folder.

...

Folder

...

hierarchies

...

should

...

be

...

used

...

to

...

represent

...

the

...

equivalent

...

of

...

"packages"

...

for

...

each

...

entity.

...


The

...

following

...

example

...

show

...

where

...

an

...

Example

...

entity

...

might

...

be

...

created:

...

}
Code Block
/lib-auto-generated/src/main/resource/org/sagebionetworks/entity/type/Example.json
{code}

Lets

...

say

...

we

...

also

...

want

...

to

...

define

...

an

...

Annotation

...

type

...

and

...

use

...

it

...

to

...

help

...

define

...

our

...

Example.json.

...

This

...

annotation

...

type

...

definition

...

JSON

...

text

...

file

...

might

...

be

...

created

...

in

...

the

...

following

...

location:

...

}
Code Block
/lib-auto-generated/src/main/resource/org/sagebionetworks/annotation/types/VertebrateOrganType.json
{code}

Before

...

we

...

look

...

at

...

the

...

definition

...

of

...

our

...

Example.json

...

let's

...

first

...

look

...

at

...

the

...

definition

...

of

...

our

...

new

...

VertebrateOrganType.json.

...

For

...

this

...

example

...

we

...

want

...

to

...

use

...

the

...

Basic

...

Vertebrate

...

Anatomy

...

ontology

...

to

...

define

...

the

...

valid

...

values

...

for

...

Organs:

...


VertebrateOrganType.json

...

}
Code Block
{
   
    "type":"string",
    	"format":"uri",
    	"enum":["XQUERY":
		
        "XQUERY":"doc(http://rest.bioontology.org/bioportal/concepts/4531?conceptid=tbio:Organ&light=1&apikey=2fb9306a-7f3f-477a-821e-e3ccd7356a18)/success/data/classBean/relations/entry[string=Subclass]/list/classBean/fullId"
	]    }]
{code}

In

...

this

...

example,

...

the

...

enumeration

...

values

...

are

...

defined

...

by

...

an

...

XQuery

...

that

...

is

...

used

...

to

...

get

...

the

...

"fullId"

...

(URIs)

...

of

...

all

...

Sub-classes

...

of

...

the

...

Term

...

"Organ"

...

using

...

the

...

XML

...

returned

...

from

...

NCBO's

...

BioPortal

...

Term

...

services.

...

Here

...

is

...

the

...

XML

...

returned

...

by

...

the

...

term

...

service

...

for

...

this

...

exampl:

...

http://rest.bioontology.org/bioportal/concepts/4531?conceptid=tbio:Organ&light=1&apikey=2fb9306a-7f3f-477a-821e-e3ccd7356a18

...

.

...


Assuming

...

the

...

XQuery

...

is

...

setup

...

correctly,

...

the

...

effective

...

enum

...

definition

...

for

...

this

...

type

...

would

...

be"

...

}
Code Block
"enum":[
	"http://www.co-ode.org/ontologies/basic-bio/basic-vertebrate-gross-anatomy.owl#Heart",
	"http://www.co-ode.org/ontologies/basic-bio/basic-vertebrate-gross-anatomy.owl#Pericardium",
	"http://www.co-ode.org/ontologies/basic-bio/basic-vertebrate-gross-anatomy.owl#Brain",
	"http://www.co-ode.org/ontologies/basic-bio/basic-vertebrate-gross-anatomy.owl#Stomach",
	"http://www.co-ode.org/ontologies/basic-bio/basic-vertebrate-gross-anatomy.owl#Lung",
	"http://www.co-ode.org/ontologies/basic-bio/basic-vertebrate-gross-anatomy.owl#Liver",
]
{code]

Now

...

that

...

we

...

have

...

defined

...

an

...

Annotation

...

Type

...

for

...

Organ

...

using

...

the

...

ontology

...

we

...

can

...

use

...

this

...

type

...

in

...

the

...

definition

...

of

...

the

...

entity.

...


Here

...

is

...

our

...

definition

...

of

...

our

...

example

...

Entity:

...


Example.json

...

}
Code Block
{
     "extends":"org/sagebionetworks/entity/type/Entity.json"
     "name":"Product",

    "properties":{
        "id":{
            "type":"number",
            "description":"Example identifier",
            "required":true
        },
        "name":{
            "description":"Name of the Example",
            "type":"string",
            "required":true
        },
	
        "organ":{
	
            "$ref":"org/sagebionetworks/annotation/types/VertebrateOrganType.json"
        }
    }
},
   }
{code}
The first thing to point out about our 

The first thing to point out about our Example.json

...

is

...

that

...

it

...

extends

...

Entity.json,

...

which

...

makes

...

it

...

a

...

Synapse

...

Entity.

...

This

...

implies

...

it

...

inherits

...

all

...

of

...

its

...

values

...

from

...

the

...

base

...

Entity.

...

The

...

second

...

thing

...

to

...

point

...

out

...

is

...

that

...

the

...

"organ"

...

property

...

is

...

defined

...

using

...

the

...

annotation

...

type

...

we

...

created

...

earlier.

...

Compile JPJOs (first

...

time)

...

Since

...

we

...

still

...

want

...

Java

...

POJOs

...

to

...

represent

...

all

...

entities,

...

we

...

will

...

use

...

the

...

schema-to-pojo-maven-plugin

...

to

...

build

...

these

...

POJOs.

...

This

...

is

...

done

...

by

...

simply

...

added

...

the

...

following

...

to

...

the

...

lib-auto-generated/pom.xml

...

file:

...

Code Block

<plugin>
	<groupId>org.sagebionetworks</groupId>
	<artifactId>schema-to-pojo-maven-plugin</artifactId>
				<version>${schema-to-pojo.version}</version>
				<executions>
					<execution>
						<goals>
							<goal>generate</goal>
						</goals>
						<configuration>
							<sourceDirectory>src/main/resources</sourceDirectory>
							<packageName>org.sagebionetworks</packageName>
							<outputDirectory>target/auto-generated-pojos</outputDirectory>
						</configuration>
					</execution>
				</executions>
			</plugin>
		</plugins>
{code}
The plugin will automatically create a POJOs class for each JSON schema found in the resource directory.  These POJOs will be placed in the 

The plugin will automatically create a POJOs class for each JSON schema found in the resource directory. These POJOs will be placed in the target/auto-generated-pojos

...

directory.

...

Synapse

...

Deploy

...

(first

...

time)

...

The

...

first

...

time

...

Synapse

...

is

...

deployed

...

after

...

creating

...

Entities,

...

the

...

org.sagebionetworks.repo.model.bootstrap.EntityBootstrapper

...

will

...

read

...

all

...

JSON

...

schema

...

files

...

found

...

in

...

the

...

lib-auto-generated.jar

...

file

...

and

...

create

...

a

...

Synapse

...

SchemaEntity

...

(to

...

be

...

defined)

...

for

...

each

...

using

...

the

...

directory

...

structure

...

create

...

each

...

path.

...

All

...

schema

...

entities

...

will

...

be

...

placed

...

in

...

the

...

folder:

...

}
Code Block
root/schemas
{code}

The

...

resulting

...

SchemaEntity

...

objects

...

from

...

the

...

two

...

examples

...

above

...

would

...

have

...

the

...

following

...

paths:

...

}
Code Block
root/schemas/org/sagebionetworks/entity/type/Example.json
root/schemas/org/sagebionetworks/annotation/types/BioOntologyTissueTypeVertebrateOrganType.json
{code}

Folder

...

entities

...

will

...

be

...

created

...

as

...

need

...

to

...

create

...

each

...

path.

...

By

...

giving

...

each

...

SchemaEntity

...

a

...

unique

...

path,

...

we

...

can

...

use

...

this

...

path

...

to

...

reference

...

a

...

schema

...

before

...

we

...

have

...

an

...

entity

...

to

...

represent

...

it.

...

The

...

API

...

user

...

will

...

be

...

able

...

to

...

get

...

the

...

SchemaEntity

...

objects

...

but

...

they

...

will

...

be

...

READ-ONLY

...

copies.

...

This

...

is

...

important,

...

because

...

the

...

"truth"

...

of

...

each

...

entity

...

is

...

the

...

JSON

...

text

...

file

...

from

...

the

...

auto-generated-pojos

...

project.

...

Hopefully,

...

this

...

will

...

make

...

more

...

sense

...

as

...

the

...

rest

...

of

...

the

...

life-cycle

...

is

...

outlined.

...

Edit

...

of

...

an

...

Schema

...

Imagine

...

that

...

we

...

want

...

to

...

add

...

a

...

new

...

primary

...

field

...

to

...

our

...

Example.json

...

Entity.

...

To

...

do

...

this

...

we

...

need

...

to

...

modify

...

the

...

original

...

JSON

...

file

...

in

...

the

...

lib-auto-generated

...

}
Code Block
/lib-auto-generated/src/main/resource/org/sagebionetworks/entity/type/Example.json
{code}

We

...

want

...

to

...

add

...

a

...

new

...

required

...

primary

...

field

...

called

...

"status".

...

Since

...

"status"

...

is

...

required,

...

we

...

must

...

provide

...

a

...

default

...

value.

...

This

...

is

...

a

...

requirement

...

because

...

we

...

already

...

have

...

instances

...

of

...

Example

...

entities

...

deployed

...

to

...

Synapse,

...

and

...

each

...

of

...

these

...

must

...

be

...

given

...

a

...

default

...

value.

...

We

...

will

...

cover

...

how

...

these

...

default

...

values

...

are

...

applied

...

shortly.

...

Here

...

is

...

our

...

new

...

Example.json:

...


Example.json

...

}
Code Block
{

    "extends":"org/sagebionetworks/entity/type/Entity.json"
     "name":"Product",

    "properties":{
        "id":{
            "type":"number",
            "description":"Example identifier",
            "required":true
        },
        "name":{
            "description":"Name of the Example",
            "type":"string",
            "required":true
        },
	"status        "organ":{
            "type$ref":"string",
org/sagebionetworks/annotation/types/VertebrateOrganType.json"
        },
        "requiredstatus":true,
	"enum":[ "PROTOTYPE", "RELEASED", "RECALLED", "DEPRECIATED"],
	"default":"PROTOTYPE"{
            "type":"string",
            }"required":true,
      },      "additionalPropertiesenum":{
	"tissue":{
	"type":"object",
	"$ref":"org/sagebionetworks/annotation/types/BioOntologyTissueType.json"
       }[
                "PROTOTYPE",
                "RELEASED",
                "RECALLED",
                "DEPRECIATED"
            ],
            "default":"PROTOTYPE"
        }
}    }
{code}

h3. 

Compile

...

POJOs

...

(Nth

...

Time)

...

This

...

time

...

when

...

we

...

compile

...

the

...

new

...

Example.java

...

POJO,

...

the

...

resulting

...

POJO

...

will

...

have

...

a

...

new

...

field

...

called

...

"status"

...

with

...

a

...

default

...

value

...

of

...

"PROTYPE".

...

Backup

...

Deployed

...

Synapse

...

Before

...

we

...

can

...

deploy

...

our

...

update

...

schema

...

we

...

must

...

create

...

a

...

backup

...

of

...

the

...

deployed

...

Synapse.

...

See:

...

Repository+Administration

...


This is an important step. We will use this backup to deploy our changes to the repository.

Synapse Deploy (Nth Time)

Just like before, the bootstrap system will per-populate all SchemaEntites on the new empty repository. At this point we have an empty Synapse that is up-to-date

...

with

...

regard

...

to

...

the

...

current

...

schema.

...

Restore

...

Synapse

...

from

...

Backup

...

After

...

we

...

have

...

a

...

clean

...

repository,

...

we

...

can

...

restore

...

the

...

backup from the earlier step. See: Repository+Administration

The restore daemon will start off by deleting all of the data in Synapse. It will then restore all entities including the SchemaEntites. One of the main tasks of the restore Daemon is to migrate data to the current version during the restoration process. This means we need to detect that a new property was added to the Example.json schema, and ensure that migrated Example entities have this new field with the default value.

Once all data has been migrated to the current schema the old EntitySchema entities can be replaced using the new JSON schemas from the lib-auto-generated.jar