Synapse Platform

Approaches for Capturing Property Order in JSON Schemas

Introduction

Many DCCs (Data Coordinating Centers) need to enforce metadata standards on metadata maintained in Synapse. In Synapse, this is achieved by binding JSON Schemas to Projects, Folders, Files, and RecordSets. Synapse uses the bound JSON Schemas to automatically validate annotation or curator grid sessions. Data managers also use the JSON Schemas to define the schema (columns) of both FileViews and, by extension, Data Curation grid sessions. In both cases, the properties of the JSON Schema define the columns of both FileViews and grid sessions.

The presentation order of the columns in both views and grid sessions is important. Data managers need to ensure that the column order is presented to data curators in a logical order. However, the JSON Schema specification does not provide a mechanism for defining such an order.

Specifically, the JSON Specification includes the following:

An object is an unordered set of name/value pairs.

By extension, JSON Schema also treats object properties as unordered. Even if some parsers and tools appear to preserve key order, any step in the pipeline (a client library, code generator, or serializer) can reorder or drop it, so preservation is not guaranteed. To support deterministic property order for downstream consumers (for example, grid columns or UI forms), we need to define explicit conventions as extensions on top of standard JSON Schema. This document outlines two concrete approaches and compares their trade-offs, with particular attention to nested schemas, composition, conditional logic, and long-term schema evolution.

While some data managers directly maintain the data models as JSON Schemas, it is more common for them to define and maintain data models using one of the following:

CSV – This is a bespoke format where each row of the CSV represents a single data field. A custom script is used to translate each row of the CSV into a property of a JSON Schema. Currently, the order of the rows in the CSV represents the presentation order of the properties.
LinkML – A standard data modeling system that includes tools for exporting to JSON Schema. LinkML has a Rank that can be used to capture the presentation order of properties.

When considering which extension we should adopt for JSON Schemas, we need to consider the approach that would be most compatible with both the CSV and LinkML data model systems.

Approach 1: Object-level "propertyOrder" Array

In this approach, each object schema may declare a "propertyOrder" keyword whose value is an ordered array of property names. The array defines the desired ordering of properties for that object only. Properties that are not listed in the array are still valid but require a fallback ordering rule.

Example (single object level):


{
  "type": "object",
  "properties": {
    "id": { "type": "string" },
    "name": { "type": "string" },
    "description": { "type": "string" }
  },
  "propertyOrder": ["id", "name", "description"]
}

Example (nested objects, array is per object level):


{
  "type": "object",
  "properties": {
    "metadata": {
      "type": "object",
      "properties": {
        "createdBy": { "type": "string" },
        "createdOn": { "type": "string", "format": "date-time" }
      },
      "propertyOrder": ["createdBy", "createdOn"]
    },
    "data": {
      "type": "object",
      "properties": {
        "x": { "type": "number" },
        "y": { "type": "number" }
      },
      "propertyOrder": ["x", "y"]
    }
  },
  "propertyOrder": ["metadata", "data"]
}

Consumers that render or otherwise depend on ordering must use the following rules:

At each object level, read the "propertyOrder" array, and order matching properties accordingly.
Append any properties not present in "propertyOrder" using a deterministic fallback (for example, lexicographic by property name or insertion order as observed by the parser).
Treat "propertyOrder" as an extension keyword; standard validators ignore it.

Approach 2: Per-property Numeric Ordering Keyword

In this approach, each property schema may declare a numeric ordering keyword such as "orderWeight" (recommended to avoid confusion with the object-level "propertyOrder" array). The consumer sorts properties by this numeric value at each object level. Properties without a value use a fallback ordering rule. When generating schemas from a CSV where each row defines a column, assigning sequential integers (1, 2, 3, …) to "orderWeight" in row order provides an easy, deterministic mapping from the CSV’s global order to the schema.

Example (single object level):


{
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "orderWeight": 1
    },
    "name": {
      "type": "string",
      "orderWeight": 2
    },
    "description": {
      "type": "string",
      "orderWeight": 3
    }
  }
}

Example (nested objects, numeric keyword used per property):


{
  "type": "object",
  "properties": {
    "metadata": {
      "type": "object",
      "properties": {
        "createdBy": {
          "type": "string",
          "orderWeight": 1
        },
        "createdOn": {
          "type": "string",
          "format": "date-time",
          "orderWeight": 2
        }
      }
    },
    "data": {
      "type": "object",
      "properties": {
        "x": {
          "type": "number",
          "orderWeight": 4
        },
        "y": {
          "type": "number",
          "orderWeight": 3
        }
      }
    }
  },
  "propertiesOrderNote": "Order is inferred from per-property 'orderWeight' values at each object level. This field name is illustrative and not part of the proposed extension."
}

Consumers that render or otherwise depend on ordering must use the following rules:

At each object level, collect properties and sort them by the numeric ordering keyword in ascending order.
For ties or missing values, apply a deterministic fallback (for example, lexicographic by property name or insertion order).
Treat the numeric ordering keyword as an extension; standard validators ignore it.

Comparison of Approaches

Scope and validation: The ordering use case is strictly about presentation (for example, grid/UI column ordering) and not data validation. Both JSON Schema extension keywords (such as propertyOrder or orderWeight) and LinkML ordering metadata (rank) are ignored by standard validators; only consumers that implement ordering should interpret them.

Aspect	Approach 1: Object-level "propertyOrder" Array	Approach 2: Per-property Numeric Keyword
Conceptual model	Treats order as metadata on the object: a single, explicit list of property names in the desired order at each object level.	Treats order as metadata on each property: a numeric value that determines ordering when properties are grouped at an object level.
Deep nesting of objects	Scales naturally by defining a separate "propertyOrder" array for each nested object schema. The intent at each level is clear and easy to read.	Also scales, but developers must ensure that numeric values are locally meaningful at each object level. It can be less obvious which numbers apply to which nesting level when reading by eye.
Use with $ref	Referenced object schemas can define their own "propertyOrder" arrays. When composing via $ref, each resolved object-level schema brings its own ordering definition.	Numeric values travel with the referenced property definitions. When a property is reused via $ref in multiple parent objects, the same numeric value may not reflect the intended order in every context, unless overridden locally.
Use with allOf / oneOf / anyOf	Requires merging multiple "propertyOrder" arrays at each object level. Consumers must implement clear merge rules (for example, preserve base order, then insert new properties from additional subschemas at the end or at defined insertion points).	Numeric keywords merge more mechanically: all applicable definitions contribute values for their properties. Consumers then sort by the final numeric values. However, resolving conflicts (for example, two subschemas assigning different numbers to the same property) still requires explicit rules.
Use with conditional logic (if/then/else)	Different branches can specify different "propertyOrder" arrays. Consumers can evaluate the active branch and apply the corresponding order. This keeps conditional layouts explicit but requires the consumer to understand and resolve conditionally active schemas before ordering.	Numeric keywords can be defined in conditional branches and will override or refine base ordering where the condition applies. The resulting order may be harder to reason about statically, because it emerges from the combination of base and branch-specific numeric values.
Schema evolution and refactoring	Adding, removing, or renaming properties requires keeping the "propertyOrder" array in sync. Order is easy to inspect and update as a contiguous list, which often matches how product stakeholders think about ordering.	Adding or removing properties requires assigning or removing numeric values. Large gaps or re-numbering may appear over time, but that does not affect semantics as long as consumers sort numerically. It can be convenient for incremental evolution because new properties can be assigned numbers without editing a separate list.
Human readability and authoring	Ordering intent is immediately visible as a single ordered list of names, which often matches how data managers describe desired layouts. Changes can be made by manipulating a single array at each object level.	Ordering intent is distributed across properties. This can be convenient when editing individual properties in isolation but may be harder to audit as a whole ordered sequence without tooling support.
Handling unspecified properties	Unlisted properties must follow a documented fallback rule (for example, append after all explicitly ordered properties). This makes it explicit which properties are intentionally ordered and which follow a default ordering.	Properties without a numeric value use a fallback rule (for example, treat as if they have a default weight or append after explicitly weighted properties). The mix of weighted and unweighted properties can be flexible but may be less explicit.

LinkML interoperability (rank ↔ orderWeight)

Scope and intent: Ordering is for presentation (for example, grid column order) and does not affect validation. Standard validators ignore both LinkML rank and the JSON Schema extension keyword orderWeight; only ordering-aware consumers should interpret them.

Conceptual analogy: In LinkML, a Class is analogous to a JSON Schema object; a LinkML slot is analogous to a JSON Schema property. Ordering is interpreted per Class/object level.

Mapping guidance (rank ↔ orderWeight):

LinkML slot rank → JSON Schema property orderWeight (per-property numeric).
Order is interpreted per Class/object level. A slot reused in multiple classes can have different order via slot_usage in LinkML; analogously, a property reused via $ref can be assigned a different orderWeight in each object that includes it.
CSV → Schema pipelines: If columns are authored in CSV, assign sequential integers from row order to rank (in LinkML) and to orderWeight (in JSON Schema).

Example: Participant class/object (LinkML → JSON Schema)


# LinkML snippet
classes:
  Participant:
    slots:
      - id
      - name
      - age
    slot_usage:
      id:
        rank: 1
      name:
        rank: 2
      age:
        rank: 3
slots:
  id:
    range: string
  name:
    range: string
  age:
    range: integer


// JSON Schema snippet (participant object)
{
  "type": "object",
  "title": "Participant",
  "properties": {
    "id": { "type": "string", "orderWeight": 1 },
    "name": { "type": "string", "orderWeight": 2 },
    "age": { "type": "integer", "orderWeight": 3 }
  }
}

Recommended Decision Criteria

Use these criteria to choose an ordering convention per feature or schema, focusing on outcomes (how curators see columns) rather than implementation. Decisions are evaluated per object/class (for example, a Participant object maps to a class-level decision). Ordering is presentation-only and does not affect validation.

Prefer object-level propertyOrder when:

Data managers think in terms of ordered lists of fields at each object level and are comfortable editing a single array to manage order.
Schemas have substantial reuse of property definitions via $ref, but the desired order may differ depending on context, so tying order to the property definition itself would be misleading.
It is important to make the full ordering of an object obvious to reviewers without tooling support.

Prefer the per-property numeric keyword when:

Schemas use heavy composition (including allOf, oneOf, anyOf) and authors frequently edit or override properties in isolation rather than at the object level.
Conditional logic and $ref reuse require fine-grained overrides of ordering where individual properties change their relative order only in specific branches or contexts.
Schema authoring tools can surface and manage the numeric ordering values for authors, reducing the cognitive cost of tracking numbers by hand.

Hybrid and Fallback Strategy

A hybrid strategy can combine both conventions to capture ordering consistently while allowing fine-grained overrides in complex cases. The following strategy is compatible with both approaches and can be implemented incrementally:

At each object level, prefer an explicit "propertyOrder" array as the primary source of truth for ordering. This is the recommended default for new schemas and for most straightforward object structures.
Allow an optional per-property numeric keyword (for example, "propertyOrder" or "orderWeight") as an override mechanism in advanced scenarios, especially when properties are contributed via $ref, allOf, oneOf, anyOf, or conditional branches.
Define clear merge rules for object-level ordering when combining schemas:
If an object-level "propertyOrder" array is present in the final resolved schema, use it as the base sequence. For properties with a per-property numeric keyword, reorder them within that sequence according to their numeric values if necessary.
If no object-level "propertyOrder" array is present after resolution, derive order solely from the per-property numeric values at that level. For properties without numeric values, apply a deterministic fallback rule (for example, append them after explicitly ordered properties, using lexicographic order among themselves).
For schemas using $ref and allOf, define how arrays from multiple sources are combined (for example, take the base schema’s "propertyOrder" as primary, then insert newly introduced properties from additional subschemas either at defined insertion markers or at the end, optionally adjusted by per-property numeric overrides).
For schemas using oneOf, anyOf, and if/then/else, allow each branch to provide its own ordering metadata. Consumers should resolve which branch applies (when possible) and then apply that branch-specific ordering information.
For unspecified properties (that is, properties not listed in an object-level "propertyOrder" array and without a per-property numeric value), document and apply a consistent fallback behavior across all consumers so that the resulting order is deterministic and predictable.

This hybrid strategy allows feature stakeholders to adopt the object-level "propertyOrder" array as the primary, human-friendly mechanism for most schemas while retaining the flexibility of per-property ordering in complex, deeply nested, or composition-heavy schemas. It does not require or imply that JSON Schema itself supports property ordering; instead, it defines a project-level extension and corresponding consumer behavior.