Document toolboxDocument toolbox

Upload Validation

See

DHP-1074 - Getting issue details... STATUS

JSON Validation

JSON Schemas will be used to validate the structure of the JSON data and type correctness. In Java, we can use https://github.com/networknt/json-schema-validator (see also https://mvnrepository.com/artifact/com.networknt/json-schema-validator). This library was chosen because it’s compatible with both Jackson and Schema Draft 7.

JSON Schemas will be loaded from the file manifest in each upload’s metadata.json. See https://github.com/BridgeDigitalHealth/mobile-client-json/pull/17 for documentation on the file format. For each file listed in the file manifest, we will download the JSON Schema and use that to validate the file.

Previous experience tells us that researchers still want data even if it fails validation. However, we will log a message to the logs if it fails validation. We will use a Log Scan to add this to our dashboards so we can monitor and detect invalid uploads and incorrect schemas and fix app issues and systemic issues.

Table Column Validation

A list of table columns will be added as an Assessment Resource. The format will be a simple JSON Array of strings, where the strings are the names of the expected table columns (excluding common metadata columns).

Converting from JSON to table columns is done server-side, so we only need to write the code once (on the server) instead of twice (once on iOS, once on Android). This means that we wouldn’t have JSON when we have table columns, so a JSON schema wouldn’t be usable here. A list of column names is the simplest solution that doesn’t require us to build a new typing system from scratch.

If JSON-to-Table-Row generates columns not in this list, we will log a message. We will use Log Scans to add this to our dashboard, similar to above. (This table column will still be included in the CSV, because researchers usually want these columns anyway.)

Dwayne Jeng
October 2, 2023

My understanding is that we would be converting from JSON to table columns server-side, so that we only have to write this code once (on the server) instead of twice (once on iOS, once on Android).

This means that we wouldn’t have JSON when we have table columns, so a JSON schema wouldn’t be usable here. A list of column names is the simplest solution that doesn’t require us to build a new typing system from scratch.

JSON Schema is being used to parse the raw JSON files from the uploads.