Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Parquet is poorly supported in Java, and may or may not require us to pull in Hadoop as a dependency.

  • Parquet is a file format, so appending to Parquet tables will involve a lot of file I/O.

  • The current implementation of Parquet doesn’t prevent table fragments with different columns from appearing in the same partition, and the fragments don’t contain the assessment ID or revision. We will need to solve this problem in Parquet.

Alternate Design 2: Do both

  • Keep Exporter 3 with push to Synapse

  • For any supported assessment or survey, build an “answers.json” file that is a flat dictionary of scores/answers + metadata.

    • Build a unit testing setup that can use python or R scripts (what researchers like) and port to Kotlin for on-device cross platform scoring

    • write the “answers” to the Adherence Record as a dictionary

    • write a “answers.json” file to the archive

  • Add a back-end service to get the “answers.json” file into a table

Pros:

  • Feedback to the participant (if desired)

  • Easier to aggregate the data if we build it in as part of JSON-to-Table

  • Allows us to use an existing table solution, such as Synapse tables, which also allow us to easily export to CSV

Cons:

  • Why did we change from Bridge 1.0/Exporter 2.0 again?

  • Will require robust testing of the unit tests for converting R/Python scoring to Kotlin

Surveys

Surveys are even easier than ARC measures. We already know what our survey table format looks like. (This is one of the things Exporter 2.0 actually did well. However, that survey engine is currently deprecated, as are Exporter 2.0 schemas.)

...