...
Parquet is poorly supported in Java, and may or may not require us to pull in Hadoop as a dependency.
Parquet is a file format, so appending to Parquet tables will involve a lot of file I/O.
The current implementation of Parquet doesn’t prevent table fragments with different columns from appearing in the same partition, and the fragments don’t contain the assessment ID or revision. We will need to solve this problem in Parquet.
Alternate Design 2: Do both
Keep Exporter 3 with push to Synapse
For any supported assessment or survey, build an “answers.json” file that is a flat dictionary of scores/answers + metadata.
Build a unit testing setup that can use python or R scripts (what researchers like) and port to Kotlin for on-device cross platform scoring
write the “answers” to the Adherence Record as a dictionary
write a “answers.json” file to the archive
Add a back-end service to get the “answers.json” file into a table
Pros:
Feedback to the participant (if desired)
Easier to aggregate the data if we build it in as part of JSON-to-Table
Allows us to use an existing table solution, such as Synapse tables, which also allow us to easily export to CSV
Cons:
Why did we change from Bridge 1.0/Exporter 2.0 again?
Will require robust testing of the unit tests for converting R/Python scoring to Kotlin
Surveys
Surveys are even easier than ARC measures. We already know what our survey table format looks like. (This is one of the things Exporter 2.0 actually did well. However, that survey engine is currently deprecated, as are Exporter 2.0 schemas.)
...