...
To help us understand all of the technical challenges for this project we built a toy example data model. This data model was designed to capture all of the technical challenges while remaining as simple and small as we could make it.
...
So rather than pushing the complexity to the UI, it would be better of if these perspectives behaves like any other Synapse table/view. Parsing and SQL manipulation should not be required. This means that the perspectives should behave as if they are as simple tables, instead of three layers of aggregation used to generate them. So, our new design attempts to hide most of the complexity without losing any of the functionality. We would also like the resulting feature to be generic enough to work for other (non-cohort-builder) use cases.
...
For example, the PART_COUNT column is does not exist in the source MATERIAL table, instead it is a derived aggregation. Count is a simple case that always returns an integer so it should be safe to assume a column type of INTEGER. However, what about the other three aggregate columns: PART_STAGE, PART_AGE, and PART_ID? Each contains different types of data.
...
The first five columns (11,22,33,44,55) and the last (88) are all existing column types supported by both the clients and server. This means the UI should be able to treat those columns exactly like any other column of the same type. However, PART_STAGE (76) and PART_AGE (77) include new types definitions. We will talk about these new types next.
Column
...
TypeJSON
Technically, the aggregate statics for the PART_STAGE are gathered in two separate columns using the following in the the SQL select:
...
Code Block | ||
---|---|---|
| ||
SELECT
CAST(FILE_ID, 11),
CAST(FILE_NAME, 22),
CAST(FILE_TYPE, 33),
CAST(FILE_SIZE, 44),
CAST(COUNT(PART_ID), 55),
AGG_EXPAND(STAGE, 66),
AGG_EXPAND(AGE, 77),
CAST(GROUP_CONCAT(DISTINCT PART_ID), 88)
FROM MATERIAL
WHERE FILE_ID IS NOT NULL GROUP BY GROUP BY FILE_ID, FILE_NAME, FILE_TYPE, FILE_SIZE |
...