...
MySQL Full Text Index:
A table can be configured to support full text search, the user can select the columns to index
Given the limits on indexes with MySQL a special column (potentially in a special table) contains the concatenated text of the selected columns
Provide a special construct in the SQL language
In some cases re-indexing of the whole data will be needed (e.g. if we drop/add a column snice since we concatenate the text)
Can be extended in the future with some more advanced features (e.g. we can setup stemming etc) as needed, we could explore Apache Lucene (Elasticsearch is based on it) or OpenNLP
Synapse Elasticsearch Cluster with one index-per-table configuration
Setup an elasticsearch cluster with the Synapse stack
A table can be configured to be indexed with a one index per table configuration
Provide the users with a special search API, or integrate with the SQL language
Limit the search results to a maximum value (e.g. 1000) and integrate the results with the faceted results
In some cases a full re-indexing might be necessary but with the one index per table configuration we might be able to run modify-by-query operations instead
We can extend it to allow the users to setup the index configuration as they see fit (e.g. custom analyzers, stop words, suggesters etc)
We can extend it to setup a default query “template” and let the users customize it per table
Synapse Elasticsearch Cluster with single index configuration (can be turned into multiple indexes for scalability)
Similar as 2. but we construct a special document with the concatenated values instead and the table id as a field
Full re-indexing might be necessary for some operations (add/drop column)
No customization for the index or query
We need to be careful not to leak information (e.g. suggestion APIs etc might leak data from different tables)
External Elasticsearch Cluster with one index-per-table configuration
Similar to 2. but the user specifies the cluster endpoint (e.g. similar to what we do for custom storage locations) and we verify ownership. The cluster setup would need to provide access to the Synapse account for a specific index.
Synapse handles the index synchronization, there might be some issues with the fact that tables are actually rebuilt every stack (e.g. the users might not be happy with us rebuilding indexes every week)