...
Feature | CloudSearch | AWS ElasticSearch | MySQL FTS | Notes | |||
---|---|---|---|---|---|---|---|
Schema Type | Fixed, partial support for schema-less with dynamic fields (capture all) | Fixed or Schema-less (Dynamic mapping) | Fixed, index needs to specify all the columns included in the search. The query needs to specify all the columns in the index. | A schema-less approach allows to index data whose structure is unknown, this might not be needed for table as by design we know the structure of the data. | |||
Stemming | Yes | Yes | No - Require pre-processing from the application side | When indexing the tokens can be usually reduced to a word stem before indexing, this allows more flexibility when matching against similar terms (e.g. search for database might match documents containing databases) | |||
Fuzzy search | Yes | Yes | No - Can potentially be implemented using soundex in a pre-processing step but it’s very fragile | Fuzzy search can be useful in some cases for minor misspellings | |||
Field boosting | Yes (at query time) | Yes (both query and schema time) | No - Not sure what a work around would look like. | This is useful when specific columns are more relevant than others (e.g. a match in the title might be more meaningful than a match in a description). | |||
Multiple indexes | No | Yes | Yes - Each synapse table has its own DB table) | In the synapse tables context it is relevant to have the possibility to create an index per table given that each table might have a different schema and it might be more meaningful to reflect the schema in the search index. | |||
Auto-complete | Yes (Suggester API) | Yes (through suggesters, various options) | No | This is a feature that provides suggestions, useful for auto-complete (e.g. while you type) | |||
Did-you-mean | Partial? - Maybe the suggester can be used or fuzzy search | Yes (through suggesters) | No | This is a feature that provides potential suggestions after the search (e.g. misspellings) | |||
Highlighting | Yes | Yes | No | ||||
Facets | Yes | Yes | Partial - This is already supported for Synapse tables as a custom implementation | This might not be relevant as Synapse table already (But we wouldn’t be able to use it for tables due to the limitation on the number of fields in a domain and given the sheer amount of columns in tables) | Yes (But it might not be a good idea to use it, given that we would have to re-implement the whole faceting on top of elastic search) | Partial - This is already supported for Synapse tables as a custom implementation | This might not be relevant as Synapse table already implement faceting. |
Arrays | Yes | Yes | No - Not natively but could be probably worked around | This might be needed for multi-value columns | |||
Custom Synonyms | Yes (Index time) | Yes (Index or query time) | No | This feature can be useful to complement stemming or fuzzy search. Expanding the index/query with similar term might yield better results. | |||
Custom Stop words | Yes (global) | Yes (Index) | Yes (global) | ||||
Dedicated Java Client | Yes | No, currently re-use the client provided by Elastic that broke on purpose the compatibility with non-elastic distributions for newer version. There are plans for releasing forks that will maintain compatibility. | Yes, JDBC | ||||
Maintenance and scalability | Managed, auto-scale | Managed, tuning suggestions | Managed RDS | ||||
Synapse Tables Integration Effort | High | High | Medium | ||||
Additional Costs | Yes, per cluster per instance type/hour. Plus amount of data in batches sent to index. | Yes per instance type/hour. Plus size of data. | No | Elasticsearch might turn out to be cheaper than CloudSearch since the instances are priced lowered and we do not pay for sending batches to index. Setting up the cluster with the right sizing can be complex with Elasticsearch and to ensure availability it can be more expensive (e.g. dedicated master nodes, multiple availability zones and replicas). |
...
Pro | Con | |
---|---|---|
MySQL Full Text Search |
|
|
AWS Elasticsearch |
|
|
CloudSearch |
|
|
...