In cassandra SASI custom index, need to change Ana

2019-08-18 04:18发布

问题:

Which analyzer is costlier over time, disk space , based on search criteria over the data?

Note: I'm using NonTokenizing Analyzer for the case-sensitive feature.

回答1:

Analyzer_class : Analyzers can be specified that will analyze the text in the specified column.

  • The NonTokenizingAnalyzer is used for cases where the text is not analyzed, but case normalization or sensitivity is required.
  • The StandardAnalyzer is used for analysis that involves stemming, case normalization, case sensitivity, skipping common words like "and" and "the", and localization of the language used to complete the analysis

So moving from StandardAnalyzer to NonTokenizingAnalyzer you loose the capability of skipping common words, localization, etc. So it really depends on the query that you are trying to solve, determines the switch.

In terms of cost on disk space, StandardAnalyzer does use more as it has to process more, but provides more functionality as well. So it really depends on your use case.