Sphinx Search / MySQL find most common words

2020-07-24 03:43发布

问题:

I have a sphinx search index and would like to find what are the most common words in my index. Ideally have a list of words ordered by frequency.

If it is not possible to do it using Sphinx, is there a way to query a mysql table's text fields to get the same stat?

回答1:

Yes. This is quite simple. Build them with indexer using the --buildstops and --buildfreqs flags.

indexer --config /path/to/sphinx.conf  indexName --buildfreqs --buildstops freq_wordlist.txt 100000

This example gives you the first 100000 word in your sphinx index ordered by its frequency



回答2:

Build them with indexer using the --buildstops and --buildfreqs flags.

Just be aware this does not build from the existing index, but runs against the data source as if indexing, and builds the word frequencies. It does not affect the index itself.

If you use delta indexes where you save the id of the last indexed document, this will read the last saved id and work from there.