How to get same result as following Mysql query fr

2019-09-07 01:36发布

问题:

Mysql Query : The inner query returns all the attribute_value containing "man" and it's position in attribute value. The outer query orders it in descending order of position number. Thereby giving results in order where "man" starts moving from 1st position to later positions Like

man
manager
aman
human
hanuman
assistant manager
indian institute of management

This is the SQL query:

SELECT f1.av
FROM (
    SELECT `attribute_value` av, LOCATE("man",LOWER(`attribute_value`)) po     
        FROM db_attributes WHERE `attribute_value` LIKE "%man%"
    ) f1 
ORDER BY f1.po

I want to achieve this using solr. Right now I am clueless about how to achieve this. Solr is loaded with all attribute values. Help is greatly appreciated.

回答1:

This question is about how to do partial string matching that is NOT left-anchored. This may be some misunderstanding of what Solr (and any index) provides and what it does not provide.

You can do this query in mysql because it is computed at execution time, at the cost of examining every row. But it is unnatural to attempt this query in Solr because the entire point of an index is to minimize cost at execution time and NOT touch every record. I.E., the index wants to precompute a subset for a given potential input.

Consider: your two basic fieldType for this are string and text. String only supports exact matching. Text does tokenizing and stemming. Do you want a search for "ingition" to match "ignite"? It appears you do not, since you are not treating the input as a word or word-stem, but rather a string.

In that case, you probably want to look at http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory, which can be used to produce all the left-anchored substrings of given tokens. By using a second field, you can also have EdgeNGramFilterFactory produce right anchored substrings (then search both for matches). But this is not the same as producing all possible substrings as your example usage suggests.

As for the resultset order, you would have to define a relevance that sorts the way you want. That probably means a separate string field with high score for exact match and the atomized field for matching at a lower relevance.

In short, you probably should not be thinking of reproducing these particular mysql queries exactly in Solr. I would push for clarification or redefinition of the use case (left or right anchoring).



标签: mysql solr