Favor exact matches over nGram in elasticsearch

2019-05-17 20:29发布

I am trying to map a field as nGram and 'exact' match, and make the exact matches appear first in the search results. This is an answer to a similar question, but I am struggling to make it work.

No matter what boost value I specify for the 'exact' field I get the same results order each time. This is how my field mapping looks:

"name" : {
    "type" : "multi_field",
    "fields" : {
      "name" : {
        "type" : "string",
        "boost" : 2.0,
        "analyzer" : "ngram"
      },
      "exact" : {
        "type" : "string",
        "boost" : 4.0,
        "analyzer" : "simple",
        "include_in_all" : false
      }
    }
  }

And this is how the query looks like:

{
    "query": {
        "filtered": {
            "query": {
                "query_string": {
                    "fields":["name","name.exact"],
                    "query":"Woods"
                }
            }
        }
    }
}

2条回答
叛逆
2楼-- · 2019-05-17 20:37

Understating how score is calculated

Elasticsearch has an option for producing an explanation with every search result. by setting the explain parameter to be true

POST  <Index>/<Type>/_search?explain&format=yaml
{
"query" : " ....."
}

it will produce a lot of output for every hit and that can be overwhelming, but it worth taking some time to understand what it all means

the output of eplian might be harder to read in json, so adding format=yaml makes it easier to read

Understanding why a document is matched or not

you can pass the query to a specific document like below to see explanation how matching is being done.

GET <Index>/<type>/<id>/_explain
{
"query": "....."
}
查看更多
迷人小祖宗
3楼-- · 2019-05-17 20:54

The multi_field mapping is correct, but the search query needs to be changed like this:

{
    "query": {
        "filtered": {
            "query": {
                "multi_match": { # changed from "query_string"
                    "fields": ["name","name.exact"],
                    "query": "Woods",
                    # added this so the engine does a "sum of" instead of a "max of"
                    # this is deprecated in the latest versions but works with 0.x
                    "use_dis_max": false
                }
            }
        }
    }
}

Now the results take into account the 'exact' match and adds up to the score.

查看更多
登录 后发表回答