How to boost indices query inside a boolean query

2019-07-25 06:59发布

So what I'm trying to achieve is partial matching with customized searchable fields per index. I generate a match_phrase_prefix with the value to search, and if it is more than one word, I generate another one per word.(I could use prefix, but it bugged, or has undocumented settings).

In this case, I'm trying to look up for "belden cable"; the query looks like this:

{
    "query":{
        "bool":{
            "should":
            [
                {
                    "indices":{
                        "indices":["addresss"],
                        "query":{
                            "bool":{
                                "should":
                                [
                                    {"match_phrase_prefix":{"name":"BELDEN CABLE"}}
                                    {"match_phrase_prefix":{"name":"BELDEN"}},
                                    {"match_phrase_prefix":{"name":"CABLE"}}
                                ]
                            }
                        },
                        "no_match_query":"none"
                    }
                },
                {
                    "indices":{
                        "indices":["customers"],
                        "query":{
                            "bool":{
                                "should":[
                                    {"match_phrase_prefix":{"_all":"BELDEN CABLE"}},
                                    {"match_phrase_prefix":{"_all":"CABLE"}},
                                    {"match_phrase_prefix":{"_all":"BELDEN"}}
                                ]
                            }
                        },
                    "no_match_query":"none"
                }
            }
        ]
    }
}

My target search is to get the results that have "belden cable" first, then the searches for just "belden" or "cable".

This returns(by example) 4 results that have "belden cable", then a result that has only "cable", then more results of "belden cable".

How can I boost the results that have the complete value of the search?("belden cable")

I've tried separating the indices query of both words and separated words, but it gives worst relevance results.

Also I've tried using a boost statement inside the match_phrase_prefix for "belden cable" without change in the results..

1条回答
Root(大扎)
2楼-- · 2019-07-25 07:32

What you actually need is a different way of analyzing the input data. See below something that should be a starting point to your final solution (because you need to consider the full set of requirements for your queries and data analysis). Searching with ES is not only about queries, but also about how you structure and prepare the data.

The idea is that you want your data to be analyzed so that belden cable stays as is. With a mapping of "name": {"type": "string"} the standard analyzer is being used which means that the list of terms in your index is belden and cable. What you actually need is [belden cable, belden, cable]. So, I thought on suggesting the shingles token filter.

DELETE /addresss
PUT /addresss
{
  "settings": {
    "analysis": {
      "analyzer": {
        "analyzer_shingle": {
          "tokenizer": "standard",
          "filter": [
            "standard",
            "lowercase",
            "shingle"
          ]
        }
      }
    }
  },
  "mappings": {
    "test": {
      "properties": {
        "name": {
          "type": "string",
          "analyzer": "analyzer_shingle"
        }
      }
    }
  }
}
DELETE /customers
PUT /customers
{
  "settings": {
    "analysis": {
      "analyzer": {
        "analyzer_shingle": {
          "tokenizer": "standard",
          "filter": [
            "standard",
            "lowercase",
            "shingle"
          ]
        }
      }
    }
  },
  "mappings": {
    "test": {
      "_all": {
        "analyzer": "analyzer_shingle"
      }
    }
  }
}

POST /addresss/test/_bulk
{"index":{}}
{"name": "belden cable"}
{"index":{}}
{"name": "belden cable yyy"}
{"index":{}}
{"name": "belden cable xxx"}
{"index":{}}
{"name": "belden bla"}
{"index":{}}
{"name": "cable bla"}

POST /customers/test/_bulk
{"index":{}}
{"field1": "belden", "field2": "cable"}
{"index":{}}
{"field1": "belden cable yyy"}
{"index":{}}
{"field2": "belden cable xxx"}
{"index":{}}
{"field2": "belden bla"}
{"index":{}}
{"field2": "cable bla"}

GET /addresss,customers/test/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "indices": {
            "indices": [
              "addresss"
            ],
            "query": {
              "bool": {
                "should": [
                  {
                    "match_phrase_prefix": {
                      "name": "BELDEN CABLE"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "name": "BELDEN"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "name": "CABLE"
                    }
                  }
                ]
              }
            },
            "no_match_query": "none"
          }
        },
        {
          "indices": {
            "indices": [
              "customers"
            ],
            "query": {
              "bool": {
                "should": [
                  {
                    "match_phrase_prefix": {
                      "_all": "BELDEN CABLE"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "_all": "CABLE"
                    }
                  },
                  {
                    "match_phrase_prefix": {
                      "_all": "BELDEN"
                    }
                  }
                ]
              }
            },
            "no_match_query": "none"
          }
        }
      ]
    }
  }
}
查看更多
登录 后发表回答