I would like to be able to match a multi word search against multiple fields where every word searched is contained in any of the fields, any combination. The catch is I would like to avoid using query_string.
curl -X POST "http://localhost:9200/index/document/1" -d '{"id":1,"firstname":"john","middlename":"clark","lastname":"smith"}'
curl -X POST "http://localhost:9200/index/document/2" -d '{"id":2,"firstname":"john","middlename":"paladini","lastname":"miranda"}'
I would like the search for 'John Smith' to match only document 1. The following query does what I need but I would rather avoid using query_string in case the user passes "OR", "AND" and any of the other advanced params.
curl -X GET 'http://localhost:9200/index/_search?per_page=10&pretty' -d '{
"query": {
"query_string": {
"query": "john smith",
"default_operator": "AND",
"fields": [
"firstname",
"lastname",
"middlename"
]
}
}
}'
What you are looking for is the multi-match query, but it doesn't perform in quite the way you would like.
Compare the output of validate for
multi_match
vsquery_string
.multi_match
(with operatorand
) will make sure that ALL terms exist in at least one field:While
query_string
(with default_operatorAND
) will check that EACH term exists in at least one field:So you have a few choices to achieve what you are after:
Preparse the search terms, to remove things like wildcards, etc, before using the
query_string
Preparse the search terms to extract each word, then generate a
multi_match
query per wordUse
index_name
in your mapping for the name fields to index their data into a single field, which you can then use for search. (like your own customall
field):As follows:
Note however, that
firstname
andlastname
are no longer searchable independently. The data for both fields has been indexed intoname
.You could use multi-fields with the
path
parameter to make them searchable both independently and together, as follows:Searching the
any_name
field works:Searching
firstname
forjohn AND smith
doesn't work:But searching
firstname
for justjohn
works correctly:I think "match" query is what you are looking for:
"The match family of queries does not go through a “query parsing” process. It does not support field name prefixes, wildcard characters, or other “advance” features. For this reason, chances of it failing are very small / non existent, and it provides an excellent behavior when it comes to just analyze and run that text as a query behavior (which is usually what a text search box does)"
http://www.elasticsearch.org/guide/reference/query-dsl/match-query.html
In my experience, escaping the special characters with backslash is a simple and effective solution. The list can be found in the documentation http://lucene.apache.org/core/4_5_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description, plus AND/OR/NOT/TO.