If I search documents containing e.g "called" in "message" field I get an expected result, but when I search for "was called", "was called*" or
"*was called*"
I get nothing, although I have a lot of documents whose message field contains the following content "Application was called by REST API".
Here is a part of a query I send:
"wildcard": {
"message": {
"wildcard": "was called",
"boost": 1.0
}
}
Here is a part of the mapping:
"mappings": {
"doc": {
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"norms": false,
"type": "text"
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"norms": false,
"type": "text"
}
}
}
],
"properties": {
...
"message": {
"type": "text",
"norms": false
}
}
}
}
Indexes I search in are automatically created by Logstash.
I have a similar problem with another field; I have the following value in the field: "NP-00121". *00121 works, but *-00121 doesn't.
edit: and one example more: I have a "requestUri" field containing "/api/v1/log/rest", "/api/v1/log/notification" etc. when I send the following wildcard query I get nothing "/api/v1*".
So it looks like problem appears when using spaces and dashes. Could anyone help me to solve this problem?
Wildcards are used within tokens. Your message field is indexed as text, and so will be tokenized into words.
Basically, you don't need wildcards for a query like "was called". Simply use a phrase query like:
or if you prefer a query string query:
A wildcard query would be useful for searching for partial terms, something like:
If you wanted to find all docs that contain "call", "called" or "calling".
For values like NP-00121, or for URIs, it would likely be more useful if those fields were not analyzed. As it is these are getting separated into tokens ('np' and '00121'), thus the problem you are seeing. You can index these fields as the "keyword" type instead of "text", to have the whole field indexed as a single, unanalyzed token.