Performance of elastic queries

2019-04-12 03:06发布

This query takes 200+ ms every time it is executed:

{
  "filter": {
    "term": {
      "id": "123456",
      "_cache": true
    }
  }
}

but this one only takes 2-3 ms every time it is executed after the first query:

{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "term": {
          "id": "123456"
        }
      }
    }
  }
}

Note the same ID values in both queries. Looks like the second query uses cached results from the first query. But why the first query cannot use the cached results itself? Removing "_cache" : true from the first query doesn't change anything.

And when I execute the second query with some other ID, it takes ~ 40 ms to execute it for the first time and 2-3 ms every time after that. So the second query not only works faster but it also caches the results and uses the cache for subsequent calls.

Is there an explanation for all this?

1条回答
爷、活的狠高调
2楼-- · 2019-04-12 03:40

The top-level filter element in the first request has very special function in Elasticsearch. It's used to filter search result without affecting facets. In order to avoid interfering with facets, this filter is applied during collection of results and not during searching, which causes its slow performance. Using top-level filter without facets makes very little sense because filtered and constant_score queries typically provide much better performance. If verbosity of filtered query with match_all bothers you, you can rewrite your second request into equivalent constant_score query:

{
  "query": {
    "constant_score": {
      "filter": {
        "term": {
          "id": "123456"
        }
      }
    }
  }
}
查看更多
登录 后发表回答