Show all Elasticsearch aggregation results/buckets

2019-01-30 02:54发布

I'm trying to list all buckets on an aggregation, but it seems to be showing only the first 10.

My search:

curl -XPOST "http://localhost:9200/imoveis/_search?pretty=1" -d'
{
   "size": 0, 
   "aggregations": {
      "bairro_count": {
         "terms": {
            "field": "bairro.raw"
         }
      }
   }
}'

Returns:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 16920,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "bairro_count" : {
      "buckets" : [ {
        "key" : "Barra da Tijuca",
        "doc_count" : 5812
      }, {
        "key" : "Centro",
        "doc_count" : 1757
      }, {
        "key" : "Recreio dos Bandeirantes",
        "doc_count" : 1027
      }, {
        "key" : "Ipanema",
        "doc_count" : 927
      }, {
        "key" : "Copacabana",
        "doc_count" : 842
      }, {
        "key" : "Leblon",
        "doc_count" : 833
      }, {
        "key" : "Botafogo",
        "doc_count" : 594
      }, {
        "key" : "Campo Grande",
        "doc_count" : 456
      }, {
        "key" : "Tijuca",
        "doc_count" : 361
      }, {
        "key" : "Flamengo",
        "doc_count" : 328
      } ]
    }
  }
}

I have much more than 10 keys for this aggregation. In this example I'd have 145 keys, and I want the count for each of them. Is there some pagination on buckets? Can I get all of them?

I'm using Elasticsearch 1.1.0

4条回答
可以哭但决不认输i
2楼-- · 2019-01-30 03:11

But BTW, on https://github.com/elasticsearch/elasticsearch/issues/1776

is was closed at Jun 22, my elasticsearch was downloaded and installed before that day, so assume you can got it if have the latest version

查看更多
爷的心禁止访问
3楼-- · 2019-01-30 03:19

The size param should be a param for the terms query example:

curl -XPOST "http://localhost:9200/imoveis/_search?pretty=1" -d'
{
   "size": 0,
   "aggregations": {
      "bairro_count": {
         "terms": {
            "field": "bairro.raw",
             "size": 0
         }
      }
   }
}'

As mentioned in the doc works only for version 1.1.0 onwards

Edit

Updating the answer based on @PhaedrusTheGreek comment.

setting size:0 is deprecated in 2.x onwards, due to memory issues inflicted on your cluster with high-cardinality field values. You can read more about it in the github issue here .

It is recommended to explicitly set reasonable value for size a number between 1 to 2147483647.

查看更多
【Aperson】
4楼-- · 2019-01-30 03:23

Increase the size(2nd size) to 10000 in ur term aggregations and u will get the bucket of size 10000.By default its set to 10. Also if u want to see the search results just make the 1st size to 1, you can see 1 document ,since ES does support both searching and aggregation.

curl -XPOST "http://localhost:9200/imoveis/_search?pretty=1" -d'
{
   "size": 1,
   "aggregations": {
      "bairro_count": {
         "terms": {
             "field": "bairro.raw",
             "size": 10000

         }
      }
   }
}'
查看更多
祖国的老花朵
5楼-- · 2019-01-30 03:25

How to show all buckets?

{
  "size": 0,
  "aggs": {
    "aggregation_name": {
      "terms": {
        "field": "your_field",
        "size": 10000
      }
    }
  }
}

Note

  • "size":10000 Get at most 10000 buckets. Default is 10.

  • "size":0 In result, "hits" contains 10 documents by default. We don't need them.

  • By default, the buckets are ordered by the doc_count in decreasing order.


Why do I get Fielddata is disabled on text fields by default error?

Because fielddata is disabled on text fields by default. If you have not wxplicitly chosen a field type mapping, it has the default dynamic mappings for string fields.

So, instead of writing "field": "your_field" you need to have "field": "your_field.keyword".

查看更多
登录 后发表回答