Aggregating array of values in elasticsearch

2019-04-08 20:04发布

I need to aggregate an array as follows

Two document examples:

{
    "_index": "log",
    "_type": "travels",
    "_id": "tnQsGy4lS0K6uT3Hwzzo-g",
    "_score": 1,
    "_source": {
        "state": "saopaulo",
        "date": "2014-10-30T17",
        "traveler": "patrick",
        "registry": "123123",
        "cities": {
            "saopaulo": 1,
            "riodejaneiro": 2,
            "total": 2
        },
        "reasons": [
            "Entrega de encomenda"
        ],
        "from": [
            "CompraRapida"
        ]
    }
},
{
    "_index": "log",
    "_type": "travels",
    "_id": "tnQsGy4lS0K6uT3Hwzzo-g",
    "_score": 1,
    "_source": {
        "state": "saopaulo",
        "date": "2014-10-31T17",
        "traveler": "patrick",
        "registry": "123123",
        "cities": {
            "saopaulo": 1,
            "curitiba": 1,
            "total": 2
        },
        "reasons": [
            "Entrega de encomenda"
        ],
        "from": [
            "CompraRapida"
        ]
    }
},

I want to aggregate the cities array, to find out all the cities the traveler has gone to. I want something like this:

{
    "traveler":{
        "name":"patrick"
    },
    "cities":{
        "saopaulo":2,
        "riodejaneiro":2,
        "curitiba":1,
        "total":3
    }
}

Where the total is the length of the cities array minus 1. I tried the terms aggregation and the sum, but couldn't output the desired output.

Changes in the document structure can be made, so if anything like that would help me, I'd be pleased to know.

1条回答
ゆ 、 Hurt°
2楼-- · 2019-04-08 20:21

in the document posted above "cities" is not a json array , it is a json object. If changing the document structure is a possibility I would change cities in the document to be an array of object

example document:

 cities : [
   {
     "name" :"saopaulo"
     "visit_count" :"2",

   },
   {
     "name" :"riodejaneiro"
     "visit_count" :"1",

   }
]

You would then need to set cities to be of type nested in the index mapping

   "mappings": {
         "<type_name>": {
            "properties": {
               "cities": {
                  "type": "nested",
                  "properties": {
                     "city": {
                        "type": "string"
                     },
                     "count": {
                        "type": "integer"
                     },
                     "value": {
                        "type": "long"
                     }
                  }
               },
               "date": {
                  "type": "date",
                  "format": "dateOptionalTime"
               },
               "registry": {
                  "type": "string"
               },
               "state": {
                  "type": "string"
               },
               "traveler": {
                  "type": "string"
               }
            }
         }
      }

After which you could use nested aggregation to get the city count per user. The query would look something on these lines :

{
   "query": {
      "match": {
         "traveler": "patrick"
      }
   },
   "aggregations": {
      "city_travelled": {
         "nested": {
            "path": "cities"
         },
         "aggs": {
            "citycount": {
               "cardinality": {
                  "field": "cities.city"
               }
            }
         }
      }
   }
}
查看更多
登录 后发表回答