elasticsearch aggregations separated words

2019-07-13 08:46发布

问题:

I simply run an aggregations in browser plugin(marvel) as you see in picture below there is only one doc match the query but aggregrated separated by spaces but it doesn't make sense I want aggregrate for different doc.. ın this scenario there should be only one group with count 1 and key:"Drow Ranger". What is the true way of do this in elasticsearch..

回答1:

It's probably because your heroname field is analyzed and thus "Drow Ranger" gets tokenized and indexed as "drow" and "ranger".

One way to get around this is to transform your heroname field to a multi-field with an analyzed part (the one you search on with the wildcard query) and another not_analyzed part (the one you can aggregate on).

You should create your index like this and specify the proper mapping for your heroname field

curl -XPUT localhost:9200/dota2 -d '{
    "mappings": {
        "agust": {
            "properties": {
                "heroname": {
                    "type": "string",
                    "fields": {
                        "raw: {
                            "type": "string",
                            "index": "not_analyzed"
                        }
                    }
                },
                ... your other fields go here
            }
        }
    }
} 

Then you can run your aggregation on the heroname.raw field instead of the heroname field.

UPDATE

If you just want to try on the heroname field, you can just modify that field and not recreate the whole index. If you run the following command, it will simply add the new heroname.raw sub-field to your existing heroname field. Note that you still have to reindex your data though

curl -XPUT localhost:9200/dota2/_mapping/agust -d '{
    "properties": {
        "heroname": {
            "type": "string",
            "fields": {
                "raw: {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
} 

Then you can keep using heroname in your wildcard query, but your aggregation will look like this:

{
    "aggs": {
        "asd": {
            "terms": {
                "field": "heroname.raw",    <--- use the raw field here
                "size": 0
            }
        }
    }
}