How to properly define a hashmap in ElasticSearch

2019-07-12 19:20发布

问题:

I am using dynamic mapping in ElasticSearch.

I use it because my index is long and I need to use the template feature in order to avoid updating definitions in multiple places (see my other SO question).

I am sending the following json (Object Zoo) which contains a HashMap as an example:

Put 127.0.0.1:9200/myIndex/Zoo/10
{
  "id" : 1,
  "Name" : "Madagascar",
  "map" : {
    "1" : {
      "id" : -4944060986111146989,
      "name" : null
    },
    "2" : {
      "id" : 5073063561743125202,
      "name" : null
    },
    "3" : {
      "id" : -1777985506870671559,
      "name" : null
    }
  }
}

This creates the following Index

{
    "mm3_v2": {
        "mappings": {
            "Zoo": {
                "properties": {
                    "Name": {
                        "type": "string"
                    },
                    "id": {
                        "type": "long"
                    },
                    "map": {
                        "properties": {
                            "1": {
                                "properties": {
                                    "id": {
                                        "type": "long"
                                    }
                                }
                            },
                            "2": {
                                "properties": {
                                    "id": {
                                        "type": "long"
                                    }
                                }
                            },
                            "3": {
                                "properties": {
                                    "id": {
                                        "type": "long"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

In this example the inner object in the hashmap is short.
In real life my hashmap's object can be long.

This can create a huge index files (easily 1M rows) and for each object it repeats exactly the same definition.
(e.g. when stored in a list, the mapping does not repeat itself)

Is there a way to properly define a hashmap in elastic search?

回答1:

Is there a way to properly define a hashmap in elastic search?

In terms of handling objects, Elasticsearch has the Object Type and the Nested Type. Nested objects are treated as separate documents and the documentation provides good examples for understanding the advantages (and disadvantages).

I think Elasticsearch's dynamic templates might be worth exploring for your case if you want to fine tune how new key value pair additions are treated although you did mention the nested objects have a rigid definition?

This can create a huge index files (easily 1M rows) and for each object it repeats exactly the same definition. (e.g. when stored in a list, the mapping does not repeat itself)

Are you referring to the large mappings that will get created? As you mentioned, that can be avoided with the Array type and so would it be possible to change the structure of your map object to avoid this? Or are the map keys not as simple as 1, 2, 3,...?



回答2:

if you are not querying the field, you can set "enabled": false in mapping for that field

  "map": {
    "type": "object",
    "enabled": false
  }