Elasticsearch queries on “empty index”

2019-06-02 05:13发布

问题:

in my application I use several elasticsearch indices, which will contain no indexed documents in their initial state. I consider that can be called "empty" :) The document's mapping is correct and working.

The application also has a relational database that contain entities, that MIGHT have documents associated in elasticsearch.

In the initial state of the appliation it is very common that there are only entities without documents, so not a single document has been indexed, therefore "empty index". The index has been created nevertheless and also the document's mapping has been put to the index and is present in the indexes metadata.

Anyway, when I query elasticsearch with a SearchQuery to find an document for one of the entities (the document contains an unique id from the entity), elasticsearch will throw an ElasticSearchException, that complains about no mapping present for field xy etc.

BUT IF I insert one single blank document into the index first, the query wont fail.

Is there a way to "initialize" an index in a way to prevent the query from failing and to get rid of the silly "dummy document workaround"?

UPDATE: Plus, the workaround with the dummy doc pollutes the index, as for example a count query now returns always +1....so I added a deletion to the workaround as well...

回答1:

Your questions lacks details and is not clear. If you had provided gist of your index schema and query, that would have helped. You should have also provided the version of elasticsearch that you are using.

"No mapping" exception that you have mentioned has nothing to do with initializing the index with some data. Most likely you are sorting on the field which doesn't exist. This is common if you are querying multiple indexes at once.

Solution: Solution is based on the version of elasticsearch. If you are on 1.3.x or lower then you should use ignore_unmapped. If you are on a version greater than 1.3.5 then you should use unmapped_type. Click here to read official documentation.

If you are find the documentation confusing, then this example will make it clear:

Lets create two indexes testindex1 and testindex2

curl -XPUT localhost:9200/testindex1 -d '{"mappings":{"type1":{"properties":{"firstname":{"type":"string"},"servers":{"type":"nested","properties":{"name":{"type":"string"},"location":{"type":"nested","properties":{"name":{"type":"string"}}}}}}}}}'

curl -XPUT localhost:9200/testindex2 -d '{"mappings":{"type1":{"properties":{"firstname":{"type":"string"},"computers":{"type":"nested","properties":{"name":{"type":"string"},"location":{"type":"nested","properties":{"name":{"type":"string"}}}}}}}}}'

The only difference between these two indexes is - testindex1 has "server" field and textindex2 has "computers" field.

Now let's insert test data in both the indexes.

Index test data on testindex1:

curl -XPUT localhost:9200/testindex1/type1/1 -d '{"firstname":"servertom","servers":[{"name":"server1","location":[{"name":"location1"},{"name":"location2"}]},{"name":"server2","location":[{"name":"location1"}]}]}'

curl -XPUT localhost:9200/testindex1/type1/2 -d '{"firstname":"serverjerry","servers":[{"name":"server2","location":[{"name":"location5"}]}]}'

Index test data on testindex2:

curl -XPUT localhost:9200/testindex2/type1/1 -d '{"firstname":"computertom","computers":[{"name":"computer1","location":[{"name":"location1"},{"name":"location2"}]},{"name":"computer2","location":[{"name":"location1"}]}]}'

curl -XPUT localhost:9200/testindex2/type1/2 -d '{"firstname":"computerjerry","computers":[{"name":"computer2","location":[{"name":"location5"}]}]}'

Query examples:

  1. Using "unmapped_type" for elasticsearch version > 1.3.x

        curl -XPOST 'localhost:9200/testindex2/_search?pretty' -d '{"fields":["firstname"],"query":{"match_all":{}},"sort":[{"servers.location.name":{"order":"desc","unmapped_type":"string"}}]}'
    
  2. Using "ignore_unmapped" for elasticsearch version <= 1.3.5

    curl -XPOST 'localhost:9200/testindex2/_search?pretty' -d '{"fields":["firstname"],"query":{"match_all":{}},"sort":[{"servers.location.name":{"order":"desc","ignore_unmapped":"true"}}]}'
    

Output of query1:

{
  "took" : 15,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : null,
    "hits" : [ {
      "_index" : "testindex2",
      "_type" : "type1",
      "_id" : "1",
      "_score" : null,
      "fields" : {
        "firstname" : [ "computertom" ]
      },
      "sort" : [ null ]
    }, {
      "_index" : "testindex2",
      "_type" : "type1",
      "_id" : "2",
      "_score" : null,
      "fields" : {
        "firstname" : [ "computerjerry" ]
      },
      "sort" : [ null ]
    } ]
  }
}

Output of query2:

{
  "took" : 10,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : null,
    "hits" : [ {
      "_index" : "testindex2",
      "_type" : "type1",
      "_id" : "1",
      "_score" : null,
      "fields" : {
        "firstname" : [ "computertom" ]
      },
      "sort" : [ -9223372036854775808 ]
    }, {
      "_index" : "testindex2",
      "_type" : "type1",
      "_id" : "2",
      "_score" : null,
      "fields" : {
        "firstname" : [ "computerjerry" ]
      },
      "sort" : [ -9223372036854775808 ]
    } ]
  }
}

Note:

  1. These examples were created on elasticserch 1.4.
  2. These examples also demonstrate how to do sorting on nested fields.


回答2:

Are you doing a sort when you search? I've run into the same issue ("No mapping found for [field] in order to sort on"), but only when trying to sort results. In that case, the solution is simply to add the ignore_unmapped: true property to the sort parameter in your query:

{
  ...
  "body": {
    ...
    "sort": [
      {"field_name": {
        "order": "asc",
        "ignore_unmapped": true
      }}
    ]
    ...
  }
  ...
}

I found my solution here: No mapping found for field in order to sort on in ElasticSearch