In ElasticSearch 5.6, I have created multiple indices of the same 15k documents. Specifically, 4 that all share the same mapping, settings, and content.
3 of the 4 had index sizes ~1.0 GB. index_1
of the 4 has a size of 52 MB. I've compared searches across the 4 indices, and index_1
returns less documents than the others for identical searches. I've seen anywhere from 1% to 80% less documents per query.
At this point, I don't trust the docs.count
or the store.size_in_bytes
on index_1
or the others. This leaves me in the situation where I'm creating multiple indices just to get a single reliable answer.
Why would one of these indices be so off from the others?
From:GET index_1, index_2, index_3, index_4/_stats/docs,store
{
"_shards": {
"total": 72,
"successful": 72,
"failed": 0
},
"_all": { // Removed the primaries as they were identical (no replicas)
"total": {
"docs": {
"count": 60000,
"deleted": 0
},
"store": {
"size_in_bytes": 3276001895,
"throttle_time_in_millis": 0
}
}
},
"indices": {
"index_1": {
"total": {
"docs": {
"count": 15000,
"deleted": 0
},
"store": {
"size_in_bytes": 54854206, // <-- What is going on here?!?
"throttle_time_in_millis": 0
}
}
},
"index_2": {
"total": {
"docs": {
"count": 15000,
"deleted": 0
},
"store": {
"size_in_bytes": 1075205322,
"throttle_time_in_millis": 0
}
}
},
"index_3": {
"total": {
"docs": {
"count": 15000,
"deleted": 0
},
"store": {
"size_in_bytes": 1072993635,
"throttle_time_in_millis": 0
}
}
},
"index_4": {
"total": {
"docs": {
"count": 15000,
"deleted": 0
},
"store": {
"size_in_bytes": 1072948732,
"throttle_time_in_millis": 0
}
}
}
}
}