This is related to my original question here: Elasticsearch Delete Mapping Property
From that post assuming you are going to have to "reindex" your data. What is a safe strategy for doing this?
To summarize from the original post I am trying to take the mapping from:
{
"propVal1": {
"type": "double",
"index": "analyzed"
},
"propVal2": {
"type": "string",
"analyzer": "keyword"
},
"propVal3": {
"type": "string",
"analyzer": "keyword"
}
}
to this:
{
"propVal1": {
"type": "double",
"index": "analyzed"
},
"propVal2": {
"type": "string",
"analyzer": "keyword"
}
}
Removing all data for the property that was removed.
I have been contemplating using the REST API for this. This seems dangerous though since you are going to need to synchronize state with the client application making the REST calls, i.e. you need to send all of your documents to the client, modify them, and send them back.
What would be ideal is if there was a server side operation that could move and transform types around. Does something like this exist or am I missing something obvious with the "reindexing"?
Another approach would be to flag the data as no longer valid. Is there any built in flags for this, in terms of the mapping, or is it necessary to create an auxiliary type to define if another type property is valid?
You can have a look to elasticsearch-reindex plugin. A more manual operation could be to use scan & scroll API to get back your original content and use bulk API to index it in a new index or type.
Last answer, how did you get your docs in Elasticsearch? If you have already a data source somewhere, just use the same process as before. If you don't want any downtime, use an alias on top of your old index and once reindex is done, just move the alias to the new index.