Ways to implement data versioning in MongoDB

2019-01-02 16:08发布

Can you share your thoughts how would you implement data versioning in MongoDB. (I've asked similar question regarding Cassandra. If you have any thoughts which db is better for that please share)

Suppose that I need to version records in an simple address book. (Address book records are stored as flat json objects). I expect that the history:

  • will be used infrequently
  • will be used all at once to present it in a "time machine" fashion
  • there won't be more versions than few hundred to a single record. history won't expire.

I'm considering the following approaches:

  • Create a new object collection to store history of records or changes to the records. It would store one object per version with a reference to the address book entry. Such records would looks as follows:

    {
     '_id': 'new id',
     'user': user_id,
     'timestamp': timestamp,
     'address_book_id': 'id of the address book record' 
     'old_record': {'first_name': 'Jon', 'last_name':'Doe' ...}
    }
    

    This approach can be modified to store an array of versions per document. But this seems to be slower approach without any advantages.

  • Store versions as serialized (JSON) object attached to address book entries. I'm not sure how to attach such objects to MongoDB documents. Perhaps as an array of strings. (Modelled after Simple Document Versioning with CouchDB)

9条回答
无与为乐者.
2楼-- · 2019-01-02 16:59

If you're looking for a ready-to-roll solution -

Mongoid has built in simple versioning

http://mongoid.org/en/mongoid/docs/extras.html#versioning

mongoid-history is a Ruby plugin that provides a significantly more complicated solution with auditing, undo and redo

https://github.com/aq1018/mongoid-history

查看更多
琉璃瓶的回忆
3楼-- · 2019-01-02 16:59

I have used the below package for a meteor/MongoDB project, and it works well, the main advantage is that it stores history/revisions within an array in the same document, hence no need for an additional publications or middleware to access change-history. It can support a limited number of previous versions (ex. last ten versions), it also supports change-concatenation (so all changes happened within a specific period will be covered by one revision).

nicklozon/meteor-collection-revisions

Another sound option is to use Meteor Vermongo (here)

查看更多
心情的温度
4楼-- · 2019-01-02 17:08

Here's another solution using a single document for the current version and all old versions:

{
    _id: ObjectId("..."),
    data: [
        { vid: 1, content: "foo" },
        { vid: 2, content: "bar" }
    ]
}

data contains all versions. The data array is ordered, new versions will only get $pushed to the end of the array. data.vid is the version id, which is an incrementing number.

Get the most recent version:

find(
    { "_id":ObjectId("...") },
    { "data":{ $slice:-1 } }
)

Get a specific version by vid:

find(
    { "_id":ObjectId("...") },
    { "data":{ $elemMatch:{ "vid":1 } } }
)

Return only specified fields:

find(
    { "_id":ObjectId("...") },
    { "data":{ $elemMatch:{ "vid":1 } }, "data.content":1 }
)

Insert new version: (and prevent concurrent insert/update)

update(
    {
        "_id":ObjectId("..."),
        $and:[
            { "data.vid":{ $not:{ $gt:2 } } },
            { "data.vid":2 }
        ]
    },
    { $push:{ "data":{ "vid":3, "content":"baz" } } }
)

2 is the vid of the current most recent version and 3 is the new version getting inserted. Because you need the most recent version's vid, it's easy to do get the next version's vid: nextVID = oldVID + 1.

The $and condition will ensure, that 2 is the latest vid.

This way there's no need for a unique index, but the application logic has to take care of incrementing the vid on insert.

Remove a specific version:

update(
    { "_id":ObjectId("...") },
    { $pull:{ "data":{ "vid":2 } } }
)

That's it!

(remember the 16MB per document limit)

查看更多
登录 后发表回答