Want to have a i18n support for objects stored in mongodb collection
currently our schema is like:
{
_id: "id"
name: "name"
localization: [{
lan: "en-US",
name: "name_in_english"
}, {
lan: "zh-TW",
name: "name_in_traditional_chinese"
}]
}
but my thought is that field "lan" is unique, can I just use this field as a key, so the structure would be
{
_id: "id"
name: "name"
localization: {
"en-US": "name_in_english",
"zh-TW": "name_in_traditional_chinese"
}
}
which would be neater and easier to parse (just localization[language] would get the value i want for specific language).
But then the question is: Is this a good practice in storing data in MongoDB? And how to pass the json-schema check?
Obviously the second schema example is much better for your task (of course, if
lan
field is unique as you mentioned, that seems true to me also).Getting element from
dictionary/associated array/mapping/whatever_it_is_called_in_your_language
is much cheaper than scanning whole array of values (and in current case it's also much efficient from the storage size point of view (remember that all fields are stored in MongoDBas-is
, so every record holds the whole key name for json field, not it's representation or index or whatever).My experience shows that MongoDB is mature enough to be used as a main storage for your application, even on high-loads (whatever it means ;) ), and the main problem is how you fight database-level locks (well, we'll wait for promised table-level locks, it'll fasten MongoDB I hope a lot more), though data loss is possible if your MongoDB cluster is built badly (dig into docs and articles over Internet for more information).
As for schema check, you must do it by means of your programming language on application side before inserting records, yeah, that's why Mongo is called
schemaless
.There is a case where an object is necessarily better than an array: supporting upserts into a set. For example, if you want to update an item having
name
'item1' to haveval
100, or insert such an item if one doesn't exist, all in one atomic operation. With an array, you'd have to do one of two operations. Given a schema likeyou'd have commands
You'd have to query first to know which is needed in advance, which can exacerbate race conditions unless you implement some versioning. With an object, you can simply do
If this is a use case you have, then you should go with the object approach. One drawback is that querying for a specific name requires scanning. If that is also needed, you can add a separate array specifically for indexing. This is a trade-off with MongoDB. Upserts would become
and the query would then simply be
(Note: the
$
positional operator does not support array upserts.)It is not a good practice to have values as keys. The language codes are values and as you say you can not validate them against a schema. It makes querying against it impossible. For example, you can't figure out if you have a language translation for "nl-NL" as you can't compare against keys and neither is it possible to easily index this. You should always have descriptive keys.
However, as you say, having the languages as keys makes it a lot easier to pull the data out as you can just access it by
['nl-NL']
(or whatever your language's syntax is).I would suggest an alternative schema:
Now you can :
{ your_id: 1, lan: 1 }
for speedy lookupsdb.so.find( { your_id: "id_for_name", lan: 'en-US' } )
db.so.find( { your_id: "id_for_name" } )
and also much easier update the translation for a specific language:
Neither of those points are possible with your suggested schemas.