I need to convert fields like this:
{
"_id" : ObjectId("576fd6e87d33ed2f37a6d526"),
"phoneme" : "JH OY1 N Z"
}
into an arrays of substrings like this
{
"_id" : ObjectId("576fd6e87d33ed2f37a6d526"),
"phonemes" : [ "JH", "OY1", "N", "Z" ]
}
and sometimes into an array of characters like this
{
"_id" : ObjectId("576fd6e87d33ed2f37a6d526"),
"phonemes" : ["J", "H", " ", "O", "Y", "1", " ", "N", " ", "Z"]
}
I found some code here which converts a string into an array, but it's a bit too simple for my purposes as there is only a single array element to be created.
db.members.find().snapshot().forEach( function (x) {
x.photos = [{"uri": "/images/" + x.photos}];
db.members.save(x);
});
Is the entire javascript language available to me from within mongo shell statements?
Much easier than I thought. Just use JavaScript split function. boom!
db.temp.find().snapshot().forEach( function (el) {
el.phonemes = el.phoneme.split(' ');
db.temp.save(el);
});
Suppose that the documents in our collection look like this:
{ "phoneme" : "JH OY1 N Z" }
{ "phoneme" : "foobar" }
In version 3.4+, we can use $split
operator to divide the field value into an array of substrings.
To split a string into an array of characters, we need to apply a $substrCP
expression to the array of all chars in the string index using the $map
operator.
To get the array of index value is all integers from 0 to the string's length minus one which can generate using the $range
and the $strLenCP
operators.
We use the $addFields
pipeline stage to add the new fields to the initial document, but for this to be persistent, we can either create a view or overwrite our collection using the $out
aggregation pipeline operator.
[
{
"$addFields":{
"arrayOfPhonemeChar":{
"$map":{
"input":{
"$range":[
0,
{
"$strLenCP":"$phoneme"
}
]
},
"in":{
"$substrCP":[
"$phoneme",
"$$this",
1
]
}
}
},
"phonemeSubstrArray":{
"$split":[
"$phoneme",
" "
]
}
}
}
]
yields something that look like this:
{
"phoneme" : "JH OY1 N Z",
"arrayOfPhonemeChar" : ["J", "H", " ", "O", "Y", "1", " ", "N", " ", "Z"],
"phonemeSubstrArray" : ["JH", "OY1", "N", "Z"]
},
{
"phoneme" : "foobar",
"arrayOfPhonemeChar" : ["f", "o", "o", "b", "a", "r"],
"phonemeSubstrArray" : ["foobar"]
}
How to split a string into an array?
In any halfway modern JavaScript engine, it is
var myString = 'foo bar baz';
var myArray = myString.split(' ');
which should work even on the shell.
Does MongoDB's shell provide the full feature set of JavaScript?
Internally, since MongoDB 2.4 Google's V8 engine is used, which conforms ECMA-262. Expect all functionality defined in this standard at least.
I haven't checked it, but some objects you know from the browser really don't make much sense in the mongo shell. All DOM related, that is. So before using them, I'd rather check wether they exist right away.
This should work with Mongo 3.4+ (see here for more info). This is a bit more concise than user3100115's answer.
db.members.aggregate(
[
{ "$addFields": {
"phonemes": { "$split": [ "$phoneme", " " ] }
}},
{ "$out": "members" }
]
)