MongoDB - strip non numeric characters in field

2020-07-10 05:06发布

I have a field of phone numbers where a random variety of separators have been used, such as:

932-555-1515
951.555.1255
(952) 555-1414

I would like to go through each field that already exists and remove the non numeric characters.

Is that possible?

Whether or not it gets stored as an integer or as a string of numbers, I don't care either way. It will only be used for display purposes.

3条回答
We Are One
2楼-- · 2020-07-10 05:38

You'll have to iterate over all your docs in code and use a regex replace to clean up the strings.

Here's how you'd do it in the mongo shell for a test collection with a phone field that needs to be cleaned up.

db.test.find().forEach(function(doc) {
  doc.phone = doc.phone.replace(/[^0-9]/g, ''); 
  db.test.save(doc);
});
查看更多
Lonely孤独者°
3楼-- · 2020-07-10 05:45

Starting in Mongo 4.4, the $function aggregation operator allows applying a custom javascript function to implement behaviour not supported by the MongoDB Query Language.

And coupled with improvements made to db.collection.update() in Mongo 4.2 that can accept an aggregation pipeline, allowing the update of a field based on its own value,

We can manipulate and update a field in ways the language doesn't easily permit and avoid an inefficient find/foreach pattern:

// { "x" : "932-555-1515",   "y" : 3 }
// { "x" : "951.555.1255",   "y" : 7 }
// { "x" : "(952) 555-1414", "y" : 6 }
db.collection.updateMany(
  { "x": { $regex: /[^0-9]/g } },
  [{ $set:
    { "x":
      { $function: {
          body: function(x) { return x.replace(/[^0-9]/g, ''); },
          args: ["$x"],
          lang: "js"
      }}
    }
  }
])
// { "x" : "9325551515", "y" : 3 }
// { "x" : "9515551255", "y" : 7 }
// { "x" : "9525551414", "y" : 6 }

The update consist of:

  • a match query { "x": { $regex: /[^0-9]/g } }, filtering documents to update (in our case any document that contains non-numeric characters in the field we're interested on updating).

  • an update aggreation pipeline [ { $set: { active: { $eq: [ "$a", "Hello" ] } } } ] (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias for $addFields.

$function takes 3 parameters:

  • body, which is the function to apply, whose parameter is the string to modify. The function here simply consists in replacing characters matching the regex with empty characters.
  • args, which contains the fields from the record that the body function takes as parameter. In our case, "$x".
  • lang, which is the language in which the body function is written. Only js is currently available.
查看更多
虎瘦雄心在
4楼-- · 2020-07-10 05:52

Based on the previous example by @JohnnyHK, I added regex also to the find query:

/*
MongoDB: Find by regular expression and run regex replace on results
*/
db.test.find({"url": { $regex: 'http:\/\/' }}).forEach(function(doc) {
  doc.url = doc.url.replace(/http:\/\/www\.url\.com/g, 'http://another.url.com'); 
  db.test.save(doc);
});
查看更多
登录 后发表回答