Mongo find by regex: return only matching string

2020-04-11 11:07发布

问题:

My application has the following stack:

Sinatra on Ruby -> MongoMapper -> MongoDB

The application puts several entries in the database. In order to crosslink to other pages, I've added some sort of syntax. e.g.:

Coffee is a black, caffeinated liquid made from beans. {Tea} is made from leaves. Both drinks are sometimes enjoyed with {milk}

In this example {Tea} will link to another DB entry about tea.

I'm trying to query my mongoDB about all 'linked terms'. Usually in ruby I would do something like this: /{([a-zA-Z0-9])+}/ where the () will return a matched string. In mongo however I get the whole record.

How can I get mongo to return me only the matched parts of the record I'm looking for. So for the example above it would return:

["Tea", "milk"]

I'm trying to avoid pulling the entire record into Ruby and processing them there

回答1:

I don't know if I understand.

db.yourColl.aggregate([
{
    $match:{"yourKey":{$regex:'[a-zA-Z0-9]', "$options" : "i"}}
},
{
    $group:{
        _id:null,
        tot:{$push:"$yourKey"}
    }
}])

If you don't want to have duplicate in totuse $addToSet



回答2:

The way I solved this problem is using the string aggregation commands to extract the StartingIndexCP, ending indexCP and substrCP commands to extract the string I wanted. Since you could have multiple of these {} you need to have a projection to identify these CP indices in one shot and have another projection to extract the words you need. Hope this helps.