How do I perform the SQL Join equivalent in MongoD

2018-12-30 23:29发布

How do I perform the SQL Join equivalent in MongoDB?

For example say you have two collections (users and comments) and I want to pull all the comments with pid=444 along with the user info for each.

comments
  { uid:12345, pid:444, comment="blah" }
  { uid:12345, pid:888, comment="asdf" }
  { uid:99999, pid:444, comment="qwer" }

users
  { uid:12345, name:"john" }
  { uid:99999, name:"mia"  }

Is there a way to pull all the comments with a certain field (eg. ...find({pid:444}) ) and the user information associated with each comment in one go?

At the moment, I am first getting the comments which match my criteria, then figuring out all the uid's in that result set, getting the user objects, and merging them with the comment's results. Seems like I am doing it wrong.

19条回答
荒废的爱情
2楼-- · 2018-12-31 00:12

As of Mongo 3.2 the answers to this question are mostly no longer correct. The new $lookup operator added to the aggregation pipeline is essentially identical to a left outer join:

https://docs.mongodb.org/master/reference/operator/aggregation/lookup/#pipe._S_lookup

From the docs:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

Of course Mongo is not a relational database, and the devs are being careful to recommend specific use cases for $lookup, but at least as of 3.2 doing join is now possible with MongoDB.

查看更多
妖精总统
3楼-- · 2018-12-31 00:14

With right combination of $lookup, $project and $match, you can join mutiple tables on multiple parameters. This is because they can be chained multiple times.

Suppose we want to do following (reference)

SELECT S.* FROM LeftTable S
LEFT JOIN RightTable R ON S.ID =R.ID AND S.MID =R.MID WHERE R.TIM >0 AND 
S.MOB IS NOT NULL

Step 1: Link all tables

you can $lookup as many tables as you want.

$lookup - one for each table in query

$unwind - because data is denormalised correctly, else wrapped in arrays

Python code..

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "R"}

                        ])

Step 2: Define all conditionals

$project : define all conditional statements here, plus all the variables you'd like to select.

Python Code..

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "R"},

                        # define conditionals + variables

                        {"$project": {
                          "midEq": {"$eq": ["$MID", "$R.MID"]},
                          "ID": 1, "MOB": 1, "MID": 1
                        }}
                        ])

Step 3: Join all the conditionals

$match - join all conditions using OR or AND etc. There can be multiples of these.

$project: undefine all conditionals

Python Code..

db.LeftTable.aggregate([
                        # connect all tables

                        {"$lookup": {
                          "from": "RightTable",
                          "localField": "ID",
                          "foreignField": "ID",
                          "as": "R"
                        }},
                        {"$unwind": "$R"},

                        # define conditionals + variables

                        {"$project": {
                          "midEq": {"$eq": ["$MID", "$R.MID"]},
                          "ID": 1, "MOB": 1, "MID": 1
                        }},

                        # join all conditionals

                        {"$match": {
                          "$and": [
                            {"R.TIM": {"$gt": 0}}, 
                            {"MOB": {"$exists": True}},
                            {"midEq": {"$eq": True}}
                        ]}},

                        # undefine conditionals

                        {"$project": {
                          "midEq": 0
                        }}

                        ])

Pretty much any combination of tables, conditionals and joins can be done in this manner.

查看更多
旧时光的记忆
4楼-- · 2018-12-31 00:15

I think, if You need normalized data tables - You need to try some other database solutions.

But I've foun that sollution for MOngo on Git By the way, in inserts code - it has movie's name, but noi movie's ID.

Problem

You have a collection of Actors with an array of the Movies they've done.

You want to generate a collection of Movies with an array of Actors in each.

Some sample data

 db.actors.insert( { actor: "Richard Gere", movies: ['Pretty Woman', 'Runaway Bride', 'Chicago'] });
 db.actors.insert( { actor: "Julia Roberts", movies: ['Pretty Woman', 'Runaway Bride', 'Erin Brockovich'] });

Solution

We need to loop through each movie in the Actor document and emit each Movie individually.

The catch here is in the reduce phase. We cannot emit an array from the reduce phase, so we must build an Actors array inside of the "value" document that is returned.

The code
map = function() {
  for(var i in this.movies){
    key = { movie: this.movies[i] };
    value = { actors: [ this.actor ] };
    emit(key, value);
  }
}

reduce = function(key, values) {
  actor_list = { actors: [] };
  for(var i in values) {
    actor_list.actors = values[i].actors.concat(actor_list.actors);
  }
  return actor_list;
}

Notice how actor_list is actually a javascript object that contains an array. Also notice that map emits the same structure.

Run the following to execute the map / reduce, output it to the "pivot" collection and print the result:

printjson(db.actors.mapReduce(map, reduce, "pivot")); db.pivot.find().forEach(printjson);

Here is the sample output, note that "Pretty Woman" and "Runaway Bride" have both "Richard Gere" and "Julia Roberts".

{ "_id" : { "movie" : "Chicago" }, "value" : { "actors" : [ "Richard Gere" ] } }
{ "_id" : { "movie" : "Erin Brockovich" }, "value" : { "actors" : [ "Julia Roberts" ] } }
{ "_id" : { "movie" : "Pretty Woman" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }
{ "_id" : { "movie" : "Runaway Bride" }, "value" : { "actors" : [ "Richard Gere", "Julia Roberts" ] } }

查看更多
柔情千种
5楼-- · 2018-12-31 00:17

You can join two collection in Mongo by using lookup which is offered in 3.2 version. In your case the query would be

db.comments.aggregate({
    $lookup:{
        from:"users",
        localField:"uid",
        foreignField:"uid",
        as:"users_comments"
    }
})

or you can also join with respect to users then there will be a little change as given below.

db.users.aggregate({
    $lookup:{
        from:"comments",
        localField:"uid",
        foreignField:"uid",
        as:"users_comments"
    }
})

It will work just same as left and right join in SQL.

查看更多
余生请多指教
6楼-- · 2018-12-31 00:20

$lookup (aggregation)

Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the “joined” collection. The $lookup stage passes these reshaped documents to the next stage. The $lookup stage has the following syntaxes:

Equality Match

To perform an equality match between a field from the input documents with a field from the documents of the “joined” collection, the $lookup stage has the following syntax:

{
   $lookup:
     {
       from: <collection to join>,
       localField: <field from the input documents>,
       foreignField: <field from the documents of the "from" collection>,
       as: <output array field>
     }
}

The operation would correspond to the following pseudo-SQL statement:

SELECT *, <output array field>
FROM collection
WHERE <output array field> IN (SELECT <documents as determined from the pipeline>
                               FROM <collection to join>
                               WHERE <pipeline> );

Mongo URL

查看更多
步步皆殇っ
7楼-- · 2018-12-31 00:20

playORM can do it for you using S-SQL(Scalable SQL) which just adds partitioning such that you can do joins within partitions.

查看更多
登录 后发表回答