$unwind an object in aggregation framework

2019-01-14 06:00发布

问题:

In the MongoDB aggregation framework, I was hoping to use the $unwind operator on an object (ie. a JSON collection). Doesn't look like this is possible, is there a workaround? Are there plans to implement this?

For example, take the article collection from the aggregation documentation . Suppose there is an additional field "ratings" that is a map from user -> rating. Could you calculate the average rating for each user?

Other than this, I'm quite pleased with the aggregation framework.

Update: here's a simplified version of my JSON collection per request. I'm storing genomic data. I can't really make genotypes an array, because the most common lookup is to get the genotype for a random person.

variants: [

    {
        name: 'variant1', 
        genotypes: {

            person1: 2,
            person2: 5,
            person3: 7,

        }
    }, 

    {
        name: 'variant2', 
        genotypes: {

            person1: 3,
            person2: 3,
            person3: 2,

        }
    }

]

回答1:

It is not possible to do the type of computation you are describing with the aggregation framework - and it's not because there is no $unwind method for non-arrays. Even if the person:value objects were documents in an array, $unwind would not help.

The "group by" functionality (whether in MongoDB or in any relational database) is done on the value of a field or column. We group by value of field and sum/average/etc based on the value of another field.

Simple example is a variant of what you suggest, ratings field added to the example article collection, but not as a map from user to rating but as an array like this:

{ title : title of article", ...
  ratings: [
         { voter: "user1", score: 5 },
         { voter: "user2", score: 8 },
         { voter: "user3", score: 7 }
  ]
}

Now you can aggregate this with:

[ {$unwind: "$ratings"},
  {$group : {_id : "$ratings.voter", averageScore: {$avg:"$ratings.score"} } } 
]

But this example structured as you describe it would look like this:

{ title : title of article", ...
  ratings: {
         user1: 5,
         user2: 8,
         user3: 7
  }
}

or even this:

{ title : title of article", ...
  ratings: [
         { user1: 5 },
         { user2: 8 },
         { user3: 7 }
  ]
}

Even if you could $unwind this, there is nothing to aggregate on here. Unless you know the complete list of all possible keys (users) you cannot do much with this. [*]

An analogous relational DB schema to what you have would be:

CREATE TABLE T (
   user1: integer,
   user2: integer,
   user3: integer
   ...
);

That's not what would be done, instead we would do this:

CREATE TABLE T (
   username: varchar(32),
   score: integer
);

and now we aggregate using SQL:

select username, avg(score) from T group by username;

There is an enhancement request for MongoDB that may allow you to do this in the aggregation framework in the future - the ability to project values to keys to vice versa. Meanwhile, there is always map/reduce.

[*] There is a complicated way to do this if you know all unique keys (you can find all unique keys with a method similar to this) but if you know all the keys you may as well just run a sequence of queries of the form db.articles.find({"ratings.user1":{$exists:true}},{_id:0,"ratings.user1":1}) for each userX which will return all their ratings and you can sum and average them simply enough rather than do a very complex projection the aggregation framework would require.



回答2:

Since 3.4.4, you can transform object to array using $objectToArray

See: https://docs.mongodb.com/manual/reference/operator/aggregation/objectToArray/



回答3:

This is an old question, but I've run across a tidbit of information through trial and error that people may find useful.

It's actually possible to unwind on a dummy value by fooling the parser this way:

db.Opportunity.aggregate(
  { $project: {
        Field1: 1, Field2: 1, Field3: 1,
        DummyUnwindField: { $ifNull: [null, [1.0]] }
    }
  },
  { $unwind: "$DummyUnwindField" }
);

This will produce 1 row per document, regardless of whether or not the value exists. You may be able tinker with this to generate the results you want. I had hoped to combine this with multiple $unwinds to (sort of like emit() in map/reduce), but alas, the last $unwind wins or they combine as an intersection rather than union which makes it impossible to achieve the results I was looking for. I am sadly disappointed with the aggregate framework functionality as it doesn't fit the one use case I was hoping to use it for (and seems strangely like a lot of the questions on StackOverflow in this area are asking) - ordering results based on match rate. Improving the poor map reduce performance would have made this entire feature unnecessary.



回答4:

This is what I found & extended.

Lets create experimental database in mongo

db.copyDatabase('livedb' , 'experimentdb')

Now Use experimentdb & convert Array to object in your experimentcollection

db.getCollection('experimentcollection').find({}).forEach(function(e){
    if(e.store){
        e.ratings = [e.ratings]; //Objects name to be converted to array eg:ratings
        db.experimentcollection.save(e);
    }
})

Some nerdy js code to convert json to flat object

var flatArray = [];

var data = db.experimentcollection.find().toArray();

for (var index = 0; index < data.length; index++) {

  var flatObject = {};

  for (var prop in data[index]) {

    var value = data[index][prop];

    if (Array.isArray(value) && prop === 'ratings') {
      for (var i = 0; i < value.length; i++) {
        for (var inProp in value[i]) {
          flatObject[inProp] = value[i][inProp];
        }
      }
    }else{
        flatObject[prop] = value;
    }
  }
  flatArray.push(flatObject);
}

printjson(flatArray);