Pymongo aggregation - passing python list for aggr

2019-07-29 07:06发布

问题:

Here is my attempt at performing the aggregation (day-wise) based on timestamp if all the elements are hardcoded inside the query.

pipe = [ 
{
     "$match": { 
        "cid": ObjectId("57fe39972b8dbc1387b20913")
        }
    },
{
    "$project":
    {

        "animal_dog": "$animal.dog",
        "animal_dog_tail": "$animal.dog.tail",
        "animal_cat": "$animal.cat",
        "tree": "$fruits",
        "day": {"$substr": ["$timestamp",  0, 10]} 
        }},
{ 
"$group":
    {
    "_id" : "$day",
    "animal_dog" : {"$sum": "$animal_dog"},
    "animal_dog_tail": {"$sum": "$animal_dog_tail"}, 
    "animal_cat": {"$sum": "$animal_cat"}, 
    "tree": {"$sum": "$tree"}, 
    "fruits": {"$sum": "$fruits"},

}} ]

output = dailycollection.aggregate(pipeline = pipe)

Assuming that I have a mongo-collection having the exact same nested structure, how do I pass a python_list with the respective elements for aggregating based on timestamp? Let's say my Python list has elements like this:

key_list = animal.dog, animal.dog.tail, animal.cat, tree, fruits, timestamp.

I would like to pass this list into the query I just wrote above without hardcoding each of the elements. I would like to perform projection, $sum, $group for the elements without hardcoding them as I did in the aforementioned query. I would like to simply iterate through the python list during $project and $group stage.

Is that possible?

And also how do I ensure that the output query also preserves the same nested-format without losing depth?

回答1:

You could try the following:

key_list =  ["animal.dog", "animal.dog.tail", "animal.cat", "tree", "fruits", "timestamp"]
match = { "$match": { "cid": ObjectId("57fe39972b8dbc1387b20913") } }
project = { "$project": {} }
group = { "$group": {} }

for item in key_list:
    if item == "timestamp":
        project["$project"]["day"] = { "$substr": ["$"+item,  0, 10] }
        group["$group"]["_id"] = "$day"
        break
    sum = {"$sum": ""}
    sum["$sum"] = "$"+item.replace(".", "_")
    project["$project"][item.replace(".", "_")] = "$"+item
    group["$group"][item.replace(".", "_")] = sum

pipeline = [match, project, group]