Find last record of each day

2019-02-19 11:47发布

问题:

I use mongodb (I'm new in mongodb) to store data about my power consumption , each minute there is a new record, here is an example :

{"id":"5309d4cae4b0fbd904cc00e1","adco":"O","hchc":7267599,"hchp":10805900,"hhphc":"g","ptec":"c","iinst":13,"papp":3010,"imax":58,"optarif":"s","isousc":60,"motdetat":"Á","date":1393156826114}

so I have around 1440 records a day.

I want to compute the cost by day, but the problem for this is that I need the last record of the day, because this record can give me the amount of absolute kWh (kiloWatt-Hour). So if I remove the kwh amount of the last record of the yesterday, I have the amount of kWh for the day.

The field hchp gives me this absolute kWh. The field date corresponds to the time of the measure in millisecond.

consumption of the day = absolute consumption of the end of the day - absolute consumption of the end of yesterday.

How is it possible to get the last record of each day in mongodb ?

Note : I use mongodb in spring java, so I need a query like this :

Example to get all measures :

@Query("{ 'date' : { $gt : ?0 }}")
public List<Mesure> findByDateGreaterThan(Date date, Sort sort);

回答1:

A bit more modern than the original answer:

db.collection.aggregate([
  { "$sort": { "date": 1 } },
  { "$group": {
    "_id": {
      "$subtract": ["$date",{"$mod": ["$date",86400000]}]
    },
    "doc": { "$last": "$$ROOT" }
  }},
  { "$replaceRoot": { "newDocument": "$doc" } }
])

The same principle applies that you essentially $sort the collection and then $group on the required grouping key picking up the $last data from the grouping boundary.

Making things a bit clearer since the original writing is that you can use $$ROOT instead of specifying every document property, and of course the $replaceRoot stage allows you to restore that data fully as the original document form.

But the general solution is still $sort first, then $group on the common key that is required and keep the $last or $first depending on sort order occurrences from the grouping boundary for the properties that are required.

Also for BSON Dates as opposed to a timestamp value as in the question, see Group result by 15 minutes time interval in MongoDb for different approaches on how to accumulate for different time intervals actually using and returning BSON Date values.


Not quite sure what you are going for here but you could do this in aggregate if my understanding is right. So to get the last record for each day:

db.collection.aggregate([
    // Sort in date order  as ascending
    {"$sort": { "date": 1 } },

    // Date math converts to whole day
    {"$project": {
        "adco": 1,
        "hchc": 1,
        "hchp": 1,
        "hhphc": 1,
        "ptec": 1,
        "iinst": 1,
        "papp": 1,
        "imax": 1,
        "optarif": 1,
        "isousc": 1,
        "motdetat": 1,
        "date": 1,
        "wholeDay": {"$subtract": ["$date",{"$mod": ["$date",86400000]}]} 
    }},

    // Group on wholeDay ( _id insertion is monotonic )
    {"$group": 
        "_id": "$wholeDay",
        "docId": {"$last": "$_id" },
        "adco": {"$last": "$adco" },
        "hchc": {"$last": "$hchc" },
        "hchp": {"$last": "$hchp" },
        "hhphc": {"$last": "$hhphc" },
        "ptec": {"$last": "$ptec" },
        "iinst": {"$last": "$iinst" },
        "papp": {"$last": "$papp" },
        "imax": {"$last": "$imax" },
        "optarif": {"$last": "$optarif",
        "isousc": {"$last": "$isouc" },
        "motdetat": {"$last": "$motdetat" },
        "date": {"$last": "$date" },
    }}
])

So the principle here is that given the timestamp value, do the date math to project that as the midnight time at the beginning of each day. Then as the _id key on the document is already monotonic (always increasing), then simply group on the wholeDay value while pulling the $last document from the grouping boundary.

If you don't need all the fields then only project and group on the ones you want.

And yes you can do this in the spring data framework. I'm sure there is a wrapped command in there. But otherwise, the incantation to get to the native command goes something like this:

mongoOps.getCollection("yourCollection").aggregate( ... )

For the record, if you actually had BSON date types rather than a timestamp as a number, then you can skip the date math:

db.collection.aggregate([
    { "$group": { 
        "_id": { 
            "year": { "$year": "$date" },
            "month": { "$month": "$date" },
            "day": { "$dayOfMonth": "$date" }
        },
        "hchp": { "$last": "$hchp" }
    }}
])