Calculating difference on datetime row betwen rows

2019-03-06 22:51发布

问题:

I have table that holds records with tasks, status and time when triggered:

Table tblwork:

+-------------+------------+---------------------+-----+
| task        | status     | stime               | id  |
+-------------+------------+---------------------+-----+
| A           | 1          | 2018-03-07 20:00:00 | 1   |
| A           | 2          | 2018-03-07 20:30:00 | 2   |
| A           | 1          | 2018-03-07 21:00:00 | 3   |
| A           | 3          | 2018-03-07 21:30:00 | 4   |
| B           | 1          | 2018-03-07 22:30:00 | 5   |
| B           | 3          | 2018-03-07 23:30:00 | 6   |
+-------------+------------+---------------------+-----+

Status 1 means start, 2 - pause, 3 - end

Then I need to calculate how much time is spent for each task excluding pause (status = 2). This is how I do it:

SELECT t1.id, t1.task,
SUM(timestampdiff(second,IFNULL( 
(SELECT MAX(t2.stime) FROM tblwork t2  WHERE t2.task='B' AND t2.stime< t1.stime) ,t1.stime),t1.stime)) myTimeDiffSeconds
FROM tblwork t1 
WHERE t1.task='B' and (t1.status = 1 or t1.status = 3);

Now I want to get table for all tasks

SELECT t1.id, t1.task,
SUM(timestampdiff(second,IFNULL( 
(SELECT MAX(t2.stime) FROM tblwork t2  WHERE t2.stime< t1.stime) ,t1.stime),t1.stime)) myTimeDiffSeconds
FROM tblwork t1 
WHERE (t1.status = 1 or t1.status = 3) GROUP BY t1.taks

I get this result:

+-------------+------------+---------------------+
| task        | id         | mytimedifference    |
+-------------+------------+---------------------+
| A           | 1          | 3600                |   
| B           | 3          | 2421217             |
+-------------+------------+---------------------+

Calculation for A is correct B is wrong, it should be 3600 second but i don't understand why.

回答1:

Assuming there is always a start for each pause and end, wouldn't something like this be more direct?

SELECT t.task
   , SUM(TO_SECONDS(t.stime) 
         * CASE WHEN t.status IN (1) THEN -1
                WHEN t.status IN (2, 3) THEN 1
                ELSE 0
           END
     ) AS totalTimeSecs
FROM tblwork AS task
GROUP BY t.task

I'm not quite sure offhand how big the values that come out of TO_SECONDS() are for current timestamps; but if they are an issue when being summed, if could be changed to

   , SUM((TO_SECONDS(t.stime) - some_constant_just_before_or_at_your_earliest_seconds)
         * CASE WHEN t.status IN (1) THEN -1
                WHEN t.status IN (2, 3) THEN 1
                ELSE 0
           END
     ) AS totalTimeSecs

You can detect "abnormal" data by adding the following to the select expression list

, CASE WHEN SUM(CASE 
                WHEN t.status IN (1) THEN -1 
                WHEN t.status IN (2, 3) THEN 1 
                ELSE 0 END
              ) = 0 
       THEN 'OK' 
       ELSE 'ABNORMAL' 
   END AS integrityCheck

Note: any "unclosed" intervals will be marked as abnormal; without much more complicated and expensive start and end checking for intervals to differentiate "open" from "invalid", it's probably the best that can be done. The sum used for additonal "integrityCheck" equaling -1 might hint at an open ended interval, but could also indicate an erroneous double-start.