MySQL group by date and count including missing da

2020-02-11 07:32发布

Previously I was doing following to get per day count from reports table.

SELECT COUNT(*) AS count_all, tracked_on
 FROM `reports`
 WHERE (domain_id = 939 AND tracked_on >= '2014-01-01' AND tracked_on <= '2014-12-31')
 GROUP BY tracked_on
 ORDER BY tracked_on ASC;

Obviously this wont give me 0 count for missing dates.

Then I finally found a optimum solution to generate date-series between given date range. But the next challenge am facing is to join it with my reports table and get the count grouped by date.

select count(*), all_dates.Date as the_date, domain_id
from (
    select curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as Date
    from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
    cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
    cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
) all_dates
inner JOIN reports r
    on all_dates.Date >= '2014-01-01'
  and all_dates.Date <= '2014-12-31'
where all_dates.Date between '2014-01-01' and '2014-12-31' AND domain_id = 939 GROUP BY the_date order by the_date ASC ;

The result am getting is

count(*)    the_date    domain_id
46  2014-01-01  939
46  2014-01-02  939
46  2014-01-03  939
46  2014-01-04  939
46  2014-01-05  939
46  2014-01-06  939
46  2014-01-07  939
46  2014-01-08  939
46  2014-01-09  939
46  2014-01-10  939
46  2014-01-11  939
46  2014-01-12  939
46  2014-01-13  939
46  2014-01-14  939
...


Whereas am looking to fill in the missing dates with 0

something like

count(*)    the_date    domain_id
12  2014-01-01  939
23  2014-01-02  939
46  2014-01-03  939
0   2014-01-04  939
0   2014-01-05  939
99  2014-01-06  939
1   2014-01-07  939
5   2014-01-08  939
...


Another try that I gave was:

select count(*), all_dates.Date as the_date, domain_id
from (
    select curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as Date
    from (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
    cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
    cross join (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as c
) all_dates
inner JOIN reports r
    on all_dates.Date = r.tracked_on
where all_dates.Date between '2014-01-01' and '2014-12-31' AND domain_id = 939 GROUP BY the_date order by the_date ASC ;

Results:

count(*)    the_date    domain_id
38        2014-09-03     939
8         2014-09-04     939

Minimal data with above queries: http://sqlfiddle.com/#!2/dee3e/6

2条回答
Evening l夕情丶
2楼-- · 2020-02-11 07:58

The all_dates subquery is only looking back from the current day (curdate()). If you want to include future dates, change the first line of the subquery to something like:

select '2015-01-01' - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY as Date
查看更多
太酷不给撩
3楼-- · 2020-02-11 08:03

You need an OUTER JOIN to arrive at every day between a start and an end because if you use an INNER JOIN it will restrict the output to just the dates that are joined (i.e. just those dates in the report table).

In addition, when you use an OUTER JOIN you must take care that conditions in the where clause don't cause an implicit inner join; for example AND domain_id = 1 if use in the where clause would suppress any row that did not have that condition met, but when used as a join condition it only restricts the rows of the report table.

SELECT
      COUNT(r.domain_id)
    , all_dates.Date AS the_date
    , domain_id
FROM (
        SELECT DATE_ADD(curdate(), INTERVAL 2 MONTH) - INTERVAL (a.a + (10 * b.a) ) DAY as Date
        FROM (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
        CROSS JOIN (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
      ) all_dates
      LEFT OUTER JOIN reports r
                  ON all_dates.Date = r.tracked_on
                        AND domain_id = 1
WHERE all_dates.Date BETWEEN '2014-09-01' AND '2014-09-30'
GROUP BY
      the_date
ORDER BY
      the_date ASC;

I have also changed the all_dates derived table, by using DATE_ADD() to push the starting point into the future, and I have reduced the it's size. Both of these are options and can be tweaked as you see fit.

Demo at SQLfiddle


to arrive at a domain_id for every row (as shown in your question) you would need to use someting like the following; Note you could use IFNULL() which is MySQL specific but I have used COALESCE() which is more generic SQL. However use of an @parameter as shown here is MySQL specific anyway.

SET @domain := 1;

SELECT
      COUNT(r.domain_id)
    , all_dates.Date AS the_date
    , coalesce(domain_id,@domain) AS domain_id
FROM (
        SELECT DATE_ADD(curdate(), INTERVAL 2 month) - INTERVAL (a.a + (10 * b.a) ) DAY as Date
        FROM (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as a
        CROSS JOIN (select 0 as a union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) as b
      ) all_dates
      LEFT JOIN reports r
                  ON all_dates.Date = r.tracked_on
                        AND domain_id = @domain
WHERE all_dates.Date BETWEEN '2014-09-01' AND '2014-09-30'
GROUP BY
      the_date
ORDER BY
      the_date ASC;

See this at SQLfiddle

查看更多
登录 后发表回答