I want to get stats for each day in a given month. However, if a day has no rows in the table, it doesn't show up in the results. How can I include days with no data, and show all days until the current date?
This is the query I have now:
SELECT DATE_FORMAT(FROM_UNIXTIME(timestamp), '%d'), COUNT(*)
FROM data
WHERE EXTRACT(MONTH FROM FROM_UNIXTIME(timestamp)) = 6
GROUP BY EXTRACT(DAY FROM FROM_UNIXTIME(timestamp))
So if I have
Row 1 | 01-06
Row 2 | 02-06
Row 3 | 03-06
Row 4 | 05-06
Row 5 | 05-06
(i changed timestamp values to a day/month date just to explain)
It should output
01 | 1
02 | 1
03 | 1
04 | 0
05 | 2
06 | 0
...Instead of ignoring day 4 and today (day 6).
You will need a calendar table to do something in the form
SELECT `date`, count(*)
FROM Input_Calendar c
LEFT JOIN Data d on c.date=d.date
GROUP BY `date`
I keep a full copy of a calendar table in my database and used a WHILE loop to fill it but you can populate one on the fly for use based on the different solutions out there like http://crazycoders.net/2012/03/using-a-calendar-table-in-mysql/
In MySQL, you can use MySQL variables (act like in-line programming values). You set and can manipulate as needed.
select
dayofmonth( DynamicCalendar.CalendarDay ) as `Day`,
count(*) as Entries
from
( select
@startDate := date_add( @startDate, interval 1 day ) CalendarDay
from
( select @startDate := '2013-05-31' ) sqlvars,
AnyTableThatHasAsManyDaysYouExpectToReport
limit
6 ) DynamicCalendar
LEFT JOIN Input_Calendar c
on DynamicCalendar.CalendarDay = date( from_unixtime( c.date ))
group by
DynamicCalendar.CalendarDay
In the above sample, the inner query can join against as the name implies "Any Table" in your database that has at least X number of records you are trying to generate for... in this case, you are dealing with only the current month of June and only need 6 records worth... But if you wanted to do an entire year, just make sure the "Any Table" has 365 records(or more).
The inner query will start by setting the "@startDate" to the day BEFORE June 1st (May 31). Then, by just having the other table, will result in every record joined to this variable (creates a simulated for/next loop) via a limit of 6 records (days you are generating the report for). So now, as the records are being queried, the Start Date keeps adding 1 day... first record results in June 1st, next record June 2nd, etc.
So now, you have a simulated calendar with 6 records dated from June 1 to June 6. Take that and join to your "data" table and you are already qualifying your dates via the join and get only those dates of activity. I'm joining on the DATE() of the from unix time since you care about anything that happend on June 1, and June 1 @ 12:00:00AM is different than June 1 @ 8:45am, so matching on the date only portion, they should remain in proper grouping.
You could expand this answer by changing the inner '2013-05-31' to some MySQL Date function to get the last day of the prior month, and the limit based on whatever day in the current month you are doing so these are not hard-coded.
Create a Time dimension. This is a standard OLAP reporting trick. You don't need a cube in order to do OLAP tricks, though. Simply find a script on the internet to generate a Calendar table and join to that table.
Also, I think your query is missing a WHERE clause.
Other useful tricks include creating a "Tally" table that is a list of numbers from 1 to N where N is usually the max of the bigint on your database management system.
No code provided here, as I am not a MySQL guru.
Pseudo-code is:
Select * from Data left join TimeDimension on data.date = timedimension.date