How to Select records that don't exist in Sql

2019-01-26 20:47发布

问题:

This isn't about checking if a record exists before selecting it, or viceversa.

The issue is this, I have a database with many records, they are stored by date, by day to be specific, each day should have a set of values, but some days may not have any values at all, they may not exist. So when I try to do a query, I need to show all the data, even if it doesn't technically exist, it could show at least as a blank space but it should show.

Think of this table:

 ___________________________________
| id |    date    | value1 | value2 |
|  1 | 02/01/2014 |    1   |    1   |
|  2 | 02/03/2014 |    2   |    2   |
|  3 | 02/04/2014 |    3   |    3   |
|  4 | 02/06/2014 |    4   |    4   |

Now what the query needs to do is to return all the data that's in between a certain date range, say from 02/01/2014 to 02/03/2014, it should return all the values between that date range, say:

SELECT * FROM myTable WHERE date>=`02/01/2014` AND date<=`02/03/2014`

But that will only return this:

 ___________________________________
| id |    date    | value1 | value2 |
|  1 | 02/01/2014 |    1   |    1   |
|  2 | 02/03/2014 |    2   |    2   |

Now I know this is fairly obvious, it is selecting all the data that exists in the table given the range I sent to the query, but I'm working with a report that doesn't care if there were values for that given day or not, it must return every single day, even if there were records for that day or not, meaning that even if 02/02/2014 is not on the table, it has to be returned as well.

Sort of like this:

 ___________________________________
| id |    date    | value1 | value2 |
|  1 | 02/01/2014 |    1   |    1   |
|    | 02/02/2014 |        |        |
|  2 | 02/03/2014 |    2   |    2   |

It doesn't matter it has no data, it just can't leave the day as if it didn't happen.

I already tried using COALESCE and ISNULL, but since there are no values for those days, there are no null values either, so they simply just cannot replace Null with something else.

Does anyone have any ideas?

回答1:

You could do this with a recursive CTE. Something like this:

DECLARE @startDate datetime = '2/1/2014'
DECLARE @endDate datetime = '2/6/2014'

;WITH DateRange(RunningDate) AS
(
    SELECT @startDate AS RunningDate
    UNION ALL
    SELECT RunningDate + 1
    FROM DateRange
    WHERE RunningDate < @endDate
)

SELECT id, RunningDate date, value1, value2 
FROM DateRange LEFT JOIN myTable ON myTable.date = DateRange.RunningDate

Edit... IF you do go with this solution (take note of Aaron Bertrand's comment under my answer), note that you'll also have to specify the max recursion if you intend on dealing with ranges greater than 3 months. The default is set to 100. This means that, as the query is currently written, it will only do a maximum of 101 dates (100 levels of recursion).

You can additionally specify OPTION (MAXRECURSION 0) at the end of the last SELECT to overcome this. 0 means infinite levels of recursion. But again, take note of Aaron's post.



回答2:

This should work:

WITH
    -- Numbers CTE, courtesy Itzik Ben-Gan
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS n FROM L4),
    -- Turn into Dates
  Dates AS(SELECT CAST('18991231' AS DATETIME)+n AS d FROM Nums)
SELECT * 
FROM Dates d
LEFT JOIN myTable t ON d.d = t.[date]
WHERE d.d>='02/01/2014' AND d.d<='02/03/2014'

Here is a version that is optimized for smaller date ranges:

DECLARE @startDate datetime = '2/1/2014'
DECLARE @endDate datetime = '2/3/2014'
DECLARE @days as INT = DATEDIFF(dd, @startDate, @enddate) + 1
;
WITH
    -- Numbers CTE, courtesy Itzik Ben-Gan
  L0   AS(SELECT 1 AS c UNION ALL SELECT 1),
  L1   AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
  L2   AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
  L3   AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
  L4   AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
  T4   AS(SELECT TOP (@days) 1 AS c FROM L4),
  Nums AS(SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1 AS n FROM T4),
    -- Turn into Dates
  Dates AS(SELECT @startDate+n AS d FROM Nums)
SELECT * 
FROM Dates d
LEFT JOIN myTable t ON d.d = t.[date]
WHERE d.d>=@startDate AND d.d<=@endDate


回答3:

Eric Fan had the right idea, but I thought it might help to see an implementation:

Here's a function that will return a table of the inclusive dates:

CREATE FUNCTION dbo.fDatesBetween(@startDate Date, @endDate Date)
    RETURNS @tbl TABLE
        (
            a_date      Date
        )
AS BEGIN
    DECLARE @thisDt Date
    SET @thisDt = @startDate

    WHILE @thisDt <= @endDate
    BEGIN
        INSERT INTO @tbl SELECT @thisDt
        SET @thisDt = DateAdd(day, 1, @thisDt)
    END
    RETURN
END

Now, if you do an outer join to your table from the function results, you will have what you are looking for:

SELECT DATES.a_date, SampleTable.value1, SampleTable.value2
  FROM dbo.fDatesBetween(@startDate, @endDate) DATES
       LEFT JOIN SampleTable
          ON DATES.a_date = SampleTable.dt

That should give you just what you asked for.