Filling in missing days for rows

2019-05-16 21:21发布

Given a table with schema like this:

id1    id2    day    number

How can I turn this:

a    b    day1    2
a    b    day5    4
a    b    day9    8
c    d    day2    1
c    d    day3    2
c    d    day5    4
c    d    day8    3

Into this?:

a    b    day1    2
a    b    day2    2
a    b    day3    2
a    b    day4    2
a    b    day5    4
a    b    day6    4
a    b    day7    4
a    b    day8    4
a    b    day9    8
c    d    day2    1
c    d    day3    2
c    d    day4    2
c    d    day5    4
c    d    day6    4
c    d    day7    4
c    d    day8    3

To clarify, for each group of id1 and id2, I need to fill in the missing rows with dates ranging from the minimum date for that grouping to the maximum date. Furthermore, the rows that get filled in must use the previous entry's number column for it's number column.

I need this to run as fast as possible.

Bonus points if you can do it in LINQ to SQL (assuming the class exists for the table).

EDIT: The day column is actually an int that represents the day, but for the sake of argument, it could be a date.

I've done the naive approach of iterating over each group and adding in the missing days, but it just seems terribly inefficient. I have to think that there's something faster or that someone has encountered this situation before.

2条回答
混吃等死
2楼-- · 2019-05-16 21:26

I started going down the same path as Quassnoi only in LINQ, but I kept having issues. I had to create a solution or it would bug me all day (and I really need to get some work done). This solution is not nearly as slick as Quassnoi's answer (he gets my vote), but I spent the time figuring it out so I thought I would share. This is probably no better than your existing solution but I had fun making it :)

// This selects each id group with their min and max dates into a collection
var query = (from m in MissingDaysTables
group m by new {Id1 = m.Id1, Id2 = m.Id2, } into myGroup
select new 
{ 
    Id1=myGroup.Key.Id1, 
    Id2=myGroup.Key.Id2, 
    StartDay=(from g in myGroup select g.Day).Min(), 
    EndDay=(from g in myGroup select g.Day).Max()
});

var myList = new List<MissingDaysTable>();

// Loop through each group and create the records
foreach (var row in query)
{
    DateTime curDate = row.StartDay; //used to track the current date
    while (curDate <= row.EndDay)
    {
        myList.Add(new MissingDaysTable() 
        { 
            Id1 = row.Id1, 
            Id2 = row.Id2, 
            Day=startDate, 
            Number=MissingDaysTables.Where(m => m.Id1==row.Id1 && m.Id2==row.Id2 && m.Day<=curDate).OrderByDescending (t => t.Day).First().Number
        });
        curDate = curDate.AddDays(1);
    }
}
查看更多
再贱就再见
3楼-- · 2019-05-16 21:28
WITH    dates (id1, id2, ds, de) AS
        (
        SELECT  id1, id2, MIN(d), MAX(d)
        FROM    mytable m
        GROUP BY
               id1, id2
        UNION ALL
        SELECT  d.id1, d.id2, DATEADD(d, 1, ds), de
        FROM    dates d
        WHERE   ds < de
        )
SELECT  id1, id2, ds, m.number
FROM    dates d
CROSS APPLY
        (
        SELECT  TOP 1 number
        FROM    mytable m
        WHERE   m.id1 = d.id1
                AND m.id2 = d.id2
                AND m.d <= d.ds
        ORDER BY
                m.d DESC
        ) m
OPTION (MAXRECURSION 0)
查看更多
登录 后发表回答