Left outer join and group by

2019-04-09 18:03发布

问题:

I have the following class definition:

public class MyData  
{  
    public DateTime Expiration { get; private set; }  
    public string Name { get; private set; }  
    public double Price { get; private set; }  
    public double Quantity { get; private set; }  
}

I need to join this with a list of years:

IEnumerable<int> years= Enumerable.Range(1, 20);

The result will be ultimately be displayed in a grid with the y-axis representing the Name field and the x-axis representing the years. Each cell will be the aggregated Quantity * Price for the Name and year.

I am currently struggling with the syntax. I started off by joining the instances of MyData to the years and grouping as follows:

var myData = GetData();
var query = from data in myData
            join year in years on (data.Expiration.Year - DateTime.Now.Year) + 1 equals year
            group data by new { Year = (data.Expiration.Year - DateTime.Now.Year) + 1
                              , Name = data.Name } into grouped
            select new {Name = grouped.Key.Name
                      , Year = grouped.Key.Year
                      , Value = grouped.Sum(d => d.Quanity * d.Price) };

This gives me the data aggregated as I need, but obviously excludes any years where none of the instances of MyData contain a matching Expiration.

I can't seem to figure out how to modify my query to get the data I need. Once I get the years included my aggregation breaks down, and I effectively end up with the sum of all the Price's * Quantity's for a Name across all years, rather than on a year by year basis.

回答1:

There is a way to left join in LINQ, but it's true - the syntax is not that clear. Your query would look like this:

var query = from year in years
            join data in myData
                on year equals (data.Expiration.Year - DateTime.Now.Year) + 1
                into year_data
            from y_d in year_data.DefaultIfEmpty()
            group y_d by new
            {
                Year = year,
                Name = y_d == null ? "" : y_d.Name
            }
            into grouped
            select new
            {
                Name = grouped.Key.Name,
                Year = grouped.Key.Year,
                Value = grouped.Sum(d => d == null ? 0 : d.Quantity * d.Price)
            };


回答2:

You can also take different approach - instead of joining, select all years first and filter out your data while computing aggregate for each data object / year. Like this:

from year in years select new 
{ 
    Year = year, 
    Items = myData
        .GroupBy(i => i.Name)
        .Select(grouped => new 
        { 
            Name = grouped.Key, 
            Aggregate = grouped
                .Where(d => (d.Expiration.Year - DateTime.Now.Year) + 1 == year)
                .Sum(d => d.Price * d.Quantity)
        })
};


回答3:

if you can't solve the problem ... replace it ...

build dummy entries of MyData for each name and year with quantity=0 and price=0

put them in a list and replace this line:

var query = from data in myData

with this line:

var query = from data in myData.Concat(dummyList)