Default values for empty groups in Linq GroupBy qu

2019-04-14 00:07发布

I have a data set of values that I want to summarise in groups. For each group, I want to create an array big enough to contain the values of the largest group. When a group contains less than this maximum number, I want to insert a default value of zero for the empty key values.

Dataset

Col1    Col2    Value
--------------------
A       X       10
A       Z       15
B       X       9
B       Y       12
B       Z       6

Desired result

X, [10, 9]
Y, [0, 12]
Z, [15, 6]

Note that value "A" in Col1 in the dataset has no value for "Y" in Col2. Value "A" is first group in the outer series, therefore it is the first element that is missing.

The following query creates the result dataset, but does not insert the default zero values for the Y group.

result = data.GroupBy(item => item.Col2)
             .Select(group => new
             {
                 name = group.Key,
                 data = group.Select(item => item.Value)
                             .ToArray()
             })

Actual result

X, [10, 9]
Y, [12]
Z, [15, 6]

What do I need to do to insert a zero as the missing group value?

3条回答
乱世女痞
2楼-- · 2019-04-14 00:47

It won't be pretty, but you can do something like this:

var groups = data.GroupBy(d => d.Col2, d => d.Value)
                 .Select(g => new { g, count = g.Count() })
                 .ToList();
int maxG = groups.Max(p => p.count);
var paddedGroups = groups.Select(p => new {
                     name = p.g.Key,
                     data = p.g.Concat(Enumerable.Repeat(0, maxG - p.count)).ToArray() });
查看更多
Rolldiameter
3楼-- · 2019-04-14 01:08

Here is how I understand it.

Let say we have this

class Data
{
    public string Col1, Col2;
    public decimal Value;
}

Data[] source =
{
    new Data { Col1="A", Col2 = "X", Value = 10 },
    new Data { Col1="A", Col2 = "Z", Value = 15 },
    new Data { Col1="B", Col2 = "X", Value = 9 },
    new Data { Col1="B", Col2 = "Y", Value = 12 },
    new Data { Col1="B", Col2 = "Z", Value = 6 },
};

First we need to determine the "fixed" part

var columns = source.Select(e => e.Col1).Distinct().OrderBy(c => c).ToList();

Then we can process with the normal grouping, but inside the group we will left join the columns with group elements which will allow us to achieve the desired behavior

var result = source.GroupBy(e => e.Col2, (key, elements) => new
{
    Key = key,
    Elements = (from c in columns
             join e in elements on c equals e.Col1 into g
             from e in g.DefaultIfEmpty()
             select e != null ? e.Value : 0).ToList()
})
.OrderBy(e => e.Key)
.ToList();
查看更多
时光不老,我们不散
4楼-- · 2019-04-14 01:09

You can do it like this:-

int maxCount = 0;
var result = data.GroupBy(x => x.Col2)
             .OrderByDescending(x => x.Count())
             .Select(x => 
                {
                   if (maxCount == 0)
                       maxCount = x.Count();
                   var Value = x.Select(z => z.Value);
                   return new 
                   {
                      name = x.Key,
                      data = maxCount == x.Count() ? Value.ToArray() : 
                                 Value.Concat(new int[maxCount - Value.Count()]).ToArray()
                   };
                });

Code Explanation:-

Since you need to append default zeros in case when you have less items in any group, I am storing the maxCount (which any group can produce in a variable maxCount) for this I am ordering the items in descending order. Next I am storing the maximum count which the item can producr in maxCount variable. While projecting I am simply checking if number of items in the group is not equal to maxCount then create an integer array of size (maxCount - x.Count) i.e. maximum count minus number of items in current group and appending it to the array.

Working Fiddle.

查看更多
登录 后发表回答