I have a data set of values that I want to summarise in groups. For each group, I want to create an array big enough to contain the values of the largest group. When a group contains less than this maximum number, I want to insert a default value of zero for the empty key values.
Dataset
Col1 Col2 Value
--------------------
A X 10
A Z 15
B X 9
B Y 12
B Z 6
Desired result
X, [10, 9]
Y, [0, 12]
Z, [15, 6]
Note that value "A" in Col1 in the dataset has no value for "Y" in Col2. Value "A" is first group in the outer series, therefore it is the first element that is missing.
The following query creates the result dataset, but does not insert the default zero values for the Y group.
result = data.GroupBy(item => item.Col2)
.Select(group => new
{
name = group.Key,
data = group.Select(item => item.Value)
.ToArray()
})
Actual result
X, [10, 9]
Y, [12]
Z, [15, 6]
What do I need to do to insert a zero as the missing group value?
It won't be pretty, but you can do something like this:
Here is how I understand it.
Let say we have this
First we need to determine the "fixed" part
Then we can process with the normal grouping, but inside the group we will left join the
columns
with group elements which will allow us to achieve the desired behaviorYou can do it like this:-
Code Explanation:-
Since you need to append
default zeros
in case when you have less items in any group, I am storing the maxCount (which any group can produce in a variablemaxCount
) for this I am ordering the items in descending order. Next I am storing the maximum count which the item can producr inmaxCount
variable. While projecting I am simply checking if number of items in the group is not equal tomaxCount
then create aninteger array
of size (maxCount - x.Count) i.e. maximum count minus number of items in current group and appending it to the array.Working Fiddle.