I was toying around with some of the linq samples that come with LINQPad. In the "C# 3.0 in a Nutshell" folder, under Chater 9 - Grouping, there is a sample query called "Grouping by Multiple Keys". It contains the following query:
from n in new[] { "Tom", "Dick", "Harry", "Mary", "Jay" }.AsQueryable()
group n by new
{
FirstLetter = n[0],
Length = n.Length
}
I added the string "Jon" to the end of the array to get an actual grouping, and came up with the following result:
This was exactly what I was expecting. Then, in LINQPad, I went to the VB.NET version of the same query:
' Manually added "Jon"
from n in new string() { "Tom", "Dick", "Harry", "Mary", "Jay", "Jon" }.AsQueryable() _
group by ng = new with _
{ _
.FirstLetter = n(0), _
.Length = n.Length _
} into group
The result does not properly group Jay/Jon together.
After pulling my hair out for a bit, I discovered this MSDN article discussing VB.NET anonymous types. In VB.NET they are mutable by default as opposed to C# where they are immutable. In VB, you need to add the Key
keyword to make them immutable. So, I changed the query to this (notice the addition of Key
):
from n in new string() { "Tom", "Dick", "Harry", "Mary", "Jay", "Jon" }.AsQueryable() _
group by ng = new with _
{ _
Key .FirstLetter = n(0), _
Key .Length = n.Length _
} into group
This gave me the correct result:
So my question is this:
- Why does mutability/immutability of anonymous types matter when linq does an equality comparison? Notably, in Linq-to-SQL it doesn't matter at all, which is likely just a product of the translation to SQL. But in Linq-to-objects it apparently makes all the difference.
- Why would MS have chosen to make VB's anonymous types mutable. I see no real advantage, and after mucking around with this issue I see some very real disadvantages. Namely that your linq queries can have subtle bugs.
-- EDIT --
Just an interesting extra piece of info... Apparently this is keyed property issue is widely known. I just didn't know what to Google for. It's been discussed here and here on stackoverflow. Here's another example of the issue using anonymous types and Distinct:
Dim items = New String() {"a", "b", "b", "c", "c", "c"}
Dim result = items.Select(Function(x) New With {.MyValue = x}).Distinct()
Dim result2 = items.Select(Function(x) New With {Key .MyValue = x}).Distinct()
'Debug.Assert(result.Count() = 3) ' Nope... it's 6!
Debug.Assert(result2.Count() = 3)
The
Key
modifier doesn't just affect mutability - it also affects the behaviour ofEquals
andGetHashCode
. OnlyKey
properties are included in those calculations... which clearly affects grouping etc.As for why it's different for VB - I don't know. It seems odd to me too. I know I'm glad that C# works the way it does though :) Even if it could be argued that making properties optionally mutable makes sense, I don't see why it should be the default.