I would like to compare two collections (in C#), but I'm not sure of the best way to implement this efficiently.
I've read the other thread about Enumerable.SequenceEqual, but it's not exactly what I'm looking for.
In my case, two collections would be equal if they both contain the same items (no matter the order).
Example:
collection1 = {1, 2, 3, 4};
collection2 = {2, 4, 1, 3};
collection1 == collection2; // true
What I usually do is to loop through each item of one collection and see if it exists in the other collection, then loop through each item of the other collection and see if it exists in the first collection. (I start by comparing the lengths).
if (collection1.Count != collection2.Count)
return false; // the collections are not equal
foreach (Item item in collection1)
{
if (!collection2.Contains(item))
return false; // the collections are not equal
}
foreach (Item item in collection2)
{
if (!collection1.Contains(item))
return false; // the collections are not equal
}
return true; // the collections are equal
However, this is not entirely correct, and it's probably not the most efficient way to do compare two collections for equality.
An example I can think of that would be wrong is:
collection1 = {1, 2, 3, 3, 4}
collection2 = {1, 2, 2, 3, 4}
Which would be equal with my implementation. Should I just count the number of times each item is found and make sure the counts are equal in both collections?
The examples are in some sort of C# (let's call it pseudo-C#), but give your answer in whatever language you wish, it does not matter.
Note: I used integers in the examples for simplicity, but I want to be able to use reference-type objects too (they do not behave correctly as keys because only the reference of the object is compared, not the content).
In the case of no repeats and no order, the following EqualityComparer can be used to allow collections as dictionary keys:
Here is the ToHashSet() implementation I used. The hash code algorithm comes from Effective Java (by way of Jon Skeet).
This simple solution forces the
IEnumerable
's generic type to implementIComparable
. Because ofOrderBy
's definition.If you don't want to make such an assumption but still want use this solution, you can use the following piece of code :
erickson is almost right: since you want to match on counts of duplicates, you want a Bag. In Java, this looks something like:
I'm sure C# has a built-in Set implementation. I would use that first; if performance is a problem, you could always use a different Set implementation, but use the same Set interface.
You could use a Hashset. Look at the SetEquals method.
Here is a solution which is an improvement over this one.
Here's my extension method variant of ohadsc's answer, in case it's useful to someone