I have a List<> of objects in C# and I need a way to return those objects that are considered duplicates within the list. I do not need the Distinct resultset, I need a list of those items that I will be deleting from my repository.
For the sake of this example, lets say I have a list of "Car" types and I need to know which of these cars are the same color as another in the list. Here are the cars in the list and their color property:
Car1.Color = Red;
Car2.Color = Blue;
Car3.Color = Green;
Car4.Color = Red;
Car5.Color = Red;
For this example I need the result (IEnumerable<>, List<>, or whatever) to contain Car4 and Car5 because I want to delete these from my repository or db so that I only have one car per color in my repository. Any help would be appreciated.
Here's a slightly different Linq solution that I think makes it more obvious what you're trying to do:
It's just grouping cars by color, tossing out all the groups that have more than one element, and then putting the rest into the returned IEnumerable.
Without actually coding it, how about an algorithm something like this:
List<T>
creating aDictionary<T, int>
Dictionary<T, int>
deleting entries where theint
is >1Anything left in the
Dictionary
has duplicates. The second part where you actually delete is optional, of course. You can just iterate through theDictionary
and look for the >1's to take action.EDIT: OK, I bumped up Ryan's since he actually gave you code. ;)
public static IQueryable Duplicates(this IEnumerable source) where TSource : IComparable {
}
Create a new
Dictionary<Color, Car> foundColors
and aList<Car> carsToDelete
Then you iterate through your original list of cars like so:
Then you can delete every car that's in foundColors.
You could get a minor performance boost by putting your "delete record" logic in the
if
statement instead of creating a new list, but the way you worded the question suggested that you needed to collect them in a List.I inadvertently coded this yesterday, when I was trying to write a "distinct by a projection". I included a ! when I shouldn't have, but this time it's just right:
You'd then call it with:
This groups the cars by color and then skips the first result from each group, returning the remainder from each group flattened into a single sequence.
If you have particular requirements about which one you want to keep, e.g. if the car has an
Id
property and you want to keep the car with the lowestId
, then you could add some ordering in there, e.g.