Checking for duplicates in a complex object using

2020-04-07 19:12发布

问题:

I've just started learning linq and lambda expressions, and they seem to be a good fit for finding duplicates in a complex object collection, but I'm getting a little confused and hope someone can help put me back on the path to happy coding.

My object is structured like list.list.uniqueCustomerIdentifier

I need to ensure there are no duplicate uniqueCustomerIdentifier with in the entire complex object. If there are duplicates, I need to identify which are duplicated and return a list of the duplicates.

回答1:

  • Unpack the hierarchy
  • Project each element to its uniqueID property
  • Group these ID's up
  • Filter the groups by groups that have more than 1 element
  • Project each group to the group's key (back to uniqueID)
  • Enumerate the query and store the result in a list.

var result = 
  myList
    .SelectMany(x => x.InnerList)
    .Select(y => y.uniqueCustomerIdentifier)
    .GroupBy(id => id)
    .Where(g => g.Skip(1).Any())
    .Select(g => g.Key)
    .ToList()


回答2:

There is a linq operator Distinct( ), that allows you to filter down to a distinct set of records if you only want the ids. If you have setup your class to override equals you or have an IEqualityComparer you can directly call the Distinct extension method to return the unique results from the list. As an added bonus you can also use the Union and Intersect methods to merge or filter between two lists.

Another option would be to group by the id and then select the first element.

var results = from item in list
              group item by item.id into g
              select g.First();


回答3:

If you want to flatten the two list hierarchies, use the SelectMany method to flatten an IEnumerable<IEnumerable<T>> into IEnumerable<T>.