How to get duplicate items from a list using LINQ?

2019-01-04 07:12发布

I'm having a List<string> like:

List<String> list = new List<String>{"6","1","2","4","6","5","1"};

I need to get the duplicate items in the list into a new list. Now I'm using a nested for loop to do this.

The resulting list will contain {"6","1"}.

Is there any idea to do this using LINQ or lambda expressions?

9条回答
够拽才男人
2楼-- · 2019-01-04 08:01

I wrote this extension method based off @Lee's response to the OP. Note, a default parameter was used (requiring C# 4.0). However, an overloaded method call in C# 3.0 would suffice.

/// <summary>
/// Method that returns all the duplicates (distinct) in the collection.
/// </summary>
/// <typeparam name="T">The type of the collection.</typeparam>
/// <param name="source">The source collection to detect for duplicates</param>
/// <param name="distinct">Specify <b>true</b> to only return distinct elements.</param>
/// <returns>A distinct list of duplicates found in the source collection.</returns>
/// <remarks>This is an extension method to IEnumerable&lt;T&gt;</remarks>
public static IEnumerable<T> Duplicates<T>
         (this IEnumerable<T> source, bool distinct = true)
{
     if (source == null)
     {
        throw new ArgumentNullException("source");
     }

     // select the elements that are repeated
     IEnumerable<T> result = source.GroupBy(a => a).SelectMany(a => a.Skip(1));

     // distinct?
     if (distinct == true)
     {
        // deferred execution helps us here
        result = result.Distinct();
     }

     return result;
}
查看更多
祖国的老花朵
3楼-- · 2019-01-04 08:03

Hope this wil help

int[] listOfItems = new[] { 4, 2, 3, 1, 6, 4, 3 };

var duplicates = listOfItems 
    .GroupBy(i => i)
    .Where(g => g.Count() > 1)
    .Select(g => g.Key);

foreach (var d in duplicates)
    Console.WriteLine(d);
查看更多
别忘想泡老子
4楼-- · 2019-01-04 08:05

Here is one way to do it:

List<String> duplicates = lst.GroupBy(x => x)
                             .Where(g => g.Count() > 1)
                             .Select(g => g.Key)
                             .ToList();

The GroupBy groups the elements that are the same together, and the Where filters out those that only appear once, leaving you with only the duplicates.

查看更多
冷血范
5楼-- · 2019-01-04 08:05
  List<String> list = new List<String> { "6", "1", "2", "4", "6", "5", "1" };

    var q = from s in list
            group s by s into g
            where g.Count() > 1
            select g.First();

    foreach (var item in q)
    {
        Console.WriteLine(item);

    }
查看更多
我想做一个坏孩纸
6楼-- · 2019-01-04 08:09
var duplicates = lst.GroupBy(s => s)
    .SelectMany(grp => grp.Skip(1));

Note that this will return all duplicates, so if you only want to know which items are duplicated in the source list, you could apply Distinct to the resulting sequence or use the solution given by Mark Byers.

查看更多
倾城 Initia
7楼-- · 2019-01-04 08:10

All mentioned solutions until now perform a GroupBy. Even if I only need the first Duplicate all elements of the collections are enumerated at least once.

The following extension function stops enumerating as soon as a duplicate has been found. It continues if a next duplicate is requested.

As always in LINQ there are two versions, one with IEqualityComparer and one without it.

public static IEnumerable<TSource> ExtractDuplicates(this IEnumerable<TSource> source)
{
    return source.ExtractDuplicates(null);
}
public static IEnumerable<TSource> ExtractDuplicates(this IEnumerable<TSource source,
    IEqualityComparer<TSource> comparer);
{
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (comparer == null)
        comparer = EqualityCompare<TSource>.Default;

    HashSet<TSource> foundElements = new HashSet<TSource>(comparer);
    foreach (TSource sourceItem in source)
    {
        if (!foundElements.Contains(sourceItem))
        {   // we've not seen this sourceItem before. Add to the foundElements
            foundElements.Add(sourceItem);
        }
        else
        {   // we've seen this item before. It is a duplicate!
            yield return sourceItem;
        }
    }
}

Usage:

IEnumerable<MyClass> myObjects = ...

// check if has duplicates:
bool hasDuplicates = myObjects.ExtractDuplicates().Any();

// or find the first three duplicates:
IEnumerable<MyClass> first3Duplicates = myObjects.ExtractDuplicates().Take(3)

// or find the first 5 duplicates that have a Name = "MyName"
IEnumerable<MyClass> myNameDuplicates = myObjects.ExtractDuplicates()
    .Where(duplicate => duplicate.Name == "MyName")
    .Take(5);

For all these linq statements the collection is only parsed until the requested items are found. The rest of the sequence is not interpreted.

IMHO that is an efficiency boost to consider.

查看更多
登录 后发表回答