Fastest way to implement Duplicate Character Remov

In C#, what is the fastest way to detect duplicate characters in a String and remove them (removal including 1st instance of the duplicated character)?

Example Input: nbHHkRvrXbvkn

Example Output: RrX

标签： c# .net string

5条回答

放我归山

2楼-- · 2020-02-08 09:39

Here is a pretty fast one preserving order. But I'd be a little worried about how LINQ does Group and Where:

string s = "nbHHkRvrXbvkn";
Console.WriteLine( 
    s.ToCharArray()
        .GroupBy(c => c)
        .Where(g => g.Count() == 1)
        .Aggregate(new StringBuilder(), (b, g) => b.Append(g.Key)));

Edit: This one beats Luke's in some cases still slower than dtb's, but it preserves the order

private static string MyMethod(string s)
{
    StringBuilder sb = new StringBuilder(s.Length);
    foreach (var g in s.ToCharArray().GroupBy(c => c))
        if (g.Count() == 1) sb.Append(g.Key);

    return sb.ToString();
}

0人赞添加讨论(0) 举报

女痞

3楼-- · 2020-02-08 09:42

Fastest as in fewest-lines-of-code:

var s = "nbHHkRvrXbvkn";
var duplicates = s.Where(ch => s.Count(c => c == ch) > 1);
var result = new string(s.Except(duplicates).ToArray()); // = "RrX"

Fastest as in fastest-performance would probably be something like this (does not preserve order):

var h1 = new HashSet<char>();
var h2 = new HashSet<char>();

foreach (var ch in "nbHHkRvrXbvkn")
{
    if (!h1.Add(ch))
    {
        h2.Add(ch);
    }
}

h1.ExceptWith(h2); // remove duplicates

var chars = new char[h1.Count];
h1.CopyTo(chars);
var result = new string(chars); // = "RrX"

Performance test

When in doubt -- test it :)

Yuriy Faktorovich's answer        00:00:00.2360900
Luke's answer                     00:00:00.2225683
My 'few lines' answer             00:00:00.5318395
My 'fast' answer                  00:00:00.1842144

0人赞添加讨论(0) 举报

家丑人穷心不美

4楼-- · 2020-02-08 09:47

This one should be pretty quick (and it preserves the original order):

public static string RemoveDuplicates(string source)
{
    HashSet<char> found = new HashSet<char>();
    HashSet<char> dupes = new HashSet<char>();

    foreach (char c in source)
    {
        if (!found.Add(c))
        {
            dupes.Add(c);
        }
    }

    StringBuilder sb = new StringBuilder(source.Length);
    foreach (char c in source)
    {
        if (!dupes.Contains(c))
        {
            sb.Append(c);
        }
    }
    return sb.ToString();
}

0人赞添加讨论(0) 举报

疯言疯语

5楼-- · 2020-02-08 09:52

This preserves order and, based on my tests, is 4 times faster than using a HashSet. This assumes your character range is 0-255 but you can extend that easily. If you're planning on using this in a loop, move the int[] c = new int[255]; out and in the function do an Array.Clear(c,0,255).


        private static string RemoveDuplicates(string s)
        {
            int[] c = new int[255];
            for (int i = 0; i < s.Length; i++)
            {
                c[s[i]]++;
            }
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < s.Length; i++)
            {
                if (c[s[i]] == 1) sb.Append(s[i]);
            }
            return sb.ToString();
        }

0人赞添加讨论(0) 举报

手持菜刀，她持情操

6楼-- · 2020-02-08 09:58

This algorithm is general, can be applied to any language

create a map(HashTable) char->int that holds the count of each character found, initially empty
scan the string once to populate the map.
create a new empty string that'll hold the output, may be you'll need to use a StringBuilder.
scan the string(orthe map, whichever is shorter) copying only characters which an occurrence of 1 to the output string/StringBuilder

0人赞添加讨论(0) 举报

Fastest way to implement Duplicate Character Remov

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间