How to remove duplicates from int[][]

2019-07-29 11:34发布

I have an array of arrays - information about selection in Excel using VSTO, where each element means start and end selection position.

For example,

int[][] selection = {
new int[] { 1 }, // column A
new int[] { 6 }, // column F
new int[] { 6 }, // column F
new int[] { 8, 9 } // columns H:I
new int[] { 8, 9 } // columns H:I
new int[] { 12, 15 } // columns L:O
};

Could you please help me to find a way, maybe using LINQ or Extension methods, to remove duplicated elements? I mean: F and F, H:I and H:I, etc.

2条回答
姐就是有狂的资本
2楼-- · 2019-07-29 12:07

If you want to use a pure LINQ/extension method solution, then you'll need to define your own implementation of IEqualityComparer for arrays/sequences. (Unless I'm missing something obvious, there's no pre-existing array or sequence comparer in the BCL). This isn't terribly hard however - here's an example of one that should do the job pretty well:

public class SequenceEqualityComparer<T> : IEqualityComparer<IEnumerable<T>>
{
    public bool Equals(IEnumerable<T> x, IEnumerable<T> y)
    {
        return Enumerable.SequenceEqual(x, y);
    }

    // Probably not the best hash function for an ordered list, but it should do the job in most cases.
    public int GetHashCode(IEnumerable<T> obj)
    {
        int hash = 0;
        int i = 0;
        foreach (var element in obj)
            hash = unchecked((hash * 37 + hash) + (element.GetHashCode() << (i++ % 16)));
        return hash;
    }
}

The advantage of this is that you can then simply call the following to remove any duplicate arrays.

var result = selection.Distinct(new SequenceEqualityComparer<int>()).ToArray();

Hope that helps.

查看更多
Deceive 欺骗
3楼-- · 2019-07-29 12:09

First you need a way to compare the integer arrays. To use it with the classes in the framework, you do that by making an EquailtyComparer. If the arrays are always sorted, that is rather easy to implement:

public class IntArrayComparer : IEqualityComparer<int[]> {

    public bool Equals(int[] x, int[] y) {
        if (x.Length != y.Length) return false;
        for (int i = 0; i < x.Length; i++) {
            if (x[i] != y[i]) return false;
        }
        return true;
    }

    public int GetHashCode(int[] obj) {
        int code = 0;
        foreach (int value in obj) code ^= value;
        return code;
    }

}

Now you can use an integer array as key in a HashSet to get the unique arrays:

int[][] selection = {
    new int[] { 1 }, // column A
    new int[] { 6 }, // column F
    new int[] { 6 }, // column F
    new int[] { 8, 9 }, // columns H:I
    new int[] { 8, 9 }, // columns H:I
    new int[] { 12, 15 } // columns L:O
};

HashSet<int[]> arrays = new HashSet<int[]>(new IntArrayComparer());
foreach (int[] array in selection) {
    arrays.Add(array);
}

The HashSet just throws away duplicate values, so it now contains four integer arrays.

查看更多
登录 后发表回答