vba: get unique values from array

2019-01-01 10:48发布

问题:

is there built in functionality in vba to get unique values from a one-dimensional array? what about just getting rid of duplicates?

if not, then how would i get the unique values from an array?

回答1:

This post contains 2 examples. I like the 2nd one:

Sub unique() 
  Dim arr As New Collection, a 
  Dim aFirstArray() As Variant 
  Dim i As Long 

  aFirstArray() = Array(\"Banana\", \"Apple\", \"Orange\", \"Tomato\", \"Apple\", _ 
  \"Lemon\", \"Lime\", \"Lime\", \"Apple\") 

  On Error Resume Next 
  For Each a In aFirstArray 
     arr.Add a, a 
  Next 

  For i = 1 To arr.Count 
     Cells(i, 1) = arr(i) 
  Next 

End Sub 


回答2:

There\'s no built-in functionality to remove duplicates from arrays. Raj\'s answer seems elegant, but I prefer to use dictionaries.

Dim d As Object
Set d = CreateObject(\"Scripting.Dictionary\")
\'Set d = New Scripting.Dictionary

Dim i As Long
For i = LBound(myArray) To UBound(myArray)
    d(myArray(i)) = 1
Next i

Dim v As Variant
For Each v In d.Keys()
    \'d.Keys() is a Variant array of the unique values in myArray.
    \'v will iterate through each of them.
Next v

EDIT: I changed the loop to use LBound and UBound as per Tomalak\'s suggested answer. EDIT: d.Keys() is a Variant array, not a Collection.



回答3:

Update (6/15/16)

I have created much more thorough benchmarks. First of all, as @ChaimG pointed out, early binding makes a big difference (I originally used @eksortso\'s code above verbatim which uses late binding). Secondly, my original benchmarks only included the time to create the unique object, however, it did not test the efficiency of using the object. My point in doing this is, it doesn\'t really matter if I can create an object really fast if the object I create is clunky and slows me down moving forward.

Old Remark: It turns out, that looping over a collection object is highly inefficient

It turns out that looping over a collection can be quite efficient if you know how to do it (I didn\'t). As @ChaimG (yet again), pointed out in the comments, using a For Each construct is ridiculously superior to simply using a For loop. To give you an idea, before changing the loop construct, the time for Collection2 for the Test Case Size = 10^6 was over 1400s (i.e. ~23 minutes). It is now a meager 0.195s (over 7000x faster).

For the Collection method there are two times. The first (my original benchmark Collection1) show the time to create the unique object. The second part (Collection2) shows the time to loop over the object (which is very natural) to create a returnable array as the other functions do.

In the chart below, a yellow background indicates that it was the fastest for that test case, and red indicates the slowest (\"Not Tested\" algorithms are excluded). The total time for the Collection method is the sum of Collection1 and Collection2. Turquoise indicates that is was the fastest regardless of original order.

\"Benchmarks5\"

Below is the original algorithm I created (I have modified it slightly e.g. I no longer instantiate my own data type). It returns the unique values of an array with the original order in a very respectable time and it can be modified to take on any data type. Outside of the IndexMethod, it is the fastest algorithm for very large arrays.

Here are the main ideas behind this algorithm:

  1. Index the array
  2. Sort by values
  3. Place identical values at the end of the array and subsequently \"chop\" them off.
  4. Finally, sort by index.

Below is an example:

Let myArray = (86, 100, 33, 19, 33, 703, 19, 100, 703, 19)

    1.  (86, 100, 33, 19, 33, 703, 19, 100, 703, 19)
        (1 ,   2,  3,  4,  5,   6,  7,   8,   9, 10)   <<-- Indexing

    2.  (19, 19, 19, 33, 33, 86, 100, 100, 703, 703)   <<-- sort by values     
        (4,   7, 10,  3,  5,  1,   2,   8,   6,   9)

    3.  (19, 33,  86, 100, 703)   <<-- remove duplicates    
        (4,   3,   1,   2,   6)

    4.  (86, 100,  33, 19, 703)   
        ( 1,   2,   3,  4,   6)   <<-- sort by index

Here is the code:

Function SortingUniqueTest(ByRef myArray() As Long, bOrigIndex As Boolean) As Variant
    Dim MyUniqueArr() As Long, i As Long, intInd As Integer
    Dim StrtTime As Double, Endtime As Double, HighB As Long, LowB As Long

    LowB = LBound(myArray): HighB = UBound(myArray)

    ReDim MyUniqueArr(1 To 2, LowB To HighB)
    intInd = 1 - LowB  \'Guarantees the indices span 1 to Lim

    For i = LowB To HighB
        MyUniqueArr(1, i) = myArray(i)
        MyUniqueArr(2, i) = i + intInd
    Next i

    QSLong2D MyUniqueArr, 1, LBound(MyUniqueArr, 2), UBound(MyUniqueArr, 2), 2
    Call UniqueArray2D(MyUniqueArr)
    If bOrigIndex Then QSLong2D MyUniqueArr, 2, LBound(MyUniqueArr, 2), UBound(MyUniqueArr, 2), 2

    SortingUniqueTest = MyUniqueArr()
End Function

Public Sub UniqueArray2D(ByRef myArray() As Long)
    Dim i As Long, j As Long, Count As Long, Count1 As Long, DuplicateArr() As Long
    Dim lngTemp As Long, HighB As Long, LowB As Long
    LowB = LBound(myArray, 2): Count = LowB: i = LowB: HighB = UBound(myArray, 2)

    Do While i < HighB
        j = i + 1
        If myArray(1, i) = myArray(1, j) Then
            Do While myArray(1, i) = myArray(1, j)
                ReDim Preserve DuplicateArr(1 To Count)
                DuplicateArr(Count) = j
                Count = Count + 1
                j = j + 1
                If j > HighB Then Exit Do
            Loop

            QSLong2D myArray, 2, i, j - 1, 2
        End If
        i = j
    Loop

    Count1 = HighB

    If Count > 1 Then
        For i = UBound(DuplicateArr) To LBound(DuplicateArr) Step -1
            myArray(1, DuplicateArr(i)) = myArray(1, Count1)
            myArray(2, DuplicateArr(i)) = myArray(2, Count1)
            Count1 = Count1 - 1
            ReDim Preserve myArray(1 To 2, LowB To Count1)
        Next i
    End If
End Sub

Here is the sorting algorithm I use (more about this algo here).

Sub QSLong2D(ByRef saArray() As Long, bytDim As Byte, lLow1 As Long, lHigh1 As Long, bytNum As Byte)
    Dim lLow2 As Long, lHigh2 As Long
    Dim sKey As Long, sSwap As Long, i As Byte

On Error GoTo ErrorExit

    If IsMissing(lLow1) Then lLow1 = LBound(saArray, bytDim)
    If IsMissing(lHigh1) Then lHigh1 = UBound(saArray, bytDim)
    lLow2 = lLow1
    lHigh2 = lHigh1

    sKey = saArray(bytDim, (lLow1 + lHigh1) \\ 2)

    Do While lLow2 < lHigh2
        Do While saArray(bytDim, lLow2) < sKey And lLow2 < lHigh1: lLow2 = lLow2 + 1: Loop
        Do While saArray(bytDim, lHigh2) > sKey And lHigh2 > lLow1: lHigh2 = lHigh2 - 1: Loop

        If lLow2 < lHigh2 Then
            For i = 1 To bytNum
                sSwap = saArray(i, lLow2)
                saArray(i, lLow2) = saArray(i, lHigh2)
                saArray(i, lHigh2) = sSwap
            Next i
        End If

        If lLow2 <= lHigh2 Then
            lLow2 = lLow2 + 1
            lHigh2 = lHigh2 - 1
        End If
    Loop

    If lHigh2 > lLow1 Then QSLong2D saArray(), bytDim, lLow1, lHigh2, bytNum
    If lLow2 < lHigh1 Then QSLong2D saArray(), bytDim, lLow2, lHigh1, bytNum

ErrorExit:

End Sub

Below is a special algorithm that is blazing fast if your data contains integers. It makes use of indexing and the Boolean data type.

Function IndexSort(ByRef myArray() As Long, bOrigIndex As Boolean) As Variant
\'\' Modified to take both positive and negative integers
    Dim arrVals() As Long, arrSort() As Long, arrBool() As Boolean
    Dim i As Long, HighB As Long, myMax As Long, myMin As Long, OffSet As Long
    Dim LowB As Long, myIndex As Long, count As Long, myRange As Long

    HighB = UBound(myArray)
    LowB = LBound(myArray)

    For i = LowB To HighB
        If myArray(i) > myMax Then myMax = myArray(i)
        If myArray(i) < myMin Then myMin = myArray(i)
    Next i

    OffSet = Abs(myMin)  \'\' Number that will be added to every element
                         \'\' to guarantee every index is non-negative

    If myMax > 0 Then
        myRange = myMax + OffSet  \'\' E.g. if myMax = 10 & myMin = -2, then myRange = 12
    Else
        myRange = OffSet
    End If

    If bOrigIndex Then
        ReDim arrSort(1 To 2, 1 To HighB)
        ReDim arrVals(1 To 2, 0 To myRange)
        ReDim arrBool(0 To myRange)

        For i = LowB To HighB
            myIndex = myArray(i) + OffSet
            arrBool(myIndex) = True
            arrVals(1, myIndex) = myArray(i)
            If arrVals(2, myIndex) = 0 Then arrVals(2, myIndex) = i
        Next i

        For i = 0 To myRange
            If arrBool(i) Then
                count = count + 1
                arrSort(1, count) = arrVals(1, i)
                arrSort(2, count) = arrVals(2, i)
            End If
        Next i

        QSLong2D arrSort, 2, 1, count, 2
        ReDim Preserve arrSort(1 To 2, 1 To count)
    Else
        ReDim arrSort(1 To HighB)
        ReDim arrVals(0 To myRange)
        ReDim arrBool(0 To myRange)

        For i = LowB To HighB
            myIndex = myArray(i) + OffSet
            arrBool(myIndex) = True
            arrVals(myIndex) = myArray(i)
        Next i

        For i = 0 To myRange
            If arrBool(i) Then
                count = count + 1
                arrSort(count) = arrVals(i)
            End If
        Next i

        ReDim Preserve arrSort(1 To count)
    End If

    ReDim arrVals(0)
    ReDim arrBool(0)

    IndexSort = arrSort
End Function

Here are the Collection (by @DocBrown) and Dictionary (by @eksortso) Functions.

Function CollectionTest(ByRef arrIn() As Long, Lim As Long) As Variant
    Dim arr As New Collection, a, i As Long, arrOut() As Variant, aFirstArray As Variant
    Dim StrtTime As Double, EndTime1 As Double, EndTime2 As Double, count As Long
On Error Resume Next

    ReDim arrOut(1 To UBound(arrIn))
    ReDim aFirstArray(1 To UBound(arrIn))

    StrtTime = Timer
    For i = 1 To UBound(arrIn): aFirstArray(i) = CStr(arrIn(i)): Next i \'\' Convert to string
    For Each a In aFirstArray               \'\'\' This part is actually creating the unique set
        arr.Add a, a
    Next
    EndTime1 = Timer - StrtTime

    StrtTime = Timer         \'\'\' This part is writing back to an array for return
    For Each a In arr: count = count + 1: arrOut(count) = a: Next a
    EndTime2 = Timer - StrtTime
    CollectionTest = Array(arrOut, EndTime1, EndTime2)
End Function

Function DictionaryTest(ByRef myArray() As Long, Lim As Long) As Variant
    Dim StrtTime As Double, Endtime As Double
    Dim d As Scripting.Dictionary, i As Long  \'\' Early Binding
    Set d = New Scripting.Dictionary
    For i = LBound(myArray) To UBound(myArray): d(myArray(i)) = 1: Next i
    DictionaryTest = d.Keys()
End Function

Here is the Direct approach provided by @IsraelHoletz.

Function ArrayUnique(ByRef aArrayIn() As Long) As Variant
    Dim aArrayOut() As Variant, bFlag As Boolean, vIn As Variant, vOut As Variant
    Dim i As Long, j As Long, k As Long
    ReDim aArrayOut(LBound(aArrayIn) To UBound(aArrayIn))
    i = LBound(aArrayIn)
    j = i

    For Each vIn In aArrayIn
        For k = j To i - 1
            If vIn = aArrayOut(k) Then bFlag = True: Exit For
        Next
        If Not bFlag Then aArrayOut(i) = vIn: i = i + 1
        bFlag = False
    Next

    If i <> UBound(aArrayIn) Then ReDim Preserve aArrayOut(LBound(aArrayIn) To i - 1)
    ArrayUnique = aArrayOut
End Function

Function DirectTest(ByRef aArray() As Long, Lim As Long) As Variant
    Dim aReturn() As Variant
    Dim StrtTime As Long, Endtime As Long, i As Long
    aReturn = ArrayUnique(aArray)
    DirectTest = aReturn
End Function

Here is the benchmark function that compares all of the functions. You should note that the last two cases are handled a little bit different because of memory issues. Also note, that I didn\'t test the Collection method for the Test Case Size = 10,000,000. For some reason, it was returning incorrect results and behaving unusual (I\'m guessing the collection object has a limit on how many things you can put in it. I searched and I couldn\'t find any literature on this).

Function UltimateTest(Lim As Long, bTestDirect As Boolean, bTestDictionary, bytCase As Byte) As Variant

    Dim dictionTest, collectTest, sortingTest1, indexTest1, directT \'\' all variants
    Dim arrTest() As Long, i As Long, bEquality As Boolean, SizeUnique As Long
    Dim myArray() As Long, StrtTime As Double, EndTime1 As Variant
    Dim EndTime2 As Double, EndTime3 As Variant, EndTime4 As Double
    Dim EndTime5 As Double, EndTime6 As Double, sortingTest2, indexTest2

    ReDim myArray(1 To Lim): Rnd (-2)   \'\' If you want to test negative numbers, 
    \'\' insert this to the left of CLng(Int(Lim... : (-1) ^ (Int(2 * Rnd())) *
    For i = LBound(myArray) To UBound(myArray): myArray(i) = CLng(Int(Lim * Rnd() + 1)): Next i
    arrTest = myArray

    If bytCase = 1 Then
        If bTestDictionary Then
            StrtTime = Timer: dictionTest = DictionaryTest(arrTest, Lim): EndTime1 = Timer - StrtTime
        Else
            EndTime1 = \"Not Tested\"
        End If

        arrTest = myArray
        collectTest = CollectionTest(arrTest, Lim)

        arrTest = myArray
        StrtTime = Timer: sortingTest1 = SortingUniqueTest(arrTest, True): EndTime2 = Timer - StrtTime
        SizeUnique = UBound(sortingTest1, 2)

        If bTestDirect Then
            arrTest = myArray: StrtTime = Timer: directT = DirectTest(arrTest, Lim): EndTime3 = Timer - StrtTime
        Else
            EndTime3 = \"Not Tested\"
        End If

        arrTest = myArray
        StrtTime = Timer: indexTest1 = IndexSort(arrTest, True): EndTime4 = Timer - StrtTime

        arrTest = myArray
        StrtTime = Timer: sortingTest2 = SortingUniqueTest(arrTest, False): EndTime5 = Timer - StrtTime

        arrTest = myArray
        StrtTime = Timer: indexTest2 = IndexSort(arrTest, False): EndTime6 = Timer - StrtTime

        bEquality = True
        For i = LBound(sortingTest1, 2) To UBound(sortingTest1, 2)
            If Not CLng(collectTest(0)(i)) = sortingTest1(1, i) Then
                bEquality = False
                Exit For
            End If
        Next i

        For i = LBound(dictionTest) To UBound(dictionTest)
            If Not dictionTest(i) = sortingTest1(1, i + 1) Then
                bEquality = False
                Exit For
            End If
        Next i

        For i = LBound(dictionTest) To UBound(dictionTest)
            If Not dictionTest(i) = indexTest1(1, i + 1) Then
                bEquality = False
                Exit For
            End If
        Next i

        If bTestDirect Then
            For i = LBound(dictionTest) To UBound(dictionTest)
                If Not dictionTest(i) = directT(i + 1) Then
                    bEquality = False
                    Exit For
                End If
            Next i
        End If

        UltimateTest = Array(bEquality, EndTime1, EndTime2, EndTime3, EndTime4, _
                        EndTime5, EndTime6, collectTest(1), collectTest(2), SizeUnique)
    ElseIf bytCase = 2 Then
        arrTest = myArray
        collectTest = CollectionTest(arrTest, Lim)
        UltimateTest = Array(collectTest(1), collectTest(2))
    ElseIf bytCase = 3 Then
        arrTest = myArray
        StrtTime = Timer: sortingTest1 = SortingUniqueTest(arrTest, True): EndTime2 = Timer - StrtTime
        SizeUnique = UBound(sortingTest1, 2)
        UltimateTest = Array(EndTime2, SizeUnique)
    ElseIf bytCase = 4 Then
        arrTest = myArray
        StrtTime = Timer: indexTest1 = IndexSort(arrTest, True): EndTime4 = Timer - StrtTime
        UltimateTest = EndTime4
    ElseIf bytCase = 5 Then
        arrTest = myArray
        StrtTime = Timer: sortingTest2 = SortingUniqueTest(arrTest, False): EndTime5 = Timer - StrtTime
        UltimateTest = EndTime5
    ElseIf bytCase = 6 Then
        arrTest = myArray
        StrtTime = Timer: indexTest2 = IndexSort(arrTest, False): EndTime6 = Timer - StrtTime
        UltimateTest = EndTime6
    End If

End Function

And finally, here is the sub that produces the table above.

Sub GetBenchmarks()
    Dim myVar, i As Long, TestCases As Variant, j As Long, temp

    TestCases = Array(1000, 5000, 10000, 20000, 50000, 100000, 200000, 500000, 1000000, 2000000, 5000000, 10000000)

    For j = 0 To 11
        If j < 6 Then
            myVar = UltimateTest(CLng(TestCases(j)), True, True, 1)
        ElseIf j < 10 Then
            myVar = UltimateTest(CLng(TestCases(j)), False, True, 1)
        ElseIf j < 11 Then
            myVar = Array(\"Not Tested\", \"Not Tested\", 0.1, \"Not Tested\", 0.1, 0.1, 0.1, 0, 0, 0)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 2)
            myVar(7) = temp(0): myVar(8) = temp(1)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 3)
            myVar(2) = temp(0): myVar(9) = temp(1)
            myVar(4) = UltimateTest(CLng(TestCases(j)), False, False, 4)
            myVar(5) = UltimateTest(CLng(TestCases(j)), False, False, 5)
            myVar(6) = UltimateTest(CLng(TestCases(j)), False, False, 6)
        Else
            myVar = Array(\"Not Tested\", \"Not Tested\", 0.1, \"Not Tested\", 0.1, 0.1, 0.1, \"Not Tested\", \"Not Tested\", 0)
            temp = UltimateTest(CLng(TestCases(j)), False, False, 3)
            myVar(2) = temp(0): myVar(9) = temp(1)
            myVar(4) = UltimateTest(CLng(TestCases(j)), False, False, 4)
            myVar(5) = UltimateTest(CLng(TestCases(j)), False, False, 5)
            myVar(6) = UltimateTest(CLng(TestCases(j)), False, False, 6)
        End If

        Cells(4 + j, 6) = TestCases(j)
        For i = 1 To 9: Cells(4 + j, 6 + i) = myVar(i - 1): Next i
        Cells(4 + j, 17) = myVar(9)
    Next j
End Sub

Summary
From the table of results, we can see that the Dictionary method works really well for cases less than about 500,000, however, after that, the IndexMethod really starts to dominate. You will notice that when order doesn\'t matter and your data is made up of positive integers, there is no comparison to the IndexMethod algorithm (it returns the unique values from an array containing 10 million elements in less than 1 sec!!! Incredible!). Below I have a breakdown of which algorithm is preferred in various cases.

Case 1
Your Data contains integers (i.e. whole numbers, both positive and negative): IndexMethod

Case 2
Your Data contains non-integers (i.e. variant, double, string, etc.) with less than 200000 elements: Dictionary Method

Case 3
Your Data contains non-integers (i.e. variant, double, string, etc.) with more than 200000 elements: Collection Method

If you had to choose one algorithm, in my opinion, the Collection method is still the best as it only requires a few lines of code, it\'s super general, and it\'s fast enough.



回答4:

No, nothing built-in. Do it yourself:

  • Instantiate a Scripting.Dictionary object
  • Write a For loop over your array (be sure to use LBound() and UBound() instead of looping from 0 to x!)
  • On each iteration, check Exists() on the dictionary. Add every array value (that doesn\'t already exist) as a key to the dictionary (use CStr() since keys must be strings as I\'ve just learned, keys can be of any type in a Scripting.Dictionary), also store the array value itself into the dictionary.
  • When done, use Keys() (or Items()) to return all values of the dictionary as a new, now unique array.
  • In my tests, the Dictionary keeps original order of all added values, so the output will be ordered like the input was. I\'m not sure if this is documented and reliable behavior, though.


回答5:

I don\'t know of any built-in functionality in VBA. The best would be to use a collection using the value as key and only add to it if a value doesn\'t exist.



回答6:

No, VBA does not have this functionality. You can use the technique of adding each item to a collection using the item as the key. Since a collection does not allow duplicate keys, the result is distinct values that you can copy to an array, if needed.

You may also want something more robust. See Distinct Values Function at http://www.cpearson.com/excel/distinctvalues.aspx

Distinct Values Function

A VBA Function that will return an array of the distinct values in a range or array of input values.

Excel has some manual methods, such as Advanced Filter, for getting a list of distinct items from an input range. The drawback of using such methods is that you must manually refresh the results when the input data changes. Moreover, these methods work only with ranges, not arrays of values, and, not being functions, cannot be called from worksheet cells or incorporated into array formulas. This page describes a VBA function called DistinctValues that accepts as input either a range or an array of data and returns as its result an array containing the distinct items from the input list. That is, the elements with all duplicates removed. The order of the input elements is preserved. The order of the elements in the output array is the same as the order in the input values. The function can be called from an array entered range on a worksheet (see this page for information about array formulas), or from in an array formula in a single worksheet cell, or from another VB function.



回答7:

The Collection and Dictionary solutions are all nice and shine for a short approach, but if you want speed try using a more direct approach:

Function ArrayUnique(ByVal aArrayIn As Variant) As Variant
\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'
\' ArrayUnique
\' This function removes duplicated values from a single dimension array
\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'\'
Dim aArrayOut() As Variant
Dim bFlag As Boolean
Dim vIn As Variant
Dim vOut As Variant
Dim i%, j%, k%

ReDim aArrayOut(LBound(aArrayIn) To UBound(aArrayIn))
i = LBound(aArrayIn)
j = i

For Each vIn In aArrayIn
    For k = j To i - 1
        If vIn = aArrayOut(k) Then bFlag = True: Exit For
    Next
    If Not bFlag Then aArrayOut(i) = vIn: i = i + 1
    bFlag = False
Next

If i <> UBound(aArrayIn) Then ReDim Preserve aArrayOut(LBound(aArrayIn) To i - 1)
ArrayUnique = aArrayOut
End Function

Calling it:

Sub Test()
Dim aReturn As Variant
Dim aArray As Variant

aArray = Array(1, 2, 3, 1, 2, 3, \"Test\", \"Test\")
aReturn = ArrayUnique(aArray)
End Sub

For speed comparasion, this will be 100x to 130x faster then the dictionary solution, and about 8000x to 13000x faster than the collection one.



回答8:

If the order of the deduplicated array does not matter to you, you can use my pragmatic function:

Function DeDupArray(ia() As String)
  Dim newa() As String
  ReDim newa(999)
  ni = -1
  For n = LBound(ia) To UBound(ia)
    dup = False
    If n <= UBound(ia) Then
      For k = n + 1 To UBound(ia)
        If ia(k) = ia(n) Then dup = True
      Next k

      If dup = False And Trim(ia(n)) <> \"\" Then
        ni = ni + 1
        newa(ni) = ia(n)
      End If
    End If
  Next n

  If ni > -1 Then
    ReDim Preserve newa(ni)
  Else
    ReDim Preserve newa(1)
  End If

  DeDupArray = newa
End Function



Sub testdedup()
Dim m(5) As String
Dim m2() As String

m(0) = \"Horse\"
m(1) = \"Cow\"
m(2) = \"Dear\"
m(3) = \"Horse\"
m(4) = \"Joke\"
m(5) = \"Cow\"

m2 = DeDupArray(m)
t = \"\"
For n = LBound(m2) To UBound(m2)
  t = t & n & \"=\" & m2(n) & \" \"
Next n
MsgBox t
End Sub

From the test function, it will result in the following deduplicated array:

\"0=Dear 1=Horse 2=Joke 3=Cow \"