Excel - Count unique values that match ID, optimiz

2019-07-03 17:46发布

in reference to the excel screen capture below, I'm looking for a formula solution that counts the number of unique values in Column B (Colour) for each ID number in Column A.

I've imputed the desired result in Column C. So, for instance, ID 1 (A2) has only one unique colour, Grey (B2), which would return a 1 in C2. ID 2 has only one unique colour, yellow (B3, B4), and returns 1 in C3 and C4. ID 3, has two unique colours, Blue and Purple, thus returning a 2 in C5 through C8. Etc.

Because this will be run for close to 100,000 rows, many of the index and/or match based solutions I've come across take way to long to compute. I have found that the ID values, which are all in ascending order, can be used to speed things up by starting the formula with =IF(A2=A1, or something like this. Thanks in advance to anyone with some ideas about how to solve this with a lean formula.

Note: I am working with files that also have close to 100 columns. A solution that doesn't require helper columns would be ideal.

EDIT/ADDITION: In my main data file, there are instances of blank cells in column B. Is there a way to ignore the blank cells when counting for column C results?

enter image description here

2条回答
The star\"
2楼-- · 2019-07-03 18:20

Ok, I solved your problem using this array formula:

=SUM(IF(FREQUENCY(IF($A$2:$A$21=A2,MATCH($B$2:$B$21,$B$2:$B$21,0),""),MATCH($B$2:$B$21,$B$2:$B$21,0))>0,1))

Put this formula into cell C2 and press CTRL+SHIFT+ENTER, then pull formula down. If any questions, comment below

查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-07-03 18:33

Here is a VBA routine that should run quickly for that number of entries. We create a Class Module (User Defined Object) that consists of a collection (Dictionary) of Colors associated with each ID, and an count of that color. (Don't really need the count but it is trivial to add it in, in case you want it for some other purpose; and also as a demonstration of some of what could be done).

Then we output the results in the adjacent column, as you show in your screen shot. The results could be output elsewhere, even on a different worksheet, with minor code changes.

Be sure to read the notes at the beginning of the modules for important information and about setting things up properly.

Class Module


Option Explicit
'RENAME this module:  cID

Private pID As String
Private pColor As String
Private pColors As Dictionary

Public Property Get ID() As String
    ID = pID
End Property
Public Property Let ID(Value As String)
    pID = Value
End Property

Public Property Get Color() As String
    Color = pColor
End Property
Public Property Let Color(Value As String)
    pColor = Value
End Property

Public Property Get Colors() As Dictionary
    Set Colors = pColors
End Property
Public Function ADDColor(Value As String)
    'Might as well also count # of times this color assigned
    If Not pColors.Exists(Value) Then
        pColors.Add Key:=Value, Item:=1
    Else
        pColors(Value) = pColors(Value) + 1
    End If
End Function

Private Sub Class_Initialize()
    Set pColors = New Dictionary
End Sub

Regular Module

EDIT (edited to eliminate count for blank rows)


Option Explicit
'Set reference to Microsoft Scripting Runtime (Tools/References)

Sub IDColorCount()
    Dim cID As cID, dID As Dictionary
    Dim wsData As Worksheet, rData As Range
    Dim vData As Variant, vRes As Variant
    Dim I As Long

'Set the data worksheet and range
'Read the data into an array for faster calculations
Set wsData = Worksheets("sheet1")
With wsData
    Set rData = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=2)
    vData = rData
End With

'Go through the data and collect the information
Set dID = New Dictionary
For I = 2 To UBound(vData, 1)
  If Not vData(I, 1) = "" Then
    Set cID = New cID
    With cID
        .ID = vData(I, 1)
        .Color = vData(I, 2)
        .ADDColor .Color

        If Not dID.Exists(.ID) Then
            dID.Add Key:=.ID, Item:=cID
        Else
            dID(.ID).ADDColor .Color
        End If
    End With
  End If
Next I

'Size the results array
ReDim vRes(1 To UBound(vData), 1 To 1)
vRes(1, 1) = "Count"
For I = 2 To UBound(vData, 1)
    If Not vData(I, 1) = "" Then _
        vRes(I, 1) = dID(CStr(vData(I, 1))).Colors.Count
Next I

'The results can  be written anyplace
With rData.Offset(0, 2).Resize(columnsize:=1)
    .EntireColumn.Clear
    .Value = vRes
End With

End Sub

查看更多
登录 后发表回答