in reference to the excel screen capture below, I'm looking for a formula solution that counts the number of unique values in Column B (Colour) for each ID number in Column A.
I've imputed the desired result in Column C. So, for instance, ID 1 (A2) has only one unique colour, Grey (B2), which would return a 1 in C2. ID 2 has only one unique colour, yellow (B3, B4), and returns 1 in C3 and C4. ID 3, has two unique colours, Blue and Purple, thus returning a 2 in C5 through C8. Etc.
Because this will be run for close to 100,000 rows, many of the index and/or match based solutions I've come across take way to long to compute. I have found that the ID values, which are all in ascending order, can be used to speed things up by starting the formula with =IF(A2=A1, or something like this. Thanks in advance to anyone with some ideas about how to solve this with a lean formula.
Note: I am working with files that also have close to 100 columns. A solution that doesn't require helper columns would be ideal.
EDIT/ADDITION: In my main data file, there are instances of blank cells in column B. Is there a way to ignore the blank cells when counting for column C results?
Here is a VBA routine that should run quickly for that number of entries. We create a Class Module (User Defined Object) that consists of a collection (Dictionary) of Colors associated with each ID, and an count of that color. (Don't really need the count but it is trivial to add it in, in case you want it for some other purpose; and also as a demonstration of some of what could be done).
Then we output the results in the adjacent column, as you show in your screen shot. The results could be output elsewhere, even on a different worksheet, with minor code changes.
Be sure to read the notes at the beginning of the modules for important information and about setting things up properly.
Class Module
Option Explicit
'RENAME this module: cID
Private pID As String
Private pColor As String
Private pColors As Dictionary
Public Property Get ID() As String
ID = pID
End Property
Public Property Let ID(Value As String)
pID = Value
End Property
Public Property Get Color() As String
Color = pColor
End Property
Public Property Let Color(Value As String)
pColor = Value
End Property
Public Property Get Colors() As Dictionary
Set Colors = pColors
End Property
Public Function ADDColor(Value As String)
'Might as well also count # of times this color assigned
If Not pColors.Exists(Value) Then
pColors.Add Key:=Value, Item:=1
Else
pColors(Value) = pColors(Value) + 1
End If
End Function
Private Sub Class_Initialize()
Set pColors = New Dictionary
End Sub
Regular Module
EDIT (edited to eliminate count for blank rows)
Option Explicit
'Set reference to Microsoft Scripting Runtime (Tools/References)
Sub IDColorCount()
Dim cID As cID, dID As Dictionary
Dim wsData As Worksheet, rData As Range
Dim vData As Variant, vRes As Variant
Dim I As Long
'Set the data worksheet and range
'Read the data into an array for faster calculations
Set wsData = Worksheets("sheet1")
With wsData
Set rData = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=2)
vData = rData
End With
'Go through the data and collect the information
Set dID = New Dictionary
For I = 2 To UBound(vData, 1)
If Not vData(I, 1) = "" Then
Set cID = New cID
With cID
.ID = vData(I, 1)
.Color = vData(I, 2)
.ADDColor .Color
If Not dID.Exists(.ID) Then
dID.Add Key:=.ID, Item:=cID
Else
dID(.ID).ADDColor .Color
End If
End With
End If
Next I
'Size the results array
ReDim vRes(1 To UBound(vData), 1 To 1)
vRes(1, 1) = "Count"
For I = 2 To UBound(vData, 1)
If Not vData(I, 1) = "" Then _
vRes(I, 1) = dID(CStr(vData(I, 1))).Colors.Count
Next I
'The results can be written anyplace
With rData.Offset(0, 2).Resize(columnsize:=1)
.EntireColumn.Clear
.Value = vRes
End With
End Sub
Ok, I solved your problem using this array formula:
=SUM(IF(FREQUENCY(IF($A$2:$A$21=A2,MATCH($B$2:$B$21,$B$2:$B$21,0),""),MATCH($B$2:$B$21,$B$2:$B$21,0))>0,1))
Put this formula into cell C2
and press CTRL+SHIFT+ENTER, then pull formula down. If any questions, comment below