This is an embarrassingly simple question, but I was not able to find a way to figure it out for hours nor also to find an idea from online search.
Suppose I have the following data (the actual data can be thousands or millions) in Excel sheet (Table 1):
Name No. ID
A A1
B B4
B B5
C C0
D -
A A1
A A2
E -
F -
C C0
B B5
B B5
B B5
B B6
A A1
A A1
A A2
B B3
B B3
B B3
The character (-) in column No. ID above can be also a number 0 or a blank cell.
I want to format the above data as follow (Table 2)
Name Number of ID
A 2
B 4
C 1
D 0
E 0
F 0
It means A has 2 IDs (A1 and A2), B has 4 IDs (B1, B2, B3, and B4), C has 1 ID (C0), and D, E, and F have no ID.
The best I can get by using Pivot Table is like this
How does one perform a task like Table 2 in MS Excel? If possible a VBA script of it.
If anyone interested in a VBA solution for the above problem:
Sub GetUniquesCount()
Dim dict1 As Object, dict2 As Object
Dim c1 As Variant, c2 As Variant
Dim i As Long, lastRow As Long, count As Long
Set dict1 = CreateObject("Scripting.Dictionary")
Set dict2 = CreateObject("Scripting.Dictionary")
lastRow = Cells(Rows.count, "A").End(xlUp).Row
c1 = Range("A2:A" & lastRow)
For i = 1 To UBound(c1, 1)
dict1(c1(i, 1)) = 1
Next i
Range("D2").Resize(dict1.count) = Application.Transpose(dict1.keys)
lastRow = Cells(Rows.count, "B").End(xlUp).Row
c2 = Range("B2:B" & lastRow)
For i = 1 To UBound(c2, 1)
dict2(c2(i, 1)) = 1
Next i
For Each cel In Range("D2:D" & Cells(Rows.count, "D").End(xlUp).Row)
count = 0
For Each k2 In dict2.keys
If k2 Like cel.Value & "*" Then
count = count + 1
End If
Next k2
cel.Offset(0, 1).Value = count
Next cel
End Sub
Copy column A to another unused column and perform a Data ► Remove Duplicates command on it then sort it. Few formulas will work efficiently with thousands of rows of data and none will work at all with millions.
Do the same for column B right beside it. Do not extend the selection whjen asked by Remove Duplicates.
Next, use the following formula in the column next to the unduplicated set of column A data you just created:
=COUNTIF(F:F, E2&"*")
Fill down to the extent of the unduplicated data. I've assumed you used column E to create the first unduplicated data set and column F for the second.