Excel Macro append duplicates to first line

2019-09-13 22:38发布

I'm an Excel VBA newbie and i'm trying to get the duplicates rows to appends to the first occurence of that row.

Per exemple we have the table here First Table

I would like to format data as here Formatted table

The logic goes like this. Whenever we detect that the last name and the birth date are the same for the current and following line that mean we have a dependant and we need to append the dependant's data to the "Main"

I have started writing code but i'm not able to detect the dependants properly. Below is what i have. please consider that i'm a real noob and i'm trying hard.

Sub formatData()

    Dim sh As Worksheet
    Dim rw As Range
    Dim RowCount As Integer

    'This variable is checked to see if we have a first occurence of a line
    Dim firstOccurence

    'Initialise the variables for that will be used to match the data
    Dim LocationName
    Dim PlanCode
    Dim LastName
    Dim FirstName

    Dim dependantFirstName
    Dim dependantLastName
    Dim dependantBirthdate


    RowCount = 0
    firstOccurence = True

    'Check if the spreadsheet already exist if not create it.
    For i = 1 To Worksheets.Count
        If Worksheets(i).Name = "Benefits Census Formatted" Then
            exists = True
        End If
    Next i

    If Not exists Then
        'Create a new spreadsheet to add the data to
        Set ws = Sheets.Add
        Sheets.Add.Name = "Benefits Census Formatted"
    End If


    'Set the ActiveSheet to the one containing the original data
    Set sh = Sheets("BENEFIT Census")


    With ActiveSheet
        LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row


    For Each rw In sh.Rows
    'If the data of one cell is empty EXIT THE LOOP
        If sh.Cells(rw.Row, 1).Value = "" Then


    Exit For
        End If

    If rw.Row > 1 Then

       'Afffecting the variables to the next loop so we can compare the values
       nextLocationName = sh.Cells(rw.Row + 1, 1).Value
       nextPlanCode = sh.Cells(rw.Row + 1, 2).Value
       nextLastName = sh.Cells(rw.Row + 1, 3).Value
       nextFirstName = sh.Cells(rw.Row + 1, 4).Value
       nextEmploymentDate = sh.Cells(rw.Row + 1, 5).Value
       nextBirthDate = sh.Cells(rw.Row + 1, 6).Value
       nextDependantFirstName = sh.Cells(rw.Row + 1, 25).Value
       nextDependantLastName = sh.Cells(rw.Row + 1, 26).Value
       nextDependantBirthdate = sh.Cells(rw.Row + 1, 27).Value

       Debug.Print LastName & " - " & FirstName  & " ::: "  & nextLastName & " - " & nextFirstName & " : " & rw.Row & " : " & firstOccurence


       'First time you pass through the loop write the whole lane
        If firstOccurence = True Then

        'Affecting the variables to the current loops values
       LocationName = sh.Cells(rw.Row, 1).Value
       PlanCode = sh.Cells(rw.Row, 2).Value
       LastName = sh.Cells(rw.Row, 3).Value
       FirstName = sh.Cells(rw.Row, 4).Value
       dependantFirstName = sh.Cells(rw.Row, 25).Value
       dependantLastName = sh.Cells(rw.Row, 26).Value
       dependantBirthdate = sh.Cells(rw.Row, 27).Value

       'Write the current line
        sh.Rows(rw.Row).Copy
        'We copy the value into another sheet
        Set ns = Sheets("Benefits Census Formatted")
        LastRow = ns.Cells(ns.Rows.Count, "A").End(xlUp).Row + 1
        ns.Rows(LastRow).PasteSpecial xlPasteValues

        firstOccurence = False


    Else

      'We match the location with the plan code and the last name and first name of the user to find duplicates
       If dependantFirstName <> nextDependantFirstName And PlanCode <> nextPlanCode And LastName <> nextLastName And FirstName <> nextFirstName Then



      'We find a different dependant if the first name or the last name or the birthdate differs
       'If Not (dependantFirstName <> nextDependantFirstName) Or Not (dependantLastName <> nextDependantLastName) Or Not (dependantBirthdate <> nextDependantBirthdate) Then

       'We have a dependant Append it to the line
        'append the user to the currentLine
        'End If

        Else
        'If the dependantFirstName and the nextDependant First name doesn't match then on the next loop  we print the full line
        firstOccurence = True


        End If


        End If

        RowCount = RowCount + 1
        'End of if row > 2
        End If

        Next rw

    End With

End Sub

2条回答
迷人小祖宗
2楼-- · 2019-09-13 23:26

I would use an approach using Dictionaries to collect and organize the data, and then output it. Judging both by your comments, and the code, there is a lot of stuff you haven't included. But the following code will take your original data, and output a table close to what you show -- some of the results ordering is different, but it is standardized (i.e. there is a relation listed with every dependent name.

In the dictionary, we use Last Name and Birthdate as the "key" so as to combine what you stated were the duplicates.

We define two Class objects

  • Dependent object which includes the Name and the Relation
  • Family object which includes the First and Last Names, and Birthdate as well as a collection (dictionary) of the dependent objects.

Once we have it organized, it is relatively simple to output it as we want.

For a discussion of Classes, you can do an Internet search. I would recommend Chip Pearson's Introduction to Classes

Be sure to read the notes in the code about renaming the class modules, and also setting a reference to Microsoft Scripting Runtime

Class1

Option Explicit
'Rename this module: cDependents
'set reference to Microsoft Scripting Runtime
Private pRelation As String
Private pDepName As String

Public Property Get Relation() As String
    Relation = pRelation
End Property
Public Property Let Relation(Value As String)
    pRelation = Value
End Property

Public Property Get DepName() As String
    DepName = pDepName
End Property
Public Property Let DepName(Value As String)
    pDepName = Value
End Property

Class2

Option Explicit
'rename this module: cFamily
'set reference to Microsoft Scripting Runtime
Private pFirstName As String
Private pLastName As String
Private pBirthdate As Date
Private pDependents As Dictionary

Public Property Get FirstName() As String
    FirstName = pFirstName
End Property
Public Property Let FirstName(Value As String)
    pFirstName = Value
End Property

Public Property Get LastName() As String
    LastName = pLastName
End Property
Public Property Let LastName(Value As String)
    pLastName = Value
End Property

Public Property Get Birthdate() As Date
    Birthdate = pBirthdate
End Property
Public Property Let Birthdate(Value As Date)
    pBirthdate = Value
End Property

Public Function ADDDependents(Typ, Nme)
    Dim cD As New cDependents
    Dim sKey As String
    With cD
        .DepName = Nme
        .Relation = Typ
        sKey = .Relation & Chr(1) & .DepName
    End With

    If Not pDependents.Exists(sKey) Then
        pDependents.Add Key:=sKey, Item:=cD
    End If
End Function

Public Property Get Dependents() As Dictionary
    Set Dependents = pDependents
End Property


Private Sub Class_Initialize()
    Set pDependents = New Dictionary
End Sub

Regular Module

Option Explicit
'set reference to Microsoft Scripting Runtime
Sub Family()
    Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
    Dim vSrc As Variant, vRes As Variant
    Dim dF As Dictionary, cF As cFamily
    Dim I As Long, J As Long
    Dim sKey As String
    Dim V As Variant, W As Variant

'Set source and results worksheets and results range
Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet2")
    Set rRes = wsRes.Cells(1, 1)

'read source data into array
With wsSrc
    vSrc = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=5)
End With

'Collect and organize the family and dependent objects
Set dF = New Dictionary
For I = 2 To UBound(vSrc, 1)
    Set cF = New cFamily
    With cF
        .FirstName = vSrc(I, 1)
        .LastName = vSrc(I, 2)
        .Birthdate = vSrc(I, 3)
        .ADDDependents vSrc(I, 4), vSrc(I, 5)
        sKey = .LastName & Chr(1) & .Birthdate
        If Not dF.Exists(sKey) Then
            dF.Add Key:=sKey, Item:=cF
        Else
            dF(sKey).ADDDependents vSrc(I, 4), vSrc(I, 5)
        End If
    End With
Next I

'Results will have two columns for each relation, including Main
' + three columns at the beginning

'get number of extra columns
    Dim ColCount As Long
    For Each V In dF
        I = dF(V).Dependents.Count
        ColCount = IIf(I > ColCount, I, ColCount)
    Next V
    ColCount = ColCount * 2 + 3


ReDim vRes(0 To dF.Count, 1 To ColCount)
    vRes(0, 1) = "First Name"
    vRes(0, 2) = "Last Name"
    vRes(0, 3) = "Birthdate"
    vRes(0, 4) = "Dependant"
    vRes(0, 5) = "Dependant Name"
    For J = 6 To UBound(vRes, 2) Step 2
        vRes(0, J) = "Relation " & J - 5
        vRes(0, J + 1) = "Dependant Name"
    Next J


I = 0
For Each V In dF
    I = I + 1
    With dF(V)
        vRes(I, 1) = .FirstName
        vRes(I, 2) = .LastName
        vRes(I, 3) = .Birthdate

        J = 2
        For Each W In .Dependents
            J = J + 2
            With .Dependents(W)
                vRes(I, J) = .Relation
                vRes(I, J + 1) = .DepName
            End With
        Next W
    End With
Next V

Set rRes = rRes.Resize(rowsize:=UBound(vRes, 1) + 1, columnsize:=UBound(vRes, 2))
With rRes
    .EntireColumn.Clear
    .Value = vRes
    With .Rows(1)
        .Font.Bold = True
        .HorizontalAlignment = xlCenter
    End With
    .EntireColumn.AutoFit
End With

End Sub

Source Data

enter image description here

Results

enter image description here

查看更多
迷人小祖宗
3楼-- · 2019-09-13 23:31

This is the code I wrote for you. (Glad to see that so many others did, too. So you got a choice :-))

Sub TransscribeData()
    ' 25 Mar 2017

    Dim WsS As Worksheet                    ' Source
    Dim WsT As Worksheet                    ' Target
    Dim TargetName As String
    Dim LastRow As Long                     ' in WsS
    Dim Rs As Long                          ' Source: row
    Dim Rt As Long, Ct As Long              ' Target: row / column
    Dim Tmp As String
    Dim Comp As String                      ' compare string

    ' Set Source sheet to the one containing the original data
    Set WsS = Worksheets("BENEFIT Census")
    LastRow = WsS.Cells(WsS.Rows.Count, NbcName).End(xlUp).Row

    Application.ScreenUpdating = False
    TargetName = "Benefits Census Formatted"
    On Error Resume Next
    Set WsT = Worksheets(TargetName)        ' Set the Target sheet
    If Err Then
        ' Create it if it doesn't exist
        Set WsT = Worksheets.Add(After:=Worksheets(Worksheets.Count))
        WsT.Name = TargetName
        ' insert the column captions here
    End If
    On Error GoTo 0

    Rt = WsT.Cells(WsS.Rows.Count, NfdName).End(xlUp).Row
    AddMain WsS, WsT, NbcFirstDataRow, Rt    ' Rt is counting in the sub
    For Rs = NbcFirstDataRow To LastRow - 1
        With WsS.Rows(Rs)
            Tmp = .Cells(NbcFname).Value & .Cells(NbcName).Value & .Cells(NbcDob).Value
        End With
        With WsS.Rows(Rs + 1)
            Comp = .Cells(NbcFname).Value & .Cells(NbcName).Value & .Cells(NbcDob).Value
        End With
        If StrComp(Tmp, Comp, vbTextCompare) Then
            AddMain WsS, WsT, Rs + 1, Rt
        Else
            Ct = WsT.Cells(Rt, WsT.Columns.Count).End(xlToLeft).Column
            If Ct > NfdMain Then Ct = Ct + 1
            With WsS.Rows(Rs + 1)
                WsT.Cells(Rt, Ct + NfdRelate).Value = .Cells(NbcRelate).Value
                WsT.Cells(Rt, Ct + NfdDepName).Value = .Cells(NbcDepName).Value
            End With
        End If
    Next Rs
    Application.ScreenUpdating = True
End Sub

The above code calls one Sub routine which you must add in the same code module which, by the way, should be a normal code module (by default "Module1" but you can rename it to whatever).

Private Sub AddMain(WsS As Worksheet, WsT As Worksheet, _
                    Rs As Long, Rt As Long)
    ' 25 Mar 2017

    Rt = Rt + 1
    With WsS.Rows(Rs)
        WsT.Cells(Rt, NfdFname).Value = .Cells(NbcFname).Value
        WsT.Cells(Rt, NfdName).Value = .Cells(NbcName).Value
        WsT.Cells(Rt, NfdDob).Value = .Cells(NbcDob).Value
        WsT.Cells(Rt, NfdMain).Value = "Main"
    End With
End Sub

Observe that I inserted the word "Main" as hard text. You could also copy the content of the appropriate call in the Source sheet. This procedure only writes the first entry. Dependents are written by another code.

The entire code is controlled by two "enums", enumerations, one for each of the worksheets. Enums are the quickest way to assign names to numbers. Please paste these two enums at the top of your code sheet, before either of the procedures.

Private Enum Nbc                        ' worksheet Benefit Census
    NbcFirstDataRow = 2                 ' Adjust as required
    NbcFname = 1                        ' columns:
    NbcName
    NbcDob
    NbcRelate
    NbcDepName
End Enum

Private Enum Nfd                        ' worksheet Formatted Data
    NfdFirstDataRow = 2                 ' Adjust as required
    NfdName = 1                         ' columns:
    NfdFname
    NfdDob
    NfdMain
    NfdRelate = 0                       ' Offset from NfdMain
    NfdDepName
End Enum

Note that the rule of enums is that you can assign any integer to them. If you don't assign any number the value will be one higher than the previous. So, NfdMain = 4, followed by NfdRelate which has an assigned value of 0, followed by NfdDepName which has a value of 0 + 1 = 1.

The numbers in these enumerations are columns (and rows). You can control the entire output by adjusting these numbers. For example, "Main" is written into column NfdMain (=4 =D). Change the value to 5 and "Main" will appear in column 5 = E. No need to go rummaging in the code. Consider this a control panel.

In the formatted output I introduced a logic which is slightly different from yours. If you don't like it you can change it easily by modifying the enums. My logic has the family name as the main criterion in the first column (switched from the raw data). In column D I write "Main". But when there is a dependent I write the relationship in column D. Therefore only entries without any dependents will have "Main" in that column. For your first example, the formatted row will show Rasmond / Shawn / 01-01-1990 / Spouse / Jessica, Child 1 / Vanessa. enter image description here

If you wish to keep the "Main and place "Spouse" in the next column, just set the enum NfdRelate = 1. With the "control panel" it's that simple.

查看更多
登录 后发表回答