Subtracting ranges in VBA (Excel)

2020-02-10 21:49发布

问题:

What I'm trying to do

I'm trying to write a function to subtract Excel ranges. It should take two input parameters: range A and range B. It should return a range object consisting of cells that are part of range A and are not part of range B (as in set subtraction)

What I've tried

I've seen some examples on the web that use a temporary worksheet to do this (fast, but might introduce some issues with protected workbooks and such) and some other examples that go cell by cell through the first range checking for intersections with the second one (extremely slow).

After some thinking I've come up with this code {1}, which works faster, but still is slow. Subtracting from a range representing the whole worksheet takes from 1 to 5 minutes depending on how complex the second range is.

When I looked over that code trying to find ways to make it faster I saw a possibility for applying the divide-and-conquer paradigm, which I did {2}. But that had made my code slower instead. I'm not much of a CS guy, so I might have done something wrong or this algorithm simply is not the one the divide-and-conquer should be used on, I don't know.

I have also tried rewriting it using mostly recursion, but that took forever to finish or (more often) had thrown Out of Stack Space errors. I didn't save the code.

The only (marginally) successful improvement I've been able to do is adding a flip switch {3} and going first through rows, then (in the next call) through columns instead of going through both in the same call, but the effect was not as good as I've hoped. Now I see that even though we don't go through all rows in the first call, in the second call we still loop through the same amount of rows we would in the first one, only these rows are a little bit shorter :)

I would appreciate any help in improving or rewriting this function, thank you!

The solution, based on the accepted answer by Dick Kusleika

Dick Kusleika, thank you very much for providing your answer! I think I'll use it with some modifications I've made:

  • Got rid of the global variable (mrBuild)
  • Fixed "some overlap" condition to exclude "no overlap" case
  • Added more complex conditions to choose whether to split the range top to bottom or left to right

With these modifications the code runs very fast on the most of common cases. As it's been pointed out, it will still be slow with checkerboard-style huge range which I agree is unavoidable.

I think this code still has a room for improvement and I'll update this post in case I modify it.

Improvement possibilities:

  • Heuristics of choosing how to split the range (by columns or by rows)

{0} Solution code

Public Function SubtractRanges(rFirst As Range, rSecond As Range) As Range
'
' Returns a range of cells that are part of rFirst, but not part of rSecond
' (as in set subtraction)
'
' This function handles big input ranges really well!
'
' The reason for having a separate recursive function is
' handling multi-area rFirst range
'
    Dim rInter As Range
    Dim rReturn As Range
    Dim rArea As Range

    Set rInter = Intersect(rFirst, rSecond)
    Set mrBuild = Nothing

    If rInter Is Nothing Then 'no overlap
        Set rReturn = rFirst
    ElseIf rInter.Address = rFirst.Address Then 'total overlap
        Set rReturn = Nothing
    Else 'partial overlap
        For Each rArea In rFirst.Areas
            Set mrBuild = BuildRange(rArea, rInter) 'recursive
        Next rArea
        Set rReturn = mrBuild
    End If

    Set SubtractRanges = rReturn
End Function


Private Function BuildRange(rArea As Range, rInter As Range, _
Optional mrBuild As Range = Nothing) As Range
'
' Recursive function for SubtractRanges()
'
' Subtracts rInter from rArea and adds the result to mrBuild
'
    Dim rLeft As Range, rRight As Range
    Dim rTop As Range, rBottom As Range
    Dim rInterSub As Range
    Dim GoByColumns As Boolean

    Set rInterSub = Intersect(rArea, rInter)
    If rInterSub Is Nothing Then 'no overlap
        If mrBuild Is Nothing Then
            Set mrBuild = rArea
        Else
            Set mrBuild = Union(mrBuild, rArea)
        End If
    ElseIf Not rInterSub.Address = rArea.Address Then 'some overlap
        If Not rArea.Cells.CountLarge = 1 Then 'just in case there is only one cell for some impossible reason

            ' Decide whether to go by columns or by rows
            ' (helps when subtracting whole rows/columns)
            If Not rInterSub.Columns.Count = rArea.Columns.Count And _
            ((Not rInterSub.Cells.CountLarge = 1 And _
            (rInterSub.Rows.Count > rInterSub.Columns.Count _
            And rArea.Columns.Count > 1) Or (rInterSub.Rows.Count = 1 _
            And Not rArea.Columns.Count = 1)) Or _
            (rInterSub.Cells.CountLarge = 1 _
            And rArea.Columns.Count > rArea.Rows.Count)) Then
                    GoByColumns = True
            Else
                    GoByColumns = False
            End If

            If Not GoByColumns Then
                Set rTop = rArea.Resize(rArea.Rows.Count \ 2) 'split the range top to bottom
                Set rBottom = rArea.Resize(rArea.Rows.Count - rTop.Rows.Count).Offset(rTop.Rows.Count)
                Set mrBuild = BuildRange(rTop, rInterSub, mrBuild) 'rerun it
                Set mrBuild = BuildRange(rBottom, rInterSub, mrBuild)
            Else
                Set rLeft = rArea.Resize(, rArea.Columns.Count \ 2) 'split the range left to right
                Set rRight = rArea.Resize(, rArea.Columns.Count - rLeft.Columns.Count).Offset(, rLeft.Columns.Count)
                Set mrBuild = BuildRange(rLeft, rInterSub, mrBuild) 'rerun it
                Set mrBuild = BuildRange(rRight, rInterSub, mrBuild)
            End If
        End If
    End If

    Set BuildRange = mrBuild
End Function

Other code mentioned in the question

{1} Initial code (go row by row, column by column)

Function SubtractRanges(RangeA, RangeB) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
' This function handles big RangeA pretty well (took less than a minute
' on my computer with RangeA = ActiveSheet.Cells)
'
    Dim CommonArea As Range
    Dim Result As Range

    Set CommonArea = Intersect(RangeA, RangeB)
    If CommonArea Is Nothing Then
        Set Result = RangeA
    ElseIf CommonArea.Address = RangeA.Address Then
        Set Result = Nothing
    Else
        'a routine to deal with A LOT of cells in RangeA
        'go column by column, then row by row
        Dim GoodCells As Range
        Dim UnworkedCells As Range

        For Each Area In RangeA.Areas
            For Each Row In Area.Rows
                Set RowCommonArea = Intersect(Row, CommonArea)
                If Not RowCommonArea Is Nothing Then
                    If Not RowCommonArea.Address = Row.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Row)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Row)
                End If
            Next Row

            For Each Column In Area.Columns
                Set ColumnCommonArea = Intersect(Column, CommonArea)
                If Not ColumnCommonArea Is Nothing Then
                    If Not ColumnCommonArea.Address = Column.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Column)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Column)
                End If
            Next Column
        Next Area

        If Not UnworkedCells Is Nothing Then
            For Each Area In UnworkedCells
                Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea))
            Next Area
        End If

        Set Result = GoodCells
    End If

    Set SubtractRanges = Result
End Function

{2} Divide and conquer

Function SubtractRanges(RangeA, RangeB) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
    Dim CommonArea As Range
    Dim Result As Range

    Set CommonArea = Intersect(RangeA, RangeB)
    If CommonArea Is Nothing Then
        Set Result = RangeA
    ElseIf CommonArea.Address = RangeA.Address Then
        Set Result = Nothing
    Else
        'a routine to deal with A LOT of cells in RangeA
        'go column by column, then row by row
        Dim GoodCells As Range
        Dim UnworkedCells As Range

        For Each Area In RangeA.Areas

            RowsNumber = Area.Rows.Count
            If RowsNumber > 1 Then
                Set RowsLeft = Range(Area.Rows(1), Area.Rows(RowsNumber / 2))
                Set RowsRight = Range(Area.Rows(RowsNumber / 2 + 1), Area.Rows(RowsNumber))
            Else
                Set RowsLeft = Area
                Set RowsRight = CommonArea.Cells(1, 1) 'the next best thing to Nothing - will end its cycle rather fast and won't throw an error with For Each statement
            End If
            For Each Row In Array(RowsLeft, RowsRight)
                Set RowCommonArea = Intersect(Row, CommonArea)
                If Not RowCommonArea Is Nothing Then
                    If Not RowCommonArea.Address = Row.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Row)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Row)
                End If
            Next Row

            ColumnsNumber = Area.Columns.Count
            If ColumnsNumber > 1 Then
                Set ColumnsLeft = Range(Area.Columns(1), Area.Columns(ColumnsNumber / 2))
                Set ColumnsRight = Range(Area.Columns(ColumnsNumber / 2 + 1), Area.Columns(ColumnsNumber))
            Else
                Set ColumnsLeft = Area
                Set ColumnsRight = CommonArea.Cells(1, 1)
            End If
            For Each Column In Array(ColumnsLeft, ColumnsRight)
                Set ColumnCommonArea = Intersect(Column, CommonArea)
                If Not ColumnCommonArea Is Nothing Then
                    If Not ColumnCommonArea.Address = Column.Address Then
                        Set UnworkedCells = AddRanges(UnworkedCells, Column)
                    End If
                Else
                    Set GoodCells = AddRanges(GoodCells, Column)
                End If
            Next Column
        Next Area

        If Not UnworkedCells Is Nothing Then
            For Each Area In UnworkedCells
                Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea))
            Next Area
        End If

        Set Result = GoodCells
    End If

    Set SubtractRanges = Result
End Function

{3} Initial code + flip switch (row by row OR column by column in turns)

Function SubtractRanges(RangeA, RangeB, Optional Flip As Boolean = False) As Range
'
' Returns a range of cells that are part of RangeA, but not part of RangeB
'
' This function handles big RangeA pretty well (took less than a minute
' on my computer with RangeA = ActiveSheet.Cells)
'
    Dim CommonArea As Range
    Dim Result As Range

    Set CommonArea = Intersect(RangeA, RangeB)
    If CommonArea Is Nothing Then
        Set Result = RangeA
    ElseIf CommonArea.Address = RangeA.Address Then
        Set Result = Nothing
    Else
        'a routine to deal with A LOT of cells in RangeA
        'go column by column, then row by row
        Dim GoodCells As Range
        Dim UnworkedCells As Range

        For Each Area In RangeA.Areas
            If Flip Then
                For Each Row In Area.Rows
                    Set RowCommonArea = Intersect(Row, CommonArea)
                    If Not RowCommonArea Is Nothing Then
                        If Not RowCommonArea.Address = Row.Address Then
                            Set UnworkedCells = AddRanges(UnworkedCells, Row)
                        End If
                    Else
                        Set GoodCells = AddRanges(GoodCells, Row)
                    End If
                Next Row
            Else
                For Each Column In Area.Columns
                    Set ColumnCommonArea = Intersect(Column, CommonArea)
                    If Not ColumnCommonArea Is Nothing Then
                        If Not ColumnCommonArea.Address = Column.Address Then
                            Set UnworkedCells = AddRanges(UnworkedCells, Column)
                        End If
                    Else
                        Set GoodCells = AddRanges(GoodCells, Column)
                    End If
                Next Column
            End If
        Next Area

        If Not UnworkedCells Is Nothing Then
            For Each Area In UnworkedCells
                Set GoodCells = AddRanges(GoodCells, SubtractRanges(Area, CommonArea, Not Flip))
            Next Area
        End If

        Set Result = GoodCells
    End If

    Set SubtractRanges = Result
End Function

A little helper function mentioned here and there:

Function AddRanges(RangeA, RangeB)
'
' The same as Union built-in but handles empty ranges fine.
'
    If Not RangeA Is Nothing And Not RangeB Is Nothing Then
        Set AddRanges = Union(RangeA, RangeB)
    ElseIf RangeA Is Nothing And RangeB Is Nothing Then
        Set AddRanges = Nothing
    Else
        If RangeA Is Nothing Then
            Set AddRanges = RangeB
        Else
            Set AddRanges = RangeA
        End If
    End If
End Function

回答1:

Your divide and conquer seems like a good way to go. You need to introduce some recursion and should be reasonably fast

Private mrBuild As Range

Public Function SubtractRanges(rFirst As Range, rSecond As Range) As Range

    Dim rInter As Range
    Dim rReturn As Range
    Dim rArea As Range

    Set rInter = Intersect(rFirst, rSecond)
    Set mrBuild = Nothing

    If rInter Is Nothing Then 'No overlap
        Set rReturn = rFirst
    ElseIf rInter.Address = rFirst.Address Then 'total overlap
        Set rReturn = Nothing
    Else 'partial overlap
        For Each rArea In rFirst.Areas
            BuildRange rArea, rInter
        Next rArea
        Set rReturn = mrBuild
    End If

    Set SubtractRanges = rReturn

End Function

Sub BuildRange(rArea As Range, rInter As Range)

    Dim rLeft As Range, rRight As Range
    Dim rTop As Range, rBottom As Range

    If Intersect(rArea, rInter) Is Nothing Then 'no overlap
        If mrBuild Is Nothing Then
            Set mrBuild = rArea
        Else
            Set mrBuild = Union(mrBuild, rArea)
        End If
    Else 'some overlap
        If rArea.Columns.Count = 1 Then 'we've exhausted columns, so split on rows
            If rArea.Rows.Count > 1 Then 'if one cell left, don't do anything
                Set rTop = rArea.Resize(rArea.Rows.Count \ 2) 'split the range top to bottom
                Set rBottom = rArea.Resize(rArea.Rows.Count - rTop.Rows.Count).Offset(rTop.Rows.Count)
                BuildRange rTop, rInter 'rerun it
                BuildRange rBottom, rInter
            End If
        Else
            Set rLeft = rArea.Resize(, rArea.Columns.Count \ 2) 'split the range left to right
            Set rRight = rArea.Resize(, rArea.Columns.Count - rLeft.Columns.Count).Offset(, rLeft.Columns.Count)
            BuildRange rLeft, rInter 'rerun it
            BuildRange rRight, rInter
        End If
    End If

End Sub

These aren't particularly huge ranges, but they all ran fast

?subtractranges(rangE("A1"),range("a10")).Address
$A$1
?subtractranges(range("a1"),range("a1")) is nothing
True
?subtractranges(range("$B$3,$B$6,$C$8:$W$39"),range("a1:C10")).Address
$C$11:$C$39,$D$8:$W$39
?subtractranges(range("a1:C10"),range("$B$3,$B$6,$C$8:$W$39")).Address
$A$1:$A$10,$B$1:$B$2,$B$4:$B$5,$B$7:$B$10,$C$1:$C$7


回答2:

My solution is shorter but I don't know if it is optimal one:

Sub RangeSubtraction()

    Dim firstRange As Range
    Dim secondRange As Range
    Dim rIntersect As Range
    Dim rOutput As Range
    Dim x As Range

    Set firstRange = Range("A1:B10")
    Set secondRange = Range("A5:B10")

    Set rIntersect = Intersect(firstRange, secondRange)

    For Each x In firstRange
        If Intersect(rIntersect, x) Is Nothing Then
            If rOutput Is Nothing Then 'ugly 'if-else' but needed, can't use Union(Nothing, Range("A1")) etc.
                Set rOutput = x
            Else
                Set rOutput = Application.Union(rOutput, x)
            End If
        End If
    Next x

    Msgbox rOutput.Address

End Sub


回答3:

Although iterative and not recursive, here's my solution. The function returns the rangeA subtracted by rangeB

public Function SubtractRange(rangeA Range, rangeB as Range) as Range
'rangeA is a range to subtract from
'rangeB is the range we want to subtract

 Dim existingRange As Range
  Dim resultRange As Range
  Set existingRange = rangeA
  Set resultRange = Nothing
  Dim c As Range
  For Each c In existingRange
  If Intersect(c, rangeB) Is Nothing Then
    If resultRange Is Nothing Then
      Set resultRange = c
    Else
      Set resultRange = Union(c, resultRange)
    End If
  End If
  Next c
  Set SubtractRange = resultRange
End Sub