I have an Excel workbook with 250,000 rows and 10 columns and I want to split up the data into different workbooks. My idea was to filter the list so that Excel/VBA doesn't have to go through all 250,000 rows every time my code says to look for something in the data.
However, I've run into one specific problem with Sort
and also have a general question regarding hidden rows and SpecialCells(xlCellTypeVisible)
. First off, here's the code:
Option Explicit
Sub Filtering()
Dim wsData As Worksheet
Dim cell As Variant
Dim lRowData As Long, lColData As Long
'filter
Set wsData = ThisWorkbook.Sheets(1)
lRowData = wsData.Cells(Rows.Count, 1).End(xlUp).Row
wsData.Range("A:A").AutoFilter Field:=1, Criteria1:="Name1"
For Each cell In wsData.Range(wsData.Cells(2, 1), wsData.Cells(100, 1)).SpecialCells(xlCellTypeVisible)
Debug.Print cell.Value
Next cell
'sort
lColData = wsData.Cells(1, Columns.Count).End(xlToLeft).Column
wsData.Range(wsData.Cells(1, 1), wsData.Cells(lRowData, lColData)).SpecialCells(xlCellTypeVisible).Sort Key1:=wsData.Range("B1:B100"), Order1:=xlDescending, Header:=xlYes ' returns error because of SpecialCells
End Sub
- "Run-time error '1004': This can't be done on a multiple range selection. Select a single range and try again." This occurs in the last line, in
wsData.Range(wsData.Cells(1, 1), wsData.Cells(lRowData, lColData)).SpecialCells(xlCellTypeVisible).Sort Key1:=wsData.Range("B1:B100"), Order1:=xlDescending, Header:=xlYes
. It only happens when I useSpecialCells(xlCellTypeVisible)
, sowsData.Range(wsData.Cells(1, 1), wsData.Cells(lRowData, lColData)).Sort Key1:=wsData.Range("B1:B100"), Order1:=xlDescending, Header:=xlYes
works.
My thinking in using SpecialCells(xlCellTypeVisible)
was that only then VBA would skip the filtered cells. I've tried it out, though, and to me it seems .Sort
skips them anyway, with or without SpecialCells(xlCellTypeVisible)
- can someone confirm this?
- And this leads to my more general question: One thing I'm not quite clear on is when does Excel/VBA skip filtered rows and when it doesn't. To loop through the visible cells, I need to use
SpecialCells(xlCellTypeVisible)
. With.Sort
I (maybe) don't? And this question will always pop up for any operation I'll do on these filtered lists.
This made me wonder: should I work with my original sheet where part of the data is hidden or should I temporarily create a new sheet, copy only the data I need (= excluding the rows I've hidden with the filter) and then work with that? Would this new sheet make it quicker or easier in any way? What is better in your experience?
As per bm13563 comment you are copying nonadjacent cells. Also using a Sort will be altering your base data which could have an impact if you ever need to determine how it was initially ordered in the future.
Working with filters can become quite complex so a simpler (and not particularly slow) method could be to do a string search with your filtering value in your chosen column and then loop through the instances returned performing actions on each result.
The (slightly adapted) code below from David Zemens would be a good starting point (copied from Find All Instances in Excel Column)
Your first error occurs when you attempt to copy nonadjacent cell or range selections e.g multiple nonadjacent rows within the same column (A1, A3, A5). This is because Excel "slides" the ranges together and pastes them as a single rectangle. Your visible special cells are nonadjacent, and therefore can't be copied as a single range.
It seems that excel is looping through all of the cells in your range, not just the visible ones. Your debug.print is returning more rows than just those that are visible.
I would take a different approach to tackling your problem by using arrays, which VBA is able to loop through extremely quickly compared to worksheets.
Using this approach, I was able to copy 9k rows with 10 columns based on the value of the first column from a sample size of 190k in 4.55 seconds:EDIT: I did some messing around with the arrays which brought the time down to 0.45 seconds to copy 9k rows based on the first column from an initial 190k using the following:
It isn't super clean and could probably do with some refining, but if speed is important (which it often seems to be), this should do the job well for you.