I hope you can help me with this issue.
I have an Excel file with 146,459 rows and I need to delete blank cells to unify my data. Here is an image of what I mean:
When I select all blanks, my laptop takes around 2 minutes, but then when I try to delete the cells from one or more columns and shift up, Excel freezes and nothing happen. I already left my laptop for over 1 hours like that and I didn't have any results.
Do you know if there's a way to do it or if any alternatives can be implemented?
Thanks in advance!
Looping through cells takes a very long time, even with the Union optimisation. The code below was tested on an imitated data set, 5 columns x 200,000 records, and finished in 5.5 seconds.
Setup: Let's say your source data is in a range "A1:E200000" on a sheet named "Source", and you want clean data in a similar range on a sheet named "Target".
Code:
How it works: Sub Remove_Empty_Cells loops though the source range by columns, and calls sub "Clean_Column" that removes empty cells from the provided column.
Clean_Column loads all column cells into an ADO recordset using MSXML2.DOMDocument object. The recordset is then filtered for non-empty rows, and the result is copied to the target column. All these operations are very fast in VBA.
Ideally, I would love to load the entire range into a recordset at once, but unfortunately VBA function CopyFromRecordset does not alow to paste recordset field by field. So we have to load the data column by column (if somebody knows a more optimal way, I'd love to see it).
A couple of caveats:
[EDIT]: An alternative solution, implemented using arrays. The same data set 5x 200,000 with 40,000 valid records is cleaned in less than 1 second. It can be further optimized, I just prototyped a quick demo.
Working with arrays is either one of the fastest or the fastest method of dealing with large ranges of cells.
Start with:
Run code:
Results:
Seeded data:
The following was used to replicate the OP 'sample-data-in-an-image'.