I'm writing some code in VBA behind Excel to pull some summary numbers out of potentially huge text files (10M+ rows) out on a network drive. In the past, these numbers have been pulled using greps in linux, but I was hoping to implement something that could be done with a click of a button in Excel for ease of use.
My solution works, but it's like 25 times slower than a linux grep - takes 4 minutes to query 10M records, while the grep can do it in 10 seconds. Should I not be using ADO for this? Why is it so slow, aside from the fact that text files obviously aren't indexed? Is there a better solution that could still be coded without too much hassle in VBA, or is it a lost cause? I'm using Excel 2007 and the ADO 6.0 library. Here is some sample code:
Sub RunSQL()
Dim cn As New ADODB.Connection
Dim rs As New ADODB.Recordset
cn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Extended Properties=""text;HDR=YES;FMT=Delimited"";" & _
"Data Source=\\network\share\path\;"
rs.Open "select count(*) from Customers.tab where CHANGE_FLAG = 'Y'", cn
Range("A1").CopyFromRecordset rs
rs.Close
cn.Close
End Sub