I have some very large .csv files that I want to query and pull entries out. I've set this up using ADO in Excel 2016 successfully, or so it seemed, and remarkably fast. I am looking to query a field ID using;
SELECT * FROM <file> WHERE ID = #.
This has worked fine up until the ID numbers which contain letters, i.e. 960545H4. These appear much further down the file, around ~400k rows in.
I believe the problem is that excel is assuming this field (under the header ID) is numeric based one the first x number of entries. I want this to be set to text.
Is there anyway to set the datatype of the field/column so that I can query successfully?
Sub testSQL()
Dim xlcon as ADOB.Connection
Dim xlrs as ADOB.RecordSet
Dim nextRow as Integer
Dim datafilepath as String
Dim datafilename as String
Set xlcon = New ADOB.Connection
Set xlrs = New ADOB.RecordSet
datafilepath = "U:\Common\"
datafilename = "test_file"
xlcon.Provider = "Microsoft.Jet.OLEDB.4.0"
xlcon.ConnectionString = "Data Source=" & datafilepath & ";" & "Extended Properties=""text;HDR=Yes;FMT=Delimited,"""
xlcon.Open
xlrs.Open "SELECT * FROM [" & datafilename & ".csv] WHERE ID = '023487562HH'", xlcon
'the rest of the code...
I was hoping something slong the lines of;
SELECT * FROM [file] WHERE CStr(ID) = 34897562FD
but this doesn't seem to work.
Any help is greatly appreciated!!
One could take Killuminati's solution and add also the field names as needed. The parameter
MaxScanRows
just makes sure that the OLEDB driver will scan the whole file as stated also in the documentationIn order to declare the data type for each field one could use a schema.ini like that
The entry after the equal sign is name of the field resp. column. If the file contains a header it can be the same but does not have to but it is required.
Code could look like that
Thanks to Storax's recommendation on using a schema file I have got this working.
I created a schema file in the same location as the file. The file looked like;
I can now return the IDs I am looking for! Many thanks to Storax for this solution!