Every now and then I come across a problem with an old system one of my colleagues has developed. They tend to have thousands of lines of code to do a simple thing like importing a csv file.
Currently the vba process is:
- open excel application
- create new worksheet
- populate the csv file
- into excel add the header names to the file
- save the worksheet as a new excel file
- imports the file into the access data project sql table.
- Process the data
What I want to do with it is:
- import the csv to the table (Like the get external data function)
- process the data
I have had a quick search and cannot see any easy methods of just sucking the file into the table.
Any help would be appreciated.
Thanks
Paul
There is an easier way to import a CSV! You can use the Microsoft Text Odbc Driver.
Sub Import()
Dim conn as new ADODB.Connection
Dim rs as new ADODB.Recordset
Dim f as ADODB.field
conn.Open "DRIVER={Microsoft Text Driver (*.txt; *.csv)};DBQ=c:\temp;"
rs.Open "SELECT * FROM [test.txt]", conn, adOpenStatic, adLockReadOnly, adCmdText
While Not rs.EOF
For Each f In rs.Fields
Debug.Print f.name & "=" & f.Value
Next
Wend
End Sub
You change from a Select to an INSERT INTO combined with a SELECT and there you are.
There are some settings you can do in the registry, in the key \\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Text
:
Format: TabDelimited, CSVDelimited, Delimited(X) where X=some char
FirstRowHasNames: 0,1
CharacterSet: OEM, ANSI
In many cases, the easiest way to import CSV is with the TransferText method.
Refer to this MSDN link for details: TransferText Method [Access 2003 VBA Language Reference]
Here is an example TransferText command to import C:\SomeFolder\DataFile.csv into a table named tblImport. The last parameter, HasFieldNames, is False to indicate the CSV file doesn't include field names.
DoCmd.TransferText acImportDelim, "YourCustomSpecificationName", _
"tblImport", "C:\SomeFolder\DataFile.csv", False
The SpecificationName parameter is optional. However, you can often get better results by creating your own import specification and including its name in the TransferText command. The specification allows you to identify which table fields to load the data into, adjust data types, and a whole host of other options. You can create your own import specification when you manually import your data file ... select your custom import options, and save those choices as a named specification. (Look for an "Advanced" button on the Import Wizard dialog.)
I have found a nifty way to import entire CSVs into access.
I was tasked with importing three CSVs into three tables for a single database. This had to be done about 100 times, and each CSV would range from 200MB to 500MB. Since three table schemas were the same for each database I spent some time trying to find the best way to create a script to import all of these for me.
I first used
DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel9, p1, _
Application.CurrentProject.Path & "\Page1\_8_lift_base_" & dbName & ".csv",_
True, sh.Name & "!"
This worked for most cases, except occasionally upon opening the CSV a "read only" prompt would appear, and halt the imports until it was closed. Also, a 300MB CSV would take somewhere around 8 to 10 minutes. For 100 DBs, this is not acceptable.
What I ended up doing was to create my own XML import export specification.
Sub make_import_spec(filePath As String, tableName As String, pageNum As Long)
'By Ryan Griffin
Dim name_of_spec As String
name_of_spec = "imspec" & tableName
Dim xml As String
'This xml string contains the specifications the use for that table
xml = ""
xml = xml & "<?xml version=""1.0"" encoding=""utf-8"" ?>" & vbCrLf
xml = xml & "<ImportExportSpecification Path=" & Chr(34) & filePath & Chr(34) & " xmlns=""urn:www.microsoft.com/office/access/imexspec"">" & vbCrLf
xml = xml & " <ImportText TextFormat=""Delimited"" FirstRowHasNames=""true"" FieldDelimiter="","" CodePage=""437"" Destination=" & Chr(34) & tableName & Chr(34) & " >" & vbCrLf
xml = xml & " <DateFormat DateOrder=""MDY"" DateDelimiter=""/"" TimeDelimiter="":"" FourYearDates=""true"" DatesLeadingZeros=""false"" />" & vbCrLf
xml = xml & " <NumberFormat DecimalSymbol=""."" />" & vbCrLf
xml = xml & " <Columns PrimaryKey=""{none}"">" & vbCrLf
xml = xml & " <Column Name=""Col1"" FieldName=""field1"" Indexed=""YESDUPLICATES"" SkipColumn=""false"" DataType=""Text"" Width=""12"" />" & vbCrLf
xml = xml & " <Column Name=""Col2"" FieldName=""field2"" Indexed=""NO"" SkipColumn=""false"" DataType=""Text"" Width=""16"" />" & vbCrLf
xml = xml & " <Column Name=""Col3"" FieldName=""field3"" Indexed=""YESDUPLICATES"" SkipColumn=""false"" DataType=""Text"" Width=""24"" />" & vbCrLf
xml = xml & " <Column Name=""Col4"" FieldName=""field4"" Indexed=""NO"" SkipColumn=""false"" DataType=""Text"" Width=""16"" />" & vbCrLf
xml = xml & " </Columns>" & vbCrLf
xml = xml & " </ImportText>" & vbCrLf
xml = xml & "</ImportExportSpecification>"
'By Ryan Griffin
'Now you can add the specification to the project
CurrentProject.ImportExportSpecifications.Add name_of_spec, xml
' This will run your specification and import you csv file
DoCmd.RunSavedImportExport name_of_spec
End Sub
After running the code with this setup I was able to import a 300MB file in just over a minute (~62seconds), and was able to make sure every column had the appropriate dataType and correct indexing (without an extra step). So with this method I was able to achieve some where between a 7 to 9 times speed increase, with the ease of knowing that the data will be correct.
*Note: for this function I am providing the CSV file path (which includes the name.csv), the desired tablename, and pagenum, which is the table reference. (I used this to distinguish between the tables. In the xml string, I had an if statement based on that pageNum, where if pageNum was 1; add these columns to the string).
This will work beautifully for all your CSV importing desires, so long as the csv files do not include a "." (period) in the name [besides the extension]. For this you will need to use a Scripting FileSystemObject to get the file, and change its' name to use something like an underscore rather than a period, before importing.
I know it may be a little long winded, but there are very few resources out there that are reliable and useful in this area. Took me a almost a day to whittle down the options and sort out the VBA mess. I hope this can help anybody out there who is having the same trouble that I was.
BULK INSERT is generally faster, and it will work on an X64 machine. The text driver mentioned above will probably not work in some X64 environments.
I would reccomend NOT using the format file, it's a lot simpler without it.
http://msdn.microsoft.com/en-us/library/ms188365.aspx
BULK INSERT AdventureWorks2008R2.Sales.SalesOrderDetail
FROM 'f:\orders\lineitem.tbl'
WITH
(
FIELDTERMINATOR =' |',
ROWTERMINATOR =' |\n'
)