I currently have a variable name called InvoiceFileName that is creating .csv files through a foreach loop. A list of .csv is then outputted to a folder.
I will then need to query off of each .csv file to select the header and the first row of data for each .csv.
I believe I need to use the OPENROWSET
to query off of the .csv. I have 2 questions.
- What is the syntax to query off of the variable name InvoiceFileName.
- Is it possible to select the header field and first row of data
OPENROWSET
without inserting into a table.
Below is a simple OPENROWSET
that only provides the header of the file.
SELECT
top 1 *
FROM OPENROWSET(BULK N'\\myservername\f$\reports\Invoices\CokeFiles\54ASBSd.csv', SINGLE_CLOB) AS Report
Use a Foreach Loop container to query all files in a folder. You can use wildcards for the file name, or user the variables in your DTS to set the properties of the components.
Inside the loop container you place a Data Flow Task with your source file connection, your transformations, and your destination.
You can modify the file names and paths of all these objects by setting their properties to variables in your DTS.
With an Expresion Task inside the loop, you can change the path of the CSV file connection.
What kind of privs do you have on the database? If you have or can get slightly elevated privs, you can use
BULK INSERT
andxp_cmdShell
to accomplish this, but like @scsimon said, you will have to use dynamic sql. Here's a quick example:A few disclaimers:
BULK INSERT
andxp_cmdShell
.xp_cmdShell
(and for good reason) but this is a quick and dirty solution making a lot of assumptions about what your environment is like.For doing this through SSIS, ideally you'd probably need to use a format file for the bulk operation, but you'd have to have consistently formatted files and remove the SINGLE_CLOB option as well. A really hacky and non-ideal way to do this would be to do something like this:
Let's say your file contains this data:
Then you could basically just parse the data doing something like this:
And your result would be this:
This is obviously dependent on your line endings being consistent, but if you want the results in a single column and single row (as is the behavior of the bulk operation with the SINGLE_CLOB option), that should get you what you need.
You can take a look at the solution on this SO post for info on how to pass the SSIS variable value as a parameter to your query.