I currently store my csv formatted files on disk and then query them like this:
SELECT *
FROM OPENROWSET(BULK 'C:\myfile.csv',
FORMATFILE = 'C\format.fmt',
FIRSTROW = 2) AS rs
Where format.fmt are the defined format of the columns in the csv file. This works very well. But I'm interested in storing the file in a SQL Server table instead of storing them at disk. So when having a VARBINARY(MAX) datatype column. How do I query them?
If I have a table like:
CREATE TABLE FileTable
(
[FileName] NVARCHAR(256)
,[File] VARBINARY(MAX)
)
With one row 'myfile.csv', '0x427574696B3B44616....'
How to read that file content into a temporary table for example?
If you really need to work with varbinary data, you can just cast it back to nvarchar:
Once you've got it into that format, you can use a split function to turn it into a table. Don't ask me why there isn't a built-in split function in SQL Server, given that it's such a screamingly obvious oversight, but there isn't. So create your own with the code below:
Put it all together:
gives this result:
And of course, you can get the result into a temp table to work with if you so wish:
You can also use BULK INSERT to do this as in this question.
Assuming you've created a table with the correct format to import the data into (e.g. 'MyImportTable') something like the following could be used:
EDIT 1:
With the data imported into the database, you can now query the table directly, and avoid having the CSV altogether like so:
With the reference to the original CSV no longer required you can delete/archive the original CSV.
EDIT 2:
If you've enabled xp_cmdshell, and you have the appropriate permissions, you can delete the file from SQL with the following:
Lastly, if you want to enable xp_cmdshell use the following: