Is there a way to escape a double quote within a t

2019-02-16 17:05发布

问题:

I have a CSV I'm trying to import into SQL using SSIS packages through code. A line might look something like this

321,1234,"SOME MACHINE, MACHINE ACCESSORIES 1 1/2"" - 4"""

In this example they're using a double quote to symbolize inches. They are trying to escape the inches double quote with a double quote. SSIS, however, does not honour this escapism and fails.

Is there anyway I can still use the double quote symbol for inches and escape it within the quoted text?

Many suggestions are to replace the double quote with two single quotes. Is this the only work around or can I use some other escape technique?

I've seen people talk about using the Derived Column transformation but in my case SSIS fails at the Flat File Source step and I therefore cannot get to a derived column transform step.

I'm currently running a script task in the control flow, just before the data flow, to manipulate the Csv with some regex's to cleanup the data.

I need the string to be text qualified with the 2 outer double quotes because of potential commas in the description column.

What can I do about the double quotes within the text qualified string?

回答1:

Wow, I expected to be able to answer with "Just set the text qualifier", but figured you would have already tried that so I tried it before I answered. Surprise, SSIS doesn't support standard CSV files!

Looks like this is a common complaint. There is one comment in there from Microsoft about some samples that may help; Here is the codeplex project, they mentioned that the Regular Expression Flat File Source sample and the Delimited File Reader Source sample in particular may help -- I'm guessing the Delimited File Reader would be more worthwhile.



回答2:

I ran into a similar problem yesterday.

We got the csv file that using comma , as delimiter and double quote " as text qualifier, but there is a field that contain double quote within double quote(non-escaped double quote within a string).

After spending half day searching, came up with the solution below:

// load the file into a one dimensional string array.
// fullFilePath is the full path + file name.
var fileContent = File.ReadAllLines(fullFilePath);

// Find double quotes within double quotes and replace with a single quote
var fileContentUpdated = fileContent.Select(
    x => new Regex(@"(?<!^)(?<!\,)""(?!\,)(?!$)"
    ).Replace(x, "'")).ToArray();

// write the string array into the csv file.
File.WriteAllLines(fullFilePath, fileContentUpdated);

I don't see any other way than replace the double quote with something else to avoid the issue.



回答3:

This answer is not applicable to 2005 as referenced here, but in case someone comes across this while searching and is using 2008, this other question appears to have a working answer: SSIS 2008 and Undouble



回答4:

There is a workaround if in the File connection you remove the " as text qualifier you can remove all the double quotes later with a derived column expression REPLACE(Item_Name,"\"",""). The downside is that you will need to do it for every field



回答5:

Probably old news now, but this issue was fixed in SQL Server 2012. I was able to import the same file on a 2012 server that failed on my 2008 server.