I have a SSIS package importing data from a .csv file. This file has doulbe quotes ("
) qualifiers for each entry in it but also in between. I also added commas (,
) as a column delimiter. I can't give you the original data I'm working with but here is an example how my data is passed in Flat File Source:
"ID-1","A "B"", C, D, E","Today"
"ID-2","A, B, C, D, E,F","Yesterday"
"ID-3","A and nothing else","Today"
As you can see the second column can contain quotes (and commas) which smashes my SSIS import with an error pointing at this line. I'm not really familiar with regular expressions, but I've heard that this might help in this case.
In my eyes I need to replace all the double quotes ("
) by single quotes ('
) except...
- ...all quotes at the beginning of one line
- ...all quotes at the end of one line
- ...quotes which are part of
","
Can anyone of you help me out in this thing? Would be great!
Thanks in advance!
To replace double quotes with single quotes according to your specifications, use this simple regex. This regex will allow whitespace at the beginning and/or end of lines.
This is the explanation of the pattern:
Use text qualifier
"
for CSV destination before inserting values to CSV destination, add a derived column expressionThis will retain
"
in your text fieldwhile loading CSV with double quotes and comma there is one limitation that extra double quotes has been added and the data also enclosed with the double quotes you can check in the preview of source file. So, add the derived column task and give the below expression:-
(REPLACE(REPLACE(RIGHT(SUBSTRING(TRIM(COL2),1,LEN(COL2) - 1),LEN(COL2) - 2)," ","@"),"\"\"","\""),"@"," ")
the bold part removes the data enclosed with double quotes.
Try this and do let me know if this is helpful
You can split columns with regex match pattern
See this demo.