I have a set of data inside the csv as below:
Given Data:
(12,'hello','this girl,is lovely(adorable \r\n actually)',goodbye),
(13,'hello','this fruit,is super tasty (sweet actually)',goodbye)
I want to print the given data into 2 rows starting from ( till ) and ignore delimiter , and () inside the ' ' field.
How can I do this using awk or sed in linux?
Expected result as below:
Expected Result:
row 1 = 12,'hello','this girl,is lovely(adorable actually)',goodbye
row 2 = 13,'hello','this fruit,is super tasty (sweet actually)',goodbye
UPDATE:
I just noticed that there are a comma between the 2 rows. So how can i separate it into 2 rows using the , after ) and before (?
You can use the following awk
command to achieve your goal:
awk -i.bak '{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}' file.in
tested on your input:
explanations:
-i.bak
will take a backup of your file and
{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}
will first remove the first and last parenthesis of your string before removing the \r
,\n
and printing it in the format you want
- you might need to add before the
{...}
the following condition if you have a header NR>1
-> 'NR>1{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}'
following the changes in your requirements, I have adapted the awk command to be able to take into account your ,
as a record separator (row separator)
awk -i.bak 'BEGIN{RS=",\n|\n"}{str=substr($0,2,length($0)-2); gsub("\\\\r ?|\\\\n ?","",str); print "row "NR" = "str;}' file.in
where BEGIN{RS=",\n|\n"}
defines your row separator constraint