I am trying to make a script for text editing. In this case I have a text file named text.csv, which reads:
first;48548a;48954a,48594B
second;58757a;5875b
third;58756a;58576b;5867d;56894d;45864a
I want to make text format to like this:
first;48548a
first;48954a
first;48594B
second;58757a
second;5875b
third;58756a
third;58576b
third;5867d
third;56894d
third;45864a
What is command should I use to make this happen?
I'd do this in awk.
Assuming your first line should have a ;
instead of a ,
:
$ awk -F\; '{for(n=2; n<=NF; n++) { printf("%s;%s\n",$1,$n); }}' input.txt
Untested.
Here is a pure bash solution that handles both ,
and ;
.
while IFS=';,' read -a data; do
id="${data[0]}"
data=("${data[@]:1}")
for item in "${data[@]}"; do
printf '%s;%s\n' "$id" "$item"
done
done < input.txt
UPDATED - alternate printing method based on chepner's suggestion:
while IFS=';,' read -a data; do
id="${data[0]}"
data=("${data[@]:1}")
printf "$id;%s\n" "${data[@]}"
done < input.txt
awk -v FS=';' -v OFS=';' '{for (i = 2; i <= NF; ++i) { print $1, $i }}'
Explanation: awk implicitly splits data into records(by default separeted by newline, i.e. line == record) which then are split into numbered fields by given field separator(FS
for input field separator and OFS
for output separator).
For each record this script prints first field(which is record name), along with i-th field, and that's exactly what you need.
while IFS=';,' read -a data; do
id="${data[0]}"
data=("${data[@]:1}")
printf "$id;%s\n" "${data[@]}"
done < input.txt
or
awk -v FS=';' -v OFS=';' '{for (i = 2; i <= NF; ++i) { print $1, $i }}'
And
$ awk -F\; '{for(n=2; n<=NF; n++) { printf("%s;%s\n",$1,$n); }}' input.txt
thanks all for your suggestions, :d. It's really give me a new knowledge..