Removing spaces for all the columns of a CSV file

2019-08-01 16:44发布

问题:

I have a CSV file in which every column contains unnecessary extra spaces added to it before the actual value. I want to create a new CSV file by removing all the spaces.

For example

One line in input CSV file

 123, ste hen, 456, out put

Expected output CSV file

123,ste hen,456,out put

I tried using awk to trim each column but it didn't work.

回答1:

echo " 123, ste hen, 456, out put" | awk '{sub(/^ +/,""); gsub(/, /,",")}1'
123,ste hen,456,out put


回答2:

This sed should work:

sed -i.bak -E 's/(^|,)[[:blank:]]+/\1/g; s/[[:blank:]]+(,|$)/\1/g' file.csv

This will remove leading spaes, trailing spaces and spaces around comma.

Update: Here is an awk command to do the same:

awk -F '[[:blank:]]*,[[:blank:]]*' -v OFS=, '{
  gsub(/^[[:blank:]]+|[[:blank:]]+$/, ""); $1=$1} 1' file


回答3:

awk is your friend.

Input

$ cat 38609590.txt
Ted Winter, Evelyn Salt, Peabody
  Ulrich, Ethan Hunt, Wallace
James Bond, Q,  M
(blank line)

Script

$ awk '/^$/{next}{sub(/^[[:blank:]]*/,"");gsub(/[[:blank:]]*,[[:blank:]]*/,",")}1' 38609590.txt

Output

Ted Winter,Evelyn Salt,Peabody
Ulrich,Ethan Hunt,Wallace
James Bond,Q,M

Note

  • This one removes the blank lines too - /^$/{next}.
  • See the [ awk ] manual for more information.


回答4:

To remove leading blank chars with sed:

$ sed -E 's/(^|,) +/\1/g' file
123,ste hen,456,out put

With GNU awk:

$ awk '{$0=gensub(/(^|,) +/,"\\1","g")}1' file
123,ste hen,456,out put

With other awks:

$ awk '{sub(/^ +/,""); gsub(/, +/,",")}1' file
123,ste hen,456,out put

To remove blank chars before and after the values with sed:

$ sed -E 's/ *(^|,|$) */\1/g' file
123,ste hen,456,out put

With GNU awk:

$ awk '{$0=gensub(/ *(^|,|$) */,"\\1","g")}1' file
123,ste hen,456,out put

With other awks:

$ awk '{gsub(/^ +| +$/,""); gsub(/ *, */,",")}1' file
123,ste hen,456,out put

Change (a single blank char) to [[:blank:]] if you can have tabs as well as blank chars.



回答5:

Another way to do with awk to remove multiple leading white-spaces is as below:-

$ awk 'BEGIN{FS=OFS=","} {s = ""; for (i = 1; i <= NF; i++) gsub(/^[ \t]+/,"",$i);} 1' <<< "123, ste hen, 456, out put"
123,ste hen,456,out put
  • FS=OFS="," sets the input and output field separator to ,
  • s = ""; for (i = 1; i <= NF; i++) loops across each column entry up to the end (i.e. from $1,$2...NF) and the gsub(/^[ \t]+/,"",$i) trims only the leading white-space and not anywhere else (one ore more white-space, note the +) from each column.

If you are want to do this action for an entire file, suggest using a simple script like below

#!/bin/bash
# Output written to the file 'output.csv' in the same path    

while IFS= read -r line || [[ -n "$line" ]]; do   # Not setting IFS here, all done in 'awk', || condition for handling empty lines
   awk 'BEGIN{FS=OFS=","} {s = ""; for (i = 1; i <= NF; i++) gsub(/^[ \t]+/,"",$i);} 1' <<< "$line" >> output.csv
done <input.csv


回答6:

$ cat > test.in
 123, ste hen, 456, out put
$ awk -F',' -v OFS=',' '{for (i=1;i<=NF;i++) gsub(/^ +| +$/,"",$i); print $0}' test.in
123,ste hen,456,out put

or written out loud:

BEGIN {
  FS=","                  # set the input field separator
  OFS=","                 # and the output field separator
}
{
  for (i=1;i<=NF;i++)     # loop thru every field on record
    gsub(/^ +| +$/,"",$i) # remove leading and trailing spaces
  print $0                # print out the trimmed record
}

Run with:

$ awk -f test.awk test.in


回答7:

You could try :

  • your file : ~/path/file.csv

cat ~/path/file.csv | tr -d "\ " sed "s/, /,/g" ~/path/file.csv