可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have two CSV files and I want to compare them using AWK and generate a new file.

file1.csv:

"no","loc" 
"abc121","C:/pro/in" 
"abc122","C:/pro/abc"
"abc123","C:/pro/xyz"
"abc124","C:/pro/in"

file2.csv:

"no","loc" 
"abc121","C:/pro/in"
"abc122","C:/pro/abc"
"abc125","C:/pro/xyz"
"abc126","C:/pro/in"

output.csv:

"file1","file2","Diff" 
"abc121","abc121","Match" 
"abc122","abc122","Match" 
"abc123","","Unmatch" 
"abc124","","Unmatch" 
"","abc125","Unmatch" 
"","abc126","Unmatch"

回答1:

One way with awk:

script.awk:

BEGIN {
    FS = ","
}

NR>1 && NR==FNR {
    a[$1] = $2
    next
}

FNR>1 { 
    print ($1 in a) ? $1 FS $1 FS "Match" : "\"\"" FS $1 FS "Unmatch"
    delete a[$1] 
}

END {
    for (x in a) {
        print x FS "\"\"" FS "Unmatch"
    }
}

Output:

$ awk -f script.awk file1.csv file2.csv
"abc121","abc121",Match
"abc122","abc122",Match
"","abc125",Unmatch
"","abc126",Unmatch
"abc124","",Unmatch
"abc123","",Unmatch

回答2:

I didn't use awk alone, but if I understood the gist of what you're asking correctly, I think this long one-liner should do it...

join -t, -a 1 -a 2 -o 1.1 2.1 1.2 2.2 file1.csv file2.csv | awk -F, '{ if ( $3 == $4 ) var = "\"Match\""; else var = "\"Unmatch\"" ; print $1","$2","var }' | sed -e '1d' -e 's/^,/"",/' -e 's/,$/,"" /' -e 's/,,/,"",/g'

Description:

The join portion takes the two CSV files, joins them on the first column (default behavior of join) and outputs all four fields (-o 1.1 2.1 1.2 2.2), making sure to include rows that are unmatched for both files (-a 1 -a 2).
The awk portion takes that output and replaces combination of the 3rd and 4th columns to either "Match" or "Unmatch" based on if they do in fact match or not. I had to make an assumption on this behavior based on your example.
The sed portion deletes the "no","loc" header from the output (-e '1d') and replaces empty fields with open-close quote marks (-e 's/^,/"",/' -e 's/,$/,""/' -e 's/,,/,"",/g'). This last part might not be necessary for you.

EDIT: As tripleee points out, the above fails if the two initial files are unsorted. Here's an updated command to fix that. It punts the header line and sorts each file before passing them to join...

join -t, -a 1 -a 2 -o 1.1 2.1 1.2 2.2 <( sed 1d file1.csv | sort ) <( sed 1d file2.csv | sort ) | awk -F, '{ if ( $3 == $4 ) var = "\"Match\""; else var = "\"Unmatch\"" ; print $1","$2","var }' | sed -e 's/^,/"",/' -e 's/,$/,""/' -e 's/,,/,"",/g'