Awk replace a column with its hash value

2楼-- · 2020-02-11 04:15

I copy pasted larsks's response, but I have added the close line, to avoid the problem indicated in this post: gawk / awk: piping date to getline *sometimes* won't work

awk '{
    tmp="echo " $2 " | openssl md5 | cut -f2 -d\" \""
tmp | getline cksum
close(tmp)
$2=cksum
print
}' < sample

0人赞添加讨论(0) 举报

干净又极端

3楼-- · 2020-02-11 04:19

So, you don't really want to be doing this with awk. Any of the popular high-level scripting languages -- Perl, Python, Ruby, etc. -- would do this in a way that was simpler and more robust. Having said that, something like this will work.

Given input like this:

this is a test

(E.g., a row with four columns), we can replace a given column with its md5 checksum like this:

awk '{
    tmp="echo " $2 " | openssl md5 | cut -f2 -d\" \""
tmp | getline cksum
$2=cksum
print
}' < sample

This relies on GNU awk (you'll probably have this by default on a Linux system), and it uses openssl to generate the md5 checksum. We first build a shell command line in tmp to pass the selected column to the md5 command. Then we pipe the output into the cksum variable, and replace column 2 with the checksum. Given the sample input above, the output of this awk script would be:

this 7e1b6dbfa824d5d114e96981cededd00 a test

0人赞添加讨论(0) 举报

疯言疯语

4楼-- · 2020-02-11 04:26

This might work using Bash/GNU sed:

<<<"this is a test" sed -r 's/(\S+\s)(\S+)(.*)/echo "\1 $(md5sum <<<"\2") \3"/e;s/ - //'
this  7e1b6dbfa824d5d114e96981cededd00  a test

or a mostly sed solution:

<<<"this is a test" sed -r 'h;s/^\S+\s(\S+).*/md5sum <<<"\1"/e;G;s/^(\S+).*\n(\S+)\s\S+\s(.*)/\2 \1 \3/'
this 7e1b6dbfa824d5d114e96981cededd00 a test

Replaces is from this is a test with md5sum

Explanation:

In the first:- identify the columns and use back references as parameters in the Bash command which is substituted and evaluated then make cosmetic changes to lose the file description (in this case standard input) generated by the md5sum command.

In the second:- similar to the first but hive the input string into the hold space, then after evaluating the md5sum command, append the string G to the pattern space (md5sum result) and using substitution arrange to suit.

0人赞添加讨论(0) 举报

不美不萌又怎样

5楼-- · 2020-02-11 04:28

You can also do that with perl :

echo "aze qsd wxc" | perl -MDigest::MD5 -ne 'print "$1 ".Digest::MD5::md5_hex($2)." $3" if /([^ ]+) ([^ ]+) ([^ ]+)/' 
aze 511e33b4b0fe4bf75aa3bbac63311e5a wxc

If you want to obfuscate large amount of data it might be faster than sed and awk which need to fork a md5sum process for each lines.

0人赞添加讨论(0) 举报

淡お忘

6楼-- · 2020-02-11 04:29

You might have a better time with read than awk, though I haven't done any benchmarking.

the input (scratch001.txt):

foo|bar|foobar|baz|bang|bazbang
baz|bang|bazbang|foo|bar|foobar

transformed using read:

while IFS="|" read -r one fish twofish red fishy bluefishy; do
  twofish=`echo -n $twofish | md5sum | tr -d "  -"`
  echo "$one|$fish|$twofish|$red|$fishy|$bluefishy"
done < scratch001.txt

produces the output:

foo|bar|3858f62230ac3c915f300c664312c63f|baz|bang|bazbang
baz|bang|19e737ea1f14d36fc0a85fbe0c3e76f9|foo|bar|foobar

0人赞添加讨论(0) 举报

Awk replace a column with its hash value

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间