GNU awk: accessing captured groups in replacement

2019-01-23 11:18发布

This seems like it should be dirt simple, but the awk gensub/gsub/sub behavior has always been unclear to me, and now I just can't get it to do what the documentation says it should do (and what experience with a zillion other similar tools suggests should work). Specifically, I want to access "captured groups" from a regex in the replacement string. Here's what I think the awk syntax should be:

awk '{ gsub(/a(b*)c/, "Here are bees: \1"); print; }'

That should turn "abbbc" into "Here are bees: bbb". It does not, at least not for me in Ubunutu 9.04. Instead, the "\1" is rendered as a ^A; that is, the character with code 1. Not what I want, of course. How do I do this?

Thanks.

标签： gawk

2条回答

倾城　Initia

2楼-- · 2019-01-23 11:59

Per the gawk manual

gensub provides an additional feature that is not available in sub or gsub: the ability to specify components of a regexp in the replacement text. This is done by using parentheses in the regexp to mark the components and then specifying ‘\N’ in the replacement text, where N is a digit from 1 to 9.

You must use gensub, you must specify "g", and you must grab the result of gensub, since it does not modify in-place.

awk '{ r = gensub(/a(b*)c/, "Here are bees: \\1", "g"); print r; }'

0人赞添加讨论(0) 举报

混吃等死

3楼-- · 2019-01-23 12:10

echo abbc | awk '{ print gensub(/a(b*)c/, "Here are bees: \\1", "g", $1);}'

See manual here to see the difference between gsub and gensub

0人赞添加讨论(0) 举报

GNU awk: accessing captured groups in replacement

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间