Non greedy (reluctant) regex matching in sed?

2018-12-31 05:02发布

I'm trying to use sed to clean up lines of URLs to extract just the domain..

So from:

http://www.suepearson.co.uk/product/174/71/3816/

I want:

http://www.suepearson.co.uk/

(either with or without the trainling slash, it doesn't matter)

I have tried:

 sed 's|\(http:\/\/.*?\/\).*|\1|'

and (escaping the non greedy quantifier)

sed 's|\(http:\/\/.*\?\/\).*|\1|'

but I can not seem to get the non greedy quantifier to work, so it always ends up matching the whole string.

20条回答
牵手、夕阳
2楼-- · 2018-12-31 05:44

This can be done using cut:

echo "http://www.suepearson.co.uk/product/174/71/3816/" | cut -d'/' -f1-3
查看更多
像晚风撩人
3楼-- · 2018-12-31 05:44

sed -E interprets regular expressions as extended (modern) regular expressions

Update: -E on MacOS X, -r in GNU sed.

查看更多
登录 后发表回答