SED replacing with 'possible' newline

2019-08-16 16:55发布

I have a sed command that is working fine, except when it comes across a newline right in the file somewhere. Here is my command:

sed -i 's,<a href="\(.*\)">\(.*\)</a>,\2 - \1,g'

Now, it works perfectly, but I just ran across this file that has the a tag like so:

<a href="link">Click
        here now</a>

Of course it didn't find this one. So I need to modify it somehow to allow for lines breaks in the search. But I have no clue how to make it allow for that unless I go over the entire file first off and remove all \n before hand. Problem there is I loose all formatting in the file.

标签: linux sed
2条回答
爷、活的狠高调
2楼-- · 2019-08-16 17:30

Here is a quick and dirty solution that assumes there will be no more than one newline in a link:

sed -i '' -e '/<a href=.*>/{/<\/a>/!{N;s|\n||;};}' -e 's,<a href="\(.*\)">\(.*\)</a>,\2 - \1,g'

The first command (/<a href=.*>/{/<\/a>/!{N;s|\n||;};}) checks for the presence of <a href=...> without </a>, in which case it reads the next line into the pattern space and removes the newline. The second is yours.

查看更多
放荡不羁爱自由
3楼-- · 2019-08-16 17:33

You can do this by inserting a loop into your sed script:

sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s,<a href="\(.*\)">\(.*\)</a>,\2 - \1,g;}' yourfile

As-is, that will leave an embedded newline in the output, and it wasn't clear if you wanted it that way or not. If not, just substitute out the newline:

sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s/\n//g;s,<a href="\(.*\)">\(.*\)</a>,\2 - \1,g;}' yourfile

And maybe clean up extra spaces:

sed -e '/<a href/{;:next;/<\/a>/!{N;b next;};s/\n//g;s/\s\{2,\}/ /g;s,<a href="\(.*\)">\(.*\)</a>,\2 - \1,g;}' yourfile

Explanation: The /<a href/{...} lets us ignore lines we don't care about. Once we find one we like, we check to see if it has the end marker. If not (/<\a>/!) we grab the next line and a newline (N) and branch (b) back to :next to see if we've found it yet. Once we find it we continue on with the substitutions.

查看更多
登录 后发表回答