Need variable width negative lookbehind replacemen

I have looked at many questions here (and many more websites) and some provided hints but none gave me a definitive answer. I know regular expressions but I am far from being a guru. This particular question deals with regex in PHP.

I need to locate words in a text that are not surrounded by a hyperlink of a given class. For example, I might have

This <a href="blabblah" class="no_check">elephant</a> is green and this elephant is blue while this <a href="blahblah">elephant</a> is red.

I would need to match against the second and third elephants but not the first (identified by test class "no_check"). Note that there could more attributes than just href and class within hyperlinks. I came up with

((?<!<a .*class="no_check".*>)\belephant\b)

which works beautifully in regex test software but not in PHP.

Any help is greatly appreciated. If you cannot provide a regular expression but can find some sort of PHP code logic that would circumvent the need for it, I would be equally grateful.

标签： php regex lookbehind negative-lookbehind

3条回答

何必那么认真

2楼-- · 2019-07-17 19:26

I think the simplest approach would be to match either a complete <a> element with a "no_check" attribute, or the word you're searching for. For example:

<a [^<>]*class="no_check"[^<>]*>.*?</a>|(\belephant\b)

If it was the word you matched, it will be in capture group #1; if not, that group should be empty or null.

Of course, by "simplest approach" I really meant the simplest regex approach. Even simpler would be to use an HTML parser.

0人赞添加讨论(0) 举报

smile是对你的礼貌

3楼-- · 2019-07-17 19:26

I ended up using a mixed solution. It turns out that I had to parse a text for specific keywords and check if they were already part of a link and if not add them to a hyperlink. The solutions provided here were very interesting but not exactly tailored enough for what I needed.

The idea of using an HTML parser was a good one though and I am currently using one in another project. So hats off to both Alan Moore and Eric Strom for suggesting that solution.

0人赞添加讨论(0) 举报

ゆ、 Hurt°

4楼-- · 2019-07-17 19:42

If variable width negative look-behind is not available a quick and dirty solution is to reverse the string in memory and use variable width negative look-ahead instead. then reverse the string again.

But you may be better off using an HTML parser.

0人赞添加讨论(0) 举报

Need variable width negative lookbehind replacemen

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间