PHP: Preg_replace, word boundaries and non word ch

2019-05-18 08:19发布

问题:

I need to replace words that start with hash mark (#) inside a text. Well I know how I can replace whole words.

preg_replace("/\b".$variable."\b/", $value, $text);

Because that \b modifier accepts only word characters so a word containing hash mark wont be replaced.


I have this html which contains #companyName type of variables which I replace with a value.

回答1:

\b matches between an alphanumeric character (shorthand \w) and a non-alphanumeric character (\W), counting underscores as alphanumeric. This means, as you have seen, that it won't match before a # (unless that's preceded by an alnum character).

I suggest that you only surround your query word with \b if it starts and end with an alnum character.

So, perhaps something like this (although I don't know any PHP, so this may be syntactically completely wrong):

if (preg_match('/^\w/', $variable))
    $variable = '\b'.$variable;
if (preg_match('/\w$/', $variable))
    $variable = $variable.'\b';
preg_replace('/'.$variable.'/', $value, $text);


回答2:

All \b does is match a change between non-word and word characters. Since you know $variable starts with non-word characters, you just need to precede the match by a non-word character (\W).

However, since you are replacing, you either need to make the non-word match zero-width, i.e. a look-behind:

preg_replace("/(?<=\\W)".$variable."\\b/", $value, $text); 

or incorporate the matched character into the replacement text:

preg_replace("/(\\W)".$variable."\\b/", $value, "$1$text");


回答3:

Following expression can also be used for marking boundaries for words containing non-word characters:-

 preg_replace("/(^|\s|\W)".$variable."($|\s|\W)/", $value, $text);


回答4:

Why not just

preg_replace("/#\b".$variable."\b/", $value, $text);