php preg_match and ereg syntax difference

2019-02-28 09:42发布

I found that syntax of preg_match() and the deprecated ereg() is different.
For example:

I thought that

preg_match('/^<div>(.*)</div>$/', $content);

means the same as

ereg('^<div>(.*)</div>$', $content);

but I was wrong. preg_match() doesn't include special characters as enter like ereg() does.

So I started to use this syntax:

preg_match('/^<div>([^<]*)</div>$/', $content);

but it isn't exactly the same to what I need.

Can anyone suggest me how to solve this problem, without using deprecated functions?

1条回答
The star\"
2楼-- · 2019-02-28 10:22

For parsing HTML I'd suggest reading this question and choosing a built in PHP extension.

If for some reason you need or want to use RegEx to do it you should know that:

  • preg_match() is a greedy little bugger and it will try to eat your anything (.*) till it get's sick (meaning it hits recursion or backtracking limits). You change this with the U modifier1.

  • the engine expects to be fed a single line. You change this with the m or s modifiers1.

  • using your 'not a < character' ([^<]*) hack does a good job as it forces the engine to stop at the first < char, but will work only if the <div> doesn't contain other tags inside!

ref: 1 PCRE Pattern Modifiers

查看更多
登录 后发表回答