php preg_match and ereg syntax difference

2019-02-28 10:06发布

问题:

I found that syntax of preg_match() and the deprecated ereg() is different.
For example:

I thought that

preg_match('/^<div>(.*)</div>$/', $content);

means the same as

ereg('^<div>(.*)</div>$', $content);

but I was wrong. preg_match() doesn't include special characters as enter like ereg() does.

So I started to use this syntax:

preg_match('/^<div>([^<]*)</div>$/', $content);

but it isn't exactly the same to what I need.

Can anyone suggest me how to solve this problem, without using deprecated functions?

回答1:

For parsing HTML I'd suggest reading this question and choosing a built in PHP extension.

If for some reason you need or want to use RegEx to do it you should know that:

preg_match() is a greedy little bugger and it will try to eat your anything (.*) till it get's sick (meaning it hits recursion or backtracking limits). You change this with the U modifier¹.
the engine expects to be fed a single line. You change this with the m or s modifiers¹.
using your 'not a < character' ([^<]*) hack does a good job as it forces the engine to stop at the first < char, but will work only if the <div> doesn't contain other tags inside!