Find PHP with REGEX

2019-08-19 00:50发布

I need a REGEX that can find blocks of PHP code in a file. For example:

    <? print '<?xml version="1.0" encoding="UTF-8"?>';?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
    <head>
        <?php echo "stuff"; ?>
    </head>
    <html>

When parsed would by the REGEX would return:

array(
    "<? print '<?xml version=\"1.0\" encoding="UTF-8"?>';?>",
    "<? echo \"stuff\"; ?>"
);

You can assume the PHP is valid.

5条回答
不美不萌又怎样
2楼-- · 2019-08-19 01:30
<\?(?:php)?\s+.*?\?>$

with the following modifiers:

Dot match newlines

^& match at line breaks

查看更多
我命由我不由天
3楼-- · 2019-08-19 01:43

Try this regex(untested):

preg_match_all('@<\?.*?\?>@si',$html,$m);
print_r($m[0]);
查看更多
欢心
4楼-- · 2019-08-19 01:47

This is the type of task that is much better suited for a custom parser. You could relatively easily construct one using a stack and I can guarantee you will be done much quicker and pull less hair out than you would trying to debug your regex.

Regular expressions are great tools when used appropriately but not all text parsing tasks are equal.

查看更多
Melony?
5楼-- · 2019-08-19 01:52

With token_get_all you get a list of PHP language tokens of a given PHP code. Then you just need to iterate the list, look for the open tag tokens and for the corresponding close tags.

$blocks = array();
$opened = false;
foreach (token_get_all($code) as $token) {
    if (!$opened) {
        if (is_array($token) && ($token[0] === T_OPEN_TAG || $token[0] === T_OPEN_TAG_WITH_ECHO)) {
            $opened = true;
            $buffer = $token[1];
        }
    } else {
        if (is_array($token)) {
            $buffer .= $token[1];
            if ($token[0] === T_CLOSE_TAG) {
                $opened = false;
                $blocks[] = $buffer;
            }
        } else {
            $buffer .= $token;
        }
    }
}
查看更多
聊天终结者
6楼-- · 2019-08-19 01:53

Try the following regex using preg_match()

/<\?(?:php)?\s+(.*?)\?>/

That's untested, but is a start. It assumes a closing PHP tag (arguably well-formed).

查看更多
登录 后发表回答