Processing include/require directives in PHP

2019-02-28 10:02发布

Background: I'm building an automated test framework for a PHP application, and I need a way to efficiently "stub out" classes which encapsulate communication with external systems. For example, when testing class X that uses DB wrapper class Y, I would like to be able to "swap in" a "fake" version of class Y while running automated tests on class X (this way I don't have to do full setup + teardown of the state of the real DB as part of the test).

Problem: PHP allows "conditional includes", which means basically that include/require directives are handled as part of processing the "main" logic of a file, e.g.:

if (condition) {
    require_once('path/to/file');
}

The problem is that I can't figure out what happens when the "main" logic of the included file calls "return". Are all of the objects (defines, classes, functions, etc.) in the included file imported into the file which calls include/require? Or does processing stop with the return?

Example: Consider these three files:

A.inc

define('MOCK_Z', true);
require_once('Z.inc');
class Z {
    public function foo() {
        print "This is foo() from a local version of class Z.\n";
    }
}
$a = new Z();
$a->foo();

B.inc

define('MOCK_Z', true);
require_once('Z.inc');
$a = new Z();
$a->foo();

Z.inc

if (defined ('MOCK_Z')) {
    return true;
}
class Z {
    function foo() {
        print "This is foo() from the original version of class Z.\n";
    }
}

I observe the following behavior:

$ php A.inc
> This is foo() from a local version of class Z.

$ php B.inc
> This is foo() from the original version of class Z.

Why This is Strange: If require_once() included all of the defined code objects, then "php A.inc" ought to complain with a message like

Fatal error: Cannot redeclare class Z

And if require_once() included only the defined code objects up to "return", then "php B.inc" ought to complain with a message like:

Fatal error: Class 'Z' not found

Question: Can anyone explain exactly what PHP is doing, here? It actually matters to me because I need a robust idiom for handling includes for "mocked" classes.

标签: php include
5条回答
成全新的幸福
2楼-- · 2019-02-28 10:32

Okay so behavior of the return statement in PHP included files is to return control to the parent in execution. That means the classes definitions are parsed and accessible during the compile phase. For instance, if you change the above to the following

a.php:

<?php
define('MOCK_Z', true);

require_once('z.php');

class Z {
    public function foo() {
        print "This is foo() from a local version of class Z in a.php\n";
    }
}

$a = new Z();
$a->foo();

?> 

b.php:

<?php

    define('MOCK_Z', true);
    require_once('z.php');
    $a = new Z();
    $a->foo();

?>

z.php:

<?php

if (defined ('MOCK_Z')) {
    echo "MOCK_Z definition found, returning\n";
    return false;
}

echo "MOCK_Z definition not found defining class Z\n";

class X { syntax error here ; }

class Z {
    function foo() {
        print "This is foo() from the original version of class Z.\n";
    }
}

?>

then php a.php and php b.php will both die with syntax errors; which indicates that the return behavior is not evaluated during compile phase!

So this is how you go around it:

z.php:

<?php

$z_source = "z-real.inc";

if ( defined(MOCK_Z) ) {
    $z_source = "z-mock.inc";
}

include_once($z_source);

?>

z-real.inc:

<?php
class Z {
    function foo() {
            print "This is foo() from the z-real.inc.\n";
        }
}

?>

z-mock.inc:

<?php
class Z {
    function foo() {
            print "This is foo() from the z-mock.inc.\n";
        }
}

?>

Now the inclusion is determined at runtime :^) because the decision is not made until $z_source value is evaluated by the engine.

Now you get desired behavior, namely:

php a.php gives:

Fatal error: Cannot redeclare class Z in /Users/masud/z-real.inc on line 2

and php b.php gives:

This is foo() from the z-real.inc.

Of course you can do this directly in a.php or b.php but doing the double indirection may be useful ...

NOTE

Having SAID all of this, of course this is a terrible way to build stubs hehe for unit-testing or for any other purpose :-) ... but that's beyond the scope of this question so I shall leave it to your good devices.

Hope this helps.

查看更多
Deceive 欺骗
3楼-- · 2019-02-28 10:40

It looks like the answer is that class declarations are compile-time, but duplicate class definition errors are run-time at the point in the code that the class is declared. The first time a class definition is in a parsed block, it is immediately made available for use; by returning from an included file early, you aren't preventing class declaration, but you are bailing out before the error is thrown.

For example, here are a bunch of class definitions for Z:

$ cat A.php
<?php
error_reporting(-1);

$init_classlist = get_declared_classes();
require_once("Z.php");
var_dump(array_diff(get_declared_classes(), $init_classlist));

class Z {
  function test() {
    print "Modified Z from A.php.\n";
  }
}

$z = new Z();
$z->test();

return;

class Z {
  function test() {
    print "Another Z from A.php.\n";
  }
}


$ cat Z.php
<?php
echo "In Z.php!\n";
return;

class Z {
  function test() {
    print "Original Z.\n";
  }
}

When A.php is called, the following is produced:

In Z.php!
array(0) {
}
Modified Z from A.php.

This shows that the declared classes don't change upon entering Z.php - the Z class is already declared by A.php further down the file. However, Z.php never gets a change to complain about the duplicate definition due to the return before the class declaration. Similarly, A.php doesn't get a chance to complain about the second definition in the same file because it also returns before the second definition is reached.

Conversely, removing the first return; in Z.php instead produces:

In Z.php!

Fatal error: Cannot redeclare class Z in Z.php on line 4

By simply not returning early from Z.php, we reach the class declaration, which has a chance to produce its run-time error.

In summary: class declaration is compile-time, but duplicate definition errors are run-time at the point the class declaration appears in the code.

(Of course, having not confirmed this with the PHP internals, it might be doing something completely different, but the behavior is consistent with my description above. Tested in PHP 5.5.14.)

查看更多
神经病院院长
4楼-- · 2019-02-28 10:42

This is the closest thing I could find in the manual:

If there are functions defined in the included file, they can be used in the main file independent if they are before return() or after. If the file is included twice, PHP 5 issues fatal error because functions were already declared, while PHP 4 doesn't complain about functions defined after return().

And this is true regarding functions. If you define the same function in A and Z (after the return) with PHP 5, you'll get a fatal error as you expect.

However, classes seem to fall back to PHP 4 behavior, where it doesn't complain about functions defined after return. To me this seems like a bug, but I don't see where the documentation says what should happen with classes.

查看更多
一纸荒年 Trace。
5楼-- · 2019-02-28 10:45

According to php.net, if you use a return statement, it'll return execution to script that called it. Which means, require_once will stop executing, but the overall script will keep running. Also, examples on php.net show that if you return a variable within an included file, then you can do something like $foo = require_once('myfile.php'); and $foo will contain the returned value from the included file. If you don't return anything, then $foo is 1 to show that require_once was successful. Read this for more examples.

And I don't see anything on php.net that says anything specifically about how the php interpreter will parse included statements, but your testing shows that it first resolves class definitions before executing code in-line.

UPDATE

I added some tests as well, by modifying Z.inc as follows:

    $test = new Z();
    echo $test->foo();
    if (defined ('MOCK_Z')) {
        return true;
    }
    class Z {
        function foo() {
            print "This is foo() from the original version of class Z.\n";
        }
    }

And then tested on the command line as follows:

    %> php A.inc
    => This is foo() from a local version of class Z.
       This is foo() from a local version of class Z.

    %> php B.inc
    => This is foo() from the original version of class Z.
       This is foo() from the original version of class Z.

Obviously, name hoisting is happening here, but the question remaining is why there are no complaints about re-declarations?

UPDATE

So, I tried to declare class Z twice in A.inc and I got the fatal error, but when I tried to declare it twice in Z.inc, I didn't get an error. This leads me to believe that the php interpreter will return execution to the file that did the including when a fatal runtime error occurs in an included file. That is why A.inc did not use Z.inc's class definition. It was never put into the environment, because it caused a fatal error, returning execution back to A.inc.

UPDATE

I tried the die(); statement in Z.inc, and it actually does stop all execution. So, if one of your included scripts has a die statement, then you will kill your testing.

查看更多
祖国的老花朵
6楼-- · 2019-02-28 10:46

I've thought about this for a while now, and nobody has been able to point me to a clear and consistent explanation for the way PHP (up to 5.3 anyway) processes includes.

I conclude that it would be better to avoid this issue entirely and achieve control over "test double" class substitution via autoloading:

spl-autoload-register

In other words, replace the includes at the top of each PHP file with a require_once() which "bootstraps" a class which defines the logic for autoloading. And when writing automated tests, "inject" alternative autoloading logic for the classes to be "mocked" at the top of each test script.

It will naturally require a good deal of effort to modify existing code to follow this approach, but the effort appears to be worthwhile both to improve testability and to reduce the total number of lines in the codebase.

查看更多
登录 后发表回答