Are scalar and strict types in PHP7 a performance

2019-02-11 15:56发布

问题:

Since PHP7 we can now use scalar typehint and ask for strict types on a per-file basis. Are there any performance benefits from using these features? If yes, how?

Around the interwebs I've only found conceptual benefits, such as:

  • more precise errors
  • avoiding issues with unwanted type coercion
  • more semantic code, avoiding misunderstandings when using other's code
  • better IDE evaluation of code

回答1:

Today, the use of scalar and strict types in PHP7 does not enhance performance.

PHP7 does not have a JIT compiler.

If at some time in the future PHP does get a JIT compiler, it is not too difficult to imagine optimizations that could be performed with the additional type information.

When it comes to optimizations without a JIT, scalar types are only partly helpful.

Let's take the following code:

<?php
function (int $a, int $b) : int {
    return $a + $b;
}
?>

This is the code generated by Zend for that:

function name: {closure}
L2-4 {closure}() /usr/src/scalar.php - 0x7fd6b30ef100 + 7 ops
 L2    #0     RECV                    1                                         $a                  
 L2    #1     RECV                    2                                         $b                  
 L3    #2     ADD                     $a                   $b                   ~0                  
 L3    #3     VERIFY_RETURN_TYPE      ~0                                                            
 L3    #4     RETURN                  ~0                                                            
 L4    #5     VERIFY_RETURN_TYPE                                                                    
 L4    #6     RETURN                  null

ZEND_RECV is the opcode that performs type verification and coercion for the received parameters. The next opcode is ZEND_ADD:

ZEND_VM_HANDLER(1, ZEND_ADD, CONST|TMPVAR|CV, CONST|TMPVAR|CV)
{
    USE_OPLINE
    zend_free_op free_op1, free_op2;
    zval *op1, *op2, *result;

    op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
    op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
    if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_LONG)) {
        if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
            result = EX_VAR(opline->result.var);
            fast_long_add_function(result, op1, op2);
            ZEND_VM_NEXT_OPCODE();
        } else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
            result = EX_VAR(opline->result.var);
            ZVAL_DOUBLE(result, ((double)Z_LVAL_P(op1)) + Z_DVAL_P(op2));
            ZEND_VM_NEXT_OPCODE();
        }
    } else if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_DOUBLE)) {
        if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
            result = EX_VAR(opline->result.var);
            ZVAL_DOUBLE(result, Z_DVAL_P(op1) + Z_DVAL_P(op2));
            ZEND_VM_NEXT_OPCODE();
        } else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
            result = EX_VAR(opline->result.var);
            ZVAL_DOUBLE(result, Z_DVAL_P(op1) + ((double)Z_LVAL_P(op2)));
            ZEND_VM_NEXT_OPCODE();
        }
    }

    SAVE_OPLINE();
    if (OP1_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op1) == IS_UNDEF)) {
        op1 = GET_OP1_UNDEF_CV(op1, BP_VAR_R);
    }
    if (OP2_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op2) == IS_UNDEF)) {
        op2 = GET_OP2_UNDEF_CV(op2, BP_VAR_R);
    }
    add_function(EX_VAR(opline->result.var), op1, op2);
    FREE_OP1();
    FREE_OP2();
    ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}

Without understanding what any of that code does, you can see that it's rather complex.

So the target would be omitting ZEND_RECV completely, and replacing ZEND_ADD with ZEND_ADD_INT_INT which doesn't need to perform any checking (beyond guarding) or branching, because the types of params are known.

In order to omit those, and have a ZEND_ADD_INT_INT you need to be able to reliably infer the types of $a and $b at compile time. Compile time inference is sometimes easy, for example, $a and $b are literal integers, or constants.

Literally yesterday, PHP 7.1 got something really similar: There are now type specific handlers for some high frequency opcodes like ZEND_ADD. Opcache is able to infer the type of some variables, it's even able to infer the types of variables within an array in some cases and change opcodes generated to use the normal ZEND_ADD, to use a type specific handler:

ZEND_VM_TYPE_SPEC_HANDLER(ZEND_ADD, (res_info == MAY_BE_LONG && op1_info == MAY_BE_LONG && op2_info == MAY_BE_LONG), ZEND_ADD_LONG_NO_OVERFLOW, CONST|TMPVARCV, CONST|TMPVARCV, SPEC(NO_CONST_CONST,COMMUTATIVE))
{
    USE_OPLINE
    zval *op1, *op2, *result;

    op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
    op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
    result = EX_VAR(opline->result.var);
    ZVAL_LONG(result, Z_LVAL_P(op1) + Z_LVAL_P(op2));
    ZEND_VM_NEXT_OPCODE();
}

Again, without understanding what any of that does, you can tell that this is much simpler to execute.

These optimizations are very cool, however, the most effective, and most interesting optimizations will come when PHP has a JIT.



回答2:

Are there any performance benefits from using these features? If yes, how?

Not yet.

But this is the first step for a more efficient opcode generation. According to RFC: Scalar Type Hints's Future Scope:

Because scalar type hints guarantee that a passed argument will be of a certain type within a function body (at least initially), this could be used in the Zend Engine for optimisations. For example, if a function takes two float-hinted arguments and does arithmetic with them, there is no need for the arithmetic operators to check the types of their operands.

In previous version of php there was no way to know what kind of parameter could be passed to a function, which makes really hard to have JIT compilation approach to achieve superior performance, like facebook's HHVM do.

@ircmaxell in his blog mentions the possibility of bringing all this to the next level with native compilation, which would be even better than JIT.

From the point of view of the performance, type scalar hints opens the doors for implementing those optimizations. But doesn't enhance performance in and of itself.