Since PHP7 we can now use scalar typehint and ask for strict types on a per-file basis. Are there any performance benefits from using these features? If yes, how?
Around the interwebs I've only found conceptual benefits, such as:
- more precise errors
- avoiding issues with unwanted type coercion
- more semantic code, avoiding misunderstandings when using other's code
- better IDE evaluation of code
Today, the use of scalar and strict types in PHP7 does not enhance performance.
PHP7 does not have a JIT compiler.
If at some time in the future PHP does get a JIT compiler, it is not too difficult to imagine optimizations that could be performed with the additional type information.
When it comes to optimizations without a JIT, scalar types are only partly helpful.
Let's take the following code:
<?php
function (int $a, int $b) : int {
return $a + $b;
}
?>
This is the code generated by Zend for that:
function name: {closure}
L2-4 {closure}() /usr/src/scalar.php - 0x7fd6b30ef100 + 7 ops
L2 #0 RECV 1 $a
L2 #1 RECV 2 $b
L3 #2 ADD $a $b ~0
L3 #3 VERIFY_RETURN_TYPE ~0
L3 #4 RETURN ~0
L4 #5 VERIFY_RETURN_TYPE
L4 #6 RETURN null
ZEND_RECV
is the opcode that performs type verification and coercion for the received parameters. The next opcode is ZEND_ADD
:
ZEND_VM_HANDLER(1, ZEND_ADD, CONST|TMPVAR|CV, CONST|TMPVAR|CV)
{
USE_OPLINE
zend_free_op free_op1, free_op2;
zval *op1, *op2, *result;
op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_LONG)) {
if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
result = EX_VAR(opline->result.var);
fast_long_add_function(result, op1, op2);
ZEND_VM_NEXT_OPCODE();
} else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
result = EX_VAR(opline->result.var);
ZVAL_DOUBLE(result, ((double)Z_LVAL_P(op1)) + Z_DVAL_P(op2));
ZEND_VM_NEXT_OPCODE();
}
} else if (EXPECTED(Z_TYPE_INFO_P(op1) == IS_DOUBLE)) {
if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_DOUBLE)) {
result = EX_VAR(opline->result.var);
ZVAL_DOUBLE(result, Z_DVAL_P(op1) + Z_DVAL_P(op2));
ZEND_VM_NEXT_OPCODE();
} else if (EXPECTED(Z_TYPE_INFO_P(op2) == IS_LONG)) {
result = EX_VAR(opline->result.var);
ZVAL_DOUBLE(result, Z_DVAL_P(op1) + ((double)Z_LVAL_P(op2)));
ZEND_VM_NEXT_OPCODE();
}
}
SAVE_OPLINE();
if (OP1_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op1) == IS_UNDEF)) {
op1 = GET_OP1_UNDEF_CV(op1, BP_VAR_R);
}
if (OP2_TYPE == IS_CV && UNEXPECTED(Z_TYPE_INFO_P(op2) == IS_UNDEF)) {
op2 = GET_OP2_UNDEF_CV(op2, BP_VAR_R);
}
add_function(EX_VAR(opline->result.var), op1, op2);
FREE_OP1();
FREE_OP2();
ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}
Without understanding what any of that code does, you can see that it's rather complex.
So the target would be omitting ZEND_RECV
completely, and replacing ZEND_ADD
with ZEND_ADD_INT_INT
which doesn't need to perform any checking (beyond guarding) or branching, because the types of params are known.
In order to omit those, and have a ZEND_ADD_INT_INT
you need to be able to reliably infer the types of $a
and $b
at compile time. Compile time inference is sometimes easy, for example, $a
and $b
are literal integers, or constants.
Literally yesterday, PHP 7.1 got something really similar: There are now type specific handlers for some high frequency opcodes like ZEND_ADD
. Opcache is able to infer the type of some variables, it's even able to infer the types of variables within an array in some cases and change opcodes generated to use the normal ZEND_ADD
, to use a type specific handler:
ZEND_VM_TYPE_SPEC_HANDLER(ZEND_ADD, (res_info == MAY_BE_LONG && op1_info == MAY_BE_LONG && op2_info == MAY_BE_LONG), ZEND_ADD_LONG_NO_OVERFLOW, CONST|TMPVARCV, CONST|TMPVARCV, SPEC(NO_CONST_CONST,COMMUTATIVE))
{
USE_OPLINE
zval *op1, *op2, *result;
op1 = GET_OP1_ZVAL_PTR_UNDEF(BP_VAR_R);
op2 = GET_OP2_ZVAL_PTR_UNDEF(BP_VAR_R);
result = EX_VAR(opline->result.var);
ZVAL_LONG(result, Z_LVAL_P(op1) + Z_LVAL_P(op2));
ZEND_VM_NEXT_OPCODE();
}
Again, without understanding what any of that does, you can tell that this is much simpler to execute.
These optimizations are very cool, however, the most effective, and most interesting optimizations will come when PHP has a JIT.
Are there any performance benefits from using these features? If yes, how?
Not yet.
But this is the first step for a more efficient opcode generation. According to RFC: Scalar Type Hints's Future Scope:
Because scalar type hints guarantee that a passed argument will be of
a certain type within a function body (at least initially), this could
be used in the Zend Engine for optimisations. For example, if a
function takes two float-hinted arguments and does arithmetic with
them, there is no need for the arithmetic operators to check the types
of their operands.
In previous version of php there was no way to know what kind of parameter could be passed to a function, which makes really hard to have JIT compilation approach to achieve superior performance, like facebook's HHVM do.
@ircmaxell in his blog mentions the possibility of bringing all this to the next level with native compilation, which would be even better than JIT.
From the point of view of the performance, type scalar hints opens the doors for implementing those optimizations. But doesn't enhance performance in and of itself.