the PHP manual states the following about the PCRE's "S" (Extra analysis of pattern) modifier on http://php.net/manual/en/reference.pcre.pattern.modifiers.php
S
When a pattern is going to be used several times, it is worth spending more time analyzing it in order to speed up the time taken for matching. If this modifier is set, then this extra analysis is performed. At present, studying a pattern is useful only for non-anchored patterns that do not have a single fixed starting character.
So its usage is related to patterns which should be used several times, without anchors inside of them (such as ^
, $
) or a fixed starting char sequence, e.g. in a pattern like '/^abc/'
.
But there aren't any specific details on where e.g. apply this modifier and how it actually works.
Does it apply only for the PHP thread of the current executing script and after the script is executed the "cached" analysis of the pattern is lost? Or does the engine store the analysis of the pattern in a global cache which is then made available to several PHP threads that use PCRE with the pattern marked with this modifier?
Also, from the PCRE introduction: http://php.net/manual/en/intro.pcre.php
Note: This extension maintains a global per-thread cache of compiled regular expressions (up to 4096)
If the "S" modifier is used per-thread only, how does it differs from the PCRE cache of compiled regexps? I guess additional information is stored, something like MySQL does when it indexes the rows in a table (of course in the case of PCRE, this additional information is stored in memory).
Last, but not the least, have someone experienced a real use case where he/she had used this modifier and did you notice an improvement and appreciate its benefits?
Thanks for the attention.
PHP docs quote a small part of the PCRE docs. Here are some more details (emphasis mine) from PCRE 8.36:
...
Please note that in the later PCRE version (v10.00, also called PCRE2), the lib has undergone a massive refactoring and API redesign. One of the consequences is that studying is always performed in PCRE 10.00 and above. I don't know when PHP will make use of PCRE2, but it will happen sooner or later because PCRE 8.x won't get any new features from now on.
Here's a quote from the PCRE2 release announcment:
As for your second question:
There's no cache in PCRE itself, but PHP maintains a cache of regexes to avoid recompiling the same pattern over and over again, for instance in case you use a
preg_
function inside a loop.