I want to parse a PHP source file, into an AST (preferably as a nested array of instructions).
I basically want to convert things like
f($a, $b + 1)
into something like
array( 'function_call',
array(
array( 'var', '$a' ),
array( 'expression',
array(
array( 'binary_operation',
'+',
array ('var', '$b'),
array( 'int', '1' )
)
)
)
)
)
Are there any inbuilt PHP library or third party libraries (preferably in PHP) that would let me do this?
HipHop
You can use Facebook's HHVM to dump the AST.
This worked for HipHop (the old PHP to C++ compiler) - back in the days of 2013!
HHVM
Update 2015
--parse
is not supported.You will get an error:
HHVM The 'parse' command line option is not supported.
See https://github.com/facebook/hhvm/blob/c494c3a145008f65d349611eb2d09d0c33f1ab23/hphp/runtime/base/program_functions.cpp#L1111
Feature Request to support the CLI option again: https://github.com/facebook/hhvm/issues/4615
PHP 7
PHP 7 will have an AST, see the related RFC.
There are two extensions, which provide access and expose the AST generated by PHP 7:
I have implemented a PHP Parser after I figured out that there was no existing parser. It parses the PHP code into a node tree.
No, there is no such feature built-in. But you can use the Tokenizer to create it.
Pfff is an OCaml library for parsing and manipulating PHP code. See the manual of Pfff for more details.
Well, you can look at the answers from Parsing and Printing PHP Code and Generating PHP code (from Parser Tokens): basically PEAR's PHP_Beautifier package at http://pear.php.net/package/PHP_Beautifier can be extended to do what you want, but it sounds like it requires some heavy lifting.
And if you're not constrained to PHP then http://www.eclipse.org/pdt/articles/ast/PHP_AST.html walks you through using the Eclipse PHP module's AST parser.