string to abstract syntax tree

2020-02-26 00:24发布

问题:

I would like to convert a string containing a valid Erlang expression to its abstract syntax tree representation, without any success so far.

Below is an example of what I would like to do. After compiling, alling z:z(). generates module zed, which by calling zed:zed(). returns the result of applying lists:reverse on the given list.

-module(z).
-export([z/0]).

z() ->
  ModuleAST = erl_syntax:attribute(erl_syntax:atom(module),
                                   [erl_syntax:atom("zed")]),

  ExportAST = erl_syntax:attribute(erl_syntax:atom(export),
                                   [erl_syntax:list(
                                    [erl_syntax:arity_qualifier(
                                     erl_syntax:atom("zed"),
                                     erl_syntax:integer(0))])]),

  %ListAST = ?(String),  % This is where I would put my AST
  ListAST = erl_syntax:list([erl_syntax:integer(1), erl_syntax:integer(2)]),

  FunctionAST = erl_syntax:function(erl_syntax:atom("zed"),
                                    [erl_syntax:clause(
                                     [], none,
                                     [erl_syntax:application(
                                        erl_syntax:atom(lists),
                                        erl_syntax:atom(reverse),
                                        [ListAST]
                    )])]),

  Forms = [erl_syntax:revert(AST) || AST <- [ModuleAST, ExportAST, FunctionAST]],

  case compile:forms(Forms) of
    {ok,ModuleName,Binary}           -> code:load_binary(ModuleName, "z", Binary);
    {ok,ModuleName,Binary,_Warnings} -> code:load_binary(ModuleName, "z", Binary)
  end.

String could be "[1,2,3].", or "begin A=4, B=2+3, [A,B] end.", or anything alike.

(Note that this is just an example of what I would like to do, so evaluating String is not an option for me.)


EDIT:

Specifying ListAST as below generates a huge dict-digraph-error-monster, and says "internal error in lint_module".

String = "[1,2,3].",
{ok, Ts, _} = erl_scan:string(String),
{ok, ListAST} = erl_parse:parse_exprs(Ts),

EDIT2:

This solution works for simple terms:

{ok, Ts, _} = erl_scan:string(String),
{ok, Term} = erl_parse:parse_term(Ts),
ListAST = erl_syntax:abstract(Term),

回答1:

In your EDIT example:

String = "[1,2,3].",
{ok, Ts, _} = erl_scan:string(String),
{ok, ListAST} = erl_parse:parse_exprs(Ts),

the ListAST is actually a list of AST:s (because parse_exprs, as the name indicates, parses multiple expressions (each terminated by a period). Since your string contained a single expression, you got a list of one element. All you need to do is match that out:

{ok, [ListAST]} = erl_parse:parse_exprs(Ts),

so it has nothing to do with erl_syntax (which accepts all erl_parse trees); it's just that you had an extra list wrapper around the ListAST, which caused the compiler to puke.



回答2:

Some comments of the top of my head.

I have not really used the erl_syntax libraries but I do think they make it difficult to read and "see" what you are trying to build. I would probably import the functions or define my own API to make it shorter and more legible. But then I generally tend to prefer shorter function and variable names.

The AST created by erl_syntax and the "standard" one created by erl_parse and used in the compiler are different and cannot be mixed. So you have to choose one of them and stick with it.

The example in your second EDIT will work for terms but not in the more general case:

{ok, Ts, _} = erl_scan:string(String),
{ok, Term} = erl_parse:parse_term(Ts),
ListAST = erl_syntax:abstract(Term),

This because erl_parse:parse_term/1 returns the actual term represented by the tokens while the other erl_parse functions parse_form and parse_exprs return the ASTs. Putting them into erl_syntax:abstract will do funny things.

Depending on what you are trying to do it might actually be easier to actually write out and erlang file and compile it rather than working directly with the abstract forms. This goes against my ingrained feelings but generating the erlang ASTs is not trivial. What type of code do you intend to produce?

<shameless_plug>

If you are not scared of lists you might try using LFE (lisp flavoured erlang) to generate code as with all lisps there is no special abstract form, it's all homoiconic and much easier to work with.

</shameless_plug>



回答3:

Zoltan

This is how we get the AST:

11> String = "fun() -> io:format(\"blah~n\") end.".
"fun() -> io:format(\"blah~n\") end."
12> {ok, Tokens, _} = erl_scan:string(String).     
{ok,[{'fun',1},
     {'(',1},
     {')',1},
     {'->',1},
     {atom,1,io},
     {':',1},
     {atom,1,format},
     {'(',1},
     {string,1,"blah~n"},
     {')',1},
     {'end',1},
     {dot,1}],
    1}
13> {ok, AbsForm} = erl_parse:parse_exprs(Tokens). 
{ok,[{'fun',1,
            {clauses,[{clause,1,[],[],
                              [{call,1,
                                     {remote,1,{atom,1,io},{atom,1,format}},
                                     [{string,1,"blah~n"}]}]}]}}]}
14>