-->

Difference in capturing and non-capturing regex sc

2020-04-06 16:16发布

问题:

Although the docs state that calling a token/rule/regex as <.foo> instead of <foo> makes them non-capturing, it seems there is a difference in scope, but I'm not sure if it's intended.

Here is a simplified test. In a module file:

unit module Foo;
my token y           {     y  }
my token a is export { x  <y> }
my token b is export { x <.y> }

Inside of another script file:

grammar A {
  use Foo;
  token TOP { <a> }
}

grammar B {
  use Foo;
  token TOP { <b> }
}

If we calling A.parse("xy") everything runs as expected. However, calling B.parse("xy") results in the error No such method 'y' for invocant of type 'B'. Is this expected behavior or a potential bug?

回答1:

The intention per S05

The intention according to the relevant speculation/design doc includes:

<foo ...>

This form always gives preference to a lexically scoped regex declaration, dispatching directly to it as if it were a function. If there is no such lexical regex (or lexical method) in scope, the call is dispatched to the current grammar, assuming there is one.

...

A leading . explicitly calls a method as a subrule; the fact that the initial character is not alphanumeric also causes the named assertion to not capture what it matches.

...

A call to <foo> will fail if there is neither any lexically scoped routine of that name it can call, nor any method of that name that be reached via method dispatch. (The decision of which dispatcher to use is made at compile time, not at run time; the method call is not a fallback mechanism.)

Examples of forms

  • <bar> is as explained above. It preferentially resolves to an early bound lexical (my/our) routine/rule named &bar. Otherwise it resolves to a late bound attempt to call a has (has) method/rule named bar. If it succeeds it stores the match under a capture named bar.

  • <.bar> calls a has (has) method/rule named bar if it finds one. It does not capture.

  • <bar=.bar> calls a has (has) method/rule named bar if it finds one. If it succeeds it stores the match under a capture named bar. In other words, it's the same as <bar> except it only attempts to call a has method named .bar; it doesn't first attempt to resolve to a lexical &bar.

  • <&bar> and <.&bar> mean the same thing. They call a lexical routine named &bar and do not capture. To do the same thing, but capture, use <bar=&bar> or <bar=.&bar>.

(If you read the speculation/design doc linked above and try things, you'll find most of the design details that doc mentions have already been implemented in Rakudo even if they're not officially supported/roasted/documented.)

Scope examples

First the common case:

grammar c {
  has rule TOP { <bar> }
  has rule bar { . { say 'has rule' } }
}
say c.parse: 'a';

displays:

has rule
「a」
 bar => 「a」

(The has declarators are optional and it's idiomatic to exclude them.)

Now introducing a rule lexically scoped to the grammar block:

grammar c {
  my  rule bar { . { say 'inner my rule' } }
  has rule TOP { <bar> }
  has rule bar { . { say 'has rule' } }
}
say c.parse: 'a';

displays:

inner my rule
「a」
 bar => 「a」

Even a lexical rule declared outside the grammar block has precedence over has rules:

my rule bar { . { say 'outer my rule' } }
grammar c {
  has rule TOP { <bar> }
  has rule bar { . { say 'has rule' } }
}
say c.parse: 'a';

displays:

outer my rule
「a」
 bar => 「a」