Regex capturing group not working as intended [dup

2020-03-26 04:03发布

I've been struggling for two days to get this to work, but I just can't (I'm terrible at regular expressions :S).

${test}[arg]

From this text, I need to retrieve two different things: test and arg. To do it, I have created this regex:

(\$\{(\b[a-zA-Z0-9.]+\b)\})(\[(.+)\])?

With that example, it works. However, if I try this other text: ${test}[arg1] - ${test2}[arg2], I just get one match with 2 groups: test and arg1] - ${test2}[arg2, instead of getting 2 different matches: one with groups test and arg1, and the other with groups test2 and arg2.

I hope you can help me out.

Thanks in advance.

标签: java regex
1条回答
家丑人穷心不美
2楼-- · 2020-03-26 04:42

This is a classic example of why the .+ combination can be evil. Use a negated character set instead:

(\$\{(\b[a-zA-Z0-9]+\b)\})(\[([^]]+)\])
                              ^^^

You can try it here.


Compare the behavior of the two expressions:

  • Match anything greedily. For the second match, the regex matches anything greedily. It will match anything until it reaches the end of the string, and then has to backtrack until it finds a ]. As soon as it finds a ], it stops, hence you end up with [arg1] - ${test2}[arg2] as a match.

  • Match anything but a ]. Here the regex is matching anything that is not a ], therefore at every step is checks whether the next is a ] or not. For the second match, you can see that as soon as it finds a ], it stops.

查看更多
登录 后发表回答