Is this a bug in .NET's Regex.Split?

2020-02-07 03:24发布

I have two regular expressions, for use with Regex.Split:

(?<=\G[^,],[^,],)

and

(?<=\G([^,],){2})

When splitting the string "A,B,C,D,E,F,G,", the first one results in:

A,B, 
C,D, 
E,F, 
G, 

and the second results in:

A,B, 
A, 
C,D, 
C, 
E,F, 
E, 
G, 

What is going on here? I thought that (X){2} was always equivalent to XX, but I'm not sure anymore. In my actual problem, I need to do something like quite a bit more complex, and I need to do it sixty nine times, so just repeating the pattern is less than ideal.

2条回答
时光不老,我们不散
2楼-- · 2020-02-07 03:52

From docs:

If capturing parentheses are used in a Regex.Split expression, any captured text is included in the resulting string array.

You have a capture group in your second expression. Try non-capturing parens:

(?<=\G(?:[^,],){2})
查看更多
爱情/是我丢掉的垃圾
3楼-- · 2020-02-07 03:54

From the documentation for Regex.Split

If capturing parentheses are used in a Regex.Split expression, any captured text is included in the resulting string array.

The internal parentheses are capturing. Try using (?:[^,],) instead.

查看更多
登录 后发表回答