可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am trying to remove white space
that exists in a String input
. My ultimate goal is to create an infix evaluator, but I am having issues with parsing the input expression.
It seems to me that the easy solution to this is using a Regular Expression
function, namely Regex.Replace(...)
Here's what I have so far..
infixExp = Regex.Replace(infixExp, "\\s+", string.Empty);
string[] substrings = Regex.Split(infixExp, "(\\()|(\\))|(-)|(\\+)|(\\*)|(/)");
Assuming the user inputs the infix expression (2 + 3) * 4, I would expect that this would break the string into the array {(, 2, +, 3, ), *, 4}
; however, after debugging, I am getting the following output:
infixExp = "(2+3)*7"
substrings = {"", (, 2, +, 3, ), "", *, 7}
It appears that the white space is being properly removed from the infix expression, but splitting the resulting string is improper.
Could anyone give me insight as to why? Likewise, if you have any constructive criticism or suggestions, let me know!
回答1:
If a match is at one end of the string, you will get an empty match next to it. Likewise, if there are two adjacent matches, the string will be split on both of them, so you end up with an empty string in between. Citing MSDN:
If multiple matches are adjacent to one another, an empty string is inserted into the array. For example, splitting a string on a single hyphen causes the returned array to include an empty string in the position where two adjacent hyphens are found [...].
and
If a match is found at the beginning or the end of the input string, an empty string is included at the beginning or the end of the returned array.
Just filter them out in a second step.
Also, please make your life easier and use verbatim strings:
infixExp = Regex.Replace(infixExp, @"\s+", string.Empty);
string[] substrings = Regex.Split(infixExp, @"(\(|\)|-|\+|\*|/)");
The second expression could be simplified even further:
@"([()+*/-])"
回答2:
Please, ditch Regex. There are better tools to use. You can use String.Trim()
, .TrimEnd()
, and .TrimStart()
.
string inputString = " asdf ";
string output = inputString.Trim();
For whitespace within the string, use String.Replace
.
string output2 = output.Replace(" ", "");
You will have to expand this to other whitespace characters.
回答3:
var result = Regex.Split(input, "(\\d+|\\D)")
.Where(x=>x!="").ToArray();
回答4:
m.buettner's answer is correct. Also consider that you can do this in one step. From MSDN:
If capturing parentheses are used in a Regex.Split expression, any
captured text is included in the resulting string array.
Therefore, if you include the whitespace in the split pattern but outside the capturing parentheses, you can split on it as well but not include it in the result array:
var substrings = Regex.Split("(2 + 3) * 7", @"([()+*/-])|\s+");
The result:
substrings = {"", ( , 2, "", +, "", 3, ), "", "", *, "", 7}
And your final result would be:
substrings.Where(s => s != String.Empty)
回答5:
Why not just remove the white spaces and then split the string with normal string handling functions? Like this...
string x = "(2 + 3) * 4";
x = x.Replace(" ", "").Replace("\t",""); //etc...
char[] y = x.ToCharArray();
Why bother making this more complicated than it needs to be?
回答6:
A non-regex solution would probably be String.Replace - you could simply replace " ", "\t", and other whitespace with the empty string "".
回答7:
I found the solution I was looking for thanks to all of your replies.
// Ignore all whitespace within the expression.
infixExp = Regex.Replace(infixExp, @"\s+", String.Empty);
// Seperate the expression based on the tokens (, ), +, -,
// *, /, and ignore any of the empty Strings that are added
// due to duplicates.
string[] substrings = Regex.Split(infixExp, @"([()+*/-])");
substrings = substrings.Where(s => s != String.Empty).ToArray();
By doing this it seperates the characters of the String into parts based on the regular mathematical operators (+, -, *, /) and parenthesis. After doing this it eliminates any remaining empty Strings within the substrings