C# Command-Line Parsing of Quoted Paths and Avoidi

2019-01-18 18:20发布

问题:

How is it possible to parse command-line arguments that are to be interpreted as paths? args[] contains strings that are automatically joined if they are quoted, e.g.:

example.exe one two "three four"

args[0] = one
args[1] = two
args[2] = three four

However, args[] will not property parse "C:\Example\" as an argument. Rather, it will supply the argument as "C:\Example"" (with the extra quote included.) This is due to the backslash in the path being treated as an escape character and thus the end quotation that the user supplied on the command-line becomes part of the argument.

.e.g:

example.exe one "C:\InputFolder" "C:\OutuptFolder\"

args[0] = one
args[1] = C:\InputFolder"
args[2] = C:\OutputFolder"

An easy kludge might be:

_path = args[i].Replace("\"", @"\");

However, I'm sure there is a best-practice for this. How might one correctly parse a command line that inlcudes paths, preventing the args[] array from improperly being populated with stings that have been parsed for escape characters?

NOTE: I would not like to include an entire command-line parsing library in my project! I need only to handle quoted paths and wish to do so in a "manual" fashion. Please do not reccomend NConsoler, Mono, or any other large "kitchen sink" command-line parsing library.

ALSO NOTE: As far as I can tell, this is not a duplicate question. While other questions focus on generic command-line parsing, this question is specific to the problem that paths introduce when parts of them are interpreted as escape sequences.

回答1:

Not an answer, but here's some background and explanation from Jeffrey Tan, Microsoft Online Community Support (12/7/2006):

Note: this is not not a code defeat but by design, since backslashe are normally used to escape certain special character. Also, this algorithm is the same as Win32 command line arguments parsing function CommandLineToArgvW. See the Remarks section below: http://msdn2.microsoft.com/en-us/library/bb776391.aspx

Also makes reference to the FX method Environment.GetCommandLineArgs for further explanation of the slash handling behavior.

Personally I think this is a drag, and I'm surprised I haven't been bit by it before. Or maybe I have and don't know it? Blind replacement of quotes with slashes doesn't strike me as a solution, though. I'm voting the question up, because it was an eye opener.



回答2:

I like your idea:

_path = args[i].Replace("\"", @"\");

It is clean, and will have no effect unless the problem exists.



回答3:

I had the same frustration. My solution was to use regular expressions. My expected input is a list of paths, some of which may be quoted. The above kludge doesn't work unless all the last arguments are quoted.

// Capture quoted string or non-quoted strings followed by whitespace
string exp = @"^(?:""([^""]*)""\s*|([^""\s]+)\s*)+";
Match m = Regex.Match(Environment.CommandLine, exp);

// Expect three Groups
// group[0] = entire match
// group[1] = matches from left capturing group
// group[2] = matches from right capturing group
if (m.Groups.Count < 3)
    throw new ArgumentException("A minimum of 2 arguments are required for this program");

// Sort the captures by their original postion
var captures = m.Groups[1].Captures.Cast<Capture>().Concat(
               m.Groups[2].Captures.Cast<Capture>()).
               OrderBy(x => x.Index).
               ToArray();

// captures[0] is the executable file
if (captures.Length < 3)
    throw new ArgumentException("A minimum of 2 arguments are required for this program");

Can anyone see a more efficient regex?