Is there a native "PHP way" to parse command arguments from a string
? For example, given the following string
:
foo "bar \"baz\"" '\'quux\''
I'd like to create the following array
:
array(3) {
[0] =>
string(3) "foo"
[1] =>
string(7) "bar "baz""
[2] =>
string(6) "'quux'"
}
I've already tried to leverage token_get_all()
, but PHP's variable interpolation syntax (e.g. "foo ${bar} baz"
) pretty much rained on my parade.
I know full well that I could write my own parser. Command argument syntax is super simplistic, but if there's an existing native way to do it, I'd much prefer that over rolling my own.
EDIT: Please note that I am looking to parse the arguments from a string
, NOT from the shell/command-line.
EDIT #2: Below is a more comprehensive example of the expected input -> output for arguments:
foo -> foo
"foo" -> foo
'foo' -> foo
"foo'foo" -> foo'foo
'foo"foo' -> foo"foo
"foo\"foo" -> foo"foo
'foo\'foo' -> foo'foo
"foo\foo" -> foo\foo
"foo\\foo" -> foo\foo
"foo foo" -> foo foo
'foo foo' -> foo foo
I suggest something like:
With some assistance from: string to array, split by single and double quotes for the regexp
You still have to unescape the strings in the array after.
But you get the picture.
Since you request a native way to do this, and PHP doesn't provide any function that would map $argv creation, you could workaround this lack like this :
Create an executable PHP script foo.php :
And use it to retrieve arguments, the way PHP will actually do if you exec $command :
Advantages :
Drawbacks :
There really is no native function for parsing commands to my knowledge. However, I have created a function which does the trick natively in PHP. By using str_replace several times, you are able to convert the string into something array convertible. I don't know how fast you consider fast, but when running the query 400 times, the slowest query was under 34 microseconds.
If you want to follow the rules of such parsing that are there as well as in shell, there are some edge-cases which I think aren't easy to cover with regular expressions and therefore you might want to write a method that does this (example):
Output:
I guess this pretty much matches what you're looking for. The function used in the example can be configured for the escape character as well as for the quotes, you can even use parenthesis like
[
]
to form a "quote" if you like.To allow other than native bytesafe-strings with one character per byte you can pass an array instead of a string. the array needs to contain one character per value as a binary safe string. e.g. pass unicode in NFC form as UTF-8 with one code-point per array value and this should do the job for unicode.
Regexes are quite powerful:
(?s)(?<!\\)("|')(?:[^\\]|\\.)*?\1|\S+
. So what does this expression mean ?(?s)
: set thes
modifier to match newlines with a dot.
(?<!\\)
: negative lookbehind, check if there is no backslash preceding the next token("|')
: match a single or double quote and put it in group 1(?:[^\\]|\\.)*?
: match everything not \, or match \ with the immediately following (escaped) character\1
: match what is matched in the first group|
: or\S+
: match anything except whitespace one or more times.The idea is to capture a quote and group it to remember if it's a single or a double one. The negative lookbehinds are there to make sure we don't match escaped quotes.
\1
is used to match the second pair of quotes. Finally we use an alternation to match anything that's not a whitespace. This solution is handy and is almost applicable for any language/flavor that supports lookbehinds and backreferences. Of course, this solution expects that the quotes are closed. The results are found in group 0.Let's implement it in PHP:
If you wonder why I used 4 backslashes. Then take a look at my previous answer.
Output
Online regex demo Online php demo
Removing the quotes
Quite simple using named groups and a simple loop:
Online php demo
I would recommend going another way. There is already a "standard" way of doing command line arguments. it's called get_opts:
http://php.net/manual/en/function.getopt.php
I would suggest that you change your script to use get_opts, then anyone using your script will be passing parameters in a way that is familiar to them and kind of "industry standard" instead of having to learn your way of doing things.