argparse on demand imports for types, choices etc

I have quite a big program which has a CLI interaction based on argparse, with several sub parsers. The list of supported choices for the subparsers arguments are determined based on DB queries, parsing different xml files, making different calculations etc, so it is quite IO intensive and time consuming.

The problem is that argparse seems to fetch choices for all sub parser when I run the script, which adds a considerable and annoying startup delay.

Is there a way to make argparse only fetch and validate choices for the currently used sub parser?

One solution could be to move all the validation logic deeper inside the code but that would mean quite a lot of work which I would like to avoid, if possible.

Thank you

标签： python argparse

4条回答

放荡不羁爱自由

2楼-- · 2019-02-25 11:12

To delay the fetching of choices, you could parse the command-line in two stages: In the first stage, you find only the subparser, and in the second stage, the subparser is used to parse the rest of the arguments:

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('subparser', choices=['foo','bar'])

def foo_parser():
    parser = argparse.ArgumentParser()
    parser.add_argument('fooval', choices='123')
    return parser

def bar_parser():
    parser = argparse.ArgumentParser()
    parser.add_argument('barval', choices='ABC')
    return parser

dispatch = {'foo':foo_parser, 'bar':bar_parser}
args, unknown = parser.parse_known_args()
args = dispatch[args.subparser]().parse_args(unknown)
print(args)

It could be used like this:

% script.py foo 2
Namespace(fooval='2')

% script.py bar A
Namespace(barval='A')

Note that the top-level help message will be less friendly, since it can only tell you about the subparser choices:

% script.py -h
usage: script.py [-h] {foo,bar}
...

To find information about the choices in each subparser, the user would have to select the subparser and pass the -h to it:

% script.py bar -- -h
usage: script.py [-h] {A,B,C}

All arguments after the -- are considered non-options (to script.py) and are thus parsed by the bar_parser.

0人赞添加讨论(0) 举报

放荡不羁爱自由

3楼-- · 2019-02-25 11:13

This is a script that tests the idea of delaying the creation of a subparser until it is actually needed. In theory it might save start up time, by only creating the subparser that's actually needed.

I use the nargs=argparse.PARSER to replicate the subparser behavior in the main parser. help behavior is similar.

# lazy subparsers test
# lazy behaves much like a regular subparser case, but only creates one subparser
# for N=5 time differences do not rise above the noise

import argparse

def regular(N):
    parser = argparse.ArgumentParser()
    sp = parser.add_subparsers(dest='cmd')
    for i in range(N):
        spp = sp.add_parser('cmd%s'%i)
        spp.set_defaults(func='cmd%s'%(10*i))
        spp.add_argument('-f','--foo')
        spp.add_argument('pos', nargs='*')
    return parser

def lazy(N):
    parser = argparse.ArgumentParser()
    sp = parser.add_argument('cmd', nargs=argparse.PARSER, choices=[])
    for i in range(N):
        sp.choices.append('cmd%s'%i)
    return parser

def subpar(cmd):
    cmd, argv = cmd[0], cmd[1:]
    parser = argparse.ArgumentParser(prog=cmd)
    parser.add_argument('-f','--foo')
    parser.add_argument('pos', nargs='*')
    parser.set_defaults(func=cmd)
    args = parser.parse_args(argv)
    return args

N = 5
mode = True #False
argv = 'cmd1 -f1 a b c'.split()
if mode:
    args = regular(N).parse_args(argv)
    print(args)
else:
    args = lazy(N).parse_args(argv)
    print(args)
    if isinstance(args.cmd, list):
        sargs = subpar(args.cmd)
        print(sargs)

test runs with different values of mode (and N=5)

1004:~/mypy$ time python3 stack44315696.py 
Namespace(cmd='cmd1', foo='1', func='cmd10', pos=['a', 'b', 'c'])

real    0m0.052s
user    0m0.044s
sys 0m0.008s
1011:~/mypy$ time python3 stack44315696.py 
Namespace(cmd=['cmd1', '-f1', 'a', 'b', 'c'])
Namespace(foo='1', func='cmd1', pos=['a', 'b', 'c'])

real    0m0.051s
user    0m0.048s
sys 0m0.000s

N has to be much larger to start seeing a effect.

0人赞添加讨论(0) 举报

混吃等死

4楼-- · 2019-02-25 11:14

I have solved the issue by creating a simple ArgumentParser subclass:

import argparse

class ArgumentParser(argparse.ArgumentParser):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        self.lazy_init = None

    def parse_known_args(self, args=None, namespace=None):
        if self.lazy_init is not None:
            self.lazy_init()
            self.lazy_init = None

        return super().parse_known_args(args, namespace)

Then I can use it as following:

parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers(dest='command', title='commands', parser_class=ArgumentParser)
subparsers.required = True

subparser = subparsers.add_parser(
    'do-something', help="do something",
    description="Do something great.",
)

def lazy_init():
    from my_database import data

    subparser.add_argument(
        '-o', '--option', choices=data.expensive_fetch(), action='save',
    )

subparser.lazy_init = lazy_init

This will really initialize a sub-parser only when parent parser tries to parse arguments for the sub-parser. So if you do program -h it will not initialize the sub-parser, but if you do program do-something -h it will.

0人赞添加讨论(0) 举报

地球回转人心会变

5楼-- · 2019-02-25 11:21

Here's a quick and dirty example of a 'lazy' choices. In this case choices are a range of integers. I think a case that requires expensive DB lookups could implemented in a similar fashion.

# argparse with lazy choices

class LazyChoice(object):
    # large range
    def __init__(self, argmax):
        self.argmax=argmax
    def __contains__(self, item):
        # a 'lazy' test that does not enumerate all choices
        return item<=self.argmax
    def __iter__(self):
        # iterable for display in error message
        # use is in:
        # tup = value, ', '.join(map(repr, action.choices))
        # metavar bypasses this when formatting help/usage
        return iter(['integers less than %s'%self.argmax])

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--regular','-r',choices=['one','two'])
larg = parser.add_argument('--lazy','-l', choices=LazyChoice(10))
larg.type = int
print parser.parse_args()

Implementing the testing part (__contains__) is easy. The help/usage can be customized with help and metavar attributes. Customizing the error message is harder. http://bugs.python.org/issue16468 discusses alternatives when choices are not iterable. (also on long list choices: http://bugs.python.org/issue16418)

I've also shown how the type can be changed after the initial setup. That doesn't solve the problem of setting type based on subparser choice. But it isn't hard to write a custom type, one that does some sort of Db lookup. All a type function needs to do is take a string, return the correct converted value, and raise ValueError if there's a problem.

0人赞添加讨论(0) 举报

argparse on demand imports for types, choices etc

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间