Dictionary or If statements, Jython

2019-02-14 20:37发布

问题:

I am writing a script at the moment that will grab certain information from HTML using dom4j.

Since Python/Jython does not have a native switch statement I decided to use a whole bunch of if statements that call the appropriate method, like below:

if type == 'extractTitle':
    extractTitle(dom)
if type == 'extractMetaTags':
    extractMetaTags(dom)

I will be adding more depending on what information I want to extract from the HTML and thought about taking the dictionary approach which I found elsewhere on this site, example below:

{
    'extractTitle':    extractTitle,
    'extractMetaTags': extractMetaTags
}[type](dom)

I know that each time I run the script the dictionary will be built, but at the same time if I were to use the if statements the script would have to check through all of them until it hits the correct one. What I am really wondering, which one performs better or is generally better practice to use?

Update: @Brian - Thanks for the great reply. I have a question, if any of the extract methods require more than one object, e.g.

handle_extractTag(self, dom, anotherObject)
# Do something

How would you make the appropriate changes to the handle method to implemented this? Hope you know what I mean :)

Cheers

回答1:

To avoid specifying the tag and handler in the dict, you could just use a handler class with methods named to match the type. Eg

class  MyHandler(object):
    def handle_extractTitle(self, dom):
        # do something

    def handle_extractMetaTags(self, dom):
        # do something

    def handle(self, type, dom):
        func = getattr(self, 'handle_%s' % type, None)
        if func is None:
            raise Exception("No handler for type %r" % type)
        return func(dom)

Usage:

 handler = MyHandler()
 handler.handle('extractTitle', dom)

Update:

When you have multiple arguments, just change the handle function to take those arguments and pass them through to the function. If you want to make it more generic (so you don't have to change both the handler functions and the handle method when you change the argument signature), you can use the *args and **kwargs syntax to pass through all received arguments. The handle method then becomes:

def handle(self, type, *args, **kwargs):
    func = getattr(self, 'handle_%s' % type, None)
    if func is None:
        raise Exception("No handler for type %r" % type)
    return func(*args, **kwargs)


回答2:

With your code you're running your functions all get called.

handlers = {
'extractTitle': extractTitle, 
'extractMetaTags': extractMetaTags
}

handlers[type](dom)

Would work like your original if code.



回答3:

It depends on how many if statements we're talking about; if it's a very small number, then it will be more efficient than using a dictionary.

However, as always, I strongly advice you to do whatever makes your code look cleaner until experience and profiling tell you that a specific block of code needs to be optimized.



回答4:

Your use of the dictionary is not quite correct. In your implementation, all methods will be called and all the useless one discarded. What is usually done is more something like:

switch_dict = {'extractTitle': extractTitle, 
               'extractMetaTags': extractMetaTags}
switch_dict[type](dom)

And that way is facter and more extensible if you have a large (or variable) number of items.



回答5:

The efficiency question is barely relevant. The dictionary lookup is done with a simple hashing technique, the if-statements have to be evaluated one at a time. Dictionaries tend to be quicker.

I suggest that you actually have polymorphic objects that do extractions from the DOM.

It's not clear how type gets set, but it sure looks like it might be a family of related objects, not a simple string.

class ExtractTitle( object ):
    def process( dom ):
        return something

class ExtractMetaTags( object ):
    def process( dom ):
        return something

Instead of setting type="extractTitle", you'd do this.

type= ExtractTitle() # or ExtractMetaTags() or ExtractWhatever()
type.process( dom )

Then, you wouldn't be building this particular dictionary or if-statement.