Is checking Perl function arguments worth it?

2019-03-11 01:24发布

There's a lot of buzz about MooseX::Method::Signatures and even before that, modules such as Params::Validate that are designed to type check every argument to methods or functions. I'm considering using the former for all my future Perl code, both personal and at my place of work. But I'm not sure if it's worth the effort.

I'm thinking of all the Perl code I've seen (and written) before that performs no such checking. I very rarely see a module do this:

my ($a, $b) = @_;
defined $a or croak '$a must be defined!';
!ref $a or croak '$a must be a scalar!";
...
@_ == 2 or croak "Too many arguments!";

Perhaps because it's simply too much work without some kind of helper module, but perhaps because in practice we don't send excess arguments to functions, and we don't send arrayrefs to methods that expect scalars - or if we do, we have use warnings; and we quickly hear about it - a duck typing approach.

So is Perl type checking worth the performance hit, or are its strengths predominantly shown in compiled, strongly typed languages such as C or Java?

I'm interested in answers from anyone who has experience writing Perl that uses these modules and has seen benefits (or not) from their use; if your company/project has any policies relating to type checking; and any problems with type checking and performance.

UPDATE: I read an interesting article on the subject recently, called Strong Testing vs. Strong Typing. Ignoring the slight Python bias, it essentially states that type checking can be suffocating in some instances, and even if your program passes the type checks, it's no guarantee of correctness - proper tests are the only way to be sure.

9条回答
Deceive 欺骗
2楼-- · 2019-03-11 01:28

Yes its worth it - defensive programming is one of those things that are always worth it.

查看更多
啃猪蹄的小仙女
3楼-- · 2019-03-11 01:28

Params::Validate works great,but of course checking args slows things down. Tests are mandatory(at least in the code I write).

查看更多
混吃等死
4楼-- · 2019-03-11 01:32

Sometimes. I generally do it whenever I'm passing options via hash or hashref. In these cases it's very easy to misremember or misspell an option name, and checking with Params::Check can save a lot of troubleshooting time.

For example:

sub revise {
    my ($file, $options) = @_;

    my $tmpl = {
        test_mode => { allow => [0,1], 'default' => 0 },
        verbosity => { allow => qw/^\d+$/, 'default' => 1 },
        force_update => { allow => [0,1], 'default' => 0 },
        required_fields => { 'default' => [] },
        create_backup => { allow => [0,1], 'default' => 1 },
    };

    my $args = check($tmpl, $options, 1)
      or croak "Could not parse arguments: " . Params::Check::last_error();
    ...
}

Prior to adding these checks, I'd forget whether the names used underscores or hyphens, pass require_backup instead of create_backup, etc. And this is for code I wrote myself--if other people are going to use it, you should definitely do some sort of idiot-proofing. Params::Check makes it fairly easy to do type checking, allowed value checking, default values, required options, storing option values to other variables and more.

查看更多
家丑人穷心不美
5楼-- · 2019-03-11 01:37

I want to mention two points here. The first are the tests, the second the performance question.

1) Tests

You mentioned that tests can do a lot and that tests are the only way to be sure that your code is correct. In general i would say this is absolutly correct. But tests itself only solves one problem.

If you write a module you have two problems or lets say two different people that uses your module.

You as a developer and a user that uses your module. Tests helps with the first that your module is correct and do the right thing, but it didn't help the user that just uses your module.

For the later, i have one example. i had written a module using Moose and some other stuff, my code ended always in a Segmentation fault. Then i began to debug my code and search for the problem. I spend around 4 hours of time to find the error. In the end the problem was that i have used Moose with the Array Trait. I used the "map" function and i didn't provide a subroutine function, just a string or something else.

Sure this was an absolutly stupid error of mine, but i spend a long time to debug it. In the end just a checking of the input that the argument is a subref would cost the developer 10 seconds of time, and would cost me and propably other a lot of more time.

I also know of other examples. I had written a REST Client to an interface completly OOP with Moose. In the end you always got back Objects, you can change the attributes but sure it didn't call the REST API for every change you did. Instead you change your values and in the end you call a update() method that transfers the data, and change the values.

Now i had a user that then wrote:

$obj->update({ foo => 'bar' })

Sure i got an error back, that update() does not work. But sure it didn't work, because the update() method didn't accept a hashref. It only does a synchronisation of the actual state of the object with the online service. The correct code would be.

$obj->foo('bar');
$obj->update();

The first thing works because i never did a checking of the arguments. And i don't throw an error if someone gives more arguments then i expect. The method just starts normal like.

sub update {
  my ( $self ) = @_;
  ...
}

Sure all my tests absolutely works 100% fine. But handling these errors that are not errors cost me time too. And it costs the user propably a lot of more time.

So in the end. Yes, tests are the only correct way to ensure that your code works correct. But that doesn't mean that type checking is meaningless. Type checking is there to help all your non-developers (on your module) to use your module correctly. And saves you and others time finding dump errors.

2) Performance

The short: You don't care for performance until you care.

That means until your module works to slow, Performance is always fast enough and you don't need to care for this. If your module really works to slow you need further investigations. But for these investigions you should use a profiler like Devel::NYTProf to look what is slow.

And i would say. In 99% slowliness is not because you do type checking, it is more your algorithm. You do a lot of computation, calling functions to often etc. Often it helps if you do completly other solutions use another better algorithm, do caching or something else, and the performance hit is not your type checking. But even if the checking is the performance hit. Then just remove it where it matters.

There is no reason to leave the type checking where performance don't matters. Do you think type checking does matter in a case like above? Where i have written a REST Client? 99% of performance issues here are the amount of request that goes to the webservice or the time for such an request. Don't using type checking or MooseX::Declare etc. would propably speed up absolutly nothing.

And even if you see performance disadvantages. Sometimes it is acceptable. Because the speed doesn't matter or sometimes something gives you a greater value. DBIx::Class is slower then pure SQL with DBI, but DBIx::Class gives you a lot for these.

查看更多
对你真心纯属浪费
6楼-- · 2019-03-11 01:44

I basically concur with brian. How much you need to worry about your method's inputs depends heavily on how much you are concerned that a) someone will input bad data, and b) bad data will corrupt the purpose of the method. I would also add that there is a difference between external and internal methods. You need to be more diligent about public methods because you're making a promise to consumers of your class; conversely you can be less diligent about internal methods as you have greater (theoretical) control over the code that accesses it, and have only yourself to blame if things go wrong.

MooseX::Method::Signatures is an elegant solution to adding a simple declarative way to explain the parameters of a method. Method::Signatures::Simple and Params::Validate are nice but lack one of the features I find most appealing about Moose: the Type system. I have used MooseX::Declare and by extension MooseX::Method::Signatures for several projects and I find that the bar to writing the extra checks is so minimal it's almost seductive.

查看更多
狗以群分
7楼-- · 2019-03-11 01:46

If it's important for you to check that an argument is exactly what you need, it's worth it. Performance only matters when you already have correct functioning. It doesn't matter how fast you can get a wrong answer or a core dump. :)

Now, that sounds like a stupid thing to say, but consider some cases where it isn't. Do I really care what's in @_ here?

sub looks_like_a_number { $_[0] !~ /\D/ }
sub is_a_dog            { eval { $_[0]->DOES( 'Dog' ) } }

In those two examples, if the argument isn't what you expect, you are still going to get the right answer because the invalid arguments won't pass the tests. Some people see that as ugly, and I can see their point, but I also think the alternative is ugly. Who wins?

However, there are going to be times that you need guard conditions because your situation isn't so simple. The next thing you have to pass your data to might expect them to be within certain ranges or of certain types and don't fail elegantly.

When I think about guard conditions, I think through what could happen if the inputs are bad and how much I care about the failure. I have to judge that by the demands of each situation. I know that sucks as an answer, but I tend to like it better than a bondage-and-discipline approach where you have to go through all the mess even when it doesn't matter.

I dread Params::Validate because its code is often longer than my subroutine. The Moose stuff is very attractive, but you have to realize that it's a way for you to declare what you want and you still get what you could build by hand (you just don't have to see it or do it). The biggest thing I hate about Perl is the lack of optional method signatures, and that's one of the most attractive features in Perl 6 as well as Moose.

查看更多
登录 后发表回答