Passing two or more arrays to a Perl subroutine

2020-02-09 04:02发布

问题:

I am having trouble passing and reading arguments inside subroutine which is expected to have two arrays.

sub two_array_sum { # two_array_sum ( (1 2 3 4), (2, 4, 0, 1) ) -> (3, 6, 3, 5)
  # I would like to use parameters @a and @b as simply as possible
}

# I would like to call two_array_sum here and pass two arrays, @c and @d

I have seen and tried several examples from the web, but none of them worked for me.

回答1:

There are two ways you can do this:

  1. by prototype
  2. by reference

But before I discuss these--if what you show in your question is about the extent of what you want to do--let me suggest List::MoreUtils::pairwise

So, where you would write this:

my @sum = two_array_sum( @a, @b )

You'd simply write this:

my @sum = pairwise { $a + $b } @a, @b;

By prototype

This works like push. (And just like push it demands to have a @ sigil on something)

sub two_array_sub (\@\@) { 
    my ( $aref, $bref ) = @_;
    ...
}

That way when you do this

two_array_sub( @a, @b );

it works. Whereas normally it would just show up in your sub as one long list. They aren't for everybody as you'll see in my discussion below.

By reference

That's the way that everybody is showing you.

some_sub( \@a, \@b );

About prototypes

They're finicky. This won't work if you have refs:

two_array_sub( $arr_ref, $brr_ref );

You have to pass them like this:

two_array_sub( @$arr_ref, @$brr_ref );

However, because making "array expressions" gets really ugly quickly with arrays nested deep, I often avoid Perl's fussiness as you can overload the type of reference Perl will take by putting it in a "character class" construct. \[$@] means that the reference can either be a scalar or array.

sub new_two_array_sub (\[$@]\[$@]) { 
    my $ref = shift;
    my $arr = ref( $ref ) eq 'ARRAY' ? $ref : $$ref; # ref -> 'REF';
    $ref    = shift;
    my $brr = ref( $ref ) eq 'ARRAY' ? $ref : $$ref;
    ...
}

So all these work:

new_two_array_sub( @a, $self->{a_level}{an_array} );
new_two_array_sub( $arr, @b );
new_two_array_sub( @a, @b );
new_two_array_sub( $arr, $self->{a_level}{an_array} );

However, Perl is still fussy about this... for some reason:

new_two_array_sub( \@a, $b );
OR 
new_two_array_sub( $a, [ 1..3 ] );

Or any other "constructor" that still could be seen as a reference to an array. Fortunately, you can shut Perl up about that with the old Perl 4 &

&new_two_array_sub( \@a, [ 1..3 ] );

Then the mux-ing code in the sub takes care of handling two array references.



回答2:

Pass references to your arrays to the function:

two_array_sum( \@a, \@b )

and don't use a or b as variable names, because $a and $b are special (for sorting).



回答3:

I'll quote from man perlref but you should read it all:

   Making References

   References can be created in several ways.

   1.  By using the backslash operator on a variable, subroutine, or
       value.  (This works much like the & (address-of) operator in C.)
       This typically creates another reference to a variable, because
       there's already a reference to the variable in the symbol table.
       But the symbol table reference might go away, and you'll still have
       the reference that the backslash returned.  Here are some examples:

           $scalarref = \$foo;
           $arrayref  = \@ARGV;
           $hashref   = \%ENV;
           $coderef   = \&handler;
           $globref   = \*foo;

...

   Using References

   That's it for creating references.  By now you're probably dying to
   know how to use references to get back to your long-lost data.  There
   are several basic methods.

   1.  Anywhere you'd put an identifier (or chain of identifiers) as part
       of a variable or subroutine name, you can replace the identifier
       with a simple scalar variable containing a reference of the correct
       type:

           $bar = $$scalarref;
           push(@$arrayref, $filename);
           $$arrayref[0] = "January";
           $$hashref{"KEY"} = "VALUE";
           &$coderef(1,2,3);
           print $globref "output\n";


回答4:

my @sums = two_array_sum(\@aArray, \@bArray);

sub two_array_sum { # two_array_sum ( (1 2 3 4), (2, 4, 0, 1) ) -> (3, 6, 3, 5)
    my ($aRef, $bRef) = @_;
    my @result = ();

    my $idx = 0;
    foreach my $aItem (@{$aRef}) {
        my $bItem = $bRef->[$idx++];
        push (@result, $aItem + $bItem);
    }

    return @result;
}


回答5:

You need to pass arrays or hashes to your subroutine using references, example:

sub two_array_sum {
  my ($x, $y) = @_;
  #process $x, $y;
}
two_array_sum(\@a, \@b);


回答6:

You can't pass arrays to functions. Functions can only accept a lists of scalars for argument. As such, you need to pass scalars that provide sufficient data to recreate the arrays.

The simplest means of doing so is passing references to the arrays.

sub two_array_sum {
   my ($array0, $array1) = @_;

   my @array0 = @$array0;
   my @array1 = @$array1;

   return map { $array0[$_] + $array1[$_] } 0..$#array0;
}

You can even avoid reconstructing the arrays and work with the references directly.

sub two_array_sum {
   my ($array0, $array1) = @_;
   return map { $array0->[$_] + $array1->[$_] } 0..$#$array0;
}

Usage:

my @array0 = (1, 2, 3, 4);
my @array1 = (2, 4, 0, 1);
two_array_sum(\@array0, \@array1);

Square brackets construct an anonymous array (populated with the result of the expression within) and returns a reference to that array. Therefore, the above could also be written as follows:

two_array_sum([1, 2, 3, 4], [2, 4, 0, 1]);


回答7:

Those methods are canonical. Another way to do it:

use strict;
my $data;

@{$data->{array1}} = qw(foo bar baz);
@{$data->{array2}} = qw(works for me);
testsub($data);

sub testsub
{
    my ($data) = @_;
    print join "\t", @{$data->{array1}}, "\n";
    print join "\t", @{$data->{array2}}, "\n";
    $data->{array1}[3] = "newitem";
    delete $data->{array2};
    push @{$data->{newarray}}, (1, 2, 3);
    return $data;
}

When you do it this way, you can keep a much tighter control of your variables, rather than suffer a rat's nest of program data intermingled with configuration information.

In general, I never have more than three or four variables in any program.

I also keep a system to it -- I use a hashes of lists of hashes of lists thing.

$config->{server}[0]{prod}[0]{input}[0] = 'inputfile';

The reason is that as long as I am consistent with alternating each way, Data::Dumper can dump the entire structure -- and I can better control the scope of data, and can pass entire structures around with ease.

I often find myself passing multiple structures like this to subroutines. As scalars, they pass quite well, thank you.