Hidden features of Perl?

2019-01-08 02:44发布

问题:

What are some really useful but esoteric language features in Perl that you've actually been able to employ to do useful work?

Guidelines:

  • Try to limit answers to the Perl core and not CPAN
  • Please give an example and a short description

Hidden Features also found in other languages' Hidden Features:

(These are all from Corion's answer)

  • C
    • Duff's Device
    • Portability and Standardness
  • C#
    • Quotes for whitespace delimited lists and strings
    • Aliasable namespaces
  • Java
    • Static Initalizers
  • JavaScript
    • Functions are First Class citizens
    • Block scope and closure
    • Calling methods and accessors indirectly through a variable
  • Ruby
    • Defining methods through code
  • PHP
    • Pervasive online documentation
    • Magic methods
    • Symbolic references
  • Python
    • One line value swapping
    • Ability to replace even core functions with your own functionality

Other Hidden Features:

Operators:

  • The bool quasi-operator
  • The flip-flop operator
    • Also used for list construction
  • The ++ and unary - operators work on strings
  • The repetition operator
  • The spaceship operator
  • The || operator (and // operator) to select from a set of choices
  • The diamond operator
  • Special cases of the m// operator
  • The tilde-tilde "operator"

Quoting constructs:

  • The qw operator
  • Letters can be used as quote delimiters in q{}-like constructs
  • Quoting mechanisms

Syntax and Names:

  • There can be a space after a sigil
  • You can give subs numeric names with symbolic references
  • Legal trailing commas
  • Grouped Integer Literals
  • hash slices
  • Populating keys of a hash from an array

Modules, Pragmas, and command-line options:

  • use strict and use warnings
  • Taint checking
  • Esoteric use of -n and -p
  • CPAN
  • overload::constant
  • IO::Handle module
  • Safe compartments
  • Attributes

Variables:

  • Autovivification
  • The $[ variable
  • tie
  • Dynamic Scoping
  • Variable swapping with a single statement

Loops and flow control:

  • Magic goto
  • for on a single variable
  • continue clause
  • Desperation mode

Regular expressions:

  • The \G anchor
  • (?{}) and '(??{})` in regexes

Other features:

  • The debugger
  • Special code blocks such as BEGIN, CHECK, and END
  • The DATA block
  • New Block Operations
  • Source Filters
  • Signal Hooks
  • map (twice)
  • Wrapping built-in functions
  • The eof function
  • The dbmopen function
  • Turning warnings into errors

Other tricks, and meta-answers:

  • cat files, decompressing gzips if needed
  • Perl Tips

See Also:

  • Hidden features of C
  • Hidden features of C#
  • Hidden features of C++
  • Hidden features of Java
  • Hidden features of JavaScript
  • Hidden features of Ruby
  • Hidden features of PHP
  • Hidden features of Python
  • Hidden features of Clojure

回答1:

The flip-flop operator is useful for skipping the first iteration when looping through the records (usually lines) returned by a file handle, without using a flag variable:

while(<$fh>)
{
  next if 1..1; # skip first record
  ...
}

Run perldoc perlop and search for "flip-flop" for more information and examples.



回答2:

There are many non-obvious features in Perl.

For example, did you know that there can be a space after a sigil?

 $ perl -wle 'my $x = 3; print $ x'
 3

Or that you can give subs numeric names if you use symbolic references?

$ perl -lwe '*4 = sub { print "yes" }; 4->()' 
yes

There's also the "bool" quasi operator, that return 1 for true expressions and the empty string for false:

$ perl -wle 'print !!4'
1
$ perl -wle 'print !!"0 but true"'
1
$ perl -wle 'print !!0'
(empty line)

Other interesting stuff: with use overload you can overload string literals and numbers (and for example make them BigInts or whatever).

Many of these things are actually documented somewhere, or follow logically from the documented features, but nonetheless some are not very well known.

Update: Another nice one. Below the q{...} quoting constructs were mentioned, but did you know that you can use letters as delimiters?

$ perl -Mstrict  -wle 'print q bJet another perl hacker.b'
Jet another perl hacker.

Likewise you can write regular expressions:

m xabcx
# same as m/abc/


回答3:

Add support for compressed files via magic ARGV:

s{ 
    ^            # make sure to get whole filename
    ( 
      [^'] +     # at least one non-quote
      \.         # extension dot
      (?:        # now either suffix
          gz
        | Z 
       )
    )
    \z           # through the end
}{gzcat '$1' |}xs for @ARGV;

(quotes around $_ necessary to handle filenames with shell metacharacters in)

Now the <> feature will decompress any @ARGV files that end with ".gz" or ".Z":

while (<>) {
    print;
}


回答4:

One of my favourite features in Perl is using the boolean || operator to select between a set of choices.

 $x = $a || $b;

 # $x = $a, if $a is true.
 # $x = $b, otherwise

This means one can write:

 $x = $a || $b || $c || 0;

to take the first true value from $a, $b, and $c, or a default of 0 otherwise.

In Perl 5.10, there's also the // operator, which returns the left hand side if it's defined, and the right hand side otherwise. The following selects the first defined value from $a, $b, $c, or 0 otherwise:

$x = $a // $b // $c // 0;

These can also be used with their short-hand forms, which are very useful for providing defaults:

$x ||= 0;   # If $x was false, it now has a value of 0.

$x //= 0;   # If $x was undefined, it now has a value of zero.

Cheerio,

Paul



回答5:

The operators ++ and unary - don't only work on numbers, but also on strings.

my $_ = "a"
print -$_

prints -a

print ++$_

prints b

$_ = 'z'
print ++$_

prints aa



回答6:

As Perl has almost all "esoteric" parts from the other lists, I'll tell you the one thing that Perl can't:

The one thing Perl can't do is have bare arbitrary URLs in your code, because the // operator is used for regular expressions.

Just in case it wasn't obvious to you what features Perl offers, here's a selective list of the maybe not totally obvious entries:

Duff's Device - in Perl

Portability and Standardness - There are likely more computers with Perl than with a C compiler

A file/path manipulation class - File::Find works on even more operating systems than .Net does

Quotes for whitespace delimited lists and strings - Perl allows you to choose almost arbitrary quotes for your list and string delimiters

Aliasable namespaces - Perl has these through glob assignments:

*My::Namespace:: = \%Your::Namespace

Static initializers - Perl can run code in almost every phase of compilation and object instantiation, from BEGIN (code parse) to CHECK (after code parse) to import (at module import) to new (object instantiation) to DESTROY (object destruction) to END (program exit)

Functions are First Class citizens - just like in Perl

Block scope and closure - Perl has both

Calling methods and accessors indirectly through a variable - Perl does that too:

my $method = 'foo';
my $obj = My::Class->new();
$obj->$method( 'baz' ); # calls $obj->foo( 'baz' )

Defining methods through code - Perl allows that too:

*foo = sub { print "Hello world" };

Pervasive online documentation - Perl documentation is online and likely on your system too

Magic methods that get called whenever you call a "nonexisting" function - Perl implements that in the AUTOLOAD function

Symbolic references - you are well advised to stay away from these. They will eat your children. But of course, Perl allows you to offer your children to blood-thirsty demons.

One line value swapping - Perl allows list assignment

Ability to replace even core functions with your own functionality

use subs 'unlink'; 
sub unlink { print 'No.' }

or

BEGIN{
    *CORE::GLOBAL::unlink = sub {print 'no'}
};

unlink($_) for @ARGV


回答7:

Autovivification. AFAIK no other language has it.



回答8:

It's simple to quote almost any kind of strange string in Perl.

my $url = q{http://my.url.com/any/arbitrary/path/in/the/url.html};

In fact, the various quoting mechanisms in Perl are quite interesting. The Perl regex-like quoting mechanisms allow you to quote anything, specifying the delimiters. You can use almost any special character like #, /, or open/close characters like (), [], or {}. Examples:

my $var  = q#some string where the pound is the final escape.#;
my $var2 = q{A more pleasant way of escaping.};
my $var3 = q(Others prefer parens as the quote mechanism.);

Quoting mechanisms:

q : literal quote; only character that needs to be escaped is the end character. qq : an interpreted quote; processes variables and escape characters. Great for strings that you need to quote:

my $var4 = qq{This "$mechanism" is broken.  Please inform "$user" at "$email" about it.};

qx : Works like qq, but then executes it as a system command, non interactively. Returns all the text generated from the standard out. (Redirection, if supported in the OS, also comes out) Also done with back quotes (the ` character).

my $output  = qx{type "$path"};      # get just the output
my $moreout = qx{type "$path" 2>&1}; # get stuff on stderr too

qr : Interprets like qq, but then compiles it as a regular expression. Works with the various options on the regex as well. You can now pass the regex around as a variable:

sub MyRegexCheck {
    my ($string, $regex) = @_;
    if ($string)
    {
       return ($string =~ $regex);
    }
    return; # returns 'null' or 'empty' in every context
}

my $regex = qr{http://[\w]\.com/([\w]+/)+};
@results = MyRegexCheck(q{http://myurl.com/subpath1/subpath2/}, $regex);

qw : A very, very useful quote operator. Turns a quoted set of whitespace separated words into a list. Great for filling in data in a unit test.


   my @allowed = qw(A B C D E F G H I J K L M N O P Q R S T U V W X Y Z { });
   my @badwords = qw(WORD1 word2 word3 word4);
   my @numbers = qw(one two three four 5 six seven); # works with numbers too
   my @list = ('string with space', qw(eight nine), "a $var"); # works in other lists
   my $arrayref = [ qw(and it works in arrays too) ]; 

They're great to use them whenever it makes things clearer. For qx, qq, and q, I most likely use the {} operators. The most common habit of people using qw is usually the () operator, but sometimes you also see qw//.



回答9:

Not really hidden, but many every day Perl programmers don't know about CPAN. This especially applies to people who aren't full time programmers or don't program in Perl full time.



回答10:

The "for" statement can be used the same way "with" is used in Pascal:

for ($item)
{
    s/&‎nbsp;/ /g;
    s/<.*?>/ /g;
    $_ = join(" ", split(" ", $_));
}

You can apply a sequence of s/// operations, etc. to the same variable without having to repeat the variable name.

NOTE: the non-breaking space above (&‎nbsp;) has hidden Unicode in it to circumvent the Markdown. Don't copy paste it :)



回答11:

The quoteword operator is one of my favourite things. Compare:

my @list = ('abc', 'def', 'ghi', 'jkl');

and

my @list = qw(abc def ghi jkl);

Much less noise, easier on the eye. Another really nice thing about Perl, that one really misses when writing SQL, is that a trailing comma is legal:

print 1, 2, 3, ;

That looks odd, but not if you indent the code another way:

print
    results_of_foo(),
    results_of_xyzzy(),
    results_of_quux(),
    ;

Adding an additional argument to the function call does not require you to fiddle around with commas on previous or trailing lines. The single line change has no impact on its surrounding lines.

This makes it very pleasant to work with variadic functions. This is perhaps one of the most under-rated features of Perl.



回答12:

The ability to parse data directly pasted into a DATA block. No need to save to a test file to be opened in the program or similar. For example:

my @lines = <DATA>;
for (@lines) {
    print if /bad/;
}

__DATA__
some good data
some bad data
more good data 
more good data 


回答13:

New Block Operations

I'd say the ability to expand the language, creating pseudo block operations is one.

  1. You declare the prototype for a sub indicating that it takes a code reference first:

    sub do_stuff_with_a_hash (&\%) {
        my ( $block_of_code, $hash_ref ) = @_;
        while ( my ( $k, $v ) = each %$hash_ref ) { 
            $block_of_code->( $k, $v );
        }
    }
    
  2. You can then call it in the body like so

    use Data::Dumper;
    
    do_stuff_with_a_hash {
        local $Data::Dumper::Terse = 1;
        my ( $k, $v ) = @_;
        say qq(Hey, the key   is "$k"!);
        say sprintf qq(Hey, the value is "%v"!), Dumper( $v );
    
    } %stuff_for
    ;
    

(Data::Dumper::Dumper is another semi-hidden gem.) Notice how you don't need the sub keyword in front of the block, or the comma before the hash. It ends up looking a lot like: map { } @list

Source Filters

Also, there are source filters. Where Perl will pass you the code so you can manipulate it. Both this, and the block operations, are pretty much don't-try-this-at-home type of things.

I have done some neat things with source filters, for example like creating a very simple language to check the time, allowing short Perl one-liners for some decision making:

perl -MLib::DB -MLib::TL -e 'run_expensive_database_delete() if $hour_of_day < AM_7';

Lib::TL would just scan for both the "variables" and the constants, create them and substitute them as needed.

Again, source filters can be messy, but are powerful. But they can mess debuggers up something terrible--and even warnings can be printed with the wrong line numbers. I stopped using Damian's Switch because the debugger would lose all ability to tell me where I really was. But I've found that you can minimize the damage by modifying small sections of code, keeping them on the same line.

Signal Hooks

It's often enough done, but it's not all that obvious. Here's a die handler that piggy backs on the old one.

my $old_die_handler = $SIG{__DIE__};
$SIG{__DIE__}       
    = sub { say q(Hey! I'm DYIN' over here!); goto &$old_die_handler; }
    ;

That means whenever some other module in the code wants to die, they gotta come to you (unless someone else does a destructive overwrite on $SIG{__DIE__}). And you can be notified that somebody things something is an error.

Of course, for enough things you can just use an END { } block, if all you want to do is clean up.

overload::constant

You can inspect literals of a certain type in packages that include your module. For example, if you use this in your import sub:

overload::constant 
    integer => sub { 
        my $lit = shift;
        return $lit > 2_000_000_000 ? Math::BigInt->new( $lit ) : $lit 
    };

it will mean that every integer greater than 2 billion in the calling packages will get changed to a Math::BigInt object. (See overload::constant).

Grouped Integer Literals

While we're at it. Perl allows you to break up large numbers into groups of three digits and still get a parsable integer out of it. Note 2_000_000_000 above for 2 billion.



回答14:

Binary "x" is the repetition operator:

print '-' x 80;     # print row of dashes

It also works with lists:

print for (1, 4, 9) x 3; # print 149149149


回答15:

Taint checking. With taint checking enabled, perl will die (or warn, with -t) if you try to pass tainted data (roughly speaking, data from outside the program) to an unsafe function (opening a file, running an external command, etc.). It is very helpful when writing setuid scripts or CGIs or anything where the script has greater privileges than the person feeding it data.

Magic goto. goto &sub does an optimized tail call.

The debugger.

use strict and use warnings. These can save you from a bunch of typos.



回答16:

Based on the way the "-n" and "-p" switches are implemented in Perl 5, you can write a seemingly incorrect program including }{:

ls |perl -lne 'print $_; }{ print "$. Files"'

which is converted internally to this code:

LINE: while (defined($_ = <ARGV>)) {
    print $_; }{ print "$. Files";
}


回答17:

Let's start easy with the Spaceship Operator.

$a = 5 <=> 7;  # $a is set to -1
$a = 7 <=> 5;  # $a is set to 1
$a = 6 <=> 6;  # $a is set to 0


回答18:

This is a meta-answer, but the Perl Tips archives contain all sorts of interesting tricks that can be done with Perl. The archive of previous tips is on-line for browsing, and can be subscribed to via mailing list or atom feed.

Some of my favourite tips include building executables with PAR, using autodie to throw exceptions automatically, and the use of the switch and smart-match constructs in Perl 5.10.

Disclosure: I'm one of the authors and maintainers of Perl Tips, so I obviously think very highly of them. ;)



回答19:

map - not only because it makes one's code more expressive, but because it gave me an impulse to read a little bit more about this "functional programming".



回答20:

The continue clause on loops. It will be executed at the bottom of every loop, even those which are next'ed.

while( <> ){
  print "top of loop\n";
  chomp;

  next if /next/i;
  last if /last/i;

  print "bottom of loop\n";
}continue{
  print "continue\n";
}


回答21:

My vote would go for the (?{}) and (??{}) groups in Perl's regular expressions. The first executes Perl code, ignoring the return value, the second executes code, using the return value as a regular expression.



回答22:

while(/\G(\b\w*\b)/g) {
     print "$1\n";
}

the \G anchor. It's hot.



回答23:

The m// operator has some obscure special cases:

  • If you use ? as the delimiter it only matches once unless you call reset.
  • If you use ' as the delimiter the pattern is not interpolated.
  • If the pattern is empty it uses the pattern from the last successful match.


回答24:

The null filehandle diamond operator <> has its place in building command line tools. It acts like <FH> to read from a handle, except that it magically selects whichever is found first: command line filenames or STDIN. Taken from perlop:

while (<>) {
...         # code for each line
}


回答25:

Special code blocks such as BEGIN, CHECK and END. They come from Awk, but work differently in Perl, because it is not record-based.

The BEGIN block can be used to specify some code for the parsing phase; it is also executed when you do the syntax-and-variable-check perl -c. For example, to load in configuration variables:

BEGIN {
    eval {
        require 'config.local.pl';
    };
    if ($@) {
        require 'config.default.pl';
    }
}


回答26:

rename("$_.part", $_) for "data.txt";

renames data.txt.part to data.txt without having to repeat myself.



回答27:

A bit obscure is the tilde-tilde "operator" which forces scalar context.

print ~~ localtime;

is the same as

print scalar localtime;

and different from

print localtime;


回答28:

tie, the variable tying interface.



回答29:

The "desperation mode" of Perl's loop control constructs which causes them to look up the stack to find a matching label allows some curious behaviors which Test::More takes advantage of, for better or worse.

SKIP: {
    skip() if $something;

    print "Never printed";
}

sub skip {
    no warnings "exiting";
    last SKIP;
}

There's the little known .pmc file. "use Foo" will look for Foo.pmc in @INC before Foo.pm. This was intended to allow compiled bytecode to be loaded first, but Module::Compile takes advantage of this to cache source filtered modules for faster load times and easier debugging.

The ability to turn warnings into errors.

local $SIG{__WARN__} = sub { die @_ };
$num = "two";
$sum = 1 + $num;
print "Never reached";

That's what I can think of off the top of my head that hasn't been mentioned.



回答30:

The goatse operator*:

$_ = "foo bar";
my $count =()= /[aeiou]/g; #3

or

sub foo {
    return @_;
}

$count =()= foo(qw/a b c d/); #4

It works because list assignment in scalar context yields the number of elements in the list being assigned.

* Note, not really an operator