How do I read fixed-length records in Perl?

2020-03-12 02:22发布

问题:

What's the best way to read a fixed length record in Perl. I know to read a file like:

ABCDE 302
DEFGC 876

I can do

while (<FILE>) {
   $key = substr($_, 0, 5);
   $value = substr($_, 7, 3);
}

but isn't there a way to do this with read/unpack?

回答1:

Update: For the definitive answer, see Jonathan Leffler's answer below.

I wouldn't use this for just two fields (I'd use pack/unpack directly), but for 20 or 50 or so fields I like to use Parse::FixedLength (but I'm biased). E.g. (for your example) (Update: also, you can use $/ and <> as an alternative to read($fh, $buf, $buf_length)...see below):

use Parse::FixedLength;

my $pfl = Parse::FixedLength->new([qw(
  key:5
  blank:1
  value:3
)]);
# Assuming trailing newline
# (or add newline to format above and remove "+ 1" below)
my $data_length = $pfl->length() + 1;

{
  local $/ = \$data_length;
  while(<FILE>) {
    my $data = $pfl->parse($_);
    print "$data->{key}:$data->{value}\n";
    # or
    print $data->key(), ":", $data->value(), "\n";
  }
}

There are some similar modules that make pack/unpack more "friendly" (See the "See Also" section of Parse::FixedLength).

Update: Wow, this was meant to be an alternative answer, not the official answer...well, since it is what it is, I should include some of Jonathan Leffler's more straight forward code, which is likely how you should usually do it (see pack/unpack docs and Jonathan Leffler's node below):

$_ = "ABCDE 302";
my($key, $blank, $value) = unpack "A5A1A3";


回答2:

my($key, $value) = unpack "A5 A3";    # Original, but slightly dubious

We both need to check out the options at the unpack manual page (and, more particularly, the pack manual page).

Since the A pack operator removes trailing blanks, your example can be encoded as:

my($key, $value) = unpack "A6A3";

Alternatively (this is Perl, so TMTOWTDI):

my($key, $blank, $value) = unpack "A5A1A3";

The 1 is optional but systematic and symmetric. One advantage of this is that you can validate that $blank eq " ".



回答3:

Assume 10 character records of two five character fields per record:

open(my $fh, "<", $filename) or die $!;
while(read($fh, $buf, 10)) {
  ($field1, $field2) = unpack("A5 A5", $buf);
  # ... do something with data ...
}


回答4:

Here's yet another way to do it:

while (<FILE>)
{
    chomp;
    if (/^([A-Z]{5}) ([0-9]{3})$/)
    {
        $key = $1;
        $value = $2;
    }
}


回答5:

Regardless of whether your records and fields are fixed-length, if the fields are separated by uniform delimiters (such as a space or comma), you can use the split function more easily than unpack.

my ($field1, $field2) = split / /;

Look up the documentation for split. There are useful variations on the argument list and on the format of the delimiter pattern.