What's the best way to read a fixed length record in Perl. I know to read a file like:
ABCDE 302
DEFGC 876
I can do
while (<FILE>) {
$key = substr($_, 0, 5);
$value = substr($_, 7, 3);
}
but isn't there a way to do this with read/unpack?
Update: For the definitive answer, see Jonathan Leffler's answer below.
I wouldn't use this for just two fields (I'd use pack/unpack directly), but for 20 or 50 or so fields I like to use Parse::FixedLength (but I'm biased). E.g. (for your example) (Update: also, you can use $/ and <> as an alternative to read($fh, $buf, $buf_length)...see below):
use Parse::FixedLength;
my $pfl = Parse::FixedLength->new([qw(
key:5
blank:1
value:3
)]);
# Assuming trailing newline
# (or add newline to format above and remove "+ 1" below)
my $data_length = $pfl->length() + 1;
{
local $/ = \$data_length;
while(<FILE>) {
my $data = $pfl->parse($_);
print "$data->{key}:$data->{value}\n";
# or
print $data->key(), ":", $data->value(), "\n";
}
}
There are some similar modules that make pack/unpack more "friendly" (See the "See Also" section of Parse::FixedLength).
Update: Wow, this was meant to be an alternative answer, not the official answer...well, since it is what it is, I should include some of Jonathan Leffler's more straight forward code, which is likely how you should usually do it (see pack/unpack docs and Jonathan Leffler's node below):
$_ = "ABCDE 302";
my($key, $blank, $value) = unpack "A5A1A3";
my($key, $value) = unpack "A5 A3"; # Original, but slightly dubious
We both need to check out the options at the unpack manual page (and, more particularly, the pack manual page).
Since the A pack operator removes trailing blanks, your example can be encoded as:
my($key, $value) = unpack "A6A3";
Alternatively (this is Perl, so TMTOWTDI):
my($key, $blank, $value) = unpack "A5A1A3";
The 1 is optional but systematic and symmetric. One advantage of this is that you can validate that $blank eq " "
.
Assume 10 character records of two five character fields per record:
open(my $fh, "<", $filename) or die $!;
while(read($fh, $buf, 10)) {
($field1, $field2) = unpack("A5 A5", $buf);
# ... do something with data ...
}
Here's yet another way to do it:
while (<FILE>)
{
chomp;
if (/^([A-Z]{5}) ([0-9]{3})$/)
{
$key = $1;
$value = $2;
}
}
Regardless of whether your records and fields are fixed-length, if the fields are separated by uniform delimiters (such as a space or comma), you can use the split function more easily than unpack.
my ($field1, $field2) = split / /;
Look up the documentation for split. There are useful variations on the argument list and on the format of the delimiter pattern.