How do I open an array of files in Perl?

In perl, I read in files from a directory, and I want to open them all simultaneously (but line by line) so that I can perform a function that uses all of their nth lines together (e.g. concatenation).

my $text = `ls | grep ".txt"`;
my @temps = split(/\n/,$text);
my @files;
for my $i (0..$#temps) {
  my $file;
  open($file,"<",$temps[$i]);
  push(@files,$file);
}
my $concat;
for my $i (0..$#files) {
  my @blah = <$files[$i]>;
  $concat.=$blah;
}
print $concat;

I just a bunch of errors, use of uninitialized value, and GLOB(..) errors. So how can I make this work?

标签： perl file simultaneous

4条回答

女痞

2楼-- · 2019-05-10 15:05

A lot of issues. Starting with call to "ls | grep" :)

Let's start with some code:

First, let's get list of files:

my @files = glob( '*.txt' );

But it would be better to test if the given name relates to file or directory:

my @files = grep { -f } glob( '*.txt' );

Now, let's open these files to read them:

my @fhs = map { open my $fh, '<', $_; $fh } @files;

But, we need a way to handle errors - in my opinion the best way is to add:

use autodie;

At the beginning of script (and installation of autodie, if you don't have it yet). Alternatively you can:

use Fatal qw( open );

Now, that we have it, let's get the first line (as you showed in your example) from all of the inputs, and concatenate it:

my $concatenated = '';

for my $fh ( @fhs ) {
    my $line = <$fh>;
    $concatenated .= $line;
}

Which is perfectly fine, and readable, but still can be shortened, while maintaining (in my opinion) readability, to:

my $concatenated = join '', map { scalar <$_> } @fhs;

Effect is the same - $concatenated contains first lines from all files.

So, whole program would look like this:

#!/usr/bin/perl
use strict;
use warnings;
use autodie;
# use Fatal qw( open ); # uncomment if you don't have autodie

my @files        = grep { -f } glob( '*.txt' );
my @fhs          = map { open my $fh, '<', $_; $fh } @files;
my $concatenated = join '', map { scalar <$_> } @fhs;

Now, it might be that you want to concatenate not just first lines, but all of them. In this situation, instead of $concatenated = ... code, you'd need something like this:

my $concatenated = '';

while (my $fh = shift @fhs) {
    my $line = <$fh>;
    if ( defined $line ) {
        push @fhs, $fh;
        $concatenated .= $line;
    } else {
        close $fh;
    }
}

0人赞添加讨论(0) 举报

来，给爷笑一个

3楼-- · 2019-05-10 15:11

Here is your problem:

for my $i (0..$#files) {
  my @blah = <$files[$i]>;
  $concat .= $blah;
}

First, <$files[$i]> isn't a valid filehandle read. This is the source of your GLOB(...) errors. See mobrule's answer for why this is the case. So change it to this:

for my $file (@files) {
  my @blah = <$file>;
  $concat .= $blah;
}

Second problem, You're mixing @blah (an array named blah) and $blah (a scalar named blah). This is the source of your "uninitialized value" errors - $blah (the scalar) hasn't been initialized, but you're using it. If you want the $n-th line from @blah, use this:

for my $file (@files) {
  my @blah = <$file>;
  $concat .= $blah[$n];
}

I don't want to keep beating a dead horse, but I do want to address a better way to do something:

my $text = `ls | grep ".txt"`;
my @temps = split(/\n/,$text);

This reads in a list of all files in the current directory that have a ".txt" extension in them. This works, and is effective, but it can be rather slow - we have to call out to the shell, which has to fork off to run ls and grep, and that incurs a bit of overhead. Furthermore, ls and grep are simple and common programs, but not exactly portable. Surely there's a better way to do this:

my @temps;
opendir(DIRHANDLE, ".");
while(my $file = readdir(DIRHANDLE)) {
  push @temps, $file if $file =~ /\.txt/;
}

Simple, short, pure Perl, no forking, no non-portable shells, and we don't have to read in the string and then split it - we can only store the entries we really need. Plus, it becomes trivial to modify the conditions for files that pass the test. Say we end up accidentally reading the file test.txt.gz because our regex matches: we can easily change that line to:

  push @temps, $file if $file =~ /\.txt$/;

We can do that one with grep (I believe), but why settle for grep's limited regular expressions when Perl has one of the most powerful regex libraries anywhere built-in?

0人赞添加讨论(0) 举报

Ridiculous、

4楼-- · 2019-05-10 15:11

Use braces around $files[$i] inside the <> operator

my @blah = <{$files[$i]}>

Otherwise Perl interprets <> as the file glob operator instead of the read-from-filehandle operator.

0人赞添加讨论(0) 举报

霸刀☆藐视天下

5楼-- · 2019-05-10 15:13

You've got some good answers already. Another way to tackle the problem is to create a list-of-lists containing all of the lines from the files (@content). Then use the each_arrayref function from List::MoreUtils, which will create an iterator that yields line 1 from all files, then line 2, etc.

use strict;
use warnings;
use List::MoreUtils qw(each_arrayref);

my @content =
    map {
        open(my $fh, '<', $_) or die $!;
        [<$fh>]
    }
    grep {-f}
    glob '*.txt'
;
my $iterator = each_arrayref @content;
while (my @nth_lines = $iterator->()){
    # Do stuff with @nth_lines;
}

0人赞添加讨论(0) 举报

How do I open an array of files in Perl?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间