I have two text files containing the following:
FILE1.txt
dog
cat
antelope
FILE2.txt
1
2
Barry
The output I want to achieve is as follows:
dog1
dog2
dogBarry
cat1
cat2
catBarry
antelope1
antelope2
antelopeBarry
They way I have gone about it:
open (FILE1, "<File1.txt") || die $!;
open (FILE2, "<File2.txt") || die $!;
my @animals = (<FILE1>); #each line of the file into an array
my @otherStrings = (<FILE2>); #each line of the file into an array
close FILE1 || die $!;
close FILE2 || die $!;
my @bothTogether;
foreach my $animal (@animals) {
chomp $animal;
foreach my $otherString (@otherStrings) {
chomp $otherString;
push (@bothTogether, "$animal$otherString");
}
}
print @bothTogether;
The way I have done it works, but I'm sure it is not the best way of going about it especially when the files could both contain thousands of lines?
What would the best way of doing this be, to maybe use a hash?
Your approach will work fine for files with thousands of lines. That really isn't that big. For millions of lines, it might be a problem.
However, you could reduce the memory usage of your code by only reading one file into memory, as well as printing the results immediately instead of storing them in an array:
With two huge files of equal size, this will use roughly 1/4 the memory of your original code.
Update: I also edited the code to include Simbabque's good suggestions for modernizing it.
Update 2: As others have noted, you could read neither file into memory, going through the payloads file line by line on each line of the animals file. However, that would be much slower. It should be avoided unless absolutely necessary. The approach I have suggested will be about the same speed as your original code.
Besides certain Modern Perl aspects (two-argument
open
for example) your code is pretty straight forward.The only improvement I can see is that you could move the inner
chomp
into an extra loop, maybe do the chomping while you read the file. That would save some time. But all in all, if you want to do something with data for each row of some other data, you are doing it right.You should use
or die
instead of|| die
because of precedence, and the final output will be a long line because there are no more linebreaks in the array's items.Update: @FrankB made a good suggestion in his above comment: If your files are huge and you are struggling with memory you should not slurp them in and put them in the two arrays, but rather read and process the first one line by line, and open and read the second one for each of these first one's lines. That takes a lot longer, but saves up a ton of memory. You would then output the results directly as well instead of pushing them in your results array.