I am trying to process a text file in perl. I need to store the data from the file into a database. The problem that I'm having is that some fields contain a newline, which throws me off a bit. What would be the best way to contain these fields?
Example data.txt file:
ID|Title|Description|Date
1|Example 1|Example Description|10/11/2011
2|Example 2|A long example description
Which contains
a bunch of newlines|10/12/2011
3|Example 3|Short description|10/13/2011
The current (broken) Perl script (example):
#!/usr/bin/perl -w
use strict;
open (MYFILE, 'data.txt');
while (<MYFILE>) {
chomp;
my ($id, $title, $description, $date) = split(/\|/);
if ($id ne 'ID') {
# processing certain fields (...)
# insert into the database (example)
$sqlInsert->execute($id, $title, $description, $date);
}
}
close (MYFILE);
As you can see from the example, in the case of ID 2, it's broken into several lines causing errors when attempting to reference those undefined variables. How would you group them into the correct field?
Thanks in advance! (I hope the question was clear enough, difficult to define the title)
Read next line until number of fields is what you need. Something like that (I haven't tested that code):
I would just count the number of separators before splitting the line. If you don't have enough, read the next line and append it. The
tr
operator is an efficient way to count characters.If you could change your data.txt file to include the pipe separator as the last character in every line/record, you could slurp in the whole file, splitting directly into the raw fields. This code would then do what you want: