I'm writing a Perl script to run through and grab various data elements such as:
1253592000
1253678400 86400 6183.000000
1253764800 86400 4486.000000
1253851200 36.000000 86400 10669.000000
1253937600 0.000000 86400 9126.000000
1254024000 0.000000 86400 2930.000000
1254110400 0.000000 86400 2895.000000
1254196800 0.000000 8828.000000
I can grab each line of this text file no problem.
I have working regex to grab each of those fields. Once I have the line in a variable, i.e. $line - how can I grab each of those fields and place them into their own variables even though they have different delimiters?
You can split the line. It appears that your delimiter is just whitespace? You can do something on the order of:
This will match all whitespace. You can then do bounds checking and access each field via $line[0], $line[1], etc.
Split can also take a regular expression rather than a string as a delimiter as well.
This might do the same thing.
This example illustrates how to parse the line either with whitespace as the delimiter (split) or with a fixed-column layout (unpack). With
unpack
if you use upper-case (A10 etc), whitespace will be removed for you. Note: as brian d foy points out, thesplit
approach does not work well for a situation with missing fields (for example, the second line of data), because the field position information will be lost;unpack
is the way to go here, unless we are misunderstanding your data.I'm unsure of the column names and formatting but you should be able to adjust this recipe to your liking using Text::FixedWidth
Use my module
DataExtract::FixedWidth
. It is the most full featured, and well tested, for working with Fixed Width columns in perl. If this isn't fast enough you can pass in anunpack_string
and eliminate the need for heuristic detection of boundaries.If all fields have the same fixed width and are formatted with spaces, you can use the following
split
:where
N
is the with of the field. This will yield a space for each empty field.Fixed width delimiting can be done like this:
My Perl is very rusty so I am sure there are syntax errors there. but that is the gist of it.