Using perl to split a line that may contain whites

2019-04-19 20:41发布

问题:

Okay, so I'm using perl to read in a file that contains some general configuration data. This data is organized into headers based on what they mean. An example follows:

[vars]

# This is how we define a variable!
$var = 10;
$str = "Hello thar!";


# This section contains flags which can be used to modify module behavior
# All modules read this file and if they understand any of the flags, use them
[flags] 
  Verbose =       true; # Notice the errant whitespace!

[path]
WinPath = default; # Keyword which loads the standard PATH as defined by the operating system. Append  with additonal values.
LinuxPath = default;

Goal: Using the first line as an example "$var = 10;", I'd like to use the split function in perl to create an array that contains the characters "$var" and "10" as elements. Using another line as an example:

    Verbose    =         true;
    # Should become [Verbose, true] aka no whitespace is present

This is needed because I will be outputting these values to a new file (which a different piece of C++ code will read) to instantiate dictionary objects. Just to give you a little taste of what it might look like (just making it up as I go along):

define new dictionary
name: [flags]
# Start defining keys => values
new key name: Verbose
new value val: 10 
# End dictionary

Oh, and here is the code I currently have along with what it is doing (incorrectly):

sub makeref($)
{
    my @line = (split (/=/)); # Produces ["Verbose", "    true"];
}

To answer one question, why I am not using Config::Simple, is that I originally did not know what my configuration file would look like, only what I wanted it to do. Making it up as I went along - at least what seemed sensible to me - and using perl to parse the file.

The problem is I have some C++ code that will load the information in the config file, but since parsing in C or C++ is :( I decided to use perl. It's also a good learning exercise for me since I am new to the language. So that's the thing, this perl code is not really apart of my application, it just makes it easier for the C++ code to read the information. And, it is more readable (both the config file, and the generated file). Thanks for the feedback, it really helped.

回答1:

If you're doing this parsing as a learning exercise, that's fine. However, CPAN has several modules that will do a lot of the work for you.

use Config::Simple;
Config::Simple->import_from( 'some_config_file.txt', \my %conf );


回答2:

split splits on a regular expression, so you can simply put the whitespace around the = sign into its regex:

split (/\s*=\s*/, $line);

You obviously do not want to remove all whitespace, or such a line would be produced (whitespace missing in the string):

$str="Hellothere!";

I guess that only removing whitespace from the beginning and end of the line is sufficient:

$line =~ s/^\s*(.*?)\s*$/$1/;

A simpler alternative with two statements:

$line =~ s/^\s+//;
$line =~ s/\s+$//;


回答3:

Seems like you've got it. Strip the whitespaces before splitting.

sub makeref($)
{
    s/\s+//g;
    my @line = (split(/=/)); # gets ["verbose", "true"]
}


回答4:

This code does the trick (and is more efficient without reversing).

for (@line) {
    s/^\s+//;
    s/\s+$//;
}


回答5:

You probably have it all figured out, but I thought I'd add a little. If you

sub makeref($)
{
   my @line = (split(/=/));
   foreach (@line)
   {
      s/^\s+//g;
      s/\s+$//g;
   }
}

then you will remove the whitespace before and after both the left and right side. That way something like:

 this is a parameter         =      all sorts of stuff here

will not have crazy spaces.

!!Warning: I probably don't know what I'm talking about!!