Convert string with sed

2019-08-16 07:54发布

问题:

I have a couple lines in my input where I am initializing structs. For example:

head = (struct node) {5, NULL};

I need to convert these lines into the following:

init_node( &head, 5, NULL);

That is...Anytime I see the following line in the input file:

something = (struct something2){ something3, something4};

I need to convert it to:

init_something2( &something1, something3, something4);

I think I need to use sed function here. Can someone help?? Thanks

回答1:

Sed would probably work too, but here's something that works with perl:

perl -pe 's|(.*) = \(struct node\) {(.*), (.*)};|init_node( &\1, \2, \3);|'

Notice that I'm capturing each "something" with a parenthetical expression in the match, and then retrieving them with \1, \2, etc. in the substitute. That's the only part you really need to know. Hopefully you can figure out how to make either expression flexible enough to fit your actual data (Unless you miraculously have a consistent style for every single line).



回答2:

I would do:

sed -e 's/\s*\([_a-zA-Z][0-9a-zA-Z_]*\)\s*=\s*(\s*struct\s*\([_a-zA-Z][0-9a-zA-Z_]*\)\s*)\s*{\s*\([^}]*\)}\s*;/init_\2( \&\1, \3);/' -i you_file.c

Explaining the Crazy RegExp:

1) \s* skips zero or more whitespace (so it becomes more flexible).

Second we grab by using ( ) a C identifier which (please someone correct me if I'm wrong) can start with a character from the alphabet or an underscore, and can contain alpha-numeric characters and underscores ([a-zA-Z][0-9a-zA-Z]*).

Third skip an equals sign followed by zero-or-more spaces, then skip an open parenthesis followed by spaces, then a struct followed by spaces

Fourth: grab another identifier

Fifth: skip a close parenthesis surrounded by zero or more spaces, then an open bracket followed by whitespaces

Sixth: grab anything before a close bracket (beware of this! you can't have in the code an expression that contains brackets)

Seventh: Skip the close bracket then whitespaces then a semicolon

Finally: rearrange what was grabbed =)

EDIT: Notice that the "&" must be escaped: "\&". If it isn't, sed will substitute it with the whole match

EDIT2: Thanks to Jonathan for the observation related to how to include an arbitrary number of initializers (with commas)

Hope this helps,

Janito



回答3:

Regular expressions will be your friend here (assuming your input is a consistent format).

The expression \([a-zA-Z]*\) = (struct \([a-zA-Z]*\)) {\([a-zA-Z0-9, ]*\)}; should model your input string. Using these capture groups, the expression init_\2( \&\1, \3); should generate your desired output string. Putting these together, the following sed command should do what you need:

sed -e 's/\([a-zA-Z]*\) = (struct \([a-zA-Z]*\)) {\([a-zA-Z0-9, ]*\)};/init_\2( \&\1, \3);/g'

This assumes that your struct and variable names only consist of upper- and lower-case letters (I kept it simple to try to prevent the example from becoming too wide for the page). If they contain other characters, you'll need to adjust the expressions accordingly.



回答4:

A more general sed match:

sed -e 's/\([a-zA-Z0-9]*\)\s*=\s*(\s*struct\s\([a-zA-Z0-9]*\)\s*)\s*{\s*\([a-zA-Z0-9]*\)\s*,\s*\([a-zA-Z0-9]*\)\s*}\s*;/init_\2( \&\1, \3, \4);/g'

This would match expressions like:

  • something=( struct something2) {something3,something4};
  • something = (struct something2) { something3 , something4 };

etc.



标签: shell unix