I regularly use regex to transform text.
To transform, giant text files from the command line, perl lets me do this:
perl -pe < in.txt > out.txt
But this is inherently on a line-by-line basis. Occasionally, I want to match on multi-line things.
How can I do this in the command-line?
To slurp a file instead of doing line by line processing, use the
-0777
switch:As documented in
perlrun #Command Switches
:Obviously, for large files this may not work well, in which case you'll need to code some type of buffer to do this replacement. We can't advise any better though without real information about your intent.
Grepping across line boundaries
So you want to grep across lines boundaries...
You quite possibly already have
pcregrep
installed. As you may know, PCRE stands forPerl-Compatible Regular Expressions
, and the library is definitely Perl-style, though not identical to Perl.To match across multiple lines, you have to turn on the multi-line mode
-M
, which is not the same as(?m)
Running
pcregrep -M "(?s)^b.*\d+" text.txt
On this text file:
The output will be
whereas grep would return empty.
Excerpt from the doc: