Input file

aaa
Any--END--Pattern
bbb
ANY--BEGIN--PATTERN
ccc                   # do not print
ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
eee
fff
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8
iii                   # do not print
ANY--BEGIN--PATTERN
jjj

Wanted output

ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8

Notes

Print from the latest ANY--BEGIN--PATTERN before the current Any--END--Pattern.
Print until the last Any--END--Pattern if no ANY--BEGIN--PATTERN meet.

Many similar questions but cannot find an answer for this issue

The answers I have tested from these questions print the line ccc and/or the line iii... or do not print the lines having the BEGIN and END patterns. My several attempts have these same drawbacks and defects.

We could write a ten lines script, but I am sure there is an elegant one-line command solving this issue but I cannot find it. Therefore I think this could be a good SO question ;-)

I wonder what are the tricks to use from sed, awk, perl or any other tool available easy on our Unix-like systems. Please provide a tiny command line using : bash, grep, sed, awk, perl or any other tool you think...

EDIT:

Just to underline the pretty simple command line from Sundeep's comment that simplifies the problem by reversing the input file:

tac input.txt | sed -n '/END/,/BEGIN/p' | tac

But this command line also prints the beginning
^{(this case may not happen for other users looking a similar issue)}

aaa
Any--END--Pattern
ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8

_{(This answer is used within this C++ coding rules)}

标签： bash perl shell awk sed

3条回答

\"骚年 ilove

2楼-- · 2019-09-03 16:44

awk to the rescue!

$ awk '/BEGIN/{c=0; b=1} 
              {a[c++]=$0} 
      b&&/END/{for(i=0;i<c;i++) print a[i]; delete a; c=0}' file

ANY--BEGIN--PATTERN   # print 1
ddd                   # print 2
Any--END--Pattern     # print 3
ANY--BEGIN--PATTERN   # print 4
ggg                   # print 5
Any--END--Pattern     # print 6
hhh                   # print 7
Any--END--Pattern     # print 8

0人赞添加讨论(0) 举报

叛逆

3楼-- · 2019-09-03 17:02

Perl to the rescue!

#!/usr/bin/perl
use warnings;
use strict;

my $last_end;
my @buffer;
while (<>) {
    if (/BEGIN/) {

        print @buffer[ 0 .. $last_end ] if defined $last_end;

        @buffer = $_;
        undef $last_end;
        next;
    }
    $last_end = @buffer if @buffer && /END/;
    push @buffer, $_ if @buffer;
}

@buffer accumulates the lines from BEGIN, $last_end points to, well, the last END in the buffer, so you can throw away accumulated lines that don't end in an END.

As a one-liner (but why?):

perl -ne 'defined $l && print(@B[0..$l]), (@B, $l) = $_, next if /BEGIN/; $l=@B if @B && /END/; push @B, $_ if @B' file

0人赞添加讨论(0) 举报

我想做一个坏孩纸

4楼-- · 2019-09-03 17:06

This should work with sed

sed '$b1;/BEGIN/{:1;x;s/\(BEGIN.*END[^\n]*\).*/\1/;t;x;h};H;d' file

0人赞添加讨论(0) 举报

Print the smallest set of lines between two patter

Input file

Wanted output

Notes

Many similar questions but cannot find an answer for this issue

EDIT:

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间