How to print 10 letters preceding every occurrence

2019-07-22 18:42发布

Using grep, I can print all occurrences of the uppercase letter "Z" in my document. The output, however, will display the entire lines in which every "Z" in the document was found. I need to limit this to printing only the 10 letters appearing before every occurance of "Z".

E.g., if the document has a line with "AAAABBBBBBBBBCCCCCCDDDDDDDZ", it will print "CCDDDDDDD", the 10 letters appearing before.

  • If there are fewer than 10 letters prior to "Z", then nothing needs to be printed.
  • If "Z" appears multiple times in a single line, the 10 letters preceding each of these "Z"'s should be printed, e.g.: "AAAABBBBBBBBBZCCCCCDDDDDDDZ" will print "ABBBBBBBBB" and "CCDDDDDDDZ".

The result will be an output list of these letters, e.g.:

ABBBBBBBBB
CCDDDDDDDZ

How can I print the 10 letters preceding every occurrence of the letter "Z" in my document?

标签: perl bash grep
2条回答
▲ chillily
2楼-- · 2019-07-22 19:18

Simple:

grep -oP '.{10}(?=Z)' <<< AAAABBBBBBBBBZCCCCCDDDDDDDZ

Explanation:

-o     : Print only match, not entire line
-P     : Use PCRE / Perl regex
.{10}  : Match is any 10 characters,
(?=z)  : which are followed by "Z". (Search for positive look-ahead for more details)
<<< ...: Here string

EDIT:

NOTE: This does not work, if the 10 characters we want are overlapping. e.g. input=AAAABBBBBBBBBZDDDDDDDZ. If the input contains such pattern, see igegami's answer

查看更多
Emotional °昔
3楼-- · 2019-07-22 19:33
$ perl -nE'say for /(?<=(.{10}))Z/g' <<'__EOI__'
AAAABBBBBBBBBZCCCCCDDDDDDDZ
AAAABBBBBBBBBZDDDDDDDZ
__EOI__
ABBBBBBBBB
CCCDDDDDDD
ABBBBBBBBB
BBZDDDDDDD

or

$ perl -nE'say for /(?=(.{10})Z)/g' <<'__EOI__'
AAAABBBBBBBBBZCCCCCDDDDDDDZ
AAAABBBBBBBBBZDDDDDDDZ
__EOI__
ABBBBBBBBB
CCCDDDDDDD
ABBBBBBBBB
BBZDDDDDDD
查看更多
登录 后发表回答