Only pattern matching string

2019-08-26 06:30发布

问题:

I have a file which contains words

abciuf.com abdbhj.co.in abcshjkl.org.in.2 abciuf zasdg cbhjk asjk 

including other contents. The word starts with abci, abdb, abcs, abai is my requirement. So I want only the word which print starts with abci, abdb, abcs, abai like - abciuf.com abdbhj.co.in abcshjkl.org.in.2 abciuf Azerbaijan

I have tried via grep command but it doesn't help me

cat /etc/xyz.txt|egrep -o "abdb*|abci*|abcs*|abai*"
cat /etc/xyz.txt|egrep -Eow "abdb*|abci*|abcs*|abai*"

回答1:

grep -Eo `\<(abdb|abci|abcs|abai)\S*' </etc/xyz.txt
  • \< (or \b) matches start of "word" (or a "word" boundary)
  • (A|B) matches A or B
  • \S* matches zero or more nonspace characters (until a non-nonspace character)

  • it was a good idea to try using grep's -w option but its definition of "word" is too strict (matching stops if it encounters .)

  • shell meaning of * is not same as grep's
  • you can make the regexp shorter but it becomes harder to read


回答2:

You can try Perl also

 perl -ne ' while(/(\b(abdb|abci|abcs|abai)\S+)/g) { print "$1 \n" } '

with your inputs

$ cat sin15.txt
abciuf.com abdbhj.co.in abcshjkl.org.in.2 abciuf zasdg cbhjk asjk

$ perl -ne ' while(/(\b(abdb|abci|abcs|abai)\S+)/g) { print "$1 \n" } ' sin15.txt
abciuf.com
abdbhj.co.in
abcshjkl.org.in.2
abciuf

$


回答3:

With GNU awk for multi-char RS and RT:

$ awk -v RS='\\<(abdb|abci|abcs|abai)\\S*' 'RT{print RT}' file
abciuf.com
abdbhj.co.in
abcshjkl.org.in.2
abciuf


标签: awk sed grep