-->

Can grep show only words that match search pattern

2019-01-01 01:25发布

站内文章 / 前沿技术

29 0

看风景的人

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Is there a way to make grep output \"words\" from files that match the search expression?

If I want to find all the instances of, say, \"th\" in a number of files, I can do:

grep \"th\" *

but the output will be something like (bold is by me);

some-text-file : the cat sat on the mat  
some-other-text-file : the quick brown fox  
yet-another-text-file : i hope this explains it thoroughly

What I want it to output, using the same search, is:

the
the
the
this
thoroughly

Is this possible using grep? Or using another combination of tools?

回答1:

Try grep -o

grep -oh \"\\w*th\\w*\" *

Edit: matching from Phil\'s comment

From the docs:

-h, --no-filename
    Suppress the prefixing of file names on output. This is the default
    when there is only  one  file  (or only standard input) to search.
-o, --only-matching
    Print  only  the matched (non-empty) parts of a matching line,
    with each such part on a separate output line.

回答2:

Cross distribution safe answer (including windows minGW?)

grep -h \"[[:alpha:]]*th[[:alpha:]]*\" \'filename\' | tr \' \' \'\\n\' | grep -h \"[[:alpha:]]*th[[:alpha:]]*\"

If your using older versions of grep (like 2.4.2) which does not include the -o option. Use the above. Else use the simpler to maintain version below.

Linux cross distribution safe answer

grep -oh \"[[:alpha:]]*th[[:alpha:]]*\" \'filename\'

To summaries -oh outputs the regular expression matches to the file content (and not its filename), just like how you would expect regular expression to work in vim/etc... What word or regular expression you would be searching for then, is up to you! As long as you remain to POSIX and not perl syntax (refer below)

回答3:

You could translate spaces to newlines and then grep, e.g.:

cat * | tr \' \' \'\\n\' | grep th

回答4:

Just awk, no need combination of tools.

# awk \'{for(i=1;i<=NF;i++){if($i~/^th/){print $i}}}\' file
the
the
the
this
thoroughly

回答5:

It\'s more simple than you think. Try this:

egrep -wo \'th.[a-z]*\' filename.txt #### (Case Sensitive)

egrep -iwo \'th.[a-z]*\' filename.txt  ### (Case Insensitive)

Where,

 egrep: Grep will work with extended regular expression.
 w    : Matches only word/words instead of substring.
 o    : Display only matched pattern instead of whole line.
 i    : If u want to ignore case sensitivity.

回答6:

grep command for only matching and perl

grep -o -P \'th.*? \' filename

回答7:

I was unsatisfied with awk\'s hard to remember syntax but I liked the idea of using one utility to do this.

It seems like ack (or ack-grep if you use Ubuntu) can do this easily:

# ack-grep -ho \"\\bth.*?\\b\" *

the
the
the
this
thoroughly

If you omit the -h flag you get:

# ack-grep -o \"\\bth.*?\\b\" *

some-other-text-file
1:the

some-text-file
1:the
the

yet-another-text-file
1:this
thoroughly

As a bonus, you can use the --output flag to do this for more complex searches with just about the easiest syntax I\'ve found:

# echo \"bug: 1, id: 5, time: 12/27/2010\" > test-file
# ack-grep -ho \"bug: (\\d*), id: (\\d*), time: (.*)\" --output \'$1, $2, $3\' test-file

1, 5, 12/27/2010

回答8:

cat *-text-file | grep -Eio \"th[a-z]+\"

回答9:

To search all the words with start with \"icon-\" the following command works perfect. I am using Ack here which is similar to grep but with better options and nice formatting.

ack -oh --type=html \"\\w*icon-\\w*\" | sort | uniq

回答10:

You can also try pcregrep. There is also a -w option in grep, but in some cases it doesn\'t work as expected.

From Wikipedia:

cat fruitlist.txt
apple
apples
pineapple
apple-
apple-fruit
fruit-apple

grep -w apple fruitlist.txt
apple
apple-
apple-fruit
fruit-apple

回答11:

I had a similar problem, looking for grep/pattern regex and the \"matched pattern found\" as output.

At the end I used egrep (same regex on grep -e or -G didn\'t give me the same result of egrep) with the option -o

so, I think that could be something similar to (I\'m NOT a regex Master) :

egrep -o \"the*|this{1}|thoroughly{1}\" filename

回答12:

$ grep -w

Excerpt from grep man page:

-w: Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character.

回答13:

You could pipe your grep output into Perl like this:

grep \"th\" * | perl -n -e\'while(/(\\w*th\\w*)/g) {print \"$1\\n\"}\'

回答14:

`ripgrep`

Here are the example using ripgrep:

rg -o \"(\\w+)?th(\\w+)?\"

It\'ll match all words matching th.

标签： grep words

看风景的人

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~