PHP - Find if any of the keywords in an array exis

2020-05-29 11:12发布

Basically, I have an array of keywords, and a piece of text. I am wondering what would be the best way to find out if any of those keywords are present in the text, bearing in mind performance issues.

I was thinking of just looping over the array and doing a strpos() for each keyword, but with well over ten thousand words in the array, it takes PHP a bit of time to do it, and so I was wondering if there is a more efficient way to do it.

标签: php
5条回答
兄弟一词,经得起流年.
2楼-- · 2020-05-29 12:03

Depending on the size of the string You could use a hash to make it faster.

First iterate the text. For each word, assign it to an array:

 foreach (preg_split("/\s/", $text) as $word)
 {
     $string[$word] = 1;
 }

Then iterate the keywords checking the $string:

 foreach ($keywords as $keyword)
 {
     if (isset($string[$keyword]))
     {
         // $keyword exists in string
     }
 }

EDIT If your text is much smaller than your keywords, do it backwards, check the keywords for each word in the text. This would likley be faster than the above if the text is pretty short.

 foreach (preg_split("/\s/", $text) as $word)
 {
    if (isset($keywords[$word]))
    {
        //might be faster if sizeof($text) < sizeof($keywords)
    }
}
查看更多
够拽才男人
3楼-- · 2020-05-29 12:07

You could dump the text into an array and do a array_intersect_key on the two arrays. I am not sure of the performance of this though...

查看更多
小情绪 Triste *
4楼-- · 2020-05-29 12:08

I really don't know if it is more efficient, but you could try to put them all in a regex like this: (keyword1|keyword2|...) With the preg_quote function you can escape the keywords for the regex. If you set the compiled option, it might be more efficient when using it with multiple strings.

查看更多
forever°为你锁心
5楼-- · 2020-05-29 12:16

Assuming the formatting and only that you care if any (not which) of the keywords exist, you could try something like:

$keywords = array( "dog", "cat" );

// get a valid regex
$test = "(\b".implode( "\b)|(\b", $keywords )."\b)";

if( preg_match( $test, "there is a dog chasing a cat down the road" ) )
    print "keyword hit";
查看更多
Luminary・发光体
6楼-- · 2020-05-29 12:18

Working off eWolf's idea...

foreach($keywords as &$keyword) {
  $keyword = preg_quote($keyword);
}

$regex = "/(". implode('|', $keywords) .")/";

return preg_match($regex, $str);

You don't have to check for boundaries if you don't want to, but if you do just surround the group (the () characters) with \b then it'll match only a given word. And you'll want to make sure all the array's members are preg_quoted, for safety.

查看更多
登录 后发表回答