How to match any string from a list of strings in

Lets say I have a list of strings,

string_lst = ['fun', 'dum', 'sun', 'gum']

I want to make a regular expression, where at a point in it, I can match any of the strings i have in that list, within a group, such as this:

import re
template = re.compile(r".*(elem for elem in string_lst).*")
template.match("I love to have fun.")

What would be the correct way to do this? Or would one have to make multiple regular expressions and match them all separately to the string?

标签： python regex string python-3.x

5条回答

做个烂人

2楼-- · 2020-05-20 05:27

Except for the regular expression, you can use list comprehension, hope it's not off the topic.

import re
def match(input_string, string_list):
    words = re.findall(r'\w+', input_string)
    return [word for word in words if word in string_list]

>>> string_lst = ['fun', 'dum', 'sun', 'gum']
>>> match("I love to have fun.", string_lst)
['fun']

0人赞添加讨论(0) 举报

smile是对你的礼貌

3楼-- · 2020-05-20 05:30

string_lst = ['fun', 'dum', 'sun', 'gum']
x="I love to have fun."

print re.findall(r"(?=("+'|'.join(string_lst)+r"))",x)

You cannot use match as it will match from start.Use findall instead.

Output:['fun']

using search you will get only the first match.So use findall instead.

Also use lookahead if you have overlapping matches not starting at the same point.

0人赞添加讨论(0) 举报

Viruses.

4楼-- · 2020-05-20 05:38

In line with @vks reply - I feel this actually does the comeplete task..

finds = re.findall(r"(?=(\b" + '\\b|\\b'.join(string_lst) + r"\b))", x)

Adding word boundary completes the task!

0人赞添加讨论(0) 举报

狗以群分

5楼-- · 2020-05-20 05:39

regex module has named lists (sets actually):

#!/usr/bin/env python
import regex as re # $ pip install regex

p = re.compile(r"\L<words>", words=['fun', 'dum', 'sun', 'gum'])
if p.search("I love to have fun."):
    print('matched')

Here words is just a name, you can use anything you like instead.
.search() methods is used instead of .* before/after the named list.

To emulate named lists using stdlib's re module:

#!/usr/bin/env python
import re

words = ['fun', 'dum', 'sun', 'gum']
longest_first = sorted(words, key=len, reverse=True)
p = re.compile(r'(?:{})'.format('|'.join(map(re.escape, longest_first))))
if p.search("I love to have fun."):
    print('matched')

re.escape() is used to escape regex meta-characters such as .*? inside individual words (to match the words literally).
sorted() emulates regex behavior and it puts the longest words first among the alternatives, compare:

>>> import re
>>> re.findall("(funny|fun)", "it is funny")
['funny']
>>> re.findall("(fun|funny)", "it is funny")
['fun']
>>> import regex
>>> regex.findall(r"\L<words>", "it is funny", words=['fun', 'funny'])
['funny']
>>> regex.findall(r"\L<words>", "it is funny", words=['funny', 'fun'])
['funny']

0人赞添加讨论(0) 举报

冷血范

6楼-- · 2020-05-20 05:43

You should make sure to escape the strings correctly before combining into a regex

>>> import re
>>> string_lst = ['fun', 'dum', 'sun', 'gum']
>>> x = "I love to have fun."
>>> regex = re.compile("(?=(" + "|".join(map(re.escape, string_lst)) + "))")
>>> re.findall(regex, x)
['fun']

0人赞添加讨论(0) 举报

How to match any string from a list of strings in

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间