Exact match for words

2020-05-04 04:58发布

I would like to use regular expression that matches if a sentence contains one of the words that I am looking for.

All of these are matching now which is not correct. I used " " for all words in words (like " seven ") but this time it doesn't match if a word is at the end of the string.

words = ('seven', 'eight')
regex = re.compile('|'.join(words))
print regex.search('aaaaaasd seven asdfadsf')   #1 - should match
print regex.search('AAAsevenAAA')               #2 - shouldn't match
print regex.search('AAA eightaaa')              #3 - shouldn't match
print regex.search('eight aaa')                 #4 - should match
print regex.search('aaaa eight')                #5 - should match

How can I make that my regular expression doesn't match if matching word is one of the words' substring (like #2 and #3 above)?

标签： python regex

1条回答

够拽才男人

2楼-- · 2020-05-04 05:24

As @CasimiretHippolyte pointed out you want to add word boundaries. If you don't want to manually do this for each word in your list, you need to modify your compiled regular expression.

regex = re.compile(r'\b(?:%s)\b' % '|'.join(words))

Note: If you have escape sequences in your regex, it's best to use raw string notation. By using a non-capturing (?:...) group, this allows your words to be grouped with word boundaries placed around them, otherwise it places a boundary at the very beginning and the very end.

Ideone Demo

0人赞添加讨论(0) 举报

Exact match for words

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间