I`m working with documents, and I need to have the words isolated without punctuation. I know how to use string.split(" ") to make each word just the letters, but the punctuation baffles me.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
this is an example using regex, and the result is ['this', 'is', 'a', 'string', 'with', 'punctuation']
s = " ,this ?is a string! with punctuation. "
import re
pattern = re.compile('\w+')
result = pattern.findall(s)
print(result)