Extract relevant sentences to entity

2019-04-17 15:49发布

Do you know some paper or algorithm in NLP that is able to extract sentences from text that are related to given entity (term). I would like to process some reviews (mainly tech), but I found out that many reviews mention more then one product (they do comparation). I would like to extract from that text just sentences that are relevant to one product, or delete sentences that are irrelevant to particular named entity (product).

My questin is how to do it? Is there some related papers? Is something like this done by some toolkit or api?

标签： machine-learning nlp semantics semantic-web

1条回答

够拽才男人

2楼-- · 2019-04-17 16:23

What you want is a Named Entity Recognizer (NER). Given an input sentence, the NER will identify the various entities in the sentence as persons, organizations, products etc. You can then check entities recognized as products, and keep or discard the sentence accordingly. One very simple possibility would be to use the named entity recognizer of NLTK in Python. Here is an example:

import nltk
sent = "Albert Einstein spent many years at Princeton University in New Jersey"
sent1 = nltk.word_tokenize(sent)
sent2 = nltk.pos_tag(sent1)
sent3 = nltk.ne_chunk(sent2)
print sent3

The output will be:

(S
  (PERSON Albert/NNP)
  (PERSON Einstein/NNP)
  spent/VBD
  many/JJ 
  years/NNS
  at/IN
  (ORGANIZATION Princeton/NNP University/NNP)
  in/IN
  (GPE New/NNP Jersey/NNP))

NLTK works well for this simple example, but to be honest I'm not sure how accurate it is or if it can be customized to fit your purposes (identifying products). But I know that the Stanford NER is both customizable and accurate, so you might want to have a look at the above link.

0人赞添加讨论(0) 举报

Extract relevant sentences to entity

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间