With existing text categorization (supervised) techniques why don't we consider Named Entities (NE) in the text as a feature in training and testing? Do you think we can improve precision with using NEs as a feature?
相关问题
- React Native Inline style for multiple Text in sin
- How to conditionally scale values in Keras Lambda
- Trying to understand Pytorch's implementation
- How to change the first two uppercase characters o
- Bulding a classification model in R studio with ke
相关文章
- 放在input的text下文本一直出现一个/(即使还没输入任何值)是什么情况
- How to use cross_val_score with random_state
- How to measure overfitting when train and validati
- McNemar's test in Python and comparison of cla
- How to disable keras warnings?
- Invert MinMaxScaler from scikit_learn
- Rendering plain text through PHP
- How should I vectorize the following list of lists
It depends a lot on the domain you are working in. You have to define the features based on the domain. Say in a search engine you are working on learning to rank problem, generating a dynamic rank, the NE's wont give you any benefit here. It largerly depends on the domain that you are working and also the output categorization labels (supervised learning) defined.
Now say you are working on classifying documents pertaining to Soccer or Movie or Polictics and so on. In this case Named Entities can work. I will give you an example here, say you are using a Neural Network which categorizes documents into Soccer, Movie, Politics etc. Now say a document comes in "Lionel Messi was invited to attend the premier of "The Social Network", also present were the cast and crew including Jesse Eisenberg, Andrew Garfield and Justin Timberlake" Here the connection between named entities (input features) and movie (output defined) will be stronger and hence it will be classified as a document on Movie.
Another example, say our document is "Tom Cruise is portraying the character of Lionel Messi in the movie "The last soccer game". Here comes the benefit say your neural network has learnt that when an actor and footballer comes together in one document there is high probability of it being a movie. Again it depends on the data and training it may be other way round too (but that is what is learning all about; seeing the past data)
So my answer would be try it out, nobody is stopping you to have named entities as features. It might help for the domain that you are working in.