Example of NLTK's Vader Scoring Text

2019-07-31 19:02发布

I would like someone to correct my understanding of how VADER scores text. I've read an explanation of this process here, however I cannot match the compound score of test sentences to Vader's output when recreating the process it describes. Lets say we have the sentence:

"I like using VADER, its a fun tool to use"

The words VADER picks up are 'like' (+1.5 score), and 'fun' (+2.3). According to the documentation, these values are summed (so +3.8), and then normalized to a range between 0 and 1 using the following function:

(alpha = 15)
x / x2 + alpha 

With our numbers, this should become:

3.8 / 14.44 + 15 = 0.1290

VADER, however, outputs the returned compound score as follows:

Scores: {'neg': 0.0, 'neu': 0.508, 'pos': 0.492, 'compound': 0.7003}

Where am I going wrong in my reasoning? Similar questions have been asked several times, however an actual example of VADER classifying has not yet been provided. Any help would be appreciated.

1条回答
三岁会撩人
2楼-- · 2019-07-31 19:11

It's just your normalization that is wrong. From the code the function is defined:

def normalize(score, alpha=15):
"""
Normalize the score to be between -1 and 1 using an alpha that
approximates the max expected value
"""
norm_score = score/math.sqrt((score*score) + alpha)
return norm_score

So you have 3.8/sqrt(3.8*3.8 + 15) = 0.7003

查看更多
登录 后发表回答