Example of NLTK's Vader Scoring Text

2019-07-31 19:03发布

问题:

I would like someone to correct my understanding of how VADER scores text. I've read an explanation of this process here, however I cannot match the compound score of test sentences to Vader's output when recreating the process it describes. Lets say we have the sentence:

"I like using VADER, its a fun tool to use"

The words VADER picks up are 'like' (+1.5 score), and 'fun' (+2.3). According to the documentation, these values are summed (so +3.8), and then normalized to a range between 0 and 1 using the following function:

(alpha = 15)
x / x2 + alpha

With our numbers, this should become:

3.8 / 14.44 + 15 = 0.1290

VADER, however, outputs the returned compound score as follows:

Scores: {'neg': 0.0, 'neu': 0.508, 'pos': 0.492, 'compound': 0.7003}

Where am I going wrong in my reasoning? Similar questions have been asked several times, however an actual example of VADER classifying has not yet been provided. Any help would be appreciated.

回答1:

It's just your normalization that is wrong. From the code the function is defined:

def normalize(score, alpha=15):
"""
Normalize the score to be between -1 and 1 using an alpha that
approximates the max expected value
"""
norm_score = score/math.sqrt((score*score) + alpha)
return norm_score

So you have 3.8/sqrt(3.8*3.8 + 15) = 0.7003