-->

朴素贝叶斯:类内方差训练的每一个特征必须为正(Naive Bayes: the within-cla

2019-08-02 21:03发布

当试图将朴素贝叶斯:

    training_data = sample; % 
    target_class = K8;
 # train model
 nb = NaiveBayes.fit(training_data, target_class);

 # prediction
 y = nb.predict(cluster3);

我得到一个错误:

??? Error using ==> NaiveBayes.fit>gaussianFit at 535
The within-class variance in each feature of TRAINING
must be positive. The within-class variance in feature
2 5 6 in class normal. are not positive.

Error in ==> NaiveBayes.fit at 498
            obj = gaussianFit(obj, training, gindex);

任何人都可以揭示出这个光怎么解决呢? 请注意,我看过一个类似的帖子在这里 ,但我不知道该怎么办? 看来,如果其试图基于列而不是行,以适应,阶级差异,应根据各行属于特定类的概率。 如果我删除这些列那么它的工作原理,但显然这个心不是我想做的事情。

Answer 1:

Assuming that there is no bug anywhere in your code (or NaiveBayes code from mathworks), and again assuming that your training_data is in the form of NxD where there are N observations and D features, then columns 2, 5, and 6 are completely zero for at least a single class. This can happen if you have relatively small training data and high number of classes, in which a single class may be represented by a few observations. Since NaiveBayes by default treats all features as part of a normal distribution, it cannot work with a column that has zero variance for all features related to a single class. In other words, there is no way for NaiveBayes to find the parameters of the probability distribution by fitting a normal distribution to the features of that specific class (note: the default for distribution is normal).

Take a look at the nature of your features. If they seem to not follow a normal distribution within each class, then normal is not the option you want to use. Maybe your data is closer to a multinomial model mn:

nb = NaiveBayes.fit(training_data, target_class, 'Distribution', 'mn');


文章来源: Naive Bayes: the within-class variance in each feature of TRAINING must be positive