How do I combine few weak learners into a strong classifier? I know the formula, but the problem is that in every paper about AdaBoost that I've read there are only formulas without any example. I mean - I got weak learners and their weights, so I can do what the formula tells me to do (multiply learner by its weight and add another one multiplied by its weight and another one etc.) but how exactly do I do that? My weak learners are decision stumps. They got attribute and treshold, so what do I multiply?
相关问题
- How to conditionally scale values in Keras Lambda
- Trying to understand Pytorch's implementation
- Bulding a classification model in R studio with ke
- ParameterError: Audio buffer is not finite everywh
- Splitting list and iterating in prolog
相关文章
- how to flatten input in `nn.Sequential` in Pytorch
- What are the problems associated to Best First Sea
- How to use cross_val_score with random_state
- Looping through training data in Neural Networks B
- How to measure overfitting when train and validati
- McNemar's test in Python and comparison of cla
- How to disable keras warnings?
- Invert MinMaxScaler from scikit_learn
In vanilla Ada Boost, you don't multiply learners by any weight. Instead, you increase the weight of the misclassified data. Imagine that you have an array, such as [1..1000], and you want to use neural networks to estimate which numbers are primes. (Stupid example, but suffices for demonstration).
Imagine that you have class NeuralNet. You instantiate the first one, n1 = NeuralNet.new. Then you have the training set, that is, another array of primes from 1 to 1000. (You need to make up some feature set for a number, such as its digits.). Then you train n1 to recognize primes on your training set. Let's imagine that n1 is weak, so after the training period ends, it won't be able to correctly classify all numbers 1..1000 into primes and non-primes. Let's imagine that n1 incorrectly says that 27 is prime and 113 is non-prime, and makes some other mistakes. What do you do? You instantiate another NeuralNet, n2, and increase the weight of 27, 113 and other mistaken numbers, let's say, from 1 to 1.5, and decrease the weight of the correctly classified numbers from 1 to 0.667. Then you train n2. After training, you'll find that n2 corrected most of the mistakes of n1, including no. 27, but no. 113 is still misclassified. So you instantiate n3, increase the weight of 113 to 2, decrease the weight of 27 and other now correctly classified numbers to 1, and decrease the weight of old correctly classified numbers to 0.5. And so on...
Am I being concrete enough?
If I understand your question correctly, you have a great explanation on how boosting ensambles the weak classifiers into a strong classifier with a lot of images in these lecture notes:
www.csc.kth.se/utbildning/kth/kurser/DD2427/bik12/DownloadMaterial/Lectures/Lecture8.pdf
Basically you are by taking the weighted combination of the separating hyperplanes creating a more complex decision surface (great plots showing this in the lecture notes)
Hope this helps.
EDIT
To do it practically:
in page 42 you see the formulae for
alpha_t = 1/2*ln((1-e_t)/e_t)
which easily can be calculated in a for loop, or if you are using some numeric library (I'm using numpy which is really great) directly by vector operations. Thealpha_t
is calculated inside of the adaboost so I assume you already have these.You have the mathematical formulae at page 38, the big sigma stands for sum over all.
h_t
is the weak classifier function and it returns either -1 (no) or 1 (yes).alpha_t
is basically how good the weak classifier is and thus how much it has to say in the final decision of the strong classifier (not very democratic).I don't really use forloops never, but I'll be easier to understand and more language independent (this is pythonish pseudocode):
This is mathematically called the dot product between the weights and the weak-responses (basically: strong(x) = alpha*weak(x)).
EDIT2
This is what is happening inside strongclassifier(x): Separating hyperplane is basically decided in the function weak(x), so all x's which has weak(x)=1 is on one side of the hyperplane while weak(x)=-1 is on the other side of the hyperplane. If you think about it has lines on a plane you have a plane separating the plane into two parts (always), one side is (-) and the other one is (+). If you now have 3 infinite lines in the shape of a triangle with their negative side faced outwards, you will get 3 (+)'s inside the triangle and 1 or 2 (-)'s outside which results (in the strong classifier) into a triangle region which is positive and the rest negative. It's an over simplification but the point is still there and it works totally analogous in higher dimensions.