I have a question in regard to dealing with small probabilities values in machine learning models.
The standard way to avoid underflow problems which results from multiplying small floating-point numbers is to use log(x) instead of x
suppose that x=0.50 the log of which is log(x)=-0.301029996
to recover x later on the value of exp(log(x)) != x that is
0.740055574 != 0.50
So, how is using the logarithm is useful to deal with underflow??
(Not at all sure I remember correctly, so please correct me if I'm wrong.)
This is not really about overflow or underflow, but about floating point precision.
The idea is that if you have many very small numbers, multiplying them will produce an extremely small number. Say, you have ten probabilities of 1%, or
0.01
each. Multiply them, and the result is1e-20
. In those regions, floating point precision is not very good, which can introduce errors. In the worst case, the number could be 'rounded' to zero, which would break the entire calculation.The trick wth logarithms is that after conversion to logarithms,
Example (using Python, because I'm too lazy to fire up Eclipse, but the same works for Java):
Also, as pointed out in the other answer, the problem with your particular calculation is that you seem to use a logarithm base-10 (
log10
in Java), for which notexp(x)
is the inverse function, but10^x
. Note, however, that in most languages / math libraries,log
is in fact the natural logarithm.This has nothing to do with the overflow. In the first
log
, you compute thelog
in base 10, instead of the natural logarithm. You can do this:raise
10^log(x)
to get backx
, or use the natural logarithm.