I need to compute the mathematical expression floor(ln(u)/ln(1-p))
for 0 < u < 1
and 0 < p < 1
in C on an embedded processor with no floating point arithmetics and no ln
function. The result is a positive integer. I know about the limit cases (p=0), I'll deal with them later...
I imagine that the solution involves having u
and p
range over 0..UINT16_MAX
, and appeal to a lookup table for the logarithm, but I cannot figure out how exactly: what does the lookup table map to?
The result needs not be 100% exact, approximations are OK.
Thanks!
Since the logarithm is used in both dividend and divisor, there is no need to use log()
; we can use log2()
instead. Due to the restrictions on the inputs u
and p
the logarithms are known to be both negative, so we can restrict ourselves to compute the positive quantity -log2()
.
We can use fixed-point arithmetic to compute the logarithm. We do so by multiplying the original input by a sequence of factors of decreasing magnitude that approach 1. Considering each of the factor in sequence, we multiply the input only by those factors that result in a product closer to 1, but without exceeding it. While doing so, we sum the log2()
of the factors that "fit". At the end of this procedure we wind up with a number very close to 1 as our final product, and a sum that represents the binary logarithm.
This process is known in the literature as multiplicative normalization or pseudo division, and some early publications describing it are the works by De Lugish and Meggitt. The latter indicates that the origin is basically Henry Briggs's method for computing common logarithms.
B. de Lugish. "A Class of Algorithms for Automatic Evaluation of Functions and Computations in a Digital Computer". PhD thesis, Dept. of Computer Science, University of Illinois, Urbana, 1970.
J. E. Meggitt. "Pseudo division and pseudo multiplication processes". IBM Journal of Research and Development, Vol. 6, No. 2, April 1962, pp. 210-226
As the chosen set of factors comprises 2i and (1+2-i) the necessary multiplications can be performed without the need for a multiplication instruction: the products can be computed by either shift or shift plus add.
Since the inputs u
and p
are purely fractional numbers with 16 bits, we may want to chose a 5.16 fixed-point result for the logarithm. By simply dividing the two logarithm values, we remove the fixed-point scale factor, and apply a floor()
operation at the same time, because for positive numbers, floor(x)
is identical to trunc(x)
and integer division is truncating.
Note that the fixed-point computation of the logarithm results in large relative error for inputs near 1. This in turn means the entire function computed using fixed-point arithmetic may deliver results significantly different from the reference if p
is small. An example of this is the following test case: u=55af p=0052 res=848 ref=874
.
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
/* input x is a 0.16 fixed-point number in [0,1)
function returns -log2(x) as a 5.16 fixed-point number in (0, 16]
*/
uint32_t nlog2_16 (uint16_t x)
{
uint32_t r = 0;
uint32_t t, a = x;
/* try factors 2**i with i = 8, 4, 2, 1 */
if ((t = a << 8 ) < 0x10000) { a = t; r += 0x80000; }
if ((t = a << 4 ) < 0x10000) { a = t; r += 0x40000; }
if ((t = a << 2 ) < 0x10000) { a = t; r += 0x20000; }
if ((t = a << 1 ) < 0x10000) { a = t; r += 0x10000; }
/* try factors (1+2**(-i)) with i = 1, .., 16 */
if ((t = a + (a >> 1)) < 0x10000) { a = t; r += 0x095c0; }
if ((t = a + (a >> 2)) < 0x10000) { a = t; r += 0x0526a; }
if ((t = a + (a >> 3)) < 0x10000) { a = t; r += 0x02b80; }
if ((t = a + (a >> 4)) < 0x10000) { a = t; r += 0x01664; }
if ((t = a + (a >> 5)) < 0x10000) { a = t; r += 0x00b5d; }
if ((t = a + (a >> 6)) < 0x10000) { a = t; r += 0x005ba; }
if ((t = a + (a >> 7)) < 0x10000) { a = t; r += 0x002e0; }
if ((t = a + (a >> 8)) < 0x10000) { a = t; r += 0x00171; }
if ((t = a + (a >> 9)) < 0x10000) { a = t; r += 0x000b8; }
if ((t = a + (a >> 10)) < 0x10000) { a = t; r += 0x0005c; }
if ((t = a + (a >> 11)) < 0x10000) { a = t; r += 0x0002e; }
if ((t = a + (a >> 12)) < 0x10000) { a = t; r += 0x00017; }
if ((t = a + (a >> 13)) < 0x10000) { a = t; r += 0x0000c; }
if ((t = a + (a >> 14)) < 0x10000) { a = t; r += 0x00006; }
if ((t = a + (a >> 15)) < 0x10000) { a = t; r += 0x00003; }
if ((t = a + (a >> 16)) < 0x10000) { a = t; r += 0x00001; }
return r;
}
/* Compute floor(log(u)/log(1-p)) for 0 < u < 1 and 0 < p < 1,
where 'u' and 'p' are represented as 0.16 fixed-point numbers
Result is an integer in range [0, 1048676]
*/
uint32_t func (uint16_t u, uint16_t p)
{
uint16_t one_minus_p = 0x10000 - p; // 1.0 - p
uint32_t log_u = nlog2_16 (u);
uint32_t log_p = nlog2_16 (one_minus_p);
uint32_t res = log_u / log_p; // divide and floor in one go
return res;
}
The maximum value of this function basically depends on the precision limit; that is, how arbitrarily close to the limits (u -> 0)
or (1 - p -> 1)
the fixed point values can be.
If we assume (k)
fractional bits, e.g., with the limits: u = (2^-k)
and 1 - p = 1 - (2^-k)
,
then the maximum value is: k / (k - log2(2^k - 1))
(As the ratio of natural logarithms, we are free to use any base e.g., lb(x)
or log2
)
Unlike njuffa's answer, I went with a lookup table approach, settling on k = 10
fractional bits to represent 0 < frac(u) < 1024
and 0 < frac(p) < 1024
. This requires a log table with 2^k
entries. Using 32-bit table values, we're only looking at a 4KiB
table.
Any more than that, and you are using enough memory that you could seriously consider using the relevant parts of a 'soft-float' library. e.g., k = 16
would yield a 256KiB
LUT.
We're computing the values - log2(i / 1024.0)
for 0 < i < 1024
. Since these values are in the open interval (0, k)
, we only need 4 binary digits to store the integral part. So we store the precomputed LUT in 32-bit [4.28]
fixed-point format:
uint32_t lut[1024]; /* never use lut[0] */
for (uint32_t i = 1; i < 1024; i++)
lut[i] = (uint32_t) (- (log2(i / 1024.0) * (268435456.0));
Given: u, p
represented by [0.10]
fixed-point values in [1, 1023]
:
uint32_t func (uint16_t u, uint16_t p)
{
/* assert: 0 < u, p < 1024 */
return lut[u] / lut[1024 - p];
}
We can easily test all valid (u, p)
pairs against the 'naive' floating-point evaluation:
floor(log(u / 1024.0) / log(1.0 - p / 1024.0))
and only get a mismatch (+1 too high) on the following cases:
u = 193, p = 1 : 1708 vs 1707 (1.7079978488147417e+03)
u = 250, p = 384 : 3 vs 2 (2.9999999999999996e+00)
u = 413, p = 4 : 232 vs 231 (2.3199989016957960e+02)
u = 603, p = 1 : 542 vs 541 (5.4199909906444600e+02)
u = 680, p = 1 : 419 vs 418 (4.1899938077226307e+02)
Finally, it turns out that using the natural logarithm in a [3.29]
fixed-point format gives us even higher precision, where:
lut[i] = (uint32_t) (- (log(i / 1024.0) * (536870912.0));
only yields a single 'mismatch', though 'bignum' precision suggests it's correct:
u = 250, p = 384 : 3 vs 2 (2.9999999999999996e+00)