Is there any efficient way to convert an unary num

2020-06-16 02:21发布

问题:

Let those datatypes represent unary and binary natural numbers, respectively:

data UNat = Succ UNat | Zero
data BNat = One BNat | Zero BNat | End

u0 = Zero
u1 = Succ Zero
u2 = Succ (Succ Zero)
u3 = Succ (Succ (Succ Zero))
u4 = Succ (Succ (Succ (Succ Zero)))

b0 = End                   //   0
b1 = One End               //   1
b2 = One (Zero End)        //  10
b3 = One (One End)         //  11
b4 = One (Zero (Zero End)) // 100

(Alternatively, one could use `Zero End` as b1, `One End` as b2, `Zero (Zero End)` as b3...)

My question is: is there any way to implement the function:

toBNat :: UNat -> BNat

That works in O(N), doing only one pass through UNat?

回答1:

To increment a binary digit, you have to flip the first zero at the end of your number and all the ones preceding it. The cost of this operation is proportional to the number of 1 at the end of your input (for this your should represent number as right-to-left list, eg. the list [1;0;1;1] codes for 13).

Let a(n) be the number of 1 at the end of n:

a(n) = 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, ...

and let

s(k) = a(2^k) + a(2^k+1) + ... + a(2^(k+1)-1) 

be the sum of elements between two powers of 2. You should be able to convince yourself that s(k+1)=2*s(k) + 1 (with s(0) = 1) by noticing that

    a(2^(k+1)) ..., a(2^(k+2) - 1) 

is obtained by concatenating

    a(2^k) + 1, ..., a(2^(k+1) - 1) and   a(2^k), ..., a(2^(k+1) - 1)

And therefore, as a geometric series, s(k) = 2^k - 1.

Now the cost of incrementing N times a number should be proportional to

    a(0) + a(1) + ... + a(N)
  = s(0) + s(1) + s(2)  + ... + s(log(N)) 
  = 2^0 - 1 + 2^1 -1 + 2^2-1 + ... + 2^log(N) - 1
  = 2^0 + 2^1 + 2^2 + ... + 2^log(N) - log(N) - 1
  = 2^(log(N) + 1) - 1 - log(N) - 1 = 2N - log(N) - 2

Therefore, if you take care of representing your numbers from right-to-left, then the naive algorithm is linear (note that you can perform to list reversal and stay linear if you really need your numbers the other way around).



回答2:

I like the other answers, but I find their asymptotic analyses complicated. I therefore propose another answer that has a very simple asymptotic analysis. The basic idea is to implement divMod 2 for unary numbers. Thus:

data UNat = Succ UNat | Zero
data Bit = I | O

divMod2 :: UNat -> (UNat, Bit)
divMod2 Zero = (Zero, O)
divMod2 (Succ Zero) = (Zero, I)
divMod2 (Succ (Succ n)) = case divMod2 n of
    ~(div, mod) -> (Succ div, mod)

Now we can convert to binary by iterating divMod.

toBinary :: UNat -> [Bit]
toBinary Zero = []
toBinary n = case divMod2 n of
    ~(div, mod) -> mod : toBinary div

The asymptotic analysis is now pretty simple. Given a number n in unary notation, divMod2 takes O(n) time to produce a number half as big -- say, it takes at most c*n time for large enough n. Iterating this procedure therefore takes this much time:

c*(n + n/2 + n/4 + n/8 + ...)

As we all know, this series converges to c*(2*n), so toBinary is also in O(n) with witness constant 2*c.



回答3:

If we have a function to increment a BNat, we can do this quite easily by running along the UNat, incrementing a BNat at each step:

toBNat :: UNat -> BNat
toBNat = toBNat' End
    where
    toBNat' :: BNat -> UNat -> BNat
    toBNat' c Zero     = c
    toBNat' c (Succ n) = toBNat' (increment c) n

Now, this is O(NM) where M is the worst case for increment. So if we can do increment in O(1), then the answer is yes.

Here's my attempt at implementing increment:

increment :: BNat -> BNat
increment = (reverse End) . inc' . (reverse End)
    where
    inc' :: BNat -> BNat
    inc' End      = One End
    inc' (Zero n) = One n
    inc' (One n)  = Zero (inc' n)

    reverse :: BNat -> BNat -> BNat
    reverse c End = c
    reverse c (One n) = reverse (One c) n

This implementation is O(N) because you have to reverse the BNat to look at the least significant bits, which gives you O(N) overall. If we consider the BNat type to represent reversed binary numbers, we don't need to reverse the BNat, and, as @augustss says, we have O(1), which gives you O(N) overall.