Let those datatypes represent unary and binary natural numbers, respectively:
data UNat = Succ UNat | Zero
data BNat = One BNat | Zero BNat | End
u0 = Zero
u1 = Succ Zero
u2 = Succ (Succ Zero)
u3 = Succ (Succ (Succ Zero))
u4 = Succ (Succ (Succ (Succ Zero)))
b0 = End // 0
b1 = One End // 1
b2 = One (Zero End) // 10
b3 = One (One End) // 11
b4 = One (Zero (Zero End)) // 100
(Alternatively, one could use `Zero End` as b1, `One End` as b2, `Zero (Zero End)` as b3...)
My question is: is there any way to implement the function:
toBNat :: UNat -> BNat
That works in O(N)
, doing only one pass through UNat?
I like the other answers, but I find their asymptotic analyses complicated. I therefore propose another answer that has a very simple asymptotic analysis. The basic idea is to implement
divMod 2
for unary numbers. Thus:Now we can convert to binary by iterating
divMod
.The asymptotic analysis is now pretty simple. Given a number
n
in unary notation,divMod2
takes O(n) time to produce a number half as big -- say, it takes at mostc*n
time for large enoughn
. Iterating this procedure therefore takes this much time:As we all know, this series converges to
c*(2*n)
, sotoBinary
is also in O(n) with witness constant2*c
.If we have a function to increment a
BNat
, we can do this quite easily by running along theUNat
, incrementing aBNat
at each step:Now, this is
O(NM)
whereM
is the worst case forincrement
. So if we can doincrement
in O(1), then the answer is yes.Here's my attempt at implementing
increment
:This implementation is
O(N)
because you have toreverse
theBNat
to look at the least significant bits, which gives youO(N)
overall. If we consider theBNat
type to represent reversed binary numbers, we don't need to reverse theBNat
, and, as @augustss says, we have O(1), which gives you O(N) overall.To increment a binary digit, you have to flip the first zero at the end of your number and all the ones preceding it. The cost of this operation is proportional to the number of 1 at the end of your input (for this your should represent number as right-to-left list, eg. the list [1;0;1;1] codes for 13).
Let a(n) be the number of 1 at the end of n:
and let
be the sum of elements between two powers of 2. You should be able to convince yourself that s(k+1)=2*s(k) + 1 (with s(0) = 1) by noticing that
is obtained by concatenating
And therefore, as a geometric series, s(k) = 2^k - 1.
Now the cost of incrementing N times a number should be proportional to
Therefore, if you take care of representing your numbers from right-to-left, then the naive algorithm is linear (note that you can perform to list reversal and stay linear if you really need your numbers the other way around).