I was wondering how I could go about multiplying a series of binary bits using bitwise operators. However, I'm interested in doing this to find the a decimal fraction value for the binary value. Here's an example of what I'm trying to do:
Given, say: 1010010,
I want to use each individual bit so that it will be computed as:
1*(2^-1) + 0*(2^-2) + 1*(2^-3) + 0*(2^-4).....
Though I'm interested in doing this in ARM assembly, having an example in C/C++ will still help as well.
I was thinking of performing a loop with a counter, where each time the loop iterates, a counter will increment, the value will be logically shifted left so that the first bit will be taken, and multiplied by 2^-counter.
However, I'm not entirely sure how I would go about just getting just the first bit/MSB to multiply, and I'm confused as to how I would multiply that value by base 2 to some negative power.
I know that logical shift lefts will multiply it with base two, but those usually have positive exponents. Ex. LSL r0, 2 means the value in r0 will be multiplied by 2^2 = 4.
Thanks in advance!
it sounds like you are on the right track...just do it.
if I have these binary numbers I am multiplying
abcd
* 1011
=======
where a, b, c, d are just bits, one or zero which dont matter just yet as you will see why
1011 as you know is 1*(2^0)+1(2^1)+0*(2^2)+1(2^3).
exactly like grade school math, but simpler since we only care about 1 and 0 and not 9 to 0..
abcd
* 1011
=======
abcd (abcd * 1 * 2^0)
0000 (abcd * 1 * 2^1)
abcd (abcd * 0 * 2^2)
+ abcd (abcd * 1 * 2^3)
==========
and if the light bulb didnt go off yet then
abcd
* 1011
=======
abcd (abcd << 0)
00000 (abcd << 1)
abcd00 (0 << 2)
+ abcd000 (abcd << 3)
==========
Then you add up those values.
unsigned long long mymult ( unsigned int a, unsigned int b )
{
unsigned long long ra;
unsigned int rb;
ra=0;
for(rb=0;rb<32;rb++)
{
if(b&(1<<rb)) ra+=(unsigned long long a)<<rb;
}
}
Which should be pretty easy to implement in assembly. Using a simple calculator (well using one that does hex) you can very quickly see that 0xFF * 0xFF = 0xFE01, and if you keep trying things you should realize that it can take up to twice as many bits as the width of the operand to hold the result. If you multiply two 8 bit numbers, to handle all the possible combinations you need a 16 bit results. Now too many processors that have a multiply dont actually do it that way making the processors multiply somewhat useless IMO. So you can choose to do a simple 32 bit = 32bit * 32 bit, or you can try something like above and do a 64 bit = 32 bit * 32bit (assuming the compiler interprets the lengths of long and int as I assume them to be). You might want to start with 32 bit = 32 bit * 32 bit and go from there. it gets pretty tricky beyond that, another topic. (which can also be easily modeled in a C example, fairly trivial but not as trivial as this below).
unsigned int mymult ( unsigned int a, unsigned int b )
{
unsigned int ra;
unsigned int rb;
ra=0;
for(rb=0;rb<32;rb++)
{
if(b&(1<<rb)) ra+=a<<rb;
}
}
Negative powers? well if 2^4 means shift left 4 then 2^(-4) would mean shift right 4 yes? that covers negative powers.
Just like floating point you need to arbitrarily choose where your decimal point is and normalize to it. So if you want 4 bits of fraction and 28 bits of whole number then if you want to multiply 5 * 7 you need to adjust those numbers to 5<<4 and 7<<4 normalizing them to your decimal location. 0b1010000 and 0b1110000 multiplying gives the result 0b1000110000. 35 (0x23) with no fraction. the first bit to the right of your imaginary decimal point is 2^-1 or 1/2, the next bit over is 2^-2 or 1/4th and so on.
This is how floating point works but it does it basically in a scientific notation style with most of the meat of the number to the right of the decimal place. Just like scientific notation math from middle or high school you have to follow those rules for preparing your numbers before and after the simple math operation. exponents have to match before you can add for example, multiplication they dont but you add the exponents as well as multiply the number portion...
Multiplying two numbers using only bitwise operations (AND
, OR
, XOR
, <<
, >>
) is perfectly possible, although probably not very efficient. You may want to read the relevant Wikipedia articles on Adder (electronics) and Binary multiplier.
A bottom-up approach for multiplication would be to create first a binary adder. For the lowest bit (bit 0) a half adder works fine. S means sum, C means carry.
For the rest of the bits you need a full adder for each bit. Cin means 'carry in', Cout means 'carry out':
The simplest logical circuit to sum several bits is called ripple-carry adder:
Ripple-carry adder is basically a series of full adders, the carry is propagated to the full adder computing the next more significant bit. Other more efficient methods do exist, but due to reasons of simplicity, I skip them. So now we have a binary adder.
The binary multiplier is a more difficult case. But as I see this more as a proof of concept than a practical way to multiply two numbers, let's take a simpler detour.
Let's assume we want to compute the product of a
and b
, a = 100
, b = 5
. a
and b
are both 16-bit unsigned integers (can be fixed-point too). We can create an array of summands in which we write the value of a
(100) b
(5) times, or vice versa. As the highest unsigned value that is representable in 16 bits is 2^16-1 (65535), we want to create an array of 65535 unsigned integers, filled with zeros. Then we need to set some 5 values of the array to 100, using only bitwise operations.
We can do this: first we fill an array (let's call it a_array
) with the value of a
(100). Then we want to zero some values in a_array
based on the value of b
, so that b
values of a_array
stay unmodified, the rest values of a_array
are zeroed. For that we use a binary mask and AND
bitwise operation.
So we loop through the bits of b
. For each bit of b
we create a binary mask based on the value of that bit in b
. Creating such binary mask requires only bit shifts (<<
, >>
), bitwise AND
and bitwise OR
.
0 -> 0b0000 0000 0000 0000
1 -> 0b1111 1111 1111 1111
So, now we have a binary mask. But how we use it? Well, the bit 0 of b
corresponds to numerical value of 0 or 1. The bit 1 of b
corresponds to numerical value 0 or 2. The bit 2 of b
corresponds to numerical value of 0 or 4. So bit n of b
corresponds to the numerical value of 0 or 2^n. So, as we loop through the bits of b
and create a binary mask for each bit, we AND
2^n values of a_array
with the corresponding binary mask. The corresponding value in a_array
either gets zeroed or stays unmodified. In C code I use a for
loop for AND
ing through the a_array
, together with incrementing and decrementing counters. Increment and decrement are not a bitwise operations. But the for
loop is not necessary, it's used only for readability (from human point of view). Actually, I first wrote in x86-64 assembly a 4-bit * 4-bit = 4-bit multiplier to try this concept, using only and
, or
, xor
, shl
(bit shift left), and shr
(bit shift right) and call
. call
is function or procedure call, that is, not a bitwise operation, but you can inline all those functions or procedures and thus compute the product using only AND
, OR
, XOR
, <<
and >>
. So instead of a for
loop, for each bit of b
, you can AND
n (n = 1, 2, 4, 8 ...) corresponding values of a_array
using the bitwise mask based on the corrsponding bit of b
. For a 16-bit * 16-bit = 16-bit multiplication that requires 65535 AND
commands (without a loop). Computers have no problem with such an input, but humans tend to have problems reading such code. For that reason a for
loop is used.
Now we have a_array
filled with b
values of a
, the rest are zeroes. The rest is simple: we just add all the values of a_array
using out bitwise adder (it's the function my_add
in the below C code).
Here's the code for 16-bit * 16-bit = 16-bit unsigned integer multiplication. Please note that the function memset16
assumes a little-endian architecture. Converting memset16
to a big-endian architecture should be trivial. The code works for fixed-point multiplication too, you only need to add a bit shift in the end. Converting to different variable sizes as well as implementing overflow detection should be trivial too. Tasks are left for the reader. Compiles with GCC, tested in x86-64 Linux.
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#define NUMBER_OF_BITS 16
#define MAX_VALUE 65535
typedef uint8_t u8;
typedef uint16_t u16;
typedef uint32_t u32;
typedef int8_t s8;
typedef int16_t s16;
typedef int32_t s32;
typedef struct result_struct{
u16 result;
u16 carry;
} result_struct;
u16 extend_lowest_bit(u16 a)
{
/* extends lowest bit (bit 0) to all bits. */
u16 a_extended;
a = (a & 1);
a_extended = a | (a << 1) | (a << 2) | (a << 3) | (a << 4);
a_extended = a_extended | (a << 5) | (a << 6) | (a << 7) | (a << 8);
a_extended = a_extended | (a << 9) | (a << 10) | (a << 11) | (a << 12);
a_extended = a_extended | (a << 13) | (a << 14) | (a << 15);
return a_extended;
}
result_struct my_add(u16 a, u16 b)
{
/* computes (a + b). */
result_struct add_results;
u16 a0, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15;
u16 b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, b10, b11, b12, b13, b14, b15;
u16 carry, result = 0;
/* prepare for bitwise addition by separating
* each bit of summands a and b using bitwise AND. */
a0 = a & 1;
a1 = a & (1 << 1);
a2 = a & (1 << 2);
a3 = a & (1 << 3);
a4 = a & (1 << 4);
a5 = a & (1 << 5);
a6 = a & (1 << 6);
a7 = a & (1 << 7);
a8 = a & (1 << 8);
a9 = a & (1 << 9);
a10 = a & (1 << 10);
a11 = a & (1 << 11);
a12 = a & (1 << 12);
a13 = a & (1 << 13);
a14 = a & (1 << 14);
a15 = a & (1 << 15);
b0 = b & 1;
b1 = b & (1 << 1);
b2 = b & (1 << 2);
b3 = b & (1 << 3);
b4 = b & (1 << 4);
b5 = b & (1 << 5);
b6 = b & (1 << 6);
b7 = b & (1 << 7);
b8 = b & (1 << 8);
b9 = b & (1 << 9);
b10 = b & (1 << 10);
b11 = b & (1 << 11);
b12 = b & (1 << 12);
b13 = b & (1 << 13);
b14 = b & (1 << 14);
b15 = b & (1 << 15);
add_results.result = a0 ^ b0;
/* result: 0000 0000 0000 000x */
carry = (a0 & b0) << 1;
add_results.result = add_results.result | (a1 ^ b1 ^ carry);
/* result: 0000 0000 0000 00xx */
carry = ((carry & (a1 ^ b1)) | (a1 & b1)) << 1;
add_results.result = add_results.result | (a2 ^ b2 ^ carry);
/* result: 0000 0000 0000 0xxx */
carry = ((carry & (a2 ^ b2)) | (a2 & b2)) << 1;
add_results.result = add_results.result | (a3 ^ b3 ^ carry);
/* result: 0000 0000 0000 xxxx */
carry = ((carry & (a3 ^ b3)) | (a3 & b3)) << 1;
add_results.result = add_results.result | (a4 ^ b4 ^ carry);
/* result: 0000 0000 000x xxxx */
carry = ((carry & (a4 ^ b4)) | (a4 & b4)) << 1;
add_results.result = add_results.result | (a5 ^ b5 ^ carry);
/* result: 0000 0000 00xx xxxx */
carry = ((carry & (a5 ^ b5)) | (a5 & b5)) << 1;
add_results.result = add_results.result | (a6 ^ b6 ^ carry);
/* result: 0000 0000 0xxx xxxx */
carry = ((carry & (a6 ^ b6)) | (a6 & b6)) << 1;
add_results.result = add_results.result | (a7 ^ b7 ^ carry);
/* result: 0000 0000 xxxx xxxx */
carry = ((carry & (a7 ^ b7)) | (a7 & b7)) << 1;
add_results.result = add_results.result | (a8 ^ b8 ^ carry);
/* result: 0000 000x xxxx xxxx */
carry = ((carry & (a8 ^ b8)) | (a8 & b8)) << 1;
add_results.result = add_results.result | (a9 ^ b9 ^ carry);
/* result: 0000 00xx xxxx xxxx */
carry = ((carry & (a9 ^ b9)) | (a9 & b9)) << 1;
add_results.result = add_results.result | (a10 ^ b10 ^ carry);
/* result: 0000 0xxx xxxx xxxx */
carry = ((carry & (a10 ^ b10)) | (a10 & b10)) << 1;
add_results.result = add_results.result | (a11 ^ b11 ^ carry);
/* result: 0000 xxxx xxxx xxxx */
carry = ((carry & (a11 ^ b11)) | (a11 & b11)) << 1;
add_results.result = add_results.result | (a12 ^ b12 ^ carry);
/* result: 000x xxxx xxxx xxxx */
carry = ((carry & (a12 ^ b12)) | (a12 & b12)) << 1;
add_results.result = add_results.result | (a13 ^ b13 ^ carry);
/* result: 00xx xxxx xxxx xxxx */
carry = ((carry & (a13 ^ b13)) | (a13 & b13)) << 1;
add_results.result = add_results.result | (a14 ^ b14 ^ carry);
/* result: 0xxx xxxx xxxx xxxx */
carry = ((carry & (a14 ^ b14)) | (a14 & b14)) << 1;
add_results.result = add_results.result | (a15 ^ b15 ^ carry);
/* result: xxxx xxxx xxxx xxxx */
add_results.carry = ((carry & (a15 ^ b15)) | (a15 & b15)) << 1;
return add_results;
}
result_struct add_array(void* array, s32 size)
{
/* adds together all u16 values of the array. */
result_struct add_results;
u16* i;
u16* top_address;
add_results.result = 0;
add_results.carry = 0;
for (i = array; i < ((u16*)array + size); i++)
{
add_results = my_add(add_results.result, *i);
}
return add_results;
}
void memset16(void* dest, u16 value, s32 size)
{
/* does a 16-bit memset. size is the number of u16's (words). */
u8* i;
for (i = (u8*)dest; i < ((u8*)dest+(2*size)); i+=2)
{
memset(i, (int)(value & 0xff), 1);
memset(i+1, (int)(value >> 8), 1);
}
}
result_struct my_mul(u16 a, u16 b)
{
/* computes (a * b) */
u16 bitmask, a_array[MAX_VALUE];
u32 block_length;
s16 bit_i;
s32 count, size;
u16* i;
void* p_a_array;
p_a_array = a_array;
result_struct mul_results;
mul_results.result = 0;
size = MAX_VALUE;
memset16(p_a_array, a, size); // can be replaced with AND.
/* mask the summands. can be unrolled to
* use only bitwise operations. */
i = p_a_array;
for (bit_i = 0, block_length = 1; bit_i < NUMBER_OF_BITS; bit_i++)
{
bitmask = extend_lowest_bit(b >> bit_i);
for (count = block_length; count > 0; count--)
{
*i = (*i & bitmask);
i++;
}
block_length <<= 1;
}
/* the array of summands is now masked. */
/* add the values of the array together. */
mul_results = add_array(p_a_array, MAX_VALUE);
return mul_results;
}
int main(void)
{
int a, b;
result_struct multiply_results;
printf("Enter the 1st unsigned 16-bit integer.\n");
scanf("%d", &a);
printf("Enter the 2nd unsigned 16-bit integer.\n");
scanf("%d", &b);
multiply_results = my_mul((u16)a, (u16)b);
printf("%d * %d = %d\n", a, b, multiply_results.result);
return 0;
}
As an exercise, perhaps this may help:
#include <iostream>
using namespace std;
int main()
{
unsigned bits = 0x52; // 01010010
int n = 7; // position of the radix point E.g. 0.1010010
double result = 0;
for( int i=0 ; i<n ; ++i )
{
result += (bits>>i) & 1;
result *= 0.5;
}
cout << result << endl; // 0.640625
return 0;
}
One way to think of this is to take
1*(2^-1) + 0*(2^-2) + 1*(2^-3) + 0*(2^-4) + ...
and rewrite it as
(1+(0+(1+(0+(0+(1+0/2)/2)/2)/2)/2)/2)/2
As you can see the original binary number 1010010 shows up directly in this evaluation.
Multiplication using only bitwise arithmetic is masochism, as is addition. The hardware you would be attempting to replace is nontrivial. (Note: the other answers here use native addition; such is not a bitwise implementation.)
If you have an integer type twice as wide as your operands, then just perform regular multiplication and use a right shift to align the radix point to the correct bit. If you multiply two 8-bit fractions in the range [0, 1), then the result should be shifted right by 8 bits to take the radix point from the 216's place back to the 28's place.
Given the prevalence of long long
, this technique portably handles multiplication of 32-bit fractions.
const uint64_t fix_factor
= static_cast< uint64_t >( std::numeric_limits< uint32_t >::max() ) + 1;
uint32_t a = (7./13) * fix_factor;
uint32_t b = (13./14) * fix_factor;
uint32_t prod = static_cast< uint64_t >( a ) * b / fix_factor;
(Live demos with x86-64 assembly and sensible output.)
If you have access to assembly, there should be no need for any tricks no matter what, since most CPUs offer a multiply-high instruction. (This is used to implement long long
multiplication, but with assembly you can be sure the unnecessary low-order result bits are not computed.)
Multiplication with bitwise operations is perfectly doable and it can be parallelized for efficiency; See how to perform addition of three bit vectors efficiently.
int sum_three(int a, int b, int c)
{
int sum=a^b^c,c2;
c=((c&a)|(a&b)|(c&b))<<1;
while (c) {
c2=(c&sum)<<1;
sum^=c;
c=c2;
}
return sum;
}
Now that we have a loop that can add three vectors to one, we can either recursively split the standard multiplication matrix, or to implement the CSA tree already mentioned in my earlier answer.
abcdefg
* hijklmn
---------
abcdefg * n <-- bitwise AND, sum these three vectors with the given
abcdefg * m <-- bitwise AND algorithm to produce vector NMLNMLNMLNML
abcdefg * l <-- bitwise AND
...
The first three rows will produce a 9-element vector 0000LMNLMNLMN = a1
The next three rows will produce a vector 0IJKIJKIJK000 = b1
And finally h*abcdefgh needs to be added: HHHHHHH000000 = c1
Fortunately these partial sums can be added with a single additional call to sum_three(a1,b1,c1);
The bit-serial multiplier, or CSA multiplier in iterative form can be expressed as:
acc_0 = 0;
cry_0 = 0;
loop (max 2n) times:
add_n = vectorY.bit0 AND (vectorX << n);
vectorY >>= 1;
cry_i+1 = (cry_i & add_i) | (add_i & acc_i) | (acc_i & cry_i);
acc_i+1 = acc_i ^ cry_i ^ add_i;
One can break the loop when both cry_i+1 == 0 and vectorY == 0.