How to handle arbitrarily large integers

2019-01-09 02:06发布

问题:

I'm working on a programming language, and today I got the point where I could compile the factorial function(recursive), however due to the maximum size of an integer the largest I can get is factorial(12). What are some techniques for handling integers of an arbitrary maximum size. The language currently works by translating code to C++.

回答1:

If you need larger than 32-bits you could consider using 64-bit integers (long long), or use or write an arbitrary precision math library, e.g. GNU MP.



回答2:

If you want to roll your own arbitrary precision library, see Knuth's Seminumerical Algorithms, volume 2 of his magnum opus.



回答3:

If you're building this into a language (for learning purposes I'd guess), I think I would probably write a little BCD library. Just store your BCD numbers inside byte arrays.

In fact, with today's gigantic storage abilities, you might just use a byte array where each byte just holds a digit (0-9). You then write your own routine to add, subtract multiply and divide your byte arrays.

(Divide is the hard one, but I bet you can find some code out there somewhere.)

I can give you some Java-like psuedocode but can't really do C++ from scratch at this point:

class BigAssNumber {
    private byte[] value;

    // This constructor can handle numbers where overflows have occurred.
    public BigAssNumber(byte[] value) {
        this.value=normalize(value);
    }

    // Adds two numbers and returns the sum.  Originals not changed.
    public BigAssNumber add(BigAssNumber other) {
        // This needs to be a byte by byte copy in newly allocated space, not pointer copy!
        byte[] dest = value.length > other.length ? value : other.value;         

        // Just add each pair of numbers, like in a pencil and paper addition problem.
        for(int i=0; i<min(value.length, other.value.length); i++)
            dest[i]=value[i]+other.value[i];

        // constructor will fix overflows.
        return new BigAssNumber(dest);
    }

    // Fix things that might have overflowed  0,17,22 will turn into 1,9,2        
    private byte[] normalize(byte [] value) {
        if (most significant digit of value is not zero)
            extend the byte array by a few zero bytes in the front (MSB) position.

        // Simple cheap adjust.  Could lose inner loop easily if It mattered.
        for(int i=0;i<value.length;i++)
            while(value[i] > 9) {
                value[i] -=10;
                value[i+1] +=1;
            }
        }
    }
}

I use the fact that we have a lot of extra room in a byte to help deal with addition overflows in a generic way. Can work for subtraction too, and help on some multiplies.



回答4:

There's no easy way to do it in C++. You'll have to use an external library such as GNU Multiprecision, or use a different language which natively supports arbitrarily large integers such as Python.



回答5:

Other posters have given links to libraries that will do this for you, but it seem like you're trying to build this into your language. My first thought is: are you sure you need to do that? Most languages would use an add-on library as others have suggested.

Assuming you're writing a compiler and you do need this feature, you could implement integer arithmetic functions for arbitrarily large values in assembly.

For example, a simple (but non-optimal) implementation would represent the numbers as Binary Coded Decimal. The arithmetic functions could use the same algorithms as you'd use if you were doing the math with pencil and paper.

Also, consider using a specialized data type for these large integers. That way "normal" integers can use the standard 32 bit arithmetic.



回答6:

My prefered approach would be to use my current int type for 32-bit ints(or maybe change it to internally to be a long long or some such, so long as it can continue to use the same algorithms), then when it overflows, have it change to storing as a bignum, whether of my own creation, or using an external library. However, I feel like I'd need to be checking for overflow on every single arithmetic operation, roughly 2x overhead on arithmetic ops. How could I solve that?



回答7:

If I were implement my own language and want to support arbitrary length numbers, I will use a target language with the carry/borrow concept. But since there is no HLL that implements this without severe performance implications (like exceptions), I will certainly go implement it in assembly. It will probably take a single instruction (as in JC in x86) to check for overflow and handle it (as in ADC in x86), which is an acceptable compromise for a language implementing arbitrary precision. Then I will use a few functions written in assembly instead of regular operators, if you can utilize overloading for a more elegant output, even better. But I don't expect generated C++ to be maintainable (or meant to be maintained) as a target language.

Or, just use a library which has more bells and whistles than you need and use it for all your numbers.

As a hybrid approach, detect overflow in assembly and call the library function if overflow instead of rolling your own mini library.