I was perusing some code using arbitrary-length integers using the GNU Multi-Precision (GMP) library code. The type for a MP integer is mpz_t
as defined in gmp.h header file.
But, I've some questions about the lower-level definition of this library-defined mpz_t
type. In the header code:
/* THIS IS FROM THE GNU MP LIBRARY gmp.h HEADER FILE */
typedef struct
{
/* SOME OTHER STUFF HERE */
} __mpz_struct;
typedef __mpz_struct mpz_t[1];
First question: Does the [1]
associate with the __mpz_struct
? In other words, is the typedef
defining a mpz_t
type as a __mpz_struct
array with one occurrence?
Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?
Third question (perhaps indirectly related to second question): The GMP documentation for the mpz_init_set(mpz_t, unsigned long int)
function says to use it as pass-by-value only, although one would assume that this function would be modifying its contents within the called function (and thus would need pass-by-reference) syntax. Refer to my code:
/* FROM MY CODE */
mpz_t fact_val; /* declaration */
mpz_init_set_ui(fact_val, 1); /* Initialize fact_val */
Does the single-occurrence array enable pass-by-reference automatically (due to the breakdown of array/pointer semantics in C)? I freely admit I'm kinda over-analyzing this, but I'd certainly love any discussion on this. Thanks!
The reason for this comes from the implementation of
mpn
. Specifically, if you're mathematically inclined you'll realise N is the set of natural numbers (1,2,3,4...) whereas Z is the set of integers (...,-2,-1,0,1,2,...).Implementing a bignum library for Z is equivalent to doing so for N and taking into account some special rules for sign operations, i.e. keeping track of whether you need to do an addition or a subtraction and what the result is.
Now, as for how a bignum library is implemented... here's a line to give you a clue:
And now let's look at a function signature operating on that:
Basically, what it comes down to is that a "limb" is an integer field representing the bits of a number and the whole number is represented as a huge array. The clever part is that gmp does all this in a very efficient, well optimised manner.
Anyway, back to the discussion. Basically, the only way to pass arrays around in C is, as you know, to pass pointers to those arrays which effectively enables pass by reference. Now, in order to keep track of what's going on, two types are defined, a
mp_ptr
which is an array ofmp_limb_t
big enough to store your number, andmp_srcptr
which is a const version of that, so that you cannot accidentally alter the bits of the source bignums on what you are operating. The basic idea is that most of the functions follow this pattern:etc. Thus, I suspect
mpz_*
functions follow this convention simply to be consistent and it is because that is how the authors are thinking.Short version: Because of how you have to implement a bignum lib, this is necessary.
*First question: Does the
[1]
associate with the __mpz_struct? In other words, is the typedef defining a mpz_t type as a __mpz_struct array with one occurrence?*Yes.
Second question: Why the array? (And why only one occurrence?) Is this one of those struct hacks I've heard about?
Beats me. Don't know, but one possibility is that the author wanted to make an object that was passed by reference automatically, or, "yes", possibly the struct hack. If you ever see an
mpz_t
object as the last member of a struct, then "almost certainly" it's the struct hack. An allocation looking likewould be a dead giveaway.
Does the single-occurrence array enable pass-by-reference automatically...?
Aha, you figured it out too. "Yes", one possible reason is to simplify pass-by-reference at the expense of more complex references.
I suppose another possibility is that something changed in the data model or the algorithm, and the author wanted to find every reference and change it in some way. A change in type like this would leave the program with the same base type but error-out every unconverted reference.
This does not appear to be a struct hack in the sense described on C2. It appears that they want
mpz_t
to have pointer semantics (presumably, they want people to use it like an opaque pointer). Consider the syntactic difference between the following snippets:And
Because C arrays decay into pointers, this also allows for automatic pass by reference for the
mpz_t
type.It also allows you to use a pointer-like type without needing to
malloc
orfree
it.