I found that both MSVC and GCC compilers allocate at least one byte per each class instance even if the class is a predicate with no member variables (or with just static member variables). The following code illustrates the point.
#include <iostream>
class A
{
public:
bool operator()(int x) const
{
return x>0;
}
};
class B
{
public:
static int v;
static bool check(int x)
{
return x>0;
}
};
int B::v = 0;
void test()
{
A a;
B b;
std::cout << "sizeof(A)=" << sizeof(A) << "\n"
<< "sizeof(a)=" << sizeof(a) << "\n"
<< "sizeof(B)=" << sizeof(B) << "\n"
<< "sizeof(b)=" << sizeof(b) << "\n";
}
int main()
{
test();
return 0;
}
Output:
sizeof(A)=1
sizeof(a)=1
sizeof(B)=1
sizeof(b)=1
My question is why does compiler need it? The only reason that I can come up with is ensure that all member var pointers differ so we can distinguish between two members of type A or B by comparing pointers to them. But the cost of this is quite severe when dealing with small-size containers. Considering possible data alignment, we can get up to 16 bytes per class without vars (?!). Suppose we have a custom container that will typically hold a few int values. Then consider an array of such containers (with about 1000000 members). The overhead will be 16*1000000! A typical case where it can happen is a container class with a comparison predicate stored in a member variable. Also, considering that a class instance should always occupy some space, what type of overhead should be expected when calling A()(value) ?
It’s necessary to satisfy an invariant from the C++ standard: every C++ object of the same type needs to have a unique address to be identifiable.
If objects took up no space, then items in an array would share the same address.
Basically, it's an interplay between two requirements:
- Two different objects of the same type must be at a different addresses.
- In arrays, there may not be any padding between objects.
Note that the first condition alone does not require non-zero size: Given
struct empty {};
struct foo { empty a, b; };
the the first requirement could easily be met by having a zero-size a
followed by a single padding byte to enforce a different address, followed by a zero-size b
. However, given
empty array[2];
that no longer works because a padding between the different objects empty[0]
and empty[1]
would not be allowed.
All complete objects must have a unique address; so they must take up at least one byte of storage - the byte at their address.
A typical case where it can happen is a container class with a comparison predicate stored in a member variable.
In this case, you can use the empty base class optimisation: a base subobject is allowed to have the same address as the complete object that it's part of, so can take up no storage. So you can attach the predicate to a class as a (perhaps private) base class rather than a member. It's a bit more fiddly to deal with than a member, but should eliminate the overhead.
what type of overhead should be expected when calling A()(value) ?
The only overhead compared to calling a non-member function will be passing the extra this
argument. If the function is inlined, then this should be eliminated (as would be the case, in general, when calling a member function that doesn't access any member variables).
There are already excellent answers that answer the main question. I would like to address the concerns you expressed with:
But the cost of this is quite severe when dealing with small-size containers. Considering possible data alignment, we can get up to 16 bytes per class without vars (?!). Suppose we have a custom container that will typically hold a few int values. Then consider an array of such containers (with about 1000000 members). The overhead will be 16*1000000! A typical case where it can happen is a container class with a comparison predicate stored in a member variable.
Avoiding the cost of holding A
If all instances of a container depend on type A
, then there is no need to hold instances of A
in the container. The overhead associated with the non-zero size of A
can be avoided by simply creating an instance of A
on the stack when needed.
Not being able to avoid the cost of holding A
You may be forced to hold a pointer to A
in each instance of the container if A
is expected to by polymorphic. For such a containerthe cost of each container goes up by the size of a pointer. Whether there are any member variables in the base class A
or not makes no difference to the size of the container.
Impact of sizeof A
In either case, the size of an empty class should have no bearing on the storage requirements of the container.