How to correctly fix “zero-sized array in struct/u

2020-05-31 09:22发布

站内文章 / C++

34 0

戒情不戒烟

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I'm integrating some code into my library. It is a complex data structure well optimized for speed, so i'm trying not to modify it too much. The integration process goes well and actually is almost finished (it compiles). One thing is still bothering me. I'm getting the C4200 warning multiple times:

warning C4200: nonstandard extension used : zero-sized array in struct/union
Cannot generate copy-ctor or copy-assignment operator when UDT contains a zero-sized array

The code works but this warning gives me creeps (especially the part with copy-ctor). THe warning appears because of structures declared like this:

#pragma pack( push )
#pragma pack( 1 )
// String
struct MY_TREEDATSTR
{
    BYTE btLen;
    DWORD dwModOff;
    BYTE btPat[0];
};

typedef MY_TREEDATSTR TREEDATSTR;
typedef MY_TREEDATSTR *PTREEDATSTR;

#pragma pack( pop )

Note the btPat[0]. Is there a way how to easily and correctly get rid of this warning without breaking the code and/or having to change too much in it. Notice the #pragma's, have their any significance according to this warning? And why is the structure declared this way anyway? (I mean the btPat thing, not the #pragma's, those i understand).

Note: i saw this similar question, but it really didn't help me.

Update: as I said, the code works and gives correct results. So a copy-constructor or assignment operator is apparently really not needed. And as i look at the code, none of the structures get memcpy-ed.

回答1:

I'll assume that you do want this to be compiled in pure C++ mode, and that you don't want just to compile some files in C and some in C++ and later link.

The warning is telling you that the compiler generated copy constructor and assignment will most probably be wrong with your structure. Using zero-sized arrays at the end of a struct is usually a way, in C, of having an array that is decided at runtime, but is illegal in C++, but you can get similar behavior with a size of 1:

struct runtime_array {
   int size;
   char data[1];
};
runtime_array* create( int size ) {
   runtime_array *a = malloc( sizeof(runtime_array) + size ); // [*]
   a->size = size;
   return a;
}
int main() {
   runtime_array *a = create( 10 );
   for ( int i = 0; i < a->size; ++i ) {
      a->data[i] = 0;
   }
   free(a);
}

This type of structures are meant to be allocated dynamically --or with dynamic stack allocation trickery--, and are not usually copied, but if you tried you would get weird results:

int main() {
   runtime_array *a = create(10);
   runtime_array b = *a;          // ouch!!
   free(a);
}

In this example the compiler generated copy constructor would allocate exactly sizeof(runtime_array) bytes in the stack and then copy the first part of the array into b. The problem is that b has a size field saying 10 but has no memory for any element at all.

If you still want to be able to compile this in C, then you must resolve the warning by closing your eyes: silent that specific warning. If you only need C++ compatibility, you can manually disable copy construction and assignment:

struct runtime_array {
   int size;
   char data[1];
private:
   runtime_array( runtime_array const & );            // undefined
   runtime_array& operator=( runtime_array const & ); // undefined
};

By declaring the copy constructor and assignment operator the compiler will not generate one for you (and won´t complain about it not knowing how). By having the two private you will get compile time errors if by mistake you try to use it in code. Since they are never called, they can be left undefined --this is also used to avoid calling it from within a different method of the class, but I assume that there are no other methods.

Since you are refactoring to C++, I would also make the default constructor private and provide a static public inlined method that will take care of the proper allocation of the contents. If you also make the destructor private you can make sure that user code does not try to call delete on your objects:

struct runtime_array {
   int size;
   char data[1];
   static runtime_array* create( int size ) {
      runtime_array* tmp = (runtime_array*)malloc(sizeof(runtime_array)+size);
      tmp->size = size;
      return tmp;
   }
   static void release( runtime_array * a ) {
      free(a);
   }
private:
   runtime_array() {}
   ~runtime_array() {}
   runtime_array( runtime_array const & );            // undefined
   runtime_array& operator=( runtime_array const & ); // undefined
};

This will ensure that user code does not by mistake create your objects in the stack nor will it mix calls to malloc/free with calls to new/delete, since you manage creation and destruction of your objects. None of this changes affects the memory layout of your objects.

[*] The calculation for the size here is a bit off, and will overallocate, probably by as much as sizeof(int) as the size of the object has padding at the end.

回答2:

If this is a MSVC compiler (which is what the warning message tells me), then you can disable this warning using #pragma warning, ie.:

#pragma warning( push )
#pragma warning( disable : 4200 )
struct _TREEDATSTR
{
    BYTE btLen;
    DWORD dwModOff;
    BYTE btPat[0];
};
#pragma warning( pop )

BTW, the message about the copy-constructor is not creepy, but a good thing because it means, that you can't copy instances of _TREEDATSTR without the unknown bytes in btPat: The compiler has no idea how big _TREEDATSTR really is (because of the 0-size array) and therefore refuses to generate a copy constructor. This means, that you can't do this:

_TREEDATSTR x=y;

which shouldn't work anyway.

回答3:

Try changing it to say btPat[1] instead. I think both C++ and C standards dictate that an array cannot have 0 elements. It could cause problems for any code that rely on the size of the _TREEDATSTR struct itself, but usually these sorts of structs are typecast from buffers where (in this case) the first byte of the buffer determines how many bytes are actually in btPat. This kind of approach relies on the fact that there is no bounds checking on C arrays.

回答4:

If it's complaining about the copy constructor and assignment operator functions, couldn't you supply your own. If you don't want them, declare them private.

This may produce a lot of errors elsewhere in the code if you are assigning or copying without realising it, in which case it wouldn't have worked anyway because there are no automatically generated ones.

回答5:

The main idea for this in C is to get for _TREEDATSTR elements the needed extra memory; in other words allocation will be done with malloc(sizeof(_TREEDATSTR) + len).

Pragma pack is used to ask the compiler to leave no empty spaces between the fields (normally compilers do sometimes leave some unused bytes between fields of structres to guarantee alignment because in many modern processors this is a huge speed improvement).

Note however that there are architectures where unaligned access is not just slow... but totally forbidden (segfault) so those compilers are free to ignore the pragma pack; code that uses pragma pack is inherently unportable.

I think I would have put the dword first in the structure, and this probably wouldn't have required a pragma pack; also a way to silence the warning is to allocate a one element array and doing the allocation using (len-1) extra bytes.

In C++ all this stuff is quite dangerous, because you're basically fooling the compiler into thinking that the size of the object is smaller than it really is, and given that C++ is a copy-logic language this means asking for troubles (for example for copy construction and assignment functions that will act only on the first part of the object). For everyday use it's surely MUCH better to use for example an std::vector instead, but this of course will come at an higher price (double indirection, more memory for every _TREEDATSTR instance).

I normally don't like thinking all other programmers are idiots, so if this kind of bad trickery has been used then probably there is a well paying reason for it... For a definitive judgment however a much deeper inspection would be needed.

To summarize:

Using a zero element array at the end of an array is a trick used to create variable-sized objects. The allocation is done by requesting sizeof(structure) + n*sizeof(array_element) bytes.
Pragma pack is used to tell the compiler to avoid adding extra padding bytes between structure fields. This is needed when a precise control on the memory layout is needed (for example because those object are being accessed by hand-written assembly)
Don't do that in C++ unless you really need it and you know what you're doing

There is no way to "correctly" silence the warning because the code wants to play dirty (and C++ compilers don't like to be fooled about object size). If you use this object inside other objects, or as a base for other objects, or pass it aroud by value then whatever bad happens you asked for it.

回答6:

Although I realise that this is an old thread, I would like to give my pure c++11 solution to the OP's question. The idea is to wrap the to be allocated object, adding in the padding to align the objects in array to power of 2 adresses, in the following way:

template<typename T, std::size_t ObjectPaddingSize>
struct PaddedType : private T { private: char padding [ ObjectPaddingSize ]; };

template<typename T> // No padding.
struct PaddedType<T, 0> : private T { };

template<typename T>
struct PaddedT : private PaddedType<T, NextPowerOfTwo<sizeof ( T )>::value - sizeof ( T )> { };

The objects padding size can be calculated at compile-time with the following class (returns L if L is power of 2, else the next power of 2 gt L):

template<std::size_t L>
class NextPowerOfTwo {

    template <std::size_t M, std::size_t N>
    struct NextPowerOfTwo1 {

        enum { value = NextPowerOfTwo1<N, N & ( N - 1 )>::value };
    };

    template <std::size_t M>
    struct NextPowerOfTwo1<M, 0> {

        enum { value = M << 1 };
    };

    // Determine whether S is a power of 2, if not dispatch.

    template <std::size_t M, std::size_t N>
    struct NextPowerOfTwo2 {

        enum { value = NextPowerOfTwo1<M, M>::value };
    };

    template <std::size_t M>
    struct NextPowerOfTwo2<M, 0> {

        enum { value = M };
    };

public:

    enum { value = NextPowerOfTwo2<L, L & ( L - 1 )>::value };
};