Variable Sized Struct C++

2019-01-07 13:50发布

站内文章 / C++

62 0

叼着烟拽天下

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Is this the best way to make a variable sized struct in C++? I don't want to use vector because the length doesn't change after initialization.

struct Packet
{
    unsigned int bytelength;
    unsigned int data[];
};

Packet* CreatePacket(unsigned int length)
{
    Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
    output->bytelength = length;
    return output;
}

Edit: renamed variable names and changed code to be more correct.

回答1:

Some thoughts on what you're doing:

Using the C-style variable length struct idiom allows you to perform one free store allocation per packet, which is half as many as would be required if struct Packet contained a std::vector. If you are allocating a very large number of packets, then performing half as many free store allocations/deallocations may very well be significant. If you are also doing network accesses, then the time spent waiting for the network will probably be more significant.
This structure represents a packet. Are you planning to read/write from a socket directly into a struct Packet? If so, you probably need to consider byte order. Are you going to have to convert from host to network byte order when sending packets, and vice versa when receiving packets? If so, then you could byte-swap the data in place in your variable length struct. If you converted this to use a vector, it would make sense to write methods for serializing / deserializing the packet. These methods would transfer it to/from a contiguous buffer, taking byte order into account.
Likewise, you may need to take alignment and packing into account.
You can never subclass Packet. If you did, then the subclass's member variables would overlap with the array.
Instead of malloc and free, you could use Packet* p = ::operator new(size) and ::operator delete(p), since struct Packet is a POD type and does not currently benefit from having its default constructor and its destructor called. The (potential) benefit of doing so is that the global operator new handles errors using the global new-handler and/or exceptions, if that matters to you.
It is possible to make the variable length struct idiom work with the new and delete operators, but not well. You could create a custom operator new that takes an array length by implementing static void* operator new(size_t size, unsigned int bitlength), but you would still have to set the bitlength member variable. If you did this with a constructor, you could use the slightly redundant expression Packet* p = new(len) Packet(len) to allocate a packet. The only benefit I see compared to using global operator new and operator delete would be that clients of your code could just call delete p instead of ::operator delete(p). Wrapping the allocation/deallocation in separate functions (instead of calling delete p directly) is fine as long as they get called correctly.

回答2:

If you never add a constructor/destructor, assignment operators or virtual functions to your structure using malloc/free for allocation is safe.

It's frowned upon in c++ circles, but I consider the usage of it okay if you document it in the code.

Some comments to your code:

struct Packet
{
    unsigned int bitlength;
    unsigned int data[];
};

If I remember right declaring an array without a length is non-standard. It works on most compilers but may give you a warning. If you want to be compliant declare your array of length 1.

Packet* CreatePacket(unsigned int length)
{
    Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
    output->bitlength = length;
    return output;
}

This works, but you don't take the size of the structure into account. The code will break once you add new members to your structure. Better do it this way:

Packet* CreatePacket(unsigned int length)
{
    size_t s = sizeof (Packed) - sizeof (Packed.data);
    Packet *output = (Packet*) malloc(s + length * sizeof(unsigned int));
    output->bitlength = length;
    return output;
}

And write a comment into your packet structure definition that data must be the last member.

Btw - allocating the structure and the data with a single allocation is a good thing. You halve the number of allocations that way, and you improve the locality of data as well. This can improve the performance quite a bit if you allocate lots of packages.

Unfortunately c++ does not provide a good mechanism to do this, so you often end up with such malloc/free hacks in real world applications.

回答3:

This is OK (and was standard practice for C).

But this is not a good idea for C++.
This is because the compiler generates a whole set of other methods automatically for you around the class. These methods do not understand that you have cheated.

For Example:

void copyRHSToLeft(Packet& lhs,Packet& rhs)
{
    lhs = rhs;  // The compiler generated code for assignement kicks in here.
                // Are your objects going to cope correctly??
}


Packet*   a = CreatePacket(3);
Packet*   b = CreatePacket(5);
copyRHSToLeft(*a,*b);

Use the std::vector<> it is much safer and works correctly.
I would also bet it is just as efficient as your implementation after the optimizer kicks in.

Alternatively boost contains a fixed size array:
http://www.boost.org/doc/libs/1_38_0/doc/html/array.html

回答4:

You can use the "C" method if you want but for safety make it so the compiler won't try to copy it:

struct Packet
{
    unsigned int bytelength;
    unsigned int data[];

private:
   // Will cause compiler error if you misuse this struct
   void Packet(const Packet&);
   void operator=(const Packet&);
};

回答5:

If you are truly doing C++, there is no practical difference between a class and a struct except the default member visibility - classes have private visibility by default while structs have public visibility by default. The following are equivalent:

struct PacketStruct
{
    unsigned int bitlength;
    unsigned int data[];
};
class PacketClass
{
public:
    unsigned int bitlength;
    unsigned int data[];
};

The point is, you don't need the CreatePacket(). You can simply initialize the struct object with a constructor.

struct Packet
{
    unsigned long bytelength;
    unsigned char data[];

    Packet(unsigned long length = 256)  // default constructor replaces CreatePacket()
      : bytelength(length),
        data(new unsigned char[length])
    {
    }

    ~Packet()  // destructor to avoid memory leak
    {
        delete [] data;
    }
};

A few things to note. In C++, use new instead of malloc. I've taken some liberty and changed bitlength to bytelength. If this class represents a network packet, you'll be much better off dealing with bytes instead of bits (in my opinion). The data array is an array of unsigned char, not unsigned int. Again, this is based on my assumption that this class represents a network packet. The constructor allows you to create a Packet like this:

Packet p;  // default packet with 256-byte data array
Packet p(1024);  // packet with 1024-byte data array

The destructor is called automatically when the Packet instance goes out of scope and prevents a memory leak.

回答6:

I'd probably just stick with using a vector<> unless the minimal extra overhead (probably a single extra word or pointer over your implementation) is really posing a problem. There's nothing that says you have to resize() a vector once it's been constructed.

However, there are several The advantages of going with vector<>:

it already handles copy, assignment & destruction properly - if you roll your own you need to ensure you handle these correctly
all the iterator support is there - again, you don't have to roll your own.
everybody already knows how to use it

If you really want to prevent the array from growing once constructed, you might want to consider having your own class that inherits from vector<> privately or has a vector<> member and only expose via methods that just thunk to the vector methods those bits of vector that you want clients to be able to use. That should help get you going quickly with pretty good assurance that leaks and what not are not there. If you do this and find that the small overhead of vector is not working for you, you can reimplement that class without the help of vector and your client code shouldn't need to change.

回答7:

You probably want something lighter than a vector for high performances. You also want to be very specific about the size of your packet to be cross-platform. But you don't want to bother about memory leaks either.

Fortunately the boost library did most of the hard part:

struct packet
{
   boost::uint32_t _size;
   boost::scoped_array<unsigned char> _data;

   packet() : _size(0) {}

       explicit packet(packet boost::uint32_t s) : _size(s), _data(new unsigned char [s]) {}

   explicit packet(const void * const d, boost::uint32_t s) : _size(s), _data(new unsigned char [s])
   {
        std::memcpy(_data, static_cast<const unsigned char * const>(d), _size);
   }
};

typedef boost::shared_ptr<packet> packet_ptr;

packet_ptr build_packet(const void const * data, boost::uint32_t s)
{

    return packet_ptr(new packet(data, s));
}

回答8:

There are already many good thoughts mentioned here. But one is missing. Flexible Arrays are part of C99 and thus aren't part of C++, although some C++ compiler may provide this functionality there is no guarantee for that. If you find a way to use them in C++ in an acceptable way, but you have a compiler that doesn't support it, you perhaps can fallback to the "classical" way

回答9:

You should declare a pointer, not an array with an unspecified length.

回答10:

There's nothing whatsoever wrong with using vector for arrays of unknown size that will be fixed after initialization. IMHO, that's exactly what vectors are for. Once you have it initialized, you can pretend the thing is an array, and it should behave the same (including time behavior).