-->

Portable and safe way to add byte offset to any po

2019-03-12 09:10发布

问题:

I'm quite new at working with C++ and haven't grasped all the intricacies and subtleties of the language.

What is the most portable, correct and safe way to add an arbitrary byte offset to a pointer of any type in C++11?

SomeType* ptr;
int offset = 12345 /* bytes */;
ptr = ptr + offset;             // <--

I found many answers on Stack Overflow and Google, but they all propose different things. Some variants I have encountered:

  1. Cast to char *:

    ptr = (SomeType*)(((char*)ptr) + offset);
    
  2. Cast to unsigned int:

    ptr = (SomeType*)((unsigned int)ptr) + offset);
    
  3. Cast to size_t:

    ptr = (SomeType*)((size_t)ptr) + offset);
    
  4. "The size of size_t and ptrdiff_t always coincide with the pointer's size. Because of this, it is these types that should be used as indexes for large arrays, for storage of pointers and pointer arithmetic." - About size_t and ptrdiff_t on CodeProject

    ptr = (SomeType*)((size_t)ptr + (ptrdiff_t)offset);
    
  5. Or like the previous, but with intptr_t instead of size_t, which is signed instead of unsigned:

    ptr = (SomeType*)((intptr_t)ptr + (ptrdiff_t)offset);
    
  6. Only cast to intptr_t, since offset is already a signed integer and intptr_t is not size_t:

    ptr = (SomeType*)((intptr_t)ptr) + offset);
    

And in all these cases, is it safe to use old C-style casts, or is it safer or more portable to use static_cast or reinterpret_cast for this?

Should I assume the pointer value itself is unsigned or signed?

回答1:

I would use something like:

unsigned char* bytePtr = reinterpret_cast<unsigned char*>(ptr);
bytePtr += offset;


回答2:

Using reinterpret_cast (or C-style cast) means circumventing the type system and is not portable and not safe. Whether it is correct, depends on your architecture. If you (must) do it, you insinuate that you know what you do and you are basically on your own from then on. So much for the warning.

If you add a number n to a pointer or type T, you move this pointer by n elements of type T. What you are looking for is a type where 1 element means 1 byte.

From the sizeof section 5.3.3.1.:

The sizeof operator yields the number of bytes in the object representation of its operand. [...] sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1. The result of sizeof applied to any other fundamental type (3.9.1) is implementation-defined.

Note, that there is no statement about sizeof(int), etc.

Definition of byte (section 1.7.1.):

The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation-defined. [...] The memory available to a C++ program consists of one or more sequences of contiguous bytes. Every byte has a unique address.

So, if sizeof returns the number of bytes and sizeof(char) is 1, than char has the size of one byte to C++. Therefore, char is logically a byte to C++ but not necessarily the de facto standard 8-bit byte. Adding n to a char* will return a pointer that is n bytes (in terms of the C++ memory model) away. Thus, if you want to play the dangerous game of manipulating an object's pointer bytewise, you should cast it to one of the char variants. If your type also has qualifiers like const, you should transfer them to your "byte type" too.

    template <typename Dst, typename Src>
    struct adopt_const {
        using type = typename std::conditional< std::is_const<Src>::value,
            typename std::add_const<Dst>::type, Dst>::type;
    };

    template <typename Dst, typename Src>
    struct adopt_volatile {
        using type = typename std::conditional< std::is_volatile<Src>::value,
            typename std::add_volatile<Dst>::type, Dst>::type;
    };

    template <typename Dst, typename Src>
    struct adopt_cv {
        using type = typename adopt_const<
            typename adopt_volatile<Dst, Src>::type, Src>::type;
    };

    template <typename T>
    T*  add_offset(T* p, std::ptrdiff_t delta) noexcept {
        using byte_type = typename adopt_cv<unsigned char, T>::type;
        return reinterpret_cast<T*>(reinterpret_cast<byte_type*>(p) + delta);
    }

Example



回答3:

Please note that, NULL is special. Adding an offset on it is dangerous.
reinterpret_cast can't remove const or volatile qualifiers. More portable way is C-style cast.
reinterpret_cast with traits like @user2218982's answer, seems more safer.

template <typename T>
inline void addOffset( std::ptrdiff_t offset, T *&ptr ) { 
    if ( !ptr )
        return;
    ptr = (T*)( (unsigned char*)ptr + offset );
} 


回答4:

if you have:

myType *ptr;

and you do:

ptr+=3;

The compiler will most certainly increment your variable by:

3*sizeof(myType)

And it's the standard way to do it as far as I know.

If you want to iterate over let's say an array of elements of type myType that's the way to do it.

Ok, if you wanna cast do that using

myNewType *newPtr=reinterpret_cast < myNewType * > ( ptr )

Or stick to plain old C and do:

myNewType *newPtr=(myNewType *) ptr;

And then increment