-->

Subtraction between pointers of different type [du

2019-05-18 01:23发布

站内文章 / C

22 0

做个烂人

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

This question already has an answer here:

Pointer/Address difference [duplicate] 3 answers

I'm trying to find the distance in memory between two variables. Specifically I need to find the distance between a char[] array and an int.

    char data[5];
    int a = 0;

    printf("%p\n%p\n", &data[5], &a);

    long int distance = &a - &data[5];

    printf("%ld\n", distance);

When I run my my program without the last two lines I get the proper memory address of the two variables, something like this:

   0x7fff5661aac7
   0x7fff5661aacc

Now I understand, if I'm not wrong, that there are 5 bytes of distance between the two (0x7fff5661aac8, 0x7fff5661aac9, 0x7fff5661aaca, 0x7fff5661aacb, 0x7fff5661aacc).

Why I can't subtract a pointer of type (int *) and one of type (char *). Both refer to memory address.. What should I do in order to calculate the distance, in bytes, between the two?? I tried casting one of the two pointers but it's not working.

I get: "error: 'char *' and 'int *' are not pointers to compatible types". Thanks to everyone will help me

回答1:

Nopes, this is not possible.

First, you can only subtract pointers of (to) "compatible" types, an int and a char are not compatible types here. Hence the subtraction is not possible.

That said, even if both are pointers to compatible type, then also, the following comes into picture.

So, secondly You cannot just subtract two arbitrary pointers, they need to be essentially part of (address for elements of) the same array. Othweise, it invokes undefined behavior.

Quoting C11, chapter §6.5.6, Additive operators

When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. [....]

Thirdly, another important point, the result of subtraction of two pointers is of type ptrdiff_t, a signed integer type.

[...] The size of the result is implementation-defined, and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header. [...]

so, to print the result, you need to use %td format specifier.

回答2:

Pointer subtraction is only defined for pointers within the same array (or just past the last element of an array). Any other use is undefined behavior. Let's ignore that for your experimentation.

When two pointers of the same type to elements of the same array object are subtracted, the result is the difference of the array indices. You could add that signed integer result (of type ptrdiff_t) to the first pointer and get the value of the second pointer, or subtract the result from the second pointer and get the value of the first pointer. So in effect, the result is the difference in the byte address of the two pointers divided by the size of the object being pointed to. This is why it makes no sense to allow subtraction of pointers of incompatible type, particularly when the referenced object types are of different size. How could you divide the difference in byte address by the size of the object being pointed to when the pointers being subtractedare referring to differently sized objects?

Still, for experimentation purposes, you can cast both pointers (pointing to different objects) to char * and subtract them, and many compilers will just give you the difference in their byte address as a number. However, the result could overflow an integer of ptrdiff_t. Alternatively, you can convert both pointers to an integer of type intptr_t and subtract the integers to get the difference in byte address. Again, it's theoretically possible that the result of the subtraction could overflow an integer of type intptr_t.

回答3:

On a standard PC nothing keeps you from casting both pointers to an integer type which can hold the pointer value, and subtracting the two integers.

Such an integer type is not guaranteed to exist on all architectures (but on many common systems it does) — imagine segmented memory with more information than just a single number. If the integer type doesn't fit, the behavior of the cast is undefined.

From the standard draft n1570, 6.3.2.3/6:

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

Often the difference between addresses will be what one expects (variables declared in succession are next to each other in memory) and can be used to tell the direction the stack grows etc.

It may be interesting to explore what else you can do with integers and pointers.

Olaf commented that if you "cast [an arithmetic computation result] back to a pointer, you invoke UB." That is not necessarily so; it depends on the integer value. The standard draft says the following in 6.3.2.3/5:

An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation

(Emphasis by me.) If we compute the address of a struct member by adding an offset to the struct's address we have obviously taken care of the mentioned issues, so it's up to the implementation. It is certainly not UB; a good many embedded systems would fail if we couldn't use an integer -> pointer conversion, and access that memory through the resulting pointer. We must make sure that the system allows it, and that the addresses are sound.

The paragraph has a footnote:

The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment.

That is, they are meant to not surprise the user. While in theory the addresses of unrelated objects adjacent in memory could be projected to wildly different integer values, they are not supposed to. A user can for example reasonably expect that linear memory is projected into a linear integer number space, keeping ordering and distances.

I should also emphasize (as I did in one comment) that the standard is not the world. It must accommodate and give guarantees for a wide range of machines. Therefore the standard can only be the smallest common denominator. If we can narrow the range of architectures we consider, we can make much better guarantees.

One common example is the possible presence of trap values in integer registers, or flags indicating a read from an uninitialized register, which also traps; these are responsible for a wide range of UB cases in the standard which simply do not apply, for example, to your PC.

回答4:

uint8_t * ptr = ...;
uint8_t * ptr2 = ptr + 5;

Now if ptr was 100, what will ptr2 be? Correct, it will be 105. But now look at that code:

uint32_t * ptr = ...;
uint32_t * ptr2 = ptr + 5;

Again, if ptr was 100, what will ptr2 be? Wrong! It won't be 105, it will be 120.

Why? Pointer arithmetic is not integer arithmetic!

ptr2 = ptr + 5;

Actually means:

ptr2 = int_to_ptr(ptr_to_int(ptr) + (sizeof(ptr) * 5));

Functions int_to_ptr and ptr_to_int don't really exist, I'm just using them for demonstration purpose, so you better understand what is going on between the scenes.

So if you subtract two pointers, the result is not the difference of their addresses, it's the number of elements between them:

uint32_t test[50];
ptrdiff_t diff = &test[20] - &test[10];

diff will be 10, as there are 10 elements in between them (one element is one uint32_t value) but that doesn't mean there are 10 bytes between test[10] and test[20], there are 40 bytes between them as every uint32_t value takes up 4 bytes of memory.

Now you may understand why subtracting pointers of different types makes no sense, as different types have different element sizes and what shall such a subtraction then return?

If you want how many bytes are in between two pointers, you need to cast them both to a data type that has one-byte elements (e.g. uint8_t * or char * would work) or cast them to void * (GNU extension but many compilers support that as well), which means the data type is unknown and thus the element size is unknown as well and in that case the compiler will byte-sized elements. So this may work:

ptrdiff_t diff = (void *)ptr2 - (void *)ptr1;

yet this

ptrdiff_t diff = (char *)ptr2 - (char *)ptr1;

is more portable.

It will compile, it will deliver an result. If that result is meaningful, that's a different topic. Unless both pointers point to the same memory "object" (same struct, same array, same allocated memory region), it is not, as the standard says that in that case, the result is undefined. That means diff could (legally) have any value, so a compiler may as well always set diff to 0 in that case, that would be allowed by the standards.

If you want defined behavior, try this instead:

ptrdiff_t diff = (ptrdiff_t)ptr2 - (ptrdiff_t)ptr1;

That is legal and defined. Every pointer can be casted to an int value and ptrdiff_t is an int value, one that is guaranteed to big enough so that every pointer can fit into it (don't ever use int or long for that purpose, they do not make any such guarantee!). This code converts both pointers to integers and then subtract them. I still don't see anything useful you can do with diff now, but that code at least will deliver a defined result, yet maybe not the result you might be expecting.

回答5:

Try typecasting each address to void *

long int distance = (void *)&a - (void *)&data[5];

As others will point out, this is dangerous and undefined, but if you're just exploring how memory works, it should be fine.

回答6:

The size of int and size of char pointers are different.In a system where int size is 4 bytes if you will do int_pointer++ it will increase address by 4 bytes while it will increment address by 1 byte in case of char_ptr. Hence you might be getting error.

回答7:

This is because pointer arithmetic is about offsets. For example, if you have an array and a pointer to that array like:

int array[3] = { 1, 2, 3};
int *ptr = array;

and then you increment ptr, you expect next value from the array, e.g. array[0] after array[1], no matter what type is stored in that. So when you substract pointers you don't get e.g. bytes, but offset.

Don't substract pointers that are not the part of the same array.