Rationale for pointer comparisons outside an array

2019-01-18 14:48发布

So, the standard (referring to N1570) says the following about comparing pointers:

C99 6.5.8/5 Relational operators

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. ... [snip obvious definitions of comparison within aggregates] ... In all other cases, the behavior is undefined.

What is the rationale for this instance of UB, as opposed to specifying (for instance) conversion to intptr_t and comparison of that?

Is there some machine architecture where a sensible total ordering on pointers is hard to construct? Is there some class of optimization or analysis that unrestricted pointer comparisons would impede?

A deleted answer to this question mentions that this piece of UB allows for skipping comparison of segment registers and only comparing offsets. Is that particularly valuable to preserve?

(That same deleted answer, as well as one here, note that in C++, std::less and the like are required to implement a total order on pointers, whether the normal comparison operator does or not.)

4条回答
够拽才男人
2楼-- · 2019-01-18 15:25

Historically, saying that action invoked Undefined Behavior meant that any program which made use of such actions could be expected to correctly only on those implementations which defined, for that action, behavior meeting their requirements. Specifying that an action invoked Undefined Behavior didn't mean that programs using such action should be considered "illegitimate", but was rather intended to allow C to be used to run programs that didn't require such actions, on platforms which could not efficiently support them.

Generally, the expectation was that a compiler would either output the sequence of instructions which would most efficiently perform the indicated action in the cases required by the standard, and do whatever that sequence of instructions happened to do in other cases, or would output a sequence of instructions whose behavior in such cases was deemed to be in some fashion more "useful" than the natural sequence. In cases where an action might trigger a hardware trap, or where triggering an OS trap might plausibly in some cases be considered preferable to executing the "natural" sequence of instructions, and where a trap might cause behaviors outside the control of the C compiler, the Standard imposes no requirements. Such cases are thus labeled as "Undefined Behavior".

As others have noted, there are some platforms where p1 < p2, for unrelated pointers p1 and p2, could be guaranteed to yield 0 or 1, but where the most efficient means of comparing p1 and p2 that would work in the cases defined by the Standard might not uphold the usual expectation that p1 < p2 || p2 > p2 || p1 != p2. If a program written for such a platform knows that it will never deliberately compare unrelated pointers (implying that any such comparison would represent a program bug) it may be helpful to have stress-testing or troubleshooting builds generate code which traps on any such comparisons. The only way for the Standard to allow such implementations is to make such comparisons Undefined Behavior.

Until recently, the fact that a particular action would invoke behavior that was not defined by the Standard would generally only pose difficulties for people trying to write code on platforms where the action would have undesirable consequences. Further, on platforms where an action could only have undesirable consequences if a compiler went out of its way to make it do so, it was generally accepted practice for programmers to rely upon such an action behaving sensibly.

If one accepts the notions that:

  1. The authors of the Standard expected that comparisons between unrelated pointers would work usefully on those platforms, and only those platforms, where the most natural means of comparing related pointers would also work with unrelated ones, and

  2. There exist platforms where comparing unrelated pointers would be problematic

Then it makes complete sense for the Standard to regard unrelated-pointer comparisons as Undefined Behavior. Had they anticipated that even compilers for platforms which define a disjoint global ranking for all pointers might make unrelated-pointer comparisons negate the laws of time and causality (e.g. given:

int needle_in_haystack(char const *hs_base, int hs_size, char *needle)
{ return needle >= hs_base && needle < hs_base+hs_size; }

a compiler may infer that the program will never receive any input which would cause needle_in_haystack to be given unrelated pointers, and any code which would only be relevant when the program receives such input may be eliminated) I think they would have specified things differently. Compiler writers would probably argue that the proper way to write needle_in_haystack would be:

int needle_in_haystack(char const *hs_base, int hs_size, char *needle)
{
  for (int i=0; i<size; i++)
    if (hs_base+i == needle) return 1;
  return 0;
}

since their compilers would recognize what the loop is doing and also recognize that it's running on a platform where unrelated pointer comparisons work, and thus generate the same machine code as older compilers would have generated for the earlier-stated formulation. As to whether it would be better to require compilers provide a means of specifying that code resembling the former version should either sensibly on platforms that will support it or refuse compilation on those that won't, or better to require that programmers intending the former semantics should write the latter and hope that optimizers turn it into something useful, I leave that to the reader's judgment.

查看更多
贼婆χ
3楼-- · 2019-01-18 15:30

The 8086 is a processor with 16 bit registers and a 20 bit address space. To cope with the lack of bits in its registers, a set of segment registers exists. On memory access, the dereferenced address is computed like this:

address = 16 * segment + register

Notice that among other things, an address has generally multiple ways to be represented. Comparing two arbitrary addresses is tedious as the compiler has to first normalize both addresses and then compare the normalized addresses.

Many compilers specify (in the memory models where this is possible) that when doing pointer arithmetic, the segment part is to be left untouched. This has several consequences:

  • objects can have a size of at most 64 kB
  • all addresses in an object have the same segment part
  • comparing addresses in an object can be done just by comparing the register part; that can be done in a single instruction

This fast comparison of course only works when the pointers are derived from the same base-address, which is one of the reasons why the C standard defines pointer comparisons only for when both pointers point into the same object.

If you want a well-ordered comparison for all pointers, consider converting the pointers to uintptr_t values first.

查看更多
Melony?
4楼-- · 2019-01-18 15:31

I believe it's undefined so that C can be run on architectures where, in effect, "smart pointers" are implemented in hardware, with various checks to ensure that pointers never accidentally point outside of the memory regions they're defined to refer to. I've never personally used such a machine, but the way to think about them is that computing an invalid pointer is precisely as forbidden as dividing by 0; you're likely to get a run-time exception that terminates your program. Furthermore, what's forbidden is computing the pointer, you don't even have to dereference it to get the exception.

Yes, I believe the definition also ended up permitting more-efficient comparisons of offset registers in old 8086 code, but that was not the only reason.

Yes, a compiler for one of these protected pointer architectures could theoretically implement the "forbidden" comparisons by converting to unsigned or the equivalent, but (a) it would likely be significantly less efficient to do so and (b) that would be a wantonly deliberate circumvention of the architecture's intended protection, protection which at least some of the architecture's C programmers would presumably want to have enabled (not disabled).

查看更多
Summer. ? 凉城
5楼-- · 2019-01-18 15:36

Various comments in the ub mailing list discussion Justification for < not being a total order on pointers? strongly allude to segmented architectures being the reason. Including the follow comments, 1:

Separately, I believe that the Core Language should simply recognize the fact that all machines these days have a flat memory model.

and 2:

Then we maybe need an new type that guarantees a total order when converted from a pointer (e.g. in segmented architectures, conversion would require taking the address of the segment register and adding the offset stored in the pointer).

and 3:

Pointers, while historically not totally ordered, are practically so for all systems in existence today, with the exception of the ivory tower minds of the committee, so the point is moot.

and 4:

But, even if segmented architectures, unlikely though it is, do come back, the ordering problem still has to be addressed, as std::less is required to totally order pointers. I just want operator< to be an alternate spelling for that property.

Why should everyone else pretend to suffer (and I do mean pretend, because outside of a small contingent of the committee, people already assume that pointers are totally ordered with respect to operator<) to meet the theoretical needs of some currently non-existent architecture?

Counter to the trend of comments from the ub mailing list, FUZxxl points out that supporting DOS is a reason not to support totally ordered pointers.

Update

This is also supported by the Annotated C++ Reference Manual(ARM) which says this was due to burden of supporting this on segmented architectures:

The expression may not evaluate to false on segmented architectures [...] This explains why addition, subtraction and comparison of pointers are defined only for pointers into an array and one element beyond the end. [...] Users of machines with a nonsegmented address space developed idioms, however, that referred to the elements beyond the end of the array [...] was not portable to segmented architectures unless special effort was taken [...] Allowing [...] would be costly and serve few useful purposes.

查看更多
登录 后发表回答