Are free operator->* overloads evil?

2019-01-16 16:15发布

问题:

I was perusing section 13.5 after refuting the notion that built-in operators do not participate in overload resolution, and noticed that there is no section on operator->*. It is just a generic binary operator.

Its brethren, operator->, operator*, and operator[], are all required to be non-static member functions. This precludes definition of a free function overload to an operator commonly used to obtain a reference from an object. But the uncommon operator->* is left out.

In particular, operator[] has many similarities. It is binary (they missed a golden opportunity to make it n-ary), and it accepts some kind of container on the left and some kind of locator on the right. Its special-rules section, 13.5.5, doesn't seem to have any actual effect except to outlaw free functions. (And that restriction even precludes support for commutativity!)

So, for example, this is perfectly legal:

#include <utility>
#include <iostream>
using namespace std;

template< class T >
T &
operator->*( pair<T,T> &l, bool r )
    { return r? l.second : l.first; }

template< class T >
 T & operator->*( bool l, pair<T,T> &r ) { return r->*l; }

int main() {
        pair<int, int> y( 5, 6 );
        y->*(0) = 7;
        y->*0->*y = 8; // evaluates to 7->*y = y.second
        cerr << y.first << " " << y.second << endl;
}

It's easy to find uses, but alternative syntax tends not to be that bad. For example, scaled indexes for vector:

v->*matrix_width[2][5] = x; // ->* not hopelessly out of place

my_indexer<2> m( v, dim ); // my_indexer being the type of (v->*width)
m[2][5] = x; // it is probably more practical to slice just once

Did the standards committee forget to prevent this, was it considered too ugly to bother, or are there real-world use cases?

回答1:

The best example I am aware of is Boost.Phoenix, which overloads this operator to implement lazy member access.

For those unfamiliar with Phoenix, it is a supremely nifty library for building actors (or function objects) that look like normal expressions:

( arg1 % 2 == 1 )     // this expression evaluates to an actor
                 (3); // returns true since 3 % 2 == 1

// these actors can also be passed to standard algorithms:
std::find_if(c.begin(), c.end(), arg1 % 2 == 1);
// returns iterator to the first odd element of c

It achieves the above by overloading operator% and operator==. - applied to the actor arg1 these operators return another actor. The range of expressions which can be built in this manner is extreme:

// print each element in c, noting its value relative to 5:
std::for_each(c.begin(), c.end(),
  if_(arg1 > 5)
  [
    cout << arg1 << " > 5\n"
  ]
  .else_
  [
    if_(arg1 == 5)
    [
      cout << arg1 << " == 5\n"
    ]
    .else_
    [
      cout << arg1 << " < 5\n"
    ]
  ]
);

After you have been using Phoenix for a short while (not that you ever go back) you will try something like this:

typedef std::vector<MyObj> container;
container c;
//...
container::iterator inv = std::find_if(c.begin(), c.end(), arg1.ValidStateBit);
std::cout << "A MyObj was invalid: " << inv->Id() << std::endl;

Which will fail, because of course Phoenix's actors do not have a member ValidStateBit. Phoenix gets around this by overloading operator->*:

(arg1 ->* &MyObj::ValidStateBit)              // evaluates to an actor
                                (validMyObj); // returns true 

// used in your algorithm:
container::iterator inv = std::find_if(c.begin(), c.end(), 
      (arg1 ->* &MyObj::ValidStateBit)    );

operator->*'s arguments are:

  • LHS: an actor returning MyObj *
  • RHS: address of a member

It returns an actor which evaluates the LHS and looks for the specified member in it. (NB: You really, really want to make sure that arg1 returns MyObj * - you have not seen a massive template error until you get something wrong in Phoenix. This little program generated 76,738 characters of pain (Boost 1.54, gcc 4.6):

#include <boost/phoenix.hpp>
using boost::phoenix::placeholders::arg1;

struct C { int m; };
struct D { int n; };

int main() {
  ( arg1  ->*  &D::n ) (new C);
  return 0;
}


回答2:

I agree with you that there is an incoherence on the standard, It doesn't allows overloading of operator[] with non-member functions and allows it for operator->*. For my point of view operator[] is to arrays as operator->* is to structs/classes (a getter). Members of an array are selected using an index. Members of a struct are selected using member pointers.

The worst is that we can be tempted to use ->* instead of operator[] to get an array like element

int& operator->*(Array& lhs, int i);

Array a;

a ->* 2 = 10;

There is also another possible incoherence. We can use a non member function to overload operator+= and all the operator of the form @=) and we cannot do it for operator=.

I don't really know what is the rationale to make the the following legal

struct X {
    int val;
    explicit X(int i) : val(i) {}
};
struct Z {
    int val;
    explicit Z(int i) : val(i) {}
};
Z& operator+=(Z& lhs, const X& rhs) {
    lhs.val+=rhs.val;
    return lhs;
}

Z z(2);
X x(3);
z += x;

and forbidding

Z& operator=(Z& lhs, const X& rhs) {
    lhs.val=i;
    return lhs;
}

z = x;

Sorry to not answer to your question, but adding even more confusion.



回答3:

Googling around a bit, I found more instances of people asking whether operator->* is ever used than actual suggestions.

A couple places suggest T &A::operator->*( T B::* ). Not sure whether this reflects designer's intent or a misimpression that T &A::operator->*( T A::* ) is a builtin. Not really related to my question, but gives an idea of the depth I found in online discussion & literature.

There was a mention of "D&E 11.5.4" which I suppose is Design and Evolution of C++. Perhaps that contains a hint. Otherwise, I'm just gonna conclude it's a bit of useless ugliness that was overlooked by standardization, and most everyone else too.

Edit See below for a paste of the D&E quote.

To put this quantitatively, ->* is the tightest binding operator that can be overloaded by a free function. All the postfix-expression and unary operators overloads require nonstatic member function signatures. Next precedence after unary operators are C-style casts, which could be said to correspond to conversion functions (operator type()), which also cannot be free functions. Then comes ->*, then multiplication. ->* could have been like [] or like %, they could have gone either way, and they chose the path of EEEEEEVIL.



回答4:

Standard (Working Draft 2010-02-16, § 5.5) says:

The result of an ->* expression is an lvalue only if its second operand is a pointer to data member. If the second operand is the null pointer to member value (4.11), the behavior is undefined.

You may want this behavior to be well-defined. For example, check if it is a null pointer and handle this situation. SO I quess it is right decision for a standard to allow ->* overloading.