C++ Double Dispatch for Equals()

2019-01-12 09:15发布

问题:

Imagine I have abstract base class Shape, with derived classes Circle and Rectangle.

class Shape {};
class Circle : public Shape {};
class Rectangle : public Shape {};

I need to determine if two shapes are equal, assuming I have two Shape* pointers. (This is because I have two instances of vector<Shape*> and I want to see if they have the same shapes.)

The recommended way to do this is double dispatch. What I've come up with is this (greatly simplified here, so that shapes are equal to all other shapes of the same type):

class Shape {
public:
    virtual bool equals(Shape* other_shape) = 0;
protected:
    virtual bool is_equal(Circle& circle) { return false; };
    virtual bool is_equal(Rectangle& rect) { return false; };
    friend class Circle;    // so Rectangle::equals can access Circle::is_equal
    friend class Rectangle; // and vice versa
};

class Circle : public Shape {
public:
    virtual bool equals(Shape* other_shape) { return other_shape->is_equal(*this); };
protected:
    virtual bool is_equal(Circle& circle) { return true; };
};

class Rectangle : public Shape {
public:
    virtual bool equals(Shape* other_shape) { return other_shape->is_equal(*this); };
protected:
    virtual bool is_equal(Rectangle& circle) { return true; };
};

This works, but I have to add a separate equals function and friend declaration in Shape for each derived class. Then I have to copy-paste the exact same equals function into each derived class, too. This is an awful lot of boilerplate for say, 10 different shapes!

Is there a simpler way to do it?

dynamic_cast is out of the question; too slow. (Yes, I benchmarked it. Speed matters in my app.)

I tried this but it doesn't work:

class Shape {
public:
    virtual bool equals(Shape* other_shape) = 0;
private:
    virtual bool is_equal(Shape& circle) { return false; };
};

class Circle : public Shape {
public:
    virtual bool equals(Shape* other_shape) { return other_shape->is_equal(*this); };
private:
    virtual bool is_equal(Circle& circle) { return true; };
};

class Rectangle : public Shape {
public:
    virtual bool equals(Shape* other_shape) { return other_shape->is_equal(*this); };
private:
    virtual bool is_equal(Rectangle& circle) { return true; };
};

equals() always returns false, even on identical shapes. It seems dispatch is always choosing the is_equal(Shape&) base function, even when a "more specific" match is available. This probably makes sense but I don't understand C++ dispatch well enough to know why.

回答1:

When you create methods like this:

virtual bool is_equal(Shape& circle) { return false; };

And in the subclass,

virtual bool is_equal(Circle& circle) { return true; };

These are not the same method. You have two separate virtual methods, neither of which is overridden (they are overloaded not even overloaded, as Ben Voigt pointed out). When you call Shape::is_equal, there is only one version: Shape::is_equal(Shape&)... which is not overridden and always returns false.

You would have to define the separate overloaded methods in the parent class and then override them in the child class. For example,

class Shape {
    // Choice between these two methods happens at compile time...
    virtual bool is_equal(Circle& circle) { return false; };
    virtual bool is_equal(Rectangle& circle) { return false; };
};

class Rectangle : Shape {
    // Choice between this and Shape::is_equal(Rectangle&) happens at runtime...
    virtual bool is_equal(Rectangle& circle) { return true; };
};

However, using tricks like this, you will probably not approach the performance or simplicity of the way a C programmer would do it:

typedef enum {
    SHAPE_CIRCLE,
    SHAPE_RECTANGLE
} shape_type_t;

struct shape {
    shape_type_t type;
};

struct circle {
    shape_type_t type;
    ...
};

struct rectangle {
    shape_type_t type;
    ...
};

bool shape_equal(struct shape *x, struct shape *y)
{
    if (x->type != y->type)
        return false;
    switch (x->type) {
    case SHAPE_CIRCLE:
        return circle_equal((struct circle *) x, (struct circle *) y);
    case SHAPE_RECTANGLE:
        ...;
    }
}

If overloading and virtual methods are making your code more complicated than the C version, then you may wish to rethink whether you solve this particular problem with overloading and virtual methods.



回答2:

Double-dispatch has been well studied. The generalization of double-dispatch is called a "multi-method".

Chapter 11 of Modern C++ Design addresses this issue in detail. The approach using dynamic_cast<> that you described is in section 11.3 "Double Switch-on-Type: Brute Force". The author even describes how to automate most of the work and automatically generate the symmetric overloads. Then, the author introduces a logarithmic dispatch based on std::map<> and std::type_info. Finally, the section ends with "Constant-Time Multimethods: Raw Speed" that's (roughly) based on a matrix of callback functions.

The presented solution includes lengthy explanations on handling functors and casts to avoid nasty pitfalls in presence of multiple (and virtual) inheritance.

If you consider implementing multi-methods in C++, I stronly recommend that you read the book and implement the proposed solution.



回答3:

You could use a type enumeration and static casting if dynamic_cast is too slow...

enum ShapeType
{
    SHAPE_TYPE_CIRCLE,
    SHAPE_TYPE_RECTANGLE
};

struct Shape
{
    virtual ShapeType GetShapeType() const = 0;
    virtual bool isEqual(const Shape& other) const = 0;
};

struct Circle : Shape
{
    virtual ShapeType GetShapeType() const { return SHAPE_TYPE_CIRCLE; }

    virtual bool isEqual(const Shape& other) const
    {
        if (other.GetShapeType() == SHAPE_TYPE_CIRCLE)
        {
            const Circle *circle = static_cast<const Circle*>(&other);

            // do some circle specific comparison
            return true;
        }
        return false;
    }
};


回答4:

Virtual functions can easily replace dynamic_cast RTTI type-checking, like this: http://ideone.com/l7Jr5

struct Shape
{
    struct subtype { enum { Shape, Circle, Rectangle, ColoredCircle }; };

    virtual bool is_a( int type ) const { return type == subtype::Shape; }
    virtual bool is_equal(const Shape& s) const { return false; }
};

struct Rectangle : Shape
{
    virtual bool is_a( int type ) const { return type == subtype::Rectangle || Shape::is_a(type); }
    virtual bool is_equal(const Shape& s) const
    {
        if (!s.is_a(subtype::Rectangle)) return false;
        const Rectangle& r = static_cast<const Rectangle&>(s);
        return true; // or check width and height
    }
};

struct Circle : Shape
{
    virtual bool is_a( int type ) const { return type == subtype::Circle || Shape::is_a(type); }
    virtual bool is_equal(const Shape& s) const
    {
        if (!s.is_a(subtype::Circle)) return false;
        const Circle& c = static_cast<const Circle&>(s);
        return true; // or check radius
    }
};

struct ColoredCircle : Circle
{
    virtual bool is_a( int type ) const { return type == subtype::ColoredCircle || Circle::is_a(type); }
};

int main(void)
{
    Rectangle x;
    Shape y;
    return x.is_equal(y);
}

--

Why are there 10 copies of the "exact same" function? Shouldn't Rectangle::is_equal(const Rectangle&) const be comparing Rectangle-specific members?

If all rectangles fall into a single equivalence class, as is the case with the code you showed, then you can just have a single virtual function that returns the equivalence class.



回答5:

In my designs, I move the Shape::operator== method to private and not implement it. The amount of work to correctly resolve this is not worth the effort.

In other words, given a vector of Shape *:

std::vector<Shape *> my_shapes;

I can do the following:

my_shapes.push_back(new Rectangle);
my_shapes.push_back(new Circle);

The problem comes in when comparing objects:

Shape * p_shape_1 = my_shapes[0];
Shape * p_shape_2 = my_shapes[1];
if (*p_shape_1 == *p_shape_2) {...}

The expression is equivalent to:

p_shape_1->operator==(*p_shape_2);

If a virtual or polymorphic operation is in place, this becomes:

Rectangle::operator==((Circle));

In otherwords, there is a great possibility that Rectangle will be comparing itself to a Circle or other Shape; an invalid comparison.

So, in my designs I prohibit equality comparisons based on base class pointers. The only stuff that can be compared using pointers to base classes is the content in the base class.



回答6:

I usually refer to dynamic_cast and virtual funcntions. If the compiler is not too dumb, dynamic casting one step is not that different than making two jumps in a vtable.

class shape
{
protected:
   virtual bool is_equal(const shape* s) const=0;
   friend bool oeprator==(const shape& a, cost shape& b)
   { return a.is_equal(&b); }
};

class circle: public shape
{
    double radius;
    point<duouble> center;
protected:
    virtual bool is_equal(const shape* s) const
    {
        const circle* p = dynamic_cast<const circle*>(s);
        return p && p->radius==radius && p->center==center;
    }
};

Same for rectangle or whatever other shape. basically, dual dispatch requires - if N are the classees, N2 functions. In this way, you just need N functions (one per class).

If you feel dynamic cast to be too slow, you can use an enum, declared in the base class, and initialized properly by the derived classes. But this requires you to update the enum values every time a new class will be added. For example: class shape { protected: enum shapes_type { no_shape, circle_shape, rectangle_shape }; shapes_type my_type; virtual bool is_equal(const shape* s) const=0; friend bool oeprator==(const shape& a, cost shape& b) { return a.is_equal(&b); } shape() :my_type(no_shape) {} };

class circle: public shape
{
    double radius;
    point<duouble> center;
protected:
    virtual bool is_equal(const shape* s) const
    {
        const circle* p = static_cast<const circle*>(s);
        return my_type == s->my_type && p->radius==radius && p->center==center;
    }
public:
    circle() { my_type = circle_shape; }
};

In case relying on a base_defined enum is not acceptable (not known number of possible classes), you can rely on a simple value (e.g.: an integer) that can represent univocally a type with a trick like:

int int_generator()
{ static int x=0; return ++x; }

template<class T>
int  id_for_type()
{ static int z = int_generator(); return z; }

class shape
{
...
int my_type;
};


class circle
{
...
   circle() { my_type = id_for_type<circle>(); }
};