Object Oriented Polymorphism in Haskell

2019-04-10 08:18发布

问题:

So I have seen questions that ask how do you do Object Oriented Programming in Haskell, like this for example. To which the answer is along the lines of "type classes are like interfaces but not quite". In particular a type class doesn't allow a list to be built of all those types. E.g. we can't do map show [1, 1.4, "hello"] despite that having a logical result.

Given some time I wondered if it wasn't possible to do better. So I had an attempt at coding polymorphism for a simple Shape class, which can be found below (if you like sanity probably better to stop reading now, and apologies for it being so long).

module Shapes (
          Shape(..)
        , Point
        , Circle(..)
        , Triangle(..)
        , Square(..)
        , location
        , area
) where

data Point = Point {
          xcoord :: Float
        , ycoord :: Float
} deriving (Read, Show)

data Shape = CircleT Circle | PolygonT Polygon deriving (Read, Show)

data Circle = Circle {
          cLocation :: Point
        , cRadius :: Float
} deriving (Read, Show)

data Polygon = SquareT Square | TriangleT Triangle deriving (Read, Show)

data Square = Square {
          sLocation :: Point
        , sLength :: Float
} deriving (Read, Show)

-- only right angled triangles for ease of implementation!
data Triangle = Triangle {
          tLocation :: Point
        , tSide1 :: Float
        , tSide2 :: Float
} deriving (Read, Show)

class ShapeIf a where
        location :: a -> Point
        area :: a -> Float

instance ShapeIf Shape where
        location (CircleT a) = location a
        location (PolygonT a) = location a
        area (CircleT a) = area a
        area (PolygonT a) = area a

instance ShapeIf Polygon where
        location (SquareT a) = location a
        location (TriangleT a) = location a
        area (SquareT a) = area a
        area (TriangleT a) = area a

instance ShapeIf Square where
        location = sLocation
        area a = (sLength a) ^ 2

instance ShapeIf Circle where
        location = cLocation
        area a = pi * (cRadius a) ^ 2

instance ShapeIf Triangle where
        location = tLocation
        area a = 0.5 * (tSide1 a) * (tSide2 a)

Despite all the madness this ends up having some quite nice properties: I can have a list of shapes and I can map functions over them that make sense (like location and area). But also if I have a particular Shape (say a Triangle) then I can also call area just on that. But it is horrendous. I don't like the code at all (indeed I'm sure it would be much shorter in any object oriented programming language).

So where have I gone wrong? How can this be made nicer? Saying "don't think in terms of objects" is nice, but this seems to have several applications (e.g. a list of characters in a role playing game ... who have some shared attributes but different abilities, or GUI programming where objects tend to make sense).

回答1:

You can use simple data types for this purpose without resorting to typeclasses. If you do want to use typeclasses, it's better to use it to describe a conversion to your base type rather than having it include all the implementation details:

data Point = Point
    { xcoord :: Float
    , ycoord :: Float
    } deriving (Eq, Read, Show)

data Shape = Shape
    { shapeLocation :: Point
    , shapeArea :: Float
    } deriving (Eq, Show)

This might be the only two types you need, depending on your application, since you could write functions

circle :: Point -> Float -> Shape
circle loc radius = Shape loc $ pi * r * r

square :: Point -> Float -> Shape
square loc sLength = Shape loc $ sLength * sLength

triangle :: Point -> Float -> Float -> Shape
triangle loc base height = Shape loc $ 0.5 * base * height

But maybe you want to preserve those arguments. In which case, write a data type for each

data Circle = Circle
    { cLocation :: Point
    , cRadius :: Float
    } deriving (Eq, Show)

data Square = Square
    { sLocation :: Point
    , sLength :: Float
    } deriving (Eq, Show)

data Triangle = Triangle
    { tLocation :: Point
    , tBase :: Float
    , tHeight :: Float
    } deriving (Eq, Show)

Then for convenience, I'd use a typeclass here to define toShape:

class IsShape s where
    toShape :: s -> Shape

instance IsShape Shape where
    toShape = id

instance IsShape Circle where
    toShape (Circle loc radius) = Shape loc $ pi * radius * radius

instance IsShape Square where
    toShape (Square loc sideLength) = Shape loc $ sideLength * sideLength

instance IsShape Triangle where
    toShape (Triangle loc base height) = Shape loc $ 0.5 * base * height

But now there's the problem that you have to convert each type to Shape in order to get its area or location in a more generic way, except you can just add the functions

location :: IsShape s => s -> Point
location = shapeLocation . toShape

area :: IsShape s => s -> Float
area = shapeArea . toShape

I would keep these out of the IsShape class so that they can't be re-implemented, this is similar to functions like replicateM that work on all Monads, but aren't part of the Monad typeclass. Now you can write code like

twiceArea :: IsShape s => s -> Float
twiceArea = (2 *) . area

And this is fine when you're only operating on a single shape argument. If you want to operate on a collection of them:

totalArea :: IsShape s => [s] -> Float
totalArea = sum . map area

So that you don't have to rely on existentials to build a collection of them you can instead have

> let p = Point 0 0
> totalArea [toShape $ Circle p 5, toShape $ Square p 10, toShape $ Triangle p 10 20]
278.53983
> totalArea $ map (Square p) [1..10]
385.0

This gives you the flexibility to work on a list of objects of different types, or on a list of just a single type using the same function and absolutely no language extensions.

Bear in mind that this is still trying to implement a sort of object model in a strictly functional language, something that isn't going to be completely ideal, but considering this allows you to have

  • multiple "interfaces" (conversions to different types)
  • generics (totalArea :: IsShape s => [s] -> Float)
  • sealed methods if you were to use a smart constructor for Shape and add more methods to it then alias them like with area and location
  • unsealed methods if you just allowed those to be set by the smart constructor
  • public and private are set by module exports

and probably some other OOP paradigms, all with really less code than it would take in Java or C#, the only difference is that the code isn't all grouped together. This has it's benefits and disadvantages, such as being able to define new instances and data types more freely, but making the code somewhat more difficult to navigate.



回答2:

You can use existential quantification for such purposes:

{-# LANGUAGE ExistentialQuantification #-}

data Point = Point {
          xcoord :: Float
        , ycoord :: Float
} deriving (Read, Show)

data Circle = Circle {
          cLocation :: Point
        , cRadius :: Float
} deriving (Read, Show)

data Square = Square {
          sLocation :: Point
        , sLength :: Float
} deriving (Read, Show)

data Triangle = Triangle {
          tLocation :: Point
        , tSide1 :: Float
        , tSide2 :: Float
} deriving (Read, Show)

class ShapeIf a where
        location :: a -> Point
        area :: a -> Float

instance ShapeIf Square where
        location = sLocation
        area a = (sLength a) ^ 2

instance ShapeIf Circle where
        location = cLocation
        area a = pi * (cRadius a) ^ 2

instance ShapeIf Triangle where
        location = tLocation
        area a = 0.5 * (tSide1 a) * (tSide2 a)

data Shape = forall a. ShapeIf a => Shape a

instance ShapeIf Shape where
    location (Shape s) = location s
    area     (Shape s) = area s

p = Point 0 0

shlist :: [Shape]        
shlist = [Shape (Square p 0), Shape (Circle p 1), Shape (Triangle p 2 3)]

main = print $ map area shlist

But note, that there are no downcasts in Haskell, so it's not a direct analogue to Java-style subtyping. Have a look at this also.



回答3:

After being pointed to this blog post about existential quantification being an anti-pattern (which I had reinvented in a slightly clumsier way), I had a try at a rewrite and came up with:

module Shapes (Shape(), Point, Circle(..), Triangle(..), Square(..), location, area) where

data Point = Point {
          xcoord :: Float
        , ycoord :: Float
} deriving (Read, Show)

data Shape = Shape {
      location :: Point
    , shape :: ShapeT
}

data ShapeT = CircleT Circle | PolygonT Polygon deriving (Read, Show)

data Circle = Circle {
          cRadius :: Float
} deriving (Read, Show)

data Polygon = SquareT Square | TriangleT Triangle deriving (Read, Show)

data Square = Square {
          sLength :: Float
} deriving (Read, Show)

-- only right angled triangles for ease of implementation!
data Triangle = Triangle {
          tSide1 :: Float
        , tSide2 :: Float
} deriving (Read, Show)

square :: Point -> Float -> Shape
square p l = Shape p (PolygonT $ SquareT (Square l))

circle :: Point -> Float -> Shape
circle p r = Shape p (CircleT (Circle r))

triangle :: Point -> Float -> Float -> Shape
triangle p s1 s2 = Shape p (PolygonT $ TriangleT (Triangle s1 s2))

area :: Shape -> Float
area = area' . shape

area' (PolygonT (SquareT (a))) = (sLength a) ^ 2
area' (CircleT (a)) = pi * (cRadius a) ^ 2
area' (PolygonT (TriangleT (a))) = 0.5 * (tSide1 a) * (tSide2 a)


回答4:

You can get madder.

Analysed in Haskell terms, declaring a Java-style class does a number of things:

  1. Declares the existence of a set of types that share a common interface, requiring that all members of the set are also members of all the base classes' associated type sets
  2. Declares a concrete record type
  3. Declares the new data type is a member of the new set of types, and all the base classes' associated sets
  4. Declares an existential type, able to hold any concrete type that is a member of the new set of types

Whew. Features like interfaces, final classes, etc, basically allow you to skip parts of that list if you don't need/want the whole bundle. And on top of that, Java-style classes also provide a module system, which I'm not going to address at all.

Seen this way, you can get all of the above in Haskell if you use an "OO design pattern" to implement each of them yourself. But in a language like Java there's a lot of help that is provided by the language, which would manifest in Haskell as sensible defaults and syntactic sugar if it were present. An example is inheritance, which is basically automatic containment of superclass records within subclass records and automatic delegation to superclass implementations from subclass implementations. Haskell will give you none of this help, so everything must be explicit and the "OO design pattern" comes out incredibly verbose.

Part 1 is pretty easy to see; a set of types sharing a common interface is what a type class is. Haskell allows us to put superclass constraints on the new type class too. Done.

Part 2 is also straightforward; just declare a new data type holding all the member variables. Note that if you intend to be able to "inherit" from this "class" and use the same assessors to get member variables out, you'll want to have those as part of the type class, not just use Haskell's record syntax to declare them for you. And if you're "inheriting" from other "OO pattern" classes, you'll want to included their data types as members of your new data type.

Part 3 is where the lack of help from the language starts to get tedious. You need to implement instances for each type class implied by the OO inheritance hierarchy, going all the way up (i.e. not just the immediate bases). If you're not overriding "methods" then this will be extremely mechanical and tedious, because you can just delegate all the "inherited" methods to the contained member data of the base classes (which should already have all the needed instances if you're following the pattern). This is manually implementing what OO inheritance defaults for you.

Part 4 is the doozy. OO programmers are masters of existentially quantified types, they just don't know it. Haskell supports existentially quantified types, but only through extensions, and a little awkwardly. And the language, idioms, and libraries aren't really expecting you to make really heavy use of existential types, so you'll start to experience a lot of friction using them; mostly in the form of annoying type errors that go away when you manage to figure out the correct type to write explicitly, and occasionally you'll need to explicitly eta expand (i.e. turn f = foo into f x = foo x, where the logic of higher order functions should say that makes no difference).

You might think that we shouldn't need the existential types, since type-class-constrained type variables should be enough to allow code to work on any member of the type class. The trouble is that a type variable constrained by a type class must be instantiated at each call to any one type in the type class (and the choice is made by the caller, not by whatever data happens to arrive at runtime).

This is why type classes don't allow you to use heterogenous lists; although the type Shape a => [a] can hold objects of any type that implements Shape, there is only one single type variable for all the elements of the list, so they all must be the same "any type that implements Shape". An existential type is a wrapper that contains data with a type variable but where the wrapper does not itself have that type variable in its own type. This allows you to just have a list of [Shape], where it's Shape that internally contains a ShapeI a => a.

I think I've exhausted how well I can explain this without example code, so here goes. Warning, it's pretty ugly:

{-# LANGUAGE ExistentialQuantification, GADTs, RankNTypes #-}


newtype Point = Point (Double, Double)
  deriving (Show, Eq)


-- The Shape common interface
-- Shape is just an interface, so no member data type
class ShapeI a
  where area :: a -> Double

-- The Shape existential reference
data Shape 
  where Shape :: ShapeI a => a -> Shape 


-- The Polygon common interface: 'subtype' of Shape
-- Polygon is just an interface, so no member data type
class ShapeI a => PolygonI a
  where vertexes :: a -> [Point]

-- The Polygon existential reference
data Polygon
  where Polygon :: PolygonI a => a -> Polygon


-- The Circle common interface
class ShapeI a => CircleI a
  where centre :: a -> Point 
        radius :: a -> Double

-- The Circle existential reference
data Circle
  where Circle :: CircleI a => a -> Circle

-- The Circle member data type
data CircleM = CircleM Point Double
  deriving (Show, Eq)

-- Circles are Shapes
instance ShapeI CircleM
  where area (CircleM _ r) = pi * r * r 

-- Circles are Circles
instance CircleI CircleM
  where centre (CircleM c _) = c
        radius (CircleM _ r) = r


data Colour = Med | Blue
  deriving (Show, Eq)

-- The ColouredCircle member data type
-- ColouredCircle is final, so not bothering with a type class or existential reference
data CircleColouredM = CircleColouredM CircleM Colour
  deriving (Show, Eq)


-- ColouredCircles are Shapes
instance ShapeI CircleColouredM
  where area (CircleColouredM circle _) = area circle

-- ColouredCircles are Circles
-- Note there is no actual implementation logic here, ColouredCircleM implements
-- the Circle methods purely by using Circle's implementations
instance CircleI CircleColouredM
  where centre (CircleColouredM circle _) = centre circle
        radius (CircleColouredM circle _) = radius circle


-- The Triangle member data type
-- Triangle is final, so not bothering with a type class or existential refernce
data TriangleM = TriangleM Point Point Point
  deriving (Show, Eq)

instance ShapeI TriangleM
  where area = const 7 -- In this hypothetical universe, all triangles have area 7

instance PolygonI TriangleM
  where vertexes (TriangleM a b c) = [a, b, c]

Given all that:

-- Heterogenous list of different types of objects which are all Circles
circles :: [Circle]
circles = [Circle (CircleM (Point (3, 7)) 2), Circle (CircleColouredM (CircleM (Point (8, 1)) 1) Blue)]


-- Casts a Circle existential to a Shape existential
-- Note that the object *indside* the existential reference is the same; we're
-- just treating it as a Shape now
circleToShape :: Circle -> Shape
circleToShape (Circle c) = Shape c


-- Heterogenous list of different types of objects which are all Shapes
-- Note we are able to uniformly cast the circles list to shapes in order store
-- them in this list, even though they're not all the same type already
shapes :: [Shape]
shapes = [Shape (TriangleM (Point (0, 0)) (Point (10, 0)) (Point (0, 10)))] ++ map circleToShape circles


-- Helper function; can apply any function that is polymorphic in ShapeI to a
-- Shape existential; the explicit type is necessary, because it's a rank 2 type
apply :: (forall a. ShapeI a => a -> b) -> Shape -> b
apply f (Shape x) = f x


areas = map (apply area) shapes

So you see we do get heterogenous lists (or in general, types which can independently hold any member of a "class" and allow access to that class' common interface), OO style inheritance hierarchies (though with manual boilerplate to inherit methods unchanged), and even upcasting.

Another issue you'll likely run into is how strict Haskell is about type discipline. You won't be able to downcast; in fact you won't be able to refer to any property of a Shape existential other than what's implied by the ShapeI interface; all knowledge about what it particularly contains is gone.

This also means that the shapes list is close to useless; the only meaningful thing we can do with it is map (apply area) shapes, so we might as well have done away with the masses of boilerplate and just created a list of Double in the first place. Also, the root class in OO languages tends to provide a surprising amount of functionality; you can toString arbitrary objects in Java, compare them for equality, etc. You get none of that here. Once something's an existential reference, you can access nothing but what its constraints say you can. No Show constraint, no show method (even though all of the types I've used here do implement Show). Likewise, no Eq constraint, no == function; and that probably wouldn't work as you'd like here because (being an idiomatic Haskell function and not expecting to deal with existentials emulating OO class heirarchies) == only works on two values guaranteed to be of the same type, and the existential reference gives up all knowledge about being of any particular type so you could never guarantee that.

I'm certain you could refine the pattern above to make it more usable, maybe even automate bits of it (can we write a generic upcast function? can TemplateHaskell generate the boilerplate for us?). If you threw in constraints like Typeable into the mix you should even be able to get runtime-checked downcasts if you really want, and might be able to implement an equality operator that worked (returning False for different concrete types and delegating to == when the types do match). But personally I'm not terribly inclined to try to flesh this out further.

TLDR: OO style classes (ignoring mutations) are basically a particular combination of type classes, types holding member data, existential types, with a whole lot of default machinery to make it work easily instead of being a huge pain. Given that Haskell gives you each of those pieces as orthogonal minimal concepts, I find it much easier to know and understand those concepts separately and apply them individually or in concert as they are needed, rather than to take OO's swiss-army-knife approach and try to force every program to fit the facilities provided by that structure.