Why is Haskell missing “obvious” Typeclasses

2019-01-21 09:13发布

问题:

Consider the Object-Oriented Languages:

Most people coming from an object-oriented programming background, are familiar with the common and intuitive interfaces in various languages that capture the essence of Java's Collection & List interfaces. Collection refers to a collection of objects which doesn't necessarily have an natural ordering/indexing. A List is a collection which has a natural ordering/indexing. These interfaces abstract many library data-structures in Java, as do their equivalent interfaces in other languages, and an intimate understanding of these interfaces are required to work effectively with most library data-structures.

Transition to Haskell:

Haskell has a type-class system which acts on types analogously to interfaces on objects. Haskell seems to have a well designed type-class hierarchy with regard to Functors, Applicative, Monads, etc. when the type regard functionality. They obviously want correct and well-abstracted type-classes. Yet when you look at many Haskell's containers (List,Map,Sequence,Set,Vector) they almost all have very similar (or identical) functions, yet aren't abstracted through type-classes.

Some Examples:

  • null for testing "emptyness"
  • length/size for element count
  • elem/member for set inclusion
  • empty and/or singleton for default construction
  • union for set union
  • (\\)/diff for set difference
  • (!)/(!!) for unsafe indexing (partial function)
  • (!?)/lookup for safe indexing (total function)

If I want to use any of the functions above, but I have imported two or more containers I have to start hiding functions from the imported modules, or explicitly import only the necessary functions from the modules, or qualifying the imported modules. But since all the functions provide the same logical functionality, it just seems like a hassle. If the functions were defined from type-classes, and not separately in each module, the compiler's type inference mechanics could resolve this. It would also make switching underlying containers simple as long as they shared the type-classes (ie: lets just use a Sequence instead of List for better random access efficiency).

Why doesn't Haskell have a Collection and/or Indexable type-class(es) to unify & generalize some of these functions?

回答1:

Partly, the reason is that monads and arrows are new, innovative features of Haskell, while collections are relatively more mundane. Haskell has a long history as a research language; interesting research questions (designing monad instances & defining generic operations for monads) get more development effort than "industrial-strength" polishing (defining container APIs).

Partly, the reason is that those types come from three different packages (base, containers, and vector), with three separate histories and designers. That makes it harder for their designers to coordinate on providing instances of any single type class.

Partly, the reason is that defining a single type class to cover all five of the containers you mentioned is really hard. List, Sequence, and Vector are relatively similar, but Map and Set have completely different constraints. For List, Sequence, and Vector, you want a simple constructor class, but for Set that won't work, since Set requires an Ord instance on the element type. Even worse, Map can support most of your methods, but its singleton function needs two parameters where the rest need only one.



回答2:

The lens package provides some of this.

  • Testing for emptiness, creating empty containers These are both provided by the AsEmpty typeclass from Control.Lens.Empty.

  • Accessing elements by key/index. The At and Ixed typeclasses from Control.Lens.At.

  • Checking for membership in set-like containers. The Contains typeclass from Control.Lens.At.

  • Appending and deleting elements to sequence-like containers. The Cons and Snoc typeclasses from Control.Lens.Cons.

Also, the pure method of the Applicative typeclass can often be used to create "singleton" containers. For things that are not functors/applicatives in Haskell, like Set, perhaps point from Data.Pointed could be used.



回答3:

As other answers have pointed out, Haskell tends to use different vocabulary. However, I don't think they've explained the reason for the difference very well.

In a language like Java, functions are not "first class citizens"; it's true that anonymous functions are available in the latest versions, but this style of interface (Collection, Indexable, Interable, etc.) were designed before that.

This makes it tedious to pass our code around, so we prefer other people's data to be passed to our code. For example:

  • Data implementing Java's Iterable lets us write for (Foo x : anIterable) { ... }
  • Data implementing PHP's ArrayAccess lets us write anArrayAccess[anIndex]

This style can also be seen in OO languages which implement generators, since that's another way for us to write for yieldedElement in aGenerator: ....

Haskell takes a different approach with its typeclasses: we prefer our code to be passed to other people's data. Some (simplified) examples:

  • Functors accept our code and apply it to any elements they 'contain'
  • Monads accept our code and apply it in some kind of 'sequence'
  • Foldables accept our code and use it to 'reduce' their contents

Java only needs Iterable since we have to call our code in our for loop, so we can make sure it's called correctly. Haskell requires more specific typeclasses since someone else's code will be calling ours, so we need to specify how it should be called; is it a map, a fold, an unfold, etc.?

Thankfully, the type system helps us choose the right method ;)



回答4:

Haskell has some type classes for working with collections in the base package: Functor, Foldable and Traversable can be useful for working with collections, and the Monoid, Applicative and/or Alternative typeclasses can be useful for constructing collections.

Together, these classes cover most of the operations mentioned in the question, but maybe less efficient than a more container-specific function (though many of these are class methods, whose default definitions can be overriden if necessary).

null for testing "emptyness"

Foldable supports null since base 4.8 (any (const True) is an alternative for earlier versions).

length/size for element count:

Foldable supports length since base 4.8 (getSum . foldMap (const 1) is an alternative for earlier versions).

elem/member for set inclusion

Foldable supports elem, notElem and member.

empty and/or singleton for default construction

For empty, there is mempty from Monoid and empty from Alternative. For singleton, there is pure from Applicative.

union for set union

There is mappend from Monoid and <|> from Alternative. They don't necessarily implement set union, but they implement some form of union that works well together with empty and usually also with singleton and find.

(\)/diff for set difference

This one is not supported, unfortunately.

(!)/(!!) for unsafe indexing (partial function)

You could use fromJust together with a function for safe indexing.

(!?)/lookup for safe indexing (total function)

There is find from Foldable.



回答5:

Such typeclasses exist in standard Haskell, but they don't have the same name as their equivalent OO counterparts. The Collection typeclass, for example, is called Foldable in Haskell. You can use it to test if a structure is empty (foldr (const False) True x) or to count the number of elements (foldMap (const 1) x), or to test for set membership (foldr (\e' present -> (e==e') || present) False x for some e).

For operations like element lookup, you have the Array typeclass which might work for sequential data. For more flexibility, you can write your own Indexable class, like this for example (beware of lenses) :

class Indexable m k a where
  at :: k -> Lens' m (Maybe a)

The null element and set union belong to the Monoid typeclass (where mappend == union). In this light, set difference could also be implemented in its own typeclass Differentiable (which I'm sure already exists in dozens of Haskell libraries) and we would have total compatibility with imperative languages.

Haskell, by virtue of being designed by mathematicians and the like, doesn't employ the same vocabulary as most other languages, but rest assured, it doesn't mean that it's not a practical language in addition to being an awesome one :-)



回答6:

Laws. A good typeclass has laws. A great typeclass has enough parametricity so that it's laws are "theorems for free". A typeclass without laws is just ad-hoc name overloading.

Also, check out classy-prelude and Edison-API.



回答7:

You have typeclasses for different collection aspects:

  1. composition: Monoid (module Data.Monoid)

  2. sequential control: Applicative, Monad (modules Control.Applicative, Control.Monad)

  3. sequential composition: Alternative, MonadPlus (modules Control.Applicative, Control.Monad)

  4. non-sequential mapping and reduction: Functor (mod. Data.Functor), Foldable (mod. Data.Foldable)

  5. sequential mapping and reduction: Traversable (module Data.Traversable)

  6. serialisation: Binary (mod. Data.Binary)

  7. comparison: Eq, Ord (mod. Data.Eq, Data.Ord)

  8. textualisation: Show, Read

  9. deep evaluation (to Normal Form): NFData (mod. Control.DeepSeq)

  10. generic datatype traversability: Data (mod. Data.Data)

Except that monomorphic collections (ByteString, IntSet, Text) cannot implement Functor and Foldable (they require type arity == 1 (Kind: * -> *))

Also neither (Set a) implements Functor.

The package mono-traversable redefines some classes without the monomorphic types exclusion.

Update. There is an attempt to put most functions in typeclasses with the packages mono-traversable and classy-prelude.

library ref, platform