How does one declare an abstract data container ty

2020-08-11 10:58发布

问题:

I read William Cook's "On Data Abstraction, Revisited", and re-read Ralf Laemmel's "The expression lemma" to try to understand how to apply the former paper's ideas in Haskell. So, I'm trying to understand how could you implement, e.g., a set union function, in Haskell without specifying the types?

回答1:

There's multiple ways, depending on which version of "abstract data types" you're after.

  • Concrete but opaque types: It's been a little while since I read Cook's lovely paper, but glancing back over it I think this is closest to what he's talking about as ADTs. The standard way to do this in Haskell is to export a type without its constructors; what this means in Haskell:

    • No pattern matching on values of the abstracted type

    • No constructing values of the type, except using functions exported from its module

    How this relates to Cook's paper:

    • Representation independence: From the outside, the representation is inaccessible.

    • Inspection of multiple representations: Inside the ADT's module, representations may be inspected freely.

    • Unique implementations/modules: Different implementations can be provided by different modules, but the types cannot interoperate except by normal means. You can't use Data.IntMap.null to see whether a Data.Map.Map Int a is empty.

    This technique is used extensively in the Haskell standard libraries, particularly for data types that need to maintain some sort of invariant or otherwise restrict the ability to construct values. So in this case, the best way to implement the set ADT from the paper is the following code:

    import qualified Data.Set as S
    

    Although this is perhaps not as powerful a means of abstraction as it could be in a language with a more expressive module system.

  • Existential quantification and interface: Haskell doesn't actually have an exists keyword as such, but the term "existential" is used in various circumstances to describe certain kinds of polymorphic types. The general idea in each case is to combine a value with a collection of functions operating on it, such that the result is polymorphic in the value's type. Consider this function signature:

    foo :: (a, a -> Bool) -> Bool
    

    Although it receives a value of type a, because a is fully polymorphic the only thing it can possibly do with that value is apply the function to it. So in a sense, within this function, the first half of the tuple is an "abstract data type", while the second half is an "interface" for working with that type. We can make this idea explicit, and apply it outside a single function, using an existential data type:

    data FooADT = forall a. FooADT a (a -> Bool)
    
    foo :: FooADT -> Bool
    

    Now, any time we have a value of type FooADT, all we know is that there exists some type a such that we can apply FooADT's second argument to its first.

    The same idea applies to polymorphic types with class constraints; the only difference is that the functions operating on the type are provided implicitly by the type class, rather than explicitly bundled with the value.

    Now, what does this mean in terms of Cook's paper?

    • Representation independence still applies.

    • Total isolation: Unlike before, knowledge of the existentially quantified type is forever lost. Nothing can inspect the representation except the interface it itself provides.

    • Arbitrary implementations: Not only are implementations not necessarily unique, there's no way to limit them at all! Anything that can provide the same interface can be wrapped up inside an existential and be indistinguishable from other values.

    In short, this is very similar to Cook's description of objects. For more on existential ADTs, the paper Unfolding Abstract Datatypes isn't a bad place to start; but keep in mind that what it discusses is fundamentally not what Cook is calling an ADT.


And a short addendum: Having gone to all the trouble above to describe existential type abstractions, I'd like to highlight something about the FooADT type: Because all you can do with it is apply the function to get a Bool result, there is fundamentally no difference between FooADT and Bool, except that the former obfuscates your code and requires GHC extensions. I strongly encourage reading this blog post before setting out to use existential types in Haskell code.



回答2:

You can either require a comparison function to be provided or require the types to be instances of Eq. See nub and nubBy for examples of this technique:

nub :: (Eq a) => [a] -> [a]
nubBy :: (a -> a -> Bool) -> [a] -> [a]