What are the various use cases for union types and intersection types? There has been lately a lot of buzz about these type system features, yet somehow I have never felt need for either of these!
问题:
回答1:
If you want a more practice-oriented answer:
With union and recursive types you can encode regular tree types and therefore XML types.
With intersection types you can type BOTH overloaded functions and refinement types (what in a previous post is called coherent overloading)
So for instance you can write the function add (that overloads integer sum and string concatenation) as follows
let add ( (Int,Int)->Int ; (String,String)->String )
| (x & Int, y & Int) -> x+y
| (x & String, y & String) -> x@y ;;
Which has the intersection type
(Int,Int)->Int & (String,String)->String
But you can also refine the type above and type the function above as
(Pos,Pos) -> Pos &
(Neg,Neg) -> Neg &
(Int,Int)->Int &
(String,String)->String.
where Pos and Neg are positive and negative integer types.
The code above is executable in the language CDuce ( http://www.cduce.org ) whose type system includes union, intersections, and negation types (it is mainly targeted at XML transformations).
If you want to try it and you are on Linux, then it is probably included in your distribution (apt-get install cduce or yum install cduce should do the work) and you can use its toplevel (a la OCaml) to play with union and intersection types. On the CDuce site you will find a lot of practical examples of use of union and intersection types. And since there is a complete integration with OCaml libraries (you can import OCaml libraries in CDuce and export CDuce modules to OCaml) you can also check the correspondence with ML sum types (see here).
Here you are a complex example that mix union and intersection types (explained in the page "http://www.cduce.org/tutorial_overloading.html#val"), but to understand it you need to understand regular expression pattern matching, which requires some effort.
type Person = FPerson | MPerson
type FPerson = <person gender = "F">[ Name Children ]
type MPerson = <person gender = "M">[ Name Children ]
type Children = <children>[ Person* ]
type Name = <name>[ PCDATA ]
type Man = <man name=String>[ Sons Daughters ]
type Woman = <woman name=String>[ Sons Daughters ]
type Sons = <sons>[ Man* ]
type Daughters = <daughters>[ Woman* ]
let fun split (MPerson -> Man ; FPerson -> Woman)
<person gender=g>[ <name>n <children>[(mc::MPerson | fc::FPerson)*] ] ->
(* the above pattern collects all the MPerson in mc, and all the FPerson in fc *)
let tag = match g with "F" -> `woman | "M" -> `man in
let s = map mc with x -> split x in
let d = map fc with x -> split x in
<(tag) name=n>[ <sons>s <daughters>d ] ;;
In a nutshell it transforms values of type Person into values of type (Man | Women) (where the vertical bar denotes a union type) but keeping the correspondence between genres: split is a function with intersection type
MPerson -> Man & FPerson -> Woman
回答2:
Union Types
To quote Robert Harper, "Practical Foundations for Programming Languages", ch 15:
Most data structures involve alternatives such as the distinction between a leaf and an interior node in a tree, or a choice in the outermost form of a piece of abstract syntax. Importantly, the choice determines the structure of the value. For example, nodes have children, but leaves do not, and so forth. These concepts are expressed by sum types, specifically the binary sum, which offers a choice of two things, and the nullary sum, which offers a choice of no things.
Booleans
The simplest sum type is the Boolean,
data Bool = True
| False
Booleans have only two valid values, T or F. So instead of representing them as numbers, we can instead use a sum type to more accurately encode the fact there are only two possible values.
Enumerations
Enumerations are examples of more general sum types: ones with many, but finite, alternative values.
Sum types and null pointers
The best practically motivating example for sum types is discriminating between valid results and error values returned by functions, by distinguishing the failure case.
For example, null pointers and end-of-file characters are hackish encodings of the sum type:
data Maybe a = Nothing
| Just a
where we can distinguish between valid and invalid values by using the Nothing
or Just
tag to annotate each value with its status.
By using sum types in this way we can rule out null pointer errors entirely, which is a pretty decent motivating example. Null pointers are entirely due to the inability of older languages to express sum types easily.
Intersection Types
Intersection types are much newer, and their applications are not as widely understood. However, Benjamin Pierce's thesis ("Programming with Intersection Types and Bounded Polymorphism") gives a good overview:
The most intriguing and potentially useful property of intersection types is their ability to express an essentially unbounded (though of course finite) amount of information about the components of a program.
For example, the addition function
(+)
can be given the typeInt -> Int -> Int ^ Real -> Real -> Real
, capturing both the general fact that the sum of two real numbers is always a real and the more specialized fact that the sum of two integers is always an integer. A compiler for a language with intersection types might even provide two different object-code sequences for the two versions of(+)
, one using a floating point addition instruction and one using integer addition. For each instance of+ in a program, the compiler can decide whether both arguments are integers and generate the more efficient object code sequence in this case.This kind of finitary polymorphism or coherent overloading is so expressive, that ... the set of all valid typings for a program amounts to a complete characterization of the program’s behavior
They let us encode a lot of information in the type, explaining via type theory what multiple inheritance means, giving types to type classes,
回答3:
Union types are useful for typing dynamic languages or otherwise allowing more flexibility in the types passed around than most static languages allow. For example, consider this:
var a;
if (condition) {
a = "string";
} else {
a = 123;
}
If you have union types, it's easy to type a
as int | string
.
One use for intersection types is to describe an object that implements multiple interfaces. For example, C# allows multiple interface constraints on generics:
interface IFoo {
void Foo();
}
interface IBar {
void Bar();
}
void Method<T>(T arg) where T : IFoo, IBar {
arg.Foo();
arg.Bar();
}
Here, arg
's type is the intersection of IFoo
and IBar
. Using that, the type-checker knows both Foo()
and Bar()
are valid methods on it.
回答4:
For instance with union types one could describe json domain model without introducing actual new classes but using only type aliases.
type JObject = Map[String, JValue]
type JArray = List[JValue]
type JValue = String | Number | Bool | Null | JObject | JArray
type Json = JObject | JArray
def stringify(json: JValue): String = json match {
case String | Number | Bool | Null => json.toString()
case JObject => "{" + json.map(x y => x + ": " + stringify(y)).mkStr(", ") + "}"
case JArray => "[" + json.map(stringify).mkStr(", ") + "]"
}