I noticed that, among OCaml programmers I know, some of them always use polymorphic variants (variants that are not declared, prefixed with a backquote), while other ones never use polymorphic variants, and prefer variants declared in types.
Except for performance reasons (polymorphic variants are currently compiled less efficiently than simple variants), how do expert OCaml developers choose between them ?
My usage can be divided into the following 5 categories. 1. interface 2. modularity 3. legibility 4. brevity 5. tricks
- If the variant type is only internal to the module, I use regular variants, because as you said they are compiled more efficiently.
- If the variant type is exported in the interface and I feel that some cases could appear in other modules but it wouldn't necessarily make sense to make them dependend on the module, I use polymorphic variants because they are not tied to the module namespace system. Examples: the encoding type type of
Xmlm
. Also having the signal type as a variant type means you can develop modules using the same idea for XML processing without introducing a dependency on Xmlm
.
- If the variant type is exported in the interface I find it sometimes too verbose to use regular variants when values of the variant type are given to functions of the module. Example: the version type of
Uuidm
. Instead of having to write Uuidm.create Uuidm.V4
you can simply write Uuidm.create `V4
, which is as clear and less verbose.
- Sometimes a particular function may return different cases. If these cases are only used by this function I declare the function type in the interface without having to introduce a type definition. For example
parse : string -> [`Error of string | `Ok of t]
- Polymorphic variants and their subtyping allow you enforce invariants statically with phantom types. Besides the possibility of defining them incrementally can be useful, both for enforcing invariants statically and for documentation purposes.
Finally I sometimes use polymorphic variants in the implementation of a module according to 4. but without them showing up in the interface. I discourage this usage unless you declare the polymorphic variants and close them because it weakens the static typing discipline.
The only reason why I use polymorphic variants in most module interfaces is to work around the naming issues of classic variants.
If the following could work, polymorphic variants would no longer be useful in a majority of cases:
type t1 = String of string | Int of int | Bool of bool | List of t1 list
type t2 = String of string | Int of int | Other
let simplify x =
match (x : t1) with
String s -> String s
| Int n -> Int n
| Bool _
| List _ -> Other
2014-02-21 Update: the code above is now valid in OCaml 4.01. Hurray!
It's not true that polymorphic variants are always less efficient. Using Martin's example:
type base = [`String of string | `Int of int]
type t1 = [base | `Bool of bool | `List of t1 list]
type t2 = [base | `Other]
let simplify (x:t1):t2 = match x with
| #base as b -> b
| `Bool _ | `List _ -> `Other
To do this with standard variants requires two distinct types and a complete recoding, with polymorphic variants the base case is physically invariant. This feature really comes into its own when using open recursion for term rewriting:
type leaf = [`String of string | `Int of int]
type 'b base = [leaf | `List of 'b list]
type t1 = [t1 base | `Bool of bool ]
type t2 = [t2 base | `Other]
let rec simplify (x:t1):t2 = match x with
| #leaf as x -> x
| `List t -> `List (List.map simplify t)
| `Bool _ -> `Other
and the advantages are even greater when the rewriting functions are also factored with open recursion.
Unfortunately Ocaml's Hindley-Milner type inference is not strong enough to do this kind of thing without explicit typing, which requires careful factorisation of the types, which in turn makes proto-typing difficult. Additionally, explicit coercions are sometimes required.
The big downside of this technique is that for terms with multiple parameters, one soon ends up with a rather confusing combinatorial explosion of types, and in the end it is easier to give up on static enforcement and use a kitchen sink type with wildcards and exceptions (i.e. dynamic typing).