Is it possible to define rest argument in OCaml?

2019-07-14 11:06发布

问题:

Relate thread in reddit.

In racket, we could define a function with rest argument as:

(define (avg . l)
    (/ (apply + l) (length l)))

We could call this function by:

> (avg 1 2 3)
2

There are plenty of ways to solve this particular avg mentioned by the replies from reddit.

But if I want to do something more complicated like:

(define *memoize-tbl* (make-hasheq))

(define (bind fn . args)
  (let ([res (apply fn args)])
    (hash-set! *memoize-tbl*
               (equal-hash-code (cons fn args))
               res)
    res))

(define (f1 loi i s)
  (+
   (length loi)
   i
   (string-length s)))

(bind f1 '(1 2 3) 8 "hi")

We can see that the function bind does not care the number of arguments fn has , and the type of arguments could be anything: integer, list, string.

I was wondering if maybe there is semantic similarly in OCaml?

回答1:

ML and Haskell don't really have a counterpart to the &rest of Lisp (or similar features of Lisp-like languages). Type issues aside, the way that functions are defined means that there is no good way to define "the remaining arguments" of a function.

The primary two applications of using "rest" arguments are variadic functions and function wrappers. The Reddit thread already answered how to do a variadic function (use a list argument), so I presume this is about function wrappers, and here things can get a bit hairy.

The underlying problem that you have is that the concept of a list or tuple of arguments does not really exist for ML functions that do not specifically use tuple or list arguments. For example, the function

let d x y = abs (x - y)

is equivalent to the function

let d x = (fun y -> abs (x - y))

In other words, an (n+1)-ary function is actually a function that, when applied to a single argument, yields an n-ary function. For example, d 0 returns a unary function that describes the distance from 0.

If you want to operate on tuples of arguments, then you need to specify them as such:

let d (x, y) = abs (x - y)

You can then call this with (e.g.) d(3, 5) instead of d 3 5. Note that the optimizing OCaml compiler should normally generate identical code for both cases.

You can easily tell the difference from the types. The first function (d x y) has the type

int -> int -> int

while the second one (d(x, y)) has the type

int * int -> int

Arity becomes a really fuzzy concept in the former case: do we have a binary function returning a value of type int or a unary function returning a value of int -> int? The compiler can't tell, you have to look at the programmer's intent, so you have to tell a wrapper exactly which parts to wrap.

When you have arguments in tuple form, you can easily define memoization, because a tuple is just a single argument. For example, let's define the Ackermann function and then apply memoization to it:

let rec ack = function
| (0, n) -> n + 1
| (m, 0) -> ack (m-1, 1)
| (m, n) -> ack (m-1, ack(m, n-1))

let memoize f =
  let memo_table = Hashtbl.create 0 in
  let f' x = try
    Hashtbl.find memo_table x
  with Not_found -> begin
    let y = f x in Hashtbl.add memo_table x y; y
  end in f'

let ack = memoize ack

Note that this simple example does not actually apply memoization to the recursive call, but only to the top-level invocation. However, the idea should still be clear.

Also, if the transformation involves non-trivial polymorphic behavior, you may need a functor instead of a function to express the transformation.

With a bit of boilerplate, you can also apply this to the f x y notation. For example, if you had ack written to accept two int arguments rather than a pair of ints, you could then write:

let ack m n = memoize (fun (m, n) -> ack m n)

And if you felt really ambitious, you could even write a PPX rewriter to encode this as part of a syntax extension point so that you can write something like:

let%memoize ack x y = ...


回答2:

There is no such thing in OCaml like the . rest parameter in Scheme (as described in R5RS 4.1.4 third bullet - link), where all remaining parameters are converted to a list.

This would not work in OCaml, list elements must be of the same type.

Convert your parameters to a list first. Then you can define your types like in the OCaml manual 4.02 Par. 1.4

type number = Int of int | Float of float | Error;;

and destructure the parameters with pattern matching.



回答3:

OCaml is a statically typed language. So it has a different flavor in comparison with Scheme, Python and other dynamically typed languages. Approaches, that are used in Scheme, will not work as is in OCaml. Moreover, it is usually a bad idea, to try to transfer your dynamic habits to programming in OCaml, Haskell, Rust, or any other statically typed language. This will lead to a non-idiomatic code, that is hard to understand and to use.

From the OCaml perspective, all functions in Scheme accepts one argument of type list. In scheme, a list can have values of different types. In OCaml lists are monomorphic, and in order to put values of different genera into the same list, you need to coerce them to some common ancestor type. You can use variants for this, like @Str has suggested. You can also use universal type, that is inhabited by any possible value. There're several libraries in OCaml that provide such types, e.g., Jane street's Univ library. In fact, scheme, python and other dynamic languages, are also using some sort of universal type to represent their values. Another approach, instead of using some ancestor type, would be to use type classes, i.e., to pass with the value a set of functions, that represents this value. Currently, OCaml doesn't have modular implicits in the official release, so you need to pass it explicitly. You can use first-class modules or just records. For example, to define an average function you need division, summation, zero (element neutral to the sum operation) and one (an element neutral to the multiplication) - a structure called field in mathematics.

open Core_kernel.Std

module type Field = sig
  type t
  val zero : t
  val one : t
  val of_int : int -> t
  val (+) : t -> t -> t
  val (/) : t -> t -> t
end

With this we can define the avg function:

let avg (type t) (module Field : Field with type t = t) xs : t = 
  Field.(List.fold ~f:(+) ~init:zero xs / of_int (List.length xs))

And uses as follows:

avg (module Int) [1;2;3;4]

After modular implicits will become an official part of OCaml, we will write just avg [1;2;3;4] and corresponding module will be found implicitly.

Using this framework, one can add memoizing, as in your extended example (if I understand it correctly). Instead of first-class modules one can use records or even classes (the latter will allow to express the subtyping in the algebraic fields). This would be still non-idiomatic code for OCaml, as it obscures the implementation.

P.S. Just in case if you're interested, what would be an idiomatic solution for adding memoization to an arbitrary function in modern OCaml, then it is to use an extension point, that can be used as follows

let avg xs = <implementation>
[@@memoized (module Int)]


标签: ocaml