Can I extract a Coq proof as a Haskell function?

Ever since I learned a little bit of Coq I wanted to learn to write a Coq proof of the so-called division algorithm that is actually a logical proposition: forall n m : nat, exists q : nat, exists r : nat, n = q * m + r
I recently accomplished that task using what I learned from Software Foundations.
Coq being a system for developing constructive proofs, my proof is in effect a method to construct suitable values q and r from values m and n.
Coq has an intriguing facility for "extracting" an algorithm in Coq's algorithm language (Gallina) to general-purpose functional programming languages including Haskell.
Separately I have managed to write the divmod operation as a Gallina Fixpoint and extract that. I want to note carefully that that task is not what I'm considering here.
Adam Chlipala has written in Certified Programming with Dependent Types that "Many fans of the Curry-Howard correspondence support the idea of extracting programs from proofs. In reality, few users of Coq and related tools do any such thing."

Is it even possible to extract the algorithm implicit in my proof to Haskell? If it is possible, how would it be done?

回答1:

Thanks to Prof. Pierce's summer 2012 video 4.1 as Dan Feltey suggested, we see that the key is that the theorem to be extracted must provide a member of Type rather than the usual kind of propositions, which is Prop.

For the particular theorem the affected construct is the inductive Prop ex and its notation exists. Similarly to what Prof. Pierce has done, we can state our own alternate definitions ex_t and exists_t that replace occurrences of Prop with occurrences of Type.

Here is the usual redefinition of ex and exists similarly as they are defined in Coq's standard library.

Inductive ex (X:Type) (P : X->Prop) : Prop :=
ex_intro : forall (witness:X), P witness -> ex X P.

Notation "'exists' x : X , p" := (ex _ (fun x:X => p))
(at level 200, x ident, right associativity) : type_scope.

Here are the alternate definitions.

Inductive ex_t (X:Type) (P : X->Type) : Type :=
ex_t_intro : forall (witness:X), P witness -> ex_t X P.

Notation "'exists_t' x : X , p" := (ex_t _ (fun x:X => p))
(at level 200, x ident, right associativity) : type_scope.

Now, somewhat unfortunately, it is necessary to repeat both the statement and the proof of the theorem using these new definitions.

What in the world??

Why is it necessary to make a reiterated statement of the theorem and a reiterated proof of the theorem, that differ only by using an alternative definition of the quantifier??

I had hoped to use the existing theorem in Prop to prove the theorem over again in Type. That strategy fails when Coq rejects the proof tactic inversion for a Prop in the environment when that Prop uses exists and the goal is a Type that uses exists_t. Coq reports "Error: Inversion would require case analysis on sort Set which is not allowed for inductive definition ex." This behavior occurred in Coq 8.3. I am not certain that it still occurs in Coq 8.4.

I think the need to repeat the proof is actually profound although I doubt that I personally am quite managing to perceive its profundity. It involves the facts that Prop is "impredicative" and Type is not impredicative, but rather, tacitly "stratified". Predicativity is (if I understand correctly) vulnerability to Russell's paradox that the set S of sets that are not members of themselves can neither be a member of S, nor a non-member of S. Type avoids Russell's paradox by tacitly creating a sequence of higher types that contain lower types. Because Coq is drenched in the formulae-as-types interpretation of the Curry-Howard correspondence, and if I am getting this right, we can even understand stratification of types in Coq as a way to avoid Gödel incompleteness, the phenomenon that certain formulae express constraints on formulae such as themselves and thereby become unknowable as to their truth or falsehood.

Back on planet Earth, here is the repeated statement of the theorem using "exists_t".

Theorem divalg_t : forall n m : nat, exists_t q : nat,
  exists_t r : nat, n = plus (mult q m) r.

As I have omitted the proof of divalg, I will also omit the proof of divalg_t. I will only mention that we do have the good fortune that proof tactics including "exists" and "inversion" work just the same with our new definitions "ex_t" and "exists_t".

Finally, the extraction itself is accomplished easily.

Extraction Language Haskell.

Extraction "divalg.hs" divalg_t.

The resulting Haskell file contains a number of definitions, the heart of which is the reasonably nice code, below. And I was only slightly hampered by my near-total ignorance of the Haskell programming language. Note that Ex_t_intro creates a result whose type is Ex_t; O and S are the zero and the successor function from Peano arithmetic; beq_nat tests Peano numbers for equality; nat_rec is a higher-order function that recurs over the function among its arguments. The definition of nat_rec is not shown here. At any rate it is generated by Coq according to the inductive type "nat" that was defined in Coq.

divalg :: Nat -> Nat -> Ex_t Nat (Ex_t Nat ())
divalg n m =
  case m of {
   O -> Ex_t_intro O (Ex_t_intro n __);
   S m' ->
    nat_rec (Ex_t_intro O (Ex_t_intro O __)) (\n' iHn' ->
      case iHn' of {
       Ex_t_intro q' hq' ->
        case hq' of {
         Ex_t_intro r' _ ->
          let {k = beq_nat r' m'} in
          case k of {
           True -> Ex_t_intro (S q') (Ex_t_intro O __);
           False -> Ex_t_intro q' (Ex_t_intro (S r') __)}}}) n}

Update 2013-04-24: I know a bit more Haskell now. To assist others in reading the extracted code above, I'm presenting the following hand-rewritten code that I claim is equivalent and more readable. I'm also presenting the extracted definitions Nat, O, S, and nat_rec that I did not eliminate.

-- Extracted: Natural numbers (non-negative integers)
-- in the manner in which Peano defined them.
data Nat =
   O
 | S Nat
   deriving (Eq, Show)

-- Extracted: General recursion over natural numbers,
-- an interpretation of Nat in the manner of higher-order abstract syntax.
nat_rec :: a1 -> (Nat -> a1 -> a1) -> Nat -> a1
nat_rec f f0 n =
  case n of {
   O -> f;
   S n0 -> f0 n0 (nat_rec f f0 n0)}

-- Given non-negative integers n and m, produce (q, r) with n = q * m + r.
divalg_t :: Nat -> Nat -> (Nat, Nat)
divalg_t n      O = (O, n)      -- n/0: Define quotient 0, remainder n.
divalg_t n (S m') = divpos n m' -- n/(S m')
  where
        -- Given non-negative integers  n and m',
        -- and defining m = m' + 1,
        -- produce (q, r) with n = q * m + r
        -- so that q = floor (n / m) and r = n % m.
        divpos :: Nat -> Nat -> (Nat, Nat)
        divpos n m' = nat_rec (O, O) (incrDivMod m') n
        -- Given a non-negative integer m' and
        -- a pair of non-negative integers (q', r') with r <= m',
        -- and defining m = m' + 1,
        -- produce (q, r) with q*m + r = q'*m + r' + 1 and r <= m'.
        incrDivMod :: Nat -> Nat -> (Nat, Nat) -> (Nat, Nat)
        incrDivMod m' _ (q', r')
          | r' == m'  = (S q', O)
          | otherwise = (q', S r')

回答2:

The current copy of Software Foundations dated July 25, 2012, answers this quite concisely in the late chapter "Extraction2". The answer is that it can certainly be done, much like this:

Extraction Language Haskell
Extraction "divalg.hs" divalg

One more trick is necessary. Instead of a Prop, divalg must be a Type. Otherwise it will be erased in the process of extraction.

Uh oh, @Anthill is correct, I haven't answered the question because I don't know how to explain how Prof. Pierce accomplished that in his NormInType.v variant of his Norm.v and MoreStlc.v.

OK, here's the rest of my partial answer anyway.

Where "divalg" appears above, it will be necessary to provide a space-separated list of all of the propositions (which must each be redefined as a Type rather than a Prop) on which divalg relies. For a thorough, interesting, and working example of a proof extraction, one may consult the chapter Extraction2 mentioned above. That example extracts to OCaml, but adapting it for Haskell is simply a matter of using Extraction Language Haskell as above.

In part, the reason that I spent some time not knowing the above answer is that I have been using the copy of Software Foundations dated October 14, 2010, that I downloaded in 2011.