Writing well-founded programs in Coq using Fix or

2019-01-28 20:10发布

Following the example given in the chapter GeneralRec of Chlipala book, I'm trying to write the mergesort algorithm.

Here is my code

Require Import Nat.

Fixpoint insert (x:nat) (l: list nat) : list nat :=
  match l with
  | nil => x::nil
  | y::l' => if leb x y then
              x::l
            else
              y::(insert x l')
  end.

Fixpoint merge (l1 l2 : list nat) : list nat :=
  match l1 with
  | nil => l2
  | x::l1' => insert x (merge l1' l2)
  end.

Fixpoint split (l : list nat) : list nat * list nat :=
  match l with
  | nil => (nil,nil)
  | x::nil => (x::nil,nil)
  | x::y::l' =>
    let (ll,lr) := split l' in
    (x::ll,y::lr)
  end.

Definition lengthOrder (l1 l2 : list nat) :=
  length l1 < length l2.

Theorem lengthOrder_wf : well_founded lengthOrder.
Admitted.

The problem is that it is not possible to write the mergeSort function with the command Fixpoint since the function is not structurally decreasing :

Fixpoint mergeSort (l: list nat) : list nat :=
  if leb (length l) 1 then l
  else
    let (ll,lr) := split l in
    merge (mergeSort ll) (mergeSort lr).

Instead, one can use the command Program Fixpoint or Definition with the term Fix (as in Chlipala book).

However, if I'm writing this

Definition mergeSort : list nat -> list nat.
refine (Fix lengthOrder_wf (fun (l: list nat) => list nat)
      (fun (l : list nat) => (fun mergeSort : (forall ls : list nat, lengthOrder ls l -> list nat )=>
                           if leb (length l) 1 then
                             let (ll,lr) := split l in
                             merge (mergeSort ll _) (mergeSort lr _)
                           else
                             l))).

I'm getting impossible goals :

2 subgoals, subgoal 1 (ID 65)

  l : list nat
  mergeSort : forall ls : list nat, lengthOrder ls l -> list nat
  ll, lr : list nat
  ============================
  lengthOrder ll l

subgoal 2 (ID 66) is:
 lengthOrder lr l

That is why Chlipala suggests to change the definition of mergeSort this way:

Definition mergeSort : list nat -> list nat.
  refine (Fix lengthOrder_wf (fun _ => list nat)
              (fun (ls : list nat)
                 (mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat) =>
                 if Compare_dec.le_lt_dec 2 (length ls)
                 then let lss := split ls in
                      merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)
                 else ls)).

that generates the following goals:

2 subgoals, subgoal 1 (ID 68)

  ls : list nat
  mergeSort : forall ls' : list nat, lengthOrder ls' ls -> list nat
  l : 2 <= length ls
  lss := split ls : list nat * list nat
  ============================
  lengthOrder (fst lss) ls

subgoal 2 (ID 69) is:
 lengthOrder (snd lss) ls

This new definition sounds like magic to me. So I wonder:

  • Fom the first definition, is it still possible to proof the well-foudness of the function?
  • Otherwise why the first definition cannot work?
  • How a basic user can go from the first definition to the second easily?

标签: coq
1条回答
成全新的幸福
2楼-- · 2019-01-28 20:41

It's easy to see that you need to make two changes in order to get to A. Chlipala's solution.

1) When doing split you somehow need to remember that ll and lr came from split, otherwise they would be some arbitrary lists, which cannot possibly be shorter than the original list l.

The following piece of code fails to save that kind of information:

let (ll,lr) := split l in
  merge (mergeSort ll _) (mergeSort lr _)

and, thus, needs to be replaced with

let lss := split ls in
  merge (mergeSort (fst lss) _) (mergeSort (snd lss) _)

which keeps what we need.

The failure happens due to Coq's inability to remember that ll and lr come from split l and that happens because let (ll,lr) is just match in disguise (see the manual, §2.2.3).

Recall that the aims of pattern-matching is to (loosely speaking)

  • unpack the components of some value of an inductive datatype and bind them to some names (we'll need this in the 2nd part of my answer) and
  • replace the original definition with its special cases in the corresponding pattern-match branches.

Now, observe that split l does not occur anywhere in the goal or context before we pattern-match on it. We just arbitrarily introduce it into the definition. That's why pattern-matching doesn't give us anything -- we can't replace split l with its "special case" ((ll,lr)) in the goal or context, because there is no split l anywhere.

There is an alternative way of doing this by using logical equality (=):

(let (ll, lr) as s return (s = split l -> list nat) := split l in
   fun split_eq => merge (mergeSort ll _) (mergeSort lr _)) eq_refl

This is analogous to using the remember tactic. We've got rid of fst and snd, but it is a huge overkill and I wouldn't recommend it.


2) Another thing we need to prove is the fact that ll and lr are shorter than l when 2 <= length l.

Since an if-expression is a match in disguise as well (it works for any inductive datatype with exactly two constructors), we need some mechanism to remember that leb 2 (length l) = true in the then branch. Again, since we don't have leb anywhere, this information gets lost.

There are at least two possible solutions to the problem:

  • either we remember leb 2 (length l) as an equation (just as we did in the 1st part), or
  • we can use some comparison function with result type behaving like bool (so it can represent two alternatives), but it should also remember some additional information we need. Then we could pattern-match on the comparison result and extract the information, which, of course, in this case have to be a proof of 2 <= length l.

What we need is a type which is able to carry a proof of m <= n in the case when leb m n returns true and a proof of, say, m > n otherwise. There is a type in the standard library that does exactly that! It's called sumbool:

Inductive sumbool (A B : Prop) : Set :=
    left : A -> {A} + {B} | right : B -> {A} + {B}

{A} + {B} is just a notation (syntactic sugar) for sumbool A B. Just as bool, it has two constructors, but in addition it remembers a proof of either of two propositions A and B. Its advantage over bool shows up when you do case analysis on it with if: you get a proof of A in the then branch and a proof of B in the else branch. In other words, you get to use context you saved beforehand, whereas bool doesn't carry any context (only in the mind of the programmer).

And we need exactly that! Well, not in the else branch, but we would like to get 2 <= length l in our then branch. So, let us ask Coq if it already has a comparison function with the return type like that:

Search (_ -> _ -> {_ <= _} + {_}).

(*
output:
le_lt_dec: forall n m : nat, {n <= m} + {m < n}
le_le_S_dec: forall n m : nat, {n <= m} + {S m <= n}
le_ge_dec: forall n m : nat, {n <= m} + {n >= m}
le_gt_dec: forall n m : nat, {n <= m} + {n > m}
le_dec: forall n m : nat, {n <= m} + {~ n <= m}
*)

Any of the five results would do, because we need a proof only in one case.

Hence, we can replace if leb 2 (length l) then ... with if le_lt_dec 2 (length l) ... and get 2 <= length in the proof context, which will let us finish the proof.

查看更多
登录 后发表回答