Isabelle: getting three different results with sle

2019-06-08 09:13发布

问题:

You need this to have my imports:

(* tested with Isabelle2013-2  *)
theory Notepad
imports
  Main
  "~~/src/HOL/Library/Polynomial"
begin

notepad
begin

I have three almost identical lemmas.

Version1:

 {
 fix a :: "('a:: comm_ring_1) poly"
  have "degree((monom 1 1) -CONST pCons a 0) =1" sledgehammer

Version2:

 {
 fix a :: "('a:: comm_ring_1) poly"
 def p ≡ "(monom 1 1) - CONST pCons a 0"
 from p_def have "degree p = 1"

Version3:

{
 fix a :: "('a:: comm_ring_1) poly"
 have "p ≡ (monom 1 1) - CONST pCons a 0 ⟹ degree p = 1" 

Here is the result of running sledgehammer, showing only the e theorem-prover results; it is showing three completely different results:

(*
for instance, the three results of the prover e are:
"e": Try this: by (metis One_nat_def degree_1 degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one) (2.12 s).
"e": Try this: by (metis coeff_diff coeff_monom coeff_pCons_Suc degree_pCons_eq_if diff_0_right leading_coeff_0_iff pCons_cases zero_neq_one) (> 3 s). 
"e": Try this: by (metis One_nat_def degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one) (214 ms).
*)

Question: Why is that? Is it that sledgehammer is non-deterministic? Or is it that the three version differ for Isabelle, so that sledgehammer gets three different inputs?

(* tested with Isabelle2013-2  *)
theory Notepad
imports
  Main
  "~~/src/HOL/Library/Polynomial"
begin

notepad
begin
Here is the full code:
 {
 fix a :: "('a:: comm_ring_1) poly"
  have "degree((monom 1 1) -CONST pCons a 0) =1"
(*
Sledgehammering... 
"spass": Try this: by (metis One_nat_def degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one) (2.13 s). 
"e": Try this: by (metis One_nat_def degree_1 degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one) (2.12 s).
To minimize: sledgehammer min (One_nat_def degree_1 degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one). 
"z3": Timed out. 
"remote_vampire": Timed out. 
"remote_e_sine": Try this: by (smt Nat.add_0_right Nat.add_diff_inverse Nat.diff_cancel One_nat_def Suc_diff_1 Suc_diff_Suc Suc_diff_diff Suc_diff_eq_diff_pred Suc_eq_plus1 Suc_eq_plus1_left Suc_inject Suc_le_mono Suc_lessD Suc_lessE Suc_lessI Suc_less_SucD Suc_less_eq Suc_mono Suc_neq_Zero Suc_pred Suc_pred' Zero_neq_Suc Zero_not_Suc add_Suc add_Suc_right add_Suc_shift add_diff_cancel_left' add_diff_cancel_right add_diff_cancel_right' add_eq_if add_eq_self_zero add_gr_0 add_is_0 add_is_1 add_left_cancel add_lessD1 add_less_cancel_left add_less_cancel_right add_less_mono add_less_mono1 add_pCons add_pos_pos add_right_cancel bool.size(1) bool.size(2) bool.size(3) bool.size(4) coeff_0 coeff_1 coeff_add coeff_diff coeff_linear_power coeff_monom coeff_pCons_0 coeff_pCons_Suc comm_monoid_add_class.add.left_neutral comm_monoid_add_class.add.right_neutral comm_monoid_diff_class.add_diff_cancel_left comm_monoid_diff_class.diff_cancel comm_semiring_1_class.normalizing_semiring_rules(32) comm_semiring_1_class.normalizing_semiring_rules(33) degree_0 degree_1 degree_linear_power degree_monom_eq degree_pCons_0 degree_pCons_eq degree_pCons_eq_if degree_synthetic_div diff_0_eq_0 diff_0_right diff_Suc_1 diff_Suc_Suc diff_Suc_eq_diff_pred diff_Suc_less diff_add_0 diff_add_inverse diff_add_inverse2 diff_add_zero diff_cancel2 diff_commute diff_diff_cancel diff_diff_left diff_eq_diff_eq diff_induct diff_is_0_eq diff_is_0_eq' diff_less diff_less_Suc diff_less_iff(1) diff_less_iff(2) diff_less_iff(3) diff_less_iff(4) diff_less_mono2 diff_monom diff_pCons diff_right_commute diff_self diff_self_eq_0 diff_zero diffs0_imp_equal double_add_less_zero_iff_single_add_less_zero double_zero double_zero_sym dvd.dual_order.refl dvd.lift_Suc_mono_less dvd.lift_Suc_mono_less_iff dvd_1_iff_1 dvd_1_left dvd_diff_nat dvd_minus_self dvd_plusE dvd_plus_eq_left dvd_plus_eq_right dvd_pos_nat dvd_reduce eq_diff_eq' eq_iff_diff_eq_0 exists_least_lemma field_power_not_zero gcd_lcm_complete_lattice_nat.bot.extremum gcd_lcm_complete_lattice_nat.bot.extremum_strict gcd_lcm_complete_lattice_nat.bot.extremum_unique gcd_lcm_complete_lattice_nat.bot.extremum_uniqueI gcd_lcm_complete_lattice_nat.bot.not_eq_extremum gcd_lcm_complete_lattice_nat.top.extremum_strict gcd_lcm_complete_lattice_nat.top.extremum_unique gcd_lcm_complete_lattice_nat.top.extremum_uniqueI gcd_lcm_complete_lattice_nat.top.not_eq_extremum gcd_lcm_complete_lattice_nat.top_greatest gr0I gr0_conv_Suc gr0_implies_Suc gr_implies_not0 ind_euclid infinite_descent infinite_descent0 is_zero_null le0 le_0_eq le_add_diff_inverse le_add_diff_inverse2 leading_coeff_0_iff leading_coeff_neq_0 lessE lessI less_Suc0 less_SucE less_SucI less_Suc_eq less_Suc_eq_0_disj less_Suc_induct less_add_Suc1 less_add_Suc2 less_add_eq_less less_antisym less_diff_conv less_dvd_minus less_iff_Suc_add less_imp_Suc_add less_imp_add_positive less_imp_diff_less less_irrefl_nat less_nat_zero_code less_not_refl less_not_refl2 less_not_refl3 less_trans_Suc less_zeroE lift_Suc_mono_less lift_Suc_mono_less_iff linorder_neqE_nat list_decode.cases minus_nat.diff_0 minus_poly.rep_eq minus_real_def monoid_add_class.add.left_neutral monoid_add_class.add.right_neutral monom.rep_eq monom_0 monom_Suc monom_eq_0 monom_eq_0_iff monom_eq_iff n_not_Suc_n nat.inject nat.size(1) nat.size(2) nat.size(3) nat.size(4) nat_add_assoc nat_add_commute nat_add_left_cancel nat_add_left_cancel_le nat_add_left_cancel_less nat_add_left_commute nat_add_right_cancel nat_diff_split nat_diff_split_asm nat_dvd_not_less nat_induct nat_less_cases nat_less_induct nat_lt_two_imp_zero_or_one nat_neq_iff nat_power_eq_Suc_0_iff nat_power_less_imp_less nat_zero_less_power_iff neq0_conv not0_implies_Suc not_add_less1 not_add_less2 not_less0 not_less_eq not_less_less_Suc_eq one_is_add one_poly_def one_reorient order_root pCons_0_0 pCons_cases pCons_eq_0_iff pCons_eq_iff pCons_induct pcompose_0 plus_nat.add_0 poly_0 poly_1 poly_add poly_all_0_iff_0 poly_diff poly_gcd_0_0 poly_gcd_monic poly_gcd_zero_iff poly_minus_degree_zero_const poly_power pow_divides_eq_int pow_divides_eq_nat pow_divides_pow_int pow_divides_pow_nat power_0 power_0_Suc power_0_left power_Suc_0 power_eq_0_iff power_inject_exp power_one power_one_right power_strict_increasing_iff real_0_less_add_iff real_add_eq_0_iff real_add_less_0_iff real_add_minus_iff real_arch_pow real_arch_pow_inv real_lbound_gt_zero realpow_pos_nth realpow_pos_nth2 realpow_pos_nth_unique semiring_numeral_div_class.diff_zero size_bool strict_inc_induct synthetic_div_0 synthetic_div_eq_0_iff synthetic_div_pCons trans_less_add1 trans_less_add2 transitive_stepwise_gt transitive_stepwise_lt_eq triangle_0 triangle_Suc zero_diff zero_induct zero_induct_lemma zero_less_Suc zero_less_diff zero_less_double_add_iff_zero_less_single_add zero_less_power_nat_eq zero_neq_one zero_poly.rep_eq zero_reorient) (> 3 s).
To minimize: sledgehammer min [remote_e_sine] (Nat.add_0_right Nat.add_diff_inverse Nat.diff_cancel One_nat_def Suc_diff_1 Suc_diff_Suc Suc_diff_diff Suc_diff_eq_diff_pred Suc_eq_plus1 Suc_eq_plus1_left Suc_inject Suc_le_mono Suc_lessD Suc_lessE Suc_lessI Suc_less_SucD Suc_less_eq Suc_mono Suc_neq_Zero Suc_pred Suc_pred' Zero_neq_Suc Zero_not_Suc add_Suc add_Suc_right add_Suc_shift add_diff_cancel_left' add_diff_cancel_right add_diff_cancel_right' add_eq_if add_eq_self_zero add_gr_0 add_is_0 add_is_1 add_left_cancel add_lessD1 add_less_cancel_left add_less_cancel_right add_less_mono add_less_mono1 add_pCons add_pos_pos add_right_cancel bool.size(1) bool.size(2) bool.size(3) bool.size(4) coeff_0 coeff_1 coeff_add coeff_diff coeff_linear_power coeff_monom coeff_pCons_0 coeff_pCons_Suc comm_monoid_add_class.add.left_neutral comm_monoid_add_class.add.right_neutral comm_monoid_diff_class.add_diff_cancel_left comm_monoid_diff_class.diff_cancel comm_semiring_1_class.normalizing_semiring_rules(32) comm_semiring_1_class.normalizing_semiring_rules(33) degree_0 degree_1 degree_linear_power degree_monom_eq degree_pCons_0 degree_pCons_eq degree_pCons_eq_if degree_synthetic_div diff_0_eq_0 diff_0_right diff_Suc_1 diff_Suc_Suc diff_Suc_eq_diff_pred diff_Suc_less diff_add_0 diff_add_inverse diff_add_inverse2 diff_add_zero diff_cancel2 diff_commute diff_diff_cancel diff_diff_left diff_eq_diff_eq diff_induct diff_is_0_eq diff_is_0_eq' diff_less diff_less_Suc diff_less_iff(1) diff_less_iff(2) diff_less_iff(3) diff_less_iff(4) diff_less_mono2 diff_monom diff_pCons diff_right_commute diff_self diff_self_eq_0 diff_zero diffs0_imp_equal double_add_less_zero_iff_single_add_less_zero double_zero double_zero_sym dvd.dual_order.refl dvd.lift_Suc_mono_less dvd.lift_Suc_mono_less_iff dvd_1_iff_1 dvd_1_left dvd_diff_nat dvd_minus_self dvd_plusE dvd_plus_eq_left dvd_plus_eq_right dvd_pos_nat dvd_reduce eq_diff_eq' eq_iff_diff_eq_0 exists_least_lemma field_power_not_zero gcd_lcm_complete_lattice_nat.bot.extremum gcd_lcm_complete_lattice_nat.bot.extremum_strict gcd_lcm_complete_lattice_nat.bot.extremum_unique gcd_lcm_complete_lattice_nat.bot.extremum_uniqueI gcd_lcm_complete_lattice_nat.bot.not_eq_extremum gcd_lcm_complete_lattice_nat.top.extremum_strict gcd_lcm_complete_lattice_nat.top.extremum_unique gcd_lcm_complete_lattice_nat.top.extremum_uniqueI gcd_lcm_complete_lattice_nat.top.not_eq_extremum gcd_lcm_complete_lattice_nat.top_greatest gr0I gr0_conv_Suc gr0_implies_Suc gr_implies_not0 ind_euclid infinite_descent infinite_descent0 is_zero_null le0 le_0_eq le_add_diff_inverse le_add_diff_inverse2 leading_coeff_0_iff leading_coeff_neq_0 lessE lessI less_Suc0 less_SucE less_SucI less_Suc_eq less_Suc_eq_0_disj less_Suc_induct less_add_Suc1 less_add_Suc2 less_add_eq_less less_antisym less_diff_conv less_dvd_minus less_iff_Suc_add less_imp_Suc_add less_imp_add_positive less_imp_diff_less less_irrefl_nat less_nat_zero_code less_not_refl less_not_refl2 less_not_refl3 less_trans_Suc less_zeroE lift_Suc_mono_less lift_Suc_mono_less_iff linorder_neqE_nat list_decode.cases minus_nat.diff_0 minus_poly.rep_eq minus_real_def monoid_add_class.add.left_neutral monoid_add_class.add.right_neutral monom.rep_eq monom_0 monom_Suc monom_eq_0 monom_eq_0_iff monom_eq_iff n_not_Suc_n nat.inject nat.size(1) nat.size(2) nat.size(3) nat.size(4) nat_add_assoc nat_add_commute nat_add_left_cancel nat_add_left_cancel_le nat_add_left_cancel_less nat_add_left_commute nat_add_right_cancel nat_diff_split nat_diff_split_asm nat_dvd_not_less nat_induct nat_less_cases nat_less_induct nat_lt_two_imp_zero_or_one nat_neq_iff nat_power_eq_Suc_0_iff nat_power_less_imp_less nat_zero_less_power_iff neq0_conv not0_implies_Suc not_add_less1 not_add_less2 not_less0 not_less_eq not_less_less_Suc_eq one_is_add one_poly_def one_reorient order_root pCons_0_0 pCons_cases pCons_eq_0_iff pCons_eq_iff pCons_induct pcompose_0 plus_nat.add_0 poly_0 poly_1 poly_add poly_all_0_iff_0 poly_diff poly_gcd_0_0 poly_gcd_monic poly_gcd_zero_iff poly_minus_degree_zero_const poly_power pow_divides_eq_int pow_divides_eq_nat pow_divides_pow_int pow_divides_pow_nat power_0 power_0_Suc power_0_left power_Suc_0 power_eq_0_iff power_inject_exp power_one power_one_right power_strict_increasing_iff real_0_less_add_iff real_add_eq_0_iff real_add_less_0_iff real_add_minus_iff real_arch_pow real_arch_pow_inv real_lbound_gt_zero realpow_pos_nth realpow_pos_nth2 realpow_pos_nth_unique semiring_numeral_div_class.diff_zero size_bool strict_inc_induct synthetic_div_0 synthetic_div_eq_0_iff synthetic_div_pCons trans_less_add1 trans_less_add2 transitive_stepwise_gt transitive_stepwise_lt_eq triangle_0 triangle_Suc zero_diff zero_induct zero_induct_lemma zero_less_Suc zero_less_diff zero_less_double_add_iff_zero_less_single_add zero_less_power_nat_eq zero_neq_one zero_poly.rep_eq zero_reorient).
*)

 {
 fix a :: "('a:: comm_ring_1) poly"
 def p ≡ "(monom 1 1) - CONST pCons a 0"
 from p_def have "degree p = 1"
 (*
Sledgehammering... 
"spass": Try this: by (metis One_nat_def degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one) (227 ms). 
"remote_e_sine": Timed out. 
"z3": Timed out. 
"remote_vampire": Timed out. 
"e": Try this: by (metis coeff_diff coeff_monom coeff_pCons_Suc degree_pCons_eq_if diff_0_right leading_coeff_0_iff pCons_cases zero_neq_one) (> 3 s). 
Structured proof (> 9.08 s):
proof -
  have f1: "⋀x⇩1 x⇩2. coeff 0 x⇩1 = (x⇩2∷'a poly) ∨ x⇩2 ≠ 0"
    using coeff_monom by simp
  have f2: "⋀x⇩1 x⇩2. (coeff (esk6⇩1 x⇩1) x⇩2∷'a poly) = coeff x⇩1 (Suc x⇩2)"
    by (metis coeff_pCons_Suc pCons_cases)
  have f3: "⋀x⇩1. coeff (monom 1 1) (Suc x⇩1) = coeff p (Suc x⇩1)"
    using p_def by fastforce
  have f4: "⋀x⇩1. Suc (degree (esk6⇩1 x⇩1)) = degree x⇩1 ∨ esk6⇩1 x⇩1 = 0"
    by (metis degree_pCons_eq_if pCons_cases)
  have "⋀x⇩1. degree x⇩1 = 0 ∨ esk6⇩1 x⇩1 ≠ 0"
    by (metis degree_pCons_eq_if pCons_cases)
  hence "esk6⇩1 p ≠ 0"
    using f1 f2 f3 by (metis coeff_monom leading_coeff_0_iff zero_neq_one)
  thus "degree p = 1"
    using f2 f3 f4 by (metis coeff_monom leading_coeff_0_iff)
qed
*) 

{
 fix a :: "('a:: comm_ring_1) poly"
 have "p ≡ (monom 1 1) - CONST pCons a 0 ⟹ degree p = 1" 
(*
Sledgehammering... 
"spass": Try this: by (metis One_nat_def degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one) (210 ms). 
"e": Try this: by (metis One_nat_def degree_pCons_eq_if diff_0_right diff_pCons monom_0 monom_Suc one_poly_def zero_neq_one) (214 ms). 
"z3": Timed out. 
"remote_e_sine": Try this: by (smt Nat.add_0_right Nat.add_diff_inverse Nat.diff_cancel One_nat_def Suc_diff_1 Suc_diff_Suc Suc_diff_diff Suc_diff_eq_diff_pred Suc_eq_plus1 Suc_eq_plus1_left Suc_inject Suc_le_mono Suc_lessD Suc_lessE Suc_lessI Suc_less_SucD Suc_less_eq Suc_mono Suc_neq_Zero Suc_pred Suc_pred' Zero_neq_Suc Zero_not_Suc add_Suc add_Suc_right add_Suc_shift add_diff_cancel_left' add_diff_cancel_right add_diff_cancel_right' add_eq_if add_eq_self_zero add_gr_0 add_is_0 add_is_1 add_left_cancel add_lessD1 add_less_cancel_left add_less_cancel_right add_less_mono add_less_mono1 add_pCons add_pos_pos add_right_cancel bool.size(1) bool.size(2) bool.size(3) bool.size(4) coeff_0 coeff_1 coeff_add coeff_diff coeff_linear_power coeff_monom coeff_pCons_0 coeff_pCons_Suc comm_monoid_add_class.add.left_neutral comm_monoid_add_class.add.right_neutral comm_monoid_diff_class.add_diff_cancel_left comm_monoid_diff_class.diff_cancel comm_semiring_1_class.normalizing_semiring_rules(32) comm_semiring_1_class.normalizing_semiring_rules(33) degree_0 degree_1 degree_linear_power degree_monom_eq degree_pCons_0 degree_pCons_eq degree_pCons_eq_if degree_synthetic_div diff_0_eq_0 diff_0_right diff_Suc_1 diff_Suc_Suc diff_Suc_eq_diff_pred diff_Suc_less diff_add_0 diff_add_inverse diff_add_inverse2 diff_add_zero diff_cancel2 diff_commute diff_diff_cancel diff_diff_left diff_eq_diff_eq diff_induct diff_is_0_eq diff_is_0_eq' diff_less diff_less_Suc diff_less_iff(1) diff_less_iff(2) diff_less_iff(3) diff_less_iff(4) diff_less_mono2 diff_monom diff_pCons diff_right_commute diff_self diff_self_eq_0 diff_zero diffs0_imp_equal double_add_less_zero_iff_single_add_less_zero double_zero double_zero_sym dvd.dual_order.refl dvd.lift_Suc_mono_less dvd.lift_Suc_mono_less_iff dvd_1_iff_1 dvd_1_left dvd_diff_nat dvd_minus_self dvd_plusE dvd_plus_eq_left dvd_plus_eq_right dvd_pos_nat dvd_reduce eq_diff_eq' eq_iff_diff_eq_0 exists_least_lemma field_power_not_zero gcd_lcm_complete_lattice_nat.bot.extremum gcd_lcm_complete_lattice_nat.bot.extremum_strict gcd_lcm_complete_lattice_nat.bot.extremum_unique gcd_lcm_complete_lattice_nat.bot.extremum_uniqueI gcd_lcm_complete_lattice_nat.bot.not_eq_extremum gcd_lcm_complete_lattice_nat.top.extremum_strict gcd_lcm_complete_lattice_nat.top.extremum_unique gcd_lcm_complete_lattice_nat.top.extremum_uniqueI gcd_lcm_complete_lattice_nat.top.not_eq_extremum gcd_lcm_complete_lattice_nat.top_greatest gr0I gr0_conv_Suc gr0_implies_Suc gr_implies_not0 ind_euclid infinite_descent infinite_descent0 is_zero_null le0 le_0_eq le_add_diff_inverse le_add_diff_inverse2 leading_coeff_0_iff leading_coeff_neq_0 lessE lessI less_Suc0 less_SucE less_SucI less_Suc_eq less_Suc_eq_0_disj less_Suc_induct less_add_Suc1 less_add_Suc2 less_add_eq_less less_antisym less_diff_conv less_dvd_minus less_iff_Suc_add less_imp_Suc_add less_imp_add_positive less_imp_diff_less less_irrefl_nat less_nat_zero_code less_not_refl less_not_refl2 less_not_refl3 less_trans_Suc less_zeroE lift_Suc_mono_less lift_Suc_mono_less_iff linorder_neqE_nat list_decode.cases minus_nat.diff_0 minus_poly.rep_eq minus_real_def monoid_add_class.add.left_neutral monoid_add_class.add.right_neutral monom.rep_eq monom_0 monom_Suc monom_eq_0 monom_eq_0_iff monom_eq_iff n_not_Suc_n nat.inject nat.size(1) nat.size(2) nat.size(3) nat.size(4) nat_add_assoc nat_add_commute nat_add_left_cancel nat_add_left_cancel_le nat_add_left_cancel_less nat_add_left_commute nat_add_right_cancel nat_diff_split nat_diff_split_asm nat_dvd_not_less nat_induct nat_less_cases nat_less_induct nat_lt_two_imp_zero_or_one nat_neq_iff nat_power_eq_Suc_0_iff nat_power_less_imp_less nat_zero_less_power_iff neq0_conv not0_implies_Suc not_add_less1 not_add_less2 not_less0 not_less_eq not_less_less_Suc_eq one_is_add one_poly_def one_reorient order_root pCons_0_0 pCons_cases pCons_eq_0_iff pCons_eq_iff pCons_induct pcompose_0 plus_nat.add_0 poly_0 poly_1 poly_add poly_all_0_iff_0 poly_diff poly_gcd_0_0 poly_gcd_monic poly_gcd_zero_iff poly_minus_degree_zero_const poly_power pow_divides_eq_int pow_divides_eq_nat pow_divides_pow_int pow_divides_pow_nat power_0 power_0_Suc power_0_left power_Suc_0 power_eq_0_iff power_inject_exp power_one power_one_right power_strict_increasing_iff real_0_less_add_iff real_add_eq_0_iff real_add_less_0_iff real_add_minus_iff real_arch_pow real_arch_pow_inv real_lbound_gt_zero realpow_pos_nth realpow_pos_nth2 realpow_pos_nth_unique semiring_numeral_div_class.diff_zero size_bool strict_inc_induct synthetic_div_0 synthetic_div_eq_0_iff synthetic_div_pCons trans_less_add1 trans_less_add2 transitive_stepwise_gt transitive_stepwise_lt_eq triangle_0 triangle_Suc zero_diff zero_induct zero_induct_lemma zero_less_Suc zero_less_diff zero_less_double_add_iff_zero_less_single_add zero_less_power_nat_eq zero_neq_one zero_poly.rep_eq zero_reorient) (> 3 s).
To minimize: sledgehammer min [remote_e_sine] (Nat.add_0_right Nat.add_diff_inverse Nat.diff_cancel One_nat_def Suc_diff_1 Suc_diff_Suc Suc_diff_diff Suc_diff_eq_diff_pred Suc_eq_plus1 Suc_eq_plus1_left Suc_inject Suc_le_mono Suc_lessD Suc_lessE Suc_lessI Suc_less_SucD Suc_less_eq Suc_mono Suc_neq_Zero Suc_pred Suc_pred' Zero_neq_Suc Zero_not_Suc add_Suc add_Suc_right add_Suc_shift add_diff_cancel_left' add_diff_cancel_right add_diff_cancel_right' add_eq_if add_eq_self_zero add_gr_0 add_is_0 add_is_1 add_left_cancel add_lessD1 add_less_cancel_left add_less_cancel_right add_less_mono add_less_mono1 add_pCons add_pos_pos add_right_cancel bool.size(1) bool.size(2) bool.size(3) bool.size(4) coeff_0 coeff_1 coeff_add coeff_diff coeff_linear_power coeff_monom coeff_pCons_0 coeff_pCons_Suc comm_monoid_add_class.add.left_neutral comm_monoid_add_class.add.right_neutral comm_monoid_diff_class.add_diff_cancel_left comm_monoid_diff_class.diff_cancel comm_semiring_1_class.normalizing_semiring_rules(32) comm_semiring_1_class.normalizing_semiring_rules(33) degree_0 degree_1 degree_linear_power degree_monom_eq degree_pCons_0 degree_pCons_eq degree_pCons_eq_if degree_synthetic_div diff_0_eq_0 diff_0_right diff_Suc_1 diff_Suc_Suc diff_Suc_eq_diff_pred diff_Suc_less diff_add_0 diff_add_inverse diff_add_inverse2 diff_add_zero diff_cancel2 diff_commute diff_diff_cancel diff_diff_left diff_eq_diff_eq diff_induct diff_is_0_eq diff_is_0_eq' diff_less diff_less_Suc diff_less_iff(1) diff_less_iff(2) diff_less_iff(3) diff_less_iff(4) diff_less_mono2 diff_monom diff_pCons diff_right_commute diff_self diff_self_eq_0 diff_zero diffs0_imp_equal double_add_less_zero_iff_single_add_less_zero double_zero double_zero_sym dvd.dual_order.refl dvd.lift_Suc_mono_less dvd.lift_Suc_mono_less_iff dvd_1_iff_1 dvd_1_left dvd_diff_nat dvd_minus_self dvd_plusE dvd_plus_eq_left dvd_plus_eq_right dvd_pos_nat dvd_reduce eq_diff_eq' eq_iff_diff_eq_0 exists_least_lemma field_power_not_zero gcd_lcm_complete_lattice_nat.bot.extremum gcd_lcm_complete_lattice_nat.bot.extremum_strict gcd_lcm_complete_lattice_nat.bot.extremum_unique gcd_lcm_complete_lattice_nat.bot.extremum_uniqueI gcd_lcm_complete_lattice_nat.bot.not_eq_extremum gcd_lcm_complete_lattice_nat.top.extremum_strict gcd_lcm_complete_lattice_nat.top.extremum_unique gcd_lcm_complete_lattice_nat.top.extremum_uniqueI gcd_lcm_complete_lattice_nat.top.not_eq_extremum gcd_lcm_complete_lattice_nat.top_greatest gr0I gr0_conv_Suc gr0_implies_Suc gr_implies_not0 ind_euclid infinite_descent infinite_descent0 is_zero_null le0 le_0_eq le_add_diff_inverse le_add_diff_inverse2 leading_coeff_0_iff leading_coeff_neq_0 lessE lessI less_Suc0 less_SucE less_SucI less_Suc_eq less_Suc_eq_0_disj less_Suc_induct less_add_Suc1 less_add_Suc2 less_add_eq_less less_antisym less_diff_conv less_dvd_minus less_iff_Suc_add less_imp_Suc_add less_imp_add_positive less_imp_diff_less less_irrefl_nat less_nat_zero_code less_not_refl less_not_refl2 less_not_refl3 less_trans_Suc less_zeroE lift_Suc_mono_less lift_Suc_mono_less_iff linorder_neqE_nat list_decode.cases minus_nat.diff_0 minus_poly.rep_eq minus_real_def monoid_add_class.add.left_neutral monoid_add_class.add.right_neutral monom.rep_eq monom_0 monom_Suc monom_eq_0 monom_eq_0_iff monom_eq_iff n_not_Suc_n nat.inject nat.size(1) nat.size(2) nat.size(3) nat.size(4) nat_add_assoc nat_add_commute nat_add_left_cancel nat_add_left_cancel_le nat_add_left_cancel_less nat_add_left_commute nat_add_right_cancel nat_diff_split nat_diff_split_asm nat_dvd_not_less nat_induct nat_less_cases nat_less_induct nat_lt_two_imp_zero_or_one nat_neq_iff nat_power_eq_Suc_0_iff nat_power_less_imp_less nat_zero_less_power_iff neq0_conv not0_implies_Suc not_add_less1 not_add_less2 not_less0 not_less_eq not_less_less_Suc_eq one_is_add one_poly_def one_reorient order_root pCons_0_0 pCons_cases pCons_eq_0_iff pCons_eq_iff pCons_induct pcompose_0 plus_nat.add_0 poly_0 poly_1 poly_add poly_all_0_iff_0 poly_diff poly_gcd_0_0 poly_gcd_monic poly_gcd_zero_iff poly_minus_degree_zero_const poly_power pow_divides_eq_int pow_divides_eq_nat pow_divides_pow_int pow_divides_pow_nat power_0 power_0_Suc power_0_left power_Suc_0 power_eq_0_iff power_inject_exp power_one power_one_right power_strict_increasing_iff real_0_less_add_iff real_add_eq_0_iff real_add_less_0_iff real_add_minus_iff real_arch_pow real_arch_pow_inv real_lbound_gt_zero realpow_pos_nth realpow_pos_nth2 realpow_pos_nth_unique semiring_numeral_div_class.diff_zero size_bool strict_inc_induct synthetic_div_0 synthetic_div_eq_0_iff synthetic_div_pCons trans_less_add1 trans_less_add2 transitive_stepwise_gt transitive_stepwise_lt_eq triangle_0 triangle_Suc zero_diff zero_induct zero_induct_lemma zero_less_Suc zero_less_diff zero_less_double_add_iff_zero_less_single_add zero_less_power_nat_eq zero_neq_one zero_poly.rep_eq zero_reorient). 
Sledgehammer ran out of time.
*)

回答1:

The non-expert, short answer to your question is that your different versions generate different problems, which can be seen simply by doing a diff on the problem files that are generated with the overlord option, as I explain below.

Jasmin Blanchette is the main developer for Sledgehammer, but I haven't seen him on SO. He responds on the Isabelle user mailing list. Larry Paulson also has some doings with Sledgehammer.

I answer this question to provide an example of how to use the overlord option with sledgehammer. Vampire ATP on Windows doesn't work, and so in my reporting that to Jasmin a long time ago, he told me how to use overlord to generate a problem file to send to him, as I explain below.

I generate the problem files for your first two versions for the e prover.

I've pursued trying to understand Sledgehammer enough that I'm satisfied with a short answer like, "There's a language for e, Sledgehammer converts the lemma into a bunch of facts that are in the language of e, e does some magic and reports that back to Sledgehammer. A lemma stated slightly differently will result in Sledgehammer's algorithm generating a different set of facts. After all, we can many times prove a theorem more than one way also."

An ambitious person could match up what's in a problem file generated by Sledgehammer, with the language that e uses. That might make it obvious why a particular lemma, stated two different ways, results in two different problems for e.

There is a TUM link for e: The E Theorem Prover, and a Wiki link.

Now I give the details. The overlord option generates a problem file in my home folder named prob_e_1. It will report back with a file named prob_e_1_proof, but I don't concern myself with that.

(* Version 1 *)
notepad 
begin
  fix a :: "('a:: comm_ring_1) poly" 
  have "degree((monom 1 1) -CONST pCons a 0) =1"
sledgehammer[overlord=true, provers="e"]
oops

I do it again with your second version, and it overwrites the previous file:

(* Version 2 *)
notepad begin
 fix a :: "('a:: comm_ring_1) poly"
 def p ≡ "(monom 1 1) - CONST pCons a 0"
 from p_def have "degree p = 1"
sledgehammer[overlord=true, provers="e"]
oops

I had renamed the first file. The file sizes are different. I use the jEdit plugin JDiff to easily do a diff. They are different starting with the very first line, which is the command that e is given.

To show you what you would see, here are the first few lines of the first version problem:

% TIMEFORMAT='%3R'; { time (exec 2>&1; '/cygdrive/e/E_2/dev/Isabelle2013-2/contrib/e-1.8/x86-cygwin/eprover' --tstp-in --tstp-out --silent --split-clauses=4 --split-reuse-defs --simul-paramod --forward-context-sr --destructive-er-aggressive --destructive-er --presat-simplify --prefer-initial-clauses -winvfreqrank -c1 -Ginvfreqconjmax -F1 --delete-bad-limit=150000000 -WSelectMaxLComplexAvoidPosPred -H'(4*FunWeight(SimulateSOS,20,20,1.5,1.5,1,a2:0,degree_poly_a:0,minus_154650241poly_a:0,monom_poly_a:0,one_one_nat:0,one_one_poly_a:0,pCons_poly_a:0,zero_z2096148049poly_a:0,minus_1267152911poly_a:8,minus_1927295133poly_a:8,pCons_1263018438poly_a:8,minus_minus_a:9,minus_minus_poly_a:9,pCons_a:9,pCons_poly_poly_a:9,monom_1144868891atural:9,tt_poly_Code_natural:9,zero_z1864290105atural:9,zero_z2076475201atural:9,monom_poly_nat:9,tt_poly_nat:9,zero_z1059985641ly_nat:9,zero_zero_poly_nat:9,monom_67158909poly_a:9,tt_poly_poly_poly_a:9,zero_z1199790189poly_a:9,zero_z2064990175poly_a:9,monom_a:9,tt_a:9,zero_zero_a:9,zero_zero_poly_a:9,monom_poly_poly_a:10,tt_poly_poly_a:10,monom_nat:10,zero_zero_nat:10,monom_Code_natural:10,zero_z353611057atural:10,tt_poly_a:10,pCons_1690884498atural:11,tt_pol1445527701atural:11,pCons_poly_nat:11,tt_poly_poly_nat:11,tt_pol1146364953poly_a:12,pCons_nat:12,pCons_Code_natural:12,minus_1521903873atural:14,minus_minus_nat:14,one_one_poly_poly_a:15,one_one_a:16,one_on446885109atural:16,one_one_Code_natural:16,one_one_poly_nat:16,one_on392296739poly_a:16,degree_a:17,degree_Code_natural:18,degree_nat:18,degree_poly_poly_a:18,aa_poly_poly_a_bool:23,pp:23,aa_poly_a_bool:23,aa_pol1791115049l_bool:23,aa_poly_nat_bool:23,aa_pol1868968491a_bool:23,degree1502533245atural:24,degree_poly_nat:24,degree368812443poly_a:24,is_zero_poly_a:26,is_zero_a:26,is_zero_Code_natural:26,is_zero_nat:26,is_zero_poly_poly_a:26,pcompose_poly_a:27,pcompose_a:27,pcompo775675211atural:27,pcompose_nat:27,pcompose_poly_poly_a:27,power_276493840poly_a:27,power_power_poly_a:27,power_1061922746atural:28,power_power_poly_nat:28,power_1749536158poly_a:28,synthetic_div_poly_a:28,synthetic_div_a:28,synthe389744629atural:28,synthetic_div_nat:28,synthe1271504013poly_a:29,suc:29,coeff_a:30,coeff_poly_poly_a:30,coeff_nat:30,coeff_Code_natural:30,coeff_poly_a:30,coeff_2076392010atural:30,one_on1013003517atural:30,coeff_poly_nat:30,one_on1411366565ly_nat:30,coeff_36192014poly_a:30,one_on1584232881poly_a:30,poly_poly_a:31,poly_nat:31,poly_a:31,a:32,code_natural:32,fFalse:32,fTrue:32,nat:32,poly:32,tt_bool:32,undefi1030841758poly_a:32,undefi122925008atural:32,undefi1684997496ly_nat:32,undefi2131925448atural:32,undefi65090320poly_a:32,undefi880707458poly_a:32,undefined_a:32,undefined_poly_a:32,undefined_poly_nat:32,power_power_nat:36,aa_nat_bool:37,aa_nat_fun_nat_bool:37,code_natural_size:38,size_s686587580atural:38,ord_less_nat:39),3*ConjectureGeneralSymbolWeight(PreferNonGoals,200,100,200,50,50,1,100,1.5,1.5,1),1*Clauseweight(PreferProcessed,1,1,1),1*FIFOWeight(PreferProcessed))'  --term-ordering=KBO6 --cpu-limit=9 --proof-object=1 '/cygdrive/e/E_1/02-p/pi/home/.isabelle/Isabelle2013-2/prob_e_1' ) ; }
% This file was generated by Isabelle (most likely Sledgehammer)
% 2014-01-12 10:46:41.037

% Explicit typings (80)
fof(tsy_c_Groups_Ominus__class_Ominus_001t__Polynomial__Opoly_It__Polynomial__Opoly_It__Polynomial__Opoly_It__Polynomial__Opoly_Itf__a_J_J_J_J, axiom,
    ((![B_1, B_2]: tt_pol1146364953poly_a(minus_1927295133poly_a(B_1, B_2)) = minus_1927295133poly_a(B_1, B_2)))).
fof(tsy_c_Groups_Ominus__class_Ominus_001t__Polynomial__Opoly_It__Polynomial__Opoly_It__Polynomial__Opoly_Itf__a_J_J_J, axiom,
    ((![B_1, B_2]: tt_poly_poly_poly_a(minus_1267152911poly_a(B_1, B_2)) = minus_1267152911poly_a(B_1, B_2)))).
fof(tsy_c_Groups_Ominus__class_Ominus_001t__Polynomial__Opoly_It__Polynomial__Opoly_Itf__a_J_J, hypothesis,
    ((![B_1, B_2]: tt_poly_poly_a(minus_154650241poly_a(B_1, B_2)) = minus_154650241poly_a(B_1, B_2)))).

I converted your first version to use theorem instead of notepad, and the problem file is the same except for the timestamp, and about one other line.

(* Version 1 as theorem *)
theorem
  fixes a :: "('a:: comm_ring_1) poly"
  shows "degree((monom 1 1) -CONST pCons a 0) =1"
(*sledgehammer[overlord=true, provers="e"]*)
by(metis 
  One_nat_def degree_pCons_eq_if diff_0_right diff_pCons monom_0 
  monom_Suc one_poly_def zero_neq_one)

On Whether Sledgehammer Is Deterministic

Update 140112_1651

No experts have shown up on a Sunday evening, so I act as the document gofer by providing some Sledgehammer document links below.

First, though, I make 3 easy observations about whether Sledgehammer can be deterministic as defined by the OP, "in the sense that if you run sledgehammer consecutively several times (i.e., on the exactly same theorem definition) you get the same results". Other than his definition, I know nothing about determinism.

For Sledgehammer to be deterministic, I see that three things have to be deterministic, where I try to be general enough to be correct.

Also, there can be lots of multi-threading by the PIDE, and that can greatly affect how processing is allocated to anything and everything, which can affect whether a particular ATP will find a particular proof.

  1. Given a theorem statement and a fixed set of available facts (the prior theorems), for a particular automatic theorem prover (ATP), the Sledgehammer algorithm must always generate the same problem for the ATP. There is the initial problem, but also Sledgehammer may generate a series of problems for an ATP to try and eliminate unneeded facts. (It's probably more complicated, that some set of fixed problems will all produce the same result.)
  2. For a fixed problem that an ATP receives from Sledgehammer, the ATP proof attempt must always succeed or fail in the same way.
  3. The smt or metis methods, given the same facts, must always succeed or fail in the same way. This might be a no-brainer, but I list it.

Essentially, the value in my list above is that it might take away some of the mystery of what Sledgehammer is. What Sledgehammer is is a team effort between the Sledgehammer algorithm that generates a problem, 3rd-party ATPs, and the smt or metis proof methods.

Sledgehammer, out of thousands of theorems available, finds applicable theorems and uses them to give an ATP a problem in the language of the ATP. The ATP performs its logical magic with the theorems, and it reports back to Sledgehammer. Sledgehammer reads the return message from the ATP, and if a proof was found, it suggests in the output panel a smt or metis proof that can be used. In the process, Sledgehammer may play back proofs on its own to eliminate unneeded facts, or whatever various things it can do based on the Sledgehammer options chosen.

Here is Larry Paulson's web page with documents specific to Isabelle and ATPs:

Linking Isabelle with Automated Theorem Provers

At the top are various links to different pages of publications, some of which drop down.

Here is Jasmin Blanchette's publication page, which happens to contain all his stuff, so I link to his workshop papers section, and the one paper by Paulson and Blanchette which I found to be an easy overview to read.

  • Workshop Papers
  • Three years of experience with Sledgehammer, a practical link between automated and interactive theorem provers

We can assume that the Sledgehammer experts could answer 1 and 3 in my list. They're also going to be familiar with much of what the ATPs do, but they probably don't concern themselves with knowing everything about the ATPs, other than what's necessary to prevent things like inconsistencies.



回答2:

Many statements have multiple proofs. When you transform a statement as you've done here, you affect the direction of the search, and in particular, the relevance filter. Determinism was never a design goal for sledgehammer.