Counter-intuitive behavior of min_member/2

2019-02-12 15:40发布

问题:

min_member(-Min, +List)

True when Min is the smallest member in the standard order of terms. Fails if List is empty.

?- min_member(3, [1,2,X]).
X = 3.

The explanation is of course that variables come before all other terms in the standard order of terms, and unification is used. However, the reported solution feels somehow wrong.

How can it be justified? How should I interpret this solution?

EDIT:

One way to prevent min_member/2 from succeeding with this solution is to change the standard library (SWI-Prolog) implementation as follows:

xmin_member(Min, [H|T]) :-
    xmin_member_(T, H, Min).

xmin_member_([], Min0, Min) :-
    (   var(Min0), nonvar(Min)
    ->  fail
    ;   Min = Min0
    ).
xmin_member_([H|T], Min0, Min) :-
    (   H @>= Min0 
    ->  xmin_member_(T, Min0, Min)
    ;   xmin_member_(T, H, Min)
    ).

The rationale behind failing instead of throwing an instantiation error (what @mat suggests in his answer, if I understood correctly) is that this is a clear question:

"Is 3 the minimum member of [1,2,X], when X is a free variable?"

and the answer to this is (to me at least) a clear "No", rather than "I can't really tell."

This is the same class of behavior as sort/2:

?- sort([A,B,C], [3,1,2]).
A = 3,
B = 1,
C = 2.

And the same tricks apply:

?- min_member(3, [1,2,A,B]).
A = 3.

?- var(B), min_member(3, [1,2,A,B]).
B = 3.

回答1:

This is a common property of many (all?) predicates that depend on the standard order of terms, while the order between two terms can change after unification. Baseline is the conjunction below, which cannot be reverted either:

?- X @< 2, X = 3.
X = 3.

Most predicates using a -Value annotation for an argument say that pred(Value) is the same as pred(Var), Value = Var. Here is another example:

?- sort([2,X], [3,2]).
X = 3.

These predicates only represent clean relations if the input is ground. It is too much to demand the input to be ground though because they can be meaningfully used with variables, as long as the user is aware that s/he should not further instantiate any of the ordered terms. In that sense, I disagree with @mat. I do agree that constraints can surely make some of these relations sound.



回答2:

The actual source of confusion is a common problem with general Prolog code. There is no clean, generally accepted classification of the kind of purity or impurity of a Prolog predicate. In a manual, and similarly in the standard, pure and impure built-ins are happily mixed together. For this reason, things are often confused, and talking about what should be the case and what not, often leads to unfruitful discussions.

How can it be justified? How should I interpret this solution?

First, look at the "mode declaration" or "mode indicator":

min_member(-Min, +List)

In the SWI documentation, this describes the way how a programmer shall use this predicate. Thus, the first argument should be uninstantiated (and probably also unaliased within the goal), the second argument should be instantiated to a list of some sort. For all other uses you are on your own. The system assumes that you are able to check that for yourself. Are you really able to do so? I, for my part, have quite some difficulties with this. ISO has a different system which also originates in DEC10.

Further, the implementation tries to be "reasonable" for unspecified cases. In particular, it tries to be steadfast in the first argument. So the minimum is first computed independently of the value of Min. Then, the resulting value is unified with Min. This robustness against misuses comes often at a price. In this case, min_member/2 always has to visit the entire list. No matter if this is useful or not. Consider

?- length(L, 1000000), maplist(=(1),L), min_member(2, L).

Clearly, 2 is not the minimum of L. This could be detected by considering the first element of the list only. Due to the generality of the definition, the entire list has to be visited.

This way of handling output unification is similarly handled in the standard. You can spot those cases when the (otherwise) declarative description (which is the first of a built-in) explicitly refers to unification, like

8.5.4 copy_term/2

8.5.4.1 Description

copy_term(Term_1, Term_2) is true iff Term_2 unifies
with a term T which is a renamed copy (7.1.6.2) of
Term_1.

or

8.4.3 sort/2

8.4.3.1 Description

sort(List, Sorted) is true iff Sorted unifies with
the sorted list of List (7.1.6.5).

Here are those arguments (in brackets) of built-ins that can only be understood as being output arguments. Note that there are many more which effectively are output arguments, but that do not need the process of unification after some operation. Think of 8.5.2 arg/3 (3) or 8.2.1 (=)/2 (2) or (1).

8.5.4 1 copy_term/2 (2), 8.4.2 compare/3 (1), 8.4.3 sort/2 (2), 8.4.4 keysort/2 (2), 8.10.1 findall/3 (3), 8.10.2 bagof/3 (3), 8.10.3 setof/3 (3).

So much for your direct questions, there are some more fundamental problems behind:

Term order

Historically, "standard" term order1 has been defined to permit the definition of setof/3 and sort/2 about 1982. (Prior to it, as in 1978, it was not mentioned in the DEC10 manual user's guide.)

From 1982 on, term order was frequently (erm, ab-) used to implement other orders, particularly, because DEC10 did not offer higher-order predicates directly. call/N was to be invented two years later (1984) ; but needed some more decades to be generally accepted. It is for this reason that Prolog programmers have a somewhat nonchalant attitude towards sorting. Often they intend to sort terms of a certain kind, but prefer to use sort/2 for this purpose — without any additional error checking. A further reason for this was the excellent performance of sort/2 beating various "efficient" libraries in other programming languages decades later (I believe STL had a bug to this end, too). Also the complete magic in the code - I remember one variable was named Omniumgatherum - did not invite copying and modifying the code.

Term order has two problems: variables (which can be further instantiated to invalidate the current ordering) and infinite terms. Both are handled in current implementations without producing an error, but with still undefined results. Yet, programmers assume that everything will work out. Ideally, there would be comparison predicates that produce instantiation errors for unclear cases like this suggestion. And another error for incomparable infinite terms.

Both SICStus and SWI have min_member/2, but only SICStus has min_member/3 with an additional argument to specify the order employed. So the goal

?- min_member(=<, M, Ms).

behaves more to your expectations, but only for numbers (plus arithmetic expressions).


Footnotes:

1 I quote standard, in standard term order, for this notion existed since about 1982 whereas the standard was published 1995.



回答3:

Clearly min_member/2 is not a true relation:

?- min_member(X, [X,0]), X = 1.
X = 1.

yet, after simply exchanging the two goals by (highly desirable) commutativity of conjunction, we get:

?- X = 1, min_member(X, [X,0]).
false.

This is clearly quite bad, as you correctly observe.

Constraints are a declarative solution for such problems. In the case of integers, finite domain constraints are a completely declarative solution for such problems.

Without constraints, it is best to throw an instantiation error when we know too little to give a sound answer.



回答4:

I hope I am not off-topic with this third answer. I did not edit one of the previous two as I think it's a totally different idea. I was wondering if this undesired behaviour:

?- min_member(X, [A, B]), A = 3, B = 2.
X = A, A = 3,
B = 2.

can be avoided if some conditions can be postponed for the moment when A and B get instantiated.

promise_relation(Rel_2, X, Y):-
    call(Rel_2, X, Y),
    when(ground(X), call(Rel_2, X, Y)),
    when(ground(Y), call(Rel_2, X, Y)).

min_member_1(Min, Lst):-
    member(Min, Lst),
    maplist(promise_relation(@=<, Min), Lst).

What I want from min_member_1(?Min, ?Lst) is to expresses a relation that says Min will always be lower (in the standard order of terms) than any of the elements in Lst.

?- min_member_1(X, L), L = [_,2,3,4], X = 1.
X = 1,
L = [1, 2, 3, 4] .

If variables get instantiated at a later time, the order in which they get bound becomes important as a comparison between a free variable and an instantiated one might be made.

?- min_member_1(X, [A,B,C]), B is 3, C is 4, A is 1.
X = A, A = 1,
B = 3,
C = 4 ;
false.
?- min_member_1(X, [A,B,C]), A is 1, B is 3, C is 4.
false.

But this can be avoided by unifying all of them in the same goal:

?- min_member_1(X, [A,B,C]), [A, B, C] = [1, 3, 4].
X = A, A = 1,
B = 3,
C = 4 ;
false.

Versions

If the comparisons are intended only for instantiated variables, promise_relation/3 can be changed to check the relation only when both variables get instantiated:

promise_relation(Rel_2, X, Y):-
    when((ground(X), ground(Y)), call(Rel_2, X, Y)).

A simple test:

?- L = [_, _, _, _], min_member_1(X, L), L = [3,4,1,2].
L = [3, 4, 1, 2],
X = 1 ;
false.

! Edits were made to improve the initial post thanks to false's comments and suggestions.



回答5:

This is how min_member/2 is implemented:

min_member(Min, [H|T]) :-
    min_member_(T, H, Min).

min_member_([], Min, Min).
min_member_([H|T], Min0, Min) :-
    (   H @>= Min0
    ->  min_member_(T, Min0, Min)
    ;   min_member_(T, H, Min)
    ).

So it seems that min_member/2 actually tries to unify Min (the first argument) with the smallest element in List in the standard order of terms.



回答6:

I have an observation regarding your xmin_member implementation. It fails on this query:

?- xmin_member(1, [X, 2, 3]).
false.

I tried to include the case when the list might include free variables. So, I came up with this:

ymin_member(Min, Lst):-
    member(Min, Lst),
    maplist(@=<(Min), Lst).

Of course it's worse in terms of efficiency, but it works on that case:

?- ymin_member(1, [X, 2, 3]).
X = 1 ;
false.

?- ymin_member(X, [X, 2, 3]).
true ;
X = 2 ;
false.