Given the following facts in a database:
foo(a, 3).
foo(b, 2).
foo(c, 4).
foo(d, 3).
foo(e, 2).
foo(f, 6).
foo(g, 3).
foo(h, 2).
I want to collect all first arguments that have the smallest second argument, plus the value of the second argument. First try:
find_min_1(Min, As) :-
setof(B-A, foo(A, B), [Min-_|_]),
findall(A, foo(A, Min), As).
?- find_min_1(Min, As).
Min = 2,
As = [b, e, h].
Instead of setof/3
, I could use aggregate/3
:
find_min_2(Min, As) :-
aggregate(min(B), A^foo(A, B), Min),
findall(A, foo(A, Min), As).
?- find_min_2(Min, As).
Min = 2,
As = [b, e, h].
NB
This only gives the same results if I am looking for the minimum of a number. If an arithmetic expression in involved, the results might be different. If a non-number is involved, aggregate(min(...), ...)
will throw an error!
Or, instead, I can use the full key-sorted list:
find_min_3(Min, As) :-
setof(B-A, foo(A, B), [Min-First|Rest]),
min_prefix([Min-First|Rest], Min, As).
min_prefix([Min-First|Rest], Min, [First|As]) :-
!,
min_prefix(Rest, Min, As).
min_prefix(_, _, []).
?- find_min_3(Min, As).
Min = 2,
As = [b, e, h].
Finally, to the question(s):
Can I do this directly with library(aggregate)? It feels like it should be possible....
Or is there a predicate like
std::partition_point
from the C++ standard library?Or is there some easier way to do this?
EDIT:
To be more descriptive. Say there was a (library) predicate partition_point/4
:
partition_point(Pred_1, List, Before, After) :-
partition_point_1(List, Pred_1, Before, After).
partition_point_1([], _, [], []).
partition_point_1([H|T], Pred_1, Before, After) :-
( call(Pred_1, H)
-> Before = [H|B],
partition_point_1(T, Pred_1, B, After)
; Before = [],
After = [H|T]
).
(I don't like the name but we can live with it for now)
Then:
find_min_4(Min, As) :-
setof(B-A, foo(A, B), [Min-X|Rest]),
partition_point(is_min(Min), [Min-X|Rest], Min_pairs, _),
pairs_values(Min_pairs, As).
is_min(Min, Min-_).
?- find_min_4(Min, As).
Min = 2,
As = [b, e, h].
Many of the following remarks could be added to many programs here on SO.
Imperative names
Every time, you write an imperative name for something that is a relation you will reduce your understanding of relations. Not much, just a little bit. Many common Prolog idioms like
append/3
do not set a good example. Think ofappend(As,As,AsAs)
. The first argument offind_min(Min, As)
is the minimum. Sominimum_with_nodes/2
might be a better name.findall/3
Do not use
findall/3
unless the uses are rigorously checked, essentially everything must be ground. In your case it happens to work. But once you generalizefoo/2
a bit, you will lose. And that is frequently a problem: You write a tiny program ; and it seems to work. Once you move to bigger ones, the same approach no longer works.findall/3
is (compared tosetof/3
) like a bull in a china shop smashing the fine fabric of shared variables and quantification. Another problem is that accidental failure does not lead to failure offindall/3
which often leads to bizarre, hard to imagine corner cases.Untestable, too specific program
Another problem is somewhat related to
findall/3
, too. Your program is so specific, that it is quite improbable that you will ever test it. And marginal changes will invalidate your tests. So you will soon give up to perform testing. Let's see what is specific: Primarily thefoo/2
relation. Yes, only an example. Think of how to set up a test configuration wherefoo/2
may change. After each change (writing a new file) you will have to reload the program. This is so complex, chances are you will never do it. I presume you do not have a test harness for that. Plunit for one, does not cover such testing. As a rule of thumb: If you cannot test a predicate on the top level you never will. Consider insteadWith such a relation, you can now have a generalized
xfoo/3
with an additional parameter, say:and you most naturally get two answers for
minimum_with(xfoo(X), Min, Els)
. Would you have usedfindall/3
instead ofsetof/3
you already would have serious problems. Or just in general:minmum_with(\A^B^member(A-B, [x-10,y-20]), Min, Els)
. So you can play around on the top level and produce lots of interesting test cases.Unchecked border cases
Your version 3 is clearly my preferred approach, however there are still some parts that can be improved. In particular, if there are answers that contain variables as a minimum. These should be checked.
And certainly, also
setof/3
has its limits. And ideally you would test them. Answers should not contain constraints, in particular not in the relevant variables. This shows howsetof/3
itself has certain limits. After the pioneering phase, SICStus produced many errors for constraints in such cases (mid 1990s), later changed to consequently ignoring constraints in built-ins that cannot handle them. SWI on the other hand does entirely undefined things here. Sometimes things are copied, sometimes not. As an example take:setof(A, ( A in 1..3 ; A in 3..5 ), _)
andsetof(t, ( A in 1..3 ; A in 3.. 5 ), _)
.By wrapping the goal this can be avoided.
Beware, however, that SWI has spurious constraints:
Not clear if this is a feature in the meantime. It has been there since the introduction of
call_residue_vars/2
about 5 years ago.I don't think that library(aggregate) covers your use case. aggregate(min) allows for one witness:
Some time ago, I wrote a small 'library', lag.pl, with predicates to aggregate with low overhead - hence the name (LAG = Linear AGgregate). I've added a snippet, that handles your use case:
It's a simple minded extension of integrate(min)... The comparison method it's surely questionable (it uses less general operator for equality), could be worth to adopt instead a conventional call like that adopted for predsort/3. Efficiency wise, still better would be to encode the comparison method as option in the 'function selector' (min_list_associated in this case)
edit thanks @false and @Boris for correcting the bug relative to the state representation. Calling
nb_setarg(2, State, Ws)
actually changes the term' shape, whenState = (_,[],_)
was used. Will update the github repo accordingly...Using
library(pairs)
and [sort/4
], this can be simply written as:This call to
sort/4
can be replaced withkeysort/2
, but withsort/4
one can also find for example the first arguments associated with the largest second argument: just use@>=
as the second argument.This solution is probably not as time and space efficient as the other ones, but may be easier to grok.
But there is another way to do it altogether: