I want to solve a problem that is I have a Prolog list of elements. If the any of the element frequency is greater than N
then false is return. My expectation like below.
?- frequency([1,2,2,2,5],3).
true.
?- frequency([1,2,2,2,2,5],3).
false.
I have a code for get particular element frequency. Any idea for the problem.
count(_, [], 0) :-
!.
count(X, [X|T], N) :-
count(X, T, N2),
N is N2 + 1.
count(X, [Y|T], N) :-
X \= Y,
count(X, T, N).
Use clpfd!
:- use_module(library(clpfd)).
If we build on auxiliary predicate list_counts/2
, we can define frequency/2
like this:
frequency(Es, M) :-
list_counts(Es, Xss),
maplist(arg(2), Xss, Zs),
maplist(#>=(M), Zs).
Sample queries:
?- frequency([1,2,2,2,5], 3).
true.
?- frequency([1,2,2,2,2,5], 3).
false.
Thanks to clpfd we can ask quite general queries—and get logically sound answers, too!
?- frequency([A,B,C], 2).
A=B , dif(B,C)
; A=C , dif(B,C)
; dif(A,C), B=C
; dif(A,B), dif(A,C), dif(B,C).
You can first try to think about the most "cheap" (in terms of writing) way you could attack such a problem. I usually try to figure out how I would do it with standard command line tools. For example, to find the occurrences of all lines in a text file named foo
, one would write (in Bash):
sort foo | uniq --count
You can read the manuals, but here is one example run:
$ cat foo
a
b
b
b
c
d
d
a
b
d
c
$ sort foo | uniq --count
2 a
4 b
2 c
3 d
Now, one way to ask "is there any line that has a count above 3?" is to use awk
like this:
sort foo | uniq --count | awk '{ if ($1 > 3) exit(1) }'
(I am sure there are cleverer ways, too.)
With the above file, you get:
$ sort foo | uniq --count | awk '{ if ($1 > 3) exit(1) }'
$ echo $?
1
$ sort foo | uniq --count | awk '{ if ($1 > 4) exit(1) }'
$ echo $?
0
Alright, so how does that help you with Prolog? Well, one easy way to emulate a pipeline like this:
foo | bar | baz # etc
is to write in Prolog a conjunction like this:
foo(In, X0), bar(X0, X1), baz(X1, X2) % etc
Back to your problem: you can use msort/2
(or however a stable sort predicate is called in the implementation you are using). Then, you need to count runs of the same element. In SWI-Prolog at least you have group_pairs_by_key/2
. You could use it for example as follows (together with the other predicates in the same library, you can see the code at the same link):
pairs_keys_values(Pairs, Sorted, Sorted),
group_pairs_by_key(Pairs, Grouped),
pairs_values(Grouped, Runs),
maplist(length, Runs, Counts)
At that point, you have the number of occurrences of each element in Sorted
in Counts
(a bit more verbose than uniq --count
, admittedly), and you just need to check if any of these is above your limit. To do something very similar to the awk
invocation above in Prolog:
maplist(=<(3), Counts)
Disclaimer: this is just one way to approach the problem. I decided to type it out because it shows that you rarely need to write much code yourself if you know what tools are already available to you.
Edit
It is of course unnecessary to use group_pairs_by_key/2
; however, it is very useful to know about it, which is why I linked the implementation. For this problem, it would be enough to do a stable sort, followed by a predicate that simply counts the number of consecutive occurrences of the same element, and only succeed if all such runs are not longer than the limit. The basic structure of a predicate that does that is the same as group_pairs_by_key/2
.
What is about this code...
frequency(L,N):-getall(L,L1), max_member(A,L1),A=<N.
getall([],[]).
getall(L,N):-append([],[X1|T],L),count(X1,L,N1),getall(T,N2),append([N1],N2,N).