Count occurrences Prolog

2019-07-20 22:37发布

问题:

I'm new in Prolog and trying to do some programming with Lists
I want to do this :

?- count_occurrences([a,b,c,a,b,c,d], X).
X = [[d, 1], [c, 2], [b, 2], [a, 2]].

and this is my code I know it's not complete but I'm trying:

count_occurrences([],[]).
count_occurrences([X|Y],A):-
   occurrences([X|Y],X,N).

occurrences([],_,0).    
occurrences([X|Y],X,N):- occurrences(Y,X,W), N is W + 1.
occurrences([X|Y],Z,N):- occurrences(Y,Z,N), X\=Z.

My code is wrong so i need some hits or help plz..

回答1:

Here's my solution using bagof/3 and findall/3:

count_occurrences(List, Occ):-
    findall([X,L], (bagof(true,member(X,List),Xs), length(Xs,L)), Occ).

An example

?- count_occurrences([a,b,c,b,e,d,a,b,a], Occ).
Occ = [[a, 3], [b, 3], [c, 1], [d, 1], [e, 1]].

How it works

bagof(true,member(X,List),Xs) is satisfied for each distinct element of the list X with Xs being a list with its length equal to the number of occurrences of X in List:

?- bagof(true,member(X,[a,b,c,b,e,d,a,b,a]),Xs).
X = a,
Xs = [true, true, true] ;
X = b,
Xs = [true, true, true] ;
X = c,
Xs = [true] ;
X = d,
Xs = [true] ;
X = e,
Xs = [true].

The outer findall/3 collects element X and the length of the associated list Xs in a list that represents the solution.

Edit I: the original answer was improved thanks to suggestions from CapelliC and Boris.

Edit II: setof/3 can be used instead of findall/3 if there are free variables in the given list. The problem with setof/3 is that for an empty list it will fail, hence a special clause must be introduced.

count_occurrences([],[]).
count_occurrences(List, Occ):-
    setof([X,L], Xs^(bagof(a,member(X,List),Xs), length(Xs,L)), Occ).


回答2:

Note that so far all proposals have difficulties with lists that contain also variables. Think of the case:

?- count_occurrences([a,X], D).

There should be two different answers.

   X = a, D = [a-2] ;
   dif(X, a), D = [a-1,X-1].

The first answer means: the list [a,a] contains a twice, and thus D = [a-2]. The second answer covers all terms X that are different to a, for those, we have one occurrence of a and one occurrence of that other term. Note that this second answer includes an infinity of possible solutions including X = b or X = c or whatever else you wish.

And if an implementation is unable to produce these answers, an instantiation error should protect the programmer from further damage. Something along:

count_occurrences(Xs, D) :-
   ( ground(Xs) -> true ; throw(error(instantiation_error,_)) ),
   ... .

Ideally, a Prolog predicate is defined as a pure relation, like this one. But often, pure definitions are quite inefficient.

Here is a version that is pure and efficient. Efficient in the sense that it does not leave open any unnecessary choice points. I took @dasblinkenlight's definition as source of inspiration.

Ideally, such definitions use some form of if-then-else. However, the traditional (;)/2 written

   ( If_0 -> Then_0 ; Else_0 )

is an inherently non-monotonic construct. I will use a monotonic counterpart

   if_( If_1, Then_0, Else_0)

instead. The major difference is the condition. The traditional control constructs relies upon the success or failure of If_0 which destroys all purity. If you write ( X = Y -> Then_0 ; Else_0 ) the variables X and Y are unified and at that very point in time the final decision is made whether to go for Then_0 or Else_0. What, if the variables are not sufficiently instantiated? Well, then we have bad luck and get some random result by insisting on Then_0 only.

Contrast this to if_( If_1, Then_0, Else_0). Here, the first argument must be some goal that will describe in its last argument whether Then_0 or Else_0 is the case. And should the goal be undecided, it can opt for both.

count_occurrences(Xs, D) :-
   foldl(el_dict, Xs, [], D).

el_dict(K, [], [K-1]).
el_dict(K, [KV0|KVs0], [KV|KVs]) :-
    KV0 = K0-V0,
    if_( K = K0,
         ( KV = K-V1, V1 is V0+1, KVs0 = KVs ),
         ( KV = KV0, el_dict(K, KVs0, KVs ) ) ).

=(X, Y, R) :-
   equal_truth(X, Y, R).

This definition requires the following auxiliary definitions: if_/3, equal_truth/3, foldl/4.



回答3:

If you use SWI-Prolog, you can do :

:- use_module(library(lambda)).

count_occurrences(L, R) :-
    foldl(\X^Y^Z^(member([X,N], Y)
             ->  N1 is N+1,
             select([X,N], Y, [X,N1], Z)
             ;   Z = [[X,1] | Y]),
          L, [], R).


回答4:

One thing that should make solving the problem easier would be to design a helper predicate to increment the count.

Imagine a predicate that takes a list of pairs [SomeAtom,Count] and an atom whose count needs to be incremented, and produces a list that has the incremented count, or [SomeAtom,1] for the first occurrence of the atom. This predicate is easy to design:

increment([], E, [[E,1]]).
increment([[H,C]|T], H, [[H,CplusOne]|T]) :-
    CplusOne is C + 1.
increment([[H,C]|T], E, [[H,C]|R]) :-
    H \= E,
    increment(T, E, R).

The first clause serves as the base case, when we add the first occurrence. The second clause serves as another base case when the head element matches the desired element. The last case is the recursive call for the situation when the head element does not match the desired element.

With this predicate in hand, writing count_occ becomes really easy:

count_occ([], []).
count_occ([H|T], R) :-
    count_occ(T, Temp),
    increment(Temp, H, R).

This is Prolog's run-of-the-mill recursive predicate, with a trivial base clause and a recursive call that processes the tail, and then uses increment to account for the head element of the list.

Demo.



回答5:

You have gotten answers. Prolog is a language which often offers multiple "correct" ways to approach a problem. It is not clear from your answer if you insist on any sort of order in your answers. So, ignoring order, one way to do it would be:

  1. Sort the list using a stable sort (one that does not drop duplicates)
  2. Apply a run-length encoding on the sorted list

The main virtue of this approach is that it deconstructs your problem to two well-defined (and solved) sub-problems.

The first is easy: msort(List, Sorted)

The second one is a bit more involved, but still straight forward if you want the predicate to only work one way, that is, List --> Encoding. One possibility (quite explicit):

list_to_rle([], []).
list_to_rle([X|Xs], RLE) :-
    list_to_rle_1(Xs, [[X, 1]], RLE).

list_to_rle_1([], RLE, RLE).
list_to_rle_1([X|Xs], [[Y, N]|Rest], RLE) :-
    (    dif(X, Y)
    ->   list_to_rle_1(Xs, [[X, 1],[Y, N]|Rest], RLE)
    ;    succ(N, N1),
         list_to_rle_1(Xs, [[X, N1]|Rest], RLE)
    ).

So now, from the top level:

?- msort([a,b,c,a,b,c,d], Sorted), list_to_rle(Sorted, RLE).
Sorted = [a, a, b, b, c, c, d],
RLE = [[d, 1], [c, 2], [b, 2], [a, 2]].

On a side note, it is almost always better to prefer "pairs", as in X-N, instead of lists with two elements exactly, as in [X, N]. Furthermore, you should keep the original order of the elements in the list, if you want to be correct. From this answer:

rle([], []).
rle([First|Rest],Encoded):- 
    rle_1(Rest, First, 1, Encoded).               

rle_1([], Last, N, [Last-N]).
rle_1([H|T], Prev, N, Encoded) :-
    (   dif(H, Prev) 
    ->  Encoded = [Prev-N|Rest],
        rle_1(T, H, 1, Rest)
    ;   succ(N, N1),
        rle_1(T, H, N1, Encoded)
    ).

Why is it better?

  • we got rid of 4 pairs of unnecessary brackets in the code

  • we got rid of clutter in the reported solution

  • we got rid of a whole lot of unnecessary nested terms: compare .(a, .(1, [])) to -(a, 1)

  • we made the intention of the program clearer to the reader (this is the conventional way to represent pairs in Prolog)

From the top level:

?- msort([a,b,c,a,b,c,d], Sorted), rle(Sorted, RLE).
Sorted = [a, a, b, b, c, c, d],
RLE = [a-2, b-2, c-2, d-1].

The presented run-length encoder is very explicit in its definition, which has of course its pros and cons. See this answer for a much more succinct way of doing it.



回答6:

refining joel76 answer:

count_occurrences(L, R) :-
    foldl(\X^Y^Z^(select([X,N], Y, [X,N1], Z)
             ->  N1 is N+1
             ;   Z = [[X,1] | Y]),
          L, [], R).


标签: prolog