A positive_integer/1 predicate that works for big

2019-06-26 16:03发布

In my Prolog-inspired language Brachylog, there is the possibility to label CLP(FD)-equivalent variables that have potentially infinite domains. The code that does this labelization can be found here (thanks to Markus Triska @mat).

This predicate requires the existence of a predicate positive_integer/1, which must have the following behavior:

?- positive_integer(X).
X = 1 ;
X = 2 ;
X = 3 ;
X = 4 ;
…

This is implemented as such in our current solution:

positive_integer(N) :- length([_|_], N).

This has two problems that I can see:

  • This becomes slow fairly quickly:

    ?- time(positive_integer(100000)).
    % 5 inferences, 0.000 CPU in 0.001 seconds (0% CPU, Infinite Lips)
    
    ?- time(positive_integer(1000000)).
    % 5 inferences, 0.000 CPU in 0.008 seconds (0% CPU, Infinite Lips)
    
    ?- time(positive_integer(10000000)).
    % 5 inferences, 0.062 CPU in 0.075 seconds (83% CPU, 80 Lips)
    
  • This ultimately returns an Out of global stack error for numbers that are a bit too big:

    ?- positive_integer(100000000).
    ERROR: Out of global stack
    

This is obviously due to the fact that Prolog needs to instantiate the list, which is bad if its length is big.

How can we improve this predicate such that this works even for very big numbers, with the same behavior?

4条回答
爷、活的狠高调
2楼-- · 2019-06-26 16:20

In case you want your code to be runnable also on other systems, consider:

positive_integer(N) :-
   (  nonvar(N), % strictly not needed, but clearer
      integer(N),
      N > 0
   -> true
   ;  length([_|_], N)
   ).

This version produces exactly the same errors as your first try.

You have indeed spotted a bit towards a weakness in current length/2 implementations. Ideally, a goal like length([_|_], 1000000000000000) would take some time, but at least does not consume more than constant memory. On the other hand, I am not too sure if this is worth optimizing. After all, I do not see an easy way to solve the runtime problem for such cases.

Note that the version of between/3 in SWI-Prolog is highly specific to SWI. It makes termination arguments much more complex. In other systems like SICStus, you know for sure that between/3 is terminating, regardless of the arguments. In SWI you would have to prove, that the atom inf will not be encountered which raises the burden of proof obligation.

查看更多
We Are One
3楼-- · 2019-06-26 16:32

Since "Brachylog's interpreter is entirely written in Prolog" meaning SWI-Prolog, you can use between/3 with the second argument bound to inf.

Comparing your positive_integer with

positive_integer_b(X):- between(1,inf,X).

Tests on my machine:

?- time(positive_integer(10000000)).
% 5 inferences, 0.062 CPU in 0.072 seconds (87% CPU, 80 Lips)
true.
9 ?- time(positive_integer_b(10000000)).
% 2 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
true.

And showing "Out of global stack":

13 ?- time(positive_integer(100000000)).
% 5 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
ERROR: Out of global stack

14 ?- time(positive_integer_b(100000000)).
% 2 inferences, 0.000 CPU in 0.000 seconds (?% CPU, Infinite Lips)
true.

I don't think between is pure prolog though.

查看更多
Anthone
4楼-- · 2019-06-26 16:38

without between/3, and ISO compliant (I think)

positive_integer(1).
positive_integer(X) :-
 var(X),
 positive_integer(Y),
 X is Y + 1.
positive_integer(X) :-
 integer(X),
 X > 0.
查看更多
▲ chillily
5楼-- · 2019-06-26 16:43

There are already many good ideas posted, and they work to various degrees.

Additional test case

@vmg has the right intuition: between/3 does not mix well with constraints. To see this, I would like to use the following query as an additional benchmark:

?- X #> 10^30, positive_integer(X).

Solution

With the test case in mind, I suggest the following solution:

positive_integer(I) :-
        I #> 0,
        (   var(I) ->
            fd_inf(I, Inf),
            (   I #= Inf
            ;   I #\= Inf,
                positive_integer(I)
            )
        ;   true
        ).

The key idea is to use the CLP(FD) reflection predicate fd_inf/2 to reason about the smallest element in the domain of a variable. This is the only predicate you will need to change when you port the solution to further Prolog systems. For example, in SICStus Prolog, the predicate is called fd_min/2.

Main features

  • portable to several Prolog systems with minimal changes
  • fast in the shown cases
  • works also in the test case above
  • advertises and uses the full power of CLP(FD) constraints.

It is of course very clear which of these points is most important.

Sample queries

creatio ex nihilo:

?- positive_integer(X).
X = 1 ;
X = 2 ;
X = 3 .

fixed integer:

?- X #= 12^42, time(positive_integer(X)).
% 4 inferences, 0.000 CPU in 0.000 seconds (68% CPU, 363636 Lips)
X = 2116471057875484488839167999221661362284396544.

constrained integer:

?- X #> 10^30, time(positive_integer(X)).
% 124 inferences, 0.000 CPU in 0.000 seconds (83% CPU, 3647059 Lips)
X = 1000000000000000000000000000001 ;
% 206 inferences, 0.000 CPU in 0.000 seconds (93% CPU, 2367816 Lips)
X = 1000000000000000000000000000002 ;
% 204 inferences, 0.000 CPU in 0.000 seconds (92% CPU, 2428571 Lips)
X = 1000000000000000000000000000003 .

Other comments

  1. First, make sure to check out Brachylog and the latest Brachylog solutions on Code Golf. Thanks to Julien's efforts, a language inspired by Prolog is now increasingly often hosting some of the most concise and elegant programs that are posted there. Awesome work Julien!

  2. Please refrain from using implementation-specific anomalies of between/3: These destroy important semantic properties of the predicate and are not portable to other systems.

  3. If you ignore (2), please use infinite instead of inf. In the context of CLP(FD), inf denotes the infimum of the set of integers, which is the exact opposite of positive infinity.

  4. In the context of CLP(FD), I recommend to use CLP(FD) constraints instead of between/3 and other predicates that don't take constraints into account.

  5. In fact, I recommend to use CLP(FD) constraints instead of all low-level predicates that reason over integers. This can at most make your programs more general, never more specific.

Many thanks for your interest in this question and the posted solutions! I hope you find the test case above useful for your variants, and find ways to take CLP(FD) constraints into account in your versions so that they run faster and we can all upvote them!

查看更多
登录 后发表回答