What is the best algorithm for checking if a numbe

2018-12-31 08:03发布

Just an example of what I am looking for: I could represent every odd number with a bit e.g. for the given range of numbers (1, 10], starts at 3:

1110

The following dictionary can be squeezed more right? I could eleminate multiples of five with some work, but numbers that end with 1, 3, 7 or 9 must be there in the array of bits. Hope this would clarify what I want.

I am looking for the best algorithm, to check if a number is prime i.e. a boolean function:

bool isprime(number);

I would like to know the best algorithm to implement this functionality. Naturally, there would be a data structure I could query. I define the best algorithm, to be the algorithm that produces a data structure with lowest memory consumption for the range (1, N], where N is a constant.

23条回答
深知你不懂我心
2楼-- · 2018-12-31 08:29

To find if the number or numbers in a range is/are prime.

#!usr/bin/python3

def prime_check(*args):
    for arg in args:
        if arg > 1:     # prime numbers are greater than 1
            for i in range(2,arg):   # check for factors
                if(arg % i) == 0:
                    print(arg,"is not Prime")
                    print(i,"times",arg//i,"is",arg)
                    break
            else:
                print(arg,"is Prime")

            # if input number is less than
            # or equal to 1, it is not prime
        else:
            print(arg,"is not Prime")
    return

# Calling Now
prime_check(*list(range(101)))  # This will check all the numbers in range 0 to 100 
prime_check(#anynumber)         # Put any number while calling it will check.
查看更多
路过你的时光
3楼-- · 2018-12-31 08:30

The fastest algorithm for general prime testing is AKS. The Wikipedia article describes it at lengths and links to the original paper.

If you want to find big numbers, look into primes that have special forms like Mersenne primes.

The algorithm I usually implement (easy to understand and code) is as follows (in Python):

def isprime(n):
    """Returns True if n is prime."""
    if n == 2:
        return True
    if n == 3:
        return True
    if n % 2 == 0:
        return False
    if n % 3 == 0:
        return False

    i = 5
    w = 2

    while i * i <= n:
        if n % i == 0:
            return False

        i += w
        w = 6 - w

    return True

It's a variant of the classic O(sqrt(N)) algorithm. It uses the fact that a prime (except 2 and 3) is of form 6k - 1 or 6k + 1 and looks only at divisors of this form.

Sometimes, If I really want speed and the range is limited, I implement a pseudo-prime test based on Fermat's little theorem. If I really want more speed (i.e. avoid O(sqrt(N)) algorithm altogether), I precompute the false positives (see Carmichael numbers) and do a binary search. This is by far the fastest test I've ever implemented, the only drawback is that the range is limited.

查看更多
谁念西风独自凉
4楼-- · 2018-12-31 08:31
bool isPrime(int n)
{
    // Corner cases
    if (n <= 1)  return false;
    if (n <= 3)  return true;

    // This is checked so that we can skip 
    // middle five numbers in below loop
    if (n%2 == 0 || n%3 == 0) return false;

    for (int i=5; i*i<=n; i=i+6)
        if (n%i == 0 || n%(i+2) == 0)
           return false;

    return true;
}
this is just c++ implementation of above  AKS algorithm

https://en.wikipedia.org/wiki/AKS_primality_test

查看更多
零度萤火
5楼-- · 2018-12-31 08:31

Python 3:

def is_prime(a):
    return a > 1 and all(a % i for i in range(2, int(a**0.5) + 1))
查看更多
呛了眼睛熬了心
6楼-- · 2018-12-31 08:31

I compared the efficiency of the most popular suggestions to determine if a number is prime. I used python 3.6 on ubuntu 17.10; I tested with numbers up to 100.000 (you can test with bigger numbers using my code below).

This first plot compares the functions (which are explained further down in my answer), showing that the last functions do not grow as fast as the first one when increasing the numbers.

plot1

And in the second plot we can see that in case of prime numbers the time grows steadily, but non-prime numbers do not grow so fast in time (because most of them can be eliminated early on).

plot2

Here are the functions I used:

  1. this answer and this answer suggested a construct using all():

    def is_prime_1(n):
        return n > 1 and all(n % i for i in range(2, int(math.sqrt(n)) + 1))
    
  2. This answer used some kind of while loop:

    def is_prime_2(n):
        if n <= 1:
            return False
        if n == 2:
            return True
        if n == 3:
            return True
        if n % 2 == 0:
            return False
        if n % 3 == 0:
            return False
    
        i = 5
        w = 2
        while i * i <= n:
            if n % i == 0:
                return False
            i += w
            w = 6 - w
    
        return True
    
  3. This answer included a version with a for loop:

    def is_prime_3(n):
        if n <= 1:
            return False
    
        if n % 2 == 0 and n > 2:
            return False
    
        for i in range(3, int(math.sqrt(n)) + 1, 2):
            if n % i == 0:
                return False
    
        return True
    
  4. And I mixed a few ideas from the other answers into a new one:

    def is_prime_4(n):
        if n <= 1:          # negative numbers, 0 or 1
            return False
        if n <= 3:          # 2 and 3
            return True
        if n % 2 == 0 or n % 3 == 0:
            return False
    
        for i in range(5, int(math.sqrt(n)) + 1, 2):
            if n % i == 0:
                return False
    
        return True
    

Here is my script to compare the variants:

import math
import pandas as pd
import seaborn as sns
import time
from matplotlib import pyplot as plt


def is_prime_1(n):
    ...
def is_prime_2(n):
    ...
def is_prime_3(n):
    ...
def is_prime_4(n):
    ...

default_func_list = (is_prime_1, is_prime_2, is_prime_3, is_prime_4)

def assert_equal_results(func_list=default_func_list, n):
    for i in range(-2, n):
        r_list = [f(i) for f in func_list]
        if not all(r == r_list[0] for r in r_list):
            print(i, r_list)
            raise ValueError
    print('all functions return the same results for integers up to {}'.format(n))

def compare_functions(func_list=default_func_list, n):
    result_list = []
    n_measurements = 3

    for f in func_list:
        for i in range(1, n + 1):
            ret_list = []
            t_sum = 0
            for _ in range(n_measurements):
                t_start = time.perf_counter()
                is_prime = f(i)
                t_end = time.perf_counter()

                ret_list.append(is_prime)
                t_sum += (t_end - t_start)

            is_prime = ret_list[0]
            assert all(ret == is_prime for ret in ret_list)
            result_list.append((f.__name__, i, is_prime, t_sum / n_measurements))

    df = pd.DataFrame(
        data=result_list,
        columns=['f', 'number', 'is_prime', 't_seconds'])
    df['t_micro_seconds'] = df['t_seconds'].map(lambda x: round(x * 10**6, 2))
    print('df.shape:', df.shape)

    print()
    print('', '-' * 41)
    print('| {:11s} | {:11s} | {:11s} |'.format(
        'is_prime', 'count', 'percent'))
    df_sub1 = df[df['f'] == 'is_prime_1']
    print('| {:11s} | {:11,d} | {:9.1f} % |'.format(
        'all', df_sub1.shape[0], 100))
    for (is_prime, count) in df_sub1['is_prime'].value_counts().iteritems():
        print('| {:11s} | {:11,d} | {:9.1f} % |'.format(
            str(is_prime), count, count * 100 / df_sub1.shape[0]))
    print('', '-' * 41)

    print()
    print('', '-' * 69)
    print('| {:11s} | {:11s} | {:11s} | {:11s} | {:11s} |'.format(
        'f', 'is_prime', 't min (us)', 't mean (us)', 't max (us)'))
    for f, df_sub1 in df.groupby(['f', ]):
        col = df_sub1['t_micro_seconds']
        print('|{0}|{0}|{0}|{0}|{0}|'.format('-' * 13))
        print('| {:11s} | {:11s} | {:11.2f} | {:11.2f} | {:11.2f} |'.format(
            f, 'all', col.min(), col.mean(), col.max()))
        for is_prime, df_sub2 in df_sub1.groupby(['is_prime', ]):
            col = df_sub2['t_micro_seconds']
            print('| {:11s} | {:11s} | {:11.2f} | {:11.2f} | {:11.2f} |'.format(
                f, str(is_prime), col.min(), col.mean(), col.max()))
    print('', '-' * 69)

    return df

Running the function compare_functions(n=10**5) (numbers up to 100.000) I get this output:

df.shape: (400000, 5)

 -----------------------------------------
| is_prime    | count       | percent     |
| all         |     100,000 |     100.0 % |
| False       |      90,408 |      90.4 % |
| True        |       9,592 |       9.6 % |
 -----------------------------------------

 ---------------------------------------------------------------------
| f           | is_prime    | t min (us)  | t mean (us) | t max (us)  |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_1  | all         |        0.57 |        2.50 |      154.35 |
| is_prime_1  | False       |        0.57 |        1.52 |      154.35 |
| is_prime_1  | True        |        0.89 |       11.66 |       55.54 |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_2  | all         |        0.24 |        1.14 |      304.82 |
| is_prime_2  | False       |        0.24 |        0.56 |      304.82 |
| is_prime_2  | True        |        0.25 |        6.67 |       48.49 |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_3  | all         |        0.20 |        0.95 |       50.99 |
| is_prime_3  | False       |        0.20 |        0.60 |       40.62 |
| is_prime_3  | True        |        0.58 |        4.22 |       50.99 |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_4  | all         |        0.20 |        0.89 |       20.09 |
| is_prime_4  | False       |        0.21 |        0.53 |       14.63 |
| is_prime_4  | True        |        0.20 |        4.27 |       20.09 |
 ---------------------------------------------------------------------

Then, running the function compare_functions(n=10**6) (numbers up to 1.000.000) I get this output:

df.shape: (4000000, 5)

 -----------------------------------------
| is_prime    | count       | percent     |
| all         |   1,000,000 |     100.0 % |
| False       |     921,502 |      92.2 % |
| True        |      78,498 |       7.8 % |
 -----------------------------------------

 ---------------------------------------------------------------------
| f           | is_prime    | t min (us)  | t mean (us) | t max (us)  |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_1  | all         |        0.51 |        5.39 |     1414.87 |
| is_prime_1  | False       |        0.51 |        2.19 |      413.42 |
| is_prime_1  | True        |        0.87 |       42.98 |     1414.87 |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_2  | all         |        0.24 |        2.65 |      612.69 |
| is_prime_2  | False       |        0.24 |        0.89 |      322.81 |
| is_prime_2  | True        |        0.24 |       23.27 |      612.69 |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_3  | all         |        0.20 |        1.93 |       67.40 |
| is_prime_3  | False       |        0.20 |        0.82 |       61.39 |
| is_prime_3  | True        |        0.59 |       14.97 |       67.40 |
|-------------|-------------|-------------|-------------|-------------|
| is_prime_4  | all         |        0.18 |        1.88 |      332.13 |
| is_prime_4  | False       |        0.20 |        0.74 |      311.94 |
| is_prime_4  | True        |        0.18 |       15.23 |      332.13 |
 ---------------------------------------------------------------------

I used the following script to plot the results:

def plot_1(func_list=default_func_list, n):
    df_orig = compare_functions(func_list=func_list, n=n)
    df_filtered = df_orig[df_orig['t_micro_seconds'] <= 20]
    sns.lmplot(
        data=df_filtered, x='number', y='t_micro_seconds',
        col='f',
        # row='is_prime',
        markers='.',
        ci=None)

    plt.ticklabel_format(style='sci', axis='x', scilimits=(3, 3))
    plt.show()
查看更多
还给你的自由
7楼-- · 2018-12-31 08:31

Smallest memory? This isn't smallest, but is a step in the right direction.

class PrimeDictionary {
    BitArray bits;

    public PrimeDictionary(int n) {
        bits = new BitArray(n + 1);
        for (int i = 0; 2 * i + 3 <= n; i++) {
            bits.Set(i, CheckPrimality(2 * i + 3));
        }
    }

    public PrimeDictionary(IEnumerable<int> primes) {
        bits = new BitArray(primes.Max());
        foreach(var prime in primes.Where(p => p != 2)) {
            bits.Set((prime - 3) / 2, true);
        }
    }

    public bool IsPrime(int k) {
        if (k == 2) {
            return true;
        }
        if (k % 2 == 0) {
            return false;
        }
        return bits[(k - 3) / 2];
    }
}

Of course, you have to specify the definition of CheckPrimality.

查看更多
登录 后发表回答