Saturday, August 29, 2009

isPrime algorithm

This is something of a continuation of my previous post. I stalled out trying to post the code snippets, posted them as cropped screenshots, and finally discovered a mildly convenient way to export from netbeans to html and paste into blogger with the help of a few lines of css in my blog layout.

I find the Project Euler problems helpful because they challenge you to tidy up the algorithms. A question may be of a very approachable nature for small numbers, but the solution for large numbers relies on a solid understanding of what is necessary and sufficient to find the solution.

For example, if you want to create an algorithm that returns "true" if a number is prime and "false" if it is composite, then you might start with this:

public static boolean isPrime(int args){
int n = args;
for(int i=2; i<n; i++){
if(n%i==0){
return false;
}
}
return true;
}

...which works by checking every integer in the closed interval [2, n-1] as potential factors of n. The condition n%i==0 uses the modulus "%" operator and it is true when the remainder when n is divided by i is equal to zero. If there are any factors of n on that interval, then the code will return a value of "true".

With small numbers and simple tasks, the computer calculates quickly enough to make this seem like a good algorithm. For example, I can use this algorithm to learn in the blink of an eye that 370009919 (for example) is not a prime. However, it wastes a lot of energy and in some circumstances, a lot of time. It is wasteful because it checks 370009917 potential factors. If 370009919 is a composite number, then once we have learned that 2 is not a factor, we do not need to look at other even numbers. We could use:

public static boolean isPrime(int args){
int n = args;
if(n==2){
return true;
}
else if(n%2==0){
return false;
}
for(int i=3; i<n; i=i+2){
if(n%i==0){
return false;
}
}
return true;
}

...which still covers all of the possibilities, but after determining that 2 is not a factor, it will not look at any more even numbers because i goes up by 2's from 3. It is still quite wasteful, however.

If 370009919 is a composite number, then it must have a factor smaller than or equal to its square root, which is about 19236. If we pass the square root without finding a factor, we know the number is prime and we can stop. This effectively allows us to stop after 1/19236 of the work that the previous code ordered. We can take a square root of an int using Java, but the value will be a double. We can still compare our int i to the double sqrtN, but this time we want less-than-or-equal-to "<=" instead of less-than, since the maximum value of the minimum factor may be equal to the suare root, as would be the case for the square of any prime number.



public static boolean isPrime(int args){
int n = args;
double sqrtN = Math.sqrt(n);
if(n==2){
return true;
}
else if(n%2==0){
return false;
}
for(int i=3; i<=sqrtN; i=i+2){
if(n%i==0){
return false;
}
}
return true;
}

The code above is correct as long as Math.sqrt(n) is never less than the actual square root of n. If rounding errors and the hidden Java algorithm that computes "sqrt" might permit such a thing, then the code would possibly be incorrect. I could (A) look at the documentation to reassure myself, (B) conservatively make the second condition of the for-loop read: i<=(sqrtN+1), or (C), test the code for all squares of known primes in the range of int values to see if it correctly detects the factor every time. Also, this code will only return a true or a false. If I want to know the factor that was found, I can add a line of code to print out the factor as soon as it is found.

public static boolean isPrime(int args){
int n = args;
double sqrtN = Math.sqrt(n);
if(n==2){
return true;
}
else if(n%2==0){
return false;
}
for(int i = 3; i<=(sqrtN+1); i=i+2){
if(n%i==0){
System.out.println("found a factor: "+i);
return false;
}
}
return true;
}


The smallest factor is 1699.


I have not looked around to see how this algorithm could be further improved. It is a good idea to skip the evens after we determine that two is not a factor, but skipping the multiples of three (or five, and so on) would be, I suspect, too costly to provide an advantage. The BigInteger class has a method that determines if a number is prime, but I do not know if we are guaranteed more efficiency there, and I do not understand the cost should I need to convert to a BigInteger just to find out. I have seen references to the use of stored lists of primes, which would cut down execution time, but it does not address the programming question of primality. Also, if Java's square root algorithm is considered cheap, we could just replace "i<=sqrtN" with "i*i<=n". This would cost an extra multiplication for every value of i, but we could avoid finding the square root.

Many of the more sophisticated primality tests are designed to be quick and probabilistic, meaning there is a small chance that they will give an incorrect result. However, according to Wolfram Mathworld, the fastest general deterministic test, giving absolute certainty, is "Elliptic Curve Primality Proving", which recently took only six years of CPU time to verify the primality of a 20562-digit prime number.

1 comment:

  1. you have a nice site. thanks for sharing this site. you can download lots of ebooks from here

    http://feboook.blogspot.com

    ReplyDelete