
Introduction
For a positive integer n, we have to determine if n is prime, where n is small (i.e., it can be assigned to a ulong type in C#). Thus, n will be constrained to be less than 2^64-1. The algorithm presented here will make some notable improvements over the brute force method by using Wheel Factorization.
Background
Prime numbers are interesting numbers to the math community. And, the search for prime numbers seems to be the hobby (or profession) of many mathematicians. Prime numbers are also important in the RSA encryption algorithm. There is even a prize money for finding very large primes.
The Algorithm
Our goal is to determine if a positive integer is prime or not. The brute force way would look something like this:
public bool IsPrime(ulong primeSuspect)
{
if (primeSuspect < 2) return false;
if (primeSuspect == 2) return true;
for (ulong divisor = 2; divisor < primeSuspect; divisor++)
{
if (primeSuspect % divisor == 0)
{
return false;
}
}
return true;
}
The above code is inefficient because it often checks a number as a possible divisor that has already been eliminated because it is a multiple of a smaller number. No need to check 4 as a divisor if 2 was already checked. And, no need to check 9 as a divisor if 3 has already been checked, and so on. Another problem is that it checks all numbers up to our prime candidate as a divisor. We will fix this first.
So, our first improvement to the algorithm will be to only search for factors for our prime candidate up to the square root of our prime candidate. If there is a factor to our candidate greater than its square root, then there must also be a factor less than its square root such that when these two factors are multiplied, it gives us our original candidate.
Our next improvement will be to use a method called Wheel Factorization. If we already know all the primes less than or equal to the square root of our candidate, this would be optimal. However, searching for all these lesser primes comes at the price of more computation time, and negates much of our gains. So, I'll use Wheel Factorization to speed up the search.
In Wheel Factorization, you start with the first few primes. In this example, I will use 2, 3, and 5, the first three primes, to make it simple. (In the downloaded code, I use more than the first three primes, which will give us some more improvement.) This gives us a Wheel Factorization of 30, the product of the first three primes (2*3*5). You then make a list of the integers from 1 to 30, and eliminate all the numbers in the list that are multiples of 2, 3, or 5. This gives us this list: {1, 7, 11, 13, 17, 19, 23, 29} of sieved numbers. These sieved numbers give us a pattern of numbers that repeat and are not multiples of 2, 3, or 5. Thus, if you add 30, or 60, or 90, etc. to each of these numbers in the list, none are divisible by 2, 3, or 5. I will make a small modification to this sieved list of numbers to make the loop simpler. I will remove the 1, and add it to 30 at the tail of the list. So now, our list of numbers is {7, 11, 13, 17, 19, 23, 29, 31}. This is so I can do a pass = 0 and don't have to divide by 1.
So, here is the heart of the program (simplified for this article by using just the first three primes to create the sieve):
private static ulong[] aSieve30 = new ulong[]
{7, 11, 13, 17, 19, 23, 29, 31};
public static ulong FirstDivisor(ulong candidatePrime)
{
if (candidatePrime == 0)
throw new ArgumentException ("Zero is an invalid parameter!");
List<ulong> firstPrimes =
new List<ulong>(new ulong[] {2, 3, 5});
WheelFactor = 30;
if (candidatePrime == 1)
{
return 0;
}
foreach (ulong prime in firstPrimes)
{
if (candidatePrime % prime == 0) return prime;
}
ulong theSqrt = (ulong)Math.Sqrt((double)candidatePrime);
for (ulong pass = 0; pass < theSqrt; pass += WheelFactor)
{
foreach (ulong sieve in aSieve30)
{
if (candidatePrime % (pass + sieve) == 0)
{
return pass + sieve;
}
}
}
return candidatePrime;
}
public static bool IsPrime(ulong primeSuspect)
{
if (primeSuspect == 0) return false;
return (FirstDivisor(primeSuspect) == primeSuspect);
}
As you can see from the for loop above, it is incremented by our WheelFactor, which is 30 in this case. And, the inner loop checks all 8 primes in our sieved list. Therefore, just 8 of 30 numbers are checked, which is more than a 73% improvement over the brute force way. Unfortunately, increasing the number of primes in our first primes list has a diminishing improvement on our search. The downloaded code uses the first 8 primes, which gives about an 83% improvement over the brute force way.
The figure illustrates a wheel factorization of 30. After performing trial divisions of 2, 3, and 5, then you only have to do trial divisions for those spokes of the wheel that are white. The spokes of the wheel that are red have been eliminated for consideration as possible divisors.
Conclusion
In Wheel Factorization, you get some good performance improvement in determining if a number is prime by skipping all the multiples of the first few primes.
Reference
History
- 19 Nov 2008 - Original submission
- 31 Jan 2009 - The
BuildSieve() method was improved
| You must Sign In to use this message board. |
|
|
 |
|
 |
I decided to run my own benchmarks against the Prime Suspect program to show the different performance levels at the different Wheel Factor values. I did not include in my timings the time it took to build the sieve list. This is because the sieve list is built once when the program initiates and remains in memory throughout the life of the program. I also modified the Brute Force method above to stop after reaching the square root of the prime number candidate. This is so that the benchmark would only show the effect of adding the wheel factorization to the algorithm. The brute force method is shown in the row with the # Primes in firstPrimes of zero. All tests used the largest prime that can fit into a ulong field in C#, which is the default value of the demo program above. The times are listed in milliseconds and is the average of 5 tests. The computer used for the test is a Dell DIMENSION 8400 3.40 GHz and 1.00 GB of RAM. (Your results may vary.)
# Primes in Time (ms) Theoretical % Actual % of firstPrimes of Brute Force Brute Force ------------------------------------------------------------------- 0 190220.4 1 130388.4 50.00% 68.55% 2 82826.0 33.33% 43.54% 3 65491.2 26.67% 34.43% 4 54261.8 22.85% 28.53% 5 49065.4 20.78% 25.79% 6 45318.0 19.18% 23.82% 7 42794.6 18.05% 22.50% 8 40594.6 17.10% 21.34% 9 39994.2 16.36% 21.03%
Some observations on the results:
Because of some additional overhead in the Wheel Factorization inner loop (the foreach loop) and the addition in the inner loop (pass + sieve) the actual gains do not quite meet the theoretical gains. But they get closer with more primes in the firstPrimes list.
Also note that the gains from row 8 to row 9 are small. This seems to be because the wheel factor for row 9 is over 223 million. So it may do millions of divisor checks before the code to check if it is over the square root of the candidate is executed.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Thanks for code.I wass looking a faster algorithm before.But now, i do not interest in that. Till times i was interested, i discovered some very important and mathematical shortcuts for brute algorithm.I tell you them: *All even numbers are divisible by 2 so do not need them: add +2 from starting 3 so halve count of divisions. *if we are looking for x for primity, so we need actually check up to sqrt(x) for it.Because if we look up to x/2, it is already divisible by 2 so we do not need more calculations. Brute force algorithm can improved such a way, so for number N we check nearly ~ sqrt(N)/2 a simple code to explain it:
bool check_prime(_int64 number) { int sqrt_count = ceil(sqrt(number)); if(number%2 == 0)return false;
for(_int64 x=3; x <= sqrt_count;x += 2) if(number%x == 0) return false; return true; }
So: if you check for 1.000.000.000.000, you need nearly 1.000.000/2 = 500.000 division, which is aceptable to me for now.
Sorry if i am mistaken.I welcome any information.
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." Albert Einstein
modified on Tuesday, November 25, 2008 2:53 AM
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
murti386,
You can further improve the performance of your code by
bool check_prime(_int64 number) { if(number%2 == 0)return false; int sqrt_count = ceil(sqrt(number)); for(_int64 x=3; x < sqrt_count;x += 2) if(number%x == 0) return false; return true; }
1) Moving the line with int sqrt_count = ceil(sqrt(number)); below the divided by 2 test eliminates unnecessary calculations for all even numbers.
2) The change from <= sqrt_count to < sqrt_count is possible, because if sqrt_count is a factor, then the only possibility is a composite number = sqrt_count * sqrt_count. Since we only want primes, we can skip this boundary. This is why rickoshay's code above only has <.
3) Rickoshay's algorithm above should be much faster than what you've shown, as it quickly eliminates many numbers at a time (30 per pass as shown, instead of 2 per pass). Might be fun to run both and report the times you see for each.
modified on Tuesday, November 25, 2008 5:13 AM
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Just tried to give en example.Of course, you are right, but it is not my expectation that it can be used for programming purposes. I disagree with < instead of <=.I tought it. You can see i modified comment, only for this change.As an example i give 25.It starts with 3, +2 = 5. 5 is sqrt of 25, but it will ignore it because 5 is not smaller, it is equal.Btw 5 is prime.Maybe ceil() can be removed, i am paranoid about such kind of bugs. Brute force does not need additional memory, it is simple for integrated circuits, i tried to show how to improve it.Access to memory requires time, may be equal for one division on many systems. You are right anyway.It has to be tested.
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." Albert Einstein
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
As stated in the subject line this brute force algorithm is just an code otimized wheelfactor 2 implementation. Wich means it has an efficency boost of 50% (using same calculus as in the article)
Here now is the optimized version for wheel factor 6 ( efficency boost of 66% )
bool check_prime(_int64 number) { if(number%2 == 0)return false; if(number%3 == 0)return false; int sqrt_count = ceil(sqrt(number)); for(_int64 x=0; x < sqrt_count;x += 6 ){ if(number%(x+5) == 0) return false; if(number%(x+7) == 0) return false; } return true;}
This can be extendet pretty easily by writing more lines of code: 2,3 are the "firstPrimes" , 5,7 is the "sieve" and 6 the wheelfactor
besides: on classical x86 machines modulo operations are much more expensive then plain divisions. but I can't compare that to Memory or cache reads.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I'm afraid that you misunderstood the reason that I included the code for the brute force method. It is included to show how NOT to write the procedure. This is why I included the comment in the code that said "Warning: Slow code ahead, do not use." I repeat, the first method is bad code, do not use it. The code below it that uses Wheel Factorization is a much improved algorithm even over the examples you have shown. And the demo code that you can download has even a higher wheel factor.
There are better algorithms for proving primality for larger numbers than Wheel Factorization, for example the Elliptic Curve Primality Proof (ECPP). However it is beyond the scope of this article and much more difficult to understand without a good mathematics background.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I misunderstood algorithm.Someone above simply told that they are already same with factor=2.But i always checked till sqrt of code, that is a great performance gain even with bad code in the code:
for (ulong pass = 0; pass < theSqrt; pass += WheelFactor) { foreach (ulong sieve in aSieve) { if (candidatePrime % (pass + sieve) == 0) { return pass + sieve; } } }
There is two loops(compiler may not unroll it) There is a list of aSieve instead of hardcoded numbers. There is an addition, but it is performed fast. There is nothing wrong with it.Comparing is better because it is not simply %70-%80 faster,it uses slower code.Btw, division with 64bit integer (and modulo) on 32bit systems is too slow as i seen in assembly.But *not* on a real 64 bit system, i think. Thanks for article.
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." Albert Einstein
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
It is hard for me to comment on your benchmarks since I do not see your code and how it was done. Also I wonder if your benchmark removed the portion of my code that reported back progress to the GUI, which could slow down the search. I can also see some bugs in the example code that you listed above. You do not do a pass + 7 modulo the candidatePrime when it should, and you added a pass + 25 modulo the candidatePrime when it should not. This makes me wonder how you generated this injected code, given these bugs. It would also be unwieldy to do this for the larger wheel factors. In the last case you would need to inject over 36 million lines of code (although I did not double check your sieved list size).
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Oops replyed to my own message instead of yours tak a look in the thread.
The top row - with 36Million entry sieve - in my Benchmark ist just there to show how big this thing gets.
Even your Code will fail on this, because some .Net function return int instead of long.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I just used the 3 - 4 functions in PrimeDeterminer that are needed to call FirstDivisor, I called it directly , no backgroundprocessing, no GUI.
I also corrected the Error in my article, You found. But it's related to the fact I wrote the article after doing the benchmark and manually changed 8 numbers for the article.
For the benchmark I set a Breakpoint and copied exactly the content of aSieve from VS to Excel where this kind of code generation is super easy, no matter sieve size is 48 or 480 or ... (... does not mean infinit ).
As mentioned in the benchmark even the step from 48 to 480 lines cause the programm to decrease in speed, maybe because of instruction cache size on the cpu. I will not add the Test with 5760 lines inserted, because already the creation of the sieve takes quite a while.
public class PrimeDeterminer : BackgroundWorker { ... List<ulong>(new ulong[] {2, 3, 5, 7, 11, 13, 17, 19}); ...
public PrimeDeterminer(ulong primeSuspect) { WorkerReportsProgress = true; WorkerSupportsCancellation = true; possiblePrime = primeSuspect; ComputeWheelFactor(); BuildSieve(); }
public void SetPrimeSuspect(ulong anInt){...} private static void ComputeWheelFactor(){...} private static void BuildSieve(){...} public static ulong FirstDivisor(ulong candidatePrime) { ComputeWheelFactor(); BuildSieve();
if (candidatePrime == 0) throw new ArgumentException("Zero is an invalid parameter, unable to determine its divisors!"); if (candidatePrime == 1) { return 0; } foreach (ulong prime in firstPrimes) { if (candidatePrime % prime == 0) return prime; }
ulong theSqrt = (ulong)Math.Sqrt((double)candidatePrime); for (ulong pass = 0; pass < theSqrt; pass += WheelFactor)
modified on Friday, November 28, 2008 11:27 AM
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|