Click here to Skip to main content
15,064,153 members
Articles / General Programming / Algorithms
Article
Posted 29 May 2019

Tagged as

Stats

17.2K views
200 downloads
8 bookmarked

Integer Factorization: Trial Division Algorithm

Rate me:
Please Sign up or sign in to vote.
4.90/5 (5 votes)
30 Jan 2021CPOL6 min read
Small review of Trial Division algorithm
In this article, we will do a small review of the trial division algorithm. We will look at the classic variants of trial division including brute force, basic version, getting rid of even factors, the wheel and list of primes. Finally, we will take a look at a benchmark.

Background

This article is part of a set of articles. Each article highlights an aspect around Integer Factorization.

The xPrompt Download and prg Files

xPrompt.zip contains the binary of the xHarbour Scripting Engine for Windows. This is the app needed to run the prg code.

Usage:

  • Copy prg file in same directory than xPrompt.
  • Launch xPrompt.
  • Type "do FileName.prg" to run the code.

xHarbour is freeware and can be downloaded at xHarbour.org[^].

xHarbour is part of the xBase family languages: xBase - Wikipedia[^].

Introduction

When starting to play with Integer Factorization, trying all possible factors is the first idea, that algorithm is named Trial Division.

The algorithm has two purposes - finding a prime factor or finding if an integer is a prime by not by finding a prime factor.

Since the algorithm is about finding a factor, the worst case is when the integer to factorize is a prime.

Trial Division: The Classic Variants

It looks obvious that the most efficient method is to check only prime numbers, but handling the list of primes is a problem by itself. The classical variants exist as solutions to avoid that problem. As methods get more sophisticated, they remove more useless non prime numbers, thus being more efficient.

Brute Force

Some newbies are testing every single number below n. That's overkill. The inefficiency remains hidden as long as the integer to factorize is not a prime.

The maximum cost for n is n-2 divisions. Complexity of O(n) is n.

C++
// Trial Division: Brute Force 1
// Check all numbers until Cand - 1
long long TD_BF1(long long Cand) {
    Count = 0;
    long long Top = Cand - 1;
    for (long long Div = 2; Div <= Top; Div++) {
        Count++;
        if (Cand % Div == 0) {
            return Div;
        }
    }
    return Cand;
}
dBase
//  Trial Division Brute Force 1
// May 2019
function TD_BF1(Prod)
    Local D, Top
    Top= Prod-1
    for D= 2 to Top
        if Prod % D = 0
            return D
        endif
    next
    return Prod

Testing all numbers until n/2 is better, but is also overkill. Just like with the previous method, the inefficiency remains hidden as long the integer to factorize is not a prime.

The maximum cost for n is n / 2 divisions. Complexity of O(n) is n / 2.

C++
// Trial Division: Brute Force 2
// Check all numbers until Cand / 2
long long TD_BF2(long long Cand) {
    Count = 0;
    long long Top = Cand / 2;
    for (long long Div = 2; Div <= Top; Div++) {
        Count++;
        if (Cand % Div == 0) {
            return Div;
        }
    }
    return Cand;
}
dBase
//  Trial Division Brute Force 2
// May 2019
function TD_BF2(Prod)
    Local D, Top
    Top= Prod/2
    for D= 2 to Top
        if Prod % D = 0
            return D
        endif
    next
    return Prod

Basic Version: Square Root

A little analysis shows that there is no factor to check after the square root.

The maximum cost for n is √n divisions. Complexity of O(n) is √n.

C++
// Trial Division: Square Root
// Check all numbers until Square Root
long long TD_SR(long long Cand) {
    Count = 0;
    long long Top = sqrt(Cand);
    for (long long Div = 2; Div <= Top; Div++) {
        Count++;
        if (Cand % Div == 0) {
            return Div;
        }
    }
    return Cand;
}
dBase
//  Trial Division Square Root
// May 2019
function TD_SR(Prod)
    Local D, Top
    Top= int(sqrt(Prod))
    for D= 2 to Top
        if Prod % D = 0
            return D
        endif
    next
    return Prod

Get Rid of Even Factors

With further analysis, one can see that there is no even factor beyond 2.

The maximum cost for n is (√n) * 1 / 2 => (√n) * 0.50 divisions. Complexity of O(n) is √n.

C++
// Trial Division: Square Root + Even Factors
// Check all numbers until Square Root and Skip Even Factors
long long TD_SREF(long long Cand) {
    Count = 0;
    // check 2 the only Even Prime
    Count++;
    if (Cand % 2 == 0) {
        return 2;
    }
    long long Top = sqrt(Cand);
    for (long long Div = 3; Div <= Top; Div += 2) {
        Count++;
        if (Cand % Div == 0) {
            return Div;
        }
    }
    return Cand;
}
dBase
//  Trial Division Square Root + Even Factors
// May 2019
function TD_SREF(Prod)
    Local D, Top
    if Prod % 2 = 0
        return 2
    endif
    Top= int(sqrt(Prod))
    for D= 3 to Top step 2
        if Prod % D = 0
            return D
        endif
    next
    return Prod

The Wheel

The wheel is an extension of the previous optimization. One builds the wheel from small primes, say 2 and 3, the size of the wheel is 2 * 3 = 6. When one writes a list of numbers in a 6 columns table and removes 2 and 3, one can see that primes are only in the first column and in the fifth column. That is the wheel, we only have to check numbers of columns 1 and 5.

1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
31 32 33 34 35 36
37 38 39 40 41 42
43 44 45 46 47 48
49 50 51 52 53 54
55 56 57 58 59 60

The maximum cost for n is (√n) * (1 / 2) * (2 / 3) => (√n) * (1 / 3) => (√n) * 0.33 divisions. Complexity of O(n) is √n.

dBase
    //  Trial Division Square Root + Wheel
    // May 2019
<pre lang="c++">
// Trial Division: Square Root + Wheel
// Check all numbers until Square Root and Skip useless factors with a wheel
long long TD_SRW(long long Cand) {
    long long SPrimes[] = { 2, 3, 0 };
    long long Wheel[] = { 2, 4, 0 };
    long long Div;

    Count = 0;
    long long Top = sqrt(Cand);
    // Check small primes
    for (long long Ind = 0; SPrimes[Ind] != 0; Ind++) {
        Div = SPrimes[Ind]; // for debug purpose
        if (SPrimes[Ind] > Top) {
            return Cand;
        }
        Count++;
        if (Cand % SPrimes[Ind] == 0) {
            return SPrimes[Ind];
        }
    }
    // Start the Wheel
    Div = 1;
    while (Div <= Top) {
        for (long long Ind = 0; Wheel[Ind] != 0; Ind++) {
            Div += Wheel[Ind];
            if (Div > Top) {
                break;
            }
            Count++;
            if (Cand % Div == 0) {
                return Div;
            }
        }
    }
    return Cand;
}
function TD_SRW(Prod) local D, Top, SPrimes, Wheel, W // Check small primes SPrimes= {2, 3} Wheel= {4, 2} for each D in SPrimes if Prod % D = 0 return D endif next // Start the wheel D= 1 Top= int(sqrt(Prod)) while D <= Top for each W in wheel D += W if Prod % D = 0 return D endif next enddo return Prod

The wheel can include more primes. With 2, 3 and 5, the size of wheel is 30.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

The maximum cost for n is (√n) * (1 / 2) * (2 / 3) * (4 / 5) => (√n) * (4 / 15) => (√n) * 0.27 divisions. Complexity of O(n) is √n.

C++
long long SPrimes[] = { 2, 3, 5, 0 };
long long Wheel[] = { 6, 4, 2, 4, 2, 4, 6, 2, 0 };
dBase
// only those 2 lines need change
SPrimes= {2, 3, 5}
Wheel= {6, 4, 2, 4, 2, 4, 6, 2}

Or get even bigger. With 2, 3, 5 and 7, the size of wheel is 210.

The maximum cost for n is (√n) * (1 / 2) * (2 / 3) * (4 / 5) * (6 / 7) => (√n) * (24 / 105) => (√n) * 0.23 divisions. Complexity of O(n) is √n.

The List of Primes

Basically, it is a wheel variant, where the list of small primes exceeds the ones needed for the wheel, and after the end of list, we fall back on the wheel. The advantage over the simple wheel is that it avoids testing non prime factors not filtered by the wheel. As long as we are in the list of primes, the workload is optimum.

Just have to take care about setting the wheel correctly at the end of prime list.

The maximum cost for n is slightly better than the wheel, the list primes just save non prime divisors that are in the wheel. So the longer the list of primes, the better it gets. Complexity of O(n) is √n.

C++
// Trial Division: Square Root + Prime list + Wheel
// Check all numbers until Square Root, start with a list of primes
// and Skip useless factors with a wheel
long long TD_SRPW(long long Cand) {
	long long SPrimes[] = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43,
			47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113,
			127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191,
			193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263,
			269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347,
			349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421,
			431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499,
			503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593,
			599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661,
			673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757,
			761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853,
			857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941,
			947, 953, 967, 971, 977, 983, 991, 997, 1009, 1013, 1019, 1021,
			1031, 1033, 1039, 1049, 1051, 1061, 1063, 1069, 1087, 1091, 1093,
			1097, 1103, 1109, 1117, 1123, 1129, 1151, 1153, 1163, 1171, 1181,
			1187, 1193, 1201, 1213, 1217, 1223, 1229, 1231, 1237, 1249, 1259,
			1277, 1279, 1283, 1289, 1291, 1297, 1301, 1303, 1307, 1319, 1321,
			1327, 1361, 1367, 1373, 1381, 1399, 1409, 1423, 1427, 1429, 1433,
			1439, 1447, 1451, 1453, 1459, 1471, 1481, 1483, 1487, 1489, 1493,
			1499, 1511, 1523, 1531, 1543, 1549, 1553, 1559, 1567, 1571, 1579,
			1583, 1597, 1601, 1607, 1609, 1613, 1619, 1621, 1627, 1637, 1657,
			1663, 1667, 1669, 1693, 1697, 1699, 1709, 1721, 1723, 1733, 1741,
			1747, 1753, 1759, 1777, 1783, 1787, 1789, 1801, 1811, 1823, 1831,
			1847, 1861, 1867, 1871, 1873, 1877, 1879, 1889, 1901, 1907, 1913,
			1931, 1933, 1949, 1951, 1973, 1979, 1987, 1993, 1997, 1999, 2003,
			2011, 2017, 2027, 2029, 2039, 2053, 2063, 2069, 2081, 2083, 2087,
			2089, 2099, 2111, 2113, 2129, 2131, 2137, 2141, 2143, 2153, 2161,
			2179, 2203, 2207, 2213, 2221, 2237, 2239, 2243, 2251, 2267, 2269,
			2273, 2281, 2287, 2293, 2297, 2309, 2311, 2333, 2339, 2341, 2347,
			2351, 2357, 2371, 2377, 2381, 2383, 2389, 2393, 2399, 2411, 2417,
			2423, 2437, 2441, 2447, 2459, 2467, 2473, 2477, 2503, 2521, 2531,
			2539, 2543, 2549, 2551, 2557, 2579, 2591, 2593, 2609, 2617, 2621,
			2633, 2647, 2657, 2659, 2663, 2671, 2677, 2683, 2687, 2689, 2693,
			2699, 2707, 2711, 2713, 2719, 2729, 2731, 2741, 2749, 2753, 2767,
			2777, 2789, 2791, 2797, 2801, 2803, 2819, 2833, 2837, 2843, 2851,
			2857, 2861, 2879, 2887, 2897, 2903, 2909, 2917, 2927, 2939, 2953,
			2957, 2963, 2969, 2971, 2999, 0 };
	long long Wheel[] = { 6, 4, 2, 4, 2, 4, 6, 2, 0 };
	long long Div;

	Count = 0;
	long long Top = sqrt(Cand);
	// Check small primes
	for (long long Ind = 0; SPrimes[Ind] != 0; Ind++) {
		Div = SPrimes[Ind]; // for debug purpose
		if (SPrimes[Ind] > Top) {
			return Cand;
		}
		Count++;
		if (Cand % SPrimes[Ind] == 0) {
			return SPrimes[Ind];
		}
	}
	// Start the Wheel
	Div = 3001;
	while (Div <= Top) {
		for (long long Ind = 0; Wheel[Ind] != 0; Ind++) {
			if (Div > Top) {
				return Cand;
			}
			Count++;
			if (Cand % Div == 0) {
				return Div;
			}
			Div += Wheel[Ind];
		}
	}
	return Cand;
}
dBase
    //  Trial Division Square Root + Wheel + list of primes
    // with a large list of primes
    // May 2019
function TD_SRW2(Prod)
    local D, Top, SPrimes, Wheel, W
    // Check small primes
    SPrimes= {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 
              71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 
              149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 
              223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293}
    Wheel= {6, 4, 2, 4, 2, 4, 6, 2}
    
    for each D in SPrimes
        if Prod % D = 0
            return D
        endif
    next
    // Start the wheel
    D= 301
    Top= int(sqrt(Prod))
    while D <= Top
        for each W in wheel
            D += W
            if Prod % D = 0
                return D
            endif
        next
    enddo
    return Prod

Benchmark 1

As divisions/modulos are what cost time, counting them is enough for the benchmark.

C++
long long Test[] = { 11, 31, 53, 71, 97, 0, 101, 1009, 2003, 3001, 4001,
		5003, 6007, 7001, 8009, 9001, 10007, 20011, 30011, 40009, 49999,
		1000003, 4000021, 9000011, 0 };
cout << "Comparaison Variantes avec Brute Force" << endl;
cout << "Number\tTD_BF1\tTD_BF2\tTD_SR\tTD_SREF\tTD_SRW\tTD_SRPW" << endl;
for (Ind = 0; Test[Ind] != 0; Ind++) {
	cout << Test[Ind] << "\t";
	TD_BF1(Test[Ind]);
	cout << Count << "\t";
	TD_BF2(Test[Ind]);
	cout << Count << "\t";
	TD_SR(Test[Ind]);
	cout << Count << "\t";
	TD_SREF(Test[Ind]);
	cout << Count << "\t";
	TD_SRW(Test[Ind]);
	cout << Count << "\t";
	TD_SRPW(Test[Ind]);
	cout << Count << endl;
}
cout << endl;
dBase
Test = { 11, 31, 53, 71, 97, 101, 1009, 2003, 3001, 4001, 5003, 6007, 7001, 
         8009, 9001, 10007, 20011, 30011, 40009, 49999, 1000003, 4000021, 9000011 }
? "Comparaison Variantes avec Brute Force"
? "Number","TD_BF1","TD_BF2","TD_SR","TD_SREF","TD_SRW","TD_SRPW"
for Ind = 1 to 5
	? Test[Ind]
	TD_BF1(Test[Ind])
	?? Count
	TD_BF2(Test[Ind])
	?? Count
	TD_SR(Test[Ind])
	?? Count
	TD_SREF(Test[Ind])
	?? Count
	TD_SRW(Test[Ind])
	?? Count
	TD_SRPW(Test[Ind])
	?? Count
next
?
C++
Number  TD_BF1  TD_BF2  TD_SR   TD_SREF TD_SRW  TD_SRPW
11      9       4       2       2       2       2
31      29      14      4       3       3       3
53      51      25      6       4       4       4
71      69      34      7       4       4       4
97      95      47      8       5       4       4

One can see the inefficiency of brute force variants.

Benchmark 2

C++
	cout << "Comparaison Variantes sans Brute Force" << endl;
	cout << "Number\tTD_SR\tTD_SREF\tTD_SRW\tTD_SRPW\tDelta" << endl;
// test after first 0
	for (Ind++; Test[Ind] != 0; Ind++) {
		cout << Test[Ind] << "\t";
		TD_SR(Test[Ind]);
		cout << Count << "\t";
		TD_SREF(Test[Ind]);
		cout << Count << "\t";
		TD_SRW(Test[Ind]);
		int C1 = Count;
		cout << Count << "\t";
		TD_SRPW(Test[Ind]);
		cout << Count << "\t";
		cout << C1 - Count << endl;
	}
	cout << endl;
dBase
? "Number","TD_SR","TD_SREF","TD_SRW","TD_SRPW"
for Ind = 6 to len(Test)
	? Test[Ind]
	TD_SR(Test[Ind])
	?? Count
	TD_SREF(Test[Ind])
	?? Count
	TD_SRW(Test[Ind])
	?? Count
	TD_SRPW(Test[Ind])
	?? Count
next
?
C++
Comparison Variants sans Brute Force
Number	TD_SR	TD_SREF	TD_SRW	TD_SRPW	Delta
101		9		5		4		4		0
1009	30		16		11		11		0
2003	43		22		14		14		0
3001	53		27		17		16		1
4001	62		32		19		18		1
5003	69		35		20		19		1
6007	76		39		23		21		2
7001	82		42		25		23		2
8009	88		45		26		24		2
9001	93		47		27		24		3
10007	99		50		28		25		3
20011	140		71		40		34		6
30011	172		87		49		40		9
40009	199		100		56		46		10
49999	222		112		62		48		14
1000003	999		500		268		168		100
4000021	1800	901		483		279		204
9000011	2999	1500	802		430		372

Delta shows how the list of primes improves things, and it gets better with long list of primes.

Checking Correctness of Variants

Rgis code checks correctness of code by comparing results of variants.

C++
cout << "Vérification Variantes" << endl;
for (long long Cand = 3; Cand < 10000000; Cand += 10) {
    long long d1 = TD_SREF(Cand);
    long long d2 = TD_SRPW(Cand);
    if (d1 != d2) {
        cout << Cand << "\t" << d1 << "\t" << d2 << endl;
    }
}

Points of Interest

Trial Division being brute force, one can see that there is brute force and brute force.

Links

History

  • 1st April, 2019: First version
  • 27th November, 2020: Second version
  • 20th December, 2020: Third version: some cleaning, moved download to top
  • 25th December, 2020: Fourth version: some cleaning and corrections
  • 31st January, 2021: Corrections

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Patrice T
Database Developer
France France
I am a professional programmer.
Problem analyse is certainly what I am best at.
My main programming expertise is in the xBase languages (dBase, Clipper, FoxPro, Harbour, xHarbour), then VBA for Excel and advanced Excel WorkBooks.

I also have knowledge on C/C++, d language, HTML, SVG, XML, XSLT, Javascript, PHP, BASICs, Python, COBOL, Assembly.
My personal interests goes to algorithm optimization and puzzles.

Comments and Discussions

 
SuggestionA Suggestion Pin
Rick York21-Dec-20 11:51
mveRick York21-Dec-20 11:51 
GeneralRe: A Suggestion Pin
Patrice T21-Dec-20 13:38
mvePatrice T21-Dec-20 13:38 
I did you have a look at Integer Factorization: Dreaded List of Primes[^] ?
Patrice

“Everything should be made as simple as possible, but no simpler.” Albert Einstein

GeneralRe: A Suggestion Pin
Rick York21-Dec-20 20:04
mveRick York21-Dec-20 20:04 
GeneralRe: A Suggestion Pin
Patrice T22-Dec-20 11:32
mvePatrice T22-Dec-20 11:32 
PraiseNice Work Pin
Rick York20-Dec-20 14:59
mveRick York20-Dec-20 14:59 
GeneralRe: Nice Work Pin
Patrice T20-Dec-20 15:02
mvePatrice T20-Dec-20 15:02 
GeneralRe: Nice Work Pin
Rick York20-Dec-20 16:48
mveRick York20-Dec-20 16:48 
GeneralRe: Nice Work Pin
Patrice T20-Dec-20 17:58
mvePatrice T20-Dec-20 17:58 
GeneralRe: Nice Work Pin
Rick York20-Dec-20 18:16
mveRick York20-Dec-20 18:16 
GeneralRe: Nice Work Pin
Patrice T20-Dec-20 18:22
mvePatrice T20-Dec-20 18:22 
QuestionNice work! Pin
ssa-ed1-Dec-20 11:09
Memberssa-ed1-Dec-20 11:09 
AnswerRe: Nice work! Pin
Patrice T20-Dec-20 0:47
mvePatrice T20-Dec-20 0:47 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.