Factoring Algorithms for Selected RSA Numbers

Roger Doss

5.00/5 (7 votes)

Nov 10, 2011

GPL3

6 min read

34520

479

Download source code - 1.17 KB

Introduction

RSA requires that we select two random prime numbers, p and q, and use them to generate a number n = p*q. n is called a semi-prime number since it has only two factors (aside from the number 1). In an attempt to learn about factoring by examining different approaches, one approach was to try to deduce the important value of p + q. The sum of the divisors p and q can be used to factor n=p*q.

Background

One of the approaches that was implemented uses the following polynomial as a representation of the problem of factoring a semi-prime number n:

x^2 - (p+q)x + pq = 0

This allows us to use the quadratic equation [2] to deduce the factors p and q by setting the coefficients as a = 1, b = -(p+q), and c = pq, in the general equation:

ax^2 + bx +c = 0

therefore, by substitution into the Quadratic formula [2]:

x = (-b +- sqrt(b^2 – 4ac))/2

we have the following solution:

p+q +- sqrt( (p+q)^2 - 4 * pq )               (Equation 1)
--------------------------
                    2

As a simple example, consider n=3*7=21. We now have:

x^2 - (3+7)x + 21

Which, given our usage of the quadratic equation, yields the formula:

10 +- sqrt ((10^2) - 4 * 21)
----------------------------
            2

Solving for the case of:

10 + sqrt ((10^2) - 4 * 21)
---------------------------- = 7
            2

And, for the case of:

10 - sqrt ((10^2) - 4 * 21)
---------------------------- = 3
           2

This allows us to reproduce the factors 3 and 7. This implies that if we can deduce the sum p + q somehow, we could trivially factor n.

We note that we have:

(p + q)^2 – (p – q)^2 = 4 * pq  (Equation 2)

And we can prove this statement by squaring p + q and p – q, and solving:

          p^2 + 2 * pq + q^2 – [p^2 – 2 *pq + q^2] = 4 * pq
          p^2 + 2 * pq + q^2 – p^2 + 2 *pq – q^2 = 4 * pq
                      2 * pq + 2 * pq = 4 * pq
                              4 * pq = 4 * pq

Given this, we can restate the problem of factoring a semi-prime as an application of the Pythagorean theorem:

c^2 = a^2 + b^2

where c = p + q, a = p – q, and b = sqrt(4 * pq), thus:

(p + q)^2 = (p – q)^2 + (sqrt(4 * pq))^2

This is because:

(p + q)^2 = (p – q)^2 + 4 * pq    (square the square root yields 4 * pq)
(p + q)^2 – (p – q)^2 = 4 * pq    (subtracting (p-2)^2 from both sides)

which is Equation 2. Using this, we can write that p + q is:

p + q = sqrt((p – q)^2 + 4 * pq)         (Equation 3)

which follows from Pythagorean theorem.

Given Equation 3 and noting that the right hand side must be a perfect square in order for us to produce an integer factoring for n, we can devise a factoring algorithm that is O((p – q)/2):

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int
main(int argc, char **argv)
{
    if(argc != 2) {
         fprintf(stderr, “syntax: factor semi_prime\n”);
         return 1;
    }
    {
         int n = strtoul(argv[1], NULL, 10);
         int i  = 2, j = 0, k = 0, trials = 0, p = 0, q = 0;
         if((n % i) == 0) {
             /* Divisible by 2. */
             printf(“n=%d p=%d q=%d trials=1\n”, n, i, p/i);
             return 0;
         }
         for(; i < n; i += 2, ++trials) {
              j = ( i * i );
              j = j + 4 * n;
              k = (int)sqrt(j);
              if((k * k) == j) {
                  /* k == p + q, i == p – q */
                              p = (k + i) / 2; 
                  q = (k – i) / 2;
                  printf(“n=%d p=%d q=%d trials=%d\n”,n,p,q,trials);
                  return 0;
              }
         }
    }
    return 0;
}

It should be clear that the algorithm is O((p – q)/2). The algorithm stops when we find a perfect square which is determined when the following if statement is true:

if ((k * k) == j)

and that this occurs precisely when i is equal to p – q. Since we count starting with i == 2 by a factor of 2, we only require half of the steps in order to find a factoring. The case where one of the factors, either p or q, is the number 2 is treated as a special case. Further, the case where p and q are the same number (and therefore n is a perfect square) is not handled by the algorithm. The algorithm allows us to factor semi-prime numbers whose factors differ by a countable difference in polynomial time. For cases where p and q differ significantly (as would be in the case of a real-world RSA engine we hope), this algorithm is not feasible.

Application of Euler Phi

Our interest now turns to the Euler Phi [1] function. Here, some results were computed from the definition of the Euler Phi function with respect to semi-prime numbers, n=p*q where p and q are unique primes. The following results were deduced:

Let eulerphi(n) equal (p-1)*(q-1), for semi-primes. We then compute eulerphi(n) as p*q - p - q + 1, by multiplying out (p-1)*(q-1).

It follows then, that n = eulerphi(n) + p + q - 1. If we let x = p + q, we may state that n = eulerphi(n) + x - 1. Solving for x, yields:

x = n - eulerphi(n) + 1, thus

p+q = n - eulerphi(n) + 1.

Now, given the polynomial: x^2 - (p + q)x + pq, we can use the Quadratic formula to deduce p and q, factors of n, as we stated earlier. Except this time, the value of Euler phi is computed. Thus, if we can compute Euler phi of n, we can factor n trivially.

To implement this idea, we use pari/GP, available from http://www.ufr-mi.u-bordeaux.fr/~belabas/pari, and code the following script using GP:

n=312825222036352242636619;
phi=eulerphi(n);
p_plus_q=n-phi+1;
sq=p_plus_q*p_plus_q;
t=sq-4*n;
determinant=sqrt(t);
p=(p_plus_q+determinant)/2;
q=(p_plus_q-determinant)/2;
 
print ("factoring: " n);
print ("q: " q);
print ("p: " p);
\q

Included in the article is a program called gen_prime.java which generates random primes using the number of bits desired as a guide. This tool will be used to test our Euler Phi based factoring. For example, the Linux command line of:

echo "`java gen_prime 64` * `java gen_prime 64`" | bc

will generate a 128 bit semi-prime.

Now, given our Java code to deduce random primes, another script is used to test several cases using those generated random primes.

Those values, when multiplied, are used as values of n in the GP factoring script.

We then let pari solve the equation and produce a result. An example of the Euler Phi application:

parisize = 4000000, primelimit = 500000
factoring: 312825222036352242636619
q: 63060978419.00000000000000000
p: 4960678217801.000000000000000
Goodbye!

In comparison, the current 'factor' command on Linux produces the following result:

$ factor 312825222036352242636619
factor: `312825222036352242636619' is too large

The test script to run is available from test_eulerphi.sh. It allows the user to customize the script to try out the idea on 128 bit semi-prime numbers by setting the script variable PRIMES to lg(sqrt(2^128)) or 64, and the number of trials to a desired number, currently 10. The script can test the current limitation of this approach easily by supplying different values for the two variables, PRIMES and TRIALS. The current limit was found to be 84 bit primes, which allowed for a factoring of 168 bit semi-primes.

Using the Code

The code to test out the O((p-q)/2) algorithm should compile with a C++ compiler, and was tested with GCC on Linux and MinGW on Windows. Running it requires you pass in a semi-prime number. The user will note how little the number of operations it requires to factor is, where the two factors differ by a countable difference. As coded, the algorithm uses the default built-in 32 bit integer types. For the pari/GP script, you need to have this software installed on your computer first. Then, you need to compile the Java code to generate random prime numbers (gen_prime.java). After that, you can run the script (test_eulerphi.sh). The code is configured to factor 128 bit semi-primes.

Points of Interest

The best factoring algorithm that currently exists is the Generalized Number Field Sieve (GNFS) [1]. For factoring larger numbers, it is recommended that the algorithm be investigated. Alternatively, perhaps there are other techniques that can be discovered based on the idea of calculating either (p + q) or (p – q) where n = p * q.

Conclusion

The article shows two techniques involving the Quadratic formula, with one technique designed to take advantage of the difference between the two factors, and another capable of factoring 128 bit+ semi-primes. The implementation of the latter is given as a script, and all codes are freely available on the web for further investigation and use. The problem of resolution will remain an interesting one. At this time, no one has deduced a method that runs in polynomial time or proves that the complexity of the resolution problem is not in polynomial time [1]. The use of the O((p-q)/2) algorithm is limited in cases where p-q is large. The use of Euler Phi of n is limited in that it is conjectured to be as difficult to compute as factoring n.

In conclusion, the largest number factored so far in Pari/GP using the Euler Phi technique was a 168 bit integer:

parisize = 4000000, primelimit = 500000
factoring: 133508893528006101286190330650870490524366217338949
q: 10226209140199387919889509.00
p: 13055560638123520736792161.00

References

[1] Pomerance, C. & Crandall, R.(2005) Prime Numbers, Springer, New York.
[2] Larson, R. E., Hostetler, R. P., Edwards, B. H., & Heyd, D. E. (1997)
PreCalculus, Houghton Mifflin Company, New York.