FactHacks

RSA factorization
in the real world

Introduction
Sage
Basic subroutines:
Euclid
Product tree
Remainder tree
Weak primes:
Batch gcd
Fermat
Lattice
Small primes:
Batch trial division
rho
p-1
ECM
Small keys:
Smooth
QS
NFS

Batch gcd

Given a sequence X of positive integers, the batch-gcd algorithm computes the sequence
  • gcd{X[0],X[1]X[2]X[3]...};
  • gcd{X[1],X[0]X[2]X[3]...};
  • gcd{X[2],X[0]X[1]X[3]...};
  • etc.
The point of this sequence is that it shows which integers share primes with other integers in the sequence. For example, if gcd{X[0],X[1]X[2]X[3]...} = 1, then X[0] has no primes shared with any of X[1], X[2], etc.; while if gcd{X[0],X[1]X[2]X[3]...} is larger than 1, then X[0] must have at least one shared prime. Often this gcd will simply equal the shared prime. If X[0] shares one prime with X[1] and another prime with X[2] then more work is required to separate those primes; this can become a speed problem if such sharing is very common, but it is handled at tolerable speed by more complicated algorithms for "factoring into coprimes".

The batch-gcd algorithm is much faster than separately computing each gcd shown above, and is much faster than computing gcds of all pairs. This algorithm was used to test millions of deployed RSA keys in 2012, factoring tens of thousands of the keys.

How it works

The batch-gcd algorithm has three steps:
  • Use a product tree to efficiently compute the product Z = X[0]X[1]X[2]X[3]...
  • Use a remainder tree to efficiently compute Z mod X[0]^2, Z mod X[1]^2, Z mod X[2]^2, etc.
  • Divide the ith remainder by X[i], and compute the gcd of the result with X[i].
The Sage code is very short:
     def batchgcd_simple(X):
       R = remainders(product(X),[n^2 for n in X])
       return [gcd(r/n,n) for r,n in zip(R,X)]
A faster version observes that the nodes of the product tree used inside the remainder tree in the second step are simply the squares of the nodes in the product tree used in the first step:
     def batchgcd_faster(X):
       prods = producttree(X)
       R = prods.pop()
       while prods:
         X = prods.pop()
         R = [R[floor(i/2)] % X[i]**2 for i in range(len(X))]
       return [gcd(r/n,n) for r,n in zip(R,X)]

     # example:
     print batchgcd_simple([1909,2923,291,205,989,62,451,1943,1079,2419])
     # output: [1909, 1, 1, 41, 23, 1, 41, 1, 83, 41]
     print batchgcd_faster([1909,2923,291,205,989,62,451,1943,1079,2419])
     # output: [1909, 1, 1, 41, 23, 1, 41, 1, 83, 41]

Version: This is version 2012.12.27 of the batchgcd.html web page.