dixon.tex 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272
  1. \chapter{Dixon's factorization method\label{chap:dixon}}
  2. ~\cite{dixon} describes a class of ``probabilistic algorithms'' for finding a
  3. factor of any composite number, at a sub-exponential cost. They basically
  4. consists into taking random integers $r$ in $\{1, \ldots, N\}$ and look for those
  5. where $r^2 \mod{N}$ is \emph{smooth}. If enough are found, then those integers
  6. can somehow be assembled, and so a fatorization of N attemped.
  7. %% that's not really academic to be stated officially, but I would have never
  8. %% understood this section without Firas (thanks).
  9. %% <http://blog.fkraiem.org/2013/12/08/factoring-integers-dixons-algorithm/>
  10. %% I kept the voila` phrase, that was so lovely.
  11. \section{Interlude \label{sec:dixon:history}}
  12. During the latest century there has been a huge effort to approach the problem
  13. formulated by Fermat ~\ref{eq:fermat_problem} from different perspecives. This
  14. led to an entire family of algorithms, like \emph{Quadratic Sieve},
  15. \emph{Dixon}, \ldots.
  16. The core idea is still to find a pair of perfect squares whose difference can
  17. factorize $N$, but maybe Fermat's hypotesis can be made weaker.
  18. \paragraph{Kraitchick} was the first one popularizing the idea the instead of
  19. looking for integers $\angular{x, y}$ such that $x^2 -y^2 = N$ it is sufficient
  20. to look for \emph{multiples} of $N$:
  21. \begin{align}
  22. x^2 - y^2 \equiv 0 \pmod{N}
  23. \end{align}
  24. and, once found, claim that $\gcd(N, x \pm y)$ are non-trial divisors of $N$
  25. just as we did in \ref{sec:fermat:implementation}.
  26. Kraitchick did not stop here: instead of trying $x^2 \equiv y^2 \pmod{N}$ he
  27. kept the value of previous attempt, and tries to find \emph{a product} of such
  28. values which is also a square. So we have a sequence
  29. \begin{align}
  30. \label{eq:dixon:x_sequence}
  31. \angular{x_0, \ldots, x_k} \mid \forall i \leq k \quad x_i^2 - N
  32. \; \text{ is a perfect square}
  33. \end{align}
  34. and hence
  35. \begin{align*}
  36. \prod_i (x_i^2 - N) = y^2
  37. \end{align*}
  38. that $\mod{N}$ is equivalent to:
  39. \begin{align}
  40. \label{eq:dixon:fermat_revisited}
  41. y^2 \equiv \prod_i (x_i^2 - N) \equiv \big( \prod_i x_i \big) ^2 \pmod{N}
  42. \end{align}
  43. and voil\`a our congruence of squares (\cite{discretelogs} \S 4). For what
  44. concerns the generation of $x_i$ with the property \ref{eq:dixon:x_sequence},
  45. they can simply be taken at random and tested using trial division.
  46. \paragraph{Brillhart and Morrison} later proposed (\cite{morrison-brillhart}
  47. p.187) a better approach than trial division to find such $x$. Their idea aims
  48. to ease the enormous effort required by the trial division. In order to achieve
  49. this. they introduce a \emph{factor base} $\factorBase$ and generate random $x$
  50. such that $x^2 - N$ is $\factorBase$-smooth. Recalling what we anticipated in
  51. ~\ref{chap:preq}, $\factorBase$ is a precomputed set of primes
  52. $p_i \in \naturalPrime$.
  53. This way the complexity of generating a new $x$ is dominated by
  54. \bigO{|\factorBase|}. Now that the right side of \ref{eq:dixon:fermat_revisited}
  55. has been satisfied, we have to select a subset of those $x$ so that their
  56. product can be seen as a square. Consider an \emph{exponent vector}
  57. $v_i = (\alpha_0, \alpha_1, \ldots, \alpha_r)$ associated with each $x_i$, where
  58. \begin{align}
  59. \label{eq:dixon:alphas}
  60. \alpha_j = \begin{cases}
  61. 1 \quad \text{if $p_j$ divides $x_i$ to an odd power} \\
  62. 0 \quad \text{otherwise}
  63. \end{cases}
  64. \end{align}
  65. for each $1 \leq j \leq r $. There is no need to restrict ourselves for positive
  66. values of $x^2 -N$, so we are going to use $\alpha_0$ to indicate the sign. This
  67. benefit has a neglegible cost: we have to add the non-prime $-1$ to our factor
  68. base $\factorBase$.
  69. Let now $\mathcal{M}$ be the rectangular matrix having per each $i$-th row the
  70. $v_i$ associated to $x_i$: this way each element $m_{ij}$ will be $v_i$'s
  71. $\alpha_j$. We are interested in finding set(s) of the subsequences of $x_i$
  72. whose product always have even powers (\ref{eq:dixon:fermat_revisited}).
  73. Turns out that this is equivalent to look for the set of vectors
  74. $\{ w \mid wM = 0 \} = \ker(\mathcal{M})$ by definition of matrix multiplication
  75. in $\mathbb{F}_2$.
  76. \paragraph{Dixon} Morrison and Brillhart's ideas of \cite{morrison-brillhart}
  77. were actually used for a slightly different factorization method, employing
  78. continued fractions instead of the square difference polynomial. Dixon simply
  79. ported these to the square problem, achieving a probabilistic factorization
  80. method working at a computational cost asymptotically best than all other ones
  81. previously described: \bigO{\beta(\log N \log \log N)^{\rfrac{1}{2}}} for some
  82. constant $\beta > 0$ \cite{dixon}.
  83. \section{Reduction Procedure}
  84. The following reduction procedure, extracted from ~\cite{morrison-brillhart}, is
  85. a forward part of the Gauss-Jordan elimination algorithm (carried out from right
  86. to left), and can be used to determine whether the set of exponent vectors is
  87. linearly dependent.
  88. For each $v_i$ described as above, associate a \emph{companion history vector}
  89. $h_i = (\beta_0, \beta_1, \ldots, \beta_f)$, where for $0 \leq m \leq f$:
  90. \begin{align*}
  91. \beta_m = \begin{cases}
  92. 1 \quad \text{ if $m = i$} \\
  93. 0 \quad \text{ otherwise}
  94. \end{cases}
  95. \end{align*}
  96. At this point, we have all data structures needed:
  97. \\
  98. \\
  99. \\
  100. \begin{center}
  101. \emph{Reduction Procedure}
  102. \end{center}
  103. \begin{enumerate}[(i)]
  104. \item Set $j=r$;
  105. \item find the ``pivot vector'', i.e. the first vector
  106. $e_i, \quad 0 \leq i \leq f$ such that $\alpha_j = 1$. If none is found, go
  107. to (iv);
  108. \item
  109. \begin{enumerate}[(a)]
  110. \item replace every following vector $e_m, \quad i < m \leq f$
  111. whose rightmost $1$ is the $j$-th component, by the sum $e_i \xor e_m$;
  112. \item whenever $e_m$ is replaced by $e_i \xor e_m$, replace also the
  113. associated history vector $h_m$ with $h_i \xor h_m$;
  114. \end{enumerate}
  115. \item Reduce $j$ by $1$. If $j \geq 0$, return to (ii); otherwise stop.
  116. \end{enumerate}
  117. Algorithm \ref{alg:dixon:kernel} formalizes concepts so far discussed, by
  118. presenting a function \texttt{ker}, discovering linear dependencies in any
  119. rectangular matrix $\mathcal{M} \in (\mathbb{F}_2)^{(f \times r)}$
  120. and storing dependencies into a \emph{history matrix} $\mathcal{H}$.
  121. \begin{remark}
  122. We are proceeding from right to left in order to conform with
  123. \cite{morrison-brillhart}.
  124. Instead, their choice lays on optimization reasons, which does
  125. not apply any more to a modern calculator.
  126. \end{remark}
  127. \begin{algorithm}
  128. \caption{Reduction Procedure \label{alg:dixon:kernel}}
  129. \begin{algorithmic}[1]
  130. \Function{Ker}{$\mathcal{M}$}
  131. \State $\mathcal{H} \gets \texttt{Id}(f \times f)$
  132. \Comment the initial $\mathcal{H}$ is the identity matrix
  133. \For{$j = r \strong{ downto } 0$}
  134. \Comment reduce
  135. \For{$i=0 \strong{ to } f$}
  136. \If{$\mathcal{M}_{i, j} = 1$}
  137. \For{$i' = i \strong{ to } f$}
  138. \If{$\mathcal{M}_{i', k} = 1$}
  139. \State $\mathcal{M}_{i'} = \mathcal{M}_i \xor \mathcal{M}_{i'}$
  140. \State $\mathcal{H}_{i'} = \mathcal{H}_i \xor \mathcal{H}_{i'}$
  141. \EndIf
  142. \EndFor
  143. \State \strong{break}
  144. \EndIf
  145. \EndFor
  146. \EndFor
  147. \For{$i = 0 \strong{ to } f$}
  148. \Comment yield linear dependencies
  149. \If{$\mathcal{M}_i = (0, \ldots, 0)$}
  150. \strong{yield} $\{\mu \mid \mathcal{H}_{i,\mu} = 1\}$
  151. \EndIf
  152. \EndFor
  153. \EndFunction
  154. \end{algorithmic}
  155. \end{algorithm}
  156. \section{An Implementation Perspective}
  157. Before gluing all toghether, we need one last building brick necessary for
  158. Dixon's factorization algorithm: a \texttt{smooth}($x$) function. In our
  159. specific case, we need a function that, given as input a number $x$, returns the
  160. empty set $\emptyset$ if $x^2 -N$ is not $\factorBase$-smooth. Otherwise,
  161. returns a vector $v = (\alpha_0, \ldots, \alpha_r)$ such that each $\alpha_j$ is
  162. defined just as in \ref{eq:dixon:alphas}. Once we have established $\factorBase$, its
  163. implementation comes straightfoward.
  164. \paragraph{How do we choose $\factorBase$?}
  165. It's not easy to answer: if we choose $\factorBase$ small, we will rarely find
  166. $x^2 -N$ \emph{smooth}. If we chose it large, attempting to factorize $x^2 -N$
  167. with $\factorBase$ will pay the price of iterating through a large set.
  168. \cite{Crandall} \S 6.1 finds a solution for this employng complex analytic
  169. number theory. As a result, the ideal value for $|\factorBase|$ is
  170. $e^{\sqrt{\ln N \ln \ln N}}$.
  171. \begin{algorithm}
  172. \caption{Discovering Smoothness}
  173. \begin{algorithmic}[1]
  174. \Require $\factorBase$, the factor base
  175. \Procedure{smooth}{$x$}
  176. \State $v \gets (\alpha_0 = 0, \ldots, \alpha_{|\factorBase|} = 0)$
  177. \If{$x < 0$} $\alpha_0 \gets 1$ \EndIf
  178. \For{$i = 1 \strong{ to } |\factorBase|$}
  179. \If{$\factorBase_i \nmid x$} \strong{continue} \EndIf
  180. \State $x \gets x// \factorBase_i$
  181. \State $\alpha_i \gets \alpha_i \xor 1$
  182. \EndFor
  183. \If{$x = 1$}
  184. \State \Return $v$
  185. \Else
  186. \State \Return \strong{nil}
  187. \EndIf
  188. \EndProcedure
  189. \end{algorithmic}
  190. \end{algorithm}
  191. \begin{algorithm}
  192. \caption{Dixon}
  193. \begin{algorithmic}[1]
  194. \Require $\factorBase$, the factor base
  195. \Function{dixon}{\PKArg}
  196. \State $i \gets 0$
  197. \State $r \gets |\factorBase| + 5$
  198. \Comment finding linearity requires redundance
  199. \While{$i < r$}
  200. \Comment search for suitable pairs
  201. \State $x_i \getsRandom \naturalN_{< N}$
  202. \State $y_i \gets x_i^2 - N$
  203. \State $v_i \gets \texttt{smooth}(y_i)$
  204. \If{$v_i$} $i \gets i+1$ \EndIf
  205. \EndWhile
  206. \State $\mathcal{M} \gets \texttt{matrix}(v_0, \ldots, v_f)$
  207. \For{$\lambda = \{\mu_0, \ldots, \mu_k\}
  208. \strong{ in } \texttt{ker}(\mathcal{M})$}
  209. \Comment get relations
  210. \State $x \gets \prod\limits_{\mu \in \lambda} x_\mu \pmod{N}$
  211. \State $y, r \gets \dsqrt{\prod\limits_{\mu \in \lambda} y_\mu \pmod{N}}$
  212. \If{$\gcd(x+y, N) > 1$}
  213. \State $p \gets \gcd(x+y, N)$
  214. \State $q \gets \gcd(x-y, N)$
  215. \State \Return $p, q$
  216. \EndIf
  217. \EndFor
  218. \EndFunction
  219. \end{algorithmic}
  220. \end{algorithm}
  221. \paragraph{Parallelization}
  222. Dixon's factorization is ideally suited to parallel implementation. Similarly to
  223. other methods like ECM and MPQS, treated in \cite{brent:parallel} \S 6.1,
  224. we can \emph{linearly} improve the running time by distributing across many
  225. nodes the discovery of $\factorBase$-smooth numbers.
  226. Depending on the granularity we desire - and the number of nodes available, we
  227. can even act on the \texttt{ker} function - but less easily.
  228. This idea would boil down to the same structure we discussed with Wiener's attack:
  229. one node - the \emph{producer} - discovers linear dependencies, while the others
  230. - the \emph{consumers} - attempt to factorize $N$.
  231. For this reason that we introduced the \texttt{yield} statement in line
  232. $12$ of algorithm \ref{alg:dixon:kernel}: the two jobs can be performed
  233. asynchronously.
  234. Certainly, due to the probabilistic nature of this algorithm, we can even think
  235. aboutrunning multiple instances of the same program. This solution is fairly
  236. effective in proportion to the development cost.
  237. %%% Local Variables:
  238. %%% mode: latex
  239. %%% TeX-master: "question_authority"
  240. %%% End: