dixon.tex 9.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241
  1. \chapter{Dixon {\texttt{\small{[insert random buzzword here]}}} method\label{chap:dixon}}
  2. ~\cite{dixon} describes a class of ``probabilistic algorithms'' for finding a
  3. factor of any composite number, at a sub-exponential cost. They basically
  4. consists into taking random integers $r$ in $\{1, \ldots, N\}$ and look for those
  5. where $r^2 \mod{N}$ is \emph{smooth}. If enough are found, then those integers
  6. can somehow be assembled, and so a fatorization of N attemped.
  7. %% that's not really academic to be stated officially, but I would have never
  8. %% understood this section without Firas (thanks).
  9. %% <http://blog.fkraiem.org/2013/12/08/factoring-integers-dixons-algorithm/>
  10. %% I kept the voila` phrase, that was so lovely.
  11. \section{A little bit of History \label{sec:dixon:history}}
  12. During the latest century there has been a huge effort to approach the problem
  13. formulated by Fermat ~\ref{eq:fermat_problem} from different perspecives. This
  14. led to an entire family of algorithms, like \emph{Quadratic Sieve},
  15. \emph{Dixon}, \ldots.
  16. The core idea is still to find a pair of perfect squares whose difference can
  17. factorize $N$, but maybe Fermat's hypotesis can be made weaker.
  18. \paragraph{Kraitchick} was the first one popularizing the idea the instead of
  19. looking for integers $\angular{x, y}$ such that $x^2 -y^2 = N$ it is sufficient
  20. to look for \emph{multiples} of $N$:
  21. \begin{align}
  22. x^2 - y^2 \equiv 0 \pmod{N}
  23. \end{align}
  24. and, once found, claim that $\gcd(N, x \pm y)$ are non-trial divisors of $N$
  25. just as we did in \ref{sec:fermat:implementation}.
  26. Kraitchick did not stop here: instead of trying $x^2 \equiv y^2 \pmod{N}$ he
  27. kept the value of previous attempt, and tries to find \emph{a product} of such
  28. values which is also a square. So we have a sequence
  29. \begin{align}
  30. \label{eq:dixon:x_sequence}
  31. \angular{x_0, \ldots, x_k} \mid \forall i \leq k \quad x_i^2 - N
  32. \; \text{ is a perfect square}
  33. \end{align}
  34. and hence
  35. \begin{align*}
  36. \prod_i (x_i^2 - N) = y^2
  37. \end{align*}
  38. that $\mod{N}$ is equivalent to:
  39. \begin{align}
  40. \label{eq:dixon:fermat_revisited}
  41. y^2 \equiv \prod_i (x_i^2 - N) \equiv \big( \prod_i x_i \big) ^2 \pmod{N}
  42. \end{align}
  43. and voil\`a our congruence of squares. For what concerns the generation of $x_i$
  44. with the property \ref{eq:dixon:x_sequence}, they can simply taken at random and
  45. tested using trial division.
  46. \paragraph{Brillhart and Morrison} later proposed (\cite{morrison-brillhart}
  47. p.187) a better approach than trial division to find such $x$. Their idea aims
  48. to ease the enormous effort required by the trial division. In order to achieve
  49. this. they introduce a \emph{factor base} $\factorBase$ and generate random $x$
  50. such that $x^2 - N$ is $\factorBase$-smooth. Recalling what we anticipated in
  51. ~\ref{chap:preq}, $\factorBase$ is a precomputed set of primes
  52. $p_i \in \naturalPrime$.
  53. This way the complexity of generating a new $x$ is dominated by
  54. \bigO{|\factorBase|}. Now that the right side of \ref{eq:dixon:fermat_revisited}
  55. has been satisfied, we have to select a subset of those $x$ so that their
  56. product can be seen as a square. Consider an \emph{exponent vector}
  57. $v_i = (\alpha_0, \alpha_1, \ldots, \alpha_r)$ associated with each $x_i$, where
  58. \begin{align*}
  59. a_j = \begin{cases}
  60. 1 \quad \text{if $p_j$ divides $x_i$ to an odd power} \\
  61. 0 \quad \text{otherwise}
  62. \end{cases}
  63. \end{align*}
  64. for each $1 \leq j \leq r $. There is no need to restrict ourselves for positive
  65. values of $x^2 -N$, so we are going to use $\alpha_0$ to indicate the sign. This
  66. benefit has a neglegible cost: we have to add the non-prime $-1$ to our factor
  67. base $\factorBase$.
  68. Let now $\mathcal{M}$ be the rectangular matrix having per each $i$-th row the
  69. $v_i$ associated to $x_i$: this way each element $m_{ij}$ will be $v_i$'s
  70. $\alpha_j$. We are interested in finding set(s) of the subsequences of $x_i$
  71. whose product always have even powers (\ref{eq:dixon:fermat_revisited}).
  72. Turns out that this is equivalent to look for the set of vectors
  73. $\{ w \mid wM = 0 \} = \ker(\mathcal{M})$ by definition of matrix multiplication
  74. in $\mathbb{F}_2$.
  75. \paragraph{Dixon} Morrison and Brillhart's ideas of \cite{morrison-brillhart}
  76. were actually used for a slightly different factorization method, employing
  77. continued fractions instead of the square difference polynomial. Dixon simply
  78. ported these to the square problem, achieving a probabilistic factorization
  79. method working at a computational cost asymptotically best than all other ones
  80. previously described: \bigO{\beta(\log N \log \log N)^{\rfrac{1}{2}}} for some
  81. constant $\beta > 0$ \cite{dixon}.
  82. \section{Reduction Procedure}
  83. The following reduction procedure, extracted from ~\cite{morrison-brillhart}, is
  84. a forward part of the Gauss-Jordan elimination algorithm (carried out from right
  85. to left), and can be used to determine whether the set of exponent vectors is
  86. linearly dependent.
  87. For each $v_i$ described as above, associate a \emph{companion history vector}
  88. $h_i = (\beta_0, \beta_1, \ldots, \beta_f)$, where for $0 \leq m \leq f$:
  89. \begin{align*}
  90. \beta_m = \begin{cases}
  91. 1 \quad \text{ if $m = i$} \\
  92. 0 \quad \text{ otherwise}
  93. \end{cases}
  94. \end{align*}
  95. At this point, we have all data structures needed:
  96. \\
  97. \\
  98. \\
  99. \begin{center}
  100. \emph{Reduction Procedure}
  101. \end{center}
  102. \begin{enumerate}[(i)]
  103. \item Set $j=r$;
  104. \item find the ``pivot vector'', i.e. the first vector
  105. $e_i, \quad 0 \leq i \leq f$ such that $\alpha_j = 1$. If none is found, go
  106. to (iv);
  107. \item
  108. \begin{enumerate}[(a)]
  109. \item replace every following vector $e_m, \quad i < m \leq f$
  110. whose rightmost $1$ is the $j$-th component, by the sum $e_i \xor e_m$;
  111. \item whenever $e_m$ is replaced by $e_i \xor e_m$, replace also the
  112. associated history vector $h_m$ with $h_i \xor h_m$;
  113. \end{enumerate}
  114. \item Reduce $j$ by $1$. If $j \geq 0$, return to (ii); otherwise stop.
  115. \end{enumerate}
  116. Algorithm \ref{alg:dixon:kernel} formalizes concepts so far discussed, by
  117. presenting a function \texttt{ker}, discovering linear dependencies in any
  118. rectangular matrix $\mathcal{M} \in (\mathbb{F}_2)^{(f \times r)}$
  119. and storing dependencies into a \emph{history matrix} $\mathcal{H}$.
  120. \begin{remark}
  121. We are proceeding from right to left in order to conform with
  122. \cite{morrison-brillhart}.
  123. Instead, their choice lays on optimization reasons, which does
  124. not apply any more to a modern calculator.
  125. \end{remark}
  126. \begin{algorithm}
  127. \caption{Reduction Procedure \label{alg:dixon:kernel}}
  128. \begin{algorithmic}[1]
  129. \Procedure{Ker}{$\mathcal{M}$}
  130. \State $\mathcal{H} \gets \texttt{Id}(f)$
  131. \Comment The initial $\mathcal{H}$ is the identity matrix
  132. \For{$j = r \ldots 0$}
  133. \Comment Reduce
  134. \For{$i=0 \ldots f$}
  135. \If{$\mathcal{M}_{i, j} = 1$}
  136. \For{$i' = i \ldots f$}
  137. \If{$\mathcal{M}_{i', k} = 1$}
  138. \State $\mathcal{M}_{i'} = \mathcal{M}_i \xor \mathcal{M}_{i'}$
  139. \State $\mathcal{H}_{i'} = \mathcal{H}_i \xor \mathcal{H}_{i'}$
  140. \EndIf
  141. \EndFor
  142. \State \strong{break}
  143. \EndIf
  144. \EndFor
  145. \EndFor
  146. \For{$i = 0 \ldots f$}
  147. \Comment Yield linear dependencies
  148. \If{$\mathcal{M}_i = (0, \ldots, 0)$} \strong{yield} $H_i$ \EndIf
  149. \EndFor
  150. \EndProcedure
  151. \end{algorithmic}
  152. \end{algorithm}
  153. \section{Gluing the shit toghether}
  154. Before gluing all toghether, we need one last building brick necessary for
  155. Dixon's factorization algorithm: a \texttt{smooth}($x$) function. In our
  156. specific case, we need a function that, given as input a number $x$, returns the
  157. empty set $\emptyset$ if $x^2 -N$ is not $\factorBase$-smooth. Otherwise,
  158. returns the pair $\angular{y, v}$ where $y = \dsqrt{x^2 - N}$ and
  159. $v = (\alpha_0, \ldots, \alpha_r)$ that we described in section
  160. \ref{sec:dixon:history}. Once we have established $\factorBase$, its
  161. implementation is fairly straightforward:
  162. \begin{algorithm}
  163. \caption{Discovering Smoothness}
  164. \begin{algorithmic}[1]
  165. \Procedure{smooth}{$x$}
  166. \State $y, r \gets x^2 -N$
  167. \State $v \gets (\alpha_0 = 0, \ldots, \alpha_r = 0)$
  168. \For{$i = 0 \ldots |\factorBase|$}
  169. \If{$\factorBase_i \nmid x$} \strong{continue} \EndIf
  170. \State $x \gets x// \factorBase_i$
  171. \State $\alpha_i \gets \alpha_i \xor 1$
  172. \EndFor
  173. \If{$x = 1$} \State \Return $y, v$
  174. \Else \State \Return $y, \emptyset$
  175. \EndIf
  176. \EndProcedure
  177. \end{algorithmic}
  178. \end{algorithm}
  179. \paragraph{How do we choose $\factorBase$?}
  180. It's not easy to answer: if we choose $\factorBase$ small, we will rarely find
  181. $x^2 -N$ \emph{smooth}. If we chose it large, attempting to factorize $x^2 -N$
  182. with $\factorBase$ will pay the price of iterating through a large set.
  183. \cite{Crandall} \S 6.1 finds a solution for this employng complex analytic
  184. number theory. As a result, the ideal value for $|\factorBase|$ is
  185. $e^{\sqrt{\ln N \ln \ln N}}$.
  186. \begin{algorithm}
  187. \caption{Dixon}
  188. \begin{algorithmic}
  189. \State $i \gets 0$
  190. \State $r \gets |\factorBase| + 5$
  191. \Comment finding linearity requires redundance
  192. \While{$i < r$}
  193. \Comment Search for suitable pairs
  194. \State $x_i \gets \{0, \ldots N\}$
  195. \State $y_i, v_i \gets \texttt{smooth}(x_i)$
  196. \If{$v_i \neq \emptyset$} $i++$ \EndIf
  197. \EndWhile
  198. \State $\mathcal{M} \gets \texttt{matrix}(v_0, \ldots, v_f)$
  199. \For{$\angular{\lambda_0, \ldots, \lambda_k}
  200. \text{ in } \texttt{ker}(\mathcal{M})$}
  201. \Comment{Get relations}
  202. \State $x \gets \prod_\lambda x_\lambda \pmod{N}$
  203. \State $y \gets \dsqrt{\prod_\lambda y_\lambda \pmod{N}}$
  204. \If{$\gcd(x+y, N) > 1$}
  205. \State $p \gets \gcd(x+y, N)$
  206. \State $q \gets \gcd(x-y, N)$
  207. \State \Return $p, q$
  208. \EndIf
  209. \EndFor
  210. \end{algorithmic}
  211. \end{algorithm}
  212. %%% Local Variables:
  213. %%% mode: latex
  214. %%% TeX-master: "question_authority"
  215. %%% End: