\chapter{Dixon's factorization method\label{chap:dixon}} ~\cite{dixon} describes a class of ``probabilistic algorithms'' for finding a factor of any composite number, at a sub-exponential cost. They basically consists into taking random integers $r$ in $\{1, \ldots, N\}$ and look for those where $r^2 \mod{N}$ is \emph{smooth}. If enough are found, then those integers can somehow be assembled, and so a fatorization of N attemped. %% that's not really academic to be stated officially, but I would have never %% understood this section without Firas (thanks). %% %% I kept the voila` phrase, that was so lovely. \section{Interlude \label{sec:dixon:history}} During the latest century there has been a huge effort to approach the problem formulated by Fermat ~\ref{eq:fermat_problem} from different perspecives. This led to an entire family of algorithms, like \emph{Quadratic Sieve}, \emph{Dixon}, \ldots. The core idea is still to find a pair of perfect squares whose difference can factorize $N$, but maybe Fermat's hypotesis can be made weaker. \paragraph{Kraitchick} was the first one popularizing the idea that instead of looking for integers $\angular{x, y}$ such that $x^2 -y^2 = N$ it is sufficient to look for \emph{multiples} of $N$: \begin{align} x^2 - y^2 \equiv 0 \pmod{N} \end{align} and, once found, claim that $\gcd(N, x \pm y)$ are non-trial divisors of $N$ just as we did in \ref{sec:fermat:implementation}. Kraitchick did not stop here: instead of trying $x^2 \equiv y^2 \pmod{N}$ he kept the value of previous attempt, and tries to find \emph{a product} of such values which is also a square. So we have a sequence \begin{align} \label{eq:dixon:x_sequence} \angular{x_0, \ldots, x_k} \mid \forall i \leq k \quad x_i^2 - N \; \text{ is a perfect square} \end{align} and hence \begin{align*} \prod_i (x_i^2 - N) = y^2 \end{align*} that $\mod{N}$ is equivalent to: \begin{align} \label{eq:dixon:fermat_revisited} y^2 \equiv \prod_i (x_i^2 - N) \equiv \big( \prod_i x_i \big) ^2 \pmod{N} \end{align} and voil\`a our congruence of squares (\cite{discretelogs} \S 4). For what concerns the generation of $x_i$ with the property \ref{eq:dixon:x_sequence}, they can simply be taken at random and tested using trial division. \paragraph{Brillhart and Morrison} later proposed (\cite{morrison-brillhart} p.187) a better approach than trial division to find such $x$. Their idea aims to ease the enormous effort required by the trial division. In order to achieve this. they introduce a \emph{factor base} $\factorBase$ and generate random $x$ such that $x^2 - N$ is $\factorBase$-smooth. Recalling what we anticipated in ~\ref{chap:preq}, $\factorBase$ is a precomputed set of primes $p_i \in \naturalPrime$. This way the complexity of generating a new $x$ is dominated by \bigO{|\factorBase|}. Now that the right side of \ref{eq:dixon:fermat_revisited} has been satisfied, we have to select a subset of those $x$ so that their product can be seen as a square. Consider an \emph{exponent vector} $v_i = (\alpha_0, \alpha_1, \ldots, \alpha_{r-1})$ with $r = |\factorBase| + 1$ associated with each $x_i$, where \begin{align} \label{eq:dixon:alphas} \alpha_j = \begin{cases} 1 \quad \text{if $p_j$ divides $x_i$ to an odd power} \\ 0 \quad \text{otherwise} \end{cases} \end{align} for each $1 \leq j < r $. There is no need to restrict ourselves for positive values of $x^2 -N$, so we are going to use $\alpha_0$ to indicate the sign -$1$ if negative, $0$ otherwise. This benefit has a neglegible cost: we have to add the non-prime $-1$ to our factor base $\factorBase$. Let now $M \in \mathbb{F}_2^{(f \times r)}$, for some $f > r$, be the rectangular matrix having per each $i$-th row the $v_i$ associated to $x_i$: this way each matrix element $m_{ij}$ will be the $j$-th component of $v_i$. We are interested in finding set(s) of the subsequences of $x_i$ whose product always have even powers (\ref{eq:dixon:fermat_revisited}). Turns out that this is equivalent to look for the set of vectors $\{ w \mid wM = 0 \} = \ker(M)$ by definition of matrix multiplication in $\mathbb{F}_2$. \paragraph{Dixon} Morrison and Brillhart's ideas of \cite{morrison-brillhart} were actually used for a slightly different factorization method, employing continued fractions instead of the square difference polynomial. Dixon simply ported these to the square problem, achieving a probabilistic factorization method working at a computational cost asymptotically better than all other ones previously described: \bigO{\exp \{\beta(\log N \log \log N )^{\rfrac{1}{2}}\}} for some constant $\beta > 0$ \cite{dixon}. \section{Breaching the kernel} The following reduction procedure, extracted from ~\cite{morrison-brillhart}, is a forward part of the Gauss-Jordan elimination algorithm (carried out from right to left), and can be used to determine whether the set of exponent vectors is linearly dependent. For each $v_i$ described as above, associate a \emph{companion history vector} \\ $h_i = (\beta_0, \beta_1, \ldots, \beta_{f-1})$, where for $0 \leq m < f$: \begin{align*} \beta_m = \begin{cases} 1 \quad \text{ if $m = i$} \\ 0 \quad \text{ otherwise} \end{cases} \end{align*} At this point, we have all data structures needed: \\ \\ \\ \begin{center} \emph{Reduction Procedure} \end{center} \begin{enumerate}[(i)] \item Set $j=r-1$; \item find the ``pivot vector'', i.e. the first vector $v_i, \quad 0 \leq i < f$ such that $\alpha_j = 1$. If none is found, go to (iv); \item \begin{enumerate}[(a)] \item replace every following vector $v_m, \quad i < m < f$ whose rightmost $1$ is the $j$-th component, by the sum $v_i \xor v_m$; \item whenever $v_m$ is replaced by $v_i \xor v_m$, replace also the associated history vector $h_m$ with $h_i \xor h_m$; \end{enumerate} \item Reduce $j$ by $1$. If $j \geq 0$, return to (ii); otherwise stop. \end{enumerate} Algorithm \ref{alg:dixon:kernel} formalizes concepts so far discussed, by presenting a function \texttt{ker}, discovering linear dependencies in any rectangular matrix $M \in \mathbb{F}_2^{(f \times r)}$ and storing dependencies into a \emph{history matrix} $H$. \begin{remark} We are proceeding from right to left in order to conform with \cite{morrison-brillhart}. Instead, their choice lays on optimization reasons, which does not apply any more to a modern calculator. \end{remark} \begin{algorithm} \caption{Reduction Procedure \label{alg:dixon:kernel}} \begin{algorithmic}[1] \Function{Ker}{$M$} \State $H \gets \texttt{Id}(f \times f)$ \Comment the initial $H$ is the identity matrix \For{$j = r-1 \strong{ downto } 0$} \Comment reduce \For{$i=0 \strong{ to } f-1$} \If{$M_{i, j} = 1$} \For{$i' = i+1 \strong{ to } f-1$} \If{$M_{i', k} = 1$} \State $M_{i'} = M_i \xor M_{i'}$ \State $H_{i'} = H_i \xor H_{i'}$ \EndIf \EndFor \State \strong{break} \EndIf \EndFor \EndFor \For{$i = 0 \strong{ to } f-1$} \Comment yield linear dependencies \If{$M_i = (0, \ldots, 0)$} \strong{yield} $\{\mu \mid H_{i,\mu} = 1\}$ \EndIf \EndFor \EndFunction \end{algorithmic} \end{algorithm} \begin{remark} The \texttt{yield} statement in line $12$ of algorithm \ref{alg:dixon:kernel} has the same semantics as in the python programming language. It is intended to underline the fact that each $\{\mu \mid H_{i,\mu} = 1\}$ can lead to a solution for \ref{eq:dixon:x_sequence}, and therefore their generation can be performed asynchronously. \end{remark} \section{An Implementation Perspective} Before gluing all toghether, we need one last building brick necessary for Dixon's factorization algorithm: a \texttt{smooth}($x$) function. In our specific case, we need a function that, given as input a number $x$, returns \strong{nil} if $x^2 -N$ is not $\factorBase$-smooth. Otherwise, returns a vector $v = (\alpha_0, \ldots, \alpha_r)$ such that each $\alpha_j$ is defined just as in \ref{eq:dixon:alphas}. Once we have established $\factorBase$, its implementation comes straightfoward. \paragraph{How do we choose $\factorBase$?} It's not easy to answer: if we choose $\factorBase$ small, we will rarely find $x^2 -N$ \emph{smooth}. If we chose it large, attempting to factorize $x^2 -N$ with $\factorBase$ will pay the price of iterating through a large set. \cite{Crandall} \S 6.1 finds a solution for this problem by employing complex analytic number theory. As a result, the ideal value for $|\factorBase|$ is $e^{\sqrt{\ln N \ln \ln N}}$. \begin{algorithm} \caption{Discovering Smoothness} \begin{algorithmic}[1] \Require $\factorBase$, the factor base \Function{smooth}{$x$} \State $v \gets (\alpha_0 = 0, \ldots, \alpha_{|\factorBase|} = 0)$ \If{$x < 0$} $\alpha_0 \gets 1$ \EndIf \For{$i = 1 \strong{ to } |\factorBase|$} \While{$\factorBase_i \mid x$} \State $x \gets x// \factorBase_i$ \State $\alpha_i \gets \alpha_i \xor 1$ \EndWhile \EndFor \If{$x = 1$} \State \Return $v$ \Else \State \Return \strong{nil} \EndIf \EndFunction \end{algorithmic} \end{algorithm} \begin{algorithm} \caption{Dixon} \begin{algorithmic}[1] \Require $\factorBase$, the factor base \Function{dixon}{\PKArg} \State $i \gets 0$ \State $f \getsRandom \naturalN_{ > |\factorBase|}$ \Comment finding linearity requires redundance \While{$i < f$} \Comment search for suitable pairs \State $x_i \getsRandom \naturalN_{< N}$ \State $y_i \gets x_i^2 - N$ \State $v_i \gets \textsc{smooth}(y_i)$ \If{$v_i \neq \strong{nil} $} $i \gets i+1$ \EndIf \EndWhile \State $M \gets \texttt{matrix}(v_0, \ldots, v_{f-1})$ \For{$\lambda = \{\mu_0, \ldots, \mu_k\} \strong{ in } \textsc{ker}(M)$} \Comment get relations \State $x \gets \prod_{\mu \in \lambda} x_\mu \pmod{N}$ \State $y, r \gets \dsqrt{\prod_{\mu \in \lambda} y_\mu \pmod{N}}$ \State $g \gets \gcd(x+y, N)$ \If{$1 < g < N$} \State $p \gets g $ \State $q \gets N//p$ \State \Return $p, q$ \EndIf \EndFor \EndFunction \end{algorithmic} \end{algorithm} \paragraph{Parallelism} Dixon's factorization is ideally suited to parallel implementation. Similarly to other methods like ECM and MPQS, treated in \cite{brent:parallel} \S 6.1, we can \emph{linearly} improve the running time by distributing across many nodes the discovery of $\factorBase$-smooth numbers. Depending on the granularity we desire - and the number of nodes available, we can even act on the \texttt{ker} function - but less easily. This idea would boil down to the same structure we discussed with Wiener's attack: one node - the \emph{producer} - discovers linear dependencies, while the others - the \emph{consumers} - attempt to factorize $N$. Certainly, due to the probabilistic nature of this algorithm, we can even think about running multiple instances of the same program. This solution is fairly effective in proportion to the development cost. %%% Local Variables: %%% mode: latex %%% TeX-master: "question_authority" %%% End: