|
@@ -1,15 +1,21 @@
|
|
|
\chapter{Pollard's $\rho$ factorization method \label{chap:pollardrho}}
|
|
|
|
|
|
-Pollard's $\rho$ factorization method is based on the statistical idea behind
|
|
|
-the birthday paradox. It consists into indentifying a periodically recurrent
|
|
|
-sequence of integers in the ring of remainders with respect to the public
|
|
|
-modulus $N$, and claim that the period $\psi$ is one of the two primes
|
|
|
-factorizing $N$.
|
|
|
-
|
|
|
-\paragraph{Origins of the name} The $\rho$ name is devoted to the graphical
|
|
|
-representation of the algorithm: as we can see in figure ~\ref{fig:pollardrho},
|
|
|
-if we graphically represent the lookup over a graphic
|
|
|
+The \emph{Monte Carlo} factorization method, published by J. M. Pollard in
|
|
|
+~\cite{pollardMC}, consists into identifying a periodically recurrent sequence
|
|
|
+of integers within a random walk $\pmod{N}$ that could leak one of the two
|
|
|
+factors.
|
|
|
|
|
|
+Consider a function $f$ from $\mathcal{S}$ to $\mathcal{S}$, where
|
|
|
+$\mathcal{S} = \{0, 1, \ldots, q-1\}$. Let $s$ be a random element in
|
|
|
+$\mathcal{S}$, and consider the sequence
|
|
|
+\begin{align*}
|
|
|
+ s,\ f(s),\ f(f(s)),\ \ldots
|
|
|
+\end{align*}
|
|
|
+Since $f$ acts over a finite set, it is clear that this sequence must
|
|
|
+eventually repeat, and become cyclic.
|
|
|
+We might diagram it with the letter $\rho$, where the tail represent the
|
|
|
+cyclic part, or \emph{epacts}, and the oval the cyclic part, or
|
|
|
+\emph{period}.
|
|
|
\begin{center}
|
|
|
\begin{tikzpicture}[scale=0.7, thick]
|
|
|
\tikzstyle{every node}=[draw,circle,fill=white,minimum size=4pt,
|
|
@@ -33,31 +39,203 @@ if we graphically represent the lookup over a graphic
|
|
|
%%\draw [decorate,decoration={brace, raise=1.5cm}] (1) -- (3)
|
|
|
%%node[draw=no] at (-1.5, 4) {tail};
|
|
|
\draw [decorate,decoration={brace, raise=3cm}] (5) -- (7)
|
|
|
- node[draw=none] at (13, 7) {\footnotesize {periodic sequence}};
|
|
|
+ node[draw=none] at (13, 7) {\footnotesize {period $\pmod{q}$}};
|
|
|
|
|
|
\end{tikzpicture}
|
|
|
\end{center}
|
|
|
|
|
|
+Now, consider $N=pq$.
|
|
|
+Let $F(x)$ be any function generating pseudorandom numbers $\angular{x_1, x_2, \ldots}$,
|
|
|
+and let $f(x) = F(x) \pmod{q}$.
|
|
|
+As we said above, without any luck, there will be a pair $\angular{x_i, x_j}$
|
|
|
+generated by $F$ such that $x_i \equiv x_j \pmod{q}$, but $x_i \neq x_j$.
|
|
|
+
|
|
|
+Therefore, in order to factorize $N$, we proceed as follows: starting from a
|
|
|
+random $s$, we iteratively apply $F$ reduced modulo $N$. Whenever we find a
|
|
|
+period, if $\gcd(x_i - x_j, N) > 1$ then we found a non-trivial
|
|
|
+factor of $N$.
|
|
|
+
|
|
|
+\paragraph{Choosing the function} Ideally, $F$ should be easily computable, but
|
|
|
+at the same time random enough to reduce as much as possible the epacts
|
|
|
+~\cite{Crandall} \S 5.2.1. Any quadratic function $F(x) = x^2 + b$ should be
|
|
|
+enough, provided that $b \in \naturalN \setminus \{0, 2\}$ \footnote{
|
|
|
+ Note that this has been only empirically verified, and so far not been proved
|
|
|
+ (~\cite{riesel}, p. 177)}.
|
|
|
+For example, ~\cite{pollardMC} uses $x^2 -1$, meanwhile we are going to choose
|
|
|
+$F(x) = x^2 + 1$.
|
|
|
+
|
|
|
+\paragraph{Finding the period} In \cite{AOCPv2} \S 3.1, Knuth gives a simple and
|
|
|
+elegant algorithm, attributed to Floyd, for finding a multiple of the
|
|
|
+period. This algorithm is the same one finally adopted by Pollard in
|
|
|
+~\cite{pollardMC}.
|
|
|
+
|
|
|
+\begin{theorem*}
|
|
|
+Given an \emph{ultimately periodic} sequence, in the sense that there exists
|
|
|
+numbers $\lambda$ and $\mu$ for which the values:
|
|
|
+\begin{align*}
|
|
|
+ X_0, X_1, \ldots, X_{\mu}, \ldots, X_{\mu + \lambda - 1}
|
|
|
+\end{align*}
|
|
|
+are distinct, but $X_{n+\lambda} = X_n$ when $n \geq \mu$,
|
|
|
+then there exists an
|
|
|
+$\mu < n < \mu + \lambda$ such that $X_n = X_{2n}$.
|
|
|
+\end{theorem*}
|
|
|
|
|
|
-\paragraph{A more rigourous description}
|
|
|
\begin{proof}
|
|
|
+ First, if $X_n = X_{2n}$, then the sequence is obviously periodic from
|
|
|
+ $X_{2n}$ onward, possibly even earlier.
|
|
|
+ Conversely, $X_n = X_m \quad (n \geq \mu)$ for
|
|
|
+ $m = n + k\lambda, \quad k \in \naturalN$. Hence, there will eventually
|
|
|
+ be an $n$ such that $X_n = X_{2n}$ if and only if $n - \mu$ is a multiple of
|
|
|
+ $\lambda$.
|
|
|
+ The first such value happens for $n = (\lambda + 1)\floor{\rfrac{\mu}{\lambda}}$.
|
|
|
+\end{proof}
|
|
|
+
|
|
|
+The immediate consequence of this is that we can find the period $q$ simply by
|
|
|
+checking $\gcd(x_{2i} - x_i, N)$ for incremental $i$-s.
|
|
|
+
|
|
|
+\paragraph{Brent's Improvement} In 1979, Brent discovered an entire family of
|
|
|
+cycle-finding algorithms whose optimal version resulted to be 36\% faster than
|
|
|
+Floyd's one \cite{pollard-brent}.
|
|
|
+Instead of looking for the period of the sequence using $x_{2i} - x_i$, Brent
|
|
|
+considers
|
|
|
+$\abs{x_j - x_{2^k}}$ for $ 3 \cdot 2^{k-1} < j \leq 2^{k+1}$, resulting in
|
|
|
+fewer operations required by the algorithm. Pragmatically, this boils down to
|
|
|
+confronting:
|
|
|
+
|
|
|
+\medskip
|
|
|
+\begin{tabular}{l@{\hskip 40pt} l@{\hskip 50pt} l}
|
|
|
+ $k = 0$ & $j \in \{1+1 \ldots 2\}$ & $\abs{x_1 - x_2}$ \\
|
|
|
+ $k = 1$ & $j \in \{3+1 \ldots 4\}$ & $\abs{x_2 - x_4}$ \\
|
|
|
+ $k = 2$ & $j \in \{6+1 \ldots 8\}$ & $\abs{x_4 - x_7}$, $\abs{x_4 - x_8}$ \\
|
|
|
+ $k = 3$ & $j \in \{12+1, \ldots 16\}$ &
|
|
|
+ $\abs{x_8 - x_{13}}$, $\abs{x_8 -x_{14}}$, $\ldots$, $|x_8 - x_{16}|$\\
|
|
|
+ $k = 4$ & $j \in \{24+1, \ldots 32 \}$ &
|
|
|
+ $\abs{x_{16} - x_{25}}$, $\ldots$, $\abs{x_{16} - x_{32}}$ \\[2pt]
|
|
|
+ $\quad \vdots$ & $\quad \vdots$ & $\quad \quad \vdots$ \\
|
|
|
+ %\multicolumn{1}{c}{$\vdots$} &
|
|
|
+ %\multicolumn{1}{c}{$\vdots$} &
|
|
|
+ %\multicolumn{1}{c}{$\vdots$} \\
|
|
|
+\end{tabular}
|
|
|
+
|
|
|
+
|
|
|
+A Pollard's $\rho$ variant that implements Brent's cycle-finding algorithm
|
|
|
+instead of Floyd's one runs around 25\% faster on average
|
|
|
+~\cite{pollard-brent}.
|
|
|
+
|
|
|
+\section{Complexity}
|
|
|
+\cite{riesel} presents a nice demonstration of the \emph{average} complexity of
|
|
|
+this algorithm, based on the birthday paradox.
|
|
|
+\newtheorem*{birthday}{The Birthday Paradox}
|
|
|
+\begin{birthday}
|
|
|
+ How many persons needs to be selected at random in order that the probability
|
|
|
+ of at least two of them having the same birthday exceeds $\rfrac{1}{2}$?
|
|
|
+\end{birthday}
|
|
|
+
|
|
|
+\begin{proof}[Solution]
|
|
|
+ The probability that $\epsilon$ different persons have different birthdays is:
|
|
|
+ \begin{align*}
|
|
|
+ \Big(1 - \frac{1}{365}\Big)
|
|
|
+ \Big(1 - \frac{2}{365}\Big)
|
|
|
+ \Big(1 - \frac{3}{365}\Big)
|
|
|
+ \cdots
|
|
|
+ \Big(1 - \frac{\epsilon -1}{365}\Big)
|
|
|
+ =
|
|
|
+ \frac{365!}{365^\epsilon (365-\epsilon)!}
|
|
|
+ \end{align*}
|
|
|
+ This expression becomes $< \rfrac{1}{2}$ for $\epsilon \geq 23$.
|
|
|
\end{proof}
|
|
|
|
|
|
+We can obviously substitute the $365$ with any set cardinality $\zeta$
|
|
|
+to express the probability that a random function from $\integerZ_{|\epsilon}$
|
|
|
+to $\integerZ_{|\zeta}$ is injective. Back to our particular case,
|
|
|
+we want to answer the question:
|
|
|
+
|
|
|
+\emph{
|
|
|
+ How many random numbers do we have to run through before finding at least
|
|
|
+ two integers equivalent $\mod{q}$?
|
|
|
+}
|
|
|
+
|
|
|
+
|
|
|
+Using the same reasoning presented above over the previously defined function
|
|
|
+$f(x): \mathcal{S} \to \mathcal{S}$, we will discover that after
|
|
|
+$\approx 1.18 \sqrt{q}$ steps the probability to have fallen inside the period
|
|
|
+is $\rfrac{1}{2}$. %% is it clear that q is either one of the two primes, and
|
|
|
+ %% that here we want to examinate only a portion of the
|
|
|
+ %% domain?
|
|
|
+Since any of the two primes factoring $N$ is bounded above by $\sqrt{N}$, we
|
|
|
+will find a periodic sequence, and thus a factor, in time \bigO{\sqrt[4]{N}}.
|
|
|
+
|
|
|
+
|
|
|
\section{A Computer program for Pollard's $\rho$ method}
|
|
|
|
|
|
-Using the same trick we saw in section ~\ref{sec:pollard-1:implementing}, we
|
|
|
-chose to apply occasionally Euclid's algorithm by computing the accumulated
|
|
|
-product; algorithm ~\ref{alg:pollardrho} outlines what we have so far discussed,
|
|
|
-considering also the pascal transcript present in ~\cite{riesel} \S 5.
|
|
|
+The initial algorithm described by Pollard \cite{pollardMC} and consultable
|
|
|
+immediately below, looks for the pair $\angular{x_i, x_{2i}}$ such that
|
|
|
+$\gcd(x_{2i} - x_i, N) > 1$. This is achieved by keeping two variables $x, y$
|
|
|
+and respectively updating them via $x \gets f(x)$ and $y \gets f(f(y))$.
|
|
|
|
|
|
\begin{algorithm}
|
|
|
- \caption{Pollard's $\rho$ factorization \label{alg:pollardrho}}
|
|
|
+ \caption{Pollard's $\rho$ factorization}
|
|
|
\begin{algorithmic}[1]
|
|
|
- \State $a \getsRandom \naturalN \setminus \{0, 2\}$
|
|
|
\State $x \getsRandom \naturalN$
|
|
|
\State $y \gets x$
|
|
|
+ \State $g \gets 1$
|
|
|
+ \While{$g = 1$}
|
|
|
+ \State $x \gets x^2 + 1 \pmod{N}$
|
|
|
+ \State $y \gets y^4 + y^2 \ll 1 + 2 \pmod{N}$
|
|
|
+ \State $g \gets gcd(x, y)$
|
|
|
+ \EndWhile
|
|
|
+ \Return $g$
|
|
|
\end{algorithmic}
|
|
|
\end{algorithm}
|
|
|
+
|
|
|
+\begin{remark}
|
|
|
+ It is intresting to see how in its basic version, Pollard's $\rho$
|
|
|
+ method just needs 3 variables are to preserve the
|
|
|
+ state. This places it among the most parsimonious factorization algorithms in
|
|
|
+ terms of memory footprint.
|
|
|
+\end{remark}
|
|
|
+
|
|
|
+An immediate improvement of this algorithm would be to occasionally compute Euclid's
|
|
|
+algorithm over the accumulated product to save some computation cycles, just as
|
|
|
+we saw in section~\ref{sec:pollard-1:implementing}. The next code fragment
|
|
|
+adopts this trick together with Brent's cycle-finding variant:
|
|
|
+
|
|
|
+\begin{algorithm}
|
|
|
+ \caption{Pollard-Brent's factorization \label{alg:pollardrho}}
|
|
|
+ \begin{algorithmic}[1]
|
|
|
+ \State $s \gets 100$
|
|
|
+ \Comment steps to check for $\gcd$
|
|
|
+ \State $i \gets 1; \quad j' \gets j \gets 1$
|
|
|
+ \Comment indices for our $x$s
|
|
|
+ \State $x' \gets x \getsRandom \naturalN$
|
|
|
+ \Comment The $x_i$ discussed above
|
|
|
+ \State $y' \gets y \getsRandom x^2 + 1$
|
|
|
+ \Comment The $x_j$ discussed above
|
|
|
+ \State $k \gets 0; \quad q \gets \abs{x-y}$
|
|
|
+ \While{$g = 1$}
|
|
|
+ \State $x \gets y$ \Comment $x_i = 2^k$
|
|
|
+ \State $j \gets 3 \cdot 2^{(k++)} + 1$
|
|
|
+ \While{$j++ \leq 2^k$}
|
|
|
+ \State $y \gets y^2 + 1 \pmod{N}$
|
|
|
+ \State $q \gets q \cdot \abs{x - y}$
|
|
|
+ \If{$i++ \mid s$} \Comment Time to compute $\gcd$?
|
|
|
+ \State $g \gets \gcd(q, N)$
|
|
|
+ \If{$g > 1$} \Return g \EndIf
|
|
|
+ \If{$g = 0$}
|
|
|
+ \Comment Too far: fall back to latest epoch
|
|
|
+ \State $s \gets 1; \quad g \gets 1$
|
|
|
+ \State $j \gets j'; \quad x \gets x'; \quad y \gets y'$
|
|
|
+ \Else
|
|
|
+ \Comment Save current state
|
|
|
+ \State $x' \gets x; \quad y' \gets y$
|
|
|
+ \State $j' \gets j$
|
|
|
+ \EndIf
|
|
|
+ \EndIf
|
|
|
+ \EndWhile
|
|
|
+ \EndWhile
|
|
|
+ \end{algorithmic}
|
|
|
+\end{algorithm}
|
|
|
+
|
|
|
%%% Local Variables:
|
|
|
%%% mode: latex
|
|
|
%%% TeX-master: "question_authority"
|