Ver código fonte

Ultimating pollard-ρ, and writing. Now waiting for review.

Michele Orrù 11 anos atrás
pai
commit
023bdd68f1
3 arquivos alterados com 237 adições e 22 exclusões
  1. 37 5
      book/library.bib
  2. 195 17
      book/pollardrho.tex
  3. 5 0
      book/question_authority.tex

+ 37 - 5
book/library.bib

@@ -36,8 +36,8 @@
 }
 
 @misc{rfc4158,
-  title = {Certification Path Building},
-  author = {Cooper et al.},
+  title = {RFC 4158: Certification Path Building},
+  author = {M. Cooper and Y. Dzambasow and P. Hesse and S. Joseph and R. Nicholas},
   publisher = {RFC Editor},
   url = {http://tools.ietf.org/html/rfc4158}
 }
@@ -89,9 +89,12 @@
 }
 
 @book{Crandall,
-    author = {Richard Crandall and Carl Pomerance and Richard Crandall and Carl Pomerance},
-    title = {Prime numbers: a computational perspective. Second Edition},
-    year = {2005}
+  author = {Richard Crandall and Carl Pomerance and Richard Crandall and Carl Pomerance},
+  title = {Prime numbers: a computational perspective. Second Edition},
+  year = {2005},
+  isbn = {0-8176-3291-3},
+  publisher = {Birkhauser Boston Inc.},
+  address = {Cambridge, MA, USA}
 }
 
 @article{wiener,
@@ -120,6 +123,20 @@
  url = {http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=2074504}
 }
 
+@article{pollardMC,
+  year={1975},
+  issn={0006-3835},
+  journal={BIT Numerical Mathematics},
+  volume={15},
+  number={3},
+  doi={10.1007/BF01933667},
+  title={A monte carlo method for factorization},
+  url={http://dx.doi.org/10.1007/BF01933667},
+  publisher={Kluwer Academic Publishers},
+  author={Pollard, J.M.},
+  pages={331-334},
+  language={English}
+}
 
 @article{Williams:p+1,
   title = {A $p + 1$ Method of Factoring},
@@ -148,6 +165,21 @@
   year = 1981
 }
 
+
+@article{pollard-brent,
+  title = {An improved Monte Carlo Factorization algorithm},
+  author = {Richard P. Brent},
+  year=1980,
+  issn={0006-3835},
+  journal={BIT Numerical Mathematics},
+  volume=20,
+  number=2,
+  url={http://dx.doi.org/10.1007/BF01933190},
+  publisher={Kluwer Academic Publishers},
+  pages={176-184},
+  language={English}
+}
+
 @article{rsa,
  author = {Rivest, R. L. and Shamir, A. and Adleman, L.},
  title = {A Method for Obtaining Digital Signatures and Public-key Cryptosystems},

+ 195 - 17
book/pollardrho.tex

@@ -1,15 +1,21 @@
 \chapter{Pollard's $\rho$ factorization method \label{chap:pollardrho}}
 
-Pollard's $\rho$ factorization method is based on the statistical idea behind
-the birthday paradox. It consists into indentifying a periodically recurrent
-sequence of integers in the ring of remainders with respect to the public
-modulus $N$, and claim that the period $\psi$ is one of the two primes
-factorizing $N$.
-
-\paragraph{Origins of the name} The $\rho$ name is devoted to the graphical
-representation of the algorithm: as we can see in figure ~\ref{fig:pollardrho},
-if we graphically represent the lookup over a graphic
+The \emph{Monte Carlo} factorization method, published by J. M. Pollard in
+~\cite{pollardMC}, consists into identifying a periodically recurrent  sequence
+of integers within a random walk $\pmod{N}$ that could leak one of the two
+factors.
 
+Consider a function $f$ from $\mathcal{S}$ to $\mathcal{S}$, where
+$\mathcal{S} = \{0, 1, \ldots, q-1\}$. Let $s$ be a random element in
+$\mathcal{S}$, and consider the sequence
+\begin{align*}
+  s,\ f(s),\ f(f(s)),\ \ldots
+\end{align*}
+Since $f$ acts over a finite set, it is clear that this sequence must
+eventually repeat, and become cyclic.
+We might diagram it with the letter $\rho$, where the tail represent the
+cyclic part, or \emph{epacts}, and the oval the cyclic part, or
+\emph{period}.
 \begin{center}
   \begin{tikzpicture}[scale=0.7, thick]
     \tikzstyle{every node}=[draw,circle,fill=white,minimum size=4pt,
@@ -33,31 +39,203 @@ if we graphically represent the lookup over a graphic
     %%\draw [decorate,decoration={brace, raise=1.5cm}] (1) -- (3)
     %%node[draw=no] at (-1.5, 4) {tail};
     \draw [decorate,decoration={brace, raise=3cm}] (5) -- (7)
-    node[draw=none] at (13, 7) {\footnotesize {periodic sequence}};
+    node[draw=none] at (13, 7) {\footnotesize {period $\pmod{q}$}};
 
 \end{tikzpicture}
 \end{center}
 
+Now, consider $N=pq$.
+Let $F(x)$ be any function generating pseudorandom numbers $\angular{x_1, x_2, \ldots}$,
+and let $f(x) = F(x) \pmod{q}$.
+As we said above, without any luck, there will be a pair $\angular{x_i, x_j}$
+generated by $F$ such that $x_i \equiv x_j \pmod{q}$, but $x_i \neq x_j$.
+
+Therefore, in order to factorize $N$, we proceed as follows: starting from a
+random $s$, we iteratively apply $F$ reduced modulo $N$. Whenever we find a
+period, if $\gcd(x_i - x_j, N) > 1$ then we found a non-trivial
+factor of $N$.
+
+\paragraph{Choosing the function} Ideally, $F$ should be easily computable, but
+at the same time random enough to reduce as much as possible the epacts
+~\cite{Crandall} \S 5.2.1. Any quadratic function $F(x) = x^2 + b$ should be
+enough, provided that $b \in \naturalN \setminus \{0, 2\}$ \footnote{
+  Note that this has been only empirically verified, and so far not been proved
+  (~\cite{riesel}, p. 177)}.
+For example, ~\cite{pollardMC} uses $x^2 -1$, meanwhile we are going to choose
+$F(x) = x^2 + 1$.
+
+\paragraph{Finding the period} In \cite{AOCPv2} \S 3.1, Knuth gives a simple and
+elegant algorithm, attributed to Floyd, for finding a multiple of the
+period. This algorithm is the same one finally adopted by Pollard in
+~\cite{pollardMC}.
+
+\begin{theorem*}
+Given an \emph{ultimately periodic} sequence, in the sense that there exists
+numbers $\lambda$ and $\mu$ for which the values:
+\begin{align*}
+  X_0, X_1, \ldots, X_{\mu}, \ldots, X_{\mu + \lambda - 1}
+\end{align*}
+are distinct, but $X_{n+\lambda} = X_n$ when $n \geq \mu$,
+then there exists an
+$\mu < n < \mu + \lambda$ such that $X_n = X_{2n}$.
+\end{theorem*}
 
-\paragraph{A more rigourous description}
 \begin{proof}
+  First, if $X_n = X_{2n}$, then the sequence is obviously periodic from
+  $X_{2n}$ onward, possibly even earlier.
+  Conversely, $X_n = X_m \quad (n \geq \mu)$ for
+  $m = n + k\lambda, \quad k \in \naturalN$. Hence, there will eventually
+  be an $n$ such that $X_n = X_{2n}$ if and only if $n - \mu$ is a multiple of
+  $\lambda$.
+  The first such value happens for $n = (\lambda + 1)\floor{\rfrac{\mu}{\lambda}}$.
+\end{proof}
+
+The immediate consequence of this is that we can find the period $q$ simply by
+checking $\gcd(x_{2i} - x_i, N)$ for incremental $i$-s.
+
+\paragraph{Brent's Improvement} In 1979, Brent discovered an entire family of
+cycle-finding algorithms whose optimal version resulted to be 36\% faster than
+Floyd's one \cite{pollard-brent}.
+Instead of looking for the period of the sequence using $x_{2i} - x_i$, Brent
+considers
+$\abs{x_j - x_{2^k}}$ for $ 3 \cdot 2^{k-1} < j \leq 2^{k+1}$, resulting in
+fewer operations required by the algorithm. Pragmatically, this boils down to
+confronting:
+
+\medskip
+\begin{tabular}{l@{\hskip 40pt} l@{\hskip 50pt} l}
+  $k = 0$ & $j \in \{1+1 \ldots 2\}$ & $\abs{x_1 - x_2}$ \\
+  $k = 1$ & $j \in \{3+1 \ldots 4\}$ & $\abs{x_2 - x_4}$ \\
+  $k = 2$ & $j \in \{6+1 \ldots 8\}$ & $\abs{x_4 - x_7}$, $\abs{x_4 - x_8}$ \\
+  $k = 3$ & $j \in \{12+1, \ldots 16\}$ &
+            $\abs{x_8 - x_{13}}$, $\abs{x_8 -x_{14}}$, $\ldots$, $|x_8 - x_{16}|$\\
+  $k = 4$ & $j \in \{24+1, \ldots 32 \}$ &
+            $\abs{x_{16} - x_{25}}$, $\ldots$, $\abs{x_{16} - x_{32}}$ \\[2pt]
+  $\quad \vdots$ & $\quad \vdots$ & $\quad \quad \vdots$ \\
+  %\multicolumn{1}{c}{$\vdots$} &
+  %\multicolumn{1}{c}{$\vdots$} &
+  %\multicolumn{1}{c}{$\vdots$} \\
+\end{tabular}
+
+
+A Pollard's $\rho$ variant that implements Brent's cycle-finding algorithm
+instead of Floyd's one runs around 25\% faster on average
+~\cite{pollard-brent}.
+
+\section{Complexity}
+\cite{riesel} presents a nice demonstration of the \emph{average} complexity of
+this algorithm, based on the birthday paradox.
+\newtheorem*{birthday}{The Birthday Paradox}
+\begin{birthday}
+  How many persons needs to be selected at random in order that the probability
+  of at least two of them having the same birthday exceeds $\rfrac{1}{2}$?
+\end{birthday}
+
+\begin{proof}[Solution]
+  The probability that $\epsilon$ different persons have different birthdays is:
+  \begin{align*}
+    \Big(1 - \frac{1}{365}\Big)
+    \Big(1 - \frac{2}{365}\Big)
+    \Big(1 - \frac{3}{365}\Big)
+    \cdots
+    \Big(1 - \frac{\epsilon -1}{365}\Big)
+    =
+    \frac{365!}{365^\epsilon (365-\epsilon)!}
+  \end{align*}
+  This expression becomes $< \rfrac{1}{2}$ for $\epsilon \geq 23$.
 \end{proof}
 
+We can obviously substitute the $365$ with any set cardinality $\zeta$
+to express the probability that a random function from $\integerZ_{|\epsilon}$
+to $\integerZ_{|\zeta}$ is injective. Back to our particular case,
+we want to answer the question:
+
+\emph{
+  How many random numbers do we have to run through before finding at least
+  two integers equivalent $\mod{q}$?
+}
+
+
+Using the same reasoning presented above over the previously defined function
+$f(x): \mathcal{S} \to \mathcal{S}$, we will discover that after
+$\approx 1.18 \sqrt{q}$ steps the probability to have fallen inside the period
+is $\rfrac{1}{2}$. %% is it clear that q is either one of the two primes, and
+                   %% that here we want to examinate only a portion of the
+                   %% domain?
+Since any of the two primes factoring $N$ is bounded above by $\sqrt{N}$, we
+will find a periodic sequence, and thus a factor, in time \bigO{\sqrt[4]{N}}.
+
+
 \section{A Computer program for Pollard's $\rho$ method}
 
-Using the same trick we saw in section ~\ref{sec:pollard-1:implementing},  we
-chose to apply occasionally Euclid's algorithm by computing the accumulated
-product; algorithm ~\ref{alg:pollardrho} outlines what we have so far discussed,
-considering also the pascal transcript present in ~\cite{riesel} \S 5.
+The initial algorithm described by Pollard \cite{pollardMC} and consultable
+immediately below, looks for the pair $\angular{x_i, x_{2i}}$ such that
+$\gcd(x_{2i} - x_i, N) > 1$.  This is achieved by keeping two variables $x, y$
+and respectively updating them via $x \gets f(x)$ and $y \gets f(f(y))$.
 
 \begin{algorithm}
-  \caption{Pollard's $\rho$ factorization \label{alg:pollardrho}}
+  \caption{Pollard's $\rho$ factorization}
   \begin{algorithmic}[1]
-    \State $a \getsRandom \naturalN \setminus \{0, 2\}$
     \State $x \getsRandom \naturalN$
     \State $y \gets x$
+    \State $g \gets 1$
+    \While{$g = 1$}
+      \State $x \gets x^2 + 1 \pmod{N}$
+      \State $y \gets y^4 + y^2 \ll 1 + 2 \pmod{N}$
+      \State $g \gets gcd(x, y)$
+    \EndWhile
+    \Return $g$
   \end{algorithmic}
 \end{algorithm}
+
+\begin{remark}
+  It is intresting to see how in its basic version, Pollard's $\rho$
+  method just needs 3 variables are to preserve the
+  state. This places it among the most parsimonious factorization algorithms in
+  terms of memory footprint.
+\end{remark}
+
+An immediate improvement of this algorithm would be to occasionally compute Euclid's
+algorithm over the accumulated product to save some computation cycles, just as
+we saw in section~\ref{sec:pollard-1:implementing}. The next code fragment
+adopts this trick together with Brent's cycle-finding variant:
+
+\begin{algorithm}
+  \caption{Pollard-Brent's factorization \label{alg:pollardrho}}
+  \begin{algorithmic}[1]
+    \State $s \gets 100$
+    \Comment steps to check for $\gcd$
+    \State $i \gets 1; \quad j' \gets j \gets 1$
+    \Comment indices for our $x$s
+    \State $x' \gets x \getsRandom \naturalN$
+    \Comment The $x_i$ discussed above
+    \State $y' \gets y \getsRandom x^2 + 1$
+    \Comment The $x_j$ discussed above
+    \State $k \gets 0; \quad q \gets \abs{x-y}$
+    \While{$g = 1$}
+      \State $x \gets y$ \Comment $x_i = 2^k$
+      \State $j \gets 3 \cdot 2^{(k++)} + 1$
+      \While{$j++ \leq 2^k$}
+        \State $y \gets y^2 + 1 \pmod{N}$
+        \State $q \gets q \cdot \abs{x - y}$
+        \If{$i++ \mid s$} \Comment Time to compute $\gcd$?
+          \State $g \gets \gcd(q, N)$
+          \If{$g > 1$} \Return g \EndIf
+          \If{$g = 0$}
+            \Comment Too far: fall back to latest epoch
+            \State $s \gets 1; \quad  g \gets 1$
+            \State $j \gets j'; \quad x \gets x'; \quad y \gets y'$
+          \Else
+            \Comment Save current state
+            \State $x' \gets x; \quad y' \gets y$
+            \State $j' \gets j$
+          \EndIf
+        \EndIf
+      \EndWhile
+    \EndWhile
+  \end{algorithmic}
+\end{algorithm}
+
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: "question_authority"

+ 5 - 0
book/question_authority.tex

@@ -56,6 +56,11 @@
 \newcommand{\rfrac}[2]{{}^{#1}\!/_{#2}}
 \newcommand{\getsRandom}{\xleftarrow{r}}
 
+
+\theoremstyle{plain}
+\newtheorem*{theorem*}{Theorem}
+\newtheorem*{definition*}{Definition}
+
 \makeindex
 \let\origdoublepage\cleardoublepage
 \newcommand{\clearemptydoublepage}{%