123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364 |
- %% in homage to sir Isacc Newton
- \chapter{Mathematical Principles\label{chap:preq}}
- In this chapter we formalize the notation used in the rest of the thesis, and
- furthermore attempt to discuss and study the elementary functions on which the
- whole project has been grounded.
- \\
- The $\ll$ and $\gg$ are respectively used with the meaning of left and right
- bitwise shift, as usual in computer science.
- \\
- The $\dsqrt$ function will be defined in section \ref{sec:preq:sqrt}, with the
- acceptation of discrete square root.
- \\
- The logarithmic $\log$ function is assumed to be in base two, i.e. $\log_2$.
- \\
- The $\idiv$ symbol is the integer division over $\naturalN$, i.e.
- $a \idiv b = \floor{\frac{a}{b}}$, as usual in the python language.
- \\
- $\naturalPrime \subset \naturalN$ is the set containing all prime intgers.
- \\
- The binary operator $\getsRandom$, always written as $x \getsRandom S$, has the
- meaning of ``pick a uniformly distributed random element $x$ from the set $S$''.
- % XXX. following Dan Boneh notation
- \\
- The summation in $\mathbb{F}_2$ is always expressed with the circled plus,
- i.e. $a \xor b$.
- %% Since it is equivalent to the bitwise xor, we are going to use
- %% it as well in the pseudocode with the latter meaning.
- %%\section{Number Theory}
- %%What follows here is the definition and the formalization of some intuictive
- %%concepts that later are going to be taken as granted:
- %%the infinite cardinality of $\naturalPrime$,
- %%the definition of \emph{smoothness}, and
- %%the distribution of prime numbers in $\naturalN$.
- \begin{definition*}[Smoothness]
- A number $n$ is said to be $\factorBase$-smooth if and only if all its prime
- factors are contained in $\factorBase$.
- \end{definition*}
- \begin{definition*}[Quadratic Residue]
- An integer $a$ is said to be a \emph{quadratic residue} $\mod n$ if it is
- congruent to a perfect square $\!\mod n$:
- \begin{equation*}
- x^2 \equiv a \pmod{n}
- \end{equation*}
- \end{definition*}
- \begin{definition*}[Legendre Symbol]
- The \emph{Legendre Symbol}, often contracted as $\legendre{a}{p}$ is a
- function of two integers $a$ and $p$ defined as follows:
- \begin{equation*}
- \legendre{a}{p} = \begin{cases}
- 0 & \text{if $a \equiv 0 \pmod{p}$} \\
- 1 & \text{if $a$ is a quadratic residue modulo $p$} \\
- -1 & \text{if $a$ is a non-residue modulo $p$} \\
- \end{cases}
- \end{equation*}
- \end{definition*}
- \vfill
- \section{Algorithmic Complexity Notation}
- The notation used to describe asymptotic complexity follows the $\mathcal{O}$-notation,
- abused under the conventions and limits of MIT's Introduction to Algorithms
- \cite{MITalg}.
- Let \bigO{g} be the asymptotic upper bound of g:
- $$
- \bigO{g(n)} = \{ f(n) : \exists n_0, c \in \naturalN \mid 0 \leq f(n) \leq cg(n)
- \ \forall n > n_0 \}
- $$
- With $f(n) = \bigO{g(n)}$ we actually mean
- $f(n) \in \bigO{g(n)}$.
- Moreover, since the the expression ``running time'' has achieved a certain
- vogue, we shall sometimes use this term as interchangeable with ``complexity'',
- even though imprecise (\cite{Crandall} \S 1.1.4).
- \section{Euclid's Greatest Common Divisor \label{sec:preq:gcd}}
- Being the greatest common divisor a foundamental algebraic operation in the TLS
- protocol, \openssl implemented it with the following signature:
- \begin{minted}[fontsize=\small]{c}
- int BN_gcd(BIGNUM *r, BIGNUM *a, BIGNUM *b, BN_CTX *ctx);
- \end{minted}
- The computation proceeds under the well-known Euclidean algorithm, specifically
- the binary variant developed by Josef Stein in 1961 \cite{AOCPv2}. This variant
- exploits some interesting properties of $gcd(a, b)$:
- \begin{enumerate}[(a)]
- \setlength{\itemsep}{1pt}
- \item if $a,\ b$ are even, then $gcd(a, b) = 2gcd(a/2, b/2)$;
- \item if $a$ is even and $b$ is odd, then $gcd(a, b) = gcd(a/2, b)$;
- \item $gcd(a, b) = gcd(a-b, b)$, as in the standard Euclid algorithm;
- \item the sum of two odd numbers is always even.
- \end{enumerate}
- % Donald Knuth, TAOCP, "a binary method", p. 388 VOL 2
- Both \cite{AOCPv2} and \cite{MITalg} analyze the running time of the
- algorithm; \cite{MITalg}'s proof is fairly simpler and proceeds %elegantly
- by induction.
- Anyway, both show that algorithm ~\ref{alg:gcd} belongs to the class
- \bigO{\log b}.
- \begin{algorithm}[H]
- \caption{\openssl's GCD \label{alg:gcd}}
- \begin{algorithmic}[1]
- \Function{gcd}{$a, b$}
- \State $k \gets 0$
- \While{$b \neq 0$}
- \If{$a$ is odd}
- \If{$b$ is odd}
- \Comment by property (c) and (d)
- \State $a \gets (a-b) \gg 1$
- \Else
- \Comment by property (b)
- \State $b \gets b \gg 1$
- \EndIf
- \If{$a < b$} $a, b \gets b, a$ \EndIf
- \Else
- \If{$b$ is odd}
- \Comment by property (b)
- \State $a \gets a \gg 1$
- \If{$a < b$} $a, b \gets b, a$ \EndIf
- \Else
- \Comment by property (a)
- \State $k \gets k+1$
- \State $a, b \gets a \gg 1, b \gg 1$
- \EndIf
- \EndIf
- \EndWhile
- \State \Return $a \ll k$
- \EndFunction
- \end{algorithmic}
- \end{algorithm}
- Unfortunately, there is yet no known parallel solution that significantly improves
- Euclid's \textsc{gcd}.
- \section{Square Root \label{sec:preq:sqrt}}
- Computing the square root is another important building block of the project,
- though not available in \openssl\!.
- Apparently,
- % \openssl is a great pile of crap, as phk states
- \openssl does only provide the discrete square root implementation using the
- Tonelli/Shanks algorithm, which specifically solves in $x$ the equation
- $x^2 = a \pmod{p}$, with $p \in \naturalPrime$:
- \begin{minted}{c}
- BIGNUM* BN_mod_sqrt(BIGNUM* x, const BIGNUM* a, const BIGNUM* p,
- const BN_CTX* ctx);
- \end{minted}
- Instead, we are interested in finding the the pair
- $\angular{x, r} \in \naturalN^2 $ such that $ x^2 + r = n$, that is, the integer
- part of the square root of a natural number and its rest.
- Hence, we did come out with our specific implementation, first using Bombelli's
- algorithm, and later with the one of Dijkstra. Both are going to be discussed
- below.
- Unless otherwise specified, in the later pages we use $\sqrt{n}$ with the
- usual meaning ``the half power of $n$'', while with $x, r = \dsqrt{n}$ we mean
- the pair just defined.
- \paragraph{Bombelli's Algorithm \label{par:preq:sqrt:bombelli}} dates back to
- the XVI century, and approaches the problem of finding the square root by using
- continued fractions. Unfortunately, we weren't able to fully assert the
- correctness of the algorithm, since the original document
- ~\cite{bombelli:algebra} presents a difficult, inconvenient notation. Though,
- for completeness' sake, we report in table
- ~\ref{alg:sqrt:bombelli} the pseudocode adopted and tested for its correctness.
- \begin{algorithm}[H]
- \caption{Square Root: Bombelli's algorithm}
- \label{alg:sqrt:bombelli}
- \begin{algorithmic}[1]
- \Function{sqrt}{$n$}
- \State $i \gets 0; \quad g \gets \{\}$
- \While{$n > 0$}
- \Comment take pairs of digits and store them in $g$
- \State $g_i \gets n \pmod{100}$
- \State $n \gets n // 100$
- \State $i \gets i + 1$
- \EndWhile
- \State $x \gets 0; \quad r \gets 0$
- \For{$j = i-1 \strong{ downto } 0$}
- \State $r = 100r + g_i$
- \Comment take next pair
- \For{$d = 0 \strong{ to } 9$}
- \Comment find gratest multiplier $d$
- \State $y' \gets d(20x + d)$
- \If{$y' > r$} \textbf{break}
- \Else \ \ $y \gets y'$
- \EndIf
- \EndFor
- \State $r \gets r - y$
- \State $x \gets 10x + d - 1$
- \Comment $d$ is the next digit
- \EndFor
- \State \Return $x, r$
- \EndFunction
- \end{algorithmic}
- \end{algorithm}
- For each digit of the result, we perform a subtraction, and a limited number of
- multiplications. This means that the complexity of this solutions belongs to
- \bigO{\log n \log n} = \bigO{\log^2 n}.
- \begin{remark}
- Note that Bombelli actually has found a solution in $x$ for a slightly
- different equation than the one we initially formulated. Specifically, he
- found the pair $\angular{x, r}$ such that $(x+r)^2=a$, where $x$ is the mantissa,
- while $r$ is the decimal part. For our purpose this change is irrelevant: we
- just need to be able to distinguish perfect squares, and thus assert that $r$
- is zero.
- \end{remark}
- \paragraph{Dijkstra's Algorithm \label{par:preq:sqrt:dijkstra}} can be found in
- \cite{Dijkstra:adop}, \S 8, p.61. There, Dijkstra presents an elightning
- process for the computation of the square root, making only use of binary shift
- and algebraic additions.
- Specifically, the problem attempts to find, given a natual $n$, the integer $a$
- that establishes:
- \begin{align}
- \label{eq:preq:dijkstra_problem}
- a^2 \leq n \: \land \: (a+1)^2 > n
- \end{align}
- Take now the pair $\angular{a=0, b=n+1}$, and consider the inverval
- $[a, b[$. We would like to reduce the distance between the upper bound $b$ and
- the lower bound $a$, while holding the guard \ref{eq:preq:dijkstra_problem}:
- \begin{align*}
- a^2 \leq n \: \land \: b > n
- \end{align*}
- %% XXX. I am not so sure about this, pure fantasy.
- The speed of convergence is determined by the choice of the distance $d$, which
- analougously to the dicotomic search problem, is optimal when
- $d = (b-a) \idiv 2$.
- \begin{algorithm}[H]
- \caption{Square Root: an intuitive, na\"ive implementation}
- \label{alg:sqrt:dijkstra_naif}
- \begin{algorithmic}[1]
- \Function{sqrt}{$n$}
- \State $a \gets 0; \quad b \gets n+1$
- \While{$a+1 \neq b$}
- \State $d \gets (b-a) \idiv 2$
- \If{$(a+d)^2 \leq n$} $a \gets a+d$
- \Comment increment left bound
- \ElsIf{$(b-d)^2 > n$} $b \gets b-d$
- \Comment decrement right bound
- \EndIf
- \EndWhile
- \State \Return $a, a^2-n$
- \EndFunction
- \end{algorithmic}
- \end{algorithm}
- % heh, there's not much to explain here, that's almost the same in Dijkstra's
- % book, excluding the inspirative familiar portrait that led to the insight of
- % this change of varaibles.
- Now optimization proceeds with the following change of variables:
- \begin{enumerate}[a)]
- \setlength{\itemsep}{1pt}
- \setlength{\parskip}{0pt}
- \setlength{\parsep}{0pt}
- \item $c = b-a$,
- \item $p = ac$,
- \item $q = c^2$,
- \item $r = n-a^2$;
- \end{enumerate}
- resulting into algorithm \ref{alg:sqrt:dijkstra}.
- For any further details, the reference is still \cite{Dijkstra:adop}.
- \begin{algorithm}[H]
- \caption{Square Root: final version}
- \label{alg:sqrt:dijkstra}
- \begin{algorithmic}[1]
- \Function{sqrt}{$n$}
- \State $p \gets 0; \quad q \gets 1; \quad r \gets n$
- \While{$q \leq n$} $q \gets q \ll 2$ \EndWhile
- \While{$q \neq 1$}
- \State $q \gets q \gg 2$
- \State $h \gets p+q$
- \State $p \gets q \ll 1$
- \State $h \gets 2p + q$
- \If{$r \geq h$}
- \State $p \gets p+q$
- \State $r \gets r-h$ \EndIf
- \EndWhile
- \State \Return $p, r$
- \EndFunction
- \end{algorithmic}
- \end{algorithm}
- A fair approximation of the magnitude of the Dijkstra algorithm can be studied
- by looking at the pseudocode in ~\ref{alg:sqrt:dijkstra_naif}. Exactly as in
- the dicotomic search case, we split the interval $[a, b]$ in half on each step,
- and choose whether to take the leftmost or the rightmost part. This results in
- $log(n+1)$ steps. During each iteration, instead, as we have seen in
- ~\ref{alg:sqrt:dijkstra} we just apply summations and binary shifts, which are
- upper bounded by \bigO{\log{n}/2}. Thus, the order of magnitude belongs to
- \bigO{\log^2{n}}.
- \paragraph{}
- Even if both algorithms presented have \emph{asymptotically} the same
- complexity, we believe that adopting the one of Dijkstra has lead to a
- pragmatic, substantial performance improvement.
- \section{The RSA Cipher \label{sec:preq:rsa}}
- The RSA cryptosystem, invented by Ron Rivest, Adi Shamir, and Len Adleman
- ~\cite{rsa}, was first published in August 1977's issue of
- \emph{Scientific American}. In its basic version, this \emph{asymmetric} cipher
- works as follows:
- \begin{itemize}
- \item choose a pair $\angular{p, q}$ of \emph{random} \emph{prime} numbers;
- let $N$ be the product of the two, $N=pq$, and call it \emph{public modulus};
- \item choose a pair $\angular{e, d}$ of \emph{random} numbers, both in
- $\integerZ^*_{\varphi(N)}$, such that one is the multiplicative inverse of the
- other, $ed \equiv 1 \pmod{\varphi(N)}$ and $\varphi(N)$ is Euler's totient
- function;
- \end{itemize}
- Now, call $\angular{N, e}$ \emph{public key}, and $\angular{N, d}$
- \emph{private key}, and let the encryption function $E(m)$ be the $e$-th power of
- the message $m$:
- \begin{align}
- \label{eq:rsa:encrypt}
- E(m) = m^e \pmod{N}
- \end{align}
- while the decryption function $D(c)$ is the $d$-th power of the ciphertext $c$:
- \begin{align}
- \label{eq:rsa:decrypt}
- D(c) = c^d \equiv E(m)^d \equiv m^{ed} \equiv m \pmod{N}
- \end{align}
- that, due to Fermat's little theorem, is the inverse of $E$.
- \paragraph{}
- %% less unless <https://www.youtube.com/watch?v=XnbnuY7Kxhc>
- From now on, unless otherwise specified, the variable $N=pq$ will always refer
- to the public modulus of a generic RSA keypair, with
- $p, q$ being the two primes factorizing it, such that $p > q$.
- Again, $e, d$ will respectively refer to the public
- exponent and the private exponent.
- %%% Local Variables:
- %%% mode: latex
- %%% TeX-master: "question_authority"
- %%% End:
|