math_prequisites.tex 9.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275
  1. \chapter{Mathematical prequisites \label{chap:preq}}
  2. This chapter formalizes the notation used in the rest of the thesis, and
  3. moreover attempts to discuss and study the elementary functions on which the
  4. project has been grounded. For example, the $\ll$ and $\gg$ are respectively
  5. used with the meaning of left and right binary shift, as usual in the computer
  6. science field; meanwhile, the $\dsqrt$ function will be defined in section
  7. \ref{sec:preq:sqrt}, with the acceptation of discrete square root.
  8. %% XXX. where to put these?
  9. The logarithmic $\log$ function is assumed to be in base two, i.e. $\log_2$.
  10. The $\idiv$ symbol is the integer division over $\naturalN$, i.e.
  11. $a \idiv b = \floor{\frac{a}{b}}$.
  12. \section{Euclid's Greatest Common Divisor}
  13. Being the greatest common divisor a foundamental algebraic operation in the ssl
  14. protocol, \openssl implemented it with the following signature:
  15. \begin{minted}[fontsize=\small]{c}
  16. int BN_gcd(BIGNUM *r, BIGNUM *a, BIGNUM *b, BN_CTX *ctx);
  17. \end{minted}
  18. The computation proceeds under the well-known Euclidean algorithm, specifically
  19. the binary variant developed by Josef Stein in 1961 \cite{AOCPv2}. This variant
  20. exploits some interesting properties of $gcd(a, b)$
  21. \begin{itemize}
  22. \setlength{\itemsep}{1pt}
  23. \setlength{\parskip}{0pt}
  24. \setlength{\parsep}{0pt}
  25. \item if $a,\ b$ are even, then $gcd(a, b) = 2gcd(a/2, b/2)$
  26. \item if $a$ is even and $b$ is odd, then $gcd(a, b) = gcd(a/2, b)$
  27. \item $gcd(a, b) = gcd(a-b, b)$, as in the standard Euclid's algorithm
  28. \item the sum of two odd numbers is always even
  29. \end{itemize}
  30. % Donald Knuth, TAOCP, "a binary method", p. 388 VOL 2
  31. Both \cite{AOCPv2} and \cite{MITalg} analyze the running time for the algorithm,
  32. even if \cite{MITalg}'s demonstration is fairly simpler and proceeds %elegantly
  33. by induction.
  34. Anyway, both show that algorithm ~\ref{alg:gcd} belongs to the class
  35. \bigO{\log b}.
  36. \begin{algorithm}[H]
  37. \caption{\openssl's GCD \label{alg:gcd}}
  38. \begin{algorithmic}[1]
  39. \State $k \gets 0$
  40. \While{$b \neq 0$}
  41. \If{$a$ is odd}
  42. \If{$b$ is odd}
  43. \State $a \gets (a-b) \gg 1$
  44. \Else
  45. \State $b \gets b \gg 1$
  46. \EndIf
  47. \If{$a < b$} $a, b \gets b, a$ \EndIf
  48. \Else
  49. \If{$b$ is odd}
  50. \State $a = a \gg 1$
  51. \If{$a < b$} $a, b = b, a$ \EndIf
  52. \Else
  53. \State $k \gets k+1$
  54. \State $a, b \gets a \gg 1, b \gg 1$
  55. \EndIf
  56. \EndIf
  57. \EndWhile
  58. \State \Return $a \ll k$
  59. \end{algorithmic}
  60. \end{algorithm}
  61. Unfortunately, there is yet no known parallel solution that significantly improves
  62. Euclid's \textsc{gcd}.
  63. \section{RSA Cipher \label{sec:preq:rsa}}
  64. The RSA cryptosystem, invented by Ron Rivesst, Adi Shamir, and Len Adleman
  65. ~\cite{rsa}, was first published in August 1977's issue of
  66. \emph{Scientific American}. In its basic version, this \emph{asymmetric} cipher
  67. works as follows:
  68. \begin{itemize}
  69. \item choose a pair $\angular{p, q}$ of \emph{random} \emph{prime} numbers;
  70. let $N$ be the product of the two, $N=pq$, and call it ``Public Modulus''
  71. \item choose a pair $\angular{e, d}$ of \emph{random} numbers, both in
  72. $\integerZ^*_{\varphi(N)}$, such that one is the multiplicative inverse of the
  73. other, $ed \equiv 1 \pmod{\varphi(N)}$ and $\varphi(N)$ is Euler's totient
  74. function;
  75. \end{itemize}
  76. Now, call $\angular{N, e}$ \emph{public key}, and $\angular{N, d}$
  77. \emph{private key}, and let the encryption function $E(m)$ be the $e$-th power of
  78. the message $m$:
  79. \begin{align}
  80. \label{eq:rsa:encrypt}
  81. E(m) = m^e \pmod{N}
  82. \end{align}
  83. while the decryption function $D(c)$ is the $d$-th power of the ciphertext $c$:
  84. \begin{align}
  85. \label{eq:rsa:decrypt}
  86. D(c) = c^d \equiv E(m)^d \equiv m^{ed} \equiv m \pmod{N}
  87. \end{align}
  88. that, due to Fermat's little theorem, is the inverse of $E$.
  89. \paragraph{}
  90. %% less unless <https://www.youtube.com/watch?v=XnbnuY7Kxhc>
  91. From now on, unless otherwise specified, the variable $N=pq$ will always refer
  92. to the public modulus of a generis RSA keypair, with $p, q\ .\ p > q$ being the
  93. two primes factorizing it. Again, $e, d$ will respectively refer to the public
  94. exponent and the private exponent.
  95. \section{Algorithmic Complexity Notation}
  96. The notation used to describe asymptotic complexity follows the $O$-notation,
  97. abused under the conventions and limits of MIT's Introduction to Algorithms.
  98. Let \bigO{g} be the asymptotic upper bound of g:`
  99. $$
  100. O(g(n)) = \{ f(n) : \exists n_0, c \in \naturalN \mid 0 \leq f(n) \leq cg(n)
  101. \ \forall n > n_0 \}
  102. $$
  103. With the writing $f(n) = O(g(n))$ we will actually interpret
  104. $f(n) \in O(g(n))$.
  105. \section{Square Root \label{sec:preq:sqrt}}
  106. Computing the square root has been another foundamental requirement of the
  107. project, though not satisfied by \openssl. Apparently,
  108. % \openssl is a great pile of crap, as phk states
  109. \openssl does only provide the discrete quare root implementation using the
  110. Tonelli/Shanks algorithm, which specifically solves $x$ in $x^2 = a \pmod{p}$,
  111. with $p \in \naturalPrime$.
  112. \begin{minted}{c}
  113. BIGNUM* BN_mod_sqrt(BIGNUM* x, const BIGNUM* a, const BIGNUM* p,
  114. const BN_CTX* ctx);
  115. \end{minted}
  116. Instead, we are interested in finding the square root in $\naturalN$, hence we
  117. did come out with our specific implementation, first using Bombelli's algorithm
  118. described in ~\ref{par:preq:sqrt:bombelli},
  119. and later with Dijkstra one discussed in ~\ref{par:preq:sqrt:dijkstra}.
  120. Unless otherwise specified, in the later pages we will use $\sqrt{n}$ with the
  121. usual meaning ``the half power of $n$'', while with $x, r = \dsqrt{n}$ we will
  122. intend the pair $\angular{x, r} \in \naturalN^2 \mid x^2 + r = n$.
  123. \paragraph{Bombelli's Algorithm \label{par:preq:sqrt:bombelli}} dates back to
  124. the XVI century, and approaches the problem of finding the square root by using
  125. continued fractions. Unfortunately, we weren't able to fully assert the
  126. correctness of the algorithm, since the original document
  127. ~\cite{bombelli:algebra} is definitely unreadable and presents a difficult,
  128. inconvenient notation. Though, for completeness' sake, we report in table
  129. ~\ref{alg:sqrt:bombelli} the pseudocode adopted and tested for its correctness.
  130. \begin{algorithm}[H]
  131. \caption{Square Root: Bombelli's algorithm}
  132. \label{alg:sqrt:bombelli}
  133. \begin{algorithmic}[1]
  134. \Procedure{sqrt}{$n$}
  135. \State $i, g \gets 0, \{\}$
  136. \While{$n > 0$}
  137. \State $g_i \gets n \pmod{100}$
  138. \State $n \gets n // 100$
  139. \State $i++$
  140. \EndWhile
  141. \State $x, r \gets 0, 0$
  142. \For{$j \in \; [i-1..0]$}
  143. \State $r = 100r + g_i$
  144. \For{$d \in \; [0, 9]$}
  145. \State $y' \gets d(20x + d)$
  146. \If{$y' > r$} \textbf{break}
  147. \Else \ \ $y \gets y'$
  148. \EndIf
  149. \EndFor
  150. \State $r \gets r - y$
  151. \State $x \gets 10x + d - 1$
  152. \EndFor
  153. \EndProcedure
  154. \end{algorithmic}
  155. \end{algorithm}
  156. For each digit of the result, we perform a subtraction, and a limited number of
  157. multiplications. This means that the complexity of this solutions belongs to
  158. \bigO{\log n \log n} = \bigO{\log^2 n}
  159. \paragraph{Dijkstra's Algorithm \label{par:preq:sqrt:dijkstra}} can be found in
  160. \cite{Dijkstra:adop}, \S 8, p.61. There, Dijkstra presents an elightning
  161. process for the computation of the square root, making only use of binary shift
  162. and algebraic additions.
  163. Specifically, the problem attempts to find, given a natual $n$, the $a$ that
  164. establishes:
  165. \begin{align}
  166. \label{eq:preq:dijkstra_problem}
  167. a^2 \leq n \: \land \: (a+1)^2 > n
  168. \end{align}
  169. Take now the pair $\angular{a=0, b=n+1}$, and think about the problem as dicotomic
  170. search: we want to diminuish the dinstance between the upper bound $b$ and the
  171. lower bound $a$ while holding the guard \ref{eq:preq:dijkstra_problem}:
  172. \begin{align*}
  173. a^2 \leq n \: \land \: b > n
  174. \end{align*}
  175. %% XXX. I am not so sure about this, pure fantasy.
  176. The speed of convergence is determined by the choice of dinstance $d$, which is optimal when
  177. $d = (b-a) \idiv 2$.
  178. \begin{algorithm}[H]
  179. \caption{Square Root: an intuitive, na\"ive implementation}
  180. \label{alg:sqrt:dijkstra_naif}
  181. \begin{algorithmic}[1]
  182. \State $a, b \gets 0, n+1$
  183. \While{$a+1 \neq b$}
  184. \State $d \gets (b-a) \idiv 2$
  185. \If{$(a+d)^2 \leq n$}
  186. $a \gets a+d$
  187. \ElsIf{$(b-d)^2 > n$}
  188. $b \gets b-d$
  189. \EndIf
  190. \EndWhile
  191. \State \Return a
  192. \end{algorithmic}
  193. \end{algorithm}
  194. % heh, there's not much to explain here, that's almost the same in Dijkstra's
  195. % book, excluding the inspirative familiar portrait that led to the insight of
  196. % this change of varaibles.
  197. Now optimization proceeds by abstration: the variables $a, b$ are sobstituded
  198. introducing:
  199. $c = b-a$,
  200. $p = ac$,
  201. $q = c^2$,
  202. $r = n-a^2$
  203. and finally $h$ as local optimization. For any further details and
  204. explainations, the reference is still \cite{Dijkstra:adop}.
  205. \begin{algorithm}[H]
  206. \caption{Square Root: final version}
  207. \label{alg:sqrt:dijkstra}
  208. \begin{algorithmic}
  209. \State $p, q, r \gets 0, 1, n$
  210. \While{$q \leq n$} $q \gets q \gg 2$ \EndWhile
  211. \While{$q \neq 1$}
  212. \State $q \gets q \ll 2$
  213. \State $h \gets p+q$
  214. \State $p \gets q \ll 1$
  215. \State $h \gets 2p + q$
  216. \If{$r \geq h$} $p, r \gets p+q, r-h$ \EndIf
  217. \EndWhile
  218. \State \Return p
  219. \end{algorithmic}
  220. \end{algorithm}
  221. A fair approxidmation of the magnitude of the Dijkstra algorithm can be studied
  222. by looking at the pseudocode in ~\ref{alg:sqrt:dijkstra_naif}. Exactely as with
  223. the dicotomic search case, we split the interval $[a, b]$ in half on each step,
  224. and choose wether to take the leftmost or the rightmost part. This results in
  225. $log(n+1)$ steps. During each iteration, instead, as we have seen in
  226. ~\ref{alg:sqrt:dijkstra} we just apply summations and binary shifts, which are
  227. upper bounded by \bigO{\log{n}/2}. Thus, the order of magnitude belongs to
  228. \bigO{\log^2{n}}.
  229. \paragraph{}
  230. Even if both algorithms presented have \emph{asymptotically} the same
  231. complexity, we believe that adopting Dijkstra's one has lead to a pragmatic,
  232. substantial performance improvement.
  233. %%% Local Variables:
  234. %%% mode: latex
  235. %%% TeX-master: "question_authority"
  236. %%% End: