math_prequisites.tex 9.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285
  1. \chapter{Mathematical prequisites \label{chap:preq}}
  2. In this chapter we formalize the notation used in the rest of the thesis, and
  3. furthermore attempt to discuss and study the elementary functions on which the
  4. project has been grounded.
  5. \\
  6. The $\ll$ and $\gg$ are respectively used with the meaning of left and right
  7. bitwise shift, as usual in computer science.
  8. \\
  9. The $\dsqrt$ function will be defined in section \ref{sec:preq:sqrt}, with the
  10. acceptation of discrete square root.
  11. \\
  12. The logarithmic $\log$ function is assumed to be in base two, i.e. $\log_2$.
  13. \\
  14. The $\idiv$ symbol is the integer division over $\naturalN$, i.e.
  15. $a \idiv b = \floor{\frac{a}{b}}$.
  16. \section{Algorithmic Complexity Notation}
  17. The notation used to describe asymptotic complexity follows the $O$-notation,
  18. abused under the conventions and limits of MIT's Introduction to Algorithms.
  19. Let \bigO{g} be the asymptotic upper bound of g:
  20. $$
  21. \bigO{g(n)} = \{ f(n) : \exists n_0, c \in \naturalN \mid 0 \leq f(n) \leq cg(n)
  22. \ \forall n > n_0 \}
  23. $$
  24. With $f(n) = \bigO{g(n)}$ we actually mean
  25. $f(n) \in \bigO{g(n)}$.
  26. \section{Euclid's Greatest Common Divisor}
  27. Being the greatest common divisor a foundamental algebraic operation in the TLS
  28. protocol, \openssl implemented it with the following signature:
  29. \begin{minted}[fontsize=\small]{c}
  30. int BN_gcd(BIGNUM *r, BIGNUM *a, BIGNUM *b, BN_CTX *ctx);
  31. \end{minted}
  32. The computation proceeds under the well-known Euclidean algorithm, specifically
  33. the binary variant developed by Josef Stein in 1961 \cite{AOCPv2}. This variant
  34. exploits some interesting properties of $gcd(a, b)$:
  35. \begin{itemize}
  36. \setlength{\itemsep}{1pt}
  37. \setlength{\parskip}{0pt}
  38. \setlength{\parsep}{0pt}
  39. \item if $a,\ b$ are even, then $gcd(a, b) = 2gcd(a/2, b/2)$;
  40. \item if $a$ is even and $b$ is odd, then $gcd(a, b) = gcd(a/2, b)$;
  41. \item $gcd(a, b) = gcd(a-b, b)$, as in the standard Euclid algorithm;
  42. \item the sum of two odd numbers is always even.
  43. \end{itemize}
  44. % Donald Knuth, TAOCP, "a binary method", p. 388 VOL 2
  45. Both \cite{AOCPv2} and \cite{MITalg} analyze the running time of the
  46. algorithm; \cite{MITalg}'s proof is fairly simpler and proceeds %elegantly
  47. by induction.
  48. Anyway, both show that algorithm ~\ref{alg:gcd} belongs to the class
  49. \bigO{\log b}.
  50. \begin{algorithm}[H]
  51. \caption{\openssl's GCD \label{alg:gcd}}
  52. \begin{algorithmic}[1]
  53. \State $k \gets 0$
  54. \While{$b \neq 0$}
  55. \If{$a$ is odd}
  56. \If{$b$ is odd}
  57. \State $a \gets (a-b) \gg 1$
  58. \Else
  59. \State $b \gets b \gg 1$
  60. \EndIf
  61. \If{$a < b$} $a, b \gets b, a$ \EndIf
  62. \Else
  63. \If{$b$ is odd}
  64. \State $a = a \gg 1$
  65. \If{$a < b$} $a, b = b, a$ \EndIf
  66. \Else
  67. \State $k \gets k+1$
  68. \State $a, b \gets a \gg 1, b \gg 1$
  69. \EndIf
  70. \EndIf
  71. \EndWhile
  72. \State \Return $a \ll k$
  73. \end{algorithmic}
  74. \end{algorithm}
  75. Unfortunately, there is yet no known parallel solution that significantly improves
  76. Euclid's \textsc{gcd}.
  77. \section{Square Root \label{sec:preq:sqrt}}
  78. Computing the square root is another important building block of the project,
  79. though not available in \openssl\!.
  80. Apparently,
  81. % \openssl is a great pile of crap, as phk states
  82. \openssl does only provide the discrete square root implementation using the
  83. Tonelli/Shanks algorithm, which specifically solves in $x$ the equation
  84. $x^2 = a \pmod{p}$, with $p \in \naturalPrime$:
  85. \begin{minted}{c}
  86. BIGNUM* BN_mod_sqrt(BIGNUM* x, const BIGNUM* a, const BIGNUM* p,
  87. const BN_CTX* ctx);
  88. \end{minted}
  89. Instead, we are interested in finding the the pair
  90. $\angular{x, r} \in \naturalN^2 \mid x^2 + r = n$, that is, the integer part of
  91. the square root of a natural number and its rest.
  92. Hence, we did come out with our specific implementation, first using Bombelli's
  93. algorithm, and later with the one of Dijkstra. Both are going to be discussed
  94. below.
  95. Unless otherwise specified, in the later pages we use $\sqrt{n}$ with the
  96. usual meaning ``the half power of $n$'', while with $x, r = \dsqrt{n}$ we mean
  97. the pair just defined.
  98. \paragraph{Bombelli's Algorithm \label{par:preq:sqrt:bombelli}} dates back to
  99. the XVI century, and approaches the problem of finding the square root by using
  100. continued fractions. Unfortunately, we weren't able to fully assert the
  101. correctness of the algorithm, since the original document
  102. ~\cite{bombelli:algebra} presents a difficult, inconvenient notation. Though,
  103. for completeness' sake, we report in table
  104. ~\ref{alg:sqrt:bombelli} the pseudocode adopted and tested for its correctness.
  105. \begin{algorithm}[H]
  106. \caption{Square Root: Bombelli's algorithm}
  107. \label{alg:sqrt:bombelli}
  108. \begin{algorithmic}[1]
  109. \Procedure{sqrt}{$n$}
  110. \State $i, g \gets 0, \{\}$
  111. \While{$n > 0$}
  112. \State $g_i \gets n \pmod{100}$
  113. \State $n \gets n // 100$
  114. \State $i++$
  115. \EndWhile
  116. \State $x, r \gets 0, 0$
  117. \For{$j \in \; [i-1..0]$}
  118. \State $r = 100r + g_i$
  119. \For{$d \in \; [0, 9]$}
  120. \State $y' \gets d(20x + d)$
  121. \If{$y' > r$} \textbf{break}
  122. \Else \ \ $y \gets y'$
  123. \EndIf
  124. \EndFor
  125. \State $r \gets r - y$
  126. \State $x \gets 10x + d - 1$
  127. \EndFor
  128. \State \Return $x, r$
  129. \EndProcedure
  130. \end{algorithmic}
  131. \end{algorithm}
  132. For each digit of the result, we perform a subtraction, and a limited number of
  133. multiplications. This means that the complexity of this solutions belongs to
  134. \bigO{\log n \log n} = \bigO{\log^2 n}
  135. \paragraph{Dijkstra's Algorithm \label{par:preq:sqrt:dijkstra}} can be found in
  136. \cite{Dijkstra:adop}, \S 8, p.61. There, Dijkstra presents an elightning
  137. process for the computation of the square root, making only use of binary shift
  138. and algebraic additions.
  139. Specifically, the problem attempts to find, given a natual $n$, the integer $a$
  140. that establishes:
  141. \begin{align}
  142. \label{eq:preq:dijkstra_problem}
  143. a^2 \leq n \: \land \: (a+1)^2 > n
  144. \end{align}
  145. Take now the pair $\angular{a=0, b=n+1}$, and consider the inverval
  146. $[a, b[$. We would like to reduce the distance between the upper bound $b$ and
  147. the lower bound $a$, while holding the guard \ref{eq:preq:dijkstra_problem}:
  148. \begin{align*}
  149. a^2 \leq n \: \land \: b > n
  150. \end{align*}
  151. %% XXX. I am not so sure about this, pure fantasy.
  152. The speed of convergence is determined by the choice of the distance $d$, which
  153. analougously to the dicotomic search problem, is optimal when
  154. $d = (b-a) \idiv 2$.
  155. \begin{algorithm}[H]
  156. \caption{Square Root: an intuitive, na\"ive implementation}
  157. \label{alg:sqrt:dijkstra_naif}
  158. \begin{algorithmic}[1]
  159. \State $a, b \gets 0, n+1$
  160. \While{$a+1 \neq b$}
  161. \State $d \gets (b-a) \idiv 2$
  162. \If{$(a+d)^2 \leq n$}
  163. $a \gets a+d$
  164. \ElsIf{$(b-d)^2 > n$}
  165. $b \gets b-d$
  166. \EndIf
  167. \EndWhile
  168. \State \Return a
  169. \end{algorithmic}
  170. \end{algorithm}
  171. % heh, there's not much to explain here, that's almost the same in Dijkstra's
  172. % book, excluding the inspirative familiar portrait that led to the insight of
  173. % this change of varaibles.
  174. Now optimization proceeds with the following change of variables:
  175. $c = b-a$,
  176. $p = ac$,
  177. $q = c^2$,
  178. $r = n-a^2$;
  179. For any further details and explainations, the reference is still
  180. \cite{Dijkstra:adop}.
  181. \begin{algorithm}[H]
  182. \caption{Square Root: final version}
  183. \label{alg:sqrt:dijkstra}
  184. \begin{algorithmic}
  185. \State $p, q, r \gets 0, 1, n$
  186. \While{$q \leq n$} $q \gets q \gg 2$ \EndWhile
  187. \While{$q \neq 1$}
  188. \State $q \gets q \ll 2$
  189. \State $h \gets p+q$
  190. \State $p \gets q \ll 1$
  191. \State $h \gets 2p + q$
  192. \If{$r \geq h$} $p, r \gets p+q, r-h$ \EndIf
  193. \EndWhile
  194. \State \Return p
  195. \end{algorithmic}
  196. \end{algorithm}
  197. A fair approximation of the magnitude of the Dijkstra algorithm can be studied
  198. by looking at the pseudocode in ~\ref{alg:sqrt:dijkstra_naif}. Exactly as in
  199. the dicotomic search case, we split the interval $[a, b]$ in half on each step,
  200. and choose whether to take the leftmost or the rightmost part. This results in
  201. $log(n+1)$ steps. During each iteration, instead, as we have seen in
  202. ~\ref{alg:sqrt:dijkstra} we just apply summations and binary shifts, which are
  203. upper bounded by \bigO{\log{n}/2}. Thus, the order of magnitude belongs to
  204. \bigO{\log^2{n}}.
  205. \paragraph{}
  206. Even if both algorithms presented have \emph{asymptotically} the same
  207. complexity, we believe that adopting the one of Dijkstra has lead to a
  208. pragmatic, substantial performance improvement.
  209. \section{RSA Cipher \label{sec:preq:rsa}}
  210. The RSA cryptosystem, invented by Ron Rivest, Adi Shamir, and Len Adleman
  211. ~\cite{rsa}, was first published in August 1977's issue of
  212. \emph{Scientific American}. In its basic version, this \emph{asymmetric} cipher
  213. works as follows:
  214. \begin{itemize}
  215. \item choose a pair $\angular{p, q}$ of \emph{random} \emph{prime} numbers;
  216. let $N$ be the product of the two, $N=pq$, and call it \emph{public modulus};
  217. \item choose a pair $\angular{e, d}$ of \emph{random} numbers, both in
  218. $\integerZ^*_{\varphi(N)}$, such that one is the multiplicative inverse of the
  219. other, $ed \equiv 1 \pmod{\varphi(N)}$ and $\varphi(N)$ is Euler's totient
  220. function;
  221. \end{itemize}
  222. Now, call $\angular{N, e}$ \emph{public key}, and $\angular{N, d}$
  223. \emph{private key}, and let the encryption function $E(m)$ be the $e$-th power of
  224. the message $m$:
  225. \begin{align}
  226. \label{eq:rsa:encrypt}
  227. E(m) = m^e \pmod{N}
  228. \end{align}
  229. while the decryption function $D(c)$ is the $d$-th power of the ciphertext $c$:
  230. \begin{align}
  231. \label{eq:rsa:decrypt}
  232. D(c) = c^d \equiv E(m)^d \equiv m^{ed} \equiv m \pmod{N}
  233. \end{align}
  234. that, due to Fermat's little theorem, is the inverse of $E$.
  235. \paragraph{}
  236. %% less unless <https://www.youtube.com/watch?v=XnbnuY7Kxhc>
  237. From now on, unless otherwise specified, the variable $N=pq$ will always refer
  238. to the public modulus of a generic RSA keypair, with
  239. $p, q$ being the two primes factorizing it, such that $p > q$.
  240. Again, $e, d$ will respectively refer to the public
  241. exponent and the private exponent.
  242. %%% Local Variables:
  243. %%% mode: latex
  244. %%% TeX-master: "question_authority"
  245. %%% End: