comparison bn.tex @ 19:e1037a1e12e7 libtommath-orig

0.30 release of LibTomMath
author Matt Johnston <matt@ucc.asn.au>
date Tue, 15 Jun 2004 14:42:57 +0000
parents
children d29b64170cf0
comparison
equal deleted inserted replaced
2:86e0b50a9b58 19:e1037a1e12e7
1 \documentclass[b5paper]{book}
2 \usepackage{hyperref}
3 \usepackage{makeidx}
4 \usepackage{amssymb}
5 \usepackage{color}
6 \usepackage{alltt}
7 \usepackage{graphicx}
8 \usepackage{layout}
9 \def\union{\cup}
10 \def\intersect{\cap}
11 \def\getsrandom{\stackrel{\rm R}{\gets}}
12 \def\cross{\times}
13 \def\cat{\hspace{0.5em} \| \hspace{0.5em}}
14 \def\catn{$\|$}
15 \def\divides{\hspace{0.3em} | \hspace{0.3em}}
16 \def\nequiv{\not\equiv}
17 \def\approx{\raisebox{0.2ex}{\mbox{\small $\sim$}}}
18 \def\lcm{{\rm lcm}}
19 \def\gcd{{\rm gcd}}
20 \def\log{{\rm log}}
21 \def\ord{{\rm ord}}
22 \def\abs{{\mathit abs}}
23 \def\rep{{\mathit rep}}
24 \def\mod{{\mathit\ mod\ }}
25 \renewcommand{\pmod}[1]{\ ({\rm mod\ }{#1})}
26 \newcommand{\floor}[1]{\left\lfloor{#1}\right\rfloor}
27 \newcommand{\ceil}[1]{\left\lceil{#1}\right\rceil}
28 \def\Or{{\rm\ or\ }}
29 \def\And{{\rm\ and\ }}
30 \def\iff{\hspace{1em}\Longleftrightarrow\hspace{1em}}
31 \def\implies{\Rightarrow}
32 \def\undefined{{\rm ``undefined"}}
33 \def\Proof{\vspace{1ex}\noindent {\bf Proof:}\hspace{1em}}
34 \let\oldphi\phi
35 \def\phi{\varphi}
36 \def\Pr{{\rm Pr}}
37 \newcommand{\str}[1]{{\mathbf{#1}}}
38 \def\F{{\mathbb F}}
39 \def\N{{\mathbb N}}
40 \def\Z{{\mathbb Z}}
41 \def\R{{\mathbb R}}
42 \def\C{{\mathbb C}}
43 \def\Q{{\mathbb Q}}
44 \definecolor{DGray}{gray}{0.5}
45 \newcommand{\emailaddr}[1]{\mbox{$<${#1}$>$}}
46 \def\twiddle{\raisebox{0.3ex}{\mbox{\tiny $\sim$}}}
47 \def\gap{\vspace{0.5ex}}
48 \makeindex
49 \begin{document}
50 \frontmatter
51 \pagestyle{empty}
52 \title{LibTomMath User Manual \\ v0.30}
53 \author{Tom St Denis \\ [email protected]}
54 \maketitle
55 This text, the library and the accompanying textbook are all hereby placed in the public domain. This book has been
56 formatted for B5 [176x250] paper using the \LaTeX{} {\em book} macro package.
57
58 \vspace{10cm}
59
60 \begin{flushright}Open Source. Open Academia. Open Minds.
61
62 \mbox{ }
63
64 Tom St Denis,
65
66 Ontario, Canada
67 \end{flushright}
68
69 \tableofcontents
70 \listoffigures
71 \mainmatter
72 \pagestyle{headings}
73 \chapter{Introduction}
74 \section{What is LibTomMath?}
75 LibTomMath is a library of source code which provides a series of efficient and carefully written functions for manipulating
76 large integer numbers. It was written in portable ISO C source code so that it will build on any platform with a conforming
77 C compiler.
78
79 In a nutshell the library was written from scratch with verbose comments to help instruct computer science students how
80 to implement ``bignum'' math. However, the resulting code has proven to be very useful. It has been used by numerous
81 universities, commercial and open source software developers. It has been used on a variety of platforms ranging from
82 Linux and Windows based x86 to ARM based Gameboys and PPC based MacOS machines.
83
84 \section{License}
85 As of the v0.25 the library source code has been placed in the public domain with every new release. As of the v0.28
86 release the textbook ``Implementing Multiple Precision Arithmetic'' has been placed in the public domain with every new
87 release as well. This textbook is meant to compliment the project by providing a more solid walkthrough of the development
88 algorithms used in the library.
89
90 Since both\footnote{Note that the MPI files under mtest/ are copyrighted by Michael Fromberger. They are not required to use LibTomMath.} are in the
91 public domain everyone is entitled to do with them as they see fit.
92
93 \section{Building LibTomMath}
94
95 LibTomMath is meant to be very ``GCC friendly'' as it comes with a makefile well suited for GCC. However, the library will
96 also build in MSVC, Borland C out of the box. For any other ISO C compiler a makefile will have to be made by the end
97 developer.
98
99 To build the library for GCC simply issue the
100
101 \begin{alltt}
102 make
103 \end{alltt}
104
105 command. This will build the library and archive the object files in ``libtommath.a''. Now you simply link against that
106 and include ``tommath.h'' within your programs.
107
108 Alternatively to build with MSVC type
109
110 \begin{alltt}
111 nmake -f makefile.msvc
112 \end{alltt}
113
114 This will build the library and archive the object files in ``tommath.lib''. This has been tested with MSVC version 6.00
115 with service pack 5.
116
117 There is limited support for making a ``DLL'' in windows via the ``makefile.cygwin\_dll'' makefile. It requires Cygwin
118 to work with since it requires the auto-export/import functionality. The resulting DLL and imprt library ``libtomcrypt.dll.a''
119 can be used to link LibTomMath dynamically to any Windows program using Cygwin.
120
121 \subsection{Testing}
122 To build the library and the test harness type
123
124 \begin{alltt}
125 make test
126 \end{alltt}
127
128 This will build the library, ``test'' and ``mtest/mtest''. The ``test'' program will accept test vectors and verify the
129 results. ``mtest/mtest'' will generate test vectors using the MPI library by Michael Fromberger\footnote{A copy of MPI
130 is included in the package}. Simply pipe mtest into test using
131
132 \begin{alltt}
133 mtest/mtest | test
134 \end{alltt}
135
136 If you do not have a ``/dev/urandom'' style RNG source you will have to write your own PRNG and simply pipe that into
137 mtest. For example, if your PRNG program is called ``myprng'' simply invoke
138
139 \begin{alltt}
140 myprng | mtest/mtest | test
141 \end{alltt}
142
143 This will output a row of numbers that are increasing. Each column is a different test (such as addition, multiplication, etc)
144 that is being performed. The numbers represent how many times the test was invoked. If an error is detected the program
145 will exit with a dump of the relevent numbers it was working with.
146
147 \section{Purpose of LibTomMath}
148 Unlike GNU MP (GMP) Library, LIP, OpenSSL or various other commercial kits (Miracl), LibTomMath was not written with
149 bleeding edge performance in mind. First and foremost LibTomMath was written to be entirely open. Not only is the
150 source code public domain (unlike various other GPL/etc licensed code), not only is the code freely downloadable but the
151 source code is also accessible for computer science students attempting to learn ``BigNum'' or multiple precision
152 arithmetic techniques.
153
154 LibTomMath was written to be an instructive collection of source code. This is why there are many comments, only one
155 function per source file and often I use a ``middle-road'' approach where I don't cut corners for an extra 2\% speed
156 increase.
157
158 Source code alone cannot really teach how the algorithms work which is why I also wrote a textbook that accompanies
159 the library (beat that!).
160
161 So you may be thinking ``should I use LibTomMath?'' and the answer is a definite maybe. Let me tabulate what I think
162 are the pros and cons of LibTomMath by comparing it to the math routines from GnuPG\footnote{GnuPG v1.2.3 versus LibTomMath v0.28}.
163
164 \newpage\begin{figure}[here]
165 \begin{small}
166 \begin{center}
167 \begin{tabular}{|l|c|c|l|}
168 \hline \textbf{Criteria} & \textbf{Pro} & \textbf{Con} & \textbf{Notes} \\
169 \hline Few lines of code per file & X & & GnuPG $ = 300.9$, LibTomMath $ = 76.04$ \\
170 \hline Commented function prototypes & X && GnuPG function names are cryptic. \\
171 \hline Speed && X & LibTomMath is slower. \\
172 \hline Totally free & X & & GPL has unfavourable restrictions.\\
173 \hline Large function base & X & & GnuPG is barebones. \\
174 \hline Four modular reduction algorithms & X & & Faster modular exponentiation. \\
175 \hline Portable & X & & GnuPG requires configuration to build. \\
176 \hline
177 \end{tabular}
178 \end{center}
179 \end{small}
180 \caption{LibTomMath Valuation}
181 \end{figure}
182
183 It may seem odd to compare LibTomMath to GnuPG since the math in GnuPG is only a small portion of the entire application.
184 However, LibTomMath was written with cryptography in mind. It provides essentially all of the functions a cryptosystem
185 would require when working with large integers.
186
187 So it may feel tempting to just rip the math code out of GnuPG (or GnuMP where it was taken from originally) in your
188 own application but I think there are reasons not to. While LibTomMath is slower than libraries such as GnuMP it is
189 not normally significantly slower. On x86 machines the difference is normally a factor of two when performing modular
190 exponentiations.
191
192 Essentially the only time you wouldn't use LibTomMath is when blazing speed is the primary concern.
193
194 \chapter{Getting Started with LibTomMath}
195 \section{Building Programs}
196 In order to use LibTomMath you must include ``tommath.h'' and link against the appropriate library file (typically
197 libtommath.a). There is no library initialization required and the entire library is thread safe.
198
199 \section{Return Codes}
200 There are three possible return codes a function may return.
201
202 \index{MP\_OKAY}\index{MP\_YES}\index{MP\_NO}\index{MP\_VAL}\index{MP\_MEM}
203 \begin{figure}[here!]
204 \begin{center}
205 \begin{small}
206 \begin{tabular}{|l|l|}
207 \hline \textbf{Code} & \textbf{Meaning} \\
208 \hline MP\_OKAY & The function succeeded. \\
209 \hline MP\_VAL & The function input was invalid. \\
210 \hline MP\_MEM & Heap memory exhausted. \\
211 \hline &\\
212 \hline MP\_YES & Response is yes. \\
213 \hline MP\_NO & Response is no. \\
214 \hline
215 \end{tabular}
216 \end{small}
217 \end{center}
218 \caption{Return Codes}
219 \end{figure}
220
221 The last two codes listed are not actually ``return'ed'' by a function. They are placed in an integer (the caller must
222 provide the address of an integer it can store to) which the caller can access. To convert one of the three return codes
223 to a string use the following function.
224
225 \index{mp\_error\_to\_string}
226 \begin{alltt}
227 char *mp_error_to_string(int code);
228 \end{alltt}
229
230 This will return a pointer to a string which describes the given error code. It will not work for the return codes
231 MP\_YES and MP\_NO.
232
233 \section{Data Types}
234 The basic ``multiple precision integer'' type is known as the ``mp\_int'' within LibTomMath. This data type is used to
235 organize all of the data required to manipulate the integer it represents. Within LibTomMath it has been prototyped
236 as the following.
237
238 \index{mp\_int}
239 \begin{alltt}
240 typedef struct \{
241 int used, alloc, sign;
242 mp_digit *dp;
243 \} mp_int;
244 \end{alltt}
245
246 Where ``mp\_digit'' is a data type that represents individual digits of the integer. By default, an mp\_digit is the
247 ISO C ``unsigned long'' data type and each digit is $28-$bits long. The mp\_digit type can be configured to suit other
248 platforms by defining the appropriate macros.
249
250 All LTM functions that use the mp\_int type will expect a pointer to mp\_int structure. You must allocate memory to
251 hold the structure itself by yourself (whether off stack or heap it doesn't matter). The very first thing that must be
252 done to use an mp\_int is that it must be initialized.
253
254 \section{Function Organization}
255
256 The arithmetic functions of the library are all organized to have the same style prototype. That is source operands
257 are passed on the left and the destination is on the right. For instance,
258
259 \begin{alltt}
260 mp_add(&a, &b, &c); /* c = a + b */
261 mp_mul(&a, &a, &c); /* c = a * a */
262 mp_div(&a, &b, &c, &d); /* c = [a/b], d = a mod b */
263 \end{alltt}
264
265 Another feature of the way the functions have been implemented is that source operands can be destination operands as well.
266 For instance,
267
268 \begin{alltt}
269 mp_add(&a, &b, &b); /* b = a + b */
270 mp_div(&a, &b, &a, &c); /* a = [a/b], c = a mod b */
271 \end{alltt}
272
273 This allows operands to be re-used which can make programming simpler.
274
275 \section{Initialization}
276 \subsection{Single Initialization}
277 A single mp\_int can be initialized with the ``mp\_init'' function.
278
279 \index{mp\_init}
280 \begin{alltt}
281 int mp_init (mp_int * a);
282 \end{alltt}
283
284 This function expects a pointer to an mp\_int structure and will initialize the members of the structure so the mp\_int
285 represents the default integer which is zero. If the functions returns MP\_OKAY then the mp\_int is ready to be used
286 by the other LibTomMath functions.
287
288 \begin{small} \begin{alltt}
289 int main(void)
290 \{
291 mp_int number;
292 int result;
293
294 if ((result = mp_init(&number)) != MP_OKAY) \{
295 printf("Error initializing the number. \%s",
296 mp_error_to_string(result));
297 return EXIT_FAILURE;
298 \}
299
300 /* use the number */
301
302 return EXIT_SUCCESS;
303 \}
304 \end{alltt} \end{small}
305
306 \subsection{Single Free}
307 When you are finished with an mp\_int it is ideal to return the heap it used back to the system. The following function
308 provides this functionality.
309
310 \index{mp\_clear}
311 \begin{alltt}
312 void mp_clear (mp_int * a);
313 \end{alltt}
314
315 The function expects a pointer to a previously initialized mp\_int structure and frees the heap it uses. It sets the
316 pointer\footnote{The ``dp'' member.} within the mp\_int to \textbf{NULL} which is used to prevent double free situations.
317 Is is legal to call mp\_clear() twice on the same mp\_int in a row.
318
319 \begin{small} \begin{alltt}
320 int main(void)
321 \{
322 mp_int number;
323 int result;
324
325 if ((result = mp_init(&number)) != MP_OKAY) \{
326 printf("Error initializing the number. \%s",
327 mp_error_to_string(result));
328 return EXIT_FAILURE;
329 \}
330
331 /* use the number */
332
333 /* We're done with it. */
334 mp_clear(&number);
335
336 return EXIT_SUCCESS;
337 \}
338 \end{alltt} \end{small}
339
340 \subsection{Multiple Initializations}
341 Certain algorithms require more than one large integer. In these instances it is ideal to initialize all of the mp\_int
342 variables in an ``all or nothing'' fashion. That is, they are either all initialized successfully or they are all
343 not initialized.
344
345 The mp\_init\_multi() function provides this functionality.
346
347 \index{mp\_init\_multi} \index{mp\_clear\_multi}
348 \begin{alltt}
349 int mp_init_multi(mp_int *mp, ...);
350 \end{alltt}
351
352 It accepts a \textbf{NULL} terminated list of pointers to mp\_int structures. It will attempt to initialize them all
353 at once. If the function returns MP\_OKAY then all of the mp\_int variables are ready to use, otherwise none of them
354 are available for use. A complementary mp\_clear\_multi() function allows multiple mp\_int variables to be free'd
355 from the heap at the same time.
356
357 \begin{small} \begin{alltt}
358 int main(void)
359 \{
360 mp_int num1, num2, num3;
361 int result;
362
363 if ((result = mp_init_multi(&num1,
364 &num2,
365 &num3, NULL)) != MP\_OKAY) \{
366 printf("Error initializing the numbers. \%s",
367 mp_error_to_string(result));
368 return EXIT_FAILURE;
369 \}
370
371 /* use the numbers */
372
373 /* We're done with them. */
374 mp_clear_multi(&num1, &num2, &num3, NULL);
375
376 return EXIT_SUCCESS;
377 \}
378 \end{alltt} \end{small}
379
380 \subsection{Other Initializers}
381 To initialized and make a copy of an mp\_int the mp\_init\_copy() function has been provided.
382
383 \index{mp\_init\_copy}
384 \begin{alltt}
385 int mp_init_copy (mp_int * a, mp_int * b);
386 \end{alltt}
387
388 This function will initialize $a$ and make it a copy of $b$ if all goes well.
389
390 \begin{small} \begin{alltt}
391 int main(void)
392 \{
393 mp_int num1, num2;
394 int result;
395
396 /* initialize and do work on num1 ... */
397
398 /* We want a copy of num1 in num2 now */
399 if ((result = mp_init_copy(&num2, &num1)) != MP_OKAY) \{
400 printf("Error initializing the copy. \%s",
401 mp_error_to_string(result));
402 return EXIT_FAILURE;
403 \}
404
405 /* now num2 is ready and contains a copy of num1 */
406
407 /* We're done with them. */
408 mp_clear_multi(&num1, &num2, NULL);
409
410 return EXIT_SUCCESS;
411 \}
412 \end{alltt} \end{small}
413
414 Another less common initializer is mp\_init\_size() which allows the user to initialize an mp\_int with a given
415 default number of digits. By default, all initializers allocate \textbf{MP\_PREC} digits. This function lets
416 you override this behaviour.
417
418 \index{mp\_init\_size}
419 \begin{alltt}
420 int mp_init_size (mp_int * a, int size);
421 \end{alltt}
422
423 The $size$ parameter must be greater than zero. If the function succeeds the mp\_int $a$ will be initialized
424 to have $size$ digits (which are all initially zero).
425
426 \begin{small} \begin{alltt}
427 int main(void)
428 \{
429 mp_int number;
430 int result;
431
432 /* we need a 60-digit number */
433 if ((result = mp_init_size(&number, 60)) != MP_OKAY) \{
434 printf("Error initializing the number. \%s",
435 mp_error_to_string(result));
436 return EXIT_FAILURE;
437 \}
438
439 /* use the number */
440
441 return EXIT_SUCCESS;
442 \}
443 \end{alltt} \end{small}
444
445 \section{Maintenance Functions}
446
447 \subsection{Reducing Memory Usage}
448 When an mp\_int is in a state where it won't be changed again\footnote{A Diffie-Hellman modulus for instance.} excess
449 digits can be removed to return memory to the heap with the mp\_shrink() function.
450
451 \index{mp\_shrink}
452 \begin{alltt}
453 int mp_shrink (mp_int * a);
454 \end{alltt}
455
456 This will remove excess digits of the mp\_int $a$. If the operation fails the mp\_int should be intact without the
457 excess digits being removed. Note that you can use a shrunk mp\_int in further computations, however, such operations
458 will require heap operations which can be slow. It is not ideal to shrink mp\_int variables that you will further
459 modify in the system (unless you are seriously low on memory).
460
461 \begin{small} \begin{alltt}
462 int main(void)
463 \{
464 mp_int number;
465 int result;
466
467 if ((result = mp_init(&number)) != MP_OKAY) \{
468 printf("Error initializing the number. \%s",
469 mp_error_to_string(result));
470 return EXIT_FAILURE;
471 \}
472
473 /* use the number [e.g. pre-computation] */
474
475 /* We're done with it for now. */
476 if ((result = mp_shrink(&number)) != MP_OKAY) \{
477 printf("Error shrinking the number. \%s",
478 mp_error_to_string(result));
479 return EXIT_FAILURE;
480 \}
481
482 /* use it .... */
483
484
485 /* we're done with it. */
486 mp_clear(&number);
487
488 return EXIT_SUCCESS;
489 \}
490 \end{alltt} \end{small}
491
492 \subsection{Adding additional digits}
493
494 Within the mp\_int structure are two parameters which control the limitations of the array of digits that represent
495 the integer the mp\_int is meant to equal. The \textit{used} parameter dictates how many digits are significant, that is,
496 contribute to the value of the mp\_int. The \textit{alloc} parameter dictates how many digits are currently available in
497 the array. If you need to perform an operation that requires more digits you will have to mp\_grow() the mp\_int to
498 your desired size.
499
500 \index{mp\_grow}
501 \begin{alltt}
502 int mp_grow (mp_int * a, int size);
503 \end{alltt}
504
505 This will grow the array of digits of $a$ to $size$. If the \textit{alloc} parameter is already bigger than
506 $size$ the function will not do anything.
507
508 \begin{small} \begin{alltt}
509 int main(void)
510 \{
511 mp_int number;
512 int result;
513
514 if ((result = mp_init(&number)) != MP_OKAY) \{
515 printf("Error initializing the number. \%s",
516 mp_error_to_string(result));
517 return EXIT_FAILURE;
518 \}
519
520 /* use the number */
521
522 /* We need to add 20 digits to the number */
523 if ((result = mp_grow(&number, number.alloc + 20)) != MP_OKAY) \{
524 printf("Error growing the number. \%s",
525 mp_error_to_string(result));
526 return EXIT_FAILURE;
527 \}
528
529
530 /* use the number */
531
532 /* we're done with it. */
533 mp_clear(&number);
534
535 return EXIT_SUCCESS;
536 \}
537 \end{alltt} \end{small}
538
539 \chapter{Basic Operations}
540 \section{Small Constants}
541 Setting mp\_ints to small constants is a relatively common operation. To accomodate these instances there are two
542 small constant assignment functions. The first function is used to set a single digit constant while the second sets
543 an ISO C style ``unsigned long'' constant. The reason for both functions is efficiency. Setting a single digit is quick but the
544 domain of a digit can change (it's always at least $0 \ldots 127$).
545
546 \subsection{Single Digit}
547
548 Setting a single digit can be accomplished with the following function.
549
550 \index{mp\_set}
551 \begin{alltt}
552 void mp_set (mp_int * a, mp_digit b);
553 \end{alltt}
554
555 This will zero the contents of $a$ and make it represent an integer equal to the value of $b$. Note that this
556 function has a return type of \textbf{void}. It cannot cause an error so it is safe to assume the function
557 succeeded.
558
559 \begin{small} \begin{alltt}
560 int main(void)
561 \{
562 mp_int number;
563 int result;
564
565 if ((result = mp_init(&number)) != MP_OKAY) \{
566 printf("Error initializing the number. \%s",
567 mp_error_to_string(result));
568 return EXIT_FAILURE;
569 \}
570
571 /* set the number to 5 */
572 mp_set(&number, 5);
573
574 /* we're done with it. */
575 mp_clear(&number);
576
577 return EXIT_SUCCESS;
578 \}
579 \end{alltt} \end{small}
580
581 \subsection{Long Constants}
582
583 To set a constant that is the size of an ISO C ``unsigned long'' and larger than a single digit the following function
584 can be used.
585
586 \index{mp\_set\_int}
587 \begin{alltt}
588 int mp_set_int (mp_int * a, unsigned long b);
589 \end{alltt}
590
591 This will assign the value of the 32-bit variable $b$ to the mp\_int $a$. Unlike mp\_set() this function will always
592 accept a 32-bit input regardless of the size of a single digit. However, since the value may span several digits
593 this function can fail if it runs out of heap memory.
594
595 To get the ``unsigned long'' copy of an mp\_int the following function can be used.
596
597 \index{mp\_get\_int}
598 \begin{alltt}
599 unsigned long mp_get_int (mp_int * a);
600 \end{alltt}
601
602 This will return the 32 least significant bits of the mp\_int $a$.
603
604 \begin{small} \begin{alltt}
605 int main(void)
606 \{
607 mp_int number;
608 int result;
609
610 if ((result = mp_init(&number)) != MP_OKAY) \{
611 printf("Error initializing the number. \%s",
612 mp_error_to_string(result));
613 return EXIT_FAILURE;
614 \}
615
616 /* set the number to 654321 (note this is bigger than 127) */
617 if ((result = mp_set_int(&number, 654321)) != MP_OKAY) \{
618 printf("Error setting the value of the number. \%s",
619 mp_error_to_string(result));
620 return EXIT_FAILURE;
621 \}
622
623 printf("number == \%lu", mp_get_int(&number));
624
625 /* we're done with it. */
626 mp_clear(&number);
627
628 return EXIT_SUCCESS;
629 \}
630 \end{alltt} \end{small}
631
632 This should output the following if the program succeeds.
633
634 \begin{alltt}
635 number == 654321
636 \end{alltt}
637
638 \subsection{Initialize and Setting Constants}
639 To both initialize and set small constants the following two functions are available.
640 \index{mp\_init\_set} \index{mp\_init\_set\_int}
641 \begin{alltt}
642 int mp_init_set (mp_int * a, mp_digit b);
643 int mp_init_set_int (mp_int * a, unsigned long b);
644 \end{alltt}
645
646 Both functions work like the previous counterparts except they first mp\_init $a$ before setting the values.
647
648 \begin{alltt}
649 int main(void)
650 \{
651 mp_int number1, number2;
652 int result;
653
654 /* initialize and set a single digit */
655 if ((result = mp_init_set(&number1, 100)) != MP_OKAY) \{
656 printf("Error setting number1: \%s",
657 mp_error_to_string(result));
658 return EXIT_FAILURE;
659 \}
660
661 /* initialize and set a long */
662 if ((result = mp_init_set_int(&number2, 1023)) != MP_OKAY) \{
663 printf("Error setting number2: \%s",
664 mp_error_to_string(result));
665 return EXIT_FAILURE;
666 \}
667
668 /* display */
669 printf("Number1, Number2 == \%lu, \%lu",
670 mp_get_int(&number1), mp_get_int(&number2));
671
672 /* clear */
673 mp_clear_multi(&number1, &number2, NULL);
674
675 return EXIT_SUCCESS;
676 \}
677 \end{alltt}
678
679 If this program succeeds it shall output.
680 \begin{alltt}
681 Number1, Number2 == 100, 1023
682 \end{alltt}
683
684 \section{Comparisons}
685
686 Comparisons in LibTomMath are always performed in a ``left to right'' fashion. There are three possible return codes
687 for any comparison.
688
689 \index{MP\_GT} \index{MP\_EQ} \index{MP\_LT}
690 \begin{figure}[here]
691 \begin{center}
692 \begin{tabular}{|c|c|}
693 \hline \textbf{Result Code} & \textbf{Meaning} \\
694 \hline MP\_GT & $a > b$ \\
695 \hline MP\_EQ & $a = b$ \\
696 \hline MP\_LT & $a < b$ \\
697 \hline
698 \end{tabular}
699 \end{center}
700 \caption{Comparison Codes for $a, b$}
701 \label{fig:CMP}
702 \end{figure}
703
704 In figure \ref{fig:CMP} two integers $a$ and $b$ are being compared. In this case $a$ is said to be ``to the left'' of
705 $b$.
706
707 \subsection{Unsigned comparison}
708
709 An unsigned comparison considers only the digits themselves and not the associated \textit{sign} flag of the
710 mp\_int structures. This is analogous to an absolute comparison. The function mp\_cmp\_mag() will compare two
711 mp\_int variables based on their digits only.
712
713 \index{mp\_cmp\_mag}
714 \begin{alltt}
715 int mp_cmp(mp_int * a, mp_int * b);
716 \end{alltt}
717 This will compare $a$ to $b$ placing $a$ to the left of $b$. This function cannot fail and will return one of the
718 three compare codes listed in figure \ref{fig:CMP}.
719
720 \begin{small} \begin{alltt}
721 int main(void)
722 \{
723 mp_int number1, number2;
724 int result;
725
726 if ((result = mp_init_multi(&number1, &number2, NULL)) != MP_OKAY) \{
727 printf("Error initializing the numbers. \%s",
728 mp_error_to_string(result));
729 return EXIT_FAILURE;
730 \}
731
732 /* set the number1 to 5 */
733 mp_set(&number1, 5);
734
735 /* set the number2 to -6 */
736 mp_set(&number2, 6);
737 if ((result = mp_neg(&number2, &number2)) != MP_OKAY) \{
738 printf("Error negating number2. \%s",
739 mp_error_to_string(result));
740 return EXIT_FAILURE;
741 \}
742
743 switch(mp_cmp_mag(&number1, &number2)) \{
744 case MP_GT: printf("|number1| > |number2|"); break;
745 case MP_EQ: printf("|number1| = |number2|"); break;
746 case MP_LT: printf("|number1| < |number2|"); break;
747 \}
748
749 /* we're done with it. */
750 mp_clear_multi(&number1, &number2, NULL);
751
752 return EXIT_SUCCESS;
753 \}
754 \end{alltt} \end{small}
755
756 If this program\footnote{This function uses the mp\_neg() function which is discussed in section \ref{sec:NEG}.} completes
757 successfully it should print the following.
758
759 \begin{alltt}
760 |number1| < |number2|
761 \end{alltt}
762
763 This is because $\vert -6 \vert = 6$ and obviously $5 < 6$.
764
765 \subsection{Signed comparison}
766
767 To compare two mp\_int variables based on their signed value the mp\_cmp() function is provided.
768
769 \index{mp\_cmp}
770 \begin{alltt}
771 int mp_cmp(mp_int * a, mp_int * b);
772 \end{alltt}
773
774 This will compare $a$ to the left of $b$. It will first compare the signs of the two mp\_int variables. If they
775 differ it will return immediately based on their signs. If the signs are equal then it will compare the digits
776 individually. This function will return one of the compare conditions codes listed in figure \ref{fig:CMP}.
777
778 \begin{small} \begin{alltt}
779 int main(void)
780 \{
781 mp_int number1, number2;
782 int result;
783
784 if ((result = mp_init_multi(&number1, &number2, NULL)) != MP_OKAY) \{
785 printf("Error initializing the numbers. \%s",
786 mp_error_to_string(result));
787 return EXIT_FAILURE;
788 \}
789
790 /* set the number1 to 5 */
791 mp_set(&number1, 5);
792
793 /* set the number2 to -6 */
794 mp_set(&number2, 6);
795 if ((result = mp_neg(&number2, &number2)) != MP_OKAY) \{
796 printf("Error negating number2. \%s",
797 mp_error_to_string(result));
798 return EXIT_FAILURE;
799 \}
800
801 switch(mp_cmp(&number1, &number2)) \{
802 case MP_GT: printf("number1 > number2"); break;
803 case MP_EQ: printf("number1 = number2"); break;
804 case MP_LT: printf("number1 < number2"); break;
805 \}
806
807 /* we're done with it. */
808 mp_clear_multi(&number1, &number2, NULL);
809
810 return EXIT_SUCCESS;
811 \}
812 \end{alltt} \end{small}
813
814 If this program\footnote{This function uses the mp\_neg() function which is discussed in section \ref{sec:NEG}.} completes
815 successfully it should print the following.
816
817 \begin{alltt}
818 number1 > number2
819 \end{alltt}
820
821 \subsection{Single Digit}
822
823 To compare a single digit against an mp\_int the following function has been provided.
824
825 \index{mp\_cmp\_d}
826 \begin{alltt}
827 int mp_cmp_d(mp_int * a, mp_digit b);
828 \end{alltt}
829
830 This will compare $a$ to the left of $b$ using a signed comparison. Note that it will always treat $b$ as
831 positive. This function is rather handy when you have to compare against small values such as $1$ (which often
832 comes up in cryptography). The function cannot fail and will return one of the tree compare condition codes
833 listed in figure \ref{fig:CMP}.
834
835
836 \begin{small} \begin{alltt}
837 int main(void)
838 \{
839 mp_int number;
840 int result;
841
842 if ((result = mp_init(&number)) != MP_OKAY) \{
843 printf("Error initializing the number. \%s",
844 mp_error_to_string(result));
845 return EXIT_FAILURE;
846 \}
847
848 /* set the number to 5 */
849 mp_set(&number, 5);
850
851 switch(mp_cmp_d(&number, 7)) \{
852 case MP_GT: printf("number > 7"); break;
853 case MP_EQ: printf("number = 7"); break;
854 case MP_LT: printf("number < 7"); break;
855 \}
856
857 /* we're done with it. */
858 mp_clear(&number);
859
860 return EXIT_SUCCESS;
861 \}
862 \end{alltt} \end{small}
863
864 If this program functions properly it will print out the following.
865
866 \begin{alltt}
867 number < 7
868 \end{alltt}
869
870 \section{Logical Operations}
871
872 Logical operations are operations that can be performed either with simple shifts or boolean operators such as
873 AND, XOR and OR directly. These operations are very quick.
874
875 \subsection{Multiplication by two}
876
877 Multiplications and divisions by any power of two can be performed with quick logical shifts either left or
878 right depending on the operation.
879
880 When multiplying or dividing by two a special case routine can be used which are as follows.
881 \index{mp\_mul\_2} \index{mp\_div\_2}
882 \begin{alltt}
883 int mp_mul_2(mp_int * a, mp_int * b);
884 int mp_div_2(mp_int * a, mp_int * b);
885 \end{alltt}
886
887 The former will assign twice $a$ to $b$ while the latter will assign half $a$ to $b$. These functions are fast
888 since the shift counts and maskes are hardcoded into the routines.
889
890 \begin{small} \begin{alltt}
891 int main(void)
892 \{
893 mp_int number;
894 int result;
895
896 if ((result = mp_init(&number)) != MP_OKAY) \{
897 printf("Error initializing the number. \%s",
898 mp_error_to_string(result));
899 return EXIT_FAILURE;
900 \}
901
902 /* set the number to 5 */
903 mp_set(&number, 5);
904
905 /* multiply by two */
906 if ((result = mp\_mul\_2(&number, &number)) != MP_OKAY) \{
907 printf("Error multiplying the number. \%s",
908 mp_error_to_string(result));
909 return EXIT_FAILURE;
910 \}
911 switch(mp_cmp_d(&number, 7)) \{
912 case MP_GT: printf("2*number > 7"); break;
913 case MP_EQ: printf("2*number = 7"); break;
914 case MP_LT: printf("2*number < 7"); break;
915 \}
916
917 /* now divide by two */
918 if ((result = mp\_div\_2(&number, &number)) != MP_OKAY) \{
919 printf("Error dividing the number. \%s",
920 mp_error_to_string(result));
921 return EXIT_FAILURE;
922 \}
923 switch(mp_cmp_d(&number, 7)) \{
924 case MP_GT: printf("2*number/2 > 7"); break;
925 case MP_EQ: printf("2*number/2 = 7"); break;
926 case MP_LT: printf("2*number/2 < 7"); break;
927 \}
928
929 /* we're done with it. */
930 mp_clear(&number);
931
932 return EXIT_SUCCESS;
933 \}
934 \end{alltt} \end{small}
935
936 If this program is successful it will print out the following text.
937
938 \begin{alltt}
939 2*number > 7
940 2*number/2 < 7
941 \end{alltt}
942
943 Since $10 > 7$ and $5 < 7$. To multiply by a power of two the following function can be used.
944
945 \index{mp\_mul\_2d}
946 \begin{alltt}
947 int mp_mul_2d(mp_int * a, int b, mp_int * c);
948 \end{alltt}
949
950 This will multiply $a$ by $2^b$ and store the result in ``c''. If the value of $b$ is less than or equal to
951 zero the function will copy $a$ to ``c'' without performing any further actions.
952
953 To divide by a power of two use the following.
954
955 \index{mp\_div\_2d}
956 \begin{alltt}
957 int mp_div_2d (mp_int * a, int b, mp_int * c, mp_int * d);
958 \end{alltt}
959 Which will divide $a$ by $2^b$, store the quotient in ``c'' and the remainder in ``d'. If $b \le 0$ then the
960 function simply copies $a$ over to ``c'' and zeroes $d$. The variable $d$ may be passed as a \textbf{NULL}
961 value to signal that the remainder is not desired.
962
963 \subsection{Polynomial Basis Operations}
964
965 Strictly speaking the organization of the integers within the mp\_int structures is what is known as a
966 ``polynomial basis''. This simply means a field element is stored by divisions of a radix. For example, if
967 $f(x) = \sum_{i=0}^{k} y_ix^k$ for any vector $\vec y$ then the array of digits in $\vec y$ are said to be
968 the polynomial basis representation of $z$ if $f(\beta) = z$ for a given radix $\beta$.
969
970 To multiply by the polynomial $g(x) = x$ all you have todo is shift the digits of the basis left one place. The
971 following function provides this operation.
972
973 \index{mp\_lshd}
974 \begin{alltt}
975 int mp_lshd (mp_int * a, int b);
976 \end{alltt}
977
978 This will multiply $a$ in place by $x^b$ which is equivalent to shifting the digits left $b$ places and inserting zeroes
979 in the least significant digits. Similarly to divide by a power of $x$ the following function is provided.
980
981 \index{mp\_rshd}
982 \begin{alltt}
983 void mp_rshd (mp_int * a, int b)
984 \end{alltt}
985 This will divide $a$ in place by $x^b$ and discard the remainder. This function cannot fail as it performs the operations
986 in place and no new digits are required to complete it.
987
988 \subsection{AND, OR and XOR Operations}
989
990 While AND, OR and XOR operations are not typical ``bignum functions'' they can be useful in several instances. The
991 three functions are prototyped as follows.
992
993 \index{mp\_or} \index{mp\_and} \index{mp\_xor}
994 \begin{alltt}
995 int mp_or (mp_int * a, mp_int * b, mp_int * c);
996 int mp_and (mp_int * a, mp_int * b, mp_int * c);
997 int mp_xor (mp_int * a, mp_int * b, mp_int * c);
998 \end{alltt}
999
1000 Which compute $c = a \odot b$ where $\odot$ is one of OR, AND or XOR.
1001
1002 \section{Addition and Subtraction}
1003
1004 To compute an addition or subtraction the following two functions can be used.
1005
1006 \index{mp\_add} \index{mp\_sub}
1007 \begin{alltt}
1008 int mp_add (mp_int * a, mp_int * b, mp_int * c);
1009 int mp_sub (mp_int * a, mp_int * b, mp_int * c)
1010 \end{alltt}
1011
1012 Which perform $c = a \odot b$ where $\odot$ is one of signed addition or subtraction. The operations are fully sign
1013 aware.
1014
1015 \section{Sign Manipulation}
1016 \subsection{Negation}
1017 \label{sec:NEG}
1018 Simple integer negation can be performed with the following.
1019
1020 \index{mp\_neg}
1021 \begin{alltt}
1022 int mp_neg (mp_int * a, mp_int * b);
1023 \end{alltt}
1024
1025 Which assigns $-a$ to $b$.
1026
1027 \subsection{Absolute}
1028 Simple integer absolutes can be performed with the following.
1029
1030 \index{mp\_neg}
1031 \begin{alltt}
1032 int mp_abs (mp_int * a, mp_int * b);
1033 \end{alltt}
1034
1035 Which assigns $\vert a \vert$ to $b$.
1036
1037 \section{Integer Division and Remainder}
1038 To perform a complete and general integer division with remainder use the following function.
1039
1040 \index{mp\_div}
1041 \begin{alltt}
1042 int mp_div (mp_int * a, mp_int * b, mp_int * c, mp_int * d);
1043 \end{alltt}
1044
1045 This divides $a$ by $b$ and stores the quotient in $c$ and $d$. The signed quotient is computed such that
1046 $bc + d = a$. Note that either of $c$ or $d$ can be set to \textbf{NULL} if their value is not required. If
1047 $b$ is zero the function returns \textbf{MP\_VAL}.
1048
1049
1050 \chapter{Multiplication and Squaring}
1051 \section{Multiplication}
1052 A full signed integer multiplication can be performed with the following.
1053 \index{mp\_mul}
1054 \begin{alltt}
1055 int mp_mul (mp_int * a, mp_int * b, mp_int * c);
1056 \end{alltt}
1057 Which assigns the full signed product $ab$ to $c$. This function actually breaks into one of four cases which are
1058 specific multiplication routines optimized for given parameters. First there are the Toom-Cook multiplications which
1059 should only be used with very large inputs. This is followed by the Karatsuba multiplications which are for moderate
1060 sized inputs. Then followed by the Comba and baseline multipliers.
1061
1062 Fortunately for the developer you don't really need to know this unless you really want to fine tune the system. mp\_mul()
1063 will determine on its own\footnote{Some tweaking may be required.} what routine to use automatically when it is called.
1064
1065 \begin{alltt}
1066 int main(void)
1067 \{
1068 mp_int number1, number2;
1069 int result;
1070
1071 /* Initialize the numbers */
1072 if ((result = mp_init_multi(&number1,
1073 &number2, NULL)) != MP_OKAY) \{
1074 printf("Error initializing the numbers. \%s",
1075 mp_error_to_string(result));
1076 return EXIT_FAILURE;
1077 \}
1078
1079 /* set the terms */
1080 if ((result = mp_set_int(&number, 257)) != MP_OKAY) \{
1081 printf("Error setting number1. \%s",
1082 mp_error_to_string(result));
1083 return EXIT_FAILURE;
1084 \}
1085
1086 if ((result = mp_set_int(&number2, 1023)) != MP_OKAY) \{
1087 printf("Error setting number2. \%s",
1088 mp_error_to_string(result));
1089 return EXIT_FAILURE;
1090 \}
1091
1092 /* multiply them */
1093 if ((result = mp_mul(&number1, &number2,
1094 &number1)) != MP_OKAY) \{
1095 printf("Error multiplying terms. \%s",
1096 mp_error_to_string(result));
1097 return EXIT_FAILURE;
1098 \}
1099
1100 /* display */
1101 printf("number1 * number2 == \%lu", mp_get_int(&number1));
1102
1103 /* free terms and return */
1104 mp_clear_multi(&number1, &number2, NULL);
1105
1106 return EXIT_SUCCESS;
1107 \}
1108 \end{alltt}
1109
1110 If this program succeeds it shall output the following.
1111
1112 \begin{alltt}
1113 number1 * number2 == 262911
1114 \end{alltt}
1115
1116 \section{Squaring}
1117 Since squaring can be performed faster than multiplication it is performed it's own function instead of just using
1118 mp\_mul().
1119
1120 \index{mp\_sqr}
1121 \begin{alltt}
1122 int mp_sqr (mp_int * a, mp_int * b);
1123 \end{alltt}
1124
1125 Will square $a$ and store it in $b$. Like the case of multiplication there are four different squaring
1126 algorithms all which can be called from mp\_sqr(). It is ideal to use mp\_sqr over mp\_mul when squaring terms.
1127
1128 \section{Tuning Polynomial Basis Routines}
1129
1130 Both of the Toom-Cook and Karatsuba multiplication algorithms are faster than the traditional $O(n^2)$ approach that
1131 the Comba and baseline algorithms use. At $O(n^{1.464973})$ and $O(n^{1.584962})$ running times respectfully they require
1132 considerably less work. For example, a 10000-digit multiplication would take roughly 724,000 single precision
1133 multiplications with Toom-Cook or 100,000,000 single precision multiplications with the standard Comba (a factor
1134 of 138).
1135
1136 So why not always use Karatsuba or Toom-Cook? The simple answer is that they have so much overhead that they're not
1137 actually faster than Comba until you hit distinct ``cutoff'' points. For Karatsuba with the default configuration,
1138 GCC 3.3.1 and an Athlon XP processor the cutoff point is roughly 110 digits (about 70 for the Intel P4). That is, at
1139 110 digits Karatsuba and Comba multiplications just about break even and for 110+ digits Karatsuba is faster.
1140
1141 Toom-Cook has incredible overhead and is probably only useful for very large inputs. So far no known cutoff points
1142 exist and for the most part I just set the cutoff points very high to make sure they're not called.
1143
1144 A demo program in the ``etc/'' directory of the project called ``tune.c'' can be used to find the cutoff points. This
1145 can be built with GCC as follows
1146
1147 \begin{alltt}
1148 make XXX
1149 \end{alltt}
1150 Where ``XXX'' is one of the following entries from the table \ref{fig:tuning}.
1151
1152 \begin{figure}[here]
1153 \begin{center}
1154 \begin{small}
1155 \begin{tabular}{|l|l|}
1156 \hline \textbf{Value of XXX} & \textbf{Meaning} \\
1157 \hline tune & Builds portable tuning application \\
1158 \hline tune86 & Builds x86 (pentium and up) program for COFF \\
1159 \hline tune86c & Builds x86 program for Cygwin \\
1160 \hline tune86l & Builds x86 program for Linux (ELF format) \\
1161 \hline
1162 \end{tabular}
1163 \end{small}
1164 \end{center}
1165 \caption{Build Names for Tuning Programs}
1166 \label{fig:tuning}
1167 \end{figure}
1168
1169 When the program is running it will output a series of measurements for different cutoff points. It will first find
1170 good Karatsuba squaring and multiplication points. Then it proceeds to find Toom-Cook points. Note that the Toom-Cook
1171 tuning takes a very long time as the cutoff points are likely to be very high.
1172
1173 \chapter{Modular Reduction}
1174
1175 Modular reduction is process of taking the remainder of one quantity divided by another. Expressed
1176 as (\ref{eqn:mod}) the modular reduction is equivalent to the remainder of $b$ divided by $c$.
1177
1178 \begin{equation}
1179 a \equiv b \mbox{ (mod }c\mbox{)}
1180 \label{eqn:mod}
1181 \end{equation}
1182
1183 Of particular interest to cryptography are reductions where $b$ is limited to the range $0 \le b < c^2$ since particularly
1184 fast reduction algorithms can be written for the limited range.
1185
1186 Note that one of the four optimized reduction algorithms are automatically chosen in the modular exponentiation
1187 algorithm mp\_exptmod when an appropriate modulus is detected.
1188
1189 \section{Straight Division}
1190 In order to effect an arbitrary modular reduction the following algorithm is provided.
1191
1192 \index{mp\_mod}
1193 \begin{alltt}
1194 int mp_mod(mp_int *a, mp_int *b, mp_int *c);
1195 \end{alltt}
1196
1197 This reduces $a$ modulo $b$ and stores the result in $c$. The sign of $c$ shall agree with the sign
1198 of $b$. This algorithm accepts an input $a$ of any range and is not limited by $0 \le a < b^2$.
1199
1200 \section{Barrett Reduction}
1201
1202 Barrett reduction is a generic optimized reduction algorithm that requires pre--computation to achieve
1203 a decent speedup over straight division. First a $mu$ value must be precomputed with the following function.
1204
1205 \index{mp\_reduce\_setup}
1206 \begin{alltt}
1207 int mp_reduce_setup(mp_int *a, mp_int *b);
1208 \end{alltt}
1209
1210 Given a modulus in $b$ this produces the required $mu$ value in $a$. For any given modulus this only has to
1211 be computed once. Modular reduction can now be performed with the following.
1212
1213 \index{mp\_reduce}
1214 \begin{alltt}
1215 int mp_reduce(mp_int *a, mp_int *b, mp_int *c);
1216 \end{alltt}
1217
1218 This will reduce $a$ in place modulo $b$ with the precomputed $mu$ value in $c$. $a$ must be in the range
1219 $0 \le a < b^2$.
1220
1221 \begin{alltt}
1222 int main(void)
1223 \{
1224 mp_int a, b, c, mu;
1225 int result;
1226
1227 /* initialize a,b to desired values, mp_init mu,
1228 * c and set c to 1...we want to compute a^3 mod b
1229 */
1230
1231 /* get mu value */
1232 if ((result = mp_reduce_setup(&mu, b)) != MP_OKAY) \{
1233 printf("Error getting mu. \%s",
1234 mp_error_to_string(result));
1235 return EXIT_FAILURE;
1236 \}
1237
1238 /* square a to get c = a^2 */
1239 if ((result = mp_sqr(&a, &c)) != MP_OKAY) \{
1240 printf("Error squaring. \%s",
1241 mp_error_to_string(result));
1242 return EXIT_FAILURE;
1243 \}
1244
1245 /* now reduce `c' modulo b */
1246 if ((result = mp_reduce(&c, &b, &mu)) != MP_OKAY) \{
1247 printf("Error reducing. \%s",
1248 mp_error_to_string(result));
1249 return EXIT_FAILURE;
1250 \}
1251
1252 /* multiply a to get c = a^3 */
1253 if ((result = mp_mul(&a, &c, &c)) != MP_OKAY) \{
1254 printf("Error reducing. \%s",
1255 mp_error_to_string(result));
1256 return EXIT_FAILURE;
1257 \}
1258
1259 /* now reduce `c' modulo b */
1260 if ((result = mp_reduce(&c, &b, &mu)) != MP_OKAY) \{
1261 printf("Error reducing. \%s",
1262 mp_error_to_string(result));
1263 return EXIT_FAILURE;
1264 \}
1265
1266 /* c now equals a^3 mod b */
1267
1268 return EXIT_SUCCESS;
1269 \}
1270 \end{alltt}
1271
1272 This program will calculate $a^3 \mbox{ mod }b$ if all the functions succeed.
1273
1274 \section{Montgomery Reduction}
1275
1276 Montgomery is a specialized reduction algorithm for any odd moduli. Like Barrett reduction a pre--computation
1277 step is required. This is accomplished with the following.
1278
1279 \index{mp\_montgomery\_setup}
1280 \begin{alltt}
1281 int mp_montgomery_setup(mp_int *a, mp_digit *mp);
1282 \end{alltt}
1283
1284 For the given odd moduli $a$ the precomputation value is placed in $mp$. The reduction is computed with the
1285 following.
1286
1287 \index{mp\_montgomery\_reduce}
1288 \begin{alltt}
1289 int mp_montgomery_reduce(mp_int *a, mp_int *m, mp_digit mp);
1290 \end{alltt}
1291 This reduces $a$ in place modulo $m$ with the pre--computed value $mp$. $a$ must be in the range
1292 $0 \le a < b^2$.
1293
1294 Montgomery reduction is faster than Barrett reduction for moduli smaller than the ``comba'' limit. With the default
1295 setup for instance, the limit is $127$ digits ($3556$--bits). Note that this function is not limited to
1296 $127$ digits just that it falls back to a baseline algorithm after that point.
1297
1298 An important observation is that this reduction does not return $a \mbox{ mod }m$ but $aR^{-1} \mbox{ mod }m$
1299 where $R = \beta^n$, $n$ is the n number of digits in $m$ and $\beta$ is radix used (default is $2^{28}$).
1300
1301 To quickly calculate $R$ the following function was provided.
1302
1303 \index{mp\_montgomery\_calc\_normalization}
1304 \begin{alltt}
1305 int mp_montgomery_calc_normalization(mp_int *a, mp_int *b);
1306 \end{alltt}
1307 Which calculates $a = R$ for the odd moduli $b$ without using multiplication or division.
1308
1309 The normal modus operandi for Montgomery reductions is to normalize the integers before entering the system. For
1310 example, to calculate $a^3 \mbox { mod }b$ using Montgomery reduction the value of $a$ can be normalized by
1311 multiplying it by $R$. Consider the following code snippet.
1312
1313 \begin{alltt}
1314 int main(void)
1315 \{
1316 mp_int a, b, c, R;
1317 mp_digit mp;
1318 int result;
1319
1320 /* initialize a,b to desired values,
1321 * mp_init R, c and set c to 1....
1322 */
1323
1324 /* get normalization */
1325 if ((result = mp_montgomery_calc_normalization(&R, b)) != MP_OKAY) \{
1326 printf("Error getting norm. \%s",
1327 mp_error_to_string(result));
1328 return EXIT_FAILURE;
1329 \}
1330
1331 /* get mp value */
1332 if ((result = mp_montgomery_setup(&c, &mp)) != MP_OKAY) \{
1333 printf("Error setting up montgomery. \%s",
1334 mp_error_to_string(result));
1335 return EXIT_FAILURE;
1336 \}
1337
1338 /* normalize `a' so now a is equal to aR */
1339 if ((result = mp_mulmod(&a, &R, &b, &a)) != MP_OKAY) \{
1340 printf("Error computing aR. \%s",
1341 mp_error_to_string(result));
1342 return EXIT_FAILURE;
1343 \}
1344
1345 /* square a to get c = a^2R^2 */
1346 if ((result = mp_sqr(&a, &c)) != MP_OKAY) \{
1347 printf("Error squaring. \%s",
1348 mp_error_to_string(result));
1349 return EXIT_FAILURE;
1350 \}
1351
1352 /* now reduce `c' back down to c = a^2R^2 * R^-1 == a^2R */
1353 if ((result = mp_montgomery_reduce(&c, &b, mp)) != MP_OKAY) \{
1354 printf("Error reducing. \%s",
1355 mp_error_to_string(result));
1356 return EXIT_FAILURE;
1357 \}
1358
1359 /* multiply a to get c = a^3R^2 */
1360 if ((result = mp_mul(&a, &c, &c)) != MP_OKAY) \{
1361 printf("Error reducing. \%s",
1362 mp_error_to_string(result));
1363 return EXIT_FAILURE;
1364 \}
1365
1366 /* now reduce `c' back down to c = a^3R^2 * R^-1 == a^3R */
1367 if ((result = mp_montgomery_reduce(&c, &b, mp)) != MP_OKAY) \{
1368 printf("Error reducing. \%s",
1369 mp_error_to_string(result));
1370 return EXIT_FAILURE;
1371 \}
1372
1373 /* now reduce (again) `c' back down to c = a^3R * R^-1 == a^3 */
1374 if ((result = mp_montgomery_reduce(&c, &b, mp)) != MP_OKAY) \{
1375 printf("Error reducing. \%s",
1376 mp_error_to_string(result));
1377 return EXIT_FAILURE;
1378 \}
1379
1380 /* c now equals a^3 mod b */
1381
1382 return EXIT_SUCCESS;
1383 \}
1384 \end{alltt}
1385
1386 This particular example does not look too efficient but it demonstrates the point of the algorithm. By
1387 normalizing the inputs the reduced results are always of the form $aR$ for some variable $a$. This allows
1388 a single final reduction to correct for the normalization and the fast reduction used within the algorithm.
1389
1390 For more details consider examining the file \textit{bn\_mp\_exptmod\_fast.c}.
1391
1392 \section{Restricted Dimminished Radix}
1393
1394 ``Dimminished Radix'' reduction refers to reduction with respect to moduli that are ameniable to simple
1395 digit shifting and small multiplications. In this case the ``restricted'' variant refers to moduli of the
1396 form $\beta^k - p$ for some $k \ge 0$ and $0 < p < \beta$ where $\beta$ is the radix (default to $2^{28}$).
1397
1398 As in the case of Montgomery reduction there is a pre--computation phase required for a given modulus.
1399
1400 \index{mp\_dr\_setup}
1401 \begin{alltt}
1402 void mp_dr_setup(mp_int *a, mp_digit *d);
1403 \end{alltt}
1404
1405 This computes the value required for the modulus $a$ and stores it in $d$. This function cannot fail
1406 and does not return any error codes. After the pre--computation a reduction can be performed with the
1407 following.
1408
1409 \index{mp\_dr\_reduce}
1410 \begin{alltt}
1411 int mp_dr_reduce(mp_int *a, mp_int *b, mp_digit mp);
1412 \end{alltt}
1413
1414 This reduces $a$ in place modulo $b$ with the pre--computed value $mp$. $b$ must be of a restricted
1415 dimminished radix form and $a$ must be in the range $0 \le a < b^2$. Dimminished radix reductions are
1416 much faster than both Barrett and Montgomery reductions as they have a much lower asymtotic running time.
1417
1418 Since the moduli are restricted this algorithm is not particularly useful for something like Rabin, RSA or
1419 BBS cryptographic purposes. This reduction algorithm is useful for Diffie-Hellman and ECC where fixed
1420 primes are acceptable.
1421
1422 Note that unlike Montgomery reduction there is no normalization process. The result of this function is
1423 equal to the correct residue.
1424
1425 \section{Unrestricted Dimminshed Radix}
1426
1427 Unrestricted reductions work much like the restricted counterparts except in this case the moduli is of the
1428 form $2^k - p$ for $0 < p < \beta$. In this sense the unrestricted reductions are more flexible as they
1429 can be applied to a wider range of numbers.
1430
1431 \index{mp\_reduce\_2k\_setup}
1432 \begin{alltt}
1433 int mp_reduce_2k_setup(mp_int *a, mp_digit *d);
1434 \end{alltt}
1435
1436 This will compute the required $d$ value for the given moduli $a$.
1437
1438 \index{mp\_reduce\_2k}
1439 \begin{alltt}
1440 int mp_reduce_2k(mp_int *a, mp_int *n, mp_digit d);
1441 \end{alltt}
1442
1443 This will reduce $a$ in place modulo $n$ with the pre--computed value $d$. From my experience this routine is
1444 slower than mp\_dr\_reduce but faster for most moduli sizes than the Montgomery reduction.
1445
1446 \chapter{Exponentiation}
1447 \section{Single Digit Exponentiation}
1448 \index{mp\_expt\_d}
1449 \begin{alltt}
1450 int mp_expt_d (mp_int * a, mp_digit b, mp_int * c)
1451 \end{alltt}
1452 This computes $c = a^b$ using a simple binary left-to-right algorithm. It is faster than repeated multiplications by
1453 $a$ for all values of $b$ greater than three.
1454
1455 \section{Modular Exponentiation}
1456 \index{mp\_exptmod}
1457 \begin{alltt}
1458 int mp_exptmod (mp_int * G, mp_int * X, mp_int * P, mp_int * Y)
1459 \end{alltt}
1460 This computes $Y \equiv G^X \mbox{ (mod }P\mbox{)}$ using a variable width sliding window algorithm. This function
1461 will automatically detect the fastest modular reduction technique to use during the operation. For negative values of
1462 $X$ the operation is performed as $Y \equiv (G^{-1} \mbox{ mod }P)^{\vert X \vert} \mbox{ (mod }P\mbox{)}$ provided that
1463 $gcd(G, P) = 1$.
1464
1465 This function is actually a shell around the two internal exponentiation functions. This routine will automatically
1466 detect when Barrett, Montgomery, Restricted and Unrestricted Dimminished Radix based exponentiation can be used. Generally
1467 moduli of the a ``restricted dimminished radix'' form lead to the fastest modular exponentiations. Followed by Montgomery
1468 and the other two algorithms.
1469
1470 \section{Root Finding}
1471 \index{mp\_n\_root}
1472 \begin{alltt}
1473 int mp_n_root (mp_int * a, mp_digit b, mp_int * c)
1474 \end{alltt}
1475 This computes $c = a^{1/b}$ such that $c^b \le a$ and $(c+1)^b > a$. The implementation of this function is not
1476 ideal for values of $b$ greater than three. It will work but become very slow. So unless you are working with very small
1477 numbers (less than 1000 bits) I'd avoid $b > 3$ situations. Will return a positive root only for even roots and return
1478 a root with the sign of the input for odd roots. For example, performing $4^{1/2}$ will return $2$ whereas $(-8)^{1/3}$
1479 will return $-2$.
1480
1481 This algorithm uses the ``Newton Approximation'' method and will converge on the correct root fairly quickly. Since
1482 the algorithm requires raising $a$ to the power of $b$ it is not ideal to attempt to find roots for large
1483 values of $b$. If particularly large roots are required then a factor method could be used instead. For example,
1484 $a^{1/16}$ is equivalent to $\left (a^{1/4} \right)^{1/4}$.
1485
1486 \chapter{Prime Numbers}
1487 \section{Trial Division}
1488 \index{mp\_prime\_is\_divisible}
1489 \begin{alltt}
1490 int mp_prime_is_divisible (mp_int * a, int *result)
1491 \end{alltt}
1492 This will attempt to evenly divide $a$ by a list of primes\footnote{Default is the first 256 primes.} and store the
1493 outcome in ``result''. That is if $result = 0$ then $a$ is not divisible by the primes, otherwise it is. Note that
1494 if the function does not return \textbf{MP\_OKAY} the value in ``result'' should be considered undefined\footnote{Currently
1495 the default is to set it to zero first.}.
1496
1497 \section{Fermat Test}
1498 \index{mp\_prime\_fermat}
1499 \begin{alltt}
1500 int mp_prime_fermat (mp_int * a, mp_int * b, int *result)
1501 \end{alltt}
1502 Performs a Fermat primality test to the base $b$. That is it computes $b^a \mbox{ mod }a$ and tests whether the value is
1503 equal to $b$ or not. If the values are equal then $a$ is probably prime and $result$ is set to one. Otherwise $result$
1504 is set to zero.
1505
1506 \section{Miller-Rabin Test}
1507 \index{mp\_prime\_miller\_rabin}
1508 \begin{alltt}
1509 int mp_prime_miller_rabin (mp_int * a, mp_int * b, int *result)
1510 \end{alltt}
1511 Performs a Miller-Rabin test to the base $b$ of $a$. This test is much stronger than the Fermat test and is very hard to
1512 fool (besides with Carmichael numbers). If $a$ passes the test (therefore is probably prime) $result$ is set to one.
1513 Otherwise $result$ is set to zero.
1514
1515 Note that is suggested that you use the Miller-Rabin test instead of the Fermat test since all of the failures of
1516 Miller-Rabin are a subset of the failures of the Fermat test.
1517
1518 \subsection{Required Number of Tests}
1519 Generally to ensure a number is very likely to be prime you have to perform the Miller-Rabin with at least a half-dozen
1520 or so unique bases. However, it has been proven that the probability of failure goes down as the size of the input goes up.
1521 This is why a simple function has been provided to help out.
1522
1523 \index{mp\_prime\_rabin\_miller\_trials}
1524 \begin{alltt}
1525 int mp_prime_rabin_miller_trials(int size)
1526 \end{alltt}
1527 This returns the number of trials required for a $2^{-96}$ (or lower) probability of failure for a given ``size'' expressed
1528 in bits. This comes in handy specially since larger numbers are slower to test. For example, a 512-bit number would
1529 require ten tests whereas a 1024-bit number would only require four tests.
1530
1531 You should always still perform a trial division before a Miller-Rabin test though.
1532
1533 \section{Primality Testing}
1534 \index{mp\_prime\_is\_prime}
1535 \begin{alltt}
1536 int mp_prime_is_prime (mp_int * a, int t, int *result)
1537 \end{alltt}
1538 This will perform a trial division followed by $t$ rounds of Miller-Rabin tests on $a$ and store the result in $result$.
1539 If $a$ passes all of the tests $result$ is set to one, otherwise it is set to zero. Note that $t$ is bounded by
1540 $1 \le t < PRIME\_SIZE$ where $PRIME\_SIZE$ is the number of primes in the prime number table (by default this is $256$).
1541
1542 \section{Next Prime}
1543 \index{mp\_prime\_next\_prime}
1544 \begin{alltt}
1545 int mp_prime_next_prime(mp_int *a, int t, int bbs_style)
1546 \end{alltt}
1547 This finds the next prime after $a$ that passes mp\_prime\_is\_prime() with $t$ tests. Set $bbs\_style$ to one if you
1548 want only the next prime congruent to $3 \mbox{ mod } 4$, otherwise set it to zero to find any next prime.
1549
1550 \section{Random Primes}
1551 \index{mp\_prime\_random}
1552 \begin{alltt}
1553 int mp_prime_random(mp_int *a, int t, int size, int bbs,
1554 ltm_prime_callback cb, void *dat)
1555 \end{alltt}
1556 This will find a prime greater than $256^{size}$ which can be ``bbs\_style'' or not depending on $bbs$ and must pass
1557 $t$ rounds of tests. The ``ltm\_prime\_callback'' is a typedef for
1558
1559 \begin{alltt}
1560 typedef int ltm_prime_callback(unsigned char *dst, int len, void *dat);
1561 \end{alltt}
1562
1563 Which is a function that must read $len$ bytes (and return the amount stored) into $dst$. The $dat$ variable is simply
1564 copied from the original input. It can be used to pass RNG context data to the callback. The function
1565 mp\_prime\_random() is more suitable for generating primes which must be secret (as in the case of RSA) since there
1566 is no skew on the least significant bits.
1567
1568 \textit{Note:} As of v0.30 of the LibTomMath library this function has been deprecated. It is still available
1569 but users are encouraged to use the new mp\_prime\_random\_ex() function instead.
1570
1571 \subsection{Extended Generation}
1572 \index{mp\_prime\_random\_ex}
1573 \begin{alltt}
1574 int mp_prime_random_ex(mp_int *a, int t,
1575 int size, int flags,
1576 ltm_prime_callback cb, void *dat);
1577 \end{alltt}
1578 This will generate a prime in $a$ using $t$ tests of the primality testing algorithms. The variable $size$
1579 specifies the bit length of the prime desired. The variable $flags$ specifies one of several options available
1580 (see fig. \ref{fig:primeopts}) which can be OR'ed together. The callback parameters are used as in
1581 mp\_prime\_random().
1582
1583 \begin{figure}[here]
1584 \begin{center}
1585 \begin{small}
1586 \begin{tabular}{|r|l|}
1587 \hline \textbf{Flag} & \textbf{Meaning} \\
1588 \hline LTM\_PRIME\_BBS & Make the prime congruent to $3$ modulo $4$ \\
1589 \hline LTM\_PRIME\_SAFE & Make a prime $p$ such that $(p - 1)/2$ is also prime. \\
1590 & This option implies LTM\_PRIME\_BBS as well. \\
1591 \hline LTM\_PRIME\_2MSB\_OFF & Makes sure that the bit adjacent to the most significant bit \\
1592 & Is forced to zero. \\
1593 \hline LTM\_PRIME\_2MSB\_ON & Makes sure that the bit adjacent to the most significant bit \\
1594 & Is forced to one. \\
1595 \hline
1596 \end{tabular}
1597 \end{small}
1598 \end{center}
1599 \caption{Primality Generation Options}
1600 \label{fig:primeopts}
1601 \end{figure}
1602
1603 \chapter{Input and Output}
1604 \section{ASCII Conversions}
1605 \subsection{To ASCII}
1606 \index{mp\_toradix}
1607 \begin{alltt}
1608 int mp_toradix (mp_int * a, char *str, int radix);
1609 \end{alltt}
1610 This still store $a$ in ``str'' as a base-``radix'' string of ASCII chars. This function appends a NUL character
1611 to terminate the string. Valid values of ``radix'' line in the range $[2, 64]$. To determine the size (exact) required
1612 by the conversion before storing any data use the following function.
1613
1614 \index{mp\_radix\_size}
1615 \begin{alltt}
1616 int mp_radix_size (mp_int * a, int radix, int *size)
1617 \end{alltt}
1618 This stores in ``size'' the number of characters (including space for the NUL terminator) required. Upon error this
1619 function returns an error code and ``size'' will be zero.
1620
1621 \subsection{From ASCII}
1622 \index{mp\_read\_radix}
1623 \begin{alltt}
1624 int mp_read_radix (mp_int * a, char *str, int radix);
1625 \end{alltt}
1626 This will read the base-``radix'' NUL terminated string from ``str'' into $a$. It will stop reading when it reads a
1627 character it does not recognize (which happens to include th NUL char... imagine that...). A single leading $-$ sign
1628 can be used to denote a negative number.
1629
1630 \section{Binary Conversions}
1631
1632 Converting an mp\_int to and from binary is another keen idea.
1633
1634 \index{mp\_unsigned\_bin\_size}
1635 \begin{alltt}
1636 int mp_unsigned_bin_size(mp_int *a);
1637 \end{alltt}
1638
1639 This will return the number of bytes (octets) required to store the unsigned copy of the integer $a$.
1640
1641 \index{mp\_to\_unsigned\_bin}
1642 \begin{alltt}
1643 int mp_to_unsigned_bin(mp_int *a, unsigned char *b);
1644 \end{alltt}
1645 This will store $a$ into the buffer $b$ in big--endian format. Fortunately this is exactly what DER (or is it ASN?)
1646 requires. It does not store the sign of the integer.
1647
1648 \index{mp\_read\_unsigned\_bin}
1649 \begin{alltt}
1650 int mp_read_unsigned_bin(mp_int *a, unsigned char *b, int c);
1651 \end{alltt}
1652 This will read in an unsigned big--endian array of bytes (octets) from $b$ of length $c$ into $a$. The resulting
1653 integer $a$ will always be positive.
1654
1655 For those who acknowledge the existence of negative numbers (heretic!) there are ``signed'' versions of the
1656 previous functions.
1657
1658 \begin{alltt}
1659 int mp_signed_bin_size(mp_int *a);
1660 int mp_read_signed_bin(mp_int *a, unsigned char *b, int c);
1661 int mp_to_signed_bin(mp_int *a, unsigned char *b);
1662 \end{alltt}
1663 They operate essentially the same as the unsigned copies except they prefix the data with zero or non--zero
1664 byte depending on the sign. If the sign is zpos (e.g. not negative) the prefix is zero, otherwise the prefix
1665 is non--zero.
1666
1667 \chapter{Algebraic Functions}
1668 \section{Extended Euclidean Algorithm}
1669 \index{mp\_exteuclid}
1670 \begin{alltt}
1671 int mp_exteuclid(mp_int *a, mp_int *b,
1672 mp_int *U1, mp_int *U2, mp_int *U3);
1673 \end{alltt}
1674
1675 This finds the triple U1/U2/U3 using the Extended Euclidean algorithm such that the following equation holds.
1676
1677 \begin{equation}
1678 a \cdot U1 + b \cdot U2 = U3
1679 \end{equation}
1680
1681 Any of the U1/U2/U3 paramters can be set to \textbf{NULL} if they are not desired.
1682
1683 \section{Greatest Common Divisor}
1684 \index{mp\_gcd}
1685 \begin{alltt}
1686 int mp_gcd (mp_int * a, mp_int * b, mp_int * c)
1687 \end{alltt}
1688 This will compute the greatest common divisor of $a$ and $b$ and store it in $c$.
1689
1690 \section{Least Common Multiple}
1691 \index{mp\_lcm}
1692 \begin{alltt}
1693 int mp_lcm (mp_int * a, mp_int * b, mp_int * c)
1694 \end{alltt}
1695 This will compute the least common multiple of $a$ and $b$ and store it in $c$.
1696
1697 \section{Jacobi Symbol}
1698 \index{mp\_jacobi}
1699 \begin{alltt}
1700 int mp_jacobi (mp_int * a, mp_int * p, int *c)
1701 \end{alltt}
1702 This will compute the Jacobi symbol for $a$ with respect to $p$. If $p$ is prime this essentially computes the Legendre
1703 symbol. The result is stored in $c$ and can take on one of three values $\lbrace -1, 0, 1 \rbrace$. If $p$ is prime
1704 then the result will be $-1$ when $a$ is not a quadratic residue modulo $p$. The result will be $0$ if $a$ divides $p$
1705 and the result will be $1$ if $a$ is a quadratic residue modulo $p$.
1706
1707 \section{Modular Inverse}
1708 \index{mp\_invmod}
1709 \begin{alltt}
1710 int mp_invmod (mp_int * a, mp_int * b, mp_int * c)
1711 \end{alltt}
1712 Computes the multiplicative inverse of $a$ modulo $b$ and stores the result in $c$ such that $ac \equiv 1 \mbox{ (mod }b\mbox{)}$.
1713
1714 \section{Single Digit Functions}
1715
1716 For those using small numbers (\textit{snicker snicker}) there are several ``helper'' functions
1717
1718 \index{mp\_add\_d} \index{mp\_sub\_d} \index{mp\_mul\_d} \index{mp\_div\_d} \index{mp\_mod\_d}
1719 \begin{alltt}
1720 int mp_add_d(mp_int *a, mp_digit b, mp_int *c);
1721 int mp_sub_d(mp_int *a, mp_digit b, mp_int *c);
1722 int mp_mul_d(mp_int *a, mp_digit b, mp_int *c);
1723 int mp_div_d(mp_int *a, mp_digit b, mp_int *c, mp_digit *d);
1724 int mp_mod_d(mp_int *a, mp_digit b, mp_digit *c);
1725 \end{alltt}
1726
1727 These work like the full mp\_int capable variants except the second parameter $b$ is a mp\_digit. These
1728 functions fairly handy if you have to work with relatively small numbers since you will not have to allocate
1729 an entire mp\_int to store a number like $1$ or $2$.
1730
1731 \input{bn.ind}
1732
1733 \end{document}