Carrying out integer arithmetic as described will quickly lead to an integer overflow, i.e., the numbers arising in the analysis will soon be larger than the largest numbers that can be expressed in Java. So instead the arithmetic is carried out not on the number themselves but on their residuals with respect to large prime numbers. The largest such prime number is
p0=231-1 = 2,147,483,647
and the top 32 are listed in the box nearby. (If you really want to know look here for the top 1000.)
For two integers A and N we denote by
A mod N
the remainder of A divided by N. We call the number A mod N the residual of A with respect to (or modulo) N. For example,
38 mod 6 =2 since 38 divided by 6 equals 6 with a remainder 2. Thus the residual of 38 with respect to 6 equals 2. We also say the residual of 38 modulo 6 equals 2.
Taking remainders distributes over addition, subtraction, and multiplication.
i.e.,
(x + y) mod N = ((x mod N) + (y mod N)) mod N
(x - y) mod N = ((x mod N) - (y mod N)) mod N
(x * y) mod N = ((x mod N) * (y mod N)) mod N
(It does not make sense for division since the result of dividing two integers may not be an integer.)
When analyzing minimal determining sets and the dependence of some Bézier ordinates on others it only matters whether certain entries in a certain matrix are zero or non-zero. The utility of using remainders instead of the integers themselves rests in the following simple observation:
If A=0 then A mod N = 0 for all integers N.
Thus if we find that the residual of a number Nmodulo a large prime number is non-zero then N cannot be zero. It almost works the other way too: If the residual of N modulo a large prime number is zero then it is very likely (but not certain) that N is zero.
Why should we use prime numbers? Prime numbers become significant when we use several to increase our confidence in treating numbers as zero when all its residuals are zero.
If the numbers with respect to which which we compute residuals have common factors then some combinations of residuals are impossible and the range of numbers that we can distinguish is decreased.
This is best explained with an example.
Suppose we consider residuals with respect to N1=4 and N2=6. The following table lists the smallest number that gives rise to a combination of residuals. The columns correspond to the remainder modulo 6, the rows to the remainders modulo 4.
0 | 1 | 2 | 3 | 4 | 5 | |
0 | 0 | 8 | 4 | |||
1 | 1 | 9 | 5 | |||
2 | 6 | 2 | 10 | |||
3 | 7 | 3 | 11 |
For example,
9 mod 6 = 3 and 9 mod 4 = 1
and there is no number at all that has residual 1 mod 4 and residual 2 mod 6.
Compare this with the choice N1=3 and N5=6
The relevant table is now given by
0 | 1 | 2 | 3 | 4 | |
0 | 0 | 6 | 12 | 3 | 9 |
1 | 10 | 1 | 7 | 13 | 4 |
2 | 5 | 11 | 2 | 8 | 14 |
Thus all combinations of residuals are possible and the range of numbers contained in the table is larger, even though the table itself is smaller than that for the previous choice N1=4 and N2=6.
These simple observations are made more precise in the celebrated
Chinese Remainder Theorem. Suppose N1, N2, ..., Nn are relatively prime. Let
N=N1 * N2 * ... * Nn.
Then for any set of residuals n1, n2, ..., nn there exists a unique number
0 <= x
< N such that x mod Ni = n_i, i = 1,...,n.
Note that a corresponding result does not hold for our choice N1=4 and N2=6. Indeed, since 12 has a zero residual both modulo 4 and modulo 6, adding 12 to any number in the first table above gives another number between 0 and 23 that has the same residuals.
A set of numbers is relatively prime if their greatest common factor is 1. Instead of the largest primes available in Java we could thus use the largest available numbers that are relatively prime. However, using prime numbers (which of course are relatively prime) entails only a small loss in efficiency. The reason for using prime numbers, instead of relatively prime numbers, is that prime numbers are widely known and independent of each other. Also note that p0 = 231 -1 is not only the largest prime number expressible as an integer in Java, it is also the largest integer itself.
Java does allow for a data type long which allocates 64 bits (instead of 32) to every integer. However, that data type is used for computing the residuals of products, by first computing the product of two integers as a long integer. Using long as the basic data type would be possible, but it would require additional programming to compute residuals of products and sums that overflow the long data type and presumably would run more slowly. An effect similar to using the long data type can be obtained by doubling the number of primes used by the applet.
The residual of A modulo a positive integer N is defined more precisely to be the unique non-negative integer R less than N such that
A = mN + R.
It is thus a number that ranges from 0 to N-1. This is the notion of a residual that is required in applications like ours.
Unfortunately, Java, and other programming languages, botch this notion when computing the residuals of negative numbers.
For example,
-10 = 2 mod 6 since -10 = -2*6 + 2.
On the other hand, Java returns -4 as the residual of -10 mod 6.
In general, when computing the residual of a negative number with
respect to N one has to add N to the residual to get
the correct value. In this example, adding 6 to -4 gives 2, which is
the correct residual.
Dave Eyre has written
a program Goliath that employs residual arithmetic to analyze
large sparse rational linear systems. The program has been used
extensively in my spline research. For details examine items 33, 35,
38 in my bibliography.
The FORTRAN code of Goliath is available (as ACM Collected Algorithm
701) from Netlib, an online
software depository. Specifically, it is here.
The following Table shows results computed on the generic double
Clough-Tocher split. The columns contain:
Even for the largest realistic problems that can be handled by the
code, the likelihood of an accidentally zero residual is small. For
the largest possible problem the likelihood is still much better than even.
Even then an accidentally zero residual would be unlikely to cause the
computed dimension to be wrong (since it would effect the computation
only if it affects a pivot in the Gaussian elimination). It would not
follow that the computed dimension is wrong. Moreover,
the likelihood of failure can be further reduced, even to the
certainty of success, by using several prime numbers.
It is instructive to consider the performance of the code when
using small prime numbers since then failure may occur more readily.
The following table illustrates the performance of residual arithmetic.
The dimension of the spline space on the generic double Clough-Tocher
split, for r=1 and d=4 is 39. It was computed
using three consecutive prime numbers. The table gives the top one of
those prime numbers, and the color indicates the result, as follows:
Goliath
Dependability of Residual Arithmetic
There are two groups of examples: the first illustrating a case as it
might occur in actual computations, i.e., d=3r+1. The second
illustrates linear systems that are as dense as possible, ie.,
d=r. Memory limits the size of the linear system that can be
handled, and the denser the system the larger is the number of
residuals and the likelihood of accidental zeros.
r d m n rank dim
N
p 1 4 48 76 40 39 15,837 99.999% 2 7 156 223 133 99 364,335 99.983% 3 10 324 448 282 184 2,679,441 99.875% 4 13 552 751 484 297 12,042,693 99.440% 5 16 840 1132 741 436 42,066,300 98.600% 6 19 1188 1591 1051 603 114,682,971 94.799% 7 22 Out of Memory 1 1 12 7 4 3 648 99.999% 2 2 24 22 16 6 3372 99.999% 3 3 72 46 36 10 63,084 99.997% 4 4 120 79 64 15 259,350 99.987% 5 5 180 121 100 21 805,167 99.962% 6 6 252 172 144 28 2,075,643 99.903% 7 7 336 232 196 36 4,670,199 99.782% 8 8 432 301 256 45 9,503,046 99.558% 9 9 540 379 324 55 18,232,272 99.154% 10 10 660 466 400 66 32,025,258 98.519% 11 11 792 562 484 78 54,780,693 97.481% 12 12 936 667 576 91 88,723,074 95.952% 13 13 1092 781 676 105 139,291,431 93.719% 14 13 1260 904 784 120 211,128,900 90.636% 15 15 1440 1036 900 136 309,403,122 86.582% 16 16 1632 1177 1024 153 446,048,784 81.238% 17 17 1836 1327 1156 171 629,224,290 74.601% 18 18 Out of Memory Residual Arithmetic for Small Prime Numbers
All
entries in the linear system have at least one of their residuals
equal to zero. This makes Gaussian Elimination impossible and the
matrix is considered to have rank 0.
Red:
The computed dimension is too high. Some non-zero numbers in the linear
system are treated as being zero.
Purple:
The dimension is computed correctly. However, some entries in the
linear systems have a mixture of zero and non-zero residuals. The
program recognizes this as a potential problem. Numbers with mixed
residuals are considered non-zero, but they cannot serve as pivots.
Cyan:
The dimension is computed correctly and all non-zero entries in the linear system have three non-zero residuals. This is the expected and desired situation.
The smallest triple of primes having these properties are 83, 89, 97.
Green:
5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79
83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181
191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293
307 311 313 317 331 337 347 349 353 359 367 373 379 383 389 397 401 409 419 421
431 433 439 443 449 457 461 463 467 479 487 491 499 503 509 521 523 541 547 557
563 569 571 577 587 593 599 601 607 613 617 619 631 641 643 647 653 659 661 673
677 683 691 701 709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821
823 827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 947 953
Notes
[29-Dec-2002]