I asked this on physicsforums.com (which has a math section), but gotten no answers so far, so I'll ask here as well. It's a bit long, so bear with me.
The Karatsuba multiplication algorithm is a faster-than-O(n
2) (approximately O(n
1.58)) multiplication method of two large numbers. I have been working on a small project where I have implemented it (among other things), and I noticed something curious about it that I'm uncertain how to prove or disprove.
In order to present the question, I first need to explain how the algorithm works (because the question pertains to one particular part of it).
Let's represent the "digits" of a number (in whatever base we are using) with capital letters. Let's assume that we want to multiply, for example, two 10-digit numbers, let's call them
AAAAABBBBB and
CCCCCDDDDD. In order to calculate the result, we first calculate these partial results:
z0 = DDDDD * BBBBB = EEEEEEEEEE
z2 = CCCCC * AAAAA = GGGGGGGGGG
z1 = (AAAAA+BBBBB) * (CCCCC+DDDDD) - z2 - z0 = FFFFFFFFFFFF
(Those three multiplications can be calculated by calling this algorithm recursively, which is the idea with it, and where the time saving comes from, as the normal four multiplications that would be required with a normal long multiplication algorithm are reduced to three.)
Note that
z1 above is 2 digits longer than the others (this is always so regardless of the sizes of the input values). This is because the result of the two additions are 6 digits long (the most-significant digit may be 1) and thus the result of the multiplication is 12 digits long. (I believe, although I'm not certain, those two most-significant digits of the result are always zero after the two subtractions. However, I'm leaving them there because they are relevant to my actual question.)
In order to get the final result we have to add those three partial results, using an offset of 5 digits as needed (ie. in practice "multiplied by the base to the power of 5). This can be most clearly visualized like this:
AAAAABBBBB
* CCCCCDDDDD
--------------------
EEEEEEEEEE = z0
+ FFFFFFFFFFFF = z1
+ GGGGGGGGGG = z2
--------------------
RRRRRRRRRRRRRRRRRRRR = result
From a programming implementation point of view a slight space optimization can be done by noting that
z0 and
z2 do not overlap in the result. This means that they can be calculated directly into the result (thus removing the need for one temporary object, which may be quite large, to hold the value of
z2). In other words, the operation can be done like this:
AAAAABBBBB
* CCCCCDDDDD
--------------------
GGGGGGGGGGEEEEEEEEEE = z2,z0
+ 000FFFFFFFFFFFF = z1
--------------------
RRRRRRRRRRRRRRRRRRRR = result
The zeros at the beginning of that
z1 summand are there to indicate that, from a programmatical implementation point of view, when
z1 is added to the result in this manner, after adding its most-significant digit, if there is any carry left, it needs to be added to the result (in practice with an "add zero with carry" operation). In other words, as long as there is a carry, it needs to be added to the next digit of the result, and so on, until there is no carry left.
Except, and this is where my question comes in... if I leave out those "add zero with carry" operations completely, the result still seems to be correct. In other words, there never seems to be a carry left after adding the last (ie. most-significant) digit of
z1.
Note that since
z1 is two digits longer than necessary, it's in practice already doing two "add zero with carry" operations. And, in fact, if I leave the most-significant digit of z1 out, it still gives the correct result. (If I leave the two most-significant digits out, then it sometimes gives an incorrect result, so at least one carry needs to be added.)
I have tested with tens of millions of random input values to multiply, and it always gives the correct result, without those extra "add zero with carry" operations. However, I have the hunch that this is just coincidence. I have the hunch that it's simply the case that the situations where more than one "add zero with carry" is needed are extraordinarily rare, but they exist, and that if you don't do it, then you would get an incorrect result in those cases.
I have tried to come up with input values that would cause an erroneous result, but I can't come up with anything. But it may just be my lack of figuring out how this works precisely.
So I would like to present a conjecture: When
z1 above (of a size that's two digits larger than
z0 and
z2) is added to the result, no more carries are left, with any input values.
I don't know how to prove that. Disproving the conjecture would be relatively simple, though: Two values that multiplied in this manner give the wrong result, if those "add zero with carry" operations are not done. (I would be very grateful if someone could provide me with such numbers. Even better, a pattern of such numbers, so that I can generate them for any sizes. This would be a great addition to my testing code.)