mirror of https://github.com/python/cpython.git
k_mul() comments: In honor of Dijkstra, made the proof that "t3 fits"
rigorous instead of hoping for testing not to turn up counterexamples. Call me heretical, but despite that I'm wholly confident in the proof, and have done it two different ways now, I still put more faith in testing ...
This commit is contained in:
parent
9973d74b2d
commit
ab86c2be24
|
@ -1757,40 +1757,43 @@ k_mul(PyLongObject *a, PyLongObject *b)
|
||||||
|
|
||||||
/* (*) Why adding t3 can't "run out of room" above.
|
/* (*) Why adding t3 can't "run out of room" above.
|
||||||
|
|
||||||
We allocated space for asize + bsize result digits. We're adding t3 at an
|
Let f(x) mean the floor of x and c(x) mean the ceiling of x. Some facts
|
||||||
offset of shift digits, so there are asize + bsize - shift allocated digits
|
to start with:
|
||||||
remaining. Because degenerate shifts of "a" were weeded out, asize is at
|
|
||||||
least shift + 1. If bsize is odd then bsize == 2*shift + 1, else bsize ==
|
|
||||||
2*shift. Therefore there are at least shift+1 + 2*shift - shift =
|
|
||||||
|
|
||||||
2*shift+1 allocated digits remaining when bsize is even, or at least
|
1. For any integer i, i = c(i/2) + f(i/2). In particular,
|
||||||
2*shift+2 allocated digits remaining when bsize is odd.
|
bsize = c(bsize/2) + f(bsize/2).
|
||||||
|
2. shift = f(bsize/2)
|
||||||
|
3. asize <= bsize
|
||||||
|
4. Since we call k_lopsided_mul if asize*2 <= bsize, asize*2 > bsize in this
|
||||||
|
routine, so asize > bsize/2 >= f(bsize/2) in this routine.
|
||||||
|
|
||||||
Now in bh+bl, if bsize is even bh has at most shift digits, while if bsize
|
We allocated asize + bsize result digits, and add t3 into them at an offset
|
||||||
is odd bh has at most shift+1 digits. The sum bh+bl has at most
|
of shift. This leaves asize+bsize-shift allocated digit positions for t3
|
||||||
|
to fit into, = (by #1 and #2) asize + f(bsize/2) + c(bsize/2) - f(bsize/2) =
|
||||||
|
asize + c(bsize/2) available digit positions.
|
||||||
|
|
||||||
shift digits plus 1 bit when bsize is even
|
bh has c(bsize/2) digits, and bl at most f(size/2) digits. So bh+hl has
|
||||||
shift+1 digits plus 1 bit when bsize is odd
|
at most c(bsize/2) digits + 1 bit.
|
||||||
|
|
||||||
The same is true of ah+al, so (ah+al)(bh+bl) has at most
|
If asize == bsize, ah has c(bsize/2) digits, else ah has at most f(bsize/2)
|
||||||
|
digits, and al has at most f(bsize/2) digits in any case. So ah+al has at
|
||||||
|
most (asize == bsize ? c(bsize/2) : f(bsize/2)) digits + 1 bit.
|
||||||
|
|
||||||
2*shift digits + 2 bits when bsize is even
|
The product (ah+al)*(bh+bl) therefore has at most
|
||||||
2*shift+2 digits + 2 bits when bsize is odd
|
|
||||||
|
|
||||||
If bsize is even, we have at most 2*shift digits + 2 bits to fit into at
|
c(bsize/2) + (asize == bsize ? c(bsize/2) : f(bsize/2)) digits + 2 bits
|
||||||
least 2*shift+1 digits. Since a digit has SHIFT bits, and SHIFT >= 2,
|
|
||||||
there's always enough room to fit the 2 bits into the "spare" digit.
|
|
||||||
|
|
||||||
If bsize is odd, we have at most 2*shift+2 digits + 2 bits to fit into at
|
and we have asize + c(bsize/2) available digit positions. We need to show
|
||||||
least 2*shift+2 digits, and there's not obviously enough room for the
|
this is always enough. An instance of c(bsize/2) cancels out in both, so
|
||||||
extra two bits. We need a sharper analysis in this case. The major
|
the question reduces to whether asize digits is enough to hold
|
||||||
laziness was in the "the same is true of ah+al" clause: ah+al can't actually
|
(asize == bsize ? c(bsize/2) : f(bsize/2)) digits + 2 bits. If asize < bsize,
|
||||||
have shift+1 digits + 1 bit unless bsize is odd and asize == bsize. In that
|
then we're asking whether asize digits >= f(bsize/2) digits + 2 bits. By #4,
|
||||||
case, we actually have (2*shift+1)*2 - shift = 3*shift+2 allocated digits
|
asize is at least f(bsize/2)+1 digits, so this in turn reduces to whether 1
|
||||||
remaining, and that's obviously plenty to hold 2*shift+2 digits + 2 bits.
|
digit is enough to hold 2 bits. This is so since SHIFT=15 >= 2. If
|
||||||
Else (bsize is odd and asize < bsize) ah and al each have at most shift digits,
|
asize == bsize, then we're asking whether bsize digits is enough to hold
|
||||||
so ah+al has at most shift digits + 1 bit, and (ah+al)*(bh+bl) has at most
|
f(bsize/2) digits + 2 bits, or equivalently (by #1) whether c(bsize/2) digits
|
||||||
2*shift+1 digits + 2 bits, and again 2*shift+2 digits is enough to hold it.
|
is enough to hold 2 bits. This is so if bsize >= 1, which holds because
|
||||||
|
bsize >= KARATSUBA_CUTOFF >= 1.
|
||||||
|
|
||||||
Note that since there's always enough room for (ah+al)*(bh+bl), and that's
|
Note that since there's always enough room for (ah+al)*(bh+bl), and that's
|
||||||
clearly >= each of ah*bh and al*bl, there's always enough room to subtract
|
clearly >= each of ah*bh and al*bl, there's always enough room to subtract
|
||||||
|
|
Loading…
Reference in New Issue