with a single entry point, named exit points at the bottom, more self-evident
refcount adjustments, and a comment describing why the pre-increment was
necessary at all.
The real benefit of the unicode specialized function comes from
bypassing the overhead of PyObject_RichCompareBool() and not
from being in-lined (especially since there was almost no shared
data between the caller and callee). Also, the in-lining was
having a negative effect on code generation for the callee.
Simplifies the code a little bit and does the resize check
only when a new key is added (giving a small speed up in
the case where the key already exists).
Fixes possible bug in set_merge() where the set_insert_key()
call relies on a big resize at the start to make enough room
for the keys but is vulnerable to a comparision callback that
could cause the table to shrink in the middle of the merge.
Also, changed the resize threshold from two-thirds of the
mask+1 to just two-thirds. The plus one offset gave no
real benefit (afterall, the two-thirds mark is just a
heuristic and isn't a precise cut-off).
Elsewhere in the setobject.c code we do a bitwise-and with the mask
instead of using a conditional to reset to zero on wrap-around.
Using that same technique here use gives cleaner, faster, and more
consistent code.
* Move the test for an exact key match to after a hash match
* Use "used" as a loop counter instead of "fill"
* Minor improvements to variable names and code consistency
The setobject freelist was consuming memory but not providing much value.
Even when a freelisted setobject was available, most of the setobject
fields still needed to be initialized and the small table still required
a memset(). This meant that the custom freelisting scheme for sets was
providing almost no incremental benefit over the default Python freelist
scheme used by _PyObject_Malloc() in Objects/obmalloc.c.
Modern processors tend to make consecutive memory accesses cheaper than
random probes into memory.
Small sets can fit into L1 cache, so they get less benefit. But they do
come out ahead because the consecutive probes don't probe the same key
more than once and because the randomization step occurs less frequently
(or not at all).
For the open addressing step, putting the perturb shift before the index
calculation gets the upper bits into play sooner.
The Gdb prettyprint plugin depended on the dummy object being displayable.
Other solutions besides a unicode object are possible. For now, get it
back up and running.
The identity checks in lookkey() need to be there to prevent the dummy
object from leaking through Py_RichCompareBool() into user code in the
rare circumstance where the dummy's hash value exactly matches the hash
value of the actual key being looked up.
Letting the compiler decide how to optimize the multiply by five
gives it the freedom to make better choices for the best technique
for a given target machine.
For example, GCC on x86_64 produces a little bit better code:
Old-way (3 steps with a data dependency between each step):
shrq $5, %r13
leaq 1(%rbx,%r13), %rax
leaq (%rax,%rbx,4), %rbx
New-way (3 steps with no dependency between the first two steps
which can be run in parallel):
leaq (%rbx,%rbx,4), %rax # i*5
shrq $5, %r13 # perturb >>= PERTURB_SHIFT
leaq 1(%r13,%rax), %rbx # 1 + perturb + i*5
computation as the overflow behavior of signed integers is undefined.
NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done. I added the comment that my change in 3.2 added so that the
code would match up. Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
computation as the overflow behavior of signed integers is undefined.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
The fix was already committed to 3.2, but I merged two small changes
recommended by Raymond while I was working on the 2.7 patch to ease
future merges.
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r78541 | ezio.melotti | 2010-03-01 06:08:34 +0200 (Mon, 01 Mar 2010) | 17 lines
Merged revisions 78515-78516,78522 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78515 | georg.brandl | 2010-02-28 20:19:17 +0200 (Sun, 28 Feb 2010) | 1 line
#8030: make builtin type docstrings more consistent: use "iterable" instead of "seq(uence)", use "new" to show that set() always returns a new object.
........
r78516 | georg.brandl | 2010-02-28 20:26:37 +0200 (Sun, 28 Feb 2010) | 1 line
The set types can also be called without arguments.
........
r78522 | ezio.melotti | 2010-03-01 01:59:00 +0200 (Mon, 01 Mar 2010) | 1 line
#8030: more docstring fix for builtin types.
........
................
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78515 | georg.brandl | 2010-02-28 20:19:17 +0200 (Sun, 28 Feb 2010) | 1 line
#8030: make builtin type docstrings more consistent: use "iterable" instead of "seq(uence)", use "new" to show that set() always returns a new object.
........
r78516 | georg.brandl | 2010-02-28 20:26:37 +0200 (Sun, 28 Feb 2010) | 1 line
The set types can also be called without arguments.
........
r78522 | ezio.melotti | 2010-03-01 01:59:00 +0200 (Mon, 01 Mar 2010) | 1 line
#8030: more docstring fix for builtin types.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r68128 | antoine.pitrou | 2009-01-01 15:11:22 +0100 (jeu., 01 janv. 2009) | 3 lines
Issue #3680: Reference cycles created through a dict, set or deque iterator did not get collected.
........
PyUnicode_AsStringAndSize -> _PyUnicode_AsStringAndSize to mark
them for interpreter internal use only.
We'll have to rework these APIs or create new ones for the
purpose of accessing the UTF-8 representation of Unicode objects
for 3.1.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r64089 | armin.ronacher | 2008-06-10 22:37:02 +0200 (mar., 10 juin 2008) | 3 lines
Fix a formatting error in the ast documentation.
........
r64098 | raymond.hettinger | 2008-06-11 02:25:29 +0200 (mer., 11 juin 2008) | 6 lines
Mini-PEP: Simplifying numbers.py
* Convert binary methods in Integral to mixin methods
* Remove three-arg __pow__ as a required method
* Make __int__ the root method instead of __long__.
........
r64100 | raymond.hettinger | 2008-06-11 02:28:51 +0200 (mer., 11 juin 2008) | 1 line
Update numbers doc for the Integral simplification.
........
r64101 | raymond.hettinger | 2008-06-11 02:44:47 +0200 (mer., 11 juin 2008) | 3 lines
Handle the case with zero arguments.
........
r64102 | benjamin.peterson | 2008-06-11 03:31:28 +0200 (mer., 11 juin 2008) | 4 lines
convert test_struct to a unittest thanks to Giampaolo Rodola
I had to disable one test because it was functioning incorrectly, see #1530559
I also removed the debugging prints
........
r64113 | thomas.heller | 2008-06-11 09:10:43 +0200 (mer., 11 juin 2008) | 2 lines
Fix markup.
Document the new 'offset' parameter for the 'ctypes.byref' function.
........
r64115 | raymond.hettinger | 2008-06-11 12:30:54 +0200 (mer., 11 juin 2008) | 1 line
Multi-arg form for set.difference() and set.difference_update().
........
r64116 | raymond.hettinger | 2008-06-11 14:06:49 +0200 (mer., 11 juin 2008) | 1 line
Issue 3051: Let heapq work with either __lt__ or __le__.
........
r64118 | raymond.hettinger | 2008-06-11 14:39:09 +0200 (mer., 11 juin 2008) | 1 line
Optimize previous checkin for heapq.
........
r64120 | raymond.hettinger | 2008-06-11 15:14:50 +0200 (mer., 11 juin 2008) | 1 line
Add test for heapq using both __lt__ and __le__.
........
r64132 | gregory.p.smith | 2008-06-11 20:00:52 +0200 (mer., 11 juin 2008) | 3 lines
Correct an incorrect comment about our #include of stddef.h.
(see Doug Evans' comment on python-dev 2008-06-10)
........
r64342 | guido.van.rossum | 2008-06-17 19:38:02 +0200 (mar., 17 juin 2008) | 3 lines
Roll back Raymond's -r64098 while we think of something better.
(See issue 3056 -- we're close to a resolution but need unittests.)
........