cpython

Commit Graph

Author	SHA1	Message	Date
Thomas Wouters	4d70c3d9dd	Partially merge trunk into p3yk. The removal of Mac/Tools is confusing svn merge in bad ways, so I'll have to merge that extra-carefully (probably manually.) Merged revisions 46495-46605 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r46495 \| tim.peters \| 2006-05-28 03:52:38 +0200 (Sun, 28 May 2006) \| 2 lines Added missing svn:eol-style property to text files. ........ r46497 \| tim.peters \| 2006-05-28 12:41:29 +0200 (Sun, 28 May 2006) \| 3 lines PyErr_Display(), PyErr_WriteUnraisable(): Coverity found a cut-and-paste bug in both: `className` was referenced before being checked for NULL. ........ r46499 \| fredrik.lundh \| 2006-05-28 14:06:46 +0200 (Sun, 28 May 2006) \| 5 lines needforspeed: added Py_MEMCPY macro (currently tuned for Visual C only), and use it for string copy operations. this gives a 20% speedup on some string benchmarks. ........ r46501 \| michael.hudson \| 2006-05-28 17:51:40 +0200 (Sun, 28 May 2006) \| 26 lines Quality control, meet exceptions.c. Fix a number of problems with the need for speed code: One is doing this sort of thing: Py_DECREF(self->field); self->field = newval; Py_INCREF(self->field); without being very sure that self->field doesn't start with a value that has a __del__, because that almost certainly can lead to segfaults. As self->args is constrained to be an exact tuple we may as well exploit this fact consistently. This leads to quite a lot of simplification (and, hey, probably better performance). Add some error checking in places lacking it. Fix some rather strange indentation in the Unicode code. Delete some trailing whitespace. More to come, I haven't fixed all the reference leaks yet... ........ r46502 \| george.yoshida \| 2006-05-28 18:39:09 +0200 (Sun, 28 May 2006) \| 3 lines Patch #1080727: add "encoding" parameter to doctest.DocFileSuite Contributed by Bjorn Tillenius. ........ r46503 \| martin.v.loewis \| 2006-05-28 18:57:38 +0200 (Sun, 28 May 2006) \| 4 lines Rest of patch #1490384: Commit icon source, remove claim that Erik von Blokland is the author of the installer picture. ........ r46504 \| michael.hudson \| 2006-05-28 19:40:29 +0200 (Sun, 28 May 2006) \| 16 lines Quality control, meet exceptions.c, round two. Make some functions that should have been static static. Fix a bunch of refleaks by fixing the definition of MiddlingExtendsException. Remove all the __new__ implementations apart from BaseException_new. Rewrite most code that needs it to cope with NULL fields (such code could get excercised anyway, the __new__-removal just makes it more likely). This involved editing the code for WindowsError, which I can't test. This fixes all the refleaks in at least the start of a regrtest -R :: run. ........ r46505 \| marc-andre.lemburg \| 2006-05-28 19:46:58 +0200 (Sun, 28 May 2006) \| 10 lines Initial version of systimes - a module to provide platform dependent performance measurements. The module is currently just a proof-of-concept implementation, but will integrated into pybench once it is stable enough. License: pybench license. Author: Marc-Andre Lemburg. ........ r46507 \| armin.rigo \| 2006-05-28 21:13:17 +0200 (Sun, 28 May 2006) \| 15 lines ("Forward-port" of r46506) Remove various dependencies on dictionary order in the standard library tests, and one (clearly an oversight, potentially critical) in the standard library itself - base64.py. Remaining open issues: * test_extcall is an output test, messy to make robust * tarfile.py has a potential bug here, but I'm not familiar enough with this code. Filed in as SF bug #1496501. * urllib2.HTTPPasswordMgr() returns a random result if there is more than one matching root path. I'm asking python-dev for clarification... ........ r46508 \| georg.brandl \| 2006-05-28 22:11:45 +0200 (Sun, 28 May 2006) \| 4 lines The empty string is a valid import path. (fixes #1496539) ........ r46509 \| georg.brandl \| 2006-05-28 22:23:12 +0200 (Sun, 28 May 2006) \| 3 lines Patch #1496206: urllib2 PasswordMgr ./. default ports ........ r46510 \| georg.brandl \| 2006-05-28 22:57:09 +0200 (Sun, 28 May 2006) \| 3 lines Fix refleaks in UnicodeError get and set methods. ........ r46511 \| michael.hudson \| 2006-05-28 23:19:03 +0200 (Sun, 28 May 2006) \| 3 lines use the UnicodeError traversal and clearing functions in UnicodeError subclasses. ........ r46512 \| thomas.wouters \| 2006-05-28 23:32:12 +0200 (Sun, 28 May 2006) \| 4 lines Make last patch valid C89 so Windows compilers can deal with it. ........ r46513 \| georg.brandl \| 2006-05-28 23:42:54 +0200 (Sun, 28 May 2006) \| 3 lines Fix ref-antileak in _struct.c which eventually lead to deallocating None. ........ r46514 \| georg.brandl \| 2006-05-28 23:57:35 +0200 (Sun, 28 May 2006) \| 4 lines Correct None refcount issue in Mac modules. (Are they still used?) ........ r46515 \| armin.rigo \| 2006-05-29 00:07:08 +0200 (Mon, 29 May 2006) \| 3 lines A clearer error message when passing -R to regrtest.py with release builds of Python. ........ r46516 \| georg.brandl \| 2006-05-29 00:14:04 +0200 (Mon, 29 May 2006) \| 3 lines Fix C function calling conventions in _sre module. ........ r46517 \| georg.brandl \| 2006-05-29 00:34:51 +0200 (Mon, 29 May 2006) \| 3 lines Convert audioop over to METH_VARARGS. ........ r46518 \| georg.brandl \| 2006-05-29 00:38:57 +0200 (Mon, 29 May 2006) \| 3 lines METH_NOARGS functions do get called with two args. ........ r46519 \| georg.brandl \| 2006-05-29 11:46:51 +0200 (Mon, 29 May 2006) \| 4 lines Fix refleak in socketmodule. Replace bogus Py_BuildValue calls. Fix refleak in exceptions. ........ r46520 \| nick.coghlan \| 2006-05-29 14:43:05 +0200 (Mon, 29 May 2006) \| 7 lines Apply modified version of Collin Winter's patch #1478788 Renames functional extension module to _functools and adds a Python functools module so that utility functions like update_wrapper can be added easily. ........ r46522 \| georg.brandl \| 2006-05-29 15:53:16 +0200 (Mon, 29 May 2006) \| 3 lines Convert fmmodule to METH_VARARGS. ........ r46523 \| georg.brandl \| 2006-05-29 16:13:21 +0200 (Mon, 29 May 2006) \| 3 lines Fix #1494605. ........ r46524 \| georg.brandl \| 2006-05-29 16:28:05 +0200 (Mon, 29 May 2006) \| 3 lines Handle PyMem_Malloc failure in pystrtod.c. Closes #1494671. ........ r46525 \| georg.brandl \| 2006-05-29 16:33:55 +0200 (Mon, 29 May 2006) \| 3 lines Fix compiler warning. ........ r46526 \| georg.brandl \| 2006-05-29 16:39:00 +0200 (Mon, 29 May 2006) \| 3 lines Fix #1494787 (pyclbr counts whitespace as superclass name) ........ r46527 \| bob.ippolito \| 2006-05-29 17:47:29 +0200 (Mon, 29 May 2006) \| 1 line simplify the struct code a bit (no functional changes) ........ r46528 \| armin.rigo \| 2006-05-29 19:59:47 +0200 (Mon, 29 May 2006) \| 2 lines Silence a warning. ........ r46529 \| georg.brandl \| 2006-05-29 21:39:45 +0200 (Mon, 29 May 2006) \| 3 lines Correct some value converting strangenesses. ........ r46530 \| nick.coghlan \| 2006-05-29 22:27:44 +0200 (Mon, 29 May 2006) \| 1 line When adding a module like functools, it helps to let SVN know about the file. ........ r46531 \| georg.brandl \| 2006-05-29 22:52:54 +0200 (Mon, 29 May 2006) \| 4 lines Patches #1497027 and #972322: try HTTP digest auth first, and watch out for handler name collisions. ........ r46532 \| georg.brandl \| 2006-05-29 22:57:01 +0200 (Mon, 29 May 2006) \| 3 lines Add News entry for last commit. ........ r46533 \| georg.brandl \| 2006-05-29 23:04:52 +0200 (Mon, 29 May 2006) \| 4 lines Make use of METH_O and METH_NOARGS where possible. Use Py_UnpackTuple instead of PyArg_ParseTuple where possible. ........ r46534 \| georg.brandl \| 2006-05-29 23:58:42 +0200 (Mon, 29 May 2006) \| 3 lines Convert more modules to METH_VARARGS. ........ r46535 \| georg.brandl \| 2006-05-30 00:00:30 +0200 (Tue, 30 May 2006) \| 3 lines Whoops. ........ r46536 \| fredrik.lundh \| 2006-05-30 00:42:07 +0200 (Tue, 30 May 2006) \| 4 lines fixed "abc".count("", 100) == -96 error (hopefully, nobody's relying on the current behaviour ;-) ........ r46537 \| bob.ippolito \| 2006-05-30 00:55:48 +0200 (Tue, 30 May 2006) \| 1 line struct: modulo math plus warning on all endian-explicit formats for compatibility with older struct usage (ugly) ........ r46539 \| bob.ippolito \| 2006-05-30 02:26:01 +0200 (Tue, 30 May 2006) \| 1 line Add a length check to aifc to ensure it doesn't write a bogus file ........ r46540 \| tim.peters \| 2006-05-30 04:25:25 +0200 (Tue, 30 May 2006) \| 10 lines deprecated_err(): Stop bizarre warning messages when the tests are run in the order: test_genexps (or any other doctest-based test) test_struct test_doctest The `warnings` module needs an advertised way to save/restore its internal filter list. ........ r46541 \| tim.peters \| 2006-05-30 04:26:46 +0200 (Tue, 30 May 2006) \| 2 lines Whitespace normalization. ........ r46542 \| tim.peters \| 2006-05-30 04:30:30 +0200 (Tue, 30 May 2006) \| 2 lines Set a binary svn:mime-type property on this UTF-8 encoded file. ........ r46543 \| neal.norwitz \| 2006-05-30 05:18:50 +0200 (Tue, 30 May 2006) \| 1 line Simplify further by using AddStringConstant ........ r46544 \| tim.peters \| 2006-05-30 06:16:25 +0200 (Tue, 30 May 2006) \| 6 lines Convert relevant dict internals to Py_ssize_t. I don't have a box with nearly enough RAM, or an OS, that could get close to tickling this, though (requires a dict w/ at least 2**31 entries). ........ r46545 \| neal.norwitz \| 2006-05-30 06:19:21 +0200 (Tue, 30 May 2006) \| 1 line Remove stray \| in comment ........ r46546 \| neal.norwitz \| 2006-05-30 06:25:05 +0200 (Tue, 30 May 2006) \| 1 line Use Py_SAFE_DOWNCAST for safety. Fix format strings. Remove 2 more stray \| in comment ........ r46547 \| neal.norwitz \| 2006-05-30 06:43:23 +0200 (Tue, 30 May 2006) \| 1 line No DOWNCAST is required since sizeof(Py_ssize_t) >= sizeof(int) and Py_ReprEntr returns an int ........ r46548 \| tim.peters \| 2006-05-30 07:04:59 +0200 (Tue, 30 May 2006) \| 3 lines dict_print(): Explicitly narrow the return value from a (possibly) wider variable. ........ r46549 \| tim.peters \| 2006-05-30 07:23:59 +0200 (Tue, 30 May 2006) \| 5 lines dict_print(): So that Neal & I don't spend the rest of our lives taking turns rewriting code that works ;-), get rid of casting illusions by declaring a new variable with the obvious type. ........ r46550 \| georg.brandl \| 2006-05-30 09:04:55 +0200 (Tue, 30 May 2006) \| 3 lines Restore exception pickle support. #1497319. ........ r46551 \| georg.brandl \| 2006-05-30 09:13:29 +0200 (Tue, 30 May 2006) \| 3 lines Add a test case for exception pickling. args is never NULL. ........ r46552 \| neal.norwitz \| 2006-05-30 09:21:10 +0200 (Tue, 30 May 2006) \| 1 line Don't fail if the (sub)pkgname already exist. ........ r46553 \| georg.brandl \| 2006-05-30 09:34:45 +0200 (Tue, 30 May 2006) \| 3 lines Disallow keyword args for exceptions. ........ r46554 \| neal.norwitz \| 2006-05-30 09:36:54 +0200 (Tue, 30 May 2006) \| 5 lines I'm impatient. I think this will fix a few more problems with the buildbots. I'm not sure this is the best approach, but I can't think of anything better. If this creates problems, feel free to revert, but I think it's safe and should make things a little better. ........ r46555 \| georg.brandl \| 2006-05-30 10:17:00 +0200 (Tue, 30 May 2006) \| 4 lines Do the check for no keyword arguments in __init__ so that subclasses of Exception can be supplied keyword args ........ r46556 \| georg.brandl \| 2006-05-30 10:47:19 +0200 (Tue, 30 May 2006) \| 3 lines Convert test_exceptions to unittest. ........ r46557 \| andrew.kuchling \| 2006-05-30 14:52:01 +0200 (Tue, 30 May 2006) \| 1 line Add SoC name, and reorganize this section a bit ........ r46559 \| tim.peters \| 2006-05-30 17:53:34 +0200 (Tue, 30 May 2006) \| 11 lines PyLong_FromString(): Continued fraction analysis (explained in a new comment) suggests there are almost certainly large input integers in all non-binary input bases for which one Python digit too few is initally allocated to hold the final result. Instead of assert-failing when that happens, allocate more space. Alas, I estimate it would take a few days to find a specific such case, so this isn't backed up by a new test (not to mention that such a case may take hours to run, since conversion time is quadratic in the number of digits, and preliminary attempts suggested that the smallest such inputs contain at least a million digits). ........ r46560 \| fredrik.lundh \| 2006-05-30 19:11:48 +0200 (Tue, 30 May 2006) \| 3 lines changed find/rfind to return -1 for matches outside the source string ........ r46561 \| bob.ippolito \| 2006-05-30 19:37:54 +0200 (Tue, 30 May 2006) \| 1 line Change wrapping terminology to overflow masking ........ r46562 \| fredrik.lundh \| 2006-05-30 19:39:58 +0200 (Tue, 30 May 2006) \| 3 lines changed count to return 0 for slices outside the source string ........ r46568 \| tim.peters \| 2006-05-31 01:28:02 +0200 (Wed, 31 May 2006) \| 2 lines Whitespace normalization. ........ r46569 \| brett.cannon \| 2006-05-31 04:19:54 +0200 (Wed, 31 May 2006) \| 5 lines Clarify wording on default values for strptime(); defaults are used when better values cannot be inferred. Closes bug #1496315. ........ r46572 \| neal.norwitz \| 2006-05-31 09:43:27 +0200 (Wed, 31 May 2006) \| 1 line Calculate smallest properly (it was off by one) and use proper ssize_t types for Win64 ........ r46573 \| neal.norwitz \| 2006-05-31 10:01:08 +0200 (Wed, 31 May 2006) \| 1 line Revert last checkin, it is better to do make distclean ........ r46574 \| neal.norwitz \| 2006-05-31 11:02:44 +0200 (Wed, 31 May 2006) \| 3 lines On 64-bit platforms running test_struct after test_tarfile would fail since the deprecation warning wouldn't be raised. ........ r46575 \| thomas.heller \| 2006-05-31 13:37:58 +0200 (Wed, 31 May 2006) \| 3 lines PyTuple_Pack is not available in Python 2.3, but ctypes must stay compatible with that. ........ r46576 \| andrew.kuchling \| 2006-05-31 15:18:56 +0200 (Wed, 31 May 2006) \| 1 line 'functional' module was renamed to 'functools' ........ r46577 \| kristjan.jonsson \| 2006-05-31 15:35:41 +0200 (Wed, 31 May 2006) \| 1 line Fixup the PCBuild8 project directory. exceptions.c have moved to Objects, and the functionalmodule.c has been replaced with _functoolsmodule.c. Other minor changes to .vcproj files and .sln to fix compilation ........ r46578 \| andrew.kuchling \| 2006-05-31 16:08:48 +0200 (Wed, 31 May 2006) \| 15 lines [Bug #1473048] SimpleXMLRPCServer and DocXMLRPCServer don't look at the path of the HTTP request at all; you can POST or GET from / or /RPC2 or /blahblahblah with the same results. Security scanners that look for /cgi-bin/phf will therefore report lots of vulnerabilities. Fix: add a .rpc_paths attribute to the SimpleXMLRPCServer class, and report a 404 error if the path isn't on the allowed list. Possibly-controversial aspect of this change: the default makes only '/' and '/RPC2' legal. Maybe this will break people's applications (though I doubt it). We could just set the default to an empty tuple, which would exactly match the current behaviour. ........ r46579 \| andrew.kuchling \| 2006-05-31 16:12:47 +0200 (Wed, 31 May 2006) \| 1 line Mention SimpleXMLRPCServer change ........ r46580 \| tim.peters \| 2006-05-31 16:28:07 +0200 (Wed, 31 May 2006) \| 2 lines Trimmed trailing whitespace. ........ r46581 \| tim.peters \| 2006-05-31 17:33:22 +0200 (Wed, 31 May 2006) \| 4 lines _range_error(): Speed and simplify (there's no real need for loops here). Assert that size_t is actually big enough, and that f->size is at least one. Wrap a long line. ........ r46582 \| tim.peters \| 2006-05-31 17:34:37 +0200 (Wed, 31 May 2006) \| 2 lines Repaired error in new comment. ........ r46584 \| neal.norwitz \| 2006-06-01 07:32:49 +0200 (Thu, 01 Jun 2006) \| 4 lines Remove ; at end of macro. There was a compiler recently that warned about extra semi-colons. It may have been the HP C compiler. This file will trigger a bunch of those warnings now. ........ r46585 \| georg.brandl \| 2006-06-01 08:39:19 +0200 (Thu, 01 Jun 2006) \| 3 lines Correctly unpickle 2.4 exceptions via __setstate__ (patch #1498571) ........ r46586 \| georg.brandl \| 2006-06-01 10:27:32 +0200 (Thu, 01 Jun 2006) \| 3 lines Correctly allocate complex types with tp_alloc. (bug #1498638) ........ r46587 \| georg.brandl \| 2006-06-01 14:30:46 +0200 (Thu, 01 Jun 2006) \| 2 lines Correctly dispatch Faults in loads (patch #1498627) ........ r46588 \| georg.brandl \| 2006-06-01 15:00:49 +0200 (Thu, 01 Jun 2006) \| 3 lines Some code style tweaks, and remove apply. ........ r46589 \| armin.rigo \| 2006-06-01 15:19:12 +0200 (Thu, 01 Jun 2006) \| 5 lines [ 1497053 ] Let dicts propagate the exceptions in user __eq__(). [ 1456209 ] dictresize() vulnerability ( <- backport candidate ). ........ r46590 \| tim.peters \| 2006-06-01 15:41:46 +0200 (Thu, 01 Jun 2006) \| 2 lines Whitespace normalization. ........ r46591 \| tim.peters \| 2006-06-01 15:49:23 +0200 (Thu, 01 Jun 2006) \| 2 lines Record bugs 1275608 and 1456209 as being fixed. ........ r46592 \| tim.peters \| 2006-06-01 15:56:26 +0200 (Thu, 01 Jun 2006) \| 5 lines Re-enable a new empty-string test added during the NFS sprint, but disabled then because str and unicode strings gave different results. The implementations were repaired later during the sprint, but the new test remained disabled. ........ r46594 \| tim.peters \| 2006-06-01 17:50:44 +0200 (Thu, 01 Jun 2006) \| 7 lines Armin committed his patch while I was reviewing it (I'm sure he didn't know this), so merged in some changes I made during review. Nothing material apart from changing a new `mask` local from int to Py_ssize_t. Mostly this is repairing comments that were made incorrect, and adding new comments. Also a few minor code rewrites for clarity or helpful succinctness. ........ r46599 \| neal.norwitz \| 2006-06-02 06:45:53 +0200 (Fri, 02 Jun 2006) \| 1 line Convert docstrings to comments so regrtest -v prints method names ........ r46600 \| neal.norwitz \| 2006-06-02 06:50:49 +0200 (Fri, 02 Jun 2006) \| 2 lines Fix memory leak found by valgrind. ........ r46601 \| neal.norwitz \| 2006-06-02 06:54:52 +0200 (Fri, 02 Jun 2006) \| 1 line More memory leaks from valgrind ........ r46602 \| neal.norwitz \| 2006-06-02 08:23:00 +0200 (Fri, 02 Jun 2006) \| 11 lines Patch #1357836: Prevent an invalid memory read from test_coding in case the done flag is set. In that case, the loop isn't entered. I wonder if rather than setting the done flag in the cases before the loop, if they should just exit early. This code looks like it should be refactored. Backport candidate (also the early break above if decoding_fgets fails) ........ r46603 \| martin.blais \| 2006-06-02 15:03:43 +0200 (Fri, 02 Jun 2006) \| 1 line Fixed struct test to not use unittest. ........ r46605 \| tim.peters \| 2006-06-03 01:22:51 +0200 (Sat, 03 Jun 2006) \| 10 lines pprint functions used to sort a dict (by key) if and only if the output required more than one line. "Small" dicts got displayed in seemingly random order (the hash-induced order produced by dict.__repr__). None of this was documented. Now pprint functions always sort dicts by key, and the docs promise it. This was proposed and agreed to during the PyCon 2006 core sprint -- I just didn't have time for it before now. ........	2006-06-08 14:42:34 +00:00
Thomas Wouters	49fd7fa443	Merge p3yk branch with the trunk up to revision 45595. This breaks a fair number of tests, all because of the codecs/_multibytecodecs issue described here (it's not a Py3K issue, just something Py3K discovers): http://mail.python.org/pipermail/python-dev/2006-April/064051.html Hye-Shik Chang promised to look for a fix, so no need to fix it here. The tests that are expected to break are: test_codecencodings_cn test_codecencodings_hk test_codecencodings_jp test_codecencodings_kr test_codecencodings_tw test_codecs test_multibytecodec This merge fixes an actual test failure (test_weakref) in this branch, though, so I believe merging is the right thing to do anyway.	2006-04-21 10:40:58 +00:00
Guido van Rossum	4b92a82504	Oops. Fix syntax for C89 compilers.	2006-02-25 23:32:30 +00:00
Guido van Rossum	1968ad32cd	- Patch 1433928: - The copy module now "copies" function objects (as atomic objects). - dict.__getitem__ now looks for a __missing__ hook before raising KeyError. - Added a new type, defaultdict, to the collections module. This uses the new __missing__ hook behavior added to dict (see above).	2006-02-25 22:38:04 +00:00
Martin v. Löwis	e0e89f7920	Revert 42400.	2006-02-16 06:59:22 +00:00
Martin v. Löwis	2c95cc6d72	Support %zd in PyErr_Format and PyString_FromFormat.	2006-02-16 06:54:25 +00:00
Neal Norwitz	26efe402c2	Get rid of compiler warnings (gcc 3.3.4 on x86)	2006-02-16 06:21:57 +00:00
Martin v. Löwis	18e165558b	Merge ssize_t branch.	2006-02-15 17:27:45 +00:00
Armin Rigo	f5b3e36493	Renamed _length_cue() to __length_hint__(). See: http://mail.python.org/pipermail/python-dev/2006-February/060524.html	2006-02-11 21:32:43 +00:00
Tim Peters	60b29961dc	Fixed English in a comment; trimmed trailing whitespace; no code changes.	2006-01-01 01:19:23 +00:00
Raymond Hettinger	6b27cda643	Convert iterator __len__() methods to a private API.	2005-09-24 21:23:05 +00:00
Raymond Hettinger	f81e45023e	Fix nits.	2005-08-17 02:19:36 +00:00
Raymond Hettinger	186e739d29	SF patch #1200051 : Small optimization for PyDict_Merge() (Contributed by Barry Warsaw and Matt Messier.)	2005-05-14 18:08:25 +00:00
Raymond Hettinger	1356f785c1	SF bug #1183742 : PyDict_Copy() can return non-NULL value on error	2005-04-15 15:58:42 +00:00
Raymond Hettinger	07ead17318	Code simplification -- eliminate lookup when value is known in advance.	2005-02-05 23:42:57 +00:00
Nicholas Bastin	9ba301e589	Moved SunPro warning suppression into pyport.h and out of individual modules and objects.	2004-07-15 15:54:05 +00:00
Nicholas Bastin	9e1bfe7dd9	Disabling end-of-loop code not reached warning on SunPro	2004-06-18 19:57:13 +00:00
Walter Dörwald	d70ad8a9d9	Update docstring for dict.update() to match the new realities.	2004-05-28 20:59:21 +00:00
Raymond Hettinger	7892b1c651	* Add unittests for iterators that report their length * Document the differences between them * Fix corner cases covered by the unittests * Use Py_RETURN_NONE where possible for dictionaries	2004-04-12 18:10:01 +00:00
Guido van Rossum	09240f65f8	GCC was complaining that 'value' in dictiter_iternextvalue() wasn't necessarily always set before used. Between Tim, Armin & me we couldn't prove GCC wrong, so we decided to fix the algorithm. This version is Armin's.	2004-03-20 19:11:58 +00:00
Raymond Hettinger	0690512a7d	Factor out a double lookup.	2004-03-19 10:30:00 +00:00
Raymond Hettinger	0ce6dc8530	Make the new dictionary iterators transparent with respect to length. This gives another 30% speedup for operations such as map(func, d.iteritems()) or list(d.iteritems()) which can both take advantage of length information when provided.	2004-03-18 08:38:00 +00:00
Raymond Hettinger	019a148c72	Optimize dictionary iterators. * Split into three separate types that share everything except the code for iternext. Saves run time decision making and allows each iternext function to be specialized. * Inlined PyDict_Next(). In addition to saving a function call, this allows a redundant test to be eliminated and further specialization of the code for the unique needs of each iterator type. * Created a reusable result tuple for iteritems(). Saves the malloc time for tuples when the previous result was not kept by client code (this is the typical use case for iteritems). If the client code does keep the reference, then a new tuple is created. Results in a 20% to 30% speedup depending on the size and sparsity of the dictionary.	2004-03-18 02:41:19 +00:00
Raymond Hettinger	4344278250	Dictionary optimizations: * Factored constant structure references out of the inner loops for PyDict_Next(), dict_keys(), dict_values(), and dict_items(). Gave measurable speedups to each (the improvement varies depending on the sparseness of the dictionary being measured). * Added a freelist scheme styled after that for tuples. Saves around 80% of the calls to malloc and free. About 10% of the time, the previous dictionary was completely empty; in those cases, the dictionary initialization with memset() can be skipped.	2004-03-17 21:55:03 +00:00
Raymond Hettinger	ebedb2f773	Factor out code common to PyDict_Copy() and PyDict_Merge().	2004-03-08 04:19:01 +00:00
Raymond Hettinger	31017aed36	SF #904720 : dict.update should take a 2-tuple sequence like dict.__init_ (Championed by Bob Ippolito.) The update() method for mappings now accepts all the same argument forms as the dict() constructor. This includes item lists and/or keyword arguments.	2004-03-04 08:25:44 +00:00
Jeremy Hylton	7083bb744a	Oops. Return -1 to distinguish error from empty dict. This change probably isn't work a bug fix. It's unlikely that anyone was calling this method without passing it a real dict.	2004-02-17 20:10:11 +00:00
Raymond Hettinger	0c66967e3d	Simplify previous checkin -- a new function was not needed.	2003-12-13 13:31:55 +00:00
Raymond Hettinger	8f5cdaa784	* Added a new method flag, METH_COEXIST. * Used the flag to optimize set.__contains__(), dict.__contains__(), dict.__getitem__(), and list.__getitem__().	2003-12-13 11:26:12 +00:00
Raymond Hettinger	bc0f2ab9bb	Expose dict_contains() and PyDict_Contains() with is about 10% faster than PySequence_Contains() and more clearly applicable to dicts. Apply the new function in setobject.c where __contains__ checking is ubiquitous.	2003-11-25 21:12:14 +00:00
Raymond Hettinger	574aa32578	SF patch #798467 : Update docstring of has_key for bool changes (Contributed by George Yoshida.)	2003-09-01 22:12:08 +00:00
Raymond Hettinger	c8d2290c8c	SF patch #729395 : Dictionary tuning Adjust resize argument for dict.update() and dict.copy(). Extends the previous change to dict.__setitem__().	2003-05-07 00:49:40 +00:00
Raymond Hettinger	3539f6b895	SF patch #729395 : Dictionary tuning * Increase dictionary growth rate resulting in more sparse dictionaries, fewer lookup collisions, increased memory use, and better cache performance. For dicts with over 50k entries, keep the current growth rate in case an application is suffering from tight memory constraints. * Set the most common case (no resize) to fall-through the test.	2003-05-05 22:22:10 +00:00
Raymond Hettinger	930427b892	Add a reference to dictnotes.txt. It does no good if you don't know it's there or where to find it.	2003-05-03 06:51:59 +00:00
Raymond Hettinger	1da1dbf458	Renamed PyObject_GenericGetIter to PyObject_SelfIter to more accurately describe what the function does. Suggested by Thomas Wouters.	2003-03-17 19:46:11 +00:00
Raymond Hettinger	0153826964	Created PyObject_GenericGetIter(). Factors out the common case of returning self.	2003-03-17 08:24:35 +00:00
Raymond Hettinger	a3e1e4cd79	SF patch #693753 : fix for bug 639806: default for dict.pop (contributed by Michael Stone.)	2003-03-06 23:54:28 +00:00
Neal Norwitz	0732301738	Add closing ) in comment	2003-02-15 14:45:12 +00:00
Tim Peters	080c88b912	cPickle.c, load_build(): Taught cPickle how to pick apart the optional proto 2 slot state. pickle.py, load_build(): CAUTION: Noted that cPickle's load_build and pickle's load_build really don't do the same things with the state, and didn't before this patch either. cPickle never tries to do .update(), and has no backoff if instance.__dict__ can't be retrieved. There are no tests that can tell the difference, and part of what cPickle's load_build() did looked accidental to me, so I don't know what the true intent is here. pickletester.py, test_pickle.py: Got rid of the hack for exempting cPickle from running some of the proto 2 tests. dictobject.c, PyDict_Next(): documented intended use.	2003-02-15 03:01:11 +00:00
Raymond Hettinger	ea3fdf44a2	SF patch #659536 : Use PyArg_UnpackTuple where possible. Obtain cleaner coding and a system wide performance boost by using the fast, pre-parsed PyArg_Unpack function instead of PyArg_ParseTuple function which is driven by a format string.	2002-12-29 16:33:45 +00:00
Martin v. Löwis	32b4a1ba62	Constify char* API. Fixes #651363 . 2.2 candidate.	2002-12-11 13:21:12 +00:00
Tim Peters	bca1cbc6f8	SF 548651: Fix the METH_CLASS implementation. Most of these patches are from Thomas Heller, with long lines folded by Tim. The change to test_descr.py is from Guido. See the bug report. Not a bugfix candidate -- METH_CLASS is new in 2.3.	2002-12-09 22:56:13 +00:00
Raymond Hettinger	e03e5b1f91	Remove assumption that cls is a subclass of dict. Simplifies the code and gets Just van Rossum's example to work.	2002-12-07 08:10:51 +00:00
Raymond Hettinger	b02bb5ed0a	Replace BadInternalCall with TypeError. Add a test case. Fix whitespace. Just van Rossum showed a weird, but clever way for pure python code to trigger the BadInternalCall. The C code had assumed that calling a class constructor would return an instance of that class; however, classes that abuse __new__ can invalidate that assumption.	2002-12-04 07:32:25 +00:00
Neal Norwitz	ef786ae1a5	Add missing decref	2002-11-27 19:38:00 +00:00
Raymond Hettinger	e33d3df030	SF Patch 643443. Added dict.fromkeys(iterable, value=None), a class method for constructing new dictionaries from sequences of keys.	2002-11-27 07:29:33 +00:00
Just van Rossum	a797d8150d	Patch #642500 with slight modifications: allow keyword arguments in dict() constructor. Example: >>> dict(a=1, b=2) {'a': 1, 'b': 2} >>>	2002-11-23 09:45:04 +00:00
Guido van Rossum	efae8862fe	In doc strings, use 'k in D' rather than D.has_key(k).	2002-09-04 11:29:45 +00:00
Guido van Rossum	45ec02aed1	SF patch 576101, by Oren Tirosh: alternative implementation of interning. I modified Oren's patch significantly, but the basic idea and most of the implementation is unchanged. Interned strings created with PyString_InternInPlace() are now mortal, and you must keep a reference to the resulting string around; use the new function PyString_InternImmortal() to create immortal interned strings.	2002-08-19 21:43:18 +00:00
Jeremy Hylton	938ace69a0	staticforward bites the dust. The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.	2002-07-17 16:30:39 +00:00
Guido van Rossum	2147df748f	Make StopIteration a sink state. This is done by clearing out the di_dict field when the end of the list is reached. Also make the error ("dictionary changed size during iteration") a sticky state. Also remove the next() method -- one is supplied automatically by PyType_Ready() because the tp_iternext slot is set. That's a good thing, because the implementation given here was buggy (it never raised StopIteration).	2002-07-16 20:30:22 +00:00
Martin v. Löwis	14f8b4cfcb	Patch #568124 : Add doc string macros.	2002-06-13 20:33:02 +00:00
Guido van Rossum	e027d9818f	Add Raymond Hettinger's d.pop(). See SF patch 539949.	2002-04-12 15:11:59 +00:00
Neil Schemenauer	6189b89cc5	PyObject_GC_Del and PyObject_Del can now be used as a function designators. Remove PyMalloc_New.	2002-04-12 02:43:00 +00:00
Guido van Rossum	77f6a65eb0	Add the 'bool' type and its values 'False' and 'True', as described in PEP 285. Everything described in the PEP is here, and there is even some documentation. I had to fix 12 unit tests; all but one of these were printing Boolean outcomes that changed from 0/1 to False/True. (The exception is test_unicode.py, which did a type(x) == type(y) style comparison. I could've fixed that with a single line using issubtype(x, type(y)), but instead chose to be explicit about those places where a bool is expected. Still to do: perhaps more documentation; change standard library modules to return False/True from predicates.	2002-04-03 22:41:51 +00:00
Tim Peters	1f7df3595a	Remove the CACHE_HASH and INTERN_STRINGS preprocessor symbols.	2002-03-29 03:29:08 +00:00
Guido van Rossum	ff413af605	This is Neil's fix for SF bug 535905 (Evil Trashcan and GC interaction). The fix makes it possible to call PyObject_GC_UnTrack() more than once on the same object, and then move the PyObject_GC_UnTrack() call to before the trashcan code is invoked. BUGFIX CANDIDATE!	2002-03-28 20:34:59 +00:00
Neil Schemenauer	dcc819a5c9	Use pymalloc if it's enabled.	2002-03-22 15:33:15 +00:00
Tim Peters	f582b82fe9	SF bug #491415 PyDict_UpdateFromSeq2() unused PyDict_UpdateFromSeq2(): removed it. PyDict_MergeFromSeq2(): made it public and documented it. PyDict_Merge() docs: updated to reveal <wink> that the second argument can be any mapping object.	2001-12-11 18:51:08 +00:00
Guido van Rossum	dbb53d9918	Fix of SF bug #475877 (Mutable subtype instances are hashable). Rather than tweaking the inheritance of type object slots (which turns out to be too messy to try), this fix adds a __hash__ to the list and dict types (the only mutable types I'm aware of) that explicitly raises an error. This has the advantage that list.__hash__([]) also raises an error (previously, this would invoke object.__hash__([]), returning the argument's address); ditto for dict.__hash__. The disadvantage for this fix is that 3rd party mutable types aren't automatically fixed. This should be added to the rules for creating subclassable extension types: if you don't want your object to be hashable, add a tp_hash function that raises an exception. Also, it's possible that I've forgotten about other mutable types for which this should be done.	2001-12-03 16:32:18 +00:00
Tim Peters	a427a2b8d0	Rename "dictionary" (type and constructor) to "dict".	2001-10-29 22:25:45 +00:00
Tim Peters	4d85953fe6	dictionary() constructor: + Change keyword arg name from "x" to "items". People passing a mapping object can stretch their imaginations <wink>. + Simplify the docstring text.	2001-10-27 18:27:48 +00:00
Tim Peters	1fc240e851	Generalize dictionary() to accept a sequence of 2-sequences. At the outer level, the iterator protocol is used for memory-efficiency (the outer sequence may be very large if fully materialized); at the inner level, PySequence_Fast() is used for time-efficiency (these should always be sequences of length 2). dictobject.c, new functions PyDict_{Merge,Update}FromSeq2. These are wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2- sequences argument instead of a mapping object. For now, I left these functions file static, so no corresponding doc changes. It's tempting to change dict.update() to allow a sequence-of-2-seqs argument too. Also changed the name of dictionary's keyword argument from "mapping" to "x". Got a better name? "mapping_or_sequence_of_pairs" isn't attractive, although more so than "mosop" <wink>. abstract.h, abstract.tex: Added new PySequence_Fast_GET_SIZE function, much faster than going thru the all-purpose PySequence_Size. libfuncs.tex: - Document dictionary(). - Fiddle tuple() and list() to admit that their argument is optional. - The long-winded repetitions of "a sequence, a container that supports iteration, or an iterator object" is getting to be a PITA. Many months ago I suggested factoring this out into "iterable object", where the definition of that could include being explicit about generators too (as is, I'm not sure a reader outside of PythonLabs could guess that "an iterator object" includes a generator call). - Please check my curly braces -- I'm going blind <0.9 wink>. abstract.c, PySequence_Tuple(): When PyObject_GetIter() fails, leave its error msg alone now (the msg it produces has improved since PySequence_Tuple was generalized to accept iterable objects, and PySequence_Tuple was also stomping on the msg in cases it shouldn't have even before PyObject_GetIter grew a better msg).	2001-10-26 05:06:50 +00:00
Guido van Rossum	9475a2310d	Enable GC for new-style instances. This touches lots of files, since many types were subclassable but had a xxx_dealloc function that called PyObject_DEL(self) directly instead of deferring to self->ob_type->tp_free(self). It is permissible to set tp_free in the type object directly to _PyObject_Del, for non-GC types, or to _PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster, so I'm fearing that our pystone rating is going down again. I'm not sure if doing something like void xxx_dealloc(PyObject *self) { if (PyXxxCheckExact(self)) PyObject_DEL(self); else self->ob_type->tp_free(self); } is any faster than always calling the else branch, so I haven't attempted that -- however those types whose own dealloc is fancier (int, float, unicode) do use this pattern.	2001-10-05 20:51:39 +00:00
Tim Peters	0ab085c4cb	Changed the dict implementation to take "string shortcuts" only when keys are true strings -- no subclasses need apply. This may be debatable. The problem is that a str subclass may very well want to override __eq__ and/or __hash__ (see the new example of case-insensitive strings in test_descr), but go-fast shortcuts for strings are ubiquitous in our dicts (and subclass overrides aren't even looked for then). Another go-fast reason for the change is that PyCheck_StringExact() is a quicker test than PyCheck_String(), and we make such a test on virtually every access to every dict. OTOH, a str subclass may also be perfectly happy using the base str eq and hash, and this change slows them a lot. But those cases are still hypothetical, while Python's own reliance on true-string dicts is not.	2001-09-14 00:25:33 +00:00
Tim Peters	b95ec09a44	Repair typo in comment.	2001-09-02 18:35:54 +00:00
Tim Peters	25786c0851	Make dictionary() a real constructor. Accepts at most one argument, "a mapping object", in the same sense dict.update(x) requires of x (that x has a keys() method and a getitem). Questionable: The other type constructors accept a keyword argument, so I did that here too (e.g., dictionary(mapping={1:2}) works). But type_call doesn't pass the keyword args to the tp_new slot (it passes NULL), it only passes them to the tp_init slot, so getting at them required adding a tp_init slot to dicts. Looks like that makes the normal case (i.e., no args at all) a little slower (the time it takes to call dict.tp_init and have it figure out there's nothing to do).	2001-09-02 08:22:48 +00:00
Neil Schemenauer	e83c00efd0	Use new GC API.	2001-08-29 23:54:21 +00:00
Martin v. Löwis	e3eb1f2b23	Patch #427190 : Implement and use METH_NOARGS and METH_O.	2001-08-16 13:15:00 +00:00
Guido van Rossum	05ac6de2d5	Add PyDict_Merge(a, b, override): PyDict_Merge(a, b, 1) is the same as PyDict_Update(a, b). PyDict_Merge(a, b, 0) does something similar but leaves existing items unchanged.	2001-08-10 20:28:28 +00:00
Tim Peters	6d6c1a35e0	Merge of descr-branch back into trunk.	2001-08-02 04:15:00 +00:00
Barry Warsaw	66a0d1d9b9	dict_update(): Generalize this method so {}.update() accepts any "mapping" object, specifically one that supports PyMapping_Keys() and PyObject_GetItem(). This allows you to say e.g. {}.update(UserDict()) We keep the special case for concrete dict objects, although that seems moderately questionable. OTOH, the code exists and works, so why change that? .update()'s docstring already claims that D.update(E) implies calling E.keys() so it's appropriate not to transform AttributeErrors in PyMapping_Keys() to TypeErrors. Patch eyeballed by Tim.	2001-06-26 20:08:32 +00:00
Tim Peters	c605784174	dict_repr: Reuse one of the int vars (minor code simplification).	2001-06-16 07:52:53 +00:00
Tim Peters	a7259597f1	SF bug 433228: repr(list) woes when len(list) big. Gave Python linear-time repr() implementations for dicts, lists, strings. This means, e.g., that repr(range(50000)) is no longer 50x slower than pprint.pprint() in 2.2 <wink>. I don't consider this a bugfix candidate, as it's a performance boost. Added _PyString_Join() to the internal string API. If we want that in the public API, fine, but then it requires runtime error checks instead of asserts.	2001-06-16 05:11:17 +00:00
Tim Peters	afb6ae8452	Store the mask instead of the size in dictobjects. The mask is more frequently used, and in particular this allows to drop the last remaining obvious time-waster in the crucial lookdict() and lookdict_string() functions. Other changes consist mostly of changing "i < ma_size" to "i <= ma_mask" everywhere.	2001-06-04 21:00:21 +00:00
Tim Peters	453163d842	lookdict: stop more insane core-dump mutating comparison cases. Should be possible to provoke unbounded recursion now, but leaving that to someone else to provoke and repair. Bugfix candidate -- although this is getting harder to backstitch, and the cases it's protecting against are mondo contrived.	2001-06-03 04:54:32 +00:00
Tim Peters	7b5d0afb1e	lookdict: Reduce obfuscating code duplication with a judicious goto. This code is likely to get even hairier to squash core dumps due to mutating comparisons, and it's hard enough to follow without that.	2001-06-03 04:14:43 +00:00
Tim Peters	19b77cfc4b	Finish the dict->string coredump fix. Need sleep. Bugfix candidate.	2001-06-02 08:27:39 +00:00
Tim Peters	23cf6be23c	Coredumpers from Michael Hudson, mutating dicts while printing or converting to string. Critical bugfix candidate -- if you take this seriously <wink>.	2001-06-02 08:02:56 +00:00
Tim Peters	f4b33f61fb	dict_popitem(): Repaired last-second 2.1 comment, which misidentified the true reason for allocating the tuple before checking the dict size.	2001-06-02 05:42:29 +00:00
Tim Peters	eb28ef209e	New collision resolution scheme: no polynomials, simpler, faster, less code, less memory. Tests have uncovered no drawbacks. Christian and Vladimir are the other two people who have burned many brain cells on the dict code in recent years, and they like the approach too, so I'm checking it in without further ado.	2001-06-02 05:27:19 +00:00
Tim Peters	15d4929ae4	Implement an old idea of Christian Tismer's: use polynomial division instead of multiplication to generate the probe sequence. The idea is recorded in Python-Dev for Dec 2000, but that version is prone to rare infinite loops. The value is in getting all the bits of the hash code to participate; and, e.g., this speeds up querying every key in a dict with keys [i << 16 for i in range(20000)] by a factor of 500. Should be equally valuable in any bad case where the high-order hash bits were getting ignored. Also wrote up some of the motivations behind Python's ever-more-subtle hash table strategy.	2001-05-27 07:39:22 +00:00
Martin v. Löwis	cd35306a25	Patch #424335 : Implement string_richcompare, remove string_compare. Use new _PyString_Eq in lookdict_string.	2001-05-24 16:56:35 +00:00
Tim Peters	f8a548c23c	dictresize(): Rebuild small tables if there are any dummies, not just if they're entirely full. Not a question of correctness, but of temporarily misplaced common sense.	2001-05-24 16:26:40 +00:00
Tim Peters	0c6010be75	Jack Jansen hit a bug in the new dict code, reported on python-dev. dictresize() was too aggressive about never ever resizing small dicts. If a small dict is entirely full, it needs to rebuild it despite that it won't actually resize it, in order to purge old dummy entries thus creating at least one virgin slot (lookdict assumes at least one such exists). Also took the opportunity to add some high-level comments to dictresize.	2001-05-23 23:33:57 +00:00
Fred Drake	0c23231f6e	Remove unused variable.	2001-05-22 22:36:52 +00:00
Tim Peters	dea48ec581	SF patch #425242 : Patch which "inlines" small dictionaries. The idea is Marc-Andre Lemburg's, the implementation is Tim's. Add a new ma_smalltable member to dictobjects, an embedded vector of MINSIZE (8) dictentry structs. Short course is that this lets us avoid additional malloc(s) for dicts with no more than 5 entries. The changes are widespread but mostly small. Long course: WRT speed, all scalar operations (getitem, setitem, delitem) on non-empty dicts benefit from no longer needing NULL-pointer checks (ma_table is never NULL anymore). Bulk operations (copy, update, resize, clearing slots during dealloc) benefit in some cases from now looping on the ma_fill count rather than on ma_size, but that was an unexpected benefit: the original reason to loop on ma_fill was to let bulk operations on empty dicts end quickly (since the NULL-pointer checks went away, empty dicts aren't special-cased any more). Special considerations: For dicts that remain empty, this change is a lose on two counts: the dict object contains 8 new dictentry slots now that weren't needed before, and dict object creation also spends time memset'ing these doomed-to-be-unsused slots to NULLs. For dicts with one or two entries that never get larger than 2, it's a mix: a malloc()/free() pair is no longer needed, and the 2-entry case gets to use 8 slots (instead of 4) thus decreasing the chance of collision. Against that, dict object creation spends time memset'ing 4 slots that aren't strictly needed in this case. For dicts with 3 through 5 entries that never get larger than 5, it's a pure win: the dict is created with all the space they need, and they never need to resize. Before they suffered two malloc()/free() calls, plus 1 dict resize, to get enough space. In addition, the 8-slot table they ended with consumed more memory overall, because of the hidden overhead due to the additional malloc. For dicts with 6 or more entries, the ma_smalltable member is wasted space, but then these are large(r) dicts so 8 slots more or less doesn't make much difference. They still benefit all the time from removing ubiquitous dynamic null-pointer checks, and get a small benefit (but relatively smaller the larger the dict) from not having to do two mallocs, two frees, and a resize on the way to getting their sixth entry. All in all it appears a small but definite general win, with larger benefits in specific cases. It's especially nice that it allowed to get rid of several branches, gotos and labels, and overall made the code smaller.	2001-05-22 20:40:22 +00:00
Tim Peters	91a364df17	Bugfix candidate. Two exceedingly unlikely errors in dictresize(): 1. The loop for finding the new size had an off-by-one error at the end (could over-index the polys[] vector). 2. The polys[] vector ended with a 0, apparently intended as a sentinel value but never used as such; i.e., it was never checked, so 0 could have been used as a polynomial. Neither bug could trigger unless a dict grew to 2**30 slots; since that would consume at least 12GB of memory just to hold the dict pointers, I'm betting it's not the cause of the bug Fred's tracking down <wink>.	2001-05-19 07:04:38 +00:00
Tim Peters	1928314ef4	Speed dictresize by collapsing its two passes into one; the reason given in the comments for using two passes was bogus, as the only object that can get decref'ed due to the copy is the dummy key, and decref'ing dummy can't have side effects (for one thing, dummy is immortal! for another, it's a string object, not a potentially dangerous user-defined object).	2001-05-17 22:25:34 +00:00
Tim Peters	342c65e19a	Aggressive reordering of dict comparisons. In case of collision, it stands to reason that me_key is much more likely to match the key we're looking for than to match dummy, and if the key is absent me_key is much more likely to be NULL than dummy: most dicts don't even have a dummy entry. Running instrumented dict code over the test suite and some apps confirmed that matching dummy was 200-300x less frequent than matching key in practice. So this reorders the tests to try the common case first. It can lose if a large dict with many collisions is mostly deleted, not resized, and then frequently searched, but that's hardly a case we should be favoring.	2001-05-13 06:43:53 +00:00
Tim Peters	2f228e75e4	Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask". The comment following used to say: /* We use ~hash instead of hash, as degenerate hash functions, such as for ints <sigh>, can have lots of leading zeros. It's not really a performance risk, but better safe than sorry. 12-Dec-00 tim: so ~hash produces lots of leading ones instead -- what's the gain? / That is, there was never a good reason for doing it. And to the contrary, as explained on Python-Dev last December, it tended to make the sum* (i + incr) & mask (which is the first table index examined in case of collison) the same "too often" across distinct hashes. Changing to the simpler "i = hash & mask" reduced the number of string-dict collisions (== # number of times we go around the lookup for-loop) from about 6 million to 5 million during a full run of the test suite (these are approximate because the test suite does some random stuff from run to run). The number of collisions in non-string dicts also decreased, but not as dramatically. Note that this may, for a given dict, change the order (wrt previous releases) of entries exposed by .keys(), .values() and .items(). A number of std tests suffered bogus failures as a result. For dicts keyed by small ints, or (less so) by characters, the order is much more likely to be in increasing order of key now; e.g., >>> d = {} >>> for i in range(10): ... d[i] = i ... >>> d {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9} >>> Unfortunately. people may latch on to that in small examples and draw a bogus conclusion. test_support.py Moved test_extcall's sortdict() into test_support, made it stronger, and imported sortdict into other std tests that needed it. test_unicode.py Excluced cp875 from the "roundtrip over range(128)" test, because cp875 doesn't have a well-defined inverse for unicode("?", "cp875"). See Python-Dev for excruciating details. Cookie.py Chaged various output functions to sort dicts before building strings from them. test_extcall Fiddled the expected-result file. This remains sensitive to native dict ordering, because, e.g., if there are multiple errors in a keyword-arg dict (and test_extcall sets up many cases like that), the specific error Python complains about first depends on native dict ordering.	2001-05-13 00:19:31 +00:00
Tim Peters	4fa58bfac2	Restore dicts' tp_compare slot, and change dict_richcompare to say it doesn't know how to do LE, LT, GE, GT. dict_richcompare can't do the latter any faster than dict_compare can. More importantly, for cmp(dict1, dict2), Python first tries rich compares with EQ, LT, and GT one at a time, even if the tp_compare slot is defined, and dict_richcompare called dict_compare for the latter two because it couldn't do them itself. The result was a lot of wasted calls to dict_compare. Now dict_richcompare gives up at once the times Python calls it with LT and GT from try_rich_to_3way_compare(), and dict_compare is called only once (when Python gets around to trying the tp_compare slot). Continued mystery: despite that this cut the number of calls to dict_compare approximately in half in test_mutants.py, the latter still runs amazingly slowly. Running under the debugger doesn't show excessive activity in the dict comparison code anymore, so I'm guessing the culprit is somewhere else -- but where? Perhaps in the element (key/value) comparison code? We clearly spend a lot of time figuring out how to compare things.	2001-05-10 21:45:19 +00:00
Tim Peters	3918fb2549	Repair typo in comment.	2001-05-10 18:58:31 +00:00
Tim Peters	95bf9390a4	SF bug #422121 Insecurities in dict comparison. Fixed a half dozen ways in which general dict comparison could crash Python (even cause Win98SE to reboot) in the presence of kay and/or value comparison routines that mutate the dict during dict comparison. Bugfix candidate.	2001-05-10 08:32:44 +00:00
Tim Peters	e63415ead8	SF patch #421922 : Implement rich comparison for dicts. d1 == d2 and d1 != d2 now work even if the keys and values in d1 and d2 don't support comparisons other than ==, and testing dicts for equality is faster now (especially when inequality obtains).	2001-05-08 04:38:29 +00:00
Guido van Rossum	b1f35bffe5	Mchael Hudson pointed out that the code for detecting changes in dictionary size was comparing ma_size, the hash table size, which is always a power of two, rather than ma_used, wich changes on each insertion or deletion. Fixed this.	2001-05-02 15:13:44 +00:00
Guido van Rossum	09e563abb4	Add experimental iterkeys(), itervalues(), iteritems() to dict objects. Tests show that iteritems() is 5-10% faster than iterating over the dict and extracting the value with dict[key].	2001-05-01 12:10:21 +00:00
Guido van Rossum	213c7a6aa5	Mondo changes to the iterator stuff, without changing how Python code sees it (test_iter.py is unchanged). - Added a tp_iternext slot, which calls the iterator's next() method; this is much faster for built-in iterators over built-in types such as lists and dicts, speeding up pybench's ForLoop with about 25% compared to Python 2.1. (Now there's a good argument for iterators. ;-) - Renamed the built-in sequence iterator SeqIter, affecting the C API functions for it. (This frees up the PyIter prefix for generic iterator operations.) - Added PyIter_Check(obj), which checks that obj's type has a tp_iternext slot and that the proper feature flag is set. - Added PyIter_Next(obj) which calls the tp_iternext slot. It has a somewhat complex return condition due to the need for speed: when it returns NULL, it may not have set an exception condition, meaning the iterator is exhausted; when the exception StopIteration is set (or a derived exception class), it means the same thing; any other exception means some other error occurred.	2001-04-23 14:08:49 +00:00
Guido van Rossum	59d1d2b434	Iterators phase 1. This comprises: new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER new C API PyObject_GetIter(), calls tp_iter new builtin iter(), with two forms: iter(obj), and iter(function, sentinel) new internal object types iterobject and calliterobject new exception StopIteration new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py) new magic number for .pyc files new special method for instances: __iter__() returns an iterator iteration over dictionaries: "for x in dict" iterates over the keys iteration over files: "for x in file" iterates over lines TODO: documentation test suite decide whether to use a different way to spell iter(function, sentinal) decide whether "for key in dict" is a good idea use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?) speed tuning (make next() a slot tp_next???)	2001-04-20 19:13:02 +00:00
Guido van Rossum	55ad67d74d	Oops. Removed dictiter_new decl that wasn't supposed to go in yet.	2001-04-20 16:52:06 +00:00

1 2 3 4 5

227 Commits