cpython

Commit Graph

Author	SHA1	Message	Date
Victor Stinner	f2192855dd	Merge 3.5	2016-03-01 22:07:53 +01:00
Victor Stinner	337986740f	Issue #26464 : Fix unicode_fast_translate() again Initialize i variable if the string is non-ASCII.	2016-03-01 21:59:58 +01:00
Victor Stinner	3d9d77a3dc	Merge 3.5	2016-03-01 21:30:50 +01:00
Victor Stinner	6c9aa8f2bf	Fix str.translate() Issue #26464: Fix str.translate() when string is ASCII and first replacements removes character, but next replacement uses a non-ASCII character or a string longer than 1 character. Regression introduced in Python 3.5.0.	2016-03-01 21:30:30 +01:00
Victor Stinner	5b96f17b1c	Merge 3.5	2016-01-27 17:01:13 +01:00
Victor Stinner	5bc03a6d4d	Fix resize_compact() Issue #26217: resize_compact() must set wstr_length to 0 after freeing the wstr string. Otherwise, an assertion fails in _PyUnicode_CheckConsistency().	2016-01-27 16:56:53 +01:00
Serhiy Storchaka	726fc139a5	Issue #20440 : More use of Py_SETREF. This patch is manually crafted and contains changes that couldn't be handled automatically.	2015-12-27 15:44:33 +02:00
Serhiy Storchaka	191321d11b	Issue #20440 : More use of Py_SETREF. This patch is manually crafted and contains changes that couldn't be handled automatically.	2015-12-27 15:41:34 +02:00
Serhiy Storchaka	ef1585eb9a	Issue #25923 : Added more const qualifiers to signatures of static and private functions.	2015-12-25 20:01:53 +02:00
Serhiy Storchaka	2d06e84455	Issue #25923 : Added the const qualifier to static constant arrays.	2015-12-25 19:53:18 +02:00
Serhiy Storchaka	f006940351	Issue #20440 : Massive replacing unsafe attribute setting code with special macro Py_SETREF.	2015-12-24 10:39:57 +02:00
Serhiy Storchaka	5a57ade58e	Issue #20440 : Massive replacing unsafe attribute setting code with special macro Py_SETREF.	2015-12-24 10:35:59 +02:00
Serhiy Storchaka	9b3a2eec1c	Issues #25890 , #25891 , #25892 : Removed unused variables in Windows code. Reported by Alexander Riccio.	2015-12-18 10:03:13 +02:00
Serhiy Storchaka	7c088a9b5c	Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.	2015-12-03 01:05:52 +02:00
Serhiy Storchaka	6648bf5661	Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.	2015-12-03 01:04:37 +02:00
Serhiy Storchaka	7aa690860e	Issue #25709 : Fixed problem with in-place string concatenation and utf-8 cache.	2015-12-03 01:02:03 +02:00
Benjamin Peterson	d798dc1034	merge 3.5 (#25630 )	2015-11-15 21:57:50 -08:00
Benjamin Peterson	a4d33b3428	make the PyUnicode_FSConverter cleanup set the decrefed argument to NULL (closes #25630 )	2015-11-15 21:57:39 -08:00
Serhiy Storchaka	413fdcea21	Issue #24821 : Refactor STRINGLIB(fastsearch_memchr_1char) and split it on STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly without special preconditions.	2015-11-14 15:42:17 +02:00
Serhiy Storchaka	4a7c03aab4	Issue #25523 : Merge a-to-an corrections from 3.5.	2015-11-02 14:44:29 +02:00
Serhiy Storchaka	a84f6c3dd3	Issue #25523 : Merge a-to-an corrections from 3.4.	2015-11-02 14:39:05 +02:00
Serhiy Storchaka	d65c9496da	Issue #25523 : Further a-to-an corrections.	2015-11-02 14:10:23 +02:00
Victor Stinner	358af13526	Issue #25353 : Optimize unicode escape and raw unicode escape encoders to use the new _PyBytesWriter API.	2015-10-12 22:36:57 +02:00
Victor Stinner	6c2cdae9e6	Writer APIs: use empty string singletons Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the empty bytes/Unicode string if the string is empty.	2015-10-12 13:29:43 +02:00
Victor Stinner	6bd525b656	Optimize error handlers of ASCII and Latin1 encoders when the replacement string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path	2015-10-09 13:10:05 +02:00
Victor Stinner	ce179bf6ba	Add _PyBytesWriter_WriteBytes() to factorize the code	2015-10-09 12:57:22 +02:00
Victor Stinner	ad7715891e	_PyBytesWriter: simplify code to avoid "prealloc" parameters Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().	2015-10-09 12:38:53 +02:00
Victor Stinner	3fa36ff5e4	Issue #25318 : Fix backslashreplace() Fix code to estimate the needed space.	2015-10-09 03:37:11 +02:00
Victor Stinner	797485e101	Issue #25318 : Avoid sprintf() in backslashreplace() Rewrite backslashreplace() to be closer to PyCodec_BackslashReplaceErrors(). Add also unit tests for non-BMP characters.	2015-10-09 03:17:30 +02:00
Victor Stinner	0016507c16	Issue #25318 : Move _PyBytesWriter to bytesobject.c Declare also the private API in bytesobject.h.	2015-10-09 01:53:21 +02:00
Victor Stinner	e7bf86cd7d	Optimize backslashreplace error handler Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.	2015-10-09 01:39:28 +02:00
Victor Stinner	fdfbf78114	Issue #25318 : Add _PyBytesWriter API Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.	2015-10-09 00:33:49 +02:00
Victor Stinner	74e8fac3c8	Issue #25301 : Fix compatibility with ISO C90	2015-10-05 13:49:26 +02:00
Victor Stinner	1d65d9192d	Issue #25301 : The UTF-8 decoder is now up to 15 times as fast for error handlers: ``ignore``, ``replace`` and ``surrogateescape``.	2015-10-05 13:43:50 +02:00
Victor Stinner	eb36fdaad8	Fix _PyUnicodeWriter_PrepareKind() Initialize kind to 0 (PyUnicode_WCHAR_KIND) to ensure that _PyUnicodeWriter_PrepareKind() handles correctly read-only buffer: copy the buffer.	2015-10-03 01:55:51 +02:00
Serhiy Storchaka	29e68edbf4	Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data: 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).	2015-10-02 13:14:03 +03:00
Serhiy Storchaka	58c8f2bb6d	Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data: 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate. 3. In some circumstances the '\xfd' character was produced instead of the replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).	2015-10-02 13:13:14 +03:00
Serhiy Storchaka	28b21e50c8	Issue #24848 : Fixed bugs in UTF-7 decoding of misformed data: 1. Non-ASCII bytes were accepted after shift sequence. 2. A low surrogate could be emitted in case of error in high surrogate.	2015-10-02 13:07:28 +03:00
Victor Stinner	3222da26fe	Make _PyUnicode_TranslateCharmap() symbol private unicodeobject.h exposes PyUnicode_TranslateCharmap() and PyUnicode_Translate().	2015-10-01 22:07:32 +02:00
Victor Stinner	01ada3996b	Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.	2015-10-01 21:54:51 +02:00
Victor Stinner	c3713e9706	Optimize ascii/latin1+surrogateescape encoders Issue #25227: Optimize ASCII and latin1 encoders with the ``surrogateescape`` error handler: the encoders are now up to 3 times as fast. Initial patch written by Serhiy Storchaka.	2015-09-29 12:32:13 +02:00
Victor Stinner	0030cd52da	Issue #25227 : Cleanup unicode_encode_ucs1() error handler * Change limit type from unsigned int to Py_UCS4, to use the same type than the "ch" variable (an Unicode character). * Reuse ch variable for _Py_ERROR_XMLCHARREFREPLACE * Add some newlines for readability	2015-09-24 14:45:00 +02:00
Victor Stinner	54385b206d	Issue #24870 : revert unwanted change Sorry, I pushed the patch on the UTF-8 decoder by mistake :-(	2015-09-22 10:46:52 +02:00
Victor Stinner	5ebae87628	Issue #25207 , #14626 : Fix my commit. It doesn't work to use #define XXX defined(YYY)" and then "#ifdef XXX" to check YYY.	2015-09-22 01:29:33 +02:00
Victor Stinner	6174474bea	_PyUnicodeWriter_PrepareInternal(): make the assertion more strict	2015-09-22 01:01:17 +02:00
Victor Stinner	ca9381ea01	Issue #24870 : Add _PyUnicodeWriter_PrepareKind() macro Add a macro which ensures that the writer has at least the requested kind.	2015-09-22 00:58:32 +02:00
Victor Stinner	5014920cb7	Issue #24870 : Reuse the new _Py_error_handler enum Factorize code with the new get_error_handler() function. Add some empty lines for readability.	2015-09-22 00:26:54 +02:00
Victor Stinner	f96418de05	Issue #24870 : Optimize the ASCII decoder for error handlers: surrogateescape, ignore and replace. Initial patch written by Naoki Inada. The decoder is now up to 60 times as fast for these error handlers. Add also unit tests for the ASCII decoder.	2015-09-21 23:06:27 +02:00
Zachary Ware	070bd62cfa	Closes #21279 : Merge with 3.5	2015-08-06 00:05:13 -05:00
Zachary Ware	d987a81d29	Issue #21279 : Merge with 3.4	2015-08-06 00:04:23 -05:00

1 2 3 4 5 ...

1265 Commits