The zlib and hex codecs throw custom exception types with
weakref support if the input type is valid, but the data
fails validation. Make sure the exception chaining in the
codec infrastructure can wrap those as well.
The utf-16* and utf-32* encoders no longer allow surrogate code points
(U+D800-U+DFFF) to be encoded.
The utf-32* decoders no longer decode byte sequences that correspond to
surrogate code points.
The surrogatepass error handler now works with the utf-16* and utf-32* codecs.
Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.
- output type errors now redirect users to the type-neutral
convenience functions in the codecs module
- stateless errors that occur during encoding and decoding
will now be automatically wrapped in exceptions that give
the name of the codec involved
This mostly affected the encodebytes and decodebytes function
(which are used by base64_codec)
Also added a test to ensure all bytes-bytes codecs can handle
memoryview input and tests for handling of multidimensional
and non-bytes format input in the modern base64 API.
Test the following functions:
* codecs.raw_unicode_escape_decode()
* PyUnicode_FromWideChar()
* PyUnicode_FromUnicode()
* "unicode_internal" and "unicode_escape" decoders
open() function instead of using StreamReaderWriter. Deprecate StreamReader,
StreamWriter, StreamReaderWriter, StreamRecoder and EncodedFile() of the codec
module. Use the builtin open() function or io.TextIOWrapper instead."
"It has not been approved !" wrote Marc-Andre Lemburg.
StreamReaderWriter. Deprecate StreamReader, StreamWriter, StreamReaderWriter,
StreamRecoder and EncodedFile() of the codec module. Use the builtin open()
function or io.TextIOWrapper instead.
'latin-1' and 'utf-8'.
These are optimized in the Python Unicode implementation
to result in more direct processing, bypassing the codec
registry.
Also see issue11303.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81499 | georg.brandl | 2010-05-24 16:29:07 -0500 (Mon, 24 May 2010) | 1 line
#8016: add the CP858 codec (approved by Benjamin). (Also add CP720 to the tests, it was missing there.)
........
r81506 | benjamin.peterson | 2010-05-24 17:04:53 -0500 (Mon, 24 May 2010) | 1 line
set svn:eol-style
........
mode raises unicode errors. The encoder only supports "strict" and "replace"
error handlers, the decoder only supports "strict" and "ignore" error handlers.
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines
Issue #8941: decoding big endian UTF-32 data in UCS-2 builds could crash
the interpreter with characters outside the Basic Multilingual Plane
(higher than 0x10000).
........
svn+ssh://pythondev@svn.python.org/python/branches/py3k
................
r81474 | victor.stinner | 2010-05-22 18:59:09 +0200 (sam., 22 mai 2010) | 20 lines
Merged revisions 81471-81472 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81471 | victor.stinner | 2010-05-22 15:37:56 +0200 (sam., 22 mai 2010) | 7 lines
Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32
* Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
* Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
* test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
Solaris or Windows, but does it really exist? I found it the in the issue.
........
r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines
Fix my last commit (r81471) about codecs
Rememder: don't touch the code just before a commit
........
................
svn+ssh://pythondev@svn.python.org/python/trunk
........
r81471 | victor.stinner | 2010-05-22 15:37:56 +0200 (sam., 22 mai 2010) | 7 lines
Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32
* Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
* Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
* test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
Solaris or Windows, but does it really exist? I found it the in the issue.
........
r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines
Fix my last commit (r81471) about codecs
Rememder: don't touch the code just before a commit
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r78461 | florent.xicluna | 2010-02-26 11:40:58 +0100 (ven, 26 fév 2010) | 2 lines
#691291: codecs.open() should not convert end of lines on reading and writing.
........
svn+ssh://svn.python.org/python/branches/py3k
................
r74871 | georg.brandl | 2009-09-17 13:41:24 +0200 (Do, 17 Sep 2009) | 12 lines
Merged revisions 74869 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk
(Only the new tests, the code had already been corrected due to an API change
in unicode_decode_call_errorhandler.)
........
r74869 | georg.brandl | 2009-09-17 13:28:09 +0200 (Do, 17 Sep 2009) | 4 lines
Issue #6922: Fix an infinite loop when trying to decode an invalid
UTF-32 stream with a non-raising error handler like "replace" or "ignore".
........
................
(Only the new tests, the code had already been corrected due to an API change
in unicode_decode_call_errorhandler.)
........
r74869 | georg.brandl | 2009-09-17 13:28:09 +0200 (Do, 17 Sep 2009) | 4 lines
Issue #6922: Fix an infinite loop when trying to decode an invalid
UTF-32 stream with a non-raising error handler like "replace" or "ignore".
........
svn+ssh://pythondev@svn.python.org/python/branches/py3k
........
r73698 | amaury.forgeotdarc | 2009-06-30 00:36:49 +0200 (mar., 30 juin 2009) | 7 lines
#6373: SystemError in str.encode('latin1', 'surrogateescape')
if the string contains unpaired surrogates.
(In debug build, crash in assert())
This can happen with normal processing, if python starts with utf-8,
then calls sys.setfilesystemencoding('latin-1')
........
if the string contains unpaired surrogates.
(In debug build, crash in assert())
This can happen with normal processing, if python starts with utf-8,
then calls sys.setfilesystemencoding('latin-1')
svn+ssh://pythondev@svn.python.org/python/trunk
........
r72404 | walter.doerwald | 2009-05-06 16:28:24 +0200 (Mi, 06 Mai 2009) | 3 lines
Issue 3739: The unicode-internal encoder now reports the number of *characters*
consumed like any other encoder (instead of the number of bytes).
........
r72406 | walter.doerwald | 2009-05-06 16:32:35 +0200 (Mi, 06 Mai 2009) | 2 lines
Add NEWS entry about issue #3739.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r59064 | christian.heimes | 2007-11-20 02:48:48 +0100 (Tue, 20 Nov 2007) | 1 line
Fixed bug #1470
........
r59066 | martin.v.loewis | 2007-11-20 03:46:02 +0100 (Tue, 20 Nov 2007) | 2 lines
Patch #1468: Package Lib/test/*.pem.
........
r59068 | christian.heimes | 2007-11-20 04:21:02 +0100 (Tue, 20 Nov 2007) | 1 line
Another fix for test_shutil. Martin pointed out that it breaks some build bots
........
r59073 | nick.coghlan | 2007-11-20 15:55:57 +0100 (Tue, 20 Nov 2007) | 1 line
Backport some main.c cleanup from the py3k branch
........
r59076 | amaury.forgeotdarc | 2007-11-21 00:31:27 +0100 (Wed, 21 Nov 2007) | 6 lines
The incremental decoder for utf-7 must preserve its state between calls.
Solves issue1460.
Might not be a backport candidate: a new API function was added,
and some code may rely on details in utf-7.py.
........
svn+ssh://pythondev@svn.python.org/python/trunk
........
r59044 | neal.norwitz | 2007-11-18 17:46:20 -0800 (Sun, 18 Nov 2007) | 1 line
Use a slightly more recent version than 1.5.2b2.
........
r59047 | walter.doerwald | 2007-11-19 04:14:05 -0800 (Mon, 19 Nov 2007) | 2 lines
Fix typo in comment.
........
r59049 | walter.doerwald | 2007-11-19 04:41:10 -0800 (Mon, 19 Nov 2007) | 4 lines
Fix for #1444: utf_8_sig.StreamReader was (indirectly through decode())
calling codecs.utf_8_decode() with final==True, which falled with incomplete
byte sequences. Fix and test by James G. Sack.
........
r59051 | nick.coghlan | 2007-11-19 05:56:27 -0800 (Mon, 19 Nov 2007) | 1 line
Enable some test_cmd_line_script debugging output to investigate failure on Mac OSX buildbot
........
r59053 | facundo.batista | 2007-11-19 08:30:24 -0800 (Mon, 19 Nov 2007) | 3 lines
Fixed detail in add_type() explanation (issue 1463).
........
r59054 | guido.van.rossum | 2007-11-19 09:35:24 -0800 (Mon, 19 Nov 2007) | 2 lines
Make this work stand-alone, too.
........
r59055 | guido.van.rossum | 2007-11-19 09:50:22 -0800 (Mon, 19 Nov 2007) | 3 lines
Fix the OSX failures in this test -- they were due to /tmp being a symlink
to /private/tmp. Adding a call to os.path.realpath() to temp_dir() fixed it.
........