Commit Graph

194 Commits

Author SHA1 Message Date
Serhiy Storchaka db6add7d71 Clean up escape-decode decoder tests. 2013-01-29 11:07:27 +02:00
Serhiy Storchaka 077cb347a9 Clean up escape-decode decoder tests. 2013-01-29 11:06:53 +02:00
Serhiy Storchaka 8fe5a9f9c3 Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. 2013-01-29 10:37:39 +02:00
Serhiy Storchaka 24193debd4 Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. 2013-01-29 10:28:07 +02:00
Serhiy Storchaka d679377be7 Issue #16979: Fix error handling bugs in the unicode-escape-decode decoder. 2013-01-29 10:20:44 +02:00
Serhiy Storchaka f584aba3a5 Issue #16975: Fix error handling bug in the escape-decode bytes decoder. 2013-01-25 23:33:22 +02:00
Serhiy Storchaka e58785b200 Issue #16975: Fix error handling bug in the escape-decode bytes decoder. 2013-01-25 23:32:41 +02:00
Serhiy Storchaka ace3ad3bf7 Issue #16975: Fix error handling bug in the escape-decode bytes decoder. 2013-01-25 23:31:43 +02:00
Serhiy Storchaka 55e2cb497b Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"
in any mapping, not only in an unicode string.
2013-01-15 15:30:04 +02:00
Serhiy Storchaka 45d16d9924 Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"
in any mapping, not only in an unicode string.
2013-01-15 15:01:20 +02:00
Serhiy Storchaka 4fb8caee87 Issue #14850: Now a chamap decoder treates U+FFFE as "undefined mapping"
in any mapping, not only in an unicode string.
2013-01-15 14:43:21 +02:00
Ezio Melotti aabd0b0312 #16918: merge with 3.3. 2013-01-11 06:05:51 +02:00
Ezio Melotti 5d3dba0d27 #16918: test_codecs now works with unittest test discovery. Patch by Zachary Ware. 2013-01-11 06:02:07 +02:00
Ezio Melotti e0b87edd7f Merge fix for broken/disabled test. 2013-01-11 05:57:58 +02:00
Ezio Melotti 26ed234052 Enable a broken test and fix it. 2013-01-11 05:54:57 +02:00
Serhiy Storchaka 24a3ef6999 Issue #11461: Fix the incremental UTF-16 decoder. Original patch by
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:41:55 +02:00
Serhiy Storchaka ae3b32ad6b Issue #11461: Fix the incremental UTF-16 decoder. Original patch by
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:40:52 +02:00
Serhiy Storchaka 48e188e573 Issue #11461: Fix the incremental UTF-16 decoder. Original patch by
Amaury Forgeot d'Arc. Added tests for partial decoding of non-BMP
characters.
2013-01-08 23:14:24 +02:00
Andrew Svetlov 2606a6f197 Issue #16719: Get rid of WindowsError. Use OSError instead
Patch by Serhiy Storchaka.
2012-12-19 14:33:35 +02:00
Ezio Melotti a0b5c46fa2 #16336: merge with 3.2. 2012-11-03 23:04:41 +02:00
Ezio Melotti 540da76115 #16336: fix input checking in the surrogatepass error handler. Patch by Serhiy Storchaka. 2012-11-03 23:03:39 +02:00
Philip Jenvey 5f9459fbed merge with 3.2 2012-10-26 17:05:09 -07:00
Philip Jenvey 45c41494bf bounds check for bad data (thanks amaury) 2012-10-26 17:01:53 -07:00
Antoine Pitrou a1f7655fa7 Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
Patch by Serhiy Storchaka.
2012-09-23 20:00:04 +02:00
Antoine Pitrou 6f80f5d444 Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
Patch by Serhiy Storchaka.
2012-09-23 19:55:21 +02:00
Antoine Pitrou 5e36edbaba Port additional tests from #14579 (the issue is already fixed). 2012-07-21 00:47:48 +02:00
Antoine Pitrou b4bbee25b1 Issue #14579: Fix CVE-2012-2135: vulnerability in the utf-16 decoder after error handling.
Patch by Serhiy Storchaka.
2012-07-21 00:45:14 +02:00
Victor Stinner e3b47152a4 Write tests for invalid characters (U+00110000)
Test the following functions:

 * codecs.raw_unicode_escape_decode()
 * PyUnicode_FromWideChar()
 * PyUnicode_FromUnicode()
 * "unicode_internal" and "unicode_escape" decoders
2011-12-09 20:49:49 +01:00
Ezio Melotti adc417ce36 #13406: fix more deprecation warnings and move the deprecation of unicode-internal earlier in the code. 2011-11-17 12:23:34 +02:00
Ezio Melotti 345379a7f8 #13406: correct the error message in check_warnings too. 2011-11-16 09:54:19 +02:00
Ezio Melotti 11060a4a48 #13406: silence deprecation warnings in test_codecs. 2011-11-16 09:39:10 +02:00
Victor Stinner 040e16e3e8 "unicode_internal" codec has been deprecated: fix related tests 2011-11-15 22:44:05 +01:00
Victor Stinner 76a31a6bff Cleanup decode_code_page_stateful() and encode_code_page()
* Fix decode_code_page_errors() result
 * Inline decode_code_page() and encode_code_page_chunk()
 * Replace the PyUnicodeObject type by PyObject
2011-11-04 00:05:13 +01:00
Victor Stinner 2f3ca9f20e Close #13247: Add cp65001 codec, the Windows UTF-8 (CP_UTF8) 2011-10-27 01:38:56 +02:00
Victor Stinner 9e92188f53 Issue #12281: Fix test_codecs.test_cp932() on Windows XP
Cool! Decoding b'\x81\x00abc' from cp932 with replace error handler is now
giving the same result on all Windows versions.
2011-10-18 21:55:25 +02:00
Victor Stinner 62be4fb21f Issue #12281: Skip code page tests on non-Windows platforms 2011-10-18 21:46:37 +02:00
Victor Stinner 3a50e7056e Issue #12281: Rewrite the MBCS codec to handle correctly replace and ignore
error handlers on all Windows versions. The MBCS codec is now supporting all
error handlers, instead of only replace to encode and ignore to decode.
2011-10-18 21:21:00 +02:00
Antoine Pitrou 00b2c86d09 Fix text failures when ctypes is not available
(followup to Victor's 85d11cf67aa8 and 7a50e549bd11)
2011-10-05 13:01:41 +02:00
Victor Stinner 182d90d9ee Fix test_codecs for Windows: check size of wchar_t, not sys.maxunicode 2011-09-29 19:53:55 +02:00
Martin v. Löwis d63a3b8beb Implement PEP 393. 2011-09-28 07:41:54 +02:00
Antoine Pitrou 2a20f9be70 Backport 0398f07d4827 (fix for weird buildbot failures) 2011-07-27 01:06:07 +02:00
Antoine Pitrou d05066d1ee Try to fix weird buildbot failures 2011-07-26 23:55:33 +02:00
Antoine Pitrou 5a24d82941 Add a test for issue #1813: getlocale() failing under a Turkish locale
(not a problem under 3.x)
2011-07-24 02:41:54 +02:00
Antoine Pitrou cf9d3c08c8 Issue #1813: Fix codec lookup under Turkish locales. 2011-07-24 02:27:04 +02:00
Victor Stinner 0501070669 Revert my commit 3555cf6f9c98: "Issue #8796: codecs.open() calls the builtin
open() function instead of using StreamReaderWriter. Deprecate StreamReader,
StreamWriter, StreamReaderWriter, StreamRecoder and EncodedFile() of the codec
module. Use the builtin open() function or io.TextIOWrapper instead."

"It has not been approved !" wrote Marc-Andre Lemburg.
2011-05-27 16:50:40 +02:00
Victor Stinner 98fe1a0c3b Issue #8796: codecs.open() calls the builtin open() function instead of using
StreamReaderWriter. Deprecate StreamReader, StreamWriter, StreamReaderWriter,
StreamRecoder and EncodedFile() of the codec module. Use the builtin open()
function or io.TextIOWrapper instead.
2011-05-27 01:51:18 +02:00
Victor Stinner d6881701fb Merge 3.2 2011-05-23 14:58:07 +02:00
Victor Stinner b43dd4b8ca Merge 3.1 2011-05-23 14:57:05 +02:00
Victor Stinner 2cca057284 test_codecs now removes the temporay file (created by the test) 2011-05-23 14:51:42 +02:00
Marc-André Lemburg 8f36af7a4c Normalize the encoding names for Latin-1 and UTF-8 to
'latin-1' and 'utf-8'.

These are optimized in the Python Unicode implementation
to result in more direct processing, bypassing the codec
registry.

Also see issue11303.
2011-02-25 15:42:01 +00:00
Benjamin Peterson 28a4dce6a8 remove (un)transform methods 2010-12-12 01:33:04 +00:00
Victor Stinner 53a9dd776e Issue #10546: UTF-16-LE and UTF-16-BE *do* support non-BMP characters
Fix the doc and add tests.
2010-12-08 22:25:45 +00:00
Georg Brandl 02524629f3 #7475: add (un)transform method to bytes/bytearray and str, add back codecs that can be used with them from Python 2. 2010-12-02 18:06:51 +00:00
Ezio Melotti 19f2aeba67 Merged revisions 86596 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r86596 | ezio.melotti | 2010-11-20 21:04:17 +0200 (Sat, 20 Nov 2010) | 1 line

  #9424: Replace deprecated assert* methods in the Python test suite.
........
2010-11-21 01:30:29 +00:00
Ezio Melotti b3aedd4862 #9424: Replace deprecated assert* methods in the Python test suite. 2010-11-20 19:04:17 +00:00
Benjamin Peterson 5a6214afe2 Merged revisions 81499,81506 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81499 | georg.brandl | 2010-05-24 16:29:07 -0500 (Mon, 24 May 2010) | 1 line

  #8016: add the CP858 codec (approved by Benjamin).  (Also add CP720 to the tests, it was missing there.)
........
  r81506 | benjamin.peterson | 2010-05-24 17:04:53 -0500 (Mon, 24 May 2010) | 1 line

  set svn:eol-style
........
2010-06-27 22:41:29 +00:00
Victor Stinner 554f3f0081 Issue #850997: mbcs encoding (Windows only) handles errors argument: strict
mode raises unicode errors. The encoder only supports "strict" and "replace"
error handlers, the decoder only supports "strict" and "ignore" error handlers.
2010-06-16 23:33:54 +00:00
Antoine Pitrou 6107a688ee Merged revisions 81908 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r81908 | antoine.pitrou | 2010-06-11 23:46:32 +0200 (ven., 11 juin 2010) | 11 lines

  Merged revisions 81907 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines

    Issue #8941: decoding big endian UTF-32 data in UCS-2 builds could crash
    the interpreter with characters outside the Basic Multilingual Plane
    (higher than 0x10000).
  ........
................
2010-06-11 21:48:34 +00:00
Antoine Pitrou cc0cfd3576 Merged revisions 81907 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines

  Issue #8941: decoding big endian UTF-32 data in UCS-2 builds could crash
  the interpreter with characters outside the Basic Multilingual Plane
  (higher than 0x10000).
........
2010-06-11 21:46:32 +00:00
Philip Jenvey ddf0d0383c Merged revisions 79780 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r79780 | philip.jenvey | 2010-04-04 20:05:24 -0700 (Sun, 04 Apr 2010) | 9 lines

  Merged revisions 79779 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r79779 | philip.jenvey | 2010-04-04 19:51:51 -0700 (Sun, 04 Apr 2010) | 2 lines

    fix escape_encode to return the correct consumed size
  ........
................
2010-06-09 17:56:11 +00:00
Victor Stinner 3dcb5acdb0 Issue #8838, #8339: Remove codecs.charbuffer_encode() and "t#" parsing format
Remove last references to the "char buffer" of the buffer protocol from
Python3.
2010-06-08 22:54:19 +00:00
Victor Stinner b64d0eba50 Merged revisions 81474 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r81474 | victor.stinner | 2010-05-22 18:59:09 +0200 (sam., 22 mai 2010) | 20 lines

  Merged revisions 81471-81472 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r81471 | victor.stinner | 2010-05-22 15:37:56 +0200 (sam., 22 mai 2010) | 7 lines

    Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32

     * Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
     * Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
     * test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
       Solaris or Windows, but does it really exist? I found it the in the issue.
  ........
    r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines

    Fix my last commit (r81471) about codecs

    Rememder: don't touch the code just before a commit
  ........
................
2010-05-22 17:01:13 +00:00
Victor Stinner a92ad7ee2c Merged revisions 81471-81472 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81471 | victor.stinner | 2010-05-22 15:37:56 +0200 (sam., 22 mai 2010) | 7 lines

  Issue #6268: More bugfixes about BOM, UTF-16 and UTF-32

   * Fix seek() method of codecs.open(), don't write the BOM twice after seek(0)
   * Fix reset() method of codecs, UTF-16, UTF-32 and StreamWriter classes
   * test_codecs: use "w+" mode instead of "wt+". "t" mode is not supported by
     Solaris or Windows, but does it really exist? I found it the in the issue.
........
  r81472 | victor.stinner | 2010-05-22 15:44:25 +0200 (sam., 22 mai 2010) | 4 lines

  Fix my last commit (r81471) about codecs

  Rememder: don't touch the code just before a commit
........
2010-05-22 16:59:09 +00:00
Victor Stinner 37b8200608 Merged revisions 81461 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r81461 | victor.stinner | 2010-05-22 04:16:27 +0200 (sam., 22 mai 2010) | 10 lines

  Merged revisions 81459 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r81459 | victor.stinner | 2010-05-22 04:11:07 +0200 (sam., 22 mai 2010) | 3 lines

    Issue #6268: Fix seek() method of codecs.open(), don't read the BOM twice
    after seek(0)
  ........
................
2010-05-22 02:17:42 +00:00
Victor Stinner 3fed0870a6 Merged revisions 81459 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81459 | victor.stinner | 2010-05-22 04:11:07 +0200 (sam., 22 mai 2010) | 3 lines

  Issue #6268: Fix seek() method of codecs.open(), don't read the BOM twice
  after seek(0)
........
2010-05-22 02:16:27 +00:00
Victor Stinner 158701d886 Merged revisions 80382 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r80382 | victor.stinner | 2010-04-22 21:38:16 +0200 (jeu., 22 avril 2010) | 3 lines

  Issue #8092: Fix PyUnicode_EncodeUTF8() to support error handler producing
  unicode string (eg. backslashreplace)
........
2010-04-22 19:41:01 +00:00
Victor Stinner 31be90b0c7 Issue #8092: Fix PyUnicode_EncodeUTF8() to support error handler producing
unicode string (eg. backslashreplace)
2010-04-22 19:38:16 +00:00
Philip Jenvey 66a1bd5568 Merged revisions 79779 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r79779 | philip.jenvey | 2010-04-04 19:51:51 -0700 (Sun, 04 Apr 2010) | 2 lines

  fix escape_encode to return the correct consumed size
........
2010-04-05 03:05:24 +00:00
Florent Xicluna e36b2c693c Recorded merge of revisions 78462,78484 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

................
  r78462 | florent.xicluna | 2010-02-26 12:12:33 +0100 (ven, 26 fév 2010) | 9 lines

  Merged revisions 78461 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r78461 | florent.xicluna | 2010-02-26 11:40:58 +0100 (ven, 26 fév 2010) | 2 lines

    #691291: codecs.open() should not convert end of lines on reading and writing.
  ........
................
  r78484 | florent.xicluna | 2010-02-27 12:31:21 +0100 (sam, 27 fév 2010) | 9 lines

  Merged revisions 78482 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r78482 | florent.xicluna | 2010-02-27 12:19:18 +0100 (sam, 27 fév 2010) | 2 lines

    Add entry for issue #691291.
  ........
................
2010-02-27 11:38:27 +00:00
Florent Xicluna c1c415f304 Merged revisions 78461 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r78461 | florent.xicluna | 2010-02-26 11:40:58 +0100 (ven, 26 fév 2010) | 2 lines

  #691291: codecs.open() should not convert end of lines on reading and writing.
........
2010-02-26 11:12:33 +00:00
Ezio Melotti e96159335f Merged revisions 77727 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r77727 | ezio.melotti | 2010-01-24 18:58:36 +0200 (Sun, 24 Jan 2010) | 1 line

  use assert[Not]IsInstance where appropriate
........
2010-01-24 19:26:24 +00:00
Georg Brandl 7b10c9f301 Merged revisions 74871 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

................
  r74871 | georg.brandl | 2009-09-17 13:41:24 +0200 (Do, 17 Sep 2009) | 12 lines

  Merged revisions 74869 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk

  (Only the new tests, the code had already been corrected due to an API change
  in unicode_decode_call_errorhandler.)

  ........
    r74869 | georg.brandl | 2009-09-17 13:28:09 +0200 (Do, 17 Sep 2009) | 4 lines

    Issue #6922: Fix an infinite loop when trying to decode an invalid
    UTF-32 stream with a non-raising error handler like "replace" or "ignore".
  ........
................
2009-09-17 11:46:23 +00:00
Georg Brandl 791f4e15db Merged revisions 74869 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk
(Only the new tests, the code had already been corrected due to an API change
in unicode_decode_call_errorhandler.)

........
  r74869 | georg.brandl | 2009-09-17 13:28:09 +0200 (Do, 17 Sep 2009) | 4 lines

  Issue #6922: Fix an infinite loop when trying to decode an invalid
  UTF-32 stream with a non-raising error handler like "replace" or "ignore".
........
2009-09-17 11:41:24 +00:00
Georg Brandl ab91fdef1f Merged revisions 73715 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r73715 | benjamin.peterson | 2009-07-01 01:06:06 +0200 (Mi, 01 Jul 2009) | 1 line

  convert old fail* assertions to assert*
........
2009-08-13 08:51:18 +00:00
Benjamin Peterson c9c0f201fe convert old fail* assertions to assert* 2009-06-30 23:06:06 +00:00
Amaury Forgeot d'Arc e5344d6c45 Merged revisions 73698 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r73698 | amaury.forgeotdarc | 2009-06-30 00:36:49 +0200 (mar., 30 juin 2009) | 7 lines

  #6373: SystemError in str.encode('latin1', 'surrogateescape')
  if the string contains unpaired surrogates.
  (In debug build, crash in assert())

  This can happen with normal processing, if python starts with utf-8,
  then calls sys.setfilesystemencoding('latin-1')
........
2009-06-29 22:38:54 +00:00
Amaury Forgeot d'Arc 84ec8d9314 #6373: SystemError in str.encode('latin1', 'surrogateescape')
if the string contains unpaired surrogates.
(In debug build, crash in assert())

This can happen with normal processing, if python starts with utf-8,
then calls sys.setfilesystemencoding('latin-1')
2009-06-29 22:36:49 +00:00
Martin v. Löwis 43c57785d3 Rename utf8b error handler to surrogateescape. 2009-05-10 08:15:24 +00:00
Martin v. Löwis e0a2b72e61 Rename the surrogates error handler to surrogatepass. 2009-05-10 08:08:56 +00:00
Walter Dörwald 8dc33d56f5 Merged revisions 72404-72406 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r72404 | walter.doerwald | 2009-05-06 16:28:24 +0200 (Mi, 06 Mai 2009) | 3 lines

  Issue 3739: The unicode-internal encoder now reports the number of *characters*
  consumed like any other encoder (instead of the number of bytes).
........
  r72406 | walter.doerwald | 2009-05-06 16:32:35 +0200 (Mi, 06 Mai 2009) | 2 lines

  Add NEWS entry about issue #3739.
........
2009-05-06 14:41:26 +00:00
Martin v. Löwis 011e842033 Issue #5915: Implement PEP 383, Non-decodable Bytes in
System Character Interfaces.
2009-05-05 04:43:17 +00:00
Martin v. Löwis db12d454e6 Issue #3672: Reject surrogates in utf-8 codec; add surrogates error
handler.
2009-05-02 18:52:14 +00:00
Antoine Pitrou 81fabdb437 Issue #4874: Most builtin decoders now reject unicode input. 2009-01-22 10:11:36 +00:00
Antoine Pitrou 616d28566b Issue #2394: implement more of the memoryview API. 2008-08-19 22:09:34 +00:00
Benjamin Peterson ee8712cda4 #2621 rename test.test_support to test.support 2008-05-20 21:35:26 +00:00
Christian Heimes 5d14c2b8f8 Merged revisions 59056-59076 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r59064 | christian.heimes | 2007-11-20 02:48:48 +0100 (Tue, 20 Nov 2007) | 1 line

  Fixed bug #1470
........
  r59066 | martin.v.loewis | 2007-11-20 03:46:02 +0100 (Tue, 20 Nov 2007) | 2 lines

  Patch #1468: Package Lib/test/*.pem.
........
  r59068 | christian.heimes | 2007-11-20 04:21:02 +0100 (Tue, 20 Nov 2007) | 1 line

  Another fix for test_shutil. Martin pointed out that it breaks some build bots
........
  r59073 | nick.coghlan | 2007-11-20 15:55:57 +0100 (Tue, 20 Nov 2007) | 1 line

  Backport some main.c cleanup from the py3k branch
........
  r59076 | amaury.forgeotdarc | 2007-11-21 00:31:27 +0100 (Wed, 21 Nov 2007) | 6 lines

  The incremental decoder for utf-7 must preserve its state between calls.
  Solves issue1460.

  Might not be a backport candidate: a new API function was added,
  and some code may rely on details in utf-7.py.
........
2007-11-20 23:38:09 +00:00
Guido van Rossum 87c0f1d1c9 Merged revisions 59041-59055 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r59044 | neal.norwitz | 2007-11-18 17:46:20 -0800 (Sun, 18 Nov 2007) | 1 line

  Use a slightly more recent version than 1.5.2b2.
........
  r59047 | walter.doerwald | 2007-11-19 04:14:05 -0800 (Mon, 19 Nov 2007) | 2 lines

  Fix typo in comment.
........
  r59049 | walter.doerwald | 2007-11-19 04:41:10 -0800 (Mon, 19 Nov 2007) | 4 lines

  Fix for #1444: utf_8_sig.StreamReader was (indirectly through decode())
  calling codecs.utf_8_decode() with final==True, which falled with incomplete
  byte sequences. Fix and test by James G. Sack.
........
  r59051 | nick.coghlan | 2007-11-19 05:56:27 -0800 (Mon, 19 Nov 2007) | 1 line

  Enable some test_cmd_line_script debugging output to investigate failure on Mac OSX buildbot
........
  r59053 | facundo.batista | 2007-11-19 08:30:24 -0800 (Mon, 19 Nov 2007) | 3 lines


  Fixed detail in add_type() explanation (issue 1463).
........
  r59054 | guido.van.rossum | 2007-11-19 09:35:24 -0800 (Mon, 19 Nov 2007) | 2 lines

  Make this work stand-alone, too.
........
  r59055 | guido.van.rossum | 2007-11-19 09:50:22 -0800 (Mon, 19 Nov 2007) | 3 lines

  Fix the OSX failures in this test -- they were due to /tmp being a symlink
  to /private/tmp.  Adding a call to os.path.realpath() to temp_dir() fixed it.
........
2007-11-19 18:03:44 +00:00
Guido van Rossum 98297ee781 Merging the py3k-pep3137 branch back into the py3k branch.
No detailed change log; just check out the change log for the py3k-pep3137
branch.  The most obvious changes:

  - str8 renamed to bytes (PyString at the C level);
  - bytes renamed to buffer (PyBytes at the C level);
  - PyString and PyUnicode are no longer compatible.

I.e. we now have an immutable bytes type and a mutable bytes type.

The behavior of PyString was modified quite a bit, to make it more
bytes-like.  Some changes are still on the to-do list.
2007-11-06 21:34:58 +00:00
Guido van Rossum 3172c5d263 Patch# 1258 by Christian Heimes: kill basestring.
I like this because it makes the code shorter! :-)
2007-10-16 18:12:55 +00:00
Guido van Rossum 04c70ad971 Fix the one failing test (can't decode twice). 2007-08-29 14:04:40 +00:00
Guido van Rossum 09549f4407 Changes in anticipation of stricter str vs. bytes enforcement. 2007-08-27 20:40:10 +00:00
Walter Dörwald 41980caf64 Apply SF patch #1775604: This adds three new codecs (utf-32, utf-32-le and
ut-32-be). On narrow builds the codecs combine surrogate pairs in the unicode
object into one codepoint on encoding and create surrogate pairs for
codepoints outside the BMP on decoding. Lone surrogates are passed through
unchanged in all cases.

Backport to the trunk will follow.
2007-08-16 21:55:45 +00:00
Walter Dörwald 2233d27a3f Change readbuffer_encode() and charbuffer_encode() to
return bytes objects.
2007-06-22 12:17:08 +00:00
Walter Dörwald 32a4c71419 Patch by Ron Adam: Don't use u prefix in unicode error messages
and remove u prefix from some comments in test_codecs.py.
2007-06-20 09:25:34 +00:00
Walter Dörwald 42748a8d6d Rip out all codecs that can't work in a unicode/bytes world:
base64, uu, zlib, rot_13, hex, quopri, bz2, string_escape.

However codecs.escape_encode() and codecs.escape_decode()
still exist, as they are used for pickling str8 objects
(so those two functions can go, when the str8 type is removed).
2007-06-12 16:40:17 +00:00
Walter Dörwald 092a225a4d Fix tests for unicode-internal codec. 2007-06-07 11:26:16 +00:00
Guido van Rossum f4cfc8f6bb Make test_codecs work. The CJK codecs now use bytes instead of str8 for
their encoded input/output.
2007-05-17 21:52:23 +00:00
Walter Dörwald 583118a535 Fix tests for string encodings. 2007-05-17 18:35:58 +00:00
Walter Dörwald 9d2ac22721 Fix io.StringIO: String are stored encoded (using "unicode-internal" as the
encoding) which makes the buffer mutable. Strings are encoded on the way in
and decoded on the way out.

Use io.StringIO in test_codecs.py.

Fix the base64_codec test in test_codecs.py.
2007-05-16 12:47:53 +00:00
Walter Dörwald 0ac30f82fe Enhance the punycode decoder so that it can decode
unicode objects.

Fix the idna codec and the tests.
2007-05-11 10:32:57 +00:00