Commit Graph

760 Commits

Author SHA1 Message Date
Victor Stinner 77bb47b312 Simplify unicode_resizable(): singletons reference count is at least 2 2011-10-03 20:06:05 +02:00
Victor Stinner 85041a54bd _PyUnicode_CheckConsistency() checks utf8 field consistency 2011-10-03 14:42:39 +02:00
Victor Stinner 3cf4637e4e unicode_subtype_new() copies also the ascii flag 2011-10-03 14:42:15 +02:00
Victor Stinner 42dfd71333 unicode_kind_name() doesn't check consistency anymore
It is is called from _PyUnicode_Dump() and so must not fail.
2011-10-03 14:41:45 +02:00
Victor Stinner a3b334da6d PyUnicode_Ready() now sets ascii=1 if maxchar < 128
ascii=1 is no more reserved to PyASCIIObject. Use
PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before).
2011-10-03 13:53:37 +02:00
Victor Stinner 1b4f9ceca7 Create _PyUnicode_READY_REPLACE() to reuse singleton
Only use _PyUnicode_READY_REPLACE() on just created strings.
2011-10-03 13:28:14 +02:00
Victor Stinner c379ead9af Fix resize_compact() and resize_inplace(); reenable full resize optimizations
* resize_compact() updates also wstr_len for non-ascii strings sharing wstr
 * resize_inplace() updates also utf8_len/wstr_len for strings sharing
   utf8/wstr
2011-10-03 12:52:27 +02:00
Victor Stinner 34411e17b0 resize_inplace() has been fixed: reenable this optimization 2011-10-03 12:21:33 +02:00
Victor Stinner a849a4b6b4 _PyUnicode_Dump() indicates if wstr and/or utf8 are shared 2011-10-03 12:12:11 +02:00
Victor Stinner 1c8d0c76a1 Fix resize_inplace(): update shared utf8 pointer 2011-10-03 12:11:00 +02:00
Victor Stinner ca4f7a4298 Disable unicode_resize() optimization on Windows (16-bit wchar_t) 2011-10-03 04:18:04 +02:00
Victor Stinner 126c559d05 _PyUnicode_Ready() for 16-bit wchar_t 2011-10-03 04:17:10 +02:00
Victor Stinner 2fd82278cb Fix compilation error on Windows
Fix also a compiler warning.
2011-10-03 04:06:05 +02:00
Victor Stinner a3be613a56 Use PyUnicode_WCHAR_KIND to check if a string is a wstr string
Simplify the test in wstr pointer in unicode_sizeof().
2011-10-03 02:16:37 +02:00
Victor Stinner 910337b42e Add _PyUnicode_CheckConsistency() macro to help debugging
* Document Unicode string states
 * Use _PyUnicode_CheckConsistency() to ensure that objects are always
   consistent.
2011-10-03 03:20:16 +02:00
Victor Stinner 4fae54cb0e In release mode, PyUnicode_InternInPlace() does nothing if the input is NULL or
not a unicode, instead of failing with a fatal error.

Use assertions in debug mode (provide better error messages).
2011-10-03 02:01:52 +02:00
Victor Stinner 23e5668214 PyUnicode_Append() now works in-place when it's possible 2011-10-03 03:54:37 +02:00
Victor Stinner fe226c0d37 Rewrite PyUnicode_Resize()
* Rename _PyUnicode_Resize() to unicode_resize()
 * unicode_resize() creates a copy if the string cannot be resized instead
   of failing
 * Optimize resize_copy() for wstr strings
 * Disable temporary resize_inplace()
2011-10-03 03:52:20 +02:00
Victor Stinner 829c0adca9 Add _PyUnicode_HAS_UTF8_MEMORY() macro 2011-10-03 01:08:02 +02:00
Victor Stinner fe0c155c4f Write _PyUnicode_Dump() to help debugging 2011-10-03 02:59:31 +02:00
Victor Stinner f42dc448e0 PyUnicode_CopyCharacters() fails when copying latin1 into ascii 2011-10-02 23:33:16 +02:00
Victor Stinner c53be96c54 unicode_convert_wchar_to_ucs4() cannot fail 2011-10-02 21:33:54 +02:00
Victor Stinner c3c7415639 Add _PyUnicode_DATA_ANY(op) private macro 2011-10-02 20:39:55 +02:00
Victor Stinner a464fc141d unicode_empty and unicode_latin1 are PyObject* objects, not PyUnicodeObject* 2011-10-02 20:39:30 +02:00
Victor Stinner 267aa24365 PyUnicode_FindChar() raises a IndexError on invalid index 2011-10-02 01:08:37 +02:00
Victor Stinner bc603d12b7 Optimize _PyUnicode_AsKind() for UCS1->UCS4 and UCS2->UCS4
* Ensure that the input string is ready
 * Raise a ValueError instead of of a fatal error
2011-10-02 01:00:40 +02:00
Victor Stinner 5a706cf8c0 Fix usage of PyUnicode_READY() in PyUnicode_GetLength() 2011-10-02 00:36:53 +02:00
Victor Stinner cd9950fd09 PyUnicode_WriteChar() raises IndexError on invalid index
PyUnicode_WriteChar() raises also a ValueError if the string has more than 1
reference.
2011-10-02 00:34:53 +02:00
Victor Stinner 2fe5ced752 PyUnicode_ReadChar() raises a IndexError if the index in invalid
unicode_getitem() reuses PyUnicode_ReadChar()
2011-10-02 00:25:40 +02:00
Victor Stinner 202b62bd90 PyUnicode_FromKindAndData() raises a ValueError if the kind is unknown 2011-10-01 23:48:37 +02:00
Victor Stinner 07ac3ebd7b Optimize unicode_subtype_new(): don't encode to wchar_t and decode from wchar_t
Rewrite unicode_subtype_new(): allocate directly the right type.
2011-10-01 16:16:43 +02:00
Victor Stinner e90fe6a8f4 Add _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() macros
* Rename existing _PyUnicode_UTF8() macro to PyUnicode_UTF8()
 * Rename existing _PyUnicode_UTF8_LENGTH() macro to PyUnicode_UTF8_LENGTH()
 * PyUnicode_UTF8() and PyUnicode_UTF8_LENGTH() are more strict
2011-10-01 16:48:13 +02:00
Martin v. Löwis 0b1d348990 Issue 13085: Fix some memory leaks. Patch by Stefan Krah. 2011-10-01 16:35:40 +02:00
Benjamin Peterson 5c0fb00ad8 merge heads 2011-10-01 00:12:20 -04:00
Benjamin Peterson 31616ea2ff remove reference to non-existent file 2011-10-01 00:11:09 -04:00
Victor Stinner de636f3c34 PyUnicode_Substring() now accepts end bigger than string length
Fix also a bug: call PyUnicode_READY() before reading string length.
2011-10-01 03:55:54 +02:00
Victor Stinner c759f3e7ec Ooops, avoid a division by zero in unicode_repeat() 2011-10-01 03:09:58 +02:00
Victor Stinner d3a83d5eb3 PyUnicode_FromObject() ensures that its output is a ready string 2011-10-01 03:09:33 +02:00
Victor Stinner 67ca64ce54 I want a super fast 'a' * n!
* Optimize unicode_repeat() for a special case with memset()
 * Simplify integer overflow checking; remove the second check because
   PyUnicode_New() already does it and uses a smaller limit (Py_ssize_t vs
   size_t)
2011-10-01 02:47:29 +02:00
Victor Stinner e9a2935c1f Fix usage of PyUnicode_READY in unicodeobject.c 2011-10-01 02:14:59 +02:00
Victor Stinner 12bab6dace Remove private substring() function, reuse public PyUnicode_Substring()
* PyUnicode_Substring() now fails if start or end is invalid
 * PyUnicode_Substring() reuses PyUnicode_Copy() for non-exact strings
2011-10-01 01:53:49 +02:00
Victor Stinner c841e7db1f Optimize PyUnicode_Copy(): don't recompute maximum character 2011-10-01 01:34:32 +02:00
Victor Stinner 2219e0a37e PyUnicode_FromObject() reuses PyUnicode_Copy()
* PyUnicode_Copy() is faster than substring()
 * Fix also a compiler warning
2011-10-01 01:16:59 +02:00
Victor Stinner 034f6cf10c Add PyUnicode_Copy() function, include it to the public API 2011-09-30 02:26:44 +02:00
Victor Stinner b153615008 PyUnicode_CopyCharacters() uses exceptions instead of assertions
Call PyErr_BadInternalCall() if inputs are not unicode strings.
2011-09-30 02:26:10 +02:00
Victor Stinner d8f6510acc _PyUnicode_Ready() cannot be used on ready strings anymore
* Change its prototype: PyObject* instead of PyUnicodeoObject*.
 * Remove an old assertion, the result of PyUnicode_READY (_PyUnicode_Ready)
   must be checked instead
2011-09-29 19:43:17 +02:00
Victor Stinner bc8b81bc4e Move _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() outside unicodeobject.h
Move these macros to unicodeobject.c
2011-09-29 19:31:34 +02:00
Victor Stinner a0702ab1fe Add a note in PyUnicode_CopyCharacters() doc: it doesn't write null character
Cleanup also the code (avoid the goto).
2011-09-29 14:14:38 +02:00
Victor Stinner 639418812f Use the new Py_ARRAY_LENGTH macro 2011-09-29 00:42:28 +02:00
Victor Stinner b9dcffb51e Fix 'c' format of PyUnicode_Format()
formatbuf is now an array of Py_UCS4, not of Py_UNICODE
2011-09-29 00:39:24 +02:00
Victor Stinner c17f540b7a Oops, fix my previous commit: unicode => to 2011-09-29 00:16:58 +02:00
Victor Stinner b15d4d899c PyUnicode_CopyCharacters() marks the string as dirty (reset the hash) 2011-09-28 23:59:20 +02:00
Victor Stinner f5ca1a21a5 PyUnicode_CopyCharacters() fails if 'to' has more than 1 reference 2011-09-28 23:54:59 +02:00
Ezio Melotti 2aa2b3b4d5 Clean up a few tabs that went in with PEP393. 2011-09-29 00:58:57 +03:00
Ezio Melotti 48a2f8fd97 #13054: sys.maxunicode is now always 0x10FFFF. 2011-09-29 00:18:19 +03:00
Victor Stinner 506f592769 Check size of wchar_t using the preprocessor 2011-09-28 22:34:18 +02:00
Victor Stinner 73f01c65c8 PyUnicode_CopyCharacters() initializes overflow 2011-09-28 22:28:04 +02:00
Victor Stinner e57b1c0da1 Mark PyUnicode_FromUCS[124] as private 2011-09-28 22:20:48 +02:00
Victor Stinner ff9e50fd04 Oops, fix Py_MIN/Py_MAX case 2011-09-28 22:17:19 +02:00
Victor Stinner 17222160e7 Mark _PyUnicode_FindMaxCharAndNumSurrogatePairs() as private 2011-09-28 22:15:37 +02:00
Victor Stinner 157f83fcfc Strip trailing spaces in unicodeobject.[ch] 2011-09-28 21:41:31 +02:00
Victor Stinner 6c7a52a46f Check for PyUnicode_CopyCharacters() failure 2011-09-28 21:39:17 +02:00
Victor Stinner be78eaf2de PyUnicode_CopyCharacters() checks for buffer and character overflow
It now returns the number of written characters on success.
2011-09-28 21:37:03 +02:00
Victor Stinner fb5f5f2420 Mark PyUnicode_CONVERT_BYTES as private 2011-09-28 21:39:49 +02:00
Georg Brandl 4cb0de246c Rename new macros to conform to naming rules (function macros have "Py" prefix, not "PY"). 2011-09-28 21:49:49 +02:00
Benjamin Peterson 9c6e6a0c7f don't check that the first character is XID_Continue
Current, XID_Continue is a superset of XID_Start, but that may sometime change.
2011-09-28 08:09:05 -04:00
Martin v. Löwis d63a3b8beb Implement PEP 393. 2011-09-28 07:41:54 +02:00
Mark Dickinson 57e683e53e Issue #1621: Fix undefined behaviour in bytes.__hash__, str.__hash__, tuple.__hash__, frozenset.__hash__ and set indexing operations. 2011-09-24 18:18:40 +01:00
Mark Dickinson 0d5f6adbb3 Issue #13012: Allow 'keepends' to be passed as a keyword argument in str.splitlines, bytes.splitlines and bytearray.splitlines. 2011-09-24 09:14:39 +01:00
Victor Stinner f955eb210f Merge 3.2: Fix PyUnicode_AsWideCharString() doc
- Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null
   character
 - Fix spelling of the null character
2011-09-06 02:01:29 +02:00
Victor Stinner d88d9836c5 Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character
Fix also spelling of the null character.
2011-09-06 02:00:05 +02:00
Ezio Melotti 6f2a683a0c #9200: merge with 3.2. 2011-08-22 20:31:11 +03:00
Ezio Melotti 93e7afc5d9 #9200: The str.is* methods now work with strings that contain non-BMP characters even in narrow Unicode builds. 2011-08-22 14:08:38 +03:00
Benjamin Peterson e518d4c18a merge 3.2 2011-08-18 13:52:19 -05:00
Benjamin Peterson 7a6b44ab62 the named of the character is actually NUL 2011-08-18 13:51:47 -05:00
Benjamin Peterson 020340f284 merge 3.2 2011-08-18 10:49:16 -05:00
Benjamin Peterson 5ad517a7d9 NUL -> NULL 2011-08-18 10:48:50 -05:00
Ezio Melotti 269e3ee3db #12266: merge with 3.2. 2011-08-15 09:26:28 +03:00
Ezio Melotti ee8d998ecf #12266: Fix str.capitalize() to correctly uppercase/lowercase titlecased and cased non-letter characters. 2011-08-15 09:09:57 +03:00
Benjamin Peterson f8e7543df9 merge 3.2 (#12732) 2011-08-12 22:18:19 -05:00
Benjamin Peterson f413b80806 in narrow builds, make sure to test codepoints as identifier characters (closes #12732)
This fixes the use of Unicode identifiers outside the BMP in narrow builds.
2011-08-12 22:17:18 -05:00
Brian Curtin dfc80e3d97 Replace Py_NotImplemented returns with the macro form Py_RETURN_NOTIMPLEMENTED.
The macro was introduced in #12724.
2011-08-10 20:28:54 -05:00
Senthil Kumaran fcdaaa9011 merge from 3.2 - Fix closes Issue12621 - Fix docstrings of find and rfind methods of bytes/bytearry/unicodeobject. 2011-07-27 23:34:29 +08:00
Senthil Kumaran 53516a82df Fix closes Issue12621 - Fix docstrings of find and rfind methods of bytes/bytearry/unicodeobject. 2011-07-27 23:33:54 +08:00
Victor Stinner 99b9538636 Issue #9642: Uniformize the tests on the availability of the mbcs codec
Add a new HAVE_MBCS define.
2011-07-04 14:23:54 +02:00
Senthil Kumaran bc9d8f838b merge from 3.2 2011-07-03 21:05:25 -07:00
Senthil Kumaran 9ebe08d2f6 Fix closes issue12471 - wrong TypeError message when '%i' format spec was used. 2011-07-03 21:03:16 -07:00
Victor Stinner 3cbf14bfb1 Issue #10914: Initialize correctly the filesystem codec when creating a new
subinterpreter to fix a bootstrap issue with codecs implemented in Python, as
the ISO-8859-15 codec.

Add fscodec_initialized attribute to the PyInterpreterState structure.
2011-04-27 00:24:21 +02:00
Victor Stinner 793b531756 Issue #10914: Initialize correctly the filesystem codec when creating a new
subinterpreter to fix a bootstrap issue with codecs implemented in Python, as
the ISO-8859-15 codec.

Add fscodec_initialized attribute to the PyInterpreterState structure.
2011-04-27 00:24:21 +02:00
Ezio Melotti bf1253b25a #6780: merge with 3.2. 2011-04-26 06:45:24 +03:00
Ezio Melotti f2b3f780a1 #6780: merge with 3.1. 2011-04-26 06:40:59 +03:00
Ezio Melotti ba42fd5801 #6780: fix starts/endswith error message to mention that tuples are accepted too. 2011-04-26 06:09:45 +03:00
Jesus Cea c1ceb64e41 MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828) 2011-04-20 17:59:29 +02:00
Jesus Cea 6159ee3cf5 MERGE: startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828) 2011-04-20 17:42:50 +02:00
Jesus Cea ac4515063c startswith and endswith don't accept None as slice index. Patch by Torsten Becker. (closes #11828) 2011-04-20 17:09:23 +02:00
Benjamin Peterson 5fd4bd3796 avoid casting with this nice macro 2011-03-06 09:06:34 -06:00
Victor Stinner 2f283c2c19 Fix my previous commit (r88709) for str.encode(errors=...) 2011-03-02 01:21:46 +00:00
Victor Stinner a5c68c3cb7 Issue #8923: cache str.encode() result
When a string is encoded to UTF-8 in strict mode, the result is cached into the
object. Examples: str.encode(), str.encode('utf-8'), PyUnicode_AsUTF8String()
and PyUnicode_AsEncodedString(unicode, "utf-8", NULL).
2011-03-02 01:03:14 +00:00
Victor Stinner f3fd733f92 Remove useless argument of _PyUnicode_AsDefaultEncodedString() 2011-03-02 01:03:11 +00:00
Victor Stinner 6d970f4713 Issue #10831: PyUnicode_FromFormat() supports %li, %lli and %zi formats 2011-03-02 00:04:25 +00:00
Victor Stinner e7faec1aa9 Fix my previous commit (r88702): initialize size_tflag in parse_format_flags() 2011-03-02 00:01:53 +00:00
Victor Stinner 968654515f Issue #10829: Refactor PyUnicode_FromFormat()
* Use the same function to parse the format string in the 3 steps
 * Fix crashs on invalid format strings
2011-03-01 23:44:09 +00:00
Victor Stinner 2b574a2332 Merged revisions 88697 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88697 | victor.stinner | 2011-03-01 23:46:52 +0100 (mar., 01 mars 2011) | 4 lines

  Issue #11246: Fix PyUnicode_FromFormat("%V")

  Decode the byte string from UTF-8 (with replace error handler) instead of
  ISO-8859-1 (in strict mode). Patch written by Ray Allen.
........
2011-03-01 22:48:49 +00:00
Victor Stinner 2512a8b62e Issue #11246: Fix PyUnicode_FromFormat("%V")
Decode the byte string from UTF-8 (with replace error handler) instead of
ISO-8859-1 (in strict mode). Patch written by Ray Allen.
2011-03-01 22:46:52 +00:00
Alexander Belopolsky 4001847a98 PEP 7 conformance changes (whitespace only). 2011-02-26 01:02:56 +00:00
Alexander Belopolsky 1d52146a25 Issue #11303: Added shortcuts for utf8 and latin1 encodings.
Documented the list of optimized encodings as CPython implementation
detail.
2011-02-25 19:19:57 +00:00
Victor Stinner 659eb84457 Merged revisions 88481 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88481 | victor.stinner | 2011-02-21 22:13:44 +0100 (lun., 21 févr. 2011) | 4 lines

  Fix PyUnicode_FromFormatV("%c") for non-BMP char

  Issue #10830: Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
  narrow build.
........
2011-02-23 12:14:22 +00:00
Brett Cannon b94767ff44 Issue #8914: fix various warnings from the Clang static analyzer v254. 2011-02-22 20:15:44 +00:00
Victor Stinner 5ed8b2c737 Fix PyUnicode_FromFormatV("%c") for non-BMP char
Issue #10830: Fix PyUnicode_FromFormatV("%c") for non-BMP characters on
narrow build.
2011-02-21 21:13:44 +00:00
Victor Stinner fd34b3788f Remove bootstrap code of PyUnicode_AsEncodedString()
Issue #11187: Remove bootstrap code (use ASCII) of
PyUnicode_AsEncodedString(), it was replaced by a better fallback (use
the locale encoding) in PyUnicode_EncodeFSDefault().

Prepare also empty sections in NEWS.
2011-02-21 20:51:28 +00:00
Alexander Belopolsky b9cc00caab Removed unneeded #include 2010-12-22 02:35:20 +00:00
Benjamin Peterson 28a4dce6a8 remove (un)transform methods 2010-12-12 01:33:04 +00:00
Alexander Belopolsky 942af5a9a4 Issue #10557: Fixed error messages from float() and other numeric
types.  Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Martin v. Löwis 4d0d471a80 Merge branches/pep-0384. 2010-12-03 20:14:31 +00:00
Georg Brandl 3b9406b08a Remove redundant check for PyBytes in unicode_encode. 2010-12-03 07:54:09 +00:00
Georg Brandl 02524629f3 #7475: add (un)transform method to bytes/bytearray and str, add back codecs that can be used with them from Python 2. 2010-12-02 18:06:51 +00:00
Georg Brandl e5b99f0fb3 Remove redundant includes of headers that are already included by Python.h. 2010-11-30 09:41:01 +00:00
Victor Stinner d5af0a5df0 PyUnicode_DecodeFSDefaultAndSize() raises MemoryError if _Py_char2wchar() fails 2010-11-08 23:34:29 +00:00
Victor Stinner 2f02a51135 PyUnicode_EncodeFS() raises an exception if _Py_wchar2char() fails
* Add error_pos optional argument to _Py_wchar2char()
 * PyUnicode_EncodeFS() raises a UnicodeEncodeError or MemoryError if
   _Py_wchar2char() fails
2010-11-08 22:43:46 +00:00
Victor Stinner c911bbfd5d str, bytes, bytearray docstring: remove unnecessary [...] 2010-11-07 19:04:46 +00:00
Victor Stinner e14e212221 Fix encode/decode method doc of str, bytes, bytearray types
* Specify the default encoding: write 'utf-8' instead of
   sys.getdefaultencoding(), because the default encoding is now constant
 * Specify the default errors value
2010-11-07 18:41:46 +00:00
Eric Smith 16562f41b0 Merged revisions 86277 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r86277 | eric.smith | 2010-11-06 15:27:37 -0400 (Sat, 06 Nov 2010) | 1 line

  Added more to docstrings for str.format, format_map, and __format__.
........
2010-11-06 19:29:45 +00:00
Eric Smith 51d2fd983b Added more to docstrings for str.format, format_map, and __format__. 2010-11-06 19:27:37 +00:00
David Malcolm 9696088b6d Issue #10288: The deprecated family of "char"-handling macros
(ISLOWER()/ISUPPER()/etc) have now been removed: use Py_ISLOWER() etc
instead.
2010-11-05 17:23:41 +00:00
Eric Smith 27bbca6f79 Issue #6081: Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict. 2010-11-04 17:06:58 +00:00
Victor Stinner ad15872854 Simplify PyUnicode_Encode/DecodeFSDefault on Windows/Mac OS X
* Windows always uses mbcs
 * Mac OS X always uses utf-8
2010-10-27 00:25:46 +00:00
Victor Stinner f933e1ab6f Issue #4388: On Mac OS X, decode command line arguments from UTF-8, instead of
the locale encoding. If the LANG (and LC_ALL and LC_CTYPE) environment variable
is not set, the locale encoding is ISO-8859-1, whereas most programs (including
Python) expect UTF-8. Python already uses UTF-8 for the filesystem encoding and
to encode command line arguments on this OS.
2010-10-20 22:58:25 +00:00
Victor Stinner 9a90900da5 PyUnicode_FromFormatV(): Fix %A format
It was not completly implemented. Add a test.
2010-10-18 20:59:24 +00:00
Benjamin Peterson 8f67d0893f make hashes always the size of pointers; introduce Py_hash_t #9778 2010-10-17 20:54:53 +00:00
Georg Brandl ded5acf34a Merged revisions 81936 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

........
  r81936 | mark.dickinson | 2010-06-12 11:10:14 +0200 (Sa, 12 Jun 2010) | 2 lines

  Silence 'unused variable' gcc warning.  Patch by Éric Araujo.
........
2010-10-17 11:48:07 +00:00
Victor Stinner 168e117e0a Add an optional size argument to _Py_char2wchar()
_Py_char2wchar() callers usually need the result size in characters. Since it's
trivial to compute it in _Py_char2wchar() (O(1) whereas wcslen() is O(n)), add
an option to get it.
2010-10-16 23:16:16 +00:00
Victor Stinner f3170ccef8 Use locale encoding if Py_FileSystemDefaultEncoding is not set
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
   PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
   Py_FileSystemDefaultEncoding is NULL
 * redecode_filenames() functions and _Py_code_object_list (issue #9630)
   are no more needed: remove them
2010-10-15 12:04:23 +00:00
Georg Brandl 66c221e993 #9418: first step of moving private string methods to _string module. 2010-10-14 07:04:07 +00:00
Victor Stinner beb4135b8c PyUnicode_AsWideCharString() takes a PyObject*, not a PyUnicodeObject*
All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the
prototype for the new function PyUnicode_AsWideCharString().
2010-10-07 01:02:42 +00:00
Victor Stinner 5593d8aeb4 Issue #8670: PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() replace
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE
and 32 bits wchar_t (eg. Linux in narrow build).
2010-10-02 11:11:27 +00:00
Victor Stinner 1c24bd0252 Issue #8870: PyUnicode_AsWideCharString() doesn't count the trailing nul character
And write unit tests for PyUnicode_AsWideChar() and PyUnicode_AsWideCharString().
2010-10-02 11:03:13 +00:00
Victor Stinner 71e91a358b Fix PyUnicode_AsWideCharString(): set *size if size is not NULL 2010-09-29 17:55:12 +00:00
Victor Stinner c39211f51e Issue #9630: Redecode filenames when setting the filesystem encoding
Redecode the filenames of:

 - all modules: __file__ and __path__ attributes
 - all code objects: co_filename attribute
 - sys.path
 - sys.meta_path
 - sys.executable
 - sys.path_importer_cache (keys)

Keep weak references to all code objects until initfsencoding() is called, to
be able to redecode co_filename attribute of all code objects.
2010-09-29 16:35:47 +00:00
Victor Stinner 137c34c027 Issue #9979: Create function PyUnicode_AsWideCharString(). 2010-09-29 10:25:54 +00:00
Benjamin Peterson d4ac96a336 use return NULL; it's just as correct 2010-09-12 16:40:53 +00:00
Victor Stinner 4c7db315df Issue #9738, #9836: Fix refleak introduced by r84704 2010-09-12 07:51:18 +00:00
Benjamin Peterson 9be0b2e312 detect non-ascii characters much earlier (plugs ref leak) 2010-09-12 03:40:54 +00:00
Victor Stinner 1205f2774e Issue #9738: PyUnicode_FromFormat() and PyErr_Format() raise an error on
a non-ASCII byte in the format string.

Document also the encoding.
2010-09-11 00:54:47 +00:00
Victor Stinner 46408606d8 Rename PyUnicode_strdup() to PyUnicode_AsUnicodeCopy() 2010-09-03 16:18:00 +00:00
Victor Stinner 71133ff368 Create PyUnicode_strdup() function 2010-09-01 23:43:53 +00:00
Victor Stinner c4eb765fc1 Create Py_UNICODE_strcat() function 2010-09-01 23:43:50 +00:00
Victor Stinner 42cb462682 Remove unicode_default_encoding constant
Inline its value in PyUnicode_GetDefaultEncoding(). The comment is now outdated
(we will not change its value anymore).
2010-09-01 19:39:01 +00:00
Antoine Pitrou fce7fd6426 Issue #9549: sys.setdefaultencoding() and PyUnicode_SetDefaultEncoding()
are now removed, since their effect was inexistent in 3.x (the default
encoding is hardcoded to utf-8 and cannot be changed).
2010-09-01 18:54:56 +00:00
Antoine Pitrou a2983c6734 Merged revisions 84394 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r84394 | antoine.pitrou | 2010-09-01 17:10:12 +0200 (mer., 01 sept. 2010) | 4 lines

  Issue #7415: PyUnicode_FromEncodedObject() now uses the new buffer API
  properly.  Patch by Stefan Behnel.
........
2010-09-01 15:16:41 +00:00
Antoine Pitrou b0fa831d1e Issue #7415: PyUnicode_FromEncodedObject() now uses the new buffer API
properly.  Patch by Stefan Behnel.
2010-09-01 15:10:12 +00:00