mirror of https://github.com/python/cpython.git
Clarify concatenation behaviour of immutable strings, and remove explicit
mention of the CPython optimization hack.
This commit is contained in:
commit
e333d00d3a
|
@ -989,6 +989,32 @@ What does 'UnicodeDecodeError' or 'UnicodeEncodeError' error mean?
|
||||||
See the :ref:`unicode-howto`.
|
See the :ref:`unicode-howto`.
|
||||||
|
|
||||||
|
|
||||||
|
What is the most efficient way to concatenate many strings together?
|
||||||
|
--------------------------------------------------------------------
|
||||||
|
|
||||||
|
:class:`str` and :class:`bytes` objects are immutable, therefore concatenating
|
||||||
|
many strings together is inefficient as each concatenation creates a new
|
||||||
|
object. In the general case, the total runtime cost is quadratic in the
|
||||||
|
total string length.
|
||||||
|
|
||||||
|
To accumulate many :class:`str` objects, the recommended idiom is to place
|
||||||
|
them into a list and call :meth:`str.join` at the end::
|
||||||
|
|
||||||
|
chunks = []
|
||||||
|
for s in my_strings:
|
||||||
|
chunks.append(s)
|
||||||
|
result = ''.join(chunks)
|
||||||
|
|
||||||
|
(another reasonably efficient idiom is to use :class:`io.StringIO`)
|
||||||
|
|
||||||
|
To accumulate many :class:`bytes` objects, the recommended idiom is to extend
|
||||||
|
a :class:`bytearray` object using in-place concatenation (the ``+=`` operator)::
|
||||||
|
|
||||||
|
result = bytearray()
|
||||||
|
for b in my_bytes_objects:
|
||||||
|
result += b
|
||||||
|
|
||||||
|
|
||||||
Sequences (Tuples/Lists)
|
Sequences (Tuples/Lists)
|
||||||
========================
|
========================
|
||||||
|
|
||||||
|
|
|
@ -968,15 +968,18 @@ Notes:
|
||||||
If *k* is ``None``, it is treated like ``1``.
|
If *k* is ``None``, it is treated like ``1``.
|
||||||
|
|
||||||
(6)
|
(6)
|
||||||
.. impl-detail::
|
Concatenating immutable strings always results in a new object. This means
|
||||||
|
that building up a string by repeated concatenation will have a quadratic
|
||||||
|
runtime cost in the total string length. To get a linear runtime cost,
|
||||||
|
you must switch to one of the alternatives below:
|
||||||
|
|
||||||
If *s* and *t* are both strings, some Python implementations such as
|
* if concatenating :class:`str` objects, you can build a list and use
|
||||||
CPython can usually perform an in-place optimization for assignments of
|
:meth:`str.join` at the end;
|
||||||
the form ``s = s + t`` or ``s += t``. When applicable, this optimization
|
|
||||||
makes quadratic run-time much less likely. This optimization is both
|
* if concatenating :class:`bytes` objects, you can similarly use
|
||||||
version and implementation dependent. For performance sensitive code, it
|
:meth:`bytes.join`, or you can do in-place concatenation with a
|
||||||
is preferable to use the :meth:`str.join` method which assures consistent
|
:class:`bytearray` object. :class:`bytearray` objects are mutable and
|
||||||
linear concatenation performance across versions and implementations.
|
have an efficient overallocation mechanism.
|
||||||
|
|
||||||
|
|
||||||
.. _string-methods:
|
.. _string-methods:
|
||||||
|
|
Loading…
Reference in New Issue