mirror of https://gitee.com/openkylin/wcwidth.git
Import Upstream version 0.2.5
This commit is contained in:
commit
3cb230e352
|
@ -0,0 +1,27 @@
|
||||||
|
The MIT License (MIT)
|
||||||
|
|
||||||
|
Copyright (c) 2014 Jeff Quast <contact@jeffquast.com>
|
||||||
|
|
||||||
|
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||||
|
of this software and associated documentation files (the "Software"), to deal
|
||||||
|
in the Software without restriction, including without limitation the rights
|
||||||
|
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||||
|
copies of the Software, and to permit persons to whom the Software is
|
||||||
|
furnished to do so, subject to the following conditions:
|
||||||
|
|
||||||
|
The above copyright notice and this permission notice shall be included in all
|
||||||
|
copies or substantial portions of the Software.
|
||||||
|
|
||||||
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||||
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||||
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||||
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||||
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||||
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||||
|
SOFTWARE.
|
||||||
|
|
||||||
|
Markus Kuhn -- 2007-05-26 (Unicode 5.0)
|
||||||
|
|
||||||
|
Permission to use, copy, modify, and distribute this software
|
||||||
|
for any purpose and without fee is hereby granted. The author
|
||||||
|
disclaims all warranties with regard to this software.
|
|
@ -0,0 +1,2 @@
|
||||||
|
include LICENSE *.rst
|
||||||
|
recursive-include tests *.py
|
|
@ -0,0 +1,306 @@
|
||||||
|
Metadata-Version: 1.1
|
||||||
|
Name: wcwidth
|
||||||
|
Version: 0.2.5
|
||||||
|
Summary: Measures the displayed width of unicode strings in a terminal
|
||||||
|
Home-page: https://github.com/jquast/wcwidth
|
||||||
|
Author: Jeff Quast
|
||||||
|
Author-email: contact@jeffquast.com
|
||||||
|
License: MIT
|
||||||
|
Description: |pypi_downloads| |codecov| |license|
|
||||||
|
|
||||||
|
============
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
This library is mainly for CLI programs that carefully produce output for
|
||||||
|
Terminals, or make pretend to be an emulator.
|
||||||
|
|
||||||
|
**Problem Statement**: The printable length of *most* strings are equal to the
|
||||||
|
number of cells they occupy on the screen ``1 charater : 1 cell``. However,
|
||||||
|
there are categories of characters that *occupy 2 cells* (full-wide), and
|
||||||
|
others that *occupy 0* cells (zero-width).
|
||||||
|
|
||||||
|
**Solution**: POSIX.1-2001 and POSIX.1-2008 conforming systems provide
|
||||||
|
`wcwidth(3)`_ and `wcswidth(3)`_ C functions of which this python module's
|
||||||
|
functions precisely copy. *These functions return the number of cells a
|
||||||
|
unicode string is expected to occupy.*
|
||||||
|
|
||||||
|
Installation
|
||||||
|
------------
|
||||||
|
|
||||||
|
The stable version of this package is maintained on pypi, install using pip::
|
||||||
|
|
||||||
|
pip install wcwidth
|
||||||
|
|
||||||
|
Example
|
||||||
|
-------
|
||||||
|
|
||||||
|
**Problem**: given the following phrase (Japanese),
|
||||||
|
|
||||||
|
>>> text = u'コンニチハ'
|
||||||
|
|
||||||
|
Python **incorrectly** uses the *string length* of 5 codepoints rather than the
|
||||||
|
*printible length* of 10 cells, so that when using the `rjust` function, the
|
||||||
|
output length is wrong::
|
||||||
|
|
||||||
|
>>> print(len('コンニチハ'))
|
||||||
|
5
|
||||||
|
|
||||||
|
>>> print('コンニチハ'.rjust(20, '_'))
|
||||||
|
_____コンニチハ
|
||||||
|
|
||||||
|
By defining our own "rjust" function that uses wcwidth, we can correct this::
|
||||||
|
|
||||||
|
>>> def wc_rjust(text, length, padding=' '):
|
||||||
|
... from wcwidth import wcswidth
|
||||||
|
... return padding * max(0, (length - wcswidth(text))) + text
|
||||||
|
...
|
||||||
|
|
||||||
|
Our **Solution** uses wcswidth to determine the string length correctly::
|
||||||
|
|
||||||
|
>>> from wcwidth import wcswidth
|
||||||
|
>>> print(wcswidth('コンニチハ'))
|
||||||
|
10
|
||||||
|
|
||||||
|
>>> print(wc_rjust('コンニチハ', 20, '_'))
|
||||||
|
__________コンニチハ
|
||||||
|
|
||||||
|
|
||||||
|
Choosing a Version
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Export an environment variable, ``UNICODE_VERSION``. This should be done by
|
||||||
|
*terminal emulators* or those developers experimenting with authoring one of
|
||||||
|
their own, from shell::
|
||||||
|
|
||||||
|
$ export UNICODE_VERSION=13.0
|
||||||
|
|
||||||
|
If unspecified, the latest version is used. If your Terminal Emulator does not
|
||||||
|
export this variable, you can use the `jquast/ucs-detect`_ utility to
|
||||||
|
automatically detect and export it to your shell.
|
||||||
|
|
||||||
|
wcwidth, wcswidth
|
||||||
|
-----------------
|
||||||
|
Use function ``wcwidth()`` to determine the length of a *single unicode
|
||||||
|
character*, and ``wcswidth()`` to determine the length of many, a *string
|
||||||
|
of unicode characters*.
|
||||||
|
|
||||||
|
Briefly, return values of function ``wcwidth()`` are:
|
||||||
|
|
||||||
|
``-1``
|
||||||
|
Indeterminate (not printable).
|
||||||
|
|
||||||
|
``0``
|
||||||
|
Does not advance the cursor, such as NULL or Combining.
|
||||||
|
|
||||||
|
``2``
|
||||||
|
Characters of category East Asian Wide (W) or East Asian
|
||||||
|
Full-width (F) which are displayed using two terminal cells.
|
||||||
|
|
||||||
|
``1``
|
||||||
|
All others.
|
||||||
|
|
||||||
|
Function ``wcswidth()`` simply returns the sum of all values for each character
|
||||||
|
along a string, or ``-1`` when it occurs anywhere along a string.
|
||||||
|
|
||||||
|
Full API Documentation at http://wcwidth.readthedocs.org
|
||||||
|
|
||||||
|
==========
|
||||||
|
Developing
|
||||||
|
==========
|
||||||
|
|
||||||
|
Install wcwidth in editable mode::
|
||||||
|
|
||||||
|
pip install -e.
|
||||||
|
|
||||||
|
Execute unit tests using tox_::
|
||||||
|
|
||||||
|
tox
|
||||||
|
|
||||||
|
Regenerate python code tables from latest Unicode Specification data files::
|
||||||
|
|
||||||
|
tox -eupdate
|
||||||
|
|
||||||
|
Supplementary tools for browsing and testing terminals for wide unicode
|
||||||
|
characters are found in the `bin/`_ of this project's source code. Just ensure
|
||||||
|
to first ``pip install -erequirements-develop.txt`` from this projects main
|
||||||
|
folder. For example, an interactive browser for testing::
|
||||||
|
|
||||||
|
./bin/wcwidth-browser.py
|
||||||
|
|
||||||
|
Uses
|
||||||
|
----
|
||||||
|
|
||||||
|
This library is used in:
|
||||||
|
|
||||||
|
- `jquast/blessed`_: a thin, practical wrapper around terminal capabilities in
|
||||||
|
Python.
|
||||||
|
|
||||||
|
- `jonathanslenders/python-prompt-toolkit`_: a Library for building powerful
|
||||||
|
interactive command lines in Python.
|
||||||
|
|
||||||
|
- `dbcli/pgcli`_: Postgres CLI with autocompletion and syntax highlighting.
|
||||||
|
|
||||||
|
- `thomasballinger/curtsies`_: a Curses-like terminal wrapper with a display
|
||||||
|
based on compositing 2d arrays of text.
|
||||||
|
|
||||||
|
- `selectel/pyte`_: Simple VTXXX-compatible linux terminal emulator.
|
||||||
|
|
||||||
|
- `astanin/python-tabulate`_: Pretty-print tabular data in Python, a library
|
||||||
|
and a command-line utility.
|
||||||
|
|
||||||
|
- `LuminosoInsight/python-ftfy`_: Fixes mojibake and other glitches in Unicode
|
||||||
|
text.
|
||||||
|
|
||||||
|
- `nbedos/termtosvg`_: Terminal recorder that renders sessions as SVG
|
||||||
|
animations.
|
||||||
|
|
||||||
|
- `peterbrittain/asciimatics`_: Package to help people create full-screen text
|
||||||
|
UIs.
|
||||||
|
|
||||||
|
Other Languages
|
||||||
|
---------------
|
||||||
|
|
||||||
|
- `timoxley/wcwidth`_: JavaScript
|
||||||
|
- `janlelis/unicode-display_width`_: Ruby
|
||||||
|
- `alecrabbit/php-wcwidth`_: PHP
|
||||||
|
- `Text::CharWidth`_: Perl
|
||||||
|
- `bluebear94/Terminal-WCWidth`: Perl 6
|
||||||
|
- `mattn/go-runewidth`_: Go
|
||||||
|
- `emugel/wcwidth`_: Haxe
|
||||||
|
- `aperezdc/lua-wcwidth`: Lua
|
||||||
|
- `joachimschmidt557/zig-wcwidth`: Zig
|
||||||
|
- `fumiyas/wcwidth-cjk`: `LD_PRELOAD` override
|
||||||
|
- `joshuarubin/wcwidth9`: Unicode version 9 in C
|
||||||
|
|
||||||
|
History
|
||||||
|
-------
|
||||||
|
|
||||||
|
0.2.0 *2020-06-01*
|
||||||
|
* **Enhancement**: Unicode version may be selected by exporting the
|
||||||
|
Environment variable ``UNICODE_VERSION``, such as ``13.0``, or ``6.3.0``.
|
||||||
|
See the `jquast/ucs-detect`_ CLI utility for automatic detection.
|
||||||
|
* **Enhancement**:
|
||||||
|
API Documentation is published to readthedocs.org.
|
||||||
|
* **Updated** tables for *all* Unicode Specifications with files
|
||||||
|
published in a programmatically consumable format, versions 4.1.0
|
||||||
|
through 13.0
|
||||||
|
that are published
|
||||||
|
, versions
|
||||||
|
|
||||||
|
0.1.9 *2020-03-22*
|
||||||
|
* **Performance** optimization by `Avram Lubkin`_, `PR #35`_.
|
||||||
|
* **Updated** tables to Unicode Specification 13.0.0.
|
||||||
|
|
||||||
|
0.1.8 *2020-01-01*
|
||||||
|
* **Updated** tables to Unicode Specification 12.0.0. (`PR #30`_).
|
||||||
|
|
||||||
|
0.1.7 *2016-07-01*
|
||||||
|
* **Updated** tables to Unicode Specification 9.0.0. (`PR #18`_).
|
||||||
|
|
||||||
|
0.1.6 *2016-01-08 Production/Stable*
|
||||||
|
* ``LICENSE`` file now included with distribution.
|
||||||
|
|
||||||
|
0.1.5 *2015-09-13 Alpha*
|
||||||
|
* **Bugfix**:
|
||||||
|
Resolution of "combining_ character width" issue, most especially
|
||||||
|
those that previously returned -1 now often (correctly) return 0.
|
||||||
|
resolved by `Philip Craig`_ via `PR #11`_.
|
||||||
|
* **Deprecated**:
|
||||||
|
The module path ``wcwidth.table_comb`` is no longer available,
|
||||||
|
it has been superseded by module path ``wcwidth.table_zero``.
|
||||||
|
|
||||||
|
0.1.4 *2014-11-20 Pre-Alpha*
|
||||||
|
* **Feature**: ``wcswidth()`` now determines printable length
|
||||||
|
for (most) combining_ characters. The developer's tool
|
||||||
|
`bin/wcwidth-browser.py`_ is improved to display combining_
|
||||||
|
characters when provided the ``--combining`` option
|
||||||
|
(`Thomas Ballinger`_ and `Leta Montopoli`_ `PR #5`_).
|
||||||
|
* **Feature**: added static analysis (prospector_) to testing
|
||||||
|
framework.
|
||||||
|
|
||||||
|
0.1.3 *2014-10-29 Pre-Alpha*
|
||||||
|
* **Bugfix**: 2nd parameter of wcswidth was not honored.
|
||||||
|
(`Thomas Ballinger`_, `PR #4`_).
|
||||||
|
|
||||||
|
0.1.2 *2014-10-28 Pre-Alpha*
|
||||||
|
* **Updated** tables to Unicode Specification 7.0.0.
|
||||||
|
(`Thomas Ballinger`_, `PR #3`_).
|
||||||
|
|
||||||
|
0.1.1 *2014-05-14 Pre-Alpha*
|
||||||
|
* Initial release to pypi, Based on Unicode Specification 6.3.0
|
||||||
|
|
||||||
|
This code was originally derived directly from C code of the same name,
|
||||||
|
whose latest version is available at
|
||||||
|
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c::
|
||||||
|
|
||||||
|
* Markus Kuhn -- 2007-05-26 (Unicode 5.0)
|
||||||
|
*
|
||||||
|
* Permission to use, copy, modify, and distribute this software
|
||||||
|
* for any purpose and without fee is hereby granted. The author
|
||||||
|
* disclaims all warranties with regard to this software.
|
||||||
|
|
||||||
|
.. _`tox`: https://testrun.org/tox/latest/install.html
|
||||||
|
.. _`prospector`: https://github.com/landscapeio/prospector
|
||||||
|
.. _`combining`: https://en.wikipedia.org/wiki/Combining_character
|
||||||
|
.. _`bin/`: https://github.com/jquast/wcwidth/tree/master/bin
|
||||||
|
.. _`bin/wcwidth-browser.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-browser.py
|
||||||
|
.. _`Thomas Ballinger`: https://github.com/thomasballinger
|
||||||
|
.. _`Leta Montopoli`: https://github.com/lmontopo
|
||||||
|
.. _`Philip Craig`: https://github.com/philipc
|
||||||
|
.. _`PR #3`: https://github.com/jquast/wcwidth/pull/3
|
||||||
|
.. _`PR #4`: https://github.com/jquast/wcwidth/pull/4
|
||||||
|
.. _`PR #5`: https://github.com/jquast/wcwidth/pull/5
|
||||||
|
.. _`PR #11`: https://github.com/jquast/wcwidth/pull/11
|
||||||
|
.. _`PR #18`: https://github.com/jquast/wcwidth/pull/18
|
||||||
|
.. _`PR #30`: https://github.com/jquast/wcwidth/pull/30
|
||||||
|
.. _`PR #35`: https://github.com/jquast/wcwidth/pull/35
|
||||||
|
.. _`jquast/blessed`: https://github.com/jquast/blessed
|
||||||
|
.. _`selectel/pyte`: https://github.com/selectel/pyte
|
||||||
|
.. _`thomasballinger/curtsies`: https://github.com/thomasballinger/curtsies
|
||||||
|
.. _`dbcli/pgcli`: https://github.com/dbcli/pgcli
|
||||||
|
.. _`jonathanslenders/python-prompt-toolkit`: https://github.com/jonathanslenders/python-prompt-toolkit
|
||||||
|
.. _`timoxley/wcwidth`: https://github.com/timoxley/wcwidth
|
||||||
|
.. _`wcwidth(3)`: http://man7.org/linux/man-pages/man3/wcwidth.3.html
|
||||||
|
.. _`wcswidth(3)`: http://man7.org/linux/man-pages/man3/wcswidth.3.html
|
||||||
|
.. _`astanin/python-tabulate`: https://github.com/astanin/python-tabulate
|
||||||
|
.. _`janlelis/unicode-display_width`: https://github.com/janlelis/unicode-display_width
|
||||||
|
.. _`LuminosoInsight/python-ftfy`: https://github.com/LuminosoInsight/python-ftfy
|
||||||
|
.. _`alecrabbit/php-wcwidth`: https://github.com/alecrabbit/php-wcwidth
|
||||||
|
.. _`Text::CharWidth`: https://metacpan.org/pod/Text::CharWidth
|
||||||
|
.. _`bluebear94/Terminal-WCWidth`: https://github.com/bluebear94/Terminal-WCWidth
|
||||||
|
.. _`mattn/go-runewidth`: https://github.com/mattn/go-runewidth
|
||||||
|
.. _`emugel/wcwidth`: https://github.com/emugel/wcwidth
|
||||||
|
.. _`jquast/ucs-detect`: https://github.com/jquast/ucs-detect
|
||||||
|
.. _`Avram Lubkin`: https://github.com/avylove
|
||||||
|
.. _`nbedos/termtosvg`: https://github.com/nbedos/termtosvg
|
||||||
|
.. _`peterbrittain/asciimatics`: https://github.com/peterbrittain/asciimatics
|
||||||
|
.. _`aperezdc/lua-wcwidth`: https://github.com/aperezdc/lua-wcwidth
|
||||||
|
.. _`fumiyas/wcwidth-cjk`: https://github.com/fumiyas/wcwidth-cjk
|
||||||
|
.. |pypi_downloads| image:: https://img.shields.io/pypi/dm/wcwidth.svg?logo=pypi
|
||||||
|
:alt: Downloads
|
||||||
|
:target: https://pypi.org/project/wcwidth/
|
||||||
|
.. |codecov| image:: https://codecov.io/gh/jquast/wcwidth/branch/master/graph/badge.svg
|
||||||
|
:alt: codecov.io Code Coverage
|
||||||
|
:target: https://codecov.io/gh/jquast/wcwidth/
|
||||||
|
.. |license| image:: https://img.shields.io/github/license/jquast/wcwidth.svg
|
||||||
|
:target: https://pypi.python.org/pypi/wcwidth/
|
||||||
|
:alt: MIT License
|
||||||
|
|
||||||
|
Keywords: cjk,combining,console,eastasian,emojiemulator,terminal,unicode,wcswidth,wcwidth,xterm
|
||||||
|
Platform: UNKNOWN
|
||||||
|
Classifier: Intended Audience :: Developers
|
||||||
|
Classifier: Natural Language :: English
|
||||||
|
Classifier: Development Status :: 5 - Production/Stable
|
||||||
|
Classifier: Environment :: Console
|
||||||
|
Classifier: License :: OSI Approved :: MIT License
|
||||||
|
Classifier: Operating System :: POSIX
|
||||||
|
Classifier: Programming Language :: Python :: 2.7
|
||||||
|
Classifier: Programming Language :: Python :: 3.5
|
||||||
|
Classifier: Programming Language :: Python :: 3.6
|
||||||
|
Classifier: Programming Language :: Python :: 3.7
|
||||||
|
Classifier: Programming Language :: Python :: 3.8
|
||||||
|
Classifier: Topic :: Software Development :: Libraries
|
||||||
|
Classifier: Topic :: Software Development :: Localization
|
||||||
|
Classifier: Topic :: Software Development :: Internationalization
|
||||||
|
Classifier: Topic :: Terminals
|
|
@ -0,0 +1,280 @@
|
||||||
|
|pypi_downloads| |codecov| |license|
|
||||||
|
|
||||||
|
============
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
This library is mainly for CLI programs that carefully produce output for
|
||||||
|
Terminals, or make pretend to be an emulator.
|
||||||
|
|
||||||
|
**Problem Statement**: The printable length of *most* strings are equal to the
|
||||||
|
number of cells they occupy on the screen ``1 charater : 1 cell``. However,
|
||||||
|
there are categories of characters that *occupy 2 cells* (full-wide), and
|
||||||
|
others that *occupy 0* cells (zero-width).
|
||||||
|
|
||||||
|
**Solution**: POSIX.1-2001 and POSIX.1-2008 conforming systems provide
|
||||||
|
`wcwidth(3)`_ and `wcswidth(3)`_ C functions of which this python module's
|
||||||
|
functions precisely copy. *These functions return the number of cells a
|
||||||
|
unicode string is expected to occupy.*
|
||||||
|
|
||||||
|
Installation
|
||||||
|
------------
|
||||||
|
|
||||||
|
The stable version of this package is maintained on pypi, install using pip::
|
||||||
|
|
||||||
|
pip install wcwidth
|
||||||
|
|
||||||
|
Example
|
||||||
|
-------
|
||||||
|
|
||||||
|
**Problem**: given the following phrase (Japanese),
|
||||||
|
|
||||||
|
>>> text = u'コンニチハ'
|
||||||
|
|
||||||
|
Python **incorrectly** uses the *string length* of 5 codepoints rather than the
|
||||||
|
*printible length* of 10 cells, so that when using the `rjust` function, the
|
||||||
|
output length is wrong::
|
||||||
|
|
||||||
|
>>> print(len('コンニチハ'))
|
||||||
|
5
|
||||||
|
|
||||||
|
>>> print('コンニチハ'.rjust(20, '_'))
|
||||||
|
_____コンニチハ
|
||||||
|
|
||||||
|
By defining our own "rjust" function that uses wcwidth, we can correct this::
|
||||||
|
|
||||||
|
>>> def wc_rjust(text, length, padding=' '):
|
||||||
|
... from wcwidth import wcswidth
|
||||||
|
... return padding * max(0, (length - wcswidth(text))) + text
|
||||||
|
...
|
||||||
|
|
||||||
|
Our **Solution** uses wcswidth to determine the string length correctly::
|
||||||
|
|
||||||
|
>>> from wcwidth import wcswidth
|
||||||
|
>>> print(wcswidth('コンニチハ'))
|
||||||
|
10
|
||||||
|
|
||||||
|
>>> print(wc_rjust('コンニチハ', 20, '_'))
|
||||||
|
__________コンニチハ
|
||||||
|
|
||||||
|
|
||||||
|
Choosing a Version
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Export an environment variable, ``UNICODE_VERSION``. This should be done by
|
||||||
|
*terminal emulators* or those developers experimenting with authoring one of
|
||||||
|
their own, from shell::
|
||||||
|
|
||||||
|
$ export UNICODE_VERSION=13.0
|
||||||
|
|
||||||
|
If unspecified, the latest version is used. If your Terminal Emulator does not
|
||||||
|
export this variable, you can use the `jquast/ucs-detect`_ utility to
|
||||||
|
automatically detect and export it to your shell.
|
||||||
|
|
||||||
|
wcwidth, wcswidth
|
||||||
|
-----------------
|
||||||
|
Use function ``wcwidth()`` to determine the length of a *single unicode
|
||||||
|
character*, and ``wcswidth()`` to determine the length of many, a *string
|
||||||
|
of unicode characters*.
|
||||||
|
|
||||||
|
Briefly, return values of function ``wcwidth()`` are:
|
||||||
|
|
||||||
|
``-1``
|
||||||
|
Indeterminate (not printable).
|
||||||
|
|
||||||
|
``0``
|
||||||
|
Does not advance the cursor, such as NULL or Combining.
|
||||||
|
|
||||||
|
``2``
|
||||||
|
Characters of category East Asian Wide (W) or East Asian
|
||||||
|
Full-width (F) which are displayed using two terminal cells.
|
||||||
|
|
||||||
|
``1``
|
||||||
|
All others.
|
||||||
|
|
||||||
|
Function ``wcswidth()`` simply returns the sum of all values for each character
|
||||||
|
along a string, or ``-1`` when it occurs anywhere along a string.
|
||||||
|
|
||||||
|
Full API Documentation at http://wcwidth.readthedocs.org
|
||||||
|
|
||||||
|
==========
|
||||||
|
Developing
|
||||||
|
==========
|
||||||
|
|
||||||
|
Install wcwidth in editable mode::
|
||||||
|
|
||||||
|
pip install -e.
|
||||||
|
|
||||||
|
Execute unit tests using tox_::
|
||||||
|
|
||||||
|
tox
|
||||||
|
|
||||||
|
Regenerate python code tables from latest Unicode Specification data files::
|
||||||
|
|
||||||
|
tox -eupdate
|
||||||
|
|
||||||
|
Supplementary tools for browsing and testing terminals for wide unicode
|
||||||
|
characters are found in the `bin/`_ of this project's source code. Just ensure
|
||||||
|
to first ``pip install -erequirements-develop.txt`` from this projects main
|
||||||
|
folder. For example, an interactive browser for testing::
|
||||||
|
|
||||||
|
./bin/wcwidth-browser.py
|
||||||
|
|
||||||
|
Uses
|
||||||
|
----
|
||||||
|
|
||||||
|
This library is used in:
|
||||||
|
|
||||||
|
- `jquast/blessed`_: a thin, practical wrapper around terminal capabilities in
|
||||||
|
Python.
|
||||||
|
|
||||||
|
- `jonathanslenders/python-prompt-toolkit`_: a Library for building powerful
|
||||||
|
interactive command lines in Python.
|
||||||
|
|
||||||
|
- `dbcli/pgcli`_: Postgres CLI with autocompletion and syntax highlighting.
|
||||||
|
|
||||||
|
- `thomasballinger/curtsies`_: a Curses-like terminal wrapper with a display
|
||||||
|
based on compositing 2d arrays of text.
|
||||||
|
|
||||||
|
- `selectel/pyte`_: Simple VTXXX-compatible linux terminal emulator.
|
||||||
|
|
||||||
|
- `astanin/python-tabulate`_: Pretty-print tabular data in Python, a library
|
||||||
|
and a command-line utility.
|
||||||
|
|
||||||
|
- `LuminosoInsight/python-ftfy`_: Fixes mojibake and other glitches in Unicode
|
||||||
|
text.
|
||||||
|
|
||||||
|
- `nbedos/termtosvg`_: Terminal recorder that renders sessions as SVG
|
||||||
|
animations.
|
||||||
|
|
||||||
|
- `peterbrittain/asciimatics`_: Package to help people create full-screen text
|
||||||
|
UIs.
|
||||||
|
|
||||||
|
Other Languages
|
||||||
|
---------------
|
||||||
|
|
||||||
|
- `timoxley/wcwidth`_: JavaScript
|
||||||
|
- `janlelis/unicode-display_width`_: Ruby
|
||||||
|
- `alecrabbit/php-wcwidth`_: PHP
|
||||||
|
- `Text::CharWidth`_: Perl
|
||||||
|
- `bluebear94/Terminal-WCWidth`: Perl 6
|
||||||
|
- `mattn/go-runewidth`_: Go
|
||||||
|
- `emugel/wcwidth`_: Haxe
|
||||||
|
- `aperezdc/lua-wcwidth`: Lua
|
||||||
|
- `joachimschmidt557/zig-wcwidth`: Zig
|
||||||
|
- `fumiyas/wcwidth-cjk`: `LD_PRELOAD` override
|
||||||
|
- `joshuarubin/wcwidth9`: Unicode version 9 in C
|
||||||
|
|
||||||
|
History
|
||||||
|
-------
|
||||||
|
|
||||||
|
0.2.0 *2020-06-01*
|
||||||
|
* **Enhancement**: Unicode version may be selected by exporting the
|
||||||
|
Environment variable ``UNICODE_VERSION``, such as ``13.0``, or ``6.3.0``.
|
||||||
|
See the `jquast/ucs-detect`_ CLI utility for automatic detection.
|
||||||
|
* **Enhancement**:
|
||||||
|
API Documentation is published to readthedocs.org.
|
||||||
|
* **Updated** tables for *all* Unicode Specifications with files
|
||||||
|
published in a programmatically consumable format, versions 4.1.0
|
||||||
|
through 13.0
|
||||||
|
that are published
|
||||||
|
, versions
|
||||||
|
|
||||||
|
0.1.9 *2020-03-22*
|
||||||
|
* **Performance** optimization by `Avram Lubkin`_, `PR #35`_.
|
||||||
|
* **Updated** tables to Unicode Specification 13.0.0.
|
||||||
|
|
||||||
|
0.1.8 *2020-01-01*
|
||||||
|
* **Updated** tables to Unicode Specification 12.0.0. (`PR #30`_).
|
||||||
|
|
||||||
|
0.1.7 *2016-07-01*
|
||||||
|
* **Updated** tables to Unicode Specification 9.0.0. (`PR #18`_).
|
||||||
|
|
||||||
|
0.1.6 *2016-01-08 Production/Stable*
|
||||||
|
* ``LICENSE`` file now included with distribution.
|
||||||
|
|
||||||
|
0.1.5 *2015-09-13 Alpha*
|
||||||
|
* **Bugfix**:
|
||||||
|
Resolution of "combining_ character width" issue, most especially
|
||||||
|
those that previously returned -1 now often (correctly) return 0.
|
||||||
|
resolved by `Philip Craig`_ via `PR #11`_.
|
||||||
|
* **Deprecated**:
|
||||||
|
The module path ``wcwidth.table_comb`` is no longer available,
|
||||||
|
it has been superseded by module path ``wcwidth.table_zero``.
|
||||||
|
|
||||||
|
0.1.4 *2014-11-20 Pre-Alpha*
|
||||||
|
* **Feature**: ``wcswidth()`` now determines printable length
|
||||||
|
for (most) combining_ characters. The developer's tool
|
||||||
|
`bin/wcwidth-browser.py`_ is improved to display combining_
|
||||||
|
characters when provided the ``--combining`` option
|
||||||
|
(`Thomas Ballinger`_ and `Leta Montopoli`_ `PR #5`_).
|
||||||
|
* **Feature**: added static analysis (prospector_) to testing
|
||||||
|
framework.
|
||||||
|
|
||||||
|
0.1.3 *2014-10-29 Pre-Alpha*
|
||||||
|
* **Bugfix**: 2nd parameter of wcswidth was not honored.
|
||||||
|
(`Thomas Ballinger`_, `PR #4`_).
|
||||||
|
|
||||||
|
0.1.2 *2014-10-28 Pre-Alpha*
|
||||||
|
* **Updated** tables to Unicode Specification 7.0.0.
|
||||||
|
(`Thomas Ballinger`_, `PR #3`_).
|
||||||
|
|
||||||
|
0.1.1 *2014-05-14 Pre-Alpha*
|
||||||
|
* Initial release to pypi, Based on Unicode Specification 6.3.0
|
||||||
|
|
||||||
|
This code was originally derived directly from C code of the same name,
|
||||||
|
whose latest version is available at
|
||||||
|
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c::
|
||||||
|
|
||||||
|
* Markus Kuhn -- 2007-05-26 (Unicode 5.0)
|
||||||
|
*
|
||||||
|
* Permission to use, copy, modify, and distribute this software
|
||||||
|
* for any purpose and without fee is hereby granted. The author
|
||||||
|
* disclaims all warranties with regard to this software.
|
||||||
|
|
||||||
|
.. _`tox`: https://testrun.org/tox/latest/install.html
|
||||||
|
.. _`prospector`: https://github.com/landscapeio/prospector
|
||||||
|
.. _`combining`: https://en.wikipedia.org/wiki/Combining_character
|
||||||
|
.. _`bin/`: https://github.com/jquast/wcwidth/tree/master/bin
|
||||||
|
.. _`bin/wcwidth-browser.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-browser.py
|
||||||
|
.. _`Thomas Ballinger`: https://github.com/thomasballinger
|
||||||
|
.. _`Leta Montopoli`: https://github.com/lmontopo
|
||||||
|
.. _`Philip Craig`: https://github.com/philipc
|
||||||
|
.. _`PR #3`: https://github.com/jquast/wcwidth/pull/3
|
||||||
|
.. _`PR #4`: https://github.com/jquast/wcwidth/pull/4
|
||||||
|
.. _`PR #5`: https://github.com/jquast/wcwidth/pull/5
|
||||||
|
.. _`PR #11`: https://github.com/jquast/wcwidth/pull/11
|
||||||
|
.. _`PR #18`: https://github.com/jquast/wcwidth/pull/18
|
||||||
|
.. _`PR #30`: https://github.com/jquast/wcwidth/pull/30
|
||||||
|
.. _`PR #35`: https://github.com/jquast/wcwidth/pull/35
|
||||||
|
.. _`jquast/blessed`: https://github.com/jquast/blessed
|
||||||
|
.. _`selectel/pyte`: https://github.com/selectel/pyte
|
||||||
|
.. _`thomasballinger/curtsies`: https://github.com/thomasballinger/curtsies
|
||||||
|
.. _`dbcli/pgcli`: https://github.com/dbcli/pgcli
|
||||||
|
.. _`jonathanslenders/python-prompt-toolkit`: https://github.com/jonathanslenders/python-prompt-toolkit
|
||||||
|
.. _`timoxley/wcwidth`: https://github.com/timoxley/wcwidth
|
||||||
|
.. _`wcwidth(3)`: http://man7.org/linux/man-pages/man3/wcwidth.3.html
|
||||||
|
.. _`wcswidth(3)`: http://man7.org/linux/man-pages/man3/wcswidth.3.html
|
||||||
|
.. _`astanin/python-tabulate`: https://github.com/astanin/python-tabulate
|
||||||
|
.. _`janlelis/unicode-display_width`: https://github.com/janlelis/unicode-display_width
|
||||||
|
.. _`LuminosoInsight/python-ftfy`: https://github.com/LuminosoInsight/python-ftfy
|
||||||
|
.. _`alecrabbit/php-wcwidth`: https://github.com/alecrabbit/php-wcwidth
|
||||||
|
.. _`Text::CharWidth`: https://metacpan.org/pod/Text::CharWidth
|
||||||
|
.. _`bluebear94/Terminal-WCWidth`: https://github.com/bluebear94/Terminal-WCWidth
|
||||||
|
.. _`mattn/go-runewidth`: https://github.com/mattn/go-runewidth
|
||||||
|
.. _`emugel/wcwidth`: https://github.com/emugel/wcwidth
|
||||||
|
.. _`jquast/ucs-detect`: https://github.com/jquast/ucs-detect
|
||||||
|
.. _`Avram Lubkin`: https://github.com/avylove
|
||||||
|
.. _`nbedos/termtosvg`: https://github.com/nbedos/termtosvg
|
||||||
|
.. _`peterbrittain/asciimatics`: https://github.com/peterbrittain/asciimatics
|
||||||
|
.. _`aperezdc/lua-wcwidth`: https://github.com/aperezdc/lua-wcwidth
|
||||||
|
.. _`fumiyas/wcwidth-cjk`: https://github.com/fumiyas/wcwidth-cjk
|
||||||
|
.. |pypi_downloads| image:: https://img.shields.io/pypi/dm/wcwidth.svg?logo=pypi
|
||||||
|
:alt: Downloads
|
||||||
|
:target: https://pypi.org/project/wcwidth/
|
||||||
|
.. |codecov| image:: https://codecov.io/gh/jquast/wcwidth/branch/master/graph/badge.svg
|
||||||
|
:alt: codecov.io Code Coverage
|
||||||
|
:target: https://codecov.io/gh/jquast/wcwidth/
|
||||||
|
.. |license| image:: https://img.shields.io/github/license/jquast/wcwidth.svg
|
||||||
|
:target: https://pypi.python.org/pypi/wcwidth/
|
||||||
|
:alt: MIT License
|
|
@ -0,0 +1,47 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Display new wide unicode point values, by version.
|
||||||
|
|
||||||
|
For example::
|
||||||
|
|
||||||
|
"5.0.0": [
|
||||||
|
12752,
|
||||||
|
12753,
|
||||||
|
12754,
|
||||||
|
...
|
||||||
|
|
||||||
|
Means that chr(12752) through chr(12754) are new WIDE values
|
||||||
|
for Unicode vesion 5.0.0, and were not WIDE values for the
|
||||||
|
previous version (4.1.0).
|
||||||
|
"""
|
||||||
|
# std imports
|
||||||
|
import sys
|
||||||
|
import json
|
||||||
|
|
||||||
|
|
||||||
|
# List new WIDE characters at each unicode version.
|
||||||
|
#
|
||||||
|
def main():
|
||||||
|
from wcwidth import WIDE_EASTASIAN, _bisearch
|
||||||
|
versions = list(WIDE_EASTASIAN.keys())
|
||||||
|
results = {}
|
||||||
|
for version in versions:
|
||||||
|
prev_idx = versions.index(version) - 1
|
||||||
|
if prev_idx == -1:
|
||||||
|
continue
|
||||||
|
previous_version = versions[prev_idx]
|
||||||
|
previous_table = WIDE_EASTASIAN[previous_version]
|
||||||
|
for value_pair in WIDE_EASTASIAN[version]:
|
||||||
|
for value in range(*value_pair):
|
||||||
|
if not _bisearch(value, previous_table):
|
||||||
|
results[version] = results.get(version, []) + [value]
|
||||||
|
if '--debug' in sys.argv:
|
||||||
|
print(f'version {version} has unicode character '
|
||||||
|
f'0x{value:05x} ({chr(value)}) but previous '
|
||||||
|
f'version, {previous_version} does not.',
|
||||||
|
file=sys.stderr)
|
||||||
|
print(json.dumps(results, indent=4))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
|
@ -0,0 +1,42 @@
|
||||||
|
"""Workaround for https://github.com/codecov/codecov-python/issues/158."""
|
||||||
|
|
||||||
|
# std imports
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
|
||||||
|
# 3rd party
|
||||||
|
import codecov
|
||||||
|
|
||||||
|
RETRIES = 5
|
||||||
|
TIMEOUT = 2
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Run codecov up to RETRIES times On the final attempt, let it exit normally."""
|
||||||
|
|
||||||
|
# Make a copy of argv and make sure --required is in it
|
||||||
|
args = sys.argv[1:]
|
||||||
|
if '--required' not in args:
|
||||||
|
args.append('--required')
|
||||||
|
|
||||||
|
for num in range(1, RETRIES + 1):
|
||||||
|
|
||||||
|
print('Running codecov attempt %d: ' % num)
|
||||||
|
# On the last, let codecov handle the exit
|
||||||
|
if num == RETRIES:
|
||||||
|
codecov.main()
|
||||||
|
|
||||||
|
try:
|
||||||
|
codecov.main(*args)
|
||||||
|
except SystemExit as err:
|
||||||
|
# If there's no exit code, it was successful
|
||||||
|
if err.code:
|
||||||
|
time.sleep(TIMEOUT)
|
||||||
|
else:
|
||||||
|
sys.exit(err.code)
|
||||||
|
else:
|
||||||
|
break
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
|
@ -0,0 +1,331 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
"""
|
||||||
|
Update the python Unicode tables for wcwidth.
|
||||||
|
|
||||||
|
https://github.com/jquast/wcwidth
|
||||||
|
"""
|
||||||
|
from __future__ import print_function
|
||||||
|
|
||||||
|
# std imports
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import glob
|
||||||
|
import json
|
||||||
|
import codecs
|
||||||
|
import string
|
||||||
|
import urllib
|
||||||
|
import datetime
|
||||||
|
import collections
|
||||||
|
import unicodedata
|
||||||
|
|
||||||
|
try:
|
||||||
|
# py2
|
||||||
|
from urllib2 import urlopen
|
||||||
|
except ImportError:
|
||||||
|
# py3
|
||||||
|
from urllib.request import urlopen
|
||||||
|
|
||||||
|
URL_UNICODE_DERIVED_AGE = 'file:///usr/share/unicode/DerivedAge.txt'
|
||||||
|
EXCLUDE_VERSIONS = ['2.0.0', '2.1.2', '3.0.0', '3.1.0', '3.2.0', '4.0.0']
|
||||||
|
PATH_UP = os.path.relpath(
|
||||||
|
os.path.join(
|
||||||
|
os.path.dirname(__file__),
|
||||||
|
os.path.pardir))
|
||||||
|
PATH_DOCS = os.path.join(PATH_UP, 'docs')
|
||||||
|
PATH_DATA = os.path.join(PATH_UP, 'data')
|
||||||
|
PATH_CODE = os.path.join(PATH_UP, 'wcwidth')
|
||||||
|
FILE_RST = os.path.join(PATH_DOCS, 'unicode_version.rst')
|
||||||
|
FILE_PATCH_FROM = "release files:"
|
||||||
|
FILE_PATCH_TO = "======="
|
||||||
|
|
||||||
|
|
||||||
|
# use chr() for py3.x,
|
||||||
|
# unichr() for py2.x
|
||||||
|
try:
|
||||||
|
_ = unichr(0)
|
||||||
|
except NameError as err:
|
||||||
|
if err.args[0] == "name 'unichr' is not defined":
|
||||||
|
# pylint: disable=C0103,W0622
|
||||||
|
# Invalid constant name "unichr" (col 8)
|
||||||
|
# Redefining built-in 'unichr' (col 8)
|
||||||
|
unichr = chr
|
||||||
|
else:
|
||||||
|
raise
|
||||||
|
|
||||||
|
|
||||||
|
TableDef = collections.namedtuple('table', ['version', 'date', 'values'])
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Update east-asian, combining and zero width tables."""
|
||||||
|
versions = get_unicode_versions()
|
||||||
|
do_east_asian(versions)
|
||||||
|
do_zero_width(versions)
|
||||||
|
do_rst_file_update()
|
||||||
|
do_unicode_versions(versions)
|
||||||
|
|
||||||
|
|
||||||
|
def get_unicode_versions():
|
||||||
|
"""Fetch, determine, and return Unicode Versions for processing."""
|
||||||
|
fname = os.path.join(PATH_DATA, 'DerivedAge.txt')
|
||||||
|
do_retrieve(url=URL_UNICODE_DERIVED_AGE, fname=fname)
|
||||||
|
pattern = re.compile(r'#.*assigned in Unicode ([0-9.]+)')
|
||||||
|
versions = []
|
||||||
|
for line in open(fname, 'r'):
|
||||||
|
if match := re.match(pattern, line):
|
||||||
|
version = match.group(1)
|
||||||
|
if version not in EXCLUDE_VERSIONS:
|
||||||
|
versions.append(version)
|
||||||
|
versions.sort(key=lambda ver: list(map(int, ver.split('.'))))
|
||||||
|
return versions
|
||||||
|
|
||||||
|
|
||||||
|
def do_rst_file_update():
|
||||||
|
"""Patch unicode_versions.rst to reflect the data files used in release."""
|
||||||
|
|
||||||
|
# read in,
|
||||||
|
data_in = codecs.open(FILE_RST, 'r', 'utf8').read()
|
||||||
|
|
||||||
|
# search for beginning and end positions,
|
||||||
|
pos_begin = data_in.find(FILE_PATCH_FROM)
|
||||||
|
assert pos_begin != -1, (pos_begin, FILE_PATCH_FROM)
|
||||||
|
pos_begin += len(FILE_PATCH_FROM)
|
||||||
|
data_out = data_in[:pos_begin] + '\n\n'
|
||||||
|
|
||||||
|
# find all filenames with a version number in it,
|
||||||
|
# sort filenames by name, then dotted number, ascending
|
||||||
|
glob_pattern = os.path.join(PATH_DATA, '*[0-9]*.txt')
|
||||||
|
filenames = glob.glob(glob_pattern)
|
||||||
|
filenames.sort(key=lambda ver: [ver.split(
|
||||||
|
'-')[0]] + list(map(int, ver.split('-')[-1][:-4].split('.'))))
|
||||||
|
|
||||||
|
# copy file description as-is, formatted
|
||||||
|
for fpath in filenames:
|
||||||
|
if description := describe_file_header(fpath):
|
||||||
|
data_out += f'\n{description}'
|
||||||
|
|
||||||
|
# write.
|
||||||
|
print(f"patching {FILE_RST} ..")
|
||||||
|
codecs.open(
|
||||||
|
FILE_RST, 'w', 'utf8').write(data_out)
|
||||||
|
|
||||||
|
|
||||||
|
def do_east_asian(versions):
|
||||||
|
"""Fetch and update east-asian tables."""
|
||||||
|
table = {}
|
||||||
|
for version in versions:
|
||||||
|
fin = os.path.join(PATH_DATA, 'EastAsianWidth-{version}.txt')
|
||||||
|
fout = os.path.join(PATH_CODE, 'table_wide.py')
|
||||||
|
url = ('file:///usr/share/unicode/EastAsianWidth.txt')
|
||||||
|
try:
|
||||||
|
do_retrieve(url=url.format(version=version),
|
||||||
|
fname=fin.format(version=version))
|
||||||
|
except urllib.error.HTTPError as err:
|
||||||
|
if err.code != 404:
|
||||||
|
raise
|
||||||
|
else:
|
||||||
|
table[version] = parse_east_asian(
|
||||||
|
fname=fin.format(version=version),
|
||||||
|
properties=(u'W', u'F',))
|
||||||
|
do_write_table(fname=fout, variable='WIDE_EASTASIAN', table=table)
|
||||||
|
|
||||||
|
|
||||||
|
def do_zero_width(versions):
|
||||||
|
"""Fetch and update zero width tables."""
|
||||||
|
table = {}
|
||||||
|
fout = os.path.join(PATH_CODE, 'table_zero.py')
|
||||||
|
for version in versions:
|
||||||
|
fin = os.path.join(PATH_DATA, 'DerivedGeneralCategory-{version}.txt')
|
||||||
|
url = ('file:///usr/share/unicode/extracted/DerivedGeneralCategory.txt')
|
||||||
|
try:
|
||||||
|
do_retrieve(url=url.format(version=version),
|
||||||
|
fname=fin.format(version=version))
|
||||||
|
except urllib.error.HTTPError as err:
|
||||||
|
if err.code != 404:
|
||||||
|
raise
|
||||||
|
else:
|
||||||
|
table[version] = parse_category(
|
||||||
|
fname=fin.format(version=version),
|
||||||
|
categories=('Me', 'Mn',))
|
||||||
|
do_write_table(fname=fout, variable='ZERO_WIDTH', table=table)
|
||||||
|
|
||||||
|
|
||||||
|
def make_table(values):
|
||||||
|
"""Return a tuple of lookup tables for given values."""
|
||||||
|
table = collections.deque()
|
||||||
|
start, end = values[0], values[0]
|
||||||
|
for num, value in enumerate(values):
|
||||||
|
if num == 0:
|
||||||
|
table.append((value, value,))
|
||||||
|
continue
|
||||||
|
start, end = table.pop()
|
||||||
|
if end == value - 1:
|
||||||
|
table.append((start, value,))
|
||||||
|
else:
|
||||||
|
table.append((start, end,))
|
||||||
|
table.append((value, value,))
|
||||||
|
return tuple(table)
|
||||||
|
|
||||||
|
|
||||||
|
def do_retrieve(url, fname):
|
||||||
|
"""Retrieve given url to target filepath fname."""
|
||||||
|
folder = os.path.dirname(fname)
|
||||||
|
if not os.path.exists(folder):
|
||||||
|
os.makedirs(folder)
|
||||||
|
print(f"{folder}{os.path.sep} created.")
|
||||||
|
if not os.path.exists(fname):
|
||||||
|
try:
|
||||||
|
with open(fname, 'wb') as fout:
|
||||||
|
print(f"retrieving {url}: ", end='', flush=True)
|
||||||
|
resp = urlopen(url)
|
||||||
|
fout.write(resp.read())
|
||||||
|
except BaseException:
|
||||||
|
print('failed')
|
||||||
|
os.unlink(fname)
|
||||||
|
raise
|
||||||
|
print(f"{fname} saved.")
|
||||||
|
return fname
|
||||||
|
|
||||||
|
|
||||||
|
def describe_file_header(fpath):
|
||||||
|
header_2 = [line.lstrip('# ').rstrip() for line in
|
||||||
|
codecs.open(fpath, 'r', 'utf8').readlines()[:2]]
|
||||||
|
# fmt:
|
||||||
|
#
|
||||||
|
# ``EastAsianWidth-8.0.0.txt``
|
||||||
|
# *2015-02-10, 21:00:00 GMT [KW, LI]*
|
||||||
|
fmt = '``{0}``\n *{1}*\n'
|
||||||
|
if len(header_2) == 0:
|
||||||
|
return ''
|
||||||
|
assert len(header_2) == 2, (fpath, header_2)
|
||||||
|
return fmt.format(*header_2)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_east_asian(fname, properties=(u'W', u'F',)):
|
||||||
|
"""Parse unicode east-asian width tables."""
|
||||||
|
print(f'parsing {fname}: ', end='', flush=True)
|
||||||
|
version, date, values = None, None, []
|
||||||
|
for line in open(fname, 'rb'):
|
||||||
|
uline = line.decode('utf-8')
|
||||||
|
if version is None:
|
||||||
|
version = uline.split(None, 1)[1].rstrip()
|
||||||
|
continue
|
||||||
|
if date is None:
|
||||||
|
date = uline.split(':', 1)[1].rstrip()
|
||||||
|
continue
|
||||||
|
if uline.startswith('#') or not uline.lstrip():
|
||||||
|
continue
|
||||||
|
addrs, details = uline.split(';', 1)
|
||||||
|
if any(details.startswith(property)
|
||||||
|
for property in properties):
|
||||||
|
start, stop = addrs, addrs
|
||||||
|
if '..' in addrs:
|
||||||
|
start, stop = addrs.split('..')
|
||||||
|
values.extend(range(int(start, 16), int(stop, 16) + 1))
|
||||||
|
print('ok')
|
||||||
|
return TableDef(version, date, values)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_category(fname, categories):
|
||||||
|
"""Parse unicode category tables."""
|
||||||
|
print(f'parsing {fname}: ', end='', flush=True)
|
||||||
|
version, date, values = None, None, []
|
||||||
|
for line in open(fname, 'rb'):
|
||||||
|
uline = line.decode('utf-8')
|
||||||
|
if version is None:
|
||||||
|
version = uline.split(None, 1)[1].rstrip()
|
||||||
|
continue
|
||||||
|
if date is None:
|
||||||
|
date = uline.split(':', 1)[1].rstrip()
|
||||||
|
continue
|
||||||
|
if uline.startswith('#') or not uline.lstrip():
|
||||||
|
continue
|
||||||
|
addrs, details = uline.split(';', 1)
|
||||||
|
addrs, details = addrs.rstrip(), details.lstrip()
|
||||||
|
if any(details.startswith(f'{value} #')
|
||||||
|
for value in categories):
|
||||||
|
start, stop = addrs, addrs
|
||||||
|
if '..' in addrs:
|
||||||
|
start, stop = addrs.split('..')
|
||||||
|
values.extend(range(int(start, 16), int(stop, 16) + 1))
|
||||||
|
print('ok')
|
||||||
|
return TableDef(version, date, sorted(values))
|
||||||
|
|
||||||
|
|
||||||
|
def do_write_table(fname, variable, table):
|
||||||
|
"""Write combining tables to filesystem as python code."""
|
||||||
|
# pylint: disable=R0914
|
||||||
|
# Too many local variables (19/15) (col 4)
|
||||||
|
utc_now = datetime.datetime.utcnow()
|
||||||
|
indent = ' ' * 8
|
||||||
|
with open(fname, 'w') as fout:
|
||||||
|
print(f"writing {fname} ... ", end='')
|
||||||
|
fout.write(
|
||||||
|
f'"""{variable.title()} table, created by bin/update-tables.py."""\n'
|
||||||
|
f"{variable} = {{\n")
|
||||||
|
|
||||||
|
for version_key, version_table in table.items():
|
||||||
|
if not version_table.values:
|
||||||
|
continue
|
||||||
|
fout.write(
|
||||||
|
f"{indent[:-4]}'{version_key}': (\n"
|
||||||
|
f"{indent}# Source: {version_table.version}\n"
|
||||||
|
f"{indent}# Date: {version_table.date}\n"
|
||||||
|
f"{indent}#")
|
||||||
|
|
||||||
|
for start, end in make_table(version_table.values):
|
||||||
|
ucs_start, ucs_end = unichr(start), unichr(end)
|
||||||
|
hex_start, hex_end = (f'0x{start:05x}', f'0x{end:05x}')
|
||||||
|
try:
|
||||||
|
name_start = string.capwords(unicodedata.name(ucs_start))
|
||||||
|
except ValueError:
|
||||||
|
name_start = u'(nil)'
|
||||||
|
try:
|
||||||
|
name_end = string.capwords(unicodedata.name(ucs_end))
|
||||||
|
except ValueError:
|
||||||
|
name_end = u'(nil)'
|
||||||
|
fout.write(f'\n{indent}')
|
||||||
|
comment_startpart = name_start[:24].rstrip()
|
||||||
|
comment_endpart = name_end[:24].rstrip()
|
||||||
|
fout.write(f'({hex_start}, {hex_end},),')
|
||||||
|
fout.write(f' # {comment_startpart:24s}..{comment_endpart}')
|
||||||
|
fout.write(f'\n{indent[:-4]}),\n')
|
||||||
|
fout.write('}\n')
|
||||||
|
print("complete.")
|
||||||
|
|
||||||
|
|
||||||
|
def do_unicode_versions(versions):
|
||||||
|
"""Write unicode_versions.py function list_versions()."""
|
||||||
|
fname = os.path.join(PATH_CODE, 'unicode_versions.py')
|
||||||
|
print(f"writing {fname} ... ", end='')
|
||||||
|
|
||||||
|
utc_now = datetime.datetime.utcnow()
|
||||||
|
version_tuples_str = '\n '.join(
|
||||||
|
f'"{ver}",' for ver in versions)
|
||||||
|
with open(fname, 'w') as fp:
|
||||||
|
fp.write(f"""\"\"\"
|
||||||
|
Exports function list_versions() for unicode version level support.
|
||||||
|
|
||||||
|
This code generated by {__file__} on {utc_now}.
|
||||||
|
\"\"\"
|
||||||
|
|
||||||
|
|
||||||
|
def list_versions():
|
||||||
|
\"\"\"
|
||||||
|
Return Unicode version levels supported by this module release.
|
||||||
|
|
||||||
|
Any of the version strings returned may be used as keyword argument
|
||||||
|
``unicode_version`` to the ``wcwidth()`` family of functions.
|
||||||
|
|
||||||
|
:returns: Supported Unicode version numbers in ascending sorted order.
|
||||||
|
:rtype: list[str]
|
||||||
|
\"\"\"
|
||||||
|
return (
|
||||||
|
{version_tuples_str}
|
||||||
|
)
|
||||||
|
""")
|
||||||
|
print('done.')
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
|
@ -0,0 +1,706 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
"""
|
||||||
|
A terminal browser, similar to less(1) for testing printable width of unicode.
|
||||||
|
|
||||||
|
This displays the full range of unicode points for 1 or 2-character wide
|
||||||
|
ideograms, with pipes ('|') that should always align for any terminal that
|
||||||
|
supports utf-8.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
./bin/wcwidth-browser.py [--wide=<n>]
|
||||||
|
[--alignment=<str>]
|
||||||
|
[--combining]
|
||||||
|
[--help]
|
||||||
|
|
||||||
|
Options:
|
||||||
|
--wide=<int> Browser 1 or 2 character-wide cells.
|
||||||
|
--alignment=<str> Chose left or right alignment. [default: left]
|
||||||
|
--combining Use combining character generator. [default: 2]
|
||||||
|
--help Display usage
|
||||||
|
"""
|
||||||
|
# pylint: disable=C0103,W0622
|
||||||
|
# Invalid constant name "echo"
|
||||||
|
# Invalid constant name "flushout" (col 4)
|
||||||
|
# Invalid module name "wcwidth-browser"
|
||||||
|
from __future__ import division, print_function
|
||||||
|
|
||||||
|
# std imports
|
||||||
|
import sys
|
||||||
|
import signal
|
||||||
|
import string
|
||||||
|
import functools
|
||||||
|
import unicodedata
|
||||||
|
|
||||||
|
# 3rd party
|
||||||
|
import docopt
|
||||||
|
import blessed
|
||||||
|
|
||||||
|
# local
|
||||||
|
from wcwidth import ZERO_WIDTH, wcwidth, list_versions, _wcmatch_version
|
||||||
|
|
||||||
|
#: print function alias, does not end with line terminator.
|
||||||
|
echo = functools.partial(print, end='')
|
||||||
|
flushout = functools.partial(print, end='', flush=True)
|
||||||
|
|
||||||
|
#: printable length of highest unicode character description
|
||||||
|
LIMIT_UCS = 0x3fffd
|
||||||
|
UCS_PRINTLEN = len('{value:0x}'.format(value=LIMIT_UCS))
|
||||||
|
|
||||||
|
|
||||||
|
def readline(term, width):
|
||||||
|
"""A rudimentary readline implementation."""
|
||||||
|
text = ''
|
||||||
|
while True:
|
||||||
|
inp = term.inkey()
|
||||||
|
if inp.code == term.KEY_ENTER:
|
||||||
|
break
|
||||||
|
if inp.code == term.KEY_ESCAPE or inp == chr(3):
|
||||||
|
text = None
|
||||||
|
break
|
||||||
|
if not inp.is_sequence and len(text) < width:
|
||||||
|
text += inp
|
||||||
|
echo(inp)
|
||||||
|
flushout()
|
||||||
|
elif inp.code in (term.KEY_BACKSPACE, term.KEY_DELETE):
|
||||||
|
if text:
|
||||||
|
text = text[:-1]
|
||||||
|
echo('\b \b')
|
||||||
|
flushout()
|
||||||
|
return text
|
||||||
|
|
||||||
|
|
||||||
|
class WcWideCharacterGenerator(object):
|
||||||
|
"""Generator yields unicode characters of the given ``width``."""
|
||||||
|
|
||||||
|
# pylint: disable=R0903
|
||||||
|
# Too few public methods (0/2)
|
||||||
|
def __init__(self, width=2, unicode_version='auto'):
|
||||||
|
"""
|
||||||
|
Class constructor.
|
||||||
|
|
||||||
|
:param width: generate characters of given width.
|
||||||
|
:param str unicode_version: Unicode Version for render.
|
||||||
|
:type width: int
|
||||||
|
"""
|
||||||
|
self.characters = (
|
||||||
|
chr(idx) for idx in range(LIMIT_UCS)
|
||||||
|
if wcwidth(chr(idx), unicode_version=unicode_version) == width)
|
||||||
|
|
||||||
|
def __iter__(self):
|
||||||
|
"""Special method called by iter()."""
|
||||||
|
return self
|
||||||
|
|
||||||
|
def __next__(self):
|
||||||
|
"""Special method called by next()."""
|
||||||
|
while True:
|
||||||
|
ucs = next(self.characters)
|
||||||
|
try:
|
||||||
|
name = string.capwords(unicodedata.name(ucs))
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
return (ucs, name)
|
||||||
|
|
||||||
|
|
||||||
|
class WcCombinedCharacterGenerator(object):
|
||||||
|
"""Generator yields unicode characters with combining."""
|
||||||
|
|
||||||
|
# pylint: disable=R0903
|
||||||
|
# Too few public methods (0/2)
|
||||||
|
|
||||||
|
def __init__(self, width=1):
|
||||||
|
"""
|
||||||
|
Class constructor.
|
||||||
|
|
||||||
|
:param int width: generate characters of given width.
|
||||||
|
:param str unicode_version: Unicode version.
|
||||||
|
"""
|
||||||
|
self.characters = []
|
||||||
|
letters_o = ('o' * width)
|
||||||
|
last_version = list_versions()[-1]
|
||||||
|
for (begin, end) in ZERO_WIDTH[last_version].items():
|
||||||
|
for val in [_val for _val in
|
||||||
|
range(begin, end + 1)
|
||||||
|
if _val <= LIMIT_UCS]:
|
||||||
|
self.characters.append(
|
||||||
|
letters_o[:1] +
|
||||||
|
chr(val) +
|
||||||
|
letters_o[wcwidth(chr(val)) + 1:])
|
||||||
|
self.characters.reverse()
|
||||||
|
|
||||||
|
def __iter__(self):
|
||||||
|
"""Special method called by iter()."""
|
||||||
|
return self
|
||||||
|
|
||||||
|
def __next__(self):
|
||||||
|
"""
|
||||||
|
Special method called by next().
|
||||||
|
|
||||||
|
:return: unicode character and name, as tuple.
|
||||||
|
:rtype: tuple[unicode, unicode]
|
||||||
|
:raises StopIteration: no more characters
|
||||||
|
"""
|
||||||
|
while True:
|
||||||
|
if not self.characters:
|
||||||
|
raise StopIteration
|
||||||
|
ucs = self.characters.pop()
|
||||||
|
try:
|
||||||
|
name = string.capwords(unicodedata.name(ucs[1]))
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
return (ucs, name)
|
||||||
|
|
||||||
|
# python 2.6 - 3.3 compatibility
|
||||||
|
next = __next__
|
||||||
|
|
||||||
|
|
||||||
|
class Style(object):
|
||||||
|
"""Styling decorator class instance for terminal output."""
|
||||||
|
|
||||||
|
# pylint: disable=R0903
|
||||||
|
# Too few public methods (0/2)
|
||||||
|
@staticmethod
|
||||||
|
def attr_major(text):
|
||||||
|
"""non-stylized callable for "major" text, for non-ttys."""
|
||||||
|
return text
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def attr_minor(text):
|
||||||
|
"""non-stylized callable for "minor" text, for non-ttys."""
|
||||||
|
return text
|
||||||
|
|
||||||
|
delimiter = '|'
|
||||||
|
continuation = ' $'
|
||||||
|
header_hint = '-'
|
||||||
|
header_fill = '='
|
||||||
|
name_len = 10
|
||||||
|
alignment = 'right'
|
||||||
|
|
||||||
|
def __init__(self, **kwargs):
|
||||||
|
"""
|
||||||
|
Class constructor.
|
||||||
|
|
||||||
|
Any given keyword arguments are assigned to the class attribute of the same name.
|
||||||
|
"""
|
||||||
|
for key, val in kwargs.items():
|
||||||
|
setattr(self, key, val)
|
||||||
|
|
||||||
|
|
||||||
|
class Screen(object):
|
||||||
|
"""Represents terminal style, data dimensions, and drawables."""
|
||||||
|
|
||||||
|
intro_msg_fmt = ('Delimiters ({delim}) should align, '
|
||||||
|
'unicode version is {version}.')
|
||||||
|
|
||||||
|
def __init__(self, term, style, wide=2):
|
||||||
|
"""Class constructor."""
|
||||||
|
self.term = term
|
||||||
|
self.style = style
|
||||||
|
self.wide = wide
|
||||||
|
|
||||||
|
@property
|
||||||
|
def header(self):
|
||||||
|
"""Text of joined segments producing full heading."""
|
||||||
|
return self.head_item * self.num_columns
|
||||||
|
|
||||||
|
@property
|
||||||
|
def hint_width(self):
|
||||||
|
"""Width of a column segment."""
|
||||||
|
return sum((len(self.style.delimiter),
|
||||||
|
self.wide,
|
||||||
|
len(self.style.delimiter),
|
||||||
|
len(' '),
|
||||||
|
UCS_PRINTLEN + 2,
|
||||||
|
len(' '),
|
||||||
|
self.style.name_len,))
|
||||||
|
|
||||||
|
@property
|
||||||
|
def head_item(self):
|
||||||
|
"""Text of a single column heading."""
|
||||||
|
delimiter = self.style.attr_minor(self.style.delimiter)
|
||||||
|
hint = self.style.header_hint * self.wide
|
||||||
|
heading = ('{delimiter}{hint}{delimiter}'
|
||||||
|
.format(delimiter=delimiter, hint=hint))
|
||||||
|
|
||||||
|
def alignment(*args):
|
||||||
|
if self.style.alignment == 'right':
|
||||||
|
return self.term.rjust(*args)
|
||||||
|
return self.term.ljust(*args)
|
||||||
|
|
||||||
|
txt = alignment(heading, self.hint_width, self.style.header_fill)
|
||||||
|
return self.style.attr_major(txt)
|
||||||
|
|
||||||
|
def msg_intro(self, version):
|
||||||
|
"""Introductory message disabled above heading."""
|
||||||
|
return self.term.center(self.intro_msg_fmt.format(
|
||||||
|
delim=self.style.attr_minor(self.style.delimiter),
|
||||||
|
version=self.style.attr_minor(version))).rstrip()
|
||||||
|
|
||||||
|
@property
|
||||||
|
def row_ends(self):
|
||||||
|
"""Bottom of page."""
|
||||||
|
return self.term.height - 1
|
||||||
|
|
||||||
|
@property
|
||||||
|
def num_columns(self):
|
||||||
|
"""Number of columns displayed."""
|
||||||
|
if self.term.is_a_tty:
|
||||||
|
return self.term.width // self.hint_width
|
||||||
|
return 1
|
||||||
|
|
||||||
|
@property
|
||||||
|
def num_rows(self):
|
||||||
|
"""Number of rows displayed."""
|
||||||
|
return self.row_ends - self.row_begins - 1
|
||||||
|
|
||||||
|
@property
|
||||||
|
def row_begins(self):
|
||||||
|
"""Top row displayed for content."""
|
||||||
|
# pylint: disable=R0201
|
||||||
|
# Method could be a function (col 4)
|
||||||
|
return 2
|
||||||
|
|
||||||
|
@property
|
||||||
|
def page_size(self):
|
||||||
|
"""Number of unicode text displayed per page."""
|
||||||
|
return self.num_rows * self.num_columns
|
||||||
|
|
||||||
|
|
||||||
|
class Pager(object):
|
||||||
|
"""A less(1)-like browser for browsing unicode characters."""
|
||||||
|
# pylint: disable=too-many-instance-attributes
|
||||||
|
|
||||||
|
#: screen state for next draw method(s).
|
||||||
|
STATE_CLEAN, STATE_DIRTY, STATE_REFRESH = 0, 1, 2
|
||||||
|
|
||||||
|
def __init__(self, term, screen, character_factory):
|
||||||
|
"""
|
||||||
|
Class constructor.
|
||||||
|
|
||||||
|
:param term: blessed Terminal class instance.
|
||||||
|
:type term: blessed.Terminal
|
||||||
|
:param screen: Screen class instance.
|
||||||
|
:type screen: Screen
|
||||||
|
:param character_factory: Character factory generator.
|
||||||
|
:type character_factory: callable returning iterable.
|
||||||
|
"""
|
||||||
|
self.term = term
|
||||||
|
self.screen = screen
|
||||||
|
self.character_factory = character_factory
|
||||||
|
self.unicode_version = 'auto'
|
||||||
|
self.dirty = self.STATE_REFRESH
|
||||||
|
self.last_page = 0
|
||||||
|
self._page_data = list()
|
||||||
|
|
||||||
|
def on_resize(self, *args):
|
||||||
|
"""Signal handler callback for SIGWINCH."""
|
||||||
|
# pylint: disable=W0613
|
||||||
|
# Unused argument 'args'
|
||||||
|
assert self.term.width >= self.screen.hint_width, (
|
||||||
|
'Screen to small {}, must be at least {}'.format(
|
||||||
|
self.term.width, self.screen.hint_width))
|
||||||
|
self._set_lastpage()
|
||||||
|
self.dirty = self.STATE_REFRESH
|
||||||
|
|
||||||
|
def _set_lastpage(self):
|
||||||
|
"""Calculate value of class attribute ``last_page``."""
|
||||||
|
self.last_page = (len(self._page_data) - 1) // self.screen.page_size
|
||||||
|
|
||||||
|
def display_initialize(self):
|
||||||
|
"""Display 'please wait' message, and narrow build warning."""
|
||||||
|
echo(self.term.home + self.term.clear)
|
||||||
|
echo(self.term.move_y(self.term.height // 2))
|
||||||
|
echo(self.term.center('Initializing page data ...').rstrip())
|
||||||
|
flushout()
|
||||||
|
|
||||||
|
def initialize_page_data(self):
|
||||||
|
"""Initialize the page data for the given screen."""
|
||||||
|
# pylint: disable=attribute-defined-outside-init
|
||||||
|
if self.term.is_a_tty:
|
||||||
|
self.display_initialize()
|
||||||
|
self.character_generator = self.character_factory(
|
||||||
|
self.screen.wide)
|
||||||
|
self._page_data = list()
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
self._page_data.append(next(self.character_generator))
|
||||||
|
except StopIteration:
|
||||||
|
break
|
||||||
|
self._set_lastpage()
|
||||||
|
|
||||||
|
def page_data(self, idx, offset):
|
||||||
|
"""
|
||||||
|
Return character data for page of given index and offset.
|
||||||
|
|
||||||
|
:param idx: page index.
|
||||||
|
:type idx: int
|
||||||
|
:param offset: scrolling region offset of current page.
|
||||||
|
:type offset: int
|
||||||
|
:returns: list of tuples in form of ``(ucs, name)``
|
||||||
|
:rtype: list[(unicode, unicode)]
|
||||||
|
"""
|
||||||
|
size = self.screen.page_size
|
||||||
|
|
||||||
|
while offset < 0 and idx:
|
||||||
|
offset += size
|
||||||
|
idx -= 1
|
||||||
|
offset = max(0, offset)
|
||||||
|
|
||||||
|
while offset >= size:
|
||||||
|
offset -= size
|
||||||
|
idx += 1
|
||||||
|
|
||||||
|
if idx == self.last_page:
|
||||||
|
offset = 0
|
||||||
|
idx = min(max(0, idx), self.last_page)
|
||||||
|
|
||||||
|
start = (idx * self.screen.page_size) + offset
|
||||||
|
end = start + self.screen.page_size
|
||||||
|
return (idx, offset), self._page_data[start:end]
|
||||||
|
|
||||||
|
def _run_notty(self, writer):
|
||||||
|
"""Pager run method for terminals that are not a tty."""
|
||||||
|
page_idx = page_offset = 0
|
||||||
|
while True:
|
||||||
|
npage_idx, _ = self.draw(writer, page_idx + 1, page_offset)
|
||||||
|
if npage_idx == self.last_page:
|
||||||
|
# page displayed was last page, quit.
|
||||||
|
break
|
||||||
|
page_idx = npage_idx
|
||||||
|
self.dirty = self.STATE_DIRTY
|
||||||
|
|
||||||
|
def _run_tty(self, writer, reader):
|
||||||
|
"""Pager run method for terminals that are a tty."""
|
||||||
|
# allow window-change signal to reflow screen
|
||||||
|
signal.signal(signal.SIGWINCH, self.on_resize)
|
||||||
|
|
||||||
|
page_idx = page_offset = 0
|
||||||
|
while True:
|
||||||
|
if self.dirty:
|
||||||
|
page_idx, page_offset = self.draw(writer,
|
||||||
|
page_idx,
|
||||||
|
page_offset)
|
||||||
|
self.dirty = self.STATE_CLEAN
|
||||||
|
inp = reader(timeout=0.25)
|
||||||
|
if inp is not None:
|
||||||
|
nxt, noff = self.process_keystroke(inp,
|
||||||
|
page_idx,
|
||||||
|
page_offset)
|
||||||
|
if self.dirty:
|
||||||
|
continue
|
||||||
|
if not self.dirty:
|
||||||
|
self.dirty = nxt != page_idx or noff != page_offset
|
||||||
|
page_idx, page_offset = nxt, noff
|
||||||
|
if page_idx == -1:
|
||||||
|
return
|
||||||
|
|
||||||
|
def run(self, writer, reader):
|
||||||
|
"""
|
||||||
|
Pager entry point.
|
||||||
|
|
||||||
|
In interactive mode (terminal is a tty), run until
|
||||||
|
``process_keystroke()`` detects quit keystroke ('q'). In
|
||||||
|
non-interactive mode, exit after displaying all unicode points.
|
||||||
|
|
||||||
|
:param writer: callable writes to output stream, receiving unicode.
|
||||||
|
:type writer: callable
|
||||||
|
:param reader: callable reads keystrokes from input stream, sending
|
||||||
|
instance of blessed.keyboard.Keystroke.
|
||||||
|
:type reader: callable
|
||||||
|
"""
|
||||||
|
self.initialize_page_data()
|
||||||
|
if not self.term.is_a_tty:
|
||||||
|
self._run_notty(writer)
|
||||||
|
else:
|
||||||
|
self._run_tty(writer, reader)
|
||||||
|
|
||||||
|
def process_keystroke(self, inp, idx, offset):
|
||||||
|
"""
|
||||||
|
Process keystroke ``inp``, adjusting screen parameters.
|
||||||
|
|
||||||
|
:param inp: return value of blessed.Terminal.inkey().
|
||||||
|
:type inp: blessed.keyboard.Keystroke
|
||||||
|
:param idx: page index.
|
||||||
|
:type idx: int
|
||||||
|
:param offset: scrolling region offset of current page.
|
||||||
|
:type offset: int
|
||||||
|
:returns: tuple of next (idx, offset).
|
||||||
|
:rtype: (int, int)
|
||||||
|
"""
|
||||||
|
if inp.lower() in ('q', 'Q'):
|
||||||
|
# exit
|
||||||
|
return (-1, -1)
|
||||||
|
self._process_keystroke_commands(inp)
|
||||||
|
idx, offset = self._process_keystroke_movement(inp, idx, offset)
|
||||||
|
return idx, offset
|
||||||
|
|
||||||
|
def _process_keystroke_commands(self, inp):
|
||||||
|
"""Process keystrokes that issue commands (side effects)."""
|
||||||
|
if inp in ('1', '2') and self.screen.wide != int(inp):
|
||||||
|
# change between 1 or 2-character wide mode.
|
||||||
|
self.screen.wide = int(inp)
|
||||||
|
self.initialize_page_data()
|
||||||
|
self.on_resize(None, None)
|
||||||
|
elif inp == 'c':
|
||||||
|
# switch on/off combining characters
|
||||||
|
self.character_factory = (
|
||||||
|
WcWideCharacterGenerator
|
||||||
|
if self.character_factory != WcWideCharacterGenerator
|
||||||
|
else WcCombinedCharacterGenerator)
|
||||||
|
self.initialize_page_data()
|
||||||
|
self.on_resize(None, None)
|
||||||
|
elif inp in ('_', '-'):
|
||||||
|
# adjust name length -2
|
||||||
|
nlen = max(1, self.screen.style.name_len - 2)
|
||||||
|
if nlen != self.screen.style.name_len:
|
||||||
|
self.screen.style.name_len = nlen
|
||||||
|
self.on_resize(None, None)
|
||||||
|
elif inp in ('+', '='):
|
||||||
|
# adjust name length +2
|
||||||
|
nlen = min(self.term.width - 8, self.screen.style.name_len + 2)
|
||||||
|
if nlen != self.screen.style.name_len:
|
||||||
|
self.screen.style.name_len = nlen
|
||||||
|
self.on_resize(None, None)
|
||||||
|
elif inp == 'v':
|
||||||
|
with self.term.location(x=0, y=self.term.height - 2):
|
||||||
|
print(self.term.clear_eos())
|
||||||
|
input_selection_msg = (
|
||||||
|
"--> Enter unicode version [{versions}] ("
|
||||||
|
"current: {self.unicode_version}):".format(
|
||||||
|
versions=', '.join(list_versions()),
|
||||||
|
self=self))
|
||||||
|
echo('\n'.join(self.term.wrap(input_selection_msg,
|
||||||
|
subsequent_indent=' ')))
|
||||||
|
echo(' ')
|
||||||
|
flushout()
|
||||||
|
inp = readline(self.term, width=max(map(len, list_versions())))
|
||||||
|
if inp.strip() and inp != self.unicode_version:
|
||||||
|
# set new unicode version -- page data must be
|
||||||
|
# re-initialized. Any version is legal, underlying
|
||||||
|
# library performs best-match (with warnings)
|
||||||
|
self.unicode_version = _wcmatch_version(inp)
|
||||||
|
self.initialize_page_data()
|
||||||
|
self.on_resize(None, None)
|
||||||
|
|
||||||
|
def _process_keystroke_movement(self, inp, idx, offset):
|
||||||
|
"""Process keystrokes that adjust index and offset."""
|
||||||
|
term = self.term
|
||||||
|
# a little vi-inspired.
|
||||||
|
if inp in ('y', 'k') or inp.code in (term.KEY_UP,):
|
||||||
|
# scroll backward 1 line
|
||||||
|
offset -= self.screen.num_columns
|
||||||
|
elif inp in ('e', 'j') or inp.code in (term.KEY_ENTER,
|
||||||
|
term.KEY_DOWN,):
|
||||||
|
# scroll forward 1 line
|
||||||
|
offset = offset + self.screen.num_columns
|
||||||
|
elif inp in ('f', ' ') or inp.code in (term.KEY_PGDOWN,):
|
||||||
|
# scroll forward 1 page
|
||||||
|
idx += 1
|
||||||
|
elif inp == 'b' or inp.code in (term.KEY_PGUP,):
|
||||||
|
# scroll backward 1 page
|
||||||
|
idx = max(0, idx - 1)
|
||||||
|
elif inp == 'F' or inp.code in (term.KEY_SDOWN,):
|
||||||
|
# scroll forward 10 pages
|
||||||
|
idx = max(0, idx + 10)
|
||||||
|
elif inp == 'B' or inp.code in (term.KEY_SUP,):
|
||||||
|
# scroll backward 10 pages
|
||||||
|
idx = max(0, idx - 10)
|
||||||
|
elif inp.code == term.KEY_HOME:
|
||||||
|
# top
|
||||||
|
idx, offset = (0, 0)
|
||||||
|
elif inp == 'G' or inp.code == term.KEY_END:
|
||||||
|
# bottom
|
||||||
|
idx, offset = (self.last_page, 0)
|
||||||
|
elif inp == '\x0c':
|
||||||
|
self.dirty = True
|
||||||
|
return idx, offset
|
||||||
|
|
||||||
|
def draw(self, writer, idx, offset):
|
||||||
|
"""
|
||||||
|
Draw the current page view to ``writer``.
|
||||||
|
|
||||||
|
:param callable writer: callable writes to output stream, receiving unicode.
|
||||||
|
:param int idx: current page index.
|
||||||
|
:param int offset: scrolling region offset of current page.
|
||||||
|
:returns: tuple of next (idx, offset).
|
||||||
|
:rtype: (int, int)
|
||||||
|
"""
|
||||||
|
# as our screen can be resized while we're mid-calculation,
|
||||||
|
# our self.dirty flag can become re-toggled; because we are
|
||||||
|
# not re-flowing our pagination, we must begin over again.
|
||||||
|
while self.dirty:
|
||||||
|
self.draw_heading(writer)
|
||||||
|
self.dirty = self.STATE_CLEAN
|
||||||
|
(idx, offset), data = self.page_data(idx, offset)
|
||||||
|
for txt in self.page_view(data):
|
||||||
|
writer(txt)
|
||||||
|
self.draw_status(writer, idx)
|
||||||
|
flushout()
|
||||||
|
return idx, offset
|
||||||
|
|
||||||
|
def draw_heading(self, writer):
|
||||||
|
"""
|
||||||
|
Conditionally redraw screen when ``dirty`` attribute is valued REFRESH.
|
||||||
|
|
||||||
|
When Pager attribute ``dirty`` is ``STATE_REFRESH``, cursor is moved
|
||||||
|
to (0,0), screen is cleared, and heading is displayed.
|
||||||
|
|
||||||
|
:param callable writer: callable writes to output stream, receiving unicode.
|
||||||
|
:return: True if class attribute ``dirty`` is ``STATE_REFRESH``.
|
||||||
|
:rtype: bool
|
||||||
|
"""
|
||||||
|
if self.dirty == self.STATE_REFRESH:
|
||||||
|
writer(''.join(
|
||||||
|
(self.term.home, self.term.clear,
|
||||||
|
self.screen.msg_intro(version=self.unicode_version), '\n',
|
||||||
|
self.screen.header, '\n',)))
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
def draw_status(self, writer, idx):
|
||||||
|
"""
|
||||||
|
Conditionally draw status bar when output terminal is a tty.
|
||||||
|
|
||||||
|
:param callable writer: callable writes to output stream, receiving unicode.
|
||||||
|
:param int idx: current page position index.
|
||||||
|
:type idx: int
|
||||||
|
"""
|
||||||
|
if self.term.is_a_tty:
|
||||||
|
writer(self.term.hide_cursor())
|
||||||
|
style = self.screen.style
|
||||||
|
writer(self.term.move(self.term.height - 1))
|
||||||
|
if idx == self.last_page:
|
||||||
|
last_end = '(END)'
|
||||||
|
else:
|
||||||
|
last_end = '/{0}'.format(self.last_page)
|
||||||
|
txt = ('Page {idx}{last_end} - '
|
||||||
|
'{q} to quit, [keys: {keyset}]'
|
||||||
|
.format(idx=style.attr_minor('{0}'.format(idx)),
|
||||||
|
last_end=style.attr_major(last_end),
|
||||||
|
keyset=style.attr_major('kjfbvc12-='),
|
||||||
|
q=style.attr_minor('q')))
|
||||||
|
writer(self.term.center(txt).rstrip())
|
||||||
|
|
||||||
|
def page_view(self, data):
|
||||||
|
"""
|
||||||
|
Generator yields text to be displayed for the current unicode pageview.
|
||||||
|
|
||||||
|
:param list[(unicode, unicode)] data: The current page's data as tuple
|
||||||
|
of ``(ucs, name)``.
|
||||||
|
:returns: generator for full-page text for display
|
||||||
|
"""
|
||||||
|
if self.term.is_a_tty:
|
||||||
|
yield self.term.move(self.screen.row_begins, 0)
|
||||||
|
# sequence clears to end-of-line
|
||||||
|
clear_eol = self.term.clear_eol
|
||||||
|
# sequence clears to end-of-screen
|
||||||
|
clear_eos = self.term.clear_eos
|
||||||
|
|
||||||
|
# track our current column and row, where column is
|
||||||
|
# the whole segment of unicode value text, and draw
|
||||||
|
# only self.screen.num_columns before end-of-line.
|
||||||
|
#
|
||||||
|
# use clear_eol at end of each row to erase over any
|
||||||
|
# "ghosted" text, and clear_eos at end of screen to
|
||||||
|
# clear the same, especially for the final page which
|
||||||
|
# is often short.
|
||||||
|
col = 0
|
||||||
|
for ucs, name in data:
|
||||||
|
val = self.text_entry(ucs, name)
|
||||||
|
col += 1
|
||||||
|
if col == self.screen.num_columns:
|
||||||
|
col = 0
|
||||||
|
if self.term.is_a_tty:
|
||||||
|
val = ''.join((val, clear_eol, '\n'))
|
||||||
|
else:
|
||||||
|
val = ''.join((val.rstrip(), '\n'))
|
||||||
|
yield val
|
||||||
|
|
||||||
|
if self.term.is_a_tty:
|
||||||
|
yield ''.join((clear_eol, '\n', clear_eos))
|
||||||
|
|
||||||
|
def text_entry(self, ucs, name):
|
||||||
|
"""
|
||||||
|
Display a single column segment row describing ``(ucs, name)``.
|
||||||
|
|
||||||
|
:param str ucs: target unicode point character string.
|
||||||
|
:param str name: name of unicode point.
|
||||||
|
:return: formatted text for display.
|
||||||
|
:rtype: unicode
|
||||||
|
"""
|
||||||
|
style = self.screen.style
|
||||||
|
if len(name) > style.name_len:
|
||||||
|
idx = max(0, style.name_len - len(style.continuation))
|
||||||
|
name = ''.join((name[:idx], style.continuation if idx else ''))
|
||||||
|
if style.alignment == 'right':
|
||||||
|
fmt = ' '.join(('0x{val:0>{ucs_printlen}x}',
|
||||||
|
'{name:<{name_len}s}',
|
||||||
|
'{delimiter}{ucs}{delimiter}'
|
||||||
|
))
|
||||||
|
else:
|
||||||
|
fmt = ' '.join(('{delimiter}{ucs}{delimiter}',
|
||||||
|
'0x{val:0>{ucs_printlen}x}',
|
||||||
|
'{name:<{name_len}s}'))
|
||||||
|
delimiter = style.attr_minor(style.delimiter)
|
||||||
|
if len(ucs) != 1:
|
||||||
|
# determine display of combining characters
|
||||||
|
val = ord(ucs[1])
|
||||||
|
# a combining character displayed of any fg color
|
||||||
|
# will reset the foreground character of the cell
|
||||||
|
# combined with (iTerm2, OSX).
|
||||||
|
disp_ucs = style.attr_major(ucs[0:2])
|
||||||
|
if len(ucs) > 2:
|
||||||
|
disp_ucs += ucs[2]
|
||||||
|
else:
|
||||||
|
# non-combining
|
||||||
|
val = ord(ucs)
|
||||||
|
disp_ucs = style.attr_major(ucs)
|
||||||
|
|
||||||
|
return fmt.format(name_len=style.name_len,
|
||||||
|
ucs_printlen=UCS_PRINTLEN,
|
||||||
|
delimiter=delimiter,
|
||||||
|
name=name,
|
||||||
|
ucs=disp_ucs,
|
||||||
|
val=val)
|
||||||
|
|
||||||
|
|
||||||
|
def validate_args(opts):
|
||||||
|
"""Validate and return options provided by docopt parsing."""
|
||||||
|
if opts['--wide'] is None:
|
||||||
|
opts['--wide'] = 2
|
||||||
|
else:
|
||||||
|
assert opts['--wide'] in ("1", "2"), opts['--wide']
|
||||||
|
if opts['--alignment'] is None:
|
||||||
|
opts['--alignment'] = 'left'
|
||||||
|
else:
|
||||||
|
assert opts['--alignment'] in ('left', 'right'), opts['--alignment']
|
||||||
|
opts['--wide'] = int(opts['--wide'])
|
||||||
|
opts['character_factory'] = WcWideCharacterGenerator
|
||||||
|
if opts['--combining']:
|
||||||
|
opts['character_factory'] = WcCombinedCharacterGenerator
|
||||||
|
return opts
|
||||||
|
|
||||||
|
|
||||||
|
def main(opts):
|
||||||
|
"""Program entry point."""
|
||||||
|
term = blessed.Terminal()
|
||||||
|
style = Style()
|
||||||
|
|
||||||
|
# if the terminal supports colors, use a Style instance with some
|
||||||
|
# standout colors (magenta, cyan).
|
||||||
|
if term.number_of_colors:
|
||||||
|
style = Style(attr_major=term.magenta,
|
||||||
|
attr_minor=term.bright_cyan,
|
||||||
|
alignment=opts['--alignment'])
|
||||||
|
style.name_len = 10
|
||||||
|
|
||||||
|
screen = Screen(term, style, wide=opts['--wide'])
|
||||||
|
pager = Pager(term, screen, opts['character_factory'])
|
||||||
|
|
||||||
|
with term.location(), term.cbreak(), \
|
||||||
|
term.fullscreen(), term.hidden_cursor():
|
||||||
|
pager.run(writer=echo, reader=term.inkey)
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
sys.exit(main(validate_args(docopt.docopt(__doc__))))
|
|
@ -0,0 +1,138 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
# coding: utf-8
|
||||||
|
"""
|
||||||
|
Manual tests comparing wcwidth.py to libc's wcwidth(3) and wcswidth(3).
|
||||||
|
|
||||||
|
https://github.com/jquast/wcwidth
|
||||||
|
|
||||||
|
This suite of tests compares the libc return values with the pure-python return
|
||||||
|
values. Although wcwidth(3) is POSIX, its actual implementation may differ,
|
||||||
|
so these tests are not guaranteed to be successful on all platforms, especially
|
||||||
|
where wcwidth(3)/wcswidth(3) is out of date. This is especially true for many
|
||||||
|
platforms -- usually conforming only to unicode specification 1.0 or 2.0.
|
||||||
|
|
||||||
|
This program accepts one optional command-line argument, the unicode version
|
||||||
|
level for our library to use when comparing to libc.
|
||||||
|
"""
|
||||||
|
# pylint: disable=C0103
|
||||||
|
# Invalid module name "wcwidth-libc-comparator"
|
||||||
|
|
||||||
|
# standard imports
|
||||||
|
from __future__ import print_function
|
||||||
|
|
||||||
|
# std imports
|
||||||
|
import sys
|
||||||
|
import locale
|
||||||
|
import warnings
|
||||||
|
import ctypes.util
|
||||||
|
import unicodedata
|
||||||
|
|
||||||
|
# local
|
||||||
|
# local imports
|
||||||
|
import wcwidth
|
||||||
|
|
||||||
|
|
||||||
|
def is_named(ucs):
|
||||||
|
"""
|
||||||
|
Whether the unicode point ``ucs`` has a name.
|
||||||
|
|
||||||
|
:rtype bool
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
return bool(unicodedata.name(ucs))
|
||||||
|
except ValueError:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def is_not_combining(ucs):
|
||||||
|
return not unicodedata.combining(ucs)
|
||||||
|
|
||||||
|
|
||||||
|
def report_ucs_msg(ucs, wcwidth_libc, wcwidth_local):
|
||||||
|
"""
|
||||||
|
Return string report of combining character differences.
|
||||||
|
|
||||||
|
:param ucs: unicode point.
|
||||||
|
:type ucs: unicode
|
||||||
|
:param wcwidth_libc: libc-wcwidth's reported character length.
|
||||||
|
:type comb_py: int
|
||||||
|
:param wcwidth_local: wcwidth's reported character length.
|
||||||
|
:type comb_wc: int
|
||||||
|
:rtype: unicode
|
||||||
|
"""
|
||||||
|
ucp = (ucs.encode('unicode_escape')[2:]
|
||||||
|
.decode('ascii')
|
||||||
|
.upper()
|
||||||
|
.lstrip('0'))
|
||||||
|
url = "http://codepoints.net/U+{}".format(ucp)
|
||||||
|
name = unicodedata.name(ucs)
|
||||||
|
return (u"libc,ours={},{} [--o{}o--] name={} val={} {}"
|
||||||
|
" ".format(wcwidth_libc, wcwidth_local, ucs, name, ord(ucs), url))
|
||||||
|
|
||||||
|
|
||||||
|
# use chr() for py3.x,
|
||||||
|
# unichr() for py2.x
|
||||||
|
try:
|
||||||
|
_ = unichr(0)
|
||||||
|
except NameError as err:
|
||||||
|
if err.args[0] == "name 'unichr' is not defined":
|
||||||
|
# pylint: disable=W0622
|
||||||
|
# Redefining built-in 'unichr' (col 8)
|
||||||
|
|
||||||
|
unichr = chr
|
||||||
|
else:
|
||||||
|
raise
|
||||||
|
|
||||||
|
if sys.maxunicode < 1114111:
|
||||||
|
warnings.warn('narrow Python build, only a small subset of '
|
||||||
|
'characters may be tested.')
|
||||||
|
|
||||||
|
|
||||||
|
def _is_equal_wcwidth(libc, ucs, unicode_version):
|
||||||
|
w_libc = libc.wcwidth(ucs)
|
||||||
|
w_local = wcwidth.wcwidth(ucs, unicode_version)
|
||||||
|
assert w_libc == w_local, report_ucs_msg(ucs, w_libc, w_local)
|
||||||
|
|
||||||
|
|
||||||
|
def main(using_locale=('en_US', 'UTF-8',)):
|
||||||
|
"""
|
||||||
|
Program entry point.
|
||||||
|
|
||||||
|
Load the entire Unicode table into memory, excluding those that:
|
||||||
|
|
||||||
|
- are not named (func unicodedata.name returns empty string),
|
||||||
|
- are combining characters.
|
||||||
|
|
||||||
|
Using ``locale``, for each unicode character string compare libc's
|
||||||
|
wcwidth with local wcwidth.wcwidth() function; when they differ,
|
||||||
|
report a detailed AssertionError to stdout.
|
||||||
|
"""
|
||||||
|
all_ucs = (ucs for ucs in
|
||||||
|
[unichr(val) for val in range(sys.maxunicode)]
|
||||||
|
if is_named(ucs) and is_not_combining(ucs))
|
||||||
|
|
||||||
|
libc_name = ctypes.util.find_library('c')
|
||||||
|
if not libc_name:
|
||||||
|
raise ImportError("Can't find C library.")
|
||||||
|
|
||||||
|
libc = ctypes.cdll.LoadLibrary(libc_name)
|
||||||
|
libc.wcwidth.argtypes = [ctypes.c_wchar, ]
|
||||||
|
libc.wcwidth.restype = ctypes.c_int
|
||||||
|
|
||||||
|
assert getattr(libc, 'wcwidth', None) is not None
|
||||||
|
assert getattr(libc, 'wcswidth', None) is not None
|
||||||
|
|
||||||
|
locale.setlocale(locale.LC_ALL, using_locale)
|
||||||
|
unicode_version = 'latest'
|
||||||
|
if len(sys.argv) > 1:
|
||||||
|
unicode_version = sys.argv[1]
|
||||||
|
|
||||||
|
for ucs in all_ucs:
|
||||||
|
try:
|
||||||
|
_is_equal_wcwidth(libc, ucs, unicode_version)
|
||||||
|
except AssertionError as err:
|
||||||
|
print(err)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
|
@ -0,0 +1,38 @@
|
||||||
|
==========
|
||||||
|
Public API
|
||||||
|
==========
|
||||||
|
|
||||||
|
This package follows SEMVER_ rules for version, therefor, for all of the
|
||||||
|
given functions signatures, at example version 1.1.1, you may use version
|
||||||
|
dependency ``>=1.1.1,<2.0`` for forward compatibility of future wcwidth
|
||||||
|
versions.
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth.wcwidth
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth.wcswidth
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth.list_versions
|
||||||
|
|
||||||
|
.. _SEMVER: https://semver.org
|
||||||
|
|
||||||
|
===========
|
||||||
|
Private API
|
||||||
|
===========
|
||||||
|
|
||||||
|
These functions should only be used for wcwidth development, and not used by
|
||||||
|
dependent packages except with care and by use of frozen version dependency,
|
||||||
|
as these functions may change names, signatures, or disappear entirely at any
|
||||||
|
time in the future, and not reflected by SEMVER rules.
|
||||||
|
|
||||||
|
If stable public API for any of the given functions is needed, please suggest a
|
||||||
|
Pull Request!
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth._bisearch
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth._wcversion_value
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth._wcmatch_version
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth._get_package_version
|
||||||
|
|
||||||
|
.. autofunction:: wcwidth._wcmatch_version
|
|
@ -0,0 +1,178 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
# -*- coding: utf-8 -*-
|
||||||
|
#
|
||||||
|
# wcwidth documentation build configuration file, created by
|
||||||
|
# sphinx-quickstart on Fri Oct 20 15:18:02 2017.
|
||||||
|
#
|
||||||
|
# This file is execfile()d with the current directory set to its
|
||||||
|
# containing dir.
|
||||||
|
#
|
||||||
|
# Note that not all possible configuration values are present in this
|
||||||
|
# autogenerated file.
|
||||||
|
#
|
||||||
|
# All configuration values have a default; values that are commented out
|
||||||
|
# serve to show the default.
|
||||||
|
|
||||||
|
# If extensions (or modules to document with autodoc) are in another directory,
|
||||||
|
# add these directories to sys.path here. If the directory is relative to the
|
||||||
|
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||||
|
#
|
||||||
|
# import os
|
||||||
|
# import sys
|
||||||
|
# sys.path.insert(0, os.path.abspath('.'))
|
||||||
|
|
||||||
|
# local
|
||||||
|
# 3rd-party imports
|
||||||
|
import wcwidth
|
||||||
|
|
||||||
|
# -- General configuration ------------------------------------------------
|
||||||
|
|
||||||
|
# If your documentation needs a minimal Sphinx version, state it here.
|
||||||
|
#
|
||||||
|
# needs_sphinx = '1.0'
|
||||||
|
|
||||||
|
# Add any Sphinx extension module names here, as strings. They can be
|
||||||
|
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||||
|
# ones.
|
||||||
|
extensions = ['sphinx.ext.autodoc',
|
||||||
|
'sphinx.ext.doctest',
|
||||||
|
'sphinx.ext.intersphinx',
|
||||||
|
'sphinx.ext.coverage',
|
||||||
|
'sphinx.ext.viewcode']
|
||||||
|
|
||||||
|
# Add any paths that contain templates here, relative to this directory.
|
||||||
|
templates_path = ['_templates']
|
||||||
|
|
||||||
|
# The suffix(es) of source filenames.
|
||||||
|
# You can specify multiple suffix as a list of string:
|
||||||
|
#
|
||||||
|
# source_suffix = ['.rst', '.md']
|
||||||
|
source_suffix = '.rst'
|
||||||
|
|
||||||
|
# The master toctree document.
|
||||||
|
master_doc = 'index'
|
||||||
|
|
||||||
|
# General information about the project.
|
||||||
|
project = 'wcwidth'
|
||||||
|
copyright = '2017, Jeff Quast'
|
||||||
|
author = 'Jeff Quast'
|
||||||
|
|
||||||
|
# The version info for the project you're documenting, acts as replacement for
|
||||||
|
# |version| and |release|, also used in various other places throughout the
|
||||||
|
# built documents.
|
||||||
|
#
|
||||||
|
# The short X.Y version,
|
||||||
|
# The full version, including alpha/beta/rc tags.
|
||||||
|
release = version = wcwidth.__version__
|
||||||
|
|
||||||
|
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||||
|
# for a list of supported languages.
|
||||||
|
#
|
||||||
|
# This is also used if you do content translation via gettext catalogs.
|
||||||
|
# Usually you set "language" from the command line for these cases.
|
||||||
|
language = None
|
||||||
|
|
||||||
|
# List of patterns, relative to source directory, that match files and
|
||||||
|
# directories to ignore when looking for source files.
|
||||||
|
# This patterns also effect to html_static_path and html_extra_path
|
||||||
|
exclude_patterns = []
|
||||||
|
|
||||||
|
# The name of the Pygments (syntax highlighting) style to use.
|
||||||
|
pygments_style = 'sphinx'
|
||||||
|
|
||||||
|
# If true, `todo` and `todoList` produce output, else they produce nothing.
|
||||||
|
todo_include_todos = False
|
||||||
|
|
||||||
|
|
||||||
|
# -- Options for HTML output ----------------------------------------------
|
||||||
|
|
||||||
|
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||||
|
# a list of builtin themes.
|
||||||
|
#
|
||||||
|
html_theme = 'alabaster'
|
||||||
|
|
||||||
|
# Theme options are theme-specific and customize the look and feel of a theme
|
||||||
|
# further. For a list of options available for each theme, see the
|
||||||
|
# documentation.
|
||||||
|
#
|
||||||
|
# html_theme_options = {}
|
||||||
|
|
||||||
|
# Add any paths that contain custom static files (such as style sheets) here,
|
||||||
|
# relative to this directory. They are copied after the builtin static files,
|
||||||
|
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||||
|
html_static_path = ['_static']
|
||||||
|
|
||||||
|
# Custom sidebar templates, must be a dictionary that maps document names
|
||||||
|
# to template names.
|
||||||
|
#
|
||||||
|
# This is required for the alabaster theme
|
||||||
|
# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
|
||||||
|
# html_sidebars = {
|
||||||
|
# '**': [
|
||||||
|
# 'about.html',
|
||||||
|
# 'navigation.html',
|
||||||
|
# 'relations.html', # needs 'show_related': True theme option to display
|
||||||
|
# 'searchbox.html',
|
||||||
|
# 'donate.html',
|
||||||
|
# ]
|
||||||
|
# }
|
||||||
|
|
||||||
|
|
||||||
|
# -- Options for HTMLHelp output ------------------------------------------
|
||||||
|
|
||||||
|
# Output file base name for HTML help builder.
|
||||||
|
htmlhelp_basename = 'wcwidthdoc'
|
||||||
|
|
||||||
|
|
||||||
|
# -- Options for LaTeX output ---------------------------------------------
|
||||||
|
|
||||||
|
latex_elements = {
|
||||||
|
# The paper size ('letterpaper' or 'a4paper').
|
||||||
|
#
|
||||||
|
# 'papersize': 'letterpaper',
|
||||||
|
|
||||||
|
# The font size ('10pt', '11pt' or '12pt').
|
||||||
|
#
|
||||||
|
# 'pointsize': '10pt',
|
||||||
|
|
||||||
|
# Additional stuff for the LaTeX preamble.
|
||||||
|
#
|
||||||
|
# 'preamble': '',
|
||||||
|
|
||||||
|
# Latex figure (float) alignment
|
||||||
|
#
|
||||||
|
# 'figure_align': 'htbp',
|
||||||
|
}
|
||||||
|
|
||||||
|
# Grouping the document tree into LaTeX files. List of tuples
|
||||||
|
# (source start file, target name, title,
|
||||||
|
# author, documentclass [howto, manual, or own class]).
|
||||||
|
latex_documents = [
|
||||||
|
(master_doc, 'wcwidth.tex', 'wcwidth Documentation',
|
||||||
|
'Jeff Quast', 'manual'),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
# -- Options for manual page output ---------------------------------------
|
||||||
|
|
||||||
|
# One entry per manual page. List of tuples
|
||||||
|
# (source start file, name, description, authors, manual section).
|
||||||
|
man_pages = [
|
||||||
|
(master_doc, 'wcwidth', 'wcwidth Documentation',
|
||||||
|
[author], 1)
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
# -- Options for Texinfo output -------------------------------------------
|
||||||
|
|
||||||
|
# Grouping the document tree into Texinfo files. List of tuples
|
||||||
|
# (source start file, target name, title, author,
|
||||||
|
# dir menu entry, description, category)
|
||||||
|
texinfo_documents = [
|
||||||
|
(master_doc, 'wcwidth', 'wcwidth Documentation',
|
||||||
|
author, 'wcwidth', 'One line description of project.',
|
||||||
|
'Miscellaneous'),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
intersphinx_mapping = {'python': ('https://docs.python.org/3', None)}
|
|
@ -0,0 +1,15 @@
|
||||||
|
wcwidth
|
||||||
|
=======
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
|
||||||
|
intro
|
||||||
|
unicode_version
|
||||||
|
api
|
||||||
|
|
||||||
|
Indices and tables
|
||||||
|
------------------
|
||||||
|
|
||||||
|
* :ref:`genindex`
|
||||||
|
* :ref:`modindex`
|
||||||
|
* :ref:`search`
|
|
@ -0,0 +1,280 @@
|
||||||
|
|pypi_downloads| |codecov| |license|
|
||||||
|
|
||||||
|
============
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
This library is mainly for CLI programs that carefully produce output for
|
||||||
|
Terminals, or make pretend to be an emulator.
|
||||||
|
|
||||||
|
**Problem Statement**: The printable length of *most* strings are equal to the
|
||||||
|
number of cells they occupy on the screen ``1 charater : 1 cell``. However,
|
||||||
|
there are categories of characters that *occupy 2 cells* (full-wide), and
|
||||||
|
others that *occupy 0* cells (zero-width).
|
||||||
|
|
||||||
|
**Solution**: POSIX.1-2001 and POSIX.1-2008 conforming systems provide
|
||||||
|
`wcwidth(3)`_ and `wcswidth(3)`_ C functions of which this python module's
|
||||||
|
functions precisely copy. *These functions return the number of cells a
|
||||||
|
unicode string is expected to occupy.*
|
||||||
|
|
||||||
|
Installation
|
||||||
|
------------
|
||||||
|
|
||||||
|
The stable version of this package is maintained on pypi, install using pip::
|
||||||
|
|
||||||
|
pip install wcwidth
|
||||||
|
|
||||||
|
Example
|
||||||
|
-------
|
||||||
|
|
||||||
|
**Problem**: given the following phrase (Japanese),
|
||||||
|
|
||||||
|
>>> text = u'コンニチハ'
|
||||||
|
|
||||||
|
Python **incorrectly** uses the *string length* of 5 codepoints rather than the
|
||||||
|
*printible length* of 10 cells, so that when using the `rjust` function, the
|
||||||
|
output length is wrong::
|
||||||
|
|
||||||
|
>>> print(len('コンニチハ'))
|
||||||
|
5
|
||||||
|
|
||||||
|
>>> print('コンニチハ'.rjust(20, '_'))
|
||||||
|
_____コンニチハ
|
||||||
|
|
||||||
|
By defining our own "rjust" function that uses wcwidth, we can correct this::
|
||||||
|
|
||||||
|
>>> def wc_rjust(text, length, padding=' '):
|
||||||
|
... from wcwidth import wcswidth
|
||||||
|
... return padding * max(0, (length - wcswidth(text))) + text
|
||||||
|
...
|
||||||
|
|
||||||
|
Our **Solution** uses wcswidth to determine the string length correctly::
|
||||||
|
|
||||||
|
>>> from wcwidth import wcswidth
|
||||||
|
>>> print(wcswidth('コンニチハ'))
|
||||||
|
10
|
||||||
|
|
||||||
|
>>> print(wc_rjust('コンニチハ', 20, '_'))
|
||||||
|
__________コンニチハ
|
||||||
|
|
||||||
|
|
||||||
|
Choosing a Version
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Export an environment variable, ``UNICODE_VERSION``. This should be done by
|
||||||
|
*terminal emulators* or those developers experimenting with authoring one of
|
||||||
|
their own, from shell::
|
||||||
|
|
||||||
|
$ export UNICODE_VERSION=13.0
|
||||||
|
|
||||||
|
If unspecified, the latest version is used. If your Terminal Emulator does not
|
||||||
|
export this variable, you can use the `jquast/ucs-detect`_ utility to
|
||||||
|
automatically detect and export it to your shell.
|
||||||
|
|
||||||
|
wcwidth, wcswidth
|
||||||
|
-----------------
|
||||||
|
Use function ``wcwidth()`` to determine the length of a *single unicode
|
||||||
|
character*, and ``wcswidth()`` to determine the length of many, a *string
|
||||||
|
of unicode characters*.
|
||||||
|
|
||||||
|
Briefly, return values of function ``wcwidth()`` are:
|
||||||
|
|
||||||
|
``-1``
|
||||||
|
Indeterminate (not printable).
|
||||||
|
|
||||||
|
``0``
|
||||||
|
Does not advance the cursor, such as NULL or Combining.
|
||||||
|
|
||||||
|
``2``
|
||||||
|
Characters of category East Asian Wide (W) or East Asian
|
||||||
|
Full-width (F) which are displayed using two terminal cells.
|
||||||
|
|
||||||
|
``1``
|
||||||
|
All others.
|
||||||
|
|
||||||
|
Function ``wcswidth()`` simply returns the sum of all values for each character
|
||||||
|
along a string, or ``-1`` when it occurs anywhere along a string.
|
||||||
|
|
||||||
|
Full API Documentation at http://wcwidth.readthedocs.org
|
||||||
|
|
||||||
|
==========
|
||||||
|
Developing
|
||||||
|
==========
|
||||||
|
|
||||||
|
Install wcwidth in editable mode::
|
||||||
|
|
||||||
|
pip install -e.
|
||||||
|
|
||||||
|
Execute unit tests using tox_::
|
||||||
|
|
||||||
|
tox
|
||||||
|
|
||||||
|
Regenerate python code tables from latest Unicode Specification data files::
|
||||||
|
|
||||||
|
tox -eupdate
|
||||||
|
|
||||||
|
Supplementary tools for browsing and testing terminals for wide unicode
|
||||||
|
characters are found in the `bin/`_ of this project's source code. Just ensure
|
||||||
|
to first ``pip install -erequirements-develop.txt`` from this projects main
|
||||||
|
folder. For example, an interactive browser for testing::
|
||||||
|
|
||||||
|
./bin/wcwidth-browser.py
|
||||||
|
|
||||||
|
Uses
|
||||||
|
----
|
||||||
|
|
||||||
|
This library is used in:
|
||||||
|
|
||||||
|
- `jquast/blessed`_: a thin, practical wrapper around terminal capabilities in
|
||||||
|
Python.
|
||||||
|
|
||||||
|
- `jonathanslenders/python-prompt-toolkit`_: a Library for building powerful
|
||||||
|
interactive command lines in Python.
|
||||||
|
|
||||||
|
- `dbcli/pgcli`_: Postgres CLI with autocompletion and syntax highlighting.
|
||||||
|
|
||||||
|
- `thomasballinger/curtsies`_: a Curses-like terminal wrapper with a display
|
||||||
|
based on compositing 2d arrays of text.
|
||||||
|
|
||||||
|
- `selectel/pyte`_: Simple VTXXX-compatible linux terminal emulator.
|
||||||
|
|
||||||
|
- `astanin/python-tabulate`_: Pretty-print tabular data in Python, a library
|
||||||
|
and a command-line utility.
|
||||||
|
|
||||||
|
- `LuminosoInsight/python-ftfy`_: Fixes mojibake and other glitches in Unicode
|
||||||
|
text.
|
||||||
|
|
||||||
|
- `nbedos/termtosvg`_: Terminal recorder that renders sessions as SVG
|
||||||
|
animations.
|
||||||
|
|
||||||
|
- `peterbrittain/asciimatics`_: Package to help people create full-screen text
|
||||||
|
UIs.
|
||||||
|
|
||||||
|
Other Languages
|
||||||
|
---------------
|
||||||
|
|
||||||
|
- `timoxley/wcwidth`_: JavaScript
|
||||||
|
- `janlelis/unicode-display_width`_: Ruby
|
||||||
|
- `alecrabbit/php-wcwidth`_: PHP
|
||||||
|
- `Text::CharWidth`_: Perl
|
||||||
|
- `bluebear94/Terminal-WCWidth`: Perl 6
|
||||||
|
- `mattn/go-runewidth`_: Go
|
||||||
|
- `emugel/wcwidth`_: Haxe
|
||||||
|
- `aperezdc/lua-wcwidth`: Lua
|
||||||
|
- `joachimschmidt557/zig-wcwidth`: Zig
|
||||||
|
- `fumiyas/wcwidth-cjk`: `LD_PRELOAD` override
|
||||||
|
- `joshuarubin/wcwidth9`: Unicode version 9 in C
|
||||||
|
|
||||||
|
History
|
||||||
|
-------
|
||||||
|
|
||||||
|
0.2.0 *2020-06-01*
|
||||||
|
* **Enhancement**: Unicode version may be selected by exporting the
|
||||||
|
Environment variable ``UNICODE_VERSION``, such as ``13.0``, or ``6.3.0``.
|
||||||
|
See the `jquast/ucs-detect`_ CLI utility for automatic detection.
|
||||||
|
* **Enhancement**:
|
||||||
|
API Documentation is published to readthedocs.org.
|
||||||
|
* **Updated** tables for *all* Unicode Specifications with files
|
||||||
|
published in a programmatically consumable format, versions 4.1.0
|
||||||
|
through 13.0
|
||||||
|
that are published
|
||||||
|
, versions
|
||||||
|
|
||||||
|
0.1.9 *2020-03-22*
|
||||||
|
* **Performance** optimization by `Avram Lubkin`_, `PR #35`_.
|
||||||
|
* **Updated** tables to Unicode Specification 13.0.0.
|
||||||
|
|
||||||
|
0.1.8 *2020-01-01*
|
||||||
|
* **Updated** tables to Unicode Specification 12.0.0. (`PR #30`_).
|
||||||
|
|
||||||
|
0.1.7 *2016-07-01*
|
||||||
|
* **Updated** tables to Unicode Specification 9.0.0. (`PR #18`_).
|
||||||
|
|
||||||
|
0.1.6 *2016-01-08 Production/Stable*
|
||||||
|
* ``LICENSE`` file now included with distribution.
|
||||||
|
|
||||||
|
0.1.5 *2015-09-13 Alpha*
|
||||||
|
* **Bugfix**:
|
||||||
|
Resolution of "combining_ character width" issue, most especially
|
||||||
|
those that previously returned -1 now often (correctly) return 0.
|
||||||
|
resolved by `Philip Craig`_ via `PR #11`_.
|
||||||
|
* **Deprecated**:
|
||||||
|
The module path ``wcwidth.table_comb`` is no longer available,
|
||||||
|
it has been superseded by module path ``wcwidth.table_zero``.
|
||||||
|
|
||||||
|
0.1.4 *2014-11-20 Pre-Alpha*
|
||||||
|
* **Feature**: ``wcswidth()`` now determines printable length
|
||||||
|
for (most) combining_ characters. The developer's tool
|
||||||
|
`bin/wcwidth-browser.py`_ is improved to display combining_
|
||||||
|
characters when provided the ``--combining`` option
|
||||||
|
(`Thomas Ballinger`_ and `Leta Montopoli`_ `PR #5`_).
|
||||||
|
* **Feature**: added static analysis (prospector_) to testing
|
||||||
|
framework.
|
||||||
|
|
||||||
|
0.1.3 *2014-10-29 Pre-Alpha*
|
||||||
|
* **Bugfix**: 2nd parameter of wcswidth was not honored.
|
||||||
|
(`Thomas Ballinger`_, `PR #4`_).
|
||||||
|
|
||||||
|
0.1.2 *2014-10-28 Pre-Alpha*
|
||||||
|
* **Updated** tables to Unicode Specification 7.0.0.
|
||||||
|
(`Thomas Ballinger`_, `PR #3`_).
|
||||||
|
|
||||||
|
0.1.1 *2014-05-14 Pre-Alpha*
|
||||||
|
* Initial release to pypi, Based on Unicode Specification 6.3.0
|
||||||
|
|
||||||
|
This code was originally derived directly from C code of the same name,
|
||||||
|
whose latest version is available at
|
||||||
|
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c::
|
||||||
|
|
||||||
|
* Markus Kuhn -- 2007-05-26 (Unicode 5.0)
|
||||||
|
*
|
||||||
|
* Permission to use, copy, modify, and distribute this software
|
||||||
|
* for any purpose and without fee is hereby granted. The author
|
||||||
|
* disclaims all warranties with regard to this software.
|
||||||
|
|
||||||
|
.. _`tox`: https://testrun.org/tox/latest/install.html
|
||||||
|
.. _`prospector`: https://github.com/landscapeio/prospector
|
||||||
|
.. _`combining`: https://en.wikipedia.org/wiki/Combining_character
|
||||||
|
.. _`bin/`: https://github.com/jquast/wcwidth/tree/master/bin
|
||||||
|
.. _`bin/wcwidth-browser.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-browser.py
|
||||||
|
.. _`Thomas Ballinger`: https://github.com/thomasballinger
|
||||||
|
.. _`Leta Montopoli`: https://github.com/lmontopo
|
||||||
|
.. _`Philip Craig`: https://github.com/philipc
|
||||||
|
.. _`PR #3`: https://github.com/jquast/wcwidth/pull/3
|
||||||
|
.. _`PR #4`: https://github.com/jquast/wcwidth/pull/4
|
||||||
|
.. _`PR #5`: https://github.com/jquast/wcwidth/pull/5
|
||||||
|
.. _`PR #11`: https://github.com/jquast/wcwidth/pull/11
|
||||||
|
.. _`PR #18`: https://github.com/jquast/wcwidth/pull/18
|
||||||
|
.. _`PR #30`: https://github.com/jquast/wcwidth/pull/30
|
||||||
|
.. _`PR #35`: https://github.com/jquast/wcwidth/pull/35
|
||||||
|
.. _`jquast/blessed`: https://github.com/jquast/blessed
|
||||||
|
.. _`selectel/pyte`: https://github.com/selectel/pyte
|
||||||
|
.. _`thomasballinger/curtsies`: https://github.com/thomasballinger/curtsies
|
||||||
|
.. _`dbcli/pgcli`: https://github.com/dbcli/pgcli
|
||||||
|
.. _`jonathanslenders/python-prompt-toolkit`: https://github.com/jonathanslenders/python-prompt-toolkit
|
||||||
|
.. _`timoxley/wcwidth`: https://github.com/timoxley/wcwidth
|
||||||
|
.. _`wcwidth(3)`: http://man7.org/linux/man-pages/man3/wcwidth.3.html
|
||||||
|
.. _`wcswidth(3)`: http://man7.org/linux/man-pages/man3/wcswidth.3.html
|
||||||
|
.. _`astanin/python-tabulate`: https://github.com/astanin/python-tabulate
|
||||||
|
.. _`janlelis/unicode-display_width`: https://github.com/janlelis/unicode-display_width
|
||||||
|
.. _`LuminosoInsight/python-ftfy`: https://github.com/LuminosoInsight/python-ftfy
|
||||||
|
.. _`alecrabbit/php-wcwidth`: https://github.com/alecrabbit/php-wcwidth
|
||||||
|
.. _`Text::CharWidth`: https://metacpan.org/pod/Text::CharWidth
|
||||||
|
.. _`bluebear94/Terminal-WCWidth`: https://github.com/bluebear94/Terminal-WCWidth
|
||||||
|
.. _`mattn/go-runewidth`: https://github.com/mattn/go-runewidth
|
||||||
|
.. _`emugel/wcwidth`: https://github.com/emugel/wcwidth
|
||||||
|
.. _`jquast/ucs-detect`: https://github.com/jquast/ucs-detect
|
||||||
|
.. _`Avram Lubkin`: https://github.com/avylove
|
||||||
|
.. _`nbedos/termtosvg`: https://github.com/nbedos/termtosvg
|
||||||
|
.. _`peterbrittain/asciimatics`: https://github.com/peterbrittain/asciimatics
|
||||||
|
.. _`aperezdc/lua-wcwidth`: https://github.com/aperezdc/lua-wcwidth
|
||||||
|
.. _`fumiyas/wcwidth-cjk`: https://github.com/fumiyas/wcwidth-cjk
|
||||||
|
.. |pypi_downloads| image:: https://img.shields.io/pypi/dm/wcwidth.svg?logo=pypi
|
||||||
|
:alt: Downloads
|
||||||
|
:target: https://pypi.org/project/wcwidth/
|
||||||
|
.. |codecov| image:: https://codecov.io/gh/jquast/wcwidth/branch/master/graph/badge.svg
|
||||||
|
:alt: codecov.io Code Coverage
|
||||||
|
:target: https://codecov.io/gh/jquast/wcwidth/
|
||||||
|
.. |license| image:: https://img.shields.io/github/license/jquast/wcwidth.svg
|
||||||
|
:target: https://pypi.python.org/pypi/wcwidth/
|
||||||
|
:alt: MIT License
|
|
@ -0,0 +1,4 @@
|
||||||
|
Sphinx
|
||||||
|
sphinx-paramlinks
|
||||||
|
sphinx_rtd_theme
|
||||||
|
sphinxcontrib-manpage
|
|
@ -0,0 +1,104 @@
|
||||||
|
=====================
|
||||||
|
Unicode release files
|
||||||
|
=====================
|
||||||
|
|
||||||
|
This library aims to be forward-looking, portable, and most correct.
|
||||||
|
The most current release of this API is based on the Unicode Standard
|
||||||
|
release files:
|
||||||
|
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-4.1.0.txt``
|
||||||
|
*Date: 2005-02-26, 02:35:50 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-5.0.0.txt``
|
||||||
|
*Date: 2006-02-27, 23:41:27 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-5.1.0.txt``
|
||||||
|
*Date: 2008-03-20, 17:54:57 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-5.2.0.txt``
|
||||||
|
*Date: 2009-08-22, 04:58:21 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-6.0.0.txt``
|
||||||
|
*Date: 2010-08-19, 00:48:09 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-6.1.0.txt``
|
||||||
|
*Date: 2011-11-27, 05:10:22 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-6.2.0.txt``
|
||||||
|
*Date: 2012-05-20, 00:42:34 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-6.3.0.txt``
|
||||||
|
*Date: 2013-07-05, 14:08:45 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-7.0.0.txt``
|
||||||
|
*Date: 2014-02-07, 18:42:12 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-8.0.0.txt``
|
||||||
|
*Date: 2015-02-13, 13:47:11 GMT [MD]*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-9.0.0.txt``
|
||||||
|
*Date: 2016-06-01, 10:34:26 GMT*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-10.0.0.txt``
|
||||||
|
*Date: 2017-03-08, 08:41:49 GMT*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-11.0.0.txt``
|
||||||
|
*Date: 2018-02-21, 05:34:04 GMT*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-12.0.0.txt``
|
||||||
|
*Date: 2019-01-22, 08:18:28 GMT*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-12.1.0.txt``
|
||||||
|
*Date: 2019-03-10, 10:53:08 GMT*
|
||||||
|
|
||||||
|
``DerivedGeneralCategory-13.0.0.txt``
|
||||||
|
*Date: 2019-10-21, 14:30:32 GMT*
|
||||||
|
|
||||||
|
``EastAsianWidth-4.1.0.txt``
|
||||||
|
*Date: 2005-03-17, 15:21:00 PST [KW]*
|
||||||
|
|
||||||
|
``EastAsianWidth-5.0.0.txt``
|
||||||
|
*Date: 2006-02-15, 14:39:00 PST [KW]*
|
||||||
|
|
||||||
|
``EastAsianWidth-5.1.0.txt``
|
||||||
|
*Date: 2008-03-20, 17:42:00 PDT [KW]*
|
||||||
|
|
||||||
|
``EastAsianWidth-5.2.0.txt``
|
||||||
|
*Date: 2009-06-09, 17:47:00 PDT [KW]*
|
||||||
|
|
||||||
|
``EastAsianWidth-6.0.0.txt``
|
||||||
|
*Date: 2010-08-17, 12:17:00 PDT [KW]*
|
||||||
|
|
||||||
|
``EastAsianWidth-6.1.0.txt``
|
||||||
|
*Date: 2011-09-19, 18:46:00 GMT [KW]*
|
||||||
|
|
||||||
|
``EastAsianWidth-6.2.0.txt``
|
||||||
|
*Date: 2012-05-15, 18:30:00 GMT [KW]*
|
||||||
|
|
||||||
|
``EastAsianWidth-6.3.0.txt``
|
||||||
|
*Date: 2013-02-05, 20:09:00 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-7.0.0.txt``
|
||||||
|
*Date: 2014-02-28, 23:15:00 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-8.0.0.txt``
|
||||||
|
*Date: 2015-02-10, 21:00:00 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-9.0.0.txt``
|
||||||
|
*Date: 2016-05-27, 17:00:00 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-10.0.0.txt``
|
||||||
|
*Date: 2017-03-08, 02:00:00 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-11.0.0.txt``
|
||||||
|
*Date: 2018-05-14, 09:41:59 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-12.0.0.txt``
|
||||||
|
*Date: 2019-01-21, 14:12:58 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-12.1.0.txt``
|
||||||
|
*Date: 2019-03-31, 22:01:58 GMT [KW, LI]*
|
||||||
|
|
||||||
|
``EastAsianWidth-13.0.0.txt``
|
||||||
|
*Date: 2029-01-21, 18:14:00 GMT [KW, LI]*
|
|
@ -0,0 +1,10 @@
|
||||||
|
[bdist_wheel]
|
||||||
|
universal = 1
|
||||||
|
|
||||||
|
[metadata]
|
||||||
|
license_file = LICENSE
|
||||||
|
|
||||||
|
[egg_info]
|
||||||
|
tag_build =
|
||||||
|
tag_date = 0
|
||||||
|
|
|
@ -0,0 +1,99 @@
|
||||||
|
#!/usr/bin/env python
|
||||||
|
"""
|
||||||
|
Setup.py distribution file for wcwidth.
|
||||||
|
|
||||||
|
https://github.com/jquast/wcwidth
|
||||||
|
"""
|
||||||
|
# std imports
|
||||||
|
import os
|
||||||
|
import codecs
|
||||||
|
|
||||||
|
# 3rd party
|
||||||
|
import setuptools
|
||||||
|
|
||||||
|
|
||||||
|
def _get_here(fname):
|
||||||
|
return os.path.join(os.path.dirname(__file__), fname)
|
||||||
|
|
||||||
|
|
||||||
|
class _SetupUpdate(setuptools.Command):
|
||||||
|
# This is a compatibility, some downstream distributions might
|
||||||
|
# still call "setup.py update".
|
||||||
|
#
|
||||||
|
# New entry point is tox, 'tox -eupdate'.
|
||||||
|
description = "Fetch and update unicode code tables"
|
||||||
|
user_options = []
|
||||||
|
|
||||||
|
def initialize_options(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def finalize_options(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
import sys
|
||||||
|
import subprocess
|
||||||
|
retcode = subprocess.Popen([
|
||||||
|
sys.executable,
|
||||||
|
_get_here(os.path.join('bin', 'update-tables.py'))]).wait()
|
||||||
|
assert retcode == 0, ('non-zero exit code', retcode)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Setup.py entry point."""
|
||||||
|
setuptools.setup(
|
||||||
|
name='wcwidth',
|
||||||
|
# NOTE: manually manage __version__ in wcwidth/__init__.py !
|
||||||
|
version='0.2.5',
|
||||||
|
description=(
|
||||||
|
"Measures the displayed width of unicode strings in a terminal"),
|
||||||
|
long_description=codecs.open(
|
||||||
|
_get_here('README.rst'), 'rb', 'utf8').read(),
|
||||||
|
author='Jeff Quast',
|
||||||
|
author_email='contact@jeffquast.com',
|
||||||
|
install_requires=('backports.functools-lru-cache>=1.2.1;'
|
||||||
|
'python_version < "3.2"'),
|
||||||
|
license='MIT',
|
||||||
|
packages=['wcwidth'],
|
||||||
|
url='https://github.com/jquast/wcwidth',
|
||||||
|
package_data={
|
||||||
|
'wcwidth': ['*.json'],
|
||||||
|
'': ['LICENSE', '*.rst'],
|
||||||
|
},
|
||||||
|
zip_safe=True,
|
||||||
|
classifiers=[
|
||||||
|
'Intended Audience :: Developers',
|
||||||
|
'Natural Language :: English',
|
||||||
|
'Development Status :: 5 - Production/Stable',
|
||||||
|
'Environment :: Console',
|
||||||
|
'License :: OSI Approved :: MIT License',
|
||||||
|
'Operating System :: POSIX',
|
||||||
|
'Programming Language :: Python :: 2.7',
|
||||||
|
'Programming Language :: Python :: 3.5',
|
||||||
|
'Programming Language :: Python :: 3.6',
|
||||||
|
'Programming Language :: Python :: 3.7',
|
||||||
|
'Programming Language :: Python :: 3.8',
|
||||||
|
'Topic :: Software Development :: Libraries',
|
||||||
|
'Topic :: Software Development :: Localization',
|
||||||
|
'Topic :: Software Development :: Internationalization',
|
||||||
|
'Topic :: Terminals'
|
||||||
|
],
|
||||||
|
keywords=[
|
||||||
|
'cjk',
|
||||||
|
'combining',
|
||||||
|
'console',
|
||||||
|
'eastasian',
|
||||||
|
'emoji'
|
||||||
|
'emulator',
|
||||||
|
'terminal',
|
||||||
|
'unicode',
|
||||||
|
'wcswidth',
|
||||||
|
'wcwidth',
|
||||||
|
'xterm',
|
||||||
|
],
|
||||||
|
cmdclass={'update': _SetupUpdate},
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
|
@ -0,0 +1 @@
|
||||||
|
"""This file intentionally left blank."""
|
|
@ -0,0 +1,154 @@
|
||||||
|
# coding: utf-8
|
||||||
|
"""Core tests for wcwidth module."""
|
||||||
|
# 3rd party
|
||||||
|
import pkg_resources
|
||||||
|
|
||||||
|
# local
|
||||||
|
import wcwidth
|
||||||
|
|
||||||
|
|
||||||
|
def test_package_version():
|
||||||
|
"""wcwidth.__version__ is expected value."""
|
||||||
|
# given,
|
||||||
|
expected = pkg_resources.get_distribution('wcwidth').version
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
result = wcwidth.__version__
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_hello_jp():
|
||||||
|
u"""
|
||||||
|
Width of Japanese phrase: コンニチハ, セカイ!
|
||||||
|
|
||||||
|
Given a phrase of 5 and 3 Katakana ideographs, joined with
|
||||||
|
3 English-ASCII punctuation characters, totaling 11, this
|
||||||
|
phrase consumes 19 cells of a terminal emulator.
|
||||||
|
"""
|
||||||
|
# given,
|
||||||
|
phrase = u'コンニチハ, セカイ!'
|
||||||
|
expect_length_each = (2, 2, 2, 2, 2, 1, 1, 2, 2, 2, 1)
|
||||||
|
expect_length_phrase = sum(expect_length_each)
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_each = tuple(map(wcwidth.wcwidth, phrase))
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_each == expect_length_each
|
||||||
|
assert length_phrase == expect_length_phrase
|
||||||
|
|
||||||
|
|
||||||
|
def test_wcswidth_substr():
|
||||||
|
"""
|
||||||
|
Test wcswidth() optional 2nd parameter, ``n``.
|
||||||
|
|
||||||
|
``n`` determines at which position of the string
|
||||||
|
to stop counting length.
|
||||||
|
"""
|
||||||
|
# given,
|
||||||
|
phrase = u'コンニチハ, セカイ!'
|
||||||
|
end = 7
|
||||||
|
expect_length_each = (2, 2, 2, 2, 2, 1, 1,)
|
||||||
|
expect_length_phrase = sum(expect_length_each)
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase, end)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_phrase == expect_length_phrase
|
||||||
|
|
||||||
|
|
||||||
|
def test_null_width_0():
|
||||||
|
"""NULL (0) reports width 0."""
|
||||||
|
# given,
|
||||||
|
phrase = u'abc\x00def'
|
||||||
|
expect_length_each = (1, 1, 1, 0, 1, 1, 1)
|
||||||
|
expect_length_phrase = sum(expect_length_each)
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_each = tuple(map(wcwidth.wcwidth, phrase))
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase, len(phrase))
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_each == expect_length_each
|
||||||
|
assert length_phrase == expect_length_phrase
|
||||||
|
|
||||||
|
|
||||||
|
def test_control_c0_width_negative_1():
|
||||||
|
"""CSI (Control sequence initiate) reports width -1 for ESC."""
|
||||||
|
# given,
|
||||||
|
phrase = u'\x1b[0m'
|
||||||
|
expect_length_each = (-1, 1, 1, 1)
|
||||||
|
expect_length_phrase = -1
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_each = tuple(map(wcwidth.wcwidth, phrase))
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase, len(phrase))
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_each == expect_length_each
|
||||||
|
assert length_phrase == expect_length_phrase
|
||||||
|
|
||||||
|
|
||||||
|
def test_combining_width():
|
||||||
|
"""Simple test combining reports total width of 4."""
|
||||||
|
# given,
|
||||||
|
phrase = u'--\u05bf--'
|
||||||
|
expect_length_each = (1, 1, 0, 1, 1)
|
||||||
|
expect_length_phrase = 4
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_each = tuple(map(wcwidth.wcwidth, phrase))
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase, len(phrase))
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_each == expect_length_each
|
||||||
|
assert length_phrase == expect_length_phrase
|
||||||
|
|
||||||
|
|
||||||
|
def test_combining_cafe():
|
||||||
|
u"""Phrase cafe + COMBINING ACUTE ACCENT is café of length 4."""
|
||||||
|
phrase = u"cafe\u0301"
|
||||||
|
expect_length_each = (1, 1, 1, 1, 0)
|
||||||
|
expect_length_phrase = 4
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_each = tuple(map(wcwidth.wcwidth, phrase))
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase, len(phrase))
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_each == expect_length_each
|
||||||
|
assert length_phrase == expect_length_phrase
|
||||||
|
|
||||||
|
|
||||||
|
def test_combining_enclosing():
|
||||||
|
u"""CYRILLIC CAPITAL LETTER A + COMBINING CYRILLIC HUNDRED THOUSANDS SIGN is А҈ of length 1."""
|
||||||
|
phrase = u"\u0410\u0488"
|
||||||
|
expect_length_each = (1, 0)
|
||||||
|
expect_length_phrase = 1
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_each = tuple(map(wcwidth.wcwidth, phrase))
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase, len(phrase))
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_each == expect_length_each
|
||||||
|
assert length_phrase == expect_length_phrase
|
||||||
|
|
||||||
|
|
||||||
|
def test_combining_spacing():
|
||||||
|
u"""Balinese kapal (ship) is ᬓᬨᬮ᭄ of length 4."""
|
||||||
|
phrase = u"\u1B13\u1B28\u1B2E\u1B44"
|
||||||
|
expect_length_each = (1, 1, 1, 1)
|
||||||
|
expect_length_phrase = 4
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
length_each = tuple(map(wcwidth.wcwidth, phrase))
|
||||||
|
length_phrase = wcwidth.wcswidth(phrase, len(phrase))
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert length_each == expect_length_each
|
||||||
|
assert length_phrase == expect_length_phrase
|
|
@ -0,0 +1,184 @@
|
||||||
|
# coding: utf-8
|
||||||
|
"""Unicode version level tests for wcwidth."""
|
||||||
|
# std imports
|
||||||
|
import json
|
||||||
|
import warnings
|
||||||
|
|
||||||
|
# 3rd party
|
||||||
|
import pytest
|
||||||
|
import pkg_resources
|
||||||
|
|
||||||
|
# local
|
||||||
|
import wcwidth
|
||||||
|
|
||||||
|
|
||||||
|
def test_latest():
|
||||||
|
"""wcwidth._wcmatch_version('latest') returns tail item."""
|
||||||
|
# given,
|
||||||
|
expected = wcwidth.list_versions()[-1]
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
result = wcwidth._wcmatch_version('latest')
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_exact_410_str():
|
||||||
|
"""wcwidth._wcmatch_version('4.1.0') returns equal value (str)."""
|
||||||
|
# given,
|
||||||
|
given = expected = '4.1.0'
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_exact_410_unicode():
|
||||||
|
"""wcwidth._wcmatch_version(u'4.1.0') returns equal value (unicode)."""
|
||||||
|
# given,
|
||||||
|
given = expected = u'4.1.0'
|
||||||
|
|
||||||
|
# exercise,
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_505_str():
|
||||||
|
"""wcwidth._wcmatch_version('5.0.5') returns nearest '5.0.0'. (str)"""
|
||||||
|
# given
|
||||||
|
given, expected = '5.0.5', '5.0.0'
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_505_unicode():
|
||||||
|
"""wcwidth._wcmatch_version(u'5.0.5') returns nearest u'5.0.0'. (unicode)"""
|
||||||
|
# given
|
||||||
|
given, expected = u'5.0.5', u'5.0.0'
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_lowint40_str():
|
||||||
|
"""wcwidth._wcmatch_version('4.0') returns nearest '4.1.0'."""
|
||||||
|
# given
|
||||||
|
given, expected = '4.0', '4.1.0'
|
||||||
|
warnings.resetwarnings()
|
||||||
|
wcwidth._wcmatch_version.cache_clear()
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
with pytest.warns(UserWarning):
|
||||||
|
# warns that given version is lower than any available
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_lowint40_unicode():
|
||||||
|
"""wcwidth._wcmatch_version(u'4.0') returns nearest u'4.1.0'."""
|
||||||
|
# given
|
||||||
|
given, expected = u'4.0', u'4.1.0'
|
||||||
|
warnings.resetwarnings()
|
||||||
|
wcwidth._wcmatch_version.cache_clear()
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
with pytest.warns(UserWarning):
|
||||||
|
# warns that given version is lower than any available
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_800_str():
|
||||||
|
"""wcwidth._wcmatch_version('8') returns nearest '8.0.0'."""
|
||||||
|
# given
|
||||||
|
given, expected = '8', '8.0.0'
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_800_unicode():
|
||||||
|
"""wcwidth._wcmatch_version(u'8') returns nearest u'8.0.0'."""
|
||||||
|
# given
|
||||||
|
given, expected = u'8', u'8.0.0'
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_999_str():
|
||||||
|
"""wcwidth._wcmatch_version('999.0') returns nearest (latest)."""
|
||||||
|
# given
|
||||||
|
given, expected = '999.0', wcwidth.list_versions()[-1]
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nearest_999_unicode():
|
||||||
|
"""wcwidth._wcmatch_version(u'999.0') returns nearest (latest)."""
|
||||||
|
# given
|
||||||
|
given, expected = u'999.0', wcwidth.list_versions()[-1]
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nonint_unicode():
|
||||||
|
"""wcwidth._wcmatch_version(u'x.y.z') returns latest (unicode)."""
|
||||||
|
# given
|
||||||
|
given, expected = u'x.y.z', wcwidth.list_versions()[-1]
|
||||||
|
warnings.resetwarnings()
|
||||||
|
wcwidth._wcmatch_version.cache_clear()
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
with pytest.warns(UserWarning):
|
||||||
|
# warns that given version is not valid
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
||||||
|
|
||||||
|
|
||||||
|
def test_nonint_str():
|
||||||
|
"""wcwidth._wcmatch_version(u'x.y.z') returns latest (str)."""
|
||||||
|
# given
|
||||||
|
given, expected = 'x.y.z', wcwidth.list_versions()[-1]
|
||||||
|
warnings.resetwarnings()
|
||||||
|
wcwidth._wcmatch_version.cache_clear()
|
||||||
|
|
||||||
|
# exercise
|
||||||
|
with pytest.warns(UserWarning):
|
||||||
|
# warns that given version is not valid
|
||||||
|
result = wcwidth._wcmatch_version(given)
|
||||||
|
|
||||||
|
# verify.
|
||||||
|
assert result == expected
|
|
@ -0,0 +1,306 @@
|
||||||
|
Metadata-Version: 1.1
|
||||||
|
Name: wcwidth
|
||||||
|
Version: 0.2.5
|
||||||
|
Summary: Measures the displayed width of unicode strings in a terminal
|
||||||
|
Home-page: https://github.com/jquast/wcwidth
|
||||||
|
Author: Jeff Quast
|
||||||
|
Author-email: contact@jeffquast.com
|
||||||
|
License: MIT
|
||||||
|
Description: |pypi_downloads| |codecov| |license|
|
||||||
|
|
||||||
|
============
|
||||||
|
Introduction
|
||||||
|
============
|
||||||
|
|
||||||
|
This library is mainly for CLI programs that carefully produce output for
|
||||||
|
Terminals, or make pretend to be an emulator.
|
||||||
|
|
||||||
|
**Problem Statement**: The printable length of *most* strings are equal to the
|
||||||
|
number of cells they occupy on the screen ``1 charater : 1 cell``. However,
|
||||||
|
there are categories of characters that *occupy 2 cells* (full-wide), and
|
||||||
|
others that *occupy 0* cells (zero-width).
|
||||||
|
|
||||||
|
**Solution**: POSIX.1-2001 and POSIX.1-2008 conforming systems provide
|
||||||
|
`wcwidth(3)`_ and `wcswidth(3)`_ C functions of which this python module's
|
||||||
|
functions precisely copy. *These functions return the number of cells a
|
||||||
|
unicode string is expected to occupy.*
|
||||||
|
|
||||||
|
Installation
|
||||||
|
------------
|
||||||
|
|
||||||
|
The stable version of this package is maintained on pypi, install using pip::
|
||||||
|
|
||||||
|
pip install wcwidth
|
||||||
|
|
||||||
|
Example
|
||||||
|
-------
|
||||||
|
|
||||||
|
**Problem**: given the following phrase (Japanese),
|
||||||
|
|
||||||
|
>>> text = u'コンニチハ'
|
||||||
|
|
||||||
|
Python **incorrectly** uses the *string length* of 5 codepoints rather than the
|
||||||
|
*printible length* of 10 cells, so that when using the `rjust` function, the
|
||||||
|
output length is wrong::
|
||||||
|
|
||||||
|
>>> print(len('コンニチハ'))
|
||||||
|
5
|
||||||
|
|
||||||
|
>>> print('コンニチハ'.rjust(20, '_'))
|
||||||
|
_____コンニチハ
|
||||||
|
|
||||||
|
By defining our own "rjust" function that uses wcwidth, we can correct this::
|
||||||
|
|
||||||
|
>>> def wc_rjust(text, length, padding=' '):
|
||||||
|
... from wcwidth import wcswidth
|
||||||
|
... return padding * max(0, (length - wcswidth(text))) + text
|
||||||
|
...
|
||||||
|
|
||||||
|
Our **Solution** uses wcswidth to determine the string length correctly::
|
||||||
|
|
||||||
|
>>> from wcwidth import wcswidth
|
||||||
|
>>> print(wcswidth('コンニチハ'))
|
||||||
|
10
|
||||||
|
|
||||||
|
>>> print(wc_rjust('コンニチハ', 20, '_'))
|
||||||
|
__________コンニチハ
|
||||||
|
|
||||||
|
|
||||||
|
Choosing a Version
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Export an environment variable, ``UNICODE_VERSION``. This should be done by
|
||||||
|
*terminal emulators* or those developers experimenting with authoring one of
|
||||||
|
their own, from shell::
|
||||||
|
|
||||||
|
$ export UNICODE_VERSION=13.0
|
||||||
|
|
||||||
|
If unspecified, the latest version is used. If your Terminal Emulator does not
|
||||||
|
export this variable, you can use the `jquast/ucs-detect`_ utility to
|
||||||
|
automatically detect and export it to your shell.
|
||||||
|
|
||||||
|
wcwidth, wcswidth
|
||||||
|
-----------------
|
||||||
|
Use function ``wcwidth()`` to determine the length of a *single unicode
|
||||||
|
character*, and ``wcswidth()`` to determine the length of many, a *string
|
||||||
|
of unicode characters*.
|
||||||
|
|
||||||
|
Briefly, return values of function ``wcwidth()`` are:
|
||||||
|
|
||||||
|
``-1``
|
||||||
|
Indeterminate (not printable).
|
||||||
|
|
||||||
|
``0``
|
||||||
|
Does not advance the cursor, such as NULL or Combining.
|
||||||
|
|
||||||
|
``2``
|
||||||
|
Characters of category East Asian Wide (W) or East Asian
|
||||||
|
Full-width (F) which are displayed using two terminal cells.
|
||||||
|
|
||||||
|
``1``
|
||||||
|
All others.
|
||||||
|
|
||||||
|
Function ``wcswidth()`` simply returns the sum of all values for each character
|
||||||
|
along a string, or ``-1`` when it occurs anywhere along a string.
|
||||||
|
|
||||||
|
Full API Documentation at http://wcwidth.readthedocs.org
|
||||||
|
|
||||||
|
==========
|
||||||
|
Developing
|
||||||
|
==========
|
||||||
|
|
||||||
|
Install wcwidth in editable mode::
|
||||||
|
|
||||||
|
pip install -e.
|
||||||
|
|
||||||
|
Execute unit tests using tox_::
|
||||||
|
|
||||||
|
tox
|
||||||
|
|
||||||
|
Regenerate python code tables from latest Unicode Specification data files::
|
||||||
|
|
||||||
|
tox -eupdate
|
||||||
|
|
||||||
|
Supplementary tools for browsing and testing terminals for wide unicode
|
||||||
|
characters are found in the `bin/`_ of this project's source code. Just ensure
|
||||||
|
to first ``pip install -erequirements-develop.txt`` from this projects main
|
||||||
|
folder. For example, an interactive browser for testing::
|
||||||
|
|
||||||
|
./bin/wcwidth-browser.py
|
||||||
|
|
||||||
|
Uses
|
||||||
|
----
|
||||||
|
|
||||||
|
This library is used in:
|
||||||
|
|
||||||
|
- `jquast/blessed`_: a thin, practical wrapper around terminal capabilities in
|
||||||
|
Python.
|
||||||
|
|
||||||
|
- `jonathanslenders/python-prompt-toolkit`_: a Library for building powerful
|
||||||
|
interactive command lines in Python.
|
||||||
|
|
||||||
|
- `dbcli/pgcli`_: Postgres CLI with autocompletion and syntax highlighting.
|
||||||
|
|
||||||
|
- `thomasballinger/curtsies`_: a Curses-like terminal wrapper with a display
|
||||||
|
based on compositing 2d arrays of text.
|
||||||
|
|
||||||
|
- `selectel/pyte`_: Simple VTXXX-compatible linux terminal emulator.
|
||||||
|
|
||||||
|
- `astanin/python-tabulate`_: Pretty-print tabular data in Python, a library
|
||||||
|
and a command-line utility.
|
||||||
|
|
||||||
|
- `LuminosoInsight/python-ftfy`_: Fixes mojibake and other glitches in Unicode
|
||||||
|
text.
|
||||||
|
|
||||||
|
- `nbedos/termtosvg`_: Terminal recorder that renders sessions as SVG
|
||||||
|
animations.
|
||||||
|
|
||||||
|
- `peterbrittain/asciimatics`_: Package to help people create full-screen text
|
||||||
|
UIs.
|
||||||
|
|
||||||
|
Other Languages
|
||||||
|
---------------
|
||||||
|
|
||||||
|
- `timoxley/wcwidth`_: JavaScript
|
||||||
|
- `janlelis/unicode-display_width`_: Ruby
|
||||||
|
- `alecrabbit/php-wcwidth`_: PHP
|
||||||
|
- `Text::CharWidth`_: Perl
|
||||||
|
- `bluebear94/Terminal-WCWidth`: Perl 6
|
||||||
|
- `mattn/go-runewidth`_: Go
|
||||||
|
- `emugel/wcwidth`_: Haxe
|
||||||
|
- `aperezdc/lua-wcwidth`: Lua
|
||||||
|
- `joachimschmidt557/zig-wcwidth`: Zig
|
||||||
|
- `fumiyas/wcwidth-cjk`: `LD_PRELOAD` override
|
||||||
|
- `joshuarubin/wcwidth9`: Unicode version 9 in C
|
||||||
|
|
||||||
|
History
|
||||||
|
-------
|
||||||
|
|
||||||
|
0.2.0 *2020-06-01*
|
||||||
|
* **Enhancement**: Unicode version may be selected by exporting the
|
||||||
|
Environment variable ``UNICODE_VERSION``, such as ``13.0``, or ``6.3.0``.
|
||||||
|
See the `jquast/ucs-detect`_ CLI utility for automatic detection.
|
||||||
|
* **Enhancement**:
|
||||||
|
API Documentation is published to readthedocs.org.
|
||||||
|
* **Updated** tables for *all* Unicode Specifications with files
|
||||||
|
published in a programmatically consumable format, versions 4.1.0
|
||||||
|
through 13.0
|
||||||
|
that are published
|
||||||
|
, versions
|
||||||
|
|
||||||
|
0.1.9 *2020-03-22*
|
||||||
|
* **Performance** optimization by `Avram Lubkin`_, `PR #35`_.
|
||||||
|
* **Updated** tables to Unicode Specification 13.0.0.
|
||||||
|
|
||||||
|
0.1.8 *2020-01-01*
|
||||||
|
* **Updated** tables to Unicode Specification 12.0.0. (`PR #30`_).
|
||||||
|
|
||||||
|
0.1.7 *2016-07-01*
|
||||||
|
* **Updated** tables to Unicode Specification 9.0.0. (`PR #18`_).
|
||||||
|
|
||||||
|
0.1.6 *2016-01-08 Production/Stable*
|
||||||
|
* ``LICENSE`` file now included with distribution.
|
||||||
|
|
||||||
|
0.1.5 *2015-09-13 Alpha*
|
||||||
|
* **Bugfix**:
|
||||||
|
Resolution of "combining_ character width" issue, most especially
|
||||||
|
those that previously returned -1 now often (correctly) return 0.
|
||||||
|
resolved by `Philip Craig`_ via `PR #11`_.
|
||||||
|
* **Deprecated**:
|
||||||
|
The module path ``wcwidth.table_comb`` is no longer available,
|
||||||
|
it has been superseded by module path ``wcwidth.table_zero``.
|
||||||
|
|
||||||
|
0.1.4 *2014-11-20 Pre-Alpha*
|
||||||
|
* **Feature**: ``wcswidth()`` now determines printable length
|
||||||
|
for (most) combining_ characters. The developer's tool
|
||||||
|
`bin/wcwidth-browser.py`_ is improved to display combining_
|
||||||
|
characters when provided the ``--combining`` option
|
||||||
|
(`Thomas Ballinger`_ and `Leta Montopoli`_ `PR #5`_).
|
||||||
|
* **Feature**: added static analysis (prospector_) to testing
|
||||||
|
framework.
|
||||||
|
|
||||||
|
0.1.3 *2014-10-29 Pre-Alpha*
|
||||||
|
* **Bugfix**: 2nd parameter of wcswidth was not honored.
|
||||||
|
(`Thomas Ballinger`_, `PR #4`_).
|
||||||
|
|
||||||
|
0.1.2 *2014-10-28 Pre-Alpha*
|
||||||
|
* **Updated** tables to Unicode Specification 7.0.0.
|
||||||
|
(`Thomas Ballinger`_, `PR #3`_).
|
||||||
|
|
||||||
|
0.1.1 *2014-05-14 Pre-Alpha*
|
||||||
|
* Initial release to pypi, Based on Unicode Specification 6.3.0
|
||||||
|
|
||||||
|
This code was originally derived directly from C code of the same name,
|
||||||
|
whose latest version is available at
|
||||||
|
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c::
|
||||||
|
|
||||||
|
* Markus Kuhn -- 2007-05-26 (Unicode 5.0)
|
||||||
|
*
|
||||||
|
* Permission to use, copy, modify, and distribute this software
|
||||||
|
* for any purpose and without fee is hereby granted. The author
|
||||||
|
* disclaims all warranties with regard to this software.
|
||||||
|
|
||||||
|
.. _`tox`: https://testrun.org/tox/latest/install.html
|
||||||
|
.. _`prospector`: https://github.com/landscapeio/prospector
|
||||||
|
.. _`combining`: https://en.wikipedia.org/wiki/Combining_character
|
||||||
|
.. _`bin/`: https://github.com/jquast/wcwidth/tree/master/bin
|
||||||
|
.. _`bin/wcwidth-browser.py`: https://github.com/jquast/wcwidth/tree/master/bin/wcwidth-browser.py
|
||||||
|
.. _`Thomas Ballinger`: https://github.com/thomasballinger
|
||||||
|
.. _`Leta Montopoli`: https://github.com/lmontopo
|
||||||
|
.. _`Philip Craig`: https://github.com/philipc
|
||||||
|
.. _`PR #3`: https://github.com/jquast/wcwidth/pull/3
|
||||||
|
.. _`PR #4`: https://github.com/jquast/wcwidth/pull/4
|
||||||
|
.. _`PR #5`: https://github.com/jquast/wcwidth/pull/5
|
||||||
|
.. _`PR #11`: https://github.com/jquast/wcwidth/pull/11
|
||||||
|
.. _`PR #18`: https://github.com/jquast/wcwidth/pull/18
|
||||||
|
.. _`PR #30`: https://github.com/jquast/wcwidth/pull/30
|
||||||
|
.. _`PR #35`: https://github.com/jquast/wcwidth/pull/35
|
||||||
|
.. _`jquast/blessed`: https://github.com/jquast/blessed
|
||||||
|
.. _`selectel/pyte`: https://github.com/selectel/pyte
|
||||||
|
.. _`thomasballinger/curtsies`: https://github.com/thomasballinger/curtsies
|
||||||
|
.. _`dbcli/pgcli`: https://github.com/dbcli/pgcli
|
||||||
|
.. _`jonathanslenders/python-prompt-toolkit`: https://github.com/jonathanslenders/python-prompt-toolkit
|
||||||
|
.. _`timoxley/wcwidth`: https://github.com/timoxley/wcwidth
|
||||||
|
.. _`wcwidth(3)`: http://man7.org/linux/man-pages/man3/wcwidth.3.html
|
||||||
|
.. _`wcswidth(3)`: http://man7.org/linux/man-pages/man3/wcswidth.3.html
|
||||||
|
.. _`astanin/python-tabulate`: https://github.com/astanin/python-tabulate
|
||||||
|
.. _`janlelis/unicode-display_width`: https://github.com/janlelis/unicode-display_width
|
||||||
|
.. _`LuminosoInsight/python-ftfy`: https://github.com/LuminosoInsight/python-ftfy
|
||||||
|
.. _`alecrabbit/php-wcwidth`: https://github.com/alecrabbit/php-wcwidth
|
||||||
|
.. _`Text::CharWidth`: https://metacpan.org/pod/Text::CharWidth
|
||||||
|
.. _`bluebear94/Terminal-WCWidth`: https://github.com/bluebear94/Terminal-WCWidth
|
||||||
|
.. _`mattn/go-runewidth`: https://github.com/mattn/go-runewidth
|
||||||
|
.. _`emugel/wcwidth`: https://github.com/emugel/wcwidth
|
||||||
|
.. _`jquast/ucs-detect`: https://github.com/jquast/ucs-detect
|
||||||
|
.. _`Avram Lubkin`: https://github.com/avylove
|
||||||
|
.. _`nbedos/termtosvg`: https://github.com/nbedos/termtosvg
|
||||||
|
.. _`peterbrittain/asciimatics`: https://github.com/peterbrittain/asciimatics
|
||||||
|
.. _`aperezdc/lua-wcwidth`: https://github.com/aperezdc/lua-wcwidth
|
||||||
|
.. _`fumiyas/wcwidth-cjk`: https://github.com/fumiyas/wcwidth-cjk
|
||||||
|
.. |pypi_downloads| image:: https://img.shields.io/pypi/dm/wcwidth.svg?logo=pypi
|
||||||
|
:alt: Downloads
|
||||||
|
:target: https://pypi.org/project/wcwidth/
|
||||||
|
.. |codecov| image:: https://codecov.io/gh/jquast/wcwidth/branch/master/graph/badge.svg
|
||||||
|
:alt: codecov.io Code Coverage
|
||||||
|
:target: https://codecov.io/gh/jquast/wcwidth/
|
||||||
|
.. |license| image:: https://img.shields.io/github/license/jquast/wcwidth.svg
|
||||||
|
:target: https://pypi.python.org/pypi/wcwidth/
|
||||||
|
:alt: MIT License
|
||||||
|
|
||||||
|
Keywords: cjk,combining,console,eastasian,emojiemulator,terminal,unicode,wcswidth,wcwidth,xterm
|
||||||
|
Platform: UNKNOWN
|
||||||
|
Classifier: Intended Audience :: Developers
|
||||||
|
Classifier: Natural Language :: English
|
||||||
|
Classifier: Development Status :: 5 - Production/Stable
|
||||||
|
Classifier: Environment :: Console
|
||||||
|
Classifier: License :: OSI Approved :: MIT License
|
||||||
|
Classifier: Operating System :: POSIX
|
||||||
|
Classifier: Programming Language :: Python :: 2.7
|
||||||
|
Classifier: Programming Language :: Python :: 3.5
|
||||||
|
Classifier: Programming Language :: Python :: 3.6
|
||||||
|
Classifier: Programming Language :: Python :: 3.7
|
||||||
|
Classifier: Programming Language :: Python :: 3.8
|
||||||
|
Classifier: Topic :: Software Development :: Libraries
|
||||||
|
Classifier: Topic :: Software Development :: Localization
|
||||||
|
Classifier: Topic :: Software Development :: Internationalization
|
||||||
|
Classifier: Topic :: Terminals
|
|
@ -0,0 +1,19 @@
|
||||||
|
LICENSE
|
||||||
|
MANIFEST.in
|
||||||
|
README.rst
|
||||||
|
setup.cfg
|
||||||
|
setup.py
|
||||||
|
tests/__init__.py
|
||||||
|
tests/test_core.py
|
||||||
|
tests/test_ucslevel.py
|
||||||
|
wcwidth/__init__.py
|
||||||
|
wcwidth/table_wide.py
|
||||||
|
wcwidth/table_zero.py
|
||||||
|
wcwidth/unicode_versions.py
|
||||||
|
wcwidth/wcwidth.py
|
||||||
|
wcwidth.egg-info/PKG-INFO
|
||||||
|
wcwidth.egg-info/SOURCES.txt
|
||||||
|
wcwidth.egg-info/dependency_links.txt
|
||||||
|
wcwidth.egg-info/requires.txt
|
||||||
|
wcwidth.egg-info/top_level.txt
|
||||||
|
wcwidth.egg-info/zip-safe
|
|
@ -0,0 +1 @@
|
||||||
|
|
|
@ -0,0 +1,3 @@
|
||||||
|
|
||||||
|
[:python_version < "3.2"]
|
||||||
|
backports.functools-lru-cache>=1.2.1
|
|
@ -0,0 +1 @@
|
||||||
|
wcwidth
|
|
@ -0,0 +1 @@
|
||||||
|
|
|
@ -0,0 +1,37 @@
|
||||||
|
"""
|
||||||
|
wcwidth module.
|
||||||
|
|
||||||
|
https://github.com/jquast/wcwidth
|
||||||
|
"""
|
||||||
|
# re-export all functions & definitions, even private ones, from top-level
|
||||||
|
# module path, to allow for 'from wcwidth import _private_func'. Of course,
|
||||||
|
# user beware that any _private function may disappear or change signature at
|
||||||
|
# any future version.
|
||||||
|
|
||||||
|
# local
|
||||||
|
from .wcwidth import ZERO_WIDTH # noqa
|
||||||
|
from .wcwidth import (WIDE_EASTASIAN,
|
||||||
|
wcwidth,
|
||||||
|
wcswidth,
|
||||||
|
_bisearch,
|
||||||
|
list_versions,
|
||||||
|
_wcmatch_version,
|
||||||
|
_wcversion_value)
|
||||||
|
|
||||||
|
# The __all__ attribute defines the items exported from statement,
|
||||||
|
# 'from wcwidth import *', but also to say, "This is the public API".
|
||||||
|
__all__ = ('wcwidth', 'wcswidth', 'list_versions')
|
||||||
|
|
||||||
|
# I used to use a _get_package_version() function to use the `pkg_resources'
|
||||||
|
# module to parse the package version from our version.json file, but this blew
|
||||||
|
# some folks up, or more particularly, just the `xonsh' shell.
|
||||||
|
#
|
||||||
|
# Yikes! I always wanted to like xonsh and tried it many times but issues like
|
||||||
|
# these always bit me, too, so I can sympathize -- this version is now manually
|
||||||
|
# kept in sync with version.json to help them out. Shucks, this variable is just
|
||||||
|
# for legacy, from the days before 'pip freeze' was a thing.
|
||||||
|
#
|
||||||
|
# We also used pkg_resources to load unicode version tables from version.json,
|
||||||
|
# generated by bin/update-tables.py, but some environments are unable to
|
||||||
|
# import pkg_resources for one reason or another, yikes!
|
||||||
|
__version__ = '0.2.5'
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,35 @@
|
||||||
|
"""
|
||||||
|
Exports function list_versions() for unicode version level support.
|
||||||
|
|
||||||
|
This code generated by bin/update-tables.py on 2020-06-23 16:03:21.350604.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def list_versions():
|
||||||
|
"""
|
||||||
|
Return Unicode version levels supported by this module release.
|
||||||
|
|
||||||
|
Any of the version strings returned may be used as keyword argument
|
||||||
|
``unicode_version`` to the ``wcwidth()`` family of functions.
|
||||||
|
|
||||||
|
:returns: Supported Unicode version numbers in ascending sorted order.
|
||||||
|
:rtype: list[str]
|
||||||
|
"""
|
||||||
|
return (
|
||||||
|
"4.1.0",
|
||||||
|
"5.0.0",
|
||||||
|
"5.1.0",
|
||||||
|
"5.2.0",
|
||||||
|
"6.0.0",
|
||||||
|
"6.1.0",
|
||||||
|
"6.2.0",
|
||||||
|
"6.3.0",
|
||||||
|
"7.0.0",
|
||||||
|
"8.0.0",
|
||||||
|
"9.0.0",
|
||||||
|
"10.0.0",
|
||||||
|
"11.0.0",
|
||||||
|
"12.0.0",
|
||||||
|
"12.1.0",
|
||||||
|
"13.0.0",
|
||||||
|
)
|
|
@ -0,0 +1,375 @@
|
||||||
|
"""
|
||||||
|
This is a python implementation of wcwidth() and wcswidth().
|
||||||
|
|
||||||
|
https://github.com/jquast/wcwidth
|
||||||
|
|
||||||
|
from Markus Kuhn's C code, retrieved from:
|
||||||
|
|
||||||
|
http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
|
||||||
|
|
||||||
|
This is an implementation of wcwidth() and wcswidth() (defined in
|
||||||
|
IEEE Std 1002.1-2001) for Unicode.
|
||||||
|
|
||||||
|
http://www.opengroup.org/onlinepubs/007904975/functions/wcwidth.html
|
||||||
|
http://www.opengroup.org/onlinepubs/007904975/functions/wcswidth.html
|
||||||
|
|
||||||
|
In fixed-width output devices, Latin characters all occupy a single
|
||||||
|
"cell" position of equal width, whereas ideographic CJK characters
|
||||||
|
occupy two such cells. Interoperability between terminal-line
|
||||||
|
applications and (teletype-style) character terminals using the
|
||||||
|
UTF-8 encoding requires agreement on which character should advance
|
||||||
|
the cursor by how many cell positions. No established formal
|
||||||
|
standards exist at present on which Unicode character shall occupy
|
||||||
|
how many cell positions on character terminals. These routines are
|
||||||
|
a first attempt of defining such behavior based on simple rules
|
||||||
|
applied to data provided by the Unicode Consortium.
|
||||||
|
|
||||||
|
For some graphical characters, the Unicode standard explicitly
|
||||||
|
defines a character-cell width via the definition of the East Asian
|
||||||
|
FullWidth (F), Wide (W), Half-width (H), and Narrow (Na) classes.
|
||||||
|
In all these cases, there is no ambiguity about which width a
|
||||||
|
terminal shall use. For characters in the East Asian Ambiguous (A)
|
||||||
|
class, the width choice depends purely on a preference of backward
|
||||||
|
compatibility with either historic CJK or Western practice.
|
||||||
|
Choosing single-width for these characters is easy to justify as
|
||||||
|
the appropriate long-term solution, as the CJK practice of
|
||||||
|
displaying these characters as double-width comes from historic
|
||||||
|
implementation simplicity (8-bit encoded characters were displayed
|
||||||
|
single-width and 16-bit ones double-width, even for Greek,
|
||||||
|
Cyrillic, etc.) and not any typographic considerations.
|
||||||
|
|
||||||
|
Much less clear is the choice of width for the Not East Asian
|
||||||
|
(Neutral) class. Existing practice does not dictate a width for any
|
||||||
|
of these characters. It would nevertheless make sense
|
||||||
|
typographically to allocate two character cells to characters such
|
||||||
|
as for instance EM SPACE or VOLUME INTEGRAL, which cannot be
|
||||||
|
represented adequately with a single-width glyph. The following
|
||||||
|
routines at present merely assign a single-cell width to all
|
||||||
|
neutral characters, in the interest of simplicity. This is not
|
||||||
|
entirely satisfactory and should be reconsidered before
|
||||||
|
establishing a formal standard in this area. At the moment, the
|
||||||
|
decision which Not East Asian (Neutral) characters should be
|
||||||
|
represented by double-width glyphs cannot yet be answered by
|
||||||
|
applying a simple rule from the Unicode database content. Setting
|
||||||
|
up a proper standard for the behavior of UTF-8 character terminals
|
||||||
|
will require a careful analysis not only of each Unicode character,
|
||||||
|
but also of each presentation form, something the author of these
|
||||||
|
routines has avoided to do so far.
|
||||||
|
|
||||||
|
http://www.unicode.org/unicode/reports/tr11/
|
||||||
|
|
||||||
|
Latest version: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
|
||||||
|
"""
|
||||||
|
from __future__ import division
|
||||||
|
|
||||||
|
# std imports
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import warnings
|
||||||
|
|
||||||
|
# local
|
||||||
|
from .table_wide import WIDE_EASTASIAN
|
||||||
|
from .table_zero import ZERO_WIDTH
|
||||||
|
from .unicode_versions import list_versions
|
||||||
|
|
||||||
|
try:
|
||||||
|
from functools import lru_cache
|
||||||
|
except ImportError:
|
||||||
|
# lru_cache was added in Python 3.2
|
||||||
|
from backports.functools_lru_cache import lru_cache
|
||||||
|
|
||||||
|
# global cache
|
||||||
|
_UNICODE_CMPTABLE = None
|
||||||
|
_PY3 = (sys.version_info[0] >= 3)
|
||||||
|
|
||||||
|
|
||||||
|
# NOTE: created by hand, there isn't anything identifiable other than
|
||||||
|
# general Cf category code to identify these, and some characters in Cf
|
||||||
|
# category code are of non-zero width.
|
||||||
|
# Also includes some Cc, Mn, Zl, and Zp characters
|
||||||
|
ZERO_WIDTH_CF = set([
|
||||||
|
0, # Null (Cc)
|
||||||
|
0x034F, # Combining grapheme joiner (Mn)
|
||||||
|
0x200B, # Zero width space
|
||||||
|
0x200C, # Zero width non-joiner
|
||||||
|
0x200D, # Zero width joiner
|
||||||
|
0x200E, # Left-to-right mark
|
||||||
|
0x200F, # Right-to-left mark
|
||||||
|
0x2028, # Line separator (Zl)
|
||||||
|
0x2029, # Paragraph separator (Zp)
|
||||||
|
0x202A, # Left-to-right embedding
|
||||||
|
0x202B, # Right-to-left embedding
|
||||||
|
0x202C, # Pop directional formatting
|
||||||
|
0x202D, # Left-to-right override
|
||||||
|
0x202E, # Right-to-left override
|
||||||
|
0x2060, # Word joiner
|
||||||
|
0x2061, # Function application
|
||||||
|
0x2062, # Invisible times
|
||||||
|
0x2063, # Invisible separator
|
||||||
|
])
|
||||||
|
|
||||||
|
|
||||||
|
def _bisearch(ucs, table):
|
||||||
|
"""
|
||||||
|
Auxiliary function for binary search in interval table.
|
||||||
|
|
||||||
|
:arg int ucs: Ordinal value of unicode character.
|
||||||
|
:arg list table: List of starting and ending ranges of ordinal values,
|
||||||
|
in form of ``[(start, end), ...]``.
|
||||||
|
:rtype: int
|
||||||
|
:returns: 1 if ordinal value ucs is found within lookup table, else 0.
|
||||||
|
"""
|
||||||
|
lbound = 0
|
||||||
|
ubound = len(table) - 1
|
||||||
|
|
||||||
|
if ucs < table[0][0] or ucs > table[ubound][1]:
|
||||||
|
return 0
|
||||||
|
while ubound >= lbound:
|
||||||
|
mid = (lbound + ubound) // 2
|
||||||
|
if ucs > table[mid][1]:
|
||||||
|
lbound = mid + 1
|
||||||
|
elif ucs < table[mid][0]:
|
||||||
|
ubound = mid - 1
|
||||||
|
else:
|
||||||
|
return 1
|
||||||
|
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
@lru_cache(maxsize=1000)
|
||||||
|
def wcwidth(wc, unicode_version='auto'):
|
||||||
|
r"""
|
||||||
|
Given one Unicode character, return its printable length on a terminal.
|
||||||
|
|
||||||
|
:param str wc: A single Unicode character.
|
||||||
|
:param str unicode_version: A Unicode version number, such as
|
||||||
|
``'6.0.0'``, the list of available version levels may be
|
||||||
|
listed by pairing function :func:`list_versions`.
|
||||||
|
|
||||||
|
Any version string may be specified without error -- the nearest
|
||||||
|
matching version is selected. When ``latest`` (default), the
|
||||||
|
highest Unicode version level is used.
|
||||||
|
:return: The width, in cells, necessary to display the character of
|
||||||
|
Unicode string character, ``wc``. Returns 0 if the ``wc`` argument has
|
||||||
|
no printable effect on a terminal (such as NUL '\0'), -1 if ``wc`` is
|
||||||
|
not printable, or has an indeterminate effect on the terminal, such as
|
||||||
|
a control character. Otherwise, the number of column positions the
|
||||||
|
character occupies on a graphic terminal (1 or 2) is returned.
|
||||||
|
:rtype: int
|
||||||
|
|
||||||
|
The following have a column width of -1:
|
||||||
|
|
||||||
|
- C0 control characters (U+001 through U+01F).
|
||||||
|
|
||||||
|
- C1 control characters and DEL (U+07F through U+0A0).
|
||||||
|
|
||||||
|
The following have a column width of 0:
|
||||||
|
|
||||||
|
- Non-spacing and enclosing combining characters (general
|
||||||
|
category code Mn or Me in the Unicode database).
|
||||||
|
|
||||||
|
- NULL (``U+0000``).
|
||||||
|
|
||||||
|
- COMBINING GRAPHEME JOINER (``U+034F``).
|
||||||
|
|
||||||
|
- ZERO WIDTH SPACE (``U+200B``) *through*
|
||||||
|
RIGHT-TO-LEFT MARK (``U+200F``).
|
||||||
|
|
||||||
|
- LINE SEPARATOR (``U+2028``) *and*
|
||||||
|
PARAGRAPH SEPARATOR (``U+2029``).
|
||||||
|
|
||||||
|
- LEFT-TO-RIGHT EMBEDDING (``U+202A``) *through*
|
||||||
|
RIGHT-TO-LEFT OVERRIDE (``U+202E``).
|
||||||
|
|
||||||
|
- WORD JOINER (``U+2060``) *through*
|
||||||
|
INVISIBLE SEPARATOR (``U+2063``).
|
||||||
|
|
||||||
|
The following have a column width of 1:
|
||||||
|
|
||||||
|
- SOFT HYPHEN (``U+00AD``).
|
||||||
|
|
||||||
|
- All remaining characters, including all printable ISO 8859-1
|
||||||
|
and WGL4 characters, Unicode control characters, etc.
|
||||||
|
|
||||||
|
The following have a column width of 2:
|
||||||
|
|
||||||
|
- Spacing characters in the East Asian Wide (W) or East Asian
|
||||||
|
Full-width (F) category as defined in Unicode Technical
|
||||||
|
Report #11 have a column width of 2.
|
||||||
|
|
||||||
|
- Some kinds of Emoji or symbols.
|
||||||
|
"""
|
||||||
|
# NOTE: created by hand, there isn't anything identifiable other than
|
||||||
|
# general Cf category code to identify these, and some characters in Cf
|
||||||
|
# category code are of non-zero width.
|
||||||
|
ucs = ord(wc)
|
||||||
|
if ucs in ZERO_WIDTH_CF:
|
||||||
|
return 0
|
||||||
|
|
||||||
|
# C0/C1 control characters
|
||||||
|
if ucs < 32 or 0x07F <= ucs < 0x0A0:
|
||||||
|
return -1
|
||||||
|
|
||||||
|
_unicode_version = _wcmatch_version(unicode_version)
|
||||||
|
|
||||||
|
# combining characters with zero width
|
||||||
|
if _bisearch(ucs, ZERO_WIDTH[_unicode_version]):
|
||||||
|
return 0
|
||||||
|
|
||||||
|
return 1 + _bisearch(ucs, WIDE_EASTASIAN[_unicode_version])
|
||||||
|
|
||||||
|
|
||||||
|
def wcswidth(pwcs, n=None, unicode_version='auto'):
|
||||||
|
"""
|
||||||
|
Given a unicode string, return its printable length on a terminal.
|
||||||
|
|
||||||
|
:param str pwcs: Measure width of given unicode string.
|
||||||
|
:param int n: When ``n`` is None (default), return the length of the
|
||||||
|
entire string, otherwise width the first ``n`` characters specified.
|
||||||
|
:param str unicode_version: An explicit definition of the unicode version
|
||||||
|
level to use for determination, may be ``auto`` (default), which uses
|
||||||
|
the Environment Variable, ``UNICODE_VERSION`` if defined, or the latest
|
||||||
|
available unicode version, otherwise.
|
||||||
|
:rtype: int
|
||||||
|
:returns: The width, in cells, necessary to display the first ``n``
|
||||||
|
characters of the unicode string ``pwcs``. Returns ``-1`` if
|
||||||
|
a non-printable character is encountered.
|
||||||
|
"""
|
||||||
|
# pylint: disable=C0103
|
||||||
|
# Invalid argument name "n"
|
||||||
|
|
||||||
|
end = len(pwcs) if n is None else n
|
||||||
|
idx = slice(0, end)
|
||||||
|
width = 0
|
||||||
|
for char in pwcs[idx]:
|
||||||
|
wcw = wcwidth(char, unicode_version)
|
||||||
|
if wcw < 0:
|
||||||
|
return -1
|
||||||
|
width += wcw
|
||||||
|
return width
|
||||||
|
|
||||||
|
|
||||||
|
@lru_cache(maxsize=128)
|
||||||
|
def _wcversion_value(ver_string):
|
||||||
|
"""
|
||||||
|
Integer-mapped value of given dotted version string.
|
||||||
|
|
||||||
|
:param str ver_string: Unicode version string, of form ``n.n.n``.
|
||||||
|
:rtype: tuple(int)
|
||||||
|
:returns: tuple of digit tuples, ``tuple(int, [...])``.
|
||||||
|
"""
|
||||||
|
retval = tuple(map(int, (ver_string.split('.'))))
|
||||||
|
return retval
|
||||||
|
|
||||||
|
|
||||||
|
@lru_cache(maxsize=8)
|
||||||
|
def _wcmatch_version(given_version):
|
||||||
|
"""
|
||||||
|
Return nearest matching supported Unicode version level.
|
||||||
|
|
||||||
|
If an exact match is not determined, the nearest lowest version level is
|
||||||
|
returned after a warning is emitted. For example, given supported levels
|
||||||
|
``4.1.0`` and ``5.0.0``, and a version string of ``4.9.9``, then ``4.1.0``
|
||||||
|
is selected and returned:
|
||||||
|
|
||||||
|
>>> _wcmatch_version('4.9.9')
|
||||||
|
'4.1.0'
|
||||||
|
>>> _wcmatch_version('8.0')
|
||||||
|
'8.0.0'
|
||||||
|
>>> _wcmatch_version('1')
|
||||||
|
'4.1.0'
|
||||||
|
|
||||||
|
:param str given_version: given version for compare, may be ``auto``
|
||||||
|
(default), to select Unicode Version from Environment Variable,
|
||||||
|
``UNICODE_VERSION``. If the environment variable is not set, then the
|
||||||
|
latest is used.
|
||||||
|
:rtype: str
|
||||||
|
:returns: unicode string, or non-unicode ``str`` type for python 2
|
||||||
|
when given ``version`` is also type ``str``.
|
||||||
|
"""
|
||||||
|
# Design note: the choice to return the same type that is given certainly
|
||||||
|
# complicates it for python 2 str-type, but allows us to define an api that
|
||||||
|
# to use 'string-type', for unicode version level definitions, so all of our
|
||||||
|
# example code works with all versions of python. That, along with the
|
||||||
|
# string-to-numeric and comparisons of earliest, latest, matching, or
|
||||||
|
# nearest, greatly complicates this function.
|
||||||
|
_return_str = not _PY3 and isinstance(given_version, str)
|
||||||
|
|
||||||
|
if _return_str:
|
||||||
|
unicode_versions = [ucs.encode() for ucs in list_versions()]
|
||||||
|
else:
|
||||||
|
unicode_versions = list_versions()
|
||||||
|
latest_version = unicode_versions[-1]
|
||||||
|
|
||||||
|
if given_version in (u'auto', 'auto'):
|
||||||
|
given_version = os.environ.get(
|
||||||
|
'UNICODE_VERSION',
|
||||||
|
'latest' if not _return_str else latest_version.encode())
|
||||||
|
|
||||||
|
if given_version in (u'latest', 'latest'):
|
||||||
|
# default match, when given as 'latest', use the most latest unicode
|
||||||
|
# version specification level supported.
|
||||||
|
return latest_version if not _return_str else latest_version.encode()
|
||||||
|
|
||||||
|
if given_version in unicode_versions:
|
||||||
|
# exact match, downstream has specified an explicit matching version
|
||||||
|
# matching any value of list_versions().
|
||||||
|
return given_version if not _return_str else given_version.encode()
|
||||||
|
|
||||||
|
# The user's version is not supported by ours. We return the newest unicode
|
||||||
|
# version level that we support below their given value.
|
||||||
|
try:
|
||||||
|
cmp_given = _wcversion_value(given_version)
|
||||||
|
|
||||||
|
except ValueError:
|
||||||
|
# submitted value raises ValueError in int(), warn and use latest.
|
||||||
|
warnings.warn("UNICODE_VERSION value, {given_version!r}, is invalid. "
|
||||||
|
"Value should be in form of `integer[.]+', the latest "
|
||||||
|
"supported unicode version {latest_version!r} has been "
|
||||||
|
"inferred.".format(given_version=given_version,
|
||||||
|
latest_version=latest_version))
|
||||||
|
return latest_version if not _return_str else latest_version.encode()
|
||||||
|
|
||||||
|
# given version is less than any available version, return earliest
|
||||||
|
# version.
|
||||||
|
earliest_version = unicode_versions[0]
|
||||||
|
cmp_earliest_version = _wcversion_value(earliest_version)
|
||||||
|
|
||||||
|
if cmp_given <= cmp_earliest_version:
|
||||||
|
# this probably isn't what you wanted, the oldest wcwidth.c you will
|
||||||
|
# find in the wild is likely version 5 or 6, which we both support,
|
||||||
|
# but it's better than not saying anything at all.
|
||||||
|
warnings.warn("UNICODE_VERSION value, {given_version!r}, is lower "
|
||||||
|
"than any available unicode version. Returning lowest "
|
||||||
|
"version level, {earliest_version!r}".format(
|
||||||
|
given_version=given_version,
|
||||||
|
earliest_version=earliest_version))
|
||||||
|
return earliest_version if not _return_str else earliest_version.encode()
|
||||||
|
|
||||||
|
# create list of versions which are less than our equal to given version,
|
||||||
|
# and return the tail value, which is the highest level we may support,
|
||||||
|
# or the latest value we support, when completely unmatched or higher
|
||||||
|
# than any supported version.
|
||||||
|
#
|
||||||
|
# function will never complete, always returns.
|
||||||
|
for idx, unicode_version in enumerate(unicode_versions):
|
||||||
|
# look ahead to next value
|
||||||
|
try:
|
||||||
|
cmp_next_version = _wcversion_value(unicode_versions[idx + 1])
|
||||||
|
except IndexError:
|
||||||
|
# at end of list, return latest version
|
||||||
|
return latest_version if not _return_str else latest_version.encode()
|
||||||
|
|
||||||
|
# Maybe our given version has less parts, as in tuple(8, 0), than the
|
||||||
|
# next compare version tuple(8, 0, 0). Test for an exact match by
|
||||||
|
# comparison of only the leading dotted piece(s): (8, 0) == (8, 0).
|
||||||
|
if cmp_given == cmp_next_version[:len(cmp_given)]:
|
||||||
|
return unicode_versions[idx + 1]
|
||||||
|
|
||||||
|
# Or, if any next value is greater than our given support level
|
||||||
|
# version, return the current value in index. Even though it must
|
||||||
|
# be less than the given value, its our closest possible match. That
|
||||||
|
# is, 4.1 is returned for given 4.9.9, where 4.1 and 5.0 are available.
|
||||||
|
if cmp_next_version > cmp_given:
|
||||||
|
return unicode_version
|
||||||
|
assert False, ("Code path unreachable", given_version, unicode_versions)
|
Loading…
Reference in New Issue