@ -0,0 +1,154 @@
@ -0,0 +1,675 @@
@ -0,0 +1,502 @@
@ -0,0 +1,165 @@
@ -0,0 +1,48 @@
# Copyright © 2019 Christian Persch
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 3 of the License, or (at your
# option) any later version.
# This library is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# General Public License for more details.
# You should have received a copy of the GNU Lesser General Public License
# along with this library. If not, see <https://www.gnu.org/licenses/>.
vte_gtk3_api_version = @vte_gtk3_api_version@
vte_gtk4_api_version = @vte_gtk4_api_version@
NINJA = ninja $(NJOBS)
$(NINJA) clean
$(NINJA) coverage
$(NINJA) vte-$(vte_gtk3_api_version)-doc
$(NINJA) install
$(NINJA) uninstall
@ -0,0 +1,61 @@
Virtual TErminal
VTE provides a virtual terminal widget for GTK applications.
$ git clone https://gitlab.gnome.org/GNOME/vte # Get the source code of VTE
$ cd vte # Change to the toplevel directory
$ meson _build # Run the configure script
$ ninja -C _build # Build VTE
[ Optional ]
$ ninja -C _build install # Install VTE to default `/usr/local`
* By default, VTE will install under `/usr/local`. You can customize the
prefix directory by `--prefix` option, e.g. If you want to install VTE under
`~/foobar`, you should run `meson _build --prefix=~/foobar`. If you already
run the configure script before, you should also pass `--reconfigure` option to it.
* You may need to execute `ninja -C _build install` as root
(i.e. `sudo ninja -C _build install`) if installing to system directories.
* If you wish to test VTE before installing it, you may execute it directly from
its build directory. As `_build` directory, it should be `_build/src/app/vte-[version]`.
* You can pass `-Ddebugg=true` option to meson if you wish to enable debug function.
After installing VTE with `-Ddebugg=true` flag, you can use `VTE_DEBUG` variable to control
VTE to print out the debug information
# You should change vte-[2.91] to the version you build
$ VTE_DEBUG=selection ./_build/src/app/vte-2.91
# Or, you can mixup with multiple logging level
$ VTE_DEBUG=selection,draw,cell ./_build/src/app/vte-2.91
$ Or, you can use `all` to print out all logging message
$ VTE_DEBUG=all ./_build/src/app/vte-2.91
For logging level information, please refer to enum [VteDebugFlags](src/debug.h).
Bugs should be filed here: https://gitlab.gnome.org/GNOME/vte/issues/
Please note that this is *not a support forum*; if you are a end user,
always file bugs in your distribution's bug tracker, or use their
support forums.
If you want to provide a patch, please attach them to an issue in GNOME
GitLab, in the format output by the git format-patch command.
@ -0,0 +1,38 @@
# Copyright © 2018, 2019 Iñigo Martínez
# Copyright © 2019 Christian Persch
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 3 of the License, or (at your
# option) any later version.
# This library is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# General Public License for more details.
# You should have received a copy of the GNU Lesser General Public License
# along with this library. If not, see <https://www.gnu.org/licenses/>.
gir_dep = dependency('gobject-introspection-1.0', version: '>= 0.9.0')
if get_option('gtk3')
libvte_gtk3_gir_includes = [
libvte_gtk3_gir = gnome.generate_gir(
sources: libvte_gtk3_public_headers + libvte_common_doc_sources,
includes: libvte_gtk3_gir_includes,
dependencies: libvte_gtk3_dep,
extra_args: '-DVTE_COMPILATION',
nsversion: vte_gtk3_api_version,
namespace: 'Vte',
export_packages: vte_gtk3_api_name,
header: 'vte' / 'vte.h',
install: true,
@ -0,0 +1,23 @@
# Copyright © 2018, 2019 Iñigo Martínez
# Copyright © 2019 Christian Persch
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 3 of the License, or (at your
# option) any later version.
# This library is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# General Public License for more details.
# You should have received a copy of the GNU Lesser General Public License
# along with this library. If not, see <https://www.gnu.org/licenses/>.
if get_option('gir') and (get_option('gtk3') or get_option('gtk4'))
if get_option('vapi') and get_option('gtk3')
@ -0,0 +1 @@
Terminal.spawn_async skip = false
@ -0,0 +1,23 @@
<?xml version="1.0" encoding="UTF-8"?>
Copyright © 2014 Christian Persch
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope conf it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
<gresource prefix="/org/gnome/vte/test/app">
<file alias="ui/window.ui" compressed="true" preprocess="xml-stripblanks">app.ui</file>
<file alias="ui/search-popover.ui" compressed="true" preprocess="xml-stripblanks">search-popover.ui</file>
@ -0,0 +1,152 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated with glade 3.19.0 -->
<requires lib="gtk+" version="3.10"/>
<template class="TestWindow" parent="GtkApplicationWindow">
<property name="can_focus">False</property>
<property name="role">vte-terminal</property>
<property name="icon_name">utilities-terminal</property>
<object class="GtkBox" id="terminal_box">
<property name="visible">True</property>
<property name="can_focus">False</property>
<object class="GtkScrollbar" id="scrollbar">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="orientation">vertical</property>
<property name="restrict_to_fill_level">False</property>
<property name="fill_level">0</property>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="pack_type">end</property>
<property name="position">1</property>
<child type="titlebar">
<object class="GtkHeaderBar" id="headerbar1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="show_close_button">True</property>
<property name="decoration_layout">:close</property>
<object class="GtkButton" id="copy_button">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="receives_default">True</property>
<property name="tooltip_text" translatable="yes">Copy</property>
<property name="action_name">win.copy</property>
<property name="action_target">"text"</property>
<property name="focus_on_click">False</property>
<object class="GtkImage" id="image2">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="icon_name">edit-copy-symbolic</property>
<property name="use_fallback">True</property>
<property name="position">1</property>
<object class="GtkButton" id="paste_button">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="receives_default">True</property>
<property name="tooltip_text" translatable="yes">Paste</property>
<property name="action_name">win.paste</property>
<property name="focus_on_click">False</property>
<object class="GtkImage" id="image3">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="icon_name">edit-paste-symbolic</property>
<property name="use_fallback">True</property>
<property name="position">2</property>
<object class="GtkToggleButton" id="find_button">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="receives_default">True</property>
<property name="focus_on_click">False</property>
<object class="GtkImage" id="image5">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="icon_name">edit-find-symbolic</property>
<property name="use_fallback">True</property>
<property name="position">4</property>
<child type="title">
<object class="GtkMenuButton" id="gear_button">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="receives_default">True</property>
<property name="focus_on_click">False</property>
<object class="GtkImage" id="image1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="icon_name">open-menu-symbolic</property>
<property name="use_fallback">True</property>
<property name="pack_type">end</property>
<property name="position">3</property>
<object class="GtkBox" id="notifications_box">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="orientation">vertical</property>
<property name="spacing">6</property>
<object class="GtkImage" id="readonly_emblem">
<property name="can_focus">False</property>
<property name="tooltip_text" translatable="yes">Read-only</property>
<property name="icon_name">emblem-readonly</property>
<property name="use_fallback">True</property>
<property name="expand">False</property>
<property name="fill">True</property>
<property name="position">0</property>
<property name="pack_type">end</property>
<property name="position">4</property>
@ -0,0 +1,5 @@
[CCode (cprefix = "", lower_case_cprefix = "", cheader_filename = "config.h")]
namespace Config
public const string VERSION;
@ -0,0 +1,100 @@
# Copyright © 2018, 2019 Iñigo Martínez
# Copyright © 2019 Christian Persch
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 3 of the License, or (at your
# option) any later version.
# This library is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# General Public License for more details.
# You should have received a copy of the GNU Lesser General Public License
# along with this library. If not, see <https://www.gnu.org/licenses/>.
assert(get_option('gir'), 'gir is required for vala support')
assert(get_option('gtk3'), 'vala support only available for gtk3')
add_languages('vala', required: true)
valac = meson.get_compiler('vala')
assert(valac.version().version_compare('>= 0.24.0'), 'vala >= 0.24 required')
posix_dep = valac.find_library('posix')
libvte_gtk3_vapi_deps = [
libvte_gtk3_vapi_dep = gnome.generate_vapi(
sources: libvte_gtk3_gir[0],
packages: libvte_gtk3_vapi_deps,
install: true,
# Vala test application
vapp_resource_data = files(
vapp_resource_sources = gnome.compile_resources(
c_name: 'app',
dependencies: vapp_resource_data,
export: true,
vapp_sources = vapp_resource_sources + files(
vapp_cflags = [
vapp_valaflags = [
if valac.version().version_compare('>= 0.31.1')
vapp_valaflags += '--disable-since-check'
if gtk3_dep.version().version_compare('>= 3.16')
vapp_valaflags += '--define=GTK_3_16'
vapp_incs = [
vapp_deps = [
vapp = executable(
sources: vapp_sources,
include_directories: vapp_incs,
dependencies: vapp_deps,
c_args: vapp_cflags,
vala_args: vapp_valaflags,
install: false,
@ -0,0 +1,249 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated with glade 3.19.0 -->
<requires lib="gtk+" version="3.16"/>
<template class="TestSearchPopover" parent="GtkPopover">
<property name="can_focus">False</property>
<property name="transitions_enabled">False</property>
<object class="GtkBox" id="box1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="margin_left">12</property>
<property name="margin_right">12</property>
<property name="margin_top">12</property>
<property name="margin_bottom">12</property>
<property name="orientation">vertical</property>
<object class="GtkBox" id="box2">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="spacing">18</property>
<object class="GtkBox" id="box4">
<property name="visible">True</property>
<property name="can_focus">False</property>
<object class="GtkSearchEntry" id="search_entry">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="activates_default">True</property>
<property name="width_chars">30</property>
<property name="primary_icon_name">edit-find-symbolic</property>
<property name="primary_icon_activatable">False</property>
<property name="primary_icon_sensitive">False</property>
<property name="placeholder_text" translatable="yes">Search</property>
<property name="expand">True</property>
<property name="fill">True</property>
@ -0,0 +1,50 @@
Unicode defines width information for characters. Conventionally this
describes the number of columns a character is expected to occupy when
printed or drawn using a monospaced font.
There are five width classes with which we concern ourselves. Four of
these are narrow, wide, half-width, and full-width. For practical
purposes, narrow and half-width can be grouped together as
"single-width" (occupying one column), and wide and full-width can be
grouped together as "double-width" (occupying two columns).
The last class we're concerned with is those of ambiguous width. These
are characters which have the same meaning and graphical representation
everywhere, but which are either single-width or double-width based on
the context in which they appear.
Width information is crucial for terminal-based applications which need
to address the screen: if the application draws five characters and
expects the cursor to be in moved six columns to the right, and the
terminal moves the cursor seven (or five, or any number other than six),
display bugs manifest.
Ambiguously-wide characters pose an implementation problem for terminals
which may not be running in the same locale as an application which is
running inside the terminal. In these cases, the terminal cannot depend
on the libc wcwidth() function because wcwidth() typically makes use of
locale information.
There are basically four approaches to solving this problem:
A) Force characters with ambiguous width to be single-width.
B) Force characters with ambiguous width to be double-width.
C) Force characters with ambiguous width to be have a width value based
on the locale's region.
D) Force characters with ambiguous width to be have a width value based
on the locale's encoding.
Methods A and B will produce display bugs, because they don't take into
account any context information. Method C fails on glibc-based systems
because glibc uses method D and the two methods produce different
results for the same wchar_t values.
So the VteTerminal widget uses approach D. Depending on the context in
which a character was received (a combination of the terminal's encoding
and whether or not the character was received as an ISO-2022 sequence),
a character is internally assigned a width when it is received from the
Text which is not received from the terminal (input method preedit data)
is processed using method C, although now that I think about it, the
fact that it's UTF-8 text suggests that these characters should be
treated as single-width.
@ -0,0 +1,341 @@
Single width, hollow.
┌─┐ )0lqk
│ │ )0x x
└─┘ )0mqj
│ │
Single width, single fill.
┌┬┐ )0lwk
├┼┤ )0tnu
└┴┘ )0mvj
Double width, hollow.
┏━┓ )0
┃ ┃ )0
┗━┛ )0
║ ║
Double width, double fill.
┏┳┓ )0
┣╋┫ )0
┗┻┛ )0
Double width, single fill.
┏┯┓ )0
┠┼┨ )0 n
┗┷┛ )0
Single width, double fill.
┌┰┐ )0l k
┝╋┥ )0
└┸┘ )0m j
Single width, mixed fill (double horizontal, single vertical).
┌┬┐ )0lwk
┝┿┥ )0
└┴┘ )0mvj
Double width, mixed fill (double vertical, single horizontal).
┏┳┓ )0
┠╂┨ )0
┗┻┛ )0
Double horizontal, single vertical.
Double vertical, single horizontal.
Single width, double, triple and quadruple dash.
┌╌╌┐ ┌┄┄┐ ┌┈┈┐
╎ ╎ ┆ ┆ ┊ ┊
╎ ╎ ┆ ┆ ┊ ┊
└╌╌┘ └┄┄┘ └┈┈┘
Double width, double, triple and quadruple dash.
┏╍╍┓ ┏┅┅┓ ┏┉┉┓
╏ ╏ ┇ ┇ ┋ ┋
╏ ╏ ┇ ┇ ┋ ┋
┗╍╍┛ ┗┅┅┛ ┗┉┉┛
One single, two double lines meet.
┢┪ ┲┱
┡┩ ┺┹
One double, two single lines meet.
┞┦ ┭┮
┟┧ ┵┶
One single, three double lines meet.
╇ ╉╊
One double, three single lines meet.
╁ ┾┽
Two double, two single lines meet.
Mixed width, starting, ending and changing width mid-character.
╷ ╻ ╶╼╸
╽ ╿ ╺╾╴
╹ ╵
Single line with vertical lines crossing
║ ┃ │ │ │ ┃ ║
║ ┃ │ │ │ ┃ ║
│ │
╲ ╱ ╲ ╱ ╳ ╳ ╲ ╱ ╲ ╱ ╲ ╱ ╳ ╳ ╳ ╳ ╳╳╳╳╳╳╳
╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳╳╳╳╳╳╳
╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╱ ╲ ╱ ╲ ╱ ╲ ╳ ╳ ╳ ╳ ╳╳╳╳╳╳╳
╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳╳╳╳╳╳╳
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╲ ╱ ╲ ╱ ╲ ╱ ╳ ╳ ╳ ╳ ╳╳╳╳╳╳╳
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╳ ╳ ╳ ╳ ╳ ╳ ╳ ╳
╱ ╲ ╱ ╲ ╳ ╳ ╱ ╲ ╱ ╲ ╱ ╲ ╳ ╳ ╳ ╳
╲ ╱ ╲ ╱ ╱╲ ╱╲ ╱╲ ╱╲ ╱╲ ╲╱╲╱╲╱╲╱
╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲╱ ╲╱ ╲╱ ╱╲╱╲╱╲╱╲
╲╱ ╲╱ ╲╱ ╲╱ ╲╱ ╱╲ ╱╲ ╱╲ ╲╱╲╱╲╱╲╱
╱╲ ╱╲ ╱╲ ╱╲ ╱╲ ╲╱ ╲╱ ╲╱ ╱╲╱╲╱╲╱╲
╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱ ╲ ╱╲ ╱╲ ╱╲ ╲╱╲╱╲╱╲╱
╱ ╲ ╱ ╲ ╲╱ ╲╱ ╲╱ ╲╱ ╲╱ ╱╲╱╲╱╲╱╲
Block elements.
█▏ 🭽▔🭶🭷🭸🭹🭺🭻▁🮀🮁🮀▁🭻🭺🭹🭸🭷🭶▔🭾
🮋▎ ▏ ▕
🭰🭰 🭰 🭵
🮋▎ 🭱 ▐ ▌ ▛▀#▀▜ 🭴
🮊▍ 🭲 ▄▞▀ ▗▄▀▘ ▌▗▄▖▐ 🭳
🭱🭱 🭳 ▌ ▐ #▐#▌# 🭲
🮊▍ 🭴 ▀▚▄ ▝▀▄▖ ▌▝▀▘▐ 🭱
🮉▌ 🭵 ▐ ▌ ▙▄#▄▟ 🭰
🭲🭲 ▕ ▏
🮉▌ ▕ ▁▂▃▄▅▆▇█ ▖# ▗# ▏
▐▋ ▕ ▕ ▉ ▌# ▐# ▏
🭳🭳 ▕ 🮇 ▊ ▐# ▌# ▏
▐▋ ▕ 🮈 ░ ▋ ▝# ▘# ▏
🮈▊ 🭵 ▐ ▒░ ▌ 🭰
🭴🭴 🭴 🮉 ▓▒░ ▍ ▌# ▐# 🭱
🮈▊ 🭳 🮊 █▓▒░ ▎ ▚# ▞# 🭲
🮇▉ 🭲 🮋 ▏ ▐# ▌# 🭳
🭵🭵 🭱 █🮆🮅🮄▀🮃🮂▔ 🭴
🮇▉ 🭰 🭵
▕█ ▏ ▕
▕▕ 🭼▁🭻🭺🭹🭸🭷🭶▔🮀🮁🮀▔🭶🭷🭸🭹🭺🭻▁🭿
░░░░░ ▒▒▒▒▒ ▓▓▓▓▓ ██▓▓▓▓▓█████▒▒▒▒▒█████░░░░░█████
░ ░░░░░ ▒ ▒▒▒▒▒ ▓ ▓▓▓▓▓ ██▓▓▓▓▓██▓██▒▒▒▒▒██▒██░░░░░██░██
░░░░░ ▒▒▒▒▒ ▓▓▓▓▓ ██▓▓▓▓▓█████▒▒▒▒▒█████░░░░░█████
Hatchings and Checkerboards
🮘🮘🮘🮘 🮙🮙🮙🮙 🮘🮙🮘🮙 🮕🮕🮕🮕 🮖🮖🮖🮖 🮕🮖🮕🮖
🮘🮘🮘🮘 🮙🮙🮙🮙 🮙🮘🮙🮘 🮕🮕🮕🮕 🮖🮖🮖🮖 🮖🮕🮖🮕
🮘🮘🮘🮘 🮙🮙🮙🮙 🮘🮙🮘🮙 🮕🮕🮕🮕 🮖🮖🮖🮖 🮕🮖🮕🮖
🮘🮘🮘🮘 🮙🮙🮙🮙 🮙🮘🮙🮘 🮕🮕🮕🮕 🮖🮖🮖🮖 🮖🮕🮖🮕
🬇🬋🬃 🬦🬹🬓 🬞🬭🬏 🬠🬰🬐 🬁🬂🬀 🬉🬎🬄 🬇🬋🬃
🬭🬞🬏 🬹🬦🬓
█▐▌ █▐▌
🬂🬁🬀 🬎🬉🬄
🬭🬭🬭 🬚🬋🬩 🬕🬂🬨 🬹🬹🬹 🬝🬎🬬 🬴🬰🬸 🬛🬋🬫
▌ ▐ ▌ ▐ ▌ ▐ ▌ ▐ ▌ ▐ ▌ ▐ ▌ ▐
🬂🬂🬂 🬌🬋🬍 🬲🬭🬷 🬎🬎🬎 🬺🬹🬻 🬴🬰🬸 🬛🬋🬫
🬞🬭🬏 🬦🬋🬓 ▐🬂▌ 🬦🬹🬓 ▐🬎▌ ▐🬰▌ ▐🬋▌
▐ ▌ ▐ ▌ ▐ ▌ ▐ ▌ ▐ ▌ ▐ ▌ ▐ ▌
🬁🬂🬀 🬉🬋🬄 ▐🬭▌ 🬉🬎🬄 ▐🬹▌ ▐🬰▌ ▐🬋▌
🬞 🬏
🬖🬏🬇🬗 🬈🬀🬁🬅 🬤🬃🬞🬢
🬠 🬞🬢 🬔🬓🬦🬧 🬖🬏 🬐
🬣🬄🬁🬅 🬁 🬀 🬈🬀🬉🬘
🬥 🬙 🬆 🬊 🬒 🬡 🬑 🬟
🬇 🬃 🬐 🬠 🬃 🬇 🬃 🬇
🬳 🬶 🬱 🬵 🬮 🬯 🬟 🬑
🬞🬻🬺🬏 🬞🬜🬪🬏 🬞🬅🬈🬏
🬵🬝🬀🬁🬬🬱 🬵🬆 🬊🬱 🬖🬀 🬁🬢
🬻🬆 🬊🬺 🬜🬀 🬁🬪 🬔 🬧
🬬🬱 🬵🬝 🬪🬏 🬞🬜 🬣 🬘
🬊🬺🬏🬞🬻🬆 🬊🬱 🬵🬆 🬈🬏 🬞🬅
🬁🬬🬝🬀 🬁🬪🬜🬀 🬁🬢🬖🬀
Slope 1/3.
🭈🭆🭂🭍🭑🬽 🭈🬭🭆🬹🭂█🭍🬹🭑🬭🬽
🭣🭧🭓🭞🭜🭘 █#########█
Slope 2/3.
🭇🬼 🬞🬏
🭇🭄🭏🬼 🭊🭁🭌🬿 🭇🬭🭄█🭏🬭🬼 🭊🬹🭁🭌🬹🬿
🭢🭕🭠🭗 🭥🭒🭝🭚 ▐#####▌ █####█
🭢🭗 🭢🬂🭕█🭠🬂🭗 🭥🬎🭒🭝🬎🭚
Slope 1.
◢◣ 🮞🮟
◥◤ 🮝🮜
Slope 4/3.
🭉🬾 ▐#▌
🭃🭎 🬞🭃#🭎🬏
🭔🭟 🬁🭔#🭟🬀
🭤🭙 ▐#▌
Slope 2.
🭋🭀 ▐#▌
🭅🭐 🭅#🭐
🭖🭡 🭖#🭡
🭦🭛 ▐#▌
Diagonal quarters.
🭯 🭯 🭯 🭯
🭯 🭫 🭮🭫🭬 🮞🭫🮟 ◢🭫◣ 🭯🭯🭯🭯
🭫 🭯 🭯 🮞🮜 🮝🮟 ◢◤ ◥◣ 🭮🮛🮛🭬🮚🮚🮚🮚
🭮🭪 🭨🭬 🭮🭪 🭨🭬 🭮🭪 🭨🭬 🭮🭪 🭨🭬 🭮🭪 🭨🭬 🭮🮛🮛🭬🮚🮚🮚🮚
🭩 🭭 🭭 🮝🮟 🮞🮜 ◥◣ ◢◤ 🭮🮛🮛🭬🮚🮚🮚🮚
🭭 🭩 🭮🭩🭬 🮝🭩🮜 ◥🭩◤ 🭭🭭🭭🭭
🭭 🭭 🭭 🭭
🭯 🭯 🭯
◢◣◢◣◢◣ ◢🭫🭩🭫🭩🭫◣ ◢🭩🭩🭩◣ 🭯🭯🭯
◢◤◥◤◥◤◥◣ 🭮🭪 🭭 🭭 🭨🭬 ◢◤🭭🭭🭭◥◣ ◢🭫🭫🭫◣
◥◣ ◢◤ 🭨🭬 🭮🭪 🭨🭬 🭮🭪 🭮🭪 🭨🭬
◢◤ ◥◣ 🭮🭪 🭨🭬 🭨🭬 🭮🭪 🭮🭪 🭨🭬
◥◣ ◢◤ 🭨🭬 🭮🭪 🭨🭬 🭮🭪 🭮🭪 🭨🭬
◢◤ ◥◣ 🭮🭪 🭯 🭯 🭨🭬 ◥◣🭯🭯🭯◢◤ ◥🭩🭩🭩◤
◥◣◢◣◢◣◢◤ ◥🭩🭫🭩🭫🭩◤ ◥🭫🭫🭫◤ 🭭🭭🭭
◥◤◥◤◥◤ 🭭 🭭 🭭
🮞◣🮞◣🮞◣ ◢🮟◢🮟◢🮟 ╱🮟╱🮟╱🮟 🮞╲🮞╲🮞╲
🮞🮜◥🮜◥🮜◥◣ ◢◤🮝◤🮝◤🮝🮟 ╱╱🮝╱🮝╱🮝🮟 🮞🮜╲🮜╲🮜╲╲
◥◣ 🮞🮜 🮝🮟 ◢◤ 🮝🮟 ╱╱ ╲╲ 🮞🮜
🮞🮜 ◥◣ ◢◤ 🮝🮟 ╱╱ 🮝🮟 🮞🮜 ╲╲
◥◣ 🮞🮜 🮝🮟 ◢◤ 🮝🮟 ╱╱ ╲╲ 🮞🮜
🮞🮜 ◥◣ ◢◤ 🮝🮟 ╱╱ 🮝🮟 🮞🮜 ╲╲
◥◣🮞◣🮞◣🮞🮜 🮝🮟◢🮟◢🮟◢◤ 🮝🮟╱🮟╱🮟╱╱ ╲╲🮞╲🮞╲🮞🮜
◥🮜◥🮜◥🮜 🮝◤🮝◤🮝◤ 🮝╱🮝╱🮝╱ ╲🮜╲🮜╲🮜
╱◣╱◣╱◣ ◢╲◢╲◢╲ ╱╲╱╲╱╲ 🮣🮧🮧🮧🮧🮢
╱╱◥╱◥╱◥◣ ◢◤╲◤╲◤╲╲ ╱╱╲╱╲╱╲╲ 🮣🮨🮧🮧🮧🮧🮩🮢
◥◣ ╱╱ ╲╲ ◢◤ ╲╲ ╱╱ 🮤🮤 🮥🮥
╱╱ ◥◣ ◢◤ ╲╲ ╱╱ ╲╲ 🮤🮤 🮥🮥
◥◣ ╱╱ ╲╲ ◢◤ ╲╲ ╱╱ 🮤🮤 🮥🮥
╱╱ ◥◣ ◢◤ ╲╲ ╱╱ ╲╲ 🮤🮤 🮥🮥
◥◣╱◣╱◣╱╱ ╲╲◢╲◢╲◢◤ ╲╲╱╲╱╲╱╱ 🮡🮩🮦🮦🮦🮦🮨🮠
◥╱◥╱◥╱ ╲◤╲◤╲◤ ╲╱╲╱╲╱ 🮡🮦🮦🮦🮦🮠
╷ ╷ 🮣─🮢 🮣─🮦─🮢
🮣─🮢 ┌🮧┐ ╶🮭─🮬╴ │ │ │ │ │
│ │ 🮤 🮥 │ │ 🮣─🮨─🮩─🮢 🮥─🮮─🮤
🮡─🮠 └🮦┘ ╶🮫─🮪╴ │ │ │ │ │ │ │
╵ ╵ 🮡─🮠 🮡─🮠 🮡─🮧─🮠
▗▘ ▝▖
🮔 🮏
█ 🮍▒🮌
🮔 🮎
▝▖ ▗▘
🬤🬤🬤🬤⅓█ █ 🬗🬗🬗🬗
🬗🬗🬗🬗█ █ █🬤🬤🬤🬤
🬤🬤🬤🬤 █ █ 🬗🬗🬗🬗
¼ 🬗🬗🬗🬗█ █ █🬤🬤🬤🬤
▒▒▒▒🮖🮖🮖🮖▞▞▞▞½█ █ ▚▚▚▚🮕🮕🮕🮕🮐🮐🮐🮐 ▎ 🮇 ▎ 🮇
▒▒▒▒🮖🮖🮖🮖▞▞▞▞█ █ █▚▚▚▚🮕🮕🮕🮕🮐🮐🮐🮐 🮂🮕🮗🮖🮂 🮂🮖🮗🮕🮂
▒▒▒▒🮖🮖🮖🮖▞▞▞▞ █ █ ▚▚▚▚🮕🮕🮕🮕🮐🮐🮐🮐 ▂🮕🮗🮖▂ ▂🮖🮗🮕▂
▒▒▒▒🮖🮖🮖🮖▞▞▞▞█ █ █▚▚▚▚🮕🮕🮕🮕🮐🮐🮐🮐 ▎ 🮇 ▎ 🮇
🬘🬘🬘🬘⅔█ █ 🬣🬣🬣🬣
🬧🬧🬧🬧█ █ █🬔🬔🬔🬔
🬣🬣🬣🬣 █ █ 🬘🬘🬘🬘
🬔🬔🬔🬔█ █ █🬧🬧🬧🬧
🮣🮢 🮣🮢 🮣🮢🮣🮢
🮣🮠🮡🮢🮣🮨🮩🮢 🮭🮬 🮡🮩🮨🮠 🮨🮨🮨🮩🮩🮩 🮭🮭🮭🮬🮬🮬 🮮🮮🮮🮮🮮🮮
🮡🮢🮣🮠🮡🮩🮨🮠 🮫🮪 🮣🮨🮩🮢 🮨🮨🮨🮩🮩🮩 🮭🮭🮭🮬🮬🮬 🮮🮮🮮🮮🮮🮮
🮡🮠 🮡🮠 🮡🮠🮡🮠 🮨🮨🮨🮩🮩🮩 🮭🮭🮭🮬🮬🮬 🮮🮮🮮🮮🮮🮮
🮣🮧🮢 🮣🮧🮢 🮣🮦🮢 🮭🮦🮬 🮩🮩🮩🮨🮨🮨 🮫🮫🮫🮪🮪🮪 🮮🮮🮮🮮🮮🮮
🮤 🮥 🮤🮮🮥 🮥 🮤 🮥 🮤 🮩🮩🮩🮨🮨🮨 🮫🮫🮫🮪🮪🮪 🮮🮮🮮🮮🮮🮮
🮡🮦🮠 🮡🮦🮠 🮡🮧🮠 🮫🮧🮪 🮩🮩🮩🮨🮨🮨 🮫🮫🮫🮪🮪🮪 🮮🮮🮮🮮🮮🮮
◤◤◤◥◥◥ 🮜🮜🮜🮝🮝🮝
◤◤◤◥◥◥ 🮜🮜🮜🮝🮝🮝
◤◤◤◥◥◥ 🮜🮜🮜🮝🮝🮝
◣◣◣◢◢◢ 🮟🮟🮟🮞🮞🮞
◣◣◣◢◢◢ 🮟🮟🮟🮞🮞🮞
◣◣◣◢◢◢ 🮟🮟🮟🮞🮞🮞
VT-102: http://vt100.net/docs/vt102-ug/table5-13.html
Unicode: http://www.unicode.org/charts/PDF/U2500.pdf
@ -0,0 +1,108 @@
<title>VTE Reference Manual</title>
Documentation for VTE version &version;.
The latest version of this documentation can be found on-line at the
<ulink role="online-location" url="http://library.gnome.org/devel/vte/">GNOME Library</ulink>.
<holder>Christian Persch</holder>
Permission is granted to copy, distribute and/or modify this document
under the terms of the <citetitle>GNU Lesser General Public Licence</citetitle>, Version 2.1
or (at your option) any later version published by the Free Software Foundation.
You may obtain a copy of the <citetitle>GNU Lesser General Public Licence</citetitle>
from the Free Software Foundation at
<ulink type="http" url="http://www.gnu.org/licences/">GNU Licences web site</ulink>
or by writing to:
The Free Software Foundation, Inc.,
<street>51 Franklin St</street> – Fifth Floor,
<city>Boston</city>, <state>MA</state> <postcode>02110-1301</postcode>,
<title>API Reference</title>
<xi:include href="xml/vte-terminal.xml"/>
<xi:include href="xml/vte-regex.xml"/>
<xi:include href="xml/vte-pty.xml"/>
<xi:include href="xml/vte-version.xml"/>
<chapter id="object-hierarchy">
<title>Object Hierarchy</title>
<xi:include href="xml/tree_index.sgml"/>
<index id="api-index-full">
<title id="index-all">Index</title>
<xi:include href="xml/api-index-full.xml"><xi:fallback /></xi:include>
<index id="api-index-deprecated" role="deprecated">
<title>Index of deprecated symbols</title>
<xi:include href="xml/api-index-deprecated.xml"><xi:fallback /></xi:include>
<index id="api-index-0-40" role="0.40">
<title>Index of new symbols in 0.40</title>
<xi:include href="xml/api-index-0.40.xml"><xi:fallback /></xi:include>
<index id="api-index-0-44" role="0.44">
<title>Index of new symbols in 0.44</title>
<xi:include href="xml/api-index-0.44.xml"><xi:fallback /></xi:include>
<index id="api-index-0-46" role="0.46">
<title>Index of new symbols in 0.46</title>
<xi:include href="xml/api-index-0.46.xml"><xi:fallback /></xi:include>
<index id="api-index-0-48" role="0.48">
<title>Index of new symbols in 0.48</title>
<xi:include href="xml/api-index-0.48.xml"><xi:fallback /></xi:include>
<index id="api-index-0-50" role="0.50">
<title>Index of new symbols in 0.50</title>
<xi:include href="xml/api-index-0.50.xml"><xi:fallback /></xi:include>
<index id="api-index-0-52" role="0.52">
<title>Index of new symbols in 0.52</title>
<xi:include href="xml/api-index-0.52.xml"><xi:fallback /></xi:include>
<index id="api-index-0-54" role="0.54">
<title>Index of new symbols in 0.54</title>
<xi:include href="xml/api-index-0.54.xml"><xi:fallback /></xi:include>
<index id="api-index-0-56" role="0.56">
<title>Index of new symbols in 0.56</title>
<xi:include href="xml/api-index-0.56.xml"><xi:fallback /></xi:include>
<index id="api-index-0-58" role="0.58">
<title>Index of new symbols in 0.58</title>
<xi:include href="xml/api-index-0.58.xml"><xi:fallback /></xi:include>
<xi:include href="xml/annotation-glossary.xml"><xi:fallback /></xi:include>
@ -0,0 +1,240 @@
<SUBSECTION Binding Accessors>
<SUBSECTION Deprecated>
<SUBSECTION Deprecated>
<TITLE>Version Information</TITLE>
@ -0,0 +1,448 @@
║ VTE rewrapping ║
as per the feature request and discussions at
by Egmont Koblinger and Behdad Esfahbod
It is a really cool feature if the terminal rewraps long lines when the window
is resized.
In order to implement this, we need to remember for each line whether we
advanced to the next because a newline (a.k.a. linefeed) was printed, or
because the end of line was reached. VTE and most other terminals already
remember this (even if they don't support rewrap) for copy-paste purposes.
Let's use the following terminology:
A "line" or "row" (these two words are used interchangeably in this document)
refer to a physical line of the terminal.
A line is "hard wrapped" if it was terminated by an explicit newline. On
contrary, a line is "soft wrapped" if the text overflowed to the next line.
It's not clear by this definition whether the last line should be defined as
hard or soft wrapped. It should be irrelevant. The definition also gets
unclear as soon as we start printing escape codes that move the cursor. E.g.
should positioning the cursor to the beginning of a previous line and printing
something there effect the soft or hard wrapped state of the preceding line?
A "paragraph" is one or more lines enclosed between two hard line breaks. That
is, the line preceding the paragraph is hard wrapped (or we're at the
beginning of the buffer), all lines of the paragraph except the last are soft
wrapped, and the last line is hard wrapped (or we're at the end of the buffer,
in which case it can also be soft wrapped).
Content after rewrapping
The basic goal is that if an application prints some continuous stream of text
(with no cursor positioning escape codes) then after resizing the terminal the
text should look just as if it was originally printed at the new terminal
Rewrapping paragraphs containing single width and combining characters only
should be obvious.
Double width (CJK) characters should not be cut in half. If they don't fit at
the end of the row, they should overflow to the next, leaving one empty cell
at the end of the previous line. That empty cell should not be considered when
copy-pasting the text, nor when rewrapping the text again. This is the same as
when the CJK text is originally printed.
TAB characters are a nightmare. Even without rewrapping, their behavior is
weird. You can print arbitrary amount of tabs, the cursor doesn't advance from
the last column. Then you can print a letter, and the cursor stays just beyond
the last cell and yet again you can print arbitrary amounts of tabs which do
nothing. Then the next letter wraps to the next line. So, even without
rewrapping, copy-pasting tabs around EOL doesn't reproduce the exact same text
that was printed by the application, tab characters can get dropped. In order
to "fix" this, we'd need to remember two numbers per line (number of tabs at
EOL before the last character, and number of tabs at EOL after the last
character). It's definitely not worth it. Furthermore, there's dynamic tab
stop positions, and the very last thing we'd want to do is to remember for
each tab character where the tab stops were when it was printed. So when
rewrapping, we don't try to rewrap to the state exactly as if the application
originally printed the text at the new width. If we do anything that's not
obviously horribly broken then we're okay. (In other words, in this respect
we're safe to say that tab is a cursor positioning code rather than a
printable character.)
Other generic expectations
Window managers can be configured to resize applications (and hence the VTE
widget) only once for the final size, and can resize it continuously. It's
expected that these two should lead to the same result (as much as possible).
Some terminal emulators scroll to the bottom on resize. VTE has traditionally
been cleverer, it kept the scroll position. I believe it's a nice feature and
we should try to keep it the same.
It is expected that a small difference in the way you resize the terminal
shouldn't lead to a big difference in behavior. This is very hard to lay in
exact specifications, these are rather "common sense" expectations, but I try
to demonstrate via a couple of examples. If you change the width but all
paragraphs were and still are shorter than the width, rewrapping shouldn't
change the scroll offset. If there was only 1 paragraph that needed to be
rewrapped from one line to two lines, the content shouldn't scroll by more
than 1 line anywhere on the screen. If you change the height only, the
behavior would be the same as with old non-rewrapping VTE. In this case the
rewrapping code is actually skipped (because it's an expensive operation), but
even if it was executed, the behavior should remain the same.
Normal vs alternate screen
The normal screen should always be resized and rewrapped, even if the
alternate screen is visible (bug 415277). This can occur immediately on each
resize, or once when returning from the alternate screen. Probably resizing
immediately gives a better user experience (main bug comment 34), since
resizing is a heavyweight user-initiated event, while returning from the
alternate screen is not where the user would expect the terminal to hang for
some time.
The alternate screen should not be rewrapped. It is used by applications that
have full control over the entire area and they will repaint it themselves.
Rewrapping by vte would cause ugly artifacts after vte rewraps but before the
application catches up, e.g. characters aligned below each other would become
arranged diagonally for a short while. (Moreover, with current VTE design,
rewrapping the alternate screen would require many new fds to be used: main
bug comment 60).
Cursor position after rewrapping
Both the active cursor and the saved cursor should be updated when rewrapping.
(The saved cursor might be important e.g. when returning from alternate
The cursor should ideally stay over the same character (whenever possible), or
as "close" to that as possible. If it is over the second cell of a CJK, or in
the middle of a Tab, it should remain so.
If rewrapping is disabled, the cursor can be anywhere to the right, even
beyond the right end of the screen. This can occur easily when the window is
narrowed. But even with rewrapping enabled, there is 1 more valid position
than the number of columns. E.g. with 80 columns, the cursor can be over the
1st character, ..., over the 80th character, or beyond the 80th character,
which are 81 valid horizontal positions; in the latter case the cursor is not
over a character. We need to distinguish all these positions and keep them
during rewrap whenever possible.
Let's assume the cursor's old position is not above a character, but at EOL or
beyond. After rewrapping, we should try to maintain this position, so we
should walk to the right from the corresponding character if possible.
However, we should not walk into text that got joined with this line during
rewrapping a paragraphs, nor should we wrap to next line.
Here are a couple of examples. Imagine the cursor stands in the underlined
cell (although it's technically an "upper one eighth block" character in the
cell below in this document). The text printed by applications doesn't contain
space characters in these examples.
- The cursor is far to the right in a hard wrapped line. Keep that position,
no matter if visible or not:
▏width 13 ▏ ▏width 20 ▏
paragraphend. <-> paragraphend.
Newparagraph ▔ Newparagraph ▔
- The cursor is far to the right in a soft wrapped line. That position cannot
be maintained, so jump to a character:
▏width 11 ▏ ▏width 10 ▏ ▏width 12 ▏
blabla12345 -> blabla1234 or blabla123456
67890 ▔ 567890 7890 ▔
- The cursor is far to the right in a soft wrapped line. That position can be
maintained because the next CJK doesn't fix:
▏width 11 ▏ ▏width 12 ▏
blabla12345 <-> blabla12345
伀 ▔ 伀 ▔
- Wrapping a CJK leaves an empty cell. Also, keep the cursor under the second
▏width 13 ▏ ▏width 12 ▏
blabla12345伀 <-> blabla12345
▔ 伀
Shell prompt
If you resize the terminal to be narrower than your shell prompt (plus the
command you're entering) while the shell is waiting for your command, you see
weird behavior there. This is not a bug in rewrapping: it's because the shell
redisplays its prompt (and command line) on every resize. There's not much VTE
could do here.
As a long term goal, maybe readline could have an option where it knows that
the terminal rewraps its contents so that it doesn't redisplay the prompt and
the command line, just expects the terminal to do this correctly. It's a bit
risky, since probably all terminals that support rewrapping do this a little
bit differently.
Scroll position, cutting lines from the bottom
A very tricky question is to figure out the scroll position after a resize.
First, let's ignore bug 708213's requirements.
Normally the scrollbar is at the bottom. If this is the case, it should remain
How to position the scroll offset if the scrollbar is somewhere at the middle?
Playing with various possibilities suggested that probably the best behavior
is if we try to keep the bottom visible paragraph at the bottom. (After all,
in terminals the bottom is far more important than the top.) It's not yet
exactly specified if the bottom of the viewport cuts a paragraph in two, but
still then we try to keep it approximately there.
The exact implemented behavior is: we look at the character at the cell just
under the viewport's bottom left corner, keep track where this character moves
during rewrapping, and position the scrollbar so that this character is again
just under the viewport.
As an exception, I personally found a "snap to top" feature useful: if the
scrollbar was all the way at the top, it should stay there.
Now let's address bug 708213.
This breaks the expectation that changing the terminal height back and forth
should be a no-op. To match XTerm's behavior, when the window height is
reduced and there are lines under the cursor then those lines should be
dropped for good.
It is very hard to figure out the desired behavior when this is combined with
rewrapping. E.g. in one step you decrease the height and would expect lines to
be dropped from the bottom, but in the very same step you increase the width
which causes some previously wrapped paragraphs to fit in a single line (this
could be above or below the cursor or just in the cursor's line, or all of
these) which makes room for previously undisplayed lines. What to do then?
The total number of rows, the number of rows above the cursor, and the number
of rows below the cursor can all increase/decrease/stay pretty much
independently from each other, almost all combinations are possible when
resizing diagonally with rewrapping enabled. The behavior should also be sane
when the cursor's paragraph starts wrapping.
As an additional requirement, I had the aforementioned shell prompt feature in
mind. One of the most typical use cases when the cursor is not in the bottom
row is when you edit a multiline shell command and move the cursor back. In
this case, shrinking the terminal shouldn't cut lines from the bottom.
My best idea which reasonably covers all the possible cases is that we drop
the lines (if necessary) after rewrapping, but before computing the new
scrollbar offsets, and we drop the highest number of lines that satisfies all
these three conditions:
- drop1: We shouldn't drop more lines than necessary to fit the content
without scrollbars.
- drop2: We should only drop data that's below the cursor's paragraph. (We
don't drop data that is under the cursor's row, but belongs to the same
- drop3: We track the character cell that immediately follows the cursor's
paragraph (that is, the line after this paragraph, first column), and see
how much it would get closer to the top of the window (assuming viewport is
scrolled to the bottom). The original bug is about that the cursor
shouldn't get closer to the top, with rewrapping I found that it's probably
not the cursor but the end of the cursor's paragraph that makes sense to
track. We shouldn't drop more lines than the amount by which this point
would get closer to the top.
Storing lines
Vte's ring was designed with rewrapping in mind, nevertheless it operates with
rows. Changing it to work on paragraphs would require heavy refactoring, and
would cause all sorts of troubles with overlong paragraphs. As the main
features of terminals (showing content, scrolling etc.) are all built around
rows, such a change for rewrapping only doesn't sound feasible. It's even
unclear which approach would be better for a terminal built from scratch. So
we decided to keep Vte operate with rows. Rewrapping is an expensive operation
that builds up the notion of paragraphs from rows, and then cuts them to rows
The scrollback buffer also remains defined in terms of lines, rather than
paragraphs or memory. This also guarantees that the scrollbar's length cannot
The ring contains some of the bottom rows in thawed state, while most of the
scrollback buffer is frozen. Rewrapping is very complicated so we don't want
the code to be duplicated. It is also computational heavy and we should try to
be as fast as possible. Hence we work on frozen data structure in which most
of the data lies, and we freeze all the rows for this purpose.
The frozen text is stored in UTF-8. Care should be taken that the number of
visual cells, number of Unicode characters, and number of bytes are three
different values.
The buffer is stored in three streams: text_stream contains the raw text
encoded in UTF-8, with '\n' characters at paragraph boundaries; attr_stream
contains records for each continuous run of identical attributes (same colors,
character width, etc.) of text_stream (with the exception of '\n' where the
attribute is ignored, e.g. it can be even embedded in a continuous run of
double-width CJK characters); and row_stream consists of pointers into
attr_steam and text_stream for every row. Out of these three, only row_stream
needs to be regenerated.
We start building up the new row stream beginning at new row number 0. We
could make it any other arbitrary number, but we wouldn't be able to keep any
of the old numbers unchanged (neither ring->start because lines can be dropped
from the scrollback's top when narrowing the window, nor ring->end because we
have no clue at the beginning how many rows we'll have), so there's no point
even trying.
For higher performance, for each row we store whether it consists of ASCII
32..126 characters only (excluding tabs too). (The flag can err in the safe
way: it can be false even if the paragraph is ASCII only.) If a paragraph
consists solely of such rows, we can rewrap it without looking at text_stream,
since we know that all characters are stored as a single byte and all occupy a
single cell.
If it's not the case, we need to look at text_stream to be able to wrap the
Other than this, rewrapping is long, boring, but straightforward code without
any further tricks.
There are some cell positions (I call them markers) that we need to keep track
of, and tell where they moved during rewrapping. Such markers are the cursor,
the saved cursor, the cell under the viewport's bottom left corner (for
computing the new scrollbar offset), the cell under the bottom left corner of
the cursor's paragraph (for computing the number of lines to get dropped), and
the boundaries of the highlighted region.
A marker is a (row, column) pair where the row is either within the ring's
range or in a further row, and the column is arbitrary.
Before rewrapping, if the row is within the ring's range, the (row, column)
pair is converted to a VteCellTextOffset which contains the text offset,
fragment_cells denoting how many cells to walk from the first cell of a
multicell character (i.e. 1 for the right half of a CJK), and eol_cells
containing -1 if the cursor is over a character, 0 if the cursor is just after
the last character, or more if the cursor is farther to the right. Example:
▏width 24 ▏
Line 0 overflowing to LI
NE 1 ▔
If the cursor is over 'I' then text_offset is 23, eol_cells is -1.
If the cursor is just after the 'I' (as shown) then text_offset is 24,
eol_cells is 0.
If the cursor is one n more cells further to the right then text_offset is 24,
eol_cells is n.
if the cursor is over 'N' then text_offset is 24 and eol_cells is -1.
If the cursor is over 'E' then text_offset is 25 and eol_cells is -1.
If the row is beyond the range covered by the ring, then text_offset will be
text_stream's head for the immediate next row, one bigger for next row and so
on, eol_cells will be set to the desired column, and fragment_cells is 0.
Pretty much as if the ring continued with empty hard wrapped lines.
After rewrapping, VteCellTextOffset is converted back to (row, column)
according to the new width and new row numbering. This could be done solely
based on VteCellTextOffset, but instead we update the row during rewrapping,
and only compute the column afterwards. This is because we don't have a fast
way of mapping text_offset to row number, this would require a binary search,
it's much easier to remember this data when we're there anyway while
Further optimization
In row_stream and attr_stream, along with the text offset we could similarly
store the character offset (a counter that is increased by 1 on every Unicode
character, in other words what the value of the text offset would be if we
stored the text in UCS-4 rather than UTF-8).
This, along with the fact that a cell's attribute contains the character
width, and hence there is an attr change at every boundary where the character
width changes, would enable us to compute the number of lines for each
paragraph without looking at text_stream. This could be a huge win, since
text_stream is by far the biggest of the three streams.
The trick is however that we'd only know the number of lines for the
paragraph, but not the text offsets for the inner lines. These would have to
remain in a special uninitialized state in the new row_stream, and be computed
lazily on demand. For storing that, streams would need to be writable at
arbitrary positions, rather than just allowing appending of new data.
Care should be taken that this "on demand" includes the case when they are
being scrolled out from the scrollback buffer for good, because we'd still
need to be able to tell the text offset for the remaining lines of the
With the current design, the top of the scrollback buffer can easily contain a
partial paragraph. After a subsequent resize, this might lead to the topmost
row missing its first part. E.g. after executing "ls -l /bin" at width 40 and
then widening the terminal, the first 40 characters of bash's paragraph can be
cut off like this, because that used to form a row that got scrolled out:
012 bash
-rwxr-xr-x 3 root root 31152 Aug 3 2012 bunzip2
-rwxr-xr-x 1 root root 1999912 Mar 13 2013 busybox
With the current design I can't see any easy and clean workaround for this
that wouldn't introduce other side effects or terribly complicated code. I'd
say this is a small glitch we can easily live with.
With extremely large scrollback buffers (let's not forget: VTE supports
infinite scrollback) rewrapping might become slow. On my computer (average
laptop with Intel(R) Core(TM) i3 CPU, old-fashioned HDD) resizing 1 million
lines take about 0.2 seconds wall clock time, this is close to the boundary of
okay-ish speed. For this reason, rewrapping can be disabled with the
vte_terminal_set_rewrap_on_resize() api call.
Developers writing Vte-based multi-tab terminal emulators are encouraged to
resize only the visible Vte, the hidden ones should be resized when they
become visible. This avoids the time it takes to rewrap the buffer to be
multiplied by the number of tabs and so block the user for a long
uninterrupted time when they resize the window. Developers are also encouraged
to implement a user friendly way of disabling rewrapping if they allow giant
scrollback buffer.
@ -0,0 +1,517 @@
@ -0,0 +1,100 @@
# Copyright © 2018, 2019 Iñigo Martínez
# This library is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 3 of the License, or (at your
# option) any later version.
# This library is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# General Public License for more details.
# You should have received a copy of the GNU Lesser General Public License
# along with this library. If not, see <https://www.gnu.org/licenses/>.
# This option allows you to disable -Bsymbolic-functions if your linker
# doesn't support it.
type: 'boolean',
value: true,
description: 'Use -Bsymbolic-functions',
type: 'boolean',
value: true,
description: 'Enable a11y',
'debugg', # for some reason, 'debug' is "reserved"
type: 'boolean',
value: false,
description: 'Enable extra debugging functionality',
type: 'boolean',
value: false,
description: 'Enable documentation',
type: 'boolean',
value: true,
description: 'Enable GObject Introspection',
type: 'boolean',
value: true,
description: 'Enable FriBidi support',
type: 'boolean',
value: true,
description: 'Enable GNUTLS support',
type: 'boolean',
value: true,
description: 'Enable GTK+ 3.0 widget',
type: 'boolean',
value: false,
description: 'Enable GTK+ 4.0 widget',
type: 'boolean',
value: true,
description: 'Enable legacy charset support using ICU',
type: 'boolean',
value: true,
description: 'Enable systemd support',
'vapi', # would use 'vala' but that name is reserved
type: 'boolean',
value: true,
description: 'Enable Vala bindings',
@ -0,0 +1,13 @@
echo ' 0 1 2 3 4 5 6 7 8 9 A B C D E F'
for y in 0 1 2 3 4 5 6 7 8 9 A B C D E F; do
echo -n "$y "
for x in 0 1 2 3 4 5 6 7 8 9 A B C D E F; do
echo -ne "\e[43m\U1fb$x$y\e[49m "
@ -0,0 +1,89 @@
#!/usr/bin/env bash
# Test 256 color support along with bold and dim attributes.
# Copyright (C) 2014 Egmont Koblinger
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
if [ "$1" = "-colon" -o "$1" = "-official" -o "$1" = "-dejure" ]; then
elif [ "$1" = "-semicolon" -o "$1" = "-common" -o "$1" = "-defacto" ]; then
if [ $# != 0 ]; then
echo 'Usage: 256test.sh [-format]' >&2
echo >&2
echo ' -colon|-official|-dejure: Official format (default) \e[38:5:INDEXm' >&2
echo ' -semicolon|-common|-defacto: Commonly used format \e[38;5;INDEXm' >&2
exit 1
format_number() {
local c=$'\u254F'
if [ $1 -lt 10 ]; then
printf "$c %d" $1
printf "$c%02d" $(($1%100))
somecolors() {
local from="$1"
local to="$2"
local prefix="$3"
local line
for line in \
"\e[2mdim " \
"normal " \
"\e[1mbold " \
"\e[1;2mbold+dim "; do
echo -ne "$line"
while [ $i -le $to ]; do
echo -ne "\e[$prefix${i}m"
format_number $i
echo $'\e[0m\e[K'
allcolors() {
echo "-- 8 standard colors: SGR ${1}0..${1}7 --"
somecolors 0 7 "$1"
echo "-- 8 bright colors: SGR ${2}0..${2}7 --"
somecolors 0 7 "$2"
echo "-- 256 colors: SGR ${1}8${sep}5${sep}0..255 --"
somecolors 0 15 "${1}8${sep}5${sep}"
somecolors 16 51 "${1}8${sep}5${sep}"
somecolors 52 87 "${1}8${sep}5${sep}"
somecolors 88 123 "${1}8${sep}5${sep}"
somecolors 124 159 "${1}8${sep}5${sep}"
somecolors 160 195 "${1}8${sep}5${sep}"
somecolors 196 231 "${1}8${sep}5${sep}"
somecolors 232 255 "${1}8${sep}5${sep}"
allcolors 3 9
allcolors 4 10
@ -0,0 +1,212 @@
UTF-8 encoded sample plain-text file
Markus Kuhn [ˈmaʳkʊs kuːn] <http://www.cl.cam.ac.uk/~mgk25/> — 2002-07-25
The ASCII compatible UTF-8 encoding used in this plain-text file
is defined in Unicode, ISO 10646-1, and RFC 2279.
Using Unicode/UTF-8, you can write in emails and source code things such as
Mathematics and sciences:
∮ E⋅da = Q, n → ∞, ∑ f(i) = ∏ g(i), ⎧⎡⎛┌─────┐⎞⎤⎫
⎪⎢⎜│a²+b³ ⎟⎥⎪
∀x∈ℝ: ⌈x⌉ = −⌊−x⌋, α ∧ ¬β = ¬(¬α ∨ β), ⎪⎢⎜│───── ⎟⎥⎪
⎪⎢⎜⎷ c₈ ⎟⎥⎪
ℕ ⊆ ℕ₀ ⊂ ℤ ⊂ ℚ ⊂ ℝ ⊂ ℂ, ⎨⎢⎜ ⎟⎥⎬
⎪⎢⎜ ∞ ⎟⎥⎪
⊥ < a ≠ b ≡ c ≤ d ≪ ⊤ ⇒ (⟦A⟧ ⇔ ⟪B⟫), ⎪⎢⎜ ⎲ ⎟⎥⎪
⎪⎢⎜ ⎳aⁱ-bⁱ⎟⎥⎪
2H₂ + O₂ ⇌ 2H₂O, R = 4.7 kΩ, ⌀ 200 mm ⎩⎣⎝i=1 ⎠⎦⎭
Linguistics and dictionaries:
ði ıntəˈnæʃənəl fəˈnɛtık əsoʊsiˈeıʃn
Y [ˈʏpsilɔn], Yen [jɛn], Yoga [ˈjoːgɑ]
((V⍳V)=⍳⍴V)/V←,V ⌷←⍳→⍴∆∇⊃‾⍎⍕⌈
Nicer typography in plain text files:
║ ║
║ • ‘single’ and “double” quotes ║
║ ║
║ • Curly apostrophes: “We’ve been here” ║
║ ║
║ • Latin-1 apostrophe and accents: '´` ║
║ ║
║ • ‚deutsche‘ „Anführungszeichen“ ║
║ ║
║ • †, ‡, ‰, •, 3–4, —, −5/+5, ™, … ║
║ ║
║ • ASCII safety test: 1lI|, 0OD, 8B ║
║ ╭─────────╮ ║
║ • the euro symbol: │ 14.95 € │ ║
║ ╰─────────╯ ║
Combining characters:
STARGΛ̊TE SG-1, a = v̇ = r̈, a⃑ ⊥ b⃑
Greek (in Polytonic):
The Greek anthem:
Σὲ γνωρίζω ἀπὸ τὴν κόψη
τοῦ σπαθιοῦ τὴν τρομερή,
σὲ γνωρίζω ἀπὸ τὴν ὄψη
ποὺ μὲ βία μετράει τὴ γῆ.
᾿Απ᾿ τὰ κόκκαλα βγαλμένη
τῶν ῾Ελλήνων τὰ ἱερά
καὶ σὰν πρῶτα ἀνδρειωμένη
χαῖρε, ὦ χαῖρε, ᾿Ελευθεριά!
From a speech of Demosthenes in the 4th century BC:
Οὐχὶ ταὐτὰ παρίσταταί μοι γιγνώσκειν, ὦ ἄνδρες ᾿Αθηναῖοι,
ὅταν τ᾿ εἰς τὰ πράγματα ἀποβλέψω καὶ ὅταν πρὸς τοὺς
λόγους οὓς ἀκούω· τοὺς μὲν γὰρ λόγους περὶ τοῦ
τιμωρήσασθαι Φίλιππον ὁρῶ γιγνομένους, τὰ δὲ πράγματ᾿
εἰς τοῦτο προήκοντα, ὥσθ᾿ ὅπως μὴ πεισόμεθ᾿ αὐτοὶ
πρότερον κακῶς σκέψασθαι δέον. οὐδέν οὖν ἄλλο μοι δοκοῦσιν
οἱ τὰ τοιαῦτα λέγοντες ἢ τὴν ὑπόθεσιν, περὶ ἧς βουλεύεσθαι,
οὐχὶ τὴν οὖσαν παριστάντες ὑμῖν ἁμαρτάνειν. ἐγὼ δέ, ὅτι μέν
ποτ᾿ ἐξῆν τῇ πόλει καὶ τὰ αὑτῆς ἔχειν ἀσφαλῶς καὶ Φίλιππον
τιμωρήσασθαι, καὶ μάλ᾿ ἀκριβῶς οἶδα· ἐπ᾿ ἐμοῦ γάρ, οὐ πάλαι
γέγονεν ταῦτ᾿ ἀμφότερα· νῦν μέντοι πέπεισμαι τοῦθ᾿ ἱκανὸν
προλαβεῖν ἡμῖν εἶναι τὴν πρώτην, ὅπως τοὺς συμμάχους
σώσομεν. ἐὰν γὰρ τοῦτο βεβαίως ὑπάρξῃ, τότε καὶ περὶ τοῦ
τίνα τιμωρήσεταί τις καὶ ὃν τρόπον ἐξέσται σκοπεῖν· πρὶν δὲ
τὴν ἀρχὴν ὀρθῶς ὑποθέσθαι, μάταιον ἡγοῦμαι περὶ τῆς
τελευτῆς ὁντινοῦν ποιεῖσθαι λόγον.
Δημοσθένους, Γ´ ᾿Ολυνθιακὸς
From a Unicode conference invitation:
გთხოვთ ახლავე გაიაროთ რეგისტრაცია Unicode-ის მეათე საერთაშორისო
კონფერენციაზე დასასწრებად, რომელიც გაიმართება 10-12 მარტს,
ქ. მაინცში, გერმანიაში. კონფერენცია შეჰკრებს ერთად მსოფლიოს
ექსპერტებს ისეთ დარგებში როგორიცაა ინტერნეტი და Unicode-ი,
ინტერნაციონალიზაცია და ლოკალიზაცია, Unicode-ის გამოყენება
ოპერაციულ სისტემებსა, და გამოყენებით პროგრამებში, შრიფტებში,
ტექსტების დამუშავებასა და მრავალენოვან კომპიუტერულ სისტემებში.
From a Unicode conference invitation:
Зарегистрируйтесь сейчас на Десятую Международную Конференцию по
Unicode, которая состоится 10-12 марта 1997 года в Майнце в Германии.
Конференция соберет широкий круг экспертов по вопросам глобального
Интернета и Unicode, локализации и интернационализации, воплощению и
применению Unicode в различных операционных системах и программных
приложениях, шрифтах, верстке и многоязычных компьютерных системах.
Thai (UCS Level 2):
Excerpt from a poetry on The Romance of The Three Kingdoms (a Chinese
classic 'San Gua'):
๏ แผ่นดินฮั่นเสื่อมโทรมแสนสังเวช พระปกเกศกองบู๊กู้ขึ้นใหม่
สิบสองกษัตริย์ก่อนหน้าแลถัดไป สององค์ไซร้โง่เขลาเบาปัญญา
ทรงนับถือขันทีเป็นที่พึ่ง บ้านเมืองจึงวิปริตเป็นนักหนา
โฮจิ๋นเรียกทัพทั่วหัวเมืองมา หมายจะฆ่ามดชั่วตัวสำคัญ
เหมือนขับไสไล่เสือจากเคหา รับหมาป่าเข้ามาเลยอาสัญ
ฝ่ายอ้องอุ้นยุแยกให้แตกกัน ใช้สาวนั้นเป็นชนวนชื่นชวนใจ
พลันลิฉุยกุยกีกลับก่อเหตุ ช่างอาเพศจริงหนาฟ้าร้องไห้
ต้องรบราฆ่าฟันจนบรรลัย ฤๅหาใครค้ำชูกู้บรรลังก์ ฯ
(The above is a two-column text. If combining characters are handled
correctly, the lines of the second column should be aligned with the
| character above.)
Proverbs in the Amharic language:
ሰማይ አይታረስ ንጉሥ አይከሰስ።
ብላ ካለኝ እንደአባቴ በቆመጠኝ።
ጌጥ ያለቤቱ ቁምጥና ነው።
ደሀ በሕልሙ ቅቤ ባይጠጣ ንጣት በገደለው።
የአፍ ወለምታ በቅቤ አይታሽም።
አይጥ በበላ ዳዋ ተመታ።
ሲተረጉሙ ይደረግሙ።
ቀስ በቀስ፥ ዕንቁላል በእግሩ ይሄዳል።
ድር ቢያብር አንበሳ ያስር።
ሰው እንደቤቱ እንጅ እንደ ጉረቤቱ አይተዳደርም።
እግዜር የከፈተውን ጉሮሮ ሳይዘጋው አይድርም።
የጎረቤት ሌባ፥ ቢያዩት ይስቅ ባያዩት ያጠልቅ።
ሥራ ከመፍታት ልጄን ላፋታት።
ዓባይ ማደሪያ የለው፥ ግንድ ይዞ ይዞራል።
የእስላም አገሩ መካ የአሞራ አገሩ ዋርካ።
ተንጋሎ ቢተፉ ተመልሶ ባፉ።
ወዳጅህ ማር ቢሆን ጨርስህ አትላሰው።
እግርህን በፍራሽህ ልክ ዘርጋ።
ᚻᛖ ᚳᚹᚫᚦ ᚦᚫᛏ ᚻᛖ ᛒᚢᛞᛖ ᚩᚾ ᚦᚫᛗ ᛚᚪᚾᛞᛖ ᚾᚩᚱᚦᚹᛖᚪᚱᛞᚢᛗ ᚹᛁᚦ ᚦᚪ ᚹᛖᛥᚫ
(Old English, which transcribed into Latin reads 'He cwaeth that he
bude thaem lande northweardum with tha Westsae.' and means 'He said
that he lived in the northern land near the Western Sea.')
⡌⠁⠧⠑ ⠼⠁⠒ ⡍⠜⠇⠑⠹⠰⠎ ⡣⠕⠌
⡍⠜⠇⠑⠹ ⠺⠁⠎ ⠙⠑⠁⠙⠒ ⠞⠕ ⠃⠑⠛⠔ ⠺⠊⠹⠲ ⡹⠻⠑ ⠊⠎ ⠝⠕ ⠙⠳⠃⠞
⠱⠁⠞⠑⠧⠻ ⠁⠃⠳⠞ ⠹⠁⠞⠲ ⡹⠑ ⠗⠑⠛⠊⠌⠻ ⠕⠋ ⠙⠊⠎ ⠃⠥⠗⠊⠁⠇ ⠺⠁⠎
⠎⠊⠛⠝⠫ ⠃⠹ ⠹⠑ ⠊⠇⠻⠛⠹⠍⠁⠝⠂ ⠹⠑ ⠊⠇⠻⠅⠂ ⠹⠑ ⠥⠝⠙⠻⠞⠁⠅⠻⠂
⠁⠝⠙ ⠹⠑ ⠡⠊⠑⠋ ⠍⠳⠗⠝⠻⠲ ⡎⠊⠗⠕⠕⠛⠑ ⠎⠊⠛⠝⠫ ⠊⠞⠲ ⡁⠝⠙
⡎⠊⠗⠕⠕⠛⠑⠰⠎ ⠝⠁⠍⠑ ⠺⠁⠎ ⠛⠕⠕⠙ ⠥⠏⠕⠝ ⠰⡡⠁⠝⠛⠑⠂ ⠋⠕⠗ ⠁⠝⠹⠹⠔⠛ ⠙⠑
⠡⠕⠎⠑ ⠞⠕ ⠏⠥⠞ ⠙⠊⠎ ⠙⠁⠝⠙ ⠞⠕⠲
⡕⠇⠙ ⡍⠜⠇⠑⠹ ⠺⠁⠎ ⠁⠎ ⠙⠑⠁⠙ ⠁⠎ ⠁ ⠙⠕⠕⠗⠤⠝⠁⠊⠇⠲
⡍⠔⠙⠖ ⡊ ⠙⠕⠝⠰⠞ ⠍⠑⠁⠝ ⠞⠕ ⠎⠁⠹ ⠹⠁⠞ ⡊ ⠅⠝⠪⠂ ⠕⠋ ⠍⠹
⠪⠝ ⠅⠝⠪⠇⠫⠛⠑⠂ ⠱⠁⠞ ⠹⠻⠑ ⠊⠎ ⠏⠜⠞⠊⠊⠥⠇⠜⠇⠹ ⠙⠑⠁⠙ ⠁⠃⠳⠞
⠁ ⠙⠕⠕⠗⠤⠝⠁⠊⠇⠲ ⡊ ⠍⠊⠣⠞ ⠙⠁⠧⠑ ⠃⠑⠲ ⠔⠊⠇⠔⠫⠂ ⠍⠹⠎⠑⠇⠋⠂ ⠞⠕
⠗⠑⠛⠜⠙ ⠁ ⠊⠕⠋⠋⠔⠤⠝⠁⠊⠇ ⠁⠎ ⠹⠑ ⠙⠑⠁⠙⠑⠌ ⠏⠊⠑⠊⠑ ⠕⠋ ⠊⠗⠕⠝⠍⠕⠝⠛⠻⠹
⠔ ⠹⠑ ⠞⠗⠁⠙⠑⠲ ⡃⠥⠞ ⠹⠑ ⠺⠊⠎⠙⠕⠍ ⠕⠋ ⠳⠗ ⠁⠝⠊⠑⠌⠕⠗⠎
⠊⠎ ⠔ ⠹⠑ ⠎⠊⠍⠊⠇⠑⠆ ⠁⠝⠙ ⠍⠹ ⠥⠝⠙⠁⠇⠇⠪⠫ ⠙⠁⠝⠙⠎
⠩⠁⠇⠇ ⠝⠕⠞ ⠙⠊⠌⠥⠗⠃ ⠊⠞⠂ ⠕⠗ ⠹⠑ ⡊⠳⠝⠞⠗⠹⠰⠎ ⠙⠕⠝⠑ ⠋⠕⠗⠲ ⡹⠳
⠺⠊⠇⠇ ⠹⠻⠑⠋⠕⠗⠑ ⠏⠻⠍⠊⠞ ⠍⠑ ⠞⠕ ⠗⠑⠏⠑⠁⠞⠂ ⠑⠍⠏⠙⠁⠞⠊⠊⠁⠇⠇⠹⠂ ⠹⠁⠞
⡍⠜⠇⠑⠹ ⠺⠁⠎ ⠁⠎ ⠙⠑⠁⠙ ⠁⠎ ⠁ ⠙⠕⠕⠗⠤⠝⠁⠊⠇⠲
(The first couple of paragraphs of "A Christmas Carol" by Dickens)
Compact font selection example text:
abcdefghijklmnopqrstuvwxyz £©µÀÆÖÞßéöÿ
–—‘“”„†•…‰™œŠŸž€ ΑΒΓΔΩαβγδω АБВГДабвгд
∀∂∈ℝ∧∪≡∞ ↑↗↨↻⇣ ┐┼╔╘░►☺♀ fi<>⑀₂ἠḂӥẄɐː⍎אԱა
Greetings in various languages:
Hello world, Καλημέρα κόσμε, コンニチハ
Box drawing alignment tests: █
╔══╦══╗ ┌──┬──┐ ╭──┬──╮ ╭──┬──╮ ┏━━┳━━┓ ┎┒┏┑ ╷ ╻ ┏┯┓ ┌┰┐ ▊ ╱╲╱╲╳╳╳
║┌─╨─┐║ │╔═╧═╗│ │╒═╪═╕│ │╓─╁─╖│ ┃┌─╂─┐┃ ┗╃╄┙ ╶┼╴╺╋╸┠┼┨ ┝╋┥ ▋ ╲╱╲╱╳╳╳
║│╲ ╱│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╿ │┃ ┍╅╆┓ ╵ ╹ ┗┷┛ └┸┘ ▌ ╱╲╱╲╳╳╳
╠╡ ╳ ╞╣ ├╢ ╟┤ ├┼─┼─┼┤ ├╫─╂─╫┤ ┣┿╾┼╼┿┫ ┕┛┖┚ ┌┄┄┐ ╎ ┏┅┅┓ ┋ ▍ ╲╱╲╱╳╳╳
║│╱ ╲│║ │║ ║│ ││ │ ││ │║ ┃ ║│ ┃│ ╽ │┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▎
║└─╥─┘║ │╚═╤═╝│ │╘═╪═╛│ │╙─╀─╜│ ┃└─╂─┘┃ ░░▒▒▓▓██ ┊ ┆ ╎ ╏ ┇ ┋ ▏
╚══╩══╝ └──┴──┘ ╰──┴──╯ ╰──┴──╯ ┗━━┻━━┛ ▗▄▖▛▀▜ └╌╌┘ ╎ ┗╍╍┛ ┋ ▁▂▃▄▅▆▇█
@ -0,0 +1,308 @@
UTF-8 decoder capability and stress test
Markus Kuhn <http://www.cl.cam.ac.uk/~mgk25/> - 2015-08-28 - CC BY 4.0
[Note: This file has been slightly modified from its upstream original,
by changing the right margin so that when it is cat(1)ed in a terminal
emulator whose UTF-8 decoder implements the whatwg Encoding Spec
[https://encoding.spec.whatwg.org/], the right margin is correctly
aligned. No other changes were made, not even to correct the text
explaining what you should see; so in some instance that text is now
incorrect. -- @chpe]
This test file can help you examine, how your UTF-8 decoder handles
various types of correct, malformed, or otherwise interesting UTF-8
sequences. This file is not meant to be a conformance test. It does
not prescribe any particular outcome. Therefore, there is no way to
"pass" or "fail" this test file, even though the text does suggest a
preferable decoder behaviour at some places. Its aim is, instead, to
help you think about, and test, the behaviour of your UTF-8 decoder on a
systematic collection of unusual inputs. Experience so far suggests
that most first-time authors of UTF-8 decoders find at least one
serious problem in their decoder using this file.
The test lines below cover boundary conditions, malformed UTF-8
sequences, as well as correctly encoded UTF-8 sequences of Unicode code
points that should never occur in a correct UTF-8 file.
According to ISO 10646-1:2000, sections D.7 and 2.3c, a device
receiving UTF-8 shall interpret a "malformed sequence in the same way
that it interprets a character that is outside the adopted subset" and
"characters that are not within the adopted subset shall be indicated
to the user" by a receiving device. One commonly used approach in
UTF-8 decoders is to replace any malformed UTF-8 sequence by a
replacement character (U+FFFD), which looks a bit like an inverted
question mark, or a similar symbol. It might be a good idea to
visually distinguish a malformed UTF-8 sequence from a correctly
encoded Unicode character that is just not available in the current
font but otherwise fully legal, even though ISO 10646-1 doesn't
mandate this. In any case, just ignoring malformed sequences or
unavailable characters does not conform to ISO 10646, will make
debugging more difficult, and can lead to user confusion.
Please check, whether a malformed UTF-8 sequence is (1) represented at
all, (2) represented by exactly one single replacement character (or
equivalent signal), and (3) the following quotation mark after an
illegal UTF-8 sequence is correctly displayed, i.e. proper
resynchronization takes place immediately after any malformed
sequence. This file says "THE END" in the last line, so if you don't
see that, your decoder crashed somehow before, which should always be
cause for concern.
All lines in this file are exactly 79 characters long (plus the line
feed). In addition, all lines end with "|", except for the two test
lines 2.1.1 and 2.2.1, which contain non-printable ASCII controls
U+0000 and U+007F. If you display this file with a fixed-width font,
these "|" characters should all line up in column 79 (right margin).
This allows you to test quickly, whether your UTF-8 decoder finds the
correct number of characters in every line, that is whether each
malformed sequences is replaced by a single replacement character.
Note that, as an alternative to the notion of malformed sequence used
here, it is also a perfectly acceptable (and in some situations even
preferable) solution to represent each individual byte of a malformed
sequence with a replacement character. If you follow this strategy in
your decoder, then please ignore the "|" column.
Here come the tests: |
1 Some correct UTF-8 text |
You should see the Greek word 'kosme': "κόσμε" |
2 Boundary condition test cases |
2.1 First possible sequence of a certain length |
2.1.1 1 byte (U-00000000): " |