mdds/doc/trie_map.rst

768 lines
29 KiB
ReStructuredText
Raw Normal View History

2022-11-07 15:52:08 +08:00
.. highlight:: cpp
Trie Maps
=========
Examples
--------
Populating Trie Map
^^^^^^^^^^^^^^^^^^^
This section illustrates how to use :cpp:class:`~mdds::trie_map` to build a
database of city populations and perform prefix searches. In this example,
we will use the 2013 populations of cities in North Carolina, and use the city
names as keys.
Let's define the type first::
using trie_map_type = mdds::trie_map<mdds::trie::std_string_trait, int>;
The first template argument specifies the trait of the key. In this example,
we are using a pre-defined trait for std::string, which is defined in
:cpp:type:`~mdds::trie::std_string_trait`. The second template argument
specifies the value type, which in this example is simply an ``int``.
Once the type is defined, the next step is instantiation::
trie_map_type nc_cities;
It's pretty simple as you don't need to pass any arguments to the constructor.
Now, let's populate this data structure with some population data::
// Insert key-value pairs.
nc_cities.insert("Charlotte", 792862);
nc_cities.insert("Raleigh", 431746);
nc_cities.insert("Greensboro", 279639);
nc_cities.insert("Durham", 245475);
nc_cities.insert("Winston-Salem", 236441);
nc_cities.insert("Fayetteville", 204408);
nc_cities.insert("Cary", 151088);
nc_cities.insert("Wilmington", 112067);
nc_cities.insert("High Point", 107741);
nc_cities.insert("Greenville", 89130);
nc_cities.insert("Asheville", 87236);
nc_cities.insert("Concord", 83506);
nc_cities.insert("Gastonia", 73209);
nc_cities.insert("Jacksonville", 69079);
nc_cities.insert("Chapel Hill", 59635);
nc_cities.insert("Rocky Mount", 56954);
nc_cities.insert("Burlington", 51510);
nc_cities.insert("Huntersville", 50458);
nc_cities.insert("Wilson", 49628);
nc_cities.insert("Kannapolis", 44359);
nc_cities.insert("Apex", 42214);
nc_cities.insert("Hickory", 40361);
nc_cities.insert("Goldsboro", 36306);
It's pretty straight-forward. Each :cpp:func:`~mdds::trie_map::insert` call
expects a pair of string key and an integer value. You can insert your data
in any order regardless of key's sort order.
Now that the data is in, let's perform prefix search to query all cities whose
name begins with "Cha"::
cout << "Cities that start with 'Cha' and their populations:" << endl;
auto results = nc_cities.prefix_search("Cha");
for (const auto& kv : results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
You can perform prefix search via :cpp:func:`~mdds::trie_map::prefix_search`
method, which returns a results object that can be iterated over using a range-based
for loop. Running this code will produce the following output:
.. code-block:: none
Cities that start with 'Cha' and their populations:
Chapel Hill: 59635
Charlotte: 792862
Let's perform another prefix search, this time with a prefix of "W"::
cout << "Cities that start with 'W' and their populations:" << endl;
results = nc_cities.prefix_search("W");
for (const auto& kv : results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
You'll see the following output when running this code:
.. code-block:: none
Cities that start with 'W' and their populations:
Wilmington: 112067
Wilson: 49628
Winston-Salem: 236441
Note that the results are sorted in key's ascending order.
.. note::
Results from the prefix search are sorted in key's ascending order.
Creating Packed Trie Map from Trie Map
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There is also another variant of trie called :cpp:class:`~mdds::packed_trie_map`
which is designed to store all its data in contiguous memory region. Unlike
:cpp:class:`~mdds::trie_map` which is mutable, :cpp:class:`~mdds::packed_trie_map`
is immutable; once populated, you can only perform queries and it is no longer
possible to add new entries into the container.
One way to create an instance of :cpp:class:`~mdds::packed_trie_map` is from
:cpp:class:`~mdds::trie_map` by calling its :cpp:func:`~mdds::trie_map::pack`
method::
auto packed = nc_cities.pack();
The query methods of :cpp:class:`~mdds::packed_trie_map` are identical to those
of :cpp:class:`~mdds::trie_map`. For instance, performing prefix search to find
all entries whose key begins with "C" can be done as follows::
cout << "Cities that start with 'C' and their populations:" << endl;
auto packed_results = packed.prefix_search("C");
for (const auto& kv : packed_results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
Running this code will generate the following output:
.. code-block:: none
Cities that start with 'C' and their populations:
Cary: 151088
Chapel Hill: 59635
Charlotte: 792862
Concord: 83506
You can also perform an exact-match query via :cpp:func:`~mdds::packed_trie_map::find`
method which returns an iterator associated with the key-value pair entry::
// Individual search.
auto it = packed.find("Wilmington");
cout << "Population of Wilmington: " << it->second << endl;
You'll see the following output with this code:
.. code-block:: none
Population of Wilmington: 112067
What if you performed an exact-match query with a key that doesn't exist in the
container? You will basically get the end iterator position as its return value.
Thus, running this code::
// You get an end position iterator when the container doesn't have the
// specified key.
it = packed.find("Asheboro");
cout << "Population of Asheboro: ";
if (it == packed.end())
cout << "not found";
else
cout << it->second;
cout << endl;
will generate the following output:
.. code-block:: none
Population of Asheboro: not found
The complete source code for the examples in these two sections is available
`here <https://gitlab.com/mdds/mdds/-/blob/master/example/trie_map.cpp>`__.
Using Packed Trie Map directly
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In the previous example, we showed a way to create an instance of :cpp:class:`~mdds::packed_trie_map`
from a populated instance of :cpp:class:`~mdds::trie_map`. There is also a way
to instantiate and populate an instance of :cpp:class:`~mdds::packed_trie_map`
directly, and that is what we will cover in this section.
First, declare the type::
using trie_map_type = mdds::packed_trie_map<mdds::trie::std_string_trait, int>;
Once again, we are using the pre-defined trait for std::string as its key, and int
as its value type. The next step is to prepare its entries ahead of time::
trie_map_type::entry entries[] =
{
{ MDDS_ASCII("Apex"), 42214 },
{ MDDS_ASCII("Asheville"), 87236 },
{ MDDS_ASCII("Burlington"), 51510 },
{ MDDS_ASCII("Cary"), 151088 },
{ MDDS_ASCII("Chapel Hill"), 59635 },
{ MDDS_ASCII("Charlotte"), 792862 },
{ MDDS_ASCII("Concord"), 83506 },
{ MDDS_ASCII("Durham"), 245475 },
{ MDDS_ASCII("Fayetteville"), 204408 },
{ MDDS_ASCII("Gastonia"), 73209 },
{ MDDS_ASCII("Goldsboro"), 36306 },
{ MDDS_ASCII("Greensboro"), 279639 },
{ MDDS_ASCII("Greenville"), 89130 },
{ MDDS_ASCII("Hickory"), 40361 },
{ MDDS_ASCII("High Point"), 107741 },
{ MDDS_ASCII("Huntersville"), 50458 },
{ MDDS_ASCII("Jacksonville"), 69079 },
{ MDDS_ASCII("Kannapolis"), 44359 },
{ MDDS_ASCII("Raleigh"), 431746 },
{ MDDS_ASCII("Rocky Mount"), 56954 },
{ MDDS_ASCII("Wilmington"), 112067 },
{ MDDS_ASCII("Wilson"), 49628 },
{ MDDS_ASCII("Winston-Salem"), 236441 },
};
We need to do this since :cpp:class:`~mdds::packed_trie_map` is immutable, and
the only time we can populate its content is at instantiation time. Here, we
are using the :c:macro:`MDDS_ASCII` macro to expand a string literal to its
pointer value and size. Note that you need to ensure that the entries are sorted
by the key in ascending order.
.. warning::
When instantiating :cpp:class:`~mdds::packed_trie_map` directly with a static
set of entries, the entries must be sorted by the key in ascending order.
You can then pass this list of entries to construct the instance::
trie_map_type nc_cities(entries, MDDS_N_ELEMENTS(entries));
The :c:macro:`MDDS_N_ELEMENTS` macro will infer the size of a fixed-size array
from its static definition. Once it's instantiated, the rest of the example
for performing searches will be the same as in the previous section, which we
will not repeat here.
The complete source code for the example in this section is available
`here <https://gitlab.com/mdds/mdds/-/blob/master/example/packed_trie_map.cpp>`__.
Saving and loading Packed Trie Map instances
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
There are times when you need to save the state of a :cpp:class:`~mdds::packed_trie_map`
instance to a file, or an in-memory buffer, and load it back later. Doing that
is now possible by using the :cpp:func:`~mdds::packed_trie_map::save_state` and
:cpp:func:`~mdds::packed_trie_map::load_state` member methods of the
:cpp:class:`~mdds::packed_trie_map` class.
First, let's define the type of use::
using map_type = mdds::packed_trie_map<mdds::trie::std_string_trait, int>;
As with the previous examples, we will use ``std::string`` as the key type and
``int`` as the value type. In this example, we are going to use `the world's
largest cities and their 2018 populations
<https://en.wikipedia.org/wiki/List_of_largest_cities>`__ as the data to store
in the container.
The following code defines the entries::
std::vector<map_type::entry> entries =
{
{ MDDS_ASCII("Ahmedabad"), 7681000 },
{ MDDS_ASCII("Alexandria"), 5086000 },
{ MDDS_ASCII("Atlanta"), 5572000 },
{ MDDS_ASCII("Baghdad"), 6812000 },
{ MDDS_ASCII("Bangalore"), 11440000 },
{ MDDS_ASCII("Bangkok"), 10156000 },
{ MDDS_ASCII("Barcelona"), 5494000 },
{ MDDS_ASCII("Beijing"), 19618000 },
{ MDDS_ASCII("Belo Horizonte"), 5972000 },
{ MDDS_ASCII("Bogota"), 10574000 },
{ MDDS_ASCII("Buenos Aires"), 14967000 },
{ MDDS_ASCII("Cairo"), 20076000 },
{ MDDS_ASCII("Chengdu"), 8813000 },
{ MDDS_ASCII("Chennai"), 10456000 },
{ MDDS_ASCII("Chicago"), 8864000 },
{ MDDS_ASCII("Chongqing"), 14838000 },
{ MDDS_ASCII("Dalian"), 5300000 },
{ MDDS_ASCII("Dallas"), 6099000 },
{ MDDS_ASCII("Dar es Salaam"), 6048000 },
{ MDDS_ASCII("Delhi"), 28514000 },
{ MDDS_ASCII("Dhaka"), 19578000 },
{ MDDS_ASCII("Dongguan"), 7360000 },
{ MDDS_ASCII("Foshan"), 7236000 },
{ MDDS_ASCII("Fukuoka"), 5551000 },
{ MDDS_ASCII("Guadalajara"), 5023000 },
{ MDDS_ASCII("Guangzhou"), 12638000 },
{ MDDS_ASCII("Hangzhou"), 7236000 },
{ MDDS_ASCII("Harbin"), 6115000 },
{ MDDS_ASCII("Ho Chi Minh City"), 8145000 },
{ MDDS_ASCII("Hong Kong"), 7429000 },
{ MDDS_ASCII("Houston"), 6115000 },
{ MDDS_ASCII("Hyderabad"), 9482000 },
{ MDDS_ASCII("Istanbul"), 14751000 },
{ MDDS_ASCII("Jakarta"), 10517000 },
{ MDDS_ASCII("Jinan"), 5052000 },
{ MDDS_ASCII("Johannesburg"), 5486000 },
{ MDDS_ASCII("Karachi"), 15400000 },
{ MDDS_ASCII("Khartoum"), 5534000 },
{ MDDS_ASCII("Kinshasa"), 13171000 },
{ MDDS_ASCII("Kolkata"), 14681000 },
{ MDDS_ASCII("Kuala Lumpur"), 7564000 },
{ MDDS_ASCII("Lagos"), 13463000 },
{ MDDS_ASCII("Lahore"), 11738000 },
{ MDDS_ASCII("Lima"), 10391000 },
{ MDDS_ASCII("London"), 9046000 },
{ MDDS_ASCII("Los Angeles"), 12458000 },
{ MDDS_ASCII("Luanda"), 7774000 },
{ MDDS_ASCII("Madrid"), 6497000 },
{ MDDS_ASCII("Manila"), 13482000 },
{ MDDS_ASCII("Mexico City"), 21581000 },
{ MDDS_ASCII("Miami"), 6036000 },
{ MDDS_ASCII("Moscow"), 12410000 },
{ MDDS_ASCII("Mumbai"), 19980000 },
{ MDDS_ASCII("Nagoya"), 9507000 },
{ MDDS_ASCII("Nanjing"), 8245000 },
{ MDDS_ASCII("New York City"), 18819000 },
{ MDDS_ASCII("Osaka"), 19281000 },
{ MDDS_ASCII("Paris"), 10901000 },
{ MDDS_ASCII("Philadelphia"), 5695000 },
{ MDDS_ASCII("Pune"), 6276000 },
{ MDDS_ASCII("Qingdao"), 5381000 },
{ MDDS_ASCII("Rio de Janeiro"), 13293000 },
{ MDDS_ASCII("Riyadh"), 6907000 },
{ MDDS_ASCII("Saint Petersburg"), 5383000 },
{ MDDS_ASCII("Santiago"), 6680000 },
{ MDDS_ASCII("Sao Paulo"), 21650000 },
{ MDDS_ASCII("Seoul"), 9963000 },
{ MDDS_ASCII("Shanghai"), 25582000 },
{ MDDS_ASCII("Shenyang"), 6921000 },
{ MDDS_ASCII("Shenzhen"), 11908000 },
{ MDDS_ASCII("Singapore"), 5792000 },
{ MDDS_ASCII("Surat"), 6564000 },
{ MDDS_ASCII("Suzhou"), 6339000 },
{ MDDS_ASCII("Tehran"), 8896000 },
{ MDDS_ASCII("Tianjin"), 13215000 },
{ MDDS_ASCII("Tokyo"), 37400068 },
{ MDDS_ASCII("Toronto"), 6082000 },
{ MDDS_ASCII("Washington, D.C."), 5207000 },
{ MDDS_ASCII("Wuhan"), 8176000 },
{ MDDS_ASCII("Xi'an"), 7444000 },
{ MDDS_ASCII("Yangon"), 5157000 },
};
It's a bit long as it contains entries for 81 cities. We are then going to
create an instance of the :cpp:class:`~mdds::packed_trie_map` class directly::
map_type cities(entries.data(), entries.size());
Let's print the size of the container to make sure the container has been
successfully populated::
cout << "Number of cities: " << cities.size() << endl;
You will see the following output:
.. code-block:: none
Number of cities: 81
if the container has been successfully populated. Now, let's run a prefix
search on names beginning with an 'S'::
cout << "Cities that begin with 'S':" << endl;
auto results = cities.prefix_search("S");
for (const auto& city : results)
cout << " * " << city.first << ": " << city.second << endl;
to make sure you get the following ten cities and their populations as the
output:
.. code-block:: none
Cities that begin with 'S':
* Saint Petersburg: 5383000
* Santiago: 6680000
* Sao Paulo: 21650000
* Seoul: 9963000
* Shanghai: 25582000
* Shenyang: 6921000
* Shenzhen: 11908000
* Singapore: 5792000
* Surat: 6564000
* Suzhou: 6339000
So far so good. Next, we will use the :cpp:func:`~mdds::packed_trie_map::save_state`
method to dump the internal state of this container to a file named **cities.bin**::
std::ofstream outfile("cities.bin", std::ios::binary);
cities.save_state(outfile);
This will create a file named **cities.bin** which contains a binary blob
representing the content of this container in the current working directory.
Run the ``ls -l cities.bin`` command to make sure the file has been created:
.. code-block:: none
-rw-r--r-- 1 kohei kohei 17713 Jun 20 12:49 cities.bin
Now that the state of the container has been fully serialized to a file, let's
work on restoring its content in another, brand-new instance of
:cpp:class:`~mdds::packed_trie_map`.
::
map_type cities_loaded;
std::ifstream infile("cities.bin", std::ios::binary);
cities_loaded.load_state(infile);
Here, we used the :cpp:func:`~mdds::packed_trie_map::load_state` method to
restore the state from the file we have previously created. Let's make sure
that this new instance has content equivalent to that of the original::
cout << "Equal to the original? " << std::boolalpha << (cities == cities_loaded) << endl;
If you see the following output:
.. code-block:: none
Equal to the original? true
then this new instance has equivalent contant as the original one. Let's also
make sure that it contains the same number of entries as the original::
cout << "Number of cities: " << cities_loaded.size() << endl;
Hopefully you will see the following output:
.. code-block:: none
Number of cities: 81
Lastly, let's run on this new instance the same prefix search we did on the
original instance, to make sure we still get the same results::
cout << "Cities that begin with 'S':" << endl;
auto results = cities_loaded.prefix_search("S");
for (const auto& city : results)
cout << " * " << city.first << ": " << city.second << endl;
You should see the following output:
.. code-block:: none
Cities that begin with 'S':
* Saint Petersburg: 5383000
* Santiago: 6680000
* Sao Paulo: 21650000
* Seoul: 9963000
* Shanghai: 25582000
* Shenyang: 6921000
* Shenzhen: 11908000
* Singapore: 5792000
* Surat: 6564000
* Suzhou: 6339000
which is the same output we saw in the first prefix search.
The complete source code for this example is found
`here <https://gitlab.com/mdds/mdds/-/blob/master/example/packed_trie_state_int.cpp>`__.
Saving Packed Trie Map with custom value type
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In the previos example, you didn't have to explicitly specify the serializer type
to the :cpp:func:`~mdds::packed_trie_map::save_state` and
:cpp:func:`~mdds::packed_trie_map::load_state` methods, even though these two
methods require the serializer type as their template arguments. That's because
the library provides default serializer types for
* numeric value types i.e. integers, float and double,
* ``std::string``, and
* the standard sequence types, such as ``std::vector``, whose elements are of
numeric value types,
and the previous example used ``int`` as the value type.
In this section, we are going to illustrate how you can write your own custom
serializer to allow serialization of your own custom value type. In this example,
we are going to use `the list of presidents of the United States
<https://en.wikipedia.org/wiki/List_of_presidents_of_the_United_States>`__,
with the names of the presidents as the keys, and their years of inauguration
and political affiliations as the values.
We will use the following structure to store the values::
enum affiliated_party_t : uint8_t
{
unaffiliated = 0,
federalist,
democratic_republican,
democratic,
whig,
republican,
national_union,
republican_national_union,
};
struct us_president
{
uint16_t year;
affiliated_party_t party;
};
Each entry stores the year as a 16-bit integer and the affiliated party as an enum
value of 8-bit width.
Next, let's define the container type::
using map_type = mdds::packed_trie_map<mdds::trie::std_string_trait, us_president>;
As with the previous example, the first step is to define the entries that are
sorted by the keys, which in this case are the president's names::
std::vector<map_type::entry> entries =
{
{ MDDS_ASCII("Abraham Lincoln"), { 1861, republican_national_union } },
{ MDDS_ASCII("Andrew Jackson"), { 1829, democratic } },
{ MDDS_ASCII("Andrew Johnson"), { 1865, national_union } },
{ MDDS_ASCII("Barack Obama"), { 2009, democratic } },
{ MDDS_ASCII("Benjamin Harrison"), { 1889, republican } },
{ MDDS_ASCII("Bill Clinton"), { 1993, democratic } },
{ MDDS_ASCII("Calvin Coolidge"), { 1923, republican } },
{ MDDS_ASCII("Chester A. Arthur"), { 1881, republican } },
{ MDDS_ASCII("Donald Trump"), { 2017, republican } },
{ MDDS_ASCII("Dwight D. Eisenhower"), { 1953, republican } },
{ MDDS_ASCII("Franklin D. Roosevelt"), { 1933, democratic } },
{ MDDS_ASCII("Franklin Pierce"), { 1853, democratic } },
{ MDDS_ASCII("George H. W. Bush"), { 1989, republican } },
{ MDDS_ASCII("George W. Bush"), { 2001, republican } },
{ MDDS_ASCII("George Washington"), { 1789, unaffiliated } },
{ MDDS_ASCII("Gerald Ford"), { 1974, republican } },
{ MDDS_ASCII("Grover Cleveland 1"), { 1885, democratic } },
{ MDDS_ASCII("Grover Cleveland 2"), { 1893, democratic } },
{ MDDS_ASCII("Harry S. Truman"), { 1945, democratic } },
{ MDDS_ASCII("Herbert Hoover"), { 1929, republican } },
{ MDDS_ASCII("James A. Garfield"), { 1881, republican } },
{ MDDS_ASCII("James Buchanan"), { 1857, democratic } },
{ MDDS_ASCII("James K. Polk"), { 1845, democratic } },
{ MDDS_ASCII("James Madison"), { 1809, democratic_republican } },
{ MDDS_ASCII("James Monroe"), { 1817, democratic_republican } },
{ MDDS_ASCII("Jimmy Carter"), { 1977, democratic } },
{ MDDS_ASCII("John Adams"), { 1797, federalist } },
{ MDDS_ASCII("John F. Kennedy"), { 1961, democratic } },
{ MDDS_ASCII("John Quincy Adams"), { 1825, democratic_republican } },
{ MDDS_ASCII("John Tyler"), { 1841, whig } },
{ MDDS_ASCII("Lyndon B. Johnson"), { 1963, democratic } },
{ MDDS_ASCII("Martin Van Buren"), { 1837, democratic } },
{ MDDS_ASCII("Millard Fillmore"), { 1850, whig } },
{ MDDS_ASCII("Richard Nixon"), { 1969, republican } },
{ MDDS_ASCII("Ronald Reagan"), { 1981, republican } },
{ MDDS_ASCII("Rutherford B. Hayes"), { 1877, republican } },
{ MDDS_ASCII("Theodore Roosevelt"), { 1901, republican } },
{ MDDS_ASCII("Thomas Jefferson"), { 1801, democratic_republican } },
{ MDDS_ASCII("Ulysses S. Grant"), { 1869, republican } },
{ MDDS_ASCII("Warren G. Harding"), { 1921, republican } },
{ MDDS_ASCII("William Henry Harrison"), { 1841, whig } },
{ MDDS_ASCII("William Howard Taft"), { 1909, republican } },
{ MDDS_ASCII("William McKinley"), { 1897, republican } },
{ MDDS_ASCII("Woodrow Wilson"), { 1913, democratic } },
{ MDDS_ASCII("Zachary Taylor"), { 1849, whig } },
};
Note that we need to add numeric suffixes to the entries for Grover Cleveland,
who became president twice in two separate periods, in order to make the keys
for his entries unique.
Now, proceed to create an instance of :cpp:class:`~mdds::packed_trie_map`::
map_type us_presidents(entries.data(), entries.size());
and inspect its size to make sure it is instantiated properly::
cout << "Number of entries: " << us_presidents.size() << endl;
You should see the following output:
.. code-block:: none
Number of entries: 45
Before we proceed to save the state of this instance, let's define the custom
serializer type first::
struct us_president_serializer
{
union bin_buffer
{
char buffer[2];
uint16_t i16;
affiliated_party_t party;
};
static constexpr bool variable_size = false;
static constexpr size_t value_size = 3;
static void write(std::ostream& os, const us_president& v)
{
bin_buffer buf;
// Write the year value first.
buf.i16 = v.year;
os.write(buf.buffer, 2);
// Write the affiliated party value.
buf.party = v.party;
os.write(buf.buffer, 1);
}
static void read(std::istream& is, size_t n, us_president& v)
{
// For a fixed-size value type, this should equal the defined value size.
assert(n == 3);
bin_buffer buf;
// Read the year value.
is.read(buf.buffer, 2);
v.year = buf.i16;
// Read the affiliated party value.
is.read(buf.buffer, 1);
v.party = buf.party;
}
};
A custom value type can be either variable-size or fixed-size. For a variable-size
value type, each value segment is preceded by the byte length of that segment.
For a fixed-size value type, the byte length of all of the value segments
is written only once up-front, followed by one or more value segments of equal
byte length.
Since the value type in this example is fixed-size, we set the value of the
``variable_size`` static constant to false, and define the size of the value to 3 (bytes)
as the ``value_size`` static constant. Keep in mind that you need to define
the ``value_size`` constant *only* for fixed-size value types; if your value
type is variable-size, you can leave it out.
Additionally, you need to define two static methods - one for writing to the
output stream, and one for reading from the input stream. The write method
must have the following signature::
static void write(std::ostream& os, const T& v);
where the ``T`` is the value type. In the body of this method you write to the
output stream the bytes that represent the value. The length of the bytes you
write must match the size specified by the ``value_size`` constant.
The read method must have the following signature::
static void read(std::istream& is, size_t n, T& v);
where the ``T`` is the value type, and the ``n`` specifies the length of the
bytes you need to read for the value. For a fixed-size value type, the value
of ``n`` should equal the ``value_size`` constant. Your job is to read the
bytes off of the input stream for the length specified by the ``n``, and
populate the value instance passed to the method as the third argument.
Now that we have defined the custom serializer type, let's proceed to save the
state to a file::
std::ofstream outfile("us-presidents.bin", std::ios::binary);
us_presidents.save_state<us_president_serializer>(outfile);
This time around, we are specifying the serializer type explicitly as the template
argument to the :cpp:func:`~mdds::packed_trie_map::save_state` method. Otherwise
it is no different than what we did in the previous example.
Let's create another instance of :cpp:class:`~mdds::packed_trie_map` and restore
the state back from the file we just created::
map_type us_presidents_loaded;
std::ifstream infile("us-presidents.bin", std::ios::binary);
us_presidents_loaded.load_state<us_president_serializer>(infile);
Once again, aside from explicitly specifying the serializer type as the template
argument to the :cpp:func:`~mdds::packed_trie_map::load_state` method, it is
identical to the way we did in the previous example.
Let's compare the new instance with the old one to see if the two are equal::
cout << "Equal to the original? " << std::boolalpha << (us_presidents == us_presidents_loaded) << endl;
The output says:
.. code-block:: none
Equal to the original? true
They are. While we are at it, let's run a simple prefix search to find out
all the US presidents whose first name is 'John'::
cout << "Presidents whose first name is 'John':" << endl;
auto results = us_presidents_loaded.prefix_search("John");
for (const auto& entry : results)
cout << " * " << entry.first << " (" << entry.second.year << "; " << entry.second.party << ")" << endl;
Here is the output:
.. code-block:: none
Presidents whose first name is 'John':
* John Adams (1797; Federalist)
* John F. Kennedy (1961; Democratic)
* John Quincy Adams (1825; Democratic Republican)
* John Tyler (1841; Whig)
This looks like the correct results!
You can find the complete source code for this example `here
<https://gitlab.com/mdds/mdds/-/blob/master/example/packed_trie_state_custom.cpp>`__.
API Reference
-------------
Trie Map
^^^^^^^^
.. doxygenclass:: mdds::trie_map
:members:
Packed Trie Map
^^^^^^^^^^^^^^^
.. doxygenclass:: mdds::packed_trie_map
:members:
Traits
^^^^^^
.. doxygenstruct:: mdds::trie::std_container_trait
:members:
.. doxygentypedef:: mdds::trie::std_string_trait
Value Serializers
^^^^^^^^^^^^^^^^^
.. doxygenstruct:: mdds::trie::value_serializer
:members:
.. doxygenstruct:: mdds::trie::numeric_value_serializer
:members:
.. doxygenstruct:: mdds::trie::variable_value_serializer
:members:
.. doxygenstruct:: mdds::trie::numeric_sequence_value_serializer
:members: