Repo:
https://salsa.debian.org/debian/apt-xapian-index.git
Idea:
- Many data sources each maintain their own data, but provide a plugin to
index that data in a single, big Xapian database.
- Every data source has its own db (possibly easy to read plaintext) in
/var/lib/somewhere
- Every data source has a tool to keep their own db up to date (e.g.
downloading new data from the net, or whatever)
- Every data source installs a plugin in /usr/share/apt-xapian/index/plugins
that adds information from the data source into Xapian documents during
indexing
Technicalities:
- There is a central update procedure, but it is fed enough data to do
differential updates when xapian will make it faster to do so
- Next to the database there is a README file with information about how the
index is built and how it can be queried. Every indexing plugin adds
information to this README
- Xapian values are looked up by index. Indexes are given names using a
configuration file in the style of /etc/services, located at
/etc/apt/xapian-index-values.conf
Writing a plugin:
- Take a look at plugins/template.py: it contains all the methods and full
documentations on what they should do.
- Take a look at the other plugins for examples: there are many of them.
Packages with Xapian bindings that can be uses for querying the database:
- C++: libxapian-dev
- Perl: libsearch-xapian-perl
- Ruby: libxapian-ruby1.8
- Python: python-xapian
- Tcl: tclxapian
- PHP: php5-xapian
The C++ API documentation is in the package xapian-doc. The documentation of
the other languages is in the same package as the bindings.
Examples can be found in xapian-examples, as well as in apt-xapian-index.
Some low-level tools to access the database can be found in xapian-tools.
Please see http://www.xapian.org and http://www.xapian.org/docs/
To do:
- Example queries
- Example scripts
- Document libept transition
- Libept transition
- Move the debtags plugin in the debtags package
- Popcon data source
- Iterating data source