456 lines
19 KiB
Groff
456 lines
19 KiB
Groff
'\"! tbl | nroff \-man
|
|
'\" t macro stdmacro
|
|
|
|
.de SAMPLE
|
|
.br
|
|
.RS 0
|
|
.nf
|
|
.nh
|
|
..
|
|
.de ESAMPLE
|
|
.hy
|
|
.fi
|
|
.RE
|
|
..
|
|
|
|
.TH DEBUGINFOD 8
|
|
.SH NAME
|
|
debuginfod \- debuginfo-related http file-server daemon
|
|
|
|
.SH SYNOPSIS
|
|
.B debuginfod
|
|
[\fIOPTION\fP]... [\fIPATH\fP]...
|
|
|
|
.SH DESCRIPTION
|
|
\fBdebuginfod\fP serves debuginfo-related artifacts over HTTP. It
|
|
periodically scans a set of directories for ELF/DWARF files and their
|
|
associated source code, as well as archive files containing the above, to
|
|
build an index by their buildid. This index is used when remote
|
|
clients use the HTTP webapi, to fetch these files by the same buildid.
|
|
|
|
If a debuginfod cannot service a given buildid artifact request
|
|
itself, and it is configured with information about upstream
|
|
debuginfod servers, it queries them for the same information, just as
|
|
\fBdebuginfod-find\fP would. If successful, it locally caches then
|
|
relays the file content to the original requester.
|
|
|
|
Indexing the given PATHs proceeds using multiple threads. One thread
|
|
periodically traverses all the given PATHs logically or physically
|
|
(see the \fB\-L\fP option). Duplicate PATHs are ignored. You may use
|
|
a file name for a PATH, but source code indexing may be incomplete;
|
|
prefer using a directory that contains the binaries. The traversal
|
|
thread enumerates all matching files (see the \fB\-I\fP and \fB\-X\fP
|
|
options) into a work queue. A collection of scanner threads (see the
|
|
\fB\-c\fP option) wait at the work queue to analyze files in parallel.
|
|
|
|
If the \fB\-F\fP option is given, each file is scanned as an ELF/DWARF
|
|
file. Source files are matched with DWARF files based on the
|
|
AT_comp_dir (compilation directory) attributes inside it. Caution:
|
|
source files listed in the DWARF may be a path \fIanywhere\fP in the
|
|
file system, and debuginfod will readily serve their content on
|
|
demand. (Imagine a doctored DWARF file that lists \fI/etc/passwd\fP
|
|
as a source file.) If this is a concern, audit your binaries with
|
|
tools such as:
|
|
|
|
.SAMPLE
|
|
% eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p'
|
|
or
|
|
% eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^Line.number/p'
|
|
or even use debuginfod itself:
|
|
% debuginfod -vvv -d :memory: -F BINARY 2>&1 | grep 'recorded.*source'
|
|
^C
|
|
.ESAMPLE
|
|
|
|
If any of the \fB\-R\fP, \fB-U\fP, or \fB-Z\fP options is given, each
|
|
file is scanned as an archive file that may contain ELF/DWARF/source
|
|
files. Archive files are recognized by extension. If \-R is given,
|
|
".rpm" files are scanned; if \-D is given, ".deb" and ".ddeb" files
|
|
are scanned; if \-Z is given, the listed extensions are scanned.
|
|
Because of complications such as DWZ-compressed debuginfo, may require
|
|
\fItwo\fP traversal passes to identify all source code. Source files
|
|
for RPMs are only served from other RPMs, so the caution for \-F does
|
|
not apply. Note that due to Debian/Ubuntu packaging policies &
|
|
mechanisms, debuginfod cannot resolve source files for DEB/DDEB at
|
|
all.
|
|
|
|
If no PATH is listed, or none of the scanning options is given, then
|
|
\fBdebuginfod\fP will simply serve content that it accumulated into
|
|
its index in all previous runs, and federate to any upstream
|
|
debuginfod servers.
|
|
|
|
|
|
.SH OPTIONS
|
|
|
|
.TP
|
|
.B "\-F"
|
|
Activate ELF/DWARF file scanning. The default is off.
|
|
|
|
.TP
|
|
.B "\-Z EXT" "\-Z EXT=CMD"
|
|
Activate an additional pattern in archive scanning. Files with name
|
|
extension EXT (include the dot) will be processed. If CMD is given,
|
|
it is invoked with the file name added to its argument list, and
|
|
should produce a common archive on its standard output. Otherwise,
|
|
the file is read as if CMD were "cat". Since debuginfod internally
|
|
uses \fBlibarchive\fP to read archive files, it can accept a wide
|
|
range of archive formats and compression modes. The default is no
|
|
additional patterns. This option may be repeated.
|
|
|
|
.TP
|
|
.B "\-R"
|
|
Activate RPM patterns in archive scanning. The default is off.
|
|
Equivalent to \fB\%\-Z\~.rpm=cat\fP, since libarchive can natively
|
|
process RPM archives. If your version of libarchive is much older
|
|
than 2020, be aware that some distributions have switched to an
|
|
incompatible zstd compression for their payload. You may experiment
|
|
with \fB\%\-Z\ .rpm='(rpm2cpio|zstdcat)<'\fP instead of \fB\-R\fP.
|
|
|
|
.TP
|
|
.B "\-U"
|
|
Activate DEB/DDEB patterns in archive scanning. The default is off.
|
|
Equivalent to \fB\%\-Z\ .deb='dpkg-deb\ \-\-fsys\-tarfile\fP'
|
|
\fB\%\-Z\ .ddeb='dpkg-deb\ \-\-fsys\-tarfile'\fP.
|
|
|
|
.TP
|
|
.B "\-d FILE" "\-\-database=FILE"
|
|
Set the path of the sqlite database used to store the index. This
|
|
file is disposable in the sense that a later rescan will repopulate
|
|
data. It will contain absolute file path names, so it may not be
|
|
portable across machines. It may be frequently read/written, so it
|
|
should be on a fast filesystem. It should not be shared across
|
|
machines or users, to maximize sqlite locking performance. The
|
|
default database file is \%$HOME/.debuginfod.sqlite.
|
|
|
|
.TP
|
|
.B "\-D SQL" "\-\-ddl=SQL"
|
|
Execute given sqlite statement after the database is opened and
|
|
initialized as extra DDL (SQL data definition language). This may be
|
|
useful to tune performance-related pragmas or indexes. May be
|
|
repeated. The default is nothing extra.
|
|
|
|
.TP
|
|
.B "\-p NUM" "\-\-port=NUM"
|
|
Set the TCP port number (0 < NUM < 65536) on which debuginfod should
|
|
listen, to service HTTP requests. Both IPv4 and IPV6 sockets are
|
|
opened, if possible. The webapi is documented below. The default
|
|
port number is 8002.
|
|
|
|
.TP
|
|
.B "\-I REGEX" "\-\-include=REGEX" "\-X REGEX" "\-\-exclude=REGEX"
|
|
Govern the inclusion and exclusion of file names under the search
|
|
paths. The regular expressions are interpreted as unanchored POSIX
|
|
extended REs, thus may include alternation. They are evaluated
|
|
against the full path of each file, based on its \fBrealpath(3)\fP
|
|
canonicalization. By default, all files are included and none are
|
|
excluded. A file that matches both include and exclude REGEX is
|
|
excluded. (The \fIcontents\fP of archive files are not subject to
|
|
inclusion or exclusion filtering: they are all processed.) Only the
|
|
last of each type of regular expression given is used.
|
|
|
|
.TP
|
|
.B "\-t SECONDS" "\-\-rescan\-time=SECONDS"
|
|
Set the rescan time for the file and archive directories. This is the
|
|
amount of time the traversal thread will wait after finishing a scan,
|
|
before doing it again. A rescan for unchanged files is fast (because
|
|
the index also stores the file mtimes). A time of zero is acceptable,
|
|
and means that only one initial scan should performed. The default
|
|
rescan time is 300 seconds. Receiving a SIGUSR1 signal triggers a new
|
|
scan, independent of the rescan time (including if it was zero),
|
|
interrupting a groom pass (if any).
|
|
|
|
.TP
|
|
.B "\-g SECONDS" "\-\-groom\-time=SECONDS"
|
|
Set the groom time for the index database. This is the amount of time
|
|
the grooming thread will wait after finishing a grooming pass before
|
|
doing it again. A groom operation quickly rescans all previously
|
|
scanned files, only to see if they are still present and current, so
|
|
it can deindex obsolete files. See also the \fIDATA MANAGEMENT\fP
|
|
section. The default groom time is 86400 seconds (1 day). A time of
|
|
zero is acceptable, and means that only one initial groom should be
|
|
performed. Receiving a SIGUSR2 signal triggers a new grooming pass,
|
|
independent of the groom time (including if it was zero), interrupting
|
|
a rescan pass (if any)..
|
|
|
|
.TP
|
|
.B "\-G"
|
|
Run an extraordinary maximal-grooming pass at debuginfod startup.
|
|
This pass can take considerable time, because it tries to remove any
|
|
debuginfo-unrelated content from the archive-related parts of the index.
|
|
It should not be run if any recent archive-related indexing operations
|
|
were aborted early. It can take considerable space, because it
|
|
finishes up with an sqlite "vacuum" operation, which repacks the
|
|
database file by triplicating it temporarily. The default is not to
|
|
do maximal-grooming. See also the \fIDATA MANAGEMENT\fP section.
|
|
|
|
.TP
|
|
.B "\-c NUM" "\-\-concurrency=NUM"
|
|
Set the concurrency limit for the scanning queue threads, which work
|
|
together to process archives & files located by the traversal thread.
|
|
This important for controlling CPU-intensive operations like parsing
|
|
an ELF file and especially decompressing archives. The default is the
|
|
number of processors on the system; the minimum is 1.
|
|
|
|
.TP
|
|
.B "\-L"
|
|
Traverse symbolic links encountered during traversal of the PATHs,
|
|
including across devices - as in \fIfind\ -L\fP. The default is to
|
|
traverse the physical directory structure only, stay on the same
|
|
device, and ignore symlinks - as in \fIfind\ -P\ -xdev\fP. Caution: a
|
|
loops in the symbolic directory tree might lead to \fIinfinite
|
|
traversal\fP.
|
|
|
|
.TP
|
|
.B "\-\-fdcache\-fds=NUM" "\-\-fdcache\-mbs=MB" "\-\-fdcache\-prefetch=NUM2"
|
|
Configure limits on a cache that keeps recently extracted files from
|
|
archives. Up to NUM requested files and up to a total of MB megabytes
|
|
will be kept extracted, in order to avoid having to decompress their
|
|
archives over and over again. In addition, up to NUM2 other files
|
|
from an archive may be prefetched into the cache before they are even
|
|
requested. The default NUM, NUM2, and MB values depend on the
|
|
concurrency of the system, and on the available disk space on the
|
|
$TMPDIR or \fB/tmp\fP filesystem. This is because that is where the
|
|
most recently used extracted files are kept. Grooming cleans this
|
|
cache.
|
|
|
|
.TP
|
|
.B "\-\-fdcache\-mintmp=NUM"
|
|
Configure a disk space threshold for emergency flushing of the cache.
|
|
The filesystem holding the cache is checked periodically. If the
|
|
available space falls below the given percentage, the cache is
|
|
flushed, and the fdcache will stay disabled until the next groom
|
|
cycle. This mechanism, along a few associated /metrics on the webapi,
|
|
are intended to give an operator notice about storage scarcity - which
|
|
can translate to RAM scarcity if the disk happens to be on a RAM
|
|
virtual disk. The default threshold is 25%.
|
|
|
|
.TP
|
|
.B "\-v"
|
|
Increase verbosity of logging to the standard error file descriptor.
|
|
May be repeated to increase details. The default verbosity is 0.
|
|
|
|
.SH WEBAPI
|
|
|
|
.\" Much of the following text is duplicated with debuginfod-find.1
|
|
|
|
debuginfod's webapi resembles ordinary file service, where a GET
|
|
request with a path containing a known buildid results in a file.
|
|
Unknown buildid / request combinations result in HTTP error codes.
|
|
This file service resemblance is intentional, so that an installation
|
|
can take advantage of standard HTTP management infrastructure.
|
|
|
|
There are three requests. In each case, the buildid is encoded as a
|
|
lowercase hexadecimal string. For example, for a program \fI/bin/ls\fP,
|
|
look at the ELF note GNU_BUILD_ID:
|
|
|
|
.SAMPLE
|
|
% readelf -n /bin/ls | grep -A4 build.id
|
|
Note section [ 4] '.note.gnu.buildid' of 36 bytes at offset 0x340:
|
|
Owner Data size Type
|
|
GNU 20 GNU_BUILD_ID
|
|
Build ID: 8713b9c3fb8a720137a4a08b325905c7aaf8429d
|
|
.ESAMPLE
|
|
|
|
Then the hexadecimal BUILDID is simply:
|
|
|
|
.SAMPLE
|
|
8713b9c3fb8a720137a4a08b325905c7aaf8429d
|
|
.ESAMPLE
|
|
|
|
.SS /buildid/\fIBUILDID\fP/debuginfo
|
|
|
|
If the given buildid is known to the server, this request will result
|
|
in a binary object that contains the customary \fB.*debug_*\fP
|
|
sections. This may be a split debuginfo file as created by
|
|
\fBstrip\fP, or it may be an original unstripped executable.
|
|
|
|
.SS /buildid/\fIBUILDID\fP/executable
|
|
|
|
If the given buildid is known to the server, this request will result
|
|
in a binary object that contains the normal executable segments. This
|
|
may be a executable stripped by \fBstrip\fP, or it may be an original
|
|
unstripped executable. \fBET_DYN\fP shared libraries are considered
|
|
to be a type of executable.
|
|
|
|
.SS /buildid/\fIBUILDID\fP/source\fI/SOURCE/FILE\fP
|
|
|
|
If the given buildid is known to the server, this request will result
|
|
in a binary object that contains the source file mentioned. The path
|
|
should be absolute. Relative path names commonly appear in the DWARF
|
|
file's source directory, but these paths are relative to
|
|
individual compilation unit AT_comp_dir paths, and yet an executable
|
|
is made up of multiple CUs. Therefore, to disambiguate, debuginfod
|
|
expects source queries to prefix relative path names with the CU
|
|
compilation-directory, followed by a mandatory "/".
|
|
|
|
Note: the caller may or may not elide \fB../\fP or \fB/./\fP or extraneous
|
|
\fB///\fP sorts of path components in the directory names. debuginfod
|
|
accepts both forms. Specifically, debuginfod canonicalizes path names
|
|
according to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing
|
|
any \fB//\fP to \fB/\fP in the path.
|
|
|
|
For example:
|
|
.TS
|
|
l l.
|
|
#include <stdio.h> /buildid/BUILDID/source/usr/include/stdio.h
|
|
/path/to/foo.c /buildid/BUILDID/source/path/to/foo.c
|
|
\../bar/foo.c AT_comp_dir=/zoo/ /buildid/BUILDID/source/zoo//../bar/foo.c
|
|
.TE
|
|
|
|
.SS /metrics
|
|
|
|
This endpoint returns a Prometheus formatted text/plain dump of a
|
|
variety of statistics about the operation of the debuginfod server.
|
|
The exact set of metrics and their meanings may change in future
|
|
versions. Caution: configuration information (path names, versions)
|
|
may be disclosed.
|
|
|
|
.SH DATA MANAGEMENT
|
|
|
|
debuginfod stores its index in an sqlite database in a densely packed
|
|
set of interlinked tables. While the representation is as efficient
|
|
as we have been able to make it, it still takes a considerable amount
|
|
of data to record all debuginfo-related data of potentially a great
|
|
many files. This section offers some advice about the implications.
|
|
|
|
As a general explanation for size, consider that debuginfod indexes
|
|
ELF/DWARF files, it stores their names and referenced source file
|
|
names, and buildids will be stored. When indexing archives, it stores
|
|
every file name \fIof or in\fP an archive, every buildid, plus every
|
|
source file name referenced from a DWARF file. (Indexing archives
|
|
takes more space because the source files often reside in separate
|
|
subpackages that may not be indexed at the same pass, so extra
|
|
metadata has to be kept.)
|
|
|
|
Getting down to numbers, in the case of Fedora RPMs (essentially,
|
|
gzip-compressed cpio files), the sqlite index database tends to be
|
|
from 0.5% to 3% of their size. It's larger for binaries that are
|
|
assembled out of a great many source files, or packages that carry
|
|
much debuginfo-unrelated content. It may be even larger during the
|
|
indexing phase due to temporary sqlite write-ahead-logging files;
|
|
these are checkpointed (cleaned out and removed) at shutdown. It may
|
|
be helpful to apply tight \-I or \-X regular-expression constraints to
|
|
exclude files from scanning that you know have no debuginfo-relevant
|
|
content.
|
|
|
|
As debuginfod runs, it periodically rescans its target directories,
|
|
and any new content found is added to the database. Old content, such
|
|
as data for files that have disappeared or that have been replaced
|
|
with newer versions is removed at a periodic \fIgrooming\fP pass.
|
|
This means that the sqlite files grow fast during initial indexing,
|
|
slowly during index rescans, and periodically shrink during grooming.
|
|
There is also an optional one-shot \fImaximal grooming\fP pass is
|
|
available. It removes information debuginfo-unrelated data from the
|
|
archive content index such as file names found in archives ("archive
|
|
sdef" records) that are not referred to as source files from any
|
|
binaries find in archives ("archive sref" records). This can save
|
|
considerable disk space. However, it is slow and temporarily requires
|
|
up to twice the database size as free space. Worse: it may result in
|
|
missing source-code info if the archive traversals were interrupted,
|
|
so that not all source file references were known. Use it rarely to
|
|
polish a complete index.
|
|
|
|
You should ensure that ample disk space remains available. (The flood
|
|
of error messages on -ENOSPC is ugly and nagging. But, like for most
|
|
other errors, debuginfod will resume when resources permit.) If
|
|
necessary, debuginfod can be stopped, the database file moved or
|
|
removed, and debuginfod restarted.
|
|
|
|
sqlite offers several performance-related options in the form of
|
|
pragmas. Some may be useful to fine-tune the defaults plus the
|
|
debuginfod extras. The \-D option may be useful to tell debuginfod to
|
|
execute the given bits of SQL after the basic schema creation
|
|
commands. For example, the "synchronous", "cache_size",
|
|
"auto_vacuum", "threads", "journal_mode" pragmas may be fun to tweak
|
|
via \-D, if you're searching for peak performance. The "optimize",
|
|
"wal_checkpoint" pragmas may be useful to run periodically, outside
|
|
debuginfod. The default settings are performance- rather than
|
|
reliability-oriented, so a hardware crash might corrupt the database.
|
|
In these cases, it may be necessary to manually delete the sqlite
|
|
database and start over.
|
|
|
|
As debuginfod changes in the future, we may have no choice but to
|
|
change the database schema in an incompatible manner. If this
|
|
happens, new versions of debuginfod will issue SQL statements to
|
|
\fIdrop\fP all prior schema & data, and start over. So, disk space
|
|
will not be wasted for retaining a no-longer-useable dataset.
|
|
|
|
In summary, if your system can bear a 0.5%-3% index-to-archive-dataset
|
|
size ratio, and slow growth afterwards, you should not need to
|
|
worry about disk space. If a system crash corrupts the database,
|
|
or you want to force debuginfod to reset and start over, simply
|
|
erase the sqlite file before restarting debuginfod.
|
|
|
|
|
|
.SH SECURITY
|
|
|
|
debuginfod \fBdoes not\fP include any particular security features.
|
|
While it is robust with respect to inputs, some abuse is possible. It
|
|
forks a new thread for each incoming HTTP request, which could lead to
|
|
a denial-of-service in terms of RAM, CPU, disk I/O, or network I/O.
|
|
If this is a problem, users are advised to install debuginfod with a
|
|
HTTPS reverse-proxy front-end that enforces site policies for
|
|
firewalling, authentication, integrity, authorization, and load
|
|
control. The \fI/metrics\fP webapi endpoint is probably not
|
|
appropriate for disclosure to the public.
|
|
|
|
When relaying queries to upstream debuginfods, debuginfod \fBdoes not\fP
|
|
include any particular security features. It trusts that the binaries
|
|
returned by the debuginfods are accurate. Therefore, the list of
|
|
servers should include only trustworthy ones. If accessed across HTTP
|
|
rather than HTTPS, the network should be trustworthy. Authentication
|
|
information through the internal \fIlibcurl\fP library is not currently
|
|
enabled.
|
|
|
|
|
|
.SH "ENVIRONMENT VARIABLES"
|
|
|
|
.TP
|
|
.B TMPDIR
|
|
This environment variable points to a file system to be used for
|
|
temporary files. The default is /tmp.
|
|
|
|
.TP
|
|
.B DEBUGINFOD_URLS
|
|
This environment variable contains a list of URL prefixes for trusted
|
|
debuginfod instances. Alternate URL prefixes are separated by space.
|
|
Avoid referential loops that cause a server to contact itself, directly
|
|
or indirectly - the results would be hilarious.
|
|
|
|
.TP
|
|
.B DEBUGINFOD_TIMEOUT
|
|
This environment variable governs the timeout for each debuginfod HTTP
|
|
connection. A server that fails to provide at least 100K of data
|
|
within this many seconds is skipped. The default is 90 seconds. (Zero
|
|
or negative means "no timeout".)
|
|
|
|
|
|
.TP
|
|
.B DEBUGINFOD_CACHE_PATH
|
|
This environment variable governs the location of the cache where
|
|
downloaded files are kept. It is cleaned periodically as this
|
|
program is reexecuted. If XDG_CACHE_HOME is set then
|
|
$XDG_CACHE_HOME/debuginfod_client is the default location, otherwise
|
|
$HOME/.cache/debuginfod_client is used. For more information regarding
|
|
the client cache see \fIdebuginfod_find_debuginfo(3)\fP.
|
|
|
|
.SH FILES
|
|
.LP
|
|
.PD .1v
|
|
.TP 20
|
|
.B $HOME/.debuginfod.sqlite
|
|
Default database file.
|
|
.PD
|
|
|
|
.TP 20
|
|
.B $XDG_CACHE_HOME/debuginfod_client
|
|
Default cache directory for content from upstream debuginfods.
|
|
If XDG_CACHE_HOME is not set then \fB$HOME/.cache/debuginfod_client\fP
|
|
is used.
|
|
.PD
|
|
|
|
|
|
.SH "SEE ALSO"
|
|
.I "debuginfod-find(1)"
|
|
.I "sqlite3(1)"
|
|
.I \%https://prometheus.io/docs/instrumenting/exporters/
|