3396 lines
137 KiB
ReStructuredText
3396 lines
137 KiB
ReStructuredText
![]() |
============================
|
|||
|
Clang Compiler User's Manual
|
|||
|
============================
|
|||
|
|
|||
|
.. include:: <isonum.txt>
|
|||
|
|
|||
|
.. contents::
|
|||
|
:local:
|
|||
|
|
|||
|
Introduction
|
|||
|
============
|
|||
|
|
|||
|
The Clang Compiler is an open-source compiler for the C family of
|
|||
|
programming languages, aiming to be the best in class implementation of
|
|||
|
these languages. Clang builds on the LLVM optimizer and code generator,
|
|||
|
allowing it to provide high-quality optimization and code generation
|
|||
|
support for many targets. For more general information, please see the
|
|||
|
`Clang Web Site <https://clang.llvm.org>`_ or the `LLVM Web
|
|||
|
Site <https://llvm.org>`_.
|
|||
|
|
|||
|
This document describes important notes about using Clang as a compiler
|
|||
|
for an end-user, documenting the supported features, command line
|
|||
|
options, etc. If you are interested in using Clang to build a tool that
|
|||
|
processes code, please see :doc:`InternalsManual`. If you are interested in the
|
|||
|
`Clang Static Analyzer <https://clang-analyzer.llvm.org>`_, please see its web
|
|||
|
page.
|
|||
|
|
|||
|
Clang is one component in a complete toolchain for C family languages.
|
|||
|
A separate document describes the other pieces necessary to
|
|||
|
:doc:`assemble a complete toolchain <Toolchain>`.
|
|||
|
|
|||
|
Clang is designed to support the C family of programming languages,
|
|||
|
which includes :ref:`C <c>`, :ref:`Objective-C <objc>`, :ref:`C++ <cxx>`, and
|
|||
|
:ref:`Objective-C++ <objcxx>` as well as many dialects of those. For
|
|||
|
language-specific information, please see the corresponding language
|
|||
|
specific section:
|
|||
|
|
|||
|
- :ref:`C Language <c>`: K&R C, ANSI C89, ISO C90, ISO C94 (C89+AMD1), ISO
|
|||
|
C99 (+TC1, TC2, TC3).
|
|||
|
- :ref:`Objective-C Language <objc>`: ObjC 1, ObjC 2, ObjC 2.1, plus
|
|||
|
variants depending on base language.
|
|||
|
- :ref:`C++ Language <cxx>`
|
|||
|
- :ref:`Objective C++ Language <objcxx>`
|
|||
|
- :ref:`OpenCL C Language <opencl>`: v1.0, v1.1, v1.2, v2.0.
|
|||
|
|
|||
|
In addition to these base languages and their dialects, Clang supports a
|
|||
|
broad variety of language extensions, which are documented in the
|
|||
|
corresponding language section. These extensions are provided to be
|
|||
|
compatible with the GCC, Microsoft, and other popular compilers as well
|
|||
|
as to improve functionality through Clang-specific features. The Clang
|
|||
|
driver and language features are intentionally designed to be as
|
|||
|
compatible with the GNU GCC compiler as reasonably possible, easing
|
|||
|
migration from GCC to Clang. In most cases, code "just works".
|
|||
|
Clang also provides an alternative driver, :ref:`clang-cl`, that is designed
|
|||
|
to be compatible with the Visual C++ compiler, cl.exe.
|
|||
|
|
|||
|
In addition to language specific features, Clang has a variety of
|
|||
|
features that depend on what CPU architecture or operating system is
|
|||
|
being compiled for. Please see the :ref:`Target-Specific Features and
|
|||
|
Limitations <target_features>` section for more details.
|
|||
|
|
|||
|
The rest of the introduction introduces some basic :ref:`compiler
|
|||
|
terminology <terminology>` that is used throughout this manual and
|
|||
|
contains a basic :ref:`introduction to using Clang <basicusage>` as a
|
|||
|
command line compiler.
|
|||
|
|
|||
|
.. _terminology:
|
|||
|
|
|||
|
Terminology
|
|||
|
-----------
|
|||
|
|
|||
|
Front end, parser, backend, preprocessor, undefined behavior,
|
|||
|
diagnostic, optimizer
|
|||
|
|
|||
|
.. _basicusage:
|
|||
|
|
|||
|
Basic Usage
|
|||
|
-----------
|
|||
|
|
|||
|
Intro to how to use a C compiler for newbies.
|
|||
|
|
|||
|
compile + link compile then link debug info enabling optimizations
|
|||
|
picking a language to use, defaults to C11 by default. Autosenses based
|
|||
|
on extension. using a makefile
|
|||
|
|
|||
|
Command Line Options
|
|||
|
====================
|
|||
|
|
|||
|
This section is generally an index into other sections. It does not go
|
|||
|
into depth on the ones that are covered by other sections. However, the
|
|||
|
first part introduces the language selection and other high level
|
|||
|
options like :option:`-c`, :option:`-g`, etc.
|
|||
|
|
|||
|
Options to Control Error and Warning Messages
|
|||
|
---------------------------------------------
|
|||
|
|
|||
|
.. option:: -Werror
|
|||
|
|
|||
|
Turn warnings into errors.
|
|||
|
|
|||
|
.. This is in plain monospaced font because it generates the same label as
|
|||
|
.. -Werror, and Sphinx complains.
|
|||
|
|
|||
|
``-Werror=foo``
|
|||
|
|
|||
|
Turn warning "foo" into an error.
|
|||
|
|
|||
|
.. option:: -Wno-error=foo
|
|||
|
|
|||
|
Turn warning "foo" into a warning even if :option:`-Werror` is specified.
|
|||
|
|
|||
|
.. option:: -Wfoo
|
|||
|
|
|||
|
Enable warning "foo".
|
|||
|
See the :doc:`diagnostics reference <DiagnosticsReference>` for a complete
|
|||
|
list of the warning flags that can be specified in this way.
|
|||
|
|
|||
|
.. option:: -Wno-foo
|
|||
|
|
|||
|
Disable warning "foo".
|
|||
|
|
|||
|
.. option:: -w
|
|||
|
|
|||
|
Disable all diagnostics.
|
|||
|
|
|||
|
.. option:: -Weverything
|
|||
|
|
|||
|
:ref:`Enable all diagnostics. <diagnostics_enable_everything>`
|
|||
|
|
|||
|
.. option:: -pedantic
|
|||
|
|
|||
|
Warn on language extensions.
|
|||
|
|
|||
|
.. option:: -pedantic-errors
|
|||
|
|
|||
|
Error on language extensions.
|
|||
|
|
|||
|
.. option:: -Wsystem-headers
|
|||
|
|
|||
|
Enable warnings from system headers.
|
|||
|
|
|||
|
.. option:: -ferror-limit=123
|
|||
|
|
|||
|
Stop emitting diagnostics after 123 errors have been produced. The default is
|
|||
|
20, and the error limit can be disabled with `-ferror-limit=0`.
|
|||
|
|
|||
|
.. option:: -ftemplate-backtrace-limit=123
|
|||
|
|
|||
|
Only emit up to 123 template instantiation notes within the template
|
|||
|
instantiation backtrace for a single warning or error. The default is 10, and
|
|||
|
the limit can be disabled with `-ftemplate-backtrace-limit=0`.
|
|||
|
|
|||
|
.. _cl_diag_formatting:
|
|||
|
|
|||
|
Formatting of Diagnostics
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Clang aims to produce beautiful diagnostics by default, particularly for
|
|||
|
new users that first come to Clang. However, different people have
|
|||
|
different preferences, and sometimes Clang is driven not by a human,
|
|||
|
but by a program that wants consistent and easily parsable output. For
|
|||
|
these cases, Clang provides a wide range of options to control the exact
|
|||
|
output format of the diagnostics that it generates.
|
|||
|
|
|||
|
.. _opt_fshow-column:
|
|||
|
|
|||
|
**-f[no-]show-column**
|
|||
|
Print column number in diagnostic.
|
|||
|
|
|||
|
This option, which defaults to on, controls whether or not Clang
|
|||
|
prints the column number of a diagnostic. For example, when this is
|
|||
|
enabled, Clang will print something like:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
test.c:28:8: warning: extra tokens at end of #endif directive [-Wextra-tokens]
|
|||
|
#endif bad
|
|||
|
^
|
|||
|
//
|
|||
|
|
|||
|
When this is disabled, Clang will print "test.c:28: warning..." with
|
|||
|
no column number.
|
|||
|
|
|||
|
The printed column numbers count bytes from the beginning of the
|
|||
|
line; take care if your source contains multibyte characters.
|
|||
|
|
|||
|
.. _opt_fshow-source-location:
|
|||
|
|
|||
|
**-f[no-]show-source-location**
|
|||
|
Print source file/line/column information in diagnostic.
|
|||
|
|
|||
|
This option, which defaults to on, controls whether or not Clang
|
|||
|
prints the filename, line number and column number of a diagnostic.
|
|||
|
For example, when this is enabled, Clang will print something like:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
test.c:28:8: warning: extra tokens at end of #endif directive [-Wextra-tokens]
|
|||
|
#endif bad
|
|||
|
^
|
|||
|
//
|
|||
|
|
|||
|
When this is disabled, Clang will not print the "test.c:28:8: "
|
|||
|
part.
|
|||
|
|
|||
|
.. _opt_fcaret-diagnostics:
|
|||
|
|
|||
|
**-f[no-]caret-diagnostics**
|
|||
|
Print source line and ranges from source code in diagnostic.
|
|||
|
This option, which defaults to on, controls whether or not Clang
|
|||
|
prints the source line, source ranges, and caret when emitting a
|
|||
|
diagnostic. For example, when this is enabled, Clang will print
|
|||
|
something like:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
test.c:28:8: warning: extra tokens at end of #endif directive [-Wextra-tokens]
|
|||
|
#endif bad
|
|||
|
^
|
|||
|
//
|
|||
|
|
|||
|
**-f[no-]color-diagnostics**
|
|||
|
This option, which defaults to on when a color-capable terminal is
|
|||
|
detected, controls whether or not Clang prints diagnostics in color.
|
|||
|
|
|||
|
When this option is enabled, Clang will use colors to highlight
|
|||
|
specific parts of the diagnostic, e.g.,
|
|||
|
|
|||
|
.. nasty hack to not lose our dignity
|
|||
|
|
|||
|
.. raw:: html
|
|||
|
|
|||
|
<pre>
|
|||
|
<b><span style="color:black">test.c:28:8: <span style="color:magenta">warning</span>: extra tokens at end of #endif directive [-Wextra-tokens]</span></b>
|
|||
|
#endif bad
|
|||
|
<span style="color:green">^</span>
|
|||
|
<span style="color:green">//</span>
|
|||
|
</pre>
|
|||
|
|
|||
|
When this is disabled, Clang will just print:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
test.c:2:8: warning: extra tokens at end of #endif directive [-Wextra-tokens]
|
|||
|
#endif bad
|
|||
|
^
|
|||
|
//
|
|||
|
|
|||
|
**-fansi-escape-codes**
|
|||
|
Controls whether ANSI escape codes are used instead of the Windows Console
|
|||
|
API to output colored diagnostics. This option is only used on Windows and
|
|||
|
defaults to off.
|
|||
|
|
|||
|
.. option:: -fdiagnostics-format=clang/msvc/vi
|
|||
|
|
|||
|
Changes diagnostic output format to better match IDEs and command line tools.
|
|||
|
|
|||
|
This option controls the output format of the filename, line number,
|
|||
|
and column printed in diagnostic messages. The options, and their
|
|||
|
affect on formatting a simple conversion diagnostic, follow:
|
|||
|
|
|||
|
**clang** (default)
|
|||
|
::
|
|||
|
|
|||
|
t.c:3:11: warning: conversion specifies type 'char *' but the argument has type 'int'
|
|||
|
|
|||
|
**msvc**
|
|||
|
::
|
|||
|
|
|||
|
t.c(3,11) : warning: conversion specifies type 'char *' but the argument has type 'int'
|
|||
|
|
|||
|
**vi**
|
|||
|
::
|
|||
|
|
|||
|
t.c +3:11: warning: conversion specifies type 'char *' but the argument has type 'int'
|
|||
|
|
|||
|
.. _opt_fdiagnostics-show-option:
|
|||
|
|
|||
|
**-f[no-]diagnostics-show-option**
|
|||
|
Enable ``[-Woption]`` information in diagnostic line.
|
|||
|
|
|||
|
This option, which defaults to on, controls whether or not Clang
|
|||
|
prints the associated :ref:`warning group <cl_diag_warning_groups>`
|
|||
|
option name when outputting a warning diagnostic. For example, in
|
|||
|
this output:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
test.c:28:8: warning: extra tokens at end of #endif directive [-Wextra-tokens]
|
|||
|
#endif bad
|
|||
|
^
|
|||
|
//
|
|||
|
|
|||
|
Passing **-fno-diagnostics-show-option** will prevent Clang from
|
|||
|
printing the [:ref:`-Wextra-tokens <opt_Wextra-tokens>`] information in
|
|||
|
the diagnostic. This information tells you the flag needed to enable
|
|||
|
or disable the diagnostic, either from the command line or through
|
|||
|
:ref:`#pragma GCC diagnostic <pragma_GCC_diagnostic>`.
|
|||
|
|
|||
|
.. _opt_fdiagnostics-show-category:
|
|||
|
|
|||
|
.. option:: -fdiagnostics-show-category=none/id/name
|
|||
|
|
|||
|
Enable printing category information in diagnostic line.
|
|||
|
|
|||
|
This option, which defaults to "none", controls whether or not Clang
|
|||
|
prints the category associated with a diagnostic when emitting it.
|
|||
|
Each diagnostic may or many not have an associated category, if it
|
|||
|
has one, it is listed in the diagnostic categorization field of the
|
|||
|
diagnostic line (in the []'s).
|
|||
|
|
|||
|
For example, a format string warning will produce these three
|
|||
|
renditions based on the setting of this option:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
t.c:3:11: warning: conversion specifies type 'char *' but the argument has type 'int' [-Wformat]
|
|||
|
t.c:3:11: warning: conversion specifies type 'char *' but the argument has type 'int' [-Wformat,1]
|
|||
|
t.c:3:11: warning: conversion specifies type 'char *' but the argument has type 'int' [-Wformat,Format String]
|
|||
|
|
|||
|
This category can be used by clients that want to group diagnostics
|
|||
|
by category, so it should be a high level category. We want dozens
|
|||
|
of these, not hundreds or thousands of them.
|
|||
|
|
|||
|
.. _opt_fsave-optimization-record:
|
|||
|
|
|||
|
.. option:: -fsave-optimization-record[=<format>]
|
|||
|
|
|||
|
Write optimization remarks to a separate file.
|
|||
|
|
|||
|
This option, which defaults to off, controls whether Clang writes
|
|||
|
optimization reports to a separate file. By recording diagnostics in a file,
|
|||
|
users can parse or sort the remarks in a convenient way.
|
|||
|
|
|||
|
By default, the serialization format is YAML.
|
|||
|
|
|||
|
The supported serialization formats are:
|
|||
|
|
|||
|
- .. _opt_fsave_optimization_record_yaml:
|
|||
|
|
|||
|
``-fsave-optimization-record=yaml``: A structured YAML format.
|
|||
|
|
|||
|
.. _opt_foptimization-record-file:
|
|||
|
|
|||
|
**-foptimization-record-file**
|
|||
|
Control the file to which optimization reports are written.
|
|||
|
|
|||
|
When optimization reports are being output (see
|
|||
|
:ref:`-fsave-optimization-record <opt_fsave-optimization-record>`), this
|
|||
|
option controls the file to which those reports are written.
|
|||
|
|
|||
|
If this option is not used, optimization records are output to a file named
|
|||
|
after the primary file being compiled. If that's "foo.c", for example,
|
|||
|
optimization records are output to "foo.opt.yaml". If a specific
|
|||
|
serialization format is specified, the file will be named
|
|||
|
"foo.opt.<format>".
|
|||
|
|
|||
|
.. _opt_foptimization-record-passes:
|
|||
|
|
|||
|
**-foptimization-record-passes**
|
|||
|
Only include passes which match a specified regular expression.
|
|||
|
|
|||
|
When optimization reports are being output (see
|
|||
|
:ref:`-fsave-optimization-record <opt_fsave-optimization-record>`), this
|
|||
|
option controls the passes that will be included in the final report.
|
|||
|
|
|||
|
If this option is not used, all the passes are included in the optimization
|
|||
|
record.
|
|||
|
|
|||
|
.. _opt_fdiagnostics-show-hotness:
|
|||
|
|
|||
|
**-f[no-]diagnostics-show-hotness**
|
|||
|
Enable profile hotness information in diagnostic line.
|
|||
|
|
|||
|
This option controls whether Clang prints the profile hotness associated
|
|||
|
with diagnostics in the presence of profile-guided optimization information.
|
|||
|
This is currently supported with optimization remarks (see
|
|||
|
:ref:`Options to Emit Optimization Reports <rpass>`). The hotness information
|
|||
|
allows users to focus on the hot optimization remarks that are likely to be
|
|||
|
more relevant for run-time performance.
|
|||
|
|
|||
|
For example, in this output, the block containing the callsite of `foo` was
|
|||
|
executed 3000 times according to the profile data:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
s.c:7:10: remark: foo inlined into bar (hotness: 3000) [-Rpass-analysis=inline]
|
|||
|
sum += foo(x, x - 2);
|
|||
|
^
|
|||
|
|
|||
|
This option is implied when
|
|||
|
:ref:`-fsave-optimization-record <opt_fsave-optimization-record>` is used.
|
|||
|
Otherwise, it defaults to off.
|
|||
|
|
|||
|
.. _opt_fdiagnostics-hotness-threshold:
|
|||
|
|
|||
|
**-fdiagnostics-hotness-threshold**
|
|||
|
Prevent optimization remarks from being output if they do not have at least
|
|||
|
this hotness value.
|
|||
|
|
|||
|
This option, which defaults to zero, controls the minimum hotness an
|
|||
|
optimization remark would need in order to be output by Clang. This is
|
|||
|
currently supported with optimization remarks (see :ref:`Options to Emit
|
|||
|
Optimization Reports <rpass>`) when profile hotness information in
|
|||
|
diagnostics is enabled (see
|
|||
|
:ref:`-fdiagnostics-show-hotness <opt_fdiagnostics-show-hotness>`).
|
|||
|
|
|||
|
.. _opt_fdiagnostics-fixit-info:
|
|||
|
|
|||
|
**-f[no-]diagnostics-fixit-info**
|
|||
|
Enable "FixIt" information in the diagnostics output.
|
|||
|
|
|||
|
This option, which defaults to on, controls whether or not Clang
|
|||
|
prints the information on how to fix a specific diagnostic
|
|||
|
underneath it when it knows. For example, in this output:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
test.c:28:8: warning: extra tokens at end of #endif directive [-Wextra-tokens]
|
|||
|
#endif bad
|
|||
|
^
|
|||
|
//
|
|||
|
|
|||
|
Passing **-fno-diagnostics-fixit-info** will prevent Clang from
|
|||
|
printing the "//" line at the end of the message. This information
|
|||
|
is useful for users who may not understand what is wrong, but can be
|
|||
|
confusing for machine parsing.
|
|||
|
|
|||
|
.. _opt_fdiagnostics-print-source-range-info:
|
|||
|
|
|||
|
**-fdiagnostics-print-source-range-info**
|
|||
|
Print machine parsable information about source ranges.
|
|||
|
This option makes Clang print information about source ranges in a machine
|
|||
|
parsable format after the file/line/column number information. The
|
|||
|
information is a simple sequence of brace enclosed ranges, where each range
|
|||
|
lists the start and end line/column locations. For example, in this output:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
exprs.c:47:15:{47:8-47:14}{47:17-47:24}: error: invalid operands to binary expression ('int *' and '_Complex float')
|
|||
|
P = (P-42) + Gamma*4;
|
|||
|
~~~~~~ ^ ~~~~~~~
|
|||
|
|
|||
|
The {}'s are generated by -fdiagnostics-print-source-range-info.
|
|||
|
|
|||
|
The printed column numbers count bytes from the beginning of the
|
|||
|
line; take care if your source contains multibyte characters.
|
|||
|
|
|||
|
.. option:: -fdiagnostics-parseable-fixits
|
|||
|
|
|||
|
Print Fix-Its in a machine parseable form.
|
|||
|
|
|||
|
This option makes Clang print available Fix-Its in a machine
|
|||
|
parseable format at the end of diagnostics. The following example
|
|||
|
illustrates the format:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
fix-it:"t.cpp":{7:25-7:29}:"Gamma"
|
|||
|
|
|||
|
The range printed is a half-open range, so in this example the
|
|||
|
characters at column 25 up to but not including column 29 on line 7
|
|||
|
in t.cpp should be replaced with the string "Gamma". Either the
|
|||
|
range or the replacement string may be empty (representing strict
|
|||
|
insertions and strict erasures, respectively). Both the file name
|
|||
|
and the insertion string escape backslash (as "\\\\"), tabs (as
|
|||
|
"\\t"), newlines (as "\\n"), double quotes(as "\\"") and
|
|||
|
non-printable characters (as octal "\\xxx").
|
|||
|
|
|||
|
The printed column numbers count bytes from the beginning of the
|
|||
|
line; take care if your source contains multibyte characters.
|
|||
|
|
|||
|
.. option:: -fno-elide-type
|
|||
|
|
|||
|
Turns off elision in template type printing.
|
|||
|
|
|||
|
The default for template type printing is to elide as many template
|
|||
|
arguments as possible, removing those which are the same in both
|
|||
|
template types, leaving only the differences. Adding this flag will
|
|||
|
print all the template arguments. If supported by the terminal,
|
|||
|
highlighting will still appear on differing arguments.
|
|||
|
|
|||
|
Default:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
t.cc:4:5: note: candidate function not viable: no known conversion from 'vector<map<[...], map<float, [...]>>>' to 'vector<map<[...], map<double, [...]>>>' for 1st argument;
|
|||
|
|
|||
|
-fno-elide-type:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
t.cc:4:5: note: candidate function not viable: no known conversion from 'vector<map<int, map<float, int>>>' to 'vector<map<int, map<double, int>>>' for 1st argument;
|
|||
|
|
|||
|
.. option:: -fdiagnostics-show-template-tree
|
|||
|
|
|||
|
Template type diffing prints a text tree.
|
|||
|
|
|||
|
For diffing large templated types, this option will cause Clang to
|
|||
|
display the templates as an indented text tree, one argument per
|
|||
|
line, with differences marked inline. This is compatible with
|
|||
|
-fno-elide-type.
|
|||
|
|
|||
|
Default:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
t.cc:4:5: note: candidate function not viable: no known conversion from 'vector<map<[...], map<float, [...]>>>' to 'vector<map<[...], map<double, [...]>>>' for 1st argument;
|
|||
|
|
|||
|
With :option:`-fdiagnostics-show-template-tree`:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
t.cc:4:5: note: candidate function not viable: no known conversion for 1st argument;
|
|||
|
vector<
|
|||
|
map<
|
|||
|
[...],
|
|||
|
map<
|
|||
|
[float != double],
|
|||
|
[...]>>>
|
|||
|
|
|||
|
.. _cl_diag_warning_groups:
|
|||
|
|
|||
|
Individual Warning Groups
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
TODO: Generate this from tblgen. Define one anchor per warning group.
|
|||
|
|
|||
|
.. _opt_wextra-tokens:
|
|||
|
|
|||
|
.. option:: -Wextra-tokens
|
|||
|
|
|||
|
Warn about excess tokens at the end of a preprocessor directive.
|
|||
|
|
|||
|
This option, which defaults to on, enables warnings about extra
|
|||
|
tokens at the end of preprocessor directives. For example:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
test.c:28:8: warning: extra tokens at end of #endif directive [-Wextra-tokens]
|
|||
|
#endif bad
|
|||
|
^
|
|||
|
|
|||
|
These extra tokens are not strictly conforming, and are usually best
|
|||
|
handled by commenting them out.
|
|||
|
|
|||
|
.. option:: -Wambiguous-member-template
|
|||
|
|
|||
|
Warn about unqualified uses of a member template whose name resolves to
|
|||
|
another template at the location of the use.
|
|||
|
|
|||
|
This option, which defaults to on, enables a warning in the
|
|||
|
following code:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
template<typename T> struct set{};
|
|||
|
template<typename T> struct trait { typedef const T& type; };
|
|||
|
struct Value {
|
|||
|
template<typename T> void set(typename trait<T>::type value) {}
|
|||
|
};
|
|||
|
void foo() {
|
|||
|
Value v;
|
|||
|
v.set<double>(3.2);
|
|||
|
}
|
|||
|
|
|||
|
C++ [basic.lookup.classref] requires this to be an error, but,
|
|||
|
because it's hard to work around, Clang downgrades it to a warning
|
|||
|
as an extension.
|
|||
|
|
|||
|
.. option:: -Wbind-to-temporary-copy
|
|||
|
|
|||
|
Warn about an unusable copy constructor when binding a reference to a
|
|||
|
temporary.
|
|||
|
|
|||
|
This option enables warnings about binding a
|
|||
|
reference to a temporary when the temporary doesn't have a usable
|
|||
|
copy constructor. For example:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
struct NonCopyable {
|
|||
|
NonCopyable();
|
|||
|
private:
|
|||
|
NonCopyable(const NonCopyable&);
|
|||
|
};
|
|||
|
void foo(const NonCopyable&);
|
|||
|
void bar() {
|
|||
|
foo(NonCopyable()); // Disallowed in C++98; allowed in C++11.
|
|||
|
}
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
struct NonCopyable2 {
|
|||
|
NonCopyable2();
|
|||
|
NonCopyable2(NonCopyable2&);
|
|||
|
};
|
|||
|
void foo(const NonCopyable2&);
|
|||
|
void bar() {
|
|||
|
foo(NonCopyable2()); // Disallowed in C++98; allowed in C++11.
|
|||
|
}
|
|||
|
|
|||
|
Note that if ``NonCopyable2::NonCopyable2()`` has a default argument
|
|||
|
whose instantiation produces a compile error, that error will still
|
|||
|
be a hard error in C++98 mode even if this warning is turned off.
|
|||
|
|
|||
|
Options to Control Clang Crash Diagnostics
|
|||
|
------------------------------------------
|
|||
|
|
|||
|
As unbelievable as it may sound, Clang does crash from time to time.
|
|||
|
Generally, this only occurs to those living on the `bleeding
|
|||
|
edge <https://llvm.org/releases/download.html#svn>`_. Clang goes to great
|
|||
|
lengths to assist you in filing a bug report. Specifically, Clang
|
|||
|
generates preprocessed source file(s) and associated run script(s) upon
|
|||
|
a crash. These files should be attached to a bug report to ease
|
|||
|
reproducibility of the failure. Below are the command line options to
|
|||
|
control the crash diagnostics.
|
|||
|
|
|||
|
.. option:: -fno-crash-diagnostics
|
|||
|
|
|||
|
Disable auto-generation of preprocessed source files during a clang crash.
|
|||
|
|
|||
|
The -fno-crash-diagnostics flag can be helpful for speeding the process
|
|||
|
of generating a delta reduced test case.
|
|||
|
|
|||
|
Clang is also capable of generating preprocessed source file(s) and associated
|
|||
|
run script(s) even without a crash. This is specially useful when trying to
|
|||
|
generate a reproducer for warnings or errors while using modules.
|
|||
|
|
|||
|
.. option:: -gen-reproducer
|
|||
|
|
|||
|
Generates preprocessed source files, a reproducer script and if relevant, a
|
|||
|
cache containing: built module pcm's and all headers needed to rebuilt the
|
|||
|
same modules.
|
|||
|
|
|||
|
.. _rpass:
|
|||
|
|
|||
|
Options to Emit Optimization Reports
|
|||
|
------------------------------------
|
|||
|
|
|||
|
Optimization reports trace, at a high-level, all the major decisions
|
|||
|
done by compiler transformations. For instance, when the inliner
|
|||
|
decides to inline function ``foo()`` into ``bar()``, or the loop unroller
|
|||
|
decides to unroll a loop N times, or the vectorizer decides to
|
|||
|
vectorize a loop body.
|
|||
|
|
|||
|
Clang offers a family of flags which the optimizers can use to emit
|
|||
|
a diagnostic in three cases:
|
|||
|
|
|||
|
1. When the pass makes a transformation (`-Rpass`).
|
|||
|
|
|||
|
2. When the pass fails to make a transformation (`-Rpass-missed`).
|
|||
|
|
|||
|
3. When the pass determines whether or not to make a transformation
|
|||
|
(`-Rpass-analysis`).
|
|||
|
|
|||
|
NOTE: Although the discussion below focuses on `-Rpass`, the exact
|
|||
|
same options apply to `-Rpass-missed` and `-Rpass-analysis`.
|
|||
|
|
|||
|
Since there are dozens of passes inside the compiler, each of these flags
|
|||
|
take a regular expression that identifies the name of the pass which should
|
|||
|
emit the associated diagnostic. For example, to get a report from the inliner,
|
|||
|
compile the code with:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -O2 -Rpass=inline code.cc -o code
|
|||
|
code.cc:4:25: remark: foo inlined into bar [-Rpass=inline]
|
|||
|
int bar(int j) { return foo(j, j - 2); }
|
|||
|
^
|
|||
|
|
|||
|
Note that remarks from the inliner are identified with `[-Rpass=inline]`.
|
|||
|
To request a report from every optimization pass, you should use
|
|||
|
`-Rpass=.*` (in fact, you can use any valid POSIX regular
|
|||
|
expression). However, do not expect a report from every transformation
|
|||
|
made by the compiler. Optimization remarks do not really make sense
|
|||
|
outside of the major transformations (e.g., inlining, vectorization,
|
|||
|
loop optimizations) and not every optimization pass supports this
|
|||
|
feature.
|
|||
|
|
|||
|
Note that when using profile-guided optimization information, profile hotness
|
|||
|
information can be included in the remarks (see
|
|||
|
:ref:`-fdiagnostics-show-hotness <opt_fdiagnostics-show-hotness>`).
|
|||
|
|
|||
|
Current limitations
|
|||
|
^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
1. Optimization remarks that refer to function names will display the
|
|||
|
mangled name of the function. Since these remarks are emitted by the
|
|||
|
back end of the compiler, it does not know anything about the input
|
|||
|
language, nor its mangling rules.
|
|||
|
|
|||
|
2. Some source locations are not displayed correctly. The front end has
|
|||
|
a more detailed source location tracking than the locations included
|
|||
|
in the debug info (e.g., the front end can locate code inside macro
|
|||
|
expansions). However, the locations used by `-Rpass` are
|
|||
|
translated from debug annotations. That translation can be lossy,
|
|||
|
which results in some remarks having no location information.
|
|||
|
|
|||
|
Other Options
|
|||
|
-------------
|
|||
|
Clang options that don't fit neatly into other categories.
|
|||
|
|
|||
|
.. option:: -MV
|
|||
|
|
|||
|
When emitting a dependency file, use formatting conventions appropriate
|
|||
|
for NMake or Jom. Ignored unless another option causes Clang to emit a
|
|||
|
dependency file.
|
|||
|
|
|||
|
When Clang emits a dependency file (e.g., you supplied the -M option)
|
|||
|
most filenames can be written to the file without any special formatting.
|
|||
|
Different Make tools will treat different sets of characters as "special"
|
|||
|
and use different conventions for telling the Make tool that the character
|
|||
|
is actually part of the filename. Normally Clang uses backslash to "escape"
|
|||
|
a special character, which is the convention used by GNU Make. The -MV
|
|||
|
option tells Clang to put double-quotes around the entire filename, which
|
|||
|
is the convention used by NMake and Jom.
|
|||
|
|
|||
|
Configuration files
|
|||
|
-------------------
|
|||
|
|
|||
|
Configuration files group command-line options and allow all of them to be
|
|||
|
specified just by referencing the configuration file. They may be used, for
|
|||
|
example, to collect options required to tune compilation for particular
|
|||
|
target, such as -L, -I, -l, --sysroot, codegen options, etc.
|
|||
|
|
|||
|
The command line option `--config` can be used to specify configuration
|
|||
|
file in a Clang invocation. For example:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
clang --config /home/user/cfgs/testing.txt
|
|||
|
clang --config debug.cfg
|
|||
|
|
|||
|
If the provided argument contains a directory separator, it is considered as
|
|||
|
a file path, and options are read from that file. Otherwise the argument is
|
|||
|
treated as a file name and is searched for sequentially in the directories:
|
|||
|
|
|||
|
- user directory,
|
|||
|
- system directory,
|
|||
|
- the directory where Clang executable resides.
|
|||
|
|
|||
|
Both user and system directories for configuration files are specified during
|
|||
|
clang build using CMake parameters, CLANG_CONFIG_FILE_USER_DIR and
|
|||
|
CLANG_CONFIG_FILE_SYSTEM_DIR respectively. The first file found is used. It is
|
|||
|
an error if the required file cannot be found.
|
|||
|
|
|||
|
Another way to specify a configuration file is to encode it in executable name.
|
|||
|
For example, if the Clang executable is named `armv7l-clang` (it may be a
|
|||
|
symbolic link to `clang`), then Clang will search for file `armv7l.cfg` in the
|
|||
|
directory where Clang resides.
|
|||
|
|
|||
|
If a driver mode is specified in invocation, Clang tries to find a file specific
|
|||
|
for the specified mode. For example, if the executable file is named
|
|||
|
`x86_64-clang-cl`, Clang first looks for `x86_64-cl.cfg` and if it is not found,
|
|||
|
looks for `x86_64.cfg`.
|
|||
|
|
|||
|
If the command line contains options that effectively change target architecture
|
|||
|
(these are -m32, -EL, and some others) and the configuration file starts with an
|
|||
|
architecture name, Clang tries to load the configuration file for the effective
|
|||
|
architecture. For example, invocation:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
x86_64-clang -m32 abc.c
|
|||
|
|
|||
|
causes Clang search for a file `i368.cfg` first, and if no such file is found,
|
|||
|
Clang looks for the file `x86_64.cfg`.
|
|||
|
|
|||
|
The configuration file consists of command-line options specified on one or
|
|||
|
more lines. Lines composed of whitespace characters only are ignored as well as
|
|||
|
lines in which the first non-blank character is `#`. Long options may be split
|
|||
|
between several lines by a trailing backslash. Here is example of a
|
|||
|
configuration file:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
# Several options on line
|
|||
|
-c --target=x86_64-unknown-linux-gnu
|
|||
|
|
|||
|
# Long option split between lines
|
|||
|
-I/usr/lib/gcc/x86_64-linux-gnu/5.4.0/../../../../\
|
|||
|
include/c++/5.4.0
|
|||
|
|
|||
|
# other config files may be included
|
|||
|
@linux.options
|
|||
|
|
|||
|
Files included by `@file` directives in configuration files are resolved
|
|||
|
relative to the including file. For example, if a configuration file
|
|||
|
`~/.llvm/target.cfg` contains the directive `@os/linux.opts`, the file
|
|||
|
`linux.opts` is searched for in the directory `~/.llvm/os`.
|
|||
|
|
|||
|
Language and Target-Independent Features
|
|||
|
========================================
|
|||
|
|
|||
|
Controlling Errors and Warnings
|
|||
|
-------------------------------
|
|||
|
|
|||
|
Clang provides a number of ways to control which code constructs cause
|
|||
|
it to emit errors and warning messages, and how they are displayed to
|
|||
|
the console.
|
|||
|
|
|||
|
Controlling How Clang Displays Diagnostics
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
When Clang emits a diagnostic, it includes rich information in the
|
|||
|
output, and gives you fine-grain control over which information is
|
|||
|
printed. Clang has the ability to print this information, and these are
|
|||
|
the options that control it:
|
|||
|
|
|||
|
#. A file/line/column indicator that shows exactly where the diagnostic
|
|||
|
occurs in your code [:ref:`-fshow-column <opt_fshow-column>`,
|
|||
|
:ref:`-fshow-source-location <opt_fshow-source-location>`].
|
|||
|
#. A categorization of the diagnostic as a note, warning, error, or
|
|||
|
fatal error.
|
|||
|
#. A text string that describes what the problem is.
|
|||
|
#. An option that indicates how to control the diagnostic (for
|
|||
|
diagnostics that support it)
|
|||
|
[:ref:`-fdiagnostics-show-option <opt_fdiagnostics-show-option>`].
|
|||
|
#. A :ref:`high-level category <diagnostics_categories>` for the diagnostic
|
|||
|
for clients that want to group diagnostics by class (for diagnostics
|
|||
|
that support it)
|
|||
|
[:ref:`-fdiagnostics-show-category <opt_fdiagnostics-show-category>`].
|
|||
|
#. The line of source code that the issue occurs on, along with a caret
|
|||
|
and ranges that indicate the important locations
|
|||
|
[:ref:`-fcaret-diagnostics <opt_fcaret-diagnostics>`].
|
|||
|
#. "FixIt" information, which is a concise explanation of how to fix the
|
|||
|
problem (when Clang is certain it knows)
|
|||
|
[:ref:`-fdiagnostics-fixit-info <opt_fdiagnostics-fixit-info>`].
|
|||
|
#. A machine-parsable representation of the ranges involved (off by
|
|||
|
default)
|
|||
|
[:ref:`-fdiagnostics-print-source-range-info <opt_fdiagnostics-print-source-range-info>`].
|
|||
|
|
|||
|
For more information please see :ref:`Formatting of
|
|||
|
Diagnostics <cl_diag_formatting>`.
|
|||
|
|
|||
|
Diagnostic Mappings
|
|||
|
^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
All diagnostics are mapped into one of these 6 classes:
|
|||
|
|
|||
|
- Ignored
|
|||
|
- Note
|
|||
|
- Remark
|
|||
|
- Warning
|
|||
|
- Error
|
|||
|
- Fatal
|
|||
|
|
|||
|
.. _diagnostics_categories:
|
|||
|
|
|||
|
Diagnostic Categories
|
|||
|
^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Though not shown by default, diagnostics may each be associated with a
|
|||
|
high-level category. This category is intended to make it possible to
|
|||
|
triage builds that produce a large number of errors or warnings in a
|
|||
|
grouped way.
|
|||
|
|
|||
|
Categories are not shown by default, but they can be turned on with the
|
|||
|
:ref:`-fdiagnostics-show-category <opt_fdiagnostics-show-category>` option.
|
|||
|
When set to "``name``", the category is printed textually in the
|
|||
|
diagnostic output. When it is set to "``id``", a category number is
|
|||
|
printed. The mapping of category names to category id's can be obtained
|
|||
|
by running '``clang --print-diagnostic-categories``'.
|
|||
|
|
|||
|
Controlling Diagnostics via Command Line Flags
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
TODO: -W flags, -pedantic, etc
|
|||
|
|
|||
|
.. _pragma_gcc_diagnostic:
|
|||
|
|
|||
|
Controlling Diagnostics via Pragmas
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Clang can also control what diagnostics are enabled through the use of
|
|||
|
pragmas in the source code. This is useful for turning off specific
|
|||
|
warnings in a section of source code. Clang supports GCC's pragma for
|
|||
|
compatibility with existing source code, as well as several extensions.
|
|||
|
|
|||
|
The pragma may control any warning that can be used from the command
|
|||
|
line. Warnings may be set to ignored, warning, error, or fatal. The
|
|||
|
following example code will tell Clang or GCC to ignore the -Wall
|
|||
|
warnings:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
#pragma GCC diagnostic ignored "-Wall"
|
|||
|
|
|||
|
In addition to all of the functionality provided by GCC's pragma, Clang
|
|||
|
also allows you to push and pop the current warning state. This is
|
|||
|
particularly useful when writing a header file that will be compiled by
|
|||
|
other people, because you don't know what warning flags they build with.
|
|||
|
|
|||
|
In the below example :option:`-Wextra-tokens` is ignored for only a single line
|
|||
|
of code, after which the diagnostics return to whatever state had previously
|
|||
|
existed.
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
#if foo
|
|||
|
#endif foo // warning: extra tokens at end of #endif directive
|
|||
|
|
|||
|
#pragma clang diagnostic push
|
|||
|
#pragma clang diagnostic ignored "-Wextra-tokens"
|
|||
|
|
|||
|
#if foo
|
|||
|
#endif foo // no warning
|
|||
|
|
|||
|
#pragma clang diagnostic pop
|
|||
|
|
|||
|
The push and pop pragmas will save and restore the full diagnostic state
|
|||
|
of the compiler, regardless of how it was set. That means that it is
|
|||
|
possible to use push and pop around GCC compatible diagnostics and Clang
|
|||
|
will push and pop them appropriately, while GCC will ignore the pushes
|
|||
|
and pops as unknown pragmas. It should be noted that while Clang
|
|||
|
supports the GCC pragma, Clang and GCC do not support the exact same set
|
|||
|
of warnings, so even when using GCC compatible #pragmas there is no
|
|||
|
guarantee that they will have identical behaviour on both compilers.
|
|||
|
|
|||
|
In addition to controlling warnings and errors generated by the compiler, it is
|
|||
|
possible to generate custom warning and error messages through the following
|
|||
|
pragmas:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
// The following will produce warning messages
|
|||
|
#pragma message "some diagnostic message"
|
|||
|
#pragma GCC warning "TODO: replace deprecated feature"
|
|||
|
|
|||
|
// The following will produce an error message
|
|||
|
#pragma GCC error "Not supported"
|
|||
|
|
|||
|
These pragmas operate similarly to the ``#warning`` and ``#error`` preprocessor
|
|||
|
directives, except that they may also be embedded into preprocessor macros via
|
|||
|
the C99 ``_Pragma`` operator, for example:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
#define STR(X) #X
|
|||
|
#define DEFER(M,...) M(__VA_ARGS__)
|
|||
|
#define CUSTOM_ERROR(X) _Pragma(STR(GCC error(X " at line " DEFER(STR,__LINE__))))
|
|||
|
|
|||
|
CUSTOM_ERROR("Feature not available");
|
|||
|
|
|||
|
Controlling Diagnostics in System Headers
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Warnings are suppressed when they occur in system headers. By default,
|
|||
|
an included file is treated as a system header if it is found in an
|
|||
|
include path specified by ``-isystem``, but this can be overridden in
|
|||
|
several ways.
|
|||
|
|
|||
|
The ``system_header`` pragma can be used to mark the current file as
|
|||
|
being a system header. No warnings will be produced from the location of
|
|||
|
the pragma onwards within the same file.
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
#if foo
|
|||
|
#endif foo // warning: extra tokens at end of #endif directive
|
|||
|
|
|||
|
#pragma clang system_header
|
|||
|
|
|||
|
#if foo
|
|||
|
#endif foo // no warning
|
|||
|
|
|||
|
The `--system-header-prefix=` and `--no-system-header-prefix=`
|
|||
|
command-line arguments can be used to override whether subsets of an include
|
|||
|
path are treated as system headers. When the name in a ``#include`` directive
|
|||
|
is found within a header search path and starts with a system prefix, the
|
|||
|
header is treated as a system header. The last prefix on the
|
|||
|
command-line which matches the specified header name takes precedence.
|
|||
|
For instance:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -Ifoo -isystem bar --system-header-prefix=x/ \
|
|||
|
--no-system-header-prefix=x/y/
|
|||
|
|
|||
|
Here, ``#include "x/a.h"`` is treated as including a system header, even
|
|||
|
if the header is found in ``foo``, and ``#include "x/y/b.h"`` is treated
|
|||
|
as not including a system header, even if the header is found in
|
|||
|
``bar``.
|
|||
|
|
|||
|
A ``#include`` directive which finds a file relative to the current
|
|||
|
directory is treated as including a system header if the including file
|
|||
|
is treated as a system header.
|
|||
|
|
|||
|
.. _diagnostics_enable_everything:
|
|||
|
|
|||
|
Enabling All Diagnostics
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
In addition to the traditional ``-W`` flags, one can enable **all**
|
|||
|
diagnostics by passing :option:`-Weverything`. This works as expected
|
|||
|
with
|
|||
|
:option:`-Werror`, and also includes the warnings from :option:`-pedantic`.
|
|||
|
|
|||
|
Note that when combined with :option:`-w` (which disables all warnings), that
|
|||
|
flag wins.
|
|||
|
|
|||
|
Controlling Static Analyzer Diagnostics
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
While not strictly part of the compiler, the diagnostics from Clang's
|
|||
|
`static analyzer <https://clang-analyzer.llvm.org>`_ can also be
|
|||
|
influenced by the user via changes to the source code. See the available
|
|||
|
`annotations <https://clang-analyzer.llvm.org/annotations.html>`_ and the
|
|||
|
analyzer's `FAQ
|
|||
|
page <https://clang-analyzer.llvm.org/faq.html#exclude_code>`_ for more
|
|||
|
information.
|
|||
|
|
|||
|
.. _usersmanual-precompiled-headers:
|
|||
|
|
|||
|
Precompiled Headers
|
|||
|
-------------------
|
|||
|
|
|||
|
`Precompiled headers <https://en.wikipedia.org/wiki/Precompiled_header>`_
|
|||
|
are a general approach employed by many compilers to reduce compilation
|
|||
|
time. The underlying motivation of the approach is that it is common for
|
|||
|
the same (and often large) header files to be included by multiple
|
|||
|
source files. Consequently, compile times can often be greatly improved
|
|||
|
by caching some of the (redundant) work done by a compiler to process
|
|||
|
headers. Precompiled header files, which represent one of many ways to
|
|||
|
implement this optimization, are literally files that represent an
|
|||
|
on-disk cache that contains the vital information necessary to reduce
|
|||
|
some of the work needed to process a corresponding header file. While
|
|||
|
details of precompiled headers vary between compilers, precompiled
|
|||
|
headers have been shown to be highly effective at speeding up program
|
|||
|
compilation on systems with very large system headers (e.g., macOS).
|
|||
|
|
|||
|
Generating a PCH File
|
|||
|
^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
To generate a PCH file using Clang, one invokes Clang with the
|
|||
|
`-x <language>-header` option. This mirrors the interface in GCC
|
|||
|
for generating PCH files:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ gcc -x c-header test.h -o test.h.gch
|
|||
|
$ clang -x c-header test.h -o test.h.pch
|
|||
|
|
|||
|
Using a PCH File
|
|||
|
^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
A PCH file can then be used as a prefix header when a :option:`-include`
|
|||
|
option is passed to ``clang``:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -include test.h test.c -o test
|
|||
|
|
|||
|
The ``clang`` driver will first check if a PCH file for ``test.h`` is
|
|||
|
available; if so, the contents of ``test.h`` (and the files it includes)
|
|||
|
will be processed from the PCH file. Otherwise, Clang falls back to
|
|||
|
directly processing the content of ``test.h``. This mirrors the behavior
|
|||
|
of GCC.
|
|||
|
|
|||
|
.. note::
|
|||
|
|
|||
|
Clang does *not* automatically use PCH files for headers that are directly
|
|||
|
included within a source file. For example:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -x c-header test.h -o test.h.pch
|
|||
|
$ cat test.c
|
|||
|
#include "test.h"
|
|||
|
$ clang test.c -o test
|
|||
|
|
|||
|
In this example, ``clang`` will not automatically use the PCH file for
|
|||
|
``test.h`` since ``test.h`` was included directly in the source file and not
|
|||
|
specified on the command line using :option:`-include`.
|
|||
|
|
|||
|
Relocatable PCH Files
|
|||
|
^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
It is sometimes necessary to build a precompiled header from headers
|
|||
|
that are not yet in their final, installed locations. For example, one
|
|||
|
might build a precompiled header within the build tree that is then
|
|||
|
meant to be installed alongside the headers. Clang permits the creation
|
|||
|
of "relocatable" precompiled headers, which are built with a given path
|
|||
|
(into the build directory) and can later be used from an installed
|
|||
|
location.
|
|||
|
|
|||
|
To build a relocatable precompiled header, place your headers into a
|
|||
|
subdirectory whose structure mimics the installed location. For example,
|
|||
|
if you want to build a precompiled header for the header ``mylib.h``
|
|||
|
that will be installed into ``/usr/include``, create a subdirectory
|
|||
|
``build/usr/include`` and place the header ``mylib.h`` into that
|
|||
|
subdirectory. If ``mylib.h`` depends on other headers, then they can be
|
|||
|
stored within ``build/usr/include`` in a way that mimics the installed
|
|||
|
location.
|
|||
|
|
|||
|
Building a relocatable precompiled header requires two additional
|
|||
|
arguments. First, pass the ``--relocatable-pch`` flag to indicate that
|
|||
|
the resulting PCH file should be relocatable. Second, pass
|
|||
|
``-isysroot /path/to/build``, which makes all includes for your library
|
|||
|
relative to the build directory. For example:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
# clang -x c-header --relocatable-pch -isysroot /path/to/build /path/to/build/mylib.h mylib.h.pch
|
|||
|
|
|||
|
When loading the relocatable PCH file, the various headers used in the
|
|||
|
PCH file are found from the system header root. For example, ``mylib.h``
|
|||
|
can be found in ``/usr/include/mylib.h``. If the headers are installed
|
|||
|
in some other system root, the ``-isysroot`` option can be used provide
|
|||
|
a different system root from which the headers will be based. For
|
|||
|
example, ``-isysroot /Developer/SDKs/MacOSX10.4u.sdk`` will look for
|
|||
|
``mylib.h`` in ``/Developer/SDKs/MacOSX10.4u.sdk/usr/include/mylib.h``.
|
|||
|
|
|||
|
Relocatable precompiled headers are intended to be used in a limited
|
|||
|
number of cases where the compilation environment is tightly controlled
|
|||
|
and the precompiled header cannot be generated after headers have been
|
|||
|
installed.
|
|||
|
|
|||
|
.. _controlling-code-generation:
|
|||
|
|
|||
|
Controlling Code Generation
|
|||
|
---------------------------
|
|||
|
|
|||
|
Clang provides a number of ways to control code generation. The options
|
|||
|
are listed below.
|
|||
|
|
|||
|
**-f[no-]sanitize=check1,check2,...**
|
|||
|
Turn on runtime checks for various forms of undefined or suspicious
|
|||
|
behavior.
|
|||
|
|
|||
|
This option controls whether Clang adds runtime checks for various
|
|||
|
forms of undefined or suspicious behavior, and is disabled by
|
|||
|
default. If a check fails, a diagnostic message is produced at
|
|||
|
runtime explaining the problem. The main checks are:
|
|||
|
|
|||
|
- .. _opt_fsanitize_address:
|
|||
|
|
|||
|
``-fsanitize=address``:
|
|||
|
:doc:`AddressSanitizer`, a memory error
|
|||
|
detector.
|
|||
|
- .. _opt_fsanitize_thread:
|
|||
|
|
|||
|
``-fsanitize=thread``: :doc:`ThreadSanitizer`, a data race detector.
|
|||
|
- .. _opt_fsanitize_memory:
|
|||
|
|
|||
|
``-fsanitize=memory``: :doc:`MemorySanitizer`,
|
|||
|
a detector of uninitialized reads. Requires instrumentation of all
|
|||
|
program code.
|
|||
|
- .. _opt_fsanitize_undefined:
|
|||
|
|
|||
|
``-fsanitize=undefined``: :doc:`UndefinedBehaviorSanitizer`,
|
|||
|
a fast and compatible undefined behavior checker.
|
|||
|
|
|||
|
- ``-fsanitize=dataflow``: :doc:`DataFlowSanitizer`, a general data
|
|||
|
flow analysis.
|
|||
|
- ``-fsanitize=cfi``: :doc:`control flow integrity <ControlFlowIntegrity>`
|
|||
|
checks. Requires ``-flto``.
|
|||
|
- ``-fsanitize=safe-stack``: :doc:`safe stack <SafeStack>`
|
|||
|
protection against stack-based memory corruption errors.
|
|||
|
|
|||
|
There are more fine-grained checks available: see
|
|||
|
the :ref:`list <ubsan-checks>` of specific kinds of
|
|||
|
undefined behavior that can be detected and the :ref:`list <cfi-schemes>`
|
|||
|
of control flow integrity schemes.
|
|||
|
|
|||
|
The ``-fsanitize=`` argument must also be provided when linking, in
|
|||
|
order to link to the appropriate runtime library.
|
|||
|
|
|||
|
It is not possible to combine more than one of the ``-fsanitize=address``,
|
|||
|
``-fsanitize=thread``, and ``-fsanitize=memory`` checkers in the same
|
|||
|
program.
|
|||
|
|
|||
|
**-f[no-]sanitize-recover=check1,check2,...**
|
|||
|
|
|||
|
**-f[no-]sanitize-recover=all**
|
|||
|
|
|||
|
Controls which checks enabled by ``-fsanitize=`` flag are non-fatal.
|
|||
|
If the check is fatal, program will halt after the first error
|
|||
|
of this kind is detected and error report is printed.
|
|||
|
|
|||
|
By default, non-fatal checks are those enabled by
|
|||
|
:doc:`UndefinedBehaviorSanitizer`,
|
|||
|
except for ``-fsanitize=return`` and ``-fsanitize=unreachable``. Some
|
|||
|
sanitizers may not support recovery (or not support it by default
|
|||
|
e.g. :doc:`AddressSanitizer`), and always crash the program after the issue
|
|||
|
is detected.
|
|||
|
|
|||
|
Note that the ``-fsanitize-trap`` flag has precedence over this flag.
|
|||
|
This means that if a check has been configured to trap elsewhere on the
|
|||
|
command line, or if the check traps by default, this flag will not have
|
|||
|
any effect unless that sanitizer's trapping behavior is disabled with
|
|||
|
``-fno-sanitize-trap``.
|
|||
|
|
|||
|
For example, if a command line contains the flags ``-fsanitize=undefined
|
|||
|
-fsanitize-trap=undefined``, the flag ``-fsanitize-recover=alignment``
|
|||
|
will have no effect on its own; it will need to be accompanied by
|
|||
|
``-fno-sanitize-trap=alignment``.
|
|||
|
|
|||
|
**-f[no-]sanitize-trap=check1,check2,...**
|
|||
|
|
|||
|
Controls which checks enabled by the ``-fsanitize=`` flag trap. This
|
|||
|
option is intended for use in cases where the sanitizer runtime cannot
|
|||
|
be used (for instance, when building libc or a kernel module), or where
|
|||
|
the binary size increase caused by the sanitizer runtime is a concern.
|
|||
|
|
|||
|
This flag is only compatible with :doc:`control flow integrity
|
|||
|
<ControlFlowIntegrity>` schemes and :doc:`UndefinedBehaviorSanitizer`
|
|||
|
checks other than ``vptr``. If this flag
|
|||
|
is supplied together with ``-fsanitize=undefined``, the ``vptr`` sanitizer
|
|||
|
will be implicitly disabled.
|
|||
|
|
|||
|
This flag is enabled by default for sanitizers in the ``cfi`` group.
|
|||
|
|
|||
|
.. option:: -fsanitize-blacklist=/path/to/blacklist/file
|
|||
|
|
|||
|
Disable or modify sanitizer checks for objects (source files, functions,
|
|||
|
variables, types) listed in the file. See
|
|||
|
:doc:`SanitizerSpecialCaseList` for file format description.
|
|||
|
|
|||
|
.. option:: -fno-sanitize-blacklist
|
|||
|
|
|||
|
Don't use blacklist file, if it was specified earlier in the command line.
|
|||
|
|
|||
|
**-f[no-]sanitize-coverage=[type,features,...]**
|
|||
|
|
|||
|
Enable simple code coverage in addition to certain sanitizers.
|
|||
|
See :doc:`SanitizerCoverage` for more details.
|
|||
|
|
|||
|
**-f[no-]sanitize-stats**
|
|||
|
|
|||
|
Enable simple statistics gathering for the enabled sanitizers.
|
|||
|
See :doc:`SanitizerStats` for more details.
|
|||
|
|
|||
|
.. option:: -fsanitize-undefined-trap-on-error
|
|||
|
|
|||
|
Deprecated alias for ``-fsanitize-trap=undefined``.
|
|||
|
|
|||
|
.. option:: -fsanitize-cfi-cross-dso
|
|||
|
|
|||
|
Enable cross-DSO control flow integrity checks. This flag modifies
|
|||
|
the behavior of sanitizers in the ``cfi`` group to allow checking
|
|||
|
of cross-DSO virtual and indirect calls.
|
|||
|
|
|||
|
.. option:: -fsanitize-cfi-icall-generalize-pointers
|
|||
|
|
|||
|
Generalize pointers in return and argument types in function type signatures
|
|||
|
checked by Control Flow Integrity indirect call checking. See
|
|||
|
:doc:`ControlFlowIntegrity` for more details.
|
|||
|
|
|||
|
.. option:: -fstrict-vtable-pointers
|
|||
|
|
|||
|
Enable optimizations based on the strict rules for overwriting polymorphic
|
|||
|
C++ objects, i.e. the vptr is invariant during an object's lifetime.
|
|||
|
This enables better devirtualization. Turned off by default, because it is
|
|||
|
still experimental.
|
|||
|
|
|||
|
.. option:: -ffast-math
|
|||
|
|
|||
|
Enable fast-math mode. This defines the ``__FAST_MATH__`` preprocessor
|
|||
|
macro, and lets the compiler make aggressive, potentially-lossy assumptions
|
|||
|
about floating-point math. These include:
|
|||
|
|
|||
|
* Floating-point math obeys regular algebraic rules for real numbers (e.g.
|
|||
|
``+`` and ``*`` are associative, ``x/y == x * (1/y)``, and
|
|||
|
``(a + b) * c == a * c + b * c``),
|
|||
|
* operands to floating-point operations are not equal to ``NaN`` and
|
|||
|
``Inf``, and
|
|||
|
* ``+0`` and ``-0`` are interchangeable.
|
|||
|
|
|||
|
.. option:: -fdenormal-fp-math=[values]
|
|||
|
|
|||
|
Select which denormal numbers the code is permitted to require.
|
|||
|
|
|||
|
Valid values are: ``ieee``, ``preserve-sign``, and ``positive-zero``,
|
|||
|
which correspond to IEEE 754 denormal numbers, the sign of a
|
|||
|
flushed-to-zero number is preserved in the sign of 0, denormals are
|
|||
|
flushed to positive zero, respectively.
|
|||
|
|
|||
|
.. option:: -f[no-]strict-float-cast-overflow
|
|||
|
|
|||
|
When a floating-point value is not representable in a destination integer
|
|||
|
type, the code has undefined behavior according to the language standard.
|
|||
|
By default, Clang will not guarantee any particular result in that case.
|
|||
|
With the 'no-strict' option, Clang attempts to match the overflowing behavior
|
|||
|
of the target's native float-to-int conversion instructions.
|
|||
|
|
|||
|
.. option:: -fwhole-program-vtables
|
|||
|
|
|||
|
Enable whole-program vtable optimizations, such as single-implementation
|
|||
|
devirtualization and virtual constant propagation, for classes with
|
|||
|
:doc:`hidden LTO visibility <LTOVisibility>`. Requires ``-flto``.
|
|||
|
|
|||
|
.. option:: -fforce-emit-vtables
|
|||
|
|
|||
|
In order to improve devirtualization, forces emitting of vtables even in
|
|||
|
modules where it isn't necessary. It causes more inline virtual functions
|
|||
|
to be emitted.
|
|||
|
|
|||
|
.. option:: -fno-assume-sane-operator-new
|
|||
|
|
|||
|
Don't assume that the C++'s new operator is sane.
|
|||
|
|
|||
|
This option tells the compiler to do not assume that C++'s global
|
|||
|
new operator will always return a pointer that does not alias any
|
|||
|
other pointer when the function returns.
|
|||
|
|
|||
|
.. option:: -ftrap-function=[name]
|
|||
|
|
|||
|
Instruct code generator to emit a function call to the specified
|
|||
|
function name for ``__builtin_trap()``.
|
|||
|
|
|||
|
LLVM code generator translates ``__builtin_trap()`` to a trap
|
|||
|
instruction if it is supported by the target ISA. Otherwise, the
|
|||
|
builtin is translated into a call to ``abort``. If this option is
|
|||
|
set, then the code generator will always lower the builtin to a call
|
|||
|
to the specified function regardless of whether the target ISA has a
|
|||
|
trap instruction. This option is useful for environments (e.g.
|
|||
|
deeply embedded) where a trap cannot be properly handled, or when
|
|||
|
some custom behavior is desired.
|
|||
|
|
|||
|
.. option:: -ftls-model=[model]
|
|||
|
|
|||
|
Select which TLS model to use.
|
|||
|
|
|||
|
Valid values are: ``global-dynamic``, ``local-dynamic``,
|
|||
|
``initial-exec`` and ``local-exec``. The default value is
|
|||
|
``global-dynamic``. The compiler may use a different model if the
|
|||
|
selected model is not supported by the target, or if a more
|
|||
|
efficient model can be used. The TLS model can be overridden per
|
|||
|
variable using the ``tls_model`` attribute.
|
|||
|
|
|||
|
.. option:: -femulated-tls
|
|||
|
|
|||
|
Select emulated TLS model, which overrides all -ftls-model choices.
|
|||
|
|
|||
|
In emulated TLS mode, all access to TLS variables are converted to
|
|||
|
calls to __emutls_get_address in the runtime library.
|
|||
|
|
|||
|
.. option:: -mhwdiv=[values]
|
|||
|
|
|||
|
Select the ARM modes (arm or thumb) that support hardware division
|
|||
|
instructions.
|
|||
|
|
|||
|
Valid values are: ``arm``, ``thumb`` and ``arm,thumb``.
|
|||
|
This option is used to indicate which mode (arm or thumb) supports
|
|||
|
hardware division instructions. This only applies to the ARM
|
|||
|
architecture.
|
|||
|
|
|||
|
.. option:: -m[no-]crc
|
|||
|
|
|||
|
Enable or disable CRC instructions.
|
|||
|
|
|||
|
This option is used to indicate whether CRC instructions are to
|
|||
|
be generated. This only applies to the ARM architecture.
|
|||
|
|
|||
|
CRC instructions are enabled by default on ARMv8.
|
|||
|
|
|||
|
.. option:: -mgeneral-regs-only
|
|||
|
|
|||
|
Generate code which only uses the general purpose registers.
|
|||
|
|
|||
|
This option restricts the generated code to use general registers
|
|||
|
only. This only applies to the AArch64 architecture.
|
|||
|
|
|||
|
.. option:: -mcompact-branches=[values]
|
|||
|
|
|||
|
Control the usage of compact branches for MIPSR6.
|
|||
|
|
|||
|
Valid values are: ``never``, ``optimal`` and ``always``.
|
|||
|
The default value is ``optimal`` which generates compact branches
|
|||
|
when a delay slot cannot be filled. ``never`` disables the usage of
|
|||
|
compact branches and ``always`` generates compact branches whenever
|
|||
|
possible.
|
|||
|
|
|||
|
**-f[no-]max-type-align=[number]**
|
|||
|
Instruct the code generator to not enforce a higher alignment than the given
|
|||
|
number (of bytes) when accessing memory via an opaque pointer or reference.
|
|||
|
This cap is ignored when directly accessing a variable or when the pointee
|
|||
|
type has an explicit “aligned” attribute.
|
|||
|
|
|||
|
The value should usually be determined by the properties of the system allocator.
|
|||
|
Some builtin types, especially vector types, have very high natural alignments;
|
|||
|
when working with values of those types, Clang usually wants to use instructions
|
|||
|
that take advantage of that alignment. However, many system allocators do
|
|||
|
not promise to return memory that is more than 8-byte or 16-byte-aligned. Use
|
|||
|
this option to limit the alignment that the compiler can assume for an arbitrary
|
|||
|
pointer, which may point onto the heap.
|
|||
|
|
|||
|
This option does not affect the ABI alignment of types; the layout of structs and
|
|||
|
unions and the value returned by the alignof operator remain the same.
|
|||
|
|
|||
|
This option can be overridden on a case-by-case basis by putting an explicit
|
|||
|
“aligned” alignment on a struct, union, or typedef. For example:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
#include <immintrin.h>
|
|||
|
// Make an aligned typedef of the AVX-512 16-int vector type.
|
|||
|
typedef __v16si __aligned_v16si __attribute__((aligned(64)));
|
|||
|
|
|||
|
void initialize_vector(__aligned_v16si *v) {
|
|||
|
// The compiler may assume that ‘v’ is 64-byte aligned, regardless of the
|
|||
|
// value of -fmax-type-align.
|
|||
|
}
|
|||
|
|
|||
|
.. option:: -faddrsig, -fno-addrsig
|
|||
|
|
|||
|
Controls whether Clang emits an address-significance table into the object
|
|||
|
file. Address-significance tables allow linkers to implement `safe ICF
|
|||
|
<https://research.google.com/pubs/archive/36912.pdf>`_ without the false
|
|||
|
positives that can result from other implementation techniques such as
|
|||
|
relocation scanning. Address-significance tables are enabled by default
|
|||
|
on ELF targets when using the integrated assembler. This flag currently
|
|||
|
only has an effect on ELF targets.
|
|||
|
|
|||
|
Profile Guided Optimization
|
|||
|
---------------------------
|
|||
|
|
|||
|
Profile information enables better optimization. For example, knowing that a
|
|||
|
branch is taken very frequently helps the compiler make better decisions when
|
|||
|
ordering basic blocks. Knowing that a function ``foo`` is called more
|
|||
|
frequently than another function ``bar`` helps the inliner. Optimization
|
|||
|
levels ``-O2`` and above are recommended for use of profile guided optimization.
|
|||
|
|
|||
|
Clang supports profile guided optimization with two different kinds of
|
|||
|
profiling. A sampling profiler can generate a profile with very low runtime
|
|||
|
overhead, or you can build an instrumented version of the code that collects
|
|||
|
more detailed profile information. Both kinds of profiles can provide execution
|
|||
|
counts for instructions in the code and information on branches taken and
|
|||
|
function invocation.
|
|||
|
|
|||
|
Regardless of which kind of profiling you use, be careful to collect profiles
|
|||
|
by running your code with inputs that are representative of the typical
|
|||
|
behavior. Code that is not exercised in the profile will be optimized as if it
|
|||
|
is unimportant, and the compiler may make poor optimization choices for code
|
|||
|
that is disproportionately used while profiling.
|
|||
|
|
|||
|
Differences Between Sampling and Instrumentation
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Although both techniques are used for similar purposes, there are important
|
|||
|
differences between the two:
|
|||
|
|
|||
|
1. Profile data generated with one cannot be used by the other, and there is no
|
|||
|
conversion tool that can convert one to the other. So, a profile generated
|
|||
|
via ``-fprofile-instr-generate`` must be used with ``-fprofile-instr-use``.
|
|||
|
Similarly, sampling profiles generated by external profilers must be
|
|||
|
converted and used with ``-fprofile-sample-use``.
|
|||
|
|
|||
|
2. Instrumentation profile data can be used for code coverage analysis and
|
|||
|
optimization.
|
|||
|
|
|||
|
3. Sampling profiles can only be used for optimization. They cannot be used for
|
|||
|
code coverage analysis. Although it would be technically possible to use
|
|||
|
sampling profiles for code coverage, sample-based profiles are too
|
|||
|
coarse-grained for code coverage purposes; it would yield poor results.
|
|||
|
|
|||
|
4. Sampling profiles must be generated by an external tool. The profile
|
|||
|
generated by that tool must then be converted into a format that can be read
|
|||
|
by LLVM. The section on sampling profilers describes one of the supported
|
|||
|
sampling profile formats.
|
|||
|
|
|||
|
|
|||
|
Using Sampling Profilers
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Sampling profilers are used to collect runtime information, such as
|
|||
|
hardware counters, while your application executes. They are typically
|
|||
|
very efficient and do not incur a large runtime overhead. The
|
|||
|
sample data collected by the profiler can be used during compilation
|
|||
|
to determine what the most executed areas of the code are.
|
|||
|
|
|||
|
Using the data from a sample profiler requires some changes in the way
|
|||
|
a program is built. Before the compiler can use profiling information,
|
|||
|
the code needs to execute under the profiler. The following is the
|
|||
|
usual build cycle when using sample profilers for optimization:
|
|||
|
|
|||
|
1. Build the code with source line table information. You can use all the
|
|||
|
usual build flags that you always build your application with. The only
|
|||
|
requirement is that you add ``-gline-tables-only`` or ``-g`` to the
|
|||
|
command line. This is important for the profiler to be able to map
|
|||
|
instructions back to source line locations.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -gline-tables-only code.cc -o code
|
|||
|
|
|||
|
2. Run the executable under a sampling profiler. The specific profiler
|
|||
|
you use does not really matter, as long as its output can be converted
|
|||
|
into the format that the LLVM optimizer understands. Currently, there
|
|||
|
exists a conversion tool for the Linux Perf profiler
|
|||
|
(https://perf.wiki.kernel.org/), so these examples assume that you
|
|||
|
are using Linux Perf to profile your code.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ perf record -b ./code
|
|||
|
|
|||
|
Note the use of the ``-b`` flag. This tells Perf to use the Last Branch
|
|||
|
Record (LBR) to record call chains. While this is not strictly required,
|
|||
|
it provides better call information, which improves the accuracy of
|
|||
|
the profile data.
|
|||
|
|
|||
|
3. Convert the collected profile data to LLVM's sample profile format.
|
|||
|
This is currently supported via the AutoFDO converter ``create_llvm_prof``.
|
|||
|
It is available at https://github.com/google/autofdo. Once built and
|
|||
|
installed, you can convert the ``perf.data`` file to LLVM using
|
|||
|
the command:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ create_llvm_prof --binary=./code --out=code.prof
|
|||
|
|
|||
|
This will read ``perf.data`` and the binary file ``./code`` and emit
|
|||
|
the profile data in ``code.prof``. Note that if you ran ``perf``
|
|||
|
without the ``-b`` flag, you need to use ``--use_lbr=false`` when
|
|||
|
calling ``create_llvm_prof``.
|
|||
|
|
|||
|
4. Build the code again using the collected profile. This step feeds
|
|||
|
the profile back to the optimizers. This should result in a binary
|
|||
|
that executes faster than the original one. Note that you are not
|
|||
|
required to build the code with the exact same arguments that you
|
|||
|
used in the first step. The only requirement is that you build the code
|
|||
|
with ``-gline-tables-only`` and ``-fprofile-sample-use``.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -gline-tables-only -fprofile-sample-use=code.prof code.cc -o code
|
|||
|
|
|||
|
|
|||
|
Sample Profile Formats
|
|||
|
""""""""""""""""""""""
|
|||
|
|
|||
|
Since external profilers generate profile data in a variety of custom formats,
|
|||
|
the data generated by the profiler must be converted into a format that can be
|
|||
|
read by the backend. LLVM supports three different sample profile formats:
|
|||
|
|
|||
|
1. ASCII text. This is the easiest one to generate. The file is divided into
|
|||
|
sections, which correspond to each of the functions with profile
|
|||
|
information. The format is described below. It can also be generated from
|
|||
|
the binary or gcov formats using the ``llvm-profdata`` tool.
|
|||
|
|
|||
|
2. Binary encoding. This uses a more efficient encoding that yields smaller
|
|||
|
profile files. This is the format generated by the ``create_llvm_prof`` tool
|
|||
|
in https://github.com/google/autofdo.
|
|||
|
|
|||
|
3. GCC encoding. This is based on the gcov format, which is accepted by GCC. It
|
|||
|
is only interesting in environments where GCC and Clang co-exist. This
|
|||
|
encoding is only generated by the ``create_gcov`` tool in
|
|||
|
https://github.com/google/autofdo. It can be read by LLVM and
|
|||
|
``llvm-profdata``, but it cannot be generated by either.
|
|||
|
|
|||
|
If you are using Linux Perf to generate sampling profiles, you can use the
|
|||
|
conversion tool ``create_llvm_prof`` described in the previous section.
|
|||
|
Otherwise, you will need to write a conversion tool that converts your
|
|||
|
profiler's native format into one of these three.
|
|||
|
|
|||
|
|
|||
|
Sample Profile Text Format
|
|||
|
""""""""""""""""""""""""""
|
|||
|
|
|||
|
This section describes the ASCII text format for sampling profiles. It is,
|
|||
|
arguably, the easiest one to generate. If you are interested in generating any
|
|||
|
of the other two, consult the ``ProfileData`` library in LLVM's source tree
|
|||
|
(specifically, ``include/llvm/ProfileData/SampleProfReader.h``).
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
function1:total_samples:total_head_samples
|
|||
|
offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
|
|||
|
offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
|
|||
|
...
|
|||
|
offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
|
|||
|
offsetA[.discriminator]: fnA:num_of_total_samples
|
|||
|
offsetA1[.discriminator]: number_of_samples [fn7:num fn8:num ... ]
|
|||
|
offsetA1[.discriminator]: number_of_samples [fn9:num fn10:num ... ]
|
|||
|
offsetB[.discriminator]: fnB:num_of_total_samples
|
|||
|
offsetB1[.discriminator]: number_of_samples [fn11:num fn12:num ... ]
|
|||
|
|
|||
|
This is a nested tree in which the indentation represents the nesting level
|
|||
|
of the inline stack. There are no blank lines in the file. And the spacing
|
|||
|
within a single line is fixed. Additional spaces will result in an error
|
|||
|
while reading the file.
|
|||
|
|
|||
|
Any line starting with the '#' character is completely ignored.
|
|||
|
|
|||
|
Inlined calls are represented with indentation. The Inline stack is a
|
|||
|
stack of source locations in which the top of the stack represents the
|
|||
|
leaf function, and the bottom of the stack represents the actual
|
|||
|
symbol to which the instruction belongs.
|
|||
|
|
|||
|
Function names must be mangled in order for the profile loader to
|
|||
|
match them in the current translation unit. The two numbers in the
|
|||
|
function header specify how many total samples were accumulated in the
|
|||
|
function (first number), and the total number of samples accumulated
|
|||
|
in the prologue of the function (second number). This head sample
|
|||
|
count provides an indicator of how frequently the function is invoked.
|
|||
|
|
|||
|
There are two types of lines in the function body.
|
|||
|
|
|||
|
- Sampled line represents the profile information of a source location.
|
|||
|
``offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]``
|
|||
|
|
|||
|
- Callsite line represents the profile information of an inlined callsite.
|
|||
|
``offsetA[.discriminator]: fnA:num_of_total_samples``
|
|||
|
|
|||
|
Each sampled line may contain several items. Some are optional (marked
|
|||
|
below):
|
|||
|
|
|||
|
a. Source line offset. This number represents the line number
|
|||
|
in the function where the sample was collected. The line number is
|
|||
|
always relative to the line where symbol of the function is
|
|||
|
defined. So, if the function has its header at line 280, the offset
|
|||
|
13 is at line 293 in the file.
|
|||
|
|
|||
|
Note that this offset should never be a negative number. This could
|
|||
|
happen in cases like macros. The debug machinery will register the
|
|||
|
line number at the point of macro expansion. So, if the macro was
|
|||
|
expanded in a line before the start of the function, the profile
|
|||
|
converter should emit a 0 as the offset (this means that the optimizers
|
|||
|
will not be able to associate a meaningful weight to the instructions
|
|||
|
in the macro).
|
|||
|
|
|||
|
b. [OPTIONAL] Discriminator. This is used if the sampled program
|
|||
|
was compiled with DWARF discriminator support
|
|||
|
(http://wiki.dwarfstd.org/index.php?title=Path_Discriminators).
|
|||
|
DWARF discriminators are unsigned integer values that allow the
|
|||
|
compiler to distinguish between multiple execution paths on the
|
|||
|
same source line location.
|
|||
|
|
|||
|
For example, consider the line of code ``if (cond) foo(); else bar();``.
|
|||
|
If the predicate ``cond`` is true 80% of the time, then the edge
|
|||
|
into function ``foo`` should be considered to be taken most of the
|
|||
|
time. But both calls to ``foo`` and ``bar`` are at the same source
|
|||
|
line, so a sample count at that line is not sufficient. The
|
|||
|
compiler needs to know which part of that line is taken more
|
|||
|
frequently.
|
|||
|
|
|||
|
This is what discriminators provide. In this case, the calls to
|
|||
|
``foo`` and ``bar`` will be at the same line, but will have
|
|||
|
different discriminator values. This allows the compiler to correctly
|
|||
|
set edge weights into ``foo`` and ``bar``.
|
|||
|
|
|||
|
c. Number of samples. This is an integer quantity representing the
|
|||
|
number of samples collected by the profiler at this source
|
|||
|
location.
|
|||
|
|
|||
|
d. [OPTIONAL] Potential call targets and samples. If present, this
|
|||
|
line contains a call instruction. This models both direct and
|
|||
|
number of samples. For example,
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
130: 7 foo:3 bar:2 baz:7
|
|||
|
|
|||
|
The above means that at relative line offset 130 there is a call
|
|||
|
instruction that calls one of ``foo()``, ``bar()`` and ``baz()``,
|
|||
|
with ``baz()`` being the relatively more frequently called target.
|
|||
|
|
|||
|
As an example, consider a program with the call chain ``main -> foo -> bar``.
|
|||
|
When built with optimizations enabled, the compiler may inline the
|
|||
|
calls to ``bar`` and ``foo`` inside ``main``. The generated profile
|
|||
|
could then be something like this:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
main:35504:0
|
|||
|
1: _Z3foov:35504
|
|||
|
2: _Z32bari:31977
|
|||
|
1.1: 31977
|
|||
|
2: 0
|
|||
|
|
|||
|
This profile indicates that there were a total of 35,504 samples
|
|||
|
collected in main. All of those were at line 1 (the call to ``foo``).
|
|||
|
Of those, 31,977 were spent inside the body of ``bar``. The last line
|
|||
|
of the profile (``2: 0``) corresponds to line 2 inside ``main``. No
|
|||
|
samples were collected there.
|
|||
|
|
|||
|
Profiling with Instrumentation
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Clang also supports profiling via instrumentation. This requires building a
|
|||
|
special instrumented version of the code and has some runtime
|
|||
|
overhead during the profiling, but it provides more detailed results than a
|
|||
|
sampling profiler. It also provides reproducible results, at least to the
|
|||
|
extent that the code behaves consistently across runs.
|
|||
|
|
|||
|
Here are the steps for using profile guided optimization with
|
|||
|
instrumentation:
|
|||
|
|
|||
|
1. Build an instrumented version of the code by compiling and linking with the
|
|||
|
``-fprofile-instr-generate`` option.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -fprofile-instr-generate code.cc -o code
|
|||
|
|
|||
|
2. Run the instrumented executable with inputs that reflect the typical usage.
|
|||
|
By default, the profile data will be written to a ``default.profraw`` file
|
|||
|
in the current directory. You can override that default by using option
|
|||
|
``-fprofile-instr-generate=`` or by setting the ``LLVM_PROFILE_FILE``
|
|||
|
environment variable to specify an alternate file. If non-default file name
|
|||
|
is specified by both the environment variable and the command line option,
|
|||
|
the environment variable takes precedence. The file name pattern specified
|
|||
|
can include different modifiers: ``%p``, ``%h``, and ``%m``.
|
|||
|
|
|||
|
Any instance of ``%p`` in that file name will be replaced by the process
|
|||
|
ID, so that you can easily distinguish the profile output from multiple
|
|||
|
runs.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ LLVM_PROFILE_FILE="code-%p.profraw" ./code
|
|||
|
|
|||
|
The modifier ``%h`` can be used in scenarios where the same instrumented
|
|||
|
binary is run in multiple different host machines dumping profile data
|
|||
|
to a shared network based storage. The ``%h`` specifier will be substituted
|
|||
|
with the hostname so that profiles collected from different hosts do not
|
|||
|
clobber each other.
|
|||
|
|
|||
|
While the use of ``%p`` specifier can reduce the likelihood for the profiles
|
|||
|
dumped from different processes to clobber each other, such clobbering can still
|
|||
|
happen because of the ``pid`` re-use by the OS. Another side-effect of using
|
|||
|
``%p`` is that the storage requirement for raw profile data files is greatly
|
|||
|
increased. To avoid issues like this, the ``%m`` specifier can used in the profile
|
|||
|
name. When this specifier is used, the profiler runtime will substitute ``%m``
|
|||
|
with a unique integer identifier associated with the instrumented binary. Additionally,
|
|||
|
multiple raw profiles dumped from different processes that share a file system (can be
|
|||
|
on different hosts) will be automatically merged by the profiler runtime during the
|
|||
|
dumping. If the program links in multiple instrumented shared libraries, each library
|
|||
|
will dump the profile data into its own profile data file (with its unique integer
|
|||
|
id embedded in the profile name). Note that the merging enabled by ``%m`` is for raw
|
|||
|
profile data generated by profiler runtime. The resulting merged "raw" profile data
|
|||
|
file still needs to be converted to a different format expected by the compiler (
|
|||
|
see step 3 below).
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ LLVM_PROFILE_FILE="code-%m.profraw" ./code
|
|||
|
|
|||
|
|
|||
|
3. Combine profiles from multiple runs and convert the "raw" profile format to
|
|||
|
the input expected by clang. Use the ``merge`` command of the
|
|||
|
``llvm-profdata`` tool to do this.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ llvm-profdata merge -output=code.profdata code-*.profraw
|
|||
|
|
|||
|
Note that this step is necessary even when there is only one "raw" profile,
|
|||
|
since the merge operation also changes the file format.
|
|||
|
|
|||
|
4. Build the code again using the ``-fprofile-instr-use`` option to specify the
|
|||
|
collected profile data.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -fprofile-instr-use=code.profdata code.cc -o code
|
|||
|
|
|||
|
You can repeat step 4 as often as you like without regenerating the
|
|||
|
profile. As you make changes to your code, clang may no longer be able to
|
|||
|
use the profile data. It will warn you when this happens.
|
|||
|
|
|||
|
Profile generation using an alternative instrumentation method can be
|
|||
|
controlled by the GCC-compatible flags ``-fprofile-generate`` and
|
|||
|
``-fprofile-use``. Although these flags are semantically equivalent to
|
|||
|
their GCC counterparts, they *do not* handle GCC-compatible profiles.
|
|||
|
They are only meant to implement GCC's semantics with respect to
|
|||
|
profile creation and use. Flag ``-fcs-profile-generate`` also instruments
|
|||
|
programs using the same instrumentation method as ``-fprofile-generate``.
|
|||
|
|
|||
|
.. option:: -fprofile-generate[=<dirname>]
|
|||
|
|
|||
|
The ``-fprofile-generate`` and ``-fprofile-generate=`` flags will use
|
|||
|
an alternative instrumentation method for profile generation. When
|
|||
|
given a directory name, it generates the profile file
|
|||
|
``default_%m.profraw`` in the directory named ``dirname`` if specified.
|
|||
|
If ``dirname`` does not exist, it will be created at runtime. ``%m`` specifier
|
|||
|
will be substituted with a unique id documented in step 2 above. In other words,
|
|||
|
with ``-fprofile-generate[=<dirname>]`` option, the "raw" profile data automatic
|
|||
|
merging is turned on by default, so there will no longer any risk of profile
|
|||
|
clobbering from different running processes. For example,
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -fprofile-generate=yyy/zzz code.cc -o code
|
|||
|
|
|||
|
When ``code`` is executed, the profile will be written to the file
|
|||
|
``yyy/zzz/default_xxxx.profraw``.
|
|||
|
|
|||
|
To generate the profile data file with the compiler readable format, the
|
|||
|
``llvm-profdata`` tool can be used with the profile directory as the input:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ llvm-profdata merge -output=code.profdata yyy/zzz/
|
|||
|
|
|||
|
If the user wants to turn off the auto-merging feature, or simply override the
|
|||
|
the profile dumping path specified at command line, the environment variable
|
|||
|
``LLVM_PROFILE_FILE`` can still be used to override
|
|||
|
the directory and filename for the profile file at runtime.
|
|||
|
|
|||
|
.. option:: -fcs-profile-generate[=<dirname>]
|
|||
|
|
|||
|
The ``-fcs-profile-generate`` and ``-fcs-profile-generate=`` flags will use
|
|||
|
the same instrumentation method, and generate the same profile as in the
|
|||
|
``-fprofile-generate`` and ``-fprofile-generate=`` flags. The difference is
|
|||
|
that the instrumentation is performed after inlining so that the resulted
|
|||
|
profile has a better context sensitive information. They cannot be used
|
|||
|
together with ``-fprofile-generate`` and ``-fprofile-generate=`` flags.
|
|||
|
They are typically used in conjunction with ``-fprofile-use`` flag.
|
|||
|
The profile generated by ``-fcs-profile-generate`` and ``-fprofile-generate``
|
|||
|
can be merged by llvm-profdata. A use example:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -fprofile-generate=yyy/zzz code.cc -o code
|
|||
|
$ ./code
|
|||
|
$ llvm-profdata merge -output=code.profdata yyy/zzz/
|
|||
|
|
|||
|
The first few steps are the same as that in ``-fprofile-generate``
|
|||
|
compilation. Then perform a second round of instrumentation.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -fprofile-use=code.profdata -fcs-profile-generate=sss/ttt \
|
|||
|
-o cs_code
|
|||
|
$ ./cs_code
|
|||
|
$ llvm-profdata merge -output=cs_code.profdata sss/ttt code.profdata
|
|||
|
|
|||
|
The resulted ``cs_code.prodata`` combines ``code.profdata`` and the profile
|
|||
|
generated from binary ``cs_code``. Profile ``cs_code.profata`` can be used by
|
|||
|
``-fprofile-use`` compilaton.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang++ -O2 -fprofile-use=cs_code.profdata
|
|||
|
|
|||
|
The above command will read both profiles to the compiler at the identical
|
|||
|
point of instrumenations.
|
|||
|
|
|||
|
.. option:: -fprofile-use[=<pathname>]
|
|||
|
|
|||
|
Without any other arguments, ``-fprofile-use`` behaves identically to
|
|||
|
``-fprofile-instr-use``. Otherwise, if ``pathname`` is the full path to a
|
|||
|
profile file, it reads from that file. If ``pathname`` is a directory name,
|
|||
|
it reads from ``pathname/default.profdata``.
|
|||
|
|
|||
|
Disabling Instrumentation
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
In certain situations, it may be useful to disable profile generation or use
|
|||
|
for specific files in a build, without affecting the main compilation flags
|
|||
|
used for the other files in the project.
|
|||
|
|
|||
|
In these cases, you can use the flag ``-fno-profile-instr-generate`` (or
|
|||
|
``-fno-profile-generate``) to disable profile generation, and
|
|||
|
``-fno-profile-instr-use`` (or ``-fno-profile-use``) to disable profile use.
|
|||
|
|
|||
|
Note that these flags should appear after the corresponding profile
|
|||
|
flags to have an effect.
|
|||
|
|
|||
|
Profile remapping
|
|||
|
^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
When the program is compiled after a change that affects many symbol names,
|
|||
|
pre-existing profile data may no longer match the program. For example:
|
|||
|
|
|||
|
* switching from libstdc++ to libc++ will result in the mangled names of all
|
|||
|
functions taking standard library types to change
|
|||
|
* renaming a widely-used type in C++ will result in the mangled names of all
|
|||
|
functions that have parameters involving that type to change
|
|||
|
* moving from a 32-bit compilation to a 64-bit compilation may change the
|
|||
|
underlying type of ``size_t`` and similar types, resulting in changes to
|
|||
|
manglings
|
|||
|
|
|||
|
Clang allows use of a profile remapping file to specify that such differences
|
|||
|
in mangled names should be ignored when matching the profile data against the
|
|||
|
program.
|
|||
|
|
|||
|
.. option:: -fprofile-remapping-file=<file>
|
|||
|
|
|||
|
Specifies a file containing profile remapping information, that will be
|
|||
|
used to match mangled names in the profile data to mangled names in the
|
|||
|
program.
|
|||
|
|
|||
|
The profile remapping file is a text file containing lines of the form
|
|||
|
|
|||
|
.. code-block:: text
|
|||
|
|
|||
|
fragmentkind fragment1 fragment2
|
|||
|
|
|||
|
where ``fragmentkind`` is one of ``name``, ``type``, or ``encoding``,
|
|||
|
indicating whether the following mangled name fragments are
|
|||
|
<`name <https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.name>`_>s,
|
|||
|
<`type <https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.type>`_>s, or
|
|||
|
<`encoding <https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangle.encoding>`_>s,
|
|||
|
respectively.
|
|||
|
Blank lines and lines starting with ``#`` are ignored.
|
|||
|
|
|||
|
For convenience, built-in <substitution>s such as ``St`` and ``Ss``
|
|||
|
are accepted as <name>s (even though they technically are not <name>s).
|
|||
|
|
|||
|
For example, to specify that ``absl::string_view`` and ``std::string_view``
|
|||
|
should be treated as equivalent when matching profile data, the following
|
|||
|
remapping file could be used:
|
|||
|
|
|||
|
.. code-block:: text
|
|||
|
|
|||
|
# absl::string_view is considered equivalent to std::string_view
|
|||
|
type N4absl11string_viewE St17basic_string_viewIcSt11char_traitsIcEE
|
|||
|
|
|||
|
# std:: might be std::__1:: in libc++ or std::__cxx11:: in libstdc++
|
|||
|
name 3std St3__1
|
|||
|
name 3std St7__cxx11
|
|||
|
|
|||
|
Matching profile data using a profile remapping file is supported on a
|
|||
|
best-effort basis. For example, information regarding indirect call targets is
|
|||
|
currently not remapped. For best results, you are encouraged to generate new
|
|||
|
profile data matching the updated program, or to remap the profile data
|
|||
|
using the ``llvm-cxxmap`` and ``llvm-profdata merge`` tools.
|
|||
|
|
|||
|
.. note::
|
|||
|
|
|||
|
Profile data remapping support is currently only implemented for LLVM's
|
|||
|
new pass manager, which can be enabled with
|
|||
|
``-fexperimental-new-pass-manager``.
|
|||
|
|
|||
|
.. note::
|
|||
|
|
|||
|
Profile data remapping is currently only supported for C++ mangled names
|
|||
|
following the Itanium C++ ABI mangling scheme. This covers all C++ targets
|
|||
|
supported by Clang other than Windows.
|
|||
|
|
|||
|
GCOV-based Profiling
|
|||
|
--------------------
|
|||
|
|
|||
|
GCOV is a test coverage program, it helps to know how often a line of code
|
|||
|
is executed. When instrumenting the code with ``--coverage`` option, some
|
|||
|
counters are added for each edge linking basic blocks.
|
|||
|
|
|||
|
At compile time, gcno files are generated containing information about
|
|||
|
blocks and edges between them. At runtime the counters are incremented and at
|
|||
|
exit the counters are dumped in gcda files.
|
|||
|
|
|||
|
The tool ``llvm-cov gcov`` will parse gcno, gcda and source files to generate
|
|||
|
a report ``.c.gcov``.
|
|||
|
|
|||
|
.. option:: -fprofile-filter-files=[regexes]
|
|||
|
|
|||
|
Define a list of regexes separated by a semi-colon.
|
|||
|
If a file name matches any of the regexes then the file is instrumented.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang --coverage -fprofile-filter-files=".*\.c$" foo.c
|
|||
|
|
|||
|
For example, this will only instrument files finishing with ``.c``, skipping ``.h`` files.
|
|||
|
|
|||
|
.. option:: -fprofile-exclude-files=[regexes]
|
|||
|
|
|||
|
Define a list of regexes separated by a semi-colon.
|
|||
|
If a file name doesn't match all the regexes then the file is instrumented.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang --coverage -fprofile-exclude-files="^/usr/include/.*$" foo.c
|
|||
|
|
|||
|
For example, this will instrument all the files except the ones in ``/usr/include``.
|
|||
|
|
|||
|
If both options are used then a file is instrumented if its name matches any
|
|||
|
of the regexes from ``-fprofile-filter-list`` and doesn't match all the regexes
|
|||
|
from ``-fprofile-exclude-list``.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang --coverage -fprofile-exclude-files="^/usr/include/.*$" \
|
|||
|
-fprofile-filter-files="^/usr/.*$"
|
|||
|
|
|||
|
In that case ``/usr/foo/oof.h`` is instrumented since it matches the filter regex and
|
|||
|
doesn't match the exclude regex, but ``/usr/include/foo.h`` doesn't since it matches
|
|||
|
the exclude regex.
|
|||
|
|
|||
|
Controlling Debug Information
|
|||
|
-----------------------------
|
|||
|
|
|||
|
Controlling Size of Debug Information
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Debug info kind generated by Clang can be set by one of the flags listed
|
|||
|
below. If multiple flags are present, the last one is used.
|
|||
|
|
|||
|
.. option:: -g0
|
|||
|
|
|||
|
Don't generate any debug info (default).
|
|||
|
|
|||
|
.. option:: -gline-tables-only
|
|||
|
|
|||
|
Generate line number tables only.
|
|||
|
|
|||
|
This kind of debug info allows to obtain stack traces with function names,
|
|||
|
file names and line numbers (by such tools as ``gdb`` or ``addr2line``). It
|
|||
|
doesn't contain any other data (e.g. description of local variables or
|
|||
|
function parameters).
|
|||
|
|
|||
|
.. option:: -fstandalone-debug
|
|||
|
|
|||
|
Clang supports a number of optimizations to reduce the size of debug
|
|||
|
information in the binary. They work based on the assumption that
|
|||
|
the debug type information can be spread out over multiple
|
|||
|
compilation units. For instance, Clang will not emit type
|
|||
|
definitions for types that are not needed by a module and could be
|
|||
|
replaced with a forward declaration. Further, Clang will only emit
|
|||
|
type info for a dynamic C++ class in the module that contains the
|
|||
|
vtable for the class.
|
|||
|
|
|||
|
The **-fstandalone-debug** option turns off these optimizations.
|
|||
|
This is useful when working with 3rd-party libraries that don't come
|
|||
|
with debug information. Note that Clang will never emit type
|
|||
|
information for types that are not referenced at all by the program.
|
|||
|
|
|||
|
.. option:: -fno-standalone-debug
|
|||
|
|
|||
|
On Darwin **-fstandalone-debug** is enabled by default. The
|
|||
|
**-fno-standalone-debug** option can be used to get to turn on the
|
|||
|
vtable-based optimization described above.
|
|||
|
|
|||
|
.. option:: -g
|
|||
|
|
|||
|
Generate complete debug info.
|
|||
|
|
|||
|
Controlling Macro Debug Info Generation
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Debug info for C preprocessor macros increases the size of debug information in
|
|||
|
the binary. Macro debug info generated by Clang can be controlled by the flags
|
|||
|
listed below.
|
|||
|
|
|||
|
.. option:: -fdebug-macro
|
|||
|
|
|||
|
Generate debug info for preprocessor macros. This flag is discarded when
|
|||
|
**-g0** is enabled.
|
|||
|
|
|||
|
.. option:: -fno-debug-macro
|
|||
|
|
|||
|
Do not generate debug info for preprocessor macros (default).
|
|||
|
|
|||
|
Controlling Debugger "Tuning"
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
While Clang generally emits standard DWARF debug info (http://dwarfstd.org),
|
|||
|
different debuggers may know how to take advantage of different specific DWARF
|
|||
|
features. You can "tune" the debug info for one of several different debuggers.
|
|||
|
|
|||
|
.. option:: -ggdb, -glldb, -gsce
|
|||
|
|
|||
|
Tune the debug info for the ``gdb``, ``lldb``, or Sony PlayStation\ |reg|
|
|||
|
debugger, respectively. Each of these options implies **-g**. (Therefore, if
|
|||
|
you want both **-gline-tables-only** and debugger tuning, the tuning option
|
|||
|
must come first.)
|
|||
|
|
|||
|
|
|||
|
Controlling LLVM IR Output
|
|||
|
--------------------------
|
|||
|
|
|||
|
Controlling Value Names in LLVM IR
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Emitting value names in LLVM IR increases the size and verbosity of the IR.
|
|||
|
By default, value names are only emitted in assertion-enabled builds of Clang.
|
|||
|
However, when reading IR it can be useful to re-enable the emission of value
|
|||
|
names to improve readability.
|
|||
|
|
|||
|
.. option:: -fdiscard-value-names
|
|||
|
|
|||
|
Discard value names when generating LLVM IR.
|
|||
|
|
|||
|
.. option:: -fno-discard-value-names
|
|||
|
|
|||
|
Do not discard value names when generating LLVM IR. This option can be used
|
|||
|
to re-enable names for release builds of Clang.
|
|||
|
|
|||
|
|
|||
|
Comment Parsing Options
|
|||
|
-----------------------
|
|||
|
|
|||
|
Clang parses Doxygen and non-Doxygen style documentation comments and attaches
|
|||
|
them to the appropriate declaration nodes. By default, it only parses
|
|||
|
Doxygen-style comments and ignores ordinary comments starting with ``//`` and
|
|||
|
``/*``.
|
|||
|
|
|||
|
.. option:: -Wdocumentation
|
|||
|
|
|||
|
Emit warnings about use of documentation comments. This warning group is off
|
|||
|
by default.
|
|||
|
|
|||
|
This includes checking that ``\param`` commands name parameters that actually
|
|||
|
present in the function signature, checking that ``\returns`` is used only on
|
|||
|
functions that actually return a value etc.
|
|||
|
|
|||
|
.. option:: -Wno-documentation-unknown-command
|
|||
|
|
|||
|
Don't warn when encountering an unknown Doxygen command.
|
|||
|
|
|||
|
.. option:: -fparse-all-comments
|
|||
|
|
|||
|
Parse all comments as documentation comments (including ordinary comments
|
|||
|
starting with ``//`` and ``/*``).
|
|||
|
|
|||
|
.. option:: -fcomment-block-commands=[commands]
|
|||
|
|
|||
|
Define custom documentation commands as block commands. This allows Clang to
|
|||
|
construct the correct AST for these custom commands, and silences warnings
|
|||
|
about unknown commands. Several commands must be separated by a comma
|
|||
|
*without trailing space*; e.g. ``-fcomment-block-commands=foo,bar`` defines
|
|||
|
custom commands ``\foo`` and ``\bar``.
|
|||
|
|
|||
|
It is also possible to use ``-fcomment-block-commands`` several times; e.g.
|
|||
|
``-fcomment-block-commands=foo -fcomment-block-commands=bar`` does the same
|
|||
|
as above.
|
|||
|
|
|||
|
.. _c:
|
|||
|
|
|||
|
C Language Features
|
|||
|
===================
|
|||
|
|
|||
|
The support for standard C in clang is feature-complete except for the
|
|||
|
C99 floating-point pragmas.
|
|||
|
|
|||
|
Extensions supported by clang
|
|||
|
-----------------------------
|
|||
|
|
|||
|
See :doc:`LanguageExtensions`.
|
|||
|
|
|||
|
Differences between various standard modes
|
|||
|
------------------------------------------
|
|||
|
|
|||
|
clang supports the -std option, which changes what language mode clang
|
|||
|
uses. The supported modes for C are c89, gnu89, c99, gnu99, c11, gnu11,
|
|||
|
c17, gnu17, and various aliases for those modes. If no -std option is
|
|||
|
specified, clang defaults to gnu11 mode. Many C99 and C11 features are
|
|||
|
supported in earlier modes as a conforming extension, with a warning. Use
|
|||
|
``-pedantic-errors`` to request an error if a feature from a later standard
|
|||
|
revision is used in an earlier mode.
|
|||
|
|
|||
|
Differences between all ``c*`` and ``gnu*`` modes:
|
|||
|
|
|||
|
- ``c*`` modes define "``__STRICT_ANSI__``".
|
|||
|
- Target-specific defines not prefixed by underscores, like "linux",
|
|||
|
are defined in ``gnu*`` modes.
|
|||
|
- Trigraphs default to being off in ``gnu*`` modes; they can be enabled by
|
|||
|
the -trigraphs option.
|
|||
|
- The parser recognizes "asm" and "typeof" as keywords in ``gnu*`` modes;
|
|||
|
the variants "``__asm__``" and "``__typeof__``" are recognized in all
|
|||
|
modes.
|
|||
|
- The Apple "blocks" extension is recognized by default in ``gnu*`` modes
|
|||
|
on some platforms; it can be enabled in any mode with the "-fblocks"
|
|||
|
option.
|
|||
|
- Arrays that are VLA's according to the standard, but which can be
|
|||
|
constant folded by the frontend are treated as fixed size arrays.
|
|||
|
This occurs for things like "int X[(1, 2)];", which is technically a
|
|||
|
VLA. ``c*`` modes are strictly compliant and treat these as VLAs.
|
|||
|
|
|||
|
Differences between ``*89`` and ``*99`` modes:
|
|||
|
|
|||
|
- The ``*99`` modes default to implementing "inline" as specified in C99,
|
|||
|
while the ``*89`` modes implement the GNU version. This can be
|
|||
|
overridden for individual functions with the ``__gnu_inline__``
|
|||
|
attribute.
|
|||
|
- Digraphs are not recognized in c89 mode.
|
|||
|
- The scope of names defined inside a "for", "if", "switch", "while",
|
|||
|
or "do" statement is different. (example: "``if ((struct x {int
|
|||
|
x;}*)0) {}``".)
|
|||
|
- ``__STDC_VERSION__`` is not defined in ``*89`` modes.
|
|||
|
- "inline" is not recognized as a keyword in c89 mode.
|
|||
|
- "restrict" is not recognized as a keyword in ``*89`` modes.
|
|||
|
- Commas are allowed in integer constant expressions in ``*99`` modes.
|
|||
|
- Arrays which are not lvalues are not implicitly promoted to pointers
|
|||
|
in ``*89`` modes.
|
|||
|
- Some warnings are different.
|
|||
|
|
|||
|
Differences between ``*99`` and ``*11`` modes:
|
|||
|
|
|||
|
- Warnings for use of C11 features are disabled.
|
|||
|
- ``__STDC_VERSION__`` is defined to ``201112L`` rather than ``199901L``.
|
|||
|
|
|||
|
Differences between ``*11`` and ``*17`` modes:
|
|||
|
|
|||
|
- ``__STDC_VERSION__`` is defined to ``201710L`` rather than ``201112L``.
|
|||
|
|
|||
|
GCC extensions not implemented yet
|
|||
|
----------------------------------
|
|||
|
|
|||
|
clang tries to be compatible with gcc as much as possible, but some gcc
|
|||
|
extensions are not implemented yet:
|
|||
|
|
|||
|
- clang does not support decimal floating point types (``_Decimal32`` and
|
|||
|
friends) or fixed-point types (``_Fract`` and friends); nobody has
|
|||
|
expressed interest in these features yet, so it's hard to say when
|
|||
|
they will be implemented.
|
|||
|
- clang does not support nested functions; this is a complex feature
|
|||
|
which is infrequently used, so it is unlikely to be implemented
|
|||
|
anytime soon. In C++11 it can be emulated by assigning lambda
|
|||
|
functions to local variables, e.g:
|
|||
|
|
|||
|
.. code-block:: cpp
|
|||
|
|
|||
|
auto const local_function = [&](int parameter) {
|
|||
|
// Do something
|
|||
|
};
|
|||
|
...
|
|||
|
local_function(1);
|
|||
|
|
|||
|
- clang only supports global register variables when the register specified
|
|||
|
is non-allocatable (e.g. the stack pointer). Support for general global
|
|||
|
register variables is unlikely to be implemented soon because it requires
|
|||
|
additional LLVM backend support.
|
|||
|
- clang does not support static initialization of flexible array
|
|||
|
members. This appears to be a rarely used extension, but could be
|
|||
|
implemented pending user demand.
|
|||
|
- clang does not support
|
|||
|
``__builtin_va_arg_pack``/``__builtin_va_arg_pack_len``. This is
|
|||
|
used rarely, but in some potentially interesting places, like the
|
|||
|
glibc headers, so it may be implemented pending user demand. Note
|
|||
|
that because clang pretends to be like GCC 4.2, and this extension
|
|||
|
was introduced in 4.3, the glibc headers will not try to use this
|
|||
|
extension with clang at the moment.
|
|||
|
- clang does not support the gcc extension for forward-declaring
|
|||
|
function parameters; this has not shown up in any real-world code
|
|||
|
yet, though, so it might never be implemented.
|
|||
|
|
|||
|
This is not a complete list; if you find an unsupported extension
|
|||
|
missing from this list, please send an e-mail to cfe-dev. This list
|
|||
|
currently excludes C++; see :ref:`C++ Language Features <cxx>`. Also, this
|
|||
|
list does not include bugs in mostly-implemented features; please see
|
|||
|
the `bug
|
|||
|
tracker <https://bugs.llvm.org/buglist.cgi?quicksearch=product%3Aclang+component%3A-New%2BBugs%2CAST%2CBasic%2CDriver%2CHeaders%2CLLVM%2BCodeGen%2Cparser%2Cpreprocessor%2CSemantic%2BAnalyzer>`_
|
|||
|
for known existing bugs (FIXME: Is there a section for bug-reporting
|
|||
|
guidelines somewhere?).
|
|||
|
|
|||
|
Intentionally unsupported GCC extensions
|
|||
|
----------------------------------------
|
|||
|
|
|||
|
- clang does not support the gcc extension that allows variable-length
|
|||
|
arrays in structures. This is for a few reasons: one, it is tricky to
|
|||
|
implement, two, the extension is completely undocumented, and three,
|
|||
|
the extension appears to be rarely used. Note that clang *does*
|
|||
|
support flexible array members (arrays with a zero or unspecified
|
|||
|
size at the end of a structure).
|
|||
|
- clang does not have an equivalent to gcc's "fold"; this means that
|
|||
|
clang doesn't accept some constructs gcc might accept in contexts
|
|||
|
where a constant expression is required, like "x-x" where x is a
|
|||
|
variable.
|
|||
|
- clang does not support ``__builtin_apply`` and friends; this extension
|
|||
|
is extremely obscure and difficult to implement reliably.
|
|||
|
|
|||
|
.. _c_ms:
|
|||
|
|
|||
|
Microsoft extensions
|
|||
|
--------------------
|
|||
|
|
|||
|
clang has support for many extensions from Microsoft Visual C++. To enable these
|
|||
|
extensions, use the ``-fms-extensions`` command-line option. This is the default
|
|||
|
for Windows targets. Clang does not implement every pragma or declspec provided
|
|||
|
by MSVC, but the popular ones, such as ``__declspec(dllexport)`` and ``#pragma
|
|||
|
comment(lib)`` are well supported.
|
|||
|
|
|||
|
clang has a ``-fms-compatibility`` flag that makes clang accept enough
|
|||
|
invalid C++ to be able to parse most Microsoft headers. For example, it
|
|||
|
allows `unqualified lookup of dependent base class members
|
|||
|
<https://clang.llvm.org/compatibility.html#dep_lookup_bases>`_, which is
|
|||
|
a common compatibility issue with clang. This flag is enabled by default
|
|||
|
for Windows targets.
|
|||
|
|
|||
|
``-fdelayed-template-parsing`` lets clang delay parsing of function template
|
|||
|
definitions until the end of a translation unit. This flag is enabled by
|
|||
|
default for Windows targets.
|
|||
|
|
|||
|
For compatibility with existing code that compiles with MSVC, clang defines the
|
|||
|
``_MSC_VER`` and ``_MSC_FULL_VER`` macros. These default to the values of 1800
|
|||
|
and 180000000 respectively, making clang look like an early release of Visual
|
|||
|
C++ 2013. The ``-fms-compatibility-version=`` flag overrides these values. It
|
|||
|
accepts a dotted version tuple, such as 19.00.23506. Changing the MSVC
|
|||
|
compatibility version makes clang behave more like that version of MSVC. For
|
|||
|
example, ``-fms-compatibility-version=19`` will enable C++14 features and define
|
|||
|
``char16_t`` and ``char32_t`` as builtin types.
|
|||
|
|
|||
|
.. _cxx:
|
|||
|
|
|||
|
C++ Language Features
|
|||
|
=====================
|
|||
|
|
|||
|
clang fully implements all of standard C++98 except for exported
|
|||
|
templates (which were removed in C++11), and all of standard C++11
|
|||
|
and the current draft standard for C++1y.
|
|||
|
|
|||
|
Controlling implementation limits
|
|||
|
---------------------------------
|
|||
|
|
|||
|
.. option:: -fbracket-depth=N
|
|||
|
|
|||
|
Sets the limit for nested parentheses, brackets, and braces to N. The
|
|||
|
default is 256.
|
|||
|
|
|||
|
.. option:: -fconstexpr-depth=N
|
|||
|
|
|||
|
Sets the limit for recursive constexpr function invocations to N. The
|
|||
|
default is 512.
|
|||
|
|
|||
|
.. option:: -fconstexpr-steps=N
|
|||
|
|
|||
|
Sets the limit for the number of full-expressions evaluated in a single
|
|||
|
constant expression evaluation. The default is 1048576.
|
|||
|
|
|||
|
.. option:: -ftemplate-depth=N
|
|||
|
|
|||
|
Sets the limit for recursively nested template instantiations to N. The
|
|||
|
default is 1024.
|
|||
|
|
|||
|
.. option:: -foperator-arrow-depth=N
|
|||
|
|
|||
|
Sets the limit for iterative calls to 'operator->' functions to N. The
|
|||
|
default is 256.
|
|||
|
|
|||
|
.. _objc:
|
|||
|
|
|||
|
Objective-C Language Features
|
|||
|
=============================
|
|||
|
|
|||
|
.. _objcxx:
|
|||
|
|
|||
|
Objective-C++ Language Features
|
|||
|
===============================
|
|||
|
|
|||
|
.. _openmp:
|
|||
|
|
|||
|
OpenMP Features
|
|||
|
===============
|
|||
|
|
|||
|
Clang supports all OpenMP 4.5 directives and clauses. See :doc:`OpenMPSupport`
|
|||
|
for additional details.
|
|||
|
|
|||
|
Use `-fopenmp` to enable OpenMP. Support for OpenMP can be disabled with
|
|||
|
`-fno-openmp`.
|
|||
|
|
|||
|
Use `-fopenmp-simd` to enable OpenMP simd features only, without linking
|
|||
|
the runtime library; for combined constructs
|
|||
|
(e.g. ``#pragma omp parallel for simd``) the non-simd directives and clauses
|
|||
|
will be ignored. This can be disabled with `-fno-openmp-simd`.
|
|||
|
|
|||
|
Controlling implementation limits
|
|||
|
---------------------------------
|
|||
|
|
|||
|
.. option:: -fopenmp-use-tls
|
|||
|
|
|||
|
Controls code generation for OpenMP threadprivate variables. In presence of
|
|||
|
this option all threadprivate variables are generated the same way as thread
|
|||
|
local variables, using TLS support. If `-fno-openmp-use-tls`
|
|||
|
is provided or target does not support TLS, code generation for threadprivate
|
|||
|
variables relies on OpenMP runtime library.
|
|||
|
|
|||
|
.. _opencl:
|
|||
|
|
|||
|
OpenCL Features
|
|||
|
===============
|
|||
|
|
|||
|
Clang can be used to compile OpenCL kernels for execution on a device
|
|||
|
(e.g. GPU). It is possible to compile the kernel into a binary (e.g. for AMD or
|
|||
|
Nvidia targets) that can be uploaded to run directly on a device (e.g. using
|
|||
|
`clCreateProgramWithBinary
|
|||
|
<https://www.khronos.org/registry/OpenCL/specs/opencl-1.1.pdf#111>`_) or
|
|||
|
into generic bitcode files loadable into other toolchains.
|
|||
|
|
|||
|
Compiling to a binary using the default target from the installation can be done
|
|||
|
as follows:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ echo "kernel void k(){}" > test.cl
|
|||
|
$ clang test.cl
|
|||
|
|
|||
|
Compiling for a specific target can be done by specifying the triple corresponding
|
|||
|
to the target, for example:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -target nvptx64-unknown-unknown test.cl
|
|||
|
$ clang -target amdgcn-amd-amdhsa -mcpu=gfx900 test.cl
|
|||
|
|
|||
|
Compiling to bitcode can be done as follows:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -c -emit-llvm test.cl
|
|||
|
|
|||
|
This will produce a generic test.bc file that can be used in vendor toolchains
|
|||
|
to perform machine code generation.
|
|||
|
|
|||
|
Clang currently supports OpenCL C language standards up to v2.0. Starting from
|
|||
|
clang 9 a C++ mode is available for OpenCL (see :ref:`C++ for OpenCL <opencl_cpp>`).
|
|||
|
|
|||
|
OpenCL Specific Options
|
|||
|
-----------------------
|
|||
|
|
|||
|
Most of the OpenCL build options from `the specification v2.0 section 5.8.4
|
|||
|
<https://www.khronos.org/registry/cl/specs/opencl-2.0.pdf#200>`_ are available.
|
|||
|
|
|||
|
Examples:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -cl-std=CL2.0 -cl-single-precision-constant test.cl
|
|||
|
|
|||
|
Some extra options are available to support special OpenCL features.
|
|||
|
|
|||
|
.. option:: -finclude-default-header
|
|||
|
|
|||
|
Loads standard includes during compilations. By default OpenCL headers are not
|
|||
|
loaded and therefore standard library includes are not available. To load them
|
|||
|
automatically a flag has been added to the frontend (see also :ref:`the section
|
|||
|
on the OpenCL Header <opencl_header>`):
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -Xclang -finclude-default-header test.cl
|
|||
|
|
|||
|
Alternatively ``-include`` or ``-I`` followed by the path to the header location
|
|||
|
can be given manually.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -I<path to clang>/lib/Headers/opencl-c.h test.cl
|
|||
|
|
|||
|
In this case the kernel code should contain ``#include <opencl-c.h>`` just as a
|
|||
|
regular C include.
|
|||
|
|
|||
|
.. _opencl_cl_ext:
|
|||
|
|
|||
|
.. option:: -cl-ext
|
|||
|
|
|||
|
Disables support of OpenCL extensions. All OpenCL targets provide a list
|
|||
|
of extensions that they support. Clang allows to amend this using the ``-cl-ext``
|
|||
|
flag with a comma-separated list of extensions prefixed with ``'+'`` or ``'-'``.
|
|||
|
The syntax: ``-cl-ext=<(['-'|'+']<extension>[,])+>``, where extensions
|
|||
|
can be either one of `the OpenCL specification extensions
|
|||
|
<https://www.khronos.org/registry/cl/sdk/2.0/docs/man/xhtml/EXTENSION.html>`_
|
|||
|
or any known vendor extension. Alternatively, ``'all'`` can be used to enable
|
|||
|
or disable all known extensions.
|
|||
|
Example disabling double support for the 64-bit SPIR target:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -cc1 -triple spir64-unknown-unknown -cl-ext=-cl_khr_fp64 test.cl
|
|||
|
|
|||
|
Enabling all extensions except double support in R600 AMD GPU can be done using:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -cc1 -triple r600-unknown-unknown -cl-ext=-all,+cl_khr_fp16 test.cl
|
|||
|
|
|||
|
.. _opencl_fake_address_space_map:
|
|||
|
|
|||
|
.. option:: -ffake-address-space-map
|
|||
|
|
|||
|
Overrides the target address space map with a fake map.
|
|||
|
This allows adding explicit address space IDs to the bitcode for non-segmented
|
|||
|
memory architectures that don't have separate IDs for each of the OpenCL
|
|||
|
logical address spaces by default. Passing ``-ffake-address-space-map`` will
|
|||
|
add/override address spaces of the target compiled for with the following values:
|
|||
|
``1-global``, ``2-constant``, ``3-local``, ``4-generic``. The private address
|
|||
|
space is represented by the absence of an address space attribute in the IR (see
|
|||
|
also :ref:`the section on the address space attribute <opencl_addrsp>`).
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -ffake-address-space-map test.cl
|
|||
|
|
|||
|
Some other flags used for the compilation for C can also be passed while
|
|||
|
compiling for OpenCL, examples: ``-c``, ``-O<1-4|s>``, ``-o``, ``-emit-llvm``, etc.
|
|||
|
|
|||
|
OpenCL Targets
|
|||
|
--------------
|
|||
|
|
|||
|
OpenCL targets are derived from the regular Clang target classes. The OpenCL
|
|||
|
specific parts of the target representation provide address space mapping as
|
|||
|
well as a set of supported extensions.
|
|||
|
|
|||
|
Specific Targets
|
|||
|
^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
There is a set of concrete HW architectures that OpenCL can be compiled for.
|
|||
|
|
|||
|
- For AMD target:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -target amdgcn-amd-amdhsa -mcpu=gfx900 test.cl
|
|||
|
|
|||
|
- For Nvidia architectures:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -target nvptx64-unknown-unknown test.cl
|
|||
|
|
|||
|
|
|||
|
Generic Targets
|
|||
|
^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
- SPIR is available as a generic target to allow portable bitcode to be produced
|
|||
|
that can be used across GPU toolchains. The implementation follows `the SPIR
|
|||
|
specification <https://www.khronos.org/spir>`_. There are two flavors
|
|||
|
available for 32 and 64 bits.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -target spir-unknown-unknown test.cl
|
|||
|
$ clang -target spir64-unknown-unknown test.cl
|
|||
|
|
|||
|
All known OpenCL extensions are supported in the SPIR targets. Clang will
|
|||
|
generate SPIR v1.2 compatible IR for OpenCL versions up to 2.0 and SPIR v2.0
|
|||
|
for OpenCL v2.0.
|
|||
|
|
|||
|
- x86 is used by some implementations that are x86 compatible and currently
|
|||
|
remains for backwards compatibility (with older implementations prior to
|
|||
|
SPIR target support). For "non-SPMD" targets which cannot spawn multiple
|
|||
|
work-items on the fly using hardware, which covers practically all non-GPU
|
|||
|
devices such as CPUs and DSPs, additional processing is needed for the kernels
|
|||
|
to support multiple work-item execution. For this, a 3rd party toolchain,
|
|||
|
such as for example `POCL <http://portablecl.org/>`_, can be used.
|
|||
|
|
|||
|
This target does not support multiple memory segments and, therefore, the fake
|
|||
|
address space map can be added using the :ref:`-ffake-address-space-map
|
|||
|
<opencl_fake_address_space_map>` flag.
|
|||
|
|
|||
|
.. _opencl_header:
|
|||
|
|
|||
|
OpenCL Header
|
|||
|
-------------
|
|||
|
|
|||
|
By default Clang will not include standard headers and therefore OpenCL builtin
|
|||
|
functions and some types (i.e. vectors) are unknown. The default CL header is,
|
|||
|
however, provided in the Clang installation and can be enabled by passing the
|
|||
|
``-finclude-default-header`` flag to the Clang frontend.
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ echo "bool is_wg_uniform(int i){return get_enqueued_local_size(i)==get_local_size(i);}" > test.cl
|
|||
|
$ clang -Xclang -finclude-default-header -cl-std=CL2.0 test.cl
|
|||
|
|
|||
|
Because the header is very large and long to parse, PCH (:doc:`PCHInternals`)
|
|||
|
and modules (:doc:`Modules`) are used internally to improve the compilation
|
|||
|
speed.
|
|||
|
|
|||
|
To enable modules for OpenCL:
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
$ clang -target spir-unknown-unknown -c -emit-llvm -Xclang -finclude-default-header -fmodules -fimplicit-module-maps -fmodules-cache-path=<path to the generated module> test.cl
|
|||
|
|
|||
|
OpenCL Extensions
|
|||
|
-----------------
|
|||
|
|
|||
|
All of the ``cl_khr_*`` extensions from `the official OpenCL specification
|
|||
|
<https://www.khronos.org/registry/OpenCL/sdk/2.0/docs/man/xhtml/EXTENSION.html>`_
|
|||
|
up to and including version 2.0 are available and set per target depending on the
|
|||
|
support available in the specific architecture.
|
|||
|
|
|||
|
It is possible to alter the default extensions setting per target using
|
|||
|
``-cl-ext`` flag. (See :ref:`flags description <opencl_cl_ext>` for more details).
|
|||
|
|
|||
|
Vendor extensions can be added flexibly by declaring the list of types and
|
|||
|
functions associated with each extensions enclosed within the following
|
|||
|
compiler pragma directives:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
#pragma OPENCL EXTENSION the_new_extension_name : begin
|
|||
|
// declare types and functions associated with the extension here
|
|||
|
#pragma OPENCL EXTENSION the_new_extension_name : end
|
|||
|
|
|||
|
For example, parsing the following code adds ``my_t`` type and ``my_func``
|
|||
|
function to the custom ``my_ext`` extension.
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
#pragma OPENCL EXTENSION my_ext : begin
|
|||
|
typedef struct{
|
|||
|
int a;
|
|||
|
}my_t;
|
|||
|
void my_func(my_t);
|
|||
|
#pragma OPENCL EXTENSION my_ext : end
|
|||
|
|
|||
|
Declaring the same types in different vendor extensions is disallowed.
|
|||
|
|
|||
|
OpenCL Metadata
|
|||
|
---------------
|
|||
|
|
|||
|
Clang uses metadata to provide additional OpenCL semantics in IR needed for
|
|||
|
backends and OpenCL runtime.
|
|||
|
|
|||
|
Each kernel will have function metadata attached to it, specifying the arguments.
|
|||
|
Kernel argument metadata is used to provide source level information for querying
|
|||
|
at runtime, for example using the `clGetKernelArgInfo
|
|||
|
<https://www.khronos.org/registry/OpenCL/specs/opencl-1.2.pdf#167>`_
|
|||
|
call.
|
|||
|
|
|||
|
Note that ``-cl-kernel-arg-info`` enables more information about the original CL
|
|||
|
code to be added e.g. kernel parameter names will appear in the OpenCL metadata
|
|||
|
along with other information.
|
|||
|
|
|||
|
The IDs used to encode the OpenCL's logical address spaces in the argument info
|
|||
|
metadata follows the SPIR address space mapping as defined in the SPIR
|
|||
|
specification `section 2.2
|
|||
|
<https://www.khronos.org/registry/spir/specs/spir_spec-2.0.pdf#18>`_
|
|||
|
|
|||
|
OpenCL-Specific Attributes
|
|||
|
--------------------------
|
|||
|
|
|||
|
OpenCL support in Clang contains a set of attribute taken directly from the
|
|||
|
specification as well as additional attributes.
|
|||
|
|
|||
|
See also :doc:`AttributeReference`.
|
|||
|
|
|||
|
nosvm
|
|||
|
^^^^^
|
|||
|
|
|||
|
Clang supports this attribute to comply to OpenCL v2.0 conformance, but it
|
|||
|
does not have any effect on the IR. For more details reffer to the specification
|
|||
|
`section 6.7.2
|
|||
|
<https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#49>`_
|
|||
|
|
|||
|
|
|||
|
opencl_unroll_hint
|
|||
|
^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
The implementation of this feature mirrors the unroll hint for C.
|
|||
|
More details on the syntax can be found in the specification
|
|||
|
`section 6.11.5
|
|||
|
<https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#61>`_
|
|||
|
|
|||
|
convergent
|
|||
|
^^^^^^^^^^
|
|||
|
|
|||
|
To make sure no invalid optimizations occur for single program multiple data
|
|||
|
(SPMD) / single instruction multiple thread (SIMT) Clang provides attributes that
|
|||
|
can be used for special functions that have cross work item semantics.
|
|||
|
An example is the subgroup operations such as `intel_sub_group_shuffle
|
|||
|
<https://www.khronos.org/registry/cl/extensions/intel/cl_intel_subgroups.txt>`_
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
// Define custom my_sub_group_shuffle(data, c)
|
|||
|
// that makes use of intel_sub_group_shuffle
|
|||
|
r1 = ...
|
|||
|
if (r0) r1 = computeA();
|
|||
|
// Shuffle data from r1 into r3
|
|||
|
// of threads id r2.
|
|||
|
r3 = my_sub_group_shuffle(r1, r2);
|
|||
|
if (r0) r3 = computeB();
|
|||
|
|
|||
|
with non-SPMD semantics this is optimized to the following equivalent code:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
r1 = ...
|
|||
|
if (!r0)
|
|||
|
// Incorrect functionality! The data in r1
|
|||
|
// have not been computed by all threads yet.
|
|||
|
r3 = my_sub_group_shuffle(r1, r2);
|
|||
|
else {
|
|||
|
r1 = computeA();
|
|||
|
r3 = my_sub_group_shuffle(r1, r2);
|
|||
|
r3 = computeB();
|
|||
|
}
|
|||
|
|
|||
|
Declaring the function ``my_sub_group_shuffle`` with the convergent attribute
|
|||
|
would prevent this:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
my_sub_group_shuffle() __attribute__((convergent));
|
|||
|
|
|||
|
Using ``convergent`` guarantees correct execution by keeping CFG equivalence
|
|||
|
wrt operations marked as ``convergent``. CFG ``G´`` is equivalent to ``G`` wrt
|
|||
|
node ``Ni`` : ``iff ∀ Nj (i≠j)`` domination and post-domination relations with
|
|||
|
respect to ``Ni`` remain the same in both ``G`` and ``G´``.
|
|||
|
|
|||
|
noduplicate
|
|||
|
^^^^^^^^^^^
|
|||
|
|
|||
|
``noduplicate`` is more restrictive with respect to optimizations than
|
|||
|
``convergent`` because a convergent function only preserves CFG equivalence.
|
|||
|
This allows some optimizations to happen as long as the control flow remains
|
|||
|
unmodified.
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
for (int i=0; i<4; i++)
|
|||
|
my_sub_group_shuffle()
|
|||
|
|
|||
|
can be modified to:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
my_sub_group_shuffle();
|
|||
|
my_sub_group_shuffle();
|
|||
|
my_sub_group_shuffle();
|
|||
|
my_sub_group_shuffle();
|
|||
|
|
|||
|
while using ``noduplicate`` would disallow this. Also ``noduplicate`` doesn't
|
|||
|
have the same safe semantics of CFG as ``convergent`` and can cause changes in
|
|||
|
CFG that modify semantics of the original program.
|
|||
|
|
|||
|
``noduplicate`` is kept for backwards compatibility only and it considered to be
|
|||
|
deprecated for future uses.
|
|||
|
|
|||
|
.. _opencl_addrsp:
|
|||
|
|
|||
|
address_space
|
|||
|
^^^^^^^^^^^^^
|
|||
|
|
|||
|
Clang has arbitrary address space support using the ``address_space(N)``
|
|||
|
attribute, where ``N`` is an integer number in the range ``0`` to ``16777215``
|
|||
|
(``0xffffffu``).
|
|||
|
|
|||
|
An OpenCL implementation provides a list of standard address spaces using
|
|||
|
keywords: ``private``, ``local``, ``global``, and ``generic``. In the AST and
|
|||
|
in the IR local, global, or generic will be represented by the address space
|
|||
|
attribute with the corresponding unique number. Note that private does not have
|
|||
|
any corresponding attribute added and, therefore, is represented by the absence
|
|||
|
of an address space number. The specific IDs for an address space do not have to
|
|||
|
match between the AST and the IR. Typically in the AST address space numbers
|
|||
|
represent logical segments while in the IR they represent physical segments.
|
|||
|
Therefore, machines with flat memory segments can map all AST address space
|
|||
|
numbers to the same physical segment ID or skip address space attribute
|
|||
|
completely while generating the IR. However, if the address space information
|
|||
|
is needed by the IR passes e.g. to improve alias analysis, it is recommended
|
|||
|
to keep it and only lower to reflect physical memory segments in the late
|
|||
|
machine passes.
|
|||
|
|
|||
|
OpenCL builtins
|
|||
|
---------------
|
|||
|
|
|||
|
There are some standard OpenCL functions that are implemented as Clang builtins:
|
|||
|
|
|||
|
- All pipe functions from `section 6.13.16.2/6.13.16.3
|
|||
|
<https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#160>`_ of
|
|||
|
the OpenCL v2.0 kernel language specification. `
|
|||
|
|
|||
|
- Address space qualifier conversion functions ``to_global``/``to_local``/``to_private``
|
|||
|
from `section 6.13.9
|
|||
|
<https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#101>`_.
|
|||
|
|
|||
|
- All the ``enqueue_kernel`` functions from `section 6.13.17.1
|
|||
|
<https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#164>`_ and
|
|||
|
enqueue query functions from `section 6.13.17.5
|
|||
|
<https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf#171>`_.
|
|||
|
|
|||
|
.. _opencl_cpp:
|
|||
|
|
|||
|
C++ for OpenCL
|
|||
|
--------------
|
|||
|
|
|||
|
Starting from clang 9 kernel code can contain C++17 features: classes, templates,
|
|||
|
function overloading, type deduction, etc. Please note that this is not an
|
|||
|
implementation of `OpenCL C++
|
|||
|
<https://www.khronos.org/registry/OpenCL/specs/2.2/pdf/OpenCL_Cxx.pdf>`_ and
|
|||
|
there is no plan to support it in clang in any new releases in the near future.
|
|||
|
|
|||
|
For detailed information about restrictions to allowed C++ features please
|
|||
|
refer to :doc:`LanguageExtensions`.
|
|||
|
|
|||
|
Since C++ features are to be used on top of OpenCL C functionality, all existing
|
|||
|
restrictions from OpenCL C v2.0 will inherently apply. All OpenCL C builtin types
|
|||
|
and function libraries are supported and can be used in this mode.
|
|||
|
|
|||
|
To enable the C++ for OpenCL mode, pass one of following command line options when
|
|||
|
compiling ``.cl`` file ``-cl-std=clc++``, ``-cl-std=CLC++``, ``-std=clc++`` or
|
|||
|
``-std=CLC++``.
|
|||
|
|
|||
|
.. code-block:: c++
|
|||
|
|
|||
|
template<class T> T add( T x, T y )
|
|||
|
{
|
|||
|
return x + y;
|
|||
|
}
|
|||
|
|
|||
|
__kernel void test( __global float* a, __global float* b)
|
|||
|
{
|
|||
|
auto index = get_global_id(0);
|
|||
|
a[index] = add(b[index], b[index+1]);
|
|||
|
}
|
|||
|
|
|||
|
|
|||
|
.. code-block:: console
|
|||
|
|
|||
|
clang -cl-std=clc++ test.cl
|
|||
|
|
|||
|
.. _target_features:
|
|||
|
|
|||
|
Target-Specific Features and Limitations
|
|||
|
========================================
|
|||
|
|
|||
|
CPU Architectures Features and Limitations
|
|||
|
------------------------------------------
|
|||
|
|
|||
|
X86
|
|||
|
^^^
|
|||
|
|
|||
|
The support for X86 (both 32-bit and 64-bit) is considered stable on
|
|||
|
Darwin (macOS), Linux, FreeBSD, and Dragonfly BSD: it has been tested
|
|||
|
to correctly compile many large C, C++, Objective-C, and Objective-C++
|
|||
|
codebases.
|
|||
|
|
|||
|
On ``x86_64-mingw32``, passing i128(by value) is incompatible with the
|
|||
|
Microsoft x64 calling convention. You might need to tweak
|
|||
|
``WinX86_64ABIInfo::classify()`` in lib/CodeGen/TargetInfo.cpp.
|
|||
|
|
|||
|
For the X86 target, clang supports the `-m16` command line
|
|||
|
argument which enables 16-bit code output. This is broadly similar to
|
|||
|
using ``asm(".code16gcc")`` with the GNU toolchain. The generated code
|
|||
|
and the ABI remains 32-bit but the assembler emits instructions
|
|||
|
appropriate for a CPU running in 16-bit mode, with address-size and
|
|||
|
operand-size prefixes to enable 32-bit addressing and operations.
|
|||
|
|
|||
|
ARM
|
|||
|
^^^
|
|||
|
|
|||
|
The support for ARM (specifically ARMv6 and ARMv7) is considered stable
|
|||
|
on Darwin (iOS): it has been tested to correctly compile many large C,
|
|||
|
C++, Objective-C, and Objective-C++ codebases. Clang only supports a
|
|||
|
limited number of ARM architectures. It does not yet fully support
|
|||
|
ARMv5, for example.
|
|||
|
|
|||
|
PowerPC
|
|||
|
^^^^^^^
|
|||
|
|
|||
|
The support for PowerPC (especially PowerPC64) is considered stable
|
|||
|
on Linux and FreeBSD: it has been tested to correctly compile many
|
|||
|
large C and C++ codebases. PowerPC (32bit) is still missing certain
|
|||
|
features (e.g. PIC code on ELF platforms).
|
|||
|
|
|||
|
Other platforms
|
|||
|
^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
clang currently contains some support for other architectures (e.g. Sparc);
|
|||
|
however, significant pieces of code generation are still missing, and they
|
|||
|
haven't undergone significant testing.
|
|||
|
|
|||
|
clang contains limited support for the MSP430 embedded processor, but
|
|||
|
both the clang support and the LLVM backend support are highly
|
|||
|
experimental.
|
|||
|
|
|||
|
Other platforms are completely unsupported at the moment. Adding the
|
|||
|
minimal support needed for parsing and semantic analysis on a new
|
|||
|
platform is quite easy; see ``lib/Basic/Targets.cpp`` in the clang source
|
|||
|
tree. This level of support is also sufficient for conversion to LLVM IR
|
|||
|
for simple programs. Proper support for conversion to LLVM IR requires
|
|||
|
adding code to ``lib/CodeGen/CGCall.cpp`` at the moment; this is likely to
|
|||
|
change soon, though. Generating assembly requires a suitable LLVM
|
|||
|
backend.
|
|||
|
|
|||
|
Operating System Features and Limitations
|
|||
|
-----------------------------------------
|
|||
|
|
|||
|
Darwin (macOS)
|
|||
|
^^^^^^^^^^^^^^
|
|||
|
|
|||
|
Thread Sanitizer is not supported.
|
|||
|
|
|||
|
Windows
|
|||
|
^^^^^^^
|
|||
|
|
|||
|
Clang has experimental support for targeting "Cygming" (Cygwin / MinGW)
|
|||
|
platforms.
|
|||
|
|
|||
|
See also :ref:`Microsoft Extensions <c_ms>`.
|
|||
|
|
|||
|
Cygwin
|
|||
|
""""""
|
|||
|
|
|||
|
Clang works on Cygwin-1.7.
|
|||
|
|
|||
|
MinGW32
|
|||
|
"""""""
|
|||
|
|
|||
|
Clang works on some mingw32 distributions. Clang assumes directories as
|
|||
|
below;
|
|||
|
|
|||
|
- ``C:/mingw/include``
|
|||
|
- ``C:/mingw/lib``
|
|||
|
- ``C:/mingw/lib/gcc/mingw32/4.[3-5].0/include/c++``
|
|||
|
|
|||
|
On MSYS, a few tests might fail.
|
|||
|
|
|||
|
MinGW-w64
|
|||
|
"""""""""
|
|||
|
|
|||
|
For 32-bit (i686-w64-mingw32), and 64-bit (x86\_64-w64-mingw32), Clang
|
|||
|
assumes as below;
|
|||
|
|
|||
|
- ``GCC versions 4.5.0 to 4.5.3, 4.6.0 to 4.6.2, or 4.7.0 (for the C++ header search path)``
|
|||
|
- ``some_directory/bin/gcc.exe``
|
|||
|
- ``some_directory/bin/clang.exe``
|
|||
|
- ``some_directory/bin/clang++.exe``
|
|||
|
- ``some_directory/bin/../include/c++/GCC_version``
|
|||
|
- ``some_directory/bin/../include/c++/GCC_version/x86_64-w64-mingw32``
|
|||
|
- ``some_directory/bin/../include/c++/GCC_version/i686-w64-mingw32``
|
|||
|
- ``some_directory/bin/../include/c++/GCC_version/backward``
|
|||
|
- ``some_directory/bin/../x86_64-w64-mingw32/include``
|
|||
|
- ``some_directory/bin/../i686-w64-mingw32/include``
|
|||
|
- ``some_directory/bin/../include``
|
|||
|
|
|||
|
This directory layout is standard for any toolchain you will find on the
|
|||
|
official `MinGW-w64 website <http://mingw-w64.sourceforge.net>`_.
|
|||
|
|
|||
|
Clang expects the GCC executable "gcc.exe" compiled for
|
|||
|
``i686-w64-mingw32`` (or ``x86_64-w64-mingw32``) to be present on PATH.
|
|||
|
|
|||
|
`Some tests might fail <https://bugs.llvm.org/show_bug.cgi?id=9072>`_ on
|
|||
|
``x86_64-w64-mingw32``.
|
|||
|
|
|||
|
.. _clang-cl:
|
|||
|
|
|||
|
clang-cl
|
|||
|
========
|
|||
|
|
|||
|
clang-cl is an alternative command-line interface to Clang, designed for
|
|||
|
compatibility with the Visual C++ compiler, cl.exe.
|
|||
|
|
|||
|
To enable clang-cl to find system headers, libraries, and the linker when run
|
|||
|
from the command-line, it should be executed inside a Visual Studio Native Tools
|
|||
|
Command Prompt or a regular Command Prompt where the environment has been set
|
|||
|
up using e.g. `vcvarsall.bat <https://msdn.microsoft.com/en-us/library/f2ccy3wt.aspx>`_.
|
|||
|
|
|||
|
clang-cl can also be used from inside Visual Studio by selecting the LLVM
|
|||
|
Platform Toolset. The toolset is not part of the installer, but may be installed
|
|||
|
separately from the
|
|||
|
`Visual Studio Marketplace <https://marketplace.visualstudio.com/items?itemName=LLVMExtensions.llvm-toolchain>`_.
|
|||
|
To use the toolset, select a project in Solution Explorer, open its Property
|
|||
|
Page (Alt+F7), and in the "General" section of "Configuration Properties"
|
|||
|
change "Platform Toolset" to LLVM. Doing so enables an additional Property
|
|||
|
Page for selecting the clang-cl executable to use for builds.
|
|||
|
|
|||
|
To use the toolset with MSBuild directly, invoke it with e.g.
|
|||
|
``/p:PlatformToolset=LLVM``. This allows trying out the clang-cl toolchain
|
|||
|
without modifying your project files.
|
|||
|
|
|||
|
It's also possible to point MSBuild at clang-cl without changing toolset by
|
|||
|
passing ``/p:CLToolPath=c:\llvm\bin /p:CLToolExe=clang-cl.exe``.
|
|||
|
|
|||
|
When using CMake and the Visual Studio generators, the toolset can be set with the ``-T`` flag:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
cmake -G"Visual Studio 15 2017" -T LLVM ..
|
|||
|
|
|||
|
When using CMake with the Ninja generator, set the ``CMAKE_C_COMPILER`` and
|
|||
|
``CMAKE_CXX_COMPILER`` variables to clang-cl:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
cmake -GNinja -DCMAKE_C_COMPILER="c:/Program Files (x86)/LLVM/bin/clang-cl.exe"
|
|||
|
-DCMAKE_CXX_COMPILER="c:/Program Files (x86)/LLVM/bin/clang-cl.exe" ..
|
|||
|
|
|||
|
|
|||
|
Command-Line Options
|
|||
|
--------------------
|
|||
|
|
|||
|
To be compatible with cl.exe, clang-cl supports most of the same command-line
|
|||
|
options. Those options can start with either ``/`` or ``-``. It also supports
|
|||
|
some of Clang's core options, such as the ``-W`` options.
|
|||
|
|
|||
|
Options that are known to clang-cl, but not currently supported, are ignored
|
|||
|
with a warning. For example:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
clang-cl.exe: warning: argument unused during compilation: '/AI'
|
|||
|
|
|||
|
To suppress warnings about unused arguments, use the ``-Qunused-arguments`` option.
|
|||
|
|
|||
|
Options that are not known to clang-cl will be ignored by default. Use the
|
|||
|
``-Werror=unknown-argument`` option in order to treat them as errors. If these
|
|||
|
options are spelled with a leading ``/``, they will be mistaken for a filename:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
clang-cl.exe: error: no such file or directory: '/foobar'
|
|||
|
|
|||
|
Please `file a bug <https://bugs.llvm.org/enter_bug.cgi?product=clang&component=Driver>`_
|
|||
|
for any valid cl.exe flags that clang-cl does not understand.
|
|||
|
|
|||
|
Execute ``clang-cl /?`` to see a list of supported options:
|
|||
|
|
|||
|
::
|
|||
|
|
|||
|
CL.EXE COMPATIBILITY OPTIONS:
|
|||
|
/? Display available options
|
|||
|
/arch:<value> Set architecture for code generation
|
|||
|
/Brepro- Write current time into COFF output (default)
|
|||
|
/Brepro Do not write current time into COFF output (breaks link.exe /incremental)
|
|||
|
/clang:<arg> Pass <arg> to the clang driver
|
|||
|
/C Do not discard comments when preprocessing
|
|||
|
/c Compile only
|
|||
|
/d1PP Retain macro definitions in /E mode
|
|||
|
/d1reportAllClassLayout Dump record layout information
|
|||
|
/diagnostics:caret Enable caret and column diagnostics (default)
|
|||
|
/diagnostics:classic Disable column and caret diagnostics
|
|||
|
/diagnostics:column Disable caret diagnostics but keep column info
|
|||
|
/D <macro[=value]> Define macro
|
|||
|
/EH<value> Set exception handling model
|
|||
|
/EP Disable linemarker output and preprocess to stdout
|
|||
|
/execution-charset:<value>
|
|||
|
Set runtime encoding, supports only UTF-8
|
|||
|
/E Preprocess to stdout
|
|||
|
/fallback Fall back to cl.exe if clang-cl fails to compile
|
|||
|
/FA Output assembly code file during compilation
|
|||
|
/Fa<file or dir/> Set assembly output file name (with /FA)
|
|||
|
/Fe<file or dir/> Set output executable file name
|
|||
|
/FI <value> Include file before parsing
|
|||
|
/Fi<file> Set preprocess output file name (with /P)
|
|||
|
/Fo<file or dir/> Set output object file (with /c)
|
|||
|
/fp:except-
|
|||
|
/fp:except
|
|||
|
/fp:fast
|
|||
|
/fp:precise
|
|||
|
/fp:strict
|
|||
|
/Fp<file> Set pch file name (with /Yc and /Yu)
|
|||
|
/GA Assume thread-local variables are defined in the executable
|
|||
|
/Gd Set __cdecl as a default calling convention
|
|||
|
/GF- Disable string pooling
|
|||
|
/GF Enable string pooling (default)
|
|||
|
/GR- Do not emit RTTI data
|
|||
|
/Gregcall Set __regcall as a default calling convention
|
|||
|
/GR Emit RTTI data (default)
|
|||
|
/Gr Set __fastcall as a default calling convention
|
|||
|
/GS- Disable buffer security check
|
|||
|
/GS Enable buffer security check (default)
|
|||
|
/Gs Use stack probes (default)
|
|||
|
/Gs<value> Set stack probe size (default 4096)
|
|||
|
/guard:<value> Enable Control Flow Guard with /guard:cf,
|
|||
|
or only the table with /guard:cf,nochecks
|
|||
|
/Gv Set __vectorcall as a default calling convention
|
|||
|
/Gw- Do not put each data item in its own section (default)
|
|||
|
/Gw Put each data item in its own section
|
|||
|
/GX- Deprecated (like not passing /EH)
|
|||
|
/GX Deprecated; use /EHsc
|
|||
|
/Gy- Do not put each function in its own section (default)
|
|||
|
/Gy Put each function in its own section
|
|||
|
/Gz Set __stdcall as a default calling convention
|
|||
|
/help Display available options
|
|||
|
/imsvc <dir> Add <dir> to system include search path, as if in %INCLUDE%
|
|||
|
/I <dir> Add directory to include search path
|
|||
|
/J Make char type unsigned
|
|||
|
/LDd Create debug DLL
|
|||
|
/LD Create DLL
|
|||
|
/link <options> Forward options to the linker
|
|||
|
/MDd Use DLL debug run-time
|
|||
|
/MD Use DLL run-time
|
|||
|
/MTd Use static debug run-time
|
|||
|
/MT Use static run-time
|
|||
|
/O1 Optimize for size (like /Og /Os /Oy /Ob2 /GF /Gy)
|
|||
|
/O2 Optimize for speed (like /Og /Oi /Ot /Oy /Ob2 /GF /Gy)
|
|||
|
/Ob0 Disable function inlining
|
|||
|
/Ob1 Only inline functions explicitly or implicitly marked inline
|
|||
|
/Ob2 Inline functions as deemed beneficial by the compiler
|
|||
|
/Od Disable optimization
|
|||
|
/Og No effect
|
|||
|
/Oi- Disable use of builtin functions
|
|||
|
/Oi Enable use of builtin functions
|
|||
|
/Os Optimize for size
|
|||
|
/Ot Optimize for speed
|
|||
|
/Ox Deprecated (like /Og /Oi /Ot /Oy /Ob2); use /O2
|
|||
|
/Oy- Disable frame pointer omission (x86 only, default)
|
|||
|
/Oy Enable frame pointer omission (x86 only)
|
|||
|
/O<flags> Set multiple /O flags at once; e.g. '/O2y-' for '/O2 /Oy-'
|
|||
|
/o <file or dir/> Deprecated (set output file name); use /Fe or /Fe
|
|||
|
/P Preprocess to file
|
|||
|
/Qvec- Disable the loop vectorization passes
|
|||
|
/Qvec Enable the loop vectorization passes
|
|||
|
/showFilenames- Do not print the name of each compiled file (default)
|
|||
|
/showFilenames Print the name of each compiled file
|
|||
|
/showIncludes Print info about included files to stderr
|
|||
|
/source-charset:<value> Set source encoding, supports only UTF-8
|
|||
|
/std:<value> Set C++ version (c++14,c++17,c++latest)
|
|||
|
/TC Treat all source files as C
|
|||
|
/Tc <file> Treat <file> as C source file
|
|||
|
/TP Treat all source files as C++
|
|||
|
/Tp <file> Treat <file> as C++ source file
|
|||
|
/utf-8 Set source and runtime encoding to UTF-8 (default)
|
|||
|
/U <macro> Undefine macro
|
|||
|
/vd<value> Control vtordisp placement
|
|||
|
/vmb Use a best-case representation method for member pointers
|
|||
|
/vmg Use a most-general representation for member pointers
|
|||
|
/vmm Set the default most-general representation to multiple inheritance
|
|||
|
/vms Set the default most-general representation to single inheritance
|
|||
|
/vmv Set the default most-general representation to virtual inheritance
|
|||
|
/volatile:iso Volatile loads and stores have standard semantics
|
|||
|
/volatile:ms Volatile loads and stores have acquire and release semantics
|
|||
|
/W0 Disable all warnings
|
|||
|
/W1 Enable -Wall
|
|||
|
/W2 Enable -Wall
|
|||
|
/W3 Enable -Wall
|
|||
|
/W4 Enable -Wall and -Wextra
|
|||
|
/Wall Enable -Weverything
|
|||
|
/WX- Do not treat warnings as errors (default)
|
|||
|
/WX Treat warnings as errors
|
|||
|
/w Disable all warnings
|
|||
|
/X Do not add %INCLUDE% to include search path
|
|||
|
/Y- Disable precompiled headers, overrides /Yc and /Yu
|
|||
|
/Yc<filename> Generate a pch file for all code up to and including <filename>
|
|||
|
/Yu<filename> Load a pch file and use it instead of all code up to and including <filename>
|
|||
|
/Z7 Enable CodeView debug information in object files
|
|||
|
/Zc:alignedNew- Disable C++17 aligned allocation functions
|
|||
|
/Zc:alignedNew Enable C++17 aligned allocation functions
|
|||
|
/Zc:char8_t- Disable char8_t from c++2a
|
|||
|
/Zc:char8_t Enable char8_t from C++2a
|
|||
|
/Zc:dllexportInlines- Do not dllexport/dllimport inline member functions of dllexport/import classes
|
|||
|
/Zc:dllexportInlines dllexport/dllimport inline member functions of dllexport/import classes (default)
|
|||
|
/Zc:sizedDealloc- Disable C++14 sized global deallocation functions
|
|||
|
/Zc:sizedDealloc Enable C++14 sized global deallocation functions
|
|||
|
/Zc:strictStrings Treat string literals as const
|
|||
|
/Zc:threadSafeInit- Disable thread-safe initialization of static variables
|
|||
|
/Zc:threadSafeInit Enable thread-safe initialization of static variables
|
|||
|
/Zc:trigraphs- Disable trigraphs (default)
|
|||
|
/Zc:trigraphs Enable trigraphs
|
|||
|
/Zc:twoPhase- Disable two-phase name lookup in templates (default)
|
|||
|
/Zc:twoPhase Enable two-phase name lookup in templates
|
|||
|
/Zd Emit debug line number tables only
|
|||
|
/Zi Like /Z7
|
|||
|
/Zl Do not let object file auto-link default libraries
|
|||
|
/Zp Set default maximum struct packing alignment to 1
|
|||
|
/Zp<value> Set default maximum struct packing alignment
|
|||
|
/Zs Syntax-check only
|
|||
|
|
|||
|
OPTIONS:
|
|||
|
-### Print (but do not run) the commands to run for this compilation
|
|||
|
--analyze Run the static analyzer
|
|||
|
-faddrsig Emit an address-significance table
|
|||
|
-fansi-escape-codes Use ANSI escape codes for diagnostics
|
|||
|
-fblocks Enable the 'blocks' language feature
|
|||
|
-fcf-protection=<value> Instrument control-flow architecture protection. Options: return, branch, full, none.
|
|||
|
-fcf-protection Enable cf-protection in 'full' mode
|
|||
|
-fcolor-diagnostics Use colors in diagnostics
|
|||
|
-fcomplete-member-pointers
|
|||
|
Require member pointer base types to be complete if they would be significant under the Microsoft ABI
|
|||
|
-fcoverage-mapping Generate coverage mapping to enable code coverage analysis
|
|||
|
-fcs-profile-generate=<directory>
|
|||
|
Generate instrumented code to collect context sensitive
|
|||
|
execution counts into <directory>/default.profraw
|
|||
|
(overridden by LLVM_PROFILE_FILE env var)
|
|||
|
-fcs-profile-generate Generate instrumented code to collect context sensitive
|
|||
|
execution counts into default.profraw
|
|||
|
(overridden by LLVM_PROFILE_FILE env var)
|
|||
|
-fdebug-compilation-dir <value>
|
|||
|
The compilation directory to embed in the debug info.
|
|||
|
-fdebug-macro Emit macro debug information
|
|||
|
-fdelayed-template-parsing
|
|||
|
Parse templated function definitions at the end of the translation unit
|
|||
|
-fdiagnostics-absolute-paths
|
|||
|
Print absolute paths in diagnostics
|
|||
|
-fdiagnostics-parseable-fixits
|
|||
|
Print fix-its in machine parseable form
|
|||
|
-flto=<value> Set LTO mode to either 'full' or 'thin'
|
|||
|
-flto Enable LTO in 'full' mode
|
|||
|
-fmerge-all-constants Allow merging of constants
|
|||
|
-fms-compatibility-version=<value>
|
|||
|
Dot-separated value representing the Microsoft compiler version
|
|||
|
number to report in _MSC_VER (0 = don't define it (default))
|
|||
|
-fms-compatibility Enable full Microsoft Visual C++ compatibility
|
|||
|
-fms-extensions Accept some non-standard constructs supported by the Microsoft compiler
|
|||
|
-fmsc-version=<value> Microsoft compiler version number to report in _MSC_VER
|
|||
|
(0 = don't define it (default))
|
|||
|
-fno-addrsig Don't emit an address-significance table
|
|||
|
-fno-builtin-<value> Disable implicit builtin knowledge of a specific function
|
|||
|
-fno-builtin Disable implicit builtin knowledge of functions
|
|||
|
-fno-complete-member-pointers
|
|||
|
Do not require member pointer base types to be complete if they would be significant under the Microsoft ABI
|
|||
|
-fno-coverage-mapping Disable code coverage analysis
|
|||
|
-fno-crash-diagnostics Disable auto-generation of preprocessed source files and a script for reproduction during a clang crash
|
|||
|
-fno-debug-macro Do not emit macro debug information
|
|||
|
-fno-delayed-template-parsing
|
|||
|
Disable delayed template parsing
|
|||
|
-fno-profile-generate Disable generation of profile instrumentation.
|
|||
|
-fno-profile-instr-generate
|
|||
|
Disable generation of profile instrumentation.
|
|||
|
-fno-profile-instr-use Disable using instrumentation data for profile-guided optimization
|
|||
|
-fno-sanitize-address-poison-custom-array-cookie
|
|||
|
Disable poisoning array cookies when using custom operator new[] in AddressSanitizer
|
|||
|
-fno-sanitize-address-use-after-scope
|
|||
|
Disable use-after-scope detection in AddressSanitizer
|
|||
|
-fno-sanitize-address-use-odr-indicator
|
|||
|
Disable ODR indicator globals
|
|||
|
-fno-sanitize-blacklist Don't use blacklist file for sanitizers
|
|||
|
-fno-sanitize-cfi-cross-dso
|
|||
|
Disable control flow integrity (CFI) checks for cross-DSO calls.
|
|||
|
-fno-sanitize-coverage=<value>
|
|||
|
Disable specified features of coverage instrumentation for Sanitizers
|
|||
|
-fno-sanitize-memory-track-origins
|
|||
|
Disable origins tracking in MemorySanitizer
|
|||
|
-fno-sanitize-memory-use-after-dtor
|
|||
|
Disable use-after-destroy detection in MemorySanitizer
|
|||
|
-fno-sanitize-recover=<value>
|
|||
|
Disable recovery for specified sanitizers
|
|||
|
-fno-sanitize-stats Disable sanitizer statistics gathering.
|
|||
|
-fno-sanitize-thread-atomics
|
|||
|
Disable atomic operations instrumentation in ThreadSanitizer
|
|||
|
-fno-sanitize-thread-func-entry-exit
|
|||
|
Disable function entry/exit instrumentation in ThreadSanitizer
|
|||
|
-fno-sanitize-thread-memory-access
|
|||
|
Disable memory access instrumentation in ThreadSanitizer
|
|||
|
-fno-sanitize-trap=<value>
|
|||
|
Disable trapping for specified sanitizers
|
|||
|
-fno-standalone-debug Limit debug information produced to reduce size of debug binary
|
|||
|
-fobjc-runtime=<value> Specify the target Objective-C runtime kind and version
|
|||
|
-forder-file-instrumentation
|
|||
|
Generate instrumented code to collect order file into default.profraw
|
|||
|
file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)
|
|||
|
-fprofile-exclude-files=<value>
|
|||
|
Instrument only functions from files where names don't match all the regexes separated by a semi-colon
|
|||
|
-fprofile-filter-files=<value>
|
|||
|
Instrument only functions from files where names match any regex separated by a semi-colon
|
|||
|
-fprofile-generate=<directory>
|
|||
|
Generate instrumented code to collect execution counts into
|
|||
|
<directory>/default.profraw (overridden by LLVM_PROFILE_FILE env var)
|
|||
|
-fprofile-generate Generate instrumented code to collect execution counts into
|
|||
|
default.profraw (overridden by LLVM_PROFILE_FILE env var)
|
|||
|
-fprofile-instr-generate=<file>
|
|||
|
Generate instrumented code to collect execution counts into <file>
|
|||
|
(overridden by LLVM_PROFILE_FILE env var)
|
|||
|
-fprofile-instr-generate
|
|||
|
Generate instrumented code to collect execution counts into default.profraw file
|
|||
|
(overridden by '=' form of option or LLVM_PROFILE_FILE env var)
|
|||
|
-fprofile-instr-use=<value>
|
|||
|
Use instrumentation data for profile-guided optimization
|
|||
|
-fprofile-remapping-file=<file>
|
|||
|
Use the remappings described in <file> to match the profile data against names in the program
|
|||
|
-fsanitize-address-field-padding=<value>
|
|||
|
Level of field padding for AddressSanitizer
|
|||
|
-fsanitize-address-globals-dead-stripping
|
|||
|
Enable linker dead stripping of globals in AddressSanitizer
|
|||
|
-fsanitize-address-poison-custom-array-cookie
|
|||
|
Enable poisoning array cookies when using custom operator new[] in AddressSanitizer
|
|||
|
-fsanitize-address-use-after-scope
|
|||
|
Enable use-after-scope detection in AddressSanitizer
|
|||
|
-fsanitize-address-use-odr-indicator
|
|||
|
Enable ODR indicator globals to avoid false ODR violation reports in partially sanitized programs at the cost of an increase in binary size
|
|||
|
-fsanitize-blacklist=<value>
|
|||
|
Path to blacklist file for sanitizers
|
|||
|
-fsanitize-cfi-cross-dso
|
|||
|
Enable control flow integrity (CFI) checks for cross-DSO calls.
|
|||
|
-fsanitize-cfi-icall-generalize-pointers
|
|||
|
Generalize pointers in CFI indirect call type signature checks
|
|||
|
-fsanitize-coverage=<value>
|
|||
|
Specify the type of coverage instrumentation for Sanitizers
|
|||
|
-fsanitize-hwaddress-abi=<value>
|
|||
|
Select the HWAddressSanitizer ABI to target (interceptor or platform, default interceptor)
|
|||
|
-fsanitize-memory-track-origins=<value>
|
|||
|
Enable origins tracking in MemorySanitizer
|
|||
|
-fsanitize-memory-track-origins
|
|||
|
Enable origins tracking in MemorySanitizer
|
|||
|
-fsanitize-memory-use-after-dtor
|
|||
|
Enable use-after-destroy detection in MemorySanitizer
|
|||
|
-fsanitize-recover=<value>
|
|||
|
Enable recovery for specified sanitizers
|
|||
|
-fsanitize-stats Enable sanitizer statistics gathering.
|
|||
|
-fsanitize-thread-atomics
|
|||
|
Enable atomic operations instrumentation in ThreadSanitizer (default)
|
|||
|
-fsanitize-thread-func-entry-exit
|
|||
|
Enable function entry/exit instrumentation in ThreadSanitizer (default)
|
|||
|
-fsanitize-thread-memory-access
|
|||
|
Enable memory access instrumentation in ThreadSanitizer (default)
|
|||
|
-fsanitize-trap=<value> Enable trapping for specified sanitizers
|
|||
|
-fsanitize-undefined-strip-path-components=<number>
|
|||
|
Strip (or keep only, if negative) a given number of path components when emitting check metadata.
|
|||
|
-fsanitize=<check> Turn on runtime checks for various forms of undefined or suspicious behavior. See user manual for available checks
|
|||
|
-fsplit-lto-unit Enables splitting of the LTO unit.
|
|||
|
-fstandalone-debug Emit full debug info for all types used by the program
|
|||
|
-fthinlto-index=<value> Perform ThinLTO importing using provided function summary index
|
|||
|
-fwhole-program-vtables Enables whole-program vtable optimization. Requires -flto
|
|||
|
-gcodeview-ghash Emit type record hashes in a .debug$H section
|
|||
|
-gcodeview Generate CodeView debug information
|
|||
|
-gline-directives-only Emit debug line info directives only
|
|||
|
-gline-tables-only Emit debug line number tables only
|
|||
|
-miamcu Use Intel MCU ABI
|
|||
|
-mllvm <value> Additional arguments to forward to LLVM's option processing
|
|||
|
-nobuiltininc Disable builtin #include directories
|
|||
|
-print-supported-cpus Print supported cpu models for the given target (if target is not specified, it will print the supported cpus for the default target)
|
|||
|
-Qunused-arguments Don't emit warning for unused driver arguments
|
|||
|
-R<remark> Enable the specified remark
|
|||
|
--target=<value> Generate code for the given target
|
|||
|
--version Print version information
|
|||
|
-v Show commands to run and use verbose output
|
|||
|
-W<warning> Enable the specified warning
|
|||
|
-Xclang <arg> Pass <arg> to the clang compiler
|
|||
|
|
|||
|
The /clang: Option
|
|||
|
^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
When clang-cl is run with a set of ``/clang:<arg>`` options, it will gather all
|
|||
|
of the ``<arg>`` arguments and process them as if they were passed to the clang
|
|||
|
driver. This mechanism allows you to pass flags that are not exposed in the
|
|||
|
clang-cl options or flags that have a different meaning when passed to the clang
|
|||
|
driver. Regardless of where they appear in the command line, the ``/clang:``
|
|||
|
arguments are treated as if they were passed at the end of the clang-cl command
|
|||
|
line.
|
|||
|
|
|||
|
The /Zc:dllexportInlines- Option
|
|||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
This causes the class-level `dllexport` and `dllimport` attributes to not apply
|
|||
|
to inline member functions, as they otherwise would. For example, in the code
|
|||
|
below `S::foo()` would normally be defined and exported by the DLL, but when
|
|||
|
using the ``/Zc:dllexportInlines-`` flag it is not:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
struct __declspec(dllexport) S {
|
|||
|
void foo() {}
|
|||
|
}
|
|||
|
|
|||
|
This has the benefit that the compiler doesn't need to emit a definition of
|
|||
|
`S::foo()` in every translation unit where the declaration is included, as it
|
|||
|
would otherwise do to ensure there's a definition in the DLL even if it's not
|
|||
|
used there. If the declaration occurs in a header file that's widely used, this
|
|||
|
can save significant compilation time and output size. It also reduces the
|
|||
|
number of functions exported by the DLL similarly to what
|
|||
|
``-fvisibility-inlines-hidden`` does for shared objects on ELF and Mach-O.
|
|||
|
Since the function declaration comes with an inline definition, users of the
|
|||
|
library can use that definition directly instead of importing it from the DLL.
|
|||
|
|
|||
|
Note that the Microsoft Visual C++ compiler does not support this option, and
|
|||
|
if code in a DLL is compiled with ``/Zc:dllexportInlines-``, the code using the
|
|||
|
DLL must be compiled in the same way so that it doesn't attempt to dllimport
|
|||
|
the inline member functions. The reverse scenario should generally work though:
|
|||
|
a DLL compiled without this flag (such as a system library compiled with Visual
|
|||
|
C++) can be referenced from code compiled using the flag, meaning that the
|
|||
|
referencing code will use the inline definitions instead of importing them from
|
|||
|
the DLL.
|
|||
|
|
|||
|
Also note that like when using ``-fvisibility-inlines-hidden``, the address of
|
|||
|
`S::foo()` will be different inside and outside the DLL, breaking the C/C++
|
|||
|
standard requirement that functions have a unique address.
|
|||
|
|
|||
|
The flag does not apply to explicit class template instantiation definitions or
|
|||
|
declarations, as those are typically used to explicitly provide a single
|
|||
|
definition in a DLL, (dllexported instantiation definition) or to signal that
|
|||
|
the definition is available elsewhere (dllimport instantiation declaration). It
|
|||
|
also doesn't apply to inline members with static local variables, to ensure
|
|||
|
that the same instance of the variable is used inside and outside the DLL.
|
|||
|
|
|||
|
Using this flag can cause problems when inline functions that would otherwise
|
|||
|
be dllexported refer to internal symbols of a DLL. For example:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
void internal();
|
|||
|
|
|||
|
struct __declspec(dllimport) S {
|
|||
|
void foo() { internal(); }
|
|||
|
}
|
|||
|
|
|||
|
Normally, references to `S::foo()` would use the definition in the DLL from
|
|||
|
which it was exported, and which presumably also has the definition of
|
|||
|
`internal()`. However, when using ``/Zc:dllexportInlines-``, the inline
|
|||
|
definition of `S::foo()` is used directly, resulting in a link error since
|
|||
|
`internal()` is not available. Even worse, if there is an inline definition of
|
|||
|
`internal()` containing a static local variable, we will now refer to a
|
|||
|
different instance of that variable than in the DLL:
|
|||
|
|
|||
|
.. code-block:: c
|
|||
|
|
|||
|
inline int internal() { static int x; return x++; }
|
|||
|
|
|||
|
struct __declspec(dllimport) S {
|
|||
|
int foo() { return internal(); }
|
|||
|
}
|
|||
|
|
|||
|
This could lead to very subtle bugs. Using ``-fvisibility-inlines-hidden`` can
|
|||
|
lead to the same issue. To avoid it in this case, make `S::foo()` or
|
|||
|
`internal()` non-inline, or mark them `dllimport/dllexport` explicitly.
|
|||
|
|
|||
|
The /fallback Option
|
|||
|
^^^^^^^^^^^^^^^^^^^^
|
|||
|
|
|||
|
When clang-cl is run with the ``/fallback`` option, it will first try to
|
|||
|
compile files itself. For any file that it fails to compile, it will fall back
|
|||
|
and try to compile the file by invoking cl.exe.
|
|||
|
|
|||
|
This option is intended to be used as a temporary means to build projects where
|
|||
|
clang-cl cannot successfully compile all the files. clang-cl may fail to compile
|
|||
|
a file either because it cannot generate code for some C++ feature, or because
|
|||
|
it cannot parse some Microsoft language extension.
|