License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
# SPDX-License-Identifier: GPL-2.0
2007-10-26 01:42:04 +08:00
# Unified Makefile for i386 and x86_64
2007-10-26 02:31:19 +08:00
# select defconfig based on actual architecture
2007-11-13 03:14:19 +08:00
i f e q ( $( ARCH ) , x 8 6 )
2012-12-21 05:51:55 +08:00
ifeq ( $( shell uname -m) ,x86_64)
KBUILD_DEFCONFIG := x86_64_defconfig
else
2007-11-13 03:14:19 +08:00
KBUILD_DEFCONFIG := i386_defconfig
2012-12-21 05:51:55 +08:00
endif
2007-11-13 03:14:19 +08:00
e l s e
KBUILD_DEFCONFIG := $( ARCH) _defconfig
e n d i f
2007-10-26 02:31:19 +08:00
2017-06-22 07:28:05 +08:00
# For gcc stack alignment is specified with -mpreferred-stack-boundary,
# clang has the option -mstack-alignment for that purpose.
i f n e q ( $( call cc -option , -mpreferred -stack -boundary =4) , )
2017-08-17 08:47:40 +08:00
cc_stack_align4 := -mpreferred-stack-boundary= 2
cc_stack_align8 := -mpreferred-stack-boundary= 3
e l s e i f n e q ( $( call cc -option , -mstack -alignment =16) , )
cc_stack_align4 := -mstack-alignment= 4
cc_stack_align8 := -mstack-alignment= 8
2017-06-22 07:28:05 +08:00
e n d i f
2014-01-08 19:21:20 +08:00
# How to compile the 16-bit code. Note we always compile for -march=i386;
# that way we can complain to the user if the CPU is insufficient.
2014-01-29 20:16:47 +08:00
#
# The -m16 option is supported by GCC >= 4.9 and clang >= 3.5. For
2014-06-05 04:16:48 +08:00
# older versions of GCC, include an *assembly* header to make sure that
# gcc doesn't play any games behind our back.
CODE16GCC_CFLAGS := -m32 -Wa,$( srctree) /arch/x86/boot/code16gcc.h
2014-01-29 20:16:47 +08:00
M16_CFLAGS := $( call cc-option, -m16, $( CODE16GCC_CFLAGS) )
2018-03-16 16:49:44 +08:00
REALMODE_CFLAGS := $( M16_CFLAGS) -g -Os -DDISABLE_BRANCH_PROFILING \
2014-01-08 19:21:20 +08:00
-Wall -Wstrict-prototypes -march= i386 -mregparm= 3 \
-fno-strict-aliasing -fomit-frame-pointer -fno-pic \
2017-06-22 07:28:04 +08:00
-mno-mmx -mno-sse
REALMODE_CFLAGS += $( call __cc-option, $( CC) , $( REALMODE_CFLAGS) , -ffreestanding)
REALMODE_CFLAGS += $( call __cc-option, $( CC) , $( REALMODE_CFLAGS) , -fno-stack-protector)
2017-08-18 02:20:47 +08:00
REALMODE_CFLAGS += $( call __cc-option, $( CC) , $( REALMODE_CFLAGS) , $( cc_stack_align4) )
2014-01-08 19:21:20 +08:00
export REALMODE_CFLAGS
2008-01-30 20:32:20 +08:00
# BITS is used as extension for files which are available in a 32 bit
# and a 64 bit version to simplify shared Makefiles.
# e.g.: obj-y += foo_$(BITS).o
export BITS
2007-10-26 01:42:04 +08:00
2013-07-09 00:15:17 +08:00
i f d e f C O N F I G _ X 8 6 _ N E E D _ R E L O C S
LDFLAGS_vmlinux := --emit-relocs
e n d i f
2015-07-22 00:27:18 +08:00
#
# Prevent GCC from generating any FP code by mistake.
#
# This must happen before we try the -mpreferred-stack-boundary, see:
#
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
#
KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
KBUILD_CFLAGS += $( call cc-option,-mno-avx,)
2007-11-13 03:14:19 +08:00
i f e q ( $( CONFIG_X 86_ 32) , y )
2008-01-30 20:32:20 +08:00
BITS := 32
2007-11-20 06:58:57 +08:00
UTS_MACHINE := i386
2008-01-30 20:32:23 +08:00
CHECKFLAGS += -D__i386__
2008-01-30 20:32:20 +08:00
2008-01-30 20:32:23 +08:00
biarch := $( call cc-option,-m32)
KBUILD_AFLAGS += $( biarch)
KBUILD_CFLAGS += $( biarch)
2008-01-30 20:32:20 +08:00
KBUILD_CFLAGS += -msoft-float -mregparm= 3 -freg-struct-return
2012-08-11 02:49:06 +08:00
# Never want PIC in a 32-bit kernel, prevent breakage with GCC built
# with nonstandard options
KBUILD_CFLAGS += -fno-pic
2017-06-22 07:28:05 +08:00
# Align the stack to the register width instead of using the default
# alignment of 16 bytes. This reduces stack usage and the number of
# alignment instructions.
2017-08-18 02:20:47 +08:00
KBUILD_CFLAGS += $( call cc-option,$( cc_stack_align4) )
2008-01-30 20:32:20 +08:00
# CPU-specific tuning. Anything which can be shared with UML should go here.
2015-03-27 19:43:36 +08:00
include arch/x86/Makefile_32.cpu
2008-01-30 20:32:20 +08:00
KBUILD_CFLAGS += $( cflags-y)
# temporary until string.h is fixed
KBUILD_CFLAGS += -ffreestanding
2007-10-26 01:42:04 +08:00
e l s e
2008-01-30 20:32:20 +08:00
BITS := 64
2007-11-20 06:58:57 +08:00
UTS_MACHINE := x86_64
2018-05-31 04:48:38 +08:00
CHECKFLAGS += -D__x86_64__
2008-01-30 20:32:20 +08:00
2014-05-08 05:05:52 +08:00
biarch := -m64
2008-01-30 20:32:20 +08:00
KBUILD_AFLAGS += -m64
KBUILD_CFLAGS += -m64
x86: Align jump targets to 1-byte boundaries
The following NOP in a hot function caught my attention:
> 5a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
That's a dead NOP that bloats the function a bit, added for the
default 16-byte alignment that GCC applies for jump targets.
I realize that x86 CPU manufacturers recommend 16-byte jump
target alignments (it's in the Intel optimization manual),
to help their relatively narrow decoder prefetch alignment
and uop cache constraints, but the cost of that is very
significant:
text data bss dec filename
12566391 1617840 1089536 15273767 vmlinux.align.16-byte
12224951 1617840 1089536 14932327 vmlinux.align.1-byte
By using 1-byte jump target alignment (i.e. no alignment at all)
we get an almost 3% reduction in kernel size (!) - and a
probably similar reduction in I$ footprint.
Now, the usual justification for jump target alignment is the
following:
- modern decoders tend to have 16-byte (effective) decoder
prefetch windows. (AMD documents it higher but measurements
suggest the effective prefetch window on curretn uarchs is
still around 16 bytes)
- on Intel there's also the uop-cache with cachelines that have
16-byte granularity and limited associativity.
- older x86 uarchs had a penalty for decoder fetches that crossed
16-byte boundaries. These limits are mostly gone from recent
uarchs.
So if a forward jump target is aligned to cacheline boundary then
prefetches will start from a new prefetch-cacheline and there's
higher chance for decoding in fewer steps and packing tightly.
But I think that argument is flawed for typical optimized kernel
code flows: forward jumps often go to 'cold' (uncommon) pieces
of code, and aligning cold code to cache lines does not bring a
lot of advantages (they are uncommon), while it causes
collateral damage:
- their alignment 'spreads out' the cache footprint, it shifts
followup hot code further out
- plus it slows down even 'cold' code that immediately follows 'hot'
code (like in the above case), which could have benefited from the
partial cacheline that comes off the end of hot code.
But even in the cache-hot case the 16 byte alignment brings
disadvantages:
- it spreads out the cache footprint, possibly making the code
fall out of the L1 I$.
- On Intel CPUs, recent microarchitectures have plenty of
uop cache (typically doubling every 3 years) - while the
size of the L1 cache grows much less aggressively. So
workloads are rarely uop cache limited.
The only situation where alignment might matter are tight
loops that could fit into a single 16 byte chunk - but those
are pretty rare in the kernel: if they exist they tend
to be pointer chasing or generic memory ops, which both tend
to be cache miss (or cache allocation) intensive and are not
decoder bandwidth limited.
So the balance of arguments strongly favors packing kernel
instructions tightly versus maximizing for decoder bandwidth:
this patch changes the jump target alignment from 16 bytes
to 1 byte (tightly packed, unaligned).
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-10 20:08:46 +08:00
# Align jump targets to 1 byte, not the default 16 bytes:
2017-04-14 01:26:09 +08:00
KBUILD_CFLAGS += $( call cc-option,-falign-jumps= 1)
x86: Align jump targets to 1-byte boundaries
The following NOP in a hot function caught my attention:
> 5a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
That's a dead NOP that bloats the function a bit, added for the
default 16-byte alignment that GCC applies for jump targets.
I realize that x86 CPU manufacturers recommend 16-byte jump
target alignments (it's in the Intel optimization manual),
to help their relatively narrow decoder prefetch alignment
and uop cache constraints, but the cost of that is very
significant:
text data bss dec filename
12566391 1617840 1089536 15273767 vmlinux.align.16-byte
12224951 1617840 1089536 14932327 vmlinux.align.1-byte
By using 1-byte jump target alignment (i.e. no alignment at all)
we get an almost 3% reduction in kernel size (!) - and a
probably similar reduction in I$ footprint.
Now, the usual justification for jump target alignment is the
following:
- modern decoders tend to have 16-byte (effective) decoder
prefetch windows. (AMD documents it higher but measurements
suggest the effective prefetch window on curretn uarchs is
still around 16 bytes)
- on Intel there's also the uop-cache with cachelines that have
16-byte granularity and limited associativity.
- older x86 uarchs had a penalty for decoder fetches that crossed
16-byte boundaries. These limits are mostly gone from recent
uarchs.
So if a forward jump target is aligned to cacheline boundary then
prefetches will start from a new prefetch-cacheline and there's
higher chance for decoding in fewer steps and packing tightly.
But I think that argument is flawed for typical optimized kernel
code flows: forward jumps often go to 'cold' (uncommon) pieces
of code, and aligning cold code to cache lines does not bring a
lot of advantages (they are uncommon), while it causes
collateral damage:
- their alignment 'spreads out' the cache footprint, it shifts
followup hot code further out
- plus it slows down even 'cold' code that immediately follows 'hot'
code (like in the above case), which could have benefited from the
partial cacheline that comes off the end of hot code.
But even in the cache-hot case the 16 byte alignment brings
disadvantages:
- it spreads out the cache footprint, possibly making the code
fall out of the L1 I$.
- On Intel CPUs, recent microarchitectures have plenty of
uop cache (typically doubling every 3 years) - while the
size of the L1 cache grows much less aggressively. So
workloads are rarely uop cache limited.
The only situation where alignment might matter are tight
loops that could fit into a single 16 byte chunk - but those
are pretty rare in the kernel: if they exist they tend
to be pointer chasing or generic memory ops, which both tend
to be cache miss (or cache allocation) intensive and are not
decoder bandwidth limited.
So the balance of arguments strongly favors packing kernel
instructions tightly versus maximizing for decoder bandwidth:
this patch changes the jump target alignment from 16 bytes
to 1 byte (tightly packed, unaligned).
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-10 20:08:46 +08:00
2015-05-17 13:56:54 +08:00
# Pack loops tightly as well:
2017-04-14 01:26:09 +08:00
KBUILD_CFLAGS += $( call cc-option,-falign-loops= 1)
2015-05-17 13:56:54 +08:00
2014-09-11 00:05:39 +08:00
# Don't autogenerate traditional x87 instructions
2014-04-22 13:40:27 +08:00
KBUILD_CFLAGS += $( call cc-option,-mno-80387)
KBUILD_CFLAGS += $( call cc-option,-mno-fp-ret-in-387)
2013-11-21 05:31:49 +08:00
2017-06-22 07:28:05 +08:00
# By default gcc and clang use a stack alignment of 16 bytes for x86.
# However the standard kernel entry on x86-64 leaves the stack on an
# 8-byte boundary. If the compiler isn't informed about the actual
# alignment it will generate extra alignment instructions for the
# default alignment which keep the stack *mis*aligned.
# Furthermore an alignment to the register width reduces stack usage
# and the number of alignment instructions.
2017-08-18 02:20:47 +08:00
KBUILD_CFLAGS += $( call cc-option,$( cc_stack_align8) )
2012-05-30 05:31:23 +08:00
2014-12-18 10:05:29 +08:00
# Use -mskip-rax-setup if supported.
KBUILD_CFLAGS += $( call cc-option,-mskip-rax-setup)
2008-01-30 20:32:20 +08:00
# FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
cflags-$( CONFIG_MK8) += $( call cc-option,-march= k8)
cflags-$( CONFIG_MPSC) += $( call cc-option,-march= nocona)
cflags-$( CONFIG_MCORE2) += \
$( call cc-option,-march= core2,$( call cc-option,-mtune= generic) )
2009-08-22 05:06:23 +08:00
cflags-$( CONFIG_MATOM) += $( call cc-option,-march= atom) \
$( call cc-option,-mtune= atom,$( call cc-option,-mtune= generic) )
2008-01-30 20:32:20 +08:00
cflags-$( CONFIG_GENERIC_CPU) += $( call cc-option,-mtune= generic)
KBUILD_CFLAGS += $( cflags-y)
KBUILD_CFLAGS += -mno-red-zone
KBUILD_CFLAGS += -mcmodel= kernel
# -funit-at-a-time shrinks the kernel .text considerably
# unfortunately it makes reading oopses harder.
KBUILD_CFLAGS += $( call cc-option,-funit-at-a-time)
2009-02-09 21:17:39 +08:00
e n d i f
2008-01-30 20:32:20 +08:00
2012-02-28 06:09:10 +08:00
i f d e f C O N F I G _ X 8 6 _ X 3 2
x32_ld_ok := $( call try-run,\
/bin/echo -e '1: .quad 1b' | \
2012-10-02 22:42:36 +08:00
$( CC) $( KBUILD_AFLAGS) -c -x assembler -o " $$ TMP " - && \
2012-02-28 06:09:10 +08:00
$( OBJCOPY) -O elf32-x86-64 " $$ TMP " " $$ TMPO " && \
$( LD) -m elf32_x86_64 " $$ TMPO " -o " $$ TMP " ,y,n)
2012-02-28 17:35:06 +08:00
ifeq ( $( x32_ld_ok) ,y)
CONFIG_X86_X32_ABI := y
KBUILD_AFLAGS += -DCONFIG_X86_X32_ABI
KBUILD_CFLAGS += -DCONFIG_X86_X32_ABI
else
$( warning CONFIG_X86_X32 enabled but no binutils support)
endif
2012-02-28 06:09:10 +08:00
e n d i f
export CONFIG_X86_X32_ABI
2017-03-17 03:31:33 +08:00
#
# If the function graph tracer is used with mcount instead of fentry,
# '-maccumulate-outgoing-args' is needed to prevent a GCC bug
# (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109)
#
i f d e f C O N F I G _ F U N C T I O N _ G R A P H _ T R A C E R
ifndef CONFIG_HAVE_FENTRY
ACCUMULATE_OUTGOING_ARGS := 1
else
ifeq ( $( call cc-option-yn, -mfentry) , n)
ACCUMULATE_OUTGOING_ARGS := 1
2017-04-19 05:44:29 +08:00
# GCC ignores '-maccumulate-outgoing-args' when used with '-Os'.
# If '-Os' is enabled, disable it and print a warning.
ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
undefine CONFIG_CC_OPTIMIZE_FOR_SIZE
2017-05-24 00:27:54 +08:00
$( warning Disabling CONFIG_CC_OPTIMIZE_FOR_SIZE. Your compiler does not have -mfentry so you cannot optimize for size with CONFIG_FUNCTION_GRAPH_TRACER.)
2017-04-19 05:44:29 +08:00
endif
2017-03-17 03:31:33 +08:00
endif
endif
e n d i f
i f e q ( $( ACCUMULATE_OUTGOING_ARGS ) , 1 )
2017-05-09 11:29:46 +08:00
# This compiler flag is not supported by Clang:
KBUILD_CFLAGS += $( call cc-option,-maccumulate-outgoing-args,)
2017-03-17 03:31:33 +08:00
e n d i f
2008-01-30 20:32:20 +08:00
# Stackpointer is addressed different for 32 bit and 64 bit x86
sp-$(CONFIG_X86_32) := esp
sp-$(CONFIG_X86_64) := rsp
2015-10-06 08:47:57 +08:00
# do binutils support CFI?
cfi := $( call as-instr,.cfi_startproc\n .cfi_rel_offset $( sp-y) $( comma) 0\n .cfi_endproc,-DCONFIG_AS_CFI= 1)
# is .cfi_signal_frame supported too?
cfi-sigframe := $( call as-instr,.cfi_startproc\n .cfi_signal_frame\n .cfi_endproc,-DCONFIG_AS_CFI_SIGNAL_FRAME= 1)
cfi-sections := $( call as-instr,.cfi_sections .debug_frame,-DCONFIG_AS_CFI_SECTIONS= 1)
2010-10-14 07:00:29 +08:00
# does binutils support specific instructions?
asinstr := $( call as-instr,fxsaveq ( %rax) ,-DCONFIG_AS_FXSAVEQ= 1)
2015-01-23 16:29:50 +08:00
asinstr += $( call as-instr,pshufb %xmm0$( comma) %xmm0,-DCONFIG_AS_SSSE3= 1)
2012-05-22 11:54:04 +08:00
avx_instr := $( call as-instr,vxorps %ymm0$( comma) %ymm1$( comma) %ymm2,-DCONFIG_AS_AVX= 1)
2012-11-09 05:47:44 +08:00
avx2_instr := $( call as-instr,vpbroadcastb %xmm0$( comma) %ymm1,-DCONFIG_AS_AVX2= 1)
2016-08-13 09:03:19 +08:00
avx512_instr := $( call as-instr,vpmovm2b %k1$( comma) %zmm5,-DCONFIG_AS_AVX512= 1)
2015-09-11 06:27:26 +08:00
sha1_ni_instr := $( call as-instr,sha1msg1 %xmm0$( comma) %xmm1,-DCONFIG_AS_SHA1_NI= 1)
sha256_ni_instr := $( call as-instr,sha256msg1 %xmm0$( comma) %xmm1,-DCONFIG_AS_SHA256_NI= 1)
2010-10-14 07:00:29 +08:00
2016-08-13 09:03:19 +08:00
KBUILD_AFLAGS += $( cfi) $( cfi-sigframe) $( cfi-sections) $( asinstr) $( avx_instr) $( avx2_instr) $( avx512_instr) $( sha1_ni_instr) $( sha256_ni_instr)
KBUILD_CFLAGS += $( cfi) $( cfi-sigframe) $( cfi-sections) $( asinstr) $( avx_instr) $( avx2_instr) $( avx512_instr) $( sha1_ni_instr) $( sha256_ni_instr)
2008-01-30 20:32:20 +08:00
2018-08-24 07:20:39 +08:00
KBUILD_LDFLAGS := -m elf_$( UTS_MACHINE)
2008-01-30 20:32:21 +08:00
2018-03-20 04:57:46 +08:00
#
# The 64-bit kernel must be aligned to 2MB. Pass -z max-page-size=0x200000 to
# the linker to force 2MB page size regardless of the default page size used
# by the linker.
#
i f d e f C O N F I G _ X 8 6 _ 6 4
2018-08-24 07:20:39 +08:00
KBUILD_LDFLAGS += $( call ld-option, -z max-page-size= 0x200000)
2018-03-20 04:57:46 +08:00
e n d i f
2008-01-30 20:32:21 +08:00
# Workaround for a gcc prelease that unfortunately was shipped in a suse release
KBUILD_CFLAGS += -Wno-sign-compare
#
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
2008-01-30 20:32:20 +08:00
x86/retpoline: Add initial retpoline support
Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
the corresponding thunks. Provide assembler macros for invoking the thunks
in the same way that GCC does, from native and inline assembler.
This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
some circumstances, IBRS microcode features may be used instead, and the
retpoline can be disabled.
On AMD CPUs if lfence is serialising, the retpoline can be dramatically
simplified to a simple "lfence; jmp *\reg". A future patch, after it has
been verified that lfence really is serialising in all circumstances, can
enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
to X86_FEATURE_RETPOLINE.
Do not align the retpoline in the altinstr section, because there is no
guarantee that it stays aligned when it's copied over the oldinstr during
alternative patching.
[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
[ tglx: Put actual function CALL/JMP in front of the macros, convert to
symbolic labels ]
[ dwmw2: Convert back to numeric labels, merge objtool fixes ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
2018-01-12 05:46:25 +08:00
# Avoid indirect branches in kernel to deal with Spectre
i f d e f C O N F I G _ R E T P O L I N E
2018-11-02 16:45:41 +08:00
KBUILD_CFLAGS += $( RETPOLINE_CFLAGS)
x86/retpoline: Add initial retpoline support
Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
the corresponding thunks. Provide assembler macros for invoking the thunks
in the same way that GCC does, from native and inline assembler.
This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
some circumstances, IBRS microcode features may be used instead, and the
retpoline can be disabled.
On AMD CPUs if lfence is serialising, the retpoline can be dramatically
simplified to a simple "lfence; jmp *\reg". A future patch, after it has
been verified that lfence really is serialising in all circumstances, can
enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
to X86_FEATURE_RETPOLINE.
Do not align the retpoline in the altinstr section, because there is no
guarantee that it stays aligned when it's copied over the oldinstr during
alternative patching.
[ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
[ tglx: Put actual function CALL/JMP in front of the macros, convert to
symbolic labels ]
[ dwmw2: Convert back to numeric labels, merge objtool fixes ]
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: thomas.lendacky@amd.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
2018-01-12 05:46:25 +08:00
e n d i f
2012-10-16 03:16:56 +08:00
archscripts : scripts_basic
2012-05-09 02:22:24 +08:00
$( Q) $( MAKE) $( build) = arch/x86/tools relocs
2011-11-12 08:07:41 +08:00
###
# Syscall table generation
archheaders :
2015-06-04 00:36:41 +08:00
$( Q) $( MAKE) $( build) = arch/x86/entry/syscalls all
2011-11-12 08:07:41 +08:00
kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs
Using macros in inline assembly allows us to work around bugs
in GCC's inlining decisions.
Compile macros.S and use it to assemble all C files.
Currently only x86 will use it.
Background:
The inlining pass of GCC doesn't include an assembler, so it's not aware
of basic properties of the generated code, such as its size in bytes,
or that there are such things as discontiuous blocks of code and data
due to the newfangled linker feature called 'sections' ...
Instead GCC uses a lazy and fragile heuristic: it does a linear count of
certain syntactic and whitespace elements in inlined assembly block source
code, such as a count of new-lines and semicolons (!), as a poor substitute
for "code size and complexity".
Unsurprisingly this heuristic falls over and breaks its neck whith certain
common types of kernel code that use inline assembly, such as the frequent
practice of putting useful information into alternative sections.
As a result of this fresh, 20+ years old GCC bug, GCC's inlining decisions
are effectively disabled for inlined functions that make use of such asm()
blocks, because GCC thinks those sections of code are "large" - when in
reality they are often result in just a very low number of machine
instructions.
This absolute lack of inlining provess when GCC comes across such asm()
blocks both increases generated kernel code size and causes performance
overhead, which is particularly noticeable on paravirt kernels, which make
frequent use of these inlining facilities in attempt to stay out of the
way when running on baremetal hardware.
Instead of fixing the compiler we use a workaround: we set an assembly macro
and call it from the inlined assembly block. As a result GCC considers the
inline assembly block as a single instruction. (Which it often isn't but I digress.)
This uglifies and bloats the source code - for example just the refcount
related changes have this impact:
Makefile | 9 +++++++--
arch/x86/Makefile | 7 +++++++
arch/x86/kernel/macros.S | 7 +++++++
scripts/Kbuild.include | 4 +++-
scripts/mod/Makefile | 2 ++
5 files changed, 26 insertions(+), 3 deletions(-)
Yay readability and maintainability, it's not like assembly code is hard to read
and maintain ...
We also hope that GCC will eventually get fixed, but we are not holding
our breath for that. Yet we are optimistic, it might still happen, any decade now.
[ mingo: Wrote new changelog describing the background. ]
Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kbuild@vger.kernel.org
Link: http://lkml.kernel.org/r/20181003213100.189959-3-namit@vmware.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-04 05:30:52 +08:00
archmacros :
$( Q) $( MAKE) $( build) = arch/x86/kernel arch/x86/kernel/macros.s
2018-10-24 07:11:25 +08:00
ASM_MACRO_FLAGS = -Wa,arch/x86/kernel/macros.s
kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs
Using macros in inline assembly allows us to work around bugs
in GCC's inlining decisions.
Compile macros.S and use it to assemble all C files.
Currently only x86 will use it.
Background:
The inlining pass of GCC doesn't include an assembler, so it's not aware
of basic properties of the generated code, such as its size in bytes,
or that there are such things as discontiuous blocks of code and data
due to the newfangled linker feature called 'sections' ...
Instead GCC uses a lazy and fragile heuristic: it does a linear count of
certain syntactic and whitespace elements in inlined assembly block source
code, such as a count of new-lines and semicolons (!), as a poor substitute
for "code size and complexity".
Unsurprisingly this heuristic falls over and breaks its neck whith certain
common types of kernel code that use inline assembly, such as the frequent
practice of putting useful information into alternative sections.
As a result of this fresh, 20+ years old GCC bug, GCC's inlining decisions
are effectively disabled for inlined functions that make use of such asm()
blocks, because GCC thinks those sections of code are "large" - when in
reality they are often result in just a very low number of machine
instructions.
This absolute lack of inlining provess when GCC comes across such asm()
blocks both increases generated kernel code size and causes performance
overhead, which is particularly noticeable on paravirt kernels, which make
frequent use of these inlining facilities in attempt to stay out of the
way when running on baremetal hardware.
Instead of fixing the compiler we use a workaround: we set an assembly macro
and call it from the inlined assembly block. As a result GCC considers the
inline assembly block as a single instruction. (Which it often isn't but I digress.)
This uglifies and bloats the source code - for example just the refcount
related changes have this impact:
Makefile | 9 +++++++--
arch/x86/Makefile | 7 +++++++
arch/x86/kernel/macros.S | 7 +++++++
scripts/Kbuild.include | 4 +++-
scripts/mod/Makefile | 2 ++
5 files changed, 26 insertions(+), 3 deletions(-)
Yay readability and maintainability, it's not like assembly code is hard to read
and maintain ...
We also hope that GCC will eventually get fixed, but we are not holding
our breath for that. Yet we are optimistic, it might still happen, any decade now.
[ mingo: Wrote new changelog describing the background. ]
Tested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Acked-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kbuild@vger.kernel.org
Link: http://lkml.kernel.org/r/20181003213100.189959-3-namit@vmware.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-10-04 05:30:52 +08:00
export ASM_MACRO_FLAGS
KBUILD_CFLAGS += $( ASM_MACRO_FLAGS)
2008-01-30 20:32:20 +08:00
###
# Kernel objects
2008-02-23 16:58:20 +08:00
head-y := arch/x86/kernel/head_$( BITS) .o
head-y += arch/x86/kernel/head$( BITS) .o
2016-04-14 08:04:43 +08:00
head-y += arch/x86/kernel/ebda.o
2016-04-14 08:04:34 +08:00
head-y += arch/x86/kernel/platform-quirks.o
2008-01-30 20:32:20 +08:00
libs-y += arch/x86/lib/
2009-04-16 03:34:55 +08:00
# See arch/x86/Kbuild for content of core part of the kernel
core-y += arch/x86/
2008-01-30 20:32:20 +08:00
# drivers-y are linked after core-y
drivers-$(CONFIG_MATH_EMULATION) += arch/x86/math-emu/
drivers-$(CONFIG_PCI) += arch/x86/pci/
# must be linked after kernel/
drivers-$(CONFIG_OPROFILE) += arch/x86/oprofile/
2008-02-10 06:24:09 +08:00
# suspend and hibernation support
2008-01-30 20:32:20 +08:00
drivers-$(CONFIG_PM) += arch/x86/power/
2008-02-10 06:24:09 +08:00
2008-01-30 20:32:20 +08:00
drivers-$(CONFIG_FB) += arch/x86/video/
####
# boot loader support. Several targets are kept for legacy purposes
boot := arch/x86/boot
2009-04-18 01:46:37 +08:00
BOOT_TARGETS = bzlilo bzdisk fdimage fdimage144 fdimage288 isoimage
2009-03-13 03:50:33 +08:00
PHONY += bzImage $( BOOT_TARGETS)
2008-01-30 20:32:20 +08:00
# Default kernel to build
all : bzImage
# KBUILD_IMAGE specify target image being built
2009-03-13 03:50:33 +08:00
KBUILD_IMAGE := $( boot) /bzImage
2008-01-30 20:32:20 +08:00
2009-03-13 03:50:33 +08:00
bzImage : vmlinux
2009-08-14 04:34:21 +08:00
i f e q ( $( CONFIG_X 86_DECODER_SELFTEST ) , y )
$( Q) $( MAKE) $( build) = arch/x86/tools posttest
e n d i f
2008-01-30 20:32:20 +08:00
$( Q) $( MAKE) $( build) = $( boot) $( KBUILD_IMAGE)
$( Q) mkdir -p $( objtree) /arch/$( UTS_MACHINE) /boot
2008-04-22 23:29:26 +08:00
$( Q) ln -fsn ../../x86/boot/bzImage $( objtree) /arch/$( UTS_MACHINE) /boot/$@
2008-01-30 20:32:20 +08:00
2009-03-13 03:50:33 +08:00
$(BOOT_TARGETS) : vmlinux
$( Q) $( MAKE) $( build) = $( boot) $@
2008-01-30 20:32:20 +08:00
2009-04-18 01:46:37 +08:00
PHONY += install
install :
$( Q) $( MAKE) $( build) = $( boot) $@
2008-01-30 20:32:20 +08:00
PHONY += vdso_install
vdso_install :
2015-06-04 00:05:44 +08:00
$( Q) $( MAKE) $( build) = arch/x86/entry/vdso $@
2008-01-30 20:32:20 +08:00
2018-08-30 03:43:17 +08:00
archprepare : checkbin
checkbin :
i f n d e f C C _ H A V E _ A S M _ G O T O
@echo Compiler lacks asm-goto support.
@exit 1
e n d i f
2018-12-05 14:27:19 +08:00
i f d e f C O N F I G _ R E T P O L I N E
i f e q ( $( RETPOLINE_CFLAGS ) , )
@echo "You are building kernel with non-retpoline compiler." >& 2
@echo "Please update your compiler." >& 2
@false
e n d i f
e n d i f
2018-08-30 03:43:17 +08:00
2008-01-30 20:32:20 +08:00
archclean :
$( Q) rm -rf $( objtree) /arch/i386
$( Q) rm -rf $( objtree) /arch/x86_64
$( Q) $( MAKE) $( clean) = $( boot)
2012-05-22 01:51:24 +08:00
$( Q) $( MAKE) $( clean) = arch/x86/tools
2008-01-30 20:32:20 +08:00
d e f i n e a r c h h e l p
2008-01-30 20:32:49 +08:00
echo '* bzImage - Compressed kernel image (arch/x86/boot/bzImage)'
echo ' install - Install kernel using'
2009-07-21 03:37:11 +08:00
echo ' (your) ~/bin/$(INSTALLKERNEL) or'
echo ' (distribution) /sbin/$(INSTALLKERNEL) or'
2008-01-30 20:32:49 +08:00
echo ' install to $$(INSTALL_PATH) and run lilo'
echo ' fdimage - Create 1.4MB boot floppy image (arch/x86/boot/fdimage)'
echo ' fdimage144 - Create 1.4MB boot floppy image (arch/x86/boot/fdimage)'
echo ' fdimage288 - Create 2.8MB boot floppy image (arch/x86/boot/fdimage)'
echo ' isoimage - Create a boot CD-ROM image (arch/x86/boot/image.iso)'
echo ' bzdisk/fdimage*/isoimage also accept:'
echo ' FDARGS="..." arguments for the booted kernel'
echo ' FDINITRD=file initrd for the booted kernel'
2008-01-30 20:32:20 +08:00
e n d e f