Pull x86 timer updates from Thomas Gleixner:
"These updates are related to TSC handling:
- Support platforms which have synchronized TSCs but the boot CPU has
a non zero TSC_ADJUST value, which is considered a firmware bug on
normal systems.
This applies to HPE/SGI UV platforms where the platform firmware
uses TSC_ADJUST to ensure TSC synchronization across a huge number
of sockets, but due to power on timings the boot CPU cannot be
guaranteed to have a zero TSC_ADJUST register value.
- Fix the ordering of udelay calibration and kvmclock_init()
- Cleanup the udelay and calibration code"
* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Mark cyc2ns_init() and detect_art() __init
x86/platform/UV: Mark tsc_check_sync as an init function
x86/tsc: Make CONFIG_X86_TSC=n build work again
x86/platform/UV: Add check of TSC state set by UV BIOS
x86/tsc: Provide a means to disable TSC ART
x86/tsc: Drastically reduce the number of firmware bug warnings
x86/tsc: Skip TSC test and error messages if already unstable
x86/tsc: Add option that TSC on Socket 0 being non-zero is valid
x86/timers: Move simple_udelay_calibration() past kvmclock_init()
x86/timers: Make recalibrate_cpu_khz() void
x86/timers: Move the simple udelay calibration to tsc.h
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Insert a check early in UV system startup that checks whether BIOS was
able to obtain satisfactory TSC Sync stability. If not, it usually
is caused by an error in the external TSC clock generation source.
In this case the best fallback is to use the builtin hardware RTC
as the kernel will not be able to set an accurate TSC sync either.
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Reviewed-by: Russ Anderson <russ.anderson@hpe.com>
Reviewed-by: Andrew Banman <andrew.abanman@hpe.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Bin Gao <bin.gao@linux.intel.com>
Link: https://lkml.kernel.org/r/20171012163202.406294490@stormcage.americas.sgi.com
Rather than passing all the contents of flush_tlb_info to
flush_tlb_others(), pass a pointer to the structure directly. For
consistency, this also removes the unnecessary cpu parameter from
uv_flush_tlb_others() to make its signature match the other
*flush_tlb_others() functions.
This serves two purposes:
- It will dramatically simplify future patches that change struct
flush_tlb_info, which I'm planning to do.
- struct flush_tlb_info is an adequate description of what to do
for a local flush, too, so by reusing it we can remove duplicated
code between local and remove flushes in a future patch.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
[ Fix build warning. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 platform updates from Ingo Molnar:
"Most of the commits are continued SGI UV4 hardware-enablement changes,
plus there's also new Bluetooth support for the Intel Edison platform"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/intel-mid: Enable Bluetooth support on Intel Edison
x86/platform/uv/BAU: Implement uv4_wait_completion with read_status
x86/platform/uv/BAU: Add wait_completion to bau_operations
x86/platform/uv/BAU: Add status mmr location fields to bau_control
x86/platform/uv/BAU: Cleanup bau_operations declaration and instances
x86/platform/uv/BAU: Add payload descriptor qualifier
x86/platform/uv/BAU: Add uv_bau_version enumerated constants
The calculation of the global physical address (GPA) on UV4 is
incorrect. The gnode_extra/upper global offset should only be
applied for fixed address space systems (UV1..3).
Tested-by: John Estabrook <john.estabrook@hpe.com>
Signed-off-by: Mike Travis <mike.travis@hpe.com>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: John Estabrook <estabrook@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170321231646.667689538@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Remove the present wait_completion routine and add a function pointer by
the same name to the bau_operations struct. Rather than switching on the
UV hub version during message processing, set the architecture-specific
uv*_wait_completion during initialization.
The uv123_bau_ops struct must be split into uv1 and uv2_3 versions to
accommodate the corresponding wait_completion routines.
Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: sivanich@hpe.com
Cc: rja@hpe.com
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/1489077734-111753-6-git-send-email-abanman@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The location of the ERROR and BUSY status bits depends on the descriptor
index, i.e. the CPU, of the message. Since this index does not change,
there is no need to calculate the mmr and index location during message
processing. The less work we do in the hot path the better.
Add status_mmr and status_index fields to bau_control and compute their
values during initialization. Add kerneldoc descriptions for the new
fields. Update uv*_wait_completion to use these fields rather than
receiving the information as parameters.
Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: sivanich@hpe.com
Cc: rja@hpe.com
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/1489077734-111753-5-git-send-email-abanman@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move the bau_operations declaration after bau struct declarations so the
bau structs can be referenced when adding new functions to
bau_operations. That way we avoid forward declarations of the bau
structs.
Likewise, move uv*_bau_ops structs down to avoid forward declarations of
new functions defined in the same file. Declare these structs __initconst
since they are only used during initialization. Similarly, declare the
bau_operations ops instance __ro_after_init as it is read-only after
initialization.
This is a preparatory patch for adding wait_completion to bau_operations.
Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: sivanich@hpe.com
Cc: rja@hpe.com
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/1489077734-111753-4-git-send-email-abanman@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
On UV4, the destination agent verifies each message by checking the
descriptor qualifier field of the message payload. Messages without this
field set to 0x534749 will cause a hub error to assert. Split
bau_message_payload into uv1_2_3 and uv4 versions to account for the
different payload formats.
Enforce the size of each field by using the appropriate u** integer type.
Replace extraneous comments with KernelDoc comment.
Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: sivanich@hpe.com
Cc: rja@hpe.com
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/1489077734-111753-3-git-send-email-abanman@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Add the UV4-specific function definitions and define an operations struct
to implement them in the BAU driver.
Many BAU MMRs, although functionally the same, have new addresses on UV4
due to hardware changes. Each MMR requires new read/write functions, but
their implementation in the driver does not change. Thus, it is enough to
enumerate them in the operations struct for the changes to take effect.
Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-11-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Many BAU functions have different implementations depending on the UV
version. Rather than switching on the uvhub_version throughout the driver,
we can define a set of operations for each version. This is especially
beneficial for UV4, which will require many new MMR read/write functions.
Currently, the set of abstracted functions are the same for UV1, UV2, and
UV3. The functions were chosen because each one will have a different
implementation for UV4. Other functions will be added as needed to handle
new implementations or to cleanup the existing differences between UV1,
UV2, and UV3, i.e. read_status and wait_completion.
Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-6-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The BAU driver should use the functions provided by uv_hub.h rather than
its own implementations. uv_physnodeaddr converts vaddrs to paddrs for
BAU MMR fields, but this is done better by uv_gpa_to_offset.
Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-5-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The payload queue first MMR requires the physical memory address and hub
GNODE of where the payload queue resides in memory, but the associated
variables are named as if the PNODE were used. Rename gnode-related
variables and clarify the definitions of the payload queue head, last, and
tail pointers.
Signed-off-by: Andrew Banman <abanman@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <travis@sgi.com>
Acked-by: Dimitri Sivanich <sivanich@sgi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akpm@linux-foundation.org
Cc: rja@sgi.com
Link: http://lkml.kernel.org/r/1474474161-265604-4-git-send-email-abanman@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There are some circumstances where the UV4 BIOS cannot provide the
correct Proximity Node values to associate with specific Sockets and
Physical Nodes. The decision was made to remove these values from BIOS
and for the kernel to get these values from the standard ACPI tables.
Tested-by: Frank Ramsay <framsay@sgi.com>
Tested-by: John Estabrook <estabrook@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Nathan Zimmer <nzimmer@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160801184050.414210079@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Use no-op messages in place of cross-partition interrupts when nacking a
put message in the GRU. This allows us to remove MMR's as a destination
from the GRU driver.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215406.012228480@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch builds support for the new conversions of physical addresses
to and from sockets, pnodes and nodes in UV4. It is designed to be as
efficient as possible as lookups are done inside an interrupt context
in some cases. It will be further optimized when physical hardware is
available to measure execution time.
Tested-by: Dimitri Sivanich <sivanich@sgi.com>
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215405.841051741@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
An aspect of the UV4 system architecture changes involve changing the
way sockets, nodes, and pnodes are translated between one another.
Decode the information from the BIOS provided EFI system table to build
the needed conversion tables.
Tested-by: Dimitri Sivanich <sivanich@sgi.com>
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215405.673495324@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With the UV4 system architecture addressing changes, BIOS now provides
this information via an EFI system table. This is the initial decoding
of that system table. It also collects the sizing information for
later allocation of dynamic conversion tables.
Tested-by: Dimitri Sivanich <sivanich@sgi.com>
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215405.503022681@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
UV4 uses a GAM (globally addressed memory) architecture that supports
variable sized memory per node. This replaces the old "M" value (number
of address bits per node) with a range table for conversions between
addresses and physical node (pnode) id's. This table is obtained from UV
BIOS via the EFI UVsystab table. Support for older EFI UVsystab tables
is maintained.
Tested-by: Dimitri Sivanich <sivanich@sgi.com>
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215405.329827545@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Migrate references from the blade info structs to the per node hub info
structs. This phases out the allocation of the list of per blade info
structs on node 0, in favor of a per node hub info struct allocated on
the node's local memory.
There are also some minor cosemetic changes in the comments and whitespace
to clean things up a bit.
Tested-by: Dimitri Sivanich <sivanich@sgi.com>
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215404.987204515@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Allocate and setup per node hub info structs. CPU 0/Node 0 hub info
is statically allocated to be accessible early in system startup. The
remaining hub info structs are allocated on the node's local memory,
and shared among the CPU's on that node. This leaves the small amount
of info unique to each CPU in the per CPU info struct.
Memory is saved by combining the common per node info fields to common
node local structs. In addtion, since the info is read only only after
setup, it should stay in the L3 cache of the local processor socket.
This should therefore improve the cache hit rate when a group of cpus
on a node are all interrupted for a common task.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215404.813051625@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move references to blade local processor ID to the new per cpu info
structs. Create an access function that makes this move, and other
potential moves opaque to callers of this function. Define a flag
that indicates to callers in external GPL modules that this function
replaces any local definition. This allows calling source code to be
built for both pre-UV4 kernels as well as post-UV4 kernels.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215404.644173122@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Change the references to the SCIR fields to the new per cpu info structs.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215404.452538234@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The major portion of the hub info is common to all cpus on that hub.
This is step one of moving the per cpu hub info to a per node hub info
struct. This patch creates the small per cpu info struct that will
contain only information specific to each CPU.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215404.282265563@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Clean up any redundancies caused by new UV4 MMR definitions superseding
any previously definitions local to functions.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215403.934728974@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This adds the MMR definitions for UV4 via an automated script that uses
the output from a hardware verilog code to symbol converter. The large
number of insertions is caused by the UV4 design changing many similarly
named fields in MMR's that are named the same. This prompted the extra
production of architecture dependent field defines.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215403.580158916@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cleanup patch to rearrange code and modify some defines so the next
patch, the new UV4 MMR definitions can be merged cleanly.
* Clean up the M/N related address constants (M is # of address bits per
blade, N is the # of blade selection bits per SSI/partition).
* Fix the lookup of the alias overlay addresses and NMI definitions to
allow for flexibility in newer UV architecture types.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215403.401604203@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add defines to control which UV architectures are supported, and modify the
'if (is_uvX_*)' functions to return constant 0 for those not supported.
This will help optimize code paths when support for specific UV arches
is removed.
Tested-by: John Estabrook <estabrook@sgi.com>
Tested-by: Gary Kroening <gfk@sgi.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160429215402.897143440@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For several years, the common practice has been to boot UVs with the
"nobau" parameter on the command line, to disable the BAU. We've
decided that it makes more sense to just disable the BAU by default in
the kernel, and provide the option to turn it on, if desired.
For now, having the on/off switch doesn't buy us any more than just
reversing the logic would, but we're working towards having the BAU
enabled by default on UV4. When those changes are in place, having the
on/off switch will make more sense than an enable flag, since the
default behavior will be different depending on the system version.
I've also added a bit of documentation for the new parameter to
Documentation/kernel-parameters.txt.
Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1459451909-121845-1-git-send-email-athorlton@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
UV: NMI: insert this_cpu_read accessor function on uv_hub_nmi.
On SGI UV systems a 'power nmi' command from the CMC causes
all processors to drop into uv_handle_nmi(). With the 4.0
kernel this results in
BUG: unable to handle kernel paging request
The bug is caused by the current code trying to use the PER_CPU
variable uv_cpu_nmi.hub without an appropriate accessor
function. That oversight occurred in
commit e16321709c ("uv: Replace __get_cpu_var")
Author: Christoph Lameter <cl@linux.com>
Date: Sun Aug 17 12:30:41 2014 -0500
This patch inserts this_cpu_read() in the uv_hub_nmi macro
restoring the intended functionality.
Signed-off-by: George Beshers <gbeshers@sgi.com>
Acked-by: Mike Travis <travis@sgi.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russ Anderson <rja@sgi.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We have encountered hardware with 18 cores/socket that gives 36 CPUs/socket
with hyperthreading enabled. This exceeds the current MAX_CPUS_PER_SOCKET
causing a failure in get_cpu_topology. Increase MAX_CPUS_PER_SOCKET to 64
and MAX_CPUS_PER_UVHUB to 128.
Signed-off-by: James Custer <jcuster@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Link: http://lkml.kernel.org/r/1414952199-185319-1-git-send-email-jcuster@sgi.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x). This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.
Other use cases are for storing and retrieving data from the current
processors percpu area. __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.
__get_cpu_var() is defined as :
#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.
this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.
This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset. Thereby address calculations are avoided and less registers
are used when code is generated.
Transformations done to __get_cpu_var()
1. Determine the address of the percpu instance of the current processor.
DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(&y);
2. Same as #1 but this time an array structure is involved.
DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(y);
3. Retrieve the content of the current processors instance of a per cpu
variable.
DEFINE_PER_CPU(int, y);
int x = __get_cpu_var(y)
Converts to
int x = __this_cpu_read(y);
4. Retrieve the content of a percpu struct
DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);
Converts to
memcpy(&x, this_cpu_ptr(&y), sizeof(x));
5. Assignment to a per cpu variable
DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;
Converts to
__this_cpu_write(y, x);
6. Increment/Decrement etc of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
__this_cpu_inc(y)
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Update of TLB shootdown code for UV3.
Kernel function native_flush_tlb_others() calls
uv_flush_tlb_others() on UV to invalidate tlb page definitions
on remote cpus. The UV systems have a hardware 'broadcast assist
unit' which can be used to broadcast shootdown messages to all
cpu's of selected nodes.
The behavior of the BAU has changed only slightly with UV3:
- UV3 is recognized with is_uv3_hub().
- UV2 functions and structures (uv2_xxx) are in most cases
simply renamed to uv2_3_xxx.
- Some UV2 error workarounds are not needed for UV3.
(see uv_bau_message_interrupt and enable_timeouts)
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Link: http://lkml.kernel.org/r/E1WkgWh-0001yJ-3K@eag09.americas.sgi.com
[ Removed a few linebreak uglies. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The value of n_lshift for UV is currently set based on the
socket m_val.
For UV3, set the n_lshift value based on the GAM_GR_CONFIG MMR.
This will allow bios to control the n_lshift value independent
of the socket m_val. Then n_lshift can be assigned a fixed value
across a multi-partition system, allowing for a fixed common
global physical address format that is independent of socket
m_val.
Cleanup unneeded macros.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Link: http://lkml.kernel.org/r/20140331143700.GB29916@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Make uv_register_nmi_notifier() and uv_handle_nmi_ping() static
to address sparse warnings.
Fix problem where uv_nmi_kexec_failed is unused when
CONFIG_KEXEC is not defined.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20140114162551.480872353@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This reverts commit 8eba18428a.
uv_trace() is not used by anything, nor is uv_trace_nmi_func, nor
uv_trace_func.
That's not how we do instrumentation code in the kernel: we add
tracepoints, printk()s, etc. so that everyone not just those with
magic kernel modules can debug a system.
So remove this unused (and misguied) piece of code.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Mike Travis <travis@sgi.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/n/tip-tumfBffmr4jmnt8Gyxanoblg@git.kernel.org
This patch adds support for the uvtrace module by providing a
skeleton call to the registered trace function. It also
provides another separate 'NMI' tracer that is triggered by the
system wide 'power nmi' command.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/r/20130923212501.185052551@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The current UV NMI handler has not been updated for the changes
in the system NMI handler and the perf operations. The UV NMI
handler reads an MMR in the UV Hub to check to see if the NMI
event was caused by the external 'system NMI' that the operator
can initiate on the System Mgmt Controller.
The problem arises when the perf tools are running, causing
millions of perf events per second on very large CPU count
systems. Previously this was okay because the perf NMI handler
ran at a higher priority on the NMI call chain and if the NMI
was a perf event, it would stop calling other NMI handlers
remaining on the NMI call chain.
Now the system NMI handler calls all the handlers on the NMI
call chain including the UV NMI handler. This causes the UV NMI
handler to read the MMRs at the same millions per second rate.
This can lead to significant performance loss and possible
system failures. It also can cause thousands of 'Dazed and
Confused' messages being sent to the system console. This
effectively makes perf tools unusable on UV systems.
To avoid this excessive overhead when perf tools are running,
this code has been optimized to minimize reading of the MMRs as
much as possible, by moving to the NMI_UNKNOWN notifier chain.
This chain is called only when all the users on the standard
NMI_LOCAL call chain have been called and none of them have
claimed this NMI.
There is an exception where the NMI_LOCAL notifier chain is
used. When the perf tools are in use, it's possible that the UV
NMI was captured by some other NMI handler and then either
ignored or mistakenly processed as a perf event. We set a
per_cpu ('ping') flag for those CPUs that ignored the initial
NMI, and then send them an IPI NMI signal. The NMI_LOCAL
handler on each cpu does not need to read the MMR, but instead
checks the in memory flag indicating it was pinged. There are
two module variables, 'ping_count' indicating how many requested
NMI events occurred, and 'ping_misses' indicating how many stray
NMI events. These most likely are perf events so it shows the
overhead of the perf NMI interrupts and how many MMR reads were avoided.
This patch also minimizes the reads of the MMRs by having the
first cpu entering the NMI handler on each node set a per HUB
in-memory atomic value. (Having a per HUB value avoids sending
lock traffic over NumaLink.) Both types of UV NMIs from the SMI
layer are supported.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/r/20130923212500.353547733@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch moves the UV NMI support from the x2apic file to a
new separate uv_nmi.c file in preparation for the next sequence
of patches. It prevents upcoming bloat of the x2apic file, and
has the added benefit of putting the upcoming /sys/module
parameters under the name 'uv_nmi' instead of 'x2apic_uv_x',
which was obscure.
Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jason Wessel <jason.wessel@windriver.com>
Link: http://lkml.kernel.org/r/20130923212500.183295611@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[Purpose of this patch]
As Vaibhav explained in the thread below, tracepoints for irq vectors
are useful.
http://www.spinics.net/lists/mm-commits/msg85707.html
<snip>
The current interrupt traces from irq_handler_entry and irq_handler_exit
provide when an interrupt is handled. They provide good data about when
the system has switched to kernel space and how it affects the currently
running processes.
There are some IRQ vectors which trigger the system into kernel space,
which are not handled in generic IRQ handlers. Tracing such events gives
us the information about IRQ interaction with other system events.
The trace also tells where the system is spending its time. We want to
know which cores are handling interrupts and how they are affecting other
processes in the system. Also, the trace provides information about when
the cores are idle and which interrupts are changing that state.
<snip>
On the other hand, my usecase is tracing just local timer event and
getting a value of instruction pointer.
I suggested to add an argument local timer event to get instruction pointer before.
But there is another way to get it with external module like systemtap.
So, I don't need to add any argument to irq vector tracepoints now.
[Patch Description]
Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in all events.
But there is an above use case to trace specific irq_vector rather than tracing all events.
In this case, we are concerned about overhead due to unwanted events.
So, add following tracepoints instead of introducing irq_vector_entry/exit.
so that we can enable them independently.
- local_timer_vector
- reschedule_vector
- call_function_vector
- call_function_single_vector
- irq_work_entry_vector
- error_apic_vector
- thermal_apic_vector
- threshold_apic_vector
- spurious_apic_vector
- x86_platform_ipi_vector
Also, introduce a logic switching IDT at enabling/disabling time so that a time penalty
makes a zero when tracepoints are disabled. Detailed explanations are as follows.
- Create trace irq handlers with entering_irq()/exiting_irq().
- Create a new IDT, trace_idt_table, at boot time by adding a logic to
_set_gate(). It is just a copy of original idt table.
- Register the new handlers for tracpoints to the new IDT by introducing
macros to alloc_intr_gate() called at registering time of irq_vector handlers.
- Add checking, whether irq vector tracing is on/off, into load_current_idt().
This has to be done below debug checking for these reasons.
- Switching to debug IDT may be kicked while tracing is enabled.
- On the other hands, switching to trace IDT is kicked only when debugging
is disabled.
In addition, the new IDT is created only when CONFIG_TRACING is enabled to avoid being
used for other purposes.
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/51C323ED.5050708@hds.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
This patch trims the MMR register definitions after the updates for the
SGI UV3 system have been applied. Note that because these definitions
are automatically generated from the RTL we cannot control the length
of the names. Therefore there are lines that exceed 80 characters.
Signed-off-by: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/20130211194509.173026880@gulag1.americas.sgi.com
Acked-by: Russ Anderson <rja@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch updates the UV HUB info for UV3. The "is_uv3_hub" and
"is_uvx_hub" (UV2 or UV3) functions are added as well as the addresses
and sizes of the MMR regions for UV3.
Signed-off-by: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/20130211194508.610723192@gulag1.americas.sgi.com
Acked-by: Russ Anderson <rja@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch updates the MMR register definitions for the SGI UV3 system.
Note that because these definitions are automatically generated from
the RTL we cannot control the length of the names. Therefore there are
lines that exceed 80 characters.
All the new MMR definitions are added in this patch. The patches that
follow then update the references. The last patch is a "trim" patch
which reduces the size of the MMR definitions file by about a third.
This keeps "bi-sectability" in place as the intermediate patches would
not compile correctly if the trimmed MMR defines were done first.
Signed-off-by: Mike Travis <travis@sgi.com>
Link: http://lkml.kernel.org/r/20130211194508.326204556@gulag1.americas.sgi.com
Acked-by: Russ Anderson <rja@sgi.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The flush tlb optimization code has logical issue on UV
platform. It doesn't flush the full range at all, since it
simply ignores its 'end' parameter (and hence also the "all"
indicator) in uv_flush_tlb_others() function.
Cliff's notes:
| I tested the patch on a UV. It has the effect of either
| clearing 1 or all TLBs in a cpu. I added some debugging to
| test for the cases when clearing all TLBs is overkill, and in
| practice it happens very seldom.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Tested-by: Cliff Wickman <cpw@sgi.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>