Use the unused probe routine in the apic driver to finalize the
apic model selection. This cleans up the
default_setup_apic_routing() and this probe routine in future
can also be used for doing any apic model specific
initialisation.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: steiner@sgi.com
Cc: yinghai@kernel.org
Link: http://lkml.kernel.org/r/20110519234637.247458931@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Code flow for enabling interrupt-remapping has its own routines
for saving and restoring io-apic RTE's. ioapic suspend/resume
code flow also has similar routines. Remove the duplicate code.
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110518233157.673130611@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Code flow for enabling interrupt-remapping was
allocating/freeing buffers for saving/restoring io-apic RTE's.
ioapic suspend/resume code uses boot time allocated
ioapic_saved_data that is a perfect match for reuse here.
This will remove the unnecessary allocation/free of the
temporary buffers during suspend/resume of interrupt-remapping
enabled platforms aswell as paving the way for further code
consolidation.
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110518233157.574469296@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows re-using this buffer for enabling
interrupt-remapping during boot and resume. And thus allow for
consolidating the code between ioapic suspend/resume and
interrupt-remapping.
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110518233157.481404505@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix a potential deadlock when resuming; here the calling
function has disabled interrupts, so we cannot sleep.
Change the memory allocation flag from GFP_KERNEL to GFP_ATOMIC.
TODO: We can do away with this memory allocation during resume
by reusing the ioapic suspend/resume code that uses boot time
allocated buffers, but we want to keep this -stable patch
simple.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <stable@kernel.org> # v2.6.38/39
Link: http://lkml.kernel.org/r/20110518233157.385970138@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
x86, mm: Allow ZONE_DMA to be configurable
x86, NUMA: Trim numa meminfo with max_pfn in a separate loop
x86, NUMA: Rename setup_node_bootmem() to setup_node_data()
x86, NUMA: Enable emulation on 32bit too
x86, NUMA: Enable CONFIG_AMD_NUMA on 32bit too
x86, NUMA: Rename amdtopology_64.c to amdtopology.c
x86, NUMA: Make numa_init_array() static
x86, NUMA: Make 32bit use common NUMA init path
x86, NUMA: Initialize and use remap allocator from setup_node_bootmem()
x86-32, NUMA: Add @start and @end to init_alloc_remap()
x86, NUMA: Remove long 64bit assumption from numa.c
x86, NUMA: Enable build of generic NUMA init code on 32bit
x86, NUMA: Move NUMA init logic from numa_64.c to numa.c
x86-32, NUMA: Update numaq to use new NUMA init protocol
x86-32, NUMA: Replace srat_32.c with srat.c
x86-32, NUMA: implement temporary NUMA init shims
x86, NUMA: Move numa_nodes_parsed to numa.[hc]
x86-32, NUMA: Move get_memcfg_numa() into numa_32.c
x86, NUMA: make srat.c 32bit safe
x86, NUMA: rename srat_64.c to srat.c
...
This fixes problems seen on UV systems handling NMIs from the
node controller.
I isolated the "dazed..." messages that I saw earlier to a bug in
the BMC on our platform. It was sending NMIs w/o properly setting
a register that indicated the source of NMI.
So rather than _assuming_ any unhandled NMI came from the UV system
maintenance console (SMC), add a check to verify that the SMC actually
sent the NMI.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: gorcunov@gmail.com
Cc: dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Update numaq such that it calls numa_add_memblk() and sets
numa_nodes_parsed instead of directly diddling with NUMA states. The
original get_memcfg_numaq() is renamed to numaq_numa_init() and new
get_memcfg_numaq() is created in numa_32.c.
The shim numa_add_memblk() implementation handles node_start/end_pfn[]
and node_set_online() for nodes with memory. The new
get_memcfg_numaq() exactly the same with get_memcfg_from_srat() other
than calling the numaq init function. Things get_memcfgs_numaq() do
are not strictly necessary for numaq but added for consistency and to
help unifying NUMA init handling.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Instead of calling memory_present() for each region from NUMA init,
call sparse_memory_present_with_active_regions() from paging_init()
similarly to x86-64.
For flat and numaq, this results in exactly the same memory_present()
calls. For srat, if there are multiple memory chunks for a node,
after this change, memory_present() will be called separately for each
chunk instead of being called once to encompass the whole range, which
doesn't cause any harm and actually is the better behavior.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
NUMAQ is the only meaningful user of this callback and
setup_local_APIC() the only callsite. Stop torturing everyone else by
making the callback optional and removing all the boilerplate
implementations and assignments.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Some x86-32 NUMA implementations (NUMAQ) don't initialize apicid ->
node mapping using set_apicid_to_node() during NUMA init but implement
custom apic->x86_32_numa_cpu_node() instead.
This patch automatically initializes the default apic -> node mapping
table from apic->x86_32_numa_cpu_node() from setup_local_APIC() such
that the mapping table is in sync with the actual mapping.
As the table isn't used by custom implementations, this doesn't make
any difference at this point. This is in preparation of unifying
numa_cpu_node() between x86-32 and 64.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Merge reason: Pick up the following two fix commits.
2be19102b7: x86, NUMA: Fix empty memblk detection in numa_cleanup_meminfo()
765af22da8: x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
Scheduled NUMA init 32/64bit unification changes depend on these.
Signed-off-by: Tejun Heo <tj@kernel.org>
We use io_apic_setup_irq_pin() in order to configure pin's interrupt
number polarity and type. This is done on every irq_create_of_mapping()
which happens for instance during pci enable calls. Level typed
interrupts are masked by default, edge are unmasked.
On the first ->xlate() call the level interrupt is configured and
masked. The driver calls request_irq() and the line is unmasked. Lets
assume the interrupt line is shared with another device and we call
pci_enable_device() for this device. The ->xlate() configures the pin
again and it is masked. request_irq() does not unmask the line because
it _is_ already unmasked according to its internal state. So the
interrupt will never be unmasked again.
This patch is based on an earlier work by Torben Hohn and solves the
problem by configuring the pin only once. Since all devices must agree
on the same type and polarity there is no point in configuring the pin
more than once.
[ tglx: Split out the ce4100 part into a separate patch ]
Cc: Torben Hohn <torbenh@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/%3C20110427143052.GA15211%40linutronix.de%3E
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
End users worry about the error interrupt printout we generate
currently:
pr_debug("APIC error on CPU%d: %02x(%02x)\n",
smp_processor_id(), v , v1);
... and would like to know the reason why error interrupts are generated.
This patch prints out more detailed debug information.
Another practical problem is that dynamic debug is not initialized yet
when the APIC initializes, so the pr_debug() will not output the error
interrupt debug information on bootup. In this patch, we use
apic_printk(APIC_DEBUG, ...), so the apic=debug boot option will print
verbose error interupts during bootup.
Signed-off-by: Youquan Song <youquan.song@intel.com>
Cc: Joe Perches <joe@perches.com>
Cc: hpa@linux.intel.com
Cc: suresh.b.siddha@intel.com
Cc: yong.y.wang@linux.intel.com
Cc: jbaron@redhat.com
Cc: trenn@suse.de
Cc: kent.liu@intel.com
Cc: chaohong.guo@intel.com
Link: http://lkml.kernel.org/r/1302762968-24380-2-git-send-email-youquan.song@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Only pgdat and memmap use remap area and there isn't much benefit in
allowing per-node override. In addition, the use of node_remap_size[]
is confusing in that it contains number of bytes before remap
initialization and then number of pages afterwards.
Move remap size calculation for memap from specific NUMA config
implementations to init_alloc_remap() and make node_remap_size[]
static.
The only behavior difference is that, before this patch, numaq_32
didn't consider max_pfn when calculating the memmap size but it's
enforced after this patch, which is the right thing to do.
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1301955840-7246-8-git-send-email-tj@kernel.org
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
After a crash dump on an SGI Altix UV system the crash kernel
fails to cause a reboot. EFI mode is disabled in the kdump
kernel, so only the reboot_type of BOOT_ACPI works.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: rja@sgi.com
LKML-Reference: <E1Q5Iuo-00013b-UK@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add this_cpu_has() which determines if the current cpu has a certain
ability using a segment prefix and a bit test operation.
For that we need to add bit operations to x86s percpu.h.
Many uses of cpu_has use a pointer passed to a function to determine
the current flags. That is no longer necessary after this patch.
However, this patch only converts the straightforward cases where
cpu_has is used with this_cpu_ptr. The rest is work for later.
-tj: Rolled up patch to add x86_ prefix and use percpu_read() instead
of percpu_read_stable().
Signed-off-by: Christoph Lameter <cl@linux.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Stop including <linux/delay.h> in x86 header files which don't
need it. This will let the compiler complain when this header is
not included by source files when it should, so that
contributors can fix the problem before building on other
architectures starts to fail.
Credits go to Geert for the idea.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: James E.J. Bottomley <James.Bottomley@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20110325152014.297890ec@endymion.delvare>
[ this also fixes an upstream build bug in drivers/media/rc/ite-cir.c ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some subsystems in the x86 tree need to carry out suspend/resume and
shutdown operations with one CPU on-line and interrupts disabled and
they define sysdev classes and sysdevs or sysdev drivers for this
purpose. This leads to unnecessarily complicated code and excessive
memory usage, so switch them to using struct syscore_ops objects for
this purpose instead.
Generally, there are three categories of subsystems that use
sysdevs for implementing PM operations: (1) subsystems whose
suspend/resume callbacks ignore their arguments entirely (the
majority), (2) subsystems whose suspend/resume callbacks use their
struct sys_device argument, but don't really need to do that,
because they can be implemented differently in an arguably simpler
way (io_apic.c), and (3) subsystems whose suspend/resume callbacks
use their struct sys_device argument, but the value of that argument
is always the same and could be ignored (microcode_core.c). In all
of these cases the subsystems in question may be readily converted to
using struct syscore_ops objects for power management and shutdown.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Flush TLB if PGD entry is changed in i386 PAE mode
x86, dumpstack: Correct stack dump info when frame pointer is available
x86: Clean up csum-copy_64.S a bit
x86: Fix common misspellings
x86: Fix misspelling and align params
x86: Use PentiumPro-optimized partial_csum() on VIA C7
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
Update cpuset info & webiste for cgroups
dcdbas: force SMI to happen when expected
arch/arm/Kconfig: remove one to many l's in the word.
asm-generic/user.h: Fix spelling in comment
drm: fix printk typo 'sracth'
Remove one to many n's in a word
Documentation/filesystems/romfs.txt: fixing link to genromfs
drivers:scsi Change printk typo initate -> initiate
serial, pch uart: Remove duplicate inclusion of linux/pci.h header
fs/eventpoll.c: fix spelling
mm: Fix out-of-date comments which refers non-existent functions
drm: Fix printk typo 'failled'
coh901318.c: Change initate to initiate.
mbox-db5500.c Change initate to initiate.
edac: correct i82975x error-info reported
edac: correct i82975x mci initialisation
edac: correct commented info
fs: update comments to point correct document
target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
...
Trivial conflict in fs/eventpoll.c (spelling vs addition)
They were generated by 'codespell' and then manually reviewed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
x86: Clean up apic.c and apic.h
x86: Remove superflous goal definition of tsc_sync
x86: dt: Correct local apic documentation in device tree bindings
x86: dt: Cleanup local apic setup
x86: dt: Fix OLPC=y/INTEL_CE=n build
rtc: cmos: Add OF bindings
x86: ce4100: Use OF to setup devices
x86: ioapic: Add OF bindings for IO_APIC
x86: dtb: Add generic bus probe
x86: dtb: Add support for PCI devices backed by dtb nodes
x86: dtb: Add device tree support for HPET
x86: dtb: Add early parsing of IO_APIC
x86: dtb: Add irq domain abstraction
x86: dtb: Add a device tree for CE4100
x86: Add device tree support
x86: e820: Remove conditional early mapping in parse_e820_ext
x86: OLPC: Make OLPC=n build again
x86: OLPC: Remove extra OLPC_OPENFIRMWARE_DT indirection
x86: OLPC: Cleanup config maze completely
x86: OLPC: Hide OLPC_OPENFIRMWARE config switch
...
Fix up conflicts in arch/x86/platform/ce4100/ce4100.c
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (93 commits)
x86, tlb, UV: Do small micro-optimization for native_flush_tlb_others()
x86-64, NUMA: Don't call numa_set_distanc() for all possible node combinations during emulation
x86-64, NUMA: Don't assume phys node 0 is always online in numa_emulation()
x86-64, NUMA: Clean up initmem_init()
x86-64, NUMA: Fix numa_emulation code with node0 without RAM
x86-64, NUMA: Revert NUMA affine page table allocation
x86: Work around old gas bug
x86-64, NUMA: Better explain numa_distance handling
x86-64, NUMA: Fix distance table handling
mm: Move early_node_map[] reverse scan helpers under HAVE_MEMBLOCK
x86-64, NUMA: Fix size of numa_distance array
x86: Rename e820_table_* to pgt_buf_*
bootmem: Move __alloc_memory_core_early() to nobootmem.c
bootmem: Move contig_page_data definition to bootmem.c/nobootmem.c
bootmem: Separate out CONFIG_NO_BOOTMEM code into nobootmem.c
x86-64, NUMA: Seperate out numa_alloc_distance() from numa_set_distance()
x86-64, NUMA: Add proper function comments to global functions
x86-64, NUMA: Move NUMA emulation into numa_emulation.c
x86-64, NUMA: Prepare numa_emulation() for moving NUMA emulation into a separate file
x86-64, NUMA: Do not scan two times for setup_node_bootmem()
...
Fix up conflicts in arch/x86/kernel/smpboot.c
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (116 commits)
x86: Enable forced interrupt threading support
x86: Mark low level interrupts IRQF_NO_THREAD
x86: Use generic show_interrupts
x86: ioapic: Avoid redundant lookup of irq_cfg
x86: ioapic: Use new move_irq functions
x86: Use the proper accessors in fixup_irqs()
x86: ioapic: Use irq_data->state
x86: ioapic: Simplify irq chip and handler setup
x86: Cleanup the genirq name space
genirq: Add chip flag to force mask on suspend
genirq: Add desc->irq_data accessor
genirq: Add comments to Kconfig switches
genirq: Fixup fasteoi handler for oneshot mode
genirq: Provide forced interrupt threading
sched: Switch wait_task_inactive to schedule_hrtimeout()
genirq: Add IRQF_NO_THREAD
genirq: Allow shared oneshot interrupts
genirq: Prepare the handling of shared oneshot interrupts
genirq: Make warning in handle_percpu_event useful
x86: ioapic: Move trigger defines to io_apic.h
...
Fix up trivial(?) conflicts in arch/x86/pci/xen.c due to genirq name
space changes clashing with the Xen cleanups. The set_irq_msi() had
moved to xen_bind_pirq_msi_to_irq().
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Combine printk()s in show_regs_common()
x86: Don't call dump_stack() from arch_trigger_all_cpu_backtrace_handler()
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix and clean up generic_processor_info()
x86: Don't copy per_cpu cpuinfo for BSP two times
x86: Move llc_shared_map out of cpu_info
The caller of ioapic_register_intr() has a pointer to the irq_cfg for
the irq already. Hand it in to avoid a full lookup.
In msi_compose_msg() the pointer to irq_cfg is already available. No
need to look it up again.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use the functions which take irq_data. We already have a pointer to
irq_data. That avoids a sparse irq lookup in move_*_irq.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use the state information in irq_data. That avoids a radix-tree lookup
from apic_ack_level() and simplifies setup_ioapic_dest().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
genirq is switching to a consistent name space for the irq related
functions. Convert x86. Conversion was done with coccinelle.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch moves some functions and variables into init
sections, makes a function static and removes some lines of
cruft.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <1299826956-8607-2-git-send-email-henne@nachtwindheim.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Up to now we force enable the local apic in the devicetree setup
uncoditionally and set smp_found_config unconditionally to 1 when a
devicetree blob is available. This breaks, when local apic is disabled
in the Kconfig.
Make it consistent by initializing device tree explicitely before
smp_get_config() so a non lapic configuration could be used as well.
To be functional that would require to implement PIT as an interrupt
host, but the only user of this code until now is ce4100 which
requires apics to be available. So we leave this up to those who need
it.
Tested-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
io_apic_set_pci_routing() and mp_save_irq() check the pin_programmed
bit before calling io_apic_setup_irq_pin() and set the bit when the
pin was setup.
Move that duplicated code into a separate function and use it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There is no point to have irq_trigger() and irq_polarity() as wrappers
around the MPBIOS_* camel case functions. Get rid of both the inlines
and the ugly camel case.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The only difference here is that we did not call
__add_pin_to_irq_node() for the legacy irqs, but that's not worth 30
lines of extra code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove the duplicated code and call the function. It does not matter
whether we allocated the cfg before calling setup_local_APIC() and we
can set the irq chip and handler after that as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There are about four places in the ioapic code which do exactly the
same setup sequence. Also the OF based ioapic setup needs that
function to avoid putting the OF specific code into ioapic.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Two consecutive
for(...)
for(...)
lines to avoid an extra indentation are just horrible to read. I had
to look more than once to figure out what the code is doing.
Split out the inner loop into a separate function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is debug code and it does not matter at all whether we print each
not connected pin in an extra line or try to be extra clever.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds IOAPIC dummy functions for compilation
with local APIC, but without IOAPIC.
The local variable ioapic_entries in enable_IR_x2apic()
does not need initialization anymore, since the dummy
returns NULL.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
LKML-Reference: <1298385487-4708-4-git-send-email-henne@nachtwindheim.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently arch_disable_smp_support() on x86 disables only the
support for the IOAPIC and is also compiled in if SMP-support is
not.
Therefore this function is renamed to disable_ioapic_support(),
which meets its purpose and is only compiled in the kernel
when IOAPIC support is also.
A new arch_disable_smp_support() is created in smpboot.c,
which calls disable_ioapic_support() and gets only compiled
in the kernel when SMP support is also.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
LKML-Reference: <1298385487-4708-3-git-send-email-henne@nachtwindheim.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
show_regs() already prints two(!) stack traces, no need for a third one.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4D5D512902000078000326EE@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
mp_find_ioapic() prints errors like:
ERROR: Unable to locate IOAPIC for GSI 13
if it can't find the IOAPIC that manages that specific GSI. I
see errors like that at every boot of a laptop that apparently
doesn't have any IOAPICs.
But if there are no IOAPICs it doesn't seem to be an error that
none can be found. A solution that gets rid of this message is
to directly return if nr_ioapics (still) is zero. (But keep
returning -1 in that case, so nothing breaks from this change.)
The call chain that generates this error is:
pnpacpi_allocated_resource()
case ACPI_RESOURCE_TYPE_IRQ:
pnpacpi_parse_allocated_irqresource()
acpi_get_override_irq()
mp_find_ioapic()
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
One of the error printouts in generic_processor_info() prints out
the APIC version instead of the cpu index the warning text describes.
Move version validation down, after we get the right cpu index.
-v2: add comments about reason why we can have cpu=0 there.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D5240A9.4080703@kernel.org>
[ Cleaned up and made the BIOS bug printouts more consistent ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Additionally doing things conditionally upon smp_processor_id()
being zero is generally a bad idea, as this means CPU 0 cannot
be offlined and brought back online later again.
While there may be other places where this is done, I think adding
more of those should be avoided so that some day SMP can really
become "symmetrical".
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <4D525C7E0200007800030EE1@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit 4c321ff8 (x86: Replace cpu_2_logical_apicid[] with early
percpu variable) and following changes introduced and used
x86_cpu_to_logical_apicid percpu variable. It was declared and
defined inside CONFIG_SMP && CONFIG_X86_32 but if
CONFIG_X86_UP_APIC is set UP configuration makes use of it and
build fails.
Fix it by declaring and defining it inside CONFIG_X86_LOCAL_APIC
&& CONFIG_X86_32.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <20110128162248.GA25746@htj.dyndns.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The mapping between cpu/apicid and node is done via
apicid_to_node[] on 64bit and apicid_2_node[] +
apic->x86_32_numa_cpu_node() on 32bit. This difference makes it
difficult to further unify 32 and 64bit NUMA handling.
This patch unifies it by replacing both apicid_to_node[] and
apicid_2_node[] with __apicid_to_node[] array, which is accessed
by two accessors - set_apicid_to_node() and numa_cpu_node(). On
64bit, numa_cpu_node() always consults __apicid_to_node[]
directly while 32bit goes through apic->numa_cpu_node() method
to allow apic implementations to override it.
srat_detect_node() for amd cpus contains workaround for broken
NUMA configuration which assumes relationship between APIC ID,
HT node ID and NUMA topology. Leave it to access
__apicid_to_node[] directly as mapping through CPU might result
in undesirable behavior change. The comment is reformatted and
updated to note the ugliness.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-14-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: David Rientjes <rientjes@google.com>
apic->apicid_to_node() is 32bit specific apic operation which
determines NUMA node for a CPU. Depending on the APIC
implementation, it can be easier to determine NUMA node from
either physical or logical apicid. Currently,
->apicid_to_node() takes @logical_apicid and calls
hard_smp_processor_id() if the physical apicid is needed.
This prevents NUMA mapping from being queried from a different
CPU, which in turn makes it impossible to initialize NUMA
mapping before SMP bringup.
This patch replaces apic->apicid_to_node() with
->x86_32_numa_cpu_node() which takes @cpu, from which both
logical and physical apicids can easily be determined. While at
it, drop duplicate implementations from bigsmp_32 and summit_32,
and use the default one.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-13-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On x86_32, the mapping between cpu and logical apic ID differs
depending on the specific apic implementation in use. The
mapping is initialized while bringing up CPUs; however, this
makes early inits ignore memory topology.
Add a x86_32 specific apic->x86_32_early_logical_apicid() which
is called early during boot to query the mapping. The mapping
is later verified against the result of init_apic_ldr(). The
method is allowed to return BAD_APICID if it can't be determined
early.
noop variant which always returns BAD_APICID is implemented and
added to all x86_32 apic implementations.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-8-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After the previous patch, apic->cpu_to_logical_apicid() is no
longer used. Kill it.
For apic types with custom cpu_to_logical_apicid() which is also
used for other purposes, remove the function and modify its
users to do the mapping directly.
#ifdef's on CONFIG_SMP in es7000_32 and summit_32 are ignored
during conversion as they are not used for UP kernels.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-7-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently, cpu -> logical apic id translation is done by
apic->cpu_to_logical_apicid() callback which may or may not use
x86_cpu_to_logical_apicid. This is unnecessary as it should
always equal logical_smp_processor_id() which is known early
during CPU bring up.
Initialize x86_cpu_to_logical_apicid after apic->init_apic_ldr()
in setup_local_APIC() and always use x86_cpu_to_logical_apicid
for cpu -> logical apic id mapping.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-6-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Unlike x86_64, on x86_32, the mapping from cpu to logical apicid
may vary depending on apic in use. cpu_2_logical_apicid[] array
is used for this mapping. Replace it with early percpu variable
x86_cpu_to_logical_apicid to make it better aligned with other
mappings.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-5-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Both functions are used only in 32bit. Put them inside
CONFIG_X86_32. This is to prepare for logical apicid handling
update.
- Cyrill Gorcunov spotted that I forgot to move declarations in
ipi.h under CONFIG_X86_32. Fixed.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: eric.dumazet@gmail.com
Cc: brgerst@gmail.com
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-4-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix Moorestown VRTC fixmap placement
x86/gpio: Implement x86 gpio_to_irq convert function
x86, UV: Fix APICID shift for Westmere processors
x86: Use PCI method for enabling AMD extended config space before MSR method
x86: tsc: Prevent delayed init if initial tsc calibration failed
x86, lapic-timer: Increase the max_delta to 31 bits
x86: Fix sparse non-ANSI function warnings in smpboot.c
x86, numa: Fix CONFIG_DEBUG_PER_CPU_MAPS without NUMA emulation
x86, AMD, PCI: Add AMD northbridge PCI device id for CPU families 12h and 14h
x86, numa: Fix cpu to node mapping for sparse node ids
x86, numa: Fake node-to-cpumask for NUMA emulation
x86, numa: Fake apicid and pxm mappings for NUMA emulation
x86, numa: Avoid compiling NUMA emulation functions without CONFIG_NUMA_EMU
x86, numa: Reduce minimum fake node size to 32M
Fix up trivial conflict in arch/x86/kernel/apic/x2apic_uv_x.c
Westmere processors use a different algorithm for
assigning APICIDs on SGI UV systems. The location of the
node number within the apicid is now a function of the
processor type.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20110110195210.GA18737@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'stable/generic' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: HVM X2APIC support
apic: Move hypervisor detection of x2apic to hypervisor.h
Latest atom socs(penwell) does not have hpet timer.
As their local APIC timer is clocked at 400KHZ, and the current
code limit their Initial Counter register to 23 bits, they
cannot sleep more than 1.34 seconds which leads to ~2 spurious
wakeup per second (1 per thread)
These SOCs support 32bit timer so we change the max_delta to at
least 31bits. So we can at least sleep for 300 seconds.
We could not find any previous chip errata where lapic would
only have 23 bit precision As powertop is suggesting to activate
HPET to "sleep longer", this could mean this problem is already
known.
Problem is here since very first implementation of lapic timer
as a clock event e9e2cdb [PATCH] clockevents: i386 drivers.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Pierre Tardy <pierre.tardy@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andi Kleen <ak@suse.de>
LKML-Reference: <1294327409-19426-1-git-send-email-pierre.tardy@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (30 commits)
gameport: use this_cpu_read instead of lookup
x86: udelay: Use this_cpu_read to avoid address calculation
x86: Use this_cpu_inc_return for nmi counter
x86: Replace uses of current_cpu_data with this_cpu ops
x86: Use this_cpu_ops to optimize code
vmstat: User per cpu atomics to avoid interrupt disable / enable
irq_work: Use per cpu atomics instead of regular atomics
cpuops: Use cmpxchg for xchg to avoid lock semantics
x86: this_cpu_cmpxchg and this_cpu_xchg operations
percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
percpu,x86: relocate this_cpu_add_return() and friends
connector: Use this_cpu operations
xen: Use this_cpu_inc_return
taskstats: Use this_cpu_ops
random: Use this_cpu_inc_return
fs: Use this_cpu_inc_return in buffer.c
highmem: Use this_cpu_xx_return() operations
vmstat: Use this_cpu_inc_return for vm statistics
x86: Support for this_cpu_add, sub, dec, inc_return
percpu: Generic support for this_cpu_add, sub, dec, inc_return
...
Fixed up conflicts: in arch/x86/kernel/{apic/nmi.c, apic/x2apic_uv_x.c, process.c}
as per Tejun.
Then we can reuse it for Xen later.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Avi Kivity <avi@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
With priorities in place and no one really understanding the difference between
DIE_NMI and DIE_NMI_IPI, just remove DIE_NMI_IPI and convert everyone to DIE_NMI.
This also simplifies default_do_nmi() a little bit. Instead of calling the
die_notifier in both the if and else part, just pull it out and call it before
the if-statement. This has the side benefit of avoiding a call to the ioport
to see if there is an external NMI sitting around until after the (more frequent)
internal NMIs are dealt with.
Patch-Inspired-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294348732-15030-5-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In order to consolidate the NMI die_chain events, we need to setup the priorities
for the die notifiers.
I started by defining a bunch of common priorities that can be used by the
notifier blocks. Then I modified the notifier blocks to use the newly created
priorities.
Now that the priorities are straightened out, it should be easier to remove the
event DIE_NMI_IPI.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294348732-15030-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
They are a handful of places in the code that register a die_notifier
as a catch all in case no claims the NMI. Unfortunately, they trigger
on events like DIE_NMI and DIE_NMI_IPI, which depending on when they
registered may collide with other handlers that have the ability to
determine if the NMI is theirs or not.
The function unknown_nmi_error() makes one last effort to walk the
die_chain when no one else has claimed the NMI before spitting out
messages that the NMI is unknown.
This is a better spot for these devices to execute any code without
colliding with the other handlers.
The two drivers modified are only compiled on x86 arches I believe, so
they shouldn't be affected by other arches that may not have
DIE_NMIUNKNOWN defined.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: openipmi-developer@lists.sourceforge.net
Cc: dann frazier <dannf@hp.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294348732-15030-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
arch/x86/include/asm/io_apic.h
Merge reason: Resolve the conflict, update to a more recent -rc base
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, UV, BAU: Extend for more than 16 cpus per socket
x86, UV: Fix the effect of extra bits in the hub nodeid register
x86, UV: Add common uv_early_read_mmr() function for reading MMRs
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix APIC ID sizing bug on larger systems, clean up MAX_APICS confusion
x86, acpi: Parse all SRAT cpu entries even above the cpu number limitation
x86, acpi: Add MAX_LOCAL_APIC for 32bit
x86: io_apic: Split setup_ioapic_ids_from_mpc()
x86: io_apic: Fix CONFIG_X86_IO_APIC=n breakage
x86: apic: Move probe_nr_irqs_gsi() into ioapic_init_mappings()
x86: Allow platforms to force enable apic
The spin_lock_debug/rcu_cpu_stall detector uses
trigger_all_cpu_backtrace() to dump cpu backtrace.
Therefore it is possible that trigger_all_cpu_backtrace()
could be called at the same time on different CPUs, which
triggers and 'unknown reason NMI' warning. The following case
illustrates the problem:
CPU1 CPU2 ... CPU N
trigger_all_cpu_backtrace()
set "backtrace_mask" to cpu mask
|
generate NMI interrupts generate NMI interrupts ...
\ | /
\ | /
The "backtrace_mask" will be cleaned by the first NMI interrupt
at nmi_watchdog_tick(), then the following NMI interrupts
generated by other cpus's arch_trigger_all_cpu_backtrace() will
be taken as unknown reason NMI interrupts.
This patch uses a test_and_set to avoid the problem, and stop
the arch_trigger_all_cpu_backtrace() from calling to avoid
dumping a double cpu backtrace info when there is already a
trigger_all_cpu_backtrace() in progress.
Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Reviewed-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Cc: fweisbec@gmail.com
LKML-Reference: <1294198689-15447-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Don Zickus <dzickus@redhat.com>
There are some paths that walk the die_chain with preemption on.
Make sure we are in an NMI call before we start doing anything.
This was triggered by do_general_protection calling notify_die
with DIE_GPF.
Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Don Zickus <dzickus@redhat.com>
LKML-Reference: <1294198689-15447-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Found one x2apic pre-enabled system, x2apic_mode suddenly get
corrupted after register some cpus, when compiled
CONFIG_NR_CPUS=255 instead of 512.
It turns out that generic_processor_info() ==> phyid_set(apicid,
phys_cpu_present_map) causes the problem.
phys_cpu_present_map is sized by MAX_APICS bits, and pre-enabled
system some cpus have an apic id > 255.
The variable after phys_cpu_present_map may get corrupted
silently:
ffffffff828e8420 B phys_cpu_present_map
ffffffff828e8440 B apic_verbosity
ffffffff828e8444 B local_apic_timer_c2_ok
ffffffff828e8448 B disable_apic
ffffffff828e844c B x2apic_mode
ffffffff828e8450 B x2apic_disabled
ffffffff828e8454 B num_processors
...
Actually phys_cpu_present_map is referenced via apic id, instead
index. We should use MAX_LOCAL_APIC instead MAX_APICS.
For 64-bit it will be 32768 in all cases. BSS will increase by 4k bytes
on 64-bit:
text data bss dec filename
21696943 4193748 12787712 38678403 vmlinux.before
21696943 4193748 12791808 38682499 vmlinux.after
No change on 32bit.
Finally we can remove MAX_APCIS that was rather confusing.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4D23BD9C.3070102@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Replace all uses of current_cpu_data with this_cpu operations on the
per cpu structure cpu_info. The scala accesses are replaced with the
matching this_cpu ops which results in smaller and more efficient
code.
In the long run, it might be a good idea to remove cpu_data() macro
too and use per_cpu macro directly.
tj: updated description
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Go through x86 code and replace __get_cpu_var and get_cpu_var
instances that refer to a scalar and are not used for address
determinations.
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
We should use MAX_LOCAL_APIC for max apic ids and MAX_APICS as number
of local apics.
Also apic_version[] array should use MAX_LOCAL_APICs.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4D0AD464.2020408@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The x86 arch has shifted its use of the nmi_watchdog from a
local implementation to the global one provide by
kernel/watchdog.c. This shift has caused a whole bunch of
compile problems under different config options. I attempt to
simplify things with the patch below.
In order to simplify things, I had to come to terms with the
meaning of two terms ARCH_HAS_NMI_WATCHDOG and
CONFIG_HARDLOCKUP_DETECTOR. Basically they mean the same thing,
the former on a local level and the latter on a global level.
With the old x86 nmi watchdog gone, there is no need to rely on
defining the ARCH_HAS_NMI_WATCHDOG variable because it doesn't
make sense any more. x86 will now use the global
implementation.
The changes below do a few things. First it changes the few
places that relied on ARCH_HAS_NMI_WATCHDOG to use
CONFIG_X86_LOCAL_APIC (the former was an alias for the latter
anyway, so nothing unusual here). Those pieces of code were
relying more on local apic functionality the nmi watchdog
functionality, so the change should make sense.
Second, I removed the x86 implementation of
touch_nmi_watchdog(). It isn't need now, instead x86 will rely
on kernel/watchdog.c's implementation.
Third, I removed the #define ARCH_HAS_NMI_WATCHDOG itself from
x86. And tweaked the include/linux/nmi.h file to tell users to
look for an externally defined touch_nmi_watchdog in the case of
ARCH_HAS_NMI_WATCHDOG _or_ CONFIG_HARDLOCKUP_DETECTOR. This
changes removes some of the ugliness in that file.
Finally, I added a Kconfig dependency for
CONFIG_HARDLOCKUP_DETECTOR that said you can't have
ARCH_HAS_NMI_WATCHDOG _and_ CONFIG_HARDLOCKUP_DETECTOR. You can
only have one nmi_watchdog.
Tested with
ARCH=i386: allnoconfig, defconfig, allyesconfig, (various broken
configs) ARCH=x86_64: allnoconfig, defconfig, allyesconfig,
(various broken configs)
Hopefully, after this patch I won't get any more compile broken
emails. :-)
v3:
changed a couple of 'linux/nmi.h' -> 'asm/nmi.h' to pick-up correct function
prototypes when CONFIG_HARDLOCKUP_DETECTOR is not set.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1293044403-14117-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
UV systems can be partitioned into multiple independent SSIs.
Large partitioned systems may have extra bits in the node_id
register. These bits are used when the total memory on all SSIs
exceeds 16TB. These extra bits need to be ignored when
calculating x2apic_extra_bits.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20101130195926.972776133@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Early in boot, reading MMRs from the UV hub controller require
calls to early_ioremap()/early_iounmap(). Rather than
duplicating code, add a common function to do the
map/read/unmap.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20101130195926.834804371@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Interrupt-remapping gets enabled very early in the boot, as it determines the
apic mode that the processor can use. And the current code enables the vt-d
fault handling before the setup_local_APIC(). And hence the APIC LDR registers
and data structure in the memory may not be initialized. So the vt-d fault
handling in logical xapic/x2apic modes were broken.
Fix this by enabling the vt-d fault handling in the end_local_APIC_setup()
A cleaner fix of enabling fault handling while enabling intr-remapping
will be addressed for v2.6.38. [ Enabling intr-remapping determines the
usage of x2apic mode and the apic mode determines the fault-handling
configuration. ]
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <20101201062244.541996375@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
In x2apic mode, we need to set the upper address register of the fault
handling interrupt register of the vt-d hardware. Without this
irq migration of the vt-d fault handling interrupt is broken.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <1291225233.2648.39.camel@sbsiddha-MOBL3>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: Chris Wright <chrisw@sous-sol.org>
Tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
When adjusting the code to handle removing the old nmi watchdog,
I forgot to consider the compile case when the local apic is not
enabled.
This change fixes the following build error:
arch/x86/kernel/apic/hw_nmi.c:28:6: error: redefinition of ‘touch_nmi_watchdog’
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Rakib Mullick <rakib.mullick@gmail.com>
LKML-Reference: <20101213153719.GD18577@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
setup_local_APIC() is used to setup local APIC early during CPU
initialization and already assumes that preemption is disabled on
entry. However, The function unnecessarily disables and enables
preemption and uses smp_processor_id() multiple times in and out of
the nested preemption disabled section. This gives the wrong
impression that the function might be able to handle being called with
preemption enabled and/or migrated to another processor in the middle.
Make it clear that the function is always called with preemption
disabled, drop the confusing preemption disable block and call
smp_processor_id() once at the beginning of the function.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: brgerst@gmail.com
LKML-Reference: <4D00B3B9.7060702@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Originally adapted from Huang Ying's patch which moved the
unknown_nmi_panic to the traps.c file. Because the old nmi
watchdog was deleted before this change happened, the
unknown_nmi_panic sysctl was lost. This re-adds it.
Also, the nmi_watchdog sysctl was re-implemented and its
documentation updated accordingly.
Patch-inspired-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Cc: fweisbec@gmail.com
LKML-Reference: <1291068437-5331-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
My patch that removed the old x86 nmi watchdog broke other
arches. This change reverts a piece of that patch and puts the
change in the correct spot.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: fweisbec@gmail.com
Cc: yinghai@kernel.org
LKML-Reference: <1291068437-5331-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
assign_to_mp_irq() is copying the struct mpc_intsrc members one by
one. That's silly. Use memcpy() and let the compiler figure it out.
Same for the identical function assign_to_mpc_intsrc()
mp_irq_mpc_intsrc_cmp() is comparing the struct members one by one,
but no caller ever checks the different return codes. Use memcmp()
instead.
Remove the extra printk in MP_ioapic_info()
Signed-off-by: Feng Tang <feng.tang@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: "Alan Cox <alan@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
LKML-Reference: <20101208151857.212f0018@feng-i7>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
There are 3 places defining similar functions of saving IRQ vector
info into mp_irqs[] array: mmparse/acpi/mrst.
Replace the redundant code by a common function in io_apic.c as it's
only called when CONFIG_X86_IO_APIC=y
Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Alan Cox <alan@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20101207133204.4d913c5a@feng-i7>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
For 32bit mptable path, setup_ids_from_mpc() always writes the io_apic
id register, even there is no change needed.
Skip the write, when readout and mptable match.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
LKML-Reference: <4CFDF785.7010401@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
If x2apic is preenabled and used by the kernel, we don't need to map
the lapic address. That mapping will never be used.
So just skip that in register_lapic_address()
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4CFDF69C.9070501@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Remove the printk as well, we don't want to print when nothing
changed. We print in register_lapic_address() already.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4CFDF68A.7020902@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It is almost the same as smp_register_lapic_addr(). We just need to
let smp_read_mpc() call smp_register_lapic_addr() when early==1.
Add the apic_printk to smp_register_lapic_address()
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
LKML-Reference: <4CFDF681.3030509@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
They are the same, move the common function to apic.c to allow
further cleanups.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Len Brown <lenb@kernel.org>
LKML-Reference: <4CFDF675.4060305@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reason: apic cleanup series depends on x86/apic, x86/amd-nb x86/platform
Conflicts:
arch/x86/include/asm/io_apic.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/apic/io_apic.c: In function 'ack_apic_level':
arch/x86/kernel/apic/io_apic.c:2433: warning: unused variable 'desc'
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <201010272107.o9RL7rse018212@imap1.linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/apic/hw_nmi.c:29: warning: backtrace_mask defined but not used
commit 0e2af2a9(x86, hw_nmi: Move backtrace_mask declaration under
ARCH_HAS_NMI_WATCHDOG) addressed this warning, but it was reintroduced
by commit 5f2b0ba4(x86, nmi_watchdog: Remove the old nmi_watchdog).
Move backtrace_mask into the #ifdef arch_trigger_all_cpu_backtrace
section again.
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <AANLkTi=rcc38QzoKa6LFy4m++-p_9=Zt4_kDQE=GeKxf@mail.gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Sodaville needs to setup the IO_APIC ids as the boot loader leaves
them uninitialized. Split out the setter function so it can be called
unconditionally from the sodaville board code.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <20101126165020.GA26361@www.tglx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
dmar, x86: Use function stubs when CONFIG_INTR_REMAP is disabled
x86-64: Fix and clean up AMD Fam10 MMCONF enabling
x86: UV: Address interrupt/IO port operation conflict
x86: Use online node real index in calulate_tbl_offset()
x86, asm: Fix binutils 2.15 build failure
This patch for SGI UV systems addresses a problem whereby
interrupt transactions being looped back from a local IOH,
through the hub to a local CPU can (erroneously) conflict with
IO port operations and other transactions.
To workaound this we set a high bit in the APIC IDs used for
interrupts. This bit appears to be ignored by the sockets, but
it avoids the conflict in the hub.
Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
LKML-Reference: <20101116222352.GA8155@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
___
arch/x86/include/asm/uv/uv_hub.h | 4 ++++
arch/x86/include/asm/uv/uv_mmrs.h | 19 ++++++++++++++++++-
arch/x86/kernel/apic/x2apic_uv_x.c | 25 +++++++++++++++++++++++--
arch/x86/platform/uv/tlb_uv.c | 2 +-
arch/x86/platform/uv/uv_time.c | 4 +++-
5 files changed, 49 insertions(+), 5 deletions(-)
backtrace_mask has been used under the code context of
ARCH_HAS_NMI_WATCHDOG. So put it into that context.
We were warned by the following warning:
arch/x86/kernel/apic/hw_nmi.c:21: warning: ‘backtrace_mask’ defined but not used
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
LKML-Reference: <1289573455-3410-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that the bulk of the old nmi_watchdog is gone, remove all
the stub variables and hooks associated with it.
This touches lots of files mainly because of how the io_apic
nmi_watchdog was implemented. Now that the io_apic nmi_watchdog
is forever gone, remove all its fingers.
Most of this code was not being exercised by virtue of
nmi_watchdog != NMI_IO_APIC, so there shouldn't be anything to
risky here.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: fweisbec@gmail.com
Cc: gorcunov@openvz.org
LKML-Reference: <1289578944-28564-3-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that we have a new nmi_watchdog that is more generic and
sits on top of the perf subsystem, we really do not need the old
nmi_watchdog any more.
In addition, the old nmi_watchdog doesn't really work if you are
using the default clocksource, hpet. The old nmi_watchdog code
relied on local apic interrupts to determine if the cpu is still
alive. With hpet as the clocksource, these interrupts don't
increment any more and the old nmi_watchdog triggers false
postives.
This piece removes the old nmi_watchdog code and stubs out any
variables and functions calls. The stubs are the same ones used
by the new nmi_watchdog code, so it should be well tested.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: fweisbec@gmail.com
Cc: gorcunov@openvz.org
LKML-Reference: <1289578944-28564-2-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A new version of the SGI UV hub node controller is being
developed. A few of the MMRs (control registers) that exist on
the current hub no longer exist on the new hub. Fortunately,
there are alternate MMRs that are are functionally equivalent
and that exist on both hubs.
This patch changes the UV code to use MMRs that exist in BOTH
versions of the hub node controller.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20101106204056.GA27584@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Russ Anderson reported:
| There is a regression that is causing a NULL pointer dereference
| in free_irte when shutting down xpc. git bisect narrowed it down
| to git commit d585d06(intr_remap: Simplify the code further), which
| changed free_irte(). Reverse applying the patch fixes the problem.
We need to use irq_remapped() for each irq instead of checking only
intr_remapping_enabled as there might be non remapped irqs even when
remapping is enabled.
[ tglx: use cfg instead of retrieving it again. Massaged changelog ]
Reported-bisected-and-tested-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <4CCBD511.40607@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, alternative: Call stop_machine_text_poke() on all cpus
x86-32: Restore irq stacks NUMA-aware allocations
x86, memblock: Fix early_node_mem with big reserved region.
* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, uv: More Westmere support on SGI UV
x86, uv: Enable Westmere support on SGI UV
Enable Westmere support for all APIC modes on SGI UV.
Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20101028224132.GB15804@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
and branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
xen: register xen pci notifier
xen: initialize cpu masks for pv guests in xen_smp_init
xen: add a missing #include to arch/x86/pci/xen.c
xen: mask the MTRR feature from the cpuid
xen: make hvc_xen console work for dom0.
xen: add the direct mapping area for ISA bus access
xen: Initialize xenbus for dom0.
xen: use vcpu_ops to setup cpu masks
xen: map a dummy page for local apic and ioapic in xen_set_fixmap
xen: remap MSIs into pirqs when running as initial domain
xen: remap GSIs as pirqs when running as initial domain
xen: introduce XEN_DOM0 as a silent option
xen: map MSIs into pirqs
xen: support GSI -> pirq remapping in PV on HVM guests
xen: add xen hvm acpi_register_gsi variant
acpi: use indirect call to register gsi in different modes
xen: implement xen_hvm_register_pirq
xen: get the maximum number of pirqs from xen
xen: support pirq != irq
* 'stable/xen-pcifront-0.8.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (27 commits)
X86/PCI: Remove the dependency on isapnp_disable.
xen: Update Makefile with CONFIG_BLOCK dependency for biomerge.c
MAINTAINERS: Add myself to the Xen Hypervisor Interface and remove Chris Wright.
x86: xen: Sanitse irq handling (part two)
swiotlb-xen: On x86-32 builts, select SWIOTLB instead of depending on it.
MAINTAINERS: Add myself for Xen PCI and Xen SWIOTLB maintainer.
xen/pci: Request ACS when Xen-SWIOTLB is activated.
xen-pcifront: Xen PCI frontend driver.
xenbus: prevent warnings on unhandled enumeration values
xenbus: Xen paravirtualised PCI hotplug support.
xen/x86/PCI: Add support for the Xen PCI subsystem
x86: Introduce x86_msi_ops
msi: Introduce default_[teardown|setup]_msi_irqs with fallback.
x86/PCI: Export pci_walk_bus function.
x86/PCI: make sure _PAGE_IOMAP it set on pci mappings
x86/PCI: Clean up pci_cache_line_size
xen: fix shared irq device passthrough
xen: Provide a variant of xen_poll_irq with timeout.
xen: Find an unbound irq number in reverse order (high to low).
xen: statically initialize cpu_evtchn_mask_p
...
Fix up trivial conflicts in drivers/pci/Makefile
Enable Westmere support on SGI UV. The UV initialization code is dependent on
the APICID bits. Westmere-EX uses different APIC bit mapping than Nehalem-EX.
This code reads the apic shift value from a UV MMR to do the proper bit
decoding to determint the pnode.
Signed-off-by: Russ Anderson <rja@sgi.com>
LKML-Reference: <20101026212728.GB15071@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This improves error messages in case the BIOS was setting up
wrong LVT offsets.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1288015419-29543-6-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
probe_br_irqs_gsi() is called right after ioapic_init_mappings() and
there are no other users. Move it into ioapic_init_mappings() so the
declaration can disappear and the function can become static.
Rename ioapic_init_mappings() to ioapic_and_gsi_init() to reflect that
change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1287510389-8388-2-git-send-email-dirk.brandewie@gmail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Some embedded x86 platforms don't setup the APIC in the
BIOS/bootloader and would be forced to add "lapic" on the kernel
command line. That's a bit akward.
Split out the force enable code from detect_init_APIC() and allow
platform code to call it from the platform setup. That avoids the
command line parameter and possible replication of the MSR dance in
the force enable code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1287510389-8388-1-git-send-email-dirk.brandewie@gmail.com>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
* 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S
xen: Cope with unmapped pages when initializing kernel pagetable
memblock, bootmem: Round pfn properly for memory and reserved regions
memblock: Annotate memblock functions with __init_memblock
memblock: Allow memblock_init to be called early
memblock/arm: Fix memblock_region_is_memory() typo
x86, memblock: Remove __memblock_x86_find_in_range_size()
memblock: Fix wraparound in find_region()
x86-32, memblock: Make add_highpages honor early reserved ranges
x86, memblock: Fix crashkernel allocation
arm, memblock: Fix the sparsemem build
memblock: Fix section mismatch warnings
powerpc, memblock: Fix memblock API change fallout
memblock, microblaze: Fix memblock API change fallout
x86: Remove old bootmem code
x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve
x86: Remove not used early_res code
x86, memblock: Replace e820_/_early string with memblock_
x86: Use memblock to replace early_res
x86, memblock: Use memblock_debug to control debug message print out
...
Fix up trivial conflicts in arch/x86/kernel/setup.c and kernel/Makefile
We want the BIOS to setup the EILVT APIC registers. The offsets
were hardcoded and BIOS settings were overwritten by the OS.
Now, the subsystems for MCE threshold and IBS determine the LVT
offset from the registers the BIOS has setup. If the BIOS setup
is buggy on a family 10h system, a workaround enables IBS. If
the OS determines an invalid register setup, a "[Firmware Bug]:
" error message is reported.
We need this change also for upcomming cpu families.
Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1286360874-1471-3-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements checks for the availability of LVT entries
(APIC500-530) and reserves it if used. The check becomes
necessary since we want to let the BIOS provide the LVT offsets.
The offsets should be determined by the subsystems using it
like those for MCE threshold or IBS. On K8 only offset 0
(APIC500) and MCE interrupts are supported. Beginning with
family 10h at least 4 offsets are available.
Since offsets must be consistent for all cores, we keep track of
the LVT offsets in software and reserve the offset for the same
vector also to be used on other cores. An offset is freed by
setting the entry to APIC_EILVT_MASKED.
If the BIOS is right, there should be no conflicts. Otherwise a
"[Firmware Bug]: ..." error message is generated. However, if
software does not properly determines the offsets, it is not
necessarily a BIOS bug.
Signed-off-by: Robert Richter <robert.richter@amd.com>
LKML-Reference: <1286360874-1471-2-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On a system that support intr-rempping when booting with "intremap=off"
[ 177.895501] BUG: unable to handle kernel NULL pointer dereference at 00000000000000f8
[ 177.913316] IP: [<ffffffff8145fc18>] free_irte+0x47/0xc0
...
[ 178.173326] Call Trace:
[ 178.173574] [<ffffffff810515b4>] destroy_irq+0x3a/0x75
[ 178.192934] [<ffffffff81051834>] arch_teardown_msi_irq+0xe/0x10
[ 178.193418] [<ffffffff81458dc3>] arch_teardown_msi_irqs+0x56/0x7f
[ 178.213021] [<ffffffff81458e79>] free_msi_irqs+0x8d/0xeb
Call free_irte only when interrupt remapping is enabled.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CBCB274.7010108@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Introduce an x86 specific indirect mechanism to setup MSIs.
The MSI setup functions become function pointers in an x86_msi_ops
struct, that defaults to the implementation in io_apic.c and msi.c.
[v2: Use HAVE_DEFAULT_* knobs]
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Impact: new interface to get max GSI
Add get_nr_irqs_gsi() to return nr_irqs_gsi. Xen will use this to
determine how many irqs it needs to reserve for hardware irqs.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Instead of looping through all interrupts, use the bitmap lookup to
find the next.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
irq_2_iommu is in struct irq_cfg, so we can do the irq_remapped check
based on irq_cfg instead of going through a lookup function. That's
especially interesting in the eoi_ioapic_irq() hotpath.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Switch over to the new allocator and remove all the magic which was
caused by the unability to destroy irq descriptors. Get rid of the
create_irq_nr() loop for sparse and non sparse irq.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
The sparseirq rework triggered a warning in the iommu code, which was
caused by setting up ioapic for ACPI irq 9 twice. This function is
solely to handle interrupts which are on a secondary ioapic and
outside the legacy irq range.
Replace the sparse irq_to_desc check with a non ifdeffed version.
[ tglx: Moved it before the ioapic sparse conversion and simplified
the inverse logic ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4CB00122.3030301@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Rename the grossly misnamed get_one_free_irq_cfg() to alloc_irq_cfg().
Add a (not yet used) irq number argument to free_irq_cfg()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Implement new allocator functions which make use of the core changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>