linux

Commit Graph

Author	SHA1	Message	Date
Boris Ostrovsky	65d0cf0be7	xen/PMU: Initialization code for Xen PMU Map shared data structure that will hold CPU registers, VPMU context, V/PCPU IDs of the CPU interrupted by PMU interrupt. Hypervisor fills this information in its handler and passes it to the guest for further processing. Set up PMU VIRQ. Now that perf infrastructure will assume that PMU is available on a PV guest we need to be careful and make sure that accesses via RDPMC instruction don't cause fatal traps by the hypervisor. Provide a nop RDPMC handler. For the same reason avoid issuing a warning on a write to APIC's LVTPC. Both of these will be made functional in later patches. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:25:20 +01:00
Boris Ostrovsky	5f14154882	xen/PMU: Sysfs interface for setting Xen PMU mode Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:26 +01:00
Juergen Gross	cb3eb85013	xen: remove no longer needed p2m.h Cleanup by removing arch/x86/xen/p2m.h as it isn't needed any more. Most definitions in this file are used in p2m.c only. Move those into p2m.c. set_phys_range_identity() is already declared in arch/x86/include/asm/xen/page.h, add __init annotation there. MAX_REMAP_RANGES isn't used at all, just delete it. The only define left is P2M_PER_PAGE which is moved to page.h as well. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:25 +01:00
Juergen Gross	c70727a5bc	xen: allow more than 512 GB of RAM for 64 bit pv-domains 64 bit pv-domains under Xen are limited to 512 GB of RAM today. The main reason has been the 3 level p2m tree, which was replaced by the virtual mapped linear p2m list. Parallel to the p2m list which is being used by the kernel itself there is a 3 level mfn tree for usage by the Xen tools and eventually for crash dump analysis. For this tree the linear p2m list can serve as a replacement, too. As the kernel can't know whether the tools are capable of dealing with the p2m list instead of the mfn tree, the limit of 512 GB can't be dropped in all cases. This patch replaces the hard limit by a kernel parameter which tells the kernel to obey the 512 GB limit or not. The default is selected by a configuration parameter which specifies whether the 512 GB limit should be active per default for domUs (domain save/restore/migration and crash dump analysis are affected). Memory above the domain limit is returned to the hypervisor instead of being identity mapped, which was wrong anyway. The kernel configuration parameter to specify the maximum size of a domain can be deleted, as it is not relevant any more. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:24 +01:00
Juergen Gross	70e6119955	xen: move p2m list if conflicting with e820 map Check whether the hypervisor supplied p2m list is placed at a location which is conflicting with the target E820 map. If this is the case relocate it to a new area unused up to now and compliant to the E820 map. As the p2m list might by huge (up to several GB) and is required to be mapped virtually, set up a temporary mapping for the copied list. For pvh domains just delete the p2m related information from start info instead of reserving the p2m memory, as we don't need it at all. For 32 bit kernels adjust the memblock_reserve() parameters in order to cover the page tables only. This requires to memblock_reserve() the start_info page on it's own. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:24 +01:00
Juergen Gross	6c2681c863	xen: add explicit memblock_reserve() calls for special pages Some special pages containing interfaces to xen are being reserved implicitly only today. The memblock_reserve() call to reserve them is meant to reserve the p2m list supplied by xen. It is just reserving not only the p2m list itself, but some more pages up to the start of the xen built page tables. To be able to move the p2m list to another pfn range, which is needed for support of huge RAM, this memblock_reserve() must be split up to cover all affected reserved pages explicitly. The affected pages are: - start_info page - xenstore ring (might be missing, mfn is 0 in this case) - console ring (not for initial domain) Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:23 +01:00
Juergen Gross	4b9c15377f	xen: check for initrd conflicting with e820 map Check whether the initrd is placed at a location which is conflicting with the target E820 map. If this is the case relocate it to a new area unused up to now and compliant to the E820 map. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:22 +01:00
Juergen Gross	04414baab5	xen: check pre-allocated page tables for conflict with memory map Check whether the page tables built by the domain builder are at memory addresses which are in conflict with the target memory map. If this is the case just panic instead of running into problems later. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:21 +01:00
Juergen Gross	808fdb7193	xen: check for kernel memory conflicting with memory layout Checks whether the pre-allocated memory of the loaded kernel is in conflict with the target memory map. If this is the case, just panic instead of run into problems later, as there is nothing we can do to repair this situation. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:21 +01:00
Juergen Gross	9ddac5b724	xen: find unused contiguous memory area For being able to relocate pre-allocated data areas like initrd or p2m list it is mandatory to find a contiguous memory area which is not yet in use and doesn't conflict with the memory map we want to be in effect. In case such an area is found reserve it at once as this will be required to be done in any case. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:20 +01:00
Juergen Gross	e612b4a7db	xen: check memory area against e820 map Provide a service routine to check a physical memory area against the E820 map. The routine will return false if the complete area is RAM according to the E820 map and true otherwise. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:20 +01:00
Juergen Gross	5097cdf6ce	xen: split counting of extra memory pages from remapping Memory pages in the initial memory setup done by the Xen hypervisor conflicting with the target E820 map are remapped. In order to do this those pages are counted and remapped in xen_set_identity_and_remap(). Split the counting from the remapping operation to be able to setup the needed memory sizes in time but doing the remap operation at a later time. This enables us to simplify the interface to xen_set_identity_and_remap() as the number of remapped and released pages is no longer needed here. Finally move the remapping further down to prepare relocating conflicting memory contents before the memory might be clobbered by xen_set_identity_and_remap(). This requires to not destroy the Xen E820 map when the one for the system is being constructed. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:19 +01:00
Juergen Gross	69632ecfcd	xen: move static e820 map to global scope Instead of using a function local static e820 map in xen_memory_setup() and calling various functions in the same source with the map as a parameter use a map directly accessible by all functions in the source. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:18 +01:00
Juergen Gross	8f5b0c6398	xen: eliminate scalability issues from initial mapping setup Direct Xen to place the initial P->M table outside of the initial mapping, as otherwise the 1G (implementation) / 2G (theoretical) restriction on the size of the initial mapping limits the amount of memory a domain can be handed initially. As the initial P->M table is copied rather early during boot to domain private memory and it's initial virtual mapping is dropped, the easiest way to avoid virtual address conflicts with other addresses in the kernel is to use a user address area for the virtual address of the initial P->M table. This allows us to just throw away the page tables of the initial mapping after the copy without having to care about address invalidation. It should be noted that this patch won't enable a pv-domain to USE more than 512 GB of RAM. It just enables it to be started with a P->M table covering more memory. This is especially important for being able to boot a Dom0 on a system with more than 512 GB memory. Signed-off-by: Juergen Gross <jgross@suse.com> Based-on-patch-by: Jan Beulich <jbeulich@suse.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:18 +01:00
Juergen Gross	d51e8b3e85	xen: don't build mfn tree if tools don't need it In case the Xen tools indicate they don't need the p2m 3 level tree as they support the virtual mapped linear p2m list, just omit building the tree. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:17 +01:00
Juergen Gross	4b9c9a1180	xen: save linear p2m list address in shared info structure The virtual address of the linear p2m list should be stored in the shared info structure read by the Xen tools to be able to support 64 bit pv-domains larger than 512 GB. Additionally the linear p2m list interface includes a generation count which is changed prior to and after each mapping change of the p2m list. Reading the generation count the Xen tools can detect changes of the mappings and re-read the p2m list eventually. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:17 +01:00
Colin Ian King	772f95e3b9	x86/xen: fix non-ANSI declaration of xen_has_pv_devices() xen_has_pv_devices() has no parameters, so use the normal void parameter convention to make it match the prototype in the header file include/xen/platform_pci.h. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:13 +01:00
David Vrabel	87ffd2b9bb	x86/xen: make CONFIG_XEN depend on CONFIG_X86_LOCAL_APIC Since commit `feb44f1f7a` (x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs) Xen guests need a full APIC driver and thus should depend on X86_LOCAL_APIC. This fixes an i386 build failure with !SMP && !CONFIG_X86_UP_APIC by disabling Xen support in this configuration. Users needing Xen support in a non-SMP i386 kernel will need to enable CONFIG_X86_UP_APIC. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: <stable@vger.kernel.org>	2015-08-20 11:45:43 +01:00
Ingo Molnar	a5dd192496	Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes Conflicts: arch/x86/entry/entry_64_compat.S arch/x86/math-emu/get_address.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-18 09:39:47 +02:00
Linus Torvalds	6b476e1140	xen: bug fixes for 4.2-rc6 - Revert a fix from 4.2-rc5 that was causing lots of WARNING spam. - Fix a memory leak affecting backends in HVM guests. - Fix PV domU hang with certain configurations. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVzHsAAAoJEFxbo/MsZsTR9Y0H/2j1PHt29RPcNdgGQ84AH0Wh tw1emL8rMcdhWQnsO7bNmywNNvRNQnU3ZJ8dzoq+5GPikNsbfQzYc7U2pIL4A+gB AAJsNDNzecuq4srk8vNxcmZ7ySvm9w6dccDUex2ge3sNWaq6gzSQvz6FSWiL0Sxg k3JcnemEg6JrYOTWdxKInAORMcRO6rgx9eIsdPUPOpgC5XLg6/mZOqBAWXIksDvs V9uCMqQicaUgBgKFIOSllqH6fcCNooRu3aDwNNj/2mMcJmEvMeBkHmNlQgEm2j5L ubdDyrC5y48TUPJm8i3+W2/AY+kgWzhThcqyVy6LRAAj5RItJFxMf0nMXzIEqlQ= =UgMy -----END PGP SIGNATURE----- Merge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - revert a fix from 4.2-rc5 that was causing lots of WARNING spam. - fix a memory leak affecting backends in HVM guests. - fix PV domU hang with certain configurations. * tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: Don't leak memory when unmapping the ring on HVM backend Revert "xen/events/fifo: Handle linked events when closing a port" x86/xen: build "Xen PV" APIC driver for domU as well	2015-08-13 13:36:22 -07:00
Jason A. Donenfeld	fc5fee86bd	x86/xen: build "Xen PV" APIC driver for domU as well It turns out that a PV domU also requires the "Xen PV" APIC driver. Otherwise, the flat driver is used and we get stuck in busy loops that never exit, such as in this stack trace: (gdb) target remote localhost:9999 Remote debugging using localhost:9999 __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56 56 while (native_apic_mem_read(APIC_ICR) & APIC_ICR_BUSY) (gdb) bt #0 __xapic_wait_icr_idle () at ./arch/x86/include/asm/ipi.h:56 #1 __default_send_IPI_shortcut (shortcut=<optimized out>, dest=<optimized out>, vector=<optimized out>) at ./arch/x86/include/asm/ipi.h:75 #2 apic_send_IPI_self (vector=246) at arch/x86/kernel/apic/probe_64.c:54 #3 0xffffffff81011336 in arch_irq_work_raise () at arch/x86/kernel/irq_work.c:47 #4 0xffffffff8114990c in irq_work_queue (work=0xffff88000fc0e400) at kernel/irq_work.c:100 #5 0xffffffff8110c29d in wake_up_klogd () at kernel/printk/printk.c:2633 #6 0xffffffff8110ca60 in vprintk_emit (facility=0, level=<optimized out>, dict=0x0 <irq_stack_union>, dictlen=<optimized out>, fmt=<optimized out>, args=<optimized out>) at kernel/printk/printk.c:1778 #7 0xffffffff816010c8 in printk (fmt=<optimized out>) at kernel/printk/printk.c:1868 #8 0xffffffffc00013ea in ?? () #9 0x0000000000000000 in ?? () Mailing-list-thread: https://lkml.org/lkml/2015/8/4/755 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-10 15:33:10 +01:00
Ingo Molnar	5b929bd11d	Merge branch 'x86/urgent' into x86/asm, before applying dependent patches Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-31 10:23:35 +02:00
Andy Lutomirski	aa1acff356	x86/xen: Probe target addresses in set_aliased_prot() before the hypercall The update_va_mapping hypercall can fail if the VA isn't present in the guest's page tables. Under certain loads, this can result in an OOPS when the target address is in unpopulated vmap space. While we're at it, add comments to help explain what's going on. This isn't a great long-term fix. This code should probably be changed to use something like set_memory_ro. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Vrabel <dvrabel@cantab.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org <security@kernel.org> Cc: <stable@vger.kernel.org> Cc: xen-devel <xen-devel@lists.xen.org> Link: http://lkml.kernel.org/r/0b0e55b995cda11e7829f140b833ef932fcabe3a.1438291540.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-31 10:23:22 +02:00
Viresh Kumar	955381dd65	x86/xen/time: Migrate to new set-state interface Migrate xen driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. Callbacks aren't implemented for modes where we weren't doing anything. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: linaro-kernel@lists.linaro.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: xen-devel@lists.xenproject.org (moderated list:XEN HYPERVISOR INTERFACE) Link: http://lkml.kernel.org/r/881eea6e1a3d483cd33e044cd34827cce26a57fd.1437042675.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2015-07-30 21:25:38 +02:00
Andy Lutomirski	9261e050b6	x86/asm/tsc, x86/paravirt: Remove read_tsc() and read_tscp() paravirt hooks We've had ->read_tsc() and ->read_tscp() paravirt hooks since the very beginning of paravirt, i.e., `d3561b7fa0` ("[PATCH] paravirt: header and stubs for paravirtualisation"). AFAICT, the only paravirt guest implementation that ever replaced these calls was vmware, and it's gone. Arguably even vmware shouldn't have hooked RDTSC -- we fully support systems that don't have a TSC at all, so there's no point for a paravirt implementation to pretend that we have a TSC but to replace it. I also doubt that these hooks actually worked. Calls to rdtscl() and rdtscll(), which respected the hooks, were used seemingly interchangeably with native_read_tsc(), which did not. Just remove them. If anyone ever needs them again, they can try to make a case for why they need them. Before, on a paravirt config: text data bss dec hex filename 12618257 1816384 `1093632` 15528273 ecf151 vmlinux After: text data bss dec hex filename 12617207 1816384 `1093632` 15527223 eced37 vmlinux Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Huang Rui <ray.huang@amd.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: virtualization@lists.linux-foundation.org Link: http://lkml.kernel.org/r/d08a2600fb298af163681e5efd8e599d889a5b97.1434501121.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-07-06 15:23:26 +02:00
Linus Torvalds	d70b3ef54c	Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Ingo Molnar: "There were so many changes in the x86/asm, x86/apic and x86/mm topics in this cycle that the topical separation of -tip broke down somewhat - so the result is a more traditional architecture pull request, collected into the 'x86/core' topic. The topics were still maintained separately as far as possible, so bisectability and conceptual separation should still be pretty good - but there were a handful of merge points to avoid excessive dependencies (and conflicts) that would have been poorly tested in the end. The next cycle will hopefully be much more quiet (or at least will have fewer dependencies). The main changes in this cycle were: * x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas Gleixner) - This is the second and most intrusive part of changes to the x86 interrupt handling - full conversion to hierarchical interrupt domains: [IOAPIC domain] ----- \| [MSI domain] --------[Remapping domain] ----- [ Vector domain ] \| (optional) \| [HPET MSI domain] ----- \| \| [DMAR domain] ----------------------------- \| [Legacy domain] ----------------------------- This now reflects the actual hardware and allowed us to distangle the domain specific code from the underlying parent domain, which can be optional in the case of interrupt remapping. It's a clear separation of functionality and removes quite some duct tape constructs which plugged the remap code between ioapic/msi/hpet and the vector management. - Intel IOMMU IRQ remapping enhancements, to allow direct interrupt injection into guests (Feng Wu) * x86/asm changes: - Tons of cleanups and small speedups, micro-optimizations. This is in preparation to move a good chunk of the low level entry code from assembly to C code (Denys Vlasenko, Andy Lutomirski, Brian Gerst) - Moved all system entry related code to a new home under arch/x86/entry/ (Ingo Molnar) - Removal of the fragile and ugly CFI dwarf debuginfo annotations. Conversion to C will reintroduce many of them - but meanwhile they are only getting in the way, and the upstream kernel does not rely on them (Ingo Molnar) - NOP handling refinements. (Borislav Petkov) * x86/mm changes: - Big PAT and MTRR rework: making the code more robust and preparing to phase out exposing direct MTRR interfaces to drivers - in favor of using PAT driven interfaces (Toshi Kani, Luis R Rodriguez, Borislav Petkov) - New ioremap_wt()/set_memory_wt() interfaces to support Write-Through cached memory mappings. This is especially important for good performance on NVDIMM hardware (Toshi Kani) * x86/ras changes: - Add support for deferred errors on AMD (Aravind Gopalakrishnan) This is an important RAS feature which adds hardware support for poisoned data. That means roughly that the hardware marks data which it has detected as corrupted but wasn't able to correct, as poisoned data and raises an APIC interrupt to signal that in the form of a deferred error. It is the OS's responsibility then to take proper recovery action and thus prolonge system lifetime as far as possible. - Add support for Intel "Local MCE"s: upcoming CPUs will support CPU-local MCE interrupts, as opposed to the traditional system- wide broadcasted MCE interrupts (Ashok Raj) - Misc cleanups (Borislav Petkov) * x86/platform changes: - Intel Atom SoC updates ... and lots of other cleanups, fixlets and other changes - see the shortlog and the Git log for details" * 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits) x86/hpet: Use proper hpet device number for MSI allocation x86/hpet: Check for irq==0 when allocating hpet MSI interrupts x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail genirq: Prevent crash in irq_move_irq() genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain iommu, x86: Properly handle posted interrupts for IOMMU hotplug iommu, x86: Provide irq_remapping_cap() interface iommu, x86: Setup Posted-Interrupts capability for Intel iommu iommu, x86: Add cap_pi_support() to detect VT-d PI capability iommu, x86: Avoid migrating VT-d posted interrupts iommu, x86: Save the mode (posted or remapped) of an IRTE iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip iommu: dmar: Provide helper to copy shared irte fields iommu: dmar: Extend struct irte for VT-d Posted-Interrupts iommu: Add new member capability to struct irq_remap_ops x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry() ...	2015-06-22 17:59:09 -07:00
Linus Torvalds	e75c73ad64	Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU updates from Ingo Molnar: "This tree contains two main changes: - The big FPU code rewrite: wide reaching cleanups and reorganization that pulls all the FPU code together into a clean base in arch/x86/fpu/. The resulting code is leaner and faster, and much easier to understand. This enables future work to further simplify the FPU code (such as removing lazy FPU restores). By its nature these changes have a substantial regression risk: FPU code related bugs are long lived, because races are often subtle and bugs mask as user-space failures that are difficult to track back to kernel side backs. I'm aware of no unfixed (or even suspected) FPU related regression so far. - MPX support rework/fixes. As this is still not a released CPU feature, there were some buglets in the code - should be much more robust now (Dave Hansen)" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (250 commits) x86/fpu: Fix double-increment in setup_xstate_features() x86/mpx: Allow 32-bit binaries on 64-bit kernels again x86/mpx: Do not count MPX VMAs as neighbors when unmapping x86/mpx: Rewrite the unmap code x86/mpx: Support 32-bit binaries on 64-bit kernels x86/mpx: Use 32-bit-only cmpxchg() for 32-bit apps x86/mpx: Introduce new 'directory entry' to 'addr' helper function x86/mpx: Add temporary variable to reduce masking x86: Make is_64bit_mm() widely available x86/mpx: Trace allocation of new bounds tables x86/mpx: Trace the attempts to find bounds tables x86/mpx: Trace entry to bounds exception paths x86/mpx: Trace #BR exceptions x86/mpx: Introduce a boot-time disable flag x86/mpx: Restrict the mmap() size check to bounds tables x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK x86/mpx: Clean up the code by not passing a task pointer around when unnecessary x86/mpx: Use the new get_xsave_field_ptr()API x86/fpu/xstate: Wrap get_xsave_addr() to make it safer x86/fpu/xstate: Fix up bad get_xsave_addr() assumptions ...	2015-06-22 17:16:11 -07:00
Ingo Molnar	7ef3d7d58d	Merge branches 'x86/apic', 'x86/asm', 'x86/mm' and 'x86/platform' into x86/core, to merge last updates Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-22 09:15:03 +02:00
Ingo Molnar	9dda1658a9	Merge branch 'x86/asm' into x86/core, to prepare for new patch Collect all changes to arch/x86/entry/entry_64.S, before applying patch that changes most of the file. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-08 20:48:20 +02:00
Ingo Molnar	b2502b418e	x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32 The 'system_call' entry points differ starkly between native 32-bit and 64-bit kernels: on 32-bit kernels it defines the INT 0x80 entry point, while on 64-bit it's the SYSCALL entry point. This is pretty confusing when looking at generic code, and it also obscures the nature of the entry point at the assembly level. So unangle this by splitting the name into its two uses: system_call (32) -> entry_INT80_32 system_call (64) -> entry_SYSCALL_64 As per the generic naming scheme for x86 system call entry points: entry_MNEMONIC_qualifier where 'qualifier' is one of _32, _64 or _compat. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-08 09:14:21 +02:00
Ingo Molnar	4c8cd0c50d	x86/asm/entry: Untangle 'ia32_sysenter_target' into two entry points: entry_SYSENTER_32 and entry_SYSENTER_compat So the SYSENTER instruction is pretty quirky and it has different behavior depending on bitness and CPU maker. Yet we create a false sense of coherency by naming it 'ia32_sysenter_target' in both of the cases. Split the name into its two uses: ia32_sysenter_target (32) -> entry_SYSENTER_32 ia32_sysenter_target (64) -> entry_SYSENTER_compat As per the generic naming scheme for x86 system call entry points: entry_MNEMONIC_qualifier where 'qualifier' is one of _32, _64 or _compat. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-08 08:47:46 +02:00
Ingo Molnar	2cd23553b4	x86/asm/entry: Rename compat syscall entry points Rename the following system call entry points: ia32_cstar_target -> entry_SYSCALL_compat ia32_syscall -> entry_INT80_compat The generic naming scheme for x86 system call entry points is: entry_MNEMONIC_qualifier where 'qualifier' is one of _32, _64 or _compat. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-08 08:47:36 +02:00
Borislav Petkov	9cd25aac1f	x86/mm/pat: Emulate PAT when it is disabled In the case when PAT is disabled on the command line with "nopat" or when virtualization doesn't support PAT (correctly) - see `9d34cfdf47` ("x86: Don't rely on VMWare emulating PAT MSR correctly"). we emulate it using the PWT and PCD cache attribute bits. Get rid of boot_pat_state while at it. Based on a conglomerate patch from Toshi Kani. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Toshi Kani <toshi.kani@hp.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Elliott@hp.com Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: hch@lst.de Cc: hmh@hmh.eng.br Cc: konrad.wilk@oracle.com Cc: linux-mm <linux-mm@kvack.org> Cc: linux-nvdimm@lists.01.org Cc: stefan.bader@canonical.com Cc: yigal@plexistor.com Link: http://lkml.kernel.org/r/1433436928-31903-3-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-07 15:28:52 +02:00
Stephen Rothwell	d6472302f2	x86/mm: Decouple <linux/vmalloc.h> from <asm/io.h> Nothing in <asm/io.h> uses anything from <linux/vmalloc.h>, so remove it from there and fix up the resulting build problems triggered on x86 {64\|32}-bit {def\|allmod\|allno}configs. The breakages were triggering in places where x86 builds relied on vmalloc() facilities but did not include <linux/vmalloc.h> explicitly and relied on the implicit inclusion via <asm/io.h>. Also add: - <linux/init.h> to <linux/io.h> - <asm/pgtable_types> to <asm/io.h> ... which were two other implicit header file dependencies. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> [ Tidied up the changelog. ] Acked-by: David Miller <davem@davemloft.net> Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Vinod Koul <vinod.koul@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Colin Cross <ccross@android.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: James E.J. Bottomley <JBottomley@odin.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Kees Cook <keescook@chromium.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Kristen Carlson Accardi <kristen@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Suma Ramars <sramars@cisco.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-03 12:02:00 +02:00
Ingo Molnar	71966f3a0b	Merge branch 'locking/core' into x86/core, to prepare for dependent patch Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-06-03 10:07:35 +02:00
Ingo Molnar	21c4cd108a	x86/fpu: Simplify fpu__cpu_init() After the latest round of cleanups, fpu__cpu_init() has become a simple call to fpu__init_cpu(). Rename fpu__init_cpu() to fpu__cpu_init() and remove the extra layer. Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-19 15:47:44 +02:00
Ingo Molnar	3a9c4b0d7e	x86/fpu: Rename fpu_init() to fpu__cpu_init() fpu_init() is a bit of a misnomer in that it (falsely) creates the impression that it's related to the (old) fpu_finit() function, which initializes FPU ctx state. Rename it to fpu__cpu_init() to make its boot time initialization clear, and to move it to the fpu__*() namespace. Also fix and extend its comment block to point out that it's called not only on the boot CPU, but on secondary CPUs as well. Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-19 15:47:14 +02:00
Ingo Molnar	62c7a1e9ae	locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS Valentin Rothberg reported that we use CONFIG_QUEUED_SPINLOCKS in arch/x86/kernel/paravirt_patch_32.c, while the symbol is called CONFIG_QUEUED_SPINLOCK. (Note the extra 'S') But the typo was natural: the proper English term for such a generic object would be 'queued spinlocks' - so rename this and related symbols accordingly to the plural form. Reported-by: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Douglas Hatch <doug.hatch@hp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott J Norton <scott.norton@hp.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <Waiman.Long@hp.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-11 09:52:09 +02:00
Denys Vlasenko	3a23208e69	x86/entry: Define 'cpu_current_top_of_stack' for 64-bit code 32-bit code has PER_CPU_VAR(cpu_current_top_of_stack). 64-bit code uses somewhat more obscure: PER_CPU_VAR(cpu_tss + TSS_sp0). Define the 'cpu_current_top_of_stack' macro on CONFIG_X86_64 as well so that the PER_CPU_VAR(cpu_current_top_of_stack) expression can be used in both 32-bit and 64-bit code. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1429889495-27850-3-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-08 13:50:02 +02:00
Denys Vlasenko	63332a8455	x86/entry: Stop using PER_CPU_VAR(kernel_stack) PER_CPU_VAR(kernel_stack) is redundant: - On the 64-bit build, we can use PER_CPU_VAR(cpu_tss + TSS_sp0). - On the 32-bit build, we can use PER_CPU_VAR(cpu_current_top_of_stack). PER_CPU_VAR(kernel_stack) will be deleted by a separate change. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1429889495-27850-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-08 13:43:52 +02:00
Ingo Molnar	7ae383be81	Merge branch 'linus' into x86/asm, before applying dependent patch Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-08 13:33:33 +02:00
David Vrabel	e95e6f176c	locking/pvqspinlock, x86: Enable PV qspinlock for Xen This patch adds the necessary Xen specific code to allow Xen to support the CPU halting and kicking operations needed by the queue spinlock PV code. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Daniel J Blueman <daniel@numascale.com> Cc: Douglas Hatch <doug.hatch@hp.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paolo Bonzini <paolo.bonzini@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: Scott J Norton <scott.norton@hp.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1429901803-29771-12-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-05-08 12:37:18 +02:00
Boris Ostrovsky	a71dbdaa8c	hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests Commit `61f01dd941` ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue") makes AMD processors set SS to __KERNEL_DS in __switch_to() to deal with cases when SS is NULL. This breaks Xen PV guests who do not want to load SS with__KERNEL_DS. Since the problem that the commit is trying to address would have to be fixed in the hypervisor (if it in fact exists under Xen) there is no reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here. This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features op which will clear this flag. (And since this structure is no longer HVM-specific we should do some renaming). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-05-05 18:27:43 +01:00
Boris Ostrovsky	2b953a5e99	xen: Suspend ticks on all CPUs during suspend Commit `77e32c89a7` ("clockevents: Manage device's state separately for the core") decouples clockevent device's modes from states. With this change when a Xen guest tries to resume, it won't be calling its set_mode op which needs to be done on each VCPU in order to make the hypervisor aware that we are in oneshot mode. This happens because clockevents_tick_resume() (which is an intermediate step of resuming ticks on a processor) doesn't call clockevents_set_state() anymore and because during suspend clockevent devices on all VCPUs (except for the one doing the suspend) are left in ONESHOT state. As result, during resume the clockevents state machine will assume that device is already where it should be and doesn't need to be updated. To avoid this problem we should suspend ticks on all VCPUs during suspend. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-04-29 17:10:05 +01:00
Andy Lutomirski	aac82d3191	x86, paravirt, xen: Remove the 64-bit ->irq_enable_sysexit() pvop We don't use irq_enable_sysexit on 64-bit kernels any more. Remove all the paravirt and Xen machinery to support it on 64-bit kernels. Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Denys Vlasenko <vda.linux@googlemail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8a03355698fe5b94194e9e7360f19f91c1b2cf1f.1428100853.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-22 08:07:45 +02:00
Linus Torvalds	497a5df7bf	xen: features and fixes for 4.1-rc0 - Use a single source list of hypercalls, generating other tables etc. at build time. - Add a "Xen PV" APIC driver to support >255 VCPUs in PV guests. - Significant performance improve to guest save/restore/migration. - scsiback/front save/restore support. - Infrastructure for multi-page xenbus rings. - Misc fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJVL6OZAAoJEFxbo/MsZsTRs3YH/2AycBHs129DNnc2OLmOklBz AdD43k+FOfZlv0YU80WmPVmOpGGHGB5Pqkix2KtnvPYmtx3pb/5ikhDwSTWZpqBl Qq6/RgsRjYZ8VMKqrMTkJMrJWHQYbg8lgsP5810nsFBn/Qdbxms+WBqpMkFVo3b2 rvUZj8QijMJPS3qr55DklVaOlXV4+sTAytTdCiubVnaB/agM2jjRflp/lnJrhtTg yc4NTrIlD1RsMV/lNh92upBP/pCm6Bs0zQ2H1v3hkdhBBmaO0IVXpSheYhfDOHfo 9v209n137N7X86CGWImFk6m2b+EfiFnLFir07zKSA+iZwkYKn75znSdPfj0KCc0= =bxTm -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen features and fixes from David Vrabel: - use a single source list of hypercalls, generating other tables etc. at build time. - add a "Xen PV" APIC driver to support >255 VCPUs in PV guests. - significant performance improve to guest save/restore/migration. - scsiback/front save/restore support. - infrastructure for multi-page xenbus rings. - misc fixes. * tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pci: Try harder to get PXM information for Xen xenbus_client: Extend interface to support multi-page ring xen-pciback: also support disabling of bus-mastering and memory-write-invalidate xen: support suspend/resume in pvscsi frontend xen: scsiback: add LUN of restored domain xen-scsiback: define a pr_fmt macro with xen-pvscsi xen/mce: fix up xen_late_init_mcelog() error handling xen/privcmd: improve performance of MMAPBATCH_V2 xen: unify foreign GFN map/unmap for auto-xlated physmap guests x86/xen/apic: WARN with details. x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs xen/pciback: Don't print scary messages when unsupported by hypervisor. xen: use generated hypercall symbols in arch/x86/xen/xen-head.S xen: use generated hypervisor symbols in arch/x86/xen/trace.c xen: synchronize include/xen/interface/xen.h with xen xen: build infrastructure for generating hypercall depending symbols xen: balloon: Use static attribute groups for sysfs entries xen: pcpu: Use static attribute groups for sysfs entry	2015-04-16 14:01:03 -05:00
Linus Torvalds	1dcf58d6e6	Merge branch 'akpm' (patches from Andrew) Merge first patchbomb from Andrew Morton: - arch/sh updates - ocfs2 updates - kernel/watchdog feature - about half of mm/ * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (122 commits) Documentation: update arch list in the 'memtest' entry Kconfig: memtest: update number of test patterns up to 17 arm: add support for memtest arm64: add support for memtest memtest: use phys_addr_t for physical addresses mm: move memtest under mm mm, hugetlb: abort __get_user_pages if current has been oom killed mm, mempool: do not allow atomic resizing memcg: print cgroup information when system panics due to panic_on_oom mm: numa: remove migrate_ratelimited mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE mm: split ET_DYN ASLR from mmap ASLR s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE mm: expose arch_mmap_rnd when available s390: standardize mmap_rnd() usage powerpc: standardize mmap_rnd() usage mips: extract logic for mmap_rnd() arm64: standardize mmap_rnd() usage x86: standardize mmap_rnd() usage arm: factor out mmap ASLR into mmap_rnd ...	2015-04-14 16:49:17 -07:00
Kirill A. Shutemov	9823336833	x86: expose number of page table levels on Kconfig level We would want to use number of page table level to define mm_struct. Let's expose it as CONFIG_PGTABLE_LEVELS. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-04-14 16:49:02 -07:00
Linus Torvalds	078838d565	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: "The main changes in this cycle were: - changes permitting use of call_rcu() and friends very early in boot, for example, before rcu_init() is invoked. - add in-kernel API to enable and disable expediting of normal RCU grace periods. - improve RCU's handling of (hotplug-) outgoing CPUs. - NO_HZ_FULL_SYSIDLE fixes. - tiny-RCU updates to make it more tiny. - documentation updates. - miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) cpu: Provide smpboot_thread_init() on !CONFIG_SMP kernels as well cpu: Defer smpboot kthread unparking until CPU known to scheduler rcu: Associate quiescent-state reports with grace period rcu: Yet another fix for preemption and CPU hotplug rcu: Add diagnostics to grace-period cleanup rcutorture: Default to grace-period-initialization delays rcu: Handle outgoing CPUs on exit from idle loop cpu: Make CPU-offline idle-loop transition point more precise rcu: Eliminate ->onoff_mutex from rcu_node structure rcu: Process offlining and onlining only at grace-period start rcu: Move rcu_report_unblock_qs_rnp() to common code rcu: Rework preemptible expedited bitmask handling rcu: Remove event tracing from rcu_cpu_notify(), used by offline CPUs rcutorture: Enable slow grace-period initializations rcu: Provide diagnostic option to slow down grace-period initialization rcu: Detect stalls caused by failure to propagate up rcu_node tree rcu: Eliminate empty HOTPLUG_CPU ifdef rcu: Simplify sync_rcu_preempt_exp_init() rcu: Put all orphan-callback-related code under same comment rcu: Consolidate offline-CPU callback initialization ...	2015-04-14 13:36:04 -07:00
Linus Torvalds	60f898eeaa	Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm changes from Ingo Molnar: "There were lots of changes in this development cycle: - over 100 separate cleanups, restructuring changes, speedups and fixes in the x86 system call, irq, trap and other entry code, part of a heroic effort to deobfuscate a decade old spaghetti asm code and its C code dependencies (Denys Vlasenko, Andy Lutomirski) - alternatives code fixes and enhancements (Borislav Petkov) - simplifications and cleanups to the compat code (Brian Gerst) - signal handling fixes and new x86 testcases (Andy Lutomirski) - various other fixes and cleanups By their nature many of these changes are risky - we tried to test them well on many different x86 systems (there are no known regressions), and they are split up finely to help bisection - but there's still a fair bit of residual risk left so caveat emptor" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (148 commits) perf/x86/64: Report regs_user->ax too in get_regs_user() perf/x86/64: Simplify regs_user->abi setting code in get_regs_user() perf/x86/64: Do report user_regs->cx while we are in syscall, in get_regs_user() perf/x86/64: Do not guess user_regs->cs, ss, sp in get_regs_user() x86/asm/entry/32: Tidy up JNZ instructions after TESTs x86/asm/entry/64: Reduce padding in execve stubs x86/asm/entry/64: Remove GET_THREAD_INFO() in ret_from_fork x86/asm/entry/64: Simplify jumps in ret_from_fork x86/asm/entry/64: Remove a redundant jump x86/asm/entry/64: Optimize [v]fork/clone stubs x86/asm/entry: Zero EXTRA_REGS for stub32_execve() too x86/asm/entry/64: Move stub_x32_execvecloser() to stub_execveat() x86/asm/entry/64: Use common code for rt_sigreturn() epilogue x86/asm/entry/64: Add forgotten CFI annotation x86/asm/entry/irq: Simplify interrupt dispatch table (IDT) layout x86/asm/entry/64: Move opportunistic sysret code to syscall code path x86, selftests: Add sigreturn selftest x86/alternatives: Guard NOPs optimization x86/asm/entry: Clear EXTRA_REGS for all executable formats x86/signal: Remove pax argument from restore_sigcontext ...	2015-04-13 13:16:36 -07:00
Linus Torvalds	7fd56474db	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Ingo Molnar: "The main changes in this cycle were: - clockevents state machine cleanups and enhancements (Viresh Kumar) - clockevents broadcast notifier horror to state machine conversion and related cleanups (Thomas Gleixner, Rafael J Wysocki) - clocksource and timekeeping core updates (John Stultz) - clocksource driver updates and fixes (Ben Dooks, Dmitry Osipenko, Hans de Goede, Laurent Pinchart, Maxime Ripard, Xunlei Pang) - y2038 fixes (Xunlei Pang, John Stultz) - NMI-safe ktime_get_raw_fast() and general refactoring of the clock code, in preparation to perf's per event clock ID support (Peter Zijlstra) - generic sched/clock fixes, optimizations and cleanups (Daniel Thompson) - clockevents cpu_down() race fix (Preeti U Murthy)" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits) timers/PM: Drop unnecessary braces from tick_freeze() timers/PM: Fix up tick_unfreeze() timekeeping: Get rid of stale comment clockevents: Cleanup dead cpu explicitely clockevents: Make tick handover explicit clockevents: Remove broadcast oneshot control leftovers sched/idle: Use explicit broadcast oneshot control function ARM: Tegra: Use explicit broadcast oneshot control function ARM: OMAP: Use explicit broadcast oneshot control function intel_idle: Use explicit broadcast oneshot control function ACPI/idle: Use explicit broadcast control function ACPI/PAD: Use explicit broadcast oneshot control function x86/amd/idle, clockevents: Use explicit broadcast oneshot control functions clockevents: Provide explicit broadcast oneshot control functions clockevents: Remove the broadcast control leftovers ARM: OMAP: Use explicit broadcast control function intel_idle: Use explicit broadcast control function cpuidle: Use explicit broadcast control function ACPI/processor: Use explicit broadcast control function ACPI/PAD: Use explicit broadcast control function ...	2015-04-13 11:08:28 -07:00
Ingo Molnar	4bcc7827b0	Linux 4.0-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVIws/AAoJEHm+PkMAQRiGwEcH/1GCBqrBzXaKwDdCPMRcYVUb MYkXmGkCGRYWe5MXI8QNAaa/CdG6mAFMHWN6CaMMpLTxnM1m87uBg01fQMsh73BO mRVLKE/soiJDnR1gYzBBDBYV/AUvytN5PhgeNaA95YIJvU3T1f3iTnV8vs30Dp0L YpxSqwr3C0k7C9IE0VcgfzvWJPCnQ9IWHuX3jn5s1XjGKVNbBYHMt6FusHdyXMfT dp8ksuGHwm30mTFI5xJpKOrRzfi+P5EsEUrsnFRPRM/iFTVrM5R7eaUhsRZb2+Wo YApnbYhUYz7om1AuQ+UZ/+S6y7ZLlGWegI1lWI754GIsczG5vPHEYhhgkzMhTsc= =kR1V -----END PGP SIGNATURE----- Merge tag 'v4.0-rc7' into x86/asm, to resolve conflicts Conflicts: arch/x86/kernel/entry_64.S Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-08 09:01:54 +02:00
Boris Ostrovsky	3f85483bd8	x86/cpu: Factor out common CPU initialization code, fix 32-bit Xen PV guests Some of x86 bare-metal and Xen CPU initialization code is common between the two and therefore can be factored out to avoid code duplication. As a side effect, doing so will also extend the fix provided by commit `a7fcf28d43` ("x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() to x86_32") to 32-bit Xen PV guests. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: konrad.wilk@oracle.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1427897534-5086-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-02 12:06:41 +02:00
Thomas Gleixner	f46481d0a7	tick/xen: Provide and use tick_suspend_local() and tick_resume_local() Xen calls on every cpu into tick_resume() which is just wrong. tick_resume() is for the syscore global suspend/resume invocation. What XEN really wants is a per cpu local resume function. Provide a tick_resume_local() function and use it in XEN. Also provide a complementary tick_suspend_local() and modify tick_unfreeze() and tick_freeze(), respectively, to use the new local tick resume/suspend functions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Combined two patches, rebased, modified subject/changelog. ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1698741.eezk9tnXtG@vostro.rjw.lan [ Merged to latest timers/core. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 14:23:00 +02:00
Thomas Gleixner	4ffee521f3	clockevents: Make suspend/resume calls explicit clockevents_notify() is a leftover from the early design of the clockevents facility. It's really not a notification mechanism, it's a multiplex call. We are way better off to have explicit calls instead of this monstrosity. Split out the suspend/resume() calls and invoke them directly from the call sites. No locking required at this point because these calls happen with interrupts disabled and a single cpu online. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ Rebased on top of 4.0-rc5. ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/713674030.jVm1qaHuPf@vostro.rjw.lan [ Rebased on top of latest timers/core. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-04-01 14:22:59 +02:00
Ingo Molnar	4bfe186dbe	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Changes permitting use of call_rcu() and friends very early in boot, for example, before rcu_init() is invoked. - Miscellaneous fixes. - Add in-kernel API to enable and disable expediting of normal RCU grace periods. - Improve RCU's handling of (hotplug-) outgoing CPUs. Note: ARM support is lagging a bit here, and these improved diagnostics might generate (harmless) splats. - NO_HZ_FULL_SYSIDLE fixes. - Tiny RCU updates to make it more tiny. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-27 10:04:06 +01:00
Denys Vlasenko	ef593260f0	x86/asm/entry: Get rid of KERNEL_STACK_OFFSET PER_CPU_VAR(kernel_stack) was set up in a way where it points five stack slots below the top of stack. Presumably, it was done to avoid one "sub $5*8,%rsp" in syscall/sysenter code paths, where iret frame needs to be created by hand. Ironically, none of them benefits from this optimization, since all of them need to allocate additional data on stack (struct pt_regs), so they still have to perform subtraction. This patch eliminates KERNEL_STACK_OFFSET. PER_CPU_VAR(kernel_stack) now points directly to top of stack. pt_regs allocations are adjusted to allocate iret frame as well. Hopefully we can merge it later with 32-bit specific PER_CPU_VAR(cpu_current_top_of_stack) variable... Net result in generated code is that constants in several insns are changed. This change is necessary for changing struct pt_regs creation in SYSCALL64 code path from MOV to PUSH instructions. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1426785469-15125-2-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-24 19:42:38 +01:00
Juergen Gross	633d6f17cd	x86/xen: prepare p2m list for memory hotplug Commit `054954eb05` ("xen: switch to linear virtual mapped sparse p2m list") introduced a regression regarding to memory hotplug for a pv-domain: as the virtual space for the p2m list is allocated for the to be expected memory size of the domain only, hotplugged memory above that size will not be usable by the domain. Correct this by using a configurable size for the p2m list in case of memory hotplug enabled (default supported memory size is 512 GB for 64 bit domains and 4 GB for 32 bit domains). Signed-off-by: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> # 3.19+ Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-03-23 15:14:47 +00:00
Ingo Molnar	e4518ab90f	Linux 4.0-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVD1VGAAoJEHm+PkMAQRiG7yoH/juKOQ1zbxi5M+mleDEEJtA0 RxQSojqEMWIKrWi8PNZxjENn1OZB6XOLIXOhlyAZBrmgsjO34p1DyXlZMznr/R8W kQ2Xxs061hRtB3OuruMIqOApUrjuqsaCwgbgUS1qWmqZcoyZN4oELyZMP6OOlqv5 UUBZm8MfyXGyxrCcg39mjct3VEOhiuEcvL6SUxOC380CdSVAnyqHFPcz0JVqMUn9 9RUBs0T9cMdhb0mZ2bfXzt6AKArj63G2nXOum+VzFcvspSm2U+MPIDCuoE+ZbTPS jqIAgG0rj1ezRyb5oeJrvlU0Yy3u/cXoMPs9+kORvpladooYNLti8ovh6qllm0I= =d/ye -----END PGP SIGNATURE----- Merge tag 'v4.0-rc5' into x86/asm, to resolve conflicts Conflicts: arch/x86/kernel/entry_64.S Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-23 11:13:15 +01:00
Ingo Molnar	c38e503804	x86/asm/entry/64: Rename 'old_rsp' to 'rsp_scratch' Make clear that the usage of PER_CPU(old_rsp) is purely temporary, by renaming it to 'rsp_scratch'. Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Will Drewry <wad@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-17 16:01:42 +01:00
David Vrabel	4e8c0c8c4b	xen/privcmd: improve performance of MMAPBATCH_V2 Make the IOCTL_PRIVCMD_MMAPBATCH_V2 (and older V1 version) map multiple frames at a time rather than one at a time, despite the pages being non-consecutive GFNs. xen_remap_foreign_mfn_array() is added which maps an array of GFNs (instead of a consecutive range of GFNs). Since per-frame errors are returned in an array, privcmd must set the MMAPBATCH_V1 error bits as part of the "report errors" phase, after all the frames are mapped. Migrate times are significantly improved (when using a PV toolstack domain). For example, for an idle 12 GiB PV guest: Before After real 0m38.179s 0m26.868s user 0m15.096s 0m13.652s sys 0m28.988s 0m18.732s Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2015-03-16 14:49:15 +00:00
David Vrabel	628c28eefd	xen: unify foreign GFN map/unmap for auto-xlated physmap guests Auto-translated physmap guests (arm, arm64 and x86 PVHVM/PVH) map and unmap foreign GFNs using the same method (updating the physmap). Unify the two arm and x86 implementations into one commont one. Note that on arm and arm64, the correct error code will be returned (instead of always -EFAULT) and map/unmap failure warnings are no longer printed. These changes are required if the foreign domain is paging (-ENOENT failures are expected and must be propagated up to the caller). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2015-03-16 14:49:15 +00:00
Konrad Rzeszutek Wilk	b3b06c7eb7	x86/xen/apic: WARN with details. We should not be writting to the APIC registers under Xen PV. But if we do instead of just giving an blanket warning - include some details to help troubleshoot. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-03-16 14:49:14 +00:00
Konrad Rzeszutek Wilk	feb44f1f7a	x86/xen: Provide a "Xen PV" APIC driver to support >255 VCPUs Instead of mangling the default APIC driver, provide a Xen PV guest specific one that explicitly provides appropriate methods. This allows use to report that all APIC IDs are valid, allowing dom0 to boot with more than 255 VCPUs. Since the probe order of APIC drivers is link dependent, we add in an late probe function to change to the Xen PV if it hadn't been done during bootup. Suggested-by: David Vrabel <david.vrabel@citrix.com> Reported-by: Cathy Avery <cathy.avery@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-03-16 14:49:14 +00:00
Juergen Gross	526abeaed4	xen: use generated hypercall symbols in arch/x86/xen/xen-head.S Instead of manually list each hypercall in arch/x86/xen/xen-head.S use the auto generated symbol list. This also corrects the wrong address of xen_hypercall_mca which was located 32 bytes higher than it should. Symbol addresses have been verified to match the correct ones via objdump output. Based-on-patch-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-03-16 14:49:14 +00:00
Juergen Gross	fc903f8736	xen: use generated hypervisor symbols in arch/x86/xen/trace.c Instead of manually list all hypervisor calls in arch/x86/xen/trace.c use the auto generated list. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-03-16 14:49:13 +00:00
Juergen Gross	16b12d6057	xen: synchronize include/xen/interface/xen.h with xen The header include/xen/interface/xen.h doesn't contain all definitions from Xen's version of that header. Update it accordingly. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-03-16 14:49:13 +00:00
Paul E. McKenney	2a442c9c64	x86: Use common outgoing-CPU-notification code This commit removes the open-coded CPU-offline notification with new common code. Among other things, this change avoids calling scheduler code using RCU from an offline CPU that RCU is ignoring. It also allows Xen to notice at online time that the CPU did not go offline correctly. Note that Xen has the surviving CPU carry out some cleanup operations, so if the surviving CPU times out, these cleanup operations might have been carried out while the outgoing CPU was still running. It might therefore be unwise to bring this CPU back online, and this commit avoids doing so. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: <x86@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: <xen-devel@lists.xenproject.org>	2015-03-11 13:22:35 -07:00
Andy Lutomirski	8ef46a672a	x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu We currently store references to the top of the kernel stack in multiple places: kernel_stack (with an offset) and init_tss.x86_tss.sp0 (no offset). The latter is defined by hardware and is a clean canonical way to find the top of the stack. Add an accessor so we can start using it. This needs minor paravirt tweaks. On native, sp0 defines the top of the kernel stack and is therefore always correct. On Xen and lguest, the hypervisor tracks the top of the stack, but we want to start reading sp0 in the kernel. Fixing this is simple: just update our local copy of sp0 as well as the hypervisor's copy on task switches. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/8d675581859712bee09a055ed8f785d80dac1eca.1425611534.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-03-06 08:32:57 +01:00
Juergen Gross	b8f05c8803	x86/xen: correct bug in p2m list initialization Commit `054954eb05` ("xen: switch to linear virtual mapped sparse p2m list") introduced an error. During initialization of the p2m list a p2m identity area mapped by a complete identity pmd entry has to be split up into smaller chunks sometimes, if a non-identity pfn is introduced in this area. If this non-identity pfn is not at index 0 of a p2m page the new p2m page needed is initialized with wrong identity entries, as the identity pfns don't start with the value corresponding to index 0, but with the initial non-identity pfn. This results in weird wrong mappings. Correct the wrong initialization by starting with the correct pfn. Cc: stable@vger.kernel.org # 3.19 Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-02-27 14:53:19 +00:00
Boris Ostrovsky	5054daa285	x86/xen: Initialize cr4 shadow for 64-bit PV(H) guests Commit `1e02ce4ccc` ("x86: Store a per-cpu shadow copy of CR4") introduced CR4 shadows. These shadows are initialized in early boot code. The commit missed initialization for 64-bit PV(H) guests that this patch adds. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-02-23 16:30:26 +00:00
Boris Ostrovsky	31795b470b	x86/xen: Make sure X2APIC_ENABLE bit of MSR_IA32_APICBASE is not set Commit `d524165cb8` ("x86/apic: Check x2apic early") tests X2APIC_ENABLE bit of MSR_IA32_APICBASE when CONFIG_X86_X2APIC is off and panics the kernel when this bit is set. Xen's PV guests will pass this MSR read to the hypervisor which will return its version of the MSR, where this bit might be set. Make sure we clear it before returning MSR value to the caller. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-02-23 16:30:23 +00:00
Linus Torvalds	10436cf881	Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two fixes: the paravirt spin_unlock() corruption/crash fix, and an rtmutex NULL dereference crash fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/spinlocks/paravirt: Fix memory corruption on unlock locking/rtmutex: Avoid a NULL pointer dereference on deadlock	2015-02-21 10:45:03 -08:00
Raghavendra K T	d6abfdb202	x86/spinlocks/paravirt: Fix memory corruption on unlock Paravirt spinlock clears slowpath flag after doing unlock. As explained by Linus currently it does: prev = lock; add_smp(&lock->tickets.head, TICKET_LOCK_INC); / add_smp() is a full mb() / if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) __ticket_unlock_slowpath(lock, prev); which is exactly* the kind of things you cannot do with spinlocks, because after you've done the "add_smp()" and released the spinlock for the fast-path, you can't access the spinlock any more. Exactly because a fast-path lock might come in, and release the whole data structure. Linus suggested that we should not do any writes to lock after unlock(), and we can move slowpath clearing to fastpath lock. So this patch implements the fix with: 1. Moving slowpath flag to head (Oleg): Unlocked locks don't care about the slowpath flag; therefore we can keep it set after the last unlock, and clear it again on the first (try)lock. -- this removes the write after unlock. note that keeping slowpath flag would result in unnecessary kicks. By moving the slowpath flag from the tail to the head ticket we also avoid the need to access both the head and tail tickets on unlock. 2. use xadd to avoid read/write after unlock that checks the need for unlock_kick (Linus): We further avoid the need for a read-after-release by using xadd; the prev head value will include the slowpath flag and indicate if we need to do PV kicking of suspended spinners -- on modern chips xadd isn't (much) more expensive than an add + load. Result: setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest) benchmark overcommit %improve kernbench 1x -0.13 kernbench 2x 0.02 dbench 1x -1.77 dbench 2x -0.63 [Jeremy: Hinted missing TICKET_LOCK_INC for kick] [Oleg: Moved slowpath flag to head, ticket_equals idea] [PeterZ: Added detailed changelog] Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Sasha Levin <sasha.levin@oracle.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jones <davej@redhat.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Fernando Luis Vázquez Cao <fernando_b1@lab.ntt.co.jp> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Waiman Long <Waiman.Long@hp.com> Cc: a.ryabinin@samsung.com Cc: dave@stgolabs.net Cc: hpa@zytor.com Cc: jasowang@redhat.com Cc: jeremy@goop.org Cc: paul.gortmaker@windriver.com Cc: riel@redhat.com Cc: tglx@linutronix.de Cc: waiman.long@hp.com Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20150215173043.GA7471@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-02-18 14:53:49 +01:00
Linus Torvalds	37507717de	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf updates from Ingo Molnar: "This series tightens up RDPMC permissions: currently even highly sandboxed x86 execution environments (such as seccomp) have permission to execute RDPMC, which may leak various perf events / PMU state such as timing information and other CPU execution details. This 'all is allowed' RDPMC mode is still preserved as the (non-default) /sys/devices/cpu/rdpmc=2 setting. The new default is that RDPMC access is only allowed if a perf event is mmap-ed (which is needed to correctly interpret RDPMC counter values in any case). As a side effect of these changes CR4 handling is cleaned up in the x86 code and a shadow copy of the CR4 value is added. The extra CR4 manipulation adds ~ <50ns to the context switch cost between rdpmc-capable and rdpmc-non-capable mms" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Add /sys/devices/cpu/rdpmc=2 to allow rdpmc for all tasks perf/x86: Only allow rdpmc if a perf_event is mapped perf: Pass the event to arch_perf_update_userpage() perf: Add pmu callbacks to track event mapping and unmapping x86: Add a comment clarifying LDT context switching x86: Store a per-cpu shadow copy of CR4 x86: Clean up cr4 manipulation	2015-02-16 14:58:12 -08:00
Linus Torvalds	c833e17e27	Tighten rules for ACCESS_ONCE This series tightens the rules for ACCESS_ONCE to only work on scalar types. It also contains the necessary fixups as indicated by build bots of linux-next. Now everything is in place to prevent new non-scalar users of ACCESS_ONCE and we can continue to convert code to READ_ONCE/WRITE_ONCE. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJU2H5MAAoJEBF7vIC1phx8Jm4QALPqKOMDSUBCrqJFWJeujtv2 ILxJKsnjrAlt3dxnlVI3q6e5wi896hSce75PcvZ/vs/K3GdgMxOjrakBJGTJ2Qjg 5njW9aGJDDr/SYFX33MLWfqy222TLtpxgSz379UgXjEzB0ymMWbJJ3FnGjVqQJdp RXDutpncRySc/rGHh9UPREIRR5GvimONsWE2zxgXjUzB8vIr2fCGvHTXfIb6RKbQ yaFoihzn0m+eisc5Gy4tQ1qhhnaYyWEGrINjHTjMFTQOWTlH80BZAyQeLdbyj2K5 qloBPS/VhBTr/5TxV5onM+nVhu0LiblVNrdMHVeb7jyST4LeFOCaWK98lB3axSB5 v/2D1YKNb3g1U1x3In/oNGQvs36zGiO1uEdMF1l8ZFXgCvHmATSFSTWBtqUhb5Ew JA3YyqMTG6dpRTMSnmu3/frr4wDqnxlB/ktQC1pf3tDp87mr1ZYEy/dQld+tltjh 9Z5GSdrw0nf91wNI3DJf+26ZDdz5B+EpDnPnOKG8anI1lc/mQneI21/K/xUteFXw UZ1XGPLV2vbv9/a13u44SdjenHvQs1egsGeebMxVPoj6WmDLVmcIqinyS6NawYzn IlDGy/b3bSnXWMBP0ZVBX94KWLxqDDc4a/ayxsmxsP1tPZ+jDXjVDa7E3zskcHxG Uj5ULCPyU087t8Sl76mv =Dj70 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux Pull ACCESS_ONCE() rule tightening from Christian Borntraeger: "Tighten rules for ACCESS_ONCE This series tightens the rules for ACCESS_ONCE to only work on scalar types. It also contains the necessary fixups as indicated by build bots of linux-next. Now everything is in place to prevent new non-scalar users of ACCESS_ONCE and we can continue to convert code to READ_ONCE/WRITE_ONCE" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux: kernel: Fix sparse warning for ACCESS_ONCE next: sh: Fix compile error kernel: tighten rules for ACCESS ONCE mm/gup: Replace ACCESS_ONCE with READ_ONCE x86/spinlock: Leftover conversion ACCESS_ONCE->READ_ONCE x86/xen/p2m: Replace ACCESS_ONCE with READ_ONCE ppc/hugetlbfs: Replace ACCESS_ONCE with READ_ONCE ppc/kvm: Replace ACCESS_ONCE with READ_ONCE	2015-02-14 10:54:28 -08:00
Andy Lutomirski	375074cc73	x86: Clean up cr4 manipulation CR4 manipulation was split, seemingly at random, between direct (write_cr4) and using a helper (set/clear_in_cr4). Unfortunately, the set_in_cr4 and clear_in_cr4 helpers also poke at the boot code, which only a small subset of users actually wanted. This patch replaces all cr4 access in functions that don't leave cr4 exactly the way they found it with new helpers cr4_set_bits, cr4_clear_bits, and cr4_set_bits_and_update_boot. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Vince Weaver <vince@deater.net> Cc: "hillf.zj" <hillf.zj@alibaba-inc.com> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/495a10bdc9e67016b8fd3945700d46cfd5c12c2f.1414190806.git.luto@amacapital.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-02-04 12:10:41 +01:00
Jennifer Herbert	8da7633f16	xen: mark grant mapped pages as foreign Use the "foreign" page flag to mark pages that have a grant map. Use page->private to store information of the grant (the granting domain and the grant reference). Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 14:03:12 +00:00
Jennifer Herbert	0ae65f49af	x86/xen: require ballooned pages for grant maps Ballooned pages are always used for grant maps which means the original frame does not need to be saved in page->index nor restored after the grant unmap. This allows the workaround in netback for the conflicting use of the (unionized) page->index and page->pfmemalloc to be removed. Signed-off-by: Jennifer Herbert <jennifer.herbert@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 14:03:11 +00:00
David Vrabel	0bb599fd30	xen: remove scratch frames for ballooned pages and m2p override The scratch frame mappings for ballooned pages and the m2p override are broken. Remove them in preparation for replacing them with simpler mechanisms that works. The scratch pages did not ensure that the page was not in use. In particular, the foreign page could still be in use by hardware. If the guest reused the frame the hardware could read or write that frame. The m2p override did not handle the same frame being granted by two different grant references. Trying an M2P override lookup in this case is impossible. With the m2p override removed, the grant map/unmap for the kernel mappings (for x86 PV) can be easily batched in set_foreign_p2m_mapping() and clear_foreign_p2m_mapping(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2015-01-28 14:03:10 +00:00
David Vrabel	853d028934	xen/grant-table: pre-populate kernel unmap ops for xen_gnttab_unmap_refs() When unmapping grants, instead of converting the kernel map ops to unmap ops on the fly, pre-populate the set of unmap ops. This allows the grant unmap for the kernel mappings to be trivially batched in the future. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2015-01-28 14:03:10 +00:00
Juergen Gross	270b79338e	x86/xen: cleanup arch/x86/xen/mmu.c Remove a nested ifdef. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 10:01:11 +00:00
Juergen Gross	bf9d834a9b	x86/xen: add some __init annotations in arch/x86/xen/mmu.c The file arch/x86/xen/mmu.c has some functions that can be annotated with "__init". Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 10:00:51 +00:00
Juergen Gross	a3f5239650	x86/xen: add some __init and static annotations in arch/x86/xen/setup.c Some more functions in arch/x86/xen/setup.c can be made "__init". xen_ignore_unusable() can be made "static". Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 10:00:36 +00:00
Juergen Gross	3ba5c867ca	x86/xen: use correct types for addresses in arch/x86/xen/setup.c In many places in arch/x86/xen/setup.c wrong types are used for physical addresses (u64 or unsigned long long). Use phys_addr_t instead. Use macros already defined instead of open coding them. Correct some other type mismatches. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 10:00:10 +00:00
Juergen Gross	f0feed10aa	x86/xen: cleanup arch/x86/xen/setup.c Remove extern declarations in arch/x86/xen/setup.c which are either not used or redundant. Move needed other extern declarations to xen-ops.h Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-28 09:59:46 +00:00
Davidlohr Bueso	57b6b99bac	x86,xen: use current->state helpers Call __set_current_state() instead of assigning the new state directly. These interfaces also aid CONFIG_DEBUG_ATOMIC_SLEEP environments, keeping track of who changed the state. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-26 10:21:26 +00:00
Palik, Imre	94dd85f6a0	x86/xen: prefer TSC over xen clocksource for dom0 In Dom0's the use of the TSC clocksource (whenever it is stable enough to be used) instead of the Xen clocksource should not cause any issues, as Dom0 VMs never live-migrated. The TSC clocksource is somewhat more efficient than the Xen paravirtualised clocksource, thus it should have higher rating. This patch decreases the rating of the Xen clocksource in Dom0s to 275. Which is half-way between the rating of the TSC clocksource (300) and the hpet clocksource (250). Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: Imre Palik <imrep@amazon.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-20 18:44:24 +00:00
Christian Borntraeger	1760f1eb7e	x86/xen/p2m: Replace ACCESS_ONCE with READ_ONCE ACCESS_ONCE does not work reliably on non-scalar types. For example gcc 4.6 and 4.7 might remove the volatile tag for such accesses during the SRA (scalar replacement of aggregates) step (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145) Change the p2m code to replace ACCESS_ONCE with READ_ONCE. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: David Vrabel <david.vrabel@citrix.com>	2015-01-19 14:14:20 +01:00
Linus Torvalds	613d4cefbb	xen: bug fixes for 3.19-rc4 - Several critical linear p2m fixes that prevented some hosts from booting. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJUtQHNAAoJEFxbo/MsZsTR/qgH/iiW4k2T8dBGZ7TPyzt88iyT 4caWjuujp2OUaRqhBQdY7z05uai6XxgJLwDyqiO+qHaRUj+ZWCrjh/ZFPU1+09hK GdwPMWU7xMRs/7F2ANO03jJ/ktvsYXtazcVrV89Q3t+ZZJIQ/THovDkaoa+dF2lh W8d5H7N2UNCJLe9w2fm5iOq4SKoTsJOq6pVQ6gUBqJcgkSDWavd6bowXnTlcepZN tNaSMZsOt4CAvYQIa0nKPJo6Q4QN3buRQMWEOAOmGVT/RkVi68wirwk59uNzcS7E HjhqxFjhXYamNTuwHYZlchBrZutdbymSlucVucb1wAoxRAX+Wd1jk5EPl6zLv4w= =kFSE -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: "Several critical linear p2m fixes that prevented some hosts from booting" * tag 'stable/for-linus-3.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: properly retrieve NMI reason xen: check for zero sized area when invalidating memory xen: use correct type for physical addresses xen: correct race in alloc_p2m_pmd() xen: correct error for building p2m list on 32 bits x86/xen: avoid freeing static 'name' when kasprintf() fails x86/xen: add extra memory for remapped frames during setup x86/xen: don't count how many PFNs are identity mapped x86/xen: Free bootmem in free_p2m_page() during early boot x86/xen: Remove unnecessary BUG_ON(preemptible()) in xen_setup_timer()	2015-01-14 08:07:42 +13:00
Jan Beulich	f221b04fe0	x86/xen: properly retrieve NMI reason Using the native code here can't work properly, as the hypervisor would normally have cleared the two reason bits by the time Dom0 gets to see the NMI (if passed to it at all). There's a shared info field for this, and there's an existing hook to use - just fit the two together. This is particularly relevant so that NMIs intended to be handled by APEI / GHES actually make it to the respective handler. Note that the hook can (and should) be used irrespective of whether being in Dom0, as accessing port 0x61 in a DomU would be even worse, while the shared info field would just hold zero all the time. Note further that hardware NMI handling for PVH doesn't currently work anyway due to missing code in the hypervisor (but it is expected to work the native rather than the PV way). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-13 09:39:50 +00:00
Juergen Gross	9a17ad7f3d	xen: check for zero sized area when invalidating memory With the introduction of the linear mapped p2m list setting memory areas to "invalid" had to be delayed. When doing the invalidation make sure no zero sized areas are processed. Signed-off-by: Juegren Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:55 +00:00
Juergen Gross	e86f949667	xen: use correct type for physical addresses When converting a pfn to a physical address be sure to use 64 bit wide types or convert the physical address to a pfn if possible. Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:48 +00:00
Juergen Gross	f241b0b891	xen: correct race in alloc_p2m_pmd() When allocating a new pmd for the linear mapped p2m list a check is done for not introducing another pmd when this just happened on another cpu. In this case the old pte pointer was returned which points to the p2m_missing or p2m_identity page. The correct value would be the pointer to the found new page. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:40 +00:00
Juergen Gross	82c92ed135	xen: correct error for building p2m list on 32 bits In xen_rebuild_p2m_list() for large areas of invalid or identity mapped memory the pmd entries on 32 bit systems are initialized wrong. Correct this error. Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-12 10:09:34 +00:00
Vitaly Kuznetsov	7be0772d19	x86/xen: avoid freeing static 'name' when kasprintf() fails In case kasprintf() fails in xen_setup_timer() we assign name to the static string "<timer kasprintf failed>". We, however, don't check that fact before issuing kfree() in xen_teardown_timer(), kernel is supposed to crash with 'kernel BUG at mm/slub.c:3341!' Solve the issue by making name a fixed length string inside struct xen_clock_event_device. 16 bytes should be enough. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-08 13:55:25 +00:00
David Vrabel	a97dae1a2e	x86/xen: add extra memory for remapped frames during setup If the non-RAM regions in the e820 memory map are larger than the size of the initial balloon, a BUG was triggered as the frames are remaped beyond the limit of the linear p2m. The frames are remapped into the initial balloon area (xen_extra_mem) but not enough of this is available. Ensure enough extra memory regions are added for these remapped frames. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>	2015-01-08 13:52:37 +00:00
David Vrabel	bc7142cf79	x86/xen: don't count how many PFNs are identity mapped This accounting is just used to print a diagnostic message that isn't very useful. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>	2015-01-08 13:52:36 +00:00
Boris Ostrovsky	701a261ad6	x86/xen: Free bootmem in free_p2m_page() during early boot With recent changes in p2m we now have legitimate cases when p2m memory needs to be freed during early boot (i.e. before slab is initialized). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-01-08 13:52:36 +00:00
Boris Ostrovsky	8b8cd8a367	x86/xen: Remove unnecessary BUG_ON(preemptible()) in xen_setup_timer() There is no reason for having it and, with commit `250a1ac685` ("x86, smpboot: Remove pointless preempt_disable() in native_smp_prepare_cpus()"), it prevents HVM guests from booting. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2014-12-23 10:31:55 +00:00

1 2 3 4 5 ...

1235 Commits