Add support for initial write protection of VM memslots. This patch
series assumes that huge PUDs will not be used in 2nd stage tables, which is
always valid on ARMv7
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
This patch adds ARMv7 architecture TLB Flush function.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
We now have a generic function that does most of the work of
kvm_vm_ioctl_get_dirty_log, now use it.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
Currently the trace printk talks about "wfi" only, though the trace
point triggers both on wfi and wfe traps.
Add a parameter to differentiate between the two.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Wei Huang <wei@redhat.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
arm64 uses its own copy of exit handler (arm64/kvm/handle_exit.c).
Currently this file doesn't hook up with any trace points. As a result
users might not see certain events (e.g. HVC & WFI) while using ftrace
with arm64 KVM. This patch fixes this issue by adding a new trace file
and defining two trace events (one of which is shared by wfi and wfe)
for arm64. The new trace points are then linked with related functions
in handle_exit.c.
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Since the advent of VGIC dynamic initialization, this latter is
initialized quite late on the first vcpu run or "on-demand", when
injecting an IRQ or when the guest sets its registers.
This initialization could be initiated explicitly much earlier
by the users-space, as soon as it has provided the requested
dimensioning parameters.
This patch adds a new entry to the VGIC KVM device that allows
the user to manually request the VGIC init:
- a new KVM_DEV_ARM_VGIC_GRP_CTRL group is introduced.
- Its first attribute is KVM_DEV_ARM_VGIC_CTRL_INIT
The rationale behind introducing a group is to be able to add other
controls later on, if needed.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Adds a function kvm_vcpu_set_pending_timer instead of calling
kvm_make_request in lapic.c.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When access to descriptor in LDT/GDT wraparound outside long-mode, the address
of the descriptor should be truncated to 32-bit. Citing Intel SDM 2.1.1.1
"Global and Local Descriptor Tables in IA-32e Mode": "GDTR and LDTR registers
are expanded to 64-bits wide in both IA-32e sub-modes (64-bit mode and
compatibility mode)."
So in other cases, we need to truncate. Creating new function to return a
pointer to descriptor table to avoid too much code duplication.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
[Wrap 64-bit check with #ifdef CONFIG_X86_64, to avoid a "right shift count
>= width of type" warning and consequent undefined behavior. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When segment is loaded, the segment access bit is set unconditionally. In
fact, it should be set conditionally, based on whether the segment had the
accessed bit set before. In addition, it can improve performance.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to Intel SDM: "If the ESP register is used as a base register for
addressing a destination operand in memory, the POP instruction computes the
effective address of the operand after it increments the ESP register."
The current emulation does not behave so. The fix required to waste another
of the precious instruction flags and to check the flag in decode_modrm.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, if em_call_far fails it returns success instead of the resulting
error-code. Fix it.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The KVM emulator does not emulate JMP and CALL that target a call gate or a
task gate. This patch does not try to implement these scenario as they are
presumably rare; yet it returns X86EMUL_UNHANDLEABLE error in such cases
instead of generating an exception.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the operand size of fnstcw and fnstsw is updated during the execution,
the emulation may cause spurious exceptions as it reads the memory beforehand.
Marking these instructions as Mov (since the previous value is ignored) and
DstMem16 to simplify the setting of operand size.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Although pop sreg updates RSP according to the operand size, only 2 bytes are
read. The current behavior may result in incorrect #GP or #PF exceptions.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because ASSERT is just a printk, these would oops right away.
The assertion thus hardly adds anything.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The initialization function in mmu.c can always use walk_mmu, which
is known to be vcpu->arch.mmu. Only init_kvm_nested_mmu is used to
initialize vcpu->arch.nested_mmu.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add tracepoint to wait_lapic_expire.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
[Remind reader if early or late. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For the hrtimer which emulates the tscdeadline timer in the guest,
add an option to advance expiration, and busy spin on VM-entry waiting
for the actual expiration time to elapse.
This allows achieving low latencies in cyclictest (or any scenario
which requires strict timing regarding timer expiration).
Reduces average cyclictest latency from 12us to 8us
on Core i5 desktop.
Note: this option requires tuning to find the appropriate value
for a particular hardware/guest combination. One method is to measure the
average delay between apic_timer_fn and VM-entry.
Another method is to start with 1000ns, and increase the value
in say 500ns increments until avg cyclictest numbers stop decreasing.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_x86_ops->test_posted_interrupt() returns true/false depending
whether 'vector' is set.
Next patch makes use of this interface.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In most cases calling hwapic_isr_update(), we always check if
kvm_apic_vid_enabled() == 1, but actually,
kvm_apic_vid_enabled()
-> kvm_x86_ops->vm_has_apicv()
-> vmx_vm_has_apicv() or '0' in svm case
-> return enable_apicv && irqchip_in_kernel(kvm)
So its a little cost to recall vmx_vm_has_apicv() inside
hwapic_isr_update(), here just NULL out hwapic_isr_update() in
case of !enable_apicv inside hardware_setup() then make all
related stuffs follow this. Note we don't check this under that
condition of irqchip_in_kernel() since we should make sure
definitely any caller don't work without in-kernel irqchip.
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove FIXME comments about needing fault addresses to be returned. These
are propaagated from walk_addr_generic to gva_to_gpa and from there to
ops->read_std and ops->write_std.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When generating #PF VM-exit, check equality:
(PFEC & PFEC_MASK) == PFEC_MATCH
If there is equality, the 14 bit of exception bitmap is used to take decision
about generating #PF VM-exit. If there is inequality, inverted 14 bit is used.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch improve checks required by Intel Software Developer Manual.
- SMM MSRs are not allowed.
- microcode MSRs are not allowed.
- check x2apic MSRs only when LAPIC is in x2apic mode.
- MSR switch areas must be aligned to 16 bytes.
- address of first and last byte in MSR switch areas should not set any bits
beyond the processor's physical-address width.
Also it adds warning messages on failures during MSR switch. These messages
are useful for people who debug their VMMs in nVMX.
Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Several hypervisors need MSR auto load/restore feature.
We read MSRs from VM-entry MSR load area which specified by L1,
and load them via kvm_set_msr in the nested entry.
When nested exit occurs, we get MSRs via kvm_get_msr, writing
them to L1`s MSR store area. After this, we read MSRs from VM-exit
MSR load area, and load them via kvm_set_msr.
Signed-off-by: Wincy Van <fanwenyi0529@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Wire up sys_execveat(). Tested on 32 & 64 bit.
Fix for kdump on LE systems with cpus hot unplugged.
Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this broke other
platforms, we'll do a proper fix for 3.20.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUqillAAoJEFHr6jzI4aWA+aYP/0zMNyRbQehtpF9OLKw2TGJW
LeSdQMN+g6JWKzG2hsNIiKga5kquuCi0zgtu3dW/rxlzbXY0gxwAY7HFkjzgE7Em
qlt/iairlg8ifE436DRCkH2SXTCDN2AnlRLO35QZ2zKoo8osJuA6YjDiQRN3Y//p
ELvWraYoVdhbfGPg+gYJn94U/OAmFu0E0VrwEmpWie1YrfTNbbfTkS/BieqhqJTu
UEX53tW6FXbmit3ReQGGsPzcoVWbXfIg09gRy14Bn8dtA1EipHKr1m5o1JCjGu9J
moJVB8cMkj5o4OcQrPtJZI76FkzaVueNUVTUdFtIt1q6NCyvX8tZe16OKPYY0TEN
j8snpJfsk+oZij5caYSKqTeMAPrh6JbIjbkl8cIwc7ClRkRf/l5sW0zaFPS+GDWl
c9OFu1CdKO59g+apQ16BmMRw8b/pV22WkxSpqm9GD3I7Y7ABoUN+1l07d5H+aGRq
khD7M2rUWh4CwSRvg+ealpyAP++4Sj6mn/z4pe/gem4Wn/ynxGi6CSsYYijB6PWP
bm47t+QqyomRiH65Yo41vKGNYu/JfenvAWyJp8AnIajWqxwCRLVNSgySjtylUESs
f02qdvLsBTd3/CorZxigWgThM1wWeKyCDO6NBbjVUC/L9O8Ck9U0WbJUVQFDgG0b
+52Wv5FRb/xBF9XdpWsw
=1yze
-----END PGP SIGNATURE-----
Merge tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
- Wire up sys_execveat(). Tested on 32 & 64 bit.
- Fix for kdump on LE systems with cpus hot unplugged.
- Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this
broke other platforms, we'll do a proper fix for 3.20.
* tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
Revert "powerpc: Secondary CPUs must set cpu_callin_map after setting active and online"
powerpc/kdump: Ignore failure in enabling big endian exception during crash
powerpc: Wire up sys_execveat() syscall
Pull UML fixes from Richard Weinberger:
"Two fixes for UML regressions. Nothing exciting"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
x86, um: actually mark system call tables readonly
um: Skip futex_atomic_cmpxchg_inatomic() test
Commit a074335a37 ("x86, um: Mark system call tables readonly") was
supposed to mark the sys_call_table in UML as RO by adding the const,
but it doesn't have the desired effect as it's nevertheless being placed
into the data section since __cacheline_aligned enforces sys_call_table
being placed into .data..cacheline_aligned instead. We need to use
the ____cacheline_aligned version instead to fix this issue.
Before:
$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
U sys_writev
0000000000000000 D sys_call_table
0000000000000000 D syscall_table_size
After:
$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
U sys_writev
0000000000000000 R sys_call_table
0000000000000000 D syscall_table_size
Fixes: a074335a37 ("x86, um: Mark system call tables readonly")
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
futex_atomic_cmpxchg_inatomic() does not work on UML because
it triggers a copy_from_user() in kernel context.
On UML copy_from_user() can only be used if the kernel was called
by a real user space process such that UML can use ptrace()
to fetch the value.
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Tested-by: Daniel Walter <d.walter@0x90.at>
Follow aa0d532605 ("ia64: Use preempt_schedule_irq") and use
preempt_schedule_irq instead of enabling/disabling interrupts and
messing around with PREEMPT_ACTIVE in the nios2 low-level preemption
code ourselves. Also get rid of the now needless re-check for
TIF_NEED_RESCHED, preempt_schedule_irq will already take care of
rescheduling.
This also fixes the following build error when building with
CONFIG_PREEMPT:
arch/nios2/kernel/built-in.o: In function `need_resched':
arch/nios2/kernel/entry.S:374: undefined reference to `PREEMPT_ACTIVE'
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Ley Foon Tan <lftan@altera.com>
This patch initializes the mmu field of the cpuinfo structure to the
value supplied by the devicetree.
Signed-off-by: Walter Goossens <waltergoossens@home.nl>
Acked-by: Ley Foon Tan <lftan@altera.com>
A very small set of fixes for 3.19, as everyone was out. The clocksource
patch was something I missed for the merge window after the change that
broke arm64 was merged through arm-soc. The other two patches are
a fix for an undetected merge problem in mvebu and a defconfig change
to make some exynos boards work with the normal multi_v7_defconfig.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAVKLKEGCrR//JCVInAQLWSg//TLZdOcMiNUcKa8SjeRggILdslSOJKuVG
wLyzsAOyQ+XZFvp5KdOhBwLZwaitFxXLya0NUMGuD8c9lFylTHcg4+Lh8yOPua5b
u319jV7YoYPRVfrwdEoPKA6R+sLLJZEj8lE5+vuEGXEig2wDb7xAzfSS3uNwBVtM
kVVur8lYgphyz/5Ruzxexn9ORlmgULG2SFfJnOlJHsAxgaGOo51cMwR616V+UBHl
AeX5ZcHWmGc89o04FYL3fDF0j+ReCwhSflPIOCf2KwJNHRgE/DkOCd1n4X45Hozl
2FWgj3bD/KjZEOzapt08X4oiFZXaxl+ruGe+n7Dm7hmOj6TzLxHrWIg+E5hgbTNV
IffuGU1SJrBH+aZl7pE/aSJ3VcFA3EiSmKiq+0tPjWrX+vWlv+e6Jr95r5VQ3Qh0
xksTWCf9KRa6Peu/zxYBDeUkJcf96XC/GjWF+uQrew89G5B2JFz/emd5Gmtxrr0O
Q0eOszqafSBQcKf4Z7tU+DmRKVCcVhnTdTgYZtRDizFMkpjI6gtjrCQq7g0Q8iqg
o9gl5xyiJFVOgL1HsUUtROsBrGRXuqdAUasg2etDonXV/XMbfq6E21bxCxG+5YcT
tes1E9w0onvXFw98M7i8q3cVcnWnvuvHZKqsvmbUGOCx0dz3ah7LbzeSZJcCNWWa
F8e06SqZfQY=
=JNdC
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"A very small set of fixes for 3.19, as everyone was out.
The clocksource patch was something I missed for the merge window
after the change that broke arm64 was merged through arm-soc. The
other two patches are a fix for an undetected merge problem in mvebu
and a defconfig change to make some exynos boards work with the normal
multi_v7_defconfig"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
Add USB_EHCI_EXYNOS to multi_v7_defconfig
ARM: mvebu: Fix pinctrl configuration for Armada 370 DB
clocksource: arch_timer: Only use the virtual counter (CNTVCT) on arm64
Currently we enable Exynos devices in the multi v7 defconfig, however, when
testing on my ODROID-U3, I noticed that USB was not working. Enabling this
option causes USB to work, which enables networking support as well since the
ODROID-U3 has networking on the USB bus.
[arnd] Support for odroid-u3 was added in 3.10, so it would be nice to
backport this fix at least that far.
Signed-off-by: Steev Klimaszewski <steev@gentoo.org>
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This reverts commit 7c5c92ed56.
Although this did fix the bug it was aimed at, it also broke secondary
startup on platforms that use give/take_timebase(). Unfortunately we
didn't detect that while it was in next.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In LE kernel, we currently have a hack for kexec that resets the exception
endian before starting a new kernel as the kernel that is loaded could be a
big endian or a little endian kernel. In kdump case, resetting exception
endian fails when one or more cpus is disabled. But we can ignore the failure
and still go ahead, as in most cases crashkernel will be of same endianess
as primary kernel and reseting endianess is not even needed in those cases.
This patch adds a new inline function to say if this is kdump path. This
function is used at places where such a check is needed.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
[mpe: Rename to kdump_in_progress(), use bool, and edit comment]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Wire up sys_execveat(). This passes the selftests for the system call.
Check success of execveat(3, '../execveat', 0)... [OK]
Check success of execveat(5, 'execveat', 0)... [OK]
Check success of execveat(6, 'execveat', 0)... [OK]
Check success of execveat(-100, '/home/pranith/linux/...ftests/exec/execveat', 0)... [OK]
Check success of execveat(99, '/home/pranith/linux/...ftests/exec/execveat', 0)... [OK]
Check success of execveat(8, '', 4096)... [OK]
Check success of execveat(17, '', 4096)... [OK]
Check success of execveat(9, '', 4096)... [OK]
Check success of execveat(14, '', 4096)... [OK]
Check success of execveat(14, '', 4096)... [OK]
Check success of execveat(15, '', 4096)... [OK]
Check failure of execveat(8, '', 0) with ENOENT... [OK]
Check failure of execveat(8, '(null)', 4096) with EFAULT... [OK]
Check success of execveat(5, 'execveat.symlink', 0)... [OK]
Check success of execveat(6, 'execveat.symlink', 0)... [OK]
Check success of execveat(-100, '/home/pranith/linux/...xec/execveat.symlink', 0)... [OK]
Check success of execveat(10, '', 4096)... [OK]
Check success of execveat(10, '', 4352)... [OK]
Check failure of execveat(5, 'execveat.symlink', 256) with ELOOP... [OK]
Check failure of execveat(6, 'execveat.symlink', 256) with ELOOP... [OK]
Check failure of execveat(-100, '/home/pranith/linux/tools/testing/selftests/exec/execveat.symlink', 256) with ELOOP... [OK]
Check success of execveat(3, '../script', 0)... [OK]
Check success of execveat(5, 'script', 0)... [OK]
Check success of execveat(6, 'script', 0)... [OK]
Check success of execveat(-100, '/home/pranith/linux/...elftests/exec/script', 0)... [OK]
Check success of execveat(13, '', 4096)... [OK]
Check success of execveat(13, '', 4352)... [OK]
Check failure of execveat(18, '', 4096) with ENOENT... [OK]
Check failure of execveat(7, 'script', 0) with ENOENT... [OK]
Check success of execveat(16, '', 4096)... [OK]
Check success of execveat(16, '', 4096)... [OK]
Check success of execveat(4, '../script', 0)... [OK]
Check success of execveat(4, 'script', 0)... [OK]
Check success of execveat(4, '../script', 0)... [OK]
Check failure of execveat(4, 'script', 0) with ENOENT... [OK]
Check failure of execveat(5, 'execveat', 65535) with EINVAL... [OK]
Check failure of execveat(5, 'no-such-file', 0) with ENOENT... [OK]
Check failure of execveat(6, 'no-such-file', 0) with ENOENT... [OK]
Check failure of execveat(-100, 'no-such-file', 0) with ENOENT... [OK]
Check failure of execveat(5, '', 4096) with EACCES... [OK]
Check failure of execveat(5, 'Makefile', 0) with EACCES... [OK]
Check failure of execveat(11, '', 4096) with EACCES... [OK]
Check failure of execveat(12, '', 4096) with EACCES... [OK]
Check failure of execveat(99, '', 4096) with EBADF... [OK]
Check failure of execveat(99, 'execveat', 0) with EBADF... [OK]
Check failure of execveat(8, 'execveat', 0) with ENOTDIR... [OK]
Invoke copy of 'execveat' via filename of length 4093:
Check success of execveat(19, '', 4096)... [OK]
Check success of execveat(5, 'xxxxxxxxxxxxxxxxxxxx...yyyyyyyyyyyyyyyyyyyy', 0)... [OK]
Invoke copy of 'script' via filename of length 4093:
Check success of execveat(20, '', 4096)... [OK]
/bin/sh: 0: Can't open /dev/fd/5/xxxxxxx(... a long line of x's and y's, 0)... [OK]
Check success of execveat(5, 'xxxxxxxxxxxxxxxxxxxx...yyyyyyyyyyyyyyyyyyyy', 0)... [OK]
Tested on a 32-bit powerpc system.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
On top of this, add a couple of WARN_ONs and stop spamming dmesg on
pretty much every boot of a virtual machine.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJUn8hXAAoJEL/70l94x66Dme4H/R/HA+Aswgzse8nx3pNiqStv
e0BBeUHVJtxlOfnOlJGCWc1ef7uzKdvVWuqCmJwMDJDoLd/I8kF84E3AQS+zTJ/u
Dlb+yjwjoFPbQwr8xfclcvYXZxJgleKQJcyBWKBxgMTnFdjgRfX7U0MzXZJ/gFzH
mdHhLlNBU/On0l3A+dsKVgjtiuHZIQD0FraYs4qa2QajRGgDoHypzTmwh20XBmdx
3l/zFnSFSbaCTckbKb0xYv22pZTMd/5qrxer05sl98nzrrrXIDhVSo0hbrNVqorv
pDr+908XGvTOgVR1cvgkFn74INudiYjNyICGsue/ksmUPh9jz6hWic7sNeqYfcI=
=ehkB
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"The important fixes are for two bugs introduced by the merge window.
On top of this, add a couple of WARN_ONs and stop spamming dmesg on
pretty much every boot of a virtual machine"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: warn on more invariant breakage
kvm: fix sorting of memslots with base_gfn == 0
kvm: x86: drop severity of "generation wraparound" message
kvm: x86: vmx: reorder some msr writing
Since most virtual machines raise this message once, it is a bit annoying.
Make it KERN_DEBUG severity.
Cc: stable@vger.kernel.org
Fixes: 7a2e8aaf0f
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The commit 34a1cd60d1, "x86: vmx: move some vmx setting from
vmx_init() to hardware_setup()", tried to refactor some codes
specific to vmx hardware setting into hardware_setup(), but some
msr writing should depend on our previous setting condition like
enable_apicv, enable_ept and so on.
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull parisc build fix from Helge Deller:
"This unbreaks the kernel compilation on parisc with gcc-4.9"
* 'parisc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix out-of-register compiler error in ldcw inline assembler function
The __ldcw macro has a problem when its argument needs to be reloaded from
memory. The output memory operand and the input register operand both need to
be reloaded using a register in class R1_REGS when generating 64-bit code.
This fails because there's only a single register in the class. Instead, use a
memory clobber. This also makes the __ldcw macro a compiler memory barrier.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> [3.13+]
Signed-off-by: Helge Deller <deller@gmx.de>
This patch adds pgd_page definition in order to keep supporting
HAVE_GENERIC_RCU_GUP configuration. In addition, it changes pud_page
expression to align with pmd_page for readability.
An introduction of pgd_page resolves the following build breakage
under 4KB + 4Level memory management combo.
mm/gup.c: In function 'gup_huge_pgd':
mm/gup.c:889:2: error: implicit declaration of function 'pgd_page' [-Werror=implicit-function-declaration]
head = pgd_page(orig);
^
mm/gup.c:889:7: warning: assignment makes pointer from integer without a cast
head = pgd_page(orig);
Cc: Will Deacon <will.deacon@arm.com>
Cc: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
[catalin.marinas@arm.com: remove duplicate pmd_page definition]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The usual defconfig tweaks, this time:
- FHANDLE and AUTOFS4_FS to keep systemd happy
- PID_NS, QUOTA and KEYS to keep LTP happy
- Disable DEBUG_PREEMPT, as this *really* hurts performance
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
On arm64 the TTBR0_EL1 register is set to either the reserved TTBR0
page tables on boot or to the active_mm mappings belonging to user space
processes, it must never be set to swapper_pg_dir page tables mappings.
When a CPU is booted its active_mm is set to init_mm even though its
TTBR0_EL1 points at the reserved TTBR0 page mappings. This implies
that when __cpu_suspend is triggered the active_mm can point at
init_mm even if the current TTBR0_EL1 register contains the reserved
TTBR0_EL1 mappings.
Therefore, the mm save and restore executed in __cpu_suspend might
turn out to be erroneous in that, if the current->active_mm corresponds
to init_mm, on resume from low power it ends up restoring in the
TTBR0_EL1 the init_mm mappings that are global and can cause speculation
of TLB entries which end up being propagated to user space.
This patch fixes the issue by checking the active_mm pointer before
restoring the TTBR0 mappings. If the current active_mm == &init_mm,
the code sets the TTBR0_EL1 to the reserved TTBR0 mapping instead of
switching back to the active_mm, which is the expected behaviour
corresponding to the TTBR0_EL1 settings when __cpu_suspend was entered.
Fixes: 95322526ef ("arm64: kernel: cpu_{suspend/resume} implementation")
Cc: <stable@vger.kernel.org> # 3.14+: 18ab7db
Cc: <stable@vger.kernel.org> # 3.14+: 714f599
Cc: <stable@vger.kernel.org> # 3.14+: c3684fb
Cc: <stable@vger.kernel.org> # 3.14+
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit a3a60f81ee (dma-mapping: replace set_arch_dma_coherent_ops with
arch_setup_dma_ops) changes the of_dma_configure() arch dma_ops callback
to arch_setup_dma_ops but only the arch/arm code is updated. Subsequent
commit 97890ba928 (dma-mapping: detect and configure IOMMU in
of_dma_configure) changes the arch_setup_dma_ops() prototype further to
handle iommu. The patch makes the corresponding arm64 changes.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Will Deacon <will.deacon@arm.com>
The commit b4607572ef (ARM: mvebu: remove conflicting muxing on
Armada 370 DB) removes the hog pins muxing. As it is explained in the
commit log it solves a warning a boot time, but more important it also
allows using the Giga port 0 of the board.
Unfortunately in the same time the commit 4904a82a93 (arm: mvebu:
move Armada 370/XP pinctrl node definition armada-370-xp.dtsi) was
merged and it introduced again the hog pins muxing. Because of it, the
Giga port 0 of the board is no more usable.
This commit remove again the conflicting muxing (hopefully for the
last time).
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
[andrew@lunn.ch: Correct commit IDs]
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 4904a82a93 ("arm: mvebu: move Armada 370/XP pinctrl node definition armada-370-xp.dtsi")