smp_mb on vcpu destroy isn't paired with anything, violating pairing
rules, and seems to be useless.
Drop it.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <1452010811-25486-1-git-send-email-mst@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This is a third approach to workaround long standing issue with LPSS on
BayTrail. First one [1] was reverted since it didn't resolve the issue
comprehensively. Second one [2] was rejected by internal review.
The LPSS DMA controller does not have neither _PS0 nor _PS3 method. Moreover it
can be powered off automatically whenever the last LPSS device goes down. In
case of no power any access to the DMA controller will hang the system. The
behaviour is reproduced on some HP laptops based on Intel BayTrail [3,4] as
well as on ASuS T100TA transformer.
Power on the LPSS island through the registers accessible in a specific way.
[1] http://www.spinics.net/lists/linux-acpi/msg53963.html
[2] https://bugzilla.redhat.com/attachment.cgi?id=1066779&action=diff
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1184273
[4] http://www.spinics.net/lists/dmaengine/msg01514.html
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
While setting the KVM PIT counters in 'kvm_pit_load_count', if
'hpet_legacy_start' is set, the function disables the timer on
channel[0], instead of the respective index 'channel'. This is
because channels 1-3 are not linked to the HPET. Fix the caller
to only activate the special HPET processing for channel 0.
Reported-by: P J P <pjp@fedoraproject.org>
Fixes: 0185604c2d
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The comment had the meaning of mmu.gva_to_gpa and nested_mmu.gva_to_gpa
swapped. Fix that, and also add some details describing how each translation
works.
Signed-off-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Complete rewrite of the arm64 world switch in C, hopefully
paving the way for more sharing with the 32bit code, better
maintainability and easier integration of new features.
Also smaller and slightly faster in some cases...
- Support for 16bit VM identifiers
- Various cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWe80IAAoJECPQ0LrRPXpDdt8P/ittxzklIT7rsZxdOiIY6vLQ
i2hWGo1KdZR+8rsNyQEeGyg2Ocdja0Vld9ciBKgXKeKtc6x6AHfq0x6eyGRbF2jJ
Wdkd2a9lLJVJJIf1LBhOQuwjNiEvAgvqO5nwXL77s/rqx3Ur5OlyohSvRFBy7Pqp
8cdnCV/43I/y7k0iGhitFVrEC9TL0cfeJmM7YhXkt8IcpkcpCDfgdI7wgIb0ntvv
dqvrRmfp+Q3/hJ6SVRsy6uzOrjFjRE8hIIG6TiqfRd/FgI5x2xvGkSAE3Wx2YdRM
myPDiAuY2wOyALZpn9zD7qFMOfI2wX8kaX5S/ctnbvLelkmQblI39/zfYuxJ36xC
Mo2yMKcvT37AIMz2fxx3mGnIR7NZBNXVQGJmv/1p9vMQ8RbUXUhT0w6hP5SH9S7m
RDoOkfd37wQugQ7bgI7cqg4hodMRlybGPq8QaKp80y0Ej3cPblM+y0fbR153SSbS
6nCwYceFLdWJEV+tTFpKD5cvxOGeYfoC/8LcVRYRcWg6nlr4+qo61MHevyCe/Qxw
Pw+z9wFpoVKumRT72KmzFUxFQqRnTshE3KJqNJdsqMPM8ZuW5TJ/MtUa+JdAWmSH
dEAqd7Hy5W1CqgGEk+los+QlKw+5uFepcZ7SOuQ2ehME/Ae3xI8s9+UE6yqeHx6Y
PV2s5lyJRCPfm3qOu7TS
=wuir
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next
KVM/ARM changes for Linux v4.5
- Complete rewrite of the arm64 world switch in C, hopefully
paving the way for more sharing with the 32bit code, better
maintainability and easier integration of new features.
Also smaller and slightly faster in some cases...
- Support for 16bit VM identifiers
- Various cleanups
Commit 0976c946a6
"arm/versatile: Fix versatile irq specifications"
has an off-by-one error on the Versatile AB that has
been regressing the Versatile AB hardware for some time.
However it seems like the interrupt assignments have
never been correct and I have now adjusted them according
to the specification. The masks for the valid interrupts
made it impossible to assign the right SIC interrupt
for the MMCI, so I went in and fixed these to correspond
to the specifications, and added references if anyone
wants to double-check.
Due to the Versatile PB including the Versatile AB
as a base DTS file, we need to override and correct
some values to correspond to the actual changes in the
hardware.
For the Versatile PB I don't think the IRQ line
assignment for MMCI has ever been correct for either of
the two MMCI blocks. It would be nice if someone with the
physical PB board could test this.
Patch tested on the Versatile AB, QEMU for Versatile AB
and QEMU for Versatile PB.
Cc: Rob Herring <robh@kernel.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: stable@vger.kernel.org
Fixes: 0976c946a6 ("arm/versatile: Fix versatile irq specifications")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
The Nomadik has sporadic crashes because of these latencies, setting
them to max makes the platform work nicely, so use this values for
now.
These latencies were set to 2 since the Nomadik platform was merged,
but I suspect they never took effect until the right size and
associativity for the cache was specified in the device tree and
that is why the crash comes now.
Cc: stable@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
Pull networking fixes from David Miller:
"As usual, there are a couple straggler bug fixes:
1) qlcnic_alloc_mbx_args() error returns are not checked in qlcnic
driver. Fix from Insu Yun.
2) SKB refcounting bug in connector, from Florian Westphal.
3) vrf_get_saddr() has to propagate fib_lookup() errors to it's
callers, from David Ahern.
4) Fix AF_UNIX splice/bind deadlock, from Rainer Weikusat.
5) qdisc_rcu_free() fails to free the per-cpu qstats. Fix from John
Fastabend.
6) vmxnet3 driver passes wrong page to dma_map_page(), fix from
Shrikrishna Khare.
7) Don't allow zero cwnd in tcp_cwnd_reduction(), from Yuchung Cheng"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
tcp: fix zero cwnd in tcp_cwnd_reduction
Driver: Vmxnet3: Fix regression caused by 5738a09
net: qmi_wwan: Add WeTelecom-WPD600N
mkiss: fix scribble on freed memory
net: possible use after free in dst_release
net: sched: fix missing free per cpu on qstats
ARM: net: bpf: fix zero right shift
6pack: fix free memory scribbles
net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
bridge: Only call /sbin/bridge-stp for the initial network namespace
af_unix: Fix splice-bind deadlock
net: Propagate lookup failure in l3mdev_get_saddr to caller
r8152: add reset_resume function
connector: bump skb->users before callback invocation
cxgb4: correctly handling failed allocation
qlcnic: correctly handle qlcnic_alloc_mbx_args
A repeating pattern in drivers has become to use OF node information
and, if not found, platform specific host information to extract the
ethernet address for a given device.
Currently this is done with a call to of_get_mac_address() and then
some ifdef'd stuff for SPARC.
Consolidate this into a portable routine, and provide the
arch_get_platform_mac_address() weak function hook for all
architectures to implement if they want.
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 63aa945b10 ("memory: omap-gpmc: Add Kconfig option for debug")
unified the GPMC debug for the SoCs with GPMC. The commit also left out
the option for HWMOD_INIT_NO_RESET as we now require proper timings for
GPMC to be able to remap GPMC devices out of address 0.
Unfortunately on Nokia N900, onenand now only partially works with the
device tree provided timings. It works enough to get detected but the
clock rate supported by the onenand chip gets misdetected. This in turn
causes the GPMC timings to be miscalculated and this leads into file
system corruption on N900.
Looks like onenand needs CS_CONFIG1 bit 27 WRITETYPE set for for sync
write. This is needed also for async timings when we write to onenand
with omap2_onenand_set_async_mode(). Without sync write bit set, the
async read for the onenand ONENAND_REG_VERSION_ID will return 0xfff.
Let's exit with an error if onenand rate is not detected. And let's
remove the extra call to omap2_onenand_set_async_mode() as we only need
to do this once at the end of omap2_onenand_setup_async().
Fixes: 63aa945b10 ("memory: omap-gpmc: Add Kconfig option for debug")
Cc: stable@vger.kernel.org # v4.2+
Reported-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Currently we use an open-coded memzero to clear the BSS. As it is a
trivial implementation, it is sub-optimal.
Our optimised memset doesn't use the stack, is position-independent, and
for the memzero case can use of DC ZVA to clear large blocks
efficiently. In __mmap_switched the MMU is on and there are no live
caller-saved registers, so we can safely call an uninstrumented memset.
This patch changes __mmap_switched to use memset when clearing the BSS.
We use the __pi_memset alias so as to avoid any instrumentation in all
kernel configurations.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In work_pending, we may skip work if the stacked SPSR value represents
anything other than an EL0 context. We then immediately invoke the
kernel_exit 0 macro as part of ret_to_user, assuming a return to EL0.
This is somewhat confusing.
We use work_pending as part of the ret_to_user/ret_fast_syscall state
machine. We only use ret_fast_syscall in the return from an SVC issued
from EL0. We use ret_to_user for return from EL0 exception handlers and
also for return from ret_from_fork in the case the task was not a kernel
thread (i.e. it is a user task).
Thus in all cases the stacked SPSR value must represent an EL0 context,
and the check is redundant. This patch removes it, along with the now
unused no_work_pending label.
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This is a long standing bug with the l1-dcache-stores generic event on
AMD machines. My perf_event testsuite has been complaining about this
for years and I'm finally getting around to trying to get it fixed.
The data_cache_refills:system event does not make sense for l1-dcache-stores.
Maybe this was a typo and it was meant to be for l1-dcache-store-misses?
In any case, the values returned are nowhere near correct for l1-dcache-stores
and in fact the umask values for the event have completely changed with
fam15h so it makes even less sense than ever. So just remove it.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1512091134350.24311@vincent-weaver-1.umelst.maine.edu
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Knights Landing uncore performance monitoring (perfmon) is derived from
Haswell-EP uncore perfmon with several differences. One notable difference
is in PCI device IDs. Knights Landing uses common PCI device ID for
multiple instances of an uncore PMU device type. In Haswell-EP, each
instance of a PMU device type has a unique device ID.
Knights Landing uncore components that have performance monitoring units
are UBOX, CHA, EDC, MC, M2PCIe, IRP and PCU. Perfmon registers in EDC, MC,
IRP, and M2PCIe reside in the PCIe configuration space. Perfmon registers
in UBOX, CHA and PCU are accessed via the MSR interface.
For more details, please refer to the public document:
https://software.intel.com/sites/default/files/managed/15/8d/IntelXeonPhi%E2%84%A2x200ProcessorPerformanceMonitoringReferenceManual_Volume1_Registers_v0%206.pdf
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/8ac513981264c3eb10343a3f523f19cc5a2d12fe.1449470704.git.harish.chegondi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Call uncore_pci_box_ctl() function to get the PMON box control MSR offset
instead of hard coding the offset. This would allow us to use this
snbep_uncore_pci_init_box() function for other PCI PMON devices whose box
control MSR offset is different from SNBEP_PCI_PMON_BOX_CTL.
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/872e8ef16cfc38e5ff3b45fac1094e6f1722e4ad.1449470704.git.harish.chegondi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Knights Landing core is based on Silvermont core with several differences.
Like Silvermont, Knights Landing has 8 pairs of LBR MSRs. However, the
LBR MSRs addresses match those of the Xeon cores' first 8 pairs of LBR MSRs
Unlike Silvermont, Knights Landing supports hyperthreading. Knights Landing
offcore response events config register mask is different from that of the
Silvermont.
This patch was developed based on a patch from Andi Kleen.
For more details, please refer to the public document:
https://software.intel.com/sites/default/files/managed/15/8d/IntelXeonPhi%E2%84%A2x200ProcessorPerformanceMonitoringReferenceManual_Volume1_Registers_v0%206.pdf
Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Harish Chegondi <harish.chegondi@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/d14593c7311f78c93c9cf6b006be843777c5ad5c.1449517401.git.harish.chegondi@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The uncore subsystem for Broadwell-EP is similar to Haswell-EP.
There are some differences in pci device IDs, box number and
constraints. This patch extends the Broadwell-DE codes to support
Broadwell-EP.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1449176411-9499-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch updates the PEBS support for Intel Atom to provide
an alias for the cycles:pp event used by perf record/top by default
nowadays.
On Atom, only INST_RETIRED:ANY supports PEBS, so we use this event
instead with a large cmask to count cycles. Given that Core2 has
the same issue, we use the intel_pebs_aliases_core2() function for Atom
as well.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1449172990-30183-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch fixes broken PEBS support on Intel Atom and Core2
due to wrong pointer arithmetic in intel_pmu_drain_pebs_core().
The get_next_pebs_record_by_bit() was called on PEBS format fmt0
which does not use the pebs_record_nhm layout.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Fixes: 21509084f9 ("perf/x86/intel: Handle multiple records in the PEBS buffer")
Link: http://lkml.kernel.org/r/1449182000-31524-3-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patches fixes the LBR kernel crashes on Intel Atom.
The kernel was assuming that if the CPU supports 64-bit format
LBR, then it has an LBR_SELECT MSR. Atom uses 64-bit LBR format
but does not have LBR_SELECT. That was causing NULL pointer
dereferences in a couple of places.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Fixes: 96f3eda67f ("perf/x86/intel: Fix static checker warning in lbr enable")
Link: http://lkml.kernel.org/r/1449182000-31524-2-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch fixes a bug in the filter_events() function.
The patch fixes the bug whereby if some mappings did not
exist, e.g., STALLED_CYCLES_FRONTEND, then any event after it
in the attrs array would disappear from the published list of
events in /sys/devices/cpu/events. This could be verified
easily on any system post SNB (which do not publish
STALLED_CYCLES_FRONTEND):
$ ./perf stat -e cycles,ref-cycles true
Performance counter stats for 'true':
1,217,348 cycles
<not supported> ref-cycles
The problem is that in filter_events() there is an assumption
that the argument (attrs) is organized in increasing continuous
event indexes related to the event_map(). But if we remove the
non-supported events by shifing the position in the array, then
the lookup x86_pmu.event_map() needs to compensate for it, otherwise
we are looking up the wrong index. This patch corrects this problem
by compensating for the deleted events and with that ref-cycles
reappears (here shown on Haswell):
$ perf stat -e ref-cycles,cycles true
Performance counter stats for 'true':
4,525,910 ref-cycles
1,064,920 cycles
0.002943888 seconds time elapsed
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: jolsa@kernel.org
Cc: kan.liang@intel.com
Fixes: 8300daa267 ("perf/x86: Filter out undefined events from sysfs events attribute")
Link: http://lkml.kernel.org/r/1449516805-6637-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a new 'three-p' precise level, that uses INST_RETIRED.PREC_DIST as
base. The basic mechanism of abusing the inverse cmask to get all
cycles works the same as before.
PREC_DIST is available on Sandy Bridge or later. It had some problems
on Sandy Bridge, so we only use it on IvyBridge and later. I tested it
on Broadwell and Skylake.
PREC_DIST has special support for avoiding shadow effects, which can
give better results compare to UOPS_RETIRED. The drawback is that
PREC_DIST can only schedule on counter 1, but that is ok for cycle
sampling, as there is normally no need to do multiple cycle sampling
runs in parallel. It is still possible to run perf top in parallel, as
that doesn't use precise mode. Also of course the multiplexing can
still allow parallel operation.
:pp stays with the previous event.
Example:
Sample a loop with 10 sqrt with old cycles:pp
0.14 │10: sqrtps %xmm1,%xmm0 <--------------
9.13 │ sqrtps %xmm1,%xmm0
11.58 │ sqrtps %xmm1,%xmm0
11.51 │ sqrtps %xmm1,%xmm0
6.27 │ sqrtps %xmm1,%xmm0
10.38 │ sqrtps %xmm1,%xmm0
12.20 │ sqrtps %xmm1,%xmm0
12.74 │ sqrtps %xmm1,%xmm0
5.40 │ sqrtps %xmm1,%xmm0
10.14 │ sqrtps %xmm1,%xmm0
10.51 │ ↑ jmp 10
We expect all 10 sqrt to get roughly the sample number of samples.
But you can see that the instruction directly after the JMP is
systematically underestimated in the result, due to sampling shadow
effects.
With the new PREC_DIST based sampling this problem is gone and all
instructions show up roughly evenly:
9.51 │10: sqrtps %xmm1,%xmm0
11.74 │ sqrtps %xmm1,%xmm0
11.84 │ sqrtps %xmm1,%xmm0
6.05 │ sqrtps %xmm1,%xmm0
10.46 │ sqrtps %xmm1,%xmm0
12.25 │ sqrtps %xmm1,%xmm0
12.18 │ sqrtps %xmm1,%xmm0
5.26 │ sqrtps %xmm1,%xmm0
10.13 │ sqrtps %xmm1,%xmm0
10.43 │ sqrtps %xmm1,%xmm0
0.16 │ ↑ jmp 10
Even with PREC_DIST there is still sampling skid and the result is not
completely even, but systematic shadow effects are significantly
reduced.
The improvements are mainly expected to make a difference in high IPC
code. With low IPC it should be similar.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1448929689-13771-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I added UOPS_RETIRED.ALL by mistake to the Skylake PEBS event list for
cycles:pp. But the event is not documented for Skylake, and has some
issues.
The recommended replacement for cycles:pp is to use
INST_RETIRED.ANY+pebs as a base, similar to what CPUs before Sandy
Bridge did. This new event is called INST_RETIRED.TOTAL_CYCLES_PS. The
event is not really new, but has been already used by perf before
Sandy Bridge for the original cycles:p
Note the SDM doesn't document that event either, but it's being
documented in the latest version of the event list on:
https://download.01.org/perfmon/SKL
This patch does:
- Remove UOPS_RETIRED.ALL from the Skylake PEBS event list
- Add INST_RETIRED.ANY to the Skylake PEBS event list, and an table entry to
allow cmask=16,inv=1 for cycles:pp
- We don't need an extra entry for the base INST_RETIRED event,
because it is already covered by the catch-all PEBS table entry.
- Switch Skylake to use the Core2 PEBS alias (which is
INST_RETIRED.TOTAL_CYCLES_PS)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: hpa@zytor.com
Link: http://lkml.kernel.org/r/1448929689-13771-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Normally we drop PEBS events with a zero status field. But when
there is only a single PEBS event active we can assume the
PEBS record is for that event. The PEBS buffer is always flushed
when PEBS events are disabled, so there is no risk of mishandling
state PEBS records this way.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1449177740-5422-2-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The recent commit:
75f80859b1 ("perf/x86/intel/pebs: Robustify PEBS buffer drain")
causes lots of warnings on different CPUs before Skylake
when running PEBS intensive workloads.
They can have a zero status field in the PEBS record when
PEBS is racing with clearing of GLOBAl_STATUS.
This also can cause hangs (it seems there are still
problems with printk in NMI).
Disable the warning, but still ignore the record.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1449177740-5422-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The CHECK_MEMBER_AT_END_OF(TYPE, MEMBER) checks whether MEMBER
is last member of TYPE by evaluating:
offsetof(TYPE::MEMBER) + sizeof(TYPE::MEMBER) == sizeof(TYPE)
and ensuring TYPE::MEMBER is the last member of the TYPE.
This condition breaks on structs that are padded to be
aligned. This patch ensures the TYPE alignment is taken
into account.
This bug was revealed after adding cacheline alignment into
struct sched_entity, which broke task_struct::thread check:
CHECK_MEMBER_AT_END_OF(struct task_struct, thread);
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1450707930-3445-1-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
arch/x86/built-in.o: In function `arch_setup_additional_pages':
(.text+0x587): undefined reference to `pvclock_pvti_cpu0_va'
KVM_GUEST selects PARAVIRT_CLOCK, so we can make pvclock_pvti_cpu0_va depend
on KVM_GUEST.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Tested-by: Borislav Petkov <bp@alien8.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/444d38a9bcba832685740ea1401b569861d09a72.1451446564.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The LSR instruction cannot be used to perform a zero right shift since a
0 as the immediate value (imm5) in the LSR instruction encoding means
that a shift of 32 is perfomed. See DecodeIMMShift() in the ARM ARM.
Make the JIT skip generation of the LSR if a zero-shift is requested.
This was found using american fuzzy lop.
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value. All the BPF JITs fail to clear A if this is used as
the first instruction in a filter. This was found using american fuzzy
lop.
Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs. Except for ARM, the
rest have only been compile-tested.
Fixes: 3480593131 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent <rabin@rab.in>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* fix IBSS which got broken over time
* new USB id for bcm43242 dongle
* arp offload configuration through inet notifier
ath9k
* add random number generator support (CONFIG_ATH9K_HWRNG)
iwlwifi
* Make scan parameters low latency aware
* Fix in the NL80211_FEATURE_FULL_AP_CLIENT_STATE state case
* Fix enable injection mode (Chaya Rachel)
* Various cleanups (Dan / Julia / myself)
* Allow to stay more time on popular channels (David Spinadel)
* Bug fixes for D0i3 (Eliad / Luca)
* Fixes for GO uAPSD (myself)
* Start of TSO support (myself)
* Rate control bug fixes (Eyal / Gregory)
* Start the work on 9000 devices (Johannes / Sara / Oren)
* Start the work on a new Tx queue allocation model (Liad)
* Debug infrastructure enhancements (Golan)
mwifiex
* add a debugfs file for chip reset
* advertise SMS4 cipher suite
* increase ap and station interface limit to 3
* enable MSI support on newer pcie devices (8897 onwards)
rtlwifi
* fix lots of module parameter usage
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJWi5RxAAoJEG4XJFUm622bujQH/Rti4fRBnJcRt5EZAv/pvdIx
AGPcYQQfFbFRjzxjrcY30AYCC6OoID9n7BNtELIWabA6dQHWGJd2aebZGH85O4Eh
xgAty5CRf9E4+9MNiv+OqkL64/E6/YmSeOYiNSWl6SgRE4VYmexLXttpppImmoER
zowmtqzLSN/4l4yX5p2BVy3NAXN0R1WjYKNUXAJvJj59814upXf+XrOKCghzHIs4
2NE/aFoscscKWCSS3+MIIW4q9QWw7f8YvAdrO6k47LLkjm4+C/I5BeAlHUOEQUS7
xyoB/7KoHXlzRen8LmoL9SsqopTL2tKEjE/1f0QgLKOHXqpPRb4iKBFjsR94HJg=
=aeCM
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2016-01-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
brcfmac
* fix IBSS which got broken over time
* new USB id for bcm43242 dongle
* arp offload configuration through inet notifier
ath9k
* add random number generator support (CONFIG_ATH9K_HWRNG)
iwlwifi
* Make scan parameters low latency aware
* Fix in the NL80211_FEATURE_FULL_AP_CLIENT_STATE state case
* Fix enable injection mode (Chaya Rachel)
* Various cleanups (Dan / Julia / myself)
* Allow to stay more time on popular channels (David Spinadel)
* Bug fixes for D0i3 (Eliad / Luca)
* Fixes for GO uAPSD (myself)
* Start of TSO support (myself)
* Rate control bug fixes (Eyal / Gregory)
* Start the work on 9000 devices (Johannes / Sara / Oren)
* Start the work on a new Tx queue allocation model (Liad)
* Debug infrastructure enhancements (Golan)
mwifiex
* add a debugfs file for chip reset
* advertise SMS4 cipher suite
* increase ap and station interface limit to 3
* enable MSI support on newer pcie devices (8897 onwards)
rtlwifi
* fix lots of module parameter usage
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull tile bugfix from Chris Metcalf:
"This fixes a bug that Sudip's buildbot found for tilepro allmodconfig.
I've tagged it for stable only back to 3.19, which was when most of
the other affected architectures added their support for working
around this issue"
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: provide CONFIG_PAGE_SIZE_64KB etc for tilepro
Initialising the suppport for EFI runtime services requires us to
allocate a pgd off the back of an early_initcall. On systems where the
PGD_SIZE is smaller than PAGE_SIZE (e.g. 64k pages and 48-bit VA), the
pgd_cache isn't initialised at this stage, and we panic with a NULL
dereference during boot:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
__create_mapping.isra.5+0x84/0x350
create_pgd_mapping+0x20/0x28
efi_create_mapping+0x5c/0x6c
arm_enable_runtime_services+0x154/0x1e4
do_one_initcall+0x8c/0x190
kernel_init_freeable+0x84/0x1ec
kernel_init+0x10/0xe0
ret_from_fork+0x10/0x50
This patch fixes the problem by initialising the pgd_cache earlier, in
the pgtable_cache_init callback, which sounds suspiciously like what it
was intended for.
Reported-by: Dennis Chen <dennis.chen@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This allows the build system to know that it can't attempt to
configure the Lustre virtual block device, for example, when tilepro
is using 64KB pages (as it does by default). The tilegx build
already provided those symbols.
Previously we required that the tilepro hypervisor be rebuilt with
a different hardcoded page size in its headers, and then Linux be
rebuilt using the updated hypervisor header. Now we allow each of
the hypervisor and Linux to be built independently. We still check
at boot time to ensure that the page size provided by the hypervisor
matches what Linux expects.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: stable@vger.kernel.org [3.19+]
Compilers may engage the improbability drive when encountering shifts
by a distance that is a multiple of the size of the operand type. Since
the required bounds check is very simple here, we can get rid of all the
fuzzy masking, shifting and comparing, and use the documented bounds
directly.
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The test whether a movz instruction with a signed immediate should be
turned into a movn instruction (i.e., when the immediate is negative)
is flawed, since the value of imm is always positive. Also, the
subsequent bounds check is incorrect since the limit update never
executes, due to the fact that the imm_type comparison will always be
false for negative signed immediates.
Let's fix this by performing the sign test on sval directly, and
replacing the bounds check with a simple comparison against U16_MAX.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: tidied up use of sval, renamed MOVK enum value to MOVKZ]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Using mremap() to shrink the map size of a VM_PFNMAP range causes
the following error message, and leaves the pfn range allocated.
x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff]
This is because rbt_memtype_erase(), called from free_memtype()
with spin_lock held, only supports to free a whole memtype node in
memtype_rbroot. Therefore, this patch changes rbt_memtype_erase()
to support a request that shrinks the size of a memtype node for
mremap().
memtype_rb_exact_match() is renamed to memtype_rb_match(), and
is enhanced to support EXACT_MATCH and END_MATCH in @match_type.
Since the memtype_rbroot tree allows overlapping ranges,
rbt_memtype_erase() checks with EXACT_MATCH first, i.e. free
a whole node for the munmap case. If no such entry is found,
it then checks with END_MATCH, i.e. shrink the size of a node
from the end for the mremap case.
On the mremap case, rbt_memtype_erase() proceeds in two steps,
1) remove the node, and then 2) insert the updated node. This
allows proper update of augmented values, subtree_max_end, in
the tree.
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: stsp@list.ru
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1450832064-10093-3-git-send-email-toshi.kani@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
mremap() with MREMAP_FIXED on a VM_PFNMAP range causes the following
WARN_ON_ONCE() message in untrack_pfn().
WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0()
Call Trace:
[<ffffffff817729ea>] dump_stack+0x45/0x57
[<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0
[<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20
[<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0
[<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860
[<ffffffff811d3725>] unmap_vmas+0x55/0xb0
[<ffffffff811d916c>] unmap_region+0xac/0x120
[<ffffffff811db86a>] do_munmap+0x28a/0x460
[<ffffffff811dec33>] move_vma+0x1b3/0x2e0
[<ffffffff811df113>] SyS_mremap+0x3b3/0x510
[<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71
MREMAP_FIXED moves a pfnmap from old vma to new vma. untrack_pfn() is
called with the old vma after its pfnmap page table has been removed,
which causes follow_phys() to fail. The new vma has a new pfnmap to
the same pfn & cache type with VM_PAT set. Therefore, we only need to
clear VM_PAT from the old vma in this case.
Add untrack_pfn_moved(), which clears VM_PAT from a given old vma.
move_vma() is changed to call this function with the old vma when
VM_PFNMAP is set. move_vma() then calls do_munmap(), and untrack_pfn()
is a no-op since VM_PAT is cleared.
Reported-by: Stas Sergeev <stsp@list.ru>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1450832064-10093-2-git-send-email-toshi.kani@hpe.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Expose socket options for setting a classic or extended BPF program
for use when selecting sockets in an SO_REUSEPORT group. These options
can be used on the first socket to belong to a group before bind or
on any socket in the group after bind.
This change includes refactoring of the existing sk_filter code to
allow reuse of the existing BPF filter validation checks.
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.
Link: http://lkml.kernel.org/r/1449367378-29430-6-git-send-email-huawei.libin@huawei.com
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Switch to use a generic interface for issuing SMC/HVC based on ARM SMC
Calling Convention. Removes now the now unused psci-call.S.
Acked-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adds implementation for arm-smccc and enables CONFIG_HAVE_SMCCC.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adds implementation for arm-smccc and enables CONFIG_HAVE_SMCCC for
architectures that may support arm-smccc. It's the responsibility of the
caller to know if the SMC instruction is supported by the platform.
Reviewed-by: Lars Persson <lars.persson@axis.com>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A _lot_ of ->write() instances were open-coding it; some are
converted to memdup_user_nul(), a lot more remain...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
On a cancelled suspend the vcpu_info location does not change (it's
still in the per-cpu area registered by xen_vcpu_setup()). So do not
call xen_hvm_init_shared_info() which would make the kernel think its
back in the shared info. With the wrong vcpu_info, events cannot be
received and the domain will hang after a cancelled suspend.
Signed-off-by: Charles Ouyang <ouyangzhaowei@huawei.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
The VMSA field of MMFR0 (bottom 4 bits) is incremented for each
added feature. PXN is supported if the value is >= 4 and LPAE
is supported if it is >= 5.
In case a kernel with CONFIG_ARM_LPAE disabled is used on a
processor that supports LPAE, we can still use PXN in short
descriptors. So check for >= 4 not == 4.
Signed-off-by: Jungseung Lee <js07.lee@samsung.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This fixes a regression with device tree based booting compared to legacy booting for n900 to make the n900 legacy user space to also work with device tree based booting
Signed-off-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
So it can be used by code outside arch/arm/kernel/. Fix save_atags()
declaration to match its definition while at it.
Signed-off-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This implements UEFI kernel support for 32-bit ARM, based on the existing
arm64 support and existing generic early ioremap support. It is based on
commit f7d9248942 ("arm64/efi: refactor EFI init and runtime code for
reuse by 32-bit ARM"), which was pulled from the arm64 repo [1] as branch
'aarch64/efi'
[1] git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
The PJ4 inline asm sequence to write to cp15 cannot be built in Thumb-2
mode, due to the way it performs arithmetic on the program counter, so it
is built in ARM mode instead. However, building C files in ARM mode under
CONFIG_THUMB2_KERNEL is problematic, since the instrumentation performed
by subsystems like ftrace does not expect having to deal with interworking
branches.
Since the sequence in question is simply a poor man's ISB instruction,
let's use a straight 'isb' instead when building in Thumb2 mode. Thumb2
implies V7, so 'isb' should always be supported in that case.
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The ld-version.sh script doesn't handle versions with large (>= 10) 3rd
version components, because the 2nd component is only multiplied by 10
times that of the 3rd component.
For example the following version string:
GNU ld (Codescape GNU Tools 2015.06-05 for MIPS MTI Linux) 2.24.90
gives a bogus version number:
20000000
+ 2400000
+ 900000 = 23300000
Breakage, confusion and mole-whacking ensues.
Increase the multipliers of the first two version components by a factor
of 10 to give space for a 3rd components of up to 99, and update the
sole user of ld-ifversion (MIPS VDSO) accordingly.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11931/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
As we want gpio_chip .get() calls to be able to return negative
error codes and propagate to drivers, we need to go over all
drivers and make sure their return values are clamped to [0,1].
We do this by using the ret = !!(val) design pattern.
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Here's another device id for cp210x.
Signed-off-by: Johan Hovold <johan@kernel.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJWhQCUAAoJEEEN5E/e4bSVNZAQAI1guXglsP9KHpcMBlxHw6j+
QjCeX07sbArIsJqDFITzSrs9WJbVyXPvOTLtmuuSIeV2qhFPZeRuxztjTng7Q+xK
Ym5ioyGVbGv7UqzRGI2wX+bNNSS8dW3pFnw78xfZ78kbNM6XVKNI+qecUUE7swqb
BNPeIc/vMtiJpvrKdhCQsel48vlGcCQJFV0ELj7serOi1dO7xrCSCcu0Afj04qbQ
I4ZAukYih89vVzGo4rC6u1SqnKr9GVUd56/FEbIRLc9Sq3dUkwmFwKjiqquoL8vb
g3a/rNgCOPYUAcuz3NIr1anYLlW9pwcWKzfl7brz0hxCT1cfcbSYGsGKa4coKqXd
tJoQ2W5AJZ07NaK/YEZClWv6/qQr203mUhs0KRH6fQDk7Gz6h0HGZljY9utru0GT
KyDuCVKomOTVUqUH7pOKcY7WHss9P/AD6+Gw7cf8dArhUJ3lReOu54wZ44FvUl8c
5TPl8hht0LUb9FNWGlq8LvoLwuMx75vUCRVm5doF+rnH2FZAOCORb1HDi3LG7Pe3
cg224WNyqxKIIAC7BkzYK/y9qeU6RtkbjFYv43B2R9Gdn5iNbFkqIxZu7LIcR7ph
cfijHzXvwLdl2ucFlE8xNrs1doN0IICulc1kjwdKegGLv6pInn5iRx8ZEFJi/8Qh
Z0CB5oVOj0nTHHUW959q
=jeCr
-----END PGP SIGNATURE-----
Merge tag 'usb-serial-4.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-next
Johan writes:
USB-serial fixes for v4.4-rc8
Here's another device id for cp210x.
Signed-off-by: Johan Hovold <johan@kernel.org>
Pull MIPS build fix from Ralf Baechle:
"Fix a makefile issue resulting in build breakage with older binutils.
This has sat in -next for a few days, testers and buildbot are happy
with it, too though if you are going for another -rc that'd certainly
help ironing out a few more issues"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: VDSO: Fix build error with binutils 2.24 and earlier
Pull sparc fixes from David Miller:
"Just some missing syscall wire ups"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Wire up mlock2 system call.
sparc: Add all necessary direct socket system calls.
The GLIBC folks would like to eliminate socketcall support
eventually, and this makes sense regardless so wire them
all up.
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 69fb4dcada ("power: Add an axp20x-usb-power driver") introduced a new
driver for the USB power supply used on various Allwinner based SBCs. However,
the driver was not added to sunxi_defconfig which breaks USB support for some
boards (e.g. LeMaker BananaPi) as the kernel will now turn off the USB power
supply during boot by default if the driver isn't present. (This was not the
case in linux 4.3 or lower where the USB power was always left on.)
Hence, add the driver to sunxi_defconfig in order to keep USB support working
on those boards that require it.
Signed-off-by: Timo Sigurdsson <public_timo.s@silentcreek.de>
Reported-by: David Tulloh <david@tulloh.id.au>
Tested-by: David Tulloh <david@tulloh.id.au>
Tested-by: Timo Sigurdsson <public_timo.s@silentcreek.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
The MMCFG PCI accessors weren't being setup for NumacConnect2
correctly due to over-early assignment; this would create the
potential for the wrong PCI domain to be accessed.
Fix this by using the correct arch-specific PCI init function.
Signed-off-by: Daniel J Blueman <daniel@numascale.com>
Acked-by: Steffen Persvold <sp@numascale.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1451498807-15920-1-git-send-email-daniel@numascale.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Use CONFIG_TOPOLOGY which selects CONFIG_SCHED_* all over the place to
reduce the random usage of the previous config options.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge misc fixes from Andrew Morton:
"9 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/vmstat: fix overflow in mod_zone_page_state()
ocfs2/dlm: clear migration_pending when migration target goes down
mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone()
ocfs2: fix flock panic issue
m32r: add io*_rep helpers
m32r: fix build failure
arch/x86/xen/suspend.c: include xen/xen.h
mm: memcontrol: fix possible memcg leak due to interrupted reclaim
ocfs2: fix BUG when calculate new backup super
m32r allmodconfig was failing with the error:
error: implicit declaration of function 'read'
On checking io.h it turned out that 'read' is not defined but 'readb' is
defined and 'ioread8' will then obviously mean 'readb'.
At the same time some of the helper functions ioreadN_rep() and
iowriteN_rep() were missing which also led to the build failure.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
m32r allmodconfig is failing with:
In file included from ../include/linux/kvm_para.h:4:0,
from ../kernel/watchdog.c:26:
../include/uapi/linux/kvm_para.h:30:26: fatal error: asm/kvm_para.h: No such file or directory
kvm_para.h was not included in the build.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the build warning:
arch/x86/xen/suspend.c: In function 'xen_arch_pre_suspend':
arch/x86/xen/suspend.c:70:9: error: implicit declaration of function 'xen_pv_domain' [-Werror=implicit-function-declaration]
if (xen_pv_domain())
^
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 2a037f310b ("MIPS: VDSO: Fix build error") tries to fix a build
error seen with binutils 2.24 and earlier. However, the fix does not work,
and again results in the already known build errors if the kernel is built
with an earlier version of binutils.
CC arch/mips/vdso/gettimeofday.o
/tmp/ccnOVbHT.s: Assembler messages:
/tmp/ccnOVbHT.s:50: Error: can't resolve `_start' {*UND* section} - `L0 {.text section}
/tmp/ccnOVbHT.s:374: Error: can't resolve `_start' {*UND* section} - `L0 {.text section}
scripts/Makefile.build:258: recipe for target 'arch/mips/vdso/gettimeofday.o' failed
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
Fixes: 2a037f310b ("MIPS: VDSO: Fix build error")
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11926/
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
ioremapping multiple BARs produces a warning with a message "Your kernel is
fine". This message mostly serves to comfort kernel developers. Users do
not read the message, they only see the big scary warning which means
something must be horribly broken with their system. Less dramatically, the
warn also sets the taint flag which makes it difficult to differentiate
problems. If the kernel is actually fine as the warning claims it doesn't
make sense for it to be tainted. Change the WARN_ONCE to a pr_warn with the
caller of the ioremap.
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Link: http://lkml.kernel.org/r/1450728074-31029-1-git-send-email-labbott@fedoraproject.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This was meant to print base address and entry count; make it do so
again.
Fixes: 37868fe113 "x86/ldt: Make modify_ldt synchronous"
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: http://lkml.kernel.org/r/56797D8402000078000C24F0@prv-mh.provo.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The related warning from gcc 6.0:
arch/x86/kernel/ptrace.c:127:18: warning: ‘arg_offs_table’ defined but not used [-Wunused-const-variable]
static const int arg_offs_table[] = {
^~~~~~~~~~~~~~
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Link: http://lkml.kernel.org/r/1451137798-28701-1-git-send-email-chengang@emindsoft.com.cn
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is a new device driver for a high performance SR-IOV assisted virtual
network for IBM System p and IBM System i systems. The SR-IOV VF will be
attached to the VIOS partition and mapped to the Linux client via the
hypervisor's VNIC protocol that this driver implements.
This driver is able to perform basic tx and rx, new features
and improvements will be added as they are being developed and tested.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS fixes from Ralf Baechle:
- Fix bitrot in __get_user_unaligned()
- EVA userspace accessor bug fixes.
- Fix for build issues with certain toolchains.
- Fix build error for VDSO with particular toolchain versions.
- Fix build error due to a variable that should have been removed by an
earlier patch
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Fix bitrot in __get_user_unaligned()
MIPS: Fix build error due to unused variables.
MIPS: VDSO: Fix build error
MIPS: CPS: drop .set mips64r2 directives
MIPS: uaccess: Take EVA into account in [__]clear_user
MIPS: uaccess: Take EVA into account in __copy_from_user()
MIPS: uaccess: Fix strlen_user with EVA
A handful of fixes for OMAP, i.MX, Allwinner and Tegra:
- A clock rate and a PHY setup fix for i.MX6Q/DL
- A couple of fixes for the reduced serial bus (sunxi-rsb) on Allwinner
- UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
- Suspend fix for Tegra124 Chromebooks
- Fix for missing implicit include that's different between ARM/ARM64
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWgDAQAAoJEIwa5zzehBx3muAP/RJD+oERDw/XiBSCl4k4kvTZ
M4ep2ExO/GPXewKXyK7rlbNWMitBwttdT5OmXuuxT7BA/ODAGk90uvZSebIHRxkh
bqnt+3njeR3M6FrTp6Np+yCh3I0hLYJoRInV3Vh8XQcml0LobIIq2BbdahIg/2JO
sSDLAzJSR15CfnnKOSexvfRFoslLaKG0ffbxmR32HiMgnbq8ZfkRr7fANsrXTzoX
oHxEV0uQXBY0d1TQ71rT/4KdTKN9QsdJI6xAjUGH95MYu+ZE0DEnW/wodd/f4e+p
dyV+qoS2LHJnhYgJho3r19icM0iyNRm0Yt0GqP7ERN/y5GGa41slE9eThMG7WQ4h
Ot5DEF2GmP7tSYhp4pqEeLqHnfou9+WzxxNP6wxGceqlg9EPuPRtzdCPStZYW4rD
6+SQCvFs0vHZlEbbAfQLhRgDpvr9enGjCcWm6ntUcgwxFy0CPmRV9g5gIeipZOwi
QfJM233tudUkDQxJYrCEgKxHy3a7T3K9LgYMM0VO1JRAEpaPIBmxZ0A+Y84WJbTE
7r+r3eDC8RFF3bLbNRb/Ogt5OHOQUdElhRXcyz/BYA/X4HOT4wyVNM12aBlmW9uN
aDB4HWE2lKV/h2pxWRR1Hem1NHYRUpcdVzkwRvncDHnxCAmb82BE2x1Ub3+fqa0v
f0eO7GjUDRdPsQXn0zZn
=kNRV
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smallish set of fixes that we've been sitting on for a while now,
flushing the queue here so they go in. Summary:
A handful of fixes for OMAP, i.MX, Allwinner and Tegra:
- A clock rate and a PHY setup fix for i.MX6Q/DL
- A couple of fixes for the reduced serial bus (sunxi-rsb) on
Allwinner
- UART wakeirq fix for an OMAP4 board, timer config fixes for AM43XX.
- Suspend fix for Tegra124 Chromebooks
- Fix for missing implicit include that's different between
ARM/ARM64"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: tegra: Fix suspend hang on Tegra124 Chromebooks
bus: sunxi-rsb: Fix peripheral IC mapping runtime address
bus: sunxi-rsb: Fix primary PMIC mapping hardware address
ARM: dts: Fix UART wakeirq for omap4 duovero parlor
ARM: OMAP2+: AM43xx: select ARM TWD timer
ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST
fsl-ifc: add missing include on ARM64
ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
bus: sunxi-rsb: unlock on error in sunxi_rsb_read()
ARM: dts: sunxi: sun6i-a31s-primo81.dts: add touchscreen axis swapping property
- Unwinder rework (A revert followed by better fix)
- Build errors: MMUv2, modules with -Os
- highmem section mismatch build splat
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWfjpNAAoJEGnX8d3iisJeC08P+wRkkvtK0R5iZPqfTdSUaxFz
CYoVM8mAMF+non+b7a4bWHChDJ+HlHxqsQWbq2A/EsET/TOXmnr28KquC/qfplwd
p2I38xxsCSHK0hU1+x47VdxZdtO5VNdp+fsq7CQbHLUwg9TMCK4t8OpslrA3IbOG
W+YlHQ+7Wwgu1GbHhWOHDTPb/yqX0WByX211HomURlo9Ppi3Xv3W6xK7uGn0wUxS
DKp0pEWhLs8UQh6zGAf3zFJVjuuWpJHwAoXmQgfEeXYkA/ZOBCEh8sf+/fFZMxuK
RAItN1ZTV3HwIYh3ykJyap877MBksb7nsCE5R0l+LQW0nV485omfJgWvo2hFUL5D
Nrr93bcu8msn/mKfYtBaaQdohwlGsNY5GFaRGEv6TgKZWifFX+uUOBnZnbr2RLvn
3/cDeeK/24sYvxaHGvBs3tjnJkwWPpofr9ZHCnQT+X8MgxjHQhcBesDq/8lb90m6
G2RRqXQomdKkMBacejpKyvnulTEwMrU9itpxyIT7wfxCVGKxfdWHNHLW8KNtXsDn
02Ae7XUyuciV6RIF9lXyGa+I73T/4/ut02b4Qq7PrdfLQucwZKjIQ9nj1Om9+taV
1nSdm+VttO/Nqu2PoMKluxNHBpS1kxzEVFZWa/mXfFs+qmwRMLyKqjz0Tog+hBCv
O7951Yzp3J88HTQIwUMD
=h2pT
-----END PGP SIGNATURE-----
Merge tag 'arc-4.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Sorry for this late pull request, but these are all important fixes
for code introduced/updated in this release which we will otherwise
end up back porting.
- Unwinder rework (A revert followed by better fix)
- Build errors: MMUv2, modules with -Os
- highmem section mismatch build splat"
* tag 'arc-4.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: dw2 unwind: Catch Dwarf SNAFUs early
ARC: dw2 unwind: Don't bail for CIE.version != 1
Revert "ARC: dw2 unwind: Ignore CIE version !=1 gracefully instead of bailing"
ARC: Fix linking errors with CONFIG_MODULE + CONFIG_CC_OPTIMIZE_FOR_SIZE
ARC: mm: fix building for MMU v2
ARC: mm: HIGHMEM: Fix section mismatch splat
Pull parisc system call restart fix from Helge Deller:
"The architectural design of parisc always uses two instructions to
call kernel syscalls (delayed branch feature). This means that the
instruction following the branch (located in the delay slot of the
branch instruction) is executed before control passes to the branch
destination.
Depending on which assembler instruction and how it is used in
usersapce in the delay slot, this sometimes made restarted syscalls
like futex() and poll() failing with -ENOSYS"
* 'parisc-4.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix syscall restarts
Pull sparc fixes from David Miller:
1) Finally make perf stack backtraces stable on sparc, several problems
(mostly due to the context in which the user copies from the stack
are done) contributed to this.
From Rob Gardner.
2) Export ADI capability if the cpu supports it.
3) Hook up userfaultfd system call.
4) When faults happen during user copies we really have to clean up and
restore the FPU state fully. Also from Rob Gardner
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
tty/serial: Skip 'NULL' char after console break when sysrq enabled
sparc64: fix FP corruption in user copy functions
sparc64: Perf should save/restore fault info
sparc64: Ensure perf can access user stacks
sparc64: Don't set %pil in rtrap_nmi too early
sparc64: Add ADI capability to cpu capabilities
tty: serial: constify sunhv_ops structs
sparc: Hook up userfaultfd system call
Short story: Exception handlers used by some copy_to_user() and
copy_from_user() functions do not diligently clean up floating point
register usage, and this can result in a user process seeing invalid
values in floating point registers. This sometimes makes the process
fail.
Long story: Several cpu-specific (NG4, NG2, U1, U3) memcpy functions
use floating point registers and VIS alignaddr/faligndata to
accelerate data copying when source and dest addresses don't align
well. Linux uses a lazy scheme for saving floating point registers; It
is not done upon entering the kernel since it's a very expensive
operation. Rather, it is done only when needed. If the kernel ends up
not using FP regs during the course of some trap or system call, then
it can return to user space without saving or restoring them.
The various memcpy functions begin their FP code with VISEntry (or a
variation thereof), which saves the FP regs. They conclude their FP
code with VISExit (or a variation) which essentially marks the FP regs
"clean", ie, they contain no unsaved values. fprs.FPRS_FEF is turned
off so that a lazy restore will be triggered when/if the user process
accesses floating point regs again.
The bug is that the user copy variants of memcpy, copy_from_user() and
copy_to_user(), employ an exception handling mechanism to detect faults
when accessing user space addresses, and when this handler is invoked,
an immediate return from the function is forced, and VISExit is not
executed, thus leaving the fprs register in an indeterminate state,
but often with fprs.FPRS_FEF set and one or more dirty bits. This
results in a return to user space with invalid values in the FP regs,
and since fprs.FPRS_FEF is on, no lazy restore occurs.
This bug affects copy_to_user() and copy_from_user() for NG4, NG2,
U3, and U1. All are fixed by using a new exception handler for those
loads and stores that are done during the time between VISEnter and
VISExit.
n.b. In NG4memcpy, the problematic code can be triggered by a copy
size greater than 128 bytes and an unaligned source address. This bug
is known to be the cause of random user process memory corruptions
while perf is running with the callgraph option (ie, perf record -g).
This occurs because perf uses copy_from_user() to read user stacks,
and may fault when it follows a stack frame pointer off to an
invalid page. Validation checks on the stack address just obscure
the underlying problem.
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There have been several reports of random processes being killed with
a bus error or segfault during userspace stack walking in perf. One
of the root causes of this problem is an asynchronous modification to
thread_info fault_address and fault_code, which stems from a perf
counter interrupt arriving during kernel processing of a "benign"
fault, such as a TSB miss. Since perf_callchain_user() invokes
copy_from_user() to read user stacks, a fault is not only possible,
but probable. Validity checks on the stack address merely cover up the
problem and reduce its frequency.
The solution here is to save and restore fault_address and fault_code
in perf_callchain_user() so that the benign fault handler is not
disturbed by a perf interrupt.
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an interrupt (such as a perf counter interrupt) is delivered
while executing in user space, the trap entry code puts ASI_AIUS in
%asi so that copy_from_user() and copy_to_user() will access the
correct memory. But if a perf counter interrupt is delivered while the
cpu is already executing in kernel space, then the trap entry code
will put ASI_P in %asi, and this will prevent copy_from_user() from
reading any useful stack data in either of the perf_callchain_user_X
functions, and thus no user callgraph data will be collected for this
sample period. An additional problem is that a fault is guaranteed
to occur, and though it will be silently covered up, it wastes time
and could perturb state.
In perf_callchain_user(), we ensure that %asi contains ASI_AIUS
because we know for a fact that the subsequent calls to
copy_from_user() are intended to read the user's stack.
[ Use get_fs()/set_fs() -DaveM ]
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 28a1f53 delays setting %pil to avoid potential
hardirq stack overflow in the common rtrap_irq path.
Setting %pil also needs to be delayed in the rtrap_nmi
path for the same reason.
Signed-off-by: Rob Gardner <rob.gardner@oracle.com>
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ADI (Application Data Integrity) capability to cpu capabilities list.
ADI capability allows virtual addresses to be encoded with a tag in
bits 63-60. This tag serves as an access control key for the regions
of virtual address with ADI enabled and a key set on them. Hypervisor
encodes this capability as "adp" in "hwcap-list" property in machine
description.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After hooking up system call, userfaultfd selftest was successful for
both 32 and 64 bit version of test.
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.
Link: http://lkml.kernel.org/r/1449367378-29430-3-git-send-email-huawei.libin@huawei.com
Cc: linux-metag@vger.kernel.org
Acked-by: James Hogan <james.hogan@imgtec.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.
Link: http://lkml.kernel.org/r/1449367378-29430-5-git-send-email-huawei.libin@huawei.com
Cc: linux-sh@vger.kernel.org
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when
a module is being unloaded and before the text goes away and this
code grabs the ftrace_lock mutex and removes the module functions
from the ftrace list, such that it will no longer do any
modifications to that module's text, the update to make functions
be traced or not is done under the ftrace_lock mutex as well.
And by now, __init section codes should not been modified
by ftrace, because it is black listed in recordmcount.c and
ignored by ftrace.
Link: http://lkml.kernel.org/r/1449214067-12177-2-git-send-email-huawei.libin@huawei.com
Cc: linux-ia64@vger.kernel.org
Acked-by: Tony Luck <tony.luck@intel.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This reverts commit 677a73a9aa. This patch was not meant to be merged and
has issues. Revert it.
Requested-by: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
- A series of fixes to the MTRR emulation, tested in the BZ by several users
so they should be safe this late
- A fix for a division by zero
- Two very simple ARM and PPC fixes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJWeV/6AAoJEL/70l94x66DqzsH/05YnLi2GsX5WeZHMfIUgzgT
S/GoIkA7A4E2eXVoGg824MWppSViUzZkWgYFQTG4+KY9WPXzm9z2ij7DIlUHCD6n
QfevgQx1kIu1obyhm6bYM2xUdM3f7NCsQgw9bXZObB0ay+b/+GjR9/RbCbx60EO5
K1P+kveK6PFlS9/hc0PLztu6WkPV9BCO1RJUbeAEdnrMbpuQfHC+coR7MHRCiv2V
iy8f1CqrGaO5YPm9/3GbdH1xMKew4OZShOxTXwtvUThdrLkks2c8sk6FoLzqkznH
LMHVIpkm4mrIgThZG7VqZMXOrWvBtsCt04Vr9MzCM6QetB02b/Uz0xKvMYx2kZQ=
=pmYz
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
- A series of fixes to the MTRR emulation, tested in the BZ by several
users so they should be safe this late
- A fix for a division by zero
- Two very simple ARM and PPC fixes
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Reload pit counters for all channels when restoring state
KVM: MTRR: treat memory as writeback if MTRR is disabled in guest CPUID
KVM: MTRR: observe maxphyaddr from guest CPUID, not host
KVM: MTRR: fix fixed MTRR segment look up
KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
KVM: arm/arm64: vgic: Fix kvm_vgic_map_is_active's dist check
kvm: x86: move tracepoints outside extended quiescent state
KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR
Pull s390 fixes from Martin Schwidefsky:
"Two late bug fixes for kernel 4.4.
Merry Christmas"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/dis: Fix handling of format specifiers
s390/zcrypt: Fix AP queue handling if queue is full
Enabling CPUFreq support for Tegra124 Chromebooks is causing the Tegra124
to hang when resuming from suspend.
When CPUFreq is enabled, the CPU clock is changed from the PLLX clock to
the DFLL clock during kernel boot. When resuming from suspend the CPU
clock is temporarily changed back to the PLLX clock before switching back
to the DFLL. If the DFLL is operating at a much lower frequency than the
PLLX when we enter suspend, and so the CPU voltage rail is at a voltage
too low for the CPUs to operate at the PLLX frequency, then the device
will hang.
Please note that the PLLX is used in the resume sequence to switch the CPU
clock from the very slow 32K clock to a faster clock during early resume
to speed up the resume sequence before the DFLL is resumed.
Ideally, we should fix this by setting the suspend frequency so that it
matches the PLLX frequency, however, that would be a bigger change. For
now simply disable CPUFreq support for Tegra124 Chromebooks to avoid the
hang when resuming from suspend.
Fixes: 9a0baee960 ("ARM: tegra: Enable CPUFreq support for Tegra124
Chromebooks")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Fix a pointer cast typo introduced in v4.4-rc5 especially visible for
the i386 subarchitecture where it results in a kernel crash.
[ Also removed pointless cast as per Al Viro - Linus ]
Fixes: 8090bfd2bb ("um: Fix fpstate handling")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Cc: Jeff Dike <jdike@addtoit.com>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CPU_IDLE and ARM TWD timer. This is probably a common configuration setup
for people making products with these SoCs so let's make sure it works.
Also a wakeirq fix for duovero parlor making my life a bit easier as that
allows me to run basic PM regression tests on it.
It would be nice to have these in v4.4, but if it gets too late for that
because of the holidays, it is not super critical if these get merged for
v4.5.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWeGJhAAoJEBvUPslcq6VzU4wQAJqyWnzl5+frGf8TeTJTwAD2
yDXbiTDYAZAZ4ajFY5aocrAvQfd/z9zU4ar6liylfAMzOM3S4HRDTaUYc2jbr42s
+uNNRY3EGMRzbLuRfyZZRgui0aImBpIDXhBhK1njQoYWt9wWIVxNCv2I9pImWm+n
kyZBDK7xIunQTndgMIMTk8MPVQ5tbPrFbkOWUAZT2bIOmhhhjf/fdMpVL47tbssc
5XuS5YmyglT3spdaYzrAfZbxf19+CMVmTh2KD9FZOCv4OZPXvPOJ3JkqrJeARvPI
Idt6DjitznShJ7wQuIuRH+uS7N2fEDviZR3Of1kZBawBxt7ig58O7TizZLRUtTk+
Uj2QuXfccw8qu122chdpCI1eV5oOok157V3Yhsm7Oyi12H/eu5WMee2boJh6cz9R
KIih+5/FX6yUCgfHow7XR6vpQ2A50BJoU55+EB3yb81hsy/t4eHxbFaJfUcs7DNg
yLjtAxL1tO96VNFF+Kdgps69AOaHbo7/rNYgRvRZ4uxdSkE+YJhnHfcu6m59lFSR
k7Ybw+CCvZ+jH3IE+aGLumPguL10ntytv16G2UDwCzuvCaq59fUPUOCdxuBf+RNB
3Bd1NCkcsi8xPSypf7YXa3AfNM2XkNprHVZJadM57EBE3bVZ19+1OiBtf2gn+YAE
72gzwGGPixUNr8OjZNX0
=tGjX
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v4.4/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Few fixes for omaps to allow am437x only builds to boot properly with
CPU_IDLE and ARM TWD timer. This is probably a common configuration setup
for people making products with these SoCs so let's make sure it works.
Also a wakeirq fix for duovero parlor making my life a bit easier as that
allows me to run basic PM regression tests on it.
It would be nice to have these in v4.4, but if it gets too late for that
because of the holidays, it is not super critical if these get merged for
v4.5.
* tag 'omap-for-v4.4/fixes-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: Fix UART wakeirq for omap4 duovero parlor
ARM: OMAP2+: AM43xx: select ARM TWD timer
ARM: OMAP2+: am43xx: enable GENERIC_CLOCKEVENTS_BROADCAST
Signed-off-by: Olof Johansson <olof@lixom.net>
- Fix Ethernet PHY mode on i.MX6 Ventana boards, which can result in
a non-functional Ethernet when Marvell phy driver rather than generic
phy driver is selected.
- Fix an assigned-clock configuration bug on imx6qdl-sabreauto board
which was introduced by commit ed339363de ("ARM: dts:
imx6qdl-sabreauto: Allow HDMI and LVDS to work simultaneously").
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWcnPzAAoJEFBXWFqHsHzOvfAIALZo+lVpA3c89ZuDJn0bKfsI
hsV2wfMW1N5BhLN0AGjNIk/nk/YWhZ57gZyQ8miMdz2VeQRSgcjRaFFfEGRGOGop
OrrKpgmRKmUNxMrvztLlMmb1hGxpHOO7z/CXO3UrzDRIiw9lzxYymt0Re2UHiE1u
+tekMe7xBdimfMVmcM8RX/FOZC81tUjH3W8fyPbbzTTgrAnLPmXcfjtHkXy2VPw0
5ikWCJhFWB9ODnmKHbdA7YsNh52JAe5BieWiLXzLj33XJNDBbiFBbBltbo+BJZnp
ZaIek/fGHbq7ouCSLdRq7SEQn6ZbjpHhlic/okqNX0ux29QpY2GKTbiq8fLmcLU=
=wMKP
-----END PGP SIGNATURE-----
Merge tag 'imx-fixes-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
The i.MX fixes for 4.4, 3rd round:
- Fix Ethernet PHY mode on i.MX6 Ventana boards, which can result in
a non-functional Ethernet when Marvell phy driver rather than generic
phy driver is selected.
- Fix an assigned-clock configuration bug on imx6qdl-sabreauto board
which was introduced by commit ed339363de ("ARM: dts:
imx6qdl-sabreauto: Allow HDMI and LVDS to work simultaneously").
* tag 'imx-fixes-4.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx6: Fix Ethernet PHY mode on Ventana boards
ARM: dts: imx: Fix the assigned-clock mismatch issue on imx6q/dl
Cortex-A72 has a PMUv3 implementation that is compatible with the PMU
implemented by Cortex-A57.
This patch hooks up the new compatible string so that the Cortex-A57
event mappings are used.
Signed-off-by: Will Deacon <will.deacon@arm.com>
It's all very well providing an events directory to userspace that
details our events in terms of "event=0xNN", but if we don't define how
to encode the "event" field in the perf attr.config, then it's a waste
of time.
This patch adds a single format entry to describe that the event field
occupies the bottom 10 bits of our config field on ARMv8 (PMUv3).
Signed-off-by: Will Deacon <will.deacon@arm.com>
It's all very well providing an events directory to userspace that
details our events in terms of "event=0xNN", but if we don't define how
to encode the "event" field in the perf attr.config, then it's a waste
of time.
This patch adds a single format entry to describe that the event field
occupies the bottom 8 bits of our config field on ARMv7.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently if userspace restores the pit counters with a count of 0
on channels 1 or 2 and the guest attempts to read the count on those
channels, then KVM will perform a mod of 0 and crash. This will ensure
that 0 values are converted to 65536 as per the spec.
This is CVE-2015-7513.
Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Virtual machines can be run with CPUID such that there are no MTRRs.
In that case, the firmware will never enable MTRRs and it is obviously
undesirable to run the guest entirely with UC memory. Check out guest
CPUID, and use WB memory if MTRR do not exist.
Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Conversion of MTRRs to ranges used the maxphyaddr from the boot CPU.
This is wrong, because var_mtrr_range's mask variable then is discontiguous
(like FF00FFFF000, where the first run of 0s corresponds to the bits
between host and guest maxphyaddr). Instead always set up the masks
to be full 64-bit values---we know that the reserved bits at the top
are zero, and we can restore them when reading the MSR. This way
var_mtrr_range gets a mask that just works.
Fixes: a13842dc66
Cc: qemu-stable@nongnu.org
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes the slow-down of VM running with pci-passthrough, since some MTRR
range changed from MTRR_TYPE_WRBACK to MTRR_TYPE_UNCACHABLE. Memory in the
0K-640K range was incorrectly treated as uncacheable.
Fixes: f7bfb57b3e
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107561
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexis Dambricourt <alexis.dambricourt@gmail.com>
[Use correct BZ for "Fixes" annotation. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
c861519fcf ("MIPS: Fix delay loops which may
be removed by GCC.") which made it upstream was an outdated version of the
patch and is lacking some the removal of two variables that became unused
thus resulting in further warnings and build breakage. The commit
from ae878615d7cee5d7346946cf1ae1b60e427013c2 was correct however.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
According to commit 2503a5ecd8
"ARM: 6201/1: RealView: Do not use outer_sync() on ARM11MPCore
boards with L220" Some PB11MPCore RealView core tiles have broken
outer_sync.
We got rid of the custom barriers from the machine by disabling
outer sync, but that was just for the boardfile case. We have
to be able to do the same in the device tree case.
Since __l2c_init() is cloning and copying the L2C vtable,
we pass an argument to this function to optionally numb
the outer sync operation if desired, before initializing
the cache.
After this we can set up the cache correctly on the RealView
PB11MPCore. This was tested on a PB11MPCore known to have the
issue. Before this, spurious crashes would occur if we try to
set up the cache properly, after this it boots rock solid.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Having IPI_CPU_BACKTRACE as SGI15 may not work if the kernel is
running in non-secure mode and that the secure firmware has
decided to follow ARM's recommendations that SGI8-15 should
be reserved for secure purpose.
Now that we are "only" using SGI0-6, change IPI_CPU_BACKTRACE
to use SGI7, which makes it more likely to work.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since 9a46ad6d6d ("smp: make smp_call_function_many() use logic
similar to smp_call_function_single()"), the core IPI handling
has been simplified, and generic_smp_call_function_interrupt is
now the same as generic_smp_call_function_single_interrupt.
This means that one of IPI_CALL_FUNC and IPI_CALL_FUNC_SINGLE has
become redundant. We can then safely drop IPI_CALL_FUNC_SINGLE,
and use only IPI_CALL_FUNC.
This has the advantage of reducing the number of SGI IDs we're using
(a fairly scarse resource).
Tested on a dual A7 board.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The suspend() hook in the cpuidle_ops struct is always called on
the cpu entering idle, which means that the cpu parameter passed
to the suspend hook always corresponds to the local cpu, making
it somewhat redundant.
This patch removes the logical cpu parameter from the ARM
cpuidle_ops.suspend hook and updates all the existing kernel
implementations to reflect this change.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lina Iyer <lina.iyer@linaro.org>
Tested-by: Lina Iyer <lina.iyer@linaro.org>
Tested-by: Jisheng Zhang <jszhang@marvell.com> [psci]
Cc: Lina Iyer <lina.iyer@linaro.org>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit ebb5e78cc6 ("MIPS: Initial implementation of a VDSO") introduced a
build error.
For MIPS VDSO to be compiled it requires binutils version 2.25 or above but
the check in the Makefile had inverted logic causing it to be compiled in if
binutils is below 2.25.
This fixes the following compilation error:
CC arch/mips/vdso/gettimeofday.o
/tmp/ccsExcUd.s: Assembler messages:
/tmp/ccsExcUd.s:62: Error: can't resolve `_start' {*UND* section} - `L0' {.text section}
/tmp/ccsExcUd.s:467: Error: can't resolve `_start' {*UND* section} - `L0' {.text section}
make[2]: *** [arch/mips/vdso/gettimeofday.o] Error 1
make[1]: *** [arch/mips/vdso] Error 2
make: *** [arch/mips] Error 2
[ralf@linux-mips: Fixed Sergei's complaint on the formatting of the
cited commit and generally reformatted the log message.]
Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Cc: alex@alex-smith.me.uk
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/11745/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 977e043d5e ("MIPS: kernel: cps-vec: Replace mips32r2 ISA level
with mips64r2") leads to .set mips64r2 directives being present in 32
bit (ie. CONFIG_32BIT=y) kernels. This is incorrect & leads to MIPS64
instructions being emitted by the assembler when expanding
pseudo-instructions. For example the "move" instruction can legitimately
be expanded to a "daddu". This causes problems when the kernel is run on
a MIPS32 CPU, as CONFIG_32BIT kernels of course often are...
Fix this by dropping the .set <ISA> directives entirely now that Kconfig
should be ensuring that kernels including this code are built with a
suitable -march= compiler flag.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: <stable@vger.kernel.org> # 3.16+
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/10869/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
__clear_user() (and clear_user() which uses it), always access the user
mode address space, which results in EVA store instructions when EVA is
enabled even if the current user address limit is KERNEL_DS.
Fix this by adding a new symbol __bzero_kernel for the normal kernel
address space bzero in EVA mode, and call that from __clear_user() if
eva_kernel_access().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10844/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When EVA is in use, __copy_from_user() was unconditionally using the EVA
instructions to read the user address space, however this can also be
used for kernel access. If the address isn't a valid user address it
will cause an address error or TLB exception, and if it is then user
memory may be read instead of kernel memory.
For example in the following stack trace from Linux v3.10 (changes since
then will prevent this particular one still happening) kernel_sendmsg()
set the user address limit to KERNEL_DS, and tcp_sendmsg() goes on to
use __copy_from_user() with a kernel address in KSeg0.
[<8002d434>] __copy_fromuser_common+0x10c/0x254
[<805710e0>] tcp_sendmsg+0x5f4/0xf00
[<804e8e3c>] sock_sendmsg+0x78/0xa0
[<804e8f28>] kernel_sendmsg+0x24/0x38
[<804ee0f8>] sock_no_sendpage+0x70/0x7c
[<8017c820>] pipe_to_sendpage+0x80/0x98
[<8017c6b0>] splice_from_pipe_feed+0xa8/0x198
[<8017cc54>] __splice_from_pipe+0x4c/0x8c
[<8017e844>] splice_from_pipe+0x58/0x78
[<8017e884>] generic_splice_sendpage+0x20/0x2c
[<8017d690>] do_splice_from+0xb4/0x110
[<8017d710>] direct_splice_actor+0x24/0x30
[<8017d394>] splice_direct_to_actor+0xd8/0x208
[<8017d51c>] do_splice_direct+0x58/0x7c
[<8014eaf4>] do_sendfile+0x1dc/0x39c
[<8014f82c>] SyS_sendfile+0x90/0xf8
Add the eva_kernel_access() check in __copy_from_user() like the one in
copy_from_user().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10843/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The strlen_user() function calls __strlen_kernel_asm in both branches of
the eva_kernel_access() conditional. For EVA it should be calling
__strlen_user_eva for user accesses, otherwise it will load from the
kernel address space instead of the user address space, and the access
checking will likely be ineffective at preventing it due to EVA's
overlapping user and kernel address spaces.
This was found after extending the test_user_copy module to cover user
string access functions, which gave the following error with EVA:
test_user_copy: illegal strlen_user passed
Fortunately the use of strlen_user() has been all but eradicated from
the mainline kernel, so only out of tree modules could be affected.
Fixes: e3a9b07a9c ("MIPS: asm: uaccess: Add EVA support for str*_user operations")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 3.15.x-
Patchwork: https://patchwork.linux-mips.org/patch/10842/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We want the USB and PHY fixes in here as well to make things easier for
testing and development.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit ac7b406c1a ("arm64: Use pr_* instead of printk") was a fairly
mindless s/printk/pr_*/ change driven by a complaint from checkpatch.
As is usual with such changes, this has led to some odd behaviour on
arm64:
* syslog now picks up the "pr_emerg" line from dump_backtrace, but not
the actual trace, which leads to a bunch of "kernel:Call trace:"
lines in the log
* __{pte,pmd,pgd}_error print at KERN_CRIT, as opposed to KERN_ERR
which is used by other architectures.
This patch restores the original printk behaviour for dump_backtrace
and downgrade the pgtable error macros to KERN_ERR.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function return. This will result in many useless entries
(return_to_handler) showing up in
a) a stack tracer's output
b) perf call graph (with perf record -g)
c) dump_backtrace (at panic et al.)
For example, in case of a),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ echo 1 > /proc/sys/kernel/stack_trace_enabled
$ cat /sys/kernel/debug/tracing/stack_trace
Depth Size Location (54 entries)
----- ---- --------
0) 4504 16 gic_raise_softirq+0x28/0x150
1) 4488 80 smp_cross_call+0x38/0xb8
2) 4408 48 return_to_handler+0x0/0x40
3) 4360 32 return_to_handler+0x0/0x40
...
In case of b),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ perf record -e mem:XXX:x -ag -- sleep 10
$ perf report
...
| | |--0.22%-- 0x550f8
| | | 0x10888
| | | el0_svc_naked
| | | sys_openat
| | | return_to_handler
| | | return_to_handler
...
In case of c),
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
$ echo c > /proc/sysrq-trigger
...
Call trace:
[<ffffffc00044d3ac>] sysrq_handle_crash+0x24/0x30
[<ffffffc000092250>] return_to_handler+0x0/0x40
[<ffffffc000092250>] return_to_handler+0x0/0x40
...
This patch replaces such entries with real addresses preserved in
current->ret_stack[] at unwind_frame(). This way, we can cover all
the cases.
Reviewed-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[will: fixed minor context changes conflicting with irq stack bits]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Function graph tracer modifies a return address (LR) in a stack frame
to hook a function's return. This will result in many useless entries
(return_to_handler) showing up in a call stack list.
We will fix this problem in a later patch ("arm64: ftrace: fix a stack
tracer's output under function graph tracer"). But since real return
addresses are saved in ret_stack[] array in struct task_struct,
unwind functions need to be notified of, in addition to a stack pointer
address, which task is being traced in order to find out real return
addresses.
This patch extends unwind functions' interfaces by adding an extra
argument of a pointer to task_struct.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Function graph tracer modifies a return address (LR) in a stack frame by
calling ftrace_prepare_return() in a traced function's function prologue.
The current code does this modification before preserving an original
address at ftrace_push_return_trace() and there is always a small window
of inconsistency when an interrupt occurs.
This doesn't matter, as far as an interrupt stack is introduced, because
stack tracer won't be invoked in an interrupt context. But it would be
better to proactively minimize such a window by moving the LR modification
after ftrace_push_return_trace().
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
sysrq_handle_reboot() re-enables interrupts while on the irq stack. The
irq_stack implementation wrongly assumed this would only ever happen
via the softirq path, allowing it to update irq_count late, in
do_softirq_own_stack().
This means if an irq occurs in sysrq_handle_reboot(), during
emergency_restart() the stack will be corrupted, as irq_count wasn't
updated.
Lose the optimisation, and instead of moving the adding/subtracting of
irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare
sp_el0 (struct thread_info) with sp & ~(THREAD_SIZE - 1). This tells us
if we are on a task stack, if so, we can safely switch to the irq stack.
Finally, remove do_softirq_own_stack(), we don't need it anymore.
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
[will: use get_thread_info macro]
Signed-off-by: Will Deacon <will.deacon@arm.com>
The arm64 MMU supports a Contiguous bit which is a hint that the TTE
is one of a set of contiguous entries which can be cached in a single
TLB entry. Supporting this bit adds new intermediate huge page sizes.
The set of huge page sizes available depends on the base page size.
Without using contiguous pages the huge page sizes are as follows.
4KB: 2MB 1GB
64KB: 512MB
With a 4KB granule, the contiguous bit groups together sets of 16 pages
and with a 64KB granule it groups sets of 32 pages. This enables two new
huge page sizes in each case, so that the full set of available sizes
is as follows.
4KB: 64KB 2MB 32MB 1GB
64KB: 2MB 512MB 16GB
If a 16KB granule is used then the contiguous bit groups 128 pages
at the PTE level and 32 pages at the PMD level.
If the base page size is set to 64KB then 2MB pages are enabled by
default. It is possible in the future to make 2MB the default huge
page size for both 4KB and 64KB granules.
Reviewed-by: Chris Metcalf <cmetcalf@ezchip.com>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: David Woods <dwoods@ezchip.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
It turns out that some Android versions hardcode the SYSENTER
calling convention. This is buggy and will cause problems no
matter what the kernel does. Nonetheless, we should try to
support it.
Credit goes to Linus for pointing out a clean way to handle
the SYSENTER/SYSCALL clobber differences while preserving
straightforward DWARF annotations.
I believe that the original offending Android commit was:
https://android.googlesource.com/platform%2Fbionic/+/7dc3684d7a2587e43e6d2a8e0e3f39bf759bd535
Reported-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-and-tested-by: Borislav Petkov <bp@alien8.de>
Cc: <mark.gross@intel.com>
Cc: Su Tao <tao.su@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: <frank.wang@intel.com>
Cc: <borun.fu@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Mingwei Shi <mingwei.shi@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The pmuserenr_el0 register value is architecturally UNKNOWN on reset.
Current kernel code resets that register value iff the core pmu device is
correctly probed in the kernel. On platforms with missing DT pmu nodes (or
disabled perf events in the kernel), the pmu is not probed, therefore the
pmuserenr_el0 register is not reset in the kernel, which means that its
value retains the reset value that is architecturally UNKNOWN (system
may run with eg pmuserenr_el0 == 0x1, which means that PMU counters access
is available at EL0, which must be disallowed).
This patch adds code that resets pmuserenr_el0 on cold boot and restores
it on core resume from shutdown, so that the pmuserenr_el0 setup is
always enforced in the kernel.
Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Try XENPF_settime64 first, if it is not available fall back to
XENPF_settime32.
No need to call __current_kernel_time() when all the info needed are
already passed via the struct timekeeper * argument.
Return NOTIFY_BAD in case of errors.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
If Linux is running as dom0, call XENPF_settime64 to update the system
time in Xen on pvclock_gtod notifications.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Read the wallclock from the shared info page at boot time.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
The hypervisor actually exposes an additional field to struct
pvclock_wall_clock, with the high 32 bit seconds.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Rename the current XENPF_settime hypercall and related struct to
XENPF_settime32.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
The dom0_op hypercall has been renamed to platform_op since Xen 3.2,
which is ancient, and modern upstream Linux kernels cannot run as dom0
and it anymore anyway.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Register the runstate_memory_area with the hypervisor.
Use pv_time_ops.steal_clock to account for stolen ticks.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Introduce CONFIG_PARAVIRT and PARAVIRT_TIME_ACCOUNTING on ARM64.
Necessary duplication of paravirt.h and paravirt.c with ARM.
The only paravirt interface supported is pv_time_ops.steal_clock, so no
runtime pvops patching needed.
This allows us to make use of steal_account_process_tick for stolen
ticks accounting.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Introduce CONFIG_PARAVIRT and PARAVIRT_TIME_ACCOUNTING on ARM.
The only paravirt interface supported is pv_time_ops.steal_clock, so no
runtime pvops patching needed.
This allows us to make use of steal_account_process_tick for stolen
ticks accounting.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Christopher Covington <cov@codeaurora.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Russell King <linux@arm.linux.org.uk>
On parisc syscalls which are interrupted by signals sometimes failed to
restart and instead returned -ENOSYS which in the worst case lead to
userspace crashes.
A similiar problem existed on MIPS and was fixed by commit e967ef02
("MIPS: Fix restart of indirect syscalls").
On parisc the current syscall restart code assumes that all syscall
callers load the syscall number in the delay slot of the ble
instruction. That's how it is e.g. done in the unistd.h header file:
ble 0x100(%sr2, %r0)
ldi #syscall_nr, %r20
Because of that assumption the current code never restored %r20 before
returning to userspace.
This assumption is at least not true for code which uses the glibc
syscall() function, which instead uses this syntax:
ble 0x100(%sr2, %r0)
copy regX, %r20
where regX depend on how the compiler optimizes the code and register
usage.
This patch fixes this problem by adding code to analyze how the syscall
number is loaded in the delay branch and - if needed - copy the syscall
number to regX prior returning to userspace for the syscall restart.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Blingly ignoring CIE.version != 1 was a bad idea.
It still leaves "desirability" when running perf with callgraphing where libgcc
symbols might show in hotspot.
More importantly, basic CIE.version == 3 support already exists in code:
|
| retAddrReg = state.version <= 1 ? *ptr++ : get_uleb128(&ptr, end);
|
Next commit with simply add continue-not-bail for CIE.version != 1
This reverts commit 323f41f9e7.
At -Os, ARC gcc generates millicode thunk for function prologue/epilogue,
which are served by libgcc.
Modules historically are NOT linked with libgcc to avoid code bloat, reducing
runtime relocation fixups etc. I even once tried doing that but got lost
in makefile intricacies.
This means modules at -Os don't get the millicode thunks, causing build
failures below:
| MODPOST 5 modules
| ERROR: "__ld_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r18_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r18" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r17_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r17" [crypto/sha256_generic.ko] undefined!
| ERROR: "__ld_r13_to_r16_ret" [crypto/sha256_generic.ko] undefined!
| ERROR: "__st_r13_to_r16" [crypto/sha256_generic.ko] undefined!
|....
|....
Workaround that by inhibiting millicode thunks for loadable modules
Fixes STAR 9000641864:
("Linux built with optimizations for size emits errors for modules")
Reported-by: Anton Kolesov <akolesov@synosys.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC700 cores with MMU v2 don't have IC_PTAG AUX register and so we only
define ARC_REG_IC_PTAG for MMU versions >= 3.
But current implementation of cache_line_loop_vX() routines assumes
availability of all of them (v2, v3 and v4) simultaneously.
And given undefined ARC_REG_IC_PTAG if CONFIG_MMU_VER=2 we're seeing
compilation problem:
---------------------------------->8-------------------------------
CC arch/arc/mm/cache.o
arch/arc/mm/cache.c: In function '__cache_line_loop_v3':
arch/arc/mm/cache.c:270:13: error: 'ARC_REG_IC_PTAG' undeclared (first use in this function)
aux_tag = ARC_REG_IC_PTAG;
^
arch/arc/mm/cache.c:270:13: note: each undeclared identifier is reported only once for each function it appears in
scripts/Makefile.build:258: recipe for target 'arch/arc/mm/cache.o' failed
---------------------------------->8-------------------------------
The simples fix is to have ARC_REG_IC_PTAG defined regardless MMU
version being used.
We don't use it in cache_line_loop_v2() anyways so who cares.
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
| WARNING: vmlinux.o(.text+0xd6c2): Section mismatch in reference from the function alloc_kmap_pgtable() to the function
| .init.text:__alloc_bootmem_low()
The function alloc_kmap_pgtable() references the function __init __alloc_bootmem_low().
This is often because alloc_kmap_pgtable lacks a __init annotation or the annotation of __alloc_bootmem_low is wrong.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>