Install the Hyper-V specific interrupt handler only when needed. This would
permit us to get rid of the Xen check. Note that when the vmbus drivers invokes
the call to register its handler, we are sure to be running on Hyper-V.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Link: http://lkml.kernel.org/r/1366299886-6399-1-git-send-email-kys@microsoft.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
fixed the following compile error when use avr32 atstk1006_defconfig:
drivers/mtd/nand/atmel_nand.c: In function 'pmecc_err_location':
drivers/mtd/nand/atmel_nand.c:639: error: implicit declaration of function 'writel_relaxed'
which was introduced by commit 1c7b874d33 ("mtd: at91: atmel_nand: add
Programmable Multibit ECC controller support"). The PMECC for nand
flash code uses writel_relaxed(). But in avr32, there is no macro
"writel_relaxed" defined.
This patch add writex_relaxed macro definitions.
Signed-off-by: Josh Wu <josh.wu@atmel.com>
Acked-by: Havard Skinnemoen <havard@skinnemoen.net>
Acked-by: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: David Woodhouse <David.Woodhouse@intel.com>
Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the very unlikely event where a guest would be foolish enough to
*read* from a write-only cache maintainance register, we end up
with preemption disabled, due to a misplaced get_cpu().
Just move the "is_write" test outside of the critical section.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of
crashkernel_hign=X crashkernel_low=Y. As that could be extensible.
-v2: according to Vivek, change delimiter to ;
-v3: let hign and low only handle simple form and it conforms to
description in kernel-parameters.txt
still keep crashkernel=X override any crashkernel=X,high
crashkernel=Y,low
-v4: update get_last_crashkernel returning and add more strict
checking in parse_crashkernel_simple() found by HATAYAMA.
-v5: Change delimiter back to , according to HPA.
also separate parse_suffix from parse_simper according to vivek.
so we can avoid @pos in that path.
-v6: Tight the checking about crashkernel=X,highblahblah,high
found by HTYAYAMA.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Vivek found old kexec-tools does not work new kernel anymore.
So change back crashkernel= back to old behavoir, and add crashkernel_high=
to let user decide if buffer could be above 4G, and also new kexec-tools will
be needed.
-v2: let crashkernel=X override crashkernel_high=
update description about _high will be ignored by crashkernel=X
-v3: update description about kernel-parameters.txt according to Vivek.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-4-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Chao said that kdump does does work well on his system on 3.8
without extra parameter, even iommu does not work with kdump.
And now have to append crashkernel_low=Y in first kernel to make
kdump work.
We have now modified crashkernel=X to allocate memory beyong 4G (if
available) and do not allocate low range for crashkernel if the user
does not specify that with crashkernel_low=Y. This causes regression
if iommu is not enabled. Without iommu, swiotlb needs to be setup in
first 4G and there is no low memory available to second kernel.
Set crashkernel_low automatically if the user does not specify that.
For system that does support IOMMU with kdump properly, user could
specify crashkernel_low=0 to save that 72M low ram.
-v3: add swiotlb_size() according to Konrad.
-v4: add comments what 8M is for according to hpa.
also update more crashkernel_low= in kernel-parameters.txt
-v5: update changelog according to Vivek.
-v6: Change description about swiotlb referring according to HATAYAMA.
Reported-by: WANG Chao <chaowang@redhat.com>
Tested-by: WANG Chao <chaowang@redhat.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-2-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Events may be created with attr->disabled == 1 and attr->enable_on_exec
== 1, which confuses the group validation code because events with the
PERF_EVENT_STATE_OFF are not considered candidates for scheduling, which
may lead to failure at group scheduling time.
This patch fixes the validation check for ARM, so that events in the
OFF state are still considered when enable_on_exec is true.
Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Reported-by: Sudeep KarkadaNagesha <Sudeep.KarkadaNagesha@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We must not declare dbg_cpu_pm_nb as __cpuinitdata as we need it after
system initialization for Suspend and CPUIdle.
This was done in commit 9a6eb310ea ("ARM: hw_breakpoint: Debug powerdown
support for self-hosted debug").
Cc: stable@vger.kernel.org
Cc: Dietmar Eggemann <Dietmar.Eggemann@arm.com>
Signed-off-by: Bastian Hecht <hechtb+renesas@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On Feroceon the L2 cache becomes non-coherent with the CPU
when the L1 caches are disabled. Thus the L2 needs to be invalidated
after both L1 caches are disabled.
On kexec before the starting the code for relocation the kernel,
the L1 caches are disabled in cpu_froc_fin (cpu_v7_proc_fin for Feroceon),
but after L2 cache is never invalidated, because inv_all is not set
in cache-feroceon-l2.c.
So kernel relocation and decompression may has (and usually has) errors.
Setting the function enables L2 invalidation and fixes the issue.
Cc: <stable@vger.kernel.org>
Signed-off-by: Illia Ragozin <illia.ragozin@grapecom.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
tcm_init() call iotable_init() and it use early_alloc variants which
do memblock allocation. Directly using memblock allocation after
initializing bootmem should not permitted, because bootmem can't know
where are additinally reserved.
So move tcm_init() to a safe place before initalizing bootmem.
(On the U300)
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit b4cbb197c7 ("vm: add vm_iomap_memory() helper function") added
a helper function wrapper around io_remap_pfn_range(), and every other
architecture defined it in <asm/pgtable.h>.
The s390 choice of <asm/io.h> may make sense, but is not very convenient
for this case, and gratuitous differences like that cause unexpected errors like this:
mm/memory.c: In function 'vm_iomap_memory':
mm/memory.c:2439:2: error: implicit declaration of function 'io_remap_pfn_range' [-Werror=implicit-function-declaration]
Glory be the kbuild test robot who noticed this, bisected it, and
reported it to the guilty parties (ie me).
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using this parameter one can disable the storage_size/2 check if
he is really sure that the UEFI does sane gc and fulfills the spec.
This parameter is useful if a devices uses more than 50% of the
storage by default.
The Intel DQSW67 desktop board is such a sucker for exmaple.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
pci_disable_device is called by a driver after it stops using the pci
function - e.g. during the removal of the driver. The current
implementation removes the architecture specific information of this
function such that even after a call to pci_enable_device the pci
function is no longer usable. Just remove pcibios_disable_device.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Disable pci on s390. Enable with pci=on.
Suggested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Access to pci config space via pci_ops should not fail silently.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If a pci load instruction fails the content of the register where the
data is stored is possibly unchanged. Fix the inline assembly wrapper
__pcilg to not return stale data. Additionally fix the callers of this
function who access uninitialized variables.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Don't let pci_load and friends crash the kernel when called with
e.g. an invalid offset. Return -ENXIO instead.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use distinct (and hopefully sane) names for the pci instruction
wrappers.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Uninline pci related instruction wrappers to de-bloat the code:
add/remove: 15/0 grow/shrink: 2/24 up/down: 1326/-12628 (-11302)
This is especially useful for the inlined pci read and write functions
which are used all over the kernel. Also remove the unused __stpcifc
while at it.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use pcibios_add_device to do arch specific device initialization.
This function will be called during pci_bus_add_device.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Don't modify function handles to get a disabled handle - call
clp_disable_fh. With this change we also do no longer deconfigure
enabled functions.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use the debugfs to keep track of a pci function's status changes.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The hash used for mapping irq numbers to msi descriptors does not
utilize all buckets that were allocated. Fix this by using the same
value (computed by the number of bits used for the hash function) at
relevant places.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds the last breaking event address as parameter
for 31 bit compat program signal handlers as it is already
done for 64 bit programs.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
force_console is used to wake up the CCW based console device to
print a panic message in case something goes wrong in a suspend
or resume cycle. Stop using the static console_subchannel and add
a parameter to this function to specify which ccw device we have
to wake up.
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
wait_cons_dev is used to busy wait for an interrupt on the console
ccw device. Stop using the static console_subchannel and add a
parameter to this function to specify on which ccw device/subchannel
we have to do the polling.
While at it rename the function to ccw_device_wait_idle and
move it to device.c
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Since commit 5f954c34 ([S390] hibernation: fix lowcore handling)
the absolute zero lowcore is lost during suspend/resume.
For example, this leads to the problem that the re-IPL device
for kdump is no longer set after resume.
With this patch during suspend a buffer is allocated in the new PM
notifier "suspend_pm_cb" and then the absolute zero lowcore is saved
to that buffer. The resume code then copies back this buffer to
absolute zero and afterwards the PM notifier releases the memory.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove unused __BITOPS_ALIGN, and replace __BITOPS_WORDSIZE with
BITS_PER_LONG.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use sske with multiple block control to initialize storage keys within
a 1 MB frame at once.
It turned out that the sske with mb=1 is an order of magnitude faster
than pfmf. This is only an issue for very large systems (several 100GB)
where storage key initialization could last more than a minute.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
dumpstack() did not always print a sane callchain when being called.
The reason is that show_trace() accessed register 15 directly to get
the current stack pointer and passed that pointer to __show_trace()
which expects a valid stack frame pointer as argument.
However due to tail call optimization the stack frame may not exist
anymore when __show_trace() gets called and therefore an invalid
stack frame pointer gets passed.
To prevent that disable tail call optimization for call chain walking
functions.
So move all the show_* functions to a dumpstack.c file like other
architectures have it already and add a -fno-optimize-sibling-calls
compile flag to both dumpstack.c and stacktrace.c to prevent tail
call optimization.
Fixes callchains that looked e.g. like this:
[ 12.868258] Call Trace:
[ 12.868262] ([<0000000000008000>] 0x8000)
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Used PTR_RET function instead of IS_ERR and PTR_ERR.
Patch found using coccinelle.
Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Rewrote conditional statement and eliminated the out_kthread label.
Signed-off-by: Alexandru Gheorghiu <gheorghiuandru@gmail.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pass buffer length in extra parameter.
Signed-off-by: Stefan Raspl <raspl@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Using kmem_cache_zalloc() instead of kmem_cache_alloc() and memset().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To avoid cache synonyms on System zEC12 32 independent zero pages are
required, one for each combination for bits 2**12 to 2**16 of the virtual
address. To avoid wasting too much memory on small virtual systems the
number of zero pages is limited to 4 if the memory size is less or equal
to 64MB.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Protection exception usually are suppressing and the fault handler
needs to rewind the PSW by the instruction length to get the correct
fault address. Except for protection exceptions while the CPU is in
the middle of a transaction. The CPU stores the transaction abort
PSW at the start of the transaction, if the transaction is aborted
the PSW is already correct and may not be modified by the fault
handler.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull ARM fix from Russell King:
"A build fix for an incomplete change to the ARM cpu suspend code"
* branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: Do 15e0d9e37c (ARM: pm: let platforms select cpu_suspend support) properly
Pull kvm fixes from Marcelo Tosatti:
"PPC and ARM KVM fixes"
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
ARM: KVM: fix L_PTE_S2_RDWR to actually be Read/Write
ARM: KVM: fix KVM_CAP_ARM_SET_DEVICE_ADDR reporting
kvm/ppc/e500: eliminate tlb_refs
kvm/ppc/e500: g2h_tlb1_map: clear old bit before setting new bit
kvm/ppc/e500: h2g_tlb1_rmap: esel 0 is valid
kvm/powerpc/e500mc: fix tlb invalidation on cpu migration
this merge window.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABCAAGBQJRbQJxAAoJEIn5HApB1cB6wPAP/Rb/Nm9byySw12YvXz2tdPKp
DFJ3J+yMhbh0Tjx5+K+iVO4lCqNk6EYXtOwCuvF9dfpp7IWtx0QwhoTM9U5VeFDT
y6+wgKY3CaG9X2IQx+8mUybEiSugsWpMtzOMIOWZLM1iDUtCWtRXybbc1bYWQrOq
YOdJICB8EefffghHhABMdtb9g9F/qzati4q+4jd53zJGF5jC9OVDKRCx1TiVyWHF
VD0stZoQ8Tb693gC0rLKkSSmg4G2Nj9tuL28YIRW1Un+xJ3eGY2RJ4yz6Pt49tKL
czvtU5zHJfIDfVAjMTScentTSBzh+5QxwnPwPYE7zirKKF6+SMSktfF36qTMDkXr
nI4x8FMr1iUNOZhK7Y5AAE1sWVtpUWTrR292xGIZeK80lhJDFFuYGd2ToH6ZMwJ/
wDe8Lp2wCbmMExOoI3yAhNRcQia7AyqYBwTfnDCMlmG9OG6f+erCYt8RFBy+hAhG
r+/IGrpQEldx+URWBnpc1SqjPCegM9ldw6K+xx0bg0l/dxgd7jnLSFiaOGfa5MRv
tdKt+cNv7jf9AM85nJ0fnm6iJwdubbyBRHVkuJXDd5z5twuBqJV9XYxrebh3Td9G
tkk5fprBbn7/mD0tr9eBju9eN+/vDDBUE3ENPufIP7j1wOcg2+1UR5BDkEBgJah0
P98LmOINkt+19poWm2P4
=I7AW
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes
Pull powerpc fixes from Stephen Rothwell:
"Three regresions in the PowerPC code. One from v3.7 the others from
this merge window."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes:
powerpc: add a missing label in resume_kernel
powerpc: Fix audit crash due to save/restore PPR changes
powerpc: fix compiling CONFIG_PPC_TRANSACTIONAL_MEM when CONFIG_ALTIVEC=n
Looks like our L_PTE_S2_RDWR definition is slightly wrong,
and is actually write only (see ARM ARM Table B3-9, Stage 2 control
of access permissions). Didn't make a difference for normal pages,
as we OR the flags together, but I'm still wondering how it worked
for Stage-2 mapped devices, such as the GIC.
Brown paper bag time, again.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Commit 3401d54696 (KVM: ARM: Introduce KVM_ARM_SET_DEVICE_ADDR
ioctl) added support for the KVM_CAP_ARM_SET_DEVICE_ADDR capability,
but failed to add a break in the relevant case statement, returning
the number of CPUs instead.
Luckilly enough, the CONFIG_NR_CPUS=0 patch hasn't been merged yet
(https://lkml.org/lkml/diff/2012/3/31/131/1), so the bug wasn't
noticed.
Just give it a break!
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
There is no need to use the PV version of the IRQ_WORKER mechanism
as under PVHVM we are using the native version. The native
version is using the SMP API.
They just sit around unused:
69: 0 0 xen-percpu-ipi irqwork0
83: 0 0 xen-percpu-ipi irqwork1
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
See git commit f10cd522c5
(xen: disable PV spinlocks on HVM) for details.
But we did not disable it everywhere - which means that when
we boot as PVHVM we end up allocating per-CPU irq line for
spinlock. This fixes that.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
The default (uninitialized) value of the IRQ line is -1.
Check if we already have allocated an spinlock interrupt line
and if somebody is trying to do it again. Also set it to -1
when we offline the CPU.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
If the timer interrupt has been de-init or is just now being
initialized, the default value of -1 should be preset as
interrupt line. Check for that and if something is odd
WARN us.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When we online the CPU, we get this splat:
smpboot: Booting Node 0 Processor 1 APIC 0x2
installing Xen timer for CPU 1
BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179
in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1
Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1
Call Trace:
[<ffffffff810c1fea>] __might_sleep+0xda/0x100
[<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff813036eb>] kvasprintf+0x5b/0x90
[<ffffffff81303758>] kasprintf+0x38/0x40
[<ffffffff81044510>] xen_setup_timer+0x30/0xb0
[<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30
[<ffffffff81666d0a>] start_secondary+0x19c/0x1a8
The solution to that is use kasprintf in the CPU hotplug path
that 'online's the CPU. That is, do it in in xen_hvm_cpu_notify,
and remove the call to in xen_hvm_setup_cpu_clockevents.
Unfortunatly the later is not a good idea as the bootup path
does not use xen_hvm_cpu_notify so we would end up never allocating
timer%d interrupt lines when booting. As such add the check for
atomic() to continue.
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
While we don't use the spinlock interrupt line (see for details
commit f10cd522c5 -
xen: disable PV spinlocks on HVM) - we should still do the proper
init / deinit sequence. We did not do that correctly and for the
CPU init for PVHVM guest we would allocate an interrupt line - but
failed to deallocate the old interrupt line.
This resulted in leakage of an irq_desc but more importantly this splat
as we online an offlined CPU:
genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1)
Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1
Call Trace:
[<ffffffff811156de>] __setup_irq+0x23e/0x4a0
[<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250
[<ffffffff811161bb>] request_threaded_irq+0xfb/0x160
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160
[<ffffffff81303758>] ? kasprintf+0x38/0x40
[<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20
[<ffffffff810cad35>] ? update_max_interval+0x15/0x40
[<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78
[<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33
[<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70
[<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10
[<ffffffff8109402b>] __cpu_notify+0x1b/0x30
[<ffffffff8166834a>] _cpu_up+0xa0/0x14b
[<ffffffff816684ce>] cpu_up+0xd9/0xec
[<ffffffff8165f754>] store_online+0x94/0xd0
[<ffffffff8141d15b>] dev_attr_store+0x1b/0x20
[<ffffffff81218f44>] sysfs_write_file+0xf4/0x170
[<ffffffff811a2864>] vfs_write+0xb4/0x130
[<ffffffff811a302a>] sys_write+0x5a/0xa0
[<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b
cpu 1 spinlock event irq -16
smpboot: Booting Node 0 Processor 1 APIC 0x2
And if one looks at the /proc/interrupts right after
offlining (CPU1):
70: 0 0 xen-percpu-ipi spinlock0
71: 0 0 xen-percpu-ipi spinlock1
77: 0 0 xen-percpu-ipi spinlock2
There is the oddity of the 'spinlock1' still being present.
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
In the PVHVM path when we do CPU online/offline path we would
leak the timer%d IRQ line everytime we do a offline event. The
online path (xen_hvm_setup_cpu_clockevents via
x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt
line for the timer%d.
But we would still use the old interrupt line leading to:
kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261!
invalid opcode: 0000 [#1] SMP
RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270
.. snip..
<IRQ>
[<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0
[<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0
[<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240
[<ffffffff811175b9>] handle_percpu_irq+0x49/0x70
[<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0
[<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40
[<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80
<EOI>
[<ffffffff81666d01>] ? start_secondary+0x193/0x1a8
[<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8
There is also the oddity (timer1) in the /proc/interrupts after
offlining CPU1:
64: 1121 0 xen-percpu-virq timer0
78: 0 0 xen-percpu-virq timer1
84: 0 2483 xen-percpu-virq timer2
This patch fixes it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org
For quite a few Xen versions, this wasn't the IRQ vector anymore
anyway, and it is not being used by the kernel for anything. Hence
drop the field from struct irq_info, and respective function
parameters.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
During early setup of a dom0 kernel, populate boot_params with the
Enhanced Disk Drive (EDD) and MBR signature data. This makes
information on the BIOS boot device available in /sys/firmware/edd/.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* pci/jiang-subdrivers:
PCI/ACPI: Remove support of ACPI PCI subdrivers
PCI: acpiphp: Protect acpiphp data structures from concurrent updates
PCI: acpiphp: Use normal list to simplify implementation
PCI: acpiphp: Do not use ACPI PCI subdriver mechanism
PCI: acpiphp: Convert acpiphp to be builtin only, not modular
PCI/ACPI: Handle PCI slot devices when creating/destroying PCI buses
x86/PCI: Implement pcibios_{add|remove}_bus() hooks
ia64/PCI: Implement pcibios_{add|remove}_bus() hooks
PCI/ACPI: Prepare stub functions to handle ACPI PCI (hotplug) slots
PCI: Add pcibios hooks for adding and removing PCI buses
PCI: acpiphp: Replace local macros with standard ACPI macros
PCI: acpiphp: Remove all functions even if function 0 doesn't exist
PCI: acpiphp: Use list_for_each_entry_safe() in acpiphp_sanitize_bus()
PCI: Clean up usages of pci_bus->is_added
PCI: When removing bus, always remove legacy files & unregister
Fixes build with CONFIG_EFI_VARS=m which was broken after the commit
"x86, efivars: firmware bug workarounds should be in platform code".
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The commit "efi: Distinguish between "remaining space" and actually used
space" added usage of ucs2_*() functions to arch/x86/platform/efi/efi.c,
but the only thing which selected UCS2_STRING was EFI_VARS, which is
technically optional and can be built as a module.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
The valid mask for both offcore_response_0 and
offcore_response_1 was wrong for SNB/SNB-EP,
IVB/IVB-EP. It was possible to write to
reserved bit and cause a GP fault crashing
the kernel.
This patch fixes the problem by correctly marking the
reserved bits in the valid mask for all the processors
mentioned above.
A distinction between desktop and server parts is introduced
because bits 24-30 are only available on the server parts.
This version of the patch is just a rebase to perf/urgent tree
and should apply to older kernels as well.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: jolsa@redhat.com
Cc: gregkh@linuxfoundation.org
Cc: security@kernel.org
Cc: ak@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
EFI implementations distinguish between space that is actively used by a
variable and space that merely hasn't been garbage collected yet. Space
that hasn't yet been garbage collected isn't available for use and so isn't
counted in the remaining_space field returned by QueryVariableInfo().
Combined with commit 68d9298 this can cause problems. Some implementations
don't garbage collect until the remaining space is smaller than the maximum
variable size, and as a result check_var_size() will always fail once more
than 50% of the variable store has been used even if most of that space is
marked as available for garbage collection. The user is unable to create
new variables, and deleting variables doesn't increase the remaining space.
The problem that 68d9298 was attempting to avoid was one where certain
platforms fail if the actively used space is greater than 50% of the
available storage space. We should be able to calculate that by simply
summing the size of each available variable and subtracting that from
the total storage space. With luck this will fix the problem described in
https://bugzilla.kernel.org/show_bug.cgi?id=55471 without permitting
damage to occur to the machines 68d9298 was attempting to fix.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
EFI variables can be flagged as being accessible only within boot services.
This makes it awkward for us to figure out how much space they use at
runtime. In theory we could figure this out by simply comparing the results
from QueryVariableInfo() to the space used by all of our variables, but
that fails if the platform doesn't garbage collect on every boot. Thankfully,
calling QueryVariableInfo() while still inside boot services gives a more
reliable answer. This patch passes that information from the EFI boot stub
up to the efi platform code.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
For s390 the page table mapping for the crashkernel memory is removed to
protect the pre-loaded kdump kernel and ramdisk. Because the crashkernel
memory is not included in the page tables for suspend/resume it is not
included in the suspend image. Therefore after resume the resumed system
does no longer contain the pre-loaded kdump kernel and when kdump is
triggered it fails.
This patch adds a PM notifier that creates the page tables before suspend
is done and removes them for resume. This ensures that the kdump kernel
is included in the suspend image.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
A label 0 was missed in the patch a9c4e541 (powerpc/kprobe: Complete
kprobe and migrate exception frame). This will cause the kernel
branch to an undetermined address if there really has a conflict when
updating the thread flags.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Cc: stable@vger.kernel.org
Acked-By: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
The current mainline crashes when hitting userspace with the following:
kernel BUG at kernel/auditsc.c:1769!
cpu 0x1: Vector: 700 (Program Check) at [c000000023883a60]
pc: c0000000001047a8: .__audit_syscall_entry+0x38/0x130
lr: c00000000000ed64: .do_syscall_trace_enter+0xc4/0x270
sp: c000000023883ce0
msr: 8000000000029032
current = 0xc000000023800000
paca = 0xc00000000f080380 softe: 0 irq_happened: 0x01
pid = 1629, comm = start_udev
kernel BUG at kernel/auditsc.c:1769!
enter ? for help
[c000000023883d80] c00000000000ed64 .do_syscall_trace_enter+0xc4/0x270
[c000000023883e30] c000000000009b08 syscall_dotrace+0xc/0x38
--- Exception: c00 (System Call) at 0000008010ec50dc
Bisecting found the following patch caused it:
commit 44e9309f1f
Author: Haren Myneni <haren@linux.vnet.ibm.com>
powerpc: Implement PPR save/restore
It was found this patch corrupted r9 when calling
SET_DEFAULT_THREAD_PPR()
Using r10 as a scratch register instead of r9 solved the problem.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Pull x86 fixes from Ingo Molnar:
"Misc fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set
x86/mm/cpa/selftest: Fix false positive in CPA self test
x86/mm/cpa: Convert noop to functional fix
x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal
x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates
Pull m68knommu fix from Greg Ungerer:
"This contains only a single compilation fix for ColdFire m68k targets
that use local non-GPIOLIB support."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: define a local gpio_request_one() function
This patch attempts to fix:
https://bugzilla.kernel.org/show_bug.cgi?id=56461
The symptom is a crash and messages like this:
chrome: Corrupted page table at address 34a03000
*pdpt = 0000000000000000 *pde = 0000000000000000
Bad pagetable: 000f [#1] PREEMPT SMP
Ingo guesses this got introduced by commit 611ae8e3f5 ("x86/tlb:
enable tlb flush range support for x86") since that code started to free
unused pagetables.
On x86-32 PAE kernels, that new code has the potential to free an entire
PMD page and will clear one of the four page-directory-pointer-table
(aka pgd_t entries).
The hardware aggressively "caches" these top-level entries and invlpg
does not actually affect the CPU's copy. If we clear one we *HAVE* to
do a full TLB flush, otherwise we might continue using a freed pmd page.
(note, we do this properly on the population side in pud_populate()).
This patch tracks whenever we clear one of these entries in the 'struct
mmu_gather', and ensures that we follow up with a full tlb flush.
BTW, I disassembled and checked that:
if (tlb->fullmm == 0)
and
if (!tlb->fullmm && !tlb->need_flush_all)
generate essentially the same code, so there should be zero impact there
to the !PAE case.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Artem S Tashkinov <t.artem@mailcity.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When CONFIG_DEBUG_PAGEALLOC is set page table updates made by
kernel_map_pages() are not made visible (via TLB flush)
immediately if lazy MMU is on. In environments that support lazy
MMU (e.g. Xen) this may lead to fatal page faults, for example,
when zap_pte_range() needs to allocate pages in
__tlb_remove_page() -> tlb_next_batch().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: konrad.wilk@oracle.com
Link: http://lkml.kernel.org/r/1365703192-2089-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If the pmd is not present, _PAGE_PSE will not be set anymore.
Fix the false positive.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Stefan Bader <stefan.bader@canonical.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1365687369-30802-1-git-send-email-aarcange@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some important bug fixes that came in over the last 10 days,
mostly mvebu and imx:
- Multiple regressions on i.mx following the conversion of
the clock code, hopefully the last we are seeing of those.
- a regression in the mvebu irq handling code
- An incorrect register offset in the rewritten s3c24xx irq code.
- Two bugs in setting up the iomega_ix2_200 machine
- Turning on an extra bus clock on imx
- A MAINTAINERS file entry for Roland Stigge
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIVAwUAUWbTVmCrR//JCVInAQKnAA/6AnxY2FRgtR/XuDTBkaJ/VUObbYBbsnRW
uvO9DRYzACIy9f4bAOtg8SIB29Zt/tqJCnqj6nG4tbTn3dqQZXSaKktwQB8e9Q3j
+ttiq4an9CFfl4/9hxTfwadg1G9coUND1774Y4qv3csHYbyd5jBLLx7MEfevQ1ry
lmzDIQRnf4j4tp3q1d2kZ3sh9O09f+V2Hxnle0JySOsXt3NrfMOQdR6PYt3ZbsyO
mhuXLsSmHk/5J4rrZD6OuThlZzDEat22gjK/HbBTa/OyVqdYyHMsOWf4O2HR0MGs
FBqg2dcTap8s/lHQ0jId0Zvt31e4OgLs00ehD7V2ZYeQjG/d2PYrvDaGuFY6kIir
eNUhAozWTF5FmOe/LJ46waUC7HoYsHqrSkFWcvGWLjLXvQT6G5jie5/lpp9yfq4G
Ou2RTG+tJ+jiPnsk7E3+Z4/YQXyDhIxvo94xF+kSR3zWgJIKjFjgQ41pSql97tMa
5AQ7eJgWsmPfsVvIIxnRYAMV3/4jfJq2gM4BviA6OjZ8nHUWNXukLIHPtcADCYYQ
J0jSbFBX7qCd6UtDDo9t/YvAcLM34FUFsqGP4BjBLZe4XAkI6aVkz6jAE5amEUVm
1HgTfE5U1QI5u0TgXbaEYii6lbgcLM9Lsdl0wFsXjvXI9rdIfzJYByeV4Mx5J1dB
V4D42xojJWE=
=J404
-----END PGP SIGNATURE-----
Merge tag 'arm-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC bug fixes from Arnd Bergmann:
"A little later during the week than the last few pull requests, since
there was very little that came in before 3.9-rc6. At least things
have calmed down again here.
Some important bug fixes that came in over the last 10 days, mostly
mvebu and imx:
- Multiple regressions on i.mx following the conversion of the clock
code, hopefully the last we are seeing of those.
- a regression in the mvebu irq handling code
- An incorrect register offset in the rewritten s3c24xx irq code.
- Two bugs in setting up the iomega_ix2_200 machine
- Turning on an extra bus clock on imx
- A MAINTAINERS file entry for Roland Stigge"
* tag 'arm-soc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm: mvebu: Fix the irq map function in SMP mode
Fix GE0/GE1 init on ix2-200 as GE0 has no PHY
ARM: S3C24XX: Fix interrupt pending register offset of the EINT controller
ARM: S3C24XX: Correct NR_IRQS definition for s3c2440
ARM i.MX6: Fix ldb_di clock selection
ARM: imx: provide twd clock lookup from device tree
ARM: imx35 Bugfix admux clock
ARM: clk-imx35: Bugfix iomux clock
ARM: mxs: Slow down the I2C clock speed
MAINTAINERS: Add maintainer for LPC32xx
ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED
The definitions have moved to include/linux/usb/samsung-usb-phy.h,
and plat/usb-phy.h is unavailable from drivers in a multiplatform
configuration.
Also fix up the plat/usb-phy.h header file to use the definitions
from the new header instead of providing a separate copy.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The registers for the Samsung S3C serial port are currently defined in
the platform specific arch/arm/plat-samsung/include/plat/regs-serial.h
file, which is not visible to multiplatform capable drivers.
Unfortunately, it is not possible to move the file into a more local
place as we should normally try to, because the same registers
may be used in one of four places:
* In the driver itself
* In platform-independent ARM code for early debug output
* In platform_data definitions
* In the Samsung platform power management code
I have also found no way to logically split out a platform_data
file, other than possibly move everything into
include/linux/platform_data, which also felt wrong. The only
part of this file that makes sense to keep specific to the s3c24xx
platform are the virtual and physical addresses defined here,
which are needed in no other location.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Kirkwood
- a couple of small fixes for the Iomega ix2-200 board (ether and led)
- mvebu
- allow GPIO button to work on Mirabox when running SMP
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJRZZxwAAoJEAi3KVZQDZAerwcIAKR1xRKuagtXvfij+xfVVRQ6
1G645hTIthFXmNiEeo3Y/mswHhVooZvu8SWh3o3J2Wsbnzeh+ch6jTsl+g6Tnb3E
KmRFU0mcalRmsyYsh4PH9nDi8/oWyP+i3ZytgfcMlsSPNiE+Ek35NdTTWMLCTT8V
MkGysVc8MWoOmMf47mNboy5UUUTXRdnSUJSjv1ubrsTK33LT7Ii9Ce+eoNvpvF+5
+6RenfRMzRSwkZf9AaCRPAhXISQKbMAwz6lKGo2GGAW+73Z+JclXXmiCfQ8pWie2
pfyqiEYigZFqe6Ly5BUtGoVfDjmOLDs+ATTUDOlOj0uaEc7pOwwKoAtpVRdck24=
=yK1f
-----END PGP SIGNATURE-----
Merge tag 'mvebu_fixes_for_v3.9_round3' of git://git.infradead.org/users/jcooper/linux into fixes
From Jason Cooper <jason@lakedaemon.net>:
mvebu fixes for v3.9 round 3
- Kirkwood
- a couple of small fixes for the Iomega ix2-200 board (ether and led)
- mvebu
- allow GPIO button to work on Mirabox when running SMP
* tag 'mvebu_fixes_for_v3.9_round3' of git://git.infradead.org/users/jcooper/linux:
arm: mvebu: Fix the irq map function in SMP mode
Fix GE0/GE1 init on ix2-200 as GE0 has no PHY
ARM: Kirkwood: Fix typo in the definition of ix2-200 rebuild LED
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Some EFI implementations return always a MaximumVariableSize of 0,
check against max_size only if it is non-zero.
My Intel DQ67SW desktop board has such an implementation.
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Commit 523f0e5421 ("KVM: PPC: E500:
Explicitly mark shadow maps invalid") began using E500_TLB_VALID
for guest TLB1 entries, and skipping invalidations if it's not set.
However, when E500_TLB_VALID was set for such entries, it was on a
fake local ref, and so the invalidations never happen. gtlb_privs
is documented as being only for guest TLB0, though we already violate
that with E500_TLB_BITMAP.
Now that we have MMU notifiers, and thus don't need to actually
retain a reference to the mapped pages, get rid of tlb_refs, and
use gtlb_privs for E500_TLB_VALID in TLB1.
Since we can have more than one host TLB entry for a given tlbe_ref,
be careful not to clear existing flags that are relevant to other
host TLB entries when preparing a new host TLB entry.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
It's possible that we're using the same host TLB1 slot to map (a
presumably different portion of) the same guest TLB1 entry. Clear
the bit in the map before setting it, so that if the esels are the same
the bit will remain set.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add one to esel values in h2g_tlb1_rmap, so that "no mapping" can be
distinguished from "esel 0". Note that we're not saved by the fact
that host esel 0 is reserved for non-KVM use, because KVM host esel
numbering is not the raw host numbering (see to_htlb1_esel).
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The s3c-fb driver requires header files from the samsung platforms
to find its platform_data definition, but this no longer works on
multiplatform kernels, so let's move the data into a new header
file under include/linux/platform_data.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-fbdev@vger.kernel.org
Acked-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Commit:
a8aed3e075 ("x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge")
introduced a valid fix but one location that didn't trigger the bug that
lead to finding those (small) problems, wasn't updated using the
right variable.
The wrong variable was also initialized for no good reason, that
may have been the source of the confusion. Remove the noop
initialization accordingly.
Commit a8aed3e075 also erroneously removed one canon_pgprot pass meant
to clear pmd bitflags not supported in hardware by older CPUs, that
automatically gets corrected by this patch too by applying it to the right
variable in the new location.
Reported-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Mel Gorman <mgorman@suse.de>
Link: http://lkml.kernel.org/r/1365600505-19314-1-git-send-email-aarcange@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Early bootup issue found on DL380 machines
- Fix for the timer interrupt not being processed right away.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRZbq0AAoJEFjIrFwIi8fJTnsIAIWYw7g9j0T9gijc/t5wEZrK
KpPBITlGFAeM7liEaUh5X5M2B86tBoI77uV5EGCvDDwth+FD5WsgeMesxV9KlMdj
vbWLGubJpmd8zy6Q1f/T3LsxGHGCjz8jASeN7YTPRdBqITOQDqXjj2VC/4n7AQCh
Le3ml3A/NZTZMiz2PK8lDzjpzY2lDgrIloevahVoYLe8Jxg2aW5JTaZQg7oPA6ir
lqC1Sgju6RDKR0kmPmM8wl5TOIMCkrygriP62B+Ww9wl9HlS+X5/JlK3zVj0vJXo
oxNyDKEK96M54oO5t/v7qzdfX2Xj+S/6JPZOlegCWGClS9rQoK3uDBZupGLiAh8=
=jaNW
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
"Two bug-fixes:
- Early bootup issue found on DL380 machines
- Fix for the timer interrupt not being processed right awaym leading
to quite delayed time skew on certain workloads"
* tag 'stable/for-linus-3.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/mmu: On early bootup, flush the TLB when changing RO->RW bits Xen provided pagetables.
xen/events: Handle VIRQ_TIMER before any other hardirq in event loop.
The existing check handles the case where we've migrated to a different
core than we last ran on, but it doesn't handle the case where we're
still on the same cpu we last ran on, but some other vcpu has run on
this cpu in the meantime.
Without this, guest segfaults (and other misbehavior) have been seen in
smp guests.
Cc: stable@vger.kernel.org # 3.8.x
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Commit 92702df357 ("ARM: OMAP4: PM: fix PM regression introduced by recent
clock cleanup") makes the 'ocp2scp_usb_phy_phy_48m' as optional
functional clock causing regression in MUSB. But this 48MHz clock is a
mandatory clock for usb phy attached to ocp2scp and hence made as the main
clock for ocp2scp.
Cc: Keerthy <j-keerthy@ti.com>
Cc: Benoît Cousson <b-cousson@ti.com>
Cc: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
[paul@pwsan.com: add comment to the hwmod data to try to prevent any
future mistakes here]
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Invoking arch_flush_lazy_mmu_mode() results in calls to
preempt_enable()/disable() which may have performance impact.
Since lazy MMU is not used on bare metal we can patch away
arch_flush_lazy_mmu_mode() so that it is never called in such
environment.
[ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU
updates" may cause a minor performance regression on
bare metal. This patch resolves that performance regression. It is
somewhat unclear to me if this is a good -stable candidate. ]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1364045796-10720-2-git-send-email-konrad.wilk@oracle.com
Tested-by: Josh Boyer <jwboyer@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> SEE NOTE ABOVE
In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops
when lazy MMU updates are enabled, because set_pgd effects are being
deferred.
One instance of this problem is during process mm cleanup with memory
cgroups enabled. The chain of events is as follows:
- zap_pte_range enables lazy MMU updates
- zap_pte_range eventually calls mem_cgroup_charge_statistics,
which accesses the vmalloc'd mem_cgroup per-cpu stat area
- vmalloc_fault is triggered which tries to sync the corresponding
PGD entry with set_pgd, but the update is deferred
- vmalloc_fault oopses due to a mismatch in the PUD entries
The OOPs usually looks as so:
------------[ cut here ]------------
kernel BUG at arch/x86/mm/fault.c:396!
invalid opcode: 0000 [#1] SMP
.. snip ..
CPU 1
Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1
RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208
.. snip ..
Call Trace:
[<ffffffff81627759>] do_page_fault+0x399/0x4b0
[<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110
[<ffffffff81624065>] page_fault+0x25/0x30
[<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50
[<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350
[<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60
[<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150
[<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80
[<ffffffff81153e61>] unmap_single_vma+0x531/0x870
[<ffffffff81154962>] unmap_vmas+0x52/0xa0
[<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100
[<ffffffff8115c8f8>] exit_mmap+0x98/0x170
[<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[<ffffffff81059ce3>] mmput+0x83/0xf0
[<ffffffff810624c4>] exit_mm+0x104/0x130
[<ffffffff8106264a>] do_exit+0x15a/0x8c0
[<ffffffff810630ff>] do_group_exit+0x3f/0xa0
[<ffffffff81063177>] sys_exit_group+0x17/0x20
[<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b
Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the
changes visible to the consistency checks.
Cc: <stable@vger.kernel.org>
RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737
Tested-by: Josh Boyer <jwboyer@redhat.com>
Reported-and-Tested-by: Krishna Raman <kraman@redhat.com>
Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com>
Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
This patch fix the regression introduced by the commit 3202bf0157
"arm: mvebu: Improve the SMP support of the interrupt controller":
GPIO IRQ were no longer delivered to the CPUs.
To be delivered to a CPU an interrupt must be enabled at CPU level and
at interrupt source level. Before the offending patch, all the
interrupts were enabled at source level during map() function. Mask()
and unmask() was done by handling the per-CPU part. It was fine when
running in UP with only one CPU.
The offending patch added support for SMP, in this case mask() and
unmask() was done by handling the interrupt source level part. The
per-CPU level part was handled by the affinity API to select the CPU
which will receive the interrupt. (Due to some hardware limitation
only one CPU at a time can received a given interrupt).
For "normal" interrupt __setup_irq() was called when an irq was
registered. irq_set_affinity() is called from this function, which
enabled the interrupt on one of the CPUs. Whereas for GPIO IRQ which
were chained interrupts, the irq_set_affinity() was never called and
none of the CPUs was selected to receive the interrupt.
With this patch all the interrupt are enable on the current CPU during
map() function. Enabling the interrupts on a CPU doesn't depend
anymore on irq_set_affinity() and then the chained irq are not anymore
a special case. However the CPU which will receive the irq can still
be modify later using irq_set_affinity().
Tested with Mirabox (A370) and Openblocks AX3 (AXP), rootfs mounted
over NFS, compiled with CONFIG_SMP=y/N.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reported-by: Ryan Press <ryan@presslab.us>
Investigated-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Tested-by: Ryan Press <ryan@presslab.us>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
The of_node field of the device assigned to a
PCI bus is used during scanning of the PCI bus.
However on MIPS, the of_node field is assigned
only after the bus has been scanned.
Implement the architecture specific version of
'pcibios_get_phb_of_node'. Which ensures that the
PCI driver core will initialize the of_node field
before starting the scan.
Also remove the local assignment of bus->dev.of_node,
it is not needed after the patch.
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Just like the OHCI counter part we just can remove the architecture
specific symbols which prevent these configuration symbols from being
selected by platforms/architectures requiring it. The original
implementation did not scale at all since it required each and every
single architecture to be added for these configuration symbols to be
selected. Now it is up to the EHCI driver and/or platform to select
these configuration symbols accordingly.
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We can't compile a kernel with CONFIG_ALTIVEC=n when
CONFIG_PPC_TRANSACTIONAL_MEM=y. We currently get:
arch/powerpc/kernel/tm.S:320: Error: unsupported relocation against THREAD_VSCR
arch/powerpc/kernel/tm.S:323: Error: unsupported relocation against THREAD_VR0
arch/powerpc/kernel/tm.S:323: Error: unsupported relocation against THREAD_VR0
etc.
The below fixes this with a sprinkling of #ifdefs.
This was found by mpe with kisskb:
http://kisskb.ellerman.id.au/kisskb/buildresult/8539442/
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
arch_local_irq_save() and friends are required to act as compiler
memory barriers. This patch adds a "memory" clobber to the inline
asm code in arch_local_irq_restore() which is used as the building
block for other functions needing to set/clear the interrupt enable
in the CSR register.
Signed-off-by: Mark Salter <msalter@redhat.com>
Pull vfs fixes from Al Viro:
"A nasty bug in fs/namespace.c caught by Andrey + a couple of less
serious unpleasantness - ecryptfs misc device playing hopeless games
with try_module_get() and palinfo procfs support being... not quite
correctly done, to be polite."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
mnt: release locks on error path in do_loopback
palinfo fixes
procfs: add proc_remove_subtree()
ecryptfs: close rmmod race
* check for proc_mkdir() failures
* fix buffer overrun - sizeof(format string) is *not* enough to
hold sprintf() result.
* use proc_remove_subtree(); life's much easier with it
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The arch_local_irq_save(), etc., routines are required to function
as compiler barriers. They do, but it's subtle and requires knowing
that the gcc builtin __insn_mtspr() is marked as a memory clobber.
Provide a comment explaining the assumption.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
[ This came about from me wondering about the synchronization rules of
__insn_mtspr() - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The external pending interrupt register address (EINTPEND) offset is
0xa8, not 0x08. Without this patch the external interrupts are not
properly acknowledged, which may lead to an interrupt storm and the
system hang as soon as any external interrupt is requested.
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Due to NR_IRQS being incorrectly defined not all IRQ domains can
be registered for S3C2440. It causes following errors on a s3c2440
SoC based board:
NR_IRQS:89
S3C2440: IRQ Support
irq: clearing pending status 00000002
------------[ cut here ]------------
WARNING: at kernel/irq/irqdomain.c:234 0xc0056ed0()
...
irq: could not create irq-domain
...
s3c2410-wdt s3c2410-wdt: failed to install irq (-22)
s3c2410-wdt: probe of s3c2410-wdt failed with error -22
...
samsung-uart s3c2440-uart.0: cannot get irq 74
Fix this by increasing NR_IRQS to at least (IRQ_S3C2443_AC97 + 1)
if CPU_S3C2440 is selected, so the subintc IRQ domain gets properly
registered.
Signed-off-by: Tomasz Figa <tomasz.figa@gmail.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
* A couple imx35 clock fixes for regressions caused by common clock
framework conversion. The admux and iomux get disabled by common
clock framework late initcall, and hence causes problems.
* Add missing twd clock lookup in device tree. This becomes required
since commit bd60345 (ARM: use device tree to get smp_twd clock)
forces all DT boot to find lookup from device tree.
* Fix imx6q ldb_di clock parents mismatch per reference manual.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJRZAZBAAoJEFBXWFqHsHzOy60H/0RIxFO1TSQVVPTuJRtHr9qk
BtszcLOwiRbWipuG1OiosEKXMeOXjAPtOG8P9tReC/oJN4WL5CmGW3PrPQ/9DEbZ
OTYVPhgmGKDB/2n5BVvvTDISTovvlQRju4ht+70j1BnAlpbzGLbKMG4o6zHOROoh
d15dqg/Ny1ovsCvva2ODbwRtBkBaBqDC+RGqNPrhTN7WSk+nv6yITYZCREI1mjGH
RG7B62hn+1aD6tMdigFu9xS4vTkHXjFC0AVEixBEi3iWqofSz3Cb6Zq4MpRGNXGZ
o/tfc4/2q6uF/UEpTQJDhYbHjwesySsIwdu4vsvI2lh9EGqUgyC3YE+VbDwOeGE=
=tztW
-----END PGP SIGNATURE-----
Merge tag 'imx-fixes-3.9-5' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes
From Shawn Guo <shawn.guo@linaro.org>:
The imx fixes for 3.9, take 5:
* A couple imx35 clock fixes for regressions caused by common clock
framework conversion. The admux and iomux get disabled by common
clock framework late initcall, and hence causes problems.
* Add missing twd clock lookup in device tree. This becomes required
since commit bd60345 (ARM: use device tree to get smp_twd clock)
forces all DT boot to find lookup from device tree.
* Fix imx6q ldb_di clock parents mismatch per reference manual.
* tag 'imx-fixes-3.9-5' of git://git.linaro.org/people/shawnguo/linux-2.6: (217 commits)
ARM i.MX6: Fix ldb_di clock selection
ARM: imx: provide twd clock lookup from device tree
ARM: imx35 Bugfix admux clock
ARM: clk-imx35: Bugfix iomux clock
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
According to the recent i.MX6 Quad technical reference manual, mode 0x4 (100b)
of the CCM_CS2DCR register (address 0x020C402C) bits [11-9] and [14-12] select
the PLL3 clock, and not the PLL3 PFD1 540M clock. In our code, the PLL3 root
clock is named 'pll3_usb_otg', select this instead of the 540M clock.
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
While booting from device tree, imx6q used to provide twd clock lookup
by calling clk_register_clkdev() in clock driver. However, the commit
bd60345 (ARM: use device tree to get smp_twd clock) forces DT boot to
look up the clock from device tree. It causes the failure below when
twd driver tries to get the clock, and hence kernel has to calibrate the
local timer frequency.
smp_twd: clock not found -2
...
Calibrating local timer... 396.13MHz.
Fix the regression by providing twd clock lookup from device tree, and
remove the unused twd clk_register_clkdev() call from clock driver.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
The admux clock seems to be the audmux clock as tests show. audmux does
not work without this clock enabled. Currently imx35 does not register a
clock device for audmux. This patch adds this registration. imx-audmux
driver already handles a clock device, so no changes are necessary
there.
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
This patch enables iomuxc_gate clock. It is necessary to be able to
reconfigure iomux pads. Without this clock enabled, the
clk_disable_unused function will disable this clock and the iomux pads
are not configurable anymore. This happens at every boot. After a reboot
(watchdog system reset) the clock is not enabled again, so all iomux pad
reconfigurations in boot code are without effect.
The iomux pads should be always configurable, so this patch always
enables it.
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Let's not burden ia64 with checks in the common efivars code that we're not
writing too much data to the variable store. That kind of thing is an x86
firmware bug, plain and simple.
efi_query_variable_store() provides platforms with a wrapper in which they can
perform checks and workarounds for EFI variable storage bugs.
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Compiling for linux-3.9-rc1 and later fails with:
drivers/gpio/devres.c: In function 'devm_gpio_request_one':
drivers/gpio/devres.c:90:2: error: implicit declaration of function 'gpio_request_one' [-Werror=implicit-function-declaration]
So provide a local gpio_request_one() function. Code largely borrowed from
blackfin's local gpio_request_one() function.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
certain hardware installed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABCAAGBQJRYlR1AAoJEIn5HApB1cB6k5cQAI2khS3adFF9BTVZbETwRQMX
eQpYl6qpV3TC+7dUQQkKQMNOUQ9RtzpvyQjFNKd+6ZPy7nNfSGlH6idQ1QqOVDLX
9f1f29cyGcnymVyyRArupwGA/LmPG9ptSYE8TISMfuYuA10Ce6meLcvtCMuSnADr
3PaSL8M0HQuSjphZUdnJ30YmvhZ3RH4rZfuXFh4kK705zlx/kVg00qXSACLZYBeo
okHhpwFR9swzm9NlUEjJJNL0JM1Oi1/8mQotSUyAyKrfQHqYotqKNbqbS1+TtvTQ
pPdo8iI+rjkgEIe8ZDDbUfqqPG2pVZrZIWTSRMZ2/QiVqCeJl34qfDFdtng/79kl
PnPjOc+5fCpKWPOZXdMwCRfzXGKihxjjzL0hWOshexKr0DCSIx7+OykEqVk58tnD
bnq4yqwVMY9YE0SdjSKw6yzrd5FJu2fuRlfNY5PvEL9E9pcq4TLQCp3YvAo/RdrX
AVQ4YSCLfMcCOEXP/qoBIGcMpw3hKdffJ0kc/mylsaHFlL2FdQpXUhejWhh57SKp
U4n+70+ZPqhLZMJdo+ll3yV1qw0z2dOyRSmN44X0HcgmzdNLFPSaKiNdLcYvBdib
eCCbXumYNVzH8bTb+HjdDhmaprQ4ZonUuYghmJm/LQDHduEw6fVl3iG54DmawdEq
hdYf2jkl7DO6Gc95RwTA
=UiUm
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes
Pull powerpc bugfix from Stephen Rothwell:
"A single BUG_ON fix for a condition that could happen for machines
with certain hardware installed."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes:
powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being performed before the ANDCOND test
ARC irqsave/restore macros were missing the compiler barrier, causing a
stale load in irq-enabled region be used in irq-safe region, despite
being changed, because the register holding the value was still live.
The problem manifested as random crashes in timer code when stress
testing ARCLinux (3.9-rc3) on a !SMP && !PREEMPT_COUNT
Here's the exact sequence which caused this:
(0). tv1[x] <----> t1 <---> t2
(1). mod_timer(t1) interrupted after it calls timer_pending()
(2). mod_timer(t2) completes
(3). mod_timer(t1) resumes but messes up the list
(4). __runt_timers( ) uses bogus timer_list entry / crashes in
timer->function
Essentially mod_timer() was racing against itself and while the spinlock
serialized the tv1[] timer link list, timer_pending() called outside the
spinlock, cached timer link list element in a register.
With low register pressure (and a deep register file), lack of barrier
in raw_local_irqsave() as well as preempt_disable (!PREEMPT_COUNT
version), there was nothing to force gcc to reload across the spinlock,
causing a stale value in reg be used for link list manipulation - ensuing
a corruption.
ARcompact disassembly which shows the culprit generated code:
mod_timer:
push_s blink
mov_s r13,r0 # timer, timer
..
###### timer_pending( )
ld_s r3,[r13] # <------ <variable>.entry.next LOADED
brne r3, 0, @.L163
.L163:
..
###### spin_lock_irq( )
lr r5, [status32] # flags
bic r4, r5, 6 # temp, flags,
and.f 0, r5, 6 # flags,
flag.nz r4
###### detach_if_pending( ) begins
tst_s r3,r3 <--------------
# timer_pending( ) checks timer->entry.next
# r3 is NOT reloaded by gcc, using stale value
beq.d @.L169
mov.eq r0,0
##### detach_timer( ): __list_del( )
ld r4,[r13,4] # <variable>.entry.prev, D.31439
st r4,[r3,4] # <variable>.prev, D.31439
st r3,[r4] # <variable>.next, D.30246
We initially tried to fix this by adding barrier() to preempt_* macros
for !PREEMPT_COUNT but Linus clarified that it was anything but wrong.
http://www.spinics.net/lists/kernel/msg1512709.html
[vgupta: updated commitlog]
Reported-by/Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Cc: Christian Ruppert <christian.ruppert@abilis.com>
Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Debugged-by/Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- A couple mxs boards that run I2C at 400 kHz experience some unstable
issue occasionally. Slow down the clock speed to have I2C work
reliably.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJRXCXBAAoJEFBXWFqHsHzOqg4H/irUyWQkIfch7us/vnMQnntr
y67TFoq1ucfHDA4/Okd5YxeUWjh0vaFJTVu5JLKfuT7HGTzSeqG4sFCp/A6/i9F9
ZGbcmTUSgG3erjpzN0R+PLxWO5iqjT8Mg3/cyK83SguKtf4u/MVJq8jpTyc0JJ9R
UNj4OaKN5n7tvPGAkj8ypWDpQwlczRRie6v+Nt1qWP1K7T7Ez9ZMkhOvd7WLUi4T
H7HcWo5IOUBzLvf7JmBAQwTCTdl7BJglBO5FjTn7ao0uaMEHLJOwzPWdXs08Agzh
qMVJcWqVCcQQmB5nuDDb08mlXXEYrt/92qL3LUejJaX76bPH+lzfCnvrDGUIuvo=
=Towm
-----END PGP SIGNATURE-----
Merge tag 'mxs-fixes-3.9-4' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes
From Shawn Guo <shawn.guo@linaro.org>:
The mxs fixes for 3.9, take 4:
- A couple mxs boards that run I2C at 400 kHz experience some unstable
issue occasionally. Slow down the clock speed to have I2C work
reliably.
* tag 'mxs-fixes-3.9-4' of git://git.linaro.org/people/shawnguo/linux-2.6:
ARM: mxs: Slow down the I2C clock speed
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Let's do the changes properly and fix the same problem everywhere, not
just for one case.
Cc: <stable@vger.kernel.org> # kernels containing 15e0d9e37c or equivalent
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some versions of pHyp will perform the adjunct partition test before the
ANDCOND test. The result of this is that H_RESOURCE can be returned and
cause the BUG_ON condition to occur. The HPTE is not removed. So add a
check for H_RESOURCE, it is ok if this HPTE is not removed as
pSeries_lpar_hpte_remove is looking for an HPTE to remove and not a
specific HPTE to remove. So it is ok to just move on to the next slot
and try again.
Cc: stable@vger.kernel.org
Signed-off-by: Michael Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Pull KVM fix from Gleb Natapov:
"Bugfix for the regression introduced by commit c300aa64ddf5"
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Allow cross page reads and writes from cached translations.
Pull x86 fixes from Peter Anvin:
"Two quite small fixes: one a build problem, and the other fixes
seccomp filters on x32."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Fix rebuild with EFI_STUB enabled
x86: remove the x32 syscall bitmask from syscall_get_nr()
Interrupt handlers are always invoked with interrupts disabled, so
remove all uses of the deprecated IRQF_DISABLED flag.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linux has expected that interrupt handlers are executed with local
interrupts disabled for a while now, so ensure that this is the case on
Alpha even for non-device interrupts such as IPIs.
Without this patch, secondary boot results in the following backtrace:
warning: at kernel/softirq.c:139 __local_bh_enable+0xb8/0xd0()
trace:
__local_bh_enable+0xb8/0xd0
irq_enter+0x74/0xa0
scheduler_ipi+0x50/0x100
handle_ipi+0x84/0x260
do_entint+0x1ac/0x2e0
irq_exit+0x60/0xa0
handle_irq+0x98/0x100
do_entint+0x2c8/0x2e0
ret_from_sys_call+0x0/0x10
load_balance+0x3e4/0x870
cpu_idle+0x24/0x80
rcu_eqs_enter_common.isra.38+0x0/0x120
cpu_idle+0x40/0x80
rest_init+0xc0/0xe0
_stext+0x1c/0x20
A similar dump occurs if you try to reboot using magic-sysrq.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Due to all of the goodness being packed into today's kernels, the
resulting image isn't as slim as it once was.
In light of this, don't pass -msmall-data to gcc, which otherwise results
in link failures due to impossible relocations when compiling anything but
the most trivial configurations.
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Thorsten Kranzkowski <dl8bcu@dl8bcu.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes a NULL pointer dereference at boot on UP1500.
Cc: stable@vger.kernel.org
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jay Estabrook <jay.estabrook@gmail.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds support for kvm_gfn_to_hva_cache_init functions for
reads and writes that will cross a page. If the range falls within
the same memslot, then this will be a fast operation. If the range
is split between two memslots, then the slower kvm_read_guest and
kvm_write_guest are used.
Tested: Test against kvm_clock unit tests.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Here is the big Gadget & PHY pull request. Many of us have
been really busy lately getting multiple drivers to a better
position.
Since this pull request is so large, I will divide it in sections
so it's easier to grasp what's included.
- cleanups:
. UDC drivers no longer touch gadget->dev, that's now udc-core
responsibility
. Many more UDC drivers converted to usb_gadget_map/unmap_request()
. UDC drivers no longer initialize DMA-related fields from gadget's
device structure
. UDC drivers don't touch gadget.dev.driver directly
. UDC drivers don't assign gadget.dev.release directly
. Removal of some unused DMA_ADDR_INVALID
. Introduction of CONFIG_USB_PHY
. All phy drivers have been moved to drivers/usb/phy and renamed to
a common naming scheme
. Fix PHY layer so it never returns a NULL pointer, also fix all
callers to avoid using IS_ERR_OR_NULL()
. Sparse fixes all over the place
. drivers/usb/otg/ has been deleted
. Marvel drivers (mv_udc, ehci-mv, mv_otg and mv_u3d) improved clock
usage
- new features:
. UDC core now provides a generic way for tracking and reporting
UDC's state (not attached, resuming, suspended, addressed,
default, etc)
. twl4030-usb learned that it shouldn't be enabled during init
. Full DT support for DWC3 has been implemented
. ab8500-usb learned about pinctrl framework
. nop PHY learned about DeviceTree and regulators
. DWC3 learned about suspend/resume
. DWC3 can now be compiled in host-only and gadget-only (as well as
DRD) configurations
. UVC now enables streaming endpoint based on negotiated speed
. isp1301 now implements the PHY API properly
. configfs-based interface for gadget drivers which will lead to
the removal of all code which just combines functions together
to build functional gadget drivers.
. f_serial and f_obex were converted to new configfs interface while
maintaining old interface around.
- non-critical fixes:
. UVC gadget driver got fixes for Endpoint usage and stream calculation
. ab8500-usb fixed unbalanced clock and regulator API usage
. twl4030-usb got a fix for when OMAP3 is booted with cable connected
. fusb300_udc got a fix for DMA usage
. UVC got fixes for two assertions of the USB Video Class Compliance
specification revision 1.1
. build warning issues caused by recent addition of __must_check to
regulator API
These are all changes which deserve a mention, all other changes are related
to these one or minor spelling fixes and other similar tasks.
Signed-of-by: Felipe Balbi <balbi@ti.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRXG8GAAoJEIaOsuA1yqREJzYQAKMLW3J/TKTWS1DDuf7qMtMz
Ug6qChKZXgT1/QrNjsq2tx4jYIkNdSMtRKiUx0BnIptlUx6gM22gcsN8mXX/UJjC
FYAiWl+tYe85e9uayqqt+qVCZjTZCc7St4wQalugDHefvA7yCbiZpSaJRGlJMK+x
mePJ7MfrulDsYBXr0u+m2LOJ0qxMDi40k3/UN3aUu5yzrmBiRpVq1mySruvLwGFp
Pr6vBnprEc6bW5sRdUR4SICKLvLk5sHwHpvpkzDLYBIb/jXQwbfQri/HKeh4VMk8
trbsvHZyB7H8uuFsCHiBc6VtjcbZ4mxPUK+1PCq8hG077avdkm3ox0BERk9aRTeC
jg4mdpyWjgovwi882woPEQXNZoaAXpVDyI8tBRx92a+rGJjXSHhLQI+4Ffi4ZvzV
d+q1ZzrHxuzwa/BwcPETY76umXQqXWXg+ap1bHDY0RZFoPLdXMpl583NXGSn3gOD
dUlD0UlgYwb75333tRIPNQn3qOx0HVd6MlYPMNzl9t9c9qqfX78AYRny6ZucupRg
t9VZ6FO3D2yre9W7u3U3q2c9H7uSAKr/8xaNfvdsIWPncgvvYVIyE8MnR7AiHoPv
ZYESs/Gs6w9vUsHa9K4J16Ape7D3AMcGpXUoPUxTBHrwBexzt4j27VWtcL4ns/9r
0kcltUJ4Zq+PIjc7xgxe
=aF4r
-----END PGP SIGNATURE-----
Merge tag 'usb-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next
Felipe writes:
usb: patches for v3.10 merge window
Here is the big Gadget & PHY pull request. Many of us have
been really busy lately getting multiple drivers to a better
position.
Since this pull request is so large, I will divide it in sections
so it's easier to grasp what's included.
- cleanups:
. UDC drivers no longer touch gadget->dev, that's now udc-core
responsibility
. Many more UDC drivers converted to usb_gadget_map/unmap_request()
. UDC drivers no longer initialize DMA-related fields from gadget's
device structure
. UDC drivers don't touch gadget.dev.driver directly
. UDC drivers don't assign gadget.dev.release directly
. Removal of some unused DMA_ADDR_INVALID
. Introduction of CONFIG_USB_PHY
. All phy drivers have been moved to drivers/usb/phy and renamed to
a common naming scheme
. Fix PHY layer so it never returns a NULL pointer, also fix all
callers to avoid using IS_ERR_OR_NULL()
. Sparse fixes all over the place
. drivers/usb/otg/ has been deleted
. Marvel drivers (mv_udc, ehci-mv, mv_otg and mv_u3d) improved clock
usage
- new features:
. UDC core now provides a generic way for tracking and reporting
UDC's state (not attached, resuming, suspended, addressed,
default, etc)
. twl4030-usb learned that it shouldn't be enabled during init
. Full DT support for DWC3 has been implemented
. ab8500-usb learned about pinctrl framework
. nop PHY learned about DeviceTree and regulators
. DWC3 learned about suspend/resume
. DWC3 can now be compiled in host-only and gadget-only (as well as
DRD) configurations
. UVC now enables streaming endpoint based on negotiated speed
. isp1301 now implements the PHY API properly
. configfs-based interface for gadget drivers which will lead to
the removal of all code which just combines functions together
to build functional gadget drivers.
. f_serial and f_obex were converted to new configfs interface while
maintaining old interface around.
- non-critical fixes:
. UVC gadget driver got fixes for Endpoint usage and stream calculation
. ab8500-usb fixed unbalanced clock and regulator API usage
. twl4030-usb got a fix for when OMAP3 is booted with cable connected
. fusb300_udc got a fix for DMA usage
. UVC got fixes for two assertions of the USB Video Class Compliance
specification revision 1.1
. build warning issues caused by recent addition of __must_check to
regulator API
These are all changes which deserve a mention, all other changes are related
to these one or minor spelling fixes and other similar tasks.
Signed-of-by: Felipe Balbi <balbi@ti.com>
eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence
their .cmd files don't get included by the build machinery, leading to
the files always getting rebuilt.
Rather than adding the two files individually, take the opportunity and
add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment
at the top of the file to be shrunk quite a bit.
At the same time, remove a pointless flags override line - the variable
assigned to was misspelled anyway, and the options added are
meaningless for assembly sources.
[ hpa: the patch is not minimal, but I am taking it for -urgent anyway
since the excess impact of the patch seems to be small enough. ]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Link: http://lkml.kernel.org/r/515C5D2502000078000CA6AD@nat28.tlf.novell.com
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Pull MIPS fixes from Ralf Baechle:
"Fixes for a number of small glitches in various corners of the MIPS
tree. No particular areas is standing out.
With this applied all MIPS defconfigs are building fine. No merge
conflicts are expected."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Delete definition of SA_RESTORER.
MIPS: Fix ISA level which causes secondary cache init bypassing and more
MIPS: Fix build error cavium-octeon without CONFIG_SMP
MIPS: Kconfig: Rename SNIPROM too
MIPS: Alchemy: Fix typo "CONFIG_DEBUG_PCI"
MIPS: Unbreak function tracer for 64-bit kernel.
SA_RESTORER used to be defined as 0x04000000 but only the O32 ABI ever
supported its use and no libc was using it, so the entire sa-restorer
functionality was removed with lmo commit 39bffc12c3580ab [Zap sa_restorer.]
for 2.5.48 retaining only the SA_RESTORER definition as a reminder to avoid
accidental reuse of the mask bit.
Upstream cdef9602fbf1871a43f0f1b5cea10dd0f275167d [signal: always clear
sa_restorer on execve] adds code that assumes sa_sigaction has an
sa_restorer field, if SA_RESTORER is defined which would break MIPS.
So remove the SA_RESTORER definition before the v3.8.4 merge.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
(cherry picked from commit 17da8d63add23830892ac4dc2cbb3b5d4ffb79a8)
The commit a96102be70 introduced set_isa() where compatible ISA info is
also set aside from the one gets passed in. It means, for example, 1004K
will have MIPS_CPU_ISA_M32R2/M32R1/II/I flags. This leads to things like
the following inappropriate:
if (c->isa_level == MIPS_CPU_ISA_M32R1 ||
c->isa_level == MIPS_CPU_ISA_M32R2 ||
c->isa_level == MIPS_CPU_ISA_M64R1 ||
c->isa_level == MIPS_CPU_ISA_M64R2)
This patch fixes it.
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 7517de3486 ("MIPS: Alchemy: Redo
PCI as platform driver") added a reference to CONFIG_DEBUG_PCI. Change
it to CONFIG_PCI_DEBUG, as that is a valid Kconfig macro.
Also add a newline to a debugging printk that this fix enables.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 58b69401c7 [MIPS: Function tracer: Fix broken function tracing]
completely broke the function tracer for 64-bit kernels. The symptom is
a system hang very early in the boot process.
The fix: Remove/fix $sp adjustments for 64-bit case.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: Al Cooper <alcooperx@gmail.com>
Cc: viric@viric.name
Cc: stable@vger.kernel.org # 3.8.x
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Now that a display timing binding is available, convert our almost identical
binding to use the standard binding.
This patch converts the vt8500 and wm8505 framebuffer drivers and
associated dts/dtsi files to use the standard binding as defined in
bindings/video/display-timing.txt.
There are two side-effects of making this conversion:
1) The fb node should now be in the board file, rather than the soc file as
the display-timing node is a child of the fb node.
2) We still require a bits per pixel property to initialize the framebuffer
for the different lcd panels. Rather than including this as part of the
display timing, it is moved into the framebuffer node.
I have also taken the opportunity to alphabetise the includes of each
driver to avoid double-ups.
Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Pull ARM fixes from Russell King:
"Another round of ARM fixes, which include:
- Fixing a problem with LPAE mapping sections
- Reporting of some hwcaps on Krait CPUs
- Avoiding repetitive warnings in the breakpoint code
- Fixing a build error noticed on Dove platforms with PJ4 CPUs
- Fix masking of level 2 cache revision.
- Fixing timer-based udelay()
- A larger fix for an erratum causing people major grief with Cortex
A15 CPUs"
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7690/1: mm: fix CONFIG_LPAE typos
ARM: 7689/1: add unwind annotations to ftrace asm
ARM: 7685/1: delay: use private ticks_per_jiffy field for timer-based delay ops
ARM: 7684/1: errata: Workaround for Cortex-A15 erratum 798181 (TLBI/DSB operations)
ARM: 7682/1: cache-l2x0: fix masking of RTL revision numbering and set_debug init
ARM: iWMMXt: always enable iWMMXt support with PJ4 CPUs
ARM: 7681/1: hw_breakpoint: use warn_once to avoid spam from reset_ctrl_regs()
ARM: 7678/1: Work around faulty ISAR0 register in some Krait CPUs
ARM: 7680/1: Detect support for SDIV/UDIV from ISAR0 register
ARM: 7679/1: Clear IDIVT hwcap if CONFIG_ARM_THUMB=n
ARM: 7677/1: LPAE: Fix mapping in alloc_init_section for unaligned addresses
ARM: KVM: vgic: take distributor lock on sync_hwstate path
ARM: KVM: vgic: force EOIed LRs to the empty state
Pull s390 fixes from Martin Schwidefsky:
"Just a bunch of bugfixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: provide emtpy check_pgt_cache() function
s390/uaccess: fix page table walk
s390/3270: fix minor_start issue
s390/uaccess: fix clear_user_pt()
s390/scm_blk: fix error return code in scm_blk_init()
s390/scm_block: fix printk format string
drivers/Kconfig: add several missing GENERIC_HARDIRQS dependencies
CONFIG_LPAE doesn't exist: the correct option is CONFIG_ARM_LPAE, so fix
up the two typos under arch/arm/.
The fix to head.S is slightly scary, but this is just for setting up
an early io-mapping for the serial port when running on a big-endian,
LPAE system. Since these systems don't exist in the wild (at least, I
have no access to one outside of kvmtool, which doesn't provide a serial
port suitable for earlyprintk), then we can revisit the code later if it
causes any problems.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add unwind annotations to the ftrace assembly code so that the function
tracer's stacktracing options (func_stack_trace, etc.) work when
CONFIG_ARM_UNWIND is enabled.
Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 70264367a2 ("ARM: 7653/2: do not scale loops_per_jiffy when
using a constant delay clock") fixed a problem with our timer-based
delay loop, where loops_per_jiffy is scaled by cpufreq yet used directly
by the timer delay ops.
This patch fixes the problem in a more elegant way by keeping a private
ticks_per_jiffy field in the delay ops, independent of loops_per_jiffy
and therefore not subject to scaling. The loop-based delay continues to
use loops_per_jiffy directly, as it should.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On Cortex-A15 (r0p0..r3p2) the TLBI/DSB are not adequately shooting down
all use of the old entries. This patch implements the erratum workaround
which consists of:
1. Dummy TLBIMVAIS and DSB on the CPU doing the TLBI operation.
2. Send IPI to the CPUs that are running the same mm (and ASID) as the
one being invalidated (or all the online CPUs for global pages).
3. CPU receiving the IPI executes a DMB and CLREX (part of the exception
return code already).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit b8db6b8 (ARM: 7547/4: cache-l2x0: add support for Aurora L2 cache
ctrl) moved the masking of the part ID which caused the RTL version to be
lost. Commit 6248d06 (ARM: 7545/1: cache-l2x0: make outer_cache_fns a
field of l2x0_of_data) changed how .set_debug is initialized. Both commits
break commit 74ddcdb (ARM: 7608/1: l2x0: Only set .set_debug
on PL310 r3p0 and earlier) which uses the RTL version to conditionally set
.set_debug function pointer. Commit b8db6b8 also caused the printed cache
ID to be missing the version information.
Fix this by reverting how the part number is masked so the RTL version
info is maintained. The cache-id-part DT property does not set the RTL
bits so masking them should have no effect. Also, re-arrange the order
of the function pointer init so the .set_debug function can be overridden.
Reported-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>
Cc: Yehuda Yitschak <yehuday@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Jason Cooper reports these build errors:
arch/arm/kernel/built-in.o: In function `iwmmxt_do':
/.../arch/arm/kernel/pj4-cp0.c:36: undefined reference to `iwmmxt_task_release'
/.../arch/arm/kernel/pj4-cp0.c:40: undefined reference to `iwmmxt_task_switch'
make: *** [vmlinux] Error 1
This is caused because the PJ4 code explicitly references the iWMMXt
code, but doesn't require it to be built. Fix this by ensuring that
iWMMXt is always enabled with PJ4.
Reported-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull MIPS fixes from Ralf Baechle:
"A collection of fixes pretty much across the MIPS code. Even the
change to include/linux/signal.h by David Howells' 2a1486981c ("Fix
breakage in MIPS siginfo handling") should be considered MIPS-specific
as it touches an ifdefed segment that is only relevant to MIPS and
which unfortunately can't be made to go away entirely."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
Fix breakage in MIPS siginfo handling
Revert "MIPS: BCM63XX: Call board_register_device from device_initcall()"
MIPS: BCM63XX: Make nvram checksum failure non fatal
MIPS: Fix code generation for non-DSP capable CPUs
MIPS: Fix inconsistent formatting inside /proc/cpuinfo
MIPS: SEAD3: Enable LL/SC.
MIPS: Get rid of CONFIG_CPU_HAS_LLSC again
MIPS: Add dependencies for HAVE_ARCH_TRANSPARENT_HUGEPAGE
MIPS: VR4133: Fix probe for LL/SC.
MIPS: Fix logic errors in bitops.c
MIPS: Use CONFIG_CPU_MIPSR2 in csum_partial.S
MIPS: compat: Return same error ENOSYS as native for invalid operation.
Slow down the I2C clock speed on M28 and SPS1 as it turns out the
I2C block in i.MX28 can not operate stable enough with the bus
running at 400kHz. Note that the driver used by Freescale runs
the bus at 250kHz when 400kHz speed is selected, but the mainline
Linux kernel runs the bus at actual 400kHz and that's where it is
occasionally unstable. Play safe and run the bus at 100kHz.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Commit fca460f95e simplified the x32
implementation by creating a syscall bitmask, equal to 0x40000000, that
could be applied to x32 syscalls such that the masked syscall number
would be the same as a x86_64 syscall. While that patch was a nice
way to simplify the code, it went a bit too far by adding the mask to
syscall_get_nr(); returning the masked syscall numbers can cause
confusion with callers that expect syscall numbers matching the x32
ABI, e.g. unmasked syscall numbers.
This patch fixes this by simply removing the mask from syscall_get_nr()
while preserving the other changes from the original commit. While
there are several syscall_get_nr() callers in the kernel, most simply
check that the syscall number is greater than zero, in this case this
patch will have no effect. Of those remaining callers, they appear
to be few, seccomp and ftrace, and from my testing of seccomp without
this patch the original commit definitely breaks things; the seccomp
filter does not correctly filter the syscalls due to the difference in
syscall numbers in the BPF filter and the value from syscall_get_nr().
Applying this patch restores the seccomp BPF filter functionality on
x32.
I've tested this patch with the seccomp BPF filters as well as ftrace
and everything looks reasonable to me; needless to say general usage
seemed fine as well.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Link: http://lkml.kernel.org/r/20130215172143.12549.10292.stgit@localhost
Cc: <stable@vger.kernel.org>
Cc: Will Drewry <wad@chromium.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Occassionaly on a DL380 G4 the guest would crash quite early with this:
(XEN) d244:v0: unhandled page fault (ec=0003)
(XEN) Pagetable walk from ffffffff84dc7000:
(XEN) L4[0x1ff] = 00000000c3f18067 0000000000001789
(XEN) L3[0x1fe] = 00000000c3f14067 000000000000178d
(XEN) L2[0x026] = 00000000dc8b2067 0000000000004def
(XEN) L1[0x1c7] = 00100000dc8da067 0000000000004dc7
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 244 (vcpu#0) crashed on cpu#3:
(XEN) ----[ Xen-4.1.3OVM x86_64 debug=n Not tainted ]----
(XEN) CPU: 3
(XEN) RIP: e033:[<ffffffff81263f22>]
(XEN) RFLAGS: 0000000000000216 EM: 1 CONTEXT: pv guest
(XEN) rax: 0000000000000000 rbx: ffffffff81785f88 rcx: 000000000000003f
(XEN) rdx: 0000000000000000 rsi: 00000000dc8da063 rdi: ffffffff84dc7000
The offending code shows it to be a loop writting the value zero
(%rax) in the %rdi (the L4 provided by Xen) register:
0: 44 00 00 add %r8b,(%rax)
3: 31 c0 xor %eax,%eax
5: b9 40 00 00 00 mov $0x40,%ecx
a: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
11: 00 00
13: ff c9 dec %ecx
15:* 48 89 07 mov %rax,(%rdi) <-- trapping instruction
18: 48 89 47 08 mov %rax,0x8(%rdi)
1c: 48 89 47 10 mov %rax,0x10(%rdi)
which fails. xen_setup_kernel_pagetable recycles some of the Xen's
page-table entries when it has switched over to its Linux page-tables.
Right before try to clear the page, we make a hypercall to change
it from _RO to _RW and that works (otherwise we would hit an BUG()).
And the _RW flag is set for that page:
(XEN) L1[0x1c7] = 001000004885f067 0000000000004dc7
The error code is 3, so PFEC_page_present and PFEC_write_access, so page is
present (correct), and we tried to write to the page, but a violation
occurred. The one theory is that the the page entries in hardware
(which are cached) are not up to date with what we just set. Especially
as we have just done an CR3 write and flushed the multicalls.
This patch does solve the problem by flusing out the TLB page
entry after changing it from _RO to _RW and we don't hit this
issue anymore.
Fixed-Oracle-Bug: 16243091 [ON OCCASIONS VM START GOES INTO
'CRASH' STATE: CLEAR_PAGE+0X12 ON HP DL380 G4]
Reported-and-Tested-by: Saar Maoz <Saar.Maoz@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
numa_clear_node() function is not implemented under IA64,
it will be called in unmap_cpu_on_node() in mm/memory_hotplug.c.
This cause build error under IA64, this patch adds numa_clear_node()
in IA64 to fix this problem.
[Added __cpuinit notation to numa_clear_node() to keep linker happy -Tony]
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Back 2010 during a revamp of the irq code some initializations
were moved from ia64_mca_init() to ia64_mca_late_init() in
commit c75f2aa13f
Cannot use register_percpu_irq() from ia64_mca_init()
But this was hideously wrong. First of all these initializations
are now down far too late. Specifically after all the other cpus
have been brought up and initialized their own CMC vectors from
smp_callin(). Also ia64_mca_late_init() may be called from any cpu
so the line:
ia64_mca_cmc_vector_setup(); /* Setup vector on BSP */
is generally not executed on the BSP, and so the CMC vector isn't
setup at all on that processor.
Make use of the arch_early_irq_init() hook to get this code executed
at just the right moment: not too early, not too late.
Reported-by: Fred Hartnett <fred.hartnett@hp.com>
Tested-by: Fred Hartnett <fred.hartnett@hp.com>
Cc: stable@kernel.org # v2.6.37+
Signed-off-by: Tony Luck <tony.luck@intel.com>