Pull x86 fixes from Peter Anvin:
"A somewhat unpleasantly large collection of small fixes. The big ones
are the __visible tree sweep and a fix for 'earlyprintk=efi,keep'. It
was using __init functions with predictably suboptimal results.
Another key fix is a build fix which would produce output that simply
would not decompress correctly in some configuration, due to the
existing Makefiles picking up an unfortunate local label and mistaking
it for the global symbol _end.
Additional fixes include the handling of 64-bit numbers when setting
the vdso data page (a latent bug which became manifest when i386
started exporting a vdso with time functions), a fix to the new MSR
manipulation accessors which would cause features to not get properly
unblocked, a build fix for 32-bit userland, and a few new platform
quirks"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall()
x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro
x86: Fix typo preventing msr_set/clear_bit from having an effect
x86/intel: Add quirk to disable HPET for the Baytrail platform
x86/hpet: Make boot_hpet_disable extern
x86-64, build: Fix stack protector Makefile breakage with 32-bit userland
x86/reboot: Add reboot quirk for Certec BPC600
asmlinkage: Add explicit __visible to drivers/*, lib/*, kernel/*
asmlinkage, x86: Add explicit __visible to arch/x86/*
asmlinkage: Revert "lto: Make asmlinkage __visible"
x86, build: Don't get confused by local symbols
x86/efi: earlyprintk=efi,keep fix
With tk->wall_to_monotonic.tv_nsec being a 32-bit value on 32-bit
systems, (tk->wall_to_monotonic.tv_nsec << tk->shift) in update_vsyscall()
may lose upper bits or, worse, add them since compiler will do this:
(u64)(tk->wall_to_monotonic.tv_nsec << tk->shift)
instead of
((u64)tk->wall_to_monotonic.tv_nsec << tk->shift)
So if, for example, tv_nsec is 0x800000 and shift is 8 we will end up
with 0xffffffff80000000 instead of 0x80000000. And then we are stuck in
the subsequent 'while' loop.
We need an explicit cast.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1399648287-15178-1-git-send-email-boris.ostrovsky@oracle.com
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org> # v3.14
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Due to a typo the msr accessor function introduced in
22085a66c2 didn't have any lasting
effects because they accidentally wrote the old value back.
After c0a639ad0b this at the very least
this causes cpuid limits not to be lifted on some cpus leading to
missing capabilities for those.
Signed-off-by: Andres Freund <andres@anarazel.de>
Link: http://lkml.kernel.org/r/1399598957-7011-2-git-send-email-andres@anarazel.de
Cc: Borislav Petkov <bp@suse.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
An additional testcase found an issue with the last
series of patches applied: the fallback solution may
not save the iv value after operation. This very small
fix just makes sure the iv is copied back to the
walk/desc struct.
Cc: <stable@vger.kernel.org> # 3.14+
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
HPET on current Baytrail platform has accuracy problem to be
used as reliable clocksource/clockevent, so add a early quirk to
disable it.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1398327498-13163-2-git-send-email-feng.tang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
HPET on some platform has accuracy problem. Making
"boot_hpet_disable" extern so that we can runtime disable
the HPET timer by using quirk to check the platform.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1398327498-13163-1-git-send-email-feng.tang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
If you are using a 64-bit kernel with 32-bit userland, then
scripts/gcc-x86_64-has-stack-protector.sh invokes 32-bit gcc
with -mcmodel=kernel, which produces:
<stdin>:1:0: error: code model 'kernel' not supported in the 32 bit mode
and trips the "broken compiler" test at arch/x86/Makefile:120.
There are several places a fix is possible, but the following seems
cleanest. (But it's minimal; it would also be possible to factor
out a bunch of stuff from the two branches of the if.)
Signed-off-by: George Spelvin <linux@horizon.com>
Link: http://lkml.kernel.org/r/20140507210552.7581.qmail@ns.horizon.com
Cc: <stable@vger.kernel.org> # v3.14
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Pull sparc fixes from David Miller:
"I've been auditing the THP support on sparc64 and found several bugs,
hopefully most of which are fixed completely here.
Also an RT kernel locking fix from Kirill Tkhai"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Give more detailed information in {pgd,pmd}_ERROR() and kill pte_ERROR().
sparc64: Add basic validations to {pud,pmd}_bad().
sparc64: Use 'ILOG2_4MB' instead of constant '22'.
sparc64: Fix range check in kern_addr_valid().
sparc64: Fix top-level fault handling bugs.
sparc64: Handle 32-bit tasks properly in compute_effective_address().
sparc64: Don't use _PAGE_PRESENT in pte_modify() mask.
sparc64: Fix hex values in comment above pte_modify().
sparc64: Fix bugs in get_user_pages_fast() wrt. THP.
sparc64: Fix huge PMD invalidation.
sparc64: Fix executable bit testing in set_pmd_at() paths.
sparc64: Normalize NMI watchdog logging and behavior.
sparc64: Make itc_sync_lock raw
sparc64: Fix argument sign extension for compat_sys_futex().
As requested by Linus add explicit __visible to the asmlinkage users.
This marks all functions visible to assembler.
Tree sweep for arch/x86/*
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1398984278-29319-3-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
arch/x86/crypto/sha1_avx2_x86_64_asm.S introduced _end as a local
symbol, which broke the build under certain circumstances. Although
the wisdom of _end as a local symbol can definitely be questioned, the
build should not break for that reason.
Thus, filter the output of nm to only get global symbols of
appropriate type.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Cc: Chandramouli Narayanan <mouli@linux.intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-uxm3j3w3odglcwhafwq5tjqu@git.kernel.org
Build console support only when CONFIG_TTY is selected.
This restores ISS as the default platform for allnoconfig builds.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Chris Zankel <chris@zankel.net>
- mvebu
- fix NOR bus width on Armada XP boards
- use qsgmii on Armada XP GP board
- add i2c bus freq for Armada 370 DB board
- add SATA interface for Armada 375 DB
- kirkwood
- fix double probe of audio codec for T5325
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJTX5qGAAoJEP45WPkGe8Zn7iMP/1PxcEqs0Eb+vfOg7O5W/j72
DTNswSUN/L/mG6rDIAy2vrpx8qlkVknu0UpL/wb6pWXrSg7db6nTtfmHU77d4lTD
XhatamU4hquBkAIcXP7r/spIgL5FfCwlRMFfK+NmpAmYDq/KTofErBgR8llTYfRK
HUrsH9RKdTe/Omh6Wv5ogwoqMe2S6dx5R4U0yR2tP4mcgYrDu59hzDi+Ca8EXD9r
D+BqLJWg1JQOwY1PBZuzdSTNXbascleg004xW2nyV6MFWegqzgrgYhxPseiJIzwI
D7nQDlMuo4fOAC4mbyAcjaDYJAaTU1EGazhV7XQXoAKDgrCtN8hPJ7sBv/MfrVjH
KZhtJNEwaDh7YRCctF2eqWveEiFe7hcjViOBda/8EGXqHtPe4jL+IjVyruYPMUmh
nKaJooxOVNFzo2QimobRxMtOhQfr57UVjHngh+9u0fvnr81uFhMig2+ddNWIn7HP
ZELGcyyJr5kXYt8e9j13lnGUdHKUtdSD2ILng2ETTC8pnWRP0TPkfjtXbyZ1BqWN
fLR3jd5nAepRILynS4Ct55FVy2FjsNtCYYekeJrTmIgNps8NaOjyI7Jx+MsGSOpm
QOWz4iejohwFDc6vVMwEzh5i4JW4qBSBFPksX9Q5BWevi14uVXWujGnGEzZs34h+
Z/Pxh9rydne+wyYakHcr
=6yRE
-----END PGP SIGNATURE-----
Merge tag 'mvebu-dt-fixes-3.15' of git://git.infradead.org/linux-mvebu into fixes
From Jason Cooper:
mvebu DT fixes for v3.15
- mvebu
- fix NOR bus width on Armada XP boards
- use qsgmii on Armada XP GP board
- add i2c bus freq for Armada 370 DB board
- add SATA interface for Armada 375 DB
- kirkwood
- fix double probe of audio codec for T5325
* tag 'mvebu-dt-fixes-3.15' of git://git.infradead.org/linux-mvebu:
ARM: Kirkwood: T5325: Fix double probe of Codec
ARM: mvebu: enable the SATA interface on Armada 375 DB
ARM: mvebu: specify I2C bus frequency on Armada 370 DB
ARM: mvebu: use qsgmii phy-mode for Armada XP GP interfaces
ARM: mvebu: fix NOR bus-width in Armada XP OpenBlocks AX3 Device Tree
ARM: mvebu: fix NOR bus-width in Armada XP DB Device Tree
ARM: mvebu: fix NOR bus-width in Armada XP GP Device Tree
Signed-off-by: Olof Johansson <olof@lixom.net>
- devbus: fix bus-width conversion
- orion5x: fix target ID for crypto SRAM window
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJTX5gJAAoJEP45WPkGe8ZnXLQP/1EpsDvA8bdCm+kspEBMScpN
Ji3N1F6XTjYRI/YbpwvJra1UZyZDx2rNaA1ofGDLLTZiELO0pOMeE8gGouG6lZaA
OZf1Hq5T0pKkq8XkqUhKqgZ3QNUmCjaWL2FCkAIuiFDaCFpops2aCRy1YO3nht6o
ymhJ/MQwdvCG+pMVfaVNMmnxroA9uiu7OPLtfwru8+7kghGWfa/LNCM5H4+TOEG9
vV5onWrBzzTQCq1X12+a5qgOsOmlRdo69EZ2j0Rq/aHg5c9+Bxa2u5OncBe0txOa
blC42x9gTVtOY3sUTkFcyF1HMIC/PajZhWHkJKIqFJfVF11GVFc38LwWJewIsgaH
OxSwMZQOL4CcRTGl2nwxF4L2TVefYimJgNXsjEe0i7FNahtT/pUS4gz56XDE1iK5
sIZE//GevJ0eFDk9kCKEZYJkqiSTgrw05wHb5CVlHIy5fi5jG9prYzBqDam57Ohi
JOpVyC9+sodY7VkLMgqZc1XW492LB0JXdSOn473k8E4j5dK2JVzLd7GljiHXFc8P
MM/drQGmXa4O85lliD2RLTW1ca40AFbtgimHkQjBIAm1YCaUcZhEp5TUpA40BoBN
IA+GP6TtN2ha39gk1AbMx773aaRcK28p1B+epfkF2CPuTYuaMYnp51UcVK2Xugvt
QiD1tZJB3FpyIEqRqecS
=k1Ts
-----END PGP SIGNATURE-----
Merge tag 'mvebu-fixes-3.15' of git://git.infradead.org/linux-mvebu into fixes
From Jason Cooper:
mvebu fixes for v3.15
- devbus: fix bus-width conversion
- orion5x: fix target ID for crypto SRAM window
* tag 'mvebu-fixes-3.15' of git://git.infradead.org/linux-mvebu:
ARM: orion5x: fix target ID for crypto SRAM window
memory: mvebu-devbus: fix the conversion of the bus width
Signed-off-by: Olof Johansson <olof@lixom.net>
timings for smc911x LAN9220 (and potentially LAN9221) devices
that were noted on a cm-t3730 system. Also fix THUMB mode
for SMP, and mailbox related warnings when booted with device
tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTWpj8AAoJEBvUPslcq6Vzp5wQANywiCcNZxcoRc6ZW+f/4UUX
xndCEdCbqIa2feHJFkbFAgVh3flgoFRMu4mqM4hsxz+DdnSjprErJkAiyVV4rVqt
F8H/EWASDxuW+x/j1BlUr67SgauVsnYuqYQEb8ki3QD1ZA9g3puPzH4cvvevdUuk
GGXTc/VtH3CTWXPMGwngeIFMmVZg5IbZw9bc/F2irIM8GjQWI1RG7ZJ0MYIt167l
RDj3/Y++3qFzuWXoX59QovUOr94ixaCAtmJXWyLE9MHpT7NzNpYXTuLXQcWXXcg5
hp+DIXJuQrRcdYA156Ixfl9CuO20ODzaBqz1aJxj4hRhdtsyshRIQfBwvUwHTpQv
IKj0SSjiMxQF+mZjDs1bMrw8KCgavV1o6P/CS2NjjQLkuHs07TBD/q4IfhCmLSiW
e0f/fXBBQoB+0ZA9GFvsVaL+9VZoFP6JTMLxPk66erJhkgYn9RthuP+4oNk8kOaC
jdaoUhMjGQWohNenecPpkaUiLT6d3phL9ZPnL2Z31FUyYvbw/jrPAtC3e88cXDFu
CPlIhLmM7mhIKx+yt1BIZ2IY3LoPDYt8YxqXyVOORJcqNMq5juRXGKi+iUB8+mGa
inViKIptO26FKqQcKWqGABuaQtuUA6RZzc0MNTBSNhhCtGj12ZgD+pg3qcSN+T61
wLK1ZFlt1YZ2jp2mbt7x
=8NVy
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v3.15/fixes-gpmc' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Merge fixes from Tony Lindgren:
Mostly fixes for occasional memory corruption caused by bad
timings for smc911x LAN9220 (and potentially LAN9221) devices
that were noted on a cm-t3730 system. Also fix THUMB mode
for SMP, and mailbox related warnings when booted with device
tree.
* tag 'omap-for-v3.15/fixes-gpmc' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: AM3517: Disable absent IPs inherited from OMAP3
ARM: dts: OMAP2: Fix interrupts for OMAP2420 mailbox
ARM: dts: OMAP5: Add mailbox dt node to fix boot warning
ARM: OMAP5: Switch to THUMB mode if needed on secondary CPU
ARM: dts: am437x-gp-evm: Do not reset gpio5
ARM: dts: omap3-igep0020: use SMSC9221 timings
ARM: dts: Fix GPMC timings for LAN9220
ARM: dts: Fix GPMC Ethernet timings for omap cm-t sbc-t boards for device tree
ARM: dts: Fix bad OTG muxing for cm-t boards
Signed-off-by: Olof Johansson <olof@lixom.net>
Commit 54397d8534
("ARM: kirkwood: Relocate PCIe device tree nodes")
moved the pcie-controller nodes for the Kirkwood SoCs to the mbus
bus node. For some reason, two boards were not properly converted
and have their pci-controller nodes still in the ocp bus node.
As the corresponding SoC pcie-controller does not exist anymore,
it is likely that pcie is broken on those boards since above commit.
Fix it by moving the pcie related nodes to the correct location.
Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Fixes: 54397d8534 ("ARM: kirkwood: Relocate PCIe device tree nodes")
Cc: <stable@vger.kernel.org> # v3.12+
Acked-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lkml.kernel.org/r/1398862602-29595-2-git-send-email-sebastian.hesselbarth@gmail.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
arm64 move of of_clk_init() call in a previous commit
- Default DMA ops changed to non-coherent to preserve compatibility with
32-bit ARM DT files. The "dma-coherent" property can be used to
explicitly mark a device coherent. The Applied Micro DT file has been
updated to avoid DMA cache maintenance for the X-Gene SATA controller
(the only arm64 related driver with such assumption in -rc mainline)
- Fixmap correction for earlyprintk
- kern_addr_valid() fix for huge pages
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJTZhiPAAoJEGvWsS0AyF7xGBUQAIthlCZGjq3yFh+P3YbZBbfh
8HEg3xQIEunaUTMLxrZ9c32rHdOwWMivmaStb7XfIzYc6XIGGnFwk0VFnxlBtOS/
yOw6khNy3d5b+R2yVVXJdOwGDvUJ7ZlZ4G35RbpFXqmHVOiT2JP5Pv/8hp/Ct3UE
eBoLjLYkvrnBgZyjBafTjc+ExjtViMdACNUCZ+fPfvWVF2pWesB72P9/+QT4DZ4Q
g+QXmtTviysFJPzi2LqVukPL5HzxrOcJql9F0lPEdCVypRHDQtNZfMf7aftZVRue
8z6IaqgwQuOkHko50RFcrPF1AbEnQWbbA//Mfm1YaJLtlaUwgEXS8jryP4MVGM/s
wjJD42tY80ysTFFiWjlqYx6wumtSjkZzLQIo7K+MjvleGaciRMsM5u2OyQJ6o8sR
GMLButOfZj1GOFPE56Xn6R27MzONS1eiCFR99dsnPPwNlqGuY7KEacAHGYRfEe75
g0Qwzj1sM6d+RHQKidWFRvvMQg5bxAENt1rpFJJ1cCge/jL2QqgbPhVPzMCM4nrW
xGQzSKO+5L1CLtH4gRd7Jdyg7tUrRBFzC8HXk/o6moO+lOebKzCpq4tNiW/MOwPG
sGCzmr2TpN6ImEjOhjYUByqa+XGUsz1n7d53Itkz8+pxsXhYHvd8iC1hOpNwakVM
h/0rfXwD782k1N3S++MH
=kRLA
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"These are mostly arm64 fixes with an additional arm(64) platform fix
for the initialisation of vexpress clocks (the latter only affecting
arm64; the arch/arm64 code is SoC agnostic and does not rely on early
SoC-specific calls)
- vexpress platform clocks initialisation moved earlier following the
arm64 move of of_clk_init() call in a previous commit
- Default DMA ops changed to non-coherent to preserve compatibility
with 32-bit ARM DT files. The "dma-coherent" property can be used
to explicitly mark a device coherent. The Applied Micro DT file
has been updated to avoid DMA cache maintenance for the X-Gene SATA
controller (the only arm64 related driver with such assumption in
-rc mainline)
- Fixmap correction for earlyprintk
- kern_addr_valid() fix for huge pages"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
vexpress: Initialise the sysregs before setting up the clocks
arm64: Mark the Applied Micro X-Gene SATA controller as DMA coherent
arm64: Use bus notifiers to set per-device coherent DMA ops
arm64: Make default dma_ops to be noncoherent
arm64: fixmap: fix missing sub-page offset for earlyprintk
arm64: Fix for the arm64 kern_addr_valid() function
of the framebuffer when early_ioremap() is no longer available and
dropping __init from functions that may be invoked after
free_initmem() - Dave Young
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTZIL0AAoJEC84WcCNIz1Vr9gP/RCHnmo9+w88ujYMjXtoq+/b
qDX/Fl8/as/gJ8cKhOVlQpC/t4VbC28mRkxV3J8NS/AklY0mU2R8TatprIyUoKAI
oPZwdSbuEIS8ehCr/D+6aAIGLtFYaLD8VK27niNHEHVytZytPqQGpDKARgphin5l
AqtEUv9NNfLaN/aHUuMV33xlD4r25BoWlj3RD2h+Rpnu2/vBXs14NTBN1r+SrLFh
r8htTDsbm3NjDCvboYyPJjnFZvlYqxtLCBC2vVD8fBvaXcBmj/vLP6WmFd3sxbTZ
4CLmRMShaqh87JH9gdg0m/xJ5sEgRqqvMiqjcaAuJzAew0eE6gUZjE9+fawWYHwT
XU0kcsM9wn/014f9fUdqaqM38o/XbnVcW+D5iSrwcx6hhNHzf7nFGnSndN2tednQ
k3z3tpX/GB9u5l0064Clru6GbSnV2cSfayaoIc4sULDrp7KBmyrlwBtsQ67C/JfV
0gJ4ridzbFllHBiw3Cyw8vzLDPgQ6t2DGw6RkzUpbMwLZG5YMRcyNODWewcTuH7g
VcMMaDKVw7uCrItFyTscMuUe1nVnbZANdLu9znF8TejgX1MzwwmdetqAE/WPR+3V
vZoYGNE5zAwGhqF34BLSof9BHoeOjucx1qgaV3QYhrdtgtTXaGf++TvwOhpCVNOC
vhUguxcrMLOM68He6o5H
=BzhM
-----END PGP SIGNATURE-----
Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/urgent
Pull EFI fix from Matt Fleming:
" * Fix earlyprintk=efi,keep support by switching to an ioremap() mapping
of the framebuffer when early_ioremap() is no longer available and
dropping __init from functions that may be invoked after
free_initmem() - Dave Young "
Signed-off-by: Ingo Molnar <mingo@kernel.org>
pte_ERROR() is not used anywhere, delete it.
For pgd_ERROR() and pmd_ERROR(), output something similar to x86, giving the address
of the pgd/pmd as well as it's value.
Also provide the caller, since these macros are invoked from pgd_clear_bad() and
pmd_clear_bad() which provides little context as to what high level operation was
occuring when the BAD state was detected.
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of returning false we should at least check the most basic
things, otherwise page table corruptions will be very difficult to
debug.
PMD and PTE tables are of size PAGE_SIZE, so none of the sub-PAGE_SIZE
bits should be set.
We also complement this with a check that the physical address the
pud/pmd points to is valid memory.
PowerPC was used as a guide while implementating this.
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit b2d4383480 ("sparc64: Make
PAGE_OFFSET variable."), the MAX_PHYS_ADDRESS_BITS value was increased
(to 47).
This constant reference to '41UL' was missed.
Signed-off-by: David S. Miller <davem@davemloft.net>
Make get_user_insn() able to cope with huge PMDs.
Next, make do_fault_siginfo() more robust when get_user_insn() can't
actually fetch the instruction. In particular, use the MMU announced
fault address when that happens, instead of calling
compute_effective_address() and computing garbage.
Signed-off-by: David S. Miller <davem@davemloft.net>
If we have a 32-bit task we must chop off the top 32-bits of the
64-bit value just as the cpu would.
Signed-off-by: David S. Miller <davem@davemloft.net>
The large PMD path needs to check _PAGE_VALID not _PAGE_PRESENT, to
decide if it needs to bail and return 0.
pmd_large() should therefore just check _PAGE_PMD_HUGE.
Calls to gup_huge_pmd() are guarded with a check of pmd_large(), so we
just need to add a valid bit check.
Signed-off-by: David S. Miller <davem@davemloft.net>
On sparc64 "present" and "valid" are seperate PTE bits, this allows us to
naturally distinguish between the user explicitly asking for PROT_NONE
with mprotect() and other situations.
However we weren't handling this properly in the huge PMD paths.
First of all, the page table walker in the TSB miss path only checks
for _PAGE_PMD_HUGE. So the generic pmdp_invalidate() would clear
_PAGE_PRESENT but the TLB miss paths would still load it into the TLB
as a valid huge PMD.
Fix this by clearing the valid bit in pmdp_invalidate(), and also
checking the valid bit in USER_PGTABLE_CHECK_PMD_HUGE using "brgez"
since _PAGE_VALID is bit 63 in both the sun4u and sun4v pte layouts.
Signed-off-by: David S. Miller <davem@davemloft.net>
This code was mistakenly using the exec bit from the PMD in all
cases, even when the PMD isn't a huge PMD.
If it's not a huge PMD, test the exec bit in the individual ptes down
in tlb_batch_pmd_scan().
Signed-off-by: David S. Miller <davem@davemloft.net>
Bring this code in line with the perf based generic NMI watchdog
in kernel/watchdog.c (which we should convert over to at some
point).
In particular, don't do anything super fancy when the watchdog
triggers, and specifically don't do a do_exit() which only makes
things worse.
Either panic(), or WARN(). The latter of which will do all of
the actions such as give us a stack backtrace.
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the default DMA ops for arm64 are non-coherent, mark the X-Gene
controller explicitly as dma-coherent to avoid additional cache
maintenance.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Loc Ho <lho@apm.com>
Recently, the default DMA ops have been changed to non-coherent for
alignment with 32-bit ARM platforms (and DT files). This patch adds bus
notifiers to be able to set the coherent DMA ops (with no cache
maintenance) for devices explicitly marked as coherent via the
"dma-coherent" DT property.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Currently arm64 dma_ops is by default made coherent which makes it
opposite in default policy from arm.
Make default dma_ops to be noncoherent (same as arm), as currently there
aren't any dma-capable drivers which assumes coherent ops
Signed-off-by: Ritesh Harjani <ritesh.harjani@gmail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Commit d57c33c5da (add generic fixmap.h) added (among other
similar things) set_fixmap_io to deal with early ioremap of devices.
More recently, commit bf4b558eba (arm64: add early_ioremap support)
converted the arm64 earlyprintk to use set_fixmap_io. A side effect of
this conversion is that my virtual machines have stopped booting when
I pass "earlyprintk=uart8250-8bit,0x3f8" to the guest kernel.
Turns out that the new earlyprintk code doesn't care at all about
sub-page offsets, and just assumes that the earlyprintk device will
be page-aligned. Obviously, that doesn't play well with the above example.
Further investigation shows that set_fixmap_io uses __set_fixmap instead
of __set_fixmap_offset. A fix is to introduce a set_fixmap_offset_io that
uses the latter, and to remove the superflous call to fix_to_virt
(which only returns the value that set_fixmap_io has already given us).
With this applied, my VMs are back in business. Tested on a Cortex-A57
platform with kvmtool as platform emulation.
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Salter <msalter@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fix for the arm64 kern_addr_valid() function to recognize
virtual addresses in the kernel logical memory map. The
function fails as written because it does not check whether
the addresses in that region are mapped at the pmd level to
2MB or 512MB pages, continues the page table walk to the
pte level, and issues a garbage value to pfn_valid().
Tested on 4K-page and 64K-page kernels.
Signed-off-by: Dave Anderson <anderson@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull irq fixes from Thomas Gleixner:
"This udpate delivers:
- A fix for dynamic interrupt allocation on x86 which is required to
exclude the GSI interrupts from the dynamic allocatable range.
This was detected with the newfangled tablet SoCs which have GPIOs
and therefor allocate a range of interrupts. The MSI allocations
already excluded the GSI range, so we never noticed before.
- The last missing set_irq_affinity() repair, which was delayed due
to testing issues
- A few bug fixes for the armada SoC interrupt controller
- A memory allocation fix for the TI crossbar interrupt controller
- A trivial kernel-doc warning fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: irq-crossbar: Not allocating enough memory
irqchip: armanda: Sanitize set_irq_affinity()
genirq: x86: Ensure that dynamic irq allocation does not conflict
linux/interrupt.h: fix new kernel-doc warnings
irqchip: armada-370-xp: Fix releasing of MSIs
irqchip: armada-370-xp: implement the ->check_device() msi_chip operation
irqchip: armada-370-xp: fix invalid cast of signed value into unsigned variable
earlyprintk=efi,keep will cause kernel hangs while freeing initmem like
below:
VFS: Mounted root (ext4 filesystem) readonly on device 254:2.
devtmpfs: mounted
Freeing unused kernel memory: 880K (ffffffff817d4000 - ffffffff818b0000)
It is caused by efi earlyprintk use __init function which will be freed
later. Such as early_efi_write is marked as __init, also it will use
early_ioremap which is init function as well.
To fix this issue, I added early initcall early_efi_map_fb which maps
the whole efi fb for later use. OTOH, adding a wrapper function
early_efi_map which calls early_ioremap before ioremap is available.
With this patch applied efi boot ok with earlyprintk=efi,keep console=efi
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Pull x86 fixes from Peter Anvin:
"Two very small changes: one fix for the vSMP Foundation platform, and
one to help LLVM not choke on options it doesn't understand (although
it probably should)"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vsmp: Fix irq routing
x86: LLVMLinux: Wrap -mno-80387 with cc-option
the merge window.
- A fix from Oleg to async page faults.
- A bunch of small ARM changes.
- A trivial patch to use the new MSI-X API introduced during the merge
window.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJTY5anAAoJEBvWZb6bTYby6EQQAIWbOJCrLO3NQxwE9M7d8YvN
oviaLFv7vJh1vaXVo7SNjBRXTq4pzWrhg9rwWlHBg1KnxJ0/sc9Tn07Fe+0bxWDh
1XIFXNEkO9+Bpl43VnKGC7sbkYE9m3+jpGCWjF01vBCh+BY73wUOsPD0Zw9YQojN
TKBtiQEjb8avuoTUR0JSOTwLZw4DlDRmRLHkNwlqqvbPdvuIWI/LG2wFUvY7/eq8
dWxIPBjLKaIv2aUs9wGNNiz4Kb92uyH5L6bI6SK8VxphRA+51BOjMcBbzdY+Q1XL
c4CTaL9ybAyUi4SRv41qWnM09YbI1FayUW93k9xz/vEplXOHp5R/lyUdZETd/d83
GxaooTLcy9nOYeZ75buiH/0EG5HxI7On/QfUBEE3qIf8KfGgxb479HbRw6RnX4bf
EhQzf7eyZvvk43Xk3OYwq8Ux1SOiXQEo+8TpCSaM/KN57cJbjGB4GCUK6JX8qJCx
7MfXBdrhkAdw5V4lEBQMYKp4pdUdgYKRXavhLevm0qFjX1Swl6LIHxLtjFTKyX9S
Xfxi09J7EUs7SsI35pdlMtPQkklEUXE96S/W3RCEpR+OfgbVMkYkcQI8TGb7ib3l
xLNJrSgFDSlP5F3rN5SYIItAqboXb7iLp7SiF2ByXV43yexIrzTH0bwdwPwpZHhk
2ziVieX5WXEX4tgzZkRj
=+bLo
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
- Fix for a Haswell regression in nested virtualization, introduced
during the merge window.
- A fix from Oleg to async page faults.
- A bunch of small ARM changes.
- A trivial patch to use the new MSI-X API introduced during the merge
window.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: ARM: vgic: Fix the overlap check action about setting the GICD & GICC base address.
KVM: arm/arm64: vgic: fix GICD_ICFGR register accesses
KVM: async_pf: mm->mm_users can not pin apf->mm
KVM: ARM: vgic: Fix sgi dispatch problem
MAINTAINERS: co-maintainance of KVM/{arm,arm64}
arm: KVM: fix possible misalignment of PGDs and bounce page
KVM: x86: Check for host supported fields in shadow vmcs
kvm: Use pci_enable_msix_exact() instead of pci_enable_msix()
ARM: KVM: disable KVM in Kconfig on big-endian systems
Pull s390 fixes from Martin Schwidefsky:
"Two bug fixes, one to fix a potential information leak in the BPF jit
and common-io-layer fix for old firmware levels"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH
s390/chsc: fix SEI usage on old FW levels
One more place where we must not be able
to be preempted or to be interrupted in RT.
Always actually disable interrupts during
synchronization cycle.
Signed-off-by: Kirill Tkhai <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the support of the GMAC has been merged, we're using it as the ethernet
controller on the A20 devices.
However, sunxi_defconfig wasn't selecting it hence breaking the NFS boot.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Pull parisc fixes from Helge Deller:
"Drop the architecture-specifc value for_STK_LIM_MAX to fix stack
related problems with GNU make"
* 'parisc-3.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Use generic uapi/asm/resource.h file
parisc: remove _STK_LIM_MAX override
There are only a couple of architectures that override _STK_LIM_MAX to
a non-infinity value. This changes the stack allocation semantics in
subtle ways. For example, GNU make changes its stack allocation to the
hard maximum defined by _STK_LIM_MAX. As a results, threads executed
by processes running under make are allocated a stack size of
_STK_LIM_MAX rather than a sensible default value. This causes various
thread stress tests to fail when they can't muster more than about 50
threads.
The attached change implements the default behavior used by the
majority of architectures.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Reviewed-by: Carlos O'Donell <carlos@systemhalted.org>
Cc: stable@vger.kernel.org # 3.14
Signed-off-by: Helge Deller <deller@gmx.de>
Commit 93ea02bb84 ("arch: Clean up asm/barrier.h implementations")
wired generic barrier.h for hexagon, but failed to delete the existing
file.
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Compile-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, plus an Intel RAPL PMU driver fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tests x86: Fix stack map lookup in dwarf unwind test
perf x86: Fix perf to use non-executable stack, again
perf tools: Remove extra '/' character in events file path
perf machine: Search for modules in %s/lib/modules/%s
perf tests: Add static build make test
perf tools: Fix bfd dependency libraries detection
perf tools: Use LDFLAGS instead of ALL_LDFLAGS
perf/x86: Fix RAPL rdmsrl_safe() usage
tools lib traceevent: Fix memory leak in pretty_print()
tools lib traceevent: Fix backward compatibility macros for pevent filter enums
perf tools: Disable libdw unwind for all but x86 arch
perf tests x86: Fix memory leak in sample_ustack()
Includes vgic fixes, a possible kernel corruption bug due to
misalignment of pages and disabling of KVM in KConfig on big-endian
systems, because the last one breaks the build.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJTYNiqAAoJEEtpOizt6ddy4NwH/3ZVN7sgC5vKiKEf0n5wNdN2
zCMNOnjKfaZN7dUval3eT3qF6h0emDqW5pOFstHwoFvuMAFauLMWPQCbU1m+bl3K
gD745kVniLKGHyE4rEwOiUNEiYGbiP44DeC1oGlirSiGNptMQjeAi3dhEtJpedES
xtn3jY26bWrIdOZ75/pvFix2qE8CXmRJU2oEvsZ0B5gGkqsblrlcY+ascot4Rm8t
M88SAhGs6pzMWpjfOOm55E2BXISQw18KMzETRWZgmmYgYQOaR2sH0USwQuI/Uhvx
1UZBZSYz3KYEx3kxKnXyS7qZyWQOY8p+y487Ty9VTlzuat2gxXH9TMMA39ZIGak=
=gpor
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-3.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
First round of KVM/ARM Fixes for 3.15
Includes vgic fixes, a possible kernel corruption bug due to
misalignment of pages and disabling of KVM in KConfig on big-endian
systems, because the last one breaks the build.
There was a very small race window where resume to kernel mode from a
Exception Path (or pure kernel mode which is true for most of ARC
exceptions anyways), was not disabling interrupts in restore_regs,
clobbering the exception regs
Anton found the culprit call flow (after many sleepless nights)
| 1. we got a Trap from user land
| 2. started to service it.
| 3. While doing some stuff on user-land memory (I think it is padzero()),
| we got a DataTlbMiss
| 4. On return from it we are taking "resume_kernel_mode" path
| 5. NEED_RESHED is not set, so we go to "return from exception" path in
| restore regs.
| 6. there seems to be IRQ happening
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: <stable@vger.kernel.org> #3.10, 3.12, 3.13, 3.14
Cc: Anton Kolesov <Anton.Kolesov@synopsys.com>
Cc: Francois Bedard <Francois.Bedard@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is another great example of trainwreck engineering:
commit 2646a0e529 (ARM: edma: Add EDMA crossbar event mux support)
added support for using EDMA on peripherals which have no direct EDMA
event mapping.
The code compiles and does not explode in your face, but that's it.
1) Reading an u16 array from an u32 device tree array simply does not
work. Even if the function is named "edma_of_read_u32_to_s16_array".
It merily calls of_property_read_u16_array. So the resulting 16bit
array will have every other entry = 0.
2) The DT entry for the xbar registers related to xbar has length 0x10
instead of the real length: 0xfd0 - 0xf90 = 0x40.
Not a real problem as it does not cross a page boundary, but
wrong nevertheless.
3) But none of this matters as the mapping never happens:
After reading nonsense edma_of_read_u32_to_s16_array() invalidates
the first array entry pair, so nobody can ever notice the
braindamage by immediate explosion.
Seems the QA criteria for this code was solely not to explode when
someone adds edma-xbar-event-map entries to the DT. Goal achieved,
congratulations!
Not really helpful if someone wants to use edma on a device which
requires a xbar mapping.
Fix the issues by:
- annotating the device tree entry with "/bits/ 16" as documented in
the of_property_read_u16_array kernel doc
- make the size of the xbar register mapping correct
- invalidating the end of the array and not the start
This convoluted mess wants to be completely rewritten as there is no
point to keep the xbar_chan array memory and the iomapping of the xbar
regs around forever. Marking the xbar mapped channels as used should
be done right there.
But that's a different issue and this patch is small enough to make it
work and allows a simple backport for stable.
Cc: stable@vger.kernel.org # v3.12+
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
This branch contains a pair of important bug fixes for the DT code:
- Fix some incorrect binding property names before they enter common usage
- Fix bug where some platform devices will be unable to get their
interrupt number when they depend on an interrupt controller that is
not available at device creation time. This is a problem causing
mainline to fail on a number of ARM platforms.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTXr8WAAoJEMWQL496c2LNhzgP/RhhIS9ySzJkdPEMksnBaQbN
JKG+wBwlzvtJSPw2xX5EpWweOd1PDqHxLjjWu7q5WIrQQyOGRAELkYg1H+5ksFo1
obCAFOjTC7VXPGc+BX0dLN6Eq4UHP8k3Ui8zlMSZQNt+gBvYo1gX2ulR8sQgS8rF
xw+8iNDA2GIFusQbci35yOAhp6yo/ble3KIeR5dFKeMpiVSh9rqdLS+H9HgnG5Sw
G2LUrAeRY2yQbow3c1001HiKrJieZZSTCq8oiZDgJZY5+Qk8zamyj2xxgMoJUA8Q
2Kcb3379vNvDgjYX61NiwyrhIbJEiao8IXq6WMDNbmH3+FUwogrnh/A/g5POtX0N
JV5tGkc/2lfm5oBCFSzcw661XjLysPsm3nPLhnxRL/dlBVkl0IZzTVYPTqUtoKlK
NfwKFUDgg+QVmhFQT0BDCyOGAcRsz4rNEPPB6M12OmEKgxGpetfUaFdBcJqSNaxf
Ac6AAZCXrkLlRNOVhK333ILPxv1JRC++JAeRKNwHr9e6j2aJjUgilSuzpHJ5Lkyo
sj6kp+n1n+giYlgc9GjM7b33tm8XEd4rbcgwCWwaypEsl4FIwPs9xMys7md8SOCE
AAnVPVUvLGC9O2GhldeQlOgtGWzILW3zJXb2mrMIIGyLBGBx1qr3VJIZpxoddTc0
UtIJn0X37mKY/R4/hWs4
=B2JT
-----END PGP SIGNATURE-----
Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux
Pull devicetree bug fixes from Grant Likely:
"These are some important bug fixes that need to get into v3.15.
This branch contains a pair of important bug fixes for the DT code:
- Fix some incorrect binding property names before they enter common
usage
- Fix bug where some platform devices will be unable to get their
interrupt number when they depend on an interrupt controller that
is not available at device creation time. This is a problem
causing mainline to fail on a number of ARM platforms"
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
of/irq: do irq resolution in platform_get_irq
of: selftest: add deferred probe interrupt test
dt: Fix binding typos in clock-names and interrupt-names
Pull powerpc fixes from Ben Herrenschmidt:
"Here is a bunch of post-merge window fixes that have been accumulating
in patchwork while I was on vacation or buried under other stuff last
week.
We have the now usual batch of LE fixes from Anton (sadly some new
stuff that went into this merge window had endian issues, we'll try to
make sure we do better next time)
Some fixes and cleanups to the new 24x7 performance monitoring stuff
(mostly typos and cleaning up printk's)
A series of fixes for an issue with our runlatch bit, which wasn't set
properly for offlined threads/cores and under KVM, causing potentially
some counters to misbehave along with possible power management
issues.
A fix for kexec nasty race where the new kernel wouldn't "see" the
secondary processors having reached back into firmware in time.
And finally a few other misc (and pretty simple) bug fixes"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (33 commits)
powerpc/4xx: Fix section mismatch in ppc4xx_pci.c
ppc/kvm: Clear the runlatch bit of a vcpu before napping
ppc/kvm: Set the runlatch bit of a CPU just before starting guest
ppc/powernv: Set the runlatch bits correctly for offline cpus
powerpc/pseries: Protect remove_memory() with device hotplug lock
powerpc: Fix error return in rtas_flash module init
powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048
powerpc: Bump COMMAND_LINE_SIZE to 2048
powerpc: Rename duplicate COMMAND_LINE_SIZE define
powerpc/perf/hv-24x7: Catalog version number is be64, not be32
powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it
powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets()
powerpc/perf/hv-gpci: Make device attr static
powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced
powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed
powerpc/mm: Fix tlbie to add AVAL fields for 64K pages
powerpc/powernv: Fix little endian issues in OPAL dump code
powerpc/powernv: Create OPAL sglist helper functions and fix endian issues
powerpc/powernv: Fix little endian issues in OPAL error log code
powerpc/powernv: Fix little endian issues with opal_do_notifier calls
...
For some reason, the base address of the fifth I2C adapter in the A20 was
incorrect. Change this to the actual base address.
Reported-by: Marcus Cooper <codekipper@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate
a bounce page (if hypervisor init code crosses page boundary) and
hypervisor PGDs. The problem is that kalloc() does not guarantee
the proper alignment. In the case of the bounce page, the page sized
buffer allocated may also cross a page boundary negating the purpose
and leading to a hang during kvm initialization. Likewise the PGDs
allocated may not meet the minimum alignment requirements of the
underlying MMU. This patch uses __get_free_page() to guarantee the
worst case alignment needs of the bounce page and PGDs on both arm
and arm64.
Cc: <stable@vger.kernel.org> # 3.10+
Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
On x86 the allocation of irq descriptors may allocate interrupts which
are in the range of the GSI interrupts. That's wrong as those
interrupts are hardwired and we don't have the irq domain translation
like PPC. So one of these interrupts can be hooked up later to one of
the devices which are hard wired to it and the io_apic init code for
that particular interrupt line happily reuses that descriptor with a
completely different configuration so hell breaks lose.
Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs,
except for a few usage sites which have not yet blown up in our face
for whatever reason. But for drivers which need an irq range, like the
GPIO drivers, we have no limit in place and we don't want to expose
such a detail to a driver.
To cure this introduce a function which an architecture can implement
to impose a lower bound on the dynamic interrupt allocations.
Implement it for x86 and set the lower bound to nr_gsi_irqs, which is
the end of the hardwired interrupt space, so all dynamic allocations
happen above.
That not only allows the GPIO driver to work sanely, it also protects
the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and
htirq code. They need to be cleaned up as well, but that's a separate
issue.
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Krogerus Heikki <heikki.krogerus@intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
virt_to_pfn has been defined in asm/memory.h by the commit e26a9e0 "ARM: Better
virt_to_page() handling"
This will result of a compilation warning when CONFIG_XEN is enabled.
arch/arm/include/asm/xen/page.h:80:0: warning: "virt_to_pfn" redefined [enabled by default]
#define virt_to_pfn(v) (PFN_DOWN(__pa(v)))
^
In file included from arch/arm/include/asm/page.h:163:0,
from arch/arm/include/asm/xen/page.h:4,
from include/xen/page.h:4,
from arch/arm/xen/grant-table.c:33:
The definition in memory.h is nearly the same (it directly expand PFN_DOWN),
so we can safely drop virt_to_pfn in xen include.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
We track shadow vmcs fields through two static lists,
one for read only and another for r/w fields. However, with
addition of new vmcs fields, not all fields may be supported on
all hosts. If so, copy_vmcs12_to_shadow() trying to vmwrite on
unsupported hosts will result in a vmwrite error. For example, commit
36be0b9deb introduced GUEST_BNDCFGS, which is not supported
by all processors. Filter out host unsupported fields before
letting guests use shadow vmcs
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Correct IRQ routing in case a vSMP box is detected
but the Interrupt Routing Comply (IRC) value is set to
"comply", which leads to incorrect IRQ routing.
Before the patch:
When a vSMP box was detected and IRC was set to "comply",
users (and the kernel) couldn't effectively set the
destination of the IRQs. This is because the hook inside
vsmp_64.c always setup all CPUs as the IRQ destination using
cpumask_setall() as the return value for IRQ allocation mask.
Later, this "overrided" mask caused the kernel to set the IRQ
destination to the lowest online CPU in the mask (CPU0 usually).
After the patch:
When the IRC is set to "comply", users (and the kernel) can control
the destination of the IRQs as we will not be changing the
default "apic->vector_allocation_domain".
Signed-off-by: Oren Twaig <oren@scalemp.com>
Acked-by: Shai Fultheim <shai@scalemp.com>
Link: http://lkml.kernel.org/r/1398669697-2123-1-git-send-email-oren@scalemp.com
[ Minor readability edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch fixes this section mismatch:
WARNING: vmlinux.o(.text+0x1efc4): Section mismatch in reference from
the function apm821xx_pciex_init_port_hw() to the function
.init.text:ppc4xx_pciex_wait_on_sdr.isra.9()
The function apm821xx_pciex_init_port_hw() references the function
__init ppc4xx_pciex_wait_on_sdr.isra.9(). This is often because
apm821xx_pciex_init_port_hw lacks a __init annotation or the
annotation of ppc4xx_pciex_wait_on_sdr.isra.9 is wrong.
apm821xx_pciex_init_port_hw is only referenced by a struct in
__initdata, so it should be safe to add __init to
apm821xx_pciex_init_port_hw.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When the guest cedes the vcpu or the vcpu has no guest to
run it naps. Clear the runlatch bit of the vcpu before
napping to indicate an idle cpu.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The secondary threads in the core are kept offline before launching guests
in kvm on powerpc: "371fefd6f2dc4666:KVM: PPC: Allow book3s_hv guests to use
SMT processor modes."
Hence their runlatch bits are cleared. When the secondary threads are called
in to start a guest, their runlatch bits need to be set to indicate that they
are busy. The primary thread has its runlatch bit set though, but there is no
harm in setting this bit once again. Hence set the runlatch bit for all
threads before they start guest.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Up until now we have been setting the runlatch bits for a busy CPU and
clearing it when a CPU enters idle state. The runlatch bit has thus
been consistent with the utilization of a CPU as long as the CPU is online.
However when a CPU is hotplugged out the runlatch bit is not cleared. It
needs to be cleared to indicate an unused CPU. Hence this patch has the
runlatch bit cleared for an offline CPU just before entering an idle state
and sets it immediately after it exits the idle state.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
While testing memory hot-remove, I found following dead lock:
Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
wait_event(root->deactivate_waitq,
atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);
Process #1120 is trying to online memory499 by
echo 1 > memory499/online
In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process
The backtrace of both processes are shown below:
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():
* NOTE: The caller must call lock_device_hotplug() to serialize hotplug
* and online/offline operations before this call, as required by
* try_offline_node().
*/
void __ref remove_memory(int nid, u64 start, u64 size)
With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
module_init should return 0 or a negative errno.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Bump the boot wrapper BOOT_COMMAND_LINE_SIZE to match the
kernel.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I've had a report that the current limit is too small for
an automated network based installer. Bump it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have two definitions of COMMAND_LINE_SIZE, one for the kernel
and one for the boot wrapper. I assume this is so the boot
wrapper can be self sufficient and not rely on kernel headers.
Having two defines with the same name is confusing, I just
updated the wrong one when trying to bump it.
Make the boot wrapper define unique by calling it
BOOT_COMMAND_LINE_SIZE.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The catalog version number was changed from a be32 (with proceeding
32bits of padding) to a be64, update the code to treat it as a be64
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
fixup for "powerpc/perf: Add support for the hv gpci (get performance
counter info) interface".
Makes the "not enabled" message less awful (and hidden unless
debugging).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
fixup for "powerpc/perf: Add support for the hv 24x7 interface"
Makes the "not enabled" message less awful (and hides it in most cases).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The if condition check was based on a draft ISA doc. Remove the same.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have two copies of code that creates an OPAL sg list. Consolidate
these into a common set of helpers and fix the endian issues.
The flash interface embedded a version number in the num_entries
field, whereas the dump interface did did not. Since versioning
wasn't added to the flash interface and it is impossible to add
this in a backwards compatible way, just remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix little endian issues with the OPAL error log code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The bitmap in opal_poll_events and opal_handle_interrupt is
big endian, so we need to byteswap it on little endian builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We had some duplication of the internal OPAL functions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Using size_t in our APIs is asking for trouble, especially
when some OPAL calls use size_t pointers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which
leads to the pci_dev is not destroyed when hotplugging a pci device.
This patch release the unnecessary refcount.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With this patch I was able to update firmware on an LE kernel.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have a subtle race when sending CPUs back to OPAL on kexec.
We mark them as "in real mode" right before we send them down. Once
we've booted the new kernel, it might try to call opal_reinit_cpus()
to change endianness, and that requires all CPUs to be spinning inside
OPAL.
However there is no synchronization here and we've observed cases
where the returning CPUs hadn't established their new state inside
OPAL before opal_reinit_cpus() is called, causing it to fail.
The proper fix is to actually wait for them to go down all the way
from the kexec'ing kernel.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The size of the sysparam sysfs files is determined from the device tree
at boot. However the buffer is hard coded to 64 bytes. If we encounter a
parameter that is larger than 64, or miss-parse the device tree, the
buffer will overflow when reading or writing to the parameter.
Check it at discovery time, and if the parameter is too large, do not
create a sysfs entry for it.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The sysparam code currently uses the userspace supplied number of
bytes when memcpy()ing in to a local 64-byte buffer.
Limit the maximum number of bytes by the size of the buffer.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The OPAL calls are returning int64_t values, which the sysparam code
stores in an int, and the sysfs callback returns ssize_t. Make code a
easier to read by consistently using ssize_t.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When a sysparam query in OPAL returned a negative value (error code),
sysfs would spew out a decent chunk of memory; almost 64K more than
expected. This was traced to a sign/unsigned mix up in the OPAL sysparam
sysfs code at sys_param_show.
The return value of sys_param_show is a ssize_t, calculated using
return ret ? ret : attr->param_size;
Alan Modra explains:
"attr->param_size" is an unsigned int, "ret" an int, so the overall
expression has type unsigned int. Result is that ret is cast to
unsigned int before being cast to ssize_t.
Instead of using the ternary operator, set ret to the param_size if an
error is not detected. The same bug exists in the sysfs write callback;
this patch fixes it in the same way.
A note on debugging this next time: on my system gcc will warn about
this if compiled with -Wsign-compare, which is not enabled by -Wall,
only -Wextra.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
commit 41dd03a9 may cause Oops in rtas_stop_self().
The reason is that the rtas_args was moved into stack space. For a box
with more that 4GB RAM, the stack could easily be outside 32bit range,
but RTAS is 32bit.
So the patch moves rtas_args away from stack by adding static before
it.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit aac416fc38 (lkdtm: flush icache and report actions) calls
flush_icache_range from a module. It's exported on most architectures
that implement it, but not on powerpc. This patch exports it to fix
the module link failure.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This merges the patch to fix possible loss of dirty bit on munmap() or
madvice(DONTNEED). If there are concurrent writers on other CPU's that
have the unmapped/unneeded page in their TLBs, their writes to the page
could possibly get lost if a third CPU raced with the TLB flush and did
a page_mkclean() before the page was fully written.
Admittedly, if you unmap() or madvice(DONTNEED) an area _while_ another
thread is still busy writing to it, you deserve all the lost writes you
could get. But we kernel people hold ourselves to higher quality
standards than "crazy people deserve to lose", because, well, we've seen
people do all kinds of crazy things.
So let's get it right, just because we can, and we don't have to worry
about it.
* safe-dirty-tlb-flush:
mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts
Pull arm fixes from Russell King:
"A number of fixes for the PJ4/iwmmxt changes which arm-soc forced me
to take during the merge window. This stuff should have been better
tested and sorted out *before* the merge window"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8042/1: iwmmxt: allow to build iWMMXt on Marvell PJ4B
ARM: 8041/1: pj4: fix cpu_is_pj4 check
ARM: 8040/1: pj4: properly detect existence of iWMMXt coprocessor
ARM: 8039/1: pj4: enable iWMMXt only if CONFIG_IWMMXT is set
ARM: 8038/1: iwmmxt: explicitly check for supported architectures
Pull irq fixes from Thomas Gleixner:
"A slighlty large fix for a subtle issue in the CPU hotplug code of
certain ARM SoCs, where the not yet online cpu needs to setup the cpu
local timer and needs to set the interrupt affinity to itself.
Setting interrupt affinity to a not online cpu is prohibited and
therefor the timer interrupt ends up on the wrong cpu, which leads to
nasty complications.
The SoC folks tried to hack around that in the SoC code in some more
than nasty ways. The proper solution is to have a way to enforce the
affinity setting to a not online cpu. The core patch to the genirq
code provides that facility and the follow up patches make use of it
in the GIC interrupt controller and the exynos timer driver.
The change to the core code has no implications to existing users,
except for the rename of the locked function and therefor the
necessary fixup in mips/cavium. Aside of that, no runtime impact is
possible, as none of the existing interrupt chips implements anything
which depends on the force argument of the irq_set_affinity()
callback"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Exynos_mct: Register clock event after request_irq()
clocksource: Exynos_mct: Use irq_force_affinity() in cpu bringup
irqchip: Gic: Support forced affinity setting
genirq: Allow forcing cpu affinity of interrupts