Add the new helper filter_chain(). Currently it is only placeholder,
the comment explains what is should do. We will change it later to
consult every consumer to decide whether we need to install the swbp.
Until then it works as if any consumer returns true, this matches the
current behavior.
Change install_breakpoint() to call filter_chain() instead of checking
uprobe->consumers != NULL. We obviously need this, and this equally
closes the race with _unregister().
Change remove_breakpoint() to call this helper too. Currently this is
pointless because remove_breakpoint() is only called when the last
consumer goes away, but we will change this.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
uprobe_consumer->filter() is pointless in its current form, kill it.
We will add it back, but with the different signature/semantics. Perhaps
we will even re-introduce the callsite in handler_chain(), but not to
just skip uc->handler().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
register/unregister verifies that inode/uc != NULL. For what?
This really looks like "hide the potential problem", the caller
should pass the valid data.
register() also checks uc->next == NULL, probably to prevent the
double-register but the caller can do other stupid/wrong things.
If we do this check, then we should document that uc->next should
be cleared before register() and add BUG_ON().
Also add the small comment about the i_size_read() check.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cosmetic. __set_bit(UPROBE_SKIP_SSTEP) is the part of initialization,
it is not clear why it is set in insert_uprobe().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
We have received multiple reports of mmap failures when running with a
2:2 vm split. These manifest as either -EINVAL with a non page-aligned
address (ending 0xaaa) or a SEGV, depending on the application. The
issue is commonly observed in children of make, which appears to use
bottom-up mmap (assumedly because it changes the stack rlimit).
Further investigation reveals that this regression was triggered by
394ef6403a ("mm: use vm_unmapped_area() on arm architecture"), whereby
TASK_UNMAPPED_BASE is no longer page-aligned for bottom-up mmap, causing
get_unmapped_area to choke on misaligned addressed.
This patch fixes the problem by defining TASK_UNMAPPED_BASE in terms of
TASK_SIZE and explicitly aligns the result to 16M, matching the other
end of the heap.
Acked-by: Nicolas Pitre <nico@linaro.org>
Reported-by: Steve Capper <steve.capper@arm.com>
Reported-by: Jean-Francois Moine <moinejf@free.fr>
Reported-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Realview fails to boot with this warning:
BUG: spinlock lockup suspected on CPU#0, init/1
lock: 0xcf8bde10, .magic: dead4ead, .owner: init/1, .owner_cpu: 0
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:cf8bde10 r5:cf83d1c0 r4:cf8bde10 r3:cf83d1c0
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c018926c>] (spin_dump+0x84/0x98)
[<c01891e8>] (spin_dump+0x0/0x98) from [<c0189460>] (do_raw_spin_lock+0x100/0x198)
[<c0189360>] (do_raw_spin_lock+0x0/0x198) from [<c032cbac>] (_raw_spin_lock+0x3c/0x44)
[<c032cb70>] (_raw_spin_lock+0x0/0x44) from [<c01c9224>] (pl011_console_write+0xe8/0x11c)
[<c01c913c>] (pl011_console_write+0x0/0x11c) from [<c002aea8>] (call_console_drivers.clone.7+0xdc/0x104)
[<c002adcc>] (call_console_drivers.clone.7+0x0/0x104) from [<c002b320>] (console_unlock+0x2e8/0x454)
[<c002b038>] (console_unlock+0x0/0x454) from [<c002b8b4>] (vprintk_emit+0x2d8/0x594)
[<c002b5dc>] (vprintk_emit+0x0/0x594) from [<c0329718>] (printk+0x3c/0x44)
[<c03296dc>] (printk+0x0/0x44) from [<c002929c>] (warn_slowpath_common+0x28/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c0070ab0>] (lockdep_trace_alloc+0xd8/0xf0)
[<c00709d8>] (lockdep_trace_alloc+0x0/0xf0) from [<c00c0850>] (kmem_cache_alloc+0x24/0x11c)
[<c00c082c>] (kmem_cache_alloc+0x0/0x11c) from [<c00bb044>] (__get_vm_area_node.clone.24+0x7c/0x16c)
[<c00bafc8>] (__get_vm_area_node.clone.24+0x0/0x16c) from [<c00bb7b8>] (get_vm_area_caller+0x48/0x54)
[<c00bb770>] (get_vm_area_caller+0x0/0x54) from [<c0020064>] (__alloc_remap_buffer.clone.15+0x38/0xb8)
[<c002002c>] (__alloc_remap_buffer.clone.15+0x0/0xb8) from [<c0020244>] (__dma_alloc+0x160/0x2c8)
[<c00200e4>] (__dma_alloc+0x0/0x2c8) from [<c00204d8>] (arm_dma_alloc+0x88/0xa0)[<c0020450>] (arm_dma_alloc+0x0/0xa0) from [<c00beb00>] (dma_pool_alloc+0xcc/0x1a8)
[<c00bea34>] (dma_pool_alloc+0x0/0x1a8) from [<c01a9d14>] (pl08x_fill_llis_for_desc+0x28/0x568)
[<c01a9cec>] (pl08x_fill_llis_for_desc+0x0/0x568) from [<c01aab8c>] (pl08x_prep_slave_sg+0x258/0x3b0)
[<c01aa934>] (pl08x_prep_slave_sg+0x0/0x3b0) from [<c01c9f74>] (pl011_dma_tx_refill+0x140/0x288)
[<c01c9e34>] (pl011_dma_tx_refill+0x0/0x288) from [<c01ca748>] (pl011_start_tx+0xe4/0x120)
[<c01ca664>] (pl011_start_tx+0x0/0x120) from [<c01c54a4>] (__uart_start+0x48/0x4c)
[<c01c545c>] (__uart_start+0x0/0x4c) from [<c01c632c>] (uart_start+0x2c/0x3c)
[<c01c6300>] (uart_start+0x0/0x3c) from [<c01c795c>] (uart_write+0xcc/0xf4)
[<c01c7890>] (uart_write+0x0/0xf4) from [<c01b0384>] (n_tty_write+0x1c0/0x3e4)
[<c01b01c4>] (n_tty_write+0x0/0x3e4) from [<c01acfe8>] (tty_write+0x144/0x240)
[<c01acea4>] (tty_write+0x0/0x240) from [<c01ad17c>] (redirected_tty_write+0x98/0xac)
[<c01ad0e4>] (redirected_tty_write+0x0/0xac) from [<c00c371c>] (vfs_write+0xbc/0x150)
[<c00c3660>] (vfs_write+0x0/0x150) from [<c00c39c0>] (sys_write+0x4c/0x78)
[<c00c3974>] (sys_write+0x0/0x78) from [<c0014460>] (ret_fast_syscall+0x0/0x3c)
This happens because the DMA allocation code is not respecting atomic
allocations correctly.
GFP flags should not be tested for GFP_ATOMIC to determine if an
atomic allocation is being requested. GFP_ATOMIC is not a flag but
a value. The GFP bitmask flags are all prefixed with __GFP_.
The rest of the kernel tests for __GFP_WAIT not being set to indicate
an atomic allocation. We need to do the same.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Realview EB with a rev B MPcore tile results in lots of warnings at
boot because it can't allocate enough IRQs. Fix this by increasing
the number of available IRQs.
WARNING: at /home/rmk/git/linux-rmk/arch/arm/common/gic.c:757 gic_init_bases+0x12c/0x2ec()
Cannot allocate irq_descs @ IRQ96, assuming pre-allocated
Modules linked in:
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:000002f5 r5:c042c62c r4:c044ff40 r3:c045f240
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c00292c8>] (warn_slowpath_common+0x54/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029384>] (warn_slowpath_fmt+0x38/0x40)
[<c002934c>] (warn_slowpath_fmt+0x0/0x40) from [<c042c62c>] (gic_init_bases+0x12c/0x2ec)
[<c042c500>] (gic_init_bases+0x0/0x2ec) from [<c042cdc8>] (gic_init_irq+0x8c/0xd8)
[<c042cd3c>] (gic_init_irq+0x0/0xd8) from [<c042827c>] (init_IRQ+0x1c/0x24)
[<c0428260>] (init_IRQ+0x0/0x24) from [<c04256c8>] (start_kernel+0x1a4/0x300)
[<c0425524>] (start_kernel+0x0/0x300) from [<70008070>] (0x70008070)
---[ end trace 1b75b31a2719ed1c ]---
------------[ cut here ]------------
WARNING: at /home/rmk/git/linux-rmk/kernel/irq/irqdomain.c:234 irq_domain_add_legacy+0x80/0x140()
Modules linked in:
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:000000ea r5:c0081a38 r4:00000000 r3:c045f240
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c00292c8>] (warn_slowpath_common+0x54/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c0081a38>] (irq_domain_add_legacy+0x80/0x140)
[<c00819b8>] (irq_domain_add_legacy+0x0/0x140) from [<c042c64c>] (gic_init_bases+0x14c/0x2ec)
[<c042c500>] (gic_init_bases+0x0/0x2ec) from [<c042cdc8>] (gic_init_irq+0x8c/0xd8)
[<c042cd3c>] (gic_init_irq+0x0/0xd8) from [<c042827c>] (init_IRQ+0x1c/0x24)
[<c0428260>] (init_IRQ+0x0/0x24) from [<c04256c8>] (start_kernel+0x1a4/0x300)
[<c0425524>] (start_kernel+0x0/0x300) from [<70008070>] (0x70008070)
---[ end trace 1b75b31a2719ed1d ]---
------------[ cut here ]------------
WARNING: at /home/rmk/git/linux-rmk/arch/arm/common/gic.c:762 gic_init_bases+0x170/0x2ec()
Modules linked in:
Backtrace:
[<c00185d8>] (dump_backtrace+0x0/0x10c) from [<c03294e8>] (dump_stack+0x18/0x1c) r6:000002fa r5:c042c670 r4:00000000 r3:c045f240
[<c03294d0>] (dump_stack+0x0/0x1c) from [<c00292c8>] (warn_slowpath_common+0x54/0x6c)
[<c0029274>] (warn_slowpath_common+0x0/0x6c) from [<c0029304>] (warn_slowpath_null+0x24/0x2c)
[<c00292e0>] (warn_slowpath_null+0x0/0x2c) from [<c042c670>] (gic_init_bases+0x170/0x2ec)
[<c042c500>] (gic_init_bases+0x0/0x2ec) from [<c042cdc8>] (gic_init_irq+0x8c/0xd8)
[<c042cd3c>] (gic_init_irq+0x0/0xd8) from [<c042827c>] (init_IRQ+0x1c/0x24)
[<c0428260>] (init_IRQ+0x0/0x24) from [<c04256c8>] (start_kernel+0x1a4/0x300)
[<c0425524>] (start_kernel+0x0/0x300) from [<70008070>] (0x70008070)
---[ end trace 1b75b31a2719ed1e ]---
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Punit Agrawal reports:
> I was trying to boot 3.8-rc5 on Realview EB 11MPCore using
> realview-smp_defconfig as a starting point but the kernel failed to
> progress past the log below (config attached).
>
> Pawel suggested I try reverting 384a290283 - "ARM: gic: use a private
> mapping for CPU target interfaces" that you've authored. With this
> commit reverted the kernel boots.
>
> I am not quite sure why the commit breaks 11MPCore but Pawel (cc'd)
> might be able to shed light on that.
Some early GIC implementations return zero for the first distributor
CPU routing register. This means we can't rely on that telling us
which CPU interface we're connected to. We know that these platforms
implement PPIs for IRQs 29-31 - but we shouldn't assume that these
will always be populated.
So, instead, scan for a non-zero CPU routing register in the first
32 IRQs and use that as our CPU mask.
Reported-by: Punit Agrawal <punit.agrawal@arm.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pull drm regression fix from Dave Airlie:
"This one fixes a sleep while locked regression that was introduced
earlier in 3.8."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/ttm: fix fence locking in ttm_buffer_object_transfer, 2nd try
In commit 6509141f9c ("usbnet: add new
flag FLAG_NOARP for usb net devices"), the newly added flag NOARP was
using an already defined value, which broke drivers using flag
MULTI_PACKET.
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Savchenko reported a DNS failure and we diagnosed that
some UDP sockets were unable to send more packets because their
sk_wmem_alloc was corrupted after a while (tx_queue column in
following trace)
$ cat /proc/net/udp
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode ref pointer drops
...
459: 00000000:0270 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4507 2 ffff88003d612380 0
466: 00000000:0277 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4802 2 ffff88003d613180 0
470: 076A070A:007B 00000000:0000 07 FFFF4600:00000000 00:00000000 00000000 123 0 5552 2 ffff880039974380 0
470: 010213AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4986 2 ffff88003dbd3180 0
470: 010013AC:007B 00000000:0000 07 00000000:00000000 00:00000000 00000000 0 0 4985 2 ffff88003dbd2e00 0
470: 00FCA8C0:007B 00000000:0000 07 FFFFFB00:00000000 00:00000000 00000000 0 0 4984 2 ffff88003dbd2a80 0
...
Playing with skb->truesize is tricky, especially when
skb is attached to a socket, as we can fool memory charging.
Just remove this code, its not worth trying to be ultra
precise in xmit path.
Reported-by: Andrew Savchenko <bircoph@gmail.com>
Tested-by: Andrew Savchenko <bircoph@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For sensitive data like keying material, it is common practice to zero
out keys before returning the memory back to the allocator. Thus, use
kzfree instead of kfree.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesse Gross says:
====================
One bug fix for net/3.8 for a long standing problem that was reported a few
times recently.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ian Campbell says:
====================
The Xen netback implementation contains a couple of flaws which can
allow a guest to cause a DoS in the backend domain, potentially
affecting other domains in the system.
CVE-2013-0216 is a failure to sanity check the ring producer/consumer
pointers which can allow a guest to cause netback to loop for an
extended period preventing other work from occurring.
CVE-2013-0217 is a memory leak on an error path which is guest
triggerable.
The following series contains the fixes for these issues, as previously
included in Xen Security Advisory 39:
http://lists.xen.org/archives/html/xen-announce/2013-02/msg00001.html
Changes in v2:
- Typo and block comment format fixes
- Added stable Cc
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthew Daley <mattjd@gmail.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A buggy or malicious frontend should not be able to confuse netback.
If we spot anything which is not as it should be then shutdown the
device and don't try to continue with the ring in a potentially
hostile state. Well behaved and non-hostile frontends will not be
penalised.
As well as making the existing checks for such errors fatal also add a
new check that ensures that there isn't an insane number of requests
on the ring (i.e. more than would fit in the ring). If the ring
contains garbage then previously is was possible to loop over this
insane number, getting an error each time and therefore not generating
any more pending requests and therefore not exiting the loop in
xen_netbk_tx_build_gops for an externded period.
Also turn various netdev_dbg calls which no precipitate a fatal error
into netdev_err, they are rate limited because the device is shutdown
afterwards.
This fixes at least one known DoS/softlockup of the backend domain.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Fix mlx4 VFs not working on old guests because of 64B CQE changes
- Fix ill-considered sparse fix for qib
- Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJREutcAAoJEENa44ZhAt0hjFwP/3SCQr/eboXyJV9GBlmbU9y2
X7t6JS1n9R5KxBj54XBL8ZA7qcaw7rQj8VgPC4qlMWTR1/fXHsrtiRtQU1VMBcBP
Eh50iE5BEq4kpK93IYZZei+I7J/0O1Hpj1JwUuvGr7/hQltMWXItuPTlO4Ror5Kk
Vvu/waQh9DDp1uQRSPbSqAhEx7cGbl27UT7BLPqszVla59GA8UOUcfit8I9CyTmk
ySP2zrDC7JtPoOPYy6w32K4NSjp3KTR4EHWX0G3t/b0LvwEHARwQZ3RI4ZjNMqLl
mtKfqaYjqCeSlaT6MAODlN0aTp38GFAU0RaGePL5GurxQExwGnVZfTRUJDkNGTGO
vPDq6+L6XwPHgYTs1knafs3OT24nwv/vzZ/SLV7gcssbxdL8Cru16E4CO3Vpryrl
5B0w2+ld+L1lw/m4rSuqzQYpS6NpW35ATKzMhQNwk9cLCLNCOqv247WDvhBZDnpV
lhLQ+RGs6DK7CQQ8w4rYLFBVk+1kPlZYILV0Rjni6vv7w9S/byVrshqE8eIkQwqE
BEl0gMc83VZj5WH5s5MJEx+T5H2lZ80rIDKuamSz7wEduXWWENEqj5k7mBHa66Sn
0aHcrXDe26Cj1TUCGbrgeFPMlucVAK+fSjvEzZrzQwxLspnKXlFw5v0DvqmTqBok
hO0iE4ajfXl9RfIC7KrK
=bmKU
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull IB regression fixes from Roland Dreier:
- Fix mlx4 VFs not working on old guests because of 64B CQE changes
- Fix ill-considered sparse fix for qib
- Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/qib: Fix for broken sparse warning fix
mlx4_core: Fix advertisement of wrong PF context behaviour
IPoIB: Fix crash due to skb double destruct
defined(@array) is deprecated in Perl and gives off a warning.
Restructure the code to remove that warning.
[ hpa: it would be interesting to revert to the timeconst.bc script.
It appears that the failures reported by akpm during testing of
that script was due to a known broken version of make, not a problem
with bc. The Makefile rules could probably be restructured to avoid
the make bug, or it is probably old enough that it doesn't matter. ]
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Pull btrfs fixes from Chris Mason:
"We've got corner cases for updating i_size that ceph was hitting,
error handling for quotas when we run out of space, a very subtle
snapshot deletion race, a crash while removing devices, and one
deadlock between subvolume creation and the sb_internal code (thanks
lockdep)."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: move d_instantiate outside the transaction during mksubvol
Btrfs: fix EDQUOT handling in btrfs_delalloc_reserve_metadata
Btrfs: fix possible stale data exposure
Btrfs: fix missing i_size update
Btrfs: fix race between snapshot deletion and getting inode
Btrfs: fix missing release of the space/qgroup reservation in start_transaction()
Btrfs: fix wrong sync_writers decrement in btrfs_file_aio_write()
Btrfs: do not merge logged extents if we've removed them from the tree
btrfs: don't try to notify udev about missing devices
- Exynos Kconfig fixup
- SIRF DT translation bug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJREtLpAAoJEEEQszewGV1zXEMQALKGMnrNhcKF8W6MM3YiqxIz
XhvrjrY+tebEX2g/jg7Jb/JIToaZplS1F7Thla2Tpkd7mcg792E7tRC98tnxSLBl
aVOS7frSoMrCzrM8VUeVC6uW4qNV5mUrKTuoVGRAYZqhxt4Znfxw+LAYRqjUXmtL
OrVTKJrUzvTLD1WvbNWyvZH/Jg2EmoyQcRy+YLXu3gxi6vXIbSA/sRa7ynCkAdxE
VO098vpwfTSxr0kHafOOZJsJwdedMJvyYWyfKkLUtWOe/jibT8QwpAWCMAVxzbxu
q7gKIZK7EBu3f9NlcO+9ue4QRspqFxWrtFSPzLMPRoK0sinPjQYDFILiLF+fAcBO
NI2TnPx6w+/eiXMWvmT+2X9xcbiqDH6kicKRAGsypJVdQS7Q/VVuUivYzA2B8ebo
zx0qlG+SQlx4X409JyGO8xmtV9SxM4h4b7qP2Rjik26FA9UkapyKaC/MaOwEvLSR
TyqqXSs/ZrE+GM3R98ql/0fZbFf4ZY5LqdPrhi3Lh9m1TqpX8U7NzcbAmYZC7cnt
9IT/z7ZTelYOKDJKX4rKSbyApWqH2KlW7LqBj4NdOzAeucLMParZtdkjFMLmk2Wd
TVVtsw23e/whiHjok5BgA+55BtRbHERjR4nC/GGaxB6HMdV8LQni3TL225XmbKmA
JkkmEQR6koEtVh2+8IFm
=e21G
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-for-v3.8-late' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull late pinctrl fixes from Linus Walleij:
"Two patches appeared as of late, one was completely news to me, the
other one was rotated in -next for the next merge window but turned
out to be a showstopper.
- Exynos Kconfig fixup
- SIRF DT translation bug"
* tag 'pinctrl-for-v3.8-late' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sirf: replace of_gpio_simple_xlate by sirf specific of_xlate
pinctrl: exynos: change PINCTRL_EXYNOS option
- Fix an IRQ allocation where we only check for a specific error (-1).
- CVE-2013-0231 / XSA-43. Make xen-pciback rate limit error messages from xen_pcibk_enable_msi{,x}()
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJREocJAAoJEFjIrFwIi8fJJEMIAIDgSoSXCmKIpL5tx9gqjOVY
tINOuotRP1fYRMNLtQUVWL/cLrZ3MK/t49ae7Vx9HAbvggAgQV6GSytSDop5umtY
W/XZIizBXwRv749IliEbCO/N7t1Ithvkl1C6UHJ40u2R1qDeboGqE9YT+31YtRtg
yXWtlPu3HZzx3xJAsoERt8AtSILklFhTrZ+lW69Et2M1vTJFQ0DFz/Ch5oLEuYy0
Vkj/wBwte/J4Bfm8ClroskKg8STkIKPg44pe1VuoomMzO2tNNh4gT0aOdbSdvcEa
dhWqlVxXuChAgt/NpznLcpdiv6CsxdQ5u1AwGL3ALs8bNdZ2MRP/HzsOfZ89ZPw=
=pL4V
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen fixes from Konrad Rzeszutek Wilk:
"This has two fixes. One is a security fix wherein we would spam the
kernel printk buffer if one of the guests was misbehaving. The other
is much tamer and it was us only checking for one type of error from
the IRQ subsystem (when allocating new IRQs) instead of for all of
them.
- Fix an IRQ allocation where we only check for a specific error (-1).
- CVE-2013-0231 / XSA-43. Make xen-pciback rate limit error messages
from xen_pcibk_enable_msi{,x}()"
* tag 'stable/for-linus-3.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: fix error handling path if xen_allocate_irq_dynamic fails
xen-pciback: rate limit error messages from xen_pcibk_enable_msi{,x}()
Mostly driver specific fixes here, though one of them uncovered the
issue Stephen Warren fixed with multiple OF matches getting upset due to
a lack of cleanup.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRElLlAAoJELSic+t+oim9puEP/ApGVWBrkh/YCD0MjGNx0Ovi
xJ45rByfhXmPbFa5HP21/bh/p17TMwMrmdxqMlY1dpxXhb7wsTB2D8z6uVvhPLvt
QuYphS5tqER/7KuFY6hpKy/DVJpIbIx0kcxQWAZ+hkOawwAEpi/rv36NYgRQPc8v
vBdoeuS8dj5PMHQyWZu0HvVVX2s4wB7sDMHfrpaW6mfzPT1AncDEksnsE5Yr8pba
NPQHQ4bo6Jd7EkOUoP5VLLj6e63RdLyVockiRZHrvx8q2BXrGBHhHE634VqVNAu5
wznAPv7nfj4S1OJxguBao3dYfvyfXuVx8/A7qfnO2DgmFd+9DNdzcKxK4EyMq9Uh
Lb/6D4SA3KmIeWMZ+aVf7nz/UdSYQnj3qgJHnk5G8wM7LVsUIC23l2g2ZvCppEi+
iarpB86jJfSqNRiRaDf+UGgNHgeqAMJTtO++aUKgDwuJtx3TgKMW+Ek3VsSuLVLT
zJHwH1Ounr++gK288dTgMT3fNb0mgE+kgkV9MzaGbceLfmkD8rEL8JsPQnV2YV+K
0BU7oEC1sqzM2KdDryu+L4DtjCQkeq/eTSz9vFRZVovaCMRxJ9ygWQwd0ytloe1A
mBRBVJ69p2LPitkQL8ALU9P65Cj57ermmzeYtUSRA5ySYue2VDED73ZBFuWI8kte
v5xAyCuv1MFLJUAwXjOv
=JgF9
-----END PGP SIGNATURE-----
Merge tag 'regulator-v3.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"Mostly driver specific fixes here, though one of them uncovered the
issue Stephen Warren fixed with multiple OF matches getting upset due
to a lack of cleanup."
* tag 'regulator-v3.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: s2mps11: fix incorrect register for buck10
regulator: clear state each invocation of of_regulator_match
regulator: max8997: Fix using wrong dev argument at various places
regulator: max77686: Fix using wrong dev argument at various places
regulator: max8907: Fix using wrong dev argument for calling of_regulator_match
regulator: max8998: fix incorrect min_uV value for ldo10
regulator: tps65910: Fix using wrong dev argument for calling of_regulator_match
regulator: tps65217: Fix using wrong dev argument for calling of_regulator_match
This fixes up
commit e8e89622ed
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue Dec 18 22:25:11 2012 +0100
drm/ttm: fix fence locking in ttm_buffer_object_transfer
which leaves behind a might_sleep in atomic context, since the
fence_lock spinlock is held over a kmalloc(GFP_KERNEL) call. The fix
is to revert the above commit and only take the lock where we need it,
around the call to ->sync_obj_ref.
v2: Fixup things noticed by Maarten Lankhorst:
- Brown paper bag locking bug.
- No need for kzalloc if we clear the entire thing on the next line.
- check for bo->sync_obj (totally unlikely race, but still someone
else could have snuck in) and clear fbo->sync_obj if it's cleared
already.
Reported-by: Dave Airlie <airlied@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The old SRCU implementation loads sp->completed within an
RCU-sched section, courtesy of preempt_disable(). This was required
due to the use of synchronize_sched() in the old implemenation's
synchronize_srcu(). However, the new implementation does not rely
on synchronize_sched(), so it in turn does not require the load of
sp->completed and the ->c[] counter to be in a single preempt-disabled
region of code. This commit therefore moves the sp->completed access
outside of the preempt-disabled region and applies ACCESS_ONCE().
The resulting code is almost as the same as before, but it removes the
now-misleading rcu_dereference_index_check() call.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Because synchronize_srcu_expedited() no longer uses
synchronize_rcu_sched_expedited(), synchronize_srcu_expedited() no longer
indirectly acquires any CPU-hotplug-related locks. This commit therefore
updates the comments accordingly.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The core of SRCU is changed, but synchronize_srcu()'s comments describe
the old algorithm. This commit therefore updates them to match the
new algorithm.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of ff195cb6 (rcu: Warn when srcu_read_lock() is used in an extended
quiescent state).
It also makes the codes match the comments in whatisRCU.txt:
g. Do you need read-side critical sections that are respected
even though they are in the middle of the idle loop, during
user-mode execution, or on an offlined CPU? If so, SRCU is the
only choice that will work for you.
[ paulmck: There is at least one remaining issue, namely use of lockdep
with tracing enabled. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
SRCU has its own statemachine and no longer relies on normal RCU.
Its read-side critical section can now be used by an offline CPU, so this
commit removes the check and the comments, reverting the SRCU portion
of c0d6d01b (rcu: Check for illegal use of RCU from offlined CPUs).
It also makes the code match the comments in whatisRCU.txt:
g. Do you need read-side critical sections that are respected
even though they are in the middle of the idle loop, during
user-mode execution, or on an offlined CPU? If so, SRCU is the
only choice that will work for you.
[ paulmck: There is at least one remaining issue, namely use of lockdep
with tracing enabled. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Pack six lines of code into two lines.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Although synchronize_srcu() can sleep, it will not sleep if the fast
path succeeds, which means that illegal use of synchronize_rcu()
might go unnoticed. This commit therefore adds might_sleep(), which
unconditionally catches illegal use of synchronize_rcu() from atomic
context.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
This commit replaces disabling of preemption and decrement of a per-CPU
variable with this_cpu_dec(), which avoids preemption disabling on x86
and shortens the code on all platforms.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Currently, __queue_work() chooses the pool to queue a work item to and
then determines cwq from the target wq and the chosen pool. This is a
bit backwards in that we can determine cwq first and simply use
cwq->pool. This way, we can skip get_std_worker_pool() in queueing
path which will be a hurdle when implementing custom worker pools.
Update __queue_work() such that it chooses the target cwq and then use
cwq->pool instead of the other way around. While at it, add missing
{} in an if statement.
This patch doesn't introduce any functional changes.
tj: The original patch had two get_cwq() calls - the first to
determine the pool by doing get_cwq(cpu, wq)->pool and the second
to determine the matching cwq from get_cwq(pool->cpu, wq).
Updated the function such that it chooses cwq instead of pool and
removed the second call. Rewrote the description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
get_work_pool_id() currently first obtains pool using get_work_pool()
and then return pool->id. For an off-queue work item, this involves
obtaining pool ID from worker->data, performing idr_find() to find the
matching pool and then returning its pool->id which of course is the
same as the one which went into idr_find().
Just open code WORK_STRUCT_CWQ case and directly return pool ID from
work->data.
tj: The original patch dropped on-queue work item handling and renamed
the function to offq_work_pool_id(). There isn't much benefit in
doing so. Handling it only requires a single if() and we need at
least BUG_ON(), which is also a branch, even if we drop on-queue
handling. Open code WORK_STRUCT_CWQ case and keep the function in
line with get_work_pool(). Rewrote the description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
As nr_running is likely to be accessed from other CPUs during
try_to_wake_up(), it was kept outside worker_pool; however, while less
frequent, other fields in worker_pool are accessed from other CPUs
for, e.g., non-reentrancy check. Also, with recent pool related
changes, accessing nr_running matching the worker_pool isn't as simple
as it used to be.
Move nr_running inside worker_pool. Keep it aligned to cacheline and
define CPU pools using DEFINE_PER_CPU_SHARED_ALIGNED(). This should
give at least the same cacheline behavior.
get_pool_nr_running() is replaced with direct pool->nr_running
accesses.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Joonsoo Kim <js1304@gmail.com>
My commit f2d9d270c1
("mac80211: support VHT association") introduced a
very stupid bug: the loop to downgrade the channel
width never attempted to actually use it again so
it would downgrade all the way to 20_NOHT. Fix it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Move rt scheduler definitions out of include/linux/sched.h into
new file include/linux/sched/rt.h
Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094707.7b9f825f@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add a /proc/sys/kernel scheduler knob named
sched_rr_timeslice_ms that allows global changing of the
SCHED_RR timeslice value. User visable value is in milliseconds
but is stored as jiffies. Setting to 0 (zero) resets to the
default (currently 100ms).
Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094704.13751796@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move the sysctl-related bits from include/linux/sched.h into
a new file: include/linux/sched/sysctl.h. Then update source
files requiring access to those bits by including the new
header file.
Signed-off-by: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20130207094659.06dced96@riff.lan
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Without this patch, it is trivial to determine kernel page
mappings by examining the error code reported to dmesg[1].
Instead, declare the entire kernel memory space as a violation
of a present page.
Additionally, since show_unhandled_signals is enabled by
default, switch branch hinting to the more realistic
expectation, and unobfuscate the setting of the PF_PROT bit to
improve readability.
[1] http://vulnfactory.org/blog/2013/02/06/a-linux-memory-trick/
Reported-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130207174413.GA12485@www.outflux.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
RFC 6296 points that address bits that are not part of the prefix
has to be zeroed.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Make sure only the bits that are part of the prefix are mangled.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cast __wsum from/to __sum16 is wrong. Instead, apply appropriate
conversion function: csum_unfold() or csum_fold().
[ The original patch has been modified to undo the final ~ that
csum_fold returns. We only need to fold the 32-bit word that
results from the checksum calculation into a 16-bit to ensure
that the original subnet is restored appropriately. Spotted by
Ulrich Weber. ]
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
With the recent is-work-queued-here test simplification, the nested
if() in try_to_grab_pending() can be collapsed. Collapse it.
This patch is purely cosmetic.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Currently, determining whether a work item is queued on a locked pool
involves somewhat convoluted memory barrier dancing. It goes like the
following.
* When a work item is queued on a pool, work->data is updated before
work->entry is linked to the pending list with a wmb() inbetween.
* When trying to determine whether a work item is currently queued on
a pool pointed to by work->data, it locks the pool and looks at
work->entry. If work->entry is linked, we then do rmb() and then
check whether work->data points to the current pool.
This works because, work->data can only point to a pool if it
currently is or were on the pool and,
* If it currently is on the pool, the tests would obviously succeed.
* It it left the pool, its work->entry was cleared under pool->lock,
so if we're seeing non-empty work->entry, it has to be from the work
item being linked on another pool. Because work->data is updated
before work->entry is linked with wmb() inbetween, work->data update
from another pool is guaranteed to be visible if we do rmb() after
seeing non-empty work->entry. So, we either see empty work->entry
or we see updated work->data pointin to another pool.
While this works, it's convoluted, to put it mildly. With recent
updates, it's now guaranteed that work->data points to cwq only while
the work item is queued and that updating work->data to point to cwq
or back to pool is done under pool->lock, so we can simply test
whether work->data points to cwq which is associated with the
currently locked pool instead of the convoluted memory barrier
dancing.
This patch replaces the memory barrier based "are you still here,
really?" test with much simpler "does work->data points to me?" test -
if work->data points to a cwq which is associated with the currently
locked pool, the work item is guaranteed to be queued on the pool as
work->data can start and stop pointing to such cwq only under
pool->lock and the start and stop coincide with queue and dequeue.
tj: Rewrote the comments and description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
We plan to use work->data pointing to cwq as the synchronization
invariant when determining whether a given work item is on a locked
pool or not, which requires work->data pointing to cwq only while the
work item is queued on the associated pool.
With delayed_work updated not to overload work->data for target
workqueue recording, the only case where we still have off-queue
work->data pointing to cwq is try_to_grab_pending() which doesn't
update work->data after stealing a queued work item. There's no
reason for try_to_grab_pending() to not update work->data to point to
the pool instead of cwq, like the normal execution does.
This patch adds set_work_pool_and_keep_pending() which makes
work->data point to pool instead of cwq but keeps the pending bit
unlike set_work_pool_and_clear_pending() (surprise!).
After this patch, it's guaranteed that only queued work items point to
cwqs.
This patch doesn't introduce any visible behavior change.
tj: Renamed the new helper function to match
set_work_pool_and_clear_pending() and rewrote the description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
To avoid executing the same work item from multiple CPUs concurrently,
a work_struct records the last pool it was on in its ->data so that,
on the next queueing, the pool can be queried to determine whether the
work item is still executing or not.
A delayed_work goes through timer before actually being queued on the
target workqueue and the timer needs to know the target workqueue and
CPU. This is currently achieved by modifying delayed_work->work.data
such that it points to the cwq which points to the target workqueue
and the last CPU the work item was on. __queue_delayed_work()
extracts the last CPU from delayed_work->work.data and then combines
it with the target workqueue to create new work.data.
The only thing this rather ugly hack achieves is encoding the target
workqueue into delayed_work->work.data without using a separate field,
which could be a trade off one can make; unfortunately, this entangles
work->data management between regular workqueue and delayed_work code
by setting cwq pointer before the work item is actually queued and
becomes a hindrance for further improvements of work->data handling.
This can be easily made sane by adding a target workqueue field to
delayed_work. While delayed_work is used widely in the kernel and
this does make it a bit larger (<5%), I think this is the right
trade-off especially given the prospect of much saner handling of
work->data which currently involves quite tricky memory barrier
dancing, and don't expect to see any measureable effect.
Add delayed_work->wq and drop the delayed_work->work.data overloading.
tj: Rewrote the description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>