Commit Graph

588483 Commits

Author SHA1 Message Date
Alexandre Bounine e8de370188 rapidio: add mport char device driver
Add mport character device driver to provide user space interface to
basic RapidIO subsystem operations.

See included Documentation/rapidio/mport_cdev.txt for more details.

[akpm@linux-foundation.org: fix printk warning on i386]
[dan.carpenter@oracle.com: mport_cdev: fix some error codes]
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 458bdf6e39 rapidio/tsi721_dma: fix hardware error handling
Add DMA channel re-initialization after an error to avoid termination of
all pending transfer requests.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reported-by: Barry Wood <barry.wood@idt.com>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine e680b672a2 rapidio/tsi721_dma: fix synchronization issues
Fix synchronization issues found during testing using multiple DMA
transfer requests to the same channel:

 - lost MSI-X interrupt notifications
 - non-synchronized attempts to start DMA channel HW resulting in error
   message from the driver
 - cookie tracking/update race conditions resulting in incorrect DMA
   transfer status report

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reported-by: Barry Wood <barry.wood@idt.com>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 8347245750 rapidio/tsi721_dma: update error reporting from prep_sg callback
Switch to returning error-valued pointer instead of simple NULL pointer.
This allows to properly identify situation when request queue is full
and therefore gives to upper layer an option to retry operation later.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 72d8a0d230 rapidio/tsi721: add filtered debug output
Replace "all-or-nothing" debug output with controlled debug output using
functional block masks.  This allows run time control of debug messages
through 'dbg_level' module parameter.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 1679e8dabf rapidio/tsi721: add outbound windows mapping support
Add device-specific callback functions to support outbound windows
mapping and release.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 93bdaca501 rapidio: add outbound window support
Add RapidIO controller (mport) outbound window configuration operations.

This patch is a part of the original patch submitted by Li Yang:

   https://lists.ozlabs.org/pipermail/linuxppc-dev/2009-April/071210.html

For some reason the original part was not applied to mainline code
tree.  The inbound window mapping part has been applied later during
tsi721 mport driver submission.  Now goes the second part with
corresponding HW support.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 2ece1caf66 rapidio/tsi721: fix locking in OB_MSG processing
- Add spinlock protection into outbound message queuing routine.

- Change outbound message interrupt handler to avoid deadlock when
  calling registered callback routine.

- Allow infinite retries for outbound messages to avoid retry threshold
  error signaling in systems with nodes that have slow message receive
  queue processing.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 9a0b062742 rapidio: add global inbound port write interfaces
Add new Port Write handler registration interfaces that attach PW
handlers to local mport device objects.  This is different from old
interface that attaches PW callback to individual RapidIO device.  The
new interfaces are intended for use for common event handling (e.g.
hot-plug notifications) while the old interface is available for
individual device drivers.

This patch is based on patch proposed by Andre van Herk but preserves
existing per-device interface and adds lock protection for list
handling.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine b6cb95e8eb rapidio: move rio_pw_enable into core code
Make rio_pw_enable() routine available to other RapidIO drivers.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 5024622f58 rapidio: move rio_local_set_device_id function to the common core
Make function rio_local_set_device_id() common for all components of
RapidIO subsystem.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine a7b4c636d8 rapidio: add lock protection for doorbell list
Add lock protection around doorbell list handling to prevent list
corruption on SMP platforms.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine b7dfca8bd4 rapidio/rionet: add mport removal handling
Add handling of a local mport device removal.

RIONET driver registers itself as class interface that supports only
removal notification, 'add_device' callback is not provided because
RIONET network device can be initialized only after enumeration is
completed and the existing method (using remote peer addition) satisfies
this condition.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 34ed2ebb65 rapidio/rionet: add locking into add/remove device
Add spinlock protection when handling list of connected peers and
ability to handle new peer device addition after the RIONET device was
open.  Before his update RIONET was sending JOIN requests only when it
have been opened, peer devices added later have been missing from this
process.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine dd64f4fe6f powerpc/fsl_rio: changes to mport registration
Change mport object initialization/registration sequence to match
reworked version of rio_register_mport() in the core code.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 748353cc2d rapidio/tsi721: add HW specific mport removal
Add hardware-specific device removal support for Tsi721 PCIe-to-RapidIO
bridge.  To avoid excessive data type conversions, parameters passed to
some internal functions have been revised.  Dynamic memory allocations
of rio_mport and rio_ops have been replaced to reduce references between
data structures.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine b77a2030df rapidio: add core mport removal support
Add common mport removal support functions into the RapidIO subsystem
core.

Changes to the existing mport registration process have been made to
avoid race conditions with active subsystem interfaces immediately after
mport device registration: part of initialization code from
rio_register_mport() have been moved into separate function
rio_mport_initialize() to allow to perform mport registration as the
final step of setup process.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine e6b585ca6e rapidio: move net allocation into core code
Make net allocation/release routines available to all components of
RapidIO subsystem by moving code from rio-scan enumerator.

Make destination ID allocation method private to existing enumerator
because other enumeration methods can use their own algorithm.

Setup net device object as a parent of all RapidIO devices residing in
it and register net as a child of active mport device.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine b74ec56e8a rapidio: rework common RIO device add/delete routines
This patch moves per-net device list handling from rio-scan to common
RapidIO core and adds a matching device deletion routine.  This makes
device object creation/removal available to other implementations of
enumeration/discovery process.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine f41e2472ba rapidio/rionet: add shutdown event handling
Add shutdown notification handler which terminates active connections
with remote RapidIO nodes.  This prevents remote nodes from sending
packets to the powered off node and eliminates hardware error events on
remote nodes.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine e3dd8cd477 rapidio/tsi721: add shutdown notification callback
Add device driver specific shutdown notification callback.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 83dc2cbc11 rapidio: add shutdown notification for RapidIO devices
Add bus-specific callback to stop RapidIO devices during a system
shutdown.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine dbe74afe6a rapidio/tsi721: add query_mport callback
Add device-specific implementation of query_mport callback function.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 8b189fdbc5 rapidio: add query_mport operation
Add mport query operation to report master port RapidIO capabilities and
run time configuration to upper level drivers.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine d2a321f37e rapidio/tsi721_dma: fix pending transaction queue handling
Fix pending DMA request queue handling to avoid broken ordering during
concurrent request submissions.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 9673b883c2 rapidio/tsi721: add option to configure direct mapping of IB window
Add an option to configure mapping of Inbound Window without RIO-to-PCIe
address translation.

If a local memory buffer is not properly aligned to meet HW requirements
for RapidIO address mapping with address translation, caller can request
an inbound window with matching RapidIO address assigned to it.  This
implementation selects RapidIO base address and size for inbound window
that are capable to accommodate the local memory buffer.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine ba5d141b55 rapidio/tsi721: add check for overlapped IB window mappings
Add check for attempts to request mapping of inbound RapidIO address
space that overlaps with existing active mapping windows.

Tsi721 device does not support overlapped inbound windows and SRIO
address decoding behavior is not defined in such cases.

This patch is applicable to kernel versions starting from v3.7.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Alexandre Bounine 174f1a71cc rapidio/tsi721: fix hardcoded MRRS setting
Remove use of hardcoded setting for Maximum Read Request Size (MRRS)
value and use one set by PCIe bus driver.

Using hardcoded value can cause PCIe bus errors on platforms that have
tsi721 device on PCIe path that allows only smaller read request sizes.

This fix is applicable to kernel versions starting from v3.2.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Aurelien Jacquiot 92444bb366 rapidio/rionet: add capability to change MTU
These patches are the result of extensive collaboration within the
RapidIO.org Software Task Group between Texas Instruments, Freescale,
Prodrive Technologies, Nokia Networks, BAE and IDT.  Additional input
was received from other members of RapidIO.org.  The objective was to
create a character mode driver interface which exposes the capabilities
of RapidIO devices directly to applications, in a manner that allows the
numerous and varied RapidIO implementations to interoperate.

The Software Task Group has also developed fabric management, Remote
Memory Access, and sockets applications which make use of these
interfaces in user space.  Intensive testing with these applications
prompted the RapidIO subsystem updates provided within this set of
patches.

This patch (of 29):

Replace default Ethernet-specific routine by the custom one to allow
setting of larger MTU supported by RapidIO messaging (max RIO packet
size is 4096 bytes).

Signed-off-by: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Aurelien Jacquiot 36915976ec rapidio/rionet: fix deadlock on SMP
Fix deadlocking during concurrent receive and transmit operations on SMP
platforms caused by the use of incorrect lock: on transmit 'tx_lock'
spinlock should be used instead of 'lock' which is used for receive
operation.

This fix is applicable to kernel versions starting from v2.15.

Signed-off-by: Aurelien Jacquiot <a-jacquiot@ti.com>
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Eric Biggers 95f27356e4 cpumask: remove incorrect information from comment
Since commit cdfdef75e7 ("cpumask: only allocate nr_cpumask_bits."),
this comment above cpumask_size() is no longer relevant.

Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Jann Horn 378c6520e7 fs/coredump: prevent fsuid=0 dumps into user-controlled directories
This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:

 - The fs.suid_dumpable sysctl is set to 2.
 - The kernel.core_pattern sysctl's value starts with "/". (Systems
   where kernel.core_pattern starts with "|/" are not affected.)
 - Unprivileged user namespace creation is permitted. (This is
   true on Linux >=3.8, but some distributions disallow it by
   default using a distro patch.)

Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.

To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.

Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Oleg Nesterov 1333ab0315 ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock
This test-case (simplified version of generated by syzkaller)

	#include <unistd.h>
	#include <sys/ptrace.h>
	#include <sys/wait.h>

	void test(void)
	{
		for (;;) {
			if (fork()) {
				wait(NULL);
				continue;
			}

			ptrace(PTRACE_SEIZE, getppid(), 0, 0);
			ptrace(PTRACE_INTERRUPT, getppid(), 0, 0);
			_exit(0);
		}
	}

	int main(void)
	{
		int np;

		for (np = 0; np < 8; ++np)
			if (!fork())
				test();

		while (wait(NULL) > 0)
			;
		return 0;
	}

triggers the 2nd WARN_ON_ONCE(!signr) warning in do_jobctl_trap().  The
problem is that __ptrace_unlink() clears task->jobctl under siglock but
task->ptrace is cleared without this lock held; this fools the "else"
branch which assumes that !PT_SEIZED means PT_PTRACED.

Note also that most of other PTRACE_SEIZE checks can race with detach
from the exiting tracer too.  Say, the callers of ptrace_trap_notify()
assume that SEIZED can't go away after it was checked.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Maciej S. Szmigiero 3873938068 fat: add config option to set UTF-8 mount option by default
FAT has long supported its own default file name encoding config
setting, separate from CONFIG_NLS_DEFAULT.

However, if UTF-8 encoded file names are desired FAT character set
should not be set to utf8 since this would make file names case
sensitive even if case insensitive matching is requested.  Instead,
"utf8" mount options should be provided to enable UTF-8 file names in
FAT file system.

Unfortunately, there was no possibility to set the default value of this
option so on UTF-8 system "utf8" mount option had to be added manually
to most FAT mounts.

This patch adds config option to set such default value.

Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski f970165bee x86/compat: remove is_compat_task()
x86's is_compat_task always checked the current syscall type, not the
task type.  It has no non-arch users any more, so just remove it to
avoid confusion.

On x86, nothing should really be checking the task ABI.  There are
legitimate users for the syscall ABI and for the mm ABI.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 7365abbade drivers/hid/uhid.c: check write() bitness using in_compat_syscall
uhid changes the format expected in write() depending on bitness.  It
should check the syscall bitness directly.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: David Herrmann <dh.herrmann@googlemail.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski f4056b5284 input: redefine INPUT_COMPAT_TEST as in_compat_syscall()
The input compat code should work like all other compat code: for 32-bit
syscalls, use the 32-bit ABI and for 64-bit syscalls, use the 64-bit
ABI.  We have a helper for that (in_compat_syscall()): just use it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 10f1685fd5 drivers/gpu/drm/amd/amdkfd: use in_compat_syscall to check open() caller type
amdkfd wants to know syscall type, not task type.  Check directly.

Unfortunately, amdkfd is making nasty assumptions that a process'
bitness is a well-defined constant thing.  This isn't the case on x86.
I don't know how much this matters, but this patch has no effect on
generated code on x86, so amdkfd is equally broken with and without this
patch.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 4f01ed221e drivers/firmware/efi/efivars.c: use in_compat_syscall() to check for compat callers
This should make no difference on any architecture, as x86's historical
is_compat_task behavior really did check whether the calling syscall was
a compat syscall.  x86's is_compat_task is going away, though.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski a25045ff32 firewire: use in_compat_syscall to check ioctl compatness
Firewire was using is_compat_task to check whether it was in a compat
ioctl or a non-compat ioctl.  Use is_compat_syscall instead so it works
properly on all architectures.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 2bf8c47626 net/xfrm_user: use in_compat_syscall to deny compat syscalls
The code wants to prevent compat code from receiving messages.  Use
in_compat_syscall for this.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 96c0e0a908 net/sctp: use in_compat_syscall for sctp_getsockopt_connectx3
SCTP unfortunately has a different ABI for SCTP_SOCKOPT_CONNECTX3 for
32-bit and 64-bit callers.  Use in_compat_syscall to correctly
distinguish them on all architectures.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 121cef8f17 ext4: in ext4_dir_llseek, check syscall bitness directly
ext4 treats directory offsets differently for 32-bit and 64-bit callers.
Check the caller type using in_compat_syscall, not is_compat_task.  This
changes behavior on SPARC slightly.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 6d8bedff92 staging/lustre: switch from is_compat_task to in_compat_syscall
AFAICT, lustre is trying to determine syscall bitness.  Use the new
accessor.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski efbc0fbf34 auditsc: for seccomp events, log syscall compat state using in_compat_syscall
Except on SPARC, this is what the code always did.  SPARC compat seccomp
was buggy, although the impact of the bug was limited because SPARC
32-bit and 64-bit syscall numbers are the same.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 5c465217a9 ptrace: in PEEK_SIGINFO, check syscall bitness, not task bitness
Users of the 32-bit ptrace() ABI expect the full 32-bit ABI.  siginfo
translation should check ptrace() ABI, not caller task ABI.

This is an ABI change on SPARC.  Let's hope that no one relied on the
old buggy ABI.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 5c38065e02 seccomp: check in_compat_syscall, not is_compat_task, in strict mode
Seccomp wants to know the syscall bitness, not the caller task bitness,
when it selects the syscall whitelist.

As far as I know, this makes no difference on any architecture, so it's
not a security problem.  (It generates identical code everywhere except
sparc, and, on sparc, the syscall numbering is the same for both ABIs.)

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 203f79078f sparc/syscall: fix syscall_get_arch
Sparc's syscall_get_arch was buggy: it returned the task arch, not the
syscall arch.  This could confuse seccomp and audit.

I don't think this is as bad for seccomp as it looks: sparc's 32-bit and
64-bit syscalls are numbered the same.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 069923d87e sparc/compat: provide an accurate in_compat_syscall implementation
On sparc64 compat-enabled kernels, any task can make 32-bit and 64-bit
syscalls.  is_compat_task returns true in 32-bit tasks, which does not
necessarily imply that the current syscall is 32-bit.

Provide an in_compat_syscall implementation that checks whether the
current syscall is compat.

As far as I know, sparc is the only architecture on which is_compat_task
checks the compat status of the task and on which the compat status of a
syscall can differ from the compat status of the task.  On x86,
is_compat_task checks the syscall type, not the task type.

[akpm@linux-foundation.org: add comment, per Sam]
[akpm@linux-foundation.org: update comment, per Andy]
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00
Andy Lutomirski 5180e3e24f compat: add in_compat_syscall to ask whether we're in a compat syscall
A lot of code currently abuses is_compat_task to determine this.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Airlie <airlied@linux.ie>
Cc: David Herrmann <dh.herrmann@googlemail.com>
Cc: David Miller <davem@davemloft.net>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 15:36:02 -07:00