linux_old1

Commit Graph

Author	SHA1	Message	Date
Sean Hefty	42849b2697	RDMA/uverbs: Export ib_open_qp() capability to user space Allow processes that share the same XRC domain to open an existing shareable QP. This permits those processes to receive events on the shared QP and transfer ownership, so that any process may modify the QP. The latter allows the creating process to exit, while a remaining process can still transition it for path migration purposes. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:50:56 -07:00
Sean Hefty	0e0ec7e063	RDMA/core: Export ib_open_qp() to share XRC TGT QPs XRC TGT QPs are shared resources among multiple processes. Since the creating process may exit, allow other processes which share the same XRC domain to open an existing QP. This allows us to transfer ownership of an XRC TGT QP to another process. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:49:51 -07:00
Sean Hefty	2622e18ef4	IB/cm: Do not automatically disconnect XRC TGT QPs Because an XRC TGT QP can end up being shared among multiple processes, don't have the ib_cm automatically send a DREQ when the userspace process that owns the ib_cm_id exits. Disconnect can be initiated by the user directly; otherwise, the owner of the XRC INI QP controls the connection. Note that as a result of the process exiting, the ib_cm will stop tracking the XRC connection on the target side. For the purposes of disconnecting, this isn't a big deal. The ib_cm will respond to the DREQ appropriately. For other messages, mainly LAP, the CM will reject the request, since there's no one available to route the request to. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:42:06 -07:00
Sean Hefty	18c441a6c3	RDMA/cma: Support XRC QPs Allow users to connect XRC QPs through the rdma_cm. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:41:36 -07:00
Sean Hefty	638ef7a6c6	RDMA/ucm: Allow user to specify QP type when creating id Allow the user to indicate the QP type separately from the port space when allocating an rdma_cm_id. With RDMA_PS_IB, there is no longer a 1:1 relationship between the QP type and port space, so we need to switch on the QP type to select between UD and connected QPs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:40:36 -07:00
Sean Hefty	2d2e941529	RDMA/cm: Define new RDMA port space specific to IB Add RDMA_PS_IB. XRC QP types will use the IB port space when operating over the RDMA CM. For the 'IP protocol' field value, we select 0x3F, which is listed as being for 'any local network'. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:39:52 -07:00
Sean Hefty	ef70044647	IB/cm: Update XRC support based on XRC annex errata The XRC annex was updated to have XRC behave more like RD. Specifically, the XRC TGT QPN moves from the local QPN to local EECN field. Lookup of SRQN is done using the REQ/REP protocol. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:38:35 -07:00
Sean Hefty	d26a360b77	IB/cm: Update protocol to support XRC Update the REQ and REP messages to support XRC connection setup according to the XRC Annex. Several existing fields must be set to 0 or 1 when connecting XRC QPs, and a reserved field is changed to an extended transport type. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:37:55 -07:00
Sean Hefty	b93f3c1872	RDMA/uverbs: Export XRC TGT QPs to user space Allow user space to operate on XRC TGT QPs the same way as other types of QPs, with one notable exception: since XRC TGT QPs may be shared among multiple processes, the XRC TGT QP is allowed to exist beyond the lifetime of the creating process. The process that creates the QP is allowed to destroy it, but if the process exits without destroying the QP, then the QP will be left bound to the lifetime of the XRCD. TGT QPs are not associated with CQs or a PD. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:37:07 -07:00
Sean Hefty	9977f4f64b	RDMA/uverbs: Export XRC INI QPs to userspace XRC INI QPs are similar to send only RC QPs. Allow user space to create INI QPs. Note that INI QPs do not require receive CQs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:32:27 -07:00
Sean Hefty	8541f8de05	RDMA/uverbs: Export XRC SRQs to user space We require additional information to create XRC SRQs than we can exchange using the existing create SRQ ABI. Provide an enhanced create ABI for extended SRQ types. Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il> and Roland Dreier <roland@purestorage.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:29:18 -07:00
Sean Hefty	53d0bd1e7f	RDMA/uverbs: Export XRC domains to user space Allow user space to create XRC domains. Because XRCDs are expected to be shared among multiple processes, we use inodes to identify an XRCD. Based on patches by Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:21:24 -07:00
Sean Hefty	d3d72d909e	RDMA/verbs: Cleanup XRC TGT QPs when destroying XRCD XRC TGT QPs are intended to be shared among multiple users and processes. Allow the destruction of an XRC TGT QP to be done explicitly through ib_destroy_qp() or when the XRCD is destroyed. To support destroying an XRC TGT QP, we need to track TGT QPs with the XRCD. When the XRCD is destroyed, all tracked XRC TGT QPs are also cleaned up. To avoid stale reference issues, if a user is holding a reference on a TGT QP, we increment a reference count on the QP. The user releases the reference by calling ib_release_qp. This releases any access to the QP from a user above verbs, but allows the QP to continue to exist until destroyed by the XRCD. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:20:27 -07:00
Sean Hefty	b42b63cf0d	RDMA/core: Add XRC QPs XRC ("eXtended reliable connected") is an IB transport that provides better scalability by allowing senders to specify which shared receive queue (SRQ) should be used to receive a message, which essentially allows one transport context (QP connection) to serve multiple destinations (as long as they share an adapter, of course). XRC communication is between an initiator (INI) QP and a target (TGT) QP. Target QPs are associated with SRQs through an XRCD. An XRC TGT QP behaves like a receive-only RD QP. XRC INI QPs behave similarly to RC QPs, except that work requests posted to an XRC INI QP must specify the remote SRQ that is the target of the work request. We define two new QP types for XRC, to distinguish between INI and TGT QPs, and update the core layer to support XRC QPs. This patch is derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:16:19 -07:00
Sean Hefty	418d51307d	RDMA/core: Add XRC SRQ type XRC ("eXtended reliable connected") is an IB transport that provides better scalability by allowing senders to specify which shared receive queue (SRQ) should be used to receive a message, which essentially allows one transport context (QP connection) to serve multiple destinations (as long as they share an adapter, of course). XRC defines SRQs that are specifically used by XRC connections. Expand the SRQ code to support XRC SRQs. An XRC SRQ is currently restricted to only XRC use according to the IB XRC Annex. Portions of this patch were derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:14:31 -07:00
Sean Hefty	96104eda01	RDMA/core: Add SRQ type field Currently, there is only a single ("basic") type of SRQ, but with XRC support we will add a second. Prepare for this by defining an SRQ type and setting all current users to IB_SRQT_BASIC. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-13 09:13:26 -07:00
Sean Hefty	59991f94eb	RDMA/core: Add XRC domain support XRC ("eXtended reliable connected") is an IB transport that provides better scalability by allowing senders to specify which shared receive queue (SRQ) should be used to receive a message, which essentially allows one transport context (QP connection) to serve multiple destinations (as long as they share an adapter, of course). A few new concepts are introduced to support this. This patch adds: - A new device capability flag, IB_DEVICE_XRC, which low-level drivers set to indicate that a device supports XRC. - A new object type, XRC domains (struct ib_xrcd), and new verbs ib_alloc_xrcd()/ib_dealloc_xrcd(). XRCDs are used to limit which XRC SRQs an incoming message can target. This patch is derived from work by Jack Morgenstein <jackm@dev.mellanox.co.il>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-12 10:32:26 -07:00
Marcel Apfelbaum	71eeba161d	IB: Add new InfiniBand link speeds Introduce support for the following extended speeds: FDR-10: a Mellanox proprietary link speed which is 10.3125 Gbps with 64b/66b encoding rather than 8b/10b encoding. FDR: IBA extended speed 14.0625 Gbps. EDR: IBA extended speed 25.78125 Gbps. Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-11 11:53:47 -07:00
Kumar Sanghvi	3ebeebc38b	RDMA/iwcm: Propagate ird/ord values upwards Update struct iw_cm_event to support propagating the ird/ord values upwards to the application. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-06 09:37:52 -07:00
Hefty, Sean	caf6e3f221	RDMA/ucm: Removed checks for unsigned value < 0 cmd is unsigned, no need to check for < 0. Found by code inspection. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-06 09:33:05 -07:00
Hefty, Sean	b7ab0b19a8	IB/mad: Verify mgmt class in received MADs If a received MAD contains an invalid or reserved mgmt class, we will attempt to access method_table outside of its range. Add a check to ensure that mgmt class falls within the handled range. Found by code inspection. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-06 09:33:05 -07:00
Hefty, Sean	f45ee80eb0	RDMA/cma: Check for NULL conn_param in rdma_accept Check that conn_param is not null before dereferencing it when processing rdma_accept(). This is necessary to prevent a possible system crash, which can be caused by user space. Problem found by code inspection. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-06 09:33:04 -07:00
Hefty, Sean	9595480c5d	RDMA/cma: Fix crash in cma_req_handler The RDMA CM uses the local qp_type to determine how to process an incoming request. This can result in an incoming REQ being treated as a SIDR REQ and vice versa. Fix this by switching off the event type instead, and for good measure verify that the listener supports the incoming connection request. This problem showed up when a user space application mismatched the QP types between a client and server app. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-10-06 09:32:33 -07:00
Linus Torvalds	ece236ce2f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits) IB/qib: Defer HCA error events to tasklet mlx4_core: Bump the driver version to 1.0 RDMA/cxgb4: Use printk_ratelimited() instead of printk_ratelimit() IB/mlx4: Support PMA counters for IBoE IB/mlx4: Use flow counters on IBoE ports IB/pma: Add include file for IBA performance counters definitions mlx4_core: Add network flow counters mlx4_core: Fix location of counter index in QP context struct mlx4_core: Read extended capabilities into the flags field mlx4_core: Extend capability flags to 64 bits IB/mlx4: Generate GID change events in IBoE code IB/core: Add GID change event RDMA/cma: Don't allow IPoIB port space for IBoE RDMA: Allow for NULL .modify_device() and .modify_port() methods IB/qib: Update active link width IB/qib: Fix potential deadlock with link down interrupt IB/qib: Add sysfs interface to read free contexts IB/mthca: Remove unnecessary read of PCI_CAP_ID_EXP IB/qib: Remove double define IB/qib: Remove unnecessary read of PCI_CAP_ID_EXP ...	2011-07-22 14:50:12 -07:00
Roland Dreier	4460207561	Merge branches 'cma', 'cxgb4', 'ipath', 'misc', 'mlx4', 'mthca', 'qib' and 'srp' into for-next	2011-07-22 11:56:11 -07:00
Or Gerlitz	761d90ed4c	IB/core: Add GID change event Add IB GID change event type. This is needed for IBoE when the HW driver updates the GID (e.g when new VLANs are added/deleted) table and the change should be reflected to the IB core cache. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-07-18 21:04:30 -07:00
Moni Shoua	2efdd6a038	RDMA/cma: Don't allow IPoIB port space for IBoE This patch fixes a kernel crash in cma_set_qkey(). When the link layer is Ethernet, it is wrong to use IPoIB port space since no IPoIB interface is available. Specifically, setting the Q_Key when port space is RDMA_PS_IPOIB requires MGID calculation and an SA query, which doesn't make sense over Ethernet. Signed-off-by: Moni Shoua <monis@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-07-18 16:50:25 -07:00
Bart Van Assche	10e1b54bbb	RDMA: Allow for NULL .modify_device() and .modify_port() methods These methods don't make sense for iWARP devices, so rather than forcing them to implement stubs, just return -ENOSYS in the core if the hardware driver doesn't set .modify_device and/or .modify_port. Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-07-18 16:44:30 -07:00
Jack Morgenstein	0c9361fcde	RDMA/cma: Avoid assigning an IS_ERR value to cm_id pointer in CMA id object Avoid assigning an IS_ERR value to the cm_id pointer. This fixes a few anomalies in the error flow due to confusion about checking for NULL vs IS_ERR, and eliminates the need to test for the IS_ERR value every time we wish to determine if the cma_id object has a cm device associated with it. Also, eliminate the now-unnecessary procedure cma_has_cm_dev (we can check directly for the existence of the device pointer -- for a non-NULL check, makes no difference if it is the iwarp or the ib pointer). Finally, make a few code changes here to improve coding consistency. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-07-18 09:44:55 -07:00
David S. Miller	69cce1d140	net: Abstract dst->neighbour accesses behind helpers. dst_{get,set}_neighbour() Signed-off-by: David S. Miller <davem@davemloft.net>	2011-07-17 23:11:35 -07:00
David S. Miller	6a7ebdf2fd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/bluetooth/l2cap_core.c	2011-07-14 07:56:40 -07:00
Goldwyn Rodrigues	b2bc478219	RDMA: Check for NULL mode in .devnode methods Commits `71c29bd5c2` ("IB/uverbs: Add devnode method to set path/mode") and `c3af0980ce` ("IB: Add devnode methods to cm_class and umad_class") added devnode methods that set the mode. However, these methods don't check for a NULL mode, and so we get a crash when unloading modules because devtmpfs_delete_node() calls device_get_devnode() with mode == NULL. Add the missing checks. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.de> [ Also fix cm.c. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-04 15:53:28 -07:00
Greg Rose	c7ac8679be	rtnetlink: Compute and store minimum ifinfo dump size The message size allocated for rtnl ifinfo dumps was limited to a single page. This is not enough for additional interface info available with devices that support SR-IOV and caused a bug in which VF info would not be displayed if more than approximately 40 VFs were created per interface. Implement a new function pointer for the rtnl_register service that will calculate the amount of data required for the ifinfo dump and allocate enough data to satisfy the request. Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2011-06-09 20:38:07 -07:00
Linus Torvalds	4c171acc20	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cma: Save PID of ID's owner RDMA/cma: Add support for netlink statistics export RDMA/cma: Pass QP type into rdma_create_id() RDMA: Update exported headers list RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h> RDMA/nes: Add a check for strict_strtoul() RDMA/cxgb3: Don't post zero-byte read if endpoint is going away RDMA/cxgb4: Use completion objects for event blocking IB/srp: Fix integer -> pointer cast warnings IB: Add devnode methods to cm_class and umad_class IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required IB/uverbs: Add devnode method to set path/mode RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node RDMA: Add netlink infrastructure RDMA: Add error handling to ib_core_init()	2011-05-26 12:13:57 -07:00
Roland Dreier	8dc4abdf4c	Merge branches 'cma', 'cxgb3', 'cxgb4', 'misc', 'nes', 'netlink', 'srp' and 'uverbs' into for-next	2011-05-25 13:47:20 -07:00
Nir Muchtar	83e9502d8d	RDMA/cma: Save PID of ID's owner Save the PID associated with an RDMA CM ID for reporting via netlink. Signed-off-by: Nir Muchtar <nirm@voltaire.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-25 13:46:23 -07:00
Nir Muchtar	753f618ae0	RDMA/cma: Add support for netlink statistics export Add callbacks and data types for statistics export of all current devices/ids. The schema for RDMA CM is a series of netlink messages. Each one contains an rdma_cm_stat struct. Additionally, two netlink attributes are created for the addresses for each message (if applicable). Their types used are: RDMA_NL_RDMA_CM_ATTR_SRC_ADDR (The source address for this ID) RDMA_NL_RDMA_CM_ATTR_DST_ADDR (The destination address for this ID) sockaddr_* structs are encapsulated within these attributes. In other words, every transaction contains a series of messages like: -------message 1------- struct rdma_cm_id_stats { __u32 qp_num; __u32 bound_dev_if; __u32 port_space; __s32 pid; __u8 cm_state; __u8 node_type; __u8 port_num; __u8 reserved; } RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute - contains the source address RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute - contains the destination address -------end 1------- -------message 2------- struct rdma_cm_id_stats RDMA_NL_RDMA_CM_ATTR_SRC_ADDR attribute RDMA_NL_RDMA_CM_ATTR_DST_ADDR attribute -------end 2------- Signed-off-by: Nir Muchtar <nirm@voltaire.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-25 13:46:23 -07:00
Sean Hefty	b26f9b9949	RDMA/cma: Pass QP type into rdma_create_id() The RDMA CM currently infers the QP type from the port space selected by the user. In the future (eg with RDMA_PS_IB or XRC), there may not be a 1-1 correspondence between port space and QP type. For netlink export of RDMA CM state, we want to export the QP type to userspace, so it is cleaner to explicitly associate a QP type to an ID. Modify rdma_create_id() to allow the user to specify the QP type, and use it to make our selections of datagram versus connected mode. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-25 13:46:23 -07:00
Nir Muchtar	550e5ca77e	RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h> Move cma.c's internal definition of enum cma_state to enum rdma_cm_state in an exported header so that it can be exported via RDMA netlink. Signed-off-by: Nir Muchtar <nirm@voltaire.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-25 13:46:22 -07:00
Roland Dreier	c3af0980ce	IB: Add devnode methods to cm_class and umad_class We want the ucmX, umadX and issmX device nodes to show up under /dev/infiniband, and additionally ucmX should have mode 0666. Add appropriate devnode methods to their class structs for this. Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-23 11:24:28 -07:00
Ira Weiny	c8367c4cd9	IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required We had a script which was looping through the devices returned from ibstat and attempted to register a SMI agent on an ethernet device. This caused a kernel panic for IBoE devices that don't have QP0. Fix this by checking if the QP exists before using it. Signed-off-by: Ira Weiny <weiny2@llnl.gov> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-23 11:15:48 -07:00
Roland Dreier	71c29bd5c2	IB/uverbs: Add devnode method to set path/mode We want udev to create a device node under /dev/infiniband with permission 0666 for uverbsX devices, so add a devnode method to set the appropriate info. Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-23 11:10:05 -07:00
Roland Dreier	04ea2f8197	RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node We want udev to create a device node under /dev/infiniband with permission 0666 for rdma_cm, so add that info to our struct miscdevice. Signed-off-by: Roland Dreier <roland@purestorage.com> Acked-by: Sean Hefty <sean.hefty@intel.com>	2011-05-23 10:48:43 -07:00
Linus Torvalds	06f4e926d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits) macvlan: fix panic if lowerdev in a bond tg3: Add braces around 5906 workaround. tg3: Fix NETIF_F_LOOPBACK error macvlan: remove one synchronize_rcu() call networking: NET_CLS_ROUTE4 depends on INET irda: Fix error propagation in ircomm_lmp_connect_response() irda: Kill set but unused variable 'bytes' in irlan_check_command_param() irda: Kill set but unused variable 'clen' in ircomm_connect_indication() rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport() be2net: Kill set but unused variable 'req' in lancer_fw_download() irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication() atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined. rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer(). rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler() rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection() rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window() pkt_sched: Kill set but unused variable 'protocol' in tc_classify() isdn: capi: Use pr_debug() instead of ifdefs. tg3: Update version to 3.119 tg3: Apply rx_discards fix to 5719/5720 ... Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c as per Davem.	2011-05-20 13:43:21 -07:00
Roland Dreier	b2cbae2c24	RDMA: Add netlink infrastructure Add basic RDMA netlink infrastructure that allows for registration of RDMA clients for which data is to be exported and supplies message construction callbacks. Signed-off-by: Nir Muchtar <nirm@voltaire.com> [ Reorganize a few things, add CONFIG_NET dependency. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-20 11:46:11 -07:00
Nir Muchtar	fd75c789ab	RDMA: Add error handling to ib_core_init() Fail RDMA midlayer initialization if sysfs setup fails. Signed-off-by: Nir Muchtar <nirm@voltaire.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-20 11:46:10 -07:00
David S. Miller	5fc3590c81	infiniband: Remove rt->rt_src usage in addr4_resolve() Use an explicit flow key and fetch it from there. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-10 13:32:48 -07:00
Roland Dreier	d0c49bf391	RDMA/iwcm: Get rid of enum iw_cm_event_status The IW_CM_EVENT_STATUS_xxx values were used in only a couple of places; cma.c uses -Exxx values instead, and so do the amso1100, cxgb3 and cxgb4 drivers -- only nes was using the enum values (with the mild consequence that all nes connection failures were treated as generic errors rather than reported as timeouts or rejections). We can fix this confusion by getting rid of enum iw_cm_event_status and using a plain int for struct iw_cm_event.status, and converting nes to use -Exxx as the other iWARP drivers do. This also gets rid of the warning drivers/infiniband/core/cma.c: In function 'cma_iw_handler': drivers/infiniband/core/cma.c:1333:3: warning: case value '4294967185' not in enumerated type 'enum iw_cm_event_status' drivers/infiniband/core/cma.c:1336:3: warning: case value '4294967186' not in enumerated type 'enum iw_cm_event_status' drivers/infiniband/core/cma.c:1332:3: warning: case value '4294967192' not in enumerated type 'enum iw_cm_event_status' Signed-off-by: Roland Dreier <roland@purestorage.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Faisal Latif <faisal.latif@intel.com>	2011-05-09 22:23:57 -07:00
Hefty, Sean	a9bb79128a	RDMA/cma: Add an ID_REUSEADDR option Lustre requires that clients bind to a privileged port number before connecting to a remote server. On larger clusters (typically more than about 1000 nodes), the number of privileged ports is exhausted, resulting in lustre being unusable. To handle this, we add support for reusable addresses to the rdma_cm. This mimics the behavior of the socket option SO_REUSEADDR. A user may set an rdma_cm_id to reuse an address before calling rdma_bind_addr() (explicitly or implicitly). If set, other rdma_cm_id's may be bound to the same address, provided that they all have reuse enabled, and there are no active listens. If rdma_listen() is called on an rdma_cm_id that has reuse enabled, it will only succeed if there are no other id's bound to that same address. The reuse option is exported to user space. The behavior of the kernel reuse implementation was verified against that given by sockets. This patch is derived from a path by Ira Weiny <weiny2@llnl.gov> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-09 22:06:10 -07:00
Hefty, Sean	43b752daae	RDMA/cma: Fix handling of IPv6 addressing in cma_use_port cma_use_port() assumes that the sockaddr is an IPv4 address. Since IPv6 addressing is supported (and also to support other address families) make the code more generic in its address handling. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-05-09 22:06:10 -07:00
Michael Heinz	1eba843dd7	IB/mad: Improve an error message so error code is included Signed-off-by: Michael Heinz <michael.heinz@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-03-18 09:42:20 -07:00
Sean Hefty	1bdd6384c2	RDMA/addr: Fix return of uninitialized ret value Commit `b23dd4fe42` ("ipv4: Make output route lookup return rtable directly") resulted in leaving ret uninitialized, where it may later be returned. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-03-17 17:00:19 -07:00
Linus Torvalds	7a6362800c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1480 commits) bonding: enable netpoll without checking link status xfrm: Refcount destination entry on xfrm_lookup net: introduce rx_handler results and logic around that bonding: get rid of IFF_SLAVE_INACTIVE netdev->priv_flag bonding: wrap slave state work net: get rid of multiple bond-related netdevice->priv_flags bonding: register slave pointer for rx_handler be2net: Bump up the version number be2net: Copyright notice change. Update to Emulex instead of ServerEngines e1000e: fix kconfig for crc32 dependency netfilter ebtables: fix xt_AUDIT to work with ebtables xen network backend driver bonding: Improve syslog message at device creation time bonding: Call netif_carrier_off after register_netdevice bonding: Incorrect TX queue offset net_sched: fix ip_tos2prio xfrm: fix __xfrm_route_forward() be2net: Fix UDP packet detected status in RX compl Phonet: fix aligned-mode pipe socket buffer header reserve netxen: support for GbE port settings ... Fix up conflicts in drivers/staging/brcm80211/brcmsmac/wl_mac80211.c with the staging updates.	2011-03-16 16:29:25 -07:00
Sean Hefty	a396d43a35	RDMA/cma: Replace global lock in rdma_destroy_id() with id-specific one rdma_destroy_id currently uses the global rdma cm 'lock' to test if an rdma_cm_id has been bound to a device. This prevents an active address resolution callback handler from assigning a device to the rdma_cm_id after rdma_destroy_id checks for one. Instead, we can replace the use of the global lock around the check to the rdma_cm_id device pointer by setting the id state to destroying, then flushing all active callbacks. The latter is accomplished by acquiring and releasing the handler_mutex. Any active handler will complete first, and any newly scheduled handlers will find the rdma_cm_id in an invalid state. In addition to optimizing the current locking scheme, the use of the rdma_cm_id mutex is a more intuitive synchronization mechanism than that of the global lock. These changes are based on feedback from Doug Ledford <dledford@redhat.com> while he was trying to debug a crash in the rdma cm destroy path. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-03-15 10:57:34 -07:00
Sean Hefty	8d8ac86564	IB/cm: Cancel pending LAP message when exiting IB_CM_ESTABLISH state This problem was reported by Moni Shoua <monis@mellanox.com> and Amir Vadai <amirv@mellanox.com>: When destroying a cm_id from a context of a work queue and if the lap_state of this cm_id is IB_CM_LAP_SENT, we need to release the reference of this id that was taken upon the send of the LAP message. Otherwise, if the expected APR message gets lost, it is only after a long time that the reference will be released, while during that the work handler thread is not available to process other things. It turns out that we need to cancel any pending LAP messages whenever we transition out of the IB_CM_ESTABLISH state. This occurs when disconnecting - either sending or receiving a DREQ. It can also happen in a corner case where we receive a REJ message after sending an RTU, followed by a LAP. Add checks and cancel any outstanding LAP messages in these three cases. Canceling the LAP when sending a DREQ fixes the destroy problem reported by Moni. When a cm_id is destroyed in the IB_CM_ESTABLISHED state, it sends a DREQ to the remote side to notify the peer that the connection is going away. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-03-15 10:56:12 -07:00
Sean Hefty	29963437a4	IB/cm: Bump reference count on cm_id before invoking callback When processing a SIDR REQ, the ib_cm allocates a new cm_id. The refcount of the cm_id is initialized to 1. However, cm_process_work will decrement the refcount after invoking all callbacks. The result is that the cm_id will end up with refcount set to 0 by the end of the sidr req handler. If a user tries to destroy the cm_id, the destruction will proceed, under the incorrect assumption that no other threads are referencing the cm_id. This can lead to a crash when the cm callback thread tries to access the cm_id. This problem was noticed as part of a larger investigation with kernel crashes in the rdma_cm when running on a real time OS. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Doug Ledford <dledford@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-03-15 10:56:12 -07:00
Sean Hefty	25ae21a101	RDMA/cma: Fix crash in request handlers Doug Ledford and Red Hat reported a crash when running the rdma_cm on a real-time OS. The crash has the following call trace: cm_process_work cma_req_handler cma_disable_callback rdma_create_id kzalloc init_completion cma_get_net_info cma_save_net_info cma_any_addr cma_zero_addr rdma_translate_ip rdma_copy_addr cma_acquire_dev rdma_addr_get_sgid ib_find_cached_gid cma_attach_to_dev ucma_event_handler kzalloc ib_copy_ah_attr_to_user cma_comp [ preempted ] cma_write copy_from_user ucma_destroy_id copy_from_user _ucma_find_context ucma_put_ctx ucma_free_ctx rdma_destroy_id cma_exch cma_cancel_operation rdma_node_get_transport rt_mutex_slowunlock bad_area_nosemaphore oops_enter They were able to reproduce the crash multiple times with the following details: Crash seems to always happen on the: mutex_unlock(&conn_id->handler_mutex); as conn_id looks to have been freed during this code path. An examination of the code shows that a race exists in the request handlers. When a new connection request is received, the rdma_cm allocates a new connection identifier. This identifier has a single reference count on it. If a user calls rdma_destroy_id() from another thread after receiving a callback, rdma_destroy_id will proceed to destroy the id and free the associated memory. However, the request handlers may still be in the process of running. When control returns to the request handlers, they can attempt to access the newly created identifiers. Fix this by holding a reference on the newly created rdma_cm_id until the request handler is through accessing it. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Doug Ledford <dledford@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-03-15 10:00:28 -07:00
David S. Miller	4c9483b2fb	ipv6: Convert to use flowi6 where applicable. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:54 -08:00
David S. Miller	1d28f42c1b	net: Put flowi_* prefix on AF independent members of struct flowi I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:44 -08:00
David S. Miller	78fbfd8a65	ipv4: Create and use route lookup helpers. The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-12 15:08:42 -08:00
David S. Miller	b23dd4fe42	ipv4: Make output route lookup return rtable directly. Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-03-02 14:31:35 -08:00
Roland Dreier	e51c7b1ab0	Merge branches 'amso1100', 'cma', 'cxgb4', 'misc', 'mlx4' and 'qib' into for-next	2011-01-29 20:45:04 -08:00
Tejun Heo	96e61fa55e	RDMA: Update missed conversion of flush_scheduled_work() Commit `f06267104d` ("RDMA: Update workqueue usage") introduced ib_wq and removed the use of flush_scheduled_work(); however, during the merge process one chunk was lost in ib_sa_remove_one(). Fix it. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-01-28 16:39:08 -08:00
Steve Wise	e86f8b06f5	RDMA/ucma: Copy iWARP route information on queries For iWARP rdma_cm ids, the "route" information is the L2 src and next hop addresses. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2011-01-28 16:34:05 -08:00
Tejun Heo	f06267104d	RDMA: Update workqueue usage * ib_wq is added, which is used as the common workqueue for infiniband instead of the system workqueue. All system workqueue usages including flush_scheduled_work() callers are converted to use and flush ib_wq. * cancel_delayed_work() + flush_scheduled_work() converted to cancel_delayed_work_sync(). * qib_wq is removed and ib_wq is used instead. This is to prepare for deprecation of flush_scheduled_work(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2011-01-16 21:16:31 -08:00
David S. Miller	17f7f4d9fc	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/fib_frontend.c	2010-12-26 22:37:05 -08:00
Dan Carpenter	7182afea8d	IB/uverbs: Handle large number of entries in poll CQ In ib_uverbs_poll_cq() code there is a potential integer overflow if userspace passes in a large cmd.ne. The calls to kmalloc() would allocate smaller buffers than intended, leading to memory corruption. There iss also an information leak if resp wasn't all used. Unprivileged userspace may call this function, although only if an RDMA device that uses this function is present. Fix this by copying CQ entries one at a time, which avoids the allocation entirely, and also by moving this copying into a function that makes sure to initialize all memory copied to userspace. Special thanks to Jason Gunthorpe <jgunthorpe@obsidianresearch.com> for his help and advice. Cc: <stable@kernel.org> Signed-off-by: Dan Carpenter <error27@gmail.com> [ Monkey around with things a bit to avoid bad code generation by gcc when designated initializers are used. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-12-08 15:23:49 -08:00
Vasiliy Kulikov	91a4d157d0	IB: Fix information leak in marshalling code ib_ucm_init_qp_attr() and ucma_init_qp_attr() pass struct ib_uverbs_qp_attr with reserved, qp_state, {ah_attr,alt_ah_attr}{reserved,->grh.reserved} fields uninitialized to copy_to_user(). This leads to leaking of contents of kernel stack memory to userspace. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-12-01 16:33:18 -08:00
Or Gerlitz	f55864a4f4	IB/pack: Remove some unused code added by the IBoE patches Remove unused functions added by commit `ff7f5aab35` ("IB/pack: IBoE UD packet packing support"). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>	2010-12-01 16:30:18 -08:00
Eric Dumazet	22f4fbd9bd	infiniband: remove dev_base_lock use dev_base_lock is the legacy way to lock the device list, and is planned to disappear. (writers hold RTNL, readers hold RCU lock) Convert rdma_translate_ip() and update_ipv6_gids() to RCU locking. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-24 11:41:56 -08:00
Eric Dumazet	72cdd1d971	net: get rid of rtable->idev It seems idev field in struct rtable has no special purpose, but adding extra atomic ops. We hold refcounts on the device itself (using percpu data, so pretty cheap in current kernel). infiniband case is solved using dst.dev instead of idev->dev Removal of this field means routing without route cache is now using shared data, percpu data, and only potential contention is a pair of atomic ops on struct neighbour per forwarded packet. About 5% speedup on routing test. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-11-11 10:29:40 -08:00
Roland Dreier	116e9535fe	Merge branches 'amso1100', 'cma', 'cxgb3', 'cxgb4', 'ehca', 'iboe', 'ipoib', 'misc', 'mlx4', 'nes', 'qib' and 'srp' into for-next	2010-10-26 16:09:11 -07:00
Eli Cohen	8ad330a002	IB/core: Add link layer type information to sysfs Since an IB transport port may use either IB or Ethernet as its link layer, add the file /sys/class/infiniband/<device>/ports/<port_num>/link_layer to show the link layer for the port. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
Eli Cohen	af7bd46376	IB/core: Add VLAN support for IBoE Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the GID derived from a link local address in the following way: GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN. The 3 bits user priority field of the packets are identical to the 3 bits of the SL. In case of rdma_cm apps, the TOS field is used to generate the SL field by doing a shift right of 5 bits effectively taking to 3 MS bits of the TOS field. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
Eli Cohen	2420b60b1d	IB/uverbs: Return link layer type to userspace for query port operation Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-25 10:20:39 -07:00
Steve Wise	97cb7e40c6	RDMA/ucma: Allow tuning the max listen backlog For iWARP connections, the connect request is carried in a TCP payload on an already established TCP connection. So if the ucma's backlog is full, the connection request is transmitted and acked at the TCP level by the time the connect request gets dropped in the ucma. The end result is the connection gets rejected by the iWARP provider. Further, a 32 node 256NP OpenMPI job will generate > 128 connect requests on some ranks. This patch increases the default max backlog to 1024, and adds a sysctl variable so the backlog can be adjusted at run time. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-23 13:41:40 -07:00
Eli Cohen	ff7f5aab35	IB/pack: IBoE UD packet packing support Add support for packing IBoE packet headers. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Clean up and fix ib_ud_header_init() a bit. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-14 12:41:29 -07:00
Eli Cohen	3c86aa70bf	RDMA/cm: Add RDMA CM support for IBoE devices Add support for IBoE device binding and IP --> GID resolution. Path resolving and multicast joining are implemented within cma.c by filling in the responses and running callbacks in the CMA work queue. IP --> GID resolution always yields IPv6 link local addresses; remote GIDs are derived from the destination MAC address of the remote port. Multicast GIDs are always mapped to multicast MACs as is done in IPv6. (IPv4 multicast is enabled by translating IPv4 multicast addresses to IPv6 multicast as described in <http://www.mail-archive.com/ipng@sunroof.eng.sun.com/msg02134.html>.) Some helper functions are added to ib_addr.h. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-13 15:46:43 -07:00
Eli Cohen	fac70d5191	IB/mad: IBoE supports only QP1 (no QP0) Since IBoE is using Ethernet as its link layer, there is no central management entity so there is need for QP0. QP1 is still needed since it handles communications between CM agents. This patch will skip QP0 and create only QP1 for IBoE ports. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-13 09:38:11 -07:00
Animesh K Trivedi	26012f0750	RDMA/iwcm: Fix hang in uninterruptible wait on cm_id destroy A process can get stuck in an uninterruptible wait in the kernel while destroying a cm_id when iw_cm_connect() fails: For example, When creation of a PD fails but the user continues with an attempt to connect to the server without checking the return value, in iw_cm_connect() a NULL qp is found so the call fails. However the IWCM_F_CONNECT_WAIT bit is not cleared. destroy_cm_id() then waits forever for IWCM_F_CONNECT_WAIT to be cleared. The same problem exists on the passive side with the accept call. Fix this by clearing the bit and waking up any waiters in the appropriate spots. Signed-off-by: Animesh Trivedi <atr@zurich.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-10-11 20:24:04 -07:00
Thomas Gleixner	557d0540b9	IB/umad: Make user_mad semaphore a real one Get rid of init_MUTEX[_LOCKED]() and use sema_init() instead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-28 20:52:21 -07:00
Eli Cohen	a3f5adaf49	IB/core: Add link layer property to ports This patch allows ports to have different link layers: IB_LINK_LAYER_INFINIBAND or IB_LINK_LAYER_ETHERNET. This is required for adding IBoE (InfiniBand-over-Ethernet, aka RoCE) support. For devices that do not provide an implementation for querying the link layer property of a port, we return a default value based on the transport: RMA_TRANSPORT_IB nodes will return IB_LINK_LAYER_INFINIBAND and RDMA_TRANSPORT_IWARP nodes will return IB_LINK_LAYER_ETHERNET. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-09-27 17:51:10 -07:00
Linus Torvalds	3cc08fc35d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (42 commits) IB/qib: Add missing <linux/slab.h> include IB/ehca: Drop unnecessary NULL test RDMA/nes: Fix confusing if statement indentation IB/ehca: Init irq tasklet before irq can happen RDMA/nes: Fix misindented code RDMA/nes: Fix showing wqm_quanta RDMA/nes: Get rid of "set but not used" variables RDMA/nes: Read firmware version from correct place IB/srp: Export req_lim via sysfs IB/srp: Make receive buffer handling more robust IB/srp: Use print_hex_dump() IB: Rename RAW_ETY to RAW_ETHERTYPE RDMA/nes: Fix two sparse warnings RDMA/cxgb3: Make needlessly global iwch_l2t_send() static IB/iser: Make needlessly global iser_alloc_rx_descriptors() static RDMA/cxgb4: Add timeouts when waiting for FW responses IB/qib: Fix race between qib_error_qp() and receive packet processing IB/qib: Limit the number of packets processed per interrupt IB/qib: Allow writes to the diag_counters to be able to clear them IB/qib: Set cfgctxts to number of CPUs by default ...	2010-08-07 17:08:02 -07:00
Aleksey Senin	a2ebf07ae5	IB: Rename RAW_ETY to RAW_ETHERTYPE Change abbreviated IB_QPT_RAW_ETY to IB_QPT_RAW_ETHERTYPE to make the special QP type easier to understand. cf http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg04530.html Signed-off-by: Aleksey Senin <alekseys@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-08-04 10:44:19 -07:00
Sean Hefty	50a025c69e	IB/cm: Check LAP state before sending an MRA NULL pointer dereferences in ib_cm_init_qp_attr() were seen by some users. From a crash dump, I determined that we died in cm_init_qp_rts_attr() (it's inlined, so it doesn't show up in the traceback) on the line labeled below: static int cm_init_qp_rts_attr(struct cm_id_private cm_id_priv, struct ib_qp_attr qp_attr, int qp_attr_mask) { ........ if (cm_id_priv->id.lap_state == IB_CM_LAP_UNINIT) { ..... } else { qp_attr_mask = IB_QP_ALT_PATH \| IB_QP_PATH_MIG_STATE; qp_attr->alt_port_num = cm_id_priv->alt_av.port->port_num; <-die The problem is that the rdma_cm can call ib_send_cm_mra() after a connection has been established. The ib_cm incorrectly assumes that the MRA is in response to a LAP (load alternate path) message, even though no LAP message has been received. The ib_cm needs to check the lap_state before sending an MRA if the cm_id state is established. Reported-by: Arthur Kepner <akepner@sgi.com> Reported-by: Josh England <jjengla@gmail.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-28 15:18:24 -07:00
Roland Dreier	f400e5b38a	IB/umad: Remove unused-but-set variable 'already_dead' Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-07-14 13:25:04 -07:00
Changli Gao	d8d1f30b95	net-next: remove useless union keyword remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-06-10 23:31:35 -07:00
Julia Lawall	e642df6a0b	IB/ucm: Use memdup_user() Use memdup_user when user data is immediately copied into the allocated region. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; position p; identifier l1,l2; @@ - to = $kmalloc@p\\|kzalloc@p$(size,flag); + to = memdup_user(from,size); if ( - to==NULL + IS_ERR(to) \|\| ...) { <+... when != goto l1; - -ENOMEM + PTR_ERR(to) ...+> } - if (copy_from_user(to, from, size) != 0) { - <+... when != goto l2; - -EFAULT - ...+> - } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk>	2010-05-25 21:10:57 -07:00
Roland Dreier	acdc30b56a	Merge branches 'cxgb4', 'misc', 'mlx4', 'nes' and 'qib' into for-next	2010-05-25 09:54:03 -07:00
Roland Dreier	1693395511	IB/mad: Make needlessly global mad_sendq_size/mad_recvq_size static Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-23 21:39:31 -07:00
Ralph Campbell	9a6edb60ec	IB/core: Allow device-specific per-port sysfs files Add a new parameter to ib_register_device() so that low-level device drivers can pass in a pointer to a callback function that will be called for each port that is registered in sysfs. This allows low-level device drivers to create files in /sys/class/infiniband/<hca>/ports/<N>/ without having to poke through the internals of the RDMA sysfs handling. There is no need for an unregister function since the kobject reference will go to zero when ib_unregister_device() is called. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-21 10:34:44 -07:00
Roland Dreier	ffebedb7ab	Merge branches 'amso1100', 'bkl', 'cma', 'cxgb3', 'cxgb4', 'ipoib', 'iser', 'masked-atomics', 'misc', 'mthca' and 'nes' into for-next	2010-05-15 20:06:01 -07:00
Julia Lawall	9893e742a0	IB/core: Use kmemdup() instead of kmalloc()+memcpy() Use kmemdup when some other buffer is immediately copied into the allocated region. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression from,to,size,flag; statement S; @@ - to = $kmalloc\\|kzalloc$(size,flag); + to = kmemdup(from,size,flag); if (to==NULL \|\| ...) S - memcpy(to, from, size); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-05-15 20:05:07 -07:00
Tetsuo Handa	5d7220e8dc	RDMA/cma: Randomize local port allocation Randomize local port allocation in the way sctp_get_port_local() does. Update rover at the end of loop since we're likely to pick a valid port on the first try. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-04-21 16:18:40 -07:00
Roland Dreier	bc1db9af73	IB: Explicitly rule out llseek to avoid BKL in default_llseek() Several RDMA user-access drivers have file_operations structures with no .llseek method set. None of the drivers actually do anything with f_pos, so this means llseek is essentially a NOP, instead of returning an error as leaving other file_operations methods unimplemented would do. This is mostly harmless, except that a NULL .llseek means that default_llseek() is used, and this function grabs the BKL, which we would like to avoid. Since llseek does nothing useful on these files, we would like it to return an error to userspace instead of silently grabbing the BKL and succeeding. For nearly all of the file types, we take the belt-and-suspenders approach of setting the .llseek method to no_llseek and also calling nonseekable_open(); the exception is the uverbs_event files, which are created with anon_inode_getfile(), which already sets f_mode the same way as nonseekable_open() would. This work is motivated by Arnd Bergmann's bkl-removal tree. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-04-21 12:17:38 -07:00
Linus Torvalds	0eddb519b9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Check correct variable for allocation failure RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp() RDMA/cm: Set num_paths when manually assigning path records IB/cm: Fix device_create() return value check	2010-04-09 11:53:06 -07:00
Roland Dreier	5091b35388	Merge branches 'cma', 'misc', 'mlx4' and 'nes' into for-linus	2010-04-09 09:14:21 -07:00
Sean Hefty	ae2d9293d7	RDMA/cm: Set num_paths when manually assigning path records When manually assigning the path records to use for a connection, save the number of paths that were set. Otherwise, checks against num_path will show 0, even though path record data is available. This was discovered by manually setting the path records from user space, then querying the kernel to see if the correct path records were assigned, only to discover that the kernel returned 0 path records to the query. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-04-07 14:13:22 -07:00
Jani Nikula	3e340c05c0	IB/cm: Fix device_create() return value check Use IS_ERR() instead of comparing to NULL. Signed-off-by: Jani Nikula <ext-jani.1.nikula@nokia.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-03-31 14:26:52 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Greg Kroah-Hartman	21e3bde964	sysfs: fix sysfs lockdep warning in infiniband code This fixes a sysfs lockdep warning in the infiniband code. Cc: Yinghai Lu <yinghai@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-19 07:12:13 -07:00
Linus Torvalds	122ce878dc	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/nes: Fix CX4 link problem in back-to-back configuration RDMA/nes: Clear stall bit before destroying NIC QP RDMA/nes: Set assume_aligned_header bit RDMA/cxgb3: Wait at least one schedule cycle during device removal IB/mad: Ignore iWARP devices on device removal IPoIB: Include return code in trace message for ib_post_send() failures IPoIB: Fix TX queue lockup with mixed UD/CM traffic	2010-03-13 14:38:31 -08:00
Steve Wise	070e140c4c	IB/mad: Ignore iWARP devices on device removal When an iWARP device is unloaded, the ib_mad module logs errors. It should be ignoring iWARP devices on device removal just like it does on device add. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-03-11 14:00:08 -08:00
Emese Revfy	52cf25d0ab	Driver core: Constify struct sysfs_ops in struct kobj_type Constify struct sysfs_ops. This is part of the ops structure constification effort started by Arjan van de Ven et al. Benefits of this constification: * prevents modification of data that is shared (referenced) by many other structure instances at runtime * detects/prevents accidental (but not intentional) modification attempts on archs that enforce read-only kernel data at runtime * potentially better optimized code as the compiler can assume that the const data cannot be changed * the compiler/linker move const data into .rodata and therefore exclude them from false sharing Signed-off-by: Emese Revfy <re.emese@gmail.com> Acked-by: David Teigland <teigland@redhat.com> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com> Acked-by: Hans J. Koch <hjk@linutronix.de> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-07 17:04:49 -08:00
Andi Kleen	0933e2d98d	driver core: Convert some drivers to CLASS_ATTR_STRING Convert some drivers who export a single string as class attribute to the new class_attr_string functions. This removes redundant code all over. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-07 17:04:48 -08:00
Andi Kleen	28812fe11a	driver-core: Add attribute argument to class_attribute show/store Passing the attribute to the low level IO functions allows all kinds of cleanups, by sharing low level IO code without requiring an own function for every piece of data. Also drivers can extend the attributes with own data fields and use that in the low level function. This makes the class attributes the same as sysdev_class attributes and plain attributes. This will allow further cleanups in drivers. Full tree sweep converting all users. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-07 17:04:48 -08:00
Akinobu Mita	19b629f581	infiniband: use for_each_set_bit() Replace open-coded loop with for_each_set_bit(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:23 -08:00
Linus Torvalds	0f2cc4ecd8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) init: Open /dev/console from rootfs mqueue: fix typo "failues" -> "failures" mqueue: only set error codes if they are really necessary mqueue: simplify do_open() error handling mqueue: apply mathematics distributivity on mq_bytes calculation mqueue: remove unneeded info->messages initialization mqueue: fix mq_open() file descriptor leak on user-space processes fix race in d_splice_alias() set S_DEAD on unlink() and non-directory rename() victims vfs: add NOFOLLOW flag to umount(2) get rid of ->mnt_parent in tomoyo/realpath hppfs can use existing proc_mnt, no need for do_kern_mount() in there Mirror MS_KERNMOUNT in ->mnt_flags get rid of useless vfsmount_lock use in put_mnt_ns() Take vfsmount_lock to fs/internal.h get rid of insanity with namespace roots in tomoyo take check for new events in namespace (guts of mounts_poll()) to namespace.c Don't mess with generic_permission() under ->d_lock in hpfs sanitize const/signedness for udf nilfs: sanitize const/signedness in dealing with ->d_name.name ... Fix up fairly trivial (famous last words...) conflicts in drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c	2010-03-04 08:15:33 -08:00
Al Viro	b1e4594ba0	switch infiniband uverbs to anon_inodes Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:07:27 -05:00
Roland Dreier	fe8875e5a4	Merge branch 'misc' into for-next Conflicts: drivers/infiniband/core/uverbs_main.c	2010-03-01 23:52:31 -08:00
Roland Dreier	a265e5587f	IB/uverbs: Use anon_inodes instead of private infinibandeventfs The anon_inodes interface has been split to allow creating a bare (non-installed) file pointer and also extended to allow specifying O_RDONLY in the flags. This makes it a suitable replacement for the private "infinibandeventfs" pseudo-filesystem used by uverbs, and this replacement saves a small chunk of boilerplate code. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 16:51:20 -08:00
Eli Cohen	920d706c89	IB/core: Fix and clean up ib_ud_header_init() ib_ud_header_init() first clears header and then fills up the various fields. Later on, it tests header->immediate_present, which it has already cleared, so the condition is always false. Fix this by adding an immediate_present parameter and setting header->immediate_present as is done with grh_present. Also remove unused calculation of header_len. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 14:54:10 -08:00
Alexander Chiang	cdb8e43889	IB/ucm: Clean whitespace errors As shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:48 -08:00
Alexander Chiang	daa913580e	IB/ucm: Increase maximum devices supported Some large systems may support more than IB_UCM_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. the first IB_UCM_MAX_DEVICES keep the same major/minor device numbers they've always had. If there are more than IB_UCM_MAX_DEVICES, then we dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:48 -08:00
Alexander Chiang	31d14b6e10	IB/ucm: Use stack variable 'base' in ib_ucm_add_one This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UCM_MAX_DEVICES. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:47 -08:00
Alexander Chiang	dd08f702dd	IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UCM_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:47 -08:00
Alexander Chiang	d3f2c67f2d	IB/umad: Clean whitespace Clean errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:46 -08:00
Alexander Chiang	8698d3fecc	IB/umad: Increase maximum devices supported Some large systems may support more than IB_UMAD_MAX_PORTS (currently 64). This change allows us to support more ports in a backwards-compatible manner. The first IB_UMAD_MAX_PORTS keep the same major/minor device numbers they've always had. If there are more than IB_UMAD_MAX_PORTS, we then dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:45 -08:00
Alexander Chiang	dc2ed5e3c9	IB/umad: Use stack variable 'base' in ib_umad_init_port This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UMAD_MAX_PORTS in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:45 -08:00
Alexander Chiang	d451b8df9f	IB/umad: Use stack variable 'devnum' in ib_umad_init_port This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UMAD_MAX_PORTS in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:44 -08:00
Alexander Chiang	6aa2a86ec4	IB/umad: Remove port_table[] We no longer need this data structure, as it was used to associate an inode back to a struct ib_umad_port during ->open(). But now that we're embedding a struct cdev in struct ib_umad_port, we can use the container_of() macro to go from the inode back to the device instead. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:44 -08:00
Alexander Chiang	2b937afcab	IB/umad: Convert *cdev to cdev in struct ib_umad_port Instead of storing pointers to cdev and sm_cdev, embed the full structures instead. This change allows us to use the container_of() macro in ib_umad_open() and ib_umad_sm_open() in a future patch. This change increases the size of struct ib_umad_port to 320 bytes from 128. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:43 -08:00
Alexander Chiang	9afed76d59	IB/uverbs: Whitespace cleanup Clean up the errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:42 -08:00
Alexander Chiang	830a387138	IB/uverbs: Pack struct ib_uverbs_event_file tighter Eliminate some padding in the structure by rearranging the members. sizeof(struct ib_uverbs_event_file) is now 72 bytes (from 80) and more members now fit in the first cacheline. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:42 -08:00
Alexander Chiang	6d6a0e71ee	IB/uverbs: Increase maximum devices supported Some large systems may support more than IB_UVERBS_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. The first IB_UVERBS_MAX_DEVICES keep the same major/minor device numbers that they've always had. If there are more than IB_UVERBS_MAX_DEVICES, we then dynamically request a new major device number (new minors start at 0). This change increases the maximum number of HCAs to 64 (from 32). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:41 -08:00
Alexander Chiang	ddbd688301	IB/uverbs: use stack variable 'base' in ib_uverbs_add_one This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UVERBS_MAX_DEVICES in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:40 -08:00
Alexander Chiang	38707980c4	IB/uverbs: Use stack variable 'devnum' in ib_uverbs_add_one This change is not useful by itself, but it sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UVERBS_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:40 -08:00
Alexander Chiang	2a72f21226	IB/uverbs: Remove dev_table dev_table's raison d'etre was to associate an inode back to a struct ib_uverbs_device. However, now that we've converted ib_uverbs_device to contain an embedded cdev (instead of a *cdev), we can use the container_of() macro and cast back to the containing device. There's no longer any need for dev_table, so get rid of it. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:39 -08:00
Alexander Chiang	055422ddbb	IB/uverbs: Convert *cdev to cdev in struct ib_uverbs_device Instead of storing a pointer to a cdev, embed the entire struct cdev. This change allows us to use the container_of() macro in ib_uverbs_open() in a future patch. This change increases the size of struct ib_uverbs_device to 168 bytes across 3 cachelines from 80 bytes in 2 cachelines. However, we rearrange the members so that everything fits into the first cacheline except for the struct cdev. Finally, we don't touch the cdev in any fastpaths, so this change shouldn't negatively affect performance. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-24 10:23:39 -08:00
Jiri Slaby	ccbe9f0b11	RDMA: Use rlimit helpers Make sure compiler won't do weird things with limits by using the rlimit helpers added in `3e10e716` ("resource: add helpers for fetching rlimits"). E.g. fetching them twice may return 2 different values after writable limits are implemented. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-11 15:40:48 -08:00
Sean Hefty	8523c04809	RDMA/cm: Revert association of an RDMA device when binding to loopback Revert the following change from commit `6f8372b6` ("RDMA/cm: fix loopback address support") The defined behavior of rdma_bind_addr is to associate an RDMA device with an rdma_cm_id, as long as the user specified a non- zero address. (ie they weren't just trying to reserve a port) Currently, if the loopback address is passed to rdma_bind_addr, no device is associated with the rdma_cm_id. Fix this. It turns out that important apps such as Open MPI depend on rdma_bind_addr() NOT associating any RDMA device when binding to a loopback address. Open MPI is being updated to deal with this, but at least until a new Open MPI release is available, maintain the previous behavior: allow rdma_bind_addr() to succeed, but do not bind to a device. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-02-10 12:00:48 -08:00
Robert P. J. Day	fd4582a399	IB/addr: Correct CONFIG_IPv6 to CONFIG_IPV6 Correct misspelled "CONFIG_IPv6" that was introduced in commit `d14714df` ("IB/addr: Fix IPv6 routing lookup"). The config variable should be all uppercase. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> [ This was my fault when I munged the original patch. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2010-01-06 13:16:30 -08:00
Linus Torvalds	bac5e54c29	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (38 commits) direct I/O fallback sync simplification ocfs: stop using do_sync_mapping_range cleanup blockdev_direct_IO locking make generic_acl slightly more generic sanitize xattr handler prototypes libfs: move EXPORT_SYMBOL for d_alloc_name vfs: force reval of target when following LAST_BIND symlinks (try #7) ima: limit imbalance msg Untangling ima mess, part 3: kill dead code in ima Untangling ima mess, part 2: deal with counters Untangling ima mess, part 1: alloc_file() O_TRUNC open shouldn't fail after file truncation ima: call ima_inode_free ima_inode_free IMA: clean up the IMA counts updating code ima: only insert at inode creation time ima: valid return code from ima_inode_alloc fs: move get_empty_filp() deffinition to internal.h Sanitize exec_permission_lite() Kill cached_lookup() and real_lookup() Kill path_lookup_open() ... Trivial conflicts in fs/direct-io.c	2009-12-16 12:04:02 -08:00
Al Viro	2c48b9c455	switch alloc_file() to passing struct path ... and have the caller grab both mnt and dentry; kill leak in infiniband, while we are at it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-16 12:16:42 -05:00
Roland Dreier	14f369d1d6	Merge branches 'amso1100', 'cma', 'cxgb3', 'ehca', 'ipath', 'ipoib', 'iser', 'misc', 'mlx4' and 'nes' into for-next	2009-12-15 23:39:25 -08:00
Roel Kluin	df42245a3c	IB/uverbs: Fix return of PTR_ERR() of wrong pointer in ib_uverbs_get_context() Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-12-09 14:30:44 -08:00
Sean Hefty	d14714df61	IB/addr: Fix IPv6 routing lookup Include link scope as part of address resolution. Combine local and remote address resolution into a single, simpler code path. Fix error checking in the IPv6 routing lookups. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Fix up cma_check_linklocal() for !IPV6 case. - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 16:46:25 -08:00
Sean Hefty	923c100ef0	IB/addr: Simplify resolving IPv4 addresses Merge resolve local/remote address resolution into a single data flow to ensure consistent access and use of the local routing tables. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 13:26:51 -08:00
Sean Hefty	6f8372b69c	RDMA/cm: fix loopback address support The RDMA CM is intended to support the use of a loopback address when establishing a connection; however, the behavior of the CM when loopback addresses are used is confusing and does not always work, depending on whether loopback was specified by the server, the client, or both. The defined behavior of rdma_bind_addr is to associate an RDMA device with an rdma_cm_id, as long as the user specified a non- zero address. (ie they weren't just trying to reserve a port) Currently, if the loopback address is passed to rdam_bind_addr, no device is associated with the rdma_cm_id. Fix this. If a loopback address is specified by the client as the destination address for a connection, it will fail to establish a connection. This is true even if the server is listing across all addresses or on the loopback address itself. The issue is that the server tries to translate the IP address carried in the REQ message to a local net_device address, which fails. The translation is not needed in this case, since the REQ carries the actual HW address that should be used. Finally, cleanup loopback support to be more transport neutral. Replace separate calls to get/set the sgid and dgid from the device address to a single call that behaves correctly depending on the format of the device address. And support both IPv4 and IPv6 address formats. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Fixed RDS build by s/ib_addr_get/rdma_addr_get/ - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 13:26:06 -08:00
Sean Hefty	c4315d85f9	IB/addr: Store net_device type instead of translating to RDMA transport The struct rdma_dev_addr stores net_device address information: the source device address, destination hardware address, and broadcast address. For consistency, store the net_device type rather than converting it to the rdma_node_type. The type indicates the format of the various hardware addresses, which is what we're concerned with, and not the RDMA node type that the address may map to. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:57:18 -08:00
Sean Hefty	d2e0886245	IB/addr: Verify source and destination address families match If a source address is provided, verify that the address family matches that of the destination address. If the source is not specified, use the same address family as the destination. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:22 -08:00
Sean Hefty	6266ed6e41	RDMA/cma: Replace net_device pointer with index Provide the device interface when resolving route information to ensure that the correct outbound device is used. This will also simplify processing of sin6_scope_id for IPv6 support. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthrope@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:22 -08:00
Jason Gunthorpe	e2e626972e	RDMA/cma: Fix AF_INET6 support in multicast joining If joining to an AF_INET6 address, we need to map the address to a MGID in the same way as the IP stack. The old code would just fall through to the IPv4 case and generate garbage. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:22 -08:00
Jason Gunthorpe	1c9b281997	RDMA/cma: Correct detection of SA Created MGID RDMA CM treats AF_INET6 addresses that are either 0 or prefixed with FF1x:A01B::/32 as MGIDs, but the detection for the prefix was buggy; fix it up. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-19 12:55:21 -08:00
Eric Dumazet	0f9ea5d2ab	RDMA/addr: Use appropriate locking with for_each_netdev() for_each_netdev() should be used with RTNL or dev_base_lock held, or else we risk a crash. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-18 14:24:34 -08:00
Sean Hefty	a7ca1f00ed	RDMA/ucma: Add option to manually set IB path Export rdma_set_ib_paths to user space to allow applications to manually set the IB path used for connections. This allows alternative ways for a user space application or library to obtain path record information, including retrieving path information from cached data, avoiding direct interaction with the IB SA. The IB SA is a single, centralized entity that can limit scaling on large clusters running MPI applications. Future changes to the rdma cm can expand on this framework to support the full range of features allowed by the IB CM, such as separate forward and reverse paths and APM. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Reviewed-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-11-16 09:30:33 -08:00
Alexey Dobriyan	d43c36dc6b	headers: remove sched.h from interrupt.h After m68k's task_thread_info() doesn't refer to current, it's possible to remove sched.h from interrupt.h and not break m68k! Many thanks to Heiko Carstens for allowing this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>	2009-10-11 11:20:58 -07:00
David J. Wilder	85f20b39fd	RDMA/addr: Fix resolution of local IPv6 addresses This patch allows a local IPv6 address to be resolved by rdma_cm. To reproduce the problem: $ rping -s -v -a ::0 & $ rping -c -v -a <IPv6 address local to this system> rdma_resolve_addr error -1 Local IPv6 address was obtained with "ip addr show ib0" Addresses: https://bugs.openfabrics.org/show_bug.cgi?id=1759 Signed-off-by: David Wilder <dwilder@us.ibm.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-10-07 16:03:18 -07:00
Steve Wise	54e05f15cc	RDMA/iwcm: Don't call provider reject func with irqs disabled In commit `cb58160e` ("RDMA/iwcm: Reject the connection when the cm_id is destroyed") a call to the provider's reject handler was added to destroy_cm_id() to fix a provider endpoint leak. This call needs to be done with interrupts enabled. So unlock and relock around this call. This is safe because: 1) the provider will do nothing with this endpoint until the iwcm either accepts or rejects. 2) the lock is only released after the iwcm state is changed, so an errant iwcm app that is destroying -and- rejecting the connection concurrently will get a failure on one of the calls. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2009-10-07 15:38:12 -07:00
Alexey Dobriyan	a99bbaf5ee	headers: remove sched.h from poll.h Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-04 15:05:10 -07:00

1 2 3 4 5 ...

721 Commits