linux

Commit Graph

Author	SHA1	Message	Date
Vipul Pandya	04236df2a5	RDMA/cxgb4: Always log async errors Log AEs even if the QP isn't in RTS. It is useful information. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-14 15:51:56 -08:00
Vipul Pandya	325abead6c	RDMA/cxgb4: Keep QP referenced until TID released The driver is currently releasing the last ref on the QP too early. This can cause bus errors due to HW still fetching WRs from the HW queue. The fix is to keep a qp ref until we release the HW TID. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-14 15:51:56 -08:00
Vipul Pandya	1557967bf9	RDMA/cxgb4: Display streaming mode error only if detected in RTS With later firmware, the chances of getting streaming mode data after we exit RTS is likely, so we don't need to warn for it. The only real case where we don't expect it is when the QP is in RTS. Move QP to ERROR when streaming mode data received. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-14 15:51:55 -08:00
Vipul Pandya	91e9c07195	RDMA/cxgb4: Abort connections when moving to ERROR state If a FINI operation fails, then we need to ABORT instead of CLOSE. Also, if we ABORT due to unexpected STREAMING data, then wake up anybody blocked in FINI... Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-14 15:51:55 -08:00
Vipul Pandya	55abf8df0a	RDMA/cxgb4: Abort connections that receive unexpected streaming mode data This error means the RDMA connection was knocked out of RDMA mode, probably due to an error on the connection. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-14 15:51:55 -08:00
David S. Miller	fd5023111c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Synchronize with 'net' in order to sort out some l2tp, wireless, and ipv6 GRE fixes that will be built on top of in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-08 18:02:14 -05:00
Roland Dreier	cbdba97a0f	Merge branches 'ipoib', 'mlx4' and 'qib' into for-next	2013-02-05 09:45:25 -08:00
Mike Marciniszyn	d359f35430	IB/qib: Fix for broken sparse warning fix Commit `1fb9fed6d4` ("IB/qib: Fix QP RCU sparse warning") broke QP hash list deletion in qp_remove() badly. This patch restores the former for loop behavior, while still fixing the sparse warnings. Cc: <stable@vger.kernel.org> Reviewed-by: Gary Leshner <gary.s.leshner@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-05 09:43:09 -08:00
Shlomo Pongratz	7e5a90c25f	IPoIB: Fix crash due to skb double destruct After commit `b13912bbb4` ("IPoIB: Call skb_dst_drop() once skb is enqueued for sending"), using connected mode and running multithreaded iperf for long time, ie iperf -c <IP> -P 16 -t 3600 results in a crash. After the above-mentioned patch, the driver is calling skb_orphan() and skb_dst_drop() after calling post_send() in ipoib_cm.c::ipoib_cm_send() (also in ipoib_ib.c::ipoib_send()) The problem with this is, as is written in a comment in both routines, "it's entirely possible that the completion handler will run before we execute anything after the post_send()." This leads to running the skb cleanup routines simultaneously in two different contexts. The solution is to always perform the skb_orphan() and skb_dst_drop() before queueing the send work request. If an error occurs, then it will be no different than the regular case where dev_free_skb_any() in the completion path, which is assumed to be after these two routines. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-05 09:35:06 -08:00
Jesper Juhl	fe194f19da	IB: cxgb3: delay freeing mem untill entirely done with it Sure, it's just the pointer value we use, but the coverity checker complains about a use-after-free bug and it really does seem cleaner to delay freeing until we are entirely done with the memory. So, rearrange the code to move the kfree() later untill we are completely done. Trivial and harmless, but nice IMHO. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-01-29 10:52:20 +01:00
David S. Miller	4b87f92259	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: Documentation/networking/ip-sysctl.txt drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c Both conflicts were simply overlapping context. A build fix for qlcnic is in here too, simply removing the added devinit annotations which no longer exist. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-15 15:05:59 -05:00
Jiri Pirko	aaeb6cdfa5	remove init of dev->perm_addr in drivers perm_addr is initialized correctly in register_netdevice() so to init it in drivers is no longer needed. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-08 18:00:48 -08:00
Jiri Pirko	7826d43f2d	ethtool: fix drvinfo strings set in drivers Use strlcpy where possible to ensure the string is \0 terminated. Use always sizeof(string) instead of 32, ETHTOOL_BUSINFO_LEN and custom defines. Use snprintf instead of sprint. Remove unnecessary inits of ->fw_version Remove unnecessary inits of drvinfo struct. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-06 21:06:31 -08:00
Jiri Pirko	7f6e7101df	nes: remove usage of dev->master Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-04 13:31:50 -08:00
Greg Kroah-Hartman	1e6d9abea7	Drivers: infinband: remove __dev* attributes. CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, __devexit_p, __devinitdata, and __devexit from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Bill Pemberton <wfp5p@virginia.edu> Cc: Tom Tucker <tom@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Cc: Christoph Raisch <raisch@de.ibm.com> Cc: Mike Marciniszyn <infinipath@intel.com> Cc: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-03 15:57:15 -08:00
Linus Torvalds	184e251661	Second batch of InfiniBand/RDMA changes for 3.8: - cxgb4 changes to fix lookup engine hash collisions - mlx4 changes to make flow steering usable - fix to IPoIB to avoid pinning dst reference for too long -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJQ1NdhAAoJEENa44ZhAt0hjb0P/i0tL4Ux+PvqG/Phh2gaZaQR evoi3bw6tCYFzWEJPfur2AJ63svKjyrSrfpvgEZjhthDdYORjIv2Dw2Je1qkOSrf 5tJtSp3r9D0y35SE/DxNnzlgua9heBqphPlOGpjcKdN83KP7XIyVG2SGyJiEeLza owefPx/48jZr8hsw7LB2DlZmNUbVWK00o+pXa/VsUQX/dlIU5hyihAzBjtkwJyT2 xtiyu9oqXGuv1JW/a16ooPGDaETDLJ1G50NndadUZYWFWj36VrAwW6hOAK3oOikf Qa4z3gJVzpSdaC1kiuxERj7GxlRpVUJY0IgHEoMTVrexOz1IsFEP8KEfGLkAYwzB jjuXh+Z2+QU5OOO3un0nINRGxKZUSD8Scoa222GBwGnWuCCq68APx2UGTkVkhWon FyjhF13WJRbElg2oXzI1cg9lJNv2pf10hXhiy2qdO6tDElVXVQk+KRiDdcNtxS0H FYYh3og/DjFwp18j+FVLA9r5AiPuVV5DjNnlwBNjTTMc9RwlOrX/6oCK2kZN24rZ l+Nr0gv+h6MAjTPBPYLUP2bsY6wYt5n566mfLea/lir9YeI+Q3PL2sDdGZ7C6xHH S4pRW95leP4pEFpHqYg8z3QKPswYqzokgbUSHxg2TVYrV01RC75axXCh0Q90q3RE oLP8GDYod2okxhAU2AyN =lDXt -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull more infiniband changes from Roland Dreier: "Second batch of InfiniBand/RDMA changes for 3.8: - cxgb4 changes to fix lookup engine hash collisions - mlx4 changes to make flow steering usable - fix to IPoIB to avoid pinning dst reference for too long" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cxgb4: Fix bug for active and passive LE hash collision path RDMA/cxgb4: Fix LE hash collision bug for passive open connection RDMA/cxgb4: Fix LE hash collision bug for active open connection mlx4_core: Allow choosing flow steering mode mlx4_core: Adjustments to Flow Steering activation logic for SR-IOV mlx4_core: Fix error flow in the flow steering wrapper mlx4_core: Add QPN enforcement for flow steering rules set by VFs cxgb4: Add LE hash collision bug fix path in LLD driver cxgb4: Add T4 filter support IPoIB: Call skb_dst_drop() once skb is enqueued for sending	2012-12-21 16:40:26 -08:00
Roland Dreier	d72623b665	Merge branches 'cxgb4', 'ipoib' and 'mlx4' into for-next	2012-12-19 23:03:43 -08:00
Vipul Pandya	793dad94e7	RDMA/cxgb4: Fix bug for active and passive LE hash collision path Retries active opens for INUSE errors. Logs any active ofld_connect_wr error replies. Sends ofld_connect_wr on same ctrlq. It needs to go on the same control txq as regular CPL active/passive messages. Retries on active open replies with EADDRINUSE. Uses active open fw wr only if active filter region is set. Adds stat for ofld_connect_wr failures. This patch also adds debugfs file to show endpoints. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 23:03:12 -08:00
Vipul Pandya	1cab775c3e	RDMA/cxgb4: Fix LE hash collision bug for passive open connection It establishes passive open connection through firmware work request. Passive open connection will go through this path as now instead of listening server we create a server filter which will redirect the incoming SYN packet to the offload queue. After this driver tries to establish the connection using firmware work request. Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 23:03:11 -08:00
Vipul Pandya	5be78ee924	RDMA/cxgb4: Fix LE hash collision bug for active open connection It enables establishing active open connection using fw_ofld_connection work request when cpl_act_open_rpl says TCAM full error which may be because of LE hash collision. Current support is only for IPv4 active open connections. Sets ntuple bits in active open requests. For T4 firmware greater than 1.4.10.0 ntuple bits are required to be set. Adds nocong and enable_ecn module parameter options. Signed-off-by: Vipul Pandya <vipul@chelsio.com> [ Move all FW return values to t4fw_api.h. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 23:02:43 -08:00
Roland Dreier	b13912bbb4	IPoIB: Call skb_dst_drop() once skb is enqueued for sending Currently, IPoIB delays collecting send completions for TX packets in order to batch work more efficiently. It does skb_orphan() right after queuing the packets so that destructors run early, to avoid problems like holding socket send buffers for too long (since we might not collect a send completion until a long time after the packet is actually sent). However, IPoIB clears IFF_XMIT_DST_RELEASE because it actually looks at skb_dst() to update the PMTU when it gets a too-long packet. This means that the packets sitting in the TX ring with uncollected send completions are holding a reference on the dst. We've seen this lead to pathological behavior with respect to route and neighbour GC. The easy fix for this is to call skb_dst_drop() when we call skb_orphan(). Also, give packets sent via connected mode (CM) the same skb_orphan() / skb_dst_drop() treatment that packets sent via datagram mode get. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 09:16:43 -08:00
Linus Torvalds	16e024f30c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc update from Benjamin Herrenschmidt: "The main highlight is probably some base POWER8 support. There's more to come such as transactional memory support but that will wait for the next one. Overall it's pretty quiet, or rather I've been pretty poor at picking things up from patchwork and reviewing them this time around and Kumar no better on the FSL side it seems..." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (73 commits) powerpc+of: Rename and fix OF reconfig notifier error inject module powerpc: mpc5200: Add a3m071 board support powerpc/512x: don't compile any platform DIU code if the DIU is not enabled powerpc/mpc52xx: use module_platform_driver macro powerpc+of: Export of_reconfig_notifier_[register,unregister] powerpc/dma/raidengine: add raidengine device powerpc/iommu/fsl: Add PAMU bypass enable register to ccsr_guts struct powerpc/mpc85xx: Change spin table to cached memory powerpc/fsl-pci: Add PCI controller ATMU PM support powerpc/86xx: fsl_pcibios_fixup_bus requires CONFIG_PCI drivers/virt: the Freescale hypervisor driver doesn't need to check MSR[GS] powerpc/85xx: p1022ds: Use NULL instead of 0 for pointers powerpc: Disable relocation on exceptions when kexecing powerpc: Enable relocation on during exceptions at boot powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function powerpc: Add wrappers to enable/disable relocation on exceptions powerpc: Add set_mode hcall powerpc: Setup relocation on exceptions for bare metal systems powerpc: Move initial mfspr LPCR out of __init_LPCR powerpc: Add relocation on exception vector handlers ...	2012-12-18 09:58:09 -08:00
Linus Torvalds	5bd665f28d	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull target updates from Nicholas Bellinger: "It has been a very busy development cycle this time around in target land, with the highlights including: - Kill struct se_subsystem_dev, in favor of direct se_device usage (hch) - Simplify reservations code by combining SPC-3 + SCSI-2 support for virtual backends only (hch) - Simplify ALUA code for virtual only backends, and remove left over abstractions (hch) - Pass sense_reason_t as return value for I/O submission path (hch) - Refactor MODE_SENSE emulation to allow for easier addition of new mode pages. (roland) - Add emulation of MODE_SELECT (roland) - Fix bug in handling of ExpStatSN wrap-around (steve) - Fix bug in TMR ABORT_TASK lookup in qla2xxx target (steve) - Add WRITE_SAME w/ UNMAP=0 support for IBLOCK backends (nab) - Convert ib_srpt to use modern target_submit_cmd caller + drop legacy ioctx->kref usage (nab) - Convert ib_srpt to use modern target_submit_tmr caller (nab) - Add link_magic for fabric allow_link destination target_items for symlinks within target_core_fabric_configfs.c code (nab) - Allocate pointers in instead of full structs for config_group->default_groups (sebastian) - Fix 32-bit highmem breakage for FILEIO (sebastian) All told, hch was able to shave off another ~1K LOC by killing the se_subsystem_dev abstraction, along with a number of PR + ALUA simplifications. Also, a nice patch by Roland is the refactoring of MODE_SENSE handling, along with the addition of initial MODE_SELECT emulation support for virtual backends. Sebastian found a long-standing issue wrt to allocation of full config_group instead of pointers for config_group->default_group[] setup in a number of areas, which ends up saving memory with big configurations. He also managed to fix another long-standing BUG wrt to broken 32-bit highmem support within the FILEIO backend driver. Thank you again to everyone who contributed this round!" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (50 commits) target/iscsi_target: Add NodeACL tags for initiator group support target/tcm_fc: fix the lockdep warning due to inconsistent lock state sbp-target: fix error path in sbp_make_tpg() sbp-target: use simple assignment in tgt_agent_rw_agent_state() iscsi-target: use kstrdup() for iscsi_param target/file: merge fd_do_readv() and fd_do_writev() target/file: Fix 32-bit highmem breakage for SGL -> iovec mapping target: Add link_magic for fabric allow_link destination target_items ib_srpt: Convert TMR path to target_submit_tmr ib_srpt: Convert I/O path to target_submit_cmd + drop legacy ioctx->kref target: Make spc_get_write_same_sectors return sector_t target/configfs: use kmalloc() instead of kzalloc() for default groups target/configfs: allocate only 6 slots for dev_cg->default_groups target/configfs: allocate pointers instead of full struct for default_groups target: update error handling for sbc_setup_write_same() iscsit: use GFP_ATOMIC under spin lock iscsi_target: Remove redundant null check before kfree target/iblock: Forward declare bio helpers target: Clean up flow in transport_check_aborted_status() target: Clean up logic in transport_put_cmd() ...	2012-12-15 14:25:10 -08:00
Roland Dreier	01e0336598	Merge branch 'nes' into for-next	2012-12-08 20:45:48 -08:00
Tatyana Nikolova	7d9c199a55	RDMA/nes: Fix for crash when registering zero length MR for CQ Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-08 00:31:03 -08:00
Tatyana Nikolova	7bfcfa51c3	RDMA/nes: Fix for terminate timer crash The terminate timer needs to be initialized just once. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-08 00:31:02 -08:00
Tatyana Nikolova	00ad255d17	RDMA/nes: Fix for BUG_ON due to adding already-pending timer To avoid nes tcp_timer crash for SMP architectures, add_timer is replaced with mod_timer. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-08 00:31:02 -08:00
Roland Dreier	fb57e1dbbd	Merge branch 'srp' into for-next	2012-11-30 17:41:04 -08:00
Bart Van Assche	dc1bdbd9b8	IB/srp: Allow SRP disconnect through sysfs Make it possible to disconnect the IB RC connection used by the SRP protocol to communicate with a target. Have the SRP transport layer create a sysfs "delete" attribute for initiator drivers that support this functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:33 -08:00
Vu Pham	55d93898a1	IB/srp: send disconnect request without waiting for CM timewait exit Now that SRP recreates the CM ID, QP, and CQ for each connection, there is no need to wait for the timewait state to complete. Signed-off-by: Vu Pham <vu@mellanox.com> Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:32 -08:00
Ishai Rabinovitz	73aa89ed9e	IB/srp: destroy and recreate QP and CQs when reconnecting HW QP FATAL errors persist over a reset operation, but we can recover from that by recreating the QP and associated CQs for each connection. Creating a new QP/CQ also completely forecloses any possibility of getting stale completions or packets on the new connection. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> [ updated to current code from OFED, cleaned up commit message ] Signed-off-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:32 -08:00
Bart Van Assche	ef6c49d87c	IB/srp: Eliminate state SRP_TARGET_DEAD Only queue removal work after having changed the target state into SRP_TARGET_REMOVED and not if that state was already equal to SRP_TARGET_REMOVED. That allows us to remove the state SRP_TARGET_DEAD. Add a call to srp_disconnect_target() in srp_remove_target() -- due to previous changes it is now safe to invoke that function even if the IB connection has already been disconnected. This change allows us to replace the target removal code in srp_remove_one() by an (indirect) call to srp_remove_target(). Rename srp_target_port.work into srp_target_port.remove_work to reflect its usage. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:31 -08:00
Bart Van Assche	ee12d6a80c	IB/srp: Introduce the helper function srp_remove_target() Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:31 -08:00
Bart Van Assche	294c875a65	IB/srp: Suppress superfluous error messages Keep track of the connection state. Only report QP errors while connected. Only invoke ib_send_cm_dreq() when connected so that invoking srp_disconnect_target() after having received a DREQ does not cause an error message to be printed. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:31 -08:00
Bart Van Assche	4f0af69799	IB/srp: Process all error completions If the RDMA RC connection is closed, tell the SCSI mid-layer to terminate all pending commands instead of only the first. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:31 -08:00
Bart Van Assche	948d1e889e	IB/srp: Introduce srp_handle_qp_err() Introduce the function srp_handle_qp_err(), change the type of qp_in_error from int into bool and move the initialization of that variable from srp_reconnect_target() to srp_connect_target(). Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:30 -08:00
Bart Van Assche	224db15743	IB/srp: Simplify SCSI error handling Since scsi_remove_host() has been modified so that SCSI error handling functions will no longer be invoked after scsi_remove_host() returns, the test at the start of srp_send_tsk_mgmt() is now superfluous. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:30 -08:00
Bart Van Assche	f371823120	IB/srp: Keep processing commands during host removal Some SCSI upper layer drivers, e.g. sd, issue SCSI commands from inside scsi_remove_host() (see the sd_shutdown() call in sd_remove()). Make sure that these commands have a chance to reach the SCSI device. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:30 -08:00
Bart Van Assche	09be70a238	IB/srp: Eliminate state SRP_TARGET_CONNECTING Block the SCSI host while reconnecting instead of representing the reconnection activity as a distinct SRP target state. This allows us to eliminate the target state SRP_TARGET_CONNECTING. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:29 -08:00
Bart Van Assche	c9b03c1ae5	IB/srp: Increase block layer timeout Increase the block layer timeout for disks so that it is above the InfiniBand transport layer timeout. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-30 17:40:29 -08:00
Roland Dreier	78c90247af	Merge branches 'cma' and 'mlx4' into for-next	2012-11-29 12:16:43 -08:00
shefty	63f05be2c0	RDMA/cm: Change return value from find_gid_port() Problem reported by Dan Carpenter <dan.carpenter@oracle.com>: The patch 3c86aa70bf67: "RDMA/cm: Add RDMA CM support for IBoE devices" from Oct 13, 2010, leads to the following warning: net/sunrpc/xprtrdma/svc_rdma_transport.c:722 svc_rdma_create() error: passing non neg 1 to ERR_PTR This bug would result in a NULL dereference. svc_rdma_create() is supposed to return ERR_PTRs or valid pointers, but instead it returns ERR_PTRs, valid pointers and 1. The call tree is: svc_rdma_create() => rdma_bind_addr() => cma_acquire_dev() => find_gid_port() rdma_bind_addr() should return a valid errno. Fix this by having find_gid_port() also return a valid errno. If we can't find the specified GID on a given port, return -EADDRNOTAVAIL, rather than -EAGAIN, to better indicate the error. We also drop using the special return value of '1' and instead pass through the error returned by the underlying verbs call. On such errors, rather than aborting the search, we simply continue to check the next device/port. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-29 12:16:29 -08:00
Jack Morgenstein	ceb7decb36	IB/mlx4: Fix spinlock order to avoid lockdep warnings lockdep warns about taking a hard-irq-unsafe lock (sriov->id_map_lock) inside a hard-irq-safe lock (sriov->going_down_lock). Since id_map_lock is never taken in the interrupt context, we can simply reverse the order of taking the two spinlocks, thus avoiding the warning and the depencency. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-29 12:14:45 -08:00
Nicholas Bellinger	3e4f574857	ib_srpt: Convert TMR path to target_submit_tmr This patch converts the TMR path in srpt_handle_tsk_mgmt() to use target_submit_tmr() with TARGET_SCF_ACK_KREF flag usage. v2: Drop ununused res in target_submit_tmr (Fengguang.Wu) Cc: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Roland Dreier <roland@kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2012-11-28 11:20:28 -08:00
Nicholas Bellinger	9474b04313	ib_srpt: Convert I/O path to target_submit_cmd + drop legacy ioctx->kref This patch converts the main srpt_handle_cmd() I/O path to use modern target_submit_cmd() with TARGET_SCF_ACK_KREF flag usage. This includes dropping the original internal ioctx->kref + srpt_put_send_ioctx() usage in favor of target_put_sess_cmd() w/ se_cmd_t->cmd_kref within ib_srpt response callbacks. It also updates srpt_abort_cmd() to call target_put_sess_cmd() for completion of aborted commands, and adds target_wait_for_sess_cmds() into srpt_release_channel_work() to allow outstanding I/O to complete during session shutdown. Also, go ahead and update srpt_handle_tsk_mgmt() to make the remaining transport_init_se_cmd() to setup the ioctx->cmd with se_tmr_req. Cc: Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Roland Dreier <roland@kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2012-11-28 01:10:59 -08:00
Roland Dreier	d0951d2133	Merge branches 'cxgb4', 'misc', 'mlx4', 'nes' and 'uapi' into for-next	2012-11-26 11:08:41 -08:00
Julia Lawall	5107c2a3d1	RDMA/cxgb3: use WARN Use WARN rather than printk followed by WARN_ON(1), for conciseness. A simplified version of the semantic patch that makes this transformation is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression list es; @@ -printk( +WARN(1, es); -WARN_ON(1); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-26 11:08:16 -08:00
Julia Lawall	76f267b7ad	RDMA/cxgb4: use WARN Use WARN rather than printk followed by WARN_ON(1), for conciseness. A simplified version of the semantic patch that makes this transformation is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression list es; @@ -printk( +WARN(1, es); -WARN_ON(1); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-26 11:07:18 -08:00
Or Gerlitz	08ff32352d	mlx4: 64-byte CQE/EQE support ConnectX-3 devices can use either 64- or 32-byte completion queue entries (CQEs) and event queue entries (EQEs). Using 64-byte EQEs/CQEs performs better because each entry is aligned to a complete cacheline. This patch queries the HCA's capabilities, and if it supports 64-byte CQEs and EQES the driver will configure the HW to work in 64-byte mode. The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ. Since this mode is global, userspace (libmlx4) must be updated to work with the configured CQE size, and guests using SR-IOV virtual functions need to know both EQE and CQE size. In case one of the 64-byte CQE/EQE capabilities is activated, the patch makes sure that older guest drivers that use the QUERY_DEV_FUNC command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that they need an update to be able to work with the PPF. This is done by changing the returned pf_context_behaviour not to be zero any more. In case none of these capabilities is activated that value remains zero and older guest drivers can run OK. The SRIOV related flow is as follows 1. the PPF does the detection of the new capabilities using QUERY_DEV_CAP command. 2. the PPF activates the new capabilities using INIT_HCA. 3. the VF detects if the PPF activated the capabilities using QUERY_HCA, and if this is the case activates them for itself too. Note that the VF detects that it must be aware to the new PF behaviour using QUERY_FUNC_CAP. Steps 1 and 2 apply also for native mode. User space notification is done through a new field introduced in struct mlx4_ib_ucontext which holds device capabilities for which user space must take action. This changes the binary interface so the ABI towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only when needed i.e. only when the driver does use 64-byte CQEs or future device capabilities which must be in sync by user space. This practice allows to work with unmodified libmlx4 on older devices (e.g A0, B0) which don't support 64-byte CQEs. In order to keep existing systems functional when they update to a newer kernel that contains these changes in VF and userspace ABI, a module parameter enable_64b_cqe_eqe must be set to enable 64-byte mode; the default is currently false. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-26 10:19:17 -08:00
Benjamin Herrenschmidt	2a859ab07b	Merge branch 'merge' into next Merge my own merge branch to get various fixes from there and upstream, especially the hvc console tty refcouting fixes which which testing is quite a bit harder...	2012-11-26 09:23:57 +11:00
Alan Cox	c9795bd708	RDMA/amsol1100: Fix missing break Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-22 00:52:55 -08:00
Alan Cox	5390f86796	IB/ipath: Remove unreachable code Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-22 00:50:51 -08:00
Julia Lawall	079abea6a3	RDMA/nes: Use WARN() Use WARN() rather than printk() followed by WARN_ON(1), for conciseness. A simplified version of the semantic patch that makes this transformation is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression list es; @@ -printk( +WARN(1, es); -WARN_ON(1); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> [ Remove extra KERN_ERR from WARN() format. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-22 00:49:15 -08:00
Tatyana Nikolova	cecdcd5f24	RDMA/nes: Fix for incorrect multicast address in the perfect filter table Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-22 00:49:14 -08:00
Tatyana Nikolova	fc8d7547b1	RDMA/nes: Fix for sending fpdus in order to hardware Locking fix to prevent race conditions. Fpdus (per qp) need to be forwarded to hardware in the order of their sequence numbers. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Donald Wood <Donald.E.Wood@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-22 00:49:14 -08:00
Tatyana Nikolova	bff3976bef	RDMA/nes: Fix for unlinking skbs from empty list Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-22 00:49:13 -08:00
Tatyana Nikolova	3127e4ea54	RDMA/nes: Fix incorrect address of IP header Fix for incorrect ip header address when forwarding fpdus to hardware. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-22 00:49:13 -08:00
Ian Munsie	cca55d9ddf	powerpc: Move get_longbusy_msecs into hvcall.h and remove duplicate function I am going to use this in the next patch, better to have this code in one place rather than three. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-11-15 15:08:07 +11:00
Christoph Hellwig	de103c93af	target: pass sense_reason as a return value Pass the sense reason as an explicit return value from the I/O submission path instead of storing it in struct se_cmd and using negative return values. This cleans up a lot of the code pathes, and with the sparse annotations for the new sense_reason_t type allows for much better error checking. (nab: Convert spc_emulate_modesense + spc_emulate_modeselect to use sense_reason_t with Roland's MODE SELECT changes) Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2012-11-06 20:55:46 -08:00
Roland Dreier	1e3474d1de	Merge branches 'cxgb4' and 'mlx4' into for-next	2012-10-23 09:03:49 -07:00
Thadeu Lima de Souza Cascardo	32c631f9f2	RDMA/cxgb4: Don't free chunk that we have failed to allocate In the error path of registering memory when there's a failure to allocate a chunk from the memory pool, we try to free the same chunk we just failed to allocate, which will BUG(). Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-22 11:05:00 -07:00
Eli Cohen	bef83ed92c	IB/mlx4: Synchronize cleanup of MCGs in MCG paravirtualization A client re-register event invokes cleanup of all MCGs. This is required to protect against misbehaved guests leading to corruption of join/leave database. However, since cleaning up the MCGs is a heavy operation, it is pushed to a work queue for further processing. Client re-register is also propagated to ULPs (e.g IPoIB). However, since the cleanup is performed in a workqueue, the ULP could leave and re-join groups before the cleanup occurs. In this case, when the cleanup takes place, it prunes the (newly-joined) MCGs and the ULP is left without actual MCGs while believing it joined them. Fix this by setting the flushing flag before invoking the cleanup task and clearing it after flushing is complete. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-18 10:29:02 -07:00
Jack Morgenstein	2c75d2ccb6	IB/mlx4: Fix QP1 P_Key processing in the Primary Physical Function (PPF) In the MAD paravirtualization code, one of the checks performed when forwarding QP1 (GSI) packets from wire to slave was a P_Key check: the P_Key received in the MAD must be present in the guest's paravirtualized P_Key table, and at least one of the (packet P_Key, guest P_Key) must be a full-membership P_Key. However, if everyone involved has only limited membership in the default P_Key, then packets sent by full-member remote hosts arrive at the PPF but are not passed on to the VFs with the current P_Key1 check. Fix this as follows: 1. Don't care if P_Key received over wire is full or not. If it successfully passed HW checks on the real QP1, then simply pass it to guest regardless of whether the guest has full or limited membership in its P_Key table. 2. If the guest (including paravirtualized master) has both full and limited P_Key forms in its table, preferentially pass the paravirtualized P_Key index of the full P_Key form in the tunnel header. 3. In the multicast join flow (mlx4/mcg.c), use the index for the default P_Key (wherever it is located) in replies generated from within the mcg module (previously, P_Key index 0 was used in all cases). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-18 10:29:02 -07:00
Doug Ledford	8a095030f7	IB/mlx4: Fix build error on platforms where UL is not 64 bits Line 110 uses UL as a compiler cast for the 0x constant, but it's not large enough to hold a 64-bit value on a 32-bit arch. Signed-off-by: Doug Ledford <dledford@redhat.com> [ Use "-1" instead of "FFFFFFFFFFFFFFFFULL". - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-18 10:29:01 -07:00
Linus Torvalds	a188e7e93a	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull scsi target updates from Nicholas Bellinger: "Things have been calm for the most part with no new fabric drivers in flight for v3.7 (we're up to eight now !), so this update is primarily focused on addressing a few long-standing items within target-core and iscsi-target fabric code. The highlights include: - target: Simplify fabric sense data length handling (roland) - qla2xxx: Fix endianness of task management response code (roland) - target: fix truncation of mode data, support zero allocation length (paolo) - target: Properly support zero-length commands in normal processing path (paolo) - iscsi-target: Correctly set 0xffffffff field within ISCSI_OP_REJECT PDU (ronnie + nab) - iscsi-target: Add explicit set of cache_dynamic_acls=1 for TPG demo-mode (ronnie + nab) - target/file: Re-enable optional fd_buffered_io=1 operation (nab + hch) - iscsi-target: Add MaxXmitDataSegmenthLength forr target -> initiator MDRSL declaration (nab) - target: Add target_submit_cmd_map_sgls for SGL fabric memory passthrough (nab + hch) - tcm_loop: Convert I/O path to use target_submit_cmd_map_sgls (hch + nab) - tcm_vhost: Convert I/O path to use target_submit_cmd_map_sgls (nab + hch) The last series for adding a new target_submit_cmd_map_sgls() fabric caller (as requested by hch) that accepts pre-allocated SGL memory (using existing logic), along with converting tcm_loop + tcm_vhost has only been in -next for the last days, but has gotten enough review +testing and is clear enough a mechanical change that I think it's reasonable to merge for -rc1 code. Thanks again to everyone who contributed this round! Extra special thanks to Roland (PureStorage) for tracking down the qla2xxx target TMR response code endian issue, and to Paolo (Redhat) for resolving the long standing zero-length CDB issues within target-core between virtual and pSCSI backends." * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (44 commits) iscsi-target: Bump defaults for nopin_timeout + nopin_response_timeout values iscsit: proper endianess conversions iscsit: use the itt_t abstract type iscsit: add missing endianess conversion in iscsit_check_inaddr_any iscsit: remove incorrect unlock in iscsit_build_sendtargets_resp iscsit: mark various functions static target/iscsi: precedence bug in iscsit_set_dataout_sequence_values() target/usb-gadget: strlen() doesn't count the terminator target/usb-gadget: remove duplicate initialization tcm_vhost: Convert I/O path to use target_submit_cmd_map_sgls target: Add control CDB READ payload zero work-around tcm_loop: Convert I/O path to use target_submit_cmd_map_sgls target: Add target_submit_cmd_map_sgls for SGL fabric memory passthrough iscsi-target: Add explicit set of cache_dynamic_acls=1 for TPG demo-mode iscsi-target: Change iscsi_target_seq_pdu_list.c to honor MaxXmitDataSegmentLength iscsi-target: Add MaxXmitDataSegmentLength connection recovery check iscsi-target: Convert incoming PDU payload checks to MaxXmitDataSegmentLength iscsi-target: Enable MaxXmitDataSegmentLength operation in login path iscsi-target: Add base MaxXmitDataSegmentLength code target/file: Re-enable optional fd_buffered_io=1 operation ...	2012-10-10 19:52:19 +09:00
David S. Miller	8dd9117cc7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux Pulled mainline in order to get the UAPI infrastructure already merged before I pull in David Howells's UAPI trees for networking. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-09 13:14:32 -04:00
Konstantin Khlebnikov	314e51b985	mm: kill vma flag VM_RESERVED and mm->reserved_vm counter A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA, currently it lost original meaning but still has some effects: \| effect \| alternative flags -+------------------------+--------------------------------------------- 1\| account as reserved_vm \| VM_IO 2\| skip in core dump \| VM_IO, VM_DONTDUMP 3\| do not merge or expand \| VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP 4\| do not mlock \| VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP This patch removes reserved_vm counter from mm_struct. Seems like nobody cares about it, it does not exported into userspace directly, it only reduces total_vm showed in proc. Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND \| VM_DONTDUMP. remap_pfn_range() and io_remap_pfn_range() set VM_IO\|VM_DONTEXPAND\|VM_DONTDUMP. remap_vmalloc_range() set VM_DONTEXPAND \| VM_DONTDUMP. [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:19 +09:00
Linus Torvalds	ca4da6948b	Second batch of changes for the 3.7 merge window: - Late-breaking fix for IPoIB on mlx4 SR-IOV VFs. - Fix for IPoIB build breakage with CONFIG_INFINIBAND_IPOIB_CM=n (new netlink config changes are to blame). - Make sure retry count values are in range in RDMA CM. - A few nes hardware driver fixes and cleanups. - Have iSER initiator use >1 interrupt vectors if available. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQbkIkAAoJEENa44ZhAt0hrLEP/3jNpwR+6/rbVxcPkmVVITsn yUz9S3V7loJqQqlHt7ktphgcoUpar+xs2C8DbCyVupyee4K3zd7Lw+wOVO2OJ16Y +gE8PZB6mSYOVaI0JW26Qqe2gcQ5DLZ4ic6E8gEiuyjNo3ig3MZl17ZffTI+5lXi igR+Qnsy2bm18JI6aumSeKYa17f7EmOK63jaIp0jc95vrtWke3jRRs6+bmmDLV7q 4ZxCP7ckBTRFVkisXlWZ95MmUbg/lhWosZGdJwjRYs+LW3tzgY18OQZgbw6FJKpJ 0OouAO51UMKcizfg7ikwUIw6/apaCFXXv/pH3rIeq18qiJeKcDsIXseY7sFRboHR fIPpfhHENokcxtlOReaw7hnlyMuwrLiYbF6TjPN9t8dWTNbwe+4W0K9G60fu1wmk qjaUTBCCuDF4KtKqCxP01cZzXyjJrworw5p+Nc1wxds+JeyKRCpVRrfYep8vOgft z6KJjQGDF16MvXmhF1dD9VBpmaMCOA4NDhfa0IcgHPTOLc/7Nbe5NTD95InlzM3Q /C0CPj+4J4kzEMnTBHVJvv1NeD3G1RQD80bce6ViTbgOUg9pmeGHkLWe420h7xT4 u/LBnwM1dcqY7BDOnn3ycYoJgr5OCQZOpLcNWjTXv4t51mBOYtM6AJ6IBIvkOmSz QjEyiDEEvT2K7c/rhYOz =9Jv4 -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband changes from Roland Dreier: "Second batch of changes for the 3.7 merge window: - Late-breaking fix for IPoIB on mlx4 SR-IOV VFs. - Fix for IPoIB build breakage with CONFIG_INFINIBAND_IPOIB_CM=n (new netlink config changes are to blame). - Make sure retry count values are in range in RDMA CM. - A few nes hardware driver fixes and cleanups. - Have iSER initiator use >1 interrupt vectors if available." * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cma: Check that retry count values are in range IB/iser: Add more RX CQs to scale out processing of SCSI responses RDMA/nes: Bump the version number of nes driver RDMA/nes: Remove unused module parameter "send_first" RDMA/nes: Remove unnecessary if-else statement RDMA/nes: Add missing break to switch. mlx4_core: Adjust flow steering attach wrapper so that IB works on SR-IOV VFs IPoIB: Fix build with CONFIG_INFINIBAND_IPOIB_CM=n	2012-10-07 17:19:49 +09:00
Gao feng	809d5fc9bf	infiniband: pass rdma_cm module to netlink_dump_start set netlink_dump_control.module to avoid panic. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-07 00:30:56 -04:00
Linus Torvalds	5f3d2f2e1a	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Benjamin Herrenschmidt: "Some highlights in addition to the usual batch of fixes: - 64TB address space support for 64-bit processes by Aneesh Kumar - Gavin Shan did a major cleanup & re-organization of our EEH support code (IBM fancy PCI error handling & recovery infrastructure) which paves the way for supporting different platform backends, along with some rework of the PCIe code for the PowerNV platform in order to remove home made resource allocations and instead use the generic code (which is possible after some small improvements to it done by Gavin). - Uprobes support by Ananth N Mavinakayanahalli - A pile of embedded updates from Freescale folks, including new SoC and board supports, more KVM stuff including preparing for 64-bit BookE KVM support, ePAPR 1.1 updates, etc..." Fixup trivial conflicts in drivers/scsi/ipr.c * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (146 commits) powerpc/iommu: Fix multiple issues with IOMMU pools code powerpc: Fix VMX fix for memcpy case driver/mtd:IFC NAND:Initialise internal SRAM before any write powerpc/fsl-pci: use 'Header Type' to identify PCIE mode powerpc/eeh: Don't release eeh_mutex in eeh_phb_pe_get powerpc: Remove tlb batching hack for nighthawk powerpc: Set paca->data_offset = 0 for boot cpu powerpc/perf: Sample only if SIAR-Valid bit is set in P7+ powerpc/fsl-pci: fix warning when CONFIG_SWIOTLB is disabled powerpc/mpc85xx: Update interrupt handling for IFC controller powerpc/85xx: Enable USB support in p1023rds_defconfig powerpc/smp: Do not disable IPI interrupts during suspend powerpc/eeh: Fix crash on converting OF node to edev powerpc/eeh: Lock module while handling EEH event powerpc/kprobe: Don't emulate store when kprobe stwu r1 powerpc/kprobe: Complete kprobe and migrate exception frame powerpc/kprobe: Introduce a new thread flag powerpc: Remove unused __get_user64() and __put_user64() powerpc/eeh: Global mutex to protect PE tree powerpc/eeh: Remove EEH PE for normal PCI hotplug ...	2012-10-06 03:16:12 +09:00
Fengguang Wu	125c4c706b	idr: rename MAX_LEVEL to MAX_IDR_LEVEL To avoid name conflicts: drivers/video/riva/fbdev.c:281:9: sparse: preprocessor token MAX_LEVEL redefined While at it, also make the other names more consistent and add parentheses. [akpm@linux-foundation.org: repair fallout] [sfr@canb.auug.org.au: IB/mlx4: fix for MAX_ID_MASK to MAX_IDR_MASK name change] Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: walter harms <wharms@bfs.de> Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:04:56 +09:00
Roland Dreier	56040f07b2	Merge branches 'cma', 'ipoib', 'iser', 'mlx4' and 'nes' into for-next	2012-10-04 19:12:22 -07:00
Sean Hefty	4ede178a5e	RDMA/cma: Check that retry count values are in range The retry_count and rnr_retry_count connection parameters are both 3-bit values. Check that the values are in range and reduce if they're not. This fixes a problem reported by Doug Ledford <dledford@redhat.com> that resulted in the userspace rping test (part of the librdmacm samples) failing to run over Intel IB HCAs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Use min_t() to avoid warnings about type mismatch. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-04 19:11:54 -07:00
Alex Tabachnik	5a33a66949	IB/iser: Add more RX CQs to scale out processing of SCSI responses RX/TX CQs will now be selected from a per HCA pool. For the RX flow this has the effect of using different interrupt vectors when using low level drivers (such as mlx4) that map the "vector" param provided by the ULP on CQ creation to a dedicated IRQ/MSI-X vector. This allows the RX flow processing of IO responses to be distributed across multiple CPUs. QPs (--> iSER sessions) are assigned to CQs in round robin order using the CQ with the minimum number of sessions attached to it. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Alex Tabachnik <alext@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-03 21:26:49 -07:00
Tatyana Nikolova	cf9fd75cbc	RDMA/nes: Bump the version number of nes driver Bumpthe version of the nes driver to reflect recent fixes. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-03 14:27:59 -07:00
Tatyana Nikolova	a378e3a3fb	RDMA/nes: Remove unused module parameter "send_first" Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-03 14:27:59 -07:00
Tatyana Nikolova	a67b078cf6	RDMA/nes: Remove unnecessary if-else statement Remove unnecessary if-else statement -- we do the same thing either way. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-03 14:27:58 -07:00
Tatyana Nikolova	d5fb476a10	RDMA/nes: Add missing break to switch. Static code analyzer cppcheck points out a missing break. Reported-by: David Binderman <dcb314@hotmail.com> Addresses: <https://bugzilla.kernel.org/show_bug.cgi?id=47671> Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-03 14:27:58 -07:00
Roland Dreier	71d9c5f9e6	IPoIB: Fix build with CONFIG_INFINIBAND_IPOIB_CM=n With the new netlink support in commit `862096a8bb` ("IB/ipoib: Add more rtnl_link_ops callbacks") we need ipoib_set_mode() to be available even if connected mode isn't built. Move the function from ipoib_cm.c to ipoib_main.c (and make a few CM-related macros available unconditonally). This fixes the build error drivers/built-in.o: In function 'ipoib_changelink': ipoib_netlink.c:(.text+0x6a5fc9): undefined reference to 'ipoib_set_mode' ipoib_netlink.c:(.text+0x6a5fe3): undefined reference to 'ipoib_set_mode' when CONFIG_INFINIBAND_IPOIB_CM isn't set. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Reported-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-02 21:33:41 -07:00
Linus Torvalds	aab174f0df	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: - big one - consolidation of descriptor-related logics; almost all of that is moved to fs/file.c (BTW, I'm seriously tempted to rename the result to fd.c. As it is, we have a situation when file_table.c is about handling of struct file and file.c is about handling of descriptor tables; the reasons are historical - file_table.c used to be about a static array of struct file we used to have way back). A lot of stray ends got cleaned up and converted to saner primitives, disgusting mess in android/binder.c is still disgusting, but at least doesn't poke so much in descriptor table guts anymore. A bunch of relatively minor races got fixed in process, plus an ext4 struct file leak. - related thing - fget_light() partially unuglified; see fdget() in there (and yes, it generates the code as good as we used to have). - also related - bits of Cyrill's procfs stuff that got entangled into that work; _not_ all of it, just the initial move to fs/proc/fd.c and switch of fdinfo to seq_file. - Alex's fs/coredump.c spiltoff - the same story, had been easier to take that commit than mess with conflicts. The rest is a separate pile, this was just a mechanical code movement. - a few misc patches all over the place. Not all for this cycle, there'll be more (and quite a few currently sit in akpm's tree)." Fix up trivial conflicts in the android binder driver, and some fairly simple conflicts due to two different changes to the sock_alloc_file() interface ("take descriptor handling from sock_alloc_file() to callers" vs "net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries" adding a dentry name to the socket) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits) MAX_LFS_FILESIZE should be a loff_t compat: fs: Generic compat_sys_sendfile implementation fs: push rcu_barrier() from deactivate_locked_super() to filesystems btrfs: reada_extent doesn't need kref for refcount coredump: move core dump functionality into its own file coredump: prevent double-free on an error path in core dumper usb/gadget: fix misannotations fcntl: fix misannotations ceph: don't abuse d_delete() on failure exits hypfs: ->d_parent is never NULL or negative vfs: delete surplus inode NULL check switch simple cases of fget_light to fdget new helpers: fdget()/fdput() switch o2hb_region_dev_write() to fget_light() proc_map_files_readdir(): don't bother with grabbing files make get_file() return its argument vhost_set_vring(): turn pollstart/pollstop into bool switch prctl_set_mm_exe_file() to fget_light() switch xfs_find_handle() to fget_light() switch xfs_swapext() to fget_light() ...	2012-10-02 20:25:04 -07:00
Linus Torvalds	7a9a2970b5	First batch of InfiniBand/RDMA changes for the 3.7 merge window: - mlx4 IB support for SR-IOV - A couple of SRP initiator fixes - Batch of nes hardware driver fixes - Fix for long-standing use-after-free crash in IPoIB - Other miscellaneous fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQav4jAAoJEENa44ZhAt0hmL0QAJTuMdSOzYFd/NB38owJCNM2 kz/N1GlBm3z98fIlGo8u+lzgV2qxqZSAzsJsouMeK38KiAX3CL8HKe44A1QvTM6v dXTNL4JFX24/YF+nlmMY8Av518I9Mkte3BZCnpYkBjVFBWe0ePwoRC/btfBXPDIV 0snq4OtjoBAn00dOOyuZ5PoyY9xf0z4UB0Gple9sM4mzEb8wVWdNDDPOiuPJc6fA L+gk6HLkZDg54+QswafdKYwpeTq45wIKLmCdS3oUNmppMLVhZY8rECOwzSa+KiTr /Yo19n+zl+IBlvjQHhmUqGHvdD17PaGlr+TckAsQqmVfXUH5qqpEnkF8FoEK59c5 YA3lVU8Sj4BPhJ7qX54CuN3767mZizakkxCr9iPRzABFTgzWVgcSgCrE8jjx4i0h Pam+L5bmANFStgmGR8PmXiNgcrCUcEqYHsOWDDAnHa5ekb2nyv1JL1c18hlY9hC3 Xb1YTMZFwvofGza89hBu7oHrMbLOUc5kW2lBpvUn2nlyf3i0F8ISlVbVbNjFA54p 60/jHa2VOQ2CcJUJKnJOk4ajOOEfHnPtMn2q96XJ69Dp8+eSYEO/G+0i1OlChq4h ClnG0Yp+NkT1o8WXMd7guDR+RsXt+DXIij5TiUWRIqnIlopIsMTRhNH28tMu4jQL fgN5n987wru91ewdX4gW =PAcy -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband updates from Roland Dreier: "First batch of InfiniBand/RDMA changes for the 3.7 merge window: - mlx4 IB support for SR-IOV - A couple of SRP initiator fixes - Batch of nes hardware driver fixes - Fix for long-standing use-after-free crash in IPoIB - Other miscellaneous fixes" This merge also removes a new use of __cancel_delayed_work(), and replaces it with the regular cancel_delayed_work() that is now irq-safe thanks to the workqueue updates. That said, I suspect the sequence in question should probably use "mod_delayed_work()". I just did the minimal "don't use deprecated functions" fixup, though. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits) IB/qib: Fix local access validation for user MRs mlx4_core: Disable SENSE_PORT for multifunction devices mlx4_core: Clean up enabling of SENSE_PORT for older (ConnectX-1/-2) HCAs mlx4_core: Stash PCI ID driver_data in mlx4_priv structure IB/srp: Avoid having aborted requests hang IB/srp: Fix use-after-free in srp_reset_req() IB/qib: Add a qib driver version RDMA/nes: Fix compilation error when nes_debug is enabled RDMA/nes: Print hardware resource type RDMA/nes: Fix for crash when TX checksum offload is off RDMA/nes: Cosmetic changes RDMA/nes: Fix for incorrect MSS when TSO is on RDMA/nes: Fix incorrect resolving of the loopback MAC address mlx4_core: Fix crash on uninitialized priv->cmd.slave_sem mlx4_core: Trivial cleanups to driver log messages mlx4_core: Trivial readability fix: "0X30" -> "0x30" IB/mlx4: Create paravirt contexts for VFs when master IB driver initializes mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations mlx4: Paravirtualize Node Guids for slaves mlx4: Activate SR-IOV mode for IB ...	2012-10-02 17:20:40 -07:00
Linus Torvalds	aecdc33e11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...	2012-10-02 13:38:27 -07:00
Linus Torvalds	437589a74b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...	2012-10-02 11:11:09 -07:00
Linus Torvalds	033d9959ed	Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue changes from Tejun Heo: "This is workqueue updates for v3.7-rc1. A lot of activities this round including considerable API and behavior cleanups. * delayed_work combines a timer and a work item. The handling of the timer part has always been a bit clunky leading to confusing cancelation API with weird corner-case behaviors. delayed_work is updated to use new IRQ safe timer and cancelation now works as expected. * Another deficiency of delayed_work was lack of the counterpart of mod_timer() which led to cancel+queue combinations or open-coded timer+work usages. mod_delayed_work[_on]() are added. These two delayed_work changes make delayed_work provide interface and behave like timer which is executed with process context. * A work item could be executed concurrently on multiple CPUs, which is rather unintuitive and made flush_work() behavior confusing and half-broken under certain circumstances. This problem doesn't exist for non-reentrant workqueues. While non-reentrancy check isn't free, the overhead is incurred only when a work item bounces across different CPUs and even in simulated pathological scenario the overhead isn't too high. All workqueues are made non-reentrant. This removes the distinction between flush_[delayed_]work() and flush_[delayed_]_work_sync(). The former is now as strong as the latter and the specified work item is guaranteed to have finished execution of any previous queueing on return. * In addition to the various bug fixes, Lai redid and simplified CPU hotplug handling significantly. * Joonsoo introduced system_highpri_wq and used it during CPU hotplug. There are two merge commits - one to pull in IRQ safe timer from tip/timers/core and the other to pull in CPU hotplug fixes from wq/for-3.6-fixes as Lai's hotplug restructuring depended on them." Fixed a number of trivial conflicts, but the more interesting conflicts were silent ones where the deprecated interfaces had been used by new code in the merge window, and thus didn't cause any real data conflicts. Tejun pointed out a few of them, I fixed a couple more. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits) workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending() workqueue: use cwq_set_max_active() helper for workqueue_set_max_active() workqueue: introduce cwq_set_max_active() helper for thaw_workqueues() workqueue: remove @delayed from cwq_dec_nr_in_flight() workqueue: fix possible stall on try_to_grab_pending() of a delayed work item workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback() workqueue: use __cpuinit instead of __devinit for cpu callbacks workqueue: rename manager_mutex to assoc_mutex workqueue: WORKER_REBIND is no longer necessary for idle rebinding workqueue: WORKER_REBIND is no longer necessary for busy rebinding workqueue: reimplement idle worker rebinding workqueue: deprecate __cancel_delayed_work() workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() workqueue: use mod_delayed_work() instead of __cancel + queue workqueue: use irqsafe timer for delayed_work workqueue: clean up delayed_work initializers and add missing one workqueue: make deferrable delayed_work initializer names consistent workqueue: cosmetic whitespace updates for macro definitions workqueue: deprecate system_nrt[_freezable]_wq workqueue: deprecate flush[_delayed]_work_sync() ...	2012-10-02 09:54:49 -07:00
Roland Dreier	d172f5a4ab	Merge branches 'cma', 'cxgb4', 'ipoib', 'mlx4', 'mlx4-sriov', 'nes', 'qib' and 'srp' into for-linus	2012-10-02 07:43:59 -07:00
Or Gerlitz	862096a8bb	IB/ipoib: Add more rtnl_link_ops callbacks Add the rtnl_link_ops changelink and fill_info callbacks, through which the admin can now set/get the driver mode, etc policies. Maintain the proprietary sysfs entries only for legacy childs. For child devices, set dev->iflink to point to the parent device ifindex, such that user space tools can now correctly show the uplink relation as done for vlan, macvlan, etc devices. Pointed out by Patrick McHardy <kaber@trash.net> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:12:22 -04:00
Linus Torvalds	fdb2f9c2eb	PCI changes for the 3.7 merge window: Host bridge hotplug - Protect acpi_pci_drivers and acpi_pci_roots (Taku Izumi) - Clear host bridge resource info to avoid issue when releasing (Yinghai Lu) - Notify acpi_pci_drivers when hot-plugging host bridges (Jiang Liu) - Use standard list ops for acpi_pci_drivers (Jiang Liu) Device hotplug - Use pci_get_domain_bus_and_slot() to close hotplug races (Jiang Liu) - Remove fakephp driver (Bjorn Helgaas) - Fix VGA ref count in hotplug remove path (Yinghai Lu) - Allow acpiphp to handle PCIe ports without native hotplug (Jiang Liu) - Implement resume regardless of pciehp_force param (Oliver Neukum) - Make pci_fixup_irqs() work after init (Thierry Reding) Miscellaneous - Add pci_pcie_type(dev) and remove pci_dev.pcie_type (Yijing Wang) - Factor out PCI Express Capability accessors (Jiang Liu) - Add pcibios_window_alignment() so powerpc EEH can use generic resource assignment (Gavin Shan) - Make pci_error_handlers const (Stephen Hemminger) - Cleanup drivers/pci/remove.c (Bjorn Helgaas) - Improve Vendor-Specific Extended Capability support (Bjorn Helgaas) - Use standard list ops for bus->devices (Bjorn Helgaas) - Avoid kmalloc in pci_get_subsys() and pci_get_class() (Feng Tang) - Reassign invalid bus number ranges (Intel DP43BF workaround) (Yinghai Lu) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJQac4hAAoJEPGMOI97Hn6zjZYP/iaqU9kjmgTsBbSyzB4oApv/ RRxo3I+ad9GF6XlMQfVAtyx1pgCD1gdGAtoDgGSCTqgdYD3CO10AxKU+yleAk1wo dbMxLifJNTrT3G1mZ/NL16yEGhCwvhfwzRtB1VoZmCT4lSApO/7cJkXl2DzHfA/i pmltOOiQCN8kbUcJbVPtUyTVPi2zl/8bsyCyTkS7YG0VXeGRM+ZUvPWZJ7MnWYYB 5qoCdrw5ENCCiDQ9yw5SAfgL23b+0p6OI/x3Lkex0QQOWwSqGSiaHt4b7eitrC5b 2eAJg32f/AzZke1YbKLMfdsL0VJP3GAswhDVHlgmo63rZkOZChm+97dgZ35Mcv5v kEXkWyBb1xJ3t8rZir6Qer9Iv2wOB+MkZ5qtU/Vf+l0wLQLYTrRVsKngrEDREONk dXbokp6iVSPeA1sTSdH9MmHlTUIj82ZLSGcxcjTsN8NWZjxx6g3rNx1uay+5MYOW 4ET9zNu5snrAqN6N4Tb81gvtG8qYfxzdvVfrA9AaGKI6xxB7jkqgFJRp55JiEcFc x4cmWkhvdlhVsG2TQwFxYNfswOqD+7NCs6V4kSVZX6ezpDrH7I5VvcnnhstF7C8l KZul0EV7OW+kDK23pNe24lVP2xtOv6G8eK/3PmeKIXWl1V83nqre/oLufRzTfs+Z SxkILwY/MFpuCFteKE1t =haBu -----END PGP SIGNATURE----- Merge tag 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Host bridge hotplug - Protect acpi_pci_drivers and acpi_pci_roots (Taku Izumi) - Clear host bridge resource info to avoid issue when releasing (Yinghai Lu) - Notify acpi_pci_drivers when hot-plugging host bridges (Jiang Liu) - Use standard list ops for acpi_pci_drivers (Jiang Liu) Device hotplug - Use pci_get_domain_bus_and_slot() to close hotplug races (Jiang Liu) - Remove fakephp driver (Bjorn Helgaas) - Fix VGA ref count in hotplug remove path (Yinghai Lu) - Allow acpiphp to handle PCIe ports without native hotplug (Jiang Liu) - Implement resume regardless of pciehp_force param (Oliver Neukum) - Make pci_fixup_irqs() work after init (Thierry Reding) Miscellaneous - Add pci_pcie_type(dev) and remove pci_dev.pcie_type (Yijing Wang) - Factor out PCI Express Capability accessors (Jiang Liu) - Add pcibios_window_alignment() so powerpc EEH can use generic resource assignment (Gavin Shan) - Make pci_error_handlers const (Stephen Hemminger) - Cleanup drivers/pci/remove.c (Bjorn Helgaas) - Improve Vendor-Specific Extended Capability support (Bjorn Helgaas) - Use standard list ops for bus->devices (Bjorn Helgaas) - Avoid kmalloc in pci_get_subsys() and pci_get_class() (Feng Tang) - Reassign invalid bus number ranges (Intel DP43BF workaround) (Yinghai Lu)" * tag 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (102 commits) PCI: acpiphp: Handle PCIe ports without native hotplug capability PCI/ACPI: Use acpi_driver_data() rather than searching acpi_pci_roots PCI/ACPI: Protect acpi_pci_roots list with mutex PCI/ACPI: Use acpi_pci_root info rather than looking it up again PCI/ACPI: Pass acpi_pci_root to acpi_pci_drivers' add/remove interface PCI/ACPI: Protect acpi_pci_drivers list with mutex PCI/ACPI: Notify acpi_pci_drivers when hot-plugging PCI root bridges PCI/ACPI: Use normal list for struct acpi_pci_driver PCI/ACPI: Use DEVICE_ACPI_HANDLE rather than searching acpi_pci_roots PCI: Fix default vga ref_count ia64/PCI: Clear host bridge aperture struct resource x86/PCI: Clear host bridge aperture struct resource PCI: Stop all children first, before removing all children Revert "PCI: Use hotplug-safe pci_get_domain_bus_and_slot()" PCI: Provide a default pcibios_update_irq() PCI: Discard __init annotations for pci_fixup_irqs() and related functions PCI: Use correct type when freeing bus resource list PCI: Check P2P bridge for invalid secondary/subordinate range PCI: Convert "new_id"/"remove_id" into generic pci_bus driver attributes xen-pcifront: Use hotplug-safe pci_get_domain_bus_and_slot() ...	2012-10-01 12:05:36 -07:00
Mike Marciniszyn	c00aaa1a02	IB/qib: Fix local access validation for user MRs Commit `8aac4cc3a9` ("IB/qib: RCU locking for MR validation") introduced a bug that broke user post sends. The proper validation of the MR was lost in the patch. This patch corrects that validation. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-01 09:59:26 -07:00
Bart Van Assche	d853667091	IB/srp: Avoid having aborted requests hang We need to call scsi_done() for commands after we abort them. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:36:48 -07:00
Bart Van Assche	9b796d06d5	IB/srp: Fix use-after-free in srp_reset_req() srp_free_req() uses the scsi_cmnd structure contents to unmap buffers, so we must invoke srp_free_req() before we release ownership of that structure. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Dillow <dillowda@ornl.gov> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:36:47 -07:00
Dean Luick	e20d583818	IB/qib: Add a qib driver version Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:36:29 -07:00
Tatyana Nikolova	bca1935ccd	RDMA/nes: Fix compilation error when nes_debug is enabled Removing old variables caused a compile error from nes_debug(). Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:34:55 -07:00
Tatyana Nikolova	818216442b	RDMA/nes: Print hardware resource type Hardware resource types are added and when a resource isn't available, its type is printed. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:34:55 -07:00
Tatyana Nikolova	fc4ba7291b	RDMA/nes: Fix for crash when TX checksum offload is off When TX checksum offload is disabled for an iWarp connection, skb->ip_summed needs to be set to CHECKSUM_NONE. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:34:54 -07:00
Tatyana Nikolova	48a9956362	RDMA/nes: Cosmetic changes - Remove unnecessary statement "if (1)" - Refactor a statement (wqe_misc \|= NES_NIC_SQ_WQE_COMPLETION) out of if/else statement, because it is independant of the flow. - Define netdev->features in one line for clarity. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:34:53 -07:00
Tatyana Nikolova	6ad1be814b	RDMA/nes: Fix for incorrect MSS when TSO is on In TSO handling code, skb_shared_info() is used to get the MSS instead of the bool function skb_is_gso() (which always returns 1). Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:34:52 -07:00
Tatyana Nikolova	ef3d0c4a5e	RDMA/nes: Fix incorrect resolving of the loopback MAC address Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:34:52 -07:00
Jack Morgenstein	3806d08cf6	IB/mlx4: Create paravirt contexts for VFs when master IB driver initializes When we have VFs and PFs on same host, the VFs are activated within the mlx4_core module before the mlx4_ib kernel module is loaded. When the mlx4_ib module initializes the PF (master), it now creates MAD paravirtualization contexts for any VFs that already active. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:44 -07:00
Jack Morgenstein	47605df953	mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations Previously, the structure of a guest's proxy QPs followed the structure of the PPF special qps (qp0 port 1, qp0 port 2, qp1 port 1, qp1 port 2, ...). The guest then did offset calculations on the sqp_base qp number that the PPF passed to it in QUERY_FUNC_CAP(). This is now changed so that the guest does no offset calculations regarding proxy or tunnel QPs to use. This change frees the PPF from needing to adhere to a specific order in allocating proxy and tunnel QPs. Now QUERY_FUNC_CAP provides each port individually with its proxy qp0, proxy qp1, tunnel qp0, and tunnel qp1 QP numbers, and these are used directly where required (with no offset calculations). To accomplish this change, several fields were added to the phys_caps structure for use by the PPF and by non-SR-IOV mode: base_sqpn -- in non-sriov mode, this was formerly sqp_start. base_proxy_sqpn -- the first physical proxy qp number -- used by PPF base_tunnel_sqpn -- the first physical tunnel qp number -- used by PPF. The current code in the PPF still adheres to the previous layout of sqps, proxy-sqps and tunnel-sqps. However, the PPF can change this layout without affecting VF or (paravirtualized) PF code. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:43 -07:00
Jack Morgenstein	afa8fd1db9	mlx4: Paravirtualize Node Guids for slaves This is necessary in order to support > 1 VF/PF in a VM for software that uses the node guid as a discriminator, such as librdmacm. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:43 -07:00
Jack Morgenstein	026149cbaa	mlx4: Activate SR-IOV mode for IB Remove the error returns for IB ports from mlx4_ib_add, mlx4_INIT_PORT_wrapper, and mlx4_CLOSE_PORT_wrapper. Currently, SRIOV is supported only for devices for which the link layer is IB on all ports; RoCE support will be added later. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:42 -07:00
Jack Morgenstein	992e8e6e87	IB/mlx4: Miscellaneous adjustments for SR-IOV IB support 1. Allow only master to change node description. 2. Prevent AH leakage in send mads. 3. Take device part number from PCI structure, so that guests see the VF part number (and not the PF part number). 4. Place the device revision ID into caps structure at startup. 5. SET_PORT in update_gids_task needs to go through wrapper on master. 6. In mlx4_ib_event(), PORT_MGMT_EVENT needs be handled in a work queue on the master, since it propagates events to slaves using GEN_EQE. 7. Do not support FMR on slaves. 8. Add spinlock to slave_event(), since it is called both in interrupt context and in process context (due to 6 above, and also if smp_snoop is used). This fix was found and implemented by Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:41 -07:00
Jack Morgenstein	c1e7e46612	IB/mlx4: Add iov directory in sysfs under the ib device This directory is added only for the master -- slaves do not have it. The sysfs iov directory is used to manage and examine the port P_Key and guid paravirtualization. Under iov/ports, the administrator may examine the gid and P_Key tables as they are present in the device (and as are seen in the "network view" presented to the SM). Under the iov/<pci slot number> directories, the admin may map the index numbers in the physical tables (as under iov/ports) to the paravirtualized index numbers that guests see. For example, if the administrator, for port 1 on guest 2 maps physical pkey index 10 to virtual index 1, then that guest, whenever it uses its pkey index 1, will actually be using the real pkey index 10. Based on patch from Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:39 -07:00
Jack Morgenstein	2a4fae148c	IB/mlx4: Propagate P_Key and guid change port management events to slaves P_Key change and guid change events are not of interest to all slaves, but only to those slaves which "see" the table slots whose contents have change. For example, if the guid at port 1, index 5 has changed in the PPF, we wish to propagate the gid-change event only to the function which has that guid index mapped to its port/guid table (in this case it is slave #5). Other functions should not get the event, since the event does not affect them. Similarly with P_Keys -- P_Key change events are forwarded only to slaves which have that P_Key index mapped to their virtual P_Key table. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:38 -07:00
Jack Morgenstein	a0c64a17ab	mlx4: Add alias_guid mechanism For IB ports, we paravirtualize the GUID at index 0 on slaves. The GUID at index 0 seen by a slave is the actual GUID occupying the GUID table at the slave-id index. The driver, by default, requests at startup time that subnet manager populate its entire guid table with GUIDs. These guids are then mapped (paravirtualized) to the slaves, and appear for each slave as its GUID at index 0. Until each slave has such a guid, its port status is DOWN. The guid table is cached to support special QP paravirtualization, and event propagation to slaves on guid change (we test to see if the guid really changed before propagating an event to the slave). To support this caching, add capability to __mlx4_ib_query_gid() to obtain the network view (i.e., physical view) gid at index X, not just the host (paravirtualized) view. Based on a patch from Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:37 -07:00
Amir Vadai	3cf69cc8db	IB/mlx4: Add CM paravirtualization In CM para-virtualization: 1. Incoming requests are steered to the correct vHCA according to the embedded GID. 2. Communication IDs on outgoing requests are replaced by a globally unique ID, generated by the PPF, since there is no synchronization of ID generation between guests (and so these IDs are not guaranteed to be globally unique). The guest's comm ID is stored, and is returned to the response MAD when it arrives. Signed-off-by: Amir Vadai <amirv@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:36 -07:00
Oren Duer	b9c5d6a643	IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV MCG paravirtualization support includes: - Creating multicast groups by VFs, and keeping accounting of them - Leaving multicast groups by VFs - Updating SM only with real changes in the overall picture of MCGs status - Creation of MGID=0 groups (let SM choose MGID) Note that the MCG module maintains its own internal MCG object reference counts. The reason for this is that the IB core is used to track only the multicast groups joins generated by the PF it runs over. The PF IB core layer is unaware of slaves, so it cannot be used to keep track of MCG joins they generate. Signed-off-by: Oren Duer <oren@mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:35 -07:00
Jack Morgenstein	0a9a01884d	mlx4: MAD_IFC paravirtualization The MAD_IFC firmware command fulfills two functions. First, it is used in the QP0/QP1 MAD-handling flow to obtain information from the FW (for answering queries), and for setting variables in the HCA (MAD SET packets). For this, MAD_IFC should provide the FW (physical) view of the data. This is the view that OpenSM needs. We call this the "network view". In the second case, MAD_IFC is used by various verbs to obtain data regarding the local HCA (e.g., ib_query_device()). We call this the "host view". This data needs to be paravirtualized. MAD_IFC therefore needs a wrapper function, and also needs another flag indicating whether it should provide the network view (when it is called by ib_process_mad in special-qp packet handling), or the host view (when it is called while implementing a verb). There are currently 2 flag parameters in mlx4_MAD_IFC already: ignore_bkey and ignore_mkey. These two parameters are replaced by a single "mad_ifc_flags" parameter, with different bits set for each flag. A third flag is added: "network-view/host-view". Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:34 -07:00
Jack Morgenstein	37bfc7c1e8	IB/mlx4: SR-IOV multiplex and demultiplex MADs Special QPs are paravirtualized. vHCAs are not given direct access to QP0/1. Rather, these QPs are operated by a special context hosted by the PF, which mediates access to/from vHCAs. This is done by opening a "tunnel" per vHCA port per QP0/1. A tunnel comprises a pair of UD QPs: a "Tunnel QP" in the PF-context and a "Proxy QP" in the vHCA. All vHCA MAD traffic must pass through the corresponding tunnel. vHCA QPs cannot be assigned to VL15 and are denied of the well-known QKey. Outgoing messages are "de-multiplexed" (i.e., directed to the wire via the real special QP). Incoming messages are "multiplexed" (i.e. steered by the PPF to the correct VF or to the PF) QP0 access is restricted to the PF vHCA. VF vHCAs also have (virtual) QP0s, but they never receive any SMPs and all SMPs sent are discarded. QP1 traffic is allowed for all vHCAs, but special care is required to bridge the gap between the host and network views. Specifically: - Transaction IDs are mapped to guarantee uniqueness among vHCAs - CM para-virtualization o Incoming requests are steered to the correct vHCA according to the embedded GID o Local communication IDs are mapped to ensure uniqueness among vHCAs (see the patch that adds CM paravirtualization.) - Multicast para-virtualization o The PF context aggregates membership state from all vHCAs o The SA is contacted only when the aggregate membership changes o If the aggregate does not change, the PF context will provide the requesting vHCA with the proper response. (see the patch that adds multicast group paravirtualization) Incoming MADs are steered according to: - the DGID If a GRH is present - the mapped transaction ID for response MADs - the embedded GID in CM requests - the remote communication ID in other CM messages Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:34 -07:00
Jack Morgenstein	54679e1482	mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop This requires: 1. Replacing the paravirtualized P_Key index (inserted by the guest) with the real P_Key index. 2. For UD QPs, placing the guest's true source GID index in the address path structure mgid field, and setting the ud_force_mgid bit so that the mgid is taken from the QP context and not from the WQE when posting sends. 3. For UC and RC QPs, placing the guest's true source GID index in the address path structure mgid field. 4. For tunnel and proxy QPs, setting the Q_Key value reserved for that proxy/tunnel pair. Since not all the above adjustments occur in all the QP transitions, the QP transitions require separate wrapper functions. Secondly, initialize the P_Key virtualization table to its default values: Master virtualized table is 1-1 with the real P_Key table, guest virtualized table has P_Key index 0 mapped to the real P_Key index 0, and all the other P_Key indices mapped to the reserved (invalid) P_Key at index 127. Finally, add logic in smp_snoop for maintaining the phys_P_Key_cache. and generating events on the master only if a P_Key actually changed. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:33 -07:00
Jack Morgenstein	fc06573dfa	IB/mlx4: Initialize SR-IOV IB support for slaves in master context Allocate SR-IOV paravirtualization resources and MAD demuxing contexts on the master. This has two parts. The first part is to initialize the structures to contain the contexts. This is done at master startup time in mlx4_ib_init_sriov(). The second part is to actually create the tunneling resources required on the master to support a slave. This is performed the master detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event generated when a slave initializes its comm channel). For the master, there is no such startup event, so it creates its own tunneling resources when it starts up. In addition, the master also creates the real special QPs. The ib_core layer on the master causes creation of proxy special QPs, since the master is also paravirtualized at the ib_core layer. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:32 -07:00
Jack Morgenstein	1ffeb2eb8b	IB/mlx4: SR-IOV IB context objects and proxy/tunnel SQP support 1. Introduce the basic SR-IOV parvirtualization context objects for multiplexing and demultiplexing MADs. 2. Introduce support for the new proxy and tunnel QP types. This patch introduces the objects required by the master for managing QP paravirtualization for guests. struct mlx4_ib_sriov is created by the master only. It is a container for the following: 1. All the info required by the PPF to multiplex and de-multiplex MADs (including those from the PF). (struct mlx4_ib_demux_ctx demux) 2. All the info required to manage alias GUIDs (i.e., the GUID at index 0 that each guest perceives. In fact, this is not the GUID which is actually at index 0, but is, in fact, the GUID which is at index[<VF number>] in the physical table. 3. structures which are used to manage CM paravirtualization 4. structures for managing the real special QPs when running in SR-IOV mode. The real SQPs are controlled by the PPF in this case. All SQPs created and controlled by the ib core layer are proxy SQP. struct mlx4_ib_demux_ctx contains the information per port needed to manage paravirtualization: 1. All multicast paravirt info 2. All tunnel-qp paravirt info for the port. 3. GUID-table and GUID-prefix for the port 4. work queues. struct mlx4_ib_demux_pv_ctx contains all the info for managing the paravirtualized QPs for one slave/port. struct mlx4_ib_demux_pv_qp contains the info need to run an individual QP (either tunnel qp or real SQP). Note: We made use of the 2 most significant bits in enum mlx4_ib_qp_flags (based on enum ib_qp_create_flags in ib_verbs.h). We need these bits in the low-level driver for internal purposes. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:30 -07:00
Jack Morgenstein	73aaa7418f	IB/core: Add ib_find_exact_cached_pkey() When P_Key tables potentially contain both full and partial membership copies for the same P_Key, we need a function to find the index for an exact (16-bit) P_Key. This is necessary when the master forwards QP1 MADs sent by guests. If the guest has sent the MAD with a limited membership P_Key, we need to to forward the MAD using the same limited membership P_Key. Since the master may have both the limited and the full member P_Keys in its table, we must make sure to retrieve the limited membership P_Key in this case. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:30 -07:00
Jack Morgenstein	ff7166c447	IB/core: Handle table with full and partial membership for the same P_Key Extend the cached and non-cached P_Key table lookups to handle limited and full membership of the same P_Key to co-exist in the P_Key table. This is necessary for SR-IOV, to allow for some guests would to have the full membership P_Key in their virtual P_Key table, while other guests on the same physical HCA would have the limited one. To support this, we need both the limited and full membership P_Keys to be present in the master's (hypervisor physical port) P_Key table. The algorithm for handling P_Key tables which contain both the limited and the full membership versions of the same P_Key works as follows: When scanning the P_Key table for a 15-bit P_Key: A. If there is a full member version of that P_Key anywhere in the table, return its index (even if a limited-member version of the P_Key exists earlier in the table). B. If the full member version is not in the table, but the limited-member version is in the table, return the index of the limited P_Key. Signed-off-by: Liran Liss <liranl@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:29 -07:00
Dotan Barak	46db567deb	IB/mlx4: Fill in sq_sig_type in query QP Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:08 -07:00
Patrick McHardy	bea1e22df4	IPoIB: Fix use-after-free of multicast object Fix a crash in ipoib_mcast_join_task(). (with help from Or Gerlitz) Commit `c8c2afe360` ("IPoIB: Use rtnl lock/unlock when changing device flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which is run from the ipoib_workqueue, and hence the workqueue can't be flushed from the context of ipoib_stop(). In the current code, ipoib_stop() (which doesn't flush the workqueue) calls ipoib_mcast_dev_flush(), which goes and deletes all the multicast entries. This takes place without any synchronization with a possible running instance of ipoib_mcast_join_task() for the same ipoib device, leading to a crash due to NULL pointer dereference. Fix this by making sure that the workqueue is flushed before ipoib_mcast_dev_flush() is called. To make that possible, we move the RTNL-lock wrapped code to ipoib_mcast_join_finish(). Signed-off-by: Patrick McHardy <kaber@trash.net> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:32:33 -07:00
Emil Goode	c079c28714	RDMA/cxgb4: Fix error handling in create_qp() The variable ret is assigned return values in a couple of places, but its value is never returned. This patch makes use of the ret variable so that the caller get correct error codes returned. The following changes are also introduced: - The alloc_oc_sq function can return -ENOSYS or -ENOMEM so we want to get the return value from it. - Change the label names to improve readability. Signed-off-by: Emil Goode <emilgoode@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:32:14 -07:00
Dotan Barak	2a22fb8c69	RDMA/cma: Use consistent component mask for IPoIB port space multicast joins CMA multicast joins for the IPoIB port space need to use the same component mask used by the ipoib driver. Otherwise, it's possible for the CMA to create a group to which a join made by ipoib will fail, or vise-versa. Some of the component mask fields set by ipoib weren't set by the CMA, fix that. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:31:47 -07:00
Dotan Barak	d8c9166961	IB/core: Remove unused variables in ucm/ucma Remove unused wait objects from ucm/ucma events flow. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:31:46 -07:00
David S. Miller	6a06e5e1bb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/team/team.c drivers/net/usb/qmi_wwan.c net/batman-adv/bat_iv_ogm.c net/ipv4/fib_frontend.c net/ipv4/route.c net/l2tp/l2tp_netlink.c The team, fib_frontend, route, and l2tp_netlink conflicts were simply overlapping changes. qmi_wwan and bat_iv_ogm were of the "use HEAD" variety. With help from Antonio Quartulli. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-28 14:40:49 -04:00
Al Viro	2903ff019b	switch simple cases of fget_light to fdget Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 22:20:08 -04:00
Al Viro	88b428d6e1	switch infinibarf users of fget() to fget_light() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-09-26 21:10:10 -04:00
Paul E. McKenney	593d1006cd	Merge remote-tracking branch 'tip/core/rcu' into next.2012.09.25b Resolved conflict in kernel/sched/core.c using Peter Zijlstra's approach from https://lkml.org/lkml/2012/9/5/585.	2012-09-25 10:03:56 -07:00
Paul E. McKenney	5217192b85	Merge remote-tracking branch 'tip/smp/hotplug' into next.2012.09.25b The conflicts between kernel/rcutree.h and kernel/rcutree_plugin.h were due to adjacent insertions and deletions, which were resolved by simply accepting the changes on both branches.	2012-09-25 10:01:45 -07:00
Eric W. Biederman	d03ca5820d	userns: Convert ipathfs to use GLOBAL_ROOT_UID and GLOBAL_ROOT_GID Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:19 -07:00
Or Gerlitz	9baa0b0364	IB/ipoib: Add rtnl_link_ops support Add rtnl_link_ops to IPoIB, with the first usage being child device create/delete through them. Childs devices are now either legacy ones, created/deleted through the ipoib sysfs entries, or RTNL ones. Adding support for RTNL childs involved refactoring of ipoib_vlan_add which is now used by both the sysfs and the link_ops code. Also, added ndo_uninit entry to support calling unregister_netdevice_queue from the rtnl dellink entry. This required removal of calls to ipoib_dev_cleanup from the driver in flows which use unregister_netdevice, since the networking core will invoke ipoib_uninit which does exactly that. Signed-off-by: Erez Shitrit <erezsh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-20 16:49:17 -04:00
Roland Dreier	9c58b7ddd7	target: Simplify fabric sense data length handling Every fabric driver has to supply a se_tfo->set_fabric_sense_len() method, just so iSCSI can return an offset of 2. However, every fabric driver is already allocating a sense buffer and passing it into the target core, either via transport_init_se_cmd() or target_submit_cmd(). So instead of having iSCSI pass the start of its sense buffer into the core and then later tell the core to skip the first 2 bytes, it seems easier for iSCSI just to do the offset of 2 when it passes the sense buffer into the core. Then we can drop the se_tfo->set_fabric_sense_len() everywhere, and just add a couple of lines of code to iSCSI to set the sense data length to the beginning of the buffer right before it sends it over the network. (nab: Remove .set_fabric_sense_len usage from tcm_qla2xxx_npiv_ops + change transport_get_sense_buffer to follow v3.6-rc6 code w/o ->set_fabric_sense_len usage) Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2012-09-17 17:12:58 -07:00
Roland Dreier	2ed772b7b9	target: Remove unused target_core_fabric_ops.get_fabric_sense_len method There are no callers of se_tfo->get_fabric_sense_len(), so we should stop having every fabric driver implement it. Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2012-09-17 16:15:47 -07:00
Benjamin Herrenschmidt	eda485f06d	Merge remote-tracking branch 'pci/pci/gavin-window-alignment' into next Merge Gavin patches from the PCI tree as subsequent powerpc patches are going to depend on them Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-09-17 16:07:43 +10:00
Roland Dreier	2116fe4e8b	Merge branches 'cxgb4', 'ipoib', 'mlx4', 'ocrdma' and 'qib' into for-next	2012-09-14 10:42:52 -07:00
Mike Marciniszyn	4c3550057b	IB/qib: Fix failure of compliance test C14-024#06_LocalPortNum Commit `3236b2d469` ("IB/qib: MADs with misset M_Keys should return failure") introduced a return code assignment that unfortunately introduced an unconditional exit for the routine due to missing braces. This patch adds the braces to correct the original patch. Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-14 10:42:32 -07:00
Parav Pandit	ae3bca90e9	RDMA/ocrdma: Fix CQE expansion of unsignaled WQE Fix CQE expansion of unsignaled WQE -- don't expand the CQE when the WQE index of the completed CQE matches with last pending WQE (tail) in the queue. Signed-off-by: Parav Pandit <parav.pandit@emulex.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-14 10:40:58 -07:00
Bjorn Helgaas	9a5d5bd848	Merge commit 'v3.6-rc5' into pci/gavin-window-alignment * commit 'v3.6-rc5': (1098 commits) Linux 3.6-rc5 HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured Remove user-triggerable BUG from mpol_to_str xen/pciback: Fix proper FLR steps. uml: fix compile error in deliver_alarm() dj: memory scribble in logi_dj Fix order of arguments to compat_put_time[spec\|val] xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory. powerpc: Don't use __put_user() in patch_instruction powerpc: Make sure IPI handlers see data written by IPI senders powerpc: Restore correct DSCR in context switch powerpc: Fix DSCR inheritance in copy_thread() powerpc: Keep thread.dscr and thread.dscr_inherit in sync powerpc: Update DSCR on all CPUs when writing sysfs dscr_default powerpc/powernv: Always go into nap mode when CPU is offline powerpc: Give hypervisor decrementer interrupts their own handler powerpc/vphn: Fix arch_update_cpu_topology() return value ARM: gemini: fix the gemini build ... Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/rapidio/devices/tsi721.c	2012-09-13 15:54:57 -06:00
Bjorn Helgaas	78890b5989	Merge commit 'v3.6-rc5' into next * commit 'v3.6-rc5': (1098 commits) Linux 3.6-rc5 HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured Remove user-triggerable BUG from mpol_to_str xen/pciback: Fix proper FLR steps. uml: fix compile error in deliver_alarm() dj: memory scribble in logi_dj Fix order of arguments to compat_put_time[spec\|val] xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory. powerpc: Don't use __put_user() in patch_instruction powerpc: Make sure IPI handlers see data written by IPI senders powerpc: Restore correct DSCR in context switch powerpc: Fix DSCR inheritance in copy_thread() powerpc: Keep thread.dscr and thread.dscr_inherit in sync powerpc: Update DSCR on all CPUs when writing sysfs dscr_default powerpc/powernv: Always go into nap mode when CPU is offline powerpc: Give hypervisor decrementer interrupts their own handler powerpc/vphn: Fix arch_update_cpu_topology() return value ARM: gemini: fix the gemini build ... Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/rapidio/devices/tsi721.c	2012-09-13 08:41:01 -06:00
Bjorn Helgaas	1959ec5f82	Merge branch 'pci/stephen-const' into next * pci/stephen-const: make drivers with pci error handlers const scsi: make pci error handlers const netdev: make pci_error_handlers const PCI: Make pci_error_handlers const	2012-09-12 13:54:10 -06:00
Shlomo Pongratz	b5120a6e11	IPoIB: Fix AB-BA deadlock when deleting neighbours Lockdep points out a circular locking dependency betwwen the ipoib device priv spinlock (priv->lock) and the neighbour table rwlock (ntbl->rwlock). In the normal path, ie neigbour garbage collection task, the neigh table rwlock is taken first and then if the neighbour needs to be deleted, priv->lock is taken. However in some error paths, such as in ipoib_cm_handle_tx_wc(), priv->lock is taken first and then ipoib_neigh_free routine is called which in turn takes the neighbour table ntbl->rwlock. The solution is to get rid the neigh table rwlock completely and use only priv->lock. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-12 09:21:45 -07:00
Shlomo Pongratz	66172c0993	IPoIB: Fix memory leak in the neigh table deletion flow If the neighbours hash table is empty when unloading the module, then ipoib_flush_neighs(), the cleanup routine, isn't called and the memory used for the hash table itself leaked. To fix this, ipoib_flush_neighs() is allways called, and another completion object is added to signal when the table is freed. Once invoked, ipoib_flush_neighs() flushes all the neighbours (if there are any), calls the the hash table RCU free routine, which now signals completion of the deletion process, and waits for the last neighbour to be freed. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-12 09:05:03 -07:00
Pablo Neira Ayuso	9f00d9776b	netlink: hide struct module parameter in netlink_kernel_create This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-08 18:46:30 -04:00
Wei Yongjun	92dd6c3d4d	RDMA/cxgb4: Move dereference below NULL test spatch with a semantic match is used to found this. (http://coccinelle.lip6.fr/) Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-07 16:19:03 -07:00
Stephen Hemminger	1d3520357d	make drivers with pci error handlers const Covers the rest of the uses of pci error handler. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-09-07 16:35:00 -06:00
Benjamin Herrenschmidt	fff34b3412	Merge branch 'merge' into next Brings in various bug fixes from 3.6-rcX	2012-09-07 09:48:59 +10:00
Vipul Pandya	e5619c120d	RDMA/cxgb4: Update RDMA/cxgb4 due to macro definition removal in cxgb4 driver cxgb4 driver has duplicate definitions of registers which will be removed. This patch updates the RDMA/cxgb4 driver accordingly. Signed-off-by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Reviewed-by: Sivakumar Subramani <sivasu@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-05 17:41:40 -04:00
Michael Ellerman	b4b8d1e48e	IB/ehca: Remove uses of virt_to_abs() and abs_to_virt() abs_to_virt() simply calls __va() and we'd like to get rid of it, so replace all abs_to_virt() uses with __va(). __va() returns void *, so when assigning to a pointer there's no need to cast. Similarly virt_to_abs() just calls __pa(). Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-09-05 15:18:45 +10:00
Michael Ellerman	aa97c32c1e	IB/ehca: Don't use phys_to_abs(), it's a nop Since commit `f5339277` phys_to_abs() is 100% a nop, so just remove it. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-09-05 15:18:43 +10:00
Jiang Liu	0921caf326	IB/qib: Use PCI Express Capability accessors Use PCI Express Capability access functions to simplify qib driver. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com>	2012-08-23 10:11:15 -06:00
Jiang Liu	3c55569b08	IB/mthca: Use PCI Express Capability accessors Use PCI Express Capability access functions to simplify mthca driver. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Roland Dreier <roland@purestorage.com>	2012-08-23 10:11:15 -06:00
Tejun Heo	136b5721d7	workqueue: deprecate __cancel_delayed_work() Now that cancel_delayed_work() can be safely called from IRQ handlers, there's no reason to use __cancel_delayed_work(). Use cancel_delayed_work() instead of __cancel_delayed_work() and mark the latter deprecated. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <axboe@kernel.dk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Roland Dreier <roland@kernel.org> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>	2012-08-21 13:18:24 -07:00
Tejun Heo	e7c2f96744	workqueue: use mod_delayed_work() instead of __cancel + queue Now that mod_delayed_work() is safe to call from IRQ handlers, __cancel_delayed_work() followed by queue_delayed_work() can be replaced with mod_delayed_work(). Most conversions are straight-forward except for the following. * net/core/link_watch.c: linkwatch_schedule_work() was doing a quite elaborate dancing around its delayed_work. Collapse it such that linkwatch_work is queued for immediate execution if LW_URGENT and existing timer is kept otherwise. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>	2012-08-21 13:18:24 -07:00
Paul E. McKenney	bff4a39479	infiniband: ehca: Fix compiler warnings Fix comp_task() to return void to match smp_hotplug_thread's thread_fn member, and adjust indirection on the ->cpu_comp_threads per-CPU member of the ehca_comp_pool structure. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>	2012-08-20 08:11:04 -07:00
Paul E. McKenney	c0e12e51b0	infiniband: ehca: Fix while->do-while conversion typo This commit just adds a needed semicolon. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>	2012-08-20 08:00:23 -07:00

1 2 3 4 5 ...

3337 Commits