Commit Graph

120 Commits

Author SHA1 Message Date
Steve Wise f8b0dfd152 RDMA/cxgb3: Support peer-2-peer connection setup
Open MPI, Intel MPI and other applications don't respect the iWARP
requirement that the client (active) side of the connection send the
first RDMA message.  This class of application connection setup is
called peer-to-peer.  Typically once the connection is setup, _both_
sides want to send data.

This patch enables supporting peer-to-peer over the chelsio RNIC by
enforcing this iWARP requirement in the driver itself as part of RDMA
connection setup.

Connection setup is extended, when the peer2peer module option is 1,
such that the MPA initiator will send a 0B Read (the RTR) just after
connection setup.  The MPA responder will suspend SQ processing until
the RTR message is received and reply-to.

In the longer term, this will be handled in a standardized way by
enhancing the MPA negotiation so peers can indicate whether they
want/need the RTR and what type of RTR (0B read, 0B write, or 0B send)
should be sent.  This will be done by standardizing a few bits of the
private data in order to negotiate all this.  However this patch
enables peer-to-peer applications now and allows most of the required
firmware and driver changes to be done and tested now.

Design:

 - Add a module option, peer2peer, to enable this mode.

 - New firmware support for peer-to-peer mode:

	- a new bit in the rdma_init WR to tell it to do peer-2-peer
	  and what form of RTR message to send or expect.

	- process _all_ preposted recvs before moving the connection
	  into rdma mode.

	- passive side: defer completing the rdma_init WR until all
	  pre-posted recvs are processed.  Suspend SQ processing until
	  the RTR is received.

	- active side: expect and process the 0B read WR on offload TX
	  queue. Defer completing the rdma_init WR until all
	  pre-posted recvs are processed.  Suspend SQ processing until
	  the 0B read WR is processed from the offload TX queue.

 - If peer2peer is set, driver posts 0B read request on offload TX
   queue just after posting the rdma_init WR to the offload TX queue.

 - Add CQ poll logic to ignore unsolicitied read responses.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-04-29 13:46:52 -07:00
Matthew Wilcox 5f090dcb4d net: Remove unnecessary inclusions of asm/semaphore.h
None of these files use any of the functionality promised by
asm/semaphore.h.  It's possible that they rely on it dragging in some
unrelated header file, but I can't build all these files, so we'll have
fix any build failures as they come up.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-04-18 22:15:50 -04:00
Dan Noe d96a51f6b8 cxgb3: Fix __must_check warning with dev_dbg.
Fix the warning:
drivers/net/cxgb3/cxgb3_main.c: In function ‘offload_open’:
drivers/net/cxgb3/cxgb3_main.c:936: warning: ignoring return value of
 ‘sysfs_create_group’, declared with attribute warn_unused_result

Now the return value is checked; if sysfs_create_group() returns failure,
a warning is printed using dev_dbg, and the code continues as before.  Use
of dev_dbg ensures printk is not needlessly included unless desired for
debugging.

Signed-off-by: Dan Noe <dpn@isomerica.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-04-17 15:31:32 -04:00
David S. Miller 8e8e43843b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/usb/rndis_host.c
	drivers/net/wireless/b43/dma.c
	net/ipv6/ndisc.c
2008-03-27 18:48:56 -07:00
Al Viro fa3a6cb4a6 annotate cxgb3 (ab)uses of skb->priority/skb->csum
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-26 00:18:46 -04:00
Roland Dreier b1186dee3e cxgb3: Fix lockdep problems with sge.reg_lock
Using iWARP with a Chelsio T3 NIC generates the following lockdep warning:

    =================================
    [ INFO: inconsistent lock state ]
    2.6.25-rc6 #50
    ---------------------------------
    inconsistent {softirq-on-W} -> {in-softirq-W} usage.
    swapper/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
     (&adap->sge.reg_lock){-+..}, at: [<ffffffff880e5ee2>] cxgb_offload_ctl+0x3af/0x507 [cxgb3]

The problem is that reg_lock is used with plain spin_lock() in
drivers/net/cxgb3/sge.c but is used with spin_lock_irqsave() in
drivers/net/cxgb3/cxgb3_offload.c.  This is technically a false
positive, since the uses in sge.c are only in the initialization and
cleanup paths and cannot overlap with any use in interrupt context.

The best fix is probably just to use spin_lock_irq() with reg_lock in
sge.c.  Even though it's not strictly required for correctness, it
avoids triggering lockdep and the extra overhead of disabling
interrupts is not important at all in the initialization and cleanup
slow paths.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-25 23:42:05 -04:00
David S. Miller 577f99c1d0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	drivers/net/wireless/rt2x00/rt2x00dev.c
	net/8021q/vlan_dev.c
2008-03-18 00:37:55 -07:00
Divy Le Ray cd7e903440 cxgb3: Fix transmit queue stop mechanism
The last change in the Tx queue stop mechanism opens a window
where the Tx queue might be stopped after pending credits
returned.

Tx credits are returned via a control message generated by the HW.
It returns tx credits on demand, triggered by a completion bit
set in selective transmit packet headers.

The current code can lead to the Tx queue stopped
with all pending credits returned, and the current frame
not triggering a credit return. The Tx queue will then never be
awaken.

The driver could alternatively request a completion for packets
that stop the queue. It's however safer at this point to go back
to the pre-existing behaviour.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-03-17 08:07:01 -04:00
YOSHIFUJI Hideaki 8082c37cdc [NET] NEIGHBOUR: Remove unpopular neigh_is_connected().
neigh_is_connected() is not popular at all, and the only user
drivers/net/cxgb3/l2t.c:t3_l2t_update() also have raw (expanded) expression.
Let's expand it and remove the inline function.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-04 15:18:23 +09:00
Steve Wise 4eb61e0231 cxgb3: Handle ARP completions that mark neighbors stale.
When ARP completes due to a request rather than a reply the neighbor is
marked NUD_STALE instead of reachable (see arp_process()).  The handler
for the resulting netevent needs to check also for NUD_STALE.

Failure to use the arp entry can cause RDMA connection failures.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-11 11:09:17 -05:00
Krishna Kumar a8cc21f646 Optimize cxgb3 xmit path (a bit)
1. Add common code for stopping queue.
	2. No need to call netif_stop_queue followed by netif_wake_queue (and
	   infact a netif_start_queue could have been used instead), instead
	   call stop_queue if required, and remove code under USE_GTS macro.
	3. There is no need to check for netif_queue_stopped, as the network
	   core guarantees that for us (I am sure every driver could remove
	   that check, eg e1000 - I have tested that path a few billion times
	   with about a few hundred thousand qstops but the condition never
	   hit even once).

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-11 10:44:28 -05:00
Christoph Lameter 9e2779fa28 is_vmalloc_addr(): Check if an address is within the vmalloc boundaries
Checking if an address is a vmalloc address is done in a couple of places.
Define a common version in mm.h and replace the other checks.

Again the include structures suck.  The definition of VMALLOC_START and
VMALLOC_END is not available in vmalloc.h since highmem.c cannot be included
there.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05 09:44:14 -08:00
Roland Dreier 7b9b09436b cxgb3: Remove incorrect __devinit annotations
When PCI error recovery was added to cxgb3, a function t3_io_slot_reset()
was added.  This function can call back into t3_prep_adapter() at any
time, so t3_prep_adapter() can no longer be marked __devinit.
This patch removes the __devinit annotation from t3_prep_adapter() and
all the functions that it calls, which fixes

    WARNING: drivers/net/cxgb3/built-in.o(.text+0x2427): Section mismatch in reference from the function t3_io_slot_reset() to the function .devinit.text:t3_prep_adapter()

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-03 04:28:35 -08:00
Al Viro 05e5c11653 annotate cxgb3
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:10:30 -08:00
Patrick McHardy 9dfebcc647 [VLAN]: Turn VLAN_DEV_INFO into inline function
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28 15:08:32 -08:00
Divy Le Ray bc4b6b5269 cxgb3 - Fix EEH, missing softirq blocking
set_pci_drvdata() stores a pointer to the adapter,
not the net device.
Add missing softirq blocking in t3_mgmt_tx.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:07:22 -08:00
Divy Le Ray b881955b7d cxgb3 - parity initialization for T3C adapters.
Add parity initialization for T3C adapters.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:07:22 -08:00
Jeff Garzik 2eab17ab88 drivers/net/cxgb3: trim trailing whitespace
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-01-28 15:04:13 -08:00
Divy Le Ray afefce66a5 cxgb3 - Fix I/O synchronization
Synchronize memory access before ringing
the Tx door bell.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:12 -08:00
Divy Le Ray a2604be548 cxgb3 - HW set up updates
Disable PEX errors. The HW generates false positives.
Update RSS hash function to a symmetric algorithm.
Update T3C HW support

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:11 -08:00
Divy Le Ray 3e5192eec8 cxgb3 - sysfs methods clean up
Remove unused argument in sysfs methods

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:10 -08:00
Divy Le Ray 23561c9447 cxgb3 - fix interaction with pktgen
Do not use skb->cb to stash unmap info,
save the info to the descriptor state.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:09 -08:00
Divy Le Ray 273fa9042c cxgb3 - FW upgrade
Bump up FW version to 5.0.
Do not downgrade FW within the same major version range.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:09 -08:00
Divy Le Ray 91a6b50cf6 cxgb3 - Add EEH support
Add PCI recovery support

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:08 -08:00
Divy Le Ray 67d92ab765 cxgb3 - Fix resources release.
Remove sysfs entries before unregistering the net devices.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:07 -08:00
Divy Le Ray 678771d6f5 cxgb3 - Use wild card for PCI subdevice ID match
Subdevice ID is not necessarily set to 1.
Use wild card for PCI device matching

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:07 -08:00
Divy Le Ray 42256f57d8 cxgb3 - fix MSI-X failure path
Return error code when msi-x settings fail.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:04:06 -08:00
Joe Perches f07b2e403b drivers/net/cxgb3: Add missing "space"
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-01-28 15:03:55 -08:00
Divy Le Ray 75758e8aa4 cxgb3 - T3C support update
Update GPIO mapping for T3C.
Update xgmac for T3C support.
Fix typo in mtu table.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-12-07 15:00:36 -05:00
Jeff Garzik 7c2399756a [SPARC, XEN, NET/CXGB3] use irq_handler_t where appropriate
Rather than hand-rolling our own prototype, make the code more
future-proof by using the standard irq_handler_t typedef.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2007-10-23 19:53:17 -04:00
Jiri Slaby 1977f03272 remove asm/bitops.h includes
remove asm/bitops.h includes

including asm/bitops directly may cause compile errors. don't include it
and include linux/bitops instead. next patch will deny including asm header
directly.

Cc: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19 11:53:41 -07:00
Stephen Hemminger 9265fabf0d cxgb3 sparse warning fixes
Fix warnings from sparse related to shadowed variables and routines
that should be declared static.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:55:29 -07:00
Adrian Bunk 0da18e3883 drivers/net/cxgb3/xgmac.c: remove dead code
This patch removes dead code ("tx_xcnt" can never be != 0 at this place)
spotted by the Coverity checker.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:53:51 -07:00
Al Viro fb8e4444cc cxgb3: trivial endianness annotations
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:52:06 -07:00
Jeff Garzik b9f2c0440d [netdrvr] Stop using legacy hooks ->self_test_count, ->get_stats_count
These have been superceded by the new ->get_sset_count() hook.

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:45 -07:00
Jeff Garzik 88d3aafdae [ETHTOOL] Provide default behaviors for a few ethtool sub-ioctls
For the operations
	get-tx-csum
	get-sg
	get-tso
	get-ufo
the default ethtool_op_xxx behavior is fine for all drivers, so we
permit op==NULL to imply the default behavior.

This provides a more uniform behavior across all drivers, eliminating
ethtool(8) "ioctl not supported" errors on older drivers that had
not been updated for the latest sub-ioctls.

The ethtool_op_xxx() functions are left exported, in case anyone
wishes to call them directly from a driver-private implementation --
a not-uncommon case.  Should an ethtool_op_xxx() helper remain unused
for a while, except by net/core/ethtool.c, we can un-export it at a
later date.

[ Resolved conflicts with set/get value ethtool patch... -DaveM ]

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:17 -07:00
Ralf Baechle 10d024c1b2 [NET]: Nuke SET_MODULE_OWNER macro.
It's been a useless no-op for long enough in 2.6 so I figured it's time to
remove it.  The number of people that could object because they're
maintaining unified 2.4 and 2.6 drivers is probably rather small.

[ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ]

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-10-10 16:51:13 -07:00
Divy Le Ray dc67369573 cxgb3 - Update engine microcode version
The new microcode engine version is set to 1.1.0

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:06 -07:00
Divy Le Ray 1aafee2657 cxgb3 - Add T3C rev
add driver recognition for T3C rev board.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:06 -07:00
Divy Le Ray bb9366af7b cxgb3 - CQ context operations time out too soon.
Currently, the driver only tries up to 5 times (5us) to get the results
of a CQ context operation.  Testing has shown the chip can take as much
as 50us to return the response on SG_CONTEXT_CMD operations.  So we up
the retry count to 100 to cover high loads.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:05 -07:00
Divy Le Ray 1c17ae8af9 cxgb3 - Set the CQ_ERR bit in CQ contexts.
The cxgb3 driver is incorrectly configuring the HW CQ context for CQ's
that use overflow-avoidance.  Namely the RDMA control CQ.  This results
in a bad DMA from the device to bus address 0.  The solution is to set
the CQ_ERR bit in the context for these types of CQs.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:05 -07:00
Divy Le Ray b4687ff753 cxgb3 - remove false positive in xgmac workaround
Qualify toggling of xgmac tx enable with not getting pause frames,
we might not make forward progress because the peer is sending
lots of pause frames.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:04 -07:00
Divy Le Ray 3eea3337a0 cxgb3 - log and clear PEX errors
Clear pciE PEX errors late at module load time.
Log details when PEX errors occur.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:03 -07:00
Divy Le Ray a5a3b4601b cxgb3 - Firmware update
Update firmware version.
Allow the driver to be up and running with older FW image

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:51:03 -07:00
Divy Le Ray 3f61e4278c cxgb3 - Update internal memory management
Set PM1 internal memory to round robin mode
It balances access to this internal memory for multiport adapters.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:51 -07:00
Divy Le Ray 167cdf5fbc cxgb3 - log adapter serial number
Log HW serial number when cxgb3 module is loaded.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:50 -07:00
Divy Le Ray c64c2eaeaa cxgb3 - Fatal error update
Stop the MAC when a fatal error is detected.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:50 -07:00
Divy Le Ray c9a6ce500d cxgb3 - tighten checks on TID values
Enforce validity checks on connection ids

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:49 -07:00
Divy Le Ray e22bb45d77 cxgb3 - Expose HW memory page info
A HW issue requires limiting the receive window size
to 23 pages of internal memory.
These pages can be configured to different sizes,
thus the RDMA driver needs to know the
page size to enforce the upper limit.

Also assign explicit enum values.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:49 -07:00
Divy Le Ray 27186dc325 cxgb3 - use immediate data for offload Tx
Send small TX_DATA work requests as immediate data even when
there are fragments. this avoids doing multiple DMAs for
small fragmented packets.
The driver already implements this optimization for small
contiguous packets.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2007-10-10 16:50:48 -07:00