Commit Graph

98 Commits

Author SHA1 Message Date
Krishna Kumar b1424ed910 cxgb3: Fix panic in free_tx_desc()
I got a few of these panics (on 2.6.36-rc7) when running high
number of netperf sessions:

BUG: unable to handle kernel paging request at 0000100000000000
IP: [<ffffffff813125f0>] skb_release_data+0xa0/0xd0
Oops: 0000 [#1] SMP
Pid: 2155, comm: vhost-2115 Not tainted 2.6.36-rc7-ORG #1 49Y6512     /System x3650 M2 -[7947AC1]-
RIP: 0010:[<ffffffff813125f0>]  [<ffffffff813125f0>] skb_release_data+0xa0/0xd0
RSP: 0018:ffff880001803738  EFLAGS: 00010206
RAX: ffff880179b0fc00 RBX: ffff880178b441c0 RCX: 0000000000000000
RSP: 0018:ffff880001803738  EFLAGS: 00010206
RAX: ffff880179b0fc00 RBX: ffff880178b441c0 RCX: 0000000000000000
RDX: ffff880179b0fd40 RSI: 0000000000000000 RDI: 0000100000000000
RBP: ffff880001803748 R08: 0000000000000001 R09: ffff88017f117000
R10: ffff88017b990608 R11: ffff88017f117090 R12: ffff880178b441c0
R13: ffff88017f117090 R14: 0000000000000000 R15: ffff880178b441c0
FS:  0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000100000000000 CR3: 000000017ea64000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process vhost-2115 (pid: 2155, threadinfo ffff88017d872000, task ffff88017e954680)
Stack:
ffff880178b441c0 0000000000000007 ffff880001803768 ffffffff81312119
<0> 0000000000000000 0000000000000002 ffff880001803778 ffffffff813121f9
<0> ffff880001803818 ffffffffa012d14c ffffffffa02de076 ffff880001803700
Call Trace:
<IRQ>
[<ffffffff81312119>] __kfree_skb+0x19/0xa0
[<ffffffff813121f9>] kfree_skb+0x19/0x40
[<ffffffffa012d14c>] free_tx_desc+0x2fc/0x350 [cxgb3]
[<ffffffffa02de076>] ? vhost_poll_wakeup+0x16/0x20 [vhost_net]
[<ffffffffa01323db>] t3_eth_xmit+0x28b/0x380 [cxgb3]
[<ffffffff8131ce47>] dev_hard_start_xmit+0x377/0x5a0
[<ffffffff81335a4a>] sch_direct_xmit+0xfa/0x1d0
[<ffffffff8131d1a9>] dev_queue_xmit+0x139/0x450
[<ffffffff81326225>] neigh_resolve_output+0x125/0x340
[<ffffffff8135a77c>] ip_finish_output+0x14c/0x320
[<ffffffff8135a9fe>] ip_output+0xae/0xc0
[<ffffffff8135620f>] ip_forward_finish+0x3f/0x50
[<ffffffff8135641f>] ip_forward+0x1ff/0x400
[<ffffffff81354789>] ip_rcv_finish+0x119/0x3e0
[<ffffffff81354c7d>] ip_rcv+0x22d/0x300
[<ffffffff8131a95b>] __netif_receive_skb+0x29b/0x570
[<ffffffff8131ba70>] ? netif_receive_skb+0x0/0x80
[<ffffffff8131bae8>] netif_receive_skb+0x78/0x80
[<ffffffffa02a96d8>] br_handle_frame_finish+0x198/0x260 [bridge]
[<ffffffffa02aebc8>] br_nf_pre_routing_finish+0x238/0x380 [bridge]
[<ffffffff813424bc>] ? nf_hook_slow+0x6c/0x100
[<ffffffffa02ae990>] ? br_nf_pre_routing_finish+0x0/0x380 [bridge]
[<ffffffffa02afb08>] br_nf_pre_routing+0x698/0x7a0 [bridge]
[<ffffffff81342414>] nf_iterate+0x64/0xa0
[<ffffffffa02a9540>] ? br_handle_frame_finish+0x0/0x260 [bridge]
[<ffffffff813424bc>] nf_hook_slow+0x6c/0x100
[<ffffffffa02a9540>] ? br_handle_frame_finish+0x0/0x260 [bridge]
[<ffffffffa02a9931>] br_handle_frame+0x191/0x240 [bridge]
[<ffffffffa02a97a0>] ? br_handle_frame+0x0/0x240 [bridge]
[<ffffffff8131a863>] __netif_receive_skb+0x1a3/0x570
[<ffffffff812ef3f6>] ? dma_issue_pending_all+0x76/0xa0
[<ffffffff8131ad32>] process_backlog+0x102/0x200
[<ffffffff8131c2d0>] net_rx_action+0x100/0x220
[<ffffffff810548ef>] __do_softirq+0xaf/0x140
[<ffffffff8100bcdc>] call_softirq+0x1c/0x30
[<ffffffff8100dfc5>] ? do_softirq+0x65/0xa0
[<ffffffff8131c6b8>] netif_rx_ni+0x28/0x30
[<ffffffffa02c305d>] tun_sendmsg+0x2cd/0x4b0 [tun]
[<ffffffffa02e01af>] handle_tx+0x1df/0x340 [vhost_net]
[<ffffffffa02e0340>] handle_tx_kick+0x10/0x20 [vhost_net]
[<ffffffffa02de29b>] vhost_worker+0xbb/0x130 [vhost_net]
[<ffffffffa02de1e0>] ? vhost_worker+0x0/0x130 [vhost_net]
[<ffffffffa02de1e0>] ? vhost_worker+0x0/0x130 [vhost_net]
[<ffffffff81069686>] kthread+0x96/0xa0
[<ffffffff8100bbe4>] kernel_thread_helper+0x4/0x10
[<ffffffff810695f0>] ? kthread+0x0/0xa0
[<ffffffff8100bbe0>] ? kernel_thread_helper+0x0/0x10
Code: 8b 94 24 d0 00 00 00 49 8b 84 24 d8 00 00 00 48 8d 14 10 0f b7 0a 39 d9 7f d1 48 8b 7a 10 48 85 ff 74 20 48 c7 42 10 00 00 00 00 <48> 8b 1f e8 e8 fb ff ff 48 85 db 48 89 df 75 f0 49 8b 84 24 d8

Patch below fixes the panic. cxgb4 and cxgb4vf already have this fix.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-28 10:27:02 -07:00
stephen hemminger a5190b4eea cxgb3: function namespace cleanup
Make local functions static. Remove functions that are
defined and never used. Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 07:19:02 -07:00
Jesse Gross eab6d18d20 vlan: Don't check for vlan group before vlan_tx_tag_present.
Many (but not all) drivers check to see whether there is a vlan
group configured before using a tag stored in the skb.  There's
not much point in this check since it just throws away data that
should only be present in the expected circumstances.  However,
it will soon be legal and expected to get a vlan tag when no
vlan group is configured, so remove this check from all drivers
to avoid dropping the tags.

Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-21 01:26:52 -07:00
Eric Dumazet bc8acf2c8c drivers/net: avoid some skb->ip_summed initializations
fresh skbs have ip_summed set to CHECKSUM_NONE (0)

We can avoid setting again skb->ip_summed to CHECKSUM_NONE in drivers.

Introduce skb_checksum_none_assert() helper so that we keep this
assertion documented in driver sources.

Change most occurrences of :

skb->ip_summed = CHECKSUM_NONE;

by :

skb_checksum_none_assert(skb);

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-02 19:06:22 -07:00
FUJITA Tomonori 122e28ebac cxgb3: simplify need_skb_unmap
We can use CONFIG_NEED_DMA_MAP_STATE to see if a platform does real
DMA unmapping.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-08 23:12:27 -07:00
FUJITA Tomonori 56e3b9df13 cxgb3: use the DMA state API instead of the pci equivalents
This replace the PCI DMA state API (include/linux/pci-dma.h) with the
DMA equivalents since the PCI DMA state API will be obsolete.

No functional change.

For further information about the background:

http://marc.info/?l=linux-netdev&m=127037540020276&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-04-13 02:54:18 -07:00
Tejun Heo 5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Jiri Kosina 318ae2edc3 Merge branch 'for-next' into for-linus
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
2010-03-08 16:55:37 +01:00
Linus Torvalds 3ff1562ea4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (48 commits)
  IB/srp: Clean up error path in srp_create_target_ib()
  IB/srp: Split send and recieve CQs to reduce number of interrupts
  RDMA/nes: Add support for KR device id 0x0110
  IB/uverbs: Use anon_inodes instead of private infinibandeventfs
  IB/core: Fix and clean up ib_ud_header_init()
  RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing
  RDMA/cxgb3: Don't allocate the SW queue for user mode CQs
  RDMA/cxgb3: Increase the max CQ depth
  RDMA/cxgb3: Doorbell overflow avoidance and recovery
  IB/core: Pack struct ib_device a little tighter
  IB/ucm: Clean whitespace errors
  IB/ucm: Increase maximum devices supported
  IB/ucm: Use stack variable 'base' in ib_ucm_add_one
  IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one
  IB/umad: Clean whitespace
  IB/umad: Increase maximum devices supported
  IB/umad: Use stack variable 'base' in ib_umad_init_port
  IB/umad: Use stack variable 'devnum' in ib_umad_init_port
  IB/umad: Remove port_table[]
  IB/umad: Convert *cdev to cdev in struct ib_umad_port
  ...
2010-03-03 07:33:17 -08:00
Steve Wise e998f245c4 RDMA/cxgb3: Doorbell overflow avoidance and recovery
T3 hardware doorbell FIFO overflows can cause application stalls due
to lost doorbell ring events.  This has been seen when running large
NP IMB alltoall MPI jobs.  The T3 hardware supports an xon/xoff-type
flow control mechanism to help avoid overflowing the HW doorbell FIFO.

This patch uses these interrupts to disable RDMA QP doorbell rings
when we near an overflow condition, and then turn them back on (and
ring all the active QP doorbells) when when the doorbell FIFO empties
out.  In addition if an doorbell ring is dropped by the hardware, the
code will now recover.

Design:

cxgb3:
- enable these DB interrupts
- in the interrupt handler, schedule work tasks to call the ULPs event
  handlers with the new events.
- ring all the qset txqs when an overflow is detected.

iw_cxgb3:
- disable db ringing on all active qps when we get the DB_FULL event
- enable db ringing on all active qps and ring all active dbs when we get
  the DB_EMPTY event
- On DB_DROP event:
       - disable db rings in the event handler
       - delay-schedule a work task which rings and enables the dbs on
         all active qps.
- in post_send and post_recv logic, don't ring the db if it's disabled.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-02-24 10:40:28 -08:00
David S. Miller b1109bf085 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-02-09 11:44:44 -08:00
Divy Le Ray 2d171886b1 cxgb3: fix GRO checksum check
Verify the HW checksum state for frames handed to GRO processing.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-08 22:37:24 -08:00
Stefan Weil 947af29435 Fix spelling of 'platform' in comments and doc
Replace platfrom -> platform.

This is a frequent spelling bug.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-02-05 12:22:34 +01:00
Divy Le Ray 2e02644abc cxgb3: add memory barriers
Add memory barriers to fix crashes observed on newest PowerPC platforms.
The HW and driver state of the receive rings were getting out of sync.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-02-03 18:37:10 -08:00
Linus Torvalds 4ef58d4e2a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
  tree-wide: fix misspelling of "definition" in comments
  reiserfs: fix misspelling of "journaled"
  doc: Fix a typo in slub.txt.
  inotify: remove superfluous return code check
  hdlc: spelling fix in find_pvc() comment
  doc: fix regulator docs cut-and-pasteism
  mtd: Fix comment in Kconfig
  doc: Fix IRQ chip docs
  tree-wide: fix assorted typos all over the place
  drivers/ata/libata-sff.c: comment spelling fixes
  fix typos/grammos in Documentation/edac.txt
  sysctl: add missing comments
  fs/debugfs/inode.c: fix comment typos
  sgivwfb: Make use of ARRAY_SIZE.
  sky2: fix sky2_link_down copy/paste comment error
  tree-wide: fix typos "couter" -> "counter"
  tree-wide: fix typos "offest" -> "offset"
  fix kerneldoc for set_irq_msi()
  spidev: fix double "of of" in comment
  comment typo fix: sybsystem -> subsystem
  ...
2009-12-09 19:43:33 -08:00
Jiri Kosina d014d04386 Merge branch 'for-next' into for-linus
Conflicts:

	kernel/irq/chip.c
2009-12-07 18:36:35 +01:00
André Goddard Rosa af901ca181 tree-wide: fix assorted typos all over the place
That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-12-04 15:39:55 +01:00
David S. Miller 3505d1a9fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/sfc/sfe4001.c
	drivers/net/wireless/libertas/cmd.c
	drivers/staging/Kconfig
	drivers/staging/Makefile
	drivers/staging/rtl8187se/Kconfig
	drivers/staging/rtl8192e/Kconfig
2009-11-18 22:19:03 -08:00
Divy Le Ray 70e3bb504c cxgb3: fix premature page unmap
unmap Rx page only when guaranteed that this page won't be
used anymore to allocate rx page chunks.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-11-18 05:11:14 -08:00
Krishna Kumar 10e85f7f08 cxgb3: Set the rxq
Set the rxq# for LRO when processing the last fragment of a
frame. This helps in fast txq selection for routing workloads.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-27 01:02:32 -07:00
Krishna Kumar 0d9a40de60 cxgb3: No need to wake queue in xmit handler
The xmit handler doesn't need to wake the queue after stopping
it temporarily (some other drivers are doing the same).

Patch on net-next-2.6, multiple netperf sessions tested.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-17 23:57:25 -07:00
Karen Xie f14d42f314 cxgb3: Added private MAC address and provisioning packet handler for iSCSI
This patch added support of private MAC address per port and provisioning
packet handler for iSCSI traffic only.

The above changes are isolated to the cxgb3 driver, independent of any scsi or iscsi driver changes.

Acked-by: Karen Xie <kxie@chelsio.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Rakesh Ranjan <rakesh@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-13 03:44:06 -07:00
Stephen Hemminger 61357325f3 netdev: convert bulk of drivers to netdev_tx_t
In a couple of cases collapse some extra code like:
   int retval = NETDEV_TX_OK;
   ...
   return retval;
into
   return NETDEV_TX_OK;

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-01 01:14:07 -07:00
David S. Miller b2f8f7525c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/forcedeth.c
2009-06-03 02:43:41 -07:00
Divy Le Ray c3a8c5b644 cxgb3: move away from LLTX
cxgb3 no longer advertizes LLTX.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29 15:55:03 -07:00
Divy Le Ray 10b6d95612 cxgb3: fix dma mapping regression
Commit 5e68b772e6
  cxgb3: map entire Rx page, feed map+offset to Rx ring.

introduced a regression on platforms defining DECLARE_PCI_UNMAP_ADDR()
and related macros as no-ops.

Rx descriptors are fed with the a page buffer bus address + page chunk offset.
The page buffer bus address is set and retrieved through
pci_unamp_addr_set(), pci_unmap_addr().
These functions being meaningless on x86 (if CONFIG_DMA_API_DEBUG is not set).
The HW ends up with a bogus bus address.

This patch saves the page buffer bus address for all plaftorms.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29 01:54:37 -07:00
Eric Dumazet 28679751a9 net: dont update dev->trans_start in 10GB drivers
Followup of commits 9d21493b4b
and 08baf56108
(net: tx scalability works : trans_start)
(net: txq_trans_update() helper)

Now that core network takes care of trans_start updates, dont do it
in drivers themselves, if possible. Multi queue drivers can
avoid one cache miss (on dev->trans_start) in their start_xmit()
handler.

Exceptions are NETIF_F_LLTX drivers (vxge & tehuti)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-29 01:46:26 -07:00
Herbert Xu 76620aafd6 gro: New frags interface to avoid copying shinfo
It turns out that copying a 16-byte area at ~800k times a second
can be really expensive :) This patch redesigns the frags GRO
interface to avoid copying that area twice.

The two disciples of the frags interface have been converted.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-16 02:02:07 -07:00
Divy Le Ray 5e68b772e6 cxgb3: map entire Rx page, feed map+offset to Rx ring.
DMA mapping can be expensive in the presence of iommus.
Reduce the Rx iommu activity by mapping an entire page, and provide the H/W
the mapped address + offset of the current page chunk.
Reserve bits at the end of the page to track mapping references, so the page
can be unmapped.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-27 00:46:59 -07:00
Divy Le Ray 3fa58c883d cxgb3: sge setup fixes
Enable timestamps, update delayed ack threshold for iSCSI/iWARP traffic
Remove the len flag in Tx requests. It might corrupt offload trace packets.
Update SGE context setup to avoid potential H/W misprogrammation.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-27 00:46:57 -07:00
Divy Le Ray 3156378993 cxgb3: start qset timers when setup succeeded
Start queue set reclaim timers after the queue sets have been
allocated successfully.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-27 00:46:56 -07:00
Divy Le Ray fc88219601 cxgb3: disable high freq non-data interrupts
Under RX pressure, The HW might generate a high load of interrupts
to signal mac fifo or free lists overflow.
Disable the interrupts, and poll the relevant status bits
to maintain stats.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:30:45 -07:00
Divy Le Ray 42c8ea17e8 cxgb3: separate TX and RX reclaim handlers
Separate TX and RX reclaim handlers
Don't disable interrupts in RX reclaim handler.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:30:45 -07:00
Divy Le Ray b2b964f064 cxgb3: prefetch buffer access in GRO mode
Elmininate a cache miss when accessing the CPL header within
the first aggregated buffer.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:30:44 -07:00
Divy Le Ray 8f4358044d cxgb3: fix skb truesize in jumbo mode
Update skb truesize correctly for the 2nd buffer from a Jumbo frame

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:30:44 -07:00
Divy Le Ray 9bb2b31e6f cxgb3: release page ref on mapping error
Release page chunk reference in case we fail to map it.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:30:43 -07:00
Divy Le Ray 26b3871d2c cxgb3: ring rx door bell less frequently
Ring free lists door bell less frequently,
specifically every quarter of the active FL
size.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-13 11:30:43 -07:00
David S. Miller 7870389478 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2009-02-04 16:52:41 -08:00
Divy Le Ray 65ab8385b6 cxgb3: Fix lro switch
The LRO switch is always set to 1 in the rx processing loop.
It breaks the accelerated iSCSI receive traffic.
Fix its computation.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-04 16:31:39 -08:00
David S. Miller 0c8dfc830a net: Add skb_record_rx_queue() calls to multiqueue capable drivers.
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-27 16:22:32 -08:00
Herbert Xu 7be2df451f cxgb3: Replace LRO with GRO
This patch makes cxgb3 invoke the GRO hooks instead of LRO.  As
GRO has a compatible external interface to LRO this is a very
straightforward replacement.

I've kept the ioctl controls for per-queue LRO switches.  However,
we should not encourage anyone to use these.

Because of that, I've also kept the skb construction code in
cxgb3.  Hopefully we can phase out those per-queue switches
and then kill this too.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-21 14:39:13 -08:00
Divy Le Ray eed087e367 cxgb3: Fix LRO misalignment
The lro manager's frag_align_pad setting was missing,
leading to misaligned access to the skb passed up
to the stack.

Tested-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-19 16:20:16 -08:00
Roland Dreier 47fd23fe8e cxgb3: Keep LRO off if disabled when interface is down
I have a system with a Chelsio adapter (driven by cxgb3) whose ports are
part of a Linux bridge.  Recently I updated the kernel and discovered
that things stopped working because cxgb3 was doing LRO on packets that
were passed into the bridge code for forwarding.  (Incidentally, this
problem manifested itself in a strange way that made debugging a bit
interesting -- for some reason, the skb_warn_if_lro() check in bridge
didn't trigger and these LROed packets were forwarded out a forcedeth
interface, and caused the forcedeth transmit path to get stuck)

This is because cxgb3 has no way of keeping state for the LRO flag until
the interface is brought up, so if the bridging code disables LRO while
the interface is down, then cxgb3_up() will just reenable LRO, and on my
Debian system at least, the init scripts add interfaces to a bridge
before bringing the interfaces up.

Fix this by keeping track of each interface's LRO state in cxgb3 so that
when bridge disables LRO, it stays disabled in cxgb3_up() when the
interface is brought up.  I did this by changing the rx_csum_offload
flag into a pair of bit flags; the effect of this on the rx_eth() fast
path is miniscule enough that it should be fine (eg on x86, a cmpb
instruction becomes a testb instruction).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-11 00:19:36 -08:00
Karen Xie a109a5b916 cxgb3: manage private iSCSI IP address
The accelerated iSCSI traffic could use a private IP address unknown to the OS:
- The IP address is required in both drivers to manage ARP requests and connection set up.
- Added an control call to retrieve the ip address.
- Reply to ARP requests dedicated to the private IP address.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: Karen Xie <kxie@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-18 22:56:20 -08:00
Divy Le Ray 82ad332974 cxgb3: Add multiple Tx queue support.
Implement NIC Tx multiqueue.
Bump up driver version.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-12-16 01:09:39 -08:00
Roland Dreier c5419e6f05 cxgb3: Fix sparse warning and micro-optimize is_pure_response()
The function is_pure_response() does "ntohl(var) & const" and then
essentially just tests whether the result is 0 or not; this can be done
more efficiently by computing "var & htonl(const)" instead and doing the
byte swap at compile time instead of run time.

This change slightly shrinks the compiled code; eg on x86-64 we save a
couple of bswapl instructions:

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-8 (-8)
function                                     old     new   delta
t3_sge_intr_msix_napi                        544     536      -8

and this also has the pleasant side effect of fixing a sparse warning:

    drivers/net/cxgb3/sge.c:2313:15: warning: restricted degrades to integer

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-28 21:55:42 -08:00
Divy Le Ray 5256554489 cxgb3: avoid potential memory leak.
Add consistency in alloc_ring() parameter checking
to avoid potential memory leaks.
alloc_ring() callers are correct fo far.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-26 15:35:59 -08:00
David S. Miller babcda74e9 drivers/net: Kill now superfluous ->last_rx stores.
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.

Drivers need not do it any more.

Some cases had to be skipped over because the drivers
were making use of the ->last_rx value themselves.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-03 21:11:17 -08:00
Divy Le Ray a02d44a02b cxgb3: extend copyrights to 2008
Update copyright banner to 2008.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-13 18:47:30 -07:00
Divy Le Ray 20d3fc1150 cxgb3: reset the adapter on fatal error
when a fatal error occurs, bring ports down, reset the chip,
and bring ports back up.

Factorize code used for both EEH and fatal error recovery.
Fix timer usage when bringing up/resetting sge queue sets.

Signed-off-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-08 17:36:03 -07:00