Add multi-slice/MSI-X support. By default, a single slice
(and the normal firmware) are used. To enable msi-x, multi-slice
mode, one must load the driver with myri10ge_max_slices set to
either -1, or something larger than 1.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add several routines that multislices support will use.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This patch makes the needlessly global
myri10ge_get_firmware_capabilities() static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix a long-standing bug/misunderstanding between the
driver and the firmware. The size of the interrupt
queue must be set to the number of rx slots (big + small),
and it should never have been a tunable.
Setting it too small results in chaos.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add myri10ge_get_firmware_capabilities() to retrieve TSO6 and
interrupt slots capabilities from the firmware.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
To prepare and simplify multislice rx support, add a single slice
structure and move some fields in there.
No functional change yet.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix another potential for an infinite loop while looking for the
root port in myri10ge_enable_ecrc().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add some blank lines to uniformize the code and match
the upstream code.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add a barrier() in the usleep() loop in myri10ge_send_cmd().
Without the barrier, some mips machine never notices that the
firmware has DMA'ed the response.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Make ethtool report FIBER for XFP based NIC's port type.
Don't bother to poke around and try to find out what is in
the XFP cage, since Linux does not have separate media types
for -SR -LR, etc.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Properly align scratch buffers when making boot commands.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Increase the handoff timeout to 512ms so as to give the aeluros based
NICs sufficient time to handoff without relying on the msleep() being
sloppy, and accidentally sleeping way longer than the 20ms we specified
in 20 separate 1ms sleeps.
Fix typo in the handoff sleep delay, which made it additive, not
exponential.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Remove useless linebreaks at the end of MODULE_PARM_DESC
and fix the description of myri10ge_lro_max_pkts.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Some drivers have duplicated unlikely() macros. IS_ERR() already has
unlikely() in itself.
This patch cleans up such pointless code.
Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using ARRAY_SIZE() on arrays of the form array[][K] makes it unnecessary
to know the value of K when checking its size.
Signed-off-by: Alejandro Martinez Ruiz <alex@flawedcode.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Drivers do this to try to break out of the ->poll()'ing loop
when the device is being brought administratively down.
Now that we have a napi_disable() "pending" state we are going
to solve that problem generically.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a field to the lro_mgr struct so that drivers can specify how much
padding is required to align layer 3 headers when a packet is copied
into a freshly allocated skb by inet_lro.c:lro_gen_skb(). Without
padding, skbs generated by LRO will cause alignment warnings on
architectures which require strict alignment (seen on sparc64).
Myri10GE is updated to use this field.
Signed-off-by: Andrew Gallatin <gallatin@myri.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When testing the myri10ge driver with 2.6.24-rc1, I found
that the machine crashed under heavy load:
Unable to handle kernel paging request at 0000000000100108 RIP:
[<ffffffff803cc8dd>] net_rx_action+0x11b/0x184
The address corresponds to the list_move_tail() in
netif_rx_complete():
if (unlikely(work == weight))
list_move_tail(&n->poll_list, list);
Eventually, I traced the crashes to calling netif_rx_complete() with
work_done == budget. From looking at other drivers, it appears that
one should only call netif_rx_complete() when work_done < budget.
To fix it, I changed the test in myri10ge_poll() so that it refers
to to work_done rather than looking at the rx ring status. If
work_done is < budget, then that implies we have no more packets to
process. Any races will be resolved by the NIC when the write to
irq_claim is made.
In myri10ge_clean_rx_done(), if we ever exceeded our budget, it would
report a work_done one larger than was acutally done. This is because
the increment was done in the conditional, so work_done would be
incremented regardless of whether or not the test passed or failed.
This would lead to the WARN_ON_ONCE(work > weight); warning in
net_rx_action triggering. I've moved the increment of work_done
inside the loop. Note that this would only be a problem when we had
exceeded our budget.
Signed off by: Andrew Gallatin <gallatin@myri.com>
Andrew Gallatin Myricom Inc
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Found these while looking at printk uses.
Add missing newlines to dev_<level> uses
Add missing KERN_<level> prefixes to multiline dev_<level>s
Fixed a wierd->weird spelling typo
Added a newline to a printk
Signed-off-by: Joe Perches <joe@perches.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Mark M. Hoffman <mhoffman@lightlink.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: Tilman Schmidt <tilman@imap.cc>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: Greg KH <greg@kroah.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Smart <James.Smart@Emulex.Com>
Cc: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix one comment in myri10ge.c and update indendation and white spaces
to match the code generated by indent from upstream CVS.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
These have been superceded by the new ->get_sset_count() hook.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the operations
get-tx-csum
get-sg
get-tso
get-ufo
the default ethtool_op_xxx behavior is fine for all drivers, so we
permit op==NULL to imply the default behavior.
This provides a more uniform behavior across all drivers, eliminating
ethtool(8) "ioctl not supported" errors on older drivers that had
not been updated for the latest sub-ioctls.
The ethtool_op_xxx() functions are left exported, in case anyone
wishes to call them directly from a driver-private implementation --
a not-uncommon case. Should an ethtool_op_xxx() helper remain unused
for a while, except by net/core/ethtool.c, we can un-export it at a
later date.
[ Resolved conflicts with set/get value ethtool patch... -DaveM ]
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch covers something like this:
dev = alloc_*dev(...
...
priv = netdev_priv(dev);
memset(priv, 0, sizeof(*priv));
The memset() here is superfluous. alloc_netdev() uses kzalloc()
to allocate needed memory so there is no need to zero the priv region
twice.
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.
In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.
The signature of the ->poll() call back goes from:
int foo_poll(struct net_device *dev, int *budget)
to
int foo_poll(struct napi_struct *napi, int budget)
The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract). The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.
The napi_struct is to be embedded in the device driver private data
structures.
Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler. Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.
With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.
Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.
[ Ported to current tree and all drivers converted. Integrated
Stephen's follow-on kerneldoc additions, and restored poll_list
handling to the old style to fix mutual exclusion issues. -DaveM ]
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on a patch from Peter Oruba, convert myri10ge to use pcie_get_readrq()
and pcie_set_readrq() instead of our own PCI calls and arithmetics.
These driver changes incorporate the proposed PCI-X / PCI-Express read byte
count interface. Reading and setting those values doesn't take place
"manually", instead wrapping functions are called to allow quirks for some
PCI bridges.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off by: Peter Oruba <peter.oruba@amd.com>
Based on work by Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Use the pause counter to avoid a needless device reset, and
print a message telling the admin that our link partner is
flow controlling us down to 0 pkts/sec.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove nonsensical limit in the tx done routine. Specifically,
the loop will always terminate after processing <= 1 rings worth
of frames, as the mcp index is not refetched, so the removed
conditional could never be true.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
SET_NETDEV_DEV() in myri10ge to create the "/sys/class/net/<if>/device"
symlink.
Signed-off-by: Maik Hampel <m.hampel@gmx.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Since Myri-10G boards may also run in Myrinet mode instead of Ethernet,
add a message when we detect that the link partner is not running in the
right mode.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Limit the number of recoveries from a NIC hw watchdog reset to 1 by default.
It enables detection of defective NICs immediately since these memory parity
errors are expected to happen very rarely (less than once per century*NIC).
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove the aligned-completion whitelist, and replace it by using the 1.4.16
firmware's auto-detection features to choose which firmware to load.
The driver now loads the aligned firmware, performs a MXGEFW_CMD_UNALIGNED_TEST,
and falls back to using the unaligned firmware if:
- The firmware is too old (ie, MXGEFW_CMD_UNALIGNED_TEST is an unknown command).
- The MXGEFW_CMD_UNALIGNED_TEST returns MXGEFW_CMD_ERROR_UNALIGNED, meaning
that it has seen an unaligned completion during the DMA test.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Don't count on whatever implementation artifact preserves the
multicast list across a reset cmd, and setup multicast filtering
as part of our reset routine.
The setting of allmulti when adopting firmware with the rx-filter
broadcast bug is also moved into the multicast setup routine where
it belongs.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add dropped_pause, dropped_bad_phy, dropped_bad_crc32,
dropped_unicast_filtered to the set of ethtool counters.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
The ip_hdrlen() buddy, created to reduce the number of skb->h.th-> uses and to
avoid the longer, open coded equivalent.
Ditched a no-op in bnx2 in the process.
I wonder if we should have a BUG_ON(skb->h.th->doff < 5) in tcp_optlen()...
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the quite common 'skb->h.raw - skb->data' sequence.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One less thing for drivers writers to worry about.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the Intel 5000 southbridge (aka Intel 6310/6311/6321ESB) PCIe ports
and the Intel E30x0 chipsets to the whitelist of aligned PCIe completion.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Simpler way of dealing with the firmware 4KB boundary crossing
restriction for rx buffers. This fixes a variety of memory
corruption issues when using an "uncommon" MTU with a 16KB
page size.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Correctly detect when TSO should be used on transmit by looking at the
skb->gso_size rather than seeing if the frame was larger than our MTU.
The old method causes problems when a host with a large (jumbo) MTU is
sending to a host with a small (standard) MTU.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix management of allocated physical pages when the architecture
page size is not 4kB since the firmware cannot cross 4K boundary.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add a wc_enabled flag in the myri10ge_priv instead of relying
on mtrr >= 0.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Do not use 4k rdma request on SGI TIOCE chipset since this
bridge does not support it.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Allocate a specific page and use pci_map_page for dma test instead
of relying on another existing buffer.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix a missing error check in myri10ge_allocate_rings() and set status
to -ENOMEM before all actual allocations so that the error path returns
what it should.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix copyright and license ("regents" should not have ever been used).
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Work around a bug which occurs when adopting firmware versions
1.4.4 though 1.4.11 where broadcasts are filtered as if they
were multicasts.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Remove the NETIF_F_TSO #ifdef-ery in drivers/net; this was
for old-old-2.4 compat (even current 2.4 has NETIF_F_TSO)
but it's time to get rid of it by now.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Now that IRQ allocation is done in myri10ge_open(), we want to still
check when loading the driver that IRQ allocation could succeed later.
Additionaly, we fix the initialization and printing of netdev->irq.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Under some circumstances, using WC without the WC fifo is faster.
So we make it possible to tune wc_fifo with a module parameter.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
On suspend, handle pci_set_power_state errors, and on resume
handle failures in pci_resume_state().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The PCI MSI and express state are already saved and restored by the
current versions of pci_save_state/pci_restore_state.
Therefore it is no longer necessary for the driver to do it.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Now that IRQ are requested is called on open() and freed on close(),
we can safely switch from/to MSI without unloading the module.
We are guaranteed to correctly free IRQ even if the sysfs file got
written in the meantime since the MSI initialization is stored in
mgp->msi_enabled.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Request IRQ in myri10ge_open() and free in close() instead of probe()
and remove() to eliminate potential race between the watchdog and the
interrupt handler. Additionaly, the interrupt handler won't get called
on shared irq anymore when the interface is down.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Since pci_save_state() pushes MSI and PCIe states on a kind of stack,
myri10ge saving the state in advance for parity recovery will push the
state again on the stack on suspend. This leads to some memory leak.
We add a couple additional calls to save_state and restore_state so
that we don't leak anymore.
For the future, we are thinking of a better way to recover from parity
error without using pci_save_state().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix sizing of big_bytes in the case of vlan frames. The 4
VLAN_HLEN bytes were omitted, leading to sizing the big buffer
4 bytes smaller than it should be. Due to how rx buffers are
carved from pages, this was harmless for the common (9000, 1500)
byte MTUs, but could lead to data corruption for some MTUs.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Receive full vlan frames into smalls when running with a jumbo MTU.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Drop the old routines that used the physically contigous skb now
that we use the physical pages. And rename myri10ge_page_rx_done()
to myri10ge_rx_done() as it was previously.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Switch to physical page skb, by calling the new page-based
allocation routines and using myri10ge_page_rx_done().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add physical page skb allocation routines and page based rx_done,
to be used by upcoming patches.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Indentation cleanups to synchronize to our tree which is automatically
indent'ed.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
In the myri10ge_submit_8rx() routine, write the 64 byte request block as
2 32-byte blocks so that it is handled by the hardware pio write handler
if write-combining is enabled.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
... into anonymous union of __wsum and __u32 (csum and csum_offset resp.)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
No need to keep defining PCI_DEVICE_ID_SERVERWORKS_HT2000_PCIE
in the driver code since it is now defined in pci_ids.h.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (217 commits)
net/ieee80211: fix more crypto-related build breakage
[PATCH] Spidernet: add ethtool -S (show statistics)
[NET] GT96100: Delete bitrotting ethernet driver
[PATCH] mv643xx_eth: restrict to 32-bit PPC_MULTIPLATFORM
[PATCH] Cirrus Logic ep93xx ethernet driver
r8169: the MMIO region of the 8167 stands behin BAR#1
e1000, ixgb: Remove pointless wrappers
[PATCH] Remove powerpc specific parts of 3c509 driver
[PATCH] s2io: Switch to pci_get_device
[PATCH] gt96100: move to pci_get_device API
[PATCH] ehea: bugfix for register access functions
[PATCH] e1000 disable device on PCI error
drivers/net/phy/fixed: #if 0 some incomplete code
drivers/net: const-ify ethtool_ops declarations
[PATCH] ethtool: allow const ethtool_ops
[PATCH] sky2: big endian
[PATCH] sky2: fiber support
[PATCH] sky2: tx pause bug fix
drivers/net: Trim trailing whitespace
[PATCH] ehea: IBM eHEA Ethernet Device Driver
...
Manually resolved conflicts in drivers/net/ixgb/ixgb_main.c and
drivers/net/sky2.c related to CHECKSUM_HW/CHECKSUM_PARTIAL changes by
commit 84fa7933a3 that just happened to be
next to unrelated changes in this update.
Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose
checksum still needs to be completed) and CHECKSUM_COMPLETE (for
incoming packets, device supplied full checksum).
Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improve the firmware selection by adding 2 cases where we should use the
optimized firmware:
* when the actual PCIe link width is lower than 8x.
* when the board is plugged to one of the new Intel PCIe chipsets that
are known to provide aligned PCIe completions.
The patch actually raises two concerns:
* We might want to add a generic PCI function to get the PCIe link width since
some other drivers (at least ipath) do the same. But we probably do not want
to add a new function for every PCIe capability. I will probably look at it
and discuss it on linux-pci in the future.
* As requested during the submission, the PCI ids of chipsets that are known to
provided aligned completion are defined in the myri10ge code. If we keep adding
new ones, it might become better to move them to pciids.h.
But, this sort of quirk to detect these chipsets are very specific to our NIC,
I don't think it is worth moving it to the PCI core until somebody else really
needs it.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some recent myri10ge firmwares support multicast filtering as well
as an extended mcp_irq_data structure (64 instead of 40 bytes).
The new command MXGEFW_CMD_SET_STATS_DMA_V2 is used to check
whether the firmware support those. mgp->fw_multicast_support
is defined accordingly.
When fw_multicast_support is set, some new multicast filtering
commands is passed to the board in myri10ge_set_multicast_list().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add msg_enable and use netif_msg_link to enable/disable reporting
of link status changes since some Ethernet switches seem to generate
a lot of status changes under some circumstances and some people
want to avoid useless flooding in the logs.
Also add a counter for link status changes to statistics.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Convert the myri10ge driver to use netdev_alloc_skb instead of dev_alloc_skb,
which requires to propagate the net_device across several functions.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Dummy RDMA are always enabled on device startup since commit
9a71db721a (to work around buggy
PCIe chipsets which do not implement resending properly). But,
we also need to always re-enable them when resuming the device.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When writing the firmware to the NIC, the FIFO is 256-bytes long,
so we use 256-bytes chunks and a read to wait until the previous
write is done.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Always do a dummy RDMA after loading the firmware to work around
buggy PCIe chipsets which do not implement resending properly.
This is so cheap as to be almost free, and should never have been
conditional on the tx boundary != 4096.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Andrew Morton wrote:
> All these functions return error codes, and we're not checking them. We
> should. So there's a patch which marks all these things as __must_check,
> which causes around 1,500 new warnings.
>
The following patch fixes such a warning in myri10ge.
Check pci_enable_device() return value in myri10ge_resume().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch adds the wrapper function skb_is_gso which can be used instead
of directly testing skb_shinfo(skb)->gso_size. This makes things a little
nicer and allows us to change the primary key for indicating whether an skb
is GSO (if we ever want to do that).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the IRQ line, the tx_boundary, and whether Write-combining and MSI
are enabled to the list of parameters that are exported to ethtool.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Displaying the interface name when listing the device parameters
at the end of myri10ge_probe is not a good idea since udev might
rename the interface soon afterwards.
Print the bus id instead, using dev_info().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The workaround for the AER capability of the nVidia chipset has been
removed, we don't need this PCI id anymore. Drop it.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The pm_state field in the myri10ge_priv structure is unused. Drop it.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP). So
let's merge them.
They were used to tell the protocol of a packet. This function has been
subsumed by the new gso_type field. This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb. As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.
I've made gso_type a conjunction. The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
to be emulated in software.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
First of all it is unnecessary to allocate a new skb in skb_pad since
the existing one is not shared. More importantly, our hard_start_xmit
interface does not allow a new skb to be allocated since that breaks
requeueing.
This patch uses pskb_expand_head to expand the existing skb and linearize
it if needed. Actually, someone should sift through every instance of
skb_pad on a non-linear skb as they do not fit the reasons why this was
originally created.
Incidentally, this fixes a minor bug when the skb is cloned (tcpdump,
TCP, etc.). As it is skb_pad will simply write over a cloned skb. Because
of the position of the write it is unlikely to cause problems but still
it's best if we don't do it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't need to restore the state right after saving it for later recovery
since commit 99dc804d9b (PCI: disable msi mode
in pci_disable_device) now prevents pci_save_state() from disabling MSI.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
We don't need to hardcode the AER capability of the nVidia CK804 chipset
anymore since commit cf34a8e07f (PCI: nVidia
quirk to make AER PCI-E extended capability visible) now makes sure that
this cap will be available to pci_find_ext_capability().
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The following patch updates the myri10ge to 1.0.0, with the following changes:
* Switch to dma_alloc_coherent API.
* Avoid PCI burst when writing the firmware on chipset with unaligned completions.
* Use ethtool_op_set_tx_hw_csum instead of ethtool_op_set_tx_csum.
* Include linux/dma-mapping.h to bring DMA_32/64BIT_MASK on all architectures
(was missing at least on alpha).
* Some typo and warning fixes.
Signed-off-by: Brice Goglin <brice@myri.com>
Signed-off-by: Andrew J. Gallatin <gallatin@myri.com>
drivers/net/myri10ge/myri10ge.c | 57 +++++++++++++++++++-----------
1 file changed, 37 insertions(+), 20 deletions(-)
Signed-off-by: Jeff Garzik <jeff@garzik.org>