Commit Graph

26 Commits

Author SHA1 Message Date
Amir Vadai c8c40b7f32 net/mlx4_en: loopbacked packets are dropped when SMAC=DMAC
Should NOT check SMAC=DMAC when:
1. loopback is turned on
2. validate_loopback is true.

Fixed it accordingly.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-08-03 16:49:02 -07:00
Thadeu Lima de Souza Cascardo 4cce66cdd1 mlx4_en: map entire pages to increase throughput
In its receive path, mlx4_en driver maps each page chunk that it pushes
to the hardware and unmaps it when pushing it up the stack. This limits
throughput to about 3Gbps on a Power7 8-core machine.

One solution is to map the entire allocated page at once. However, this
requires that we keep track of every page fragment we give to a
descriptor. We also need to work with the discipline that all fragments will
be released (in the sense that it will not be reused by the driver
anymore) in the order they are allocated to the driver.

This requires that we don't reuse any fragments, every single one of
them must be reallocated. We do that by releasing all the fragments that
are processed and only after finished processing the descriptors, we
start the refill.

We also must somehow guarantee that we either refill all fragments in a
descriptor or none at all, without resorting to giving up a page
fragment that we would have already given. Otherwise, we would break the
discipline of only releasing the fragments in the order they were
allocated.

This has passed page allocation fault injections (restricted to the
driver by using required-start and required-end) and device hotplug
while 16 TCP streams were able to deliver more than 9Gbps.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-19 10:53:13 -07:00
Amir Vadai 1eb8c695bd net/mlx4_en: Add accelerated RFS support
Use RFS infrastructure and flow steering in HW to keep CPU
affinity of rx interrupts and application per TCP stream.

A flow steering filter is added to the HW whenever the RFS
ndo callback is invoked by core networking code.

Because the invocation takes place in interrupt context, the
actual setup of HW is done using workqueue. Whenever new filter
is added, the driver checks for expiry of existing filters.

Since there's window in time between the point where the core
RFS code invoked the ndo callback, to the point where the HW
is configured from the workqueue context, the 2nd, 3rd etc
packets from that stream will cause the net core to invoke
the callback again and again.

To prevent inefficient/double configuration of the HW, the filters
are kept in a database which is indexed using hash function to enable
fast access.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-19 08:34:37 -07:00
Hadar Hen Zion cabdc8ee37 net/mlx4_en: Add support for drop action through ethtool
The drop action is implemented by allocating a QP and keeping it in a reset state
such that the HW drops any packets which are steered to that QP. When a drop action
is requested, we attach the relevant flow to that QP.

Sign-off-by: Hadar Hen Zion <hadarh@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-07 16:23:06 -07:00
Amir Vadai 0e98b523c4 net/mlx4_en: Force user priority by QP attribute
Instead of relying on HW to change schedule queue by UP, schedule
queue is fixed for a tx_ring, and UP in WQE is ignored in this aspect.  This
resolves two issues with untagged traffic:
1. untagged traffic has no UP in packet which is needed for QoS. The change
   above allows setting the schedule queue (and by that the UP) of such a stream.
2. BlueFlame uses the same field used by vlan tag. So forcing UP from QPC
   allows using BF for untagged but prioritized traffic.

In old firmware that force UP is not supported, untagged traffic will not subject to
QoS.

Because UP is set by QP, need to always have a tx ring per UP, even if pfcrx
module paramter is false.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05 05:08:03 -04:00
Yevgeny Petrilin 39b2c4ebb4 net/mlx4: fix sparse warnings on wrong type for RSS keys
The keys used for the hardware RSS topelitz hash are of type __be32
where the values provided by the driver are from array of u32,
this triggered sparse warning on incorrect type in assignment as of different base types.
Since these values are picked randomly,
the fix is to transform the key to __be32 by executing cpu_to_be_32()

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-06 15:19:17 -05:00
Yevgeny Petrilin ebf8c9aa03 net/mlx4_en: Saving mem access on data path
Localized the pdev->dev, and using dma_map instead of pci_map
There are multiple map/unmap operations on data path,
optimizing those by saving redundant pointer access.
Those places were identified as hot-spots when running kernel profiling
during some benchmarks.
The fixes had most impact when testing packet rate with small packets,
reducing several % from CPU load, and in some case being the difference
between reaching wire speed or being CPU bound.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-03-06 15:19:17 -05:00
David S. Miller d5ef8a4d87 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/infiniband/hw/nes/nes_cm.c

Simple whitespace conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-10 23:32:28 -05:00
Thadeu Lima de Souza Cascardo 7e2eb99cc6 mlx4: fix DMA mapping leak when allocation fails
mlx4_en_prepare_rx_desc does not correctly clean up after it finds an
allocation failure. It should unmap a page before calling put_page, but
it only calls the later.

This bug would prevent a device removal using hotplug after setting the
device MTU to 9000 and opening the network interface. After the fix, we
still see the allocation failure with MTU 9000, but we are able to
remove the device.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-06 14:42:28 -05:00
Thadeu Lima de Souza Cascardo 68355f7113 mlx4: allow device removal by fixing dma unmap size
After opening the network interface, Mellanox ConnectX device cannot be
removed by hotplug because it has not properly unmapped all DMA memory.

It happens that mlx4_en_activate_rx_rings overrides the variable that
keeps the size of the memory mapped.

This is fixed by passing to mlx4_en_destroy_rx_ring the same size that is
given to mlx4_en_create_rx_ring.

After applying this patch, hot unplugging the device works after opening
the interface.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-06 14:42:28 -05:00
Pradeep A Dalvi c056b734e5 netdev: ethernet dev_alloc_skb to netdev_alloc_skb
Replaced deprecating dev_alloc_skb with netdev_alloc_skb in drivers/net/ethernet
  - Removed extra skb->dev = dev after netdev_alloc_skb

Signed-off-by: Pradeep A Dalvi <netdev@pradeepdalvi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-06 11:52:27 -05:00
Joe Perches e404decb0f drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages
alloc failures use dump_stack so emitting an additional
out-of-memory message is an unnecessary duplication.

Remove the allocation failure messages.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-31 16:20:21 -05:00
Yevgeny Petrilin 93d3e3678f mlx4_en: set number of rx rings used by RSS using ethtool
Value must be a power of 2 due to HW limitation.
Driver supports only 'equal' mode in ethtool and can't be set by using weights.

Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-01-18 16:07:34 -05:00
Yevgeny Petrilin 89efea25cd mlx4_en: FIX: Setting default_qpn before using it
When UDP RSS is enabled, we use same QPN for TCP and UDP ranges
The bug is that the default_qpn was used base UDP qpn before it
was set.
Fixes bug introduced in commit: 1202d460b1

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-20 13:31:36 -05:00
Eugenia Emantayev 5b4c4d3686 mlx4_en: Allow communication between functions on same host
To enable internal loopback, always fill DMAC in control segment
when transmitting the packet, once this is done, the packet is subject
for loopback for if the DMAC mathces one of the multicast/unicast addresses
registered on the physical port.
In receive path if source MAC is our own MAC and we are not in selftest,
or not in force LB mode - drop this packet.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13 13:56:07 -05:00
Or Gerlitz 1202d460b1 net/mlx4: fix UDP RSS related settings
Using RSS which takes into account UDP headers is controlled by
a module param, fix the setting of the HW RSS context to align
with that scheme. So far it was uncoditionally allowing hashing
on the UDP headers.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-27 17:17:03 -05:00
Or Gerlitz 876f6e67d1 net/mlx4: move RSS related definitions to be global
Towards adding RSS support for IB drivers/application who use
the mlx4 HW, make the RSS related definitions global and change
the mlx4_en driver to use them.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-27 17:17:03 -05:00
Yevgeny Petrilin 4a5f4dd859 mlx4_en: Remove FCS bytes from packet length.
When HW doesn't remove FCS bytes they are reported in the completion
byte count, we don't need to take them to skb.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-14 14:25:36 -05:00
Ian Campbell 311761c8a5 mlx4: convert to SKB paged frag API.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-21 02:52:52 -04:00
Eric Dumazet 90278c9ffb mlx4_en: fix skb truesize underestimation
skb->truesize must account for allocated memory, not the used part of
it. Doing this work is important to avoid unexpected OOM situations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-20 04:55:27 -04:00
Yevgeny Petrilin ad86107f7b mlx4_en: Adding rxhash support
Moving to Toeplitz function in RSS calculation.
Reporting rxhash in skb.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:42:27 -04:00
Yevgeny Petrilin 3b61008d88 mlx4_en: Recording rx queue for gro packets
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:42:27 -04:00
Yevgeny Petrilin ad04378cec mlx4_en: Checksum counters per ring
Not updating common counters from data path.
The checksum counters are per ring, summarizing them when collecting statistics.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:42:27 -04:00
Yevgeny Petrilin f3a9d1f25d mlx4_en: Controlling FCS header removal
Canceling FCS removal where FW allows for better alignment
of incoming data.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:42:26 -04:00
Eric Dumazet 9e903e0852 net: add skb frag size accessors
To ease skb->truesize sanitization, its better to be able to localize
all references to skb frags size.

Define accessors : skb_frag_size() to fetch frag size, and
skb_frag_size_{set|add|sub}() to manipulate it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-10-19 03:10:46 -04:00
Jeff Kirsher 5a2cc190eb mlx4: Move the Mellanox driver
Moves the Mellanox driver into drivers/net/ethernet/mellanox/ and
make the necessary Kconfig and Makefile changes.

CC: Roland Dreier <roland@kernel.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-08-11 02:41:35 -07:00