linux

Commit Graph

Author	SHA1	Message	Date
Gal Pressman	8ab7e2ae15	net/mlx5e: Count LRO packets correctly RX packets statistics ('rx_packets' counter) used to count LRO packets as one, even though it contains multiple segments. This patch will increment the counter by the number of segments, and align the driver with the behavior of other drivers in the stack. Note that no information is lost in this patch due to 'rx_lro_packets' counter existence. Before, ethtool showed: $ ethtool -S ens6 \| egrep "rx_packets\|rx_lro_packets" rx_packets: 435277 rx_lro_packets: 35847 rx_packets_phy: 1935066 Now, we will see the more logical statistics: $ ethtool -S ens6 \| egrep "rx_packets\|rx_lro_packets" rx_packets: 1935066 rx_lro_packets: 35847 rx_packets_phy: 1935066 Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Gal Pressman	d3a4e4da54	net/mlx5e: Count GSO packets correctly TX packets statistics ('tx_packets' counter) used to count GSO packets as one, even though it contains multiple segments. This patch will increment the counter by the number of segments, and align the driver with the behavior of other drivers in the stack. Note that no information is lost in this patch due to 'tx_tso_packets' counter existence. Before, ethtool showed: $ ethtool -S ens6 \| egrep "tx_packets\|tx_tso_packets" tx_packets: 61340 tx_tso_packets: 60954 tx_packets_phy: 2451115 Now, we will see the more logical statistics: $ ethtool -S ens6 \| egrep "tx_packets\|tx_tso_packets" tx_packets: 2451115 tx_tso_packets: 60954 tx_packets_phy: 2451115 Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Maor Gottlieb	5f40b4ed97	net/mlx5: Increase number of max QPs in default profile With ConnectX-4 sharing SRQs from the same space as QPs, we hit a limit preventing some applications to allocate needed QPs amount. Double the size to 256K. Fixes: `e126ba97db` ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Paul Blakey	1ad9a00ae0	net/mlx5e: Avoid supporting udp tunnel port ndo for VF reps This was added to allow the TC offloading code to identify offloading encap/decap vxlan rules. The VF reps are effectively related to the same mlx5 PCI device as the PF. Since the kernel invokes the (say) delete ndo for each netdev, the FW erred on multiple vxlan dst port deletes when the port was deleted from the system. We fix that by keeping the registration to be carried out only by the PF. Since the PF serves as the uplink device, the VF reps will look up a port there and realize if they are ok to offload that. Tested: <SETUP VFS> <SETUP switchdev mode to have representors> ip link add vxlan1 type vxlan id 44 dev ens5f0 dstport 9999 ip link set vxlan1 up ip link del dev vxlan1 Fixes: `4a25730eb2` ('net/mlx5e: Add ndo_udp_tunnel_add to VF representors') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Or Gerlitz	09c91ddf2c	net/mlx5e: Use the proper UAPI values when offloading TC vlan actions Currently we use the non UAPI values and we miss erring on the modify action which is not supported, fix that. Fixes: `8b32580df1` ('net/mlx5e: Add TC vlan action for SRIOV offloads') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:13 -07:00
Roi Dayan	375f51e2b5	net/mlx5: E-Switch, Don't allow changing inline mode when flows are configured Changing the eswitch inline mode can potentially cause already configured flows not to match the policy. E.g. set policy L4, add some L4 rules, set policy to L2 --> bad! Hence we disallow it. Keep track of how many offloaded rules are now set and refuse inline mode changes if this isn't zero. Fixes: `bffaa91658` ("net/mlx5: E-Switch, Add control for inline mode") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:12 -07:00
Or Gerlitz	d85cdccbb3	net/mlx5e: Change the TC offload rule add/del code path to be per NIC or E-Switch Refactor the code to deal with add/del TC rules to have handler per NIC/E-switch offloading use case, and push the latter into the e-switch code. This provides better separation and is to be used in down-stream patch for applying a fix. Fixes: `bffaa91658` ("net/mlx5: E-Switch, Add control for inline mode") Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:12 -07:00
Or Gerlitz	1f30a86c58	net/mlx5: Add missing entries for set/query rate limit commands The switch cases for the rate limit set and query commands were missing, which could get us wrong under fw error or driver reset flow, fix that. Fixes: `1466cc5b23` ('net/mlx5: Rate limit tables support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-22 12:11:12 -07:00
Jack Morgenstein	4cbe4dac82	net/mlx4_core: Avoid delays during VF driver device shutdown Some Hypervisors detach VFs from VMs by instantly causing an FLR event to be generated for a VF. In the mlx4 case, this will cause that VF's comm channel to be disabled before the VM has an opportunity to invoke the VF device's "shutdown" method. For such Hypervisors, there is a race condition between the VF's shutdown method and its internal-error detection/reset thread. The internal-error detection/reset thread (which runs every 5 seconds) also detects a disabled comm channel. If the internal-error detection/reset flow wins the race, we still get delays (while that flow tries repeatedly to detect comm-channel recovery). The cited commit fixed the command timeout problem when the internal-error detection/reset flow loses the race. This commit avoids the unneeded delays when the internal-error detection/reset flow wins. Fixes: `d585df1c5c` ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reported-by: Simon Xiao <sixiao@microsoft.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-16 20:14:51 -07:00
Jiri Pirko	e9093b1183	mlxsw: reg: Fix SPVMLR max record count The num_rec field is 8 bit, so the maximal count number is 255. This fixes vlans learning not being enabled for wider ranges than 255. Fixes: `a4feea74cd` ("mlxsw: reg: Add Switch Port VLAN MAC Learning register definition") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-14 11:35:11 -07:00
Jiri Pirko	f004ec065b	mlxsw: reg: Fix SPVM max record count The num_rec field is 8 bit, so the maximal count number is 255. This fixes vlans not being enabled for wider ranges than 255. Fixes: `b2e345f9a4` ("mlxsw: reg: Add Switch Port VID and Switch Port VLAN Membership registers definitions") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-14 11:35:11 -07:00
Eugenia Emantayev	ea29bd304d	net/mlx5e: Fix loopback selftest Change packet type handler to ETH_P_IP instead of ETH_P_ALL since we are already expecting an IP packet. Also, using ETH_P_ALL will cause the loopback test packet type handler to be called on all outgoing packets, especially our own self loopback test SKB, which will be validated on xmit as well, and we don't want that. Tested with: ethtool -t ethX validated that the loopback test passes. Fixes: `0952da791c` ('net/mlx5e: Add support for loopback selftest') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Or Gerlitz	65ba8fb7d5	net/mlx5e: Avoid wrong identification of rules on deletion When deleting offloaded TC flows, we must correctly identify E-switch rules. The current check could get us wrong w.r.t to rules set on the PF. Since it's possible to set NIC rules on the PF, switch to SRIOV offloads mode and then attempt to delete a NIC rule. To solve that, we add a flags field to offloaded rules, set it on creation time and use that over the code where currently needed. Fixes: `8b32580df1` ('net/mlx5e: Add TC vlan action for SRIOV offloads') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Huy Nguyen	33e21c5952	net/mlx5e: remove IEEE/CEE mode check when setting DCBX mode Currently, the function setdcbx fails if the request dcbx mode is either IEEE or CEE. We remove the IEEE/CEE mode check because we support both IEEE and CEE interfaces. Fixes: `3a6a931dfb` ("net/mlx5e: Support DCBX CEE API") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Daniel Jurgens	5d47f6c89d	net/mlx5: Don't save PCI state when PCI error is detected When a PCI error is detected the PCI state could be corrupt, don't save it in that flow. Save the state after initialization. After restoring the PCI state during slot reset save it again, restoring the state destroys the previously saved state info. Fixes: `05ac2c0b74` ('net/mlx5: Fix race between PCI error handlers and health work') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Paul Blakey	af36370569	net/mlx5: Fix create autogroup prev initializer The autogroups list is a list of non overlapping group boundaries sorted by their start index. If the autogroups list wasn't empty and an empty group slot was found at the start of the list, the new group was added to the end of the list instead of the beginning, as the prev initializer was incorrect. When this was repeated, it caused multiple groups to have overlapping boundaries. Fixed that by correctly initializing the prev pointer to the start of the list. Fixes: `eccec8da3b` ('net/mlx5: Keep autogroups list ordered') Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-10 10:03:26 -08:00
Jiri Pirko	713c43b315	mlxsw: spectrum_flower: Remove bogus warns in mlxsw_sp_flower_destroy This warnings may be hit even in case they should not - in case user puts a TC-flower rule which failed to be offloaded. So just remove them. Reported-by: Petr Machata <petrm@mellanox.com> Reported-by: Ido Schimmel <idosch@mellanox.com> Fixes: commit `7aa0f5aa90` ("mlxsw: spectrum: Implement TC flower offload") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-08 23:15:58 -08:00
Arnd Bergmann	0253f2681f	net/mlx5e: add IPV6 dependency The ethernet support now calls directly into the ipv6 core code, which fails if IPV6 is a loadable module but mlx5 is built-in: drivers/net/ethernet/mellanox/mlx5/core/en_tc.o: In function `mlx5e_create_encap_header_ipv6': en_tc.c:(.text.mlx5e_create_encap_header_ipv6+0x110): undefined reference to `ip6_route_output_flags' This adds a dependency to ensure that MLX5_CORE_EN can only be built if we are able link the kernel successfully. The downside is that the ethernet option can be hidden. Alternatively we could make MLX5_CORE depend on "IPV6 \|\| !IPV6", which would force MLX5_CORE to be a module when IPV6 is, including in configurations where we don't use the ethernet support at all. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-07 12:25:24 -08:00
Ido Schimmel	f7df4923fa	mlxsw: spectrum_router: Avoid potential packets loss When the structure of the LPM tree changes (f.e., due to the addition of a new prefix), we unbind the old tree and then bind the new one. This may result in temporary packet loss. Instead, overwrite the old binding with the new one. Fixes: `6b75c4807d` ("mlxsw: spectrum_router: Add virtual router management") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-01 09:50:58 -08:00
Eric Dumazet	47d3a07528	net/mlx4_en: fix overflow in mlx4_en_init_timestamp() The cited commit makes a great job of finding optimal shift/multiplier values assuming a 10 seconds wrap around, but forgot to change the overflow_period computation. It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms, which is silly. Lets simply use 5 seconds, no need to recompute this, given how it is supposed to work. Later, we will use a timer instead of a work queue, since the new RX allocation schem will no longer need mlx4_en_recover_from_oom() and the service_task firing every 250 ms. Fixes: `31c128b66e` ("net/mlx4_en: Choose time-stamping shift value according to HW frequency") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-26 15:39:43 -05:00
Linus Torvalds	190c3ee06a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Some 'const'ing in qlogic networking drivers, from Bhumika Goyal. 2) Fix scheduling while atomic in l2tp network namespace exit by deferring the work to the workqueue. From Ridge Kennedy. 3) Fix use after free in dccp timewait handling, from Andrey Ryabinin. 4) mlx5e CQE compression engine not initialized properly, from Tariq Toukan. 5) Some UAPI header fixes from Dmitry V. Levin. 6) Don't overwrite module parameter value in mlx4 driver, from Majd Dibbiny. 7) Fix divide by zero in xt_hashlimit netfilter module, from Alban Browaeys. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) bpf: Fix bpf_xdp_event_output net/mlx4_en: Use __skb_fill_page_desc() net/mlx4_core: Use cq quota in SRIOV when creating completion EQs net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs net/mlx4: Spoofcheck and zero MAC can't coexist net/mlx4: Change ENOTSUPP to EOPNOTSUPP uapi: fix linux/rds.h userspace compilation errors uapi: fix linux/seg6.h and linux/seg6_iptunnel.h userspace compilation errors lib: Remove string from parman config selection forcedeth: Remove return from a void function bpf: fix spelling mistake: "proccessed" -> "processed" uapi: fix linux/llc.h userspace compilation error uapi: fix linux/ip6_tunnel.h userspace compilation errors net/mlx5e: Fix wrong CQE decompression net/mlx5e: Update MPWQE stride size when modifying CQE compress state net/mlx5e: Fix broken CQE compression initialization net/mlx5e: Do not reduce LRO WQE size when not using build_skb net/mlx5e: Register/unregister vport representors on interface attach/detach net/mlx5e: s390 system compilation fix tcp: account for ts offset only if tsecr not zero ...	2017-02-23 11:36:06 -08:00
Linus Torvalds	af17fe7a63	Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYrx+WAAoJELgmozMOVy/du70P/1kpW2xY9Le04c3K7na2XOYl AUVIDrW/8Go63tpOaM7jBT3k4GlwVFr3IOmBpS24KbW/THxjhyUeP5L5+z2x+go+ jkQOgtPWWEHr5zP3MzsNyB8fDx1YQOnJwEXxybQRW/cbw4CLjnhP+ezd6FdV/3Yy pPEqDVlAErzvNweG+n2r1pjcUbR8uneC3inyMLnyzUBz4CHKmC8fgD3/qJIM+DNb gtFT5xHFIXKCigWdQ/EwsTDcHub43V8OXlI5sO7loG6vToOUATMkjI4oOUNhDmYS X7XLN3yRK9QHEfb5kutXIZEWzTGh7LiFtUYGaNNYqqzDfSiMRc9NC5kTOfplEXDV Uo+AGb6Fh1zYIOzNk7o+tazIv3LaLv6+Fcm+9bbe0VUIqasaylsePqaTwMuIzx/I xP5nitmd5lbYo8WdlasVdG6mH1DlJEUbU30v4DpmTpxCP6jGpog7lexyGyF3TgzS NhnG0IiIClWh3WQ2/GdsFK/obIdFkpLeASli1hwD81vzPfly9zc2YpgqydZI3WCr q6hTXYnANcP6+eciCpQPO7giRdXdiKey08Uoq/2jxb7Qbm4daG6UwopjvH9/lm1F m6UDaDvzNYm+Rx+bL/+KSx9JO9+fJB1L51yCmvLGpWi6yJI4ZTfanHNMBsCua46N Kev/DSpIAzX1WOBkte+a =rspQ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull Mellanox rdma updates from Doug Ledford: "Mellanox specific updates for 4.11 merge window Because the Mellanox code required being based on a net-next tree, I keept it separate from the remainder of the RDMA stack submission that is based on 4.10-rc3. This branch contains: - Various mlx4 and mlx5 fixes and minor changes - Support for adding a tag match rule to flow specs - Support for cvlan offload operation for raw ethernet QPs - A change to the core IB code to recognize raw eth capabilities and enumerate them (touches non-Mellanox code) - Implicit On-Demand Paging memory registration support" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (40 commits) IB/mlx5: Fix configuration of port capabilities IB/mlx4: Take source GID by index from HW GID table IB/mlx5: Fix blue flame buffer size calculation IB/mlx4: Remove unused variable from function declaration IB: Query ports via the core instead of direct into the driver IB: Add protocol for USNIC IB/mlx4: Support raw packet protocol IB/mlx5: Support raw packet protocol IB/core: Add raw packet protocol IB/mlx5: Add implicit MR support IB/mlx5: Expose MR cache for mlx5_ib IB/mlx5: Add null_mkey access IB/umem: Indicate that process is being terminated IB/umem: Update on demand page (ODP) support IB/core: Add implicit MR flag IB/mlx5: Support creation of a WQ with scatter FCS offload IB/mlx5: Enable QP creation with cvlan offload IB/mlx5: Enable WQ creation and modification with cvlan offload IB/mlx5: Expose vlan offloads capabilities IB/uverbs: Enable QP creation with cvlan offload ...	2017-02-23 11:27:49 -08:00
Eric Dumazet	7f0137e2ef	net/mlx4_en: Use __skb_fill_page_desc() Or we might miss the fact that a page was allocated from memory reserves. Fixes: `dceeab0e52` ("mlx4: support __GFP_MEMALLOC for rx") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:57 -05:00
Jack Morgenstein	6ed63d845e	net/mlx4_core: Use cq quota in SRIOV when creating completion EQs When creating EQs to handle CQ completion events for the PF or for VFs, we create enough EQE entries to handle completions for the max number of CQs that can use that EQ. When SRIOV is activated, the max number of CQs a VF (or the PF) can obtain is its CQ quota (determined by the Hypervisor resource tracker). Therefore, when creating an EQ, the number of EQE entries that the VF should request for that EQ is the CQ quota value (and not the total number of CQs available in the FW). Under SRIOV, the PF, also must use its CQ quota, because the resource tracker also controls how many CQs the PF can obtain. Using the FW total CQs instead of the CQ quota when creating EQs resulted wasting MTT entries, due to allocating more EQEs than were needed. Fixes: `5a0d0a6161` ("mlx4: Structures and init/teardown for VF resource quotas") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reported-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:57 -05:00
Majd Dibbiny	95f1ba9a24	net/mlx4_core: Fix VF overwrite of module param which disables DMFS on new probed PFs In the VF driver, module parameter mlx4_log_num_mgm_entry_size was mistakenly overwritten -- and in a manner which overrode the device-managed flow steering option encoded in the parameter. log_num_mgm_entry_size is a global module parameter which affects all ConnectX-3 PFs installed on that host. If a VF changes log_num_mgm_entry_size, this will affect all PFs which are probed subsequent to the change (by disabling DMFS for those PFs). Fixes: `3c439b5586` ("mlx4_core: Allow choosing flow steering mode") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:57 -05:00
Eugenia Emantayev	745d8ae462	net/mlx4: Spoofcheck and zero MAC can't coexist Spoofcheck can't be enabled if VF MAC is zero. Vice versa, can't zero MAC if spoofcheck is on. Fixes: `8f7ba3ca12` ('net/mlx4: Add set VF mac address support') Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:56 -05:00
Or Gerlitz	423b3aecf2	net/mlx4: Change ENOTSUPP to EOPNOTSUPP As ENOTSUPP is specific to NFS, change the return error value to EOPNOTSUPP in various places in the mlx4 driver. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Suggested-by: Yotam Gigi <yotamg@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:57:56 -05:00
Tariq Toukan	36154be40a	net/mlx5e: Fix wrong CQE decompression In cqe compression with striding RQ, the decompression of the CQE field wqe_counter was done with a wrong wraparound value. This caused handling cqes with a wrong pointer to wqe (rx descriptor) and creating SKBs with wrong data, pointing to wrong (and already consumed) strides/pages. The meaning of the CQE field wqe_counter in striding RQ holds the stride index instead of the WQE index. Hence, when decompressing a CQE, wqe_counter should have wrapped-around the number of strides in a single multi-packet WQE. We dropped this wrap-around mask at all in CQE decompression of striding RQ. It is not needed as in such cases the CQE compression session would break because of different value of wqe_id field, starting a new compression session. Tested: ethtool -K ethxx lro off/on ethtool --set-priv-flags ethxx rx_cqe_compress on super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D verified no csum errors and no page refcount issues. Fixes: `7219ab34f1` ("net/mlx5e: CQE compression") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:10 -05:00
Saeed Mahameed	6dc4b54e77	net/mlx5e: Update MPWQE stride size when modifying CQE compress state When the admin enables/disables cqe compression, updating mpwqe stride size is required: CQE compress ON ==> stride size = 256B CQE compress OFF ==> stride size = 64B This is already done on driver load via mlx5e_set_rq_type_params, all we need is just to call it on arbitrary admin changes of cqe compression state via priv flags or when changing timestamping state (as it is mutually exclusive with cqe compression). This bug introduces no functional damage, it only makes cqe compression occur less often, since in ConnectX4-LX CQE compression is performed only on packets smaller than stride size. Tested: ethtool --set-priv-flags ethxx rx_cqe_compress on pktgen with 64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6) verify `ethtool -S ethxx \| grep compress` are advancing more often (rapidly) Fixes: `7219ab34f1` ("net/mlx5e: CQE compression") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:10 -05:00
Tariq Toukan	b0d4660b4c	net/mlx5e: Fix broken CQE compression initialization Some of RQ type parameters are derived from CQE compression state flag, CQE compression flag was initialized only after RQ type parameters setup. This leads to load RQ with stride size smaller than what we want for when CQE compression is on. This bug introduces no functional damage, it only makes CQE compression occur less often, since in ConnectX4-LX CQE compression is performed only on packets smaller than stride size. Fix this by marking default status of CQE compression in PFLAG prior to calling mlx5e_set_rq_priv_params(), as it inits some fields based on it. Tested: load driver on systems where rx CQE compress will be on (MH) pktgen with 64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6) verify `ethtool -S ethxx \| grep compress` are advancing more often (rapidly) Fixes: `2fc4bfb725` ("net/mlx5e: Dynamic RQ type infrastructure") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Tariq Toukan	4078e637c1	net/mlx5e: Do not reduce LRO WQE size when not using build_skb When rq_type is Striding RQ, no room of SKB_RESERVE is needed as SKB allocation is not done via build_skb. Fixes: `e4b8550807` ("net/mlx5e: Slightly reduce hardware LRO size") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Saeed Mahameed	6f08a22c5f	net/mlx5e: Register/unregister vport representors on interface attach/detach Currently vport representors are added only on driver load and removed on driver unload. Apparently we forgot to handle them when we added the seamless reset flow feature. This caused to leave the representors netdevs alive and active with open HW resources on pci shutdown and on error reset flows. To overcome this we move their handling to interface attach/detach, so they would be cleaned up on shutdown and recreated on reset flows. Fixes: `26e59d8077` ("net/mlx5e: Implement mlx5e interface attach/detach callbacks") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Mohamad Haj Yahia	18bcf742fb	net/mlx5e: s390 system compilation fix Add necessary headers include for s390 arch compilation. Fixes: `e586b3b0ba` ("net/mlx5: Ethernet Datapath files") Fixes: `d605d6686d` ("net/mlx5e: Add support for ethtool self..") Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-23 10:43:09 -05:00
Eric Dumazet	3608b13ccc	mlx4: reduce OOM risk on arches with large pages Since mlx4 NIC are used on PowerPC with 64K pages, we need to adapt MLX4_EN_ALLOC_PREFER_ORDER definition. Otherwise, a fragment sitting in an out of order TCP queue can hold 0.5 Mbytes and it is a serious OOM risk. Fixes: `51151a16a6` ("mlx4: allow order-0 memory allocations in RX path") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-20 10:31:50 -05:00
Eric Dumazet	f5a5772337	mlx4: fix potential divide by 0 in mlx4_en_auto_moderation() 1) In the case where rate == priv->pkt_rate_low == priv->pkt_rate_high, mlx4_en_auto_moderation() does a divide by zero. 2) We want to properly change the moderation parameters if rx_frames was changed (like in ethtool -C eth0 rx-frames 16) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-19 18:15:23 -05:00
Eric Dumazet	01f0f42534	mlx4: do not fire tasklet unless necessary All rx and rx netdev interrupts are handled by respectively by mlx4_en_rx_irq() and mlx4_en_tx_irq() which simply schedule a NAPI. But mlx4_eq_int() also fires a tasklet to service all items that were queued via mlx4_add_cq_to_tasklet(), but this handler was not called unless user cqe was handled. This is very confusing, as "mpstat -I SCPU ..." show huge number of tasklet invocations. This patch saves this overhead, by carefully firing the tasklet directly from mlx4_add_cq_to_tasklet(), removing four atomic operations per IRQ. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-17 10:55:22 -05:00
David S. Miller	3f64116a83	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-02-16 19:34:01 -05:00
Jiri Pirko	0c921a894c	mlxsw: acl: Use PBS type for forward action Current behaviour of "mirred redirect" action (forward) offload is a bit odd. For matched packets the action forwards them to the desired destination, but it also lets the packet duplicates to go the original way down (bridge, router, etc). That is more like "mirred mirror". Fix this by using PBS type which behaves exactly like "mirred redirect". Note that PBS does not support loopback mode. Fixes: `4cda7d8d70` ("mlxsw: core: Introduce flexible actions support") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-15 13:23:20 -05:00
Eric Dumazet	99f5711e7c	mlx4: do not use rwlock in fast path Using a reader-writer lock in fast path is silly, when we can instead use RCU or a seqlock. For mlx4 hwstamp clock, a seqlock is the way to go, removing two atomic operations and false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-15 12:06:18 -05:00
Nogah Frankel	1ca6270b8d	mlxsw: spectrum: Change ipv6 unregistered mc table Point back the unregister IPv6 mc table to the bc table. It is done since IPv6 mcast snooping is not supported for Spectrum yet. Reported-by: Jiri Pirko <jiri@mellanox.com> Fixes: `71c365bdc4` ("mlxsw: spectrum: Separate bc and mc floods") Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Tested-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-14 13:15:46 -05:00
Or Gerlitz	fed06ee89b	net/mlx5e: Disable preemption when doing TC statistics upcall When called by HW offloading drivers, the TC action (e.g net/sched/act_mirred.c) code uses this_cpu logic, e.g _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets) per the kernel documention, preemption should be disabled, add that. Before the fix, when running with CONFIG_PREEMPT set, we get a BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793 asserion from the TC action (mirred) stats_update callback. Fixes: `aad7e08d39` ('net/mlx5e: Hardware offloaded flower filter statistics support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-14 11:56:01 -05:00
Maor Gottlieb	e04a018377	net/mlx5: Consolidate flow rules regardless their flow tag Flow rules with same match criteria and value should be mapped to the same flow table entry regardless the flow tag identifier. Flow tag is part of flow table entry context and not of the destination, therefore we should return error when we try to add destination to flow table entry with different flow tag. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-14 10:21:01 -05:00
Jiri Pirko	dc371700d4	spectrum: flower: Treat ETH_P_ALL as a special case and translate for HW HW does not understand ETH_P_ALL. So treat this special case differently and translate to 0/0 key/mask. That will allow HW to match all ethertypes. Fixes: `7aa0f5aa90` ("mlxsw: spectrum: Implement TC flower offload") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 13:42:50 -05:00
Nogah Frankel	90e0f0c1b4	mlxsw: spectrum: Update mc_disabled flag by switchdev attr Add a function to update mc_disabled from switchdev attr SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:41 -05:00
Nogah Frankel	1e5d94327d	mlxsw: spectrum: Extend port_orig_get for bridge devices The function mlxsw_sp_port_orig_get returns the vport from the physical port if needed, based on the original device. This patch addresses the case where the original device is a bridge. If it is vlan unaware bridge, it returns the matching vport. If it is vlan aware bridge, there is no matching vport, and it returns the original port. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:41 -05:00
Nogah Frankel	8ecd4591e7	mlxsw: spectrum: Add an option to flood mc by mc_router_port The decision whether to flood a multicast packet to a port dependent on three flags: mc_disabled, mc_router_port, mc_flood. If mc_disabled is on, the port will be flooded according to mc_flood, otherwise, according to mc_router_port. To accomplish that, add those flags into the mlxsw_sp_port struct and update the mc flood table accordingly. Update mc_router_port by switchdev attribute SWITCHDEV_ATTR_ID_PORT_MC_ROUTER_PORT. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:40 -05:00
Nogah Frankel	71c365bdc4	mlxsw: spectrum: Separate bc and mc floods Break the bm (broadcast-multicast) into two tables, one for broadcast (and link local multicast that behaves like bc) and one for unknown multicasts. Add a bool into mlxsw_sp_port named mc_flood that reflect the value this port should have in the mc flood table (currently, always 1); Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:40 -05:00
Nogah Frankel	63fe813c60	mlxsw: spectrum: Change max vfid A user that wants many bridges will use 1.Q bridge which are scalable. One can have as many 1.Q bridges as vfids. This patch sets their number to 1k, which is a reasonably large number. This change is done here because the next patches will add a new flood table, and without it, it will increase the overall size of the flood tables dramatically. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:40 -05:00
Nogah Frankel	69be01f374	mlxsw: spectrum: Make port flood update more generic Currently, there is a per port flood update function only for the UC table. Make the function more generic by changing the table type to be an input. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:39 -05:00
Nogah Frankel	eaa7df3c5a	mlxsw: spectrum: Break flood set func to be per table Currently, the flood set function can't operate on only one table, but sets both uc_flood and mb_flood together. This patch creates a function that sets the flood state per table. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-10 11:46:39 -05:00

1 2 3 4 5 ...

2359 Commits