Commit Graph

4051 Commits

Author SHA1 Message Date
David S. Miller a736e07468 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Overlapping changes in RXRPC, changing to ktime_get_seconds() whilst
adding some tracepoints.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 11:52:36 -07:00
Petr Machata 88cc318ebd mlxsw: spectrum: Expose counter for all 16 TCs
Before MC-aware mode was enabled in commit 7b81953066 ("mlxsw:
spectrum: Configure MC-aware mode on mlxsw ports"), only 8 traffic
classes were used. Under MC-aware regime, however, besides using TCs
0-7 for UC traffic, it additionally uses TCs 8-15 for BUM traffic. It
is therefore desirable to show counters for these TCs as well.

Update ethtool stats pool length, mlxsw_sp_port_get_strings() and
mlxsw_sp_port_get_stats() to include artifacts for all 16 TCs. For
consistency and simplicity, expose tc_no_buffer_discard_uc_tc for BUM
TCs as well, even though it ought to stay at 0 all the time.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 10:36:11 -07:00
Petr Machata 9897dce2e3 mlxsw: spectrum: Include RFC-2819 counters in stats length
The function mlxsw_sp_port_get_sset_count() is supposed to return the
total number of ethtool strings that mlxsw supports. Specifically for
names of statistic counters (the only string type that mlxsw supports
as of now), that number is stored in MLXSW_SP_PORT_ETHTOOL_STATS_LEN.
However, when adding RFC-2891 counters, that define wasn't updated to
include the new counters. As a result, ethtool snips out the counters
towards the end of the list, which contains per-TC counters, and only
the first three traffic classes end up being reported.

Fix by adding MLXSW_SP_PORT_HW_RFC_2819_STATS_LEN as appropriate.

Fixes: 1222d15a01 ("mlxsw: spectrum: Expose counters for various packet sizes")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 10:36:10 -07:00
Jiri Pirko 9948a0641a mlxsw: Replace license text with SPDX identifiers and adjust copyrights
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 10:36:10 -07:00
Jiri Pirko c86d62cc41 mlxsw: spectrum: Reset FW after flash
Recent FW fixes a bug and allows to load newly flashed FW image after
reset. So make sure the reset happens after flash. Indicate the need
down to PCI layer by -EAGAIN.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 10:36:10 -07:00
Nir Dotan a716d55e4d mlxsw: spectrum: Update the supported firmware to version 13.1702.6
This new firmware contains:
        - Support for new types of cables
        - Support for flashing future firmware without reboot
        - Support for Router ARP BC and UC traps

Signed-off-by: Nir Dotan <nird@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 10:36:10 -07:00
Nir Dotan 903fcf734f mlxsw: spectrum_flower: Disallow usage of vlan_id key on egress
As recent spectrum FW imposes a limitation on using vlan_id key for
egress ACL, disallow the usage of that key accordingly and return a
proper extack message.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 10:36:10 -07:00
Eli Cohen 269d26f47f net/mlx5: Reduce command polling interval
Use cond_resched() instead of usleep_range() to decrease the time
between polling attempts thus reducing overall driver load time.

Below is a comparison before and after the change, of loading eight
virtual functions.

Before:
real    0m8.785s
user    0m0.093s
sys     0m0.090s

After:
real    0m5.730s
user    0m0.097s
sys     0m0.087s

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:34:55 -07:00
Eli Cohen d1fd79f34f net/mlx5: Unexport functions that need not be exported
mlx5_query_vport_state() and mlx5_modify_vport_admin_state() are used
only from within mlx5_core  - unexport them.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:34:55 -07:00
Eli Cohen 29d8ebd44d net/mlx5: Remove unused mlx5_query_vport_admin_state
mlx5_query_vport_admin_state() is not used anywhere. Remove it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:34:55 -07:00
Eli Cohen 8e3debc08b net/mlx5: E-Switch, Remove unused argument when creating legacy FDB
Remove unused nvports argument.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:34:55 -07:00
Eran Ben Elisha cc9c82a866 net/mlx5: Rename modify/query_vport state related enums
Modify and query vport state commands share the same admin_state and
op_mod values, rename the enums to fit them both.

In addition, remove the esw prefix from the admin state enum as this
also applied for vnic.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:34:54 -07:00
Denis Drozdov 342ac8448f net/mlx5: Use max_num_eqs for calculation of required MSIX vectors
New firmware has defined new HCA capability field called "max_num_eqs",
that is the number of available EQs after subtracting reserved FW EQs.

Before this capability the FW reported the EQ number in "log_max_eqs",
the reported value also contained FW reserved EQs, but the driver might
be failing to load on 320 cpus systems due to the fact that FW
reserved EQs were not available to the driver.

Now the driver has to obtain max_num_eqs value from new FW to get real
number of EQs available.

Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:34:54 -07:00
Huy Nguyen f280c6a1e5 net/mlx5e: Cleanup of dcbnl related fields
Remove unused netdev_registered_init/remove in en.h
Return ENOSUPPORT if the check MLX5_DSCP_SUPPORTED fails.
Remove extra white space

Fixes: 2a5e7a1344 ("net/mlx5e: Add dcbnl dscp to priority support")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Cc: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:07:37 -07:00
Or Gerlitz 816f670623 net/mlx5e: Properly check if hairpin is possible between two functions
The current check relies on function BDF addresses and can get
us wrong e.g when two VFs are assigned into a VM and the PCI
v-address is set by the hypervisor.

Fixes: 5c65c564c9 ('net/mlx5e: Support offloading TC NIC hairpin flows')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Alaa Hleihel <alaa@mellanox.com>
Tested-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-08 19:07:37 -07:00
Gustavo A. R. Silva e77f02b812 net/mlx5e: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114808 ("Missing break in switch")
Addresses-Coverity-ID: 114802 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Gustavo A. R. Silva c8581f2bb5 net/mlx4/en_rx: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114794 ("Missing break in switch")
Addresses-Coverity-ID: 114795 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Gustavo A. R. Silva 49a9776fe5 net/mlx4/mcg: Mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 114792 ("Missing break in switch")
Addresses-Coverity-ID: 114793 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-07 17:54:20 -07:00
Al Viro be1459de2e mellanox: fix the dport endianness in call of __inet6_lookup_established()
__inet6_lookup_established() expect th->dport passed in host-endian,
not net-endian.  The reason is microoptimization in __inet6_lookup(),
but if you use the lower-level helpers, you have to play by their
rules...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-06 10:30:50 -07:00
Petr Machata 7b81953066 mlxsw: spectrum: Configure MC-aware mode on mlxsw ports
In order to give unicast traffic precedence over BUM traffic, configure
multicast-aware mode on all ports.

Under multicast-aware regime, when assigning traffic class to a packet,
the switch doesn't merely take the value prescribed by the QTCT
register. For BUM traffic, it instead assigns that value plus 8.

ETS elements for TCs 8..15 thus need to be configured as well. Extend
mlxsw_sp_port_ets_init() so that it maps each of them to the same
subgroup as their corresponding TC from the range 0..7, such that TCs X
and X+8 map to the same subgroup.

The existing code configures TCs with strict priority. So far this was
immaterial, because each TC had its own subgroup. Now that two TCs share
a subgroup it becomes important. TCs are prioritized in order of 7, 6,
..., 0, 15, 14, ..., 8: the higher TCs used for BUM traffic end up being
deprioritized. Since that's what's needed, keep that configuration as it
is, and configure the new TCs likewise.

Finally in mlxsw_sp_port_create(), invoke configuration of QTCTM to
enable MC-aware mode on each port.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05 17:28:21 -07:00
Petr Machata d0a07d6ada mlxsw: spectrum: Fix a typo
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05 17:28:21 -07:00
Petr Machata 671ae8af05 mlxsw: reg: Add QoS Switch Traffic Class Table is Multicast-Aware Register
This register configures if the Switch Priority to Traffic Class mapping
is based on Multicast packet indication. If so, then multicast packets
will get a Traffic Class that is plus (cap_max_tclass_data/2) the value
configured by QTCT.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05 17:28:21 -07:00
David S. Miller c1c8626fce Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Lots of overlapping changes, mostly trivial in nature.

The mlxsw conflict was resolving using the example
resolution at:

https://github.com/jpirko/linux_mlxsw/blob/combined_queue/drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_actions.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-05 13:04:31 -07:00
Nir Dotan caebd1b389 mlxsw: core_acl_flex_actions: Remove redundant mirror resource destruction
In previous patch mlxsw_afa_resource_del() was added to avoid a duplicate
resource detruction scenario.
For mirror actions, such duplicate destruction leads to a crash as in:

 # tc qdisc add dev swp49 ingress
 # tc filter add dev swp49 parent ffff: \
   protocol ip chain 100 pref 10 \
   flower skip_sw dst_ip 192.168.101.1 action drop
 # tc filter add dev swp49 parent ffff: \
   protocol ip pref 10 \
   flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \
   action mirred egress mirror dev swp4

Therefore add a call to mlxsw_afa_resource_del() in
mlxsw_afa_mirror_destroy() in order to clear that resource
from rule's resources.

Fixes: d0d13c1858 ("mlxsw: spectrum_acl: Add support for mirror action")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03 12:28:01 -07:00
Nir Dotan 7cc6169493 mlxsw: core_acl_flex_actions: Remove redundant counter destruction
Each tc flower rule uses a hidden count action. As counter resource may
not be available due to limited HW resources, update _counter_create()
and _counter_destroy() pair to follow previously introduced symmetric
error condition handling, add a call to mlxsw_afa_resource_del() as part
of the counter resource destruction.

Fixes: c18c1e186b ("mlxsw: core: Make counter index allocated inside the action append")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03 12:28:01 -07:00
Nir Dotan dda0a3a3fb mlxsw: core_acl_flex_actions: Remove redundant resource destruction
Some ACL actions require the allocation of a separate resource
prior to applying the action itself. When facing an error condition
during the setup phase of the action, resource should be destroyed.
For such actions the destruction was done twice which is dangerous
and lead to a potential crash.
The destruction took place first upon error on action setup phase
and then as the rule was destroyed.

The following sequence generated a crash:

 # tc qdisc add dev swp49 ingress
 # tc filter add dev swp49 parent ffff: \
   protocol ip chain 100 pref 10 \
   flower skip_sw dst_ip 192.168.101.1 action drop
 # tc filter add dev swp49 parent ffff: \
   protocol ip pref 10 \
   flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \
   action mirred egress mirror dev swp4

Therefore add mlxsw_afa_resource_del() as a complement of
mlxsw_afa_resource_add() to add symmetry to resource_list membership
handling. Call this from mlxsw_afa_fwd_entry_ref_destroy() to make the
_fwd_entry_ref_create() and _fwd_entry_ref_destroy() pair of calls a
NOP.

Fixes: 140ce42121 ("mlxsw: core: Convert fwd_entry_ref list to be generic per-block resource list")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03 12:28:01 -07:00
Nir Dotan 3757b255bf mlxsw: core_acl_flex_actions: Return error for conflicting actions
Spectrum switch ACL action set is built in groups of three actions
which may point to additional actions. A group holds a single record
which can be set as goto record for pointing at a following group
or can be set to mark the termination of the lookup. This is perfectly
adequate for handling a series of actions to be executed on a packet.
While the SW model allows configuration of conflicting actions
where it is clear that some actions will never execute, the mlxsw
driver must block such configurations as it creates a conflict
over the single terminate/goto record value.

For a conflicting actions configuration such as:

 # tc filter add dev swp49 parent ffff: \
   protocol ip pref 10 \
   flower skip_sw dst_ip 192.168.101.1 \
   action goto chain 100 \
   action mirred egress mirror dev swp4

Where it is clear that the last action will never execute, the
mlxsw driver was issuing a warning instead of returning an error.
Therefore replace that warning with an error for this specific
case.

Fixes: 4cda7d8d70 ("mlxsw: core: Introduce flexible actions support")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03 12:28:01 -07:00
David S. Miller 89b1698c93 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
The BTF conflicts were simple overlapping changes.

The virtio_net conflict was an overlap of a fix of statistics counter,
happening alongisde a move over to a bonafide statistics structure
rather than counting value on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02 10:55:32 -07:00
Petr Machata 6495342365 mlxsw: spectrum_router: Handle sysctl_ip_fwd_update_priority
This sysctl setting controls whether packet priority should be updated
after forwarding. Configure RGCR.usp accordingly so that the device is
in sync with the kernel handling.

Note that RGCR doesn't allow changing arbitrary parameters
mid-operation, however "usp" is exempt and can be reconfigured.

Also react to NETEVENT_IPV4_FWD_UPDATE_PRIORITY_UPDATE notifications
that signify change in this configuration.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 09:52:30 -07:00
Petr Machata 1f65a33fc7 mlxsw: spectrum: Extract work-scheduling into a new function
The boilerplate to schedule NETEVENT_IPV4_MPATH_HASH_UPDATE and
NETEVENT_IPV6_MPATH_HASH_UPDATE handling is almost equivalent to that of
NETEVENT_IPV4_FWD_UPDATE_PRIORITY_UPDATE that's coming in the next
patch. The only difference is which actual worker function should be
called. Extract this boilerplate into a named function in order to allow
reuse.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 09:52:30 -07:00
Gustavo A. R. Silva 96d395020e net/mlx5e: Fix uninitialized variable
There is a potential execution path in which variable *err* is returned
without being properly initialized previously.

Fix this by initializing variable *err* to 0.

Addresses-Coverity-ID: 1472116 ("Uninitialized scalar variable")
Fixes: 0ec13877ce ("net/mlx5e: Gather all XDP pre-requisite checks in a single function")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01 09:38:05 -07:00
Feras Daoud 8e1d162d8e net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow
After introduction of the cited commit, mlx5e_build_nic_params
receives the netdevice mtu in order to set the sw_mtu of mlx5e_params.
For enhanced IPoIB, the netdevice mtu is not set in this stage,
therefore, the initial sw_mtu equals zero. As a result, the hw_mtu
of the receive queue will be calculated incorrectly causing traffic
issues.

To fix this issue, query for port mtu before building the nic params.

Fixes: 472a1e44b3 ("net/mlx5e: Save MTU in channels params")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:45 -07:00
Adi Nissim eacecf2760 net/mlx5e: Fix null pointer access when setting MTU of vport representor
MTU helper function is used by both conventional mlx5e
instances (PF/VF) and the eswitch representors. The representor
shouldn't change the nic vport context MTU, the VF is responsible for
that. Therefore set_mtu_cb has a null value when changing the
representor MTU.

Fixes: 250a42b6a7 ("net/mlx5e: Support configurable MTU for vport representors")
Signed-off-by: Adi Nissim <adin@mellanox.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:44 -07:00
Or Gerlitz 2e8e70d249 net/mlx5e: Set port trust mode to PCP as default
The hairpin offload code has dependency on the trust mode being PCP.

Hence we should set PCP as the default for handling cases where we are
disallowed to read the trust mode from the FW, or failed to initialize it.

Fixes: 106be53b6b ('net/mlx5e: Set per priority hairpin pairs')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:43 -07:00
Eli Cohen 5f5991f36d net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager
Execute mlx5_eswitch_init() only if we have MLX5_ESWITCH_MANAGER
capabilities.
Do the same for mlx5_eswitch_cleanup().

Fixes: a9f7705ffd ("net/mlx5: Unify vport manager capability check")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-31 12:58:43 -07:00
Jesper Dangaard Brouer 39c64d8c87 mlx5: handle DMA mapping error case for XDP redirect
Commit 58b99ee3e3 ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
forgot to return/free the xdp_frame in case the DMA mapping failed, correct this.

Also DMA unmap the frame in case mlx5e_xmit_xdp_frame() fails.

Fixes: 58b99ee3e3 ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31 09:44:41 -07:00
David S. Miller ebe023a424 mlx5e-updates-2018-07-27 (Vxlan updates)
This series from Gal and Saeed provides updates to mlx5 vxlan implementation.
 
 Gal, started with three cleanups to reflect the actual hardware vxlan state
 - reflect 4789 UDP port default addition to software database
 - check maximum number of vxlan  UDP ports
 - cleanup an unused member in vxlan work
 
 Then Gal provides performance optimization by replacing the
 vxlan radix tree with a hash table.
 
 Measuring mlx5e_vxlan_lookup_port execution time:
 
                       Radix Tree   Hash Table
      --------------- ------------ ------------
       Single Stream   161 ns       79  ns (51% improvement)
       Multi Stream    259 ns       136 ns (47% improvement)
 
     Measuring UDP stream packet rate, single fully utilized TX core:
     Radix Tree: 498,300 PPS
     Hash Table: 555,468 PPS (11% improvement)
 
 Next, from Saeed, vxlan refactoring to allow sharing the vxlan table
 between different mlx5 netdevice instances like PF and VF representors,
 this is done by making mlx5 vxlan interface more generic and decoupling
 it from PF netdevice structures and logic, then moving it into mlx5 core
 as a low level interface so it can be used by VF representors, which is
 illustrated in the last patch of the serious.
 
 -Saeed.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbW7KLAAoJEEg/ir3gV/o+YZ8H/2D4Ogav0zxQaNnw4ASYPpVA
 luW1tvGUyk3C+fgbZO7tp/DETGJKbSodSMV9ZasEFFuHPft37mQaEIZf2rq58DvT
 Q6vaaewyRCB6SzIGYjCZWtLI0aE5QtwMWDRbRBlRyQ0zUV6wr26W3WWWM2SpCJGK
 zSuAF3Np0dEPTzBB566CY0nhYpsBiBXm2QJcgySL2WIx1nwQ6X8MJxjErhgQV+2L
 1wVT6YTm1K2cdx9LsORt3FB/mFTQcOJCpl88AyhBB+pc7+i6pBtBp95tcqD8wmtF
 dnMBmR+JzT108VxvcBnpwCxuVI7lLCcAs9hYITTGzUbo6rqT8xXV2HQ1KwU17Ow=
 =tJj8
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-07-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-07-27 (Vxlan updates)

This series from Gal and Saeed provides updates to mlx5 vxlan implementation.

Gal, started with three cleanups to reflect the actual hardware vxlan state
- reflect 4789 UDP port default addition to software database
- check maximum number of vxlan  UDP ports
- cleanup an unused member in vxlan work

Then Gal provides performance optimization by replacing the
vxlan radix tree with a hash table.

Measuring mlx5e_vxlan_lookup_port execution time:

                      Radix Tree   Hash Table
     --------------- ------------ ------------
      Single Stream   161 ns       79  ns (51% improvement)
      Multi Stream    259 ns       136 ns (47% improvement)

    Measuring UDP stream packet rate, single fully utilized TX core:
    Radix Tree: 498,300 PPS
    Hash Table: 555,468 PPS (11% improvement)

Next, from Saeed, vxlan refactoring to allow sharing the vxlan table
between different mlx5 netdevice instances like PF and VF representors,
this is done by making mlx5 vxlan interface more generic and decoupling
it from PF netdevice structures and logic, then moving it into mlx5 core
as a low level interface so it can be used by VF representors, which is
illustrated in the last patch of the serious.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-29 13:08:42 -07:00
Saeed Mahameed a3e673660b net/mlx5e: Issue direct lookup on vxlan ports by vport representors
Remove uplink representor netdevice private structure lookup, and use
mlx5 core handle directly from representor private structure to lookup
vxlan ports.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:46:13 -07:00
Saeed Mahameed 358aa5ce28 net/mlx5e: Vxlan, move vxlan logic to core driver
Move vxlan logic and objects to mlx5 core dirver.
Since it going to be used from different mlx5 interfaces.
e.g. mlx5e PF NIC netdev and mlx5e E-Switch representors.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:46:13 -07:00
Saeed Mahameed aec4eab9af net/mlx5e: Vxlan, add sync lock for add/del vxlan port
Vxlan API can and will be called from different mlx5 modules, we should
not count on mlx5e private state lock only, hence we introduce a vxlan
private mutex to sync between add/del vxlan port operations.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:46:11 -07:00
Saeed Mahameed 1b318a92f3 net/mlx5e: Vxlan, return values for add/del port
For a better API mlx5_vxlan_{add/del}_port can fail, make them return
error values.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:43:06 -07:00
Saeed Mahameed a3c785d73c net/mlx5e: Vxlan, rename from mlx5e to mlx5
Rename vxlan functions from mlx5e_vxlan_* to mlx5_vxlan_*.
Rename mlx5e_vxlan_db to mlx5_vxlan and move it from en.h to vxlan.c
since it is not related to mlx5e anymore.

Allocate mlx5_vxlan structure dynamically in order to make it easier to
move later to core driver and to make it private in vxlan.c.

This is in preparation to move vxlan API to mlx5 core.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:43:04 -07:00
Saeed Mahameed 5006eb221e net/mlx5e: Vxlan, rename struct mlx5e_vxlan to mlx5_vxlan_port
The name mlx5e_vxlan will be used in downstream patch to describe
mlx5 vxlan structure that will replace mlx5e_vxlan_db.

Hence we rename struct mlx5e_vxlan to mlx5_vxlan_port which describes a
mlx5 vxlan port.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:36:51 -07:00
Saeed Mahameed dccea6bf38 net/mlx5e: Vxlan, move netdev only logic to en_main.c
Create a direct vxlan API to add and delete vxlan ports from HW.
+void mlx5e_vxlan_add_port(struct mlx5e_priv *priv, u16 port);
+void mlx5e_vxlan_del_port(struct mlx5e_priv *priv, u16 port);

And move vxlan_add/del_work to en_main.c since they are netdev only
logic.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:35:16 -07:00
Saeed Mahameed 0f647bfcd0 net/mlx5e: Vxlan, add direct delete function
Add direct vxlan delete function to be called from vxlan_delete_work.
Needed in downstream patch.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
2018-07-27 15:35:14 -07:00
Gal Pressman 278d7f3dc0 net/mlx5e: Vxlan, cleanup an unused member in vxlan work
Cleanup the sa_family member of the vxlan work, it is unused/needed
anywhere in the code.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 15:30:32 -07:00
Gal Pressman d30d8cde19 net/mlx5e: Vxlan, replace ports radix-tree with hash table
The VXLAN database is accessed in the data path for each VXLAN TX skb in
order to check whether the UDP port is being offloaded or not.
The number of elements in the database is relatively small, we can
simplify the radix-tree to a hash table and speedup the lookup process.

Measuring mlx5e_vxlan_lookup_port execution time:

                  Radix Tree   Hash Table
 --------------- ------------ ------------
  Single Stream   161 ns       79  ns (51% improvement)
  Multi Stream    259 ns       136 ns (47% improvement)

Measuring UDP stream packet rate, single fully utilized TX core:
Radix Tree: 498,300 PPS
Hash Table: 555,468 PPS (11% improvement)

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 13:56:44 -07:00
Gal Pressman 22a65aa8b1 net/mlx5e: Vxlan, check maximum number of UDP ports
The NIC has a limited number of offloaded VXLAN UDP ports (usually 4).
Instead of letting the firmware fail when trying to add more ports than
it can handle, let the driver check it on its own.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 13:56:44 -07:00
Gal Pressman a082c4f4f0 net/mlx5e: Vxlan, reflect 4789 UDP port default addition to software database
The hardware offloads 4789 UDP port (default VXLAN port) automatically.
Add it to the software database as well in order to reflect the hardware
state appropriately.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-07-27 13:56:44 -07:00
Petr Machata b2b1dab688 mlxsw: spectrum: Support ieee_setapp, ieee_delapp
The APP TLVs are used for communicating priority-to-protocol ID maps for
a given netdevice. Support the following APP TLVs:

- DSCP (selector 5) to configure priority-to-DSCP code point maps. Use
  these maps to configure packet priority on ingress, and DSCP code
  point rewrite on egress.

- Default priority (selector 1, PID 0) to configure priority for the
  DSCP code points that don't have one assigned by the DSCP selector. In
  future this could also be used for assigning default port priority
  when a packet arrives without DSCP tagging.

Besides setting up the maps themselves, also configure port trust level
and rewrite bits.

Port trust level determines whether, for a packet arriving through a
certain port, the priority should be determined based on PCP or DSCP
header fields. So far, mlxsw kept the device default of trust-PCP. Now,
as soon as the first DSCP APP TLV is configured, switch to trust-DSCP.
Only when all DSCP APP TLVs are removed, switch back to trust-PCP again.
Note that the default priority APP TLV doesn't impact the trust level
configuration.

Rewrite bits determine whether DSCP and PCP fields of egressing packets
should be updated according to switch priority. When port trust is
switched to DSCP, enable rewrite of DSCP field.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-27 13:17:50 -07:00