Commit Graph

1859 Commits

Author SHA1 Message Date
Ido Schimmel 3dc266896d mlxsw: reg: Add Router Interface Table Register
Add the Router Interface Table Register (RITR), which allows us to
create and configure router interfaces (RIFs).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:18 -04:00
Ido Schimmel d82d8c060f mlxsw: reg: Add FDB action to forward to router
Incoming packets are directed to the router when they match an FDB
entry with action forward to IP router.

Add this action, which was mistakenly named "TRAP".

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel fa3054f5a8 mlxsw: spectrum: Add router interface struct
When enabling the router in the device we will represent L3 netdevs
using router interfaces (RIFs). These will be specified whenever
programming routes or neighbours on the netdev.

Introduce the basic RIF infrastructure which allows one to lookup a RIF
by its netdev. Later patches in the series will extend this, but the
basic routines are needed now in order to direct traffic to CPU.

Pointers to the RIF structs are stored in an array indexed by the RIF's
number. This will allow us to efficiently update the kernel's neighbour
table when regularly dumping the device's table.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel 464dce1884 mlxsw: spectrum_router: Add basic ipv4 router initialization
Create a skeleton router file and do basic HW initialization of router.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel bbf2a4757b mlxsw: spectrum: Initialize ports at the end of init sequence
During ports initialization a net device is registered for each
available port, which implies the port is usable. However, a port is
only usable after the different parts of the device (e.g. flooding,
buffers) are initialized. This is especially important now, when we must
initialize the router before the ports, as otherwise the device can't be
initialized.

Solve that by initializing the switch ports at the end of init sequence.

Also, remove an unnecessary warning about port up/down events, which
would otherwise be invoked whenever removing the driver, as ports are
removed before unregistering the listener for these events.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel 69c407aaf9 mlxsw: reg: Add Router General Configuration Register
Add the Router General Configuration Register (RGCR), which allows us to
enable the router in the device and configure its various parameters.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel 11943ff442 mlxsw: spectrum: Remove RIF from PVID vPort when joining / leaving LAG
We are going to assign router interfaces (RIFs) to netdevs if an IPv4
address was assigned to them. If one was assigned to a port netdev, this
will translate to the PVID vPort being member in a RIF.

While it's possible for a LAG slave to have an IP address, we can't have
a vPort being member in two FIDs (assuming the LAG device will be
put in bridge / assigned an IP address).

Solve that by making the PVID vPort leave any FID it might be a member
in when joining / leaving LAG.

Note that the PVID vPort is the only vPort that can be present on the
port when it's put under LAG.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel 86bf95b334 mlxsw: spectrum: Sync PVID vPort LAG status
When VLAN devices are created on top of LAG, their underlying vPorts are
configured correctly with LAG membership.

However, the PVID vPort is implicit and already present when the port
netdev is put under LAG, so its LAG membership is never set. Set it
correctly when joining / leaving LAG.

This didn't matter until now, but we are going to introduce support for
router interfaces (RIFs), which need to take into account LAG membership.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel 32d863fb93 mlxsw: spectrum: Remove VLANs configuration via SELF flag
When port isn't bridged it is still possible to invoke switchdev ops and
configure the device's VLAN filters.

However, this will require us to use different Router InterFaces (RIFs)
for the same netdev, instead of one per-netdev as with any other
configuration.

Taking the above into account and the fact that this functionality is
questionable with regards to the device's normal use-case, remove it and
instead return an error.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Ido Schimmel 52697a9ede mlxsw: spectrum: Send untagged packets through a port netdev
Port netdevs (e.g. swXpY) that are not bridged are represented in the
device using a vPort with VID=PVID=1 (the PVID vPort), as untagged
packets entering the switch are internally tagged with the PVID VLAN.
When these packets are routed through a different port netdev they
should egress untagged.

This wasn't a problem until now, as non-bridged traffic only originated
from the CPU, which transmits packets out of the port as-is.

When a vPort is created with VID 1 mark it as egress untagged.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 15:21:17 -04:00
Hadar Hen Zion cb67b83292 net/mlx5e: Introduce SRIOV VF representors
Implement the relevant profile functions to create mlx5e driver instance
serving as VF representor. When SRIOV offloads mode is enabled, each VF
will have a representor netdevice instance on the host.

To do that, we also export set of shared service functions from en_main.c,
such that they can be used by both NIC and repsresentors netdevs.

The newly created representor netdevice has a basic set of net_device_ops
which are the same ndo functions as the NIC netdevice and an ndo of it's
own for phys port name.

The profiling infrastructure allow sharing code between the NIC and the
vport representor even though the representor has only a subset of the
NIC functionality.

The VF reps and the PF which is used in that mode to represent the uplink,
expose switchdev ops. Currently the only op supposed is attr get for the
port parent ID which here serves to identify net-devices belonging to the
same HW E-Switch. Other than that, no offloading is implemented and hence
switching functionality is achieved if one sets SW switching rules, e.g
using tc, bridge or ovs.

Port phys name (ndo_get_phys_port_name) is implemented to allow exporting
to user-space the VF vport number and along with the switchdev port parent
id (phys_switch_id) enable a udev base consistent naming scheme:

SUBSYSTEM=="net", ACTION=="add", ATTR{phys_switch_id}=="<phys_switch_id>", \
        ATTR{phys_port_name}!="", NAME="$PF_NIC$attr{phys_port_name}"

where phys_switch_id is exposed by the PF (and VF reps) and $PF_NIC is
the name of the PF netdevice.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:41 -04:00
Hadar Hen Zion 127ea380ac net/mlx5: Add Representors registration API
Introduce E-Switch registration/unregister representors functions.

Those functions are called by the mlx5e driver when the PF NIC is
created upon pci probe action regardless of the E-Switch mode (NONE,
LEGACY or OFFLOADS).

Adding basic E-Switch database that will hold the vport represntors
upon creation.

This patch doesn't add any new functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:41 -04:00
Hadar Hen Zion 6bfd390ba5 net/mlx5e: Add support for multiple profiles
To allow support in representor netdevices where we create more than one
netdevice per NIC, add profiles to the mlx5e driver. The profiling
allows for creation of mlx5e instances with different characteristics.

Each profile implements its own behavior using set of function pointers
defined in struct mlx5e_profile. This is done to allow for avoiding complex
per profix branching in the code.

Currently only the profile for the conventional NIC is implemented,
which is of use when a netdev is created upon pci probe.

This patch doesn't add any new functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:41 -04:00
Hadar Hen Zion 398f33511e net/mlx5e: Mark enabled RQTs instances explicitly
In the current driver implementation two types of receive queue
tables (RQTs) are in use - direct and indirect.

Change the driver to mark each new created RQT (direct or indirect)
as "enabled". This behaviour is needed for introducing new mlx5e
instances which serve to represent SRIOV VFs.

The VF representors will have only one type of RQTs (direct).

An "enabled" flag is added to each RQT to allow better handling
and code sharing between the representors and the nic netdevices.

This patch doesn't add any new functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Hadar Hen Zion 724b2aa151 net/mlx5e: TIRs management refactoring
The current refresh tirs self loopback mechanism, refreshes all the tirs
belonging to the same mlx5e instance to prevent self loopback by packets
sent over any ring of that instance. This mechanism relies on all the
tirs/tises of an instance to be created with the same transport domain
number (tdn).

Change the driver to refresh all the tirs created under the same tdn
regardless of which mlx5e netdev instance they belong to.

This behaviour is needed for introducing new mlx5e instances which serve
to represent SRIOV VFs. The representors and the PF share vport used for
E-Switch management, and we want to avoid NIC level HW loopback between
them, e.g when sending broadcast packets. To achieve that, both the
representors and the PF NIC will share the tdn.

This patch doesn't add any new functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Hadar Hen Zion b50d292b43 net/mlx5e: Create NIC global resources only once
To allow creating more than one netdev over the same PCI function, we
change the driver such that global NIC resources are created once and
later be shared amongst all the mlx5e netdevs running over that port.

Move the CQ UAR, PD (pdn), Transport Domain (tdn), MKey resources from
being kept in the mlx5e priv part to a new resources structure
(mlx5e_resources) placed under the mlx5_core device.

This patch doesn't add any new functionality.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Or Gerlitz c930a3ad74 net/mlx5e: Add devlink based SRIOV mode changes
Implement handlers for the devlink commands to get and set the SRIOV
E-Switch mode.

When turning to the switchdev/offloads mode, we disable the e-switch
and enable it again in the new mode, create the NIC offloads table
and create VF reps.

When turning to legacy mode, we remove the VF reps and the offloads
table, and re-initiate the e-switch in it's legacy mode.

The actual creation/removal of the VF reps is done in downstream patches.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Or Gerlitz feae908744 net/mlx5: Add devlink interface
The devlink interface is initially used to set/get the mode of the SRIOV e-switch.

Currently, these are only stubs for get/set, down-stream patch will actually
fill them out.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Or Gerlitz fed9ce22bf net/mlx5: E-Switch, Add API to create vport rx rules
Add the API to create vport rx rules of the form

	packet meta-data :: vport == $VPORT --> $TIR

where the TIR is opened by this VF representor.

This logic will by used for packets that didn't match any rule in the
e-switch datapath and should be received into the host OS through the
netdevice that represents the VF they were sent from.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Or Gerlitz c116c6eec6 net/mlx5: E-Switch, Add offloads table
Belongs to the NIC offloads name-space, and to be used as part of the
SRIOV offloads logic to steer packets that hit the e-switch miss rule
to the TIR of the relevant VF representor.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:40 -04:00
Or Gerlitz acbc2004d7 net/mlx5: Introduce offloads steering namespace
Add a new namespace (MLX5_FLOW_NAMESPACE_OFFLOADS) to be populated
with flow steering rules that deal with rules that have have to
be executed before the EN NIC steering rules are matched.

The namespace is located after the bypass name-space and before the
kernel name-space. Therefore, it precedes the HW processing done for
rules set for the kernel NIC name-space.

Under SRIOV, it would allow us to match on e-switch missed packet
and forward them to the relevant VF representor TIR.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:39 -04:00
Or Gerlitz ab22be9ba3 net/mlx5: E-Switch, Add API to create send-to-vport rules
Add the API to create send-to-vport e-switch rules of the form

 packet meta-data :: send-queue-number == $SQN and source-vport == 0 --> $VPORT

These rules are to be used for a send-to-vport logic which conceptually bypasses
the "normal" steering rules currently present at the e-switch datapath.

Such rule should apply only for packets that originate in the e-switch manager
vport (0) and are sent for a given SQN which is used by a given VF representor
device, and hence the matching logic.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:39 -04:00
Or Gerlitz 3aa335724f net/mlx5: E-Switch, Add miss rule for offloads mode
In the sriov offloads mode, packets that are not matched by any other
rule should be sent towards the e-switch manager for further processing.

Add such "miss" rule which matches ANY packet as the last rule in the
e-switch FDB and programs the HW to send the packet to vport 0 where
the e-switch manager runs.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:39 -04:00
Or Gerlitz 69697b6e20 net/mlx5: E-Switch, Add support for the sriov offloads mode
Unlike the legacy mode, here, forwarding rules are not learned by the
driver per events on macs set by VFs/VMs into their vports, but rather
should be programmed by higher-level SW entities.

Saying that, still, in the offloads mode (SRIOV_OFFLOADS), two flow
groups are created by the driver for management (slow path) purposes:

The first group will be used for sending packets over e-switch vports
from the host OS where the e-switch management code runs, to be
received by VFs.

The second group will be used by a miss rule which forwards packets toward
the e-switch manager. Further logic will trap these packets such that
the receiving net-device as seen by the networking stack is the representor
of the vport that sent the packet over the e-switch data-path.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:39 -04:00
Or Gerlitz 6ab36e35f1 net/mlx5: E-Switch, Add operational mode to the SRIOV e-Switch
Define three modes for the SRIOV e-switch operation, none (SRIOV_NONE,
none of the VF vports are enabled), legacy (SRIOV_LEGACY, the current mode)
and sriov offloads (SRIOV_OFFLOADS). Currently, when in SRIOV, only the
legacy mode is supported, where steering rules are of the form:

        destination mac --> VF vport

This patch does not change any functionality.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-02 14:40:39 -04:00
Dave Airlie 542d972221 Linux 4.7-rc5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXcHi9AAoJEHm+PkMAQRiGSJ0H/2o4t9VWYmhyPC1sdIHoCExJ
 P4tBrcZYBmKcsOmIfnJDa5g/+IdhouEUM0v0fHPogS2UUWT9eRuJWYD3sY+HpEQ+
 heKTli8X73gsFB25odeIbIt0jAoSiiMYWDrWqLNsuUV1tjEYVA8rH0SM94FiOC/5
 7WVWXLTuH+Rm7JHP18BnKxmMMbzrTFmwisLMqFKyfZRRSlS+/ix7iLUNO9AFa39B
 YHxNPihLrZ0oONyCOAQoHTIXXrw0cQbxV2utg3vnMcCZdme2xOn+iXMntTSKfZ39
 iC9/T0vsO3R6OrRo2aDZAnCPUAniXnMEIhrKG37WMyXpj6cucZ/2QiNXcXviGV4=
 =iLte
 -----END PGP SIGNATURE-----

Back-merge tag 'v4.7-rc5' into drm-next

Linux 4.7-rc5

The fsl-dcu pull needs -rc3 so go to -rc5 for now.
2016-07-02 15:56:01 +10:00
Shaker Daibes 87424ad52d net/mlx5e: Log link state changes
Add Link UP/Down prints to kernel log when link state changes

Signed-off-by: Shaker Daibes <shakerd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:04 -04:00
Rana Shahout cdcf11212b net/mlx5e: Validate BW weight values of ETS
Valid weight assigned to ETS TClass values are 1-100

Fixes: 08fb1dacdd ('net/mlx5e: Support DCBNL IEEE ETS')
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:04 -04:00
Rana Shahout 7ccdd0841b net/mlx5e: Fix select queue callback
The default fallback function used by mlx5e select queue can return
any TX queues in range [0..dev->num_real_tx_queues).

The current implementation assumes that the fallback function returns
a number in the range [0.. number of channels).  Actually
dev->num_real_tx_queues = (number of channels) * dev->num_tc;
which is more than the expected range if num_tc is configured and could
lead to crashes.

To fix this we test if num_tc is not configured we can safely return the
fallback suggestion, if not we will reciprocal_scale the fallback
result and normalize it to the desired range.

Fixes: 08fb1dacdd ('net/mlx5e: Support DCBNL IEEE ETS')
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reported-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:04 -04:00
Matthew Finlay e3a19b53cb net/mlx5e: Copy all L2 headers into inline segment
ConnectX4-Lx uses an inline wqe mode that currently defaults to
requiring the entire L2 header be included in the wqe.
This patch fixes mlx5e_get_inline_hdr_size() to account for
all L2 headers (VLAN, QinQ, etc) using skb_network_offset(skb).

Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:04 -04:00
Daniel Jurgens 6cd392a082 net/mlx5e: Handle RQ flush in error cases
Add a timeout to avoid an infinite loop waiting for RQ's to flush. This
occurs during AER/EEH and will also happen if the device stops posting
completions due to internal error or reset, or if moving the RQ to the
error state fails. Also cleanup posted receive resources when closing
the RQ.

Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Daniel Jurgens 3947ca1859 net/mlx5e: Implement ndo_tx_timeout callback
Add callback to handle TX timeouts.

Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Daniel Jurgens 29429f3300 net/mlx5e: Timeout if SQ doesn't flush during close
Avoid an infinite loop by timing out waiting for the SQ to flush. Also
clean up the TX descriptors if that happens.

Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia 65ee670845 net/mlx5: Add timeout handle to commands with callback
The current implementation does not handle timeout in case of command
with callback request, and this can lead to deadlock if the command
doesn't get fw response.
Add delayed callback timeout work before posting the command to fw.
In case of real fw command completion we will cancel the delayed work.
In case of fw command timeout the callback timeout handler will be
called and it will simulate fw completion with timeout error.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia 9cba4ebcf3 net/mlx5: Fix potential deadlock in command mode change
Call command completion handler in case of timeout when working in
interrupts mode.
Avoid flushing the commands workqueue after acquiring the semaphores to
prevent a potential deadlock.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Daniel Jurgens d57847dc41 net/mlx5: Fix wait_vital for VFs and remove fixed sleep
The device ID for VFs is in a different location than PFs. This results
in the poll always timing out for VFs. There's no good way to read the
VF device ID without using the PF's configuration space.  Switch to waiting
for the health poll to start incrementing. Also remove the 1s sleep
at the beginning.

fixes: 89d44f0a6c ('net/mlx5_core: Add pci error handlers to mlx5_core
driver')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Daniel Jurgens 5adff6a088 net/mlx5: Fix incorrect page count when in internal error
Change page cleanup flow when in internal error to properly decrement
the page counts when reclaiming pages.  The prevents timing out waiting
for extra pages that were actually cleaned up previously.

fixes: 89d44f0a6c ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia c1d4d2e92a net/mlx5: Avoid calling sleeping function by the health poll thread
In internal error state the health poll thread will eventually call
synchronize_irq() (to safely trigger command completions) which might
sleep, so we are calling sleeping function from atomic context which is
invalid.
Here we move trigger_cmd_completions(dev) to enter error state which is
the earliest stage in error state handling.
This way we won't need to wait for next health poll to trigger command
completions and will solve the scheduling while atomic issue.
mlx5_enter_error_state can be called from two contexts, protect it with
dev->intf_state_lock

Fixes: 89d44f0a6c ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:03 -04:00
Mohamad Haj Yahia 0d834442cc net/mlx5: Fix teardown errors that happen in pci error handler
In case of internal error state we will simulate the commands status
through the return value translation function, but we need to simulate
all the teardown fw commands as successful so we will not have fw
command failure prints.
This also fix memory leaks that happen because we skip teardown stages
due to failed fw commands.

Fixes: 89d44f0a6c ('net/mlx5_core: Add pci error handlers to mlx5_core driver')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-01 06:12:02 -04:00
David S. Miller ee58b57100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, except the packet scheduler
conflicts which deal with the addition of the free list parameter
to qdisc_enqueue().

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-30 05:03:36 -04:00
Gal Pressman bfe6d8d1d4 net/mlx5e: Reorganize ethtool statistics
Categorize and reorganize ethtool statistics counters by renaming to
"rx_*" and "tx_*" and removing redundant and duplicated counters, this
way they are easier to grasp and more user friendly.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:47 -04:00
Gal Pressman ed80ec4c17 net/mlx5e: Fix number of PFC counters reported to ethtool
Number of PFC counters used to count only number of priorities with PFC
enabled, but each priority has more than one counter, hence the need to
multiply it by the number of PFC counters per priority.

Fixes: cf678570d5 ('net/mlx5e: Add per priority group to PPort counters')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Matthew Finlay 9ceec359e4 net/mlx5e: Prevent adding the same vxlan port
Do not allow the same vxlan udp port to be added to the device more than
once.

Fixes: b3f63c3d5e ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Gal Pressman fd4782c213 net/mlx5e: Check for BlueFlame capability before allocating SQ uar
Previous to this patch mapping was always set to write combining without
checking whether BlueFlame is supported in the device.

Fixes: 0ba422410b ('net/mlx5: Fix global UAR mapping')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Eli Cohen e0f46eb9f6 net/mlx5e: Change enum to better reflect usage
Change MLX5E_STATE_ASYNC_EVENTS_ENABLE to
MLX5E_STATE_ASYNC_EVENTS_ENABLED since it represent a state and not an
operation.

Fixes: acff797cd1 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Majd Dibbiny 7092fe8669 net/mlx5: Add ConnectX-5 PCIe 4.0 to list of supported devices
Add the upcoming ConnectX-5 PCIe 4.0 device to the list of
supported devices by the mlx5 driver.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Eli Cohen 5be1ea899d net/mlx5: Update command strings
Add command string for MODIFY_FLOW_TABLE which is used by the driver.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-29 04:28:46 -04:00
Wang Sheng-Hui f299a02d5f net/mlx5: use mlx5_buf_alloc_node instead of mlx5_buf_alloc in mlx5_wq_ll_create
Commit 311c7c71c9 ("net/mlx5e: Allocate DMA coherent memory on
reader NUMA node") introduced mlx5_*_alloc_node() but missed changing
some calling and warn messages. This patch introduces 2 changes:
	* Use mlx5_buf_alloc_node() instead of mlx5_buf_alloc() in
	  mlx5_wq_ll_create()
	* Update the failure warn messages with _node postfix for
	  mlx5_*_alloc function names

Fixes: 311c7c71c9 ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node")
Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Acked-By: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-28 05:17:38 -04:00
Gal Pressman 52244d9607 net/mlx5e: Report correct auto negotiation and allow toggling
Previous to this patch auto negotiation was reported off although it was
on by default in hardware. This patch reports the correct information to
ethtool and allows the user to toggle it on/off.

Added another parameter to set port proto function in order to pass
the auto negotiation field to the hardware.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:41 -04:00
Gal Pressman 665bc53969 net/mlx5e: Use new ethtool get/set link ksettings API
Use new get/set link ksettings and remove get/set settings legacy
callbacks.
This allows us to use bitmasks longer than 32 bit for supported and
advertised link modes and use modes that were previously not supported.

Signed-off-by: Gal Pressman <galp@mellanox.com>
CC: Ben Hutchings <bwh@kernel.org>
CC: David Decotigny <decot@googlers.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:41 -04:00
Gal Pressman 4a50e35b04 net/mlx5e: Add missing 50G baseSR2 link mode
Add MLX5E_50GBASE_SR2 as ETHTOOL_LINK_MODE_50000baseSR2_Full_BIT.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: Ben Hutchings <bwh@kernel.org>
Cc: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:41 -04:00
Gal Pressman 667daedaec net/mlx5e: Toggle link only after modifying port parameters
Add a dedicated function to toggle port link. It should be called only
after setting a port register.
Toggle will set port link to down and bring it back up in case that it's
admin status was up.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:41 -04:00
Gil Rockah cb3c7fd4f8 net/mlx5e: Support adaptive RX coalescing
Striving for high message rate and low interrupt rate.

Usage:
        ethtool -C <interface> adaptive-rx on/off

Signed-off-by: Gil Rockah <gilr@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
CC: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:41 -04:00
Tariq Toukan 9908aa2929 net/mlx5e: CQE based moderation
In this mode the moderation timer will restart upon
new completion (CQE) generation rather than upon interrupt
generation.

The outcome is that for bursty traffic the period timer will never
expire and thus only the moderation frames counter will dictate
interrupt generation, thus the interrupt rate will be relative
to the incoming packets size.
If the burst seizes for "moderation period" time then an interrupt
will be issued immediately.

CQE based moderation is off by default and can be controlled
via ethtool set_priv_flags.

Performance tested on ConnectX4-Lx 50G.

Less packet loss in netperf UDP and TCP tests, with no bw degradation,
for both single and multi streams, with message sizes of
64, 1024, 1472 and 32768 byte.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Gil Rockah <gilr@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:41 -04:00
Gal Pressman 4e59e28881 net/mlx5e: Introduce net device priv flags infrastructure
Introduce an infrastructure for getting/setting private net device
flags.

Currently a 'nop' priv flag is added, following patches will override
the flag will actual feature specific flags.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:40 -04:00
Yevgeny Petrilin 507f0c817f net/mlx5e: Add TXQ set max rate support
Implement set_maxrate ndo.
Use the rate index from the hardware table to attach to channel SQ/TXQ.
In case of failure to configure new rate, the queue remains with
unlimited rate.

We save the configuration on priv structure and apply it each time
Send Queues are being reinitialized (after open/close) operations.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:40 -04:00
Yevgeny Petrilin 1466cc5b23 net/mlx5: Rate limit tables support
Configuring and managing HW rate limit tables.
The HW holds a table of rate limits, each rate is
associated with an index in that table.
Later a Send Queue uses this index to set the rate limit.
Multiple Send Queues can have the same rate limit, which is
represented by a single entry in this table.
Even though a rate can be shared, each queue is being rate
limited independently of others.

The SW shadow of this table holds the rate itself,
the index in the HW table and the refcount (number of queues)
working with this rate.

The exported functions are mlx5_rl_add_rate and mlx5_rl_remove_rate.
Number of different rates and their values are derived
from HW capabilities.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-27 04:10:40 -04:00
Rana Shahout af7d518526 net/mlx4_en: Add DCB PFC support through CEE netlink commands
This patch adds support for reading and updating priority flow
control (PFC) attributes in the driver via netlink.

Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-23 15:18:50 -04:00
Artemy Kovalyov af1ba291c5 {net, IB}/mlx5: Refactor internal SRQ API
Currently, the SRQ API uses the obsolete mlx5_*_srq_mbox_{in,out}
structs which limit the ability to pass the SRQ attributes between
net and IB parts of the driver.

This patch changes the SRQ API so as to use auto-generated structs
and provides a better way to pass attributes which will be in use by
coming features.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:20:07 -04:00
Yishai Hadas 16bd020147 net/mlx5: Export required core functions to support RSS
In order to support RSS QPs, we need to create Ethernet based objects.
This is done by create_rq, destroy_rq, create_rqt and destroy_rqt
mlx5_core functions. We export these functions.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:43 -04:00
Eran Ben Elisha 9d76931180 net/mlx4_en: Avoid unregister_netdev at shutdown flow
This allows a clean shutdown, even if some netdev clients do not
release their reference from this netdev. It is enough to release
the HW resources only as the kernel is shutting down.

Fixes: 2ba5fbd62b ('net/mlx4_core: Handle AER flow properly')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-22 16:38:11 -04:00
Kamal Heib 93c098af09 net/mlx4_en: Fix the return value of a failure in VLAN VID add/kill
Modify mlx4_en_vlan_rx_[add/kill]_vid to return error value in case of
failure.

Fixes: 8e586137e6 ('net: make vlan ndo_vlan_rx_[add/kill]_vid return error value')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-22 16:38:11 -04:00
Ido Schimmel 223053783b mlxsw: spectrum: Add debug prints
For debug purposes, it's useful to know the order in which the driver
responds to changes in the topology of its upper devices.

Add debug prints to signal these events.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:51 -04:00
Ido Schimmel 1c80075907 mlxsw: spectrum: Free resources upon vPort destruction
There are situations in which a vPort is destroyed while still holding
references to device's resources such as FIDs and FDB records. This can
happen, for example, when a VLAN device is deleted while still being
bridged.

Instead of trying to make sure vPort destruction is invoked when it no
longer uses device's resources, just free them upon destruction. This
simplifies the code, as we no longer need to take different situations
into account when events are received - cleanup is taken care of in one
place.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:51 -04:00
Ido Schimmel fe3f6d144a mlxsw: spectrum: Refactor FDB flushing logic
FDB entries are learned using {Port / LAG ID, FID} and therefore should
be flushed whenever a port (vPort) leaves its FID (vFID).

However, when the bridge port is a LAG device (or a VLAN device on top),
then FDB flushing is conditional. Ports removed from such LAG
configurations must not trigger flushing, as other ports might still be
members in the LAG and therefore the bridge port is still active.

The decision whether to flush or not was previously computed in the
netdevice notification block, but in order to flush the entries when a
port leaves its FID this decision should be computed there.

Strip the notification block from this logic and instead move it to one
FDB flushing function that is invoked from both the FID / vFID leave
functions.

When port isn't member in LAG, FDB flushing should always occur.
Otherwise, it should occur only when the last port (vPort) member in the
LAG leaves the FID (vFID).

This will allow us - in the next patch - to simplify the cleanup code
paths that are hit whenever the topology above the port netdevs changes.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:51 -04:00
Ido Schimmel 56918b6b0a mlxsw: spectrum: Don't count on FID being present
Not all vPorts will have FIDs assigned to them, so make sure functions
first test for FID presence.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel 41b996cc94 mlxsw: spectrum: Add FID get / set functions
As previously explained, not all vPorts will be assigned FIDs, so instead
of returning the FID index of a vPort, return a pointer to its FID
struct. This will allow us to know whether it's legal to access the
vPort's FID parameters such as index and device.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel 6381b3a85f mlxsw: spectrum: Check if port is vPort using its VID
When L3 interfaces will be introduced a vPort won't necessarily have a
FID assigned to it. This can happen if it's not member in a bridge (in
which case it's assigned a vFID) or doesn't have an IP address (in which
case it's assigned an rFID).

Therefore, instead check the VID parameter to test whether a port is a
vPort or not.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel 14d39461b3 mlxsw: spectrum: Use per-FID struct for the VLAN-aware bridge
In a very similar way to the vFIDs, make the first 4K FIDs - used in the
VLAN-aware bridge - use the new FID struct.

Upon first use of the FID by any of the ports do the following:

1) Create the FID
2) Setup a matching flooding entry
3) Create a mapping for the FID

Unlike vFIDs, upon creation of a FID we always create a global
VID-to-FID mapping, so that ports without upper vPorts can use it
instead of creating an explicit {Port, VID} to FID mapping.

When a port leaves a FID the reverse is performed. Whenever the FID's
reference count reaches zero the FID is deleted along with the global
mapping.

The per-FID struct will later allow us to configure L3 interfaces on top
of the VLAN-aware bridge.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel 37286d2571 mlxsw: spectrum: Remove unused function argument
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel 0355b59fbb mlxsw: spectrum: Use join / leave functions for vFID operations
When a vPort is created or when it joins a bridge we always do the same
set of operations:

1) Create the vFID, if not already created
2) Setup flooding for the vFID
3) Map the {Port, VID} to the vFID

When a vPort is destroyed or when it leaves a bridge the reverse is
performed.

Encapsulate the above in join / leave functions and simplify the code.
FIDs and rFIDs will use a similar set of functions.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel d0ec875a2f mlxsw: spectrum: Make vFID struct generic
Up until now we had a dedicated struct only for vFIDs, but before
introducing support for L3 interfaces we need to make it generic and
use it for all three types of FIDs:

1) FIDs - 0..4K-1, used for the VLAN-aware bridge
2) vFIDs - 4K..15K-1, used for VLAN-unaware bridges
3) rFIDs - 15K..16K-1, used to direct traffic to / from the router in
the device. Will be introduced later in the series.

The three types of L3 interfaces - Router InterFaces, RIFs - that will
be introduced correspond to the three types of FIDs and are configured
using them. Therefore, we'll need to store the links between them as
well as a reference count on the underlying FID, so that the
corresponding RIF will be destroyed when it reaches zero.

Note that the lower 0.5K vFIDs are currently used for for non-bridged
netdevs, so that traffic could be flooded to the CPU port. However, when
rFIDs will be introduced we'll no longer need these and they too will be
used for VLAN-unaware bridges.

Make the vFID struct generic by renaming it and some of its fields. FIDs
will be converted to use it later in the series.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel e606002721 mlxsw: spectrum: Use FID instead of vFID to setup flooding
Use a FID index instead of vFID and ease the transition towards a
generic FID struct.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:50 -04:00
Ido Schimmel 9c4d442314 mlxsw: spectrum: Create a function to map vPort's FID
A FID used by a vPort (vFID, but also rFID later in the series) is
always mapped using {Port, VID} and not only VID as with the 4K FIDs of
the VLAN-aware bridge.

Instead of specifying all the arguments each time, just wrap this
operation using a dedicated function and simplify the code.

As before, the function takes FID as its argument in preparation for a
generic FID struct.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel c7e920b5be mlxsw: spectrum: Use only one function to create vFIDs
Simplify the code and use only one function for vFID creation /
destruction.

Unlike before, the function receives a FID index as its argument and not
a vFID index. Instead of passing 0, now one would need to pass 4K, which
is the first vFID.

This is the first step in creating a generic FID struct that will be
used for all three types of FIDs.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel 47a0a9e6c3 mlxsw: spectrum: Remove redundant function argument
In all call sites 'only_uc' is set to false, so strip it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel d8651fd886 mlxsw: spectrum: Use DECLARE_BITMAP() macro
There is a macro to do this kind of declarations, so use it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel 7117a570b9 mlxsw: spectrum: Centralize VLAN-aware bridge ref counting
We hold a reference count on the number of ports member in the
VLAN-aware bridge, as we only support one.

Instead of always incrementing / decrementing the reference count after
joining / leaving the bridge, simply do this accounting in the join /
leave functions.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel 279438952b mlxsw: spectrum: Remove unnecessary function argument
The argument 'br_dev' is never used, so remove it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel 82e6db034b mlxsw: spectrum: Make unlinking functions return void
When responding to unlinking CHANGEUPPER notifications we shouldn't
return any value, as it's not checked by upper layers.

In addition, there's nothing the driver can do in case of failure, so it
should simply continue and try to free as much resources as possible and
not stop on first error.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel 423b937e7d mlxsw: spectrum: Use WARN_ON() return value
Instead of checking for a condition and then issue the warning, just do
it in one go and simplify the code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel ddbe993dbe mlxsw: spectrum: Remove unnecessary checks from event processing
When upper device of a VLAN device changes we already made sure it's
a bridge device in PRECHANGEUPPER, so no need to check it's a master
device in CHANGEUPPER.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:49 -04:00
Ido Schimmel 6ec439043b mlxsw: spectrum: Forbid LAG slave from having VLAN uppers
When a port netdev is put under LAG it cannot have VLAN upper devices,
so forbid that. The LAG device itself can have VLAN upper devices.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:48 -04:00
Ido Schimmel 59fe9b3f84 mlxsw: spectrum: Sanitize port netdev upper devices
We currently only support the following upper devices for port netdevs:
1) Bridge
2) LAG (bond / team)
3) VLAN

Any other device is forbidden, so return an error.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:48 -04:00
Ido Schimmel 80bedf1a62 mlxsw: spectrum: Use notifier_from_errno() in notifier block
Instead of checking the error value and returning NOTIFY_BAD, just use
notifier_from_errno() and simplify the code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-21 05:02:48 -04:00
Nogah Frankel 4e239fac7c mlxsw: switchx2: Don't count internal TX header bytes to stats
Stop the SW TX counter from counting the TX header bytes
since they are not being sent out.

Fixes: e577516b9d ("mlxsw: Fix use-after-free bug in mlxsw_sx_port_xmit")
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17 21:57:53 -07:00
Nogah Frankel 63dcdd35c1 mlxsw: spectrum: Don't count internal TX header bytes to stats
Stop the SW TX counter from counting the TX header bytes
since they are not being sent out.

Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17 21:57:53 -07:00
Alexander Duyck 974c3f3000 mlx5_en: Replace ndo_add/del_vxlan_port with ndo_add/del_udp_enc_port
This change replaces the network device operations for adding or removing a
VXLAN port with operations that are more generically defined to be used for
any UDP offload port but provide a type.  As such by just adding a line to
verify that the offload type is VXLAN we can maintain the same
functionality.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17 20:23:31 -07:00
Alexander Duyck a831274a13 mlx4_en: Replace ndo_add/del_vxlan_port with ndo_add/del_udp_enc_port
This change replaces the network device operations for adding or removing a
VXLAN port with operations that are more generically defined to be used for
any UDP offload port but provide a type.  As such by just adding a line to
verify that the offload type is VXLAN we can maintain the same
functionality.

In addition I updated the socket address family check so that instead of
excluding IPv6 we instead abort of type is not IPv4.  This makes much more
sense as we should only be supporting IPv4 outer addresses on this
hardware.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-17 20:23:31 -07:00
Alexander Duyck a547224dce mlx4e: Do not attempt to offload VXLAN ports that are unrecognized
The mlx4e driver does not support more than one port for VXLAN offload.  As
such expecting the hardware to offload other ports is invalid since it
appears the parsing logic is used to perform Tx checksum and segmentation
offloads.  Use the vxlan_port number to determine in which cases we can
apply the offload and in which cases we can not.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-16 14:24:59 -07:00
Eric Dumazet 0c5ddb51e8 net/mlx4_en: initialize cmd.context_lock spinlock earlier
Maciej Żenczykowski reported lockdep warning a spinlock
was not registered before being held in mlx4_cmd_wake_completions()

cmd.context_lock initialization is not at the right place.

1) mlx4_cmd_use_events() can be called multiple times.
   Calling spin_lock_init() on a live spinlock can lead
   to hangs.

2) mlx4_cmd_wake_completions() can be called while lock
   has not been initialized.
   Lockdep complains, and current logic is not race prone.

It seems better to move the initialization earlier in
mlx4_load_one()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-15 12:16:30 -07:00
David S. Miller 1578b0a5e9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/sched/act_police.c
	net/sched/sch_drr.c
	net/sched/sch_hfsc.c
	net/sched/sch_prio.c
	net/sched/sch_red.c
	net/sched/sch_tbf.c

In net-next the drop methods of the packet schedulers got removed, so
the bug fixes to them in 'net' are irrelevant.

A packet action unload crash fix conflicts with the addition of the
new firstuse timestamp.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-10 11:52:24 -07:00
Bhaktipriya Shridhar 3d5479e920 mlxsw: core: Remove deprecated create_workqueue
alloc_workqueue replaces deprecated create_workqueue().

A dedicated workqueue has been used since the workqueue
mlxsw_wq is used for FDB notif. processing with workitems that are
involved in normal device operation && because it's a network device
which can be depended upon during memory reclaim.

Workitems &trans->timeout_dw and &mlxsw_sp->fdb_notify.dw,
map to mlxsw_sp_fdb_notify_work (processes FDB notifications from the
underlying device and resolves the netdev to which the entry points to
and notifies the bridge using the switchdev notifier) and
mlxsw_emad_trans_timeout_work (provides async EMAD register access)
respectively. They require forward progress under memory pressure and
hence, WQ_MEM_RECLAIM has been set.

Since there are only a fixed number of work items, explicit concurrency
limit is unnecessary here.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 23:49:43 -07:00
Eric Dumazet f7d3c1cbe3 net/mlx4_en: fix ethtool -x
mlx4 RSS is limited to spread incoming packets to a power of two number
of queues.

An uniformly distibuted traffic would be split on queues 0 to N-1, N
being a power of two, each queue having a 1/N weight.

If number of RX queues is not a power of two, upper RX queues do not
receive traffic.

ethtool -x is lying, because it pretends some queues have higher weight.

Before patch:

lpaa24:~# ethtool -L eth1 rx 24
lpaa24:~# ethtool -x eth1
RX flow hash indirection table for eth1 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:      0     1     2     3     4     5     6     7
RSS hash key:
e0:7c:3a:89:07:55:b6:58:69:cc:f4:e5:24:62:e3:25:88:6c:42:5b:d2:cb:9a:d2:e0:06:e1:dc:f9:09:a1:89:0f:a0:30:43:73:6f:0c:b6

If this information was correct, user space tools could expect queues 0
to 7 to receive twice more traffic than queues 8 to 15

After patch :

lpaa24:~# ethtool -L eth1 rx 24
lpaa24:~# ethtool -x eth1
RX flow hash indirection table for eth1 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
RSS hash key:
da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce
lpaa24:~# ethtool -X eth1 equal 8
lpaa24:~# ethtool -x eth1
RX flow hash indirection table for eth1 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      0     1     2     3     4     5     6     7
RSS hash key:
da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Maciej Żenczykowski <maze@google.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 23:39:46 -07:00
Eric Dumazet 7d71e994cd net/mlx4_en: mlx4_en_netpoll() should schedule TX, not RX
I am not sure mlx4_en_netpoll() is doing anything useful right now.

mlx4 has different NAPI structures for RX and TX, and netpoll only wants
to drain TX queues.

Lets schedule NAPI polls on TX, not RX.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:24:16 -07:00
Eli Cohen 0ca00fc1f8 net/mlx5e: Fix blue flame quota logic
Blue flame is a latency enhancement feature that allows the driver to
write the packet data directly to the NIC's registers thus making the
read of the packet data from host memory redundant.

We maintain a quota for the blue flame which is reloaded whenever we
identify that the hardware is processing send requests and processes
them fast enough so by the time we post the next send request it was
able to process all the pending ones. This indicates that the hardware
is capable of processing more blue flame requests efficiently. The blue
flame quota is decremented whenever we send using blue flame.

The current code erroneously clears the budget if we did not use blue
flame for the current post send operation and we fix it here.

Fixes: 88a85f99e5 ('net/mlx5e: TX latency optimization to save DMA reads')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:27 -07:00
Eran Ben Elisha 811afeaa37 net/mlx5e: Use ndo_stop explicitly at shutdown flow
The current implementation copies the flow of ndo_stop instead of
calling it explicitly, Fixed it.

Fixes: 5fc7197d3a ("net/mlx5: Add pci shutdown callback")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:27 -07:00
Mohamad Haj Yahia 62e3c24ac4 net/mlx5: E-Switch, always set mc_promisc for allmulti vports
Set the mc_promisc flag also in the case of adding new mc address to
existing allmulti vport.

Fixes: a35f71f27a ('net/mlx5: E-Switch, Implement promiscuous rx modes vf request handling')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:26 -07:00
Noa Osherovich 23898c763f net/mlx5: E-Switch, Modify node guid on vf set MAC
In RoCE, the RDMA-CM needs the node guid to establish connection
between nodes.
Today, the node guid exposed to mlx5 Ethernet VFs is zero, therefore
RDMA-CM on the VF is broken.

Whenever the administrator sets a MAC for a VF, derive the node guid
from it and set it as well in the following way:
MAC: e4:1d:2d:b3:f4:01 -> node_guid: e4:1d:2d:ff:fe:b3:f4:01

Fixes: 77256579c6 ('net/mlx5: E-Switch, Introduce Vport...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:26 -07:00
Mohamad Haj Yahia 25fff58cb2 net/mlx5: E-Switch, Fix vport enable flow
Reorder vport enable flow to mark the vport as enabled before calling
the vport change handler which was modified to handle the case for
when vport is not enabled.

This fixes the case for when the PF netdev is open before sriov is
enabled, once sriov is enabled at esw_enable_vport,
esw_vport_change_handle_locked didn't read the PF context since it
thought the PF vport was not enabled.

When we enable the vport, arming for events is not required anymore,
since it's done on the vport change handle

Fixes: 586cfa7f1d ('net/mlx5: E-Switch, Use vport event handler for vport cleanup')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:26 -07:00
Or Gerlitz 3f42ac6648 net/mlx5: E-Switch, Use the correct error check on returned pointers
The mlx5 flow-steering API (mlx5_create_flow_table/group/rule) never
returns null pointer on error. Even if it was doing that, checking
for IS_ERR_OR_NULL(p) and then returning PTR_ERR(p) would have cause
bugs, since PTR_ERR(NULL) --> success, crash.

To make things more robust and protect against related future bugs,
convert all IS_ERR_OR_NULL checks on returned values to IS_ERR.

Fixes: 5742df0f7d ('net/mlx5: E-Switch, Introduce VST vport ingress/egress ACLs')
Fixes: 86d722ad2c ('net/mlx5: Use flow steering infrastructure for mlx5_en')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:26 -07:00
Or Gerlitz 3fe3d819d5 net/mlx5: E-Switch, Use the correct free() function
We must use kvfree() for something that could have been allocated with vzalloc(),
do that.

Fixes: 5742df0f7d ('net/mlx5: E-Switch, Introduce VST vport ingress/egress ACLs')
Fixes: 86d722ad2c ('net/mlx5: Use flow steering infrastructure for mlx5_en')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:26 -07:00
Maor Gottlieb bd02ef8eec net/mlx5: Fix E-Switch flow steering capabilities check
Add missing capabilities check for E-Switch FDB and ACLs flow
tables before creating their namespace in flow steering.

Fixes: efdc810ba3 ('net/mlx5: Flow steering, Add vport ACL support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:26 -07:00
Maor Gottlieb 876d634d19 net/mlx5: Fix flow steering NIC capabilities check
Flow steering infrastructure is currently used only on link layer
ethernet, therefore the driver should initialize the flow steering
when the device link layer is ethernet.

In addition, add missing capability check before initializing the
namespace of NIC RX flow tables.

Fixes: 2530236303 ('net/mlx5_core: Flow steering tree initialization')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:25 -07:00
Maor Gottlieb 2fee37a47c net/mlx5: Fix root flow table update
When we destroy the last flow table we need to update
the root_ft to NULL.

It fixes an issue for when the last flow table is destroyed
and recreated again, root_ft pointer will not be updated,
as a result traffic will be dropped.

Fixes: 2cc43b494a ('net/mlx5_core: Managing root flow table')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:25 -07:00
Majd Dibbiny 9cd3411c42 net/mlx5: Fix masking of reserved bits in XRCD number
Mask the reserved bits when reading the number of newly
created XRCD.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 22:06:25 -07:00
Ido Schimmel d664b41e2a mlxsw: spectrum: Don't sleep during ndo_get_phys_port_name()
When rtnl_fill_ifinfo() is called for a certain netdevice it queries its
various parameters such as switch id and physical port name. The
function might get called in an atomic context, which means the
underlying driver must not sleep during the query operation.

Don't query the device and sleep during ndo_get_phys_port_name(), but
instead store the needed parameters in port creation time.

Fixes: 2bf9a58675 ("mlxsw: spectrum: Add support for physical port names")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 11:20:05 -07:00
Ido Schimmel be94535f95 mlxsw: spectrum: Make split flow match firmware requirements
When a port is created following a split / unsplit we need to map it to
the correct module and lane, enable it and then continue to initialize
its various parameters such as MTU and VLAN filters.

Under certain conditions, such as trying to split ports at the bottom
row of the front panel by four, we get firmware errors.

After evaluating this with the firmware team it was decided to alter the
split / unsplit flow, so that first all the affected ports are mapped,
then enabled and finally each is initialized separately.

Fix the split / unsplit flow by first mapping and enabling all the
affected ports. Newer firmware versions will support both flows.

Fixes: 18f1e70c41 ("mlxsw: spectrum: Introduce port splitting")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-06-09 11:20:05 -07:00
Dave Airlie fa625c1956 Linux 4.7-rc2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXVJo5AAoJEHm+PkMAQRiGjioH/3H8Q26FP83Bprhrh3y5EYmz
 QSWbq5f5XHGVp4sqc45qdozklh11PrMPkNJQPT2HCTgmtzgeTX5EW9TFaQM/Ubrm
 NeYnBW0tTcLP2PhfOyOIJOjE3R1oEYIRQ7zzdOZleViYaqAW2g0frYK0l8AY/KAv
 z9mJ8s7xieqSzNTyKNfDPIciT/mO2KMbEHu92EiAkjeCfhuVxUO6PR4wLeZb/85H
 I/cOFupU1166xeOZUR0lltvPMz8JznOUB0pg2Ma67TqsNGsPPfsmxEPFlALTTJk/
 0V/hVyL4BgvO22LXwMoIxnUcwq/F3ZtlDJRhubeX7m5aS9x2Pb5mQ/g1ko0CXJA=
 =5hXk
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.7-rc2' into drm-next

Daniel has a pull request that relies on stuff in fixes that are in rc2.
2016-06-09 11:01:49 +10:00
Dave Airlie 66fd7a66e8 Merge branch 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel into drm-next
drm-intel-next-2016-05-22:
- cmd-parser support for direct reg->reg loads (Ken Graunke)
- better handle DP++ smart dongles (Ville)
- bxt guc fw loading support (Nick Hoathe)
- remove a bunch of struct typedefs from dpll code (Ander)
- tons of small work all over to avoid casting between drm_device and the i915
  dev struct (Tvrtko&Chris)
- untangle request retiring from other operations, also fixes reset stat corner
  cases (Chris)
- skl atomic watermark support from Matt Roper, yay!
- various wm handling bugfixes from Ville
- big pile of cdclck rework for bxt/skl (Ville)
- CABC (Content Adaptive Brigthness Control) for dsi panels (Jani&Deepak M)
- nonblocking atomic commits for plane-only updates (Maarten Lankhorst)
- bunch of PSR fixes&improvements
- untangle our map/pin/sg_iter code a bit (Dave Gordon)
drm-intel-next-2016-05-08:
- refactor stolen quirks to share code between early quirks and i915 (Joonas)
- refactor gem BO/vma funcstion (Tvrtko&Dave)
- backlight over DPCD support (Yetunde Abedisi)
- more dsi panel sequence support (Jani)
- lots of refactoring around handling iomaps, vma, ring access and related
  topics culmulating in removing the duplicated request tracking in the execlist
  code (Chris & Tvrtko) includes a small patch for core iomapping code
- hw state readout for bxt dsi (Ramalingam C)
- cdclk cleanups (Ville)
- dedupe chv pll code a bit (Ander)
- enable semaphores on gen8+ for legacy submission, to be able to have a direct
  comparison against execlist on the same platform (Chris) Not meant to be used
  for anything else but performance tuning
- lvds border bit hw state checker fix (Jani)
- rpm vs. shrinker/oom-notifier fixes (Praveen Paneri)
- l3 tuning (Imre)
- revert mst dp audio, it's totally non-functional and crash-y (Lyude)
- first official dmc for kbl (Rodrigo)
- and tons of small things all over as usual

* 'drm-intel-next' of git://anongit.freedesktop.org/drm-intel: (194 commits)
  drm/i915: Revert async unpin and nonblocking atomic commit
  drm/i915: Update DRIVER_DATE to 20160522
  drm/i915: Inline sg_next() for the optimised SGL iterator
  drm/i915: Introduce & use new lightweight SGL iterators
  drm/i915: optimise i915_gem_object_map() for small objects
  drm/i915: refactor i915_gem_object_pin_map()
  drm/i915/psr: Implement PSR2 w/a for gen9
  drm/i915/psr: Use ->get_aux_send_ctl functions
  drm/i915/psr: Order DP aux transactions correctly
  drm/i915/psr: Make idle_frames sensible again
  drm/i915/psr: Try to program link training times correctly
  drm/i915/userptr: Convert to drm_i915_private
  drm/i915: Allow nonblocking update of pageflips.
  drm/i915: Check for unpin correctness.
  Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
  drm/i915: Make unpin async.
  drm/i915: Prepare connectors for nonblocking checks.
  drm/i915: Pass atomic states to fbc update functions.
  drm/i915: Remove reset_counter from intel_crtc.
  drm/i915: Remove queue_flip pointer.
  ...
2016-06-02 07:58:36 +10:00
Eric Dumazet f73a6f439f net/mlx4_en: get rid of private net_device_stats
We simply can use the standard net_device stats.

We do not need to clear fields that are already 0.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-25 22:15:50 -07:00
Eric Dumazet 9ed17db17f net/mlx4_en: get rid of ret_stats
mlx4 uses a private struct net_device_stats in a vain attempt
to avoid races.

This is buggy because multiple cpus could call mlx4_en_get_stats()
at the same time, so ret_stats can not guarantee stable results.

To fix this, we need to switch to ndo_get_stats64() as this
method provides per-thread storage.

This allows to reduce mlx4_en_priv bloat.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-25 22:15:50 -07:00
Eric Dumazet 45acbac609 net/mlx4_en: clear some TX ring stats in mlx4_en_clear_stats()
mlx4_en_clear_stats() clears about everything but few TX ring
fields are missing :
- queue_stopped, wake_queue, tso_packets, xmit_more

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-25 22:15:50 -07:00
Eric Dumazet 63a664b7e9 net/mlx4_en: fix tx_dropped bug
1) mlx4_en_xmit() can increment priv->stats.tx_dropped, but this variable
is overwritten in mlx4_en_DUMP_ETH_STATS().

2) This increment was not SMP safe, as a port might have many TX queues.

Add a per TX ring tx_dropped to fix these issues.

This is u32 as mlx4_en_DUMP_ETH_STATS() will add a 32bit field.

So lets avoid bugs with SNMP agents having to cope with partial
overwraps. (One of these agents being bond_fold_stats())

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Cc: Eugenia Emantayev <eugenia@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-25 22:15:49 -07:00
Linus Torvalds 76b584d312 Primary 4.7 merge window changes
- Updates to the new Intel X722 iWARP driver
 - Updates to the hfi1 driver
 - Fixes for the iw_cxgb4 driver
 - Misc core fixes
 - Generic RDMA READ/WRITE API addition
 - SRP updates
 - Misc ipoib updates
 - Minor mlx5 updates
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXPIAnAAoJELgmozMOVy/dEYUP/0A6NH5ptzUrwPuLrWoz8h+e
 KBfJE7H09mfKBx0Rq8YmnU+pz4lk8vrMLLaqpbGN57mwO0a1lK9bgc3E6KUhQPhc
 dpGEX/NG1+aILomD7M4l1yAKkG17kxFLD75cLCeaxhO76jBRWsunukqk5mT/u0EG
 fUYZs1fRb9t2LDtTNhPfSFR1+dgP5S17xLhpl9ttn87hTmIuiGWR6ig2nTC7azZD
 G0d7RVjohfY2sDD28YgQiUEJ+q+1ymp3XTaCZhPCVl9VCRPweEdtLKcbNWZIvClx
 ewuTCgADXg8tAL/6zu6bEKqZlC17UrmVJee3csKQLT09PJkSEICeFR/ld6ONUKAF
 nDhi3ySa75Xxg9VDcCnuRkXKK+/zi7oDelZuh9mvMG0JJqPK9rTZDD29j2kBf7C0
 bdx4R5cI4KJWQ/GlCyi/nLiuYkmAiCugzcGnRho4ub+EJ0yX1w6n8KVYr37kFsFu
 q6MCnEfArEgDpbq1wo0+9MWtqBYrnOI/XtG81Zd+6X2MW975qU85wUdUSjg6OOb1
 v1osyAmFDy9A0Y80yY+l1HHrSVIvI0IAWZDfxsbCLQY8O03ZNcvxE2RsrzWd5CKL
 iZsX24tjV0WR9+lORHLfAKB3DL9CcfHv/tHo7q+5iAHmIuWZGrEN22ELkwS/4X7x
 d/V0XDjzs6lgQeTJ7R4B
 =e+Zv
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma updates from Doug Ledford:
 "Primary 4.7 merge window changes

   - Updates to the new Intel X722 iWARP driver
   - Updates to the hfi1 driver
   - Fixes for the iw_cxgb4 driver
   - Misc core fixes
   - Generic RDMA READ/WRITE API addition
   - SRP updates
   - Misc ipoib updates
   - Minor mlx5 updates"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (148 commits)
  IB/mlx5: Fire the CQ completion handler from tasklet
  net/mlx5_core: Use tasklet for user-space CQ completion events
  IB/core: Do not require CAP_NET_ADMIN for packet sniffing
  IB/mlx4: Fix unaligned access in send_reply_to_slave
  IB/mlx5: Report Scatter FCS device capability when supported
  IB/mlx5: Add Scatter FCS support for Raw Packet QP
  IB/core: Add Scatter FCS create flag
  IB/core: Add Raw Scatter FCS device capability
  IB/core: Add extended device capability flags
  i40iw: pass hw_stats by reference rather than by value
  i40iw: Remove unnecessary synchronize_irq() before free_irq()
  i40iw: constify i40iw_vf_cqp_ops structure
  IB/mlx5: Add UARs write-combining and non-cached mapping
  IB/mlx5: Allow mapping the free running counter on PROT_EXEC
  IB/mlx4: Use list_for_each_entry_safe
  IB/SA: Use correct free function
  IB/core: Fix a potential array overrun in CMA and SA agent
  IB/core: Remove unnecessary check in ibnl_rcv_msg
  IB/IWPM: Fix a potential skb leak
  RDMA/nes: replace custom print_hex_dump()
  ...
2016-05-20 14:35:07 -07:00
Joonsoo Kim 0139aa7b7f mm: rename _count, field of the struct page, to _refcount
Many developers already know that field for reference count of the
struct page is _count and atomic type.  They would try to handle it
directly and this could break the purpose of page reference count
tracepoint.  To prevent direct _count modification, this patch rename it
to _refcount and add warning message on the code.  After that, developer
who need to handle reference count will find that field should not be
accessed directly.

[akpm@linux-foundation.org: fix comments, per Vlastimil]
[akpm@linux-foundation.org: Documentation/vm/transhuge.txt too]
[sfr@canb.auug.org.au: sync ethernet driver changes]
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Sunil Goutham <sgoutham@cavium.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Manish Chopra <manish.chopra@qlogic.com>
Cc: Yuval Mintz <yuval.mintz@qlogic.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-19 19:12:14 -07:00
Matan Barak 94c6825e0f net/mlx5_core: Use tasklet for user-space CQ completion events
Previously, we've fired all our completion callbacks straight from
our ISR.

Some of those callbacks were lightweight (for example, mlx5 Ethernet
napi callbacks), but some of them did more work (for example,
the user-space RDMA stack uverbs' completion handler). Besides that,
doing more than the minimal work in ISR is generally considered wrong,
it could even lead to a hard lockup of the system. Since when a lot
of completion events are generated by the hardware, the loop over
those events could be so long, that we'll get into a hard lockup by
the system watchdog.

In order to avoid that, add a new way of invoking completion events
callbacks. In the interrupt itself, we add the CQs which receive
completion event to a per-EQ list and schedule a tasklet. In the
tasklet context we loop over all the CQs in the list and invoke the
user callback.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-18 10:45:49 -04:00
Linus Torvalds 16bf834805 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits)
  gitignore: fix wording
  mfd: ab8500-debugfs: fix "between" in printk
  memstick: trivial fix of spelling mistake on management
  cpupowerutils: bench: fix "average"
  treewide: Fix typos in printk
  IB/mlx4: printk fix
  pinctrl: sirf/atlas7: fix printk spelling
  serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/
  w1: comment spelling s/minmum/minimum/
  Blackfin: comment spelling s/divsor/divisor/
  metag: Fix misspellings in comments.
  ia64: Fix misspellings in comments.
  hexagon: Fix misspellings in comments.
  tools/perf: Fix misspellings in comments.
  cris: Fix misspellings in comments.
  c6x: Fix misspellings in comments.
  blackfin: Fix misspelling of 'register' in comment.
  avr32: Fix misspelling of 'definitions' in comment.
  treewide: Fix typos in printk
  Doc: treewide : Fix typos in DocBook/filesystem.xml
  ...
2016-05-17 17:05:30 -07:00
Daniel Vetter 9a652cc01e Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge request by Jani to get at

commit 249c4f538b
Author: Deepak M <m.deepak@intel.com>
Date:   Wed Mar 30 17:03:39 2016 +0300

    drm: Add new DCS commands in the enum list

Some simple conflicts in intel_dp.c.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-05-17 12:15:49 +02:00
Tariq Toukan 2bb07e155b net/mlx4_core: Fix access to uninitialized index
Prevent using uninitialized or negative index when handling
steering entries.

Fixes: b12d93d63c ('mlx4: Add support for promiscuous mode in the new steering model.')
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:58:01 -04:00
Amir Vadai aad7e08d39 net/mlx5e: Hardware offloaded flower filter statistics support
Introduce support in updating statistics of offloaded TC flower
classifiers. Currently only the DROP action is supported.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
Amir Vadai 43a335e055 net/mlx5_core: Flow counters infrastructure
If a counter has the aging flag set when created, it is added to a list
of counters that will be queried periodically from a workqueue.  query
result and last use timestamp are cached.
add/del counter must be very efficient since thousands of such
operations might be issued in a second.
There is only a single reference to counters without aging, therefore
no need for locks.
But, counters with aging enabled are stored in a list. In order to make
code as lockless as possible, all the list manipulation and access to
hardware is done from a single context - the periodic counters query
thread.

The hardware supports multiple counters per FTE, however currently we
are using one counter for each FTE.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
Amir Vadai bd5251dbf1 net/mlx5_core: Introduce flow steering destination of type counter
When adding a flow steering rule with a counter, need to supply a
destination of type MLX5_FLOW_DESTINATION_TYPE_COUNTER, with a pointer
to a struct mlx5_fc.
Also, MLX5_FLOW_CONTEXT_ACTION_COUNT bit should be set in the action.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
Amir Vadai 9dc0b289c4 net/mlx5_core: Firmware commands to support flow counters
Getting packet/byte statistics on flows is done through flow counters.
Implement the firmware commands to alloc, free and query flow counters.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
Amir Vadai 42ca502e17 net/mlx5_core: Use a macro in mlx5_command_str()
Use a macro instead of copying the OP name.

Signed-off-by: Amir Vadai <amirva@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-16 13:43:51 -04:00
Saeed Mahameed b797a684b0 net/mlx5e: Enable CQE compression when PCI is slower than link
We turn the feature ON, only for servers with PCI BW < MAX LINK BW, as it
helps reducing PCI pressure on weak PCI slots, but it adds some software
overhead.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-11 19:42:39 -04:00
Tariq Toukan d9d9f156f3 net/mlx5e: Expand WQE stride when CQE compression is enabled
Make the MPWQE/Striding RQ default configuration dynamic and not
statically set at compile time.  Now at driver load we set
stride size and num strides dynamically.

By default we use same values as before, but when CQE compression
is enabled, we set larger stride size to benefit from CQE
compression for larger packets.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-11 19:42:39 -04:00
Tariq Toukan 7219ab34f1 net/mlx5e: CQE compression
CQE compression feature is meant to save PCIe bandwidth by
compressing few CQEs into smaller amount of bytes on PCIe.
CQE compression can be selectively enabled per CQ.  By default
is disabled for now and will be enabled later on.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-11 19:42:39 -04:00
David S. Miller 1de1d449c6 mlx5: Fix merge errors.
I accidently let Arnd's VXLAN dependency changes slip into net-next,
they are only appropriate for net.

Also the flow steering structural changes to mlx5e_priv got scrambled
during the merge resolution as well.

Fix that all up.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 22:05:13 -04:00
David S. Miller e800072c18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
In netdevice.h we removed the structure in net-next that is being
changes in 'net'.  In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.

The mlx5 conflicts have to do with vxlan support dependencies.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 15:59:24 -04:00
Arnd Bergmann 7dbb29172d net/mlx5e: make VXLAN support conditional
VXLAN can be disabled at compile-time or it can be a loadable
module while mlx5 is built-in, which leads to a link error:

drivers/net/built-in.o: In function `mlx5e_create_netdev':
ntb_netdev.c:(.text+0x106de4): undefined reference to `vxlan_get_rx_port'

This avoids the link error and makes the vxlan code optional,
like the other ethernet drivers do as well.

Link: https://patchwork.ozlabs.org/patch/589296/
Fixes: b3f63c3d5e ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 00:21:07 -04:00
Arnd Bergmann 7fd7406d9c Revert "net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue"
This reverts commit 69976fb104.

We cannot select VXLAN when IPv4 support is disabled, that just gives
us additional build errors, including:

warning: (MLX5_CORE_EN) selects VXLAN which has unmet direct dependencies (NETDEVICES && NET_CORE && INET)
In file included from ../drivers/net/vxlan.c:36:0:
include/net/udp_tunnel.h: In function 'udp_tunnel_handle_offloads':
include/net/udp_tunnel.h:112:9: error: implicit declaration of function 'iptunnel_handle_offloads' [-Werror=implicit-function-declaration]
  return iptunnel_handle_offloads(skb, type);
         ^~~~~~~~~~~~~~~~~~~~~~~~

I'm sending a proper fix for the original bug in a separate patch.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-09 00:21:07 -04:00
Jiri Pirko 5113bfdbc6 mlxsw: spectrum: Fix ordering in mlxsw_sp_fini
Fixes: 0f433fa0ec ("mlxsw: spectrum_buffers: Implement shared buffer configuration")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 19:00:00 -04:00
Ido Schimmel 2889286585 mlxsw: spectrum: Add missing rollback in flood configuration
When we fail to set the flooding configuration for the broadcast and
unregistered multicast traffic, we should revert the flooding
configuration of the unknown unicast traffic.

Fixes: 0293038e0c ("mlxsw: spectrum: Add support for flood control")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 18:27:43 -04:00
Ido Schimmel 51554db2d2 mlxsw: spectrum: Fix rollback order in LAG join failure
Make the leave procedure in the error path symmetric to the join
procedure and first remove the port from the collector before
potentially destroying the LAG.

Fixes: 0d65fc1304 ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-06 18:27:42 -04:00
Daniel Jurgens 82d69203df net/mlx4_en: Fix endianness bug in IPV6 csum calculation
Use htons instead of unconditionally byte swapping nexthdr.  On a little
endian systems shifting the byte is correct behavior, but it results in
incorrect csums on big endian architectures.

Fixes: f8c6455bb0 ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Carol Soto <clsoto@us.ibm.com>
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-05 23:26:14 -04:00
Haggai Abramovsky 73898db043 net/mlx4: Avoid wrong virtual mappings
The dma_alloc_coherent() function returns a virtual address which can
be used for coherent access to the underlying memory.  On some
architectures, like arm64, undefined behavior results if this memory is
also accessed via virtual mappings that are not coherent.  Because of
their undefined nature, operations like virt_to_page() return garbage
when passed virtual addresses obtained from dma_alloc_coherent().  Any
subsequent mappings via vmap() of the garbage page values are unusable
and result in bad things like bus errors (synchronous aborts in ARM64
speak).

The mlx4 driver contains code that does the equivalent of:
vmap(virt_to_page(dma_alloc_coherent)), this results in an OOPs when the
device is opened.

Prevent Ethernet driver to run this problematic code by forcing it to
allocate contiguous memory. As for the Infiniband driver, at first we
are trying to allocate contiguous memory, but in case of failure roll
back to work with fragmented memory.

Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reported-by: David Daney <david.daney@cavium.com>
Tested-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-05 23:23:05 -04:00
Mohamad Haj Yahia 1edc57e2b3 net/mlx5: E-Switch, Implement trust vf ndo
- Add support to configure trusted vf attribute through trust_vf_ndo.

- Upon VF trust setting change we update vport context to refresh
 allmulti/promisc or any trusted vf attributes that we didn't trust the
 VF for before.

- Lock the eswitch state lock on vport event in order to synchronise the
 vport context updates , this will prevent contention with vport trust
 setting change which will trigger vport mac list update.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:49 -04:00
Mohamad Haj Yahia a35f71f27a net/mlx5: E-Switch, Implement promiscuous rx modes vf request handling
Add promisc_change as a trigger to vport context change event.
Add set vport promisc/allmulti functions to add vport to promiscuous
flowtable rules.
Upon promisc/allmulti rx mode vf request add the vport to
the relevant promiscuous group (Allmulti/Promisc group) so the relevant
traffic will be forwarded to it.
Upon allmulti vf request add the vport to each existing multicast fdb
rule.
Upon adding/removing mcast address from a vport, update all other
allmulti vports.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:48 -04:00
Mohamad Haj Yahia 78a9199b71 net/mlx5: E-Switch, Add promiscuous and allmulti FDB flowtable groups
Add promiscuous and allmulti steering groups in FDB table.
Besides the full match L2 steering rules group, we added
two more groups to catch the "miss" rules traffic:
* Allmulti group: One rule that forwards any mcast traffic coming from
either uplink or VFs/PF vports
* Promisc group: One rule that forwards all unmatched traffic coming
from uplink.

Needed for downstream privileged VF promisc and allmulti support.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:48 -04:00
Mohamad Haj Yahia 586cfa7f1d net/mlx5: E-Switch, Use vport event handler for vport cleanup
Remove the usage of explicit cleanup function and use existing vport
change handler. Calling vport change handler while vport
is disabled will cleanup the vport resources.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:48 -04:00
Mohamad Haj Yahia 01f51f2247 net/mlx5: E-Switch, Enable/disable ACL tables on demand
Enable ingress/egress ACL tables only when we need to configure ACL
rules.
Disable ingress/egress ACL tables once all ACL rules are removed.

All VF outgoing/incoming traffic need to go through the ingress/egress ACL
tables.
Adding/Removing these tables on demand will save unnecessary hops in the
flow steering when the ACL tables are empty.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:48 -04:00
Mohamad Haj Yahia f942380c12 net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk
Configure ingress and egress vport ACL rules according to spoofchk
admin parameters.

Ingress ACL flow table rules:
if (!spoofchk && !vst) allow all traffic.
else :
1) one of the following rules :
* if (spoofchk && vst) allow only untagged traffic with smac=original
mac sent from the VF.
* if (spoofchk && !vst) allow only traffic with smac=original mac sent
from the VF.
* if (!spoofchk && vst) allow only untagged traffic.
2) drop all traffic that didn't hit #1.

Add support for set vf spoofchk ndo.

Add non zero mac validation in case of spoofchk to set mac ndo:
when setting new mac we need to validate that the new mac is
not zero while the spoofchk is on because it is illegal
combination.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:47 -04:00
Mohamad Haj Yahia dfcb1ed3c3 net/mlx5: E-Switch, Vport ingress/egress ACLs rules for VST mode
Configure ingress and egress vport ACL rules according to
vlan and qos admin parameters.

Ingress ACL flow table rules:
1) drop any tagged packet sent from the VF
2) allow other traffic (default behavior)

Egress ACL flow table rules:
1) allow only tagged traffic with vlan_tag=vst_vid.
2) drop other traffic.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:47 -04:00
Mohamad Haj Yahia 5742df0f7d net/mlx5: E-Switch, Introduce VST vport ingress/egress ACLs
Create egress/ingress ACLs per VF vport at vport enable.

Ingress ACL:
	- one flow group to drop all tagged traffic in VST mode.

Egress ACL:
	- one flow group that allows only untagged traffic with
          smac that is equals to the original mac (anti-spoofing).
        - one flow group that allows only untagged traffic.
        - one flow group that allows only  smac that is equals
          to the original mac (anti-spoofing).
        (note: only one of the above group has active rule)
	- star rule will be used to drop all other traffic.

By default no rules are generated, unless VST is explicitly requested.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:47 -04:00
Mohamad Haj Yahia 761e205b55 net/mlx5: E-Switch, Fix error flow memory leak
Fix memory leak in case query nic vport command failed.

Fixes: 81848731ff ('net/mlx5: E-Switch, Add SR-IOV (FDB) support')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:47 -04:00
Mohamad Haj Yahia 831cae1dae net/mlx5: E-Switch, Replace vport spin lock with synchronize_irq()
Vport spin lock can be replaced with synchronize_irq() in the right
place, this will remove the need of locking inside irq context.
Locking in esw_enable_vport is not required since vport events are yet
to be enabled, and at esw_disable_vport it is sufficient to
synchronize_irq() to guarantee no further vport events handlers will be
scheduled.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:47 -04:00
Mohamad Haj Yahia efdc810ba3 net/mlx5: Flow steering, Add vport ACL support
Update the relevant flow steering device structs and commands to
support vport.
Update the flow steering core API to receive vport number.
Add ingress and egress ACL flow table name spaces.
Add ACL flow table support:
* ACL (Access Control List) flow table is a table that contains
only allow/drop steering rules.

* We have two types of ACL flow tables - ingress and egress.

* ACLs handle traffic sent from/to E-Switch FDB table, Ingress refers to
traffic sent from Vport to E-Switch and Egress refers to traffic sent
from E-Switch to vport.

* Ingress ACL flow table allow/drop rules is checked against traffic
sent from VF.

* Egress ACL flow table allow/drop rules is checked against traffic sent
to VF.

Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:46 -04:00
Maor Gottlieb fbc4a69b56 net/mlx5e: Fix aRFS compilation dependency
en_arfs.o should be compiled only if both CONFIG_MLX5_CORE_EN
and CONFIG_RFS_ACCEL are enabled. en_arfs calls to rps_may_expire_flow
which is compiled only if CONFIG_RFS_ACCEL is defined.

Move en_arfs.o compilation dependency to be under CONFIG_MLX5_CORE_EN
and wrap the en_arfs.c content with ifdef of CONFIG_RFS_ACCEL.

Fixes: 1cabe6b096 ('net/mlx5e: Create aRFS flow tables')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 14:04:46 -04:00
Alexander Duyck f3ed653cd4 net/mlx5e: Fix IPv6 tunnel checksum offload
The mlx5 driver exposes support for TSO6 but not IPv6 csum for hardware
encapsulated tunnels.  This leads to issues as it triggers warnings in
skb_checksum_help as it ends up being called as we report supporting the
segmentation but not the checksumming for IPv6 frames.

This patch corrects that and drops 2 features that don't actually need to
be supported in hw_enc_features since they are Rx features and don't
actually impact anything by being present in hw_enc_features.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 13:32:28 -04:00
Alexander Duyck b49663c8fb net/mlx5e: Add support for UDP tunnel segmentation with outer checksum offload
This patch assumes that the mlx5 hardware will ignore existing IPv4/v6
header fields for length and checksum as well as the length and checksum
fields for outer UDP headers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 13:32:28 -04:00
Alexander Duyck 09067122db net/mlx4_en: Add support for inner IPv6 checksum offloads and TSO
>From what I can tell the ConnectX-3 will support an inner IPv6 checksum and
segmentation offload, however it cannot support outer IPv6 headers.  This
assumption is based on the fact that I could see the checksum being
offloaded for inner header on IPv4 tunnels, but not on IPv6 tunnels.

For this reason I am adding the feature to the hw_enc_features and adding
an extra check to the features_check call that will disable GSO and
checksum offload in the case that the encapsulated frame has an outer IP
version of that is not 4.  The check in mlx4_en_features_check could be
removed if at some point in the future a fix is found that allows the
hardware to offload segmentation/checksum on tunnels with an outer IPv6
header.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 13:32:27 -04:00
Alexander Duyck 3c9346b240 net/mlx4_en: Add support for UDP tunnel segmentation with outer checksum offload
This patch assumes that the mlx4 hardware will ignore existing IPv4/v6
header fields for length and checksum as well as the length and checksum
fields for outer UDP headers.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 13:32:27 -04:00
David S. Miller cba6532100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv4/ip_gre.c

Minor conflicts between tunnel bug fixes in net and
ipv6 tunnel cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 00:52:29 -04:00
Matthew Finlay d8cf2dda3d net/mlx5e: Use workqueue for vxlan ops
The vxlan add/delete port NDOs are called under rcu lock.
The current mlx5e implementation can potentially block in these
calls, which is not allowed.  Move to using the mlx5e workqueue
to handle these NDOs.

Fixes: b3f63c3d5e ('net/mlx5e: Add netdev support for VXLAN tunneling')
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 13:37:26 -04:00
Matthew Finlay 7bb2975599 net/mlx5e: Implement a mlx5e workqueue
Implement a mlx5e workqueue to handle all mlx5e specific tasks.  Move
all tasks currently using the system workqueue to the new workqueue.
This is in preparation for vxlan using the mlx5e workqueue in order to
schedule port add/remove operations.

Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 13:37:26 -04:00
Matthew Finlay 69976fb104 net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue
When MLX5_EN=y MLX5_CORE=y and VXLAN=m there is a linker error for
vxlan_get_rx_port() due to the fact that VXLAN is a module. Change Kconfig
to select VXLAN when MLX5_CORE=y. When MLX5_CORE=m there is no dependency
on the value of VXLAN.

Fixes: b3f63c3d5e ('net/mlx5e: Add netdev support for VXLAN tunneling')
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 13:37:26 -04:00
Gal Pressman 5f8a02a441 net/mlx5: Unmap only the relevant IO memory mapping
When freeing UAR the driver tries to unmap uar->map and uar->bf_map
which are mutually exclusive thus always unmapping a NULL pointer.
Make sure we only call iounmap() once, for the actual mapping.

Fixes: 0ba422410b ('net/mlx5: Fix global UAR mapping')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reported-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-03 13:37:25 -04:00
Maor Gottlieb 45bf454ae8 net/mlx5e: Enabling aRFS mechanism
Accelerated RFS requires that ntuple filtering is enabled via
ethtool and driver supports ndo_rx_flow_steer.
When the ntuple filtering is enabled, we modify the l3_l4 ttc
rules to point on the aRFS flow tables and when the filtering
is disabled, we modify the l3_l4 ttc rules to point on the RSS
TIRs.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:12 -04:00
Maor Gottlieb 18c908e477 net/mlx5e: Add accelerated RFS support
Implement ndo_rx_flow_steer ndo.
A new flow steering rule will be composed from the
skb 4-tuple and added to the hardware aRFS flow table.

Each rule is stored in an internal hash table, if such
skb 4-tuple rule already exists we update the corresponding
hardware steering rule with the new destination.

For garbage collection rps_may_expire_flow will be
invoked for a limited amount of old rules upon any
ndo_rx_flow_steer invocation.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:11 -04:00
Maor Gottlieb 1cabe6b096 net/mlx5e: Create aRFS flow tables
Create the following four flow tables for aRFS usage:
1. IPv4 TCP - filtering 4-tuple of IPv4 TCP packets.
2. IPv6 TCP - filtering 4-tuple of IPv6 TCP packets.
3. IPv4 UDP - filtering 4-tuple of IPv4 UDP packets.
4. IPv6 UDP - filtering 4-tuple of IPv6 UDP packets.

Each flow table has two flow groups: one for the 4-tuple
filtering (full match)  and the other contains * rule for miss rule.

Full match rule means a hit for aRFS and packet will be forwarded
to the dedicated RQ/Core, miss rule packets will be forwarded to
default RSS hashing.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:11 -04:00
Maor Gottlieb 5a7b27eb9c net/mlx5: Initializing CPU reverse mapping
Allocating CPU rmap and add entry for each IRQ.
CPU rmap is used in aRFS to get the RX queue number
of the RX completion interrupts.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:11 -04:00
Maor Gottlieb 33cfaaa8f3 net/mlx5e: Split the main flow steering table
Currently, the main flow table is used for two purposes:
One is to do mac filtering and the other is to classify
the packet l3-l4 header in order to steer the packet to
the right RSS TIR.

This design is very complex, for each configured mac address we
have to add eleven rules (rule for each traffic type), the same if the
device is put to promiscuous/allmulti mode.
This scheme isn't scalable for future features like aRFS.

In order to simplify it, the main flow table is split to two flow
tables:
1. l2 table - filter the packet dmac address, if there is a match
we forward to the ttc flow table.

2. TTC (Traffic Type Classifier) table - classify the traffic
type of the packet and steer the packet to the right TIR.

In this new design, when new mac address is added, the driver adds
only one flow rule instead of eleven.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:11 -04:00
Maor Gottlieb acff797cd1 net/mlx5e: Refactor mlx5e flow steering structs
Slightly refactor and re-order the flow steering structs,
tables and data-bases for better self-containment and
flexibility to add more future steering phases
(tables/rules/data bases) e.g: aRFS.

Changes:
1. Move the vlan DB and address DB into their table structs.
2. Rename steering table structs to unique format: mlx5e_*_table,
e.g: mlx5e_vlan_table.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:10 -04:00
Maor Gottlieb 13de6c106c net/mlx5: Support different attributes for priorities in namespace
Currently, namespace could be initialized only
with priorities with the same attributes.
Add support to initialize namespace with priorities
with different attributes(e.g. different number of levels).

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:10 -04:00
Maor Gottlieb d63cd28608 net/mlx5: Add user chosen levels when allocating flow tables
Currently, consumers of the flow steering infrastructure can't
choose their own flow table levels and are limited to one
flow table per level. This just waste levels.
Instead, we introduce here the possibility to use multiple
flow tables in a level. The user is free to connect these
flow tables, while following the rule (FTEs in FT of level x
could only point to FTs of level y where y > x).

In addition this patch switch the order of the create/destroy
flow tables of the NIC(vlan and main).

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:09 -04:00
Maor Gottlieb a257b94a18 net/mlx5: Set number of allowed levels in priority
Refactors the flow steering namespace creation,
by changing the name num_fts to num_levels.
When new flow table is created, the driver assign new level
to this flow table therefore the meaning is equivalent.
Since downstream patches will introduce the ability to create more
than one flow table per level, the name num_fts is no
longer accurate.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:09 -04:00
Maor Gottlieb d745098ced net/mlx5: Introduce modify flow rule destination
This API is used for modifying the flow rule destination.
This is needed for modifying the pointed flow table by the
traffic type classifier rules to point on the aRFS tables.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:08 -04:00
Tariq Toukan 1da366964e net/mlx5e: Direct TIR per RQ
Introduce new TIRs for direct access per RQ.
Now we have 2 available kinds of TIRs:
	- indirect TIR per traffic type, each points to one RQT (RSS RQT)
          same as before.
	- New direct TIR per RQ, each points to RQT with a size of one
          that forwards packets to that RQ only.

Driver will open max channels (num cores) direct TIRs by default,
they will be filled with the actual RQs once channels are allocated.

Needed for downstream aRFS and ethtool direct steering functionalities.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:08 -04:00
Matthew Finlay 01a14098d3 net/mlx5e: Call vxlan_get_rx_port() with rtnl lock
Hold the rtnl lock when calling vxlan_get_rx_port().

Fixes: b7aade1548 ("vxlan: break dependency with netdev drivers")
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:08 -04:00
Arnd Bergmann 6b87663fbe net/mlx5e: avoid stack overflow in mlx5e_open_channels
struct mlx5e_channel_param is a large structure that is allocated
on the stack of mlx5e_open_channels, and with a recent change
it has grown beyond the warning size for the maximum stack
that a single function should use:

mellanox/mlx5/core/en_main.c: In function 'mlx5e_open_channels':
mellanox/mlx5/core/en_main.c:1325:1: error: the frame size of 1072 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

The function is already using dynamic allocation and is not in
a fast path, so the easiest workaround is to use another kzalloc
for allocating the channel parameters.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: d3c9bc2743 ("net/mlx5e: Added ICO SQs")
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-28 16:46:59 -04:00
Chris Wilson d8dab00de9 io-mapping: Specify mapping size for io_mapping_map_wc()
The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-3-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Masanari Iida c01e01597c treewide: Fix typos in printk
This patch fix spelling typos in printk from various part
of the codes.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-28 10:52:28 +02:00
David S. Miller c0cc53162a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor overlapping changes in the conflicts.

In the macsec case, the change of the default ID macro
name overlapped with the 64-bit netlink attribute alignment
fixes in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-27 15:43:10 -04:00
Saeed Mahameed 1b223dd391 net/mlx5e: Fix checksum handling for non-stripped vlan packets
Now as rx-vlan offload can be disabled, packets can be received
with vlan tag not stripped, which means is_first_ethertype_ip will
return false, for that we need to check if the hardware reported
csum OK so we will report CHECKSUM_UNNECESSARY for those packets.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:03 -04:00
Gal Pressman 363501145e net/mlx5e: Add ethtool support for rxvlan-offload (vlan stripping)
Use ethtool -K <interface> rxvlan <on/off> to enable/disable
C-TAG vlan stripping by hardware.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:02 -04:00
Gal Pressman bb64143eee net/mlx5e: Add ethtool support for dump module EEPROM
Add query MCIA, PMLP registers infrastructure and commands.
Add ethtool support for get_module_info() and get_module_eeprom()
callbacks.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:02 -04:00
Gal Pressman da54d24ec3 net/mlx5e: Add ethtool support for interface identify (LED blinking)
Add the needed hardware command and mlx5_ifc structs for managing LED
control.
Add set_phys_id ethtool callback to support ethtool -p flag.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:02 -04:00
Eran Ben Elisha 94cb1ebbaf net/mlx5e: Add support for RXALL netdev feature
Introduce new access register named Ports Check Mask Register (PCMR) to
control all HW checks on port. With this register, the driver can
enable/disable Hardware FCS validation.

When RXALL is enabled/disabled using ndo_set_features, enable/disable
fcs check at HW.
User can change HW configuration using rx-all flag at ethtool.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:02 -04:00
Gal Pressman 0e405443e8 net/mlx5e: Improve set features ndo resiliency
In current mlx5e ndo_set_features implementation, setting some features
can success while others can fail. Today, we return one error code which
doesn't reflect the current features status of the netdev at the end of
the ndo callback.

Set netdev->features with features which were successfully set in order
to keep the current status in case of failure. For this purpose, define
new Macro to set/unset specific feature in netdev->features.

This patch introduces a mechanism that uses feature handlers for each
feature.
Set features will call a generic handler, which will then call a specific
handler in his turn and update netdev->features according to it's return
value. Each specific handler is responsible to perform driver specific
actions, and updating params if needed.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:01 -04:00
Gal Pressman 121fcdc84d net/mlx5e: Add link down events counter
Expose link_down_events counter through ethtool -S.
This counter is read from PPort statistics, then proccessed and stored as
a special handling software counter.
This counter is stored along software counters since it is the only PPort
counter that it's size is not 64 bits.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:01 -04:00
Gal Pressman cf678570d5 net/mlx5e: Add per priority group to PPort counters
Expose counters providing information for each priority level (PCP) through
ethtool -S option and DCBNL.
This includes rx/tx bytes, frames, and pause counters.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:01 -04:00
Gal Pressman 8075cb7238 net/mlx5e: Rename VPort counters
VPort and software counters names are confusing and may be unclear, all
VPort counters now have a prefix of rx/tx_vport_*.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:01 -04:00
Gal Pressman 9218b44dcc net/mlx5e: Statistics handling refactoring
Redesign ethtool statistics handling and reporting in the driver:
1. Move counters to a separate file (en_stats.h).
2. Remove unnecessary dependencies between stats and strings.
3. Use counter descriptors which hold a name and offset for each counter,
   and will be used to decide which counters will be exposed.

For example when adding a new software counter to ethtool, instead of:
1. Add to stats struct.
2. Add to strings struct in the same order.
3. Change macro defining number of software counters.
The only thing needed is to link the new counter to a counter descriptor.

VPort counters are a set of hardware traffic counters created automatically
for each virtual port opened.
PPort counters are a set of counters describing per physical port
performance statistics.
These counters are gathered from hardware register and divided to groups
according to different protocols.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:01 -04:00
Gal Pressman 269e6b3af3 net/mlx5e: Report additional error statistics in get stats ndo
Provide rtnl_link_stats64 with information regarding physical errors to be
seen in ifconfig and ip tool.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 15:58:00 -04:00
Eric Dumazet fc96256c90 net/mlx4_en: fix spurious timestamping callbacks
When multiple skb are TX-completed in a row, we might incorrectly keep
a timestamp of a prior skb and cause extra work.

Fixes: ec693d4701 ("net/mlx4_en: Add HW timestamping (TS) support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-26 01:13:18 -04:00
Majd Dibbiny 5fc7197d3a net/mlx5: Add pci shutdown callback
This patch introduces kexec support for mlx5.
When switching kernels, kexec() calls shutdown, which unloads
the driver and cleans its resources.

In addition, remove unregister netdev from shutdown flow. This will
allow a clean shutdown, even if some netdev clients did not release their
reference from this netdev. Releasing The HW resources only is enough as
the kernel is shutting down

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:39 -04:00
Eli Cohen 78228cbdeb net/mlx5_core: Remove static from local variable
The static is not required and breaks re-entrancy if it will be required.

Fixes: 2530236303 ("net/mlx5_core: Flow steering tree initialization")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:39 -04:00
Saeed Mahameed cd255efff9 net/mlx5e: Use vport MTU rather than physical port MTU
Set and report vport MTU rather than physical MTU,
Driver will set both vport and physical port mtu and will
rely on the query of vport mtu.

SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.

Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:39 -04:00
Saeed Mahameed d8edd2469a net/mlx5e: Fix minimum MTU
Minimum MTU that can be set in Connectx4 device is 68.

This fixes the case where a user wants to set invalid MTU,
the driver will fail to satisfy this request and the interface
will stay down.

It is better to report an error and continue working with old
mtu.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:39 -04:00
Saeed Mahameed 046339eaab net/mlx5e: Device's mtu field is u16 and not int
For set/query MTU port firmware commands the MTU field
is 16 bits, here I changed all the "int mtu" parameters
of the functions wrapping those firmware commands to be u16.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:38 -04:00
Majd Dibbiny 64dbbdfef2 net/mlx5_core: Add ConnectX-5 to list of supported devices
Add the upcoming ConnectX-5 devices (PF and VF) to the list of
supported devices by the mlx5 driver.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:38 -04:00
Rana Shahout 6e4c218946 net/mlx5e: Fix MLX5E_100BASE_T define
Bit 25 of eth_proto_capability in PTYS register is
1000Base-TT and not 100Base-T.

Fixes: f62b8bb8f2 ('net/mlx5: Extend mlx5_core to
support ConnectX-4 Ethernet functionality')
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:38 -04:00
Maor Gottlieb c3f9bf628b net/mlx5_core: Fix soft lockup in steering error flow
In the error flow of adding flow rule to auto-grouped flow
table, we call to tree_remove_node.

tree_remove_node locks the node's parent, however the node's parent
is already locked by mlx5_add_flow_rule and this causes a deadlock.
After this patch, if we failed to add the flow rule, we unlock the
flow table before calling to tree_remove_node.

fixes: f0d22d1874 ('net/mlx5_core: Introduce flow steering autogrouped
flow table')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reported-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:38 -04:00
David S. Miller 1602f49b58 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were two cases of simple overlapping changes,
nothing serious.

In the UDP case, we need to add a hlist_add_tail_rcu()
to linux/rculist.h, because we've moved UDP socket handling
away from using nulls lists.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23 18:51:33 -04:00
Hannes Frederic Sowa 0c5c3252c4 mlx4: protect mlx4_en_start_port in mlx4_en_restart with rtnl_lock
mlx4_en_start_port requires rtnl_lock to be held.

Cc: Eugenia Emantayev <eugenia@mellanox.com>
Cc: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:35:43 -04:00
Tariq Toukan 5498440756 net/mlx5e: Add ethtool counter for RX buffer allocation failures
Counts the number of RX buffer allocation failures and shows it
in ethtool statistics.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:06 -04:00
Saeed Mahameed e20a0db304 net/mlx5e: Delay skb->data access
Move mlx5e_handle_csum and eth_type_trans to the end of
mlx5e_build_rx_skb to gain some more time before accessing
skb->data, to reduce cache misses.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:06 -04:00
Tariq Toukan 1bfec31627 net/mlx5e: Remove redundant barrier
The bit-op operation one line before is an explicit barrier
by itself.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
Tariq Toukan c5adb96f6c net/mlx5e: Use napi_alloc_skb for RX SKB allocations
Instead of netdev_alloc_skb, we use the napi_alloc_skb function
which is designated to allocate skbuff's for RX in a
channel-specific NAPI instance, and implies the IP packet alignment.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
Tariq Toukan bc77b240b3 net/mlx5e: Add fragmented memory support for RX multi packet WQE
If the allocation of a linear (physically continuous) MPWQE fails,
we allocate a fragmented MPWQE.

This is implemented via device's UMR (User Memory Registration)
which allows to register multiple memory fragments into ConnectX
hardware as a continuous buffer.
UMR registration is an asynchronous operation and is done via
ICO SQs.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
Tariq Toukan d3c9bc2743 net/mlx5e: Added ICO SQs
Added ICO (Internal Control Operations) SQ per channel to be used
for driver internal operations such as memory registration for
fragmented memory and nop requests upon ifconfig up.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
Tariq Toukan 461017cb00 net/mlx5e: Support RX multi-packet WQE (Striding RQ)
Introduce the feature of multi-packet WQE (RX Work Queue Element)
referred to as (MPWQE or Striding RQ), in which WQEs are larger
and serve multiple packets each.

Every WQE consists of many strides of the same size, every received
packet is aligned to a beginning of a stride and is written to
consecutive strides within a WQE.

In the regular approach, each regular WQE is big enough to be capable
of serving one received packet of any size up to MTU or 64K in case of
device LRO is enabled, making it very wasteful when dealing with
small packets or device LRO is enabled.

For its flexibility, MPWQE allows a better memory utilization
(implying improvements in CPU utilization and packet rate) as packets
consume strides according to their size, preserving the rest of
the WQE to be available for other packets.

MPWQE default configuration:
	Num of WQEs	= 16
	Strides Per WQE = 2048
	Stride Size	= 64 byte

The default WQEs memory footprint went from 1024*mtu (~1.5MB) to
16 * 2048 * 64 = 2MB per ring.
However, HW LRO can now be supported at no additional cost in memory
footprint, and hence we turn it on by default and get an even better
performance.

Performance tested on ConnectX4-Lx 50G.
To isolate the feature under test, the numbers below were measured with
HW LRO turned off. We verified that the performance just improves when
LRO is turned back on.

* Netperf single TCP stream:
- BW raised by 10-15% for representative packet sizes:
  default, 64B, 1024B, 1478B, 65536B.

* Netperf multi TCP stream:
- No degradation, line rate reached.

* Pktgen: packet rate raised by 2-10% for traffic of different message
sizes: 64B, 128B, 256B, 1024B, and 1500B.

* Pktgen: packet loss in bursts of small messages (64byte),
single stream:
- | num packets | packets loss before | packets loss after
  |     2K      |       ~ 1K          |       0
  |     8K      |       ~ 6K          |       0
  |     16K     |       ~13K          |       0
  |     32K     |       ~28K          |       0
  |     64K     |       ~57K          |     ~24K

As expected as the driver can receive as many small packets (<=64B) as
the number of total strides in the ring (default = 2048 * 16) vs. 1024
(default ring size regardless of packets size) before this feature.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:05 -04:00
Tariq Toukan 2f48af128d net/mlx5e: Use function pointers for RX data path handling
In preparation for Striding RQ feature, which will need its own
RX handlers.
This patch does not change any functionality.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:04 -04:00
Tariq Toukan d8c9660dac net/mlx5e: Use only close NUMA node for default RSS
Distribute default RSS table uniformly over the rings of the
close NUMA node, instead of all available channels.
This way we enforce the preference of close rings over far ones.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:04 -04:00
Rana Shahout 593cf33829 net/mlx5e: Allocate set of queue counters per netdev
Connect all netdev RQs to this set of queue counters.
Also, add an "rx_out_of_buffer" counter to ethtool,
which indicates RX packet drops due to lack of receive
buffers.

Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:04 -04:00
Tariq Toukan 237cd21809 net/mlx5: Introduce device queue counters
A queue counter can collect several statistics for one or more
hardware queues (QPs, RQs, etc ..) that the counter is attached to.

For Ethernet it will provide an "out of buffer" counter which
collects the number of all packets that are dropped due to lack
of software buffers.

Here we add device commands to alloc/query/dealloc queue counters.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:09:04 -04:00
Eran Ben Elisha d21ed3a311 net/mlx4_en: Split SW RX dropped counter per RX ring
Count SW packet drops per RX ring instead of a global counter. This
will allow monitoring the number of rx drops per ring.

In addition, SW rx_dropped counter was overwritten by HW rx_dropped
counter, sum both of them instead to show the accurate value.

Fixes: a3333b35da ('net/mlx4_en: Moderate ethtool callback to [...] ')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reported-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:02:40 -04:00
Eugenia Emantayev 2a500090a4 net/mlx4_core: Don't allow to VF change global pause settings
Currently changing global pause settings is done via SET_PORT
command with input modifier GENERAL. This command is allowed
for each VF since MTU setting is done via the same command.

Change the above to the following scheme: before passing the
request to the FW, the PF will check whether it was issued
by a slave. If yes, don't change global pause and warn,
otherwise change to the requested value and store for
further reference.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:02:40 -04:00
Daniel Jurgens 4bfd2e6e53 net/mlx4_core: Avoid repeated calls to pci enable/disable
Maintain the PCI status and provide wrappers for enabling and disabling
the PCI device.  Performing the actions more than once without doing
its opposite results in warning logs.

This occurred when EEH hotplugged the device causing a warning for
disabling an already disabled device.

Fixes: 2ba5fbd62b ('net/mlx4_core: Handle AER flow properly')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:02:40 -04:00
Daniel Jurgens c12833acff net/mlx4_core: Implement pci_resume callback
Move resume related activities to a new pci_resume function instead of
performing them in mlx4_pci_slot_reset.  This change is needed to avoid
a hotplug during EEH recovery due to commit f2da4ccf8b ("powerpc/eeh:
More relaxed hotplug criterion").

Fixes: 2ba5fbd62b ('net/mlx4_core: Handle AER flow properly')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-21 15:02:39 -04:00
Konstantin Khlebnikov 851b10d608 net/mlx4_en: do batched put_page using atomic_sub
This patch fixes couple error paths after allocation failures.
Atomic set of page reference counter is safe only if it is zero,
otherwise set can race with any speculative get_page_unless_zero.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-19 20:04:24 -04:00
Konstantin Khlebnikov 04aeb56a17 net/mlx4_en: allocate non 0-order pages for RX ring with __GFP_NOMEMALLOC
High order pages are optional here since commit 51151a16a6 ("mlx4: allow
order-0 memory allocations in RX path"), so here is no reason for depleting
reserves. Generic __netdev_alloc_frag() implements the same logic.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-19 20:04:24 -04:00
Masanari Iida c19ca6cb4c treewide: Fix typos in printk
This patch fix spelling typos found in printk
within various part of the kernel sources.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2016-04-18 11:23:24 +02:00
Jiri Pirko b94cdabbf1 mlxsw: spectrum_buffers: Use MLXSW_SP_PB_UNUSED define for unused pb
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 13:02:43 -04:00
Jiri Pirko ce78f02042 mlxsw: spectrum_buffers: Use designated initializers for mlxsw_sp_pbs
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 13:02:42 -04:00
Jiri Pirko 2d0ed39fbd mlxsw: spectrum_buffers: Implement occupancy monitoring
Implement occupancy API introduced in devlink and mlxsw core. This is
done by accessing SBPM register for Port-Pool and SBSR for Port-TC
current and max occupancy values. Max clear is implemented using the
same registers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:06 -04:00
Jiri Pirko caf7297e7a mlxsw: core: Introduce support for asynchronous EMAD register access
So far it was possible to have one EMAD register access at a time,
locked by mutex. This patch extends this interface to allow multiple
EMAD register accesses to be in fly at once. That allows faster
processing on firmware side avoiding unused time in between EMADs.
Measured speedup is ~30% for shared occupancy snapshot operation.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:06 -04:00
Jiri Pirko dd9bdb04d2 mlxsw: core: Add mlxsw specific workqueue and use it for FDB notif. processing
Follow-up patch is going to need to use delayed work as well and
frequently. The FDB notification processing is already using that and
also quite frequently. It makes sense to create separate workqueue just
for mlxsw driver in this case and do not pollute system_wq.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:06 -04:00
Jiri Pirko 42a7f1d774 mlxsw: reg: Extend SBPM register for occupancy control
Since it is not possible to get and clear Port-Pool occupancy data using
SBSR register, there's a need to implement that using SBPM.
Extend pack helper and add unpack helper to get occupancy values.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:06 -04:00
Jiri Pirko 26176def3c mlxsw: reg: Add Shared Buffer Status register definition
This register allows to query HW for current and maximal buffer usage.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:05 -04:00
Jiri Pirko 1ceecc88d2 mlxsw: core: Add devlink shared buffer occupancy callbacks
Add middle layer in mlxsw core code to forward shared buffer occupancy
calls into specific ASIC drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:05 -04:00
Jiri Pirko 0f433fa0ec mlxsw: spectrum_buffers: Implement shared buffer configuration
Implement previously introduced mlxsw core shared buffer API.
For Spectrum, that is done utilizing registers SBPR, SBCM and SBPM.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:05 -04:00
Jiri Pirko 325f2f197d mlxsw: core: Add mlxsw_core_port_driver_priv helper
Needed in following patch.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:05 -04:00
Jiri Pirko c30a53c7de mlxsw: spectrum_buffers: Get max_buff defaults into limits exposed to user
Although the device supports max_buff magic values 0 and 0xff, these are
not exposed to the user via devlink.
Therefore, adjust the default values to be within configurable range.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:05 -04:00
Jiri Pirko bc872506f5 mlxsw: spectrum_buffers: Change initialization of PG 9
As explained in commit ff6551ec0c ("mlxsw: spectrum: Correctly
configure headroom size") control packets are directed to priority group
buffer 9 (PG9) in the ports' headroom buffers.

Since we don't want to drop control packets in case they can't be
admitted to the switch's shared buffer we bind PG9 to a different
ingress pool from the one used by all other PGs.

Unlike other PGs, we currently don't expose the binding between PG9 to a
pool and leave it fixed.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:04 -04:00
Jiri Pirko 5408f7cba3 mlxsw: spectrum_buffers: Remove eg pool 3 default init and CPU port TC binding to it
Since there is no congestion control for CPU port traffic, we can change
the CPU port TC binding to pool 0 with min_buff and max_buff zeroed.
Remove initialization for pool egress pool 3 since it is no longer used
by dafault.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:04 -04:00
Jiri Pirko 078f9c7132 mlxsw: spectrum_buffers: Cache shared buffer configuration
In order to achieve faster dumping of current setting and also in order
to provide possibility to get pool mode without a need to query hardware,
do cache the configuration in driver.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:04 -04:00
Jiri Pirko aa99bc70ba mlxsw: spectrum_buffers: Rename "pool" to "pr" in initialization
Be consintent with rest of the registers (pm, cm) and use "pr" here.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:04 -04:00
Jiri Pirko b11c3b4018 mlxsw: spectrum_buffers: Push out indexes and direction out of SB structs
Structs are in arrays so use array index as pool/tc/prio index. With
that, there is need to maintain separate arrays for ingress and egress.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:04 -04:00
Jiri Pirko 94266e3278 mlxsw: spectrum_buffers: Push out shared buffer register writes
Pushed them into helper functions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:03 -04:00
Jiri Pirko a6179bf0d1 mlxsw: core: Add devlink shared buffer callbacks
Add middle layer in mlxsw core code to forward shared buffer calls
into specific ASIC drivers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:22:03 -04:00
Jiri Pirko 9efc8f655c mlxsw: reg: Fix SBPM register name
Fix copy&paste error and state the name of SBPM register correctly.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 15:38:43 -04:00
Jiri Pirko 497e8592c6 mlxsw: reg: Share direction enum between SBPR, SBCM, SBPM
Same field, same values, so share the same enum.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 15:38:43 -04:00
Jiri Pirko b2f10571b9 mlxsw: Do not pass around driver_priv directly
Instead of that, pass mlxsw_core and use a helper to get driver priv
from driver code. Looks much cleaner that way.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 15:38:42 -04:00
Jiri Pirko 307c2431ab mlxsw: Pass mlxsw_core as a param of mlxsw_core_skb_transmit*
Instead of passing around driver priv, pass struct mlxsw_core *
directly.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 15:38:42 -04:00
Jiri Pirko 932762b69a mlxsw: Move devlink port registration into common core code
Remove devlink port reg/unreg from spectrum and switchx2 code and rather
do the common work in core. That also ensures code separation where
devlink is only used in core.c.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-08 15:38:42 -04:00
Ido Schimmel d81a6bdb87 mlxsw: spectrum: Add IEEE 802.1Qbb PFC support
Implement the appropriate DCB ops and allow a user to configure certain
traffic classes as lossless.

The operation configures PFC for both the egress (respecting PFC frames)
and ingress (sending PFC frames) parts of the port.

At egress, when a PFC frame is received for a PFC enabled priority, then
all the priorities mapped to the same TC are stopped.

At ingress, the priority group (PG) buffers to which the enabled PFC
priorities are mapped are configured to be lossless. PFC frames will be
transmitted when the Xoff threshold is crossed.

The user-supplied delay parameter is used to determine the PG's size
according to the following formula:

PG_SIZE = PG_SIZE_LOSSY + delay * CELL_FACTOR + MTU

In the worst case scenario the delay will be made up of packets that
are all of size CELL_SIZE + 1, which means each packet will require
almost twice its true size when buffered in the switch. We therefore
multiply this value by the "cell factor", which is close to 2.

Another MTU is added in case the transmitting host already started
transmitting a maximum length frame when the PFC packet was received.

As with PAUSE enabled ports, when the port's MTU is changed both the
PGs' size and threshold are adjusted accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:20 -04:00
Ido Schimmel 34dba0a59d mlxsw: reg: Introduce per priority counters
We are going to add support for PFC as part of DCB ops, which requires us
to report the number of PFC frames sent and received per priority.

Add per priority counters in order to report number of PFC frames sent
and received per priority.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:20 -04:00
Ido Schimmel 9f7ec052b7 mlxsw: spectrum: Add support for PAUSE frames
When a packet ingress the switch it's placed in its assigned priority
group (PG) buffer in the port's headroom buffer while it goes through
the switch's pipeline. After going through the pipeline - which
determines its egress port(s) and traffic class - it's moved to the
switch's shared buffer awaiting transmission.

However, some packets are not eligible to enter the shared buffer due to
exceeded quotas or insufficient space. Marking their associated PGs as
lossless will cause the packets to accumulate in the PG buffer. Another
reason for packets accumulation are complicated pipelines (e.g.
involving a lot of ACLs).

To prevent packets from being dropped a user can enable PAUSE frames on
the port. This will mark all the active PGs as lossless and set their
size according to the maximum delay, as it's not configured by user.

                         +----------------+   +
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   | Delay
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
                         |                |   |
    Xon/Xoff threshold   +----------------+   +
                         |                |   |
                         |                |   | 2 * MTU
                         |                |   |
                         +----------------+   +

The delay (612 [Cells]) was calculated according to worst-case scenario
involving maximum MTU and 100m cables.

After marking the PGs as lossless the device is configured to respect
incoming PAUSE frames (Rx PAUSE) and generate PAUSE frames (Tx PAUSE)
according to user's settings.

Whenever the port's headroom configuration changes we take into account
the PAUSE configuration, so that we correctly set the PG's type (lossy /
lossless), size and threshold. This can happen when:

a) The port's MTU changes, as it directly affects the PG's size.

b) A PG is created following user configuration, by binding a priority
to it.

Note that the relevant SUPPORTED flags were already mistakenly set by
the driver before this commit.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:19 -04:00
Ido Schimmel 155f9de2e0 mlxsw: reg: Add lossless settings for PBMC register
When configuring PAUSE frames and PFC we'll need to configure the
Xon/Xoff threshold for the priority group (PG) buffers.

Add the Xon/Xoff threshold fields to the PBMC register so that we can
configure these when needed.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:19 -04:00
Ido Schimmel 6f253d8381 mlxsw: reg: Add Port Flow Control Configuration register
Add the Port Flow Control Configuration (PFCC) register, which
configures both flow control and Priority-based Flow Control (PFC).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:19 -04:00
Ido Schimmel cc7cf51758 mlxsw: spectrum: Allow setting maximum rate for a TC
Allow a user to set maximum rate for a particular TC using DCB ops.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:19 -04:00
Ido Schimmel 8e8dfe9fdf mlxsw: spectrum: Add IEEE 802.1Qaz ETS support
Implement the appropriate DCB ops and allow a user to configure:
	* Priority to traffic class (TC) mapping with a total of 8
	  supported TCs
	* Transmission selection algorithm (TSA) for each TC and the
	  corresponding weights in case of weighted round robin (WRR)

As previously explained, we treat the priority group (PG) buffer in the
port's headroom as the ingress counterpart of the egress TC. Therefore,
when a certain priority to TC mapping is configured, we also configure
the port's headroom buffer.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:18 -04:00
Ido Schimmel f00817df2b mlxsw: spectrum: Introduce support for Data Center Bridging (DCB)
Introduce basic infrastructure for DCB and add the missing ops in
following patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:18 -04:00
Ido Schimmel 90183b980d mlxsw: spectrum: Initialize egress scheduling
Before introducing support for DCB ops we should first make sure we
initialize the relevant parts in the device correctly. Specifically, the
egress scheduling.

The device supports a superset of the 802.1Qaz standard with 4 hierarchy
levels that can be linked to each other in multiple ways and with
different transmission selection algorithms (TSA) employed between them.

However, since we only intend to support the 802.1Qaz standard we
flatten the hierarchies and let the user configure via DCB ops the TSA
and max rate shaper at the subgroup hierarchy (see figure below) and the
mapping between switch priority to traffic class. By default, all switch
priorities are mapped to traffic class 0, strict priority is employed
and max shaper is disabled.

Default configuration:

         switch priority 0      ...         switch priority 7
                 +                                  +
                 |                                  |
                 +----------------------------------+
                 |
              +--v--+                          +-----+
Traffic Class |     |                          |     |
  Hierarchy   | TC0 |           ...            | TC7 |
              |     |                          |     |
              +--+--+                          +--+--+
                 |                                |
              +--v--+                          +--v--+
  Subgroup    | SG0 |                          | SG7 |
  Hierarchy   |     |                          |     |
              +-----+                          +-----+
              | TSA |                          | TSA |
              +-----+           ...            +-----+
              | MAX |                          | MAX |
              +--+--+                          +--+--+
                 |                                |
                 +---------------+----------------+
                                 |
                              +--v--+
                      Group   |     |
                    Hierarchy | GR0 |
                              |     |
                              +--+--+
                                 |
                              +--v--+
                      Port    |     |
                    Hierarchy | PR0 |
                              |     |
                              +-----+

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:18 -04:00
Ido Schimmel 2c63a555e8 mlxsw: reg: Add QoS Switch Traffic Class Table register
As part of DCB ops we'll have to configure the priority to traffic class
mapping of a port.

Add the QoS Switch Traffic Class Table (QTCT) register, which configures
the mapping between the packet switch priority and traffic class on the
transmit port.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:18 -04:00
Ido Schimmel b9b7cee405 mlxsw: reg: Add QoS ETS Element Configuration register
We are going to introduce support for DCB, so we need to be able to
configure the traffic selection algorithm (TSA) used by each traffic
class (TC), as well as the bandwidth percentage allocated to each TC in
case of ETS.

Add the QoS ETS Element Configuration register, which controls the
above parameters.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:17 -04:00
Ido Schimmel d6b7c13b01 mlxsw: spectrum: Set port's shared buffer size to 0
In addition to the priority group (PG) buffers in the headroom, the
device enables the allocation of headroom shared buffer, which can
be shared between different PGs.

However, we are not going to use the headroom shared buffer and instead
allow the user to use its size for PGs or the switch's shared buffer.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:17 -04:00
Ido Schimmel 7ad7cd6113 mlxsw: reg: Use correct PBMC register length
The last field of the PBMC register is at offset 0x64 and its size is
0x8, so the correct register's length is 0x6C bytes.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-06 17:24:17 -04:00