RGMII_RXID and RGMII_TX_ID share the same GMAC CTRL setting as RGMII
or RGMII_ID.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bnxt_free_rx_skbs(), which is called to free up all RX buffers during
shutdown, we need to unmap the page if we are running in XDP mode.
Fixes: c61fb99cae ("bnxt_en: Add RX page mode support.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sankar Patchineelam <sankar.patchineelam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Net device reset can fail when the h/w or f/w is in a bad state.
Subsequent netdevice open fails in bnxt_hwrm_stat_ctx_alloc().
The cleanup invokes bnxt_hwrm_resource_free() which inturn
calls bnxt_disable_int(). In this routine, the code segment
if (ring->fw_ring_id != INVALID_HW_RING_ID)
BNXT_CP_DB(cpr->cp_doorbell, cpr->cp_raw_cons);
results in NULL pointer dereference as cpr->cp_doorbell is not yet
initialized, and fw_ring_id is zero.
The fix is to initialize cpr fw_ring_id to INVALID_HW_RING_ID before
bnxt_init_chip() is invoked.
Signed-off-by: Sankar Patchineelam <sankar.patchineelam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz says:
This series adds support for offloading modifications of packet headers using
ConnectX-5 HW header re-write as an action applied during packet steering.
The offloaded SW mechanism is TC's pedit action. The offloading is
supported for E-Switch steering of VF traffic in the SRIOV
switchdev mode and for NIC (non eswitch) RX.
One use-case for this offload on virtual networks, is when the hypervisor
implements flow based router such as Open-Stack's DVR, where L2 headers
of guest packets re-written with routers' MAC addresses and the IP TTL
is decremented.
Another use case (which can be applied in parallel with routing) is
stateless NAT where guest L3/L4 headers are re-written.
The series is built as follows: the 1st six patches are preperations which
don't yet add new functionality, patches 7-8 add the FW APIs (data-structures
and commands) for header re-write, and patch nine allows offloading driver
to access pedit keys.
The 10th patch is somehow the core of the series, where we translate from
the pedit way to represent set of header modification elements to the FW
API for that same matter.
Once a set of HW modification is established, we register it with the FW
and get a modify header ID. When this ID is used with an action during
packet steering, the HW applies the header modification on the packet.
Patches 11 and 12 implement the above logic as an offload for pedit action
for the NIC and E-Switch use-cases.
I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing
and helping me testing this functionality on HW simulator, before it could
be done with FW.
- Or.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJY2lwEAAoJEEg/ir3gV/o+rFIH+wdwGawEjoDhpihLqJHoRtwo
Wvy88Lczj++Pfzt9E0kgwgmOdnj7j+GVOh6ALjneE3PDBJEFWG/GWY5aRYonlhhf
zibafMTYf+8Dmm9qHW/C4OvhQowSrkG1RDucM2eyjXJfnAShZCh7dV4CDD7paxhu
N2rlDdSEl0Im4aPCNHzyrdGg06Fy3A0DQkDvVLIQhKV0cLPIoC0U/i+ymVtsCUY/
sSEEuSohvwdD5Ga5ZZdKicCo61lIRSi2rX5v4sK0exhAO3S8xyrKnwbiN7nVAQqg
eVZ/ekbBiksD8MRMKctt/zGxd0X4PDaQ8J9XyF9CL6pRC5VipsDy+P/GEhj/x8U=
=l2Qo
-----END PGP SIGNATURE-----
Merge tag 'mlx5e-pedit' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Or Gerlitz says:
====================
mlx5e-pedit 2017-03-28
This series adds support for offloading modifications of packet headers using
ConnectX-5 HW header re-write as an action applied during packet steering.
The offloaded SW mechanism is TC's pedit action. The offloading is
supported for E-Switch steering of VF traffic in the SRIOV
switchdev mode and for NIC (non eswitch) RX.
One use-case for this offload on virtual networks, is when the hypervisor
implements flow based router such as Open-Stack's DVR, where L2 headers
of guest packets re-written with routers' MAC addresses and the IP TTL
is decremented.
Another use case (which can be applied in parallel with routing) is
stateless NAT where guest L3/L4 headers are re-written.
The series is built as follows: the 1st six patches are preperations which
don't yet add new functionality, patches 7-8 add the FW APIs (data-structures
and commands) for header re-write, and patch nine allows offloading driver
to access pedit keys.
The 10th patch is somehow the core of the series, where we translate from
the pedit way to represent set of header modification elements to the FW
API for that same matter.
Once a set of HW modification is established, we register it with the FW
and get a modify header ID. When this ID is used with an action during
packet steering, the HW applies the header modification on the packet.
Patches 11 and 12 implement the above logic as an offload for pedit action
for the NIC and E-Switch use-cases.
I'd like to thanks Elijah Shakkour <elijahs@mellanox.com> for implementing
and helping me testing this functionality on HW simulator, before it could
be done with FW.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a delay to Rx queue disables to accommodate HW needs.
v2: Added missing check for disable only, additional details on the
need for the ugly delay and fixed spacing on comment.
Change-ID: I2864ca667ce5dcc2cc44f8718113b719742a46a1
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes the way we handle the maximum frame size for the Rx
path. Previously we were rounding up to 2K for a 1500 MTU and then brining
the max frame size down to MTU plus a fixed amount. With this patch
applied what we now do is limit the maximum frame to 1.5K minus the value
for NET_IP_ALIGN for standard MTU, and for any MTU greater than 1500 we
allow up to the maximum frame size. This makes the behavior more
consistent with the other drivers such as igb which had similar logic. In
addition it reduces the test matrix for MTU since we only have two max
frame sizes that are handled for Rx now.
Change-ID: I23a9d3c857e7df04b0ef28c64df63e659c013f3f
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a control which will allow us to toggle into and out of the
legacy Rx mode. The legacy Rx mode is what we currently do when performing
Rx. As I make further changes what should happen is that the driver will
fall back to the behavior for Rx as of this patch should the "legacy-rx"
flag be set to on.
Change-ID: I0342998849bbb31351cce05f6e182c99174e7751
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is meant to clean up the code in preparation for us adding
support for build_skb. Specifically we deconstruct i40e_fetch_buffer into
several functions so that those functions can later be reused when we add a
path for build_skb.
Specifically with this change we split out the code for adding a page to an
exiting skb.
Change-ID: Iab1efbab6b8b97cb60ab9fdd0be1d37a056a154d
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch pulls out the code responsible for handling buffer recycling and
page counting and distributes it through several functions. This allows us
to commonize the bits that handle either freeing or recycling the buffers.
As far as the page count tracking one change to the logic is that
pagecnt_bias is decremented as soon as we call i40e_get_rx_buffer. It is
then the responsibility of the function that pulls the data to either
increment the pagecnt_bias if the buffer can be recycled as-is, or to
update page_offset so that we are pointing at the correct location for
placement of the next buffer.
Change-ID: Ibac576360cb7f0b1627f2a993d13c1a8a2bf60af
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch pulls the code responsible for fetching the Rx buffer and
synchronizing DMA into a function, specifically called i40e_get_rx_buffer.
The general idea is to allow for better code reuse by pulling this out of
i40e_fetch_rx_buffer. We dropped a couple of prefetches since the time
between the prefetch being called and the data being accessed was too small
to be useful.
Change-ID: I4885fce4b2637dbedc8e16431169d23d3d7e79b9
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change makes it so that we use the length of the packet instead of the
DD status bit to determine if a new descriptor is ready to be processed.
The obvious advantage is that it cuts down on reads as we don't really even
need the DD bit if going from a 0 to a non-zero value on size is enough to
inform us that the packet has been completed.
Change-ID: Iebdf9cdb36c454ef092df27199b92ad09c374231
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This flag hasn't been used since commit 1e1be8f622 ("i40e: ATR policy
change to flush the table to clean stale ATR rules").
Lets simplify things and just remove it.
Change-ID: I76279d84db8a2fd96f445b96aa413059f9256879
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The goto found here for when in MFP mode is pointless. It jumps to the
end of a series of if blocks. However, right after this statement is
a closing '}' for this if block, which will result in the program flow
going to the exact same location as the goto statement indicates. Thus,
regardless of whether we are in MFP mode, the program flow will resume
from the same location.
This arose due to various refactoring which did not notice that this
goto became essentially a no-op.
To properly understand this diff you will need to view a larger context
than is given by default.
Change-ID: I088f73c3831aa5c4e2281380c7a3ce605594300c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix a case where we miss an arq element if a new one is added before we
enable interrupts and exit the arq subtask loop. This occurs frequently
with RDMA running on Windows VF and causes long delays that prevent SMB
from establishing connections.
Change-ID: I3e1c8b2b960c12857d9b8275bea2c1563674392e
Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The XL722 doesn't support the AQ command to read/write the control
register so enable it to bypass the check and use the direct read/write
method.
Change-ID: Iefecc737b57207485c90845af5989d5af518bf16
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch cleans up and addresses several issues in the way that i40e
handles private flags. Previously the code was choosing fixed bits and
trying to match them up with strings in a somewhat haphazard way. This
resulted in the possibility for adding a new bit and causing a mismatch as
the private flags are linear bits starting at 0, and the private flags in
the driver were split up over a group specific to the PF and a group that
was global.
What this change does is define an array of structs used to represent the
private flags. Contained within the structs are the bits necessary to know
which flags to set and/or clear depending on the state of the bit. By
doing this we can add new bits in the future with minimal overhead and
avoid creating possible mis-matches should we need to remove a flag based
on compile options.
Change-ID: Ia3214ab04f0ab2f70354ac0997a135f1d01b0acd
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current driver mode is to use a write-back mechanism for the head
register which indicates transmit completions. The VF driver needs to be
able to work on hardware that exclusively uses descriptor write-back, so
change the default driver mode of operation to descriptor write-back for
VF. In our analysis, performance wasn't significantly different with
either write-back method.
Change-ID: Ia92e4ec77c2df8dc4515c71d53746d57d77759af
Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is an include loop between netdevice.h, dsa.h, devlink.h because
of NETDEV_ALIGN, making it impossible to use devlink structures in
dsa.h.
Break this loop by taking dsa.h out of netdevice.h, add a forward
declaration of dsa_switch_tree and netdev_set_default_ethtool_ops()
function, which is what netdevice.h requires.
No longer having dsa.h in netdevice.h means the includes in dsa.h no
longer get included. This breaks a few other files which depend on
these includes. Add these directly in the affected file.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor interrupt moderation code for flexibility because parameters are
different for 10G and 25G cards. Currently parameters (for 10G only) come
from macros compiled-in to the PF and VF drivers; fix it so that parameters
suitable for the card (10G or 25G) come from the NIC firmware via response
to a command.
Also bump up driver version to 1.5.1 to match newer NIC firmware version.
Signed-off-by: Prasad Kanneganti <prasad.kanneganti@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gcc-7 reports a warning that earlier versions did not have:
drivers/net/ethernet/rocker/rocker_ofdpa.c: In function 'ofdpa_port_stp_update':
arch/x86/include/asm/string_32.h:79:22: error: '*((void *)&prev_ctrls+4)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
*((short *)to + 2) = *((short *)from + 2);
~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/rocker/rocker_ofdpa.c:2218:7: note: '*((void *)&prev_ctrls+4)' was declared here
This is clearly a variation of the warning about 'prev_state' that
was shut up using uninitialized_var().
We can slightly simplify the code and get rid of the warning by unconditionally
saving the prev_state and prev_ctrls variables. The inlined memcpy is not
particularly expensive here, as it just has to read five bytes from one or
two cache lines.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
In NETDEV_CHANGEUPPER event the upper_info field is valid
only when linking is true. Otherwise it should be ignored.
Fixes: 7907f23adc (net/mlx5: Implement RoCE LAG feature)
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Until now, qed used some port-defined value as BDQ index for both iSCSI
and FCoE.
As management firmware now treats BDQ as a resource and tells each PF
its BDQ-range, start using a valure from that range instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware is used as an arbiter between the various PFs
in matters of resources, but some of the resources that need to
be divided are dependent on the non-management firmware used,
so management firmware first needs to be told how many resources
there are before trying to divide them.
As part of the initialization sequence, driver would first inform
the management firmware of the available resources under
a dedicated resource lock, and afterwards request for various
resources which might be based on the previous set values.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Global locking can't properly be used to synchronize between different
PFs in all scenarios, as those instances might reside in different
logical partitions [e.g., when a PF is assigned via PDA to some VM].
The management firmware provides a generic infrastructure for
device locks. For each 'resource', it's guaranteed it could be acquired
by at most a single PF at any given time [or by management firmware].
This patch adds the necessary logic in qed for utilizing said
infrastructure, implementing lock/unlock internal APIs.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During HW initialization, driver would set various registers to their
needed values - but it assumes all registers start at their reset-value,
so there's no need to re-configure a register's default value.
This assumption might be incorrect, e.g., in case of preboot driver
running and initializing the driver prior to our driver.
To overcome this, we now ask management firmware to initiate a PF-flr
early during the initialization sequence. That would return everything
in the PF's scope back to default and prevent previous configurations
from still being applied.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware is used as an arbiter between the various PFs
in regard to loading - it causes the various PFs to load/unload
sequentially and informs each of its appropriate rule in the init.
But the existing flow is too weak to handle some scenarios where
PFs aren't properly cleaned prior to loading.
The significant scenarios falling under this criteria:
a. Preboot drivers in some environment can't properly unload.
b. Unexpected driver replacement [kdump, PDA].
Modern management firmware supports a more intricate loading flow,
where the driver has the ability to overcome previous limitations.
This moves qed into using this newer scheme.
Notice new scheme is backward compatible, so new drivers would
still be able to load properly on top of older management firmwares
and vice versa.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We'll soon need additional information, so start by changing
the infrastructure to receive the initializing variables
via a parameter struct.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware is used as arbiter between different PFs
which are loading/unloading, but in order to use the synchronization
it offers the contending configurations need to be applied either
between their LOAD_REQ <-> LOAD_DONE or UNLOAD_REQ <-> UNLOAD_DONE
management firmware commands.
Existing HW stop flow utilizes 2 different functions: qed_hw_stop() and
qed_hw_reset() which don't abide this requirement; Most of the closure
is doing outside the scope of the unload request.
This patch removes qed_hw_reset() and places the relevant stop
functionality underneath the management firmware protection.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
moxart_mac_start_xmit() doesn't care where tx_tail is, tx_head can
catch and pass tx_tail, which is bad because moxart_tx_finished()
isn't guaranteed to catch up on freeing resources from tx_tail.
Add a check in moxart_mac_start_xmit() stopping the queue at the
end of the circular buffer. Also add a check in moxart_tx_finished()
waking the queue if the buffer has TX_WAKE_THRESHOLD or more
free descriptors.
While we're at it, move spin_lock_irq() to happen before our
descriptor pointer is assigned in moxart_mac_start_xmit().
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=99451
Signed-off-by: Jonas Jensen <jonas.jensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A driver must not access the two fields directly but should instead use
the helper functions to set the values and keep a consistent internal
state:
ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_dvr_probe':
ethernet/stmicro/stmmac/stmmac_main.c:4083:8: error: 'struct net_device' has no member named 'real_num_rx_queues'; did you mean 'real_num_tx_queues'?
Fixes: a8f5102af2 ("net: stmmac: TX and RX queue priority configuration")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement dpipe's table ops for erif table which provide:
1. Getting the entries in the table with the associate values.
- match on "mlxsw_meta:erif_index"
- action on "mlxsw_meta:forwared_out"
2. Synchronize the hardware in case of enabling/disabling counters which
mean removing erif counters from all interfaces.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add rif helper function to access the rif index and rif devices ifindex.
This functions will be used by dpipe in order to dump the rif table.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for counter allocation on router interfaces. The allocation
depends on the counter state of relevant table. In case the counting is
disabled or no counters left the counter index will be set as invalid.
Also a counter pool for router allocation is added.
Signed-off-by: Arakdi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RICNT register retrieves per port performance counter. It will be
used to query the router interfaces statistics.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add definition for egress router interface table. This table describes
the final part in the routing pipeline. This table matches the egress
interface index (rif index, which is set by the previous stages and
determine the out port) and makes the decision of forwarding the packet
towards the L2 logic or dropping it.
The metadata header is added to represent this internal information.
The rif index field is mapped logically to netdevice ifindex.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add placeholder for dpipe. Support for specific tables and headers will
be introduced in following patches. The headers are shared between all
mlxsw_sp instances.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update RITR for counter support. This allows adding counters for
ASIC's router ports.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This includes calling the parsing code that translates from pedit
speak to the HW API, allocation (deallocation) of a modify header
context and setting the modify header id associated with this
context to the FTE of that flow.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This includes calling the parsing code that translates from pedit
speak to the HW API, allocation (deallocation) of a modify header
context and setting the modify header id associated with this
context to the FTE of that flow.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Parse/translate a set of TC pedit actions to be formed in the HW API format.
User-space provides set of keys where each one of them is made of: command (add or
set), header-type, byte offset within that header along with a 32 bit mask and value.
The mask dictates what bits in the 32 bit word that starts on the offset we should
be dealing with, but under negative polarity (unset bits are to be modified).
We do a 1st pass over the set of keys while using the header-type and offset to
fill the masks and the values into a data-structure containting all the
supported network headers.
We then do a 2nd pass over the set of fields to re-write supported by the HW,
where for each such candidate field, we use the masks filled on the 1st pass to
realize if we should offloading re-write it.
In case offloading is required, we fill a HW descriptor with the following:
(1) the header field to modify
(2) the bit offset within the field from where to modify (set command only)
(3) the value to set/add
(4) the length in bits 1...32 to modify (set command only)
Note that it's possible for a given pedit mask to dictate modifying the
same header field multiple times or to modify multiple header fields.
Currently such combinations are not supported for offloading, hence, for set
commands, the offset within the field is always zero, and the length to modify
is the field size.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Implement the low-level commands to support packet header re-write.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add the definitions related to creation/deletion of a modify header
context and the modify header steering action which are used for HW
packet header modify (re-write) as part of steering. Add as well the
modify header id into two intermediate structs and set it to the FTE.
Note that as the push/pop vlan steering actions are emulated by the
ewitch management code, we're not breaking any compatibility while
changing their values to make room for the modify header action which
is not emulated and whose value is part of the FW API. The new bit
values for the emulated actions are at the end of the possible range.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Move the commands related to scheduling elements and vport qos to
a suitable location (according to the MLX5_CMD_OP enum values) in
the command string and internal error helpers.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
There are bunch of places in the code where the intermediate struct
that keeps the elements related to flow actions is initialized with
the same default values. Put that into a small DECLARE type helper.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The code for adding tc fdb flows leaves things half set when it fails
in the middle. Currently we are not leaking things (e.g eswitch
vlan reference, encap reference and HW resources) since the main
code to add flower rules does a cleanup by calling mlx5e_tc_del_flow().
This cleanup further works just b/c we're checking there if the HW rule
for the flow we are attempting to delete is valid before touching it, and
since under the current possible combinations of supported actions it's okay
to go and blidnly deref or delete all the action related resources (encap, vlan).
Instead, do things properly, namely make sure that if add flow fails we
clean all what was allocated or referenced. Now, the flow delete code can
blindly deref/deallocate both the rule and the actions related resources and
when more action combinations are introduced (such as the upcoming header
re-write) we are fine with clear and robust code.
While here, align all of nic/fdb parse actions/add flow functions to get
mlx5e_tc_flow struct param and pick the attributes or whatever else needed
from there.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add intermediate structure to store attributes parsed from TC filter
matching/actions parts which are soon to be configured into the HW.
Currently put there the flow matching spec after being parsed. More
content to be added in down-stream patch.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add structure that contains the attributes related to offloaded
NIC flows. Currently it has the actions and flow tag.
While here, do xmas tree cleanup of the TC configure function.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add esw_ prefix to the flow attributes attached to offloaded e-switch
TC flows. This is a pre-step to add attributes to offloaded NIC TC flows.
Also, save one pointer space by using gcc's zero size array, this would
be beneficial for environments where 100Ks (or Ms) of flows are offloaded.
This patch doesn't change any functionality.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This series provides a fail-safe mechanism to allow safely re-configuring
mlx5e netdevice and provides a resiliency against sporadic
configuration failures.
To enable this we do some refactoring and code reorganizing to allow
breaking the drivers open/close flows to stages:
open -> activate -> deactivate -> close.
In addition we need to allow creating fresh HW ring resources
(mlx5e_channels) with their own "new" set of parameters, while keeping
the current ones running and active until the new channels are
successfully created with the new configuration, and only then we can
safly replace (switch) old channels with new ones.
For that we introduce mlx5e_channels object and an API to manage it:
- channels = open_channels(new_params):
open fresh TX/RX channels
- activate_channels(channels):
redirect traffic to them and attach them to the netdev
- deactivate_channes(channels)
stop traffic and detach from netdev
- close(channels)
Free the TX/RX HW resources of those channels
With the above strategy it is straightforward to achieve the desired
behavior of fail-safe configuration. In pseudo code:
make_new_config(new_params)
{
old_channels = current_active_channels;
new_channels = create_channels(new_params);
if (!new_channels)
return "Failed, but current channels are still active :)"
deactivate_channels(old_channels); /* Can't fail */
set_hw_new_state(); /* If needed */
activate_channels(new_channels); /* Can't fail */
close_channels(old_channels);
current_active_channels = new_channels;
return "SUCCESS";
}
At the top of this series, we change the following flows to be fail-safe:
ethtool:
- ring parameters
- coalesce parameters
- tx copy break parameters
- cqe compressing/moderation mode setting (priv flags)
ndos:
- tc setup
- set features: LRO
- change mtu
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJY2XfKAAoJEEg/ir3gV/o+6fAIAKBsqf+EYhbHA0JoTnV1sm3G
PSGjj5VCMNTZPyDlTWLEpY2S5TIDRPvICC04i5jWFjo5SOmsRMR6ZV0llHukKC4k
SAkAYU4A78Ds7UhmWzokebwzWa8VA48eqLRxXV60EAhJ0BOgzZnG09KIpzdplE7A
pco+F/c/qzJa0NP1KQBBrYIcXbGMrCFcYM8d6lJ8TRfVDdZZpeTB/wvxRixKfe1L
Ji6+k5tbDynDD3+HWkWq+chAkw4yldN7q8fC8FaN2r0mtWYsYbVSPuP+BlL0XN4R
oluZEJjnyaCePaqUMW+ZYVb1hCGP7pOoJkBb901XdOnX5M2fU9vK3VufWErYF/s=
=r6Qw
-----END PGP SIGNATURE-----
Merge tag 'mlx5e-failsafe' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5e-failsafe 27-03-2017
This series provides a fail-safe mechanism to allow safely re-configuring
mlx5e netdevice and provides a resiliency against sporadic
configuration failures.
To enable this we do some refactoring and code reorganizing to allow
breaking the drivers open/close flows to stages:
open -> activate -> deactivate -> close.
In addition we need to allow creating fresh HW ring resources
(mlx5e_channels) with their own "new" set of parameters, while keeping
the current ones running and active until the new channels are
successfully created with the new configuration, and only then we can
safly replace (switch) old channels with new ones.
For that we introduce mlx5e_channels object and an API to manage it:
- channels = open_channels(new_params):
open fresh TX/RX channels
- activate_channels(channels):
redirect traffic to them and attach them to the netdev
- deactivate_channes(channels)
stop traffic and detach from netdev
- close(channels)
Free the TX/RX HW resources of those channels
With the above strategy it is straightforward to achieve the desired
behavior of fail-safe configuration. In pseudo code:
make_new_config(new_params)
{
old_channels = current_active_channels;
new_channels = create_channels(new_params);
if (!new_channels)
return "Failed, but current channels are still active :)"
deactivate_channels(old_channels); /* Can't fail */
set_hw_new_state(); /* If needed */
activate_channels(new_channels); /* Can't fail */
close_channels(old_channels);
current_active_channels = new_channels;
return "SUCCESS";
}
At the top of this series, we change the following flows to be fail-safe:
ethtool:
- ring parameters
- coalesce parameters
- tx copy break parameters
- cqe compressing/moderation mode setting (priv flags)
ndos:
- tc setup
- set features: LRO
- change mtu
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Probably due to some mis-merging fix a bug associated with commits
d7ce6422d6 ("i40e: don't check params until after checking for client
instance", 2017-02-09) and 3140aa9a78c9 ("i40e: KISS the client
interface", 2017-03-14)
The first commit tried to move the initialization of the params
structure so that we didn't bother doing this if we didn't have a client
interface. You can already see that it looks fishy because of the
indentation. The second commit refactors a bunch of the interface, and
incorrectly drops the params initialization.
I believe what occurred is that internally the two patches were
re-ordered, and the merge conflicts as a result were performed
incorrectly.
Fix the use of an uninitialized variable by correctly initializing the
params variable via i40e_client_get_params().
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
VSI is being dereferenced before the VSI null check; if VSI is
null we end up with a null pointer dereference. Fix this by
performing VSI deference after the VSI null check. Also remove
the need for using adapter by using vsi->back->cinst.
Detected by CoverityScan, CID#1419696, CID#1419697
("Dereference before null check")
Fixes: ed0e894de7 ("i40evf: add client interface")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since FCoE isn't supported by the i40e products there isn't much point in
carrying around code that will always evaluate to false. This patch goes
through and strips out the code in several spots so that we don't go around
carrying variables and/or code that is always going to evaluate to false or
0.
Change-ID: I39d1d779c66c638b75525839db2b6208fdc809d7
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Looking over the code for FCoE it looks like the Rx path has been broken at
least since the last major Rx refactor almost a year ago. It seems like
FCoE isn't supported for any of the Fortville/Fortpark hardware so there
isn't much point in carrying the code around, especially if it is broken
and untested.
Change-ID: I892de8fa551cb129ce2361e738ff82ce55fa229e
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is a minor clean-up to make the i40e/i40evf process_skb_fields
function look a little more like what we have in igb. The Rx checksum
function called out a need for skb->protocol but I can't see where it
actually needs it. I am assuming this is something that was likely
refactored out some time ago as the Rx checksum code has gone through a few
rewrites.
Change-ID: I0b4668a34d90b61b66ded7c7c26e19a3e2d06251
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Removed no longer needed delays. At preproduction stage those delays were
needed but now these delays are not needed.
Signed-off-by: Bimmy Pujari <bimmy.pujari@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
First, this patch eliminates IOMMU DMAR Faults caused by VF hardware.
This is done by enabling VF hardware only after VSI resources are
freed. Otherwise, hardware could DMA into memory that is (or just has
been) being freed.
Then, the VF driver is activated only after VSI resources have been
reallocated. That's because the VF driver can request resources
immediately after it's activated. So they need to be ready at that
point.
The second race condition happens when the OS initiates a VF reset,
and then before it's finished modifies VF's settings by changing its
MAC, VLAN ID, bandwidth allocation, anti-spoof checking, etc. These
functions needed to be blocked while VF is undergoing reset. Otherwise,
they could operate on data structures that had just been freed or not
yet fully initialized.
Change-ID: I43ba5a7ae2c9a1cce3911611ffc4598ae33ae3ff
Signed-off-by: Robert Konklewski <robertx.konklewski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We need to reset skb back to NULL when we have freed it in the Rx cleanup
path. I found one spot where this wasn't occurring so this patch fixes it.
Change-ID: Iaca68934200732cd4a63eb0bd83b539c95f8c4dd
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There exists a bug in the driver where the calculation of the
RSS size was not taking into account the number of traffic classes
enabled. This patch factors in the traffic classes both in
the initial configuration of the table as well as reconfiguration.
Change-ID: I34dcd345ce52faf1d6b9614bea28d450cfd5f621
Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Update the driver code so that we do bulk updates of the page reference
count instead of just incrementing it by one reference at a time. The
advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet.
In addition if we eventually move this over to using build_skb the gains
will be more noticeable.
I also found and fixed a store forwarding stall from where we were
assigning "*new_buff = *old_buff". By breaking it up into individual
copies we can avoid this and as a result the performance is slightly
improved.
Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ibmvnic driver keeps its statistics in net_device->stats, so the
net_stats member in struct ibmvnic_adapter is unused. Remove it.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ibmveth driver keeps its statistics in net_device->stats, so the
stats member in struct ibmveth_adapter is unused. Remove it.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bfin_mac driver keeps its statistics in net_device->stats, so the
stats member in struct bfin_mac_local is unused. Remove it, as well as
the accompanying comment.
Cc: adi-buildroot-devel@lists.sourceforge.net
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ethtool api {get|set}_settings is deprecated.
We move this driver to new api {get|set}_link_ksettings.
As I don't have the hardware, I'd be very pleased if
someone may test this patch.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the new fail-safe channels switch mechanism to set new
netdev mtu and lro settings.
MTU and lro settings demand some HW configuration changes after new
channels are created and ready for action. In order to unify switch
channels routine for LRO and MTU changes, and maybe future configuration
features, we now pass to it a modify HW function pointer to be
invoked directly after old channels are de-activated and before new
channels are activated.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Use the new fail-safe channels switch mechanism to set up new
tc parameters.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Use the new fail-safe channels switch mechanism to set new
CQE compressing and CQE moderation mode settings.
We also move RX CQE compression modify function out of en_rx file to
a more appropriate place.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Use the new fail-safe channels switch mechanism to set new ethtool
settings:
- ring parameters
- coalesce parameters
- tx copy break parameters
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
A fail safe helper functions that allows switching to new channels on the
fly, In simple words:
make_new_config(new_params)
{
new_channels = open_channels(new_params);
if (!new_channels)
return "Failed, but current channels are still active :)"
switch_channels(new_channels);
return "SUCCESS";
}
Demonstrate mlx5e_switch_priv_channels usage in set channels ethtool
callback and make it fail-safe using the new switch channels mechanism.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
mlx5e_redirect_rqts_to_{channels,drop} and mlx5e_{add,del}_sqs_fwd_rules
and Set real num tx/rx queues belong to
mlx5e_{activate,deactivate}_priv_channels, for that we move those functions
and minimize mlx5e_open/close flows.
This will be needed in downstream patches to replace old channels with new
ones without the need to call mlx5e_close/open.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Remove mlx5e_priv pointer from CQ and RQ structs,
it was needed only to access mdev pointer from priv pointer.
Instead we now pass mdev where needed.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
In order to have a clean separation between channels resources creation
flows and current active mlx5e netdev parameters, make sure each
resource creation function do not access priv->params, and only works
with on a new fresh set of parameters.
For this we add "new" mlx5e_params field to mlx5e_channels structure
and use it down the road to mlx5e_open_{cq,rq,sq} and so on.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
As a foundation for safe config flow, a simple clear API such as
(Open then Activate) where the "Open" handles the heavy unsafe
creation operation and the "activate" will be fast and fail safe,
to enable the newly created channels.
For this we split the RQs/TXQ SQs and channels open/close flows to
open => activate, deactivate => close.
This will simplify the ability to have fail safe configuration changes
in downstream patches as follows:
make_new_config(new_params)
{
old_channels = current_active_channels;
new_channels = create_channels(new_params);
if (!new_channels)
return "Failed, but current channels still active :)"
deactivate_channels(old_channels); /* Can't fail */
activate_channels(new_channels); /* Can't fail */
close_channels(old_channels);
current_active_channels = new_channels;
return "SUCCESS";
}
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Rename mlx5e_refresh_tirs_self_loopback to mlx5e_refresh_tirs,
as it will be used in downstream (Safe config flow) patches, and make it
fail safe on mlx5e_open.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
RQ Tables are always created once (on netdev creation) pointing to drop RQ
and at that stage, RQ tables (indirection tables) are always directed to
drop RQ.
We don't need to use mlx5e_fill_{direct,indir}_rqt_rqns to fill the drop
RQ in create RQT procedure.
Instead of having separate flows to redirect direct and indirect RQ Tables
to the current active channels Receive Queues (RQs), we unify the two
flows by introducing mlx5e_redirect_rqt function and redirect_rqt_param
struct. Combined, they provide one generic logic to fill the RQ table RQ
numbers regardless of the RQ table purpose (direct/indirect).
Demonstrated the usage with mlx5e_redirect_rqts_to_channels which will
be called on mlx5e_open and with mlx5e_redirect_rqts_to_drop which will
be called on mlx5e_close.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Have a dedicated "channels" handler that will serve as channels
(RQs/SQs/etc..) holder to help with separating channels/parameters
operations, for the downstream fail-safe configuration flow, where we will
create a new instance of mlx5e_channels with the new requested parameters
and switch to the new channels on the fly.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
To simplify mlx5e_open_locked flow we set netdev->rx_cpu_rmap on netdev
creation rather on netdev open, it is redundant to set it every time on
mlx5e_open_locked.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Instead of iterating over the channel SQs to set their max rate, do it
on SQ creation per TXQ SQ.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
hns_dsaf_set_mac_key() calls dsaf_set_field() on an uninitialized field,
which will then change only a few of its bits, causing a warning with
the latest gcc:
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_mac_uc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
(origin) &= (~(mask)); \
^~
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_mac_mc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_add_mac_mc_port':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_del_mac_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_rm_mac_addr':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_del_mac_mc_port':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_get_mac_uc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_get_mac_mc_entry':
hisilicon/hns/hns_dsaf_reg.h:1046:12: error: 'mac_key.low.bits.port_vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized]
The code is actually correct since we always set all 16 bits of the
port_vlan field, but gcc correctly points out that the first
access does contain uninitialized data.
This initializes the field to zero first before setting the
individual bits.
Fixes: 5483bfcb16 ("net: hns: modify tcam table and set mac key")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dev_dbg() is enabled, we print uninitialized data, as gcc-7.0.1
now points out:
ethernet/hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_set_promisc_tcam':
ethernet/hisilicon/hns/hns_dsaf_main.c:2947:75: error: 'tbl_tcam_data.low.val' may be used uninitialized in this function [-Werror=maybe-uninitialized]
ethernet/hisilicon/hns/hns_dsaf_main.c:2947:75: error: 'tbl_tcam_data.high.val' may be used uninitialized in this function [-Werror=maybe-uninitialized]
We also pass the data into hns_dsaf_tcam_mc_cfg(), which might later
use it (not sure about that), so it seems safer to just always initialize
the tbl_tcam_data structure.
Fixes: 1f5fa2dd1c ("net: hns: fix for promisc mode in HNS driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the return allocated index and err value are multiplexed.
This patch changes the API to decouple the ret value from the allocated
index.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When testing the epoll w/ busy poll code I found that I could get into a
state where the i40e driver had q_vectors w/ active NAPI that had no rings.
This was resulting in a divide by zero error. To correct it I am updating
the driver code so that we only support NAPI on q_vectors that have 1 or
more rings allocated to them.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Different SQ types (tx, xdp, ico) are growing apart, we separate them
and remove unwanted parts in each one of them, to simplify data path and
utilize data cache.
Remove DB union from SQ structures since it is not needed anymore as we
now have different SQ data type for each SQ.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the next patches we will introduce different SQ types,
and we would want to reuse those functions, in this patch we make them
agnostic to SQ type (txq, xdp, ico).
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename mlx5e_{create,destroy}_{sq,rq,cq} to
mlx5e_{alloc,free}_{sq,rq,cq}.
Rename mlx5e_{enable,disable}_{sq,rq,cq} to
mlx5e_{create,destroy}_{sq,rq,cq}.
mlx5e_{enable,disable}_{sq,rq,cq} used to actually create/destroy the SQ
in FW, so we rename them to align the functions names with FW semantics.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the next patches we will introduce different SQ types, for that we here
generalize some TX helper functions to work with more basic SQ parameters,
in order to re-use them for the different SQ types.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XDP SQ has a fixed size WQE (MLX5E_XDP_TX_WQEBBS = 1) and only posts
one kind of WQE (MLX5_OPCODE_SEND),
Also we initialize SQ descriptors static fields once on open_xdpsq,
rather than every time on critical path.
Optimize the code in light of those facts and add a prefetch of the TX
descriptor first thing in the xdp xmit function.
Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Test case Before Now improvement
---------------------------------------------------------------
XDP TX (1 core) 13Mpps 13.7Mpps 5%
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle XDP TX completions before handling RX packets, to make sure more
free space is available for XDP TX packets a moment before handling
RX packets.
Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Test case Before Now improvement
---------------------------------------------------------------
XDP Drop (1 core) 16.9Mpps 16.9Mpps No change
XDP TX (1 core) 12Mpps 13Mpps 8%
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To save many rq->channel->sq dereferences in fast-path.
And rename it to xdpsq.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move struct mlx5e_rq and friends to appear after mlx5e_sq declaration in
en.h.
We will need this for next patch to move the mlx5e_sq instance into
mlx5e_rq struct for XDP SQs.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
XDP code belongs to RX path, move mlx5e_poll_xdp_tx_cq and
mlx5e_free_xdp_tx_descs to en_rx.c.
Rename them to mlx5e_poll_xdpsq_cq and mlx5e_free_xdpsq_descs.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One is sufficient since Blue Flame is not supported anymore.
This will also come in handy for switchdev mode to save resources, since
VF representors will use same single UAR as well for their own SQs.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5e netdev Blue Flame (write combining) support demands a lot of
overhead for a little latency gain for some special cases, this overhead
is hurting the common case.
Here we remove xmit Blue Flame support by creating all bfregs with no
write combining for all SQs, and we remove a lot of BF logic and
conditions from xmit data path.
Simplify mlx5e_tx_notify_hw (doorbell function) by removing BF related
code and by removing one memory barrier needed for WC mapped SQ doorbell
buffers, which no longer exist.
Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Test case Before Now improvement
---------------------------------------------------------------
TX packets (24 threads) 50Mpps 54Mpps 8%
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use dma_rmb in mlx5e_get_cqe rather than aggressive rmb (at least on
some architectures), this should help improve the performance on such
CPU archs where dma_rmb is optimized.
Performance improvement:
System: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Test case Baseline Now improvement
---------------------------------------------------------------
TX packets (24 threads) 45Mpps 50Mpps 11%
TC stack Drop (1 core) 3.45Mpps 3.6Mpps 5%
XDP Drop (1 core) 14Mpps 16.9Mpps 20%
XDP TX (1 core) 10.4Mpps 12Mpps 15%
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As explained in the previous patch, the cell size may change in future
devices, so query it from the firmware instead of hard coding it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sizes and thresholds of the priority group (PG) buffers are
configured in cells, which represent a specific amount of bytes.
The cell size can vary in different devices, so it's better to query it
from the firmware than hard coding it.
Refactor the code dealing with this value into different functions, so
that it will be easier to make the conversion in the next patch.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of hard coding the size of the shared buffer in the driver,
query it from the firmware, as it may change in future devices.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently hard code the maximum number of ports in the driver, but
this may change in future devices, so query it from the firmware
instead.
Fallback to a maximum of 64 ports in case this number can't be queried.
This should only happen in SwitchX-2 for which this number is correct.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of hard coding the number of LPM trees in the driver, query it
from the firmware, as it may change in future devices.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2017-03-23
This series contains updates to i40e and i40e.txt documentation.
Jake provides all the changes in the series which are centered around
ntuple filter fixes and additional support. Fixed the current
implementation of .set_rxnfc, where we were not reading the mask field
for filter entries which was resulting in filters not behaving as
expected and not working correctly. When cleaning up after disabling
flow director support, ensure that the default input set is correctly
reprogrammed. Since the hardware only supports a single input set for
all flows of that type, the driver shall only allow the input set to
change if there are no other configured filters for that flow type, so
add support to detect when we can update the input set for each flow
type. Align the driver to other drivers to partition the ring_cookie
value into 8bits of VF index, along with 32bits of queue number instead
of using the user-def field. Added support to parse the user-def field
into a data structure format to allow future extensions of the user-def
filed by keeping all the code that read/writes the field into a single
location. Added support for flexible payloads passed via ethtool
user-def field. We support a single flexible word (2byte) value per
protocol type, and we handle the FLX_PIT register using a list of
flexible entries so that each flow type may be configured separately.
Enabled flow director filters for SCTPv4 packets using the ethtool
ntuple interface to enable filters. Updated the documentation on the
i40e driver to include the newly added support to ntuple filters.
Reduced complexity of a if-continue-else-break section of code by
taking advantage of using hlist_for_each_entry_continue() instead.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 7e54d9d063.
After additional regression testing, several users are experiencing
kernel panics during shutdown on e1000e devices. Reverting this
change resolves the issue.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF driver is incorrectly resetting Octeon when the module parameter
"fw_type=none" is there. "fw_type=none" means the PF should not load any
firmware to the NIC because Octeon is already running preloaded firmware.
Fix it by putting an if (fw_type != none) around the reset code.
Because the Octeon reset is now conditionally gone, when unloading the
driver, conditionally send the RESET_PF command to the firmware who will
then free up PF-related data structures.
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to c298ede2fe ("net: bcmgenet: simplify circular pointer
arithmetic") we don't need to complex arthimetic since we always have a
ring size that is a power of 2.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do something similar to commit d5810ca325 ("net: bcmgenet: clear
status to reduce spurious interrupts") and clear interrupts right before
servicing them. This reduces the number of interrupts by 10K
interrupts/sec for a TX TCP session 1Gbits/sec.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcm_sysport_tx_reclaim_one() is currently summing TX bytes/packets in a
way that is not SMP friendly, mutliples CPUs could run
bcm_sysport_tx_reclaim_one() independently and still update
stats->tx_bytes and stats->tx_packets, cloberring the other CPUs
statistics.
Fix this by tracking per TX rings the number of bytes, packets,
dropped and errors statistics, and provide a bcm_sysport_get_nstats()
function which aggregates everything and returns a consistent output.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The latest gcc-7 snapshot warns about bfa_ioc_send_enable/bfa_ioc_send_disable
writing undefined values into the hardware registers:
drivers/net/ethernet/brocade/bna/bfa_ioc.c: In function 'bfa_iocpf_sm_disabling_entry':
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+4)' is used uninitialized in this function [-Werror=uninitialized]
arch/arm/include/asm/io.h:109:22: error: '*((void *)&disable_req+8)' is used uninitialized in this function [-Werror=uninitialized]
The two functions look like they should do the same thing, but only one
of them initializes the time stamp and clscode field. The fact that we
only get a warning for one of the two functions seems to be arbitrary,
based on the inlining decisions in the compiler.
To address this, I'm making both functions do the same thing:
- set the clscode from the ioc structure in both
- set the time stamp from ktime_get_real_seconds (which also
avoids the signed-integer overflow in 2038 and extends the
well-defined behavior until 2106).
- zero-fill the reserved field
Fixes: 8b230ed8ec ("bna: Brocade 10Gb Ethernet device driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two different set_mac functions exists but stmmac_dwmac4_set_mac() is
only used for enabling and never for disabling.
So on dwmac4, the MAC RX/TX is never disabled.
This patch add a generic function pointer set_mac() to stmmac_ops and
replace all call to stmmac_set_mac/stmmac_dwmac4_set_mac by a call to
this pointer.
Since dwmac4_ops is const, set_mac cannot be modified after, and so dwmac4_ops
is duplioacted like dwmac4_dma_ops.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to reset is_gso flag when EOP reached (entire LSO packet processed).
Fixes: bab6de8fd1 ("net: ethernet: aquantia:
Atlantic A0 and B0 specific functions.")
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix Context Command bit: L3 type = "0" for IPv4, "1" for IPv6.
Fixes: bab6de8fd1 ("net: ethernet: aquantia:
Atlantic A0 and B0 specific functions.")
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix for missing initialization aq_ring header.lock spinlock.
Fixes: 018423e90b ("net: ethernet: aquantia: Add ring support code")
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order for the checksum offloads to work correctly we need to set the
packet type bit (TCP/UDP) in the TX context buffer.
Fixes: 97bde5c4f9 ("net: ethernet: aquantia: Support for NIC-specific code")
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Tested-by: David Arcari <darcari@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Closing/opening the adapter is not needed at all.
The new MTU settings take effect immediately.
Fixes: 97bde5c4f9 ("net: ethernet: aquantia: Support for NIC-specific code")
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't use it during development and we can't extend it either, so
remove it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace a complex if->continue->else->break construction in
i40e_next_filter. We can simply use hlist_for_each_entry_continue
instead. This drops a lot of confusing code. The resulting code is much
easier to understand the intention, and follows the more normal pattern
for using hlist loops. We could have also used a break with a "return
next" at the end of the function, instead of return NULL, but the
current implementation is explicitly clear that when you reach the end
of the loop you get a NULL value. The alternative construction is less
clear since the reader would have to know that next is NULL at the end
of the loop.
Change-Id: Ife74ca451dd79d7f0d93c672bd42092d324d4a03
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Enable FDir filters for SCTPv4 packets using the ethtool ntuple
interface to enable filters. The ethtool API does not allow masking on
the verification tag.
Change-Id: I093e88a8143994c7e6f4b7b17a0bd5cf861d18e4
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for flexible payloads passed via ethtool user-def field.
This support is somewhat limited due to hardware design. The input set
can only be programmed once per filter type, and the flexible offset is
part of this filter input set. This means that the user cannot program
both a regular and a flexible filter at the same time for a given flow
type. Additionally, the user may not program two flexible filters of the
same flow type with different offsets, although they are allowed to
configure different values at that offset location.
We support a single flexible word (2byte) value per protocol type, and
we handle the FLX_PIT register using a list of flexible entries so that
each flow type may be configured separately.
Due to hardware implementation, the flexible data is offset from the
start of the packet payload, and thus may not be in part of the header
data. For this reason, the offset provided by the user defined data is
interpreted as a byte offset from the start of the matching payload.
Previous implementations have tried to represent the offset as from the
start of the frame, but this is not feasible because header sizes may
change due to options.
Change-Id: 36ed27995e97de63f9aea5ade5778ff038d6f811
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add code to parse the user-def field into a data structure format. This
code is intended to allow future extensions of the user-def field by
keeping all code that actually reads and writes the field into a single
location. This ensures that we do not litter the driver with references
to the user-def field and minimizes the amount of bitwise operations we
need to do on the data.
Add code which parses the lower 32bits into a flexible word and its
offset. This will be used in a future patch to enable flexible filters
which can match on some arbitrary data in the packet payload. For now,
we just return -EOPNOTSUPP when this is used.
Add code to fill in the user-def field when reporting the filter back,
even though we don't actually implement any user-def fields yet.
Additionally, ensure that we mask the extended FLOW_EXT bit from the
flow_type now that we will be accepting filters which have the FLOW_EXT
bit set (and thus make use of the user-def field).
Change-Id: I238845035c179380a347baa8db8223304f5f6dd7
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Do not use the user-def field for determining the VF target. Instead,
similar to ixgbe, partition the ring_cookie value into 8bits of VF
index, along with 32bits of queue number. This is better than using the
user-def field, because it leaves the field open for extension in
a future patch which will enable flexible data. Also, this matches with
convention used by ixgbe and other drivers.
Change-Id: Ie36745186d817216b12f0313b99ec95cb8a9130c
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support to detect when we can update the input set for each flow
type.
Because the hardware only supports a single input set for all flows of
that matching type, the driver shall only allow the input set to change
if there are no other configured filters for that flow type.
Thus, the first filter added for each flow type is allowed to change the
input set, and all future filters must match the same input set. Display
a diagnostic message whenever the filter input set changes, and
a warning whenever a filter cannot be accepted because it does not match
the configured input set.
Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ensure that the default input set is correctly reprogrammed when
cleaning up after disabling flow director support. This ensures that the
programmed value will be in a clean state.
Although we do not yet have support for SCTPv4 filters, a future patch
will add support for this protocol, so we will correctly restore the
SCTPv4 input set here as well. Note that strictly speaking the default
hardware value for SCTP includes matching the verification tag. However,
the ethtool API does not have support for specifying this value, so
there is no reason to keep the verification field enabled.
This patch is the next step on the way to enabling partial tuple filters
which will be implemented in a following patch.
Change-Id: Ic22e1c267ae37518bb036aca4a5694681449f283
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Do not assume that hardware has been programmed with the default mask,
but instead read the input set registers to determine what is currently
programmed. This ensures that all programmed filters match exactly how
the hardware will interpret them, avoiding confusion regarding filter
behavior.
This sets the initial ground-work for allowing custom input sets where
some fields are disabled. A future patch will fully implement this
feature.
Instead of using bitwise negation, we'll just explicitly check for the
correct value. The use of htonl and htons are used to silence sparse
warnings. The compiler should be able to handle the constant value and
avoid actually performing a byteswap.
Change-Id: I3d8db46cb28ea0afdaac8c5b31a2bfb90e3a4102
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The current implementation of .set_rxnfc does not properly read the mask
field for filter entries. This results in incorrect driver behavior, as
we do not reject filters which have masks set to ignore some fields. The
current implementation simply assumes that every part of the tuple or
"input set" is specified. This results in filters not behaving as
expected, and not working correctly.
As a first step in supporting some partial filters, add code which
checks the mask fields and rejects any filters which do not have an
acceptable mask. For now, we just assume that all fields must be set.
This will get the driver one step towards allowing some partial filters.
At a minimum, the ethtool commands which previously installed filters
that would not function will now return a non-zero exit code indicating
failure instead.
We should now be meeting the minimum requirements of the .set_rxnfc API,
by ensuring that all filters we program have a valid mask value for each
field.
Finally, add code to report the mask correctly so that the ethtool
command properly reports the mask to the user.
Note that the typecast to (__be16) when checking source and destination
port masks is required because the ~ bitwise negation operator does not
correctly handle variables other than integer size.
Change-Id: Ia020149e07c87aa3fcec7b2283621b887ef0546f
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver "dwc-xlgmac" is dual-licensed.
Declare the dual license with MODULE_LICENSE().
Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver "dwc-xlgmac" is dual-licensed. This patch adds
declaration of dual license in file headers.
Signed-off-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/broadcom/genet/bcmmii.c
drivers/net/hyperv/netvsc.c
kernel/bpf/hashtab.c
Almost entirely overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Align the driver feature distribution with the flow utilized
by the management firmware - first reserve L2 queues for
VFs and use all the remaining for the PF.
The current distribution might lead to PFs with an enormous
amount of queues, but at the same time leave us with insufficient
resources for starting all VFs.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When RoCE is enabled on a given L2 interface, the interrupt lines
are divided equally between L2 and RoCE -
But in case number of lines needed for RoCE is limited by number
of available CNQs, we can utilize the additional lines for L2.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management firmware and driver are meant to be both backward and forward
compatibile with each other.
If a new mangement firmware would work with an older driver,
it's possible that driver would receive indications which are meaningless
to it. That's perfectly acceptible from the firmware part - so no need to
log such messages at default verbosity; That would only serve to confuse
users.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The management firmware is running on a Big Endian processor,
and when running on LE platform HW is configured to swap access
to memory shared between management firmware and driver on
32-bit granulariy.
As a result, for matters of simplicity most of the APIs between
driver and management firmware are based on 32-bit variables.
MAC settings are one exception, as driver needs to fill a byte
array when indicating to management firmware that primary MAC
has changed.
Due to the swap, driver must make sure that the mac that was
provided in byte-order would be translated into native order,
otherwise after the swap the management firmware would read
it swapped.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver interaction with management firmware involves a union
of all the data-members relating to the commands the driver prepares.
Current interface assumes the caller always passes such a union -
but thats cumbersome as well as risky [chancing a stack corruption
in case caller accidentally passes a smaller member instead of union].
Change implementation so that caller could pass a pointer to any
of the members instead of the union.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Interaction of driver -> management firmware is based
on a one-pending mailbox [per interface], and various
mailbox commands need to be synchronized.
Current scheme is messy, and there's a difficulty extending
it as it deals differently with various commands as well as
making assumption on the required behavior for load/unload
requests.
Drop the current scheme into a completion-list-based approach;
Each flow would try sending the command when possible,
allowing one flow to complete another flow's completion and
relieve the mailbox before sending its own command.
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since AQC-100/107/108 chips supports hardware checksums for RX we should indicate this
via NETIF_F_RXCSUM flag.
v1->v2: 'Signed-off-by' tag added.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ECC bit positions that describe whether the ECC interrupt is for
Tx, Rx or descriptor memory and whether the it is a single correctable
or double detected error were defined in incorrectly (reversed order).
Fix the bit position definitions for these settings so that the proper
ECC handling is performed.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If there are no egress packets pending, then don't look for tx completions
in napi poll. Also, fix broken tx queue wakeup logic.
Signed-off-by: VSR Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add workqueue that is periodically run to try to allocate RX buffers in OOM
conditions in PF and VF.
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Presumably if there is an "add" function, there is also a "del"
function. But it causes a static checker warning because it looks like
a common cut and paste bug.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed review comments from the previous patch-set.
- changed return value check of platform_get_irq() to < 0
- replaced devm_request(free)_irq() calls by request(free)_irq() since
they are called from open() and close()
- changed sizeof(struct mystruct) to sizeof(*mystruct)
- reduced indentation on tx_timeout()
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixed port reset sequence by adding ECC init.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added phy management support by using phy abstraction layer APIs.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dma operation mode configuration routine was wrongly moved to a
function (stmmac_mtl_configuration) that is only executed if the
core version is >= 4.00.
Fixes: 6deee2221e ("net: stmmac: prepare dma op mode config for multiple queues")
Reported-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we no longer need to keep the FW enabled for .ndo_close()
to work we can always stop FW after reconfiguration failure.
This seems to make most FWs more resilient to faults (at least
in error injection scenarios).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device open and close handlers check if the device is already
in the desired state. Thanks to our reconfig infrastructure
this should not be necessary, there doesn't seem to be any
code in the driver which depends on it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of ring full or DMA mapping error remember to flush xmit_more
delayed kicks.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP6000 doesn't use queue pointers/doorbells for RX, it uses
'done' bit in descriptors. Remove the pointers from data structures.
Since we are saving space in rx_ring structure make fields we
previously compressed to 16bits word size again.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix warning which was using netdev_warn() instead of dev_warn()
to early.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When acquiring an area fails we can't call function doing both
release and free.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Core should detect when someone is trying to request an access
window which is too large for a given type of access. Otherwise
the requester will be put on a wait queue for ever without any
error message.
Add const qualifiers to clarify that we are only looking at read-
-only members in relevant functions.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When signal interrupts waiting for an area to become available
we assume success. Pay attention to the return code. Unpack
the code a little bit to make it more readable.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
msleep_interruptible() returns time left to wait, not error
code. Return ERESTARTSYS when interrupted.
While at it correct a comment and make the polling a bit
more aggressive.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We shouldn't access area_cache_list without its lock even
to check if it's empty.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Document which fields of nfp_cpp are protected by which locks.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After mutex cache removal we can put the mutex code in a separate
source file. This makes it clear it doesn't play with internals
of struct nfp_cpp any more.
No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CPP mutex cache was introduced to work around the fact that the
same host could successfully acquire a lock multiple times. It
used to collapse multiple users to the same struct nfp_cpp_mutex
and track use count. Unfortunately it's racy. Since we now force
all nfp_mutex_lock() callers within the host to actually succeed
at acquiring the lock we no longer need the cache, let's remove it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The global device lock is acquired to search the resource table.
The lock is actually itself part of the table (entry 0).
Therefore if someone asks for resource 0 we would deadlock since
double locking is no longer allowed.
Currently the driver doesn't try to lock that resource so let's
simply make sure we fail graciously and not add special handling
of this case until really need. Hide the relevant defines in
the source file.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFP can be connected to multiple machines via PCI or other buses.
Access to hardware resources is arbitrated using locks residing
in device memory. Currently nfpcore only respects the mutexes
when it comes to inter-host locking, but if we try to acquire
the same lock again, on one host - it will simply return success
because owner of the lock is already set to that host.
This makes the locks useless for arbitration within one host
and unfair because whichever host grabbed the lock will have
a chance to reacquire it without others getting a shot.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 6ac3ce8295 ("net: bcmgenet: Remove excessive PHY reset")
removed the bcmgenet_mii_reset() function from bcmgenet_power_up() and
bcmgenet_internal_phy_setup() functions. In so doing it broke the reset
of the internal PHY devices used by the GENETv1-GENETv3 which required
this reset before the UniMAC was enabled. It also broke the internal
GPHY devices used by the GENETv4 because the config_init that installed
the AFE workaround was no longer occurring after the reset of the GPHY
performed by bcmgenet_phy_power_set() in bcmgenet_internal_phy_setup().
In addition the code in bcmgenet_internal_phy_setup() related to the
"enable APD" comment goes with the bcmgenet_mii_reset() so it should
have also been removed.
Commit bd4060a610 ("net: bcmgenet: Power on integrated GPHY in
bcmgenet_power_up()") moved the bcmgenet_phy_power_set() call to the
bcmgenet_power_up() function, but failed to remove it from the
bcmgenet_internal_phy_setup() function. Had it done so, the
bcmgenet_internal_phy_setup() function would have been empty and could
have been removed at that time.
Commit 5dbebbb44a ("net: bcmgenet: Software reset EPHY after power on")
was submitted to correct the functional problems introduced by
commit 6ac3ce8295 ("net: bcmgenet: Remove excessive PHY reset"). It
was included in v4.4 and made available on 4.3-stable. Unfortunately,
it didn't fully revert the commit because this bcmgenet_mii_reset()
doesn't apply the soft reset to the internal GPHY used by GENETv4 like
the previous one did. This prevents the restoration of the AFE work-
arounds for internal GPHY devices after the bcmgenet_phy_power_set() in
bcmgenet_internal_phy_setup().
This commit takes the alternate approach of removing the unnecessary
bcmgenet_internal_phy_setup() function which shouldn't have been in v4.3
so that when bcmgenet_mii_reset() was restored it should have only gone
into bcmgenet_power_up(). This will avoid the problems while also
removing the redundancy (and hopefully some of the confusion).
Fixes: 6ac3ce8295 ("net: bcmgenet: Remove excessive PHY reset")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dma_mapping_error() returns true if there is an error but we want
to return -ENOMEM and not 1.
Fixes: 65e0ace2c5 ("net: dwc-xlgmac: Initial driver for DesignWare Enterprise Ethernet")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent changes to support multiple queues in the device tree bindings
resulted in the number of RX and TX queues to be initialized to zero for
device trees not adhering to the new bindings.
Restore backwards-compatibility with those device trees by falling back
to a single RX and TX queues each.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
Tested-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The MAC RX queues always need to be enabled in order to receive network
packets. Remove the condition that this only needs to be done for multi-
queue configurations.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RX packets statistics ('rx_packets' counter) used to count LRO packets
as one, even though it contains multiple segments.
This patch will increment the counter by the number of segments, and
align the driver with the behavior of other drivers in the stack.
Note that no information is lost in this patch due to 'rx_lro_packets'
counter existence.
Before, ethtool showed:
$ ethtool -S ens6 | egrep "rx_packets|rx_lro_packets"
rx_packets: 435277
rx_lro_packets: 35847
rx_packets_phy: 1935066
Now, we will see the more logical statistics:
$ ethtool -S ens6 | egrep "rx_packets|rx_lro_packets"
rx_packets: 1935066
rx_lro_packets: 35847
rx_packets_phy: 1935066
Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
TX packets statistics ('tx_packets' counter) used to count GSO packets
as one, even though it contains multiple segments.
This patch will increment the counter by the number of segments, and
align the driver with the behavior of other drivers in the stack.
Note that no information is lost in this patch due to 'tx_tso_packets'
counter existence.
Before, ethtool showed:
$ ethtool -S ens6 | egrep "tx_packets|tx_tso_packets"
tx_packets: 61340
tx_tso_packets: 60954
tx_packets_phy: 2451115
Now, we will see the more logical statistics:
$ ethtool -S ens6 | egrep "tx_packets|tx_tso_packets"
tx_packets: 2451115
tx_tso_packets: 60954
tx_packets_phy: 2451115
Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
With ConnectX-4 sharing SRQs from the same space as QPs, we hit a
limit preventing some applications to allocate needed QPs amount.
Double the size to 256K.
Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This was added to allow the TC offloading code to identify offloading
encap/decap vxlan rules.
The VF reps are effectively related to the same mlx5 PCI device as the
PF. Since the kernel invokes the (say) delete ndo for each netdev, the
FW erred on multiple vxlan dst port deletes when the port was deleted
from the system.
We fix that by keeping the registration to be carried out only by the
PF. Since the PF serves as the uplink device, the VF reps will look
up a port there and realize if they are ok to offload that.
Tested:
<SETUP VFS>
<SETUP switchdev mode to have representors>
ip link add vxlan1 type vxlan id 44 dev ens5f0 dstport 9999
ip link set vxlan1 up
ip link del dev vxlan1
Fixes: 4a25730eb2 ('net/mlx5e: Add ndo_udp_tunnel_add to VF representors')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we use the non UAPI values and we miss erring on
the modify action which is not supported, fix that.
Fixes: 8b32580df1 ('net/mlx5e: Add TC vlan action for SRIOV offloads')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Changing the eswitch inline mode can potentially cause already configured
flows not to match the policy. E.g. set policy L4, add some L4 rules,
set policy to L2 --> bad! Hence we disallow it.
Keep track of how many offloaded rules are now set and refuse
inline mode changes if this isn't zero.
Fixes: bffaa91658 ("net/mlx5: E-Switch, Add control for inline mode")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor the code to deal with add/del TC rules to have handler per NIC/E-switch
offloading use case, and push the latter into the e-switch code. This provides
better separation and is to be used in down-stream patch for applying a fix.
Fixes: bffaa91658 ("net/mlx5: E-Switch, Add control for inline mode")
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switch cases for the rate limit set and query commands were
missing, which could get us wrong under fw error or driver reset
flow, fix that.
Fixes: 1466cc5b23 ('net/mlx5: Rate limit tables support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not open code getting the MAC address exclusively from the
"local-mac-address" property, but instead use of_get_mac_address() which
looks up the MAC address using the 3 typical property names.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix Coverity scan errors by not dereferencing lio->glists_dma_base pointer
if it's NULL.
See http://marc.info/?l=linux-netdev&m=149002294305614&w=2
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: VSR Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The arguments packets and bytes to call mlxsw_sp_acl_rule_get_stats are
in the wrong order. Fix this by swapping them.
Detected by CoverityScan, CID#1419705 ("Arguments in wrong order")
Fixes: 7c1b8eb175 ("mlxsw: spectrum: Add support for TC flower offload statistics")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With posix timers having become optional, we get a build error with
the cpts time sync option of the CPSW driver:
drivers/net/ethernet/ti/cpts.c: In function 'cpts_find_ts':
drivers/net/ethernet/ti/cpts.c:291:23: error: implicit declaration of function 'ptp_classify_raw';did you mean 'ptp_classifier_init'? [-Werror=implicit-function-declaration]
This adds a hard dependency on PTP_CLOCK to avoid the problem, as
building it without PTP support makes no sense anyway.
Fixes: baa73d9e47 ("posix-timers: Make them configurable")
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dependency is reversed: cpsw and netcp call into cpts,
but cpts depends on the other two in Kconfig. This can lead
to cpts being a loadable module and its callers built-in:
drivers/net/ethernet/ti/cpsw.o: In function `cpsw_remove':
cpsw.c:(.text.cpsw_remove+0xd0): undefined reference to `cpts_release'
drivers/net/ethernet/ti/cpsw.o: In function `cpsw_rx_handler':
cpsw.c:(.text.cpsw_rx_handler+0x2dc): undefined reference to `cpts_rx_timestamp'
drivers/net/ethernet/ti/cpsw.o: In function `cpsw_tx_handler':
cpsw.c:(.text.cpsw_tx_handler+0x7c): undefined reference to `cpts_tx_timestamp'
drivers/net/ethernet/ti/cpsw.o: In function `cpsw_ndo_stop':
As a workaround, I'm introducing another Kconfig symbol to
control the compilation of cpts, while making the actual
module controlled by a silent symbol that is =y when necessary.
Fixes: 6246168b4a ("net: ethernet: ti: netcp: add support of cpts")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are using the smallest padding boundary (8 bytes), which isn't
smaller than the Memory Controller Read/Write Size
We get best performance in 100G when the Packing Boundary is a multiple
of the Maximum Payload Size. Its related to inefficient chopping of DMA
packets by PCIe, that causes more overhead on bus. So driver is helping
by making the starting address alignment to be MPS size.
We will try to determine PCIE MaxPayloadSize capabiltiy and set
IngPackBoundary based on this value. If cache line size is greater than
MPS or determinig MPS fails, we will use cache line size to determine
IngPackBoundary(as before).
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When building the driver as a module, we get a warning about the
lack of a license:
WARNING: modpost: missing MODULE_LICENSE() in drivers/net/ethernet/synopsys/dwc-xlgmac.o
see include/linux/module.h for more information
Curiously the text in the .c files only mentions GPLv2+, while the license
tag in the PCI driver contains both GPL and BSD. I picked the license text
as the more definite reference here and put a GPL tag in there.
Fixes: 65e0ace2c5 ("net: dwc-xlgmac: Initial driver for DesignWare Enterprise Ethernet")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this header, we can run into a build error:
drivers/net/ethernet/synopsys/dwc-xlgmac-hw.c: In function 'xlgmac_config_queue_mapping':
drivers/net/ethernet/synopsys/dwc-xlgmac-hw.c:1548:36: error: 'IEEE_8021QAZ_MAX_TCS' undeclared (first use in this function)
prio_queues = min_t(unsigned int, IEEE_8021QAZ_MAX_TCS,
Fixes: 65e0ace2c5 ("net: dwc-xlgmac: Initial driver for DesignWare Enterprise Ethernet")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jie Deng <jiedeng@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2017-03-21
This series contains updates to e1000, e1000e, igb, igbvf and ixgb.
This finishes up the work Philippe Reynes did to update the Intel drivers
to the new API for ethtool (get|set)_link_ksettings.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The link information exists only on the leading hwfn,
but some of its derivatives [e.g., min/max rate] need to
be configured for each hwfn.
When re-basing the VF link view, use the leading hwfn
information as basis for all existing hwfns to allow
said configurations to stick.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Malicious VF existance should be interesting enough for the
hyperuser. Change the PF indication that one of its child VF
became malicious to appear by default.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF<->VF interface allows for the VF to request
multiple queues closure via a single message, but this has
never been used by any official driver.
We now deprecate this option, forcing each queue close
to arrive via a different command; This would be required
for future TLVs that are going to extend the queue TLVs with
additional information on a per-queue basis.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PF needs to validate the status of VF queues before asking firmware
to configure anything for them, but that validation is done in various
different forms - sometimes inadequate.
Add auxillary functions that can be used for testing of the queue
state and convert the various flows to use those instead of current
existing flows; Also, add missing validations where needed.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When starting the VF's vport, the PF would first configure
the status blocks of the VF and then reset them.
That would cause some of the configured information to be lost -
specifically it would mean that all the VFs queues would use
the Rx coalescing state-machine of the status block.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When PF responds to the VF requests it also cleans the HW-channel
indication in firmware to allow further VF messages to arrive,
but the order currently applied is wrong -
The PF is copying by DMAE the response the VF is polling on for
completion, and only afterwards sets the HW-channel to ready state.
This creates a race condition where the VF would be able to send
an additional message to the PF before the channel would get ready
again, causing the firmware to consider the VF as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a VF is considered malicious, driver handling of the VF
FLR flow would clean said indication - but not if the FLR is
part of an sriov-disable flow.
That leads to further issues, as PF wouldn't re-enable the
previously malicious VF when sriov is re-enabled.
No reason for that - simply clean malicious indications in
the sriov-disable flow as well.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VFs are currently logging errors when communicating
with their PFs in a too-low verbosity that wouldn't
be shown by default. As timeouts and failed commands
are crucial for VF operability, make them appear by
default.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change to support host<->firmware command return value.
Fix for vf mac addr state command.
1. Added support for firmware commands to return a value:
- previously, the returned code overlapped with host codes, thus
commands were only returning 0 (success) or -1 (interpreted as
timeout)
- per 'response_manager.h', the error codes are split into two fields
(major/minor) now, firmware commands are grouped into their own
'major' group, separate from the host's 'major' group, which allow f/w
commands to return any 16-bit value
2. The command to set vf mac addr was logging a success message even if
command failed. Now command uses a callback function to log the status
message.
3. The command to set vf mac addr was not logging a message when set via
the host 'ip' command. Now, the callback function will log an
appropriate message.
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When closing the ibmvnic device we need to release the resources used
in communicating to the virtual I/O server. These need to be
re-negotiated with the server at open time.
This patch moves the releasing of resources a separate routine
and updates the open and close handlers to release all resources at
close and re-negotiate and allocate these resources at open.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The intialization of the ibmvnic driver with respect to the virtual
server it connects to should be moved to its own routine. This will
alolow the driver to initiate this process from places outside of
the drivers probe routine.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the code that handles login and renegotiation of ibmvnic
capabilities to its own routine.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VNIC server expects LINK_STATE_UP to be sent within 30s of the login. If we
exceed the timeout, VNIC server will attempt to fail over. Since time
between probe and open of the device is indeterminate, move login and queue
negotiation into ibmvnic open so we can guarantee that login and sending
LINK_STATE_UP occur within the 30s window.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function gem_begin_auto_negotiation dereference
the pointer ep before testing if it's null. This
patch add a check on ep before dereferencing it.
Fixes: 92552fdda5 ("net: sun: sungem: use new api
ethtool_{get|set}_link_ksettings")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We could allocate less memory than intended because we do:
bnad->regdata = kzalloc(len << 2, GFP_KERNEL);
The shift can overflow leading to a crash. This is debugfs code so the
impact is very small.
Fixes: 7afc5dbde0 ("bna: Add debugfs interface.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add timeout error message in lio_process_ordered_list(). Add host failure
status in existing error message in if_cfg_callback().
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove code duplicated in PF and VF; define that code once only in a common
header file included by PF and VF.
Signed-off-by: Satanand Burla <satananda.burla@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the configuration of RX queues' routing.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the configuration of RX and TX queues' priority.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch creates 2 new structures (stmmac_tx_queue and stmmac_rx_queue)
in include/linux/stmmac.h, enabling that each RX and TX queue has its
own buffers and data.
Signed-off-by: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use ether_addr_copy() instead of memcpy() to set netdev->dev_addr (which
is 2-byte aligned).
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Reviewed-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Align the default case for matchall offload with what's there
for flower.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>