This patch fixes the lost of Ethernet port on low memory system,
when driver frees its resources and fails to allocate new resources.
Issue could happen while changing number of channels, rings size or
changing the timestamp configuration.
This fix is necessary because of removing vmap use in the code.
When vmap was in use driver could allocate non-contiguous memory
and make it contiguous with vmap. Now it could fail to allocate
a large chunk of contiguous memory and lose the port.
Current code tries to allocate new resources and then upon success
frees the old resources.
Fixes: 73898db043 ('net/mlx4: Avoid wrong virtual mappings')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Filters cleanup should be done once before destroying net device,
since filters list is contained in the private data.
Fixes: 1eb8c695bd ('net/mlx4_en: Add accelerated RFS support')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A bridge device in NS2 has the same device ID as the ethernet controller.
Add check to avoid probing the bridge device.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate special vnic for dropping packets not matching the RX filters.
First vnic is for normal RX packets and the driver will drop all
packets on the 2nd vnic.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allocate napi for special vnic, packets arriving on this
napi will simply be dropped and the buffers will be replenished back
to the HW.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware is unable to drop rx packets not matching the RX filters. To
workaround it, we create a special VNIC and configure the hardware to
direct all packets not matching the filters to it. We then setup the
driver to drop packets received on this VNIC.
This patch creates the infrastructure for this VNIC, reserves a
completion ring, and rx rings. Only shared completion ring mode is
supported. The next 2 patches add a NAPI to handle packets from this
VNIC and the setup of the VNIC.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nitro A0 has a hardware bug in the rx path. The workaround is to create
a special COS context as a path for non-RSS (non-IP) packets. Without this
workaround, the chip may stall when receiving RSS and non-RSS packets.
Add infrastructure to allow 2 contexts (RSS and CoS) per VNIC. Allocate
and configure the CoS context for Nitro A0.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nitro is the embedded version of the ethernet controller in the North
Star 2 SoC. Add basic code to recognize the chip ID and disable
the features (ntuple, TPA, ring and port statistics) not supported on
Nitro A0.
Signed-off-by: Prashant Sreedharan <prashant.sreedharan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently 'ifconfig' for the Ethernet devices handled by this driver shows
"DMA chan: ff" while the driver doesn't use any DMA channels. Not assigning
a value to 'net_device::dma' causes 'ifconfig' to correctly not report a
DMA channel.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently 'ifconfig' for the Ethernet devices handled by this driver shows
"DMA chan: ff" while the driver doesn't use any DMA channels. Not assigning
a value to 'net_device::dma' causes 'ifconfig' to correctly not report a
DMA channel.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In 'cpmac_open', 'dma_alloc_coherent' has been used to allocate some
resources, so we need to free them using 'dma_free_coherent' instead
of 'kfree'.
Also, we don't need to free these resources if the allocation has failed.
So I have slighly modified the goto label in this case.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
There was a check on CAP_NET_ADMIN in bfin_mac_ethtool_setsettings,
but this check is already done in dev_ethtool, so no need to repeat
it before calling the generic function.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc_workqueue replaces deprecated create_singlethread_workqueue().
A dedicated workqueue has been used since the workitem viz
lp->txtimeout_reinit is involved in reinitialization if a TX timeout
occurs, which is necessary to guarantee forward progress in packet
processing. As a network device can be used during memory reclaim, the
workqueue needs forward progress guarantee under memory pressure.
WQ_MEM_RECLAIM has been set to ensure this.
Since there is only a single work item, explicit concurrency limit is
unnecessary here.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The label lio_xmit_failed is used 3 times through liquidio_xmit() but it
always makes a call to dma_unmap_single() using potentially
uninitialized variables from "ndata" variable. Out of the 3 gotos, 2 run
after ndata has been initialized, and had a prior dma_map_single() call.
Fix this by adding a new error label: lio_xmit_dma_failed which does
this dma_unmap_single() and then processed with the lio_xmit_failed
fallthrough.
Fixes: f21fb3ed36 ("Add support of Cavium Liquidio ethernet adapters")
Reported-by: coverity (CID 1309740)
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case nb8800_receive() fails to allocate a fragment, we would leak the
SKB freshly allocated and just return, instead, free it.
Reported-by: coverity (CID 1341750)
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Mans Rullgard <mans@mansr.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should be using a logical check here instead of a bitwise operation
to check if the device is closed already in et131x_tx_timeout().
Reported-by: coverity (CID 146498)
Fixes: 38df6492eb ("et131x: Add PCIe gigabit ethernet driver et131x to drivers/net")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the Hisilicon Fast Ethernet MAC(FEMAC) driver.
The FEMAC supports max speed 100Mbps and has been used in many
Hisilicon SoC.
Signed-off-by: Dongpo Li <lidongpo@hisilicon.com>
Reviewed-by: Jiancheng Xue <xuejiancheng@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TI_CPSW_PHY_SEL depended on TI_CPSW and was selected by the latter. So
there is no reason to have this symbol visible.
A further optimisation would be to put the code for both symbols into a
single module which would allow to not export at least cpsw_phy_sel()
and simplify the module load process.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
There was a check on CAP_NET_ADMIN in cpmac_set_settings, but this
check is already done in dev_ethtool, so no need to repeat it before
calling the generic function.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
There was a check on CAP_NET_ADMIN in au1000_set_settings, but this
check is already done in dev_ethtool, so no need to repeat it before
calling the generic function.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phydev in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nothing is decrementing the index "i" while we are cleaning up the
fragments we could not successful transmit.
Fixes: 9cde94506e ("bgmac: implement scatter/gather support")
Reported-by: coverity (CID 1352048)
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets entering the switch are mapped to a Switch Priority (SP)
according to their PCP value (untagged frames are mapped to SP 0).
The packets are classified to a priority group (PG) buffer in the port's
headroom according to their SP.
The switch maintains another mapping (SP to IEEE priority), which is
used to generate PFC frames for lossless PGs. This mapping is
initialized to IEEE = SP % 8.
Therefore, when mapping SP 'x' to PG 'y' we create a situation in which
an IEEE priority is mapped to two different PGs:
IEEE 'x' ---> SP 'x' ---> PG 'y'
IEEE 'x' ---> SP 'x + 8' ---> PG '0' (default)
Which is invalid, as a flow can use only one PG buffer.
Fix this by mapping both SP 'x' and 'x + 8' to the same PG buffer.
Fixes: 8e8dfe9fdf ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The number of supported traffic classes that can have ETS and PFC
simultaneously enabled is not subject to user configuration, so make
sure we always initialize them to the correct values following a set
operation.
Fixes: 8e8dfe9fdf ("mlxsw: spectrum: Add IEEE 802.1Qaz ETS support")
Fixes: d81a6bdb87 ("mlxsw: spectrum: Add IEEE 802.1Qbb PFC support")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can't have PAUSE frames and PFC both enabled on the same port, but
the fact that ieee_setpfc() was called doesn't necessarily mean PFC is
enabled.
Only emit errors when PAUSE frames and PFC are enabled simultaneously.
Fixes: d81a6bdb87 ("mlxsw: spectrum: Add IEEE 802.1Qbb PFC support")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device supports link autonegotiation, so let the user know about it
by indicating support via ethtool ops.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When setting a new speed we need to disable and enable the port for the
changes to take effect. We currently only do that if the operational
state of the port is up. However, setting a new speed following link
training failure will require us to explicitly set the port down and then
up.
Instead, disable and enable the port based on its administrative state.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the q_vector initialization routine sets the affinity_mask
of a q_vector based on v_idx value. Meaning a loop iterates on v_idx,
which is an incremental value, and the cpumask is created based on
this value.
This is a problem in systems with multiple logical CPUs per core (like in
SMT scenarios). If we disable some logical CPUs, by turning SMT off for
example, we will end up with a sparse cpu_online_mask, i.e., only the first
CPU in a core is online, and incremental filling in q_vector cpumask might
lead to multiple offline CPUs being assigned to q_vectors.
Example: if we have a system with 8 cores each one containing 8 logical
CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
disabled, only the 1st CPU in each core remains online, so the
cpu_online_mask in this case would have only 8 bits set, in a sparse way.
In general case, when SMT is off the cpu_online_mask has only C bits set:
0, 1*N, 2*N, ..., C*(N-1) where
C == # of cores;
N == # of logical CPUs per core.
In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.
This patch changes the way q_vector's affinity_mask is created: it iterates
on v_idx, but consumes the CPU index from the cpu_online_mask instead of
just using the v_idx incremental value.
No functional changes were introduced.
Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the function ixgbe_poll() returns 0 when it clean completely
the rx rings, but this foul budget accounting in core code.
Fix this returning the actual work done, capped to weight - 1, since
the core doesn't allow to return the full budget when the driver modifies
the napi status
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch sets VSI broadcast promiscuous mode during VSI add sequence
and prevents adding MAC filter if specified MAC address is broadcast.
Change-ID: Ia62251fca095bc449d0497fc44bec3a5a0136773
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There are a couple of issues I found in i40e_rx_checksum while doing some
recent testing. As a result I have found the Rx checksum logic is pretty
much broken and returning that the checksum is valid for tunnels in cases
where it is not.
First the inner types are not the correct values to use to test for if a
tunnel is present or not. In addition the inner protocol types are not a
bitmask as such performing an OR of the values doesn't make sense. I have
instead changed the code so that the inner protocol types are used to
determine if we report CHECKSUM_UNNECESSARY or not. For anything that does
not end in UDP, TCP, or SCTP it doesn't make much sense to report a
checksum offload since it won't contain a checksum anyway.
This leaves us with the need to set the csum_level based on some value.
For that purpose I am using the tunnel_type field. If the tunnel type is
GRENAT or greater then this means we have a GRE or UDP tunnel with an inner
header. In the case of GRE or UDP we will have a possible checksum present
so for this reason it should be safe to set the csum_level to 1 to indicate
that we are reporting the state of the inner header.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
'vr' should be a valid pointer here, so returning 'PTR_ERR(vr)' is wrong.
Return an explicit error code (-ENOENT) instead.
Fixes: 61c503f976 ("mlxsw: spectrum_router: Implement fib4 add/del switchdev obj ops")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two generics functions phy_ethtool_{get|set}_link_ksettings,
so we can use them instead of defining the same code in the driver.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The private structure contain a pointer to phydev, but the structure
net_device already contain such pointer. So we can remove the pointer
phy in the private structure, and update the driver to use the
one contained in struct net_device.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The VF representors support only TC filter/action offloads
(not mqprio) and this is enabled for them by default.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enhance the TC offload code such that when the eswitch exists and it's
mode being SRIOV offloads, we do TC actions parsing and setup targeted
for eswitch. Next, we add the offloaded flow to the HW e-switch (fdb).
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the setup code that parses the TC actions needed to support offloading drop
and mirred/redirect for SRIOV e-switch. We can redirect between two devices if
they belong to the same HW switch, compare the switchdev HW ID attribute to
enforce that.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Towards reusing the TC offloads code for an SRIOV use-case, change some of the
helper functions to have _nic in their names so it's clear what's NIC unique
and what's general. Also group together the NIC related helpers so we can easily
branch per the use-case in downstream patch.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows for upper levels in the driver, e.g the TC offload code to add
e-switch offloaded steering rules. The caller provides the rule spec for
matching, action, source and destination vports.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the offloads mode, some slow path rules are added by the driver (e.g
send-to-vport), while offloaded rules are to be added from upper layers.
The slow path rules have lower priority and we don't want matching on
offloaded rules to suffer from extra steering hops related to the slow
path rules.
We use two priorities, one for offloaded rules (fast path), and one for
the control rules (slow path). To allow for that, we enable two priorities
for the FDB namespace in the FS core code.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currenly, the code that programs the flow actions into the firmware
doesn't check if was actually asked to offload the statistics, fix that.
Fixes: aad7e08d39 ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit utilize the ability of ConnectX-4 to bulk read flow counters.
Few bulk counter queries could be done instead of issuing thousands of
firmware commands per second to get statistics of all flows set to HW,
such as those programmed when we offload tc filters.
Counters are stored sorted by hardware id, and queried in blocks (id +
number of counters).
Due to hardware requirement, start of block and number of counters in a
block must be four aligned.
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to use bulk counters, we need to have counters sorted by id.
Signed-off-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add inline function that checks if there is a pending tx packet.
Signed-off-by: Elad Kanfi <eladkan@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Start all tx queues (including inactive ones) when opening the netdev.
Stop all tx queues (including inactive ones) when closing the netdev.
This is a workaround for the tx timeout watchdog false alarm issue in
which the netdev watchdog is polling all the tx queues which may include
inactive queues and thus once lowering the real tx queues number
(ethtool -L) it will generate tx timeout watchdog false alarms.
Fixes: 3947ca1859 ('net/mlx5e: Implement ndo_tx_timeout callback')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change netif_tx_queue_stopped to netif_xmit_stopped. This will show
when queues are stopped due to byte queue limits.
Fixes: 3947ca1859 ('net/mlx5e: Implement ndo_tx_timeout callback')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even though the hardware can be doing zero padding, we want the SKB to
be going out on the wire with the appropriate size. This fixes packet
truncations observed with e.g: ARP packets.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case any operation fails before we can successfully go the point
where we would register a MDIO bus, we would be going to an error label
which involves unregistering then freeing this yet to be created MDIO
bus. Update all error paths to go to label free which is the only one
valid until either the clock is enabled, or the MDIO bus is allocated
and registered. This fixes kernel oops observed while trying to
dereference the MDIO bus structure which is not yet allocated.
Fixes: a170285772 ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trace EMAD messages going down to HW and up from HW. Devlink needs to be
registered before EMAD init so the trace function can be called
with valid devlink handle.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
v1->v2:
- Use trace_devlink_hwmsg directly
Signed-off-by: David S. Miller <davem@davemloft.net>
During commit b54b8c2d6e
("net: ezchip: adapt driver to little endian architecture")
adapting to little endian architecture,
zeroing of controller was left out.
Signed-off-by: Elad Kanfi <eladkan@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix missing clk_disable_unprepare() call before return
from dwceqos_probe() in the error handling case of invalid
fixed-link.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warnings:
drivers/net/ethernet/mediatek/mtk_eth_soc.c:79:5: warning:
symbol '_mtk_mdio_write' was not declared. Should it be static?
drivers/net/ethernet/mediatek/mtk_eth_soc.c:98:5: warning:
symbol '_mtk_mdio_read' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
PTR_ERR should access the value just tested by IS_ERR, otherwise
the wrong error code will be returned.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of error, the function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
idx can be returned as -ENOSPC, so we should check for this first
before using it as an index into nn->vxlan_usecnt[] to avoid an
out of bounds array offset read.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable lan91x adapters in some ARM machines and models
when booted with an ACPI kernel.
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some comments weren't updated to reflect the renaming of ndo's and the
change of arguments.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 4386f5662e ("net: ethernet: bcmgenet: use
phy_ethtool_{get|set}_link_ksettings")
This patch is wrong, the function phy_ethtool_{get|set}_link_ksettings
don't check if the device is running, but the driver bcmgenet need this
check.
The function {get|set}_settings need to access the mdio bus, and this
bus may only be used when the device is running. Otherwise, the clock
is disable and a mdio access will fail.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rc is not initialized so it can contain garbage if it is not
set by the call to bnxt_read_sfp_module_eeprom_info. Ensure
garbage is not returned by initializing rc to 0.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for TSE PCS that uses SGMII adapter when the phy-mode of
the dwmac is set to sgmii.
Signed-off-by: Tien Hock Loh <thloh@altera.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bcma portion of the driver has been split off into a bcma specific
driver. This has been mirrored for the platform driver. The last
references to the bcma core struct have been changed into a generic
function call. These function calls are wrappers to either the original
bcma code or new platform functions that access the same areas via MMIO.
This necessitated adding function pointers for both platform and bcma to
hide which backend is being used from the generic bgmac code.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bgmac driver is using the bcma provides device ID and revision, as
well as the SoC ID and package, to determine which features are
necessary to enable, reset, etc in the driver. In anticipation of
removing the bcma requirement for this driver, these must be changed to
not reference that struct. In place of that, each "feature" has been
given a flag, and the flags are enabled for their respective device and
SoC.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the BCMA MDIO phy into a separate file, as it is very tightly
coupled with the BCMA bus. This will help with the upcoming BCMA
removal from the bgmac driver. Optimally, this should be moved into
phy drivers, but it is too tightly coupled with the bgmac driver to
effectively move it without more changes to the driver.
Note: the phy_reset was intentionally removed, as the mdio phy subsystem
automatically resets the phy if a reset function pointer is present. In
addition to the moving of the driver, this reset function is added.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dma buffer allocation, etc references a dma_dev device pointer from
the bcma core. In anticipation of removing the bcma requirement for
this driver, these must be changed to not reference that struct. Add a
dma_dev device pointer to the bgmac stuct and reference that instead.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bgmac_* print wrappers call dev_* prints with the dev pointer from
the bcma core. In anticipation of removing the bcma requirement for
this driver, these must be changed to not reference that struct. So,
simply change all of the bgmac_* prints to their dev_* counterparts. In
some cases netdev_* prints are more appropriate, so change those as
well.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In some cases, if there is no VNIC server available during the driver
probe, the driver should wait until it receives an initialization
request from the VNIC Server to start the login process. Recent testing
has show that this is incorrectly handled in the current driver.
The proposed solution handles this initialization request by scheduling
a task in the shared workqueue that completes the login process and
registers the net device.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch creates a function that handles sub-CRQ IRQ creation
separately from sub-CRQ initialization. Another function is then needed
to release sub-CRQ resources prior to sub-CRQ IRQ creation.
These additions allow the driver probe function to be simplified,
specifically during the VNIC Server login process. A timeout is also
included while waiting for completion of the login process in case
the VNIC Server is not available or some other error occurs.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IRQ mappings were not being properly disposed when releasing sub-CRQ's.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since ibmvnic uses multiple tx queues, start and stop all queues when
opening and closing devices.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This code generates as static checker warning because htons(ETH_P_IPV6)
is always true. From the context it looks like the && was intended to
be !=.
Fixes: 94758f8de0 ('bnxt_en: Add GRO logic for BCM5731X chips.')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit aebea2ba0f ("net: mvneta: fix Tx interrupt delay") intended to
set coalescing threshold to a value guaranteeing interrupt generation
per each sent packet, so that buffers can be released with no delay.
In fact setting threshold to '1' was wrong, because it causes interrupt
every two packets. According to the documentation a reason behind it is
following - interrupt occurs once sent buffers counter reaches a value,
which is higher than one specified in MVNETA_TXQ_SIZE_REG(q). This
behavior was confirmed during tests. Also when testing the SoC working
as a NAS device, better performance was observed with int-per-packet,
as it strongly depends on the fact that all transmitted packets are
released immediately.
This commit enables NETA controller work in interrupt per sent packet mode
by setting coalescing threshold to 0.
Signed-off-by: Dmitri Epshtein <dima@marvell.com>
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Cc: <stable@vger.kernel.org> # v3.10+
Fixes aebea2ba0f ("net: mvneta: fix Tx interrupt delay")
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pinned timers must carry the pinned attribute in the timer structure
itself, so convert the code to the new API.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Chris Mason <clm@fb.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094341.376394205@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Remove .owner field since calls to module_platform_driver() will
set it automatically.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/mellanox/mlx5/core/en.h
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
drivers/net/usb/r8152.c
All three conflicts were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Change t4fw_version.h to update latest firmware version number
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
GCC complains on unused-but-set-variable, clean this up.
Fixes: 23898c763f ('net/mlx5: E-Switch, Modify node guid on vf set MAC')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of error, function devm_ioremap_resource() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now, the driver sends arp probes for all unresolved neighbours that are
currently a nexthop for some route on the system. The job is set
periodically every 5 seconds.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For nexthop neighbours we need to make kernel to think there is a traffic
flowing to them preventing it from going to stale state. Otherwise
kernel would stale it and eventually the neigh would be removed from HW
and nexthop as well. That would reduce ECMP group in HW.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement next-hop routing offload including ECMP. To make it possible,
introduce next-hop group entity. This entity keeps track of resolved
neighbours and updates HW adjacency table accordingly. Note that HW
next-hops are stored in this adjacency table, in form of MAC.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RALEU register is used to mass update remote action adjacency index
and ecmp size.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RATR register is used to configure the Router Adjacency (next-hop)
Table.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a very simple manager for KVD linear area. Currently, the
allocator will either allocate a single entry from pre-defined sub-area,
or in case more than one entry is needed, it will allocate 32-entry chunk
in other pre-defined sub-area.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Override the defaults and define the area sizes ourselves.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now we only used hash-based tables in the device, but we are
going to use the linear table for remote routes adjacency lists.
Add the configuration fields that control the size of the linear table.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Listen to any NEIGH_UPDATE events sent and program the device
accordingly. If NUD state is VALID and neighbour isn't yet offloaded,
then program it into the device's table. Otherwise, just edit its
parameters.
If NUD state machine transitioned neighbour out of VALID state and it's
present in the device's table, then remove it.
Note that the device is programmed in delayed work, as the netevent
notification chain is atomic and prevents us from going to sleep.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As previously explained, the driver should periodically poll the device
for neighbours activity according to the configured DELAY_PROBE_TIME.
This will prevent active neighbours from staying in STALE state for long
periods of time.
During init configure the polling interval according to the
DELAY_PROBE_TIME used in the default table. In addition, register a
netevent notification block, so that the interval is updated whenever
DELAY_PROBE_TIME changes.
Using the computed interval schedule a delayed work, which will update
the kernel via neigh_event_send() on any active neighbour since the last
delayed work.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RAUHTD register allows dumping entries from the Router Unicast Host
Table.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The RAUHT register is used to configure and query the Unicast Host Table
in devices that implement the Algorithmic LPM. In other words, it is
used to configure neighbour entries in the device.
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to hold some private data for every neigh entry. It would be
possible to do it using neigh_priv_len/ndo_neigh_construct/
ndo_neigh_destroy however only for the port device itself. That would not
work for stacked devices like bridge/team/bond. So introduce a private
neigh table. Hook onto ndos neigh_construct/destroy and add/remove
table entry according to that.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As the following patch will allow upper devices to follow the call down
lower devices, we need to add dev here and not rely on n->dev.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bump version to 0.28 and date to 4th of July 2016.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update my email address in the driver and MAINTAINERS file.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We maintain how much work we did in NAPI context, so provide that with
napi_complete_done().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are already in hard IRQ context, so we can use
__napi_schedule_irqoff() to save a few operations.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kick the transmission only if this is the last SKB to transmit or the
queue is not already stopped.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of taking one interrupt per packet transmitted, re-use the same
NAPI context to free transmitted buffers. Since we are no longer in hard
IRQ context replace dev_kfree_skb_irq() by dev_kfree_skb().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pad the SKB to the minimum length of ETH_ZLEN by using skb_put_padto()
and take this operation out of the critical section since there is no
need to check any HW resources before doing that.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
r6040_xmit() is increasing transmit statistics during transmission while
this may still fail, do this in r6040_tx() where we complete transmitted
buffers instead.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of open coding our own version utilize the library provided
function.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove including <linux/version.h> that don't need it.
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just like per prio counters, the global flow counters are queried from
per priority counters register.
Global flow control counters are stored in priority 0 PFC counters.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the needed descriptors to expose RoCE RDMA counters.
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enhance the existing get_rxnfc callback:
1. Get flow rule of specific ID.
2. Get all flow rules.
3. Get number of rules.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to add flow steering rules with ethtool
of L3/L4 flow types (ip4/tcp4/udp4).
Those rules will be in higher priority than l2 flow rules, in order
to prefer more specific rules.
Mask is not supported for l3/l4 flow types.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement etrhtool set_rxnfc callback to support ethtool flow spec
direct steering. This patch adds only the support of ether flow type
spec. L3/L4 flow specs support will be added in downstream patches.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of explicitly cleaning up the well known parts of the steering
tree, we use the generic tree structure to traverse for cleanup.
No functional changes.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of having all steering private name spaces and
steering module fields flat in mlx5_core_priv, we wrap
them in mlx5_flow_steering for better modularity and
API exposure.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reduce the set of arguments passed to mlx5_add_flow_rule
by introducing flow_spec structure.
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 8067302973 ("net-next: mediatek: add support for IRQ grouping")
adds handling for irq 1 and 2 to the uninit function but did not remove
irq 0 which is not used since irq grouping was introduced. Fix this by
removing the superfluous call to free_irq().
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: John Crispin <john@phrozen.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit f786f3564c ("net: ethernet: lpc_eth: use phydev
from struct net_device") the 'pldat' variable became unused, so
just remove it.
Reported-by: Olof's autobuilder <build@lixom.net>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As with the previously introduced L3 interfaces, listen to 'inetaddr'
notifications sent for bridges devices configured on top of the port
netdevs and create / destroy router interfaces (RIFs) accordingly.
This also includes VLAN devices configured on top of the VLAN-aware
bridge.
The RIFs will be destroyed either when the last IP address is removed or
when the underlying FID is is destroyed.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before introducing support for L3 interfaces on top of the VLAN-aware
bridge we need to add some missing infrastructure.
Such an interface can either be the bridge device itself or a VLAN
device on top of it. In the first case the router interface (RIF) is
associated with FID 1, which is created whenever the first port netdev
joins the bridge. We currently assume the default PVID is 1 and that
it's already created, as it seems reasonable. This can be extended in
the future.
However, in the second case it's entirely possible we've yet to create a
matching FID. This can happen if the VLAN device was configured before
making any bridge port member in the VLAN.
Prevent such ordering problems by using the VLAN device's CHANGEUPPER
event to configure the FID. Make the VLAN device hold a reference to the
FID and prevent it from being destroyed even if none of the port netdevs
is using it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previous commit deprecated the vFIDs used to get traffic to the CPU
('port_vfids'). Thus, we now use the vFIDs as god intended and the
artificial split is no longer needed.
Rename functions and variables to reflect that.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now we only supported bridged interfaces. Packets ingressing
through the switch ports were either classified to FIDs (in the case of
the VLAN-aware bridge) or vFIDs (in the case of VLAN-unaware bridges).
The packets were then forwarded according to the FDB. Routing was done
entirely in slowpath, by splitting the vFID range in two and using the
lower 0.5K vFIDs as dummy bridges that simply flooded all incoming
traffic to the CPU.
Instead, allow packets to be routed in the device by creating router
interfaces (RIFs) that will direct them to the router block.
Specifically, the RIFs introduced here are Sub-port RIFs used for VLAN
devices and port netdevs. Packets ingressing from the {Port / LAG ID, VID}
with which the RIF was programmed with will be assigned to a special
kind of FIDs called rFIDs and from there directed to the router.
Create a RIF whenever the first IPv4 address was programmed on a VLAN /
LAG / port netdev. Destroy it upon removal of the last IPv4 address.
Receive these notifications by registering for the 'inetaddr'
notification chain. A non-zero (10) priority is used for the
notification block, so that RIFs will be created before routes are
offloaded via FIB code.
Note that another trigger for RIF destruction are CHANGEUPPER
notifications causing the underlying FID's reference count to go down to
zero. This can happen, for example, when a VLAN netdev with an IP address
is put under bridge. While this configuration doesn't make sense it does
cause the device and the kernel to get out of sync when the netdev is
unbridged. We intend to address this in the future, hopefully in current
cycle.
Finally, Remove the lower 0.5K vFIDs, as they are deprecated by the RIFs,
which will trap packets according to their DIP.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are just about to introduce router interfaces (RIFs), but before that
we need to be able update the device with the correct RIF attributes
whenever they change for the netdev the RIF is backing. Two such
attributes are MTU and MAC.
The MAC is used both to set the source MAC of packets egressing from the
RIF and also to program an FDB rule that will direct packets to the
router block.
Use the existing netdevice notification block and respond to CHANGEADDR
and CHANGEMTU accordingly. Store both attributes in the RIF struct
in case we need to revert to old attributes following a failed update.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions that iterate over lower devices and find port device.
As a dependency add netdev_for_each_all_lower_dev and
netdev_for_each_all_lower_dev_rcu macro with
netdev_all_lower_get_next and netdev_all_lower_get_next_rcu shelpers.
Also, add functions to return mlxsw struct according to lower device
found and mlxsw_port struct with a reference to lower device.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement ipv4 FIB entries addition and removal. Initially, we support
local and broadcast routes using "ip2me" trap action.
Also, unicast routes without nexthop are supported using "local" action.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Serves for adding, updating and removing fib entries.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtual router is a construct used inside HW. In this implementation
we map kernel tables to virtual routers one to one. Introduce management
logic to create virtual routers when needed and destroy in case they are
no longer in use. According to that, call into LPM tree management.
Each virtual router is always bound to one LPM tree.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce basic LPM tree management allowing to share the trees in
between tables if the used prefixes in the tables are the same.
Build the tree structure according to the used prefixes. Although it is
not optimal for many use cases, this initial implementation does only
simple linear left-tree. More advanced structures will be introduced
later on, possibly including mechanisms to change trees on the fly.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This register is used to bind virtual router and protocol to an
allocated LPM tree.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Serves to build LPM tree structure.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register serves for allocation and deallocation of LPM search tree.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shadow FIB is needed in order to hold additional information for FIB
entries and keep track of used prefixes. That is needed for the LPM tree
construction to be introduced later on in this set.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit a788a4a040.
This patch is wrong, the type returned doesn't fit
what the error pointer macros expect.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is likely that checking 'fman->fifo_offset' instead of
'fman->cam_offset' is expected here.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes response header to be able to communicate
with new firmware interface.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes redundant file includes and conditions.
Provides some meaningful comments and code alignment.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes redudant droq num validation.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch limits the MTU between 68 bytes and 16000 bytes.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the issue of proper freeing of queue
memory resources during free device. It also has fix for
correct pcie error reporting.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the dependency of number of iq/oq's on
number of cpus.
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>