Add ethtool private flag 'xdp_tx_mpwqe' to control the feature
from userspace.
Feature is set ON by default, if supported.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add support for the HW feature of multi-packet WQE in XDP
xmit flow.
The conventional TX descriptor (WQE, Work Queue Element) serves
a single packet. Our HW has support for multi-packet WQE (MPWQE)
in which a single descriptor serves multiple TX packets.
This reduces both the PCI overhead and the CPU cycles wasted on
writing them.
In this patch we add support for the HW feature, which is supported
starting from ConnectX-5.
Performance:
Tested packet rate for UDP 64Byte multi-stream over ConnectX-5 NICs.
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
XDP_TX:
We see a huge gain on single port ConnectX-5, and reach the 100 Mpps
milestone.
* Single-port HCA:
Before: 70 Mpps
After: 100 Mpps (+42.8%)
* Dual-port HCA:
Before: 51.7 Mpps
After: 57.3 Mpps (+10.8%)
* In both cases we tested traffic on one port and for now On Dual-port HCAs
we see only small gain, we are working to overcome this bottleneck, but
for the moment only with experimental firmware on dual port HCAs we can
reach the wanted numbers as seen on Single-port HCAs.
XDP_REDIRECT:
Redirect from (A) ConnectX-5 to (B) ConnectX-5.
Due to a setup limitation, (A) and (B) are on different NUMA nodes,
so absolute performance numbers are not optimal.
Note:
Below is the transmit rate of (B), not the redirect rate of (A)
which is in some cases higher.
* (B) is single-port:
Before: 77 Mpps
After: 90 Mpps (+16.8%)
* (B) is dual-port:
Before: 61 Mpps
After: 72 Mpps (+18%)
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Each xdp_wqe_info instance describes the number of data-segments
and WQEBBs of the WQE.
This is useful for a downstream patch that adds support for
Multi-Packet TX WQE feature.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
This provides infrastructure to have multiple xdp_info instances
for the same consumer index.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Instead of calculating the control segment to be used upon an
XDP xmit doorbell, save it in SQ structure.
Nullify when no pending doorbell.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Do not ignore the CQE opcode.
This helps expose issues and debug them.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Do not maintain an SQ state bit to indicate whether an
XDP SQ serves redirect operations.
Instead, rely on the fact that such an XDP SQ doesn't reside
in an RQ instance, while the others do.
This info is not known to the XDP SQ functions themselves,
and they rely on their callers to distinguish between the cases.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
At the end of the RQ polling loop, some XDP-related operations
might be required. Before checking them one by one, check if
an XDP program is even loaded.
Combine all the checks and operations in a single function in xdp files.
This saves unnecessary checks for non-XDP flows.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The opcode indicates about the error reason.
Printing it helps in debug.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Virtio-net devices negotiate LRO support with the host.
Display the initially negotiated state with ethtool -k.
Also allow configuring it with ethtool -K, reusing the existing
virtnet_set_guest_offloads helper that configures LRO for XDP.
This is conditional on VIRTIO_NET_F_CTRL_GUEST_OFFLOADS.
Virtio-net negotiates TSO4 and TSO6 separately, but ethtool does not
distinguish between the two. Display LRO as on only if any offload
is active.
RTNL is held while calling virtnet_set_features, same as on the path
from virtnet_xdp_set.
Changes v1 -> v2
- allow ethtool config (-K) only if VIRTIO_NET_F_CTRL_GUEST_OFFLOADS
- show LRO as enabled if any LRO variant is enabled
- do not allow configuration while XDP is active
- differentiate current features from the capable set, to restore
on XDP down only those features that were active on XDP up
- move test out of VIRTIO_NET_F_CSUM/TSO branch, which is tx only
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-12-21
The following pull-request contains BPF updates for your *net-next* tree.
There is a merge conflict in test_verifier.c. Result looks as follows:
[...]
},
{
"calls: cross frame pruning",
.insns = {
[...]
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.errstr_unpriv = "function calls to other bpf functions are allowed for root only",
.result_unpriv = REJECT,
.errstr = "!read_ok",
.result = REJECT,
},
{
"jset: functional",
.insns = {
[...]
{
"jset: unknown const compare not taken",
.insns = {
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
BPF_FUNC_get_prandom_u32),
BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1),
BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0),
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.errstr_unpriv = "!read_ok",
.result_unpriv = REJECT,
.errstr = "!read_ok",
.result = REJECT,
},
[...]
{
"jset: range",
.insns = {
[...]
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.result_unpriv = ACCEPT,
.result = ACCEPT,
},
The main changes are:
1) Various BTF related improvements in order to get line info
working. Meaning, verifier will now annotate the corresponding
BPF C code to the error log, from Martin and Yonghong.
2) Implement support for raw BPF tracepoints in modules, from Matt.
3) Add several improvements to verifier state logic, namely speeding
up stacksafe check, optimizations for stack state equivalence
test and safety checks for liveness analysis, from Alexei.
4) Teach verifier to make use of BPF_JSET instruction, add several
test cases to kselftests and remove nfp specific JSET optimization
now that verifier has awareness, from Jakub.
5) Improve BPF verifier's slot_type marking logic in order to
allow more stack slot sharing, from Jiong.
6) Add sk_msg->size member for context access and add set of fixes
and improvements to make sock_map with kTLS usable with openssl
based applications, from John.
7) Several cleanups and documentation updates in bpftool as well as
auto-mount of tracefs for "bpftool prog tracelog" command,
from Quentin.
8) Include sub-program tags from now on in bpf_prog_info in order to
have a reliable way for user space to get all tags of the program
e.g. needed for kallsyms correlation, from Song.
9) Add BTF annotations for cgroup_local_storage BPF maps and
implement bpf fs pretty print support, from Roman.
10) Fix bpftool in order to allow for cross-compilation, from Ivan.
11) Update of bpftool license to GPLv2-only + BSD-2-Clause in order
to be compatible with libbfd and allow for Debian packaging,
from Jakub.
12) Remove an obsolete prog->aux sanitation in dump and get rid of
version check for prog load, from Daniel.
13) Fix a memory leak in libbpf's line info handling, from Prashant.
14) Fix cpumap's frame alignment for build_skb() so that skb_shared_info
does not get unaligned, from Jesper.
15) Fix test_progs kselftest to work with older compilers which are less
smart in optimizing (and thus throwing build error), from Stanislav.
16) Cleanup and simplify AF_XDP socket teardown, from Björn.
17) Fix sk lookup in BPF kselftest's test_sock_addr with regards
to netns_id argument, from Andrey.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Extract "Protocol" field decompression code from transport protocols to
PPP generic layer, where it actually belongs. As a consequence, this
patch fixes incorrect place of PFC decompression in L2TP driver (when
it's not PPPOX_BOUND) and also enables this decompression for other
protocols, like PPPoE.
Protocol field decompression also happens in PPP Multilink Protocol
code and in PPP compression protocols implementations (bsd, deflate,
mppe). It looks like there is no easy way to get rid of that, so it was
decided to leave it as is, but provide those cases with appropriate
comments instead.
Changes in v2:
- Fix the order of checking skb data room and proto decompression
- Remove "inline" keyword from ppp_decompress_proto()
- Don't split line before function name
- Prefix ppp_decompress_proto() function with "__"
- Add ppp_decompress_proto() function with skb data room checks
- Add description for introduced functions
- Fix comments (as per review on mailing list)
Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org>
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Last set of patches for 4.21. mt76 is still in very active development
and having some refactoring as well as new features. But also other
drivers got few new features and fixes.
Major changes:
ath10k
* add amsdu support for QCA6174 monitor mode
* report tx rate using the new ieee80211_tx_rate_update() API
* wcn3990 support is not experimental anymore
iwlwifi
* support for FW version 43 for 9000 and 22000 series
brcmfmac
* add support for CYW43012 SDIO chipset
* add the raw 4354 PCIe device ID for unprogrammed Cypress boards
mwifiex
* add NL80211_STA_INFO_RX_BITRATE support
mt76
* use the same firmware for mt76x2e and mt76x2u
* mt76x0e survey support
* more unification between mt76x2 and mt76x0
* mt76x0e AP mode support
* mt76x0e DFS support
* rework and fix tx status handling for mt76x0 and mt76x2
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJcG9TEAAoJEG4XJFUm622byW8H/1vMVJhXwgIZbHeoUKNa47Yp
Z7Jv5vW8IGXu+lp7DyoedDCbq4+lskNSlDV1DmysNChLgDnApU/3oCd/jH8EiGPV
JAFUHb85HuVLTTpPpNHtnYz3IzL7r098TNVxOU0VD+xILM0Mf0aCeXztgmFWpGaY
/rfHkId8oKUezIjdu6Dc96mqITrT6WRNtnOMfjr6dZPjClRTS44Hyz3Ga3rXABBL
/n8BCkl0GpKGrL3mBy2CCR5mVY8zfxMB4Aj2zx7bccZ8i2i2QjrGlXCHyB6ImNrR
lv4L1fUVXZWVdeOe8EbpftY7zEsPrX+XNm6h1kckdB7UyuBROpQLsVb+yxlLh9g=
=mhAw
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2018-12-20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.21
Last set of patches for 4.21. mt76 is still in very active development
and having some refactoring as well as new features. But also other
drivers got few new features and fixes.
Major changes:
ath10k
* add amsdu support for QCA6174 monitor mode
* report tx rate using the new ieee80211_tx_rate_update() API
* wcn3990 support is not experimental anymore
iwlwifi
* support for FW version 43 for 9000 and 22000 series
brcmfmac
* add support for CYW43012 SDIO chipset
* add the raw 4354 PCIe device ID for unprogrammed Cypress boards
mwifiex
* add NL80211_STA_INFO_RX_BITRATE support
mt76
* use the same firmware for mt76x2e and mt76x2u
* mt76x0e survey support
* more unification between mt76x2 and mt76x0
* mt76x0e AP mode support
* mt76x0e DFS support
* rework and fix tx status handling for mt76x0 and mt76x2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing indirect access in the Ocelot chip, a command is setup,
issued and then we need to poll until the result is ready. The polling
timeout is specified in milliseconds in the datasheet and not in
register access attempts.
It is not a bug on the currently supported platform, but we observed
that the code does not work properly on other platforms that we want to
support as the timing requirements there are different.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Port partitioning is done by enabling UNICAST_VLAN_BOUNDARY and changing
the default port membership of 0x7f to other values such that there is
no communication between ports. In KSZ9477 the member for port 1 is
0x41; port 2, 0x42; port 3, 0x44; port 4, 0x48; port 5, 0x50; and port 7,
0x60. Port 6 is the host port.
Setting a zero value can be used to stop port from receiving.
However, when UNICAST_VLAN_BOUNDARY is disabled and the unicast addresses
are already learned in the dynamic MAC table, setting zero still allows
devices connected to those ports to communicate. This does not apply to
multicast and broadcast addresses though. To prevent these leaks and
make the function of port membership consistent UNICAST_VLAN_BOUNDARY
should never be disabled.
Note that UNICAST_VLAN_BOUNDARY is enabled by default in KSZ9477.
Fixes: b987e98e50 ("dsa: add DSA switch driver for Microchip KSZ9477")
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When resolving the conflict wrt. the vxlan_fdb_update call
in vxlan_changelink() I made the last argument false instead
of true.
Fix this.
Signed-off-by: David S. Miller <davem@davemloft.net>
This series adds some misc updates and the support for tunnels over VLAN
tc offloads.
From Miroslav Lichvar, patches #1,2
1) Update timecounter at least twice per counter overflow
2) Extend PTP gettime function to read system clock
From Gavi Teitz, patch #3
3) Increase VF representors' SQ size to 128
From Eli Britstein and Or Gerlitz, patches #4-10
4) Adds the capability to support tunnels over VLAN device.
Patch 4 avoids crash for TC flow with egress upper devices
Patch 5 refactors tunnel routing devs into a helper function
Patch 6 avoids crash for TC encap flows with vlan on underlay
Patches 7-8 refactor encap tunnel header preparing code.
Patch 9 adds support for building VLAN tagged ETH header.
Patch 10 adds support for tunnel routing to VLAN device.
From Aviv, patches 11,12 to fix earlier VF lag series
5) Fix query_nic_sys_image_guid() error during init
6) Fix LAG requirement when CONFIG_MLX5_ESWITCH is off
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJcG5O8AAoJEEg/ir3gV/o+v0AH/ja1bJ6PkAlw2bCRMIuI6HK/
1vh9n8D74tIOAlvUi6QiLOgJ7CLdAgiAVFppDWzJSBUsY2XNAtRdYvLDDt9hGafO
QGysBWNVcX2aUVp+pLDCCVEYBWyyIzW416CWHx2IUgdAg9S6cvJK6P/81wd+l4Zp
8YlPstEUANP/JKZxHHjSelMcnY+Bj0JrDzuyyyaQdwmcHo5I7Ht0tNex14yFfFbl
gd7YvfPr1PtovPX2w9hMt3y3ml4mommB+jtxc0+59D+A7680hBOpAnH5pONms9rv
OKKKV8KqpxL1m32/TGbd7XSG+llLJwsSM6RtD4oUdiPA4iCiVDVdreP5vac52C4=
=0QQ5
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-12-19
This series adds some misc updates and the support for tunnels over VLAN
tc offloads.
From Miroslav Lichvar, patches #1,2
1) Update timecounter at least twice per counter overflow
2) Extend PTP gettime function to read system clock
From Gavi Teitz, patch #3
3) Increase VF representors' SQ size to 128
From Eli Britstein and Or Gerlitz, patches #4-10
4) Adds the capability to support tunnels over VLAN device.
Patch 4 avoids crash for TC flow with egress upper devices
Patch 5 refactors tunnel routing devs into a helper function
Patch 6 avoids crash for TC encap flows with vlan on underlay
Patches 7-8 refactor encap tunnel header preparing code.
Patch 9 adds support for building VLAN tagged ETH header.
Patch 10 adds support for tunnel routing to VLAN device.
From Aviv, patches 11,12 to fix earlier VF lag series
5) Fix query_nic_sys_image_guid() error during init
6) Fix LAG requirement when CONFIG_MLX5_ESWITCH is off
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
VID 1 is not reserved anymore, so remove the check that prevented the
creation of VLAN devices with this VID over mlxsw ports.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to abuse VID 1 anymore and we can instead use VID 4095
as the default VLAN, which will be configured on the port throughout its
lifetime.
The OVS join / leave functions are changed to enable VIDs 1-4094
(inclusive) instead of 2-4095. This because VID 4095 is now the default
VLAN instead of 1.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VLAN entries on a port can be associated with either a bridge VLAN or a
router port. Before the VLAN entry is destroyed these associations need
to be cleaned up.
Currently, this is always invoked from the function which destroys the
VLAN entry, but next patch is going to skip the destruction of the
default entry when a port in unlinked from a LAG.
The above does not mean that the associations should not be cleaned up,
so add a helper that will be invoked from both call sites.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subsequent patches will need to access the default port VLAN. Since this
VLAN will exist throughout the lifetime of the port, simply store it in
the port's struct.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function allows flushing all the existing VLAN entries on a port. It
is invoked when a port is destroyed and when it is unlinked from a LAG.
In the latter case, when moving to the new default VLAN, there will not
be a need to destroy the default VLAN entry.
Therefore, add an argument that allows to control whether the default
port VLAN should be destroyed or not. Currently it is always set to
'true'.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the driver does not set the port's PVID when initializing a
new port. This is because the driver is using VID 1 as PVID which is the
firmware default.
Subsequent patches are going to change the PVID the driver is setting
when initializing a new port.
Prepare for that by explicitly setting the port's PVID.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subsequent patches are going to replace the current default VID (1) with
VLAN_N_VID - 1 (4095).
Prepare for this conversion by replacing the hard-coded '1' with a
define.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In symmetric routing, the only two members in the VLAN corresponding to
the L3 VNI are the router port and the VXLAN tunnel.
In case the VXLAN device is already enslaved to the bridge and only
later the VLAN interface is configured, the tunnel will not be
offloaded.
The reason for this is that when the router interface (RIF)
corresponding to the VLAN interface is configured, it calls the core
fid_get() API which does not check if NVE should be enabled on the FID.
Instead, call into the bridge code which will check if NVE should be
enabled on the FID.
This effectively means that the same code path is used to retrieve a FID
when either a local port or a router port joins the FID.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2018-12-20
This series contains updates to e100, igb, ixgbe, i40e and ice drivers.
I replaced spinlocks for mutex locks to reduce the latency on CPU0 for
igb when updating the statistics. This work was based off a patch
provided by Jan Jablonsky, which was against an older version of the igb
driver.
Jesus adjusts the receive packet buffer size from 32K to 30K when
running in QAV mode, to stay within 60K for total packet buffer size for
igb.
Vinicius adds igb kernel documentation regarding the CBS algorithm and
its implementation in the i210 family of NICs.
YueHaibing from Huawei fixed the e100 driver that was potentially
passing a NULL pointer, so use the kernel macro IS_ERR_OR_NULL()
instead.
Konstantin Khorenko fixes i40e where we were not setting up the
neigh_priv_len in our net_device, which caused the driver to read beyond
the neighbor entry allocated memory.
Miroslav Lichvar extends the PTP gettime() to read the system clock by
adding support for PTP_SYS_OFFSET_EXTENDED ioctl in i40e.
Young Xiao fixed the ice driver to only enable NAPI on q_vectors that
actually have transmit and receive rings.
Kai-Heng Feng fixes an igb issue that when placed in suspend mode, the
NIC does not wake up when a cable is plugged in. This was due to the
driver not setting PME during runtime suspend.
Stephen Douthit enables the ixgbe driver allow DSA devices to use the
MII interface to talk to switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the mii_bus callbacks to address the entire clause 22/45 address
space. Enables userspace to poke switch registers instead of a single
PHY address.
The ixgbe firmware may be polling PHYs in a way that is not protected by
the mii_bus lock. This isn't new behavior, but as Andrew Lunn pointed
out there are more addresses available for conflicts.
Signed-off-by: Stephen Douthit <stephend@silicom-usa.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Most dsa devices expect a 'struct mii_bus' pointer to talk to switches
via the MII interface.
While this works for dsa devices, it will not work safely with Linux
PHYs in all configurations since the firmware of the ixgbe device may
be polling some PHY addresses in the background.
Signed-off-by: Stephen Douthit <stephend@silicom-usa.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
I210 ethernet card doesn't wakeup when a cable gets plugged. It's
because its PME is not set.
Since commit 42eca23021 ("PCI: Don't touch card regs after runtime
suspend D3"), if the PCI state is saved, pci_pm_runtime_suspend() stops
calling pci_finish_runtime_suspend(), which enables the PCI PME.
To fix the issue, let's not to save PCI states when it's runtime
suspend, to let the PCI subsystem enables PME.
Fixes: 42eca23021 ("PCI: Don't touch card regs after runtime suspend D3")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
If ice driver has q_vectors w/ active NAPI that has no rings,
then this will result in a divide by zero error. To correct it
I am updating the driver code so that we only support NAPI on
q_vectors that have 1 or more rings allocated to them.
See commit 13a8cd191a ("i40e: Do not enable NAPI on q_vectors
that have no rings") for detail.
Signed-off-by: Young Xiao <YangX92@hotmail.com>
Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl.
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Out of bound read reported by KASan.
i40iw_net_event() reads unconditionally 16 bytes from
neigh->primary_key while the memory allocated for
"neighbour" struct is evaluated in neigh_alloc() as
tbl->entry_size + dev->neigh_priv_len
where "dev" is a net_device.
But the driver does not setup dev->neigh_priv_len and
we read beyond the neigh entry allocated memory,
so the patch in the next mail fixes this.
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix a static code checker warning:
drivers/net/ethernet/intel/e100.c:1349
e100_load_ucode_wait() warn: passing zero to 'PTR_ERR'
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.
Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.
Signed-off-by: David S. Miller <davem@davemloft.net>
Section 4.5.9 of the datasheet says that the total size of all packet
buffers combined (TxPB 0 + 1 + 2 + 3 + RxPB + BMC2OS + OS2BMC) must not
exceed 60KB. Today we are configuring a total of 62KB, so reduce the
RxPB from 32KB to 30KB in order to respect that.
The choice of changing RxPBSIZE here is mainly because it seems more
correct to give more priority to the transmit packet buffers over the
receiver ones when running in Qav mode. Also, the BMC2OS and OS2BMC
sizes are already too short.
Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is based off of the work and suggestion of Jan Jablonsky
<jan.jablonsky@thalesgroup.com>.
The Watchdog workqueue in igb driver is scheduled every 2s for each
network interface. That includes updating a statistics protected by
spinlock. Function igb_update_stats in this case will be protected
against preemption. According to number of a statistics registers
(cca 60), processing this function might cause additional cpu load
on CPU0.
In case of statistics spinlock may be replaced with mutex, which
reduce latency on CPU0.
CC: Bernhard Kaindl <bernhard.kaindl@thalesgroup.com>
CC: Jan Jablonsky <jan.jablonsky@thalesgroup.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
ath.git patches for 4.21. Major changes:
ath10k
* add amsdu support for QCA6174 monitor mode
* report tx rate using the new ieee80211_tx_rate_update() API
* wcn3990 support is not experimental anymore
Add wmi configuration cmd to configure base band(BB) power amplifier(PA)
off timing values in hardware. The default PA off timings were fine tuned
to make proper DFS radar detection in QCA reference design. If ODM uses
different PA in their design, then the same default PA off timing values
cannot be used, it requires different settling time to detect radar pulses
very sooner and avoid radar detection problems. In that case it provides
provision to select proper PA off timing values based on the PA hardware used.
The PA component is part of FEM hardware and new device tree entry
"ext-fem-name" is used to indentify the FEM hardware. And this wmi configuration
cmd is enabled via wmi service flag "WMI_SERVICE_BB_TIMING_CONFIG_SUPPORT".
Other way is to apply these values through calibration data, but recalibration
of all boards out there might not be feasible.
This change tested on firmware ver 10.2.4-1.0-00042 in QCA988X chipset.
Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Memory of tx_stats was allocated when a STA was added. But it's not freed
if the STA failed to be added to driver. This issue could be seen in MDK3
attack case when STA number reached the limit.
Tested: QCA9984 with firmware ver 10.4-3.9.0.1-00005
Signed-off-by: Zhi Chen <zhichen@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
There was a race condition in SMP that an ath10k_peer was created but its
member sta was null. Following are procedures of ath10k_peer creation and
member sta access in peer statistics path.
1. Peer creation:
ath10k_peer_create()
=>ath10k_wmi_peer_create()
=>ath10k_wait_for_peer_created()
...
# another kernel path, RX from firmware
ath10k_htt_t2h_msg_handler()
=>ath10k_peer_map_event()
=>wake_up()
# ar->peer_map[id] = peer //add peer to map
#wake up original path from waiting
...
# peer->sta = sta //sta assignment
2. RX path of statistics
ath10k_htt_t2h_msg_handler()
=>ath10k_update_per_peer_tx_stats()
=>ath10k_htt_fetch_peer_stats()
# peer->sta //sta accessing
Any access of peer->sta after peer was added to peer_map but before sta was
assigned could cause a null pointer issue. And because these two steps are
asynchronous, no proper lock can protect them. So both peer and sta need to
be checked before access.
Tested: QCA9984 with firmware ver 10.4-3.9.0.1-00005
Signed-off-by: Zhi Chen <zhichen@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The "survey" pointer is the address of an array element. We know that
it can't be NULL so this check can be removed.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
During driver load below warn logs are printed in the console.
Since driver may not implement all wmi events sent by fw and
all of them are non-fatal, move this log to debug level to
remove un-necessary warn message on console.
[ 361.887230] ath10k_snoc a000000.wifi: Unknown eventid: 16393
[ 361.907037] ath10k_snoc a000000.wifi: Unknown eventid: 237569
Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The devm_memremap() function doesn't return NULLs, it returns error
pointers.
Fixes: ba94c753cc ("ath10k: add QMI message handshake for wcn3990 client")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
All the necessary patches to make wifi running (over SNOC)
are merged and tested on SDM845/QCS404 platform with WCN3990
wifi module, hence remove work in progress debug from snoc
driver and Kconfig.
Signed-off-by: Govind Singh <govinds@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Some hardwares variants (QCA99x0) are limiting msdu deaggregation with
some threshold value(default limit in QCA99x0 is 64 msdus), it was introduced to
avoid excessive MSDU-deaggregation in error cases. When number of sub frames
exceeds the limit, target hardware will send all msdus starting from present
msdu in RAW format as a single msdu packet and it will be indicated with
error status bit "RX_MSDU_END_INFO0_MSDU_LIMIT_ERR" set in rx descriptor.
This msdu frame is a partial raw MSDU and does't have first msdu and ieee80211
header. It caused below warning message.
[ 320.151332] ------------[ cut here ]------------
[ 320.155006] WARNING: CPU: 0 PID: 3 at drivers/net/wireless/ath/ath10k/htt_rx.c:1188
In our issue case, MSDU limit error happened due to FCS error and generated
this warning message.
This fixes the warning by handling the MSDU limit error. If msdu limit error
happens, driver adds first MSDU's ieee80211 header and sets A-MSDU present bit
in QOS header so that upper layer processes this frame if it is valid or drop it
if FCS error set. And removed the warning message, hence partial msdus without
first msdu is expected in msdu limit error cases.
Tested on QCA9984, Firmware 10.4-3.6-00104
Signed-off-by: Bhagavathi Perumal S <bperumal@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Currently in 10.4 FW, all the received 4addr frames are processed for
source port learning which is enabled by default. This learning can't be
disabled by default in FW since it breaks backward compatibility.
Since ath10k uses mac80211 based 4addr mode, source port learning done in
10.4 FW is redundant and also causes issues when 3addr frames are
transmitted/received for a 4addr station.
One such visible functional impact is when GTK rekey frame from
hostapd based AP to 4addr STA is dropped in AP's 10.4 FW. This is since
GTK rekey EAPOL frame is 3addr frame on AP interface and STA enabled
with 4addr is already allowed for receiving 3addr EAPOL frames.
Source port learning implementation in 10.4 FW drops this 3addr GTK rekey
frame in AP destinated for 4addr STA causing disassociation and
re-association for every GTK rekey session. GTK rekey issue is not seen
when learning is disabled in FW.
To prevent such issues without breaking backward compatibility, FW
advertises new service bit making the source port learning configurable and
this learning is being currently disabled during ath10k vdev creation.
* Tested HW: QCA9984
* Tested FW: 10.4-3.6.0.1-00004
Signed-off-by: Sathishkumar Muruganandam <murugana@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Mesh path metric needs tx rate information from ieee80211_tx_status()
call but in ath10k there is no mechanism to report tx rate information
via ieee80211_tx_status(), the tx rate is only accessible via
sta_statiscs() op.
Per peer tx stats has tx rate info available, Tx rate is available
to ath10k driver after every 4 PPDU sent in the air. For each PPDU,
ath10k driver updates rate informattion to mac80211 using
ieee80211_tx_rate_update().
Per peer txrate information is updated through per peer statistics
and is available for QCA9888/QCA9984/QCA4019/QCA998X only
Tested on QCA9984 with firmware-5.bin_10.4-3.5.3-00053
Tested on QCA998X with firmware-5.bin_10.2.4-1.0-00036
Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When processing HTT_T2H_MSG_TYPE_RX_IN_ORD_PADDR_IND, if the length of a msdu
is larger than the tailroom of the rx skb, skb_over_panic issue will happen
when calling skb_put. In monitor mode, amsdu will be handled in this path, and
msdu_len of the first msdu_desc is the length of the entire amsdu, which might
be larger than the maximum length of a skb, in such case, it will hit the issue
upon.
To fix this issue, process msdu list separately for monitor mode.
Successfully tested with:
QCA6174 (FW version: RM.4.4.1.c2-00057-QCARMSWP-1).
Signed-off-by: Yu Wang <yyuwang@codeaurora.org>
[kvalo@codeaurora.org: cosmetic cleanup]
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This issue arise in a race condition between ath10k_sta_state() and
ath10k_htt_fetch_peer_stats(), explained in below scenario
Steps:
1. In ath10k_sta_state(), arsta->tx_stats get deallocated before peer deletion
when the station moves from IEEE80211_STA_NONE to IEEE80211_STA_NOTEXIST
state.
2. Meanwhile ath10k receive HTT_T2H_MSG_TYPE_PEER_STATS message.
In ath10k_htt_fetch_peer_stats(), arsta->tx_stats get accessed after
the peer validation check.
Since arsta->tx_stats get freed before the peer deletion [1].
ath10k_htt_fetch_peer_stats() ended up in "use after free" situation.
Fixed this issue by moving the arsta->tx_stats free handling after the
peer deletion. so that ath10k_htt_fetch_peer_stats() will not end up in
"use after free" situation.
Kernel Panic:
Unable to handle kernel NULL pointer dereference at virtual address 00000286
pgd = d8754000
[00000286] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
...
CPU: 0 PID: 6245 Comm: hostapd Not tainted
task: dc44cac0 ti: d4a38000 task.ti: d4a38000
PC is at kmem_cache_alloc+0x7c/0x114
LR is at ath10k_sta_state+0x190/0xd58 [ath10k_core]
pc : [<c02bdc50>] lr : [<bf916b78>] psr: 20000013
sp : d4a39b88 ip : 00000000 fp : 00000001
r10: 00000000 r9 : 1d3bc000 r8 : 00000dc0
r7 : 000080d0 r6 : d4a38000 r5 : dd401b00 r4 : 00000286
r3 : 00000000 r2 : d4a39ba0 r1 : 000080d0 r0 : dd401b00
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control: 10c5787d Table: 5a75406a DAC: 00000015
Process hostapd (pid: 6245, stack limit = 0xd4a38238)
Stack: (0xd4a39b88 to 0xd4a3a000)
...
[<c02bdc50>] (kmem_cache_alloc) from [<bf916b78>] (ath10k_sta_state+0x190/0xd58 [ath10k_core])
[<bf916b78>] (ath10k_sta_state [ath10k_core]) from [<bf870d4c>] (sta_info_insert_rcu+0x418/0x61c [mac80211])
[<bf870d4c>] (sta_info_insert_rcu [mac80211]) from [<bf88634c>] (ieee80211_add_station+0xf0/0x134 [mac80211])
[<bf88634c>] (ieee80211_add_station [mac80211]) from [<bf83f3c4>] (nl80211_new_station+0x330/0x36c [cfg80211])
[<bf83f3c4>] (nl80211_new_station [cfg80211]) from [<bf6c4040>] (extack_doit+0x2c/0x74 [compat])
[<bf6c4040>] (extack_doit [compat]) from [<c05c285c>] (genl_rcv_msg+0x274/0x30c)
[<c05c285c>] (genl_rcv_msg) from [<c05c1d98>] (netlink_rcv_skb+0x58/0xac)
[<c05c1d98>] (netlink_rcv_skb) from [<c05c25d4>] (genl_rcv+0x20/0x34)
[<c05c25d4>] (genl_rcv) from [<c05c1750>] (netlink_unicast+0x11c/0x204)
[<c05c1750>] (netlink_unicast) from [<c05c1be0>] (netlink_sendmsg+0x30c/0x370)
[<c05c1be0>] (netlink_sendmsg) from [<c0587e90>] (sock_sendmsg+0x70/0x84)
[<c0587e90>] (sock_sendmsg) from [<c058970c>] (___sys_sendmsg.part.3+0x188/0x228)
[<c058970c>] (___sys_sendmsg.part.3) from [<c058a594>] (__sys_sendmsg+0x4c/0x70)
[<c058a594>] (__sys_sendmsg) from [<c0208c80>] (ret_fast_syscall+0x0/0x44)
Code: ebfffec1 e1a04000 ea00001b e5953014 (e7940003)
ath10k_pci 0000:01:00.0: SWBA overrun on vdev 0, skipped old beacon
Hardware tested: QCA9984
Firmware tested: 10.4-3.6.0.1-00004
Fixes: a904417fc ("ath10k: add extended per sta tx statistics support")
Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/wireless/ath/ath10k/mac.c: In function 'ath10k_sta_state':
drivers/net/wireless/ath/ath10k/mac.c:6238:7: warning:
variable 'num_tdls_vifs' set but not used [-Wunused-but-set-variable]
'num_tdls_vifs' not used any more after
9a993cc1ea ("ath10k: fix the logic of limiting tdls peer counts")
Also, remove the single called function ath10k_mac_tdls_vifs_count
and ath10k_mac_tdls_vifs_count_iter.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>