All the members: base, idm_base and nicpm_base should be annotated with
__iomem since they are pointers to register space. This fixes a bunch of
sparse reported warnings.
Fixes: f6a95a2495 ("net: ethernet: bgmac: Add platform device support")
Fixes: dd5c5d037f ("net: ethernet: bgmac: add NS2 support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mdio-bitbang mentioned 10 for both read and write.
However mdio read opcode is 10 and write opcode is 01
Fixed comment.
Signed-off-by: Frans Meulenbroeks <fransmeulenbroeks@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sky2 ethernet stops working after system resume from suspend:
[ 582.852065] sky2 0000:04:00.0: Refused to change power state, currently in D3
The current 150ms delay is not enough, change it to 200ms can solve the
issue.
BugLink: https://bugs.launchpad.net/bugs/1758507
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default TX moderation mode was mistakenly set to CQE based. The
intention was to add a control ability in order to improve some specific
use-cases. In general, we prefer to use EQE based moderation as it gives
much better numbers for the common cases.
CQE based causes a degradation in the common case since it resets the
moderation timer on CQE generation. This causes an issue when TSO is
well utilized (large TSO sessions). The timer is set to 16us so traffic
of ~64KB TSO sessions per second would mean timer reset (CQE per TSO
session -> long time between CQEs). In this case we quickly reach the
tcp_limit_output_bytes (256KB by default) and cause a halt in TX traffic.
By setting EQE based moderation we make sure timer would expire after
16us regardless of the packet rate.
This fixes an up to 40% packet rate and up to 23% bandwidth degradtions.
Fixes: 0088cbbc4b ("net/mlx5e: Enable CQE based moderation on TX CQ")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the driver is closed, all the associated irqs are disabled. In the
event that the driver exits a reset in the closed state, we should be
consistent with the state we are in directly after a close. So before we
exit the reset routine, all irqs should be disabled as well. This will
prevent the irqs from being enabled twice in this case and reporting a
number of noisy warning traces.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c,
we had some overlapping changes:
1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE -->
MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE
2) In 'net-next' params->log_rq_size is renamed to be
params->log_rq_mtu_frames.
3) In 'net-next' params->hard_mtu is added.
Signed-off-by: David S. Miller <davem@davemloft.net>
Iff TSU registers exist on a given [G]Ether controller, they always include
the CAM entry table registers (TSU_ADR{H|L}<n>), thus the check for invalid
TSU_ADRH0 offset in __sh_eth_get_regs() is useless...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 6ded286555 ("sh_eth: Fix RX recovery on R-Car in case of RX
ring underrun") added a check for an bad RDFAR offset in sh_eth_rx(), so
that the code could work on the R-Car Ether controllers which don't have
this register (and TDFAR), then the commit 3365711df0 ("sh_eth: WARN on
access to a register not implemented in a particular chip") replaced
offset 0 with 0xffff. Adding/checking the 'no_xdfar' bit field in the
'struct sh_eth_cpu_data' instead results in less object code...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refine the RX check summing handling to propagate the
hardware provided checksum so that we do not have to
compute it later in software.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 145307460b ("devlink: Remove top_hierarchy arg to
devlink_resource_register"), the "top_hierarchy" parameter to
devlink_resource_register() was removed in favor of using the parameter
"parent_resource_id" exclusively to determine who the parent is. The
root node's resource ID for this purpose is
DEVLINK_RESOURCE_ID_PARENT_TOP with the value 0. It is therefore
problematic that the resource MLXSW_SP_RESOURCE_KVD has also ID of 0.
Fix this by numbering driver-specific resources from 1.
Fixes: 145307460b ("devlink: Remove top_hierarchy arg to devlink_resource_register")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass struct mlxsw_core instead of devlink since it is nicer within mlxsw
code and we need both structs in mlxsw_sp_kvdl_resources_register()
anyway.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As struct mlxsw_config_profile is mapped to the payload of the FW
command of the same name, resources_query_enable flag does not belong
there. Move it to struct mlxsw_driver.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check should be done directly in mlxsw_pci_config_profile, as for
other profile items. Also, be consistent in naming with the rest and
rename to "used_kvd_sizes".
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
First arg of these helpers should be "mlxsw_core".
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This should not be part of the struct, as the struct fields
are tightly coupled with the FW command payload of the same name.
Just use the "granularity" define directly, as in other places.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The parts info is array. The parts copy this info array, yet they are a
list. So make the indexing according to the id and change the list of
parts into array of parts. This helps to eliminate lookups and
constructs like mlxsw_sp_kvdl_part_update() (took me some non-trivial
time to figure out what is going on there).
Alongside with that, introduce a helper macro to define the parts infos.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
devlink_resource_ops should be const as the arg of register function is
also const.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current code uses global variables, adjusts them and passes pointer down
to devlink. With every other mlxsw_core instance, the previously passed
pointer values are rewritten. Fix this by de-globalize the variables.
Fixes: 7f47b19bd7 ("mlxsw: spectrum_kvdl: Add support for per part occupancy")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Read the Inline TLS capability from firmware.
Determine the area reserved for storing the keys
Dump the Inline TLS tx and rx records count.
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Reviewed-by: Michael Werner <werner@chelsio.com>
Reviewed-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Key area size in hw-config file. CPL struct for TLS request
and response. Work request for Inline TLS.
Signed-off-by: Atul Gupta <atul.gupta@chelsio.com>
Reviewed-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf-next 2018-03-31
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Add raw BPF tracepoint API in order to have a BPF program type that
can access kernel internal arguments of the tracepoints in their
raw form similar to kprobes based BPF programs. This infrastructure
also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which
returns an anon-inode backed fd for the tracepoint object that allows
for automatic detach of the BPF program resp. unregistering of the
tracepoint probe on fd release, from Alexei.
2) Add new BPF cgroup hooks at bind() and connect() entry in order to
allow BPF programs to reject, inspect or modify user space passed
struct sockaddr, and as well a hook at post bind time once the port
has been allocated. They are used in FB's container management engine
for implementing policy, replacing fragile LD_PRELOAD wrapper
intercepting bind() and connect() calls that only works in limited
scenarios like glibc based apps but not for other runtimes in
containerized applications, from Andrey.
3) BPF_F_INGRESS flag support has been added to sockmap programs for
their redirect helper call bringing it in line with cls_bpf based
programs. Support is added for both variants of sockmap programs,
meaning for tx ULP hooks as well as recv skb hooks, from John.
4) Various improvements on BPF side for the nfp driver, besides others
this work adds BPF map update and delete helper call support from
the datapath, JITing of 32 and 64 bit XADD instructions as well as
offload support of bpf_get_prandom_u32() call. Initial implementation
of nfp packet cache has been tackled that optimizes memory access
(see merge commit for further details), from Jakub and Jiong.
5) Removal of struct bpf_verifier_env argument from the print_bpf_insn()
API has been done in order to prepare to use print_bpf_insn() soon
out of perf tool directly. This makes the print_bpf_insn() API more
generic and pushes the env into private data. bpftool is adjusted
as well with the print_bpf_insn() argument removal, from Jiri.
6) Couple of cleanups and prep work for the upcoming BTF (BPF Type
Format). The latter will reuse the current BPF verifier log as
well, thus bpf_verifier_log() is further generalized, from Martin.
7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read
and write support has been added in similar fashion to existing
IPv6 IPV6_TCLASS socket option we already have, from Nikita.
8) Fixes in recent sockmap scatterlist API usage, which did not use
sg_init_table() for initialization thus triggering a BUG_ON() in
scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and
uses a small helper sg_init_marker() to properly handle the affected
cases, from Prashant.
9) Let the BPF core follow IDR code convention and therefore use the
idr_preload() and idr_preload_end() helpers, which would also help
idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory
pressure, from Shaohua.
10) Last but not least, a spelling fix in an error message for the
BPF cookie UID helper under BPF sample code, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When the driver needs to re-initailize the IRQ vectors, we make the
new ulp_irq_stop() call to tell the RDMA driver to disable and free
the IRQ vectors. After IRQ vectors have been re-initailized, we
make the ulp_irq_restart() call to tell the RDMA driver that
IRQs can be restarted.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add additional logic to reserve completion rings for the bnxt_re driver
when it requests MSIX vectors. The function bnxt_cp_rings_in_use()
will return the total number of completion rings used by both drivers
that need to be reserved. If the network interface in up, we will
close and open the NIC to reserve the new set of completion rings and
re-initialize the vectors.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor bnxt_need_reserve_rings() slightly so that __bnxt_reserve_rings()
can call it and remove some duplicated code.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add remapping logic so that bnxt_en can use any arbitrary MSIX vectors.
This will allow the driver to reserve one range of MSIX vectors to be
used by both bnxt_en and bnxt_re. bnxt_en can now skip over the MSIX
vectors used by bnxt_re.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current code, the range of MSIX vectors allocated for the RDMA
driver is disjoint from the network driver. This creates a problem
for the new firmware ring reservation scheme. The new scheme requires
the reserved completion rings/MSIX vectors to be in a contiguous
range.
Change the logic to allocate RDMA MSIX vectors to be contiguous with
the vectors used by bnxt_en on new firmware using the new scheme.
The new function bnxt_get_num_msix() calculates the exact number of
vectors needed by both drivers.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the driver code makes some assumptions about the group index
and the map index of rings. This makes the code more difficult to
understand and less flexible.
Improve it by adding the grp_idx and map_idx fields explicitly to the
bnxt_ring_struct as a union. The grp_idx is initialized for each tx ring
and rx agg ring during init. time. We do the same for the map_idx for
each cmpl ring.
The grp_idx ties the tx ring to the ring group. The map_idx is the
doorbell index of the ring. With this new infrastructure, we can change
the ring index mapping scheme easily in the future.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When firmware sends a DMA response to the driver, the last byte of the
message will be set to 1 to indicate that the whole response is valid.
The driver waits for the message to be valid before reading the message.
The firmware spec allows these response messages to increase in
length by adding new fields to the end of these messages. The
older spec's valid location may become a new field in a newer
spec. To guarantee compatibility, the driver should zero the valid
byte before interpreting the entire message so that any new fields not
implemented by the older spec will be read as zero.
For messages that are forwarded to VFs, we need to set the length
and re-instate the valid bit so the VF will see the valid response.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When VFs are created, the current code subtracts the maximum VF
resources from the PF's pool. This under-estimates the resources
remaining in the PF pool. Instead, we should subtract the minimum
VF resources. The VF minimum resources are guaranteed to the VFs
and only these should be subtracted from the PF's pool.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When checking for the maximum pre-set TX channels for ethtool -l, we
need to check the current max_tx_scheduler_inputs parameter from firmware.
This parameter specifies the max input for the internal QoS nodes currently
available to this function. The function's TX rings will be capped by this
parameter. By adding this logic, we provide a more accurate pre-set max
TX channels to the user.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gather periodic extended port statistics, if the device is PF and
link is up.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Include additional hardware port statistics in ethtool -S, which
are useful for debugging.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trusted VFs are allowed to modify MAC address, even when PF
has assigned one.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clear flags when reset command processed successfully for components
specified.
Fixes: 6502ad5963 ("bnxt_en: Add ETH_RESET_AP support")
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the RDMA driver is registered, use a new VNIC mode that allows
RDMA traffic to be seen on the netdev in promiscuous mode.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the default ring logic to select default number of rings to be up to
8 per port if the default rings x NIC ports <= total CPUs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor changes, such as new extended port statistics.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series contains updates to mlx5 core and mlx5e netdev drivers.
The main highlight of this series is the RX optimizations for striding RQ path,
introduced by Tariq.
First Four patches are trivial misc cleanups.
- Spelling mistake fix
- Dead code removal
- Warning messages
RX optimizations for striding RQ:
1) RX refactoring, cleanups and micro optimizations
- MTU calculation simplifications, obsoletes some WQEs-to-packets translation
functions and helps delete ~60 LOC.
- Do not busy-wait a pending UMR completion.
- post the new values of UMR WQE inline, instead of using a data pointer.
- use pre-initialized structures to save calculations in datapath.
2) Use linear SKB in Striding RQ "build_skb", (Using linear SKB has many advantages):
- Saves a memcpy of the headers.
- No page-boundary checks in datapath.
- No filler CQEs.
- Significantly smaller CQ.
- SKB data continuously resides in linear part, and not split to
small amount (linear part) and large amount (fragment).
This saves datapath cycles in driver and improves utilization
of SKB fragments in GRO.
- The fragments of a resulting GRO SKB follow the IP forwarding
assumption of equal-size fragments.
implementation details:
HW writes the packets to the beginning of a stride,
i.e. does not keep headroom. To overcome this we make sure we can
extend backwards and use the last bytes of stride i-1.
Extra care is needed for stride 0 as it has no preceding stride.
We make sure headroom bytes are available by shifting the buffer
pointer passed to HW by headroom bytes.
This configuration now becomes default, whenever capable.
Of course, this implies turning LRO off.
Performance testing:
ConnectX-5, single core, single RX ring, default MTU.
UDP packet rate, early drop in TC layer:
--------------------------------------------
| pkt size | before | after | ratio |
--------------------------------------------
| 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
| 500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
| 64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
--------------------------------------------
TCP streams: ~20% gain
3) Support XDP over Striding RQ:
Now that linear SKB is supported over Striding RQ,
we can support XDP by setting stride size to PAGE_SIZE
and headroom to XDP_PACKET_HEADROOM.
Striding RQ is capable of a higher packet-rate than
conventional RQ.
Performance testing:
ConnectX-5, 24 rings, default MTU.
CQE compression ON (to reduce completions BW in PCI).
XDP_DROP packet rate:
--------------------------------------------------
| pkt size | XDP rate | 100GbE linerate | pct% |
--------------------------------------------------
| 64byte | 126.2 Mpps | 148.0 Mpps | 85% |
| 128byte | 80.0 Mpps | 84.8 Mpps | 94% |
| 256byte | 42.7 Mpps | 42.7 Mpps | 100% |
| 512byte | 23.4 Mpps | 23.4 Mpps | 100% |
--------------------------------------------------
4) Remove mlx5 page_ref bulking in Striding RQ and use page_ref_inc only when needed.
Without this bulking, we have:
- no atomic ops on WQE allocation or free
- one atomic op per SKB
- In the default MTU configuration (1500, stride size is 2K),
the non-bulking method execute 2 atomic ops as before
- For larger MTUs with stride size of 4K, non-bulking method
executes only a single op.
- For XDP (stride size of 4K, no SKBs), non-bulking have no atomic ops per packet at all.
Performance testing:
ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Single core packet rate (64 bytes).
Early drop in TC: no degradation.
XDP_DROP:
before: 14,270,188 pps
after: 20,503,603 pps, 43% improvement.
Thanks,
saeed.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJavs5fAAoJEEg/ir3gV/o+iXQIAJQ4jcYb5V3AEPqUeiTNOH2h
e2yyXj2zXNTCl2cekmJriWfQjsA5YizaTNipHb1xR8pznAiIMGmiK5nr8idRY1Qh
M/awuoxJszj8a+z3SxrL/ilgf4HF/89YEt+5MnU/2ihBC3EGG0UbJ6TAC0cXMzmG
Xghi5omlCfsqQQkWooxPVSdRXERLsgzo5kjZ2Zpln/GJa0vVPmIV7ojoQQQzFCMf
eEQzqqEeOk4rk8Z2/5fdsWYjwa2XLnvtUtRBKX/hxCd2zYrFpGUxkzsT/Mikeu+Z
AZAJA4yfHs3dKS3T4CaKDBqhUVxdAsuecT9JlqzgLVEbmw7YypacrPT0TBBsJcI=
=qnbR
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-03-30
This series contains updates to mlx5 core and mlx5e netdev drivers.
The main highlight of this series is the RX optimizations for striding RQ path,
introduced by Tariq.
First Four patches are trivial misc cleanups.
- Spelling mistake fix
- Dead code removal
- Warning messages
RX optimizations for striding RQ:
1) RX refactoring, cleanups and micro optimizations
- MTU calculation simplifications, obsoletes some WQEs-to-packets translation
functions and helps delete ~60 LOC.
- Do not busy-wait a pending UMR completion.
- post the new values of UMR WQE inline, instead of using a data pointer.
- use pre-initialized structures to save calculations in datapath.
2) Use linear SKB in Striding RQ "build_skb", (Using linear SKB has many advantages):
- Saves a memcpy of the headers.
- No page-boundary checks in datapath.
- No filler CQEs.
- Significantly smaller CQ.
- SKB data continuously resides in linear part, and not split to
small amount (linear part) and large amount (fragment).
This saves datapath cycles in driver and improves utilization
of SKB fragments in GRO.
- The fragments of a resulting GRO SKB follow the IP forwarding
assumption of equal-size fragments.
implementation details:
HW writes the packets to the beginning of a stride,
i.e. does not keep headroom. To overcome this we make sure we can
extend backwards and use the last bytes of stride i-1.
Extra care is needed for stride 0 as it has no preceding stride.
We make sure headroom bytes are available by shifting the buffer
pointer passed to HW by headroom bytes.
This configuration now becomes default, whenever capable.
Of course, this implies turning LRO off.
Performance testing:
ConnectX-5, single core, single RX ring, default MTU.
UDP packet rate, early drop in TC layer:
--------------------------------------------
| pkt size | before | after | ratio |
--------------------------------------------
| 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
| 500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
| 64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
--------------------------------------------
TCP streams: ~20% gain
3) Support XDP over Striding RQ:
Now that linear SKB is supported over Striding RQ,
we can support XDP by setting stride size to PAGE_SIZE
and headroom to XDP_PACKET_HEADROOM.
Striding RQ is capable of a higher packet-rate than
conventional RQ.
Performance testing:
ConnectX-5, 24 rings, default MTU.
CQE compression ON (to reduce completions BW in PCI).
XDP_DROP packet rate:
--------------------------------------------------
| pkt size | XDP rate | 100GbE linerate | pct% |
--------------------------------------------------
| 64byte | 126.2 Mpps | 148.0 Mpps | 85% |
| 128byte | 80.0 Mpps | 84.8 Mpps | 94% |
| 256byte | 42.7 Mpps | 42.7 Mpps | 100% |
| 512byte | 23.4 Mpps | 23.4 Mpps | 100% |
--------------------------------------------------
4) Remove mlx5 page_ref bulking in Striding RQ and use page_ref_inc only when needed.
Without this bulking, we have:
- no atomic ops on WQE allocation or free
- one atomic op per SKB
- In the default MTU configuration (1500, stride size is 2K),
the non-bulking method execute 2 atomic ops as before
- For larger MTUs with stride size of 4K, non-bulking method
executes only a single op.
- For XDP (stride size of 4K, no SKBs), non-bulking have no atomic ops per packet at all.
Performance testing:
ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Single core packet rate (64 bytes).
Early drop in TC: no degradation.
XDP_DROP:
before: 14,270,188 pps
after: 20,503,603 pps, 43% improvement.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The variables, msg and data, have the same value. This patch removes
the extra one.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than use an on-stack array to copy a broadcast address, use
the generic eth_broadcast_addr function to save a trivial amount of
object code.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function calls call_netdevice_notifier(), which also
may take net_rwsem. So, we can't use net_rwsem here.
This patch makes callers of this functions take pernet_ops_rwsem,
like register_netdevice_notifier() does. This will protect
the modifications of net_namespace_list, and allows notifiers
to take it (they won't have to care about context).
Since __rtnl_link_unregister() is used on module load
and unload (which are not frequent operations), this looks
for me better, than make all call_netdevice_notifier()
always executing in "protected net_namespace_list" context.
Also, this fixes the problem we had a deal in 328fbe747a
"Close race between {un, }register_netdevice_notifier and ...",
and guarantees __rtnl_link_unregister() does not skip
exitting net.
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need for explicit calls of devm_kfree(), as the allocated
memory will be freed during driver's detach.
The driver core clears the driver data to NULL after device_release.
Thus, it is not needed to manually clear the device driver data to NULL.
So remove the unnecessary pci_set_drvdata() and devm_kfree().
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change nsim_devlink_setup to return any error back to the caller and
update nsim_init to handle it.
Requested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ndo_set_rx_mode() is called from atomic context which causes
messages response timeouts while VF to PF communication via MSIx.
To get rid of that we're copy passed mc list, parse flags and queue
handling of kernel request to ordered workqueue.
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel calls ndo_set_rx_mode() callback from atomic context which
causes messaging timeouts between VF and PF (as they’re implemented via
MSIx). So in order to handle ndo_set_rx_mode() we need to get rid of it.
This commit implements necessary workqueue related structures to let VF
queue kernel request processing in non-atomic context later.
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is to add message handling for ndo_set_rx_mode()
callback at PF side.
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The kernel calls ndo_set_rx_mode() callback supplying it will all necessary
info, such as device state flags, multicast mac addresses list and so on.
Since we have only 128 bits to communicate with PF we need to initiate
several requests to PF with small/short operation each based on input data.
So this commit implements following PF messages codes along with new
data structures for them:
NIC_MBOX_MSG_RESET_XCAST to flush all filters configured for this
particular network interface (VF)
NIC_MBOX_MSG_ADD_MCAST to add new MAC address to DMAC filter registers
for this particular network interface (VF)
NIC_MBOX_MSG_SET_XCAST to apply filtering configuration to filter control
register
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ThunderX NIC could be partitioned to up to 128 VFs and thus
represented to system. Each VF is mapped to pair BGX:LMAC, and each of VF
is configured by kernel individually. Eventually the bunch of VFs could be
mapped onto same pair BGX:LMAC and thus could cause several multicast
filtering configuration requests to LMAC with the same MAC addresses.
This commit is to add ThunderX NIC BGX filtering manipulation routines.
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ThunderX NIC has two Ethernet Interfaces (BGX) each of them could has
up to four Logical MACs configured. Each of BGX has 32 filters to be
configured for filtering ingress packets. The number of filters available
to particular LMAC is from 8 (if we have four LMACs configured per BGX)
up to 32 (in case of only one LMAC is configured per BGX).
At the same time the NIC could present up to 128 VFs to OS as network
interfaces, each of them kernel will configure with set of MAC addresses
for filtering. So to prevent dupes in BGX filter registers from different
network interfaces it is required to cache and track all filter
configuration requests prior to applying them onto BGX filter registers.
This commit is to update LMAC structures with control fields to
allocate/releasing filters tracking list along with implementing
dmac array allocate/release per LMAC.
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ThunderX NIC has set of registers which allows to configure
filter policy for ingress packets. There are three possible regimes
of filtering multicasts, broadcasts and unicasts: accept all, reject all
and accept filter allowed only.
Current implementation has enum with all of them and two generic macro
for enabling filtering et all (CAM_ACCEPT) and enabling/disabling
broadcast packets, which also should be corrected in order to represent
register bits properly. All these values are private for driver and
there is no need to ‘publish’ them via header file.
This commit is to move filtering register manipulation values from
header file into source with explicit assignment of exact register
values to them to be used while register configuring.
Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Meson8m2 SoC uses a similar (potentially even identical) register
layout as the Meson8b and GXBB SoCs for the dwmac glue.
Add a new compatible string and update the module description to
indicate support for these SoCs.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon a new UMR post, check if the WQE buffer contains
a previous UMR WQE. If so, modify the dynamic fields
instead of a whole WQE overwrite. This saves a memcpy.
In current setting, after 2 WQ cycles (12 UMR posts),
this will always be the case.
No degradation sensed.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
All UMR WQEs of an RQ share many common fields. We use
pre-initialized structures to save calculations in datapath.
One field (xlt_offset) was the only reason we saved a pre-initialized
copy per WQE index.
Here we remove its initialization (move its calculation to datapath),
and reduce the number of copies to one-per-RQ.
A very small datapath calculation is added, it occurs once per a MPWQE
(i.e. once every 256KB), but reduces memory consumption and gives
better cache utilization.
Performance testing:
Tested packet rate, no degradation sensed.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When many packets reside on the same page, the bulking of
page_ref modifications reduces the total number of atomic
operations executed.
Besides the necessary 2 operations on page alloc/free, we
have the following extra ops per page:
- one on WQE allocation (bump refcnt to maximum possible),
- zero ops for SKBs,
- one on WQE free,
a constant of two operations in total, no matter how many
packets/SKBs actually populate the page.
Without this bulking, we have:
- no ops on WQE allocation or free,
- one op per SKB,
Comparing the two methods when PAGE_SIZE is 4K:
- As mentioned above, bulking method always executes 2 operations,
not more, but not less.
- In the default MTU configuration (1500, stride size is 2K),
the non-bulking method execute 2 ops as well.
- For larger MTUs with stride size of 4K, non-bulking method
executes only a single op.
- For XDP (stride size of 4K, no SKBs), non-bulking method
executes no ops at all!
Hence, to optimize the flows with linear SKB and XDP over Striding RQ,
we here remove the page_ref bulking method.
Performance testing:
ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz.
Single core packet rate (64 bytes).
Early drop in TC: no degradation.
XDP_DROP:
before: 14,270,188 pps
after: 20,503,603 pps, 43% improvement.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Add XDP support over Striding RQ.
Now that linear SKB is supported over Striding RQ,
we can support XDP by setting stride size to PAGE_SIZE
and headroom to XDP_PACKET_HEADROOM.
Upon a MPWQE free, do not release pages that are being
XDP xmit, they will be released upon completions.
Striding RQ is capable of a higher packet-rate than
conventional RQ.
A performance gain is expected for all cases that had
a HW packet-rate bottleneck. This is the case whenever
using many flows that distribute to many cores.
Performance testing:
ConnectX-5, 24 rings, default MTU.
CQE compression ON (to reduce completions BW in PCI).
XDP_DROP packet rate:
--------------------------------------------------
| pkt size | XDP rate | 100GbE linerate | pct% |
--------------------------------------------------
| 64byte | 126.2 Mpps | 148.0 Mpps | 85% |
| 128byte | 80.0 Mpps | 84.8 Mpps | 94% |
| 256byte | 42.7 Mpps | 42.7 Mpps | 100% |
| 512byte | 23.4 Mpps | 23.4 Mpps | 100% |
--------------------------------------------------
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Make the xdp_xmit indication available for Striding RQ
by taking it out of the type-specific union.
This refactor is a preparation for a downstream patch that
adds XDP support over Striding RQ.
In addition, use a bitmap instead of a boolean for possible
future flags.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Current Striding RQ HW feature utilizes the RX buffers so that
there is no wasted room between the strides. This maximises
the memory utilization.
This prevents the use of build_skb() (which requires headroom
and tailroom), and demands to memcpy the packets headers into
the skb linear part.
In this patch, whenever a set of conditions holds, we apply
an RQ configuration that allows combining the use of linear SKB
on top of a Striding RQ.
To use build_skb() with Striding RQ, the following must hold:
1. packet does not cross a page boundary.
2. there is enough headroom and tailroom surrounding the packet.
We can satisfy 1 and 2 by configuring:
stride size = MTU + headroom + tailoom.
This is possible only when:
a. (MTU - headroom - tailoom) does not exceed PAGE_SIZE.
b. HW LRO is turned off.
Using linear SKB has many advantages:
- Saves a memcpy of the headers.
- No page-boundary checks in datapath.
- No filler CQEs.
- Significantly smaller CQ.
- SKB data continuously resides in linear part, and not split to
small amount (linear part) and large amount (fragment).
This saves datapath cycles in driver and improves utilization
of SKB fragments in GRO.
- The fragments of a resulting GRO SKB follow the IP forwarding
assumption of equal-size fragments.
Some implementation details:
HW writes the packets to the beginning of a stride,
i.e. does not keep headroom. To overcome this we make sure we can
extend backwards and use the last bytes of stride i-1.
Extra care is needed for stride 0 as it has no preceding stride.
We make sure headroom bytes are available by shifting the buffer
pointer passed to HW by headroom bytes.
This configuration now becomes default, whenever capable.
Of course, this implies turning LRO off.
Performance testing:
ConnectX-5, single core, single RX ring, default MTU.
UDP packet rate, early drop in TC layer:
--------------------------------------------
| pkt size | before | after | ratio |
--------------------------------------------
| 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x |
| 500byte | 5.23 Mpps | 5.97 Mpps | 1.14x |
| 64byte | 5.94 Mpps | 5.96 Mpps | 1.00x |
--------------------------------------------
TCP streams: ~20% gain
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When modifying the page mapping of a HW memory region
(via a UMR post), post the new values inlined in WQE,
instead of using a data pointer.
This is a micro-optimization, inline UMR WQEs of different
rings scale better in HW.
In addition, this obsoletes a few control flows and helps
delete ~50 LOC.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Do not busy-wait a pending UMR completion. Under high HW load,
busy-waiting a delayed completion would fully utilize the CPU core
and mistakenly indicate a SW bottleneck.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Gets the process of a UMR WQE post in one function,
in preparation for a downstream patch that inlines
the WQE data.
No functional change here.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In Striding RQ, each WQE serves multiple packets
(hence called Multi-Packet WQE, MPWQE).
The size of a MPWQE is constant (currently 256KB).
Upon a ringparam set operation, we calculate the number of
MPWQEs per RQ. For this, first it is needed to determine the
number of packets that can reside within a single MPWQE.
In this patch we use the actual MTU size instead of ETH_DATA_LEN
for this calculation.
This implies that a change in MTU might require a change
in Striding RQ ring size.
In addition, this obsoletes some WQEs-to-packets translation
functions and helps delete ~60 LOC.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Knowing the MTU is required for RQ creation flow.
By our design, channels creation flow is totally isolated
from priv/netdev, and can be completed with access to
channels params and mdev.
Adding the MTU to the channels params helps preserving that.
In addition, we save it in RQ to make its access faster in
datapath checks.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
With ConnectX-4, we expect the force teardown to fail in case that
DC was enabled, therefore change the message from error to warning.
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
1. This function is not used anywhere in mlx5 driver
2. It has a memcpy statement that makes no sense and produces build
warning with gcc8
drivers/net/ethernet/mellanox/mlx5/core/transobj.c: In function 'mlx5_core_query_xsrq':
drivers/net/ethernet/mellanox/mlx5/core/transobj.c:347:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
Fixes: 01949d0109 ("net/mlx5_core: Enable XRCs and SRQs when using ISSI > 0")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Instead of looking for the EQ of the CQ, remove that redundant code and
use the eq pointer stored in the cq struct.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In mvneta_port_up() we enable relevant RX and TX port queues by write
queues bit map to an appropriate register.
q_map must be ZERO in the beginning of this process.
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miguel reported an skb use after free / double free in vrf_finish_output
when neigh_output returns an error. The vrf driver should return after
the call to neigh_output as it takes over the skb on error path as well.
Patch is a simplified version of Miguel's patch which was written for 4.9,
and updated to top of tree.
Fixes: 8f58336d3f ("net: Add ethernet header for pass through VRF device")
Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit has fix for RX traffic issues when we stress test the driver
with continuous ifconfig up/down under very high traffic conditions.
Reason for the issue is that, in existing liquidio_stop function NAPI is
disabled even before actual FW/HW interface is brought down via
send_rx_ctrl_cmd(lio, 0). Between time frame of NAPI disable and actual
interface down in firmware, firmware continuously enqueues rx traffic to
host. When interrupt happens for new packets, host irq handler fails in
scheduling NAPI as the NAPI is already disabled.
After "ifconfig <iface> up", Host re-enables NAPI but cannot schedule it
until it receives another Rx interrupt. Host never receives Rx interrupt as
it never cleared the Rx interrupt it received during interface down
operation. NIC Rx interrupt gets cleared only when Host processes queue and
clears the queue counts. Above anomaly leads to other issues like packet
overflow in FW/HW queues, backpressure.
Fix:
This commit fixes this issue by disabling NAPI only after informing
firmware to stop queueing packets to host via send_rx_ctrl_cmd(lio, 0).
send_rx_ctrl_cmd is not visible in the patch as it is already there in the
code. The DOWN command also waits for any pending packets to be processed
by NAPI so that the deadlock will not occur.
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Acked-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Schmidt says:
====================
pull-request: ieee802154-next 2018-03-29
An update from ieee802154 for *net-next*
Colin fixed a unused variable in the new mcr20a driver.
Harry fixed an unitialised data read in the debugfs interface of the
ca8210 driver.
If there are any issues or you think these are to late for -rc1 (both can also
go into -rc2 as they are simple fixes) let me know.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds initial suport for DWMAC5 and implements the Automotive Safety
Package which is available from core version 5.10.
The Automotive Safety Pacakge (also called Safety Features) offers us
with error protection in the core by implementing ECC Protection in
memories, on-chip data path parity protection, FSM parity and timeout
protection and Application/CSR interface timeout protection.
In case of an uncorrectable error we call stmmac_global_err() and
reconfigure the whole core.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently TX Timeout handler does not behaves as expected and leads to
an unrecoverable state. Rework current implementation of TX Timeout
handling to actually perform a complete reset of the driver state and IP.
We use deferred work to init a task which will be responsible for
resetting the system.
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The style of the rx/tx queue's *_coal member assignment is:
static void foo_coal_set(...)
{
set the coal in hw;
update queue's foo_coal member; [1]
}
In other place, we call foo_coal_set(pp, queue->foo_coal), so the above [1]
is duplicated and could be removed.
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call of_get_nvmem_mac_address() to fetch the MAC address from an nvmem
cell, if one is provided in the device tree. This allows the address to
be stored in an I2C EEPROM device for example.
Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trigger a port mod message to request an MTU change on the NIC when any
physical port representor is assigned a new MTU value. The driver waits
10 msec for an ack that the FW has set the MTU. If no ack is received the
request is rejected and an appropriate warning flagged.
Rather than maintain an MTU queue per repr, one is maintained per app.
Because the MTU ndo is protected by the rtnl lock, there can never be
contention here. Portmod messages from the NIC are also protected by
rtnl so we first check if the portmod is an ack and, if so, handle outside
rtnl and the cmsg work queue.
Acks are detected by the marking of a bit in a portmod response. They are
then verfied by checking the port number and MTU value expected by the
app. If the expected MTU is 0 then no acks are currently expected.
Also, ensure that the packet headroom reserved by the flower firmware is
considered when accepting an MTU change on any repr.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the 'change_mtu' app callback to 'check_mtu'. This is called
whenever an MTU change is requested on a netdev. It can reject the
change but is not responsible for implementing it.
Introduce a new 'repr_change_mtu' app callback that is hit when the MTU
of a repr is to be changed. This is responsible for performing the MTU
change and verifying it.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide a pointer to the SFP bus in struct net_device, so that the
ethtool module EEPROM methods can access the SFP directly, rather
than needing every user to provide a hook for it.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for having DSA transition entirely to PHYLINK, we need to pass a
PHY interface type to the mac_link_{up,down} callbacks because we may have to
make decisions on that (e.g: turn on/off RGMII interfaces etc.). We do not pass
an entire phylink_link_state because not all parameters (pause, duplex etc.) are
defined when the link is down, only link and interface are.
Update mvneta accordingly since it currently implements phylink_mac_ops.
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were a number of issues with setting the RX coalescing parameters:
- we would not be preserving values that would have been configured
across close/open calls, instead we would always reset to no timeout
and 1 interrupt per packet, this would also prevent DIM from setting its
default usec/pkts values
- when adaptive RX would be turned on, we woud not be fetching the
default parameters, we would stay with no timeout/1 packet per interrupt
until the estimator kicks in and changes that
- finally disabling adaptive RX coalescing while providing parameters
would not be honored, and we would stay with whatever DIM had previously
determined instead of the user requested parameters
Fixes: 9f4ca05827 ("net: bcmgenet: Add support for adaptive RX coalescing")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There were a number of issues with setting the RX coalescing parameters:
- we would not be preserving values that would have been configured
across close/open calls, instead we would always reset to no timeout
and 1 interrupt per packet, this would also prevent DIM from setting its
default usec/pkts values
- when adaptive RX would be turned on, we woud not be fetching the
default parameters, we would stay with no timeout/1 packet per
interrupt until the estimator kicks in and changes that
- finally disabling adaptive RX coalescing while providing parameters
would not be honored, and we would stay with whatever DIM had
previously determined instead of the user requested parameters
Fixes: b6e0e87542 ("net: systemport: Implement adaptive interrupt coalescing")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adaptive TX coalescing is not currently giving us any advantages and
ends up making the CPU spin more frequently until TX completion. Deny
and disable adaptive TX coalescing for now and rely on static
configuration, we can always add it back later.
Reviewed-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return a negative error code from the hash filter init error
handling case instead of 0, as done elsewhere in this function.
Fixes: 5c31254e35 ("cxgb4: initialize hash-filter configuration")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkman says:
====================
pull-request: bpf 2018-03-29
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix nfp to properly check max insn count while emitting
instructions in the JIT which was wrongly comparing bytes
against number of instructions before, from Jakub.
2) Fix for bpftool to avoid usage of hex numbers in JSON
output since JSON doesn't accept hex numbers with 0x
prefix, also from Jakub.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Smaller new features to various drivers but nothing really out of
ordinary.
Major changes:
ath10k
* enable chip temperature measurement for QCA6174/QCA9377
* add firmware memory dump for QCA9984
* enable buffer STA on TDLS link for QCA6174
* support different beacon internals in multiple interface scenario
for QCA988X/QCA99X0/QCA9984/QCA4019
iwlwifi
* support for new PCI IDs for the 9000 family
* support for a new firmware API version
* support for advanced dwell and Optimized Connectivity Experience
(OCE) in scanning
btrsi
* fix kconfig dependencies
wil6210
* support multiple virtual interfaces
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJavOUVAAoJEG4XJFUm622bf2cH/i/we50/PR64rxq5RV8HWQPt
9P/bn4I1pEDRDJZw00f/bqe20w5ps1UuHs1hgOiJzOsZNfxVv+B17MMY/dN/+5Ob
hBYeZQFZ0MJ1cEF6J6gzebH1BKVEApVPI49MmAUzdtATO/6YgCnxovb61bl51RYg
vtMNcrXBQDplPTh5W7YHXrUz4sV1d3daCk/6Coea+SoF6eL1tn+qndo6GhNIMboE
HA0vV27Wf2RCzlxcQ4WGD4cre6n9sRaX4lcFzMEtbM7ziM0E0SES7eEThB5ST+A6
3K0+zSuyEfOu40QRJF1g9Wl8FXvp+9SRVC7s1rXjDYUv9sGl+a5fhb7VPgrYn/g=
=AgFb
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2018-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.17
Smaller new features to various drivers but nothing really out of
ordinary.
Major changes:
ath10k
* enable chip temperature measurement for QCA6174/QCA9377
* add firmware memory dump for QCA9984
* enable buffer STA on TDLS link for QCA6174
* support different beacon internals in multiple interface scenario
for QCA988X/QCA99X0/QCA9984/QCA4019
iwlwifi
* support for new PCI IDs for the 9000 family
* support for a new firmware API version
* support for advanced dwell and Optimized Connectivity Experience
(OCE) in scanning
btrsi
* fix kconfig dependencies
wil6210
* support multiple virtual interfaces
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
first bullet here:
* EAPoL-over-nl80211 from Denis - this will let us fix
some long-standing issues with bridging, races with
encryption and more
* DFS offload support from the qtnfmac folks
* regulatory database changes for the new ETSI adaptivity
requirements
* various other fixes and small enhancements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlq85QQACgkQB8qZga/f
l8Sdbg//bc8C/4F1TUdJqZWGK1j9Lwd6nZpP0iFqH5Ees3MMnti5XdV2dCL31ivz
c8DOuRAU8ZG/RLPgSTHVZHwh+f7S5/TxSKg8WOBvrYk7a0C1uvVVhe5XZQEmqE7g
eqM0+UQ5DyzUYnu0kSUrFKPV7BqDa2YzVDdK8e8iozqZmAnvGN9k8H7EDEeUxtxk
LEl+bEcmhDSfIssU2Iaksl+9qoZP6BkoVGAOmDzIL654WV4XVKorxRX6vndqSQGu
0cCz2Occ+/0hfvszONBRR4M/gtI/Yyn3u+D1Q0YD3X40Q9gJE11fcodmMT61l5C7
rGcu94RIGilvRvjZScK3giiU2L7DD+VETUa+YGnjd8gLpmrYd6cURxlm4yuHbw8C
UScLCApAUuY+skmPLeuyHW9mnzHaC336vzVjk8OhdNRhX23/rB8nk1yIywgqETVW
g/iub8/Xp6TRfdyh76I18wlfqCp1It2JAeICgKH5NPlwUA6U0xFR0/ddSR8FuAcK
ZLY8mgsc2kIH6r4x5sjeH+Yb6tGi/Z3HMZM2hna+t4vSpn6Q5+GPsA6yuHuBUhJb
8QswMiLDSux8I4guKgQyROiHaCzE3zOigJ4o1z9XITKsgluZVxnKr+ETKdr88WFp
II8U0qH/kejXIfxUjbv5Wla70J9wi/hjxR6vOfSkEtYNvIApdfc=
=hI0F
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2018-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
We have a fair number of patches, but many of them are from the
first bullet here:
* EAPoL-over-nl80211 from Denis - this will let us fix
some long-standing issues with bridging, races with
encryption and more
* DFS offload support from the qtnfmac folks
* regulatory database changes for the new ETSI adaptivity
requirements
* various other fixes and small enhancements
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
VTU miss violations can happen under normal conditions. Don't spam the
kernel log, downgrade the output to debug level only. The statistics
counter will indicate it is happening, if anybody not debugging is
interested.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Count the numbers of various ATU and VTU violation statistics and
return them as part of the ethtool -S statistics.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the module_pci_driver() macro to make the code simpler
by eliminating module_init and module_exit calls.
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes the following sparse warning:
drivers/net/ethernet/broadcom/genet/bcmgenet.c:1351:16: warning:
Using plain integer as NULL pointer
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Today, driver drops received packets which are indicated as
invalid checksum by the device. Instead of dropping such packets,
pass them to the stack with CHECKSUM_NONE indication in skb.
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cotsworks modules fail the checksums - it appears that Cotsworks
reprograms the EEPROM at the end of production with the final product
information (serial, date code, and exact part number for module
options) and fails to update the checksum.
Work around this by detecting the Cotsworks name in the manufacturer
field, and reducing the checksum failures to warnings rather than a
hard error.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds ethtool callback implementation for flash update.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the required driver support for updating the flash or
non volatile memory of the adapter. At highlevel, flash upgrade comprises
of reading the flash images from the input file, validating the images and
writing them to the respective paritions.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds APIs for flash access.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for populating the flash image attributes.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This FW contains several fixes and features
RDMA Features
- SRQ support
- XRC support
- Memory window support
- RDMA low latency queue support
- RDMA bonding support
RDMA bug fixes
- RDMA remote invalidate during retransmit fix
- iWARP MPA connect interop issue with RTR fix
- iWARP Legacy DPM support
- Fix MPA reject flow
- iWARP error handling
- RQ WQE validation checks
MISC
- Fix some HSI types endianity
- New Restriction: vlan insertion in core_tx_bd_data can't be set
for LB packets
ETH
- HW QoS offload support
- Fix vlan, dcb and sriov flow of VF sending a packet with
inband VLAN tag instead of default VLAN
- Allow GRE version 1 offloads in RX flow
- Allow VXLAN steering
iSCSI / FcoE
- Fix bd availability checking flow
- Support 256th sge proerly in iscsi/fcoe retransmit
- Performance improvement
- Fix handle iSCSI command arrival with AHS and with immediate
- Fix ipv6 traffic class configuration
DEBUG
- Update debug utilities
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During heavy tx traffic, control messages (sent by liquidio driver to NIC
firmware) sometimes do not get processed in a timely manner. Reason is:
the low-level metadata of control messages and that of egress network
packets indicate that they have the same priority.
Fix it by setting a higher priority for control messages through the new
ctrl_qpg field in the oct_txpciq struct. It is the NIC firmware that does
the actual setting of priority by writing to the new ctrl_qpg field; the
host driver treats that value as opaque and just assigns it to pki_ih3->qpg
Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add devlink support to netdevsim and use it to implement a simple,
profile based resource controller. Only one controller is needed
per namespace, so the first netdevsim netdevice in a namespace
registers with devlink. If that device is deleted, the resource
settings are deleted.
The resource controller allows a user to limit the number of IPv4 and
IPv6 FIB entries and FIB rules. The resource paths are:
/IPv4
/IPv4/fib
/IPv4/fib-rules
/IPv6
/IPv6/fib
/IPv6/fib-rules
The IPv4 and IPv6 top level resources are unlimited in size and can not
be changed. From there, the number of FIB entries and FIB rule entries
are unlimited by default. A user can specify a limit for the fib and
fib-rules resources:
$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib size 96
$ devlink resource set netdevsim/netdevsim0 path /IPv4/fib-rules size 16
$ devlink resource set netdevsim/netdevsim0 path /IPv6/fib size 64
$ devlink resource set netdevsim/netdevsim0 path /IPv6/fib-rules size 16
$ devlink dev reload netdevsim/netdevsim0
such that the number of rules or routes is limited (96 ipv4 routes in the
example above):
$ for n in $(seq 1 32); do ip ro add 10.99.$n.0/24 dev eth1; done
Error: netdevsim: Exceeded number of supported fib entries.
$ devlink resource show netdevsim/netdevsim0
netdevsim/netdevsim0:
name IPv4 size unlimited unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables non
resources:
name fib size 96 occ 96 unit entry size_min 0 size_max unlimited size_gran 1 dpipe_tables
...
With this template in place for resource management, it is fairly trivial
to extend and shows one way to implement a simple counter based resource
controller typical of network profiles.
Currently, devlink only supports initial namespace. Code is in place to
adapt netdevsim to a per namespace controller once the network namespace
issues are resolved.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This series contains Misc updates and cleanups for mlx5e rx path
and SQ recovery feature for tx path.
From Tariq: (RX updates)
- Disable Striding RQ when PCI devices, striding RQ limits the use
of CQE compression feature, which is very critical for slow PCI
devices performance, in this change we will prefer CQE compression
over Striding RQ only on specific "slow" PCIe links.
- RX path cleanups
- Private flag to enable/disable striding RQ
From Eran: (TX fast recovery)
- TX timeout logic improvements, fast SQ recovery and TX error reporting
if a HW error occurs while transmitting on a specific SQ, the driver will
ignore such error and will wait for TX timeout to occur and reset all
the rings. Instead, the current series improves the resiliency for such
HW errors by detecting TX completions with errors, which will report them
and perform a fast recover for the specific faulty SQ even before a TX
timeout is detected.
Thanks,
Saeed.
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJauuK/AAoJEEg/ir3gV/o+Ro4IAIBcQl4S/6jPttGDOPwElU4b
Q8MuQbnB9k36Kf6x3GrhBqwtoS3OYBLvfyOeZsghD8uyzklhnqR/psrCt1YhLXq7
aI5iFbuQucCUwIYBscuZegXBs+PTJYALoil05hOwSkwhEBZR53ZvdRQ9Jg0G0QiY
sYEYAl41A1DWQOLET5lwleUpVQnVnNvIcXqNHKBKbS4f8rmLpfsZ6YAxJiiIjeDJ
gLbsaxUyY3p77594okBk6DuFBWZsEAKwumspdSc92OEMrZSYE5ugNL+UgVEE7I8F
JgK/4OUarw3K7xwhAg//ZhTM1iRRdJv0Sz2GPhhWRzYB0jWACc3ZG2/M/WmxFIg=
=Uslu
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-03-27 (Misc updates & SQ recovery)
This series contains Misc updates and cleanups for mlx5e rx path
and SQ recovery feature for tx path.
From Tariq: (RX updates)
- Disable Striding RQ when PCI devices, striding RQ limits the use
of CQE compression feature, which is very critical for slow PCI
devices performance, in this change we will prefer CQE compression
over Striding RQ only on specific "slow" PCIe links.
- RX path cleanups
- Private flag to enable/disable striding RQ
From Eran: (TX fast recovery)
- TX timeout logic improvements, fast SQ recovery and TX error reporting
if a HW error occurs while transmitting on a specific SQ, the driver will
ignore such error and will wait for TX timeout to occur and reset all
the rings. Instead, the current series improves the resiliency for such
HW errors by detecting TX completions with errors, which will report them
and perform a fast recover for the specific faulty SQ even before a TX
timeout is detected.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We can have interrupts left enabled form e.g: the bootloader which used
the network device for network boot. Make sure we have those disabled as
early as possible to avoid spurious interrupts.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the system contains several BGMAC adapters, it is nice to be able
to tell which one is which by looking at /proc/interrupts. Use the
network device name as a name to request_irq() with.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the National Instruments XGE 1/10G network device.
It uses the EEPROM on the board via NVMEM.
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
My recent change to netvsc drive in how receive flags are handled
broke multicast. The Hyper-v/Azure virtual interface there is not a
multicast filter list, filtering is only all or none. The driver must
enable all multicast if any multicast address is present.
Fixes: 009f766ca2 ("hv_netvsc: filter multicast/broadcast")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Description:
Crash was reported with syzkaller pointing to lan78xx_write_reg routine.
Root-cause:
Proper cleanup of workqueues and init/setup routines was not happening
in failure conditions.
Fix:
Handled the error conditions by cleaning up the queues and init/setup
routines.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ca8210_test_int_user_write() a user can request the transfer of a
frame with a length field (command.length) that is longer than the
actual buffer provided (len). In this scenario the driver will copy
the buffer contents into the uninitialised command[] buffer, then
transfer <data.length> bytes over the SPI even though only <len> bytes
had been populated, potentially leaking sensitive kernel memory.
Also the first 6 bytes of the command buffer must be initialised in case
a malformed, short packet is written and the uninitialised bytes are
read in ca8210_test_check_upstream.
Reported-by: Domen Puncer Kugler <domen.puncer@samsung.com>
Signed-off-by: Harry Morris <h.morris@cascoda.com>
Tested-by: Harry Morris <h.morris@cascoda.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
ath.git patches for 4.17. Major changes:
ath10k
* enable chip temperature measurement for QCA6174/QCA9377
* add firmware memory dump for QCA9984
* enable buffer STA on TDLS link for QCA6174
* support different beacon internals in multiple interface scenario
for QCA988X/QCA99X0/QCA9984/QCA4019
Remove the static array and use the generic routine to set the
Ethernet broadcast address.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Currently vdev stats displayed in fw_stats are applicable
only for TLV based firmware and fix it for 10.4 firmware
as of now. The vdev stats in 10.4 firmware is split into two
parts (vdev_stats, vdev_stats_extended). The actual stats
are captured only in extended vdev stats. In order to enable
vdev stats, appropriate feature bit will be set on extended
resource config. As FTM related counters are available only on
newer 10.4 based firmware, these counters will be displayed
only on valid data.
Signed-off-by: Rajkumar Manoharan <rmanohar@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The commit "cfg80211: make RATE_INFO_BW_20 the default" changed
the index of RATE_INFO_BW_20, but the updates to ath10k missed
the special bandwidth calculation case in
ath10k_update_per_peer_tx_stats().
This will fix below warning,
WARNING: CPU: 0 PID: 609 at net/wireless/util.c:1254
cfg80211_calculate_bitrate+0x174/0x220
invalid rate bw=1, mcs=9, nss=2
(unwind_backtrace) from
(cfg80211_calculate_bitrate+0x174/0x220)
(cfg80211_calculate_bitrate) from
(nl80211_put_sta_rate+0x44/0x1dc)from
(nl80211_put_sta_rate) from
(nl80211_send_station+0x388/0xaf0)
(nl80211_get_station+0xa8/0xec)
[ end trace da8257d6a850e91a ]
Fixes: 842be75c77 ("cfg80211: make RATE_INFO_BW_20 the default")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This patch fixes regression caused by 0c317a02ca
("cfg80211: support virtual interfaces with different beacon intervals"),
with this change cfg80211 expects the driver to advertize
'beacon_int_min_gcd' to support different beacon intervals in multivap
scenario. This support is added for, QCA988X/QCA99X0/QCA9984/QCA4019.
Verifed AP + mesh bring up on QCA9984 with beacon interval 100msec and
1000msec respectively.
Frimware: firmware-5.bin_10.4-3.5.3-00053
Fixes: 0c317a02ca ("cfg80211: support virtual interfaces with different beacon intervals")
Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
For WPA encryption, QCA6174 firmware(version: WLAN.RM.4.4) will unblock
data when M4 was sent successfully. For other encryption which didn't need
4-way handshake firmware will unblock the data when peer authorized. Since
TDLS is 3-way handshake host need send authorize cmd to firmware to unblock
data.
Signed-off-by: Yingying Tang <yintang@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
TDLS peer do not need WEP key. Setting WEP key will lead
to TDLS setup failure. Add fix to avoid setting WEP key
for TDLS peer.
Signed-off-by: Yingying Tang <yintang@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Enable TDLS peer inactivity detetion feature.
QCA6174 firmware(version: WLAN.RM.4.4) support TDLS link inactivity detecting.
Set related parameters in TDLS WMI command to enable this feature.
Signed-off-by: Yingying Tang <yintang@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Enable TDLS peer buffer STA feature.
QCA6174 firmware(version: WLAN.RM.4.4) support TDLS peer buffer STA,
it reports this capability through wmi service map in wmi service ready
event. Set related parameter in TDLS WMI command to enable this feature.
Signed-off-by: Yingying Tang <yintang@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In case wcn36xx_smd_rsp_process() is called more than once before
hal_ind_work was dispatched, the messages will end up in hal_ind_queue,
but wcn36xx_ind_smd_work() will only look at the first message in that
list.
Fix this by dequeing the messages from the list in a loop, and only stop
when it's empty.
This issue was found during a review of the driver. In my tests, that
race never actually occured.
Signed-off-by: Daniel Mack <daniel@zonque.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
wcn36xx_start_tx function retrieves the buffer descriptor from the
channel control queue to start filling tx buffer information. However,
nothing prevents this same buffer to be concurrently accessed in a
concurent tx call, leading to potential buffer coruption and firmware
crash (observed during iperf test). The channel control queue should
only be accessed and updated with the channel lock.
Fix this issue by using a local buffer descriptor which will be copied
in the thread-safe wcn36xx_dxe_tx_frame.
Note that buffer descriptor size is few bytes so the introduced copy
overhead is insignificant. Moreover, this allows to keep the locked
section minimal.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Ramon Fried <rfried@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
It appears that the WCN36xx firmware doesn't actually respond to
probe requests. Until it's resolved, switch the probe response
responsibility to the 802.11 layer to allow creation of
hidden SSID AP's.
Signed-off-by: Ramon Fried <rfried@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
QCA9984/QCA99X0/QCA4019 chipsets have 8 memory regions, dump all of them to the
firmware coredump file. Some of the regions need to be read using ioread() so
add new region types for them.
Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org>
[kvalo: refactoring etc]
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
As QCA9984 needs two region types refactor the code to make it easier add the
new types. No functional changes.
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
FW has Smart Logging feature enabled by default for detecting failures
and processing FATAL_CONDITION_EVENTID (36925 - 0x903D) back to host.
Since ath10k doesn't implement the Smart Logging and FATAL CONDITION
EVENT processing yet, suppressing the unknown event ID warning by moving
this under ATH10K_DBG_WMI.
Simulated the same issue by having associated STA powered off when
ping flood was running from AP backbone. This triggerd STA KICKOUT
in AP followed by FATAL CONDITION event 36925.
Issue was reproduced and verified in below DUT
------------------------------------------------
AP mode of OpenWRT QCA9984 running 6.0.8 with FW ver 10.4-3.5.3-00053
Signed-off-by: Sathishkumar Muruganandam <murugana@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Firmware WLAN.TF.2.1-00014-QCARMSWP-1 now supports reading the board ID
information and also required 9 IRAM bank, which older ath10k version
don't have the support will fail to be enabled, so in order to maintain
the backward compatibility, we need to update the FW API to 6.
Signed-off-by: Ryan Hsu <ryanhsu@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The firmware of QCA6174/QCA9377 already support the feature, just enable
it to be able to handle the get_temperature command and process the event.
You can read the temperature by using the hwmon interface,
cat /sys/class/ieee80211/phy*/device/hwmon/hwmon2/temp1_input
Verified with the following hardware and software combination,
QCA6174, only firmware-4.bin doesn't support this, otherwise all support.
QCA9377, all the firmwares upstreamed support this command
Signed-off-by: Ryan Hsu <ryanhsu@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
some userspace programs (e.g. hostapd) need to set the regulatory domain
before selecting the operating channel. Synchronize DFS detector regardless of
the value of ah->curchan, to avoid situations where wireless scan can't be done
on some 5GHz sub-bands, because dfs_region is constantly UNSET.
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When FW responds with a message of wrong size or type make sure
the type is checked first and included in the wrong size message.
This makes it easier to figure out which FW command failed.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
NFP has a prng register, which we can read to obtain a u32 worth
of pseudo random data. Generate code for it.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Allow atomic add to be used even when the value is not guaranteed
to fit into a 16 bit immediate. This requires the value to be pulled
as data, and therefore use of a transfer register and a context swap.
Track the information about possible lengths of the value, if it's
guaranteed to be larger than 16bits don't generate the code for the
optimized case at all.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Allow callers to control the delay slots of commands, instead of
giving them just a wait/nowait choice.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Implement atomic add operation for 32 and 64 bit values. Depend
on the verifier to ensure alignment. Values have to be kept in
big endian and swapped upon read/write. For now only support
atomic add of a constant.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Support calling map_delete_elem() FW helper from the datapath
programs. For JIT checks and code are basically equivalent
to map lookups. Similarly to other map helper key must be on
the stack. Different pointer types are left for future extension.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Support calling map_update_elem() from the datapath programs
by calling into FW-provided helper. Value pointer is passed
in LM pointer #2. Keeping track of old state for arg3 is not
necessary, since LM pointer #2 will be always loaded in this
case, the trivial optimization for value at the bottom of the
stack can't be done here.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a verifier helper for performing the basic state checks
before a call to a map helper.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Our implementation has restriction on stack pointers for function
calls. Move the common checks into a helper for reuse. The state
has to be encapsulated into a structure to support parameters
other than BPF_REG_2.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We will reuse most of map call code gen for other map calls.
Rename the lookup gen function and use meta->func_id instead
of hard-coding lookup.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch is the front end of this optimisation, it detects and marks
those packet reads that could be cached. Then the optimisation "backend"
will be activated automatically.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch add the support for unaligned read offset, i.e. the read offset
to the start of packet cache area is not aligned to REG_WIDTH. In this
case, the read area might across maximum three transfer-in registers.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This patch assumes there is a packet data cache, and would try to read
packet data from the cache instead of from memory.
This patch only implements the optimisation "backend", it doesn't build
the packet data cache, so this optimisation is not enabled.
This patch has only enabled aligned packet data read, i.e. when the read
offset to the start of cache is REG_WIDTH aligned.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
fix iwlwifi_dev_ucode_error tracepoint to pass pointer to a table
instead of all 17 arguments by value.
dvm/main.c and mvm/utils.c have 'struct iwl_error_event_table'
defined with very similar yet subtly different fields and offsets.
tracepoint is still common and using definition of 'struct iwl_error_event_table'
from dvm/commands.h while copying fields.
Long term this tracepoint probably should be split into two.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
two trace events defined with the same name and both unused.
They conflict in allyesconfig build. Rename one of them.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
We can set triggers that cause a debug data collection when something
of interest happens (e.g. when too many probes are lost conscutively).
Normally, this triggers don't cause the FW to be restarted, but in
some cases that may be desired, so we recover from the problem. To
support this, add a flag that indicates that the FW should be
restarted when the trigger fires.
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Currently we have a boolean variable for each cause.
This costs space, and requires to check each separately
when determining low latency.
Since we have another cause incoming, convert it to an enum.
While at it, move the retrieval of the prev value and the
assignment of the new value to be inside iwl_mvm_update_low_latency
and save the need for each caller to do it separately.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Some geographic profiles require specific handling. For example ETSI
profile requires special channel access handling. Add geographic
profile information to MCC_UPDATE response to allow it.
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
A lot of new PCI IDs were added for the 9000 series. Add them to the
list of supported PCI IDs.
Cc: stable@vger.kernel.org # 4.13+
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Remove fragmented_dwell_time and add num_of_fragments to support
the new API version.
Signed-off-by: Ayala Beker <ayala.beker@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The FW does not allocate quota air time for the binding of a station
MAC before iwlmvm indicates that it is associated. Currently iwlmvm
indicates that the MAC is associated only after hearing a beacon from
the AP. In case a deauthentication frame is sent before the MAC is
associated, the frame might not be sent as the corresponding binding
is not scheduled.
To handle such cases, set IEEE80211_HW_DEAUTH_NEED_MGD_TX_PREP in the
HW flags, requesting mac80211 to call the mgd_prepare_tx() callback
before transmitting a deauthentication frame if associated but no
beacon was heard from the AP.
In addition, do not warn in iwl_mvm_mac_mgd_prepare_tx() when already
associated as now the callback can be called also when associated.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Add support for Optimized Connectivity Experience (OCE). Get
capabilities from the fw, expose them with nl80211, and enable them in
UMAC scan if the relevant nl80211 flags are set by the userspace.
Signed-off-by: Roee Zamir <roee.zamir@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>