Add support for BCM957504 with device ID 1751
Signed-off-by: Erik Burrows <erik.burrows@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware error recover is the major change in this spec.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit b639583f9e.
As per discussion with Jakub Kicinski and Michal Kubecek,
this will be better addressed by soon-too-come ethtool netlink
API with additional indication that given configuration request
is supposed to be persisted.
Also, remove the parameter support from bnxt_en driver.
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Michal Kubecek <mkubecek@suse.cz>
Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, user can do dump or get of param values right after the
devlink params are registered. However the driver may not be initialized
which is an issue. The same problem happens during notification
upon param registration. Allow driver to publish devlink params
whenever it is ready to handle get() ops. Note that this cannot
be resolved by init reordering, as the "driverinit" params have
to be available before the driver is initialized (it needs the param
values there).
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An ipvlan bug fix in 'net' conflicted with the abstraction away
of the IPV6 specific support in 'net-next'.
Similarly, a bug fix for mlx5 in 'net' conflicted with the flow
action conversion in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-enabling RDMA driver support on 57500 chips. Removing the forced error
code for 57500 chip.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
BNXT only supports SWITCHDEV_ATTR_ID_PORT_PARENT_ID, which makes it a
great candidate to be converted to use the ndo_get_port_parent_id() NDO
instead of implementing switchdev_port_attr_get(). The conversion is
straight forward here since the PF and VF code use the same getter.
Since bnxt makes uses of switchdev_port_same_parent_id() convert it to
use netdev_port_same_parent_id().
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates drivers to use the new flow action infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides the flow_stats structure that acts as container for
tc_cls_flower_offload, then we can use to restore the statistics on the
existing TC actions. Hence, tcf_exts_stats_update() is not used from
drivers anymore.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch wraps the dissector key and mask - that flower uses to
represent the matching side - around the flow_match structure.
To avoid a follow up patch that would edit the same LoCs in the drivers,
this patch also wraps this new flow match structure around the flow rule
object. This new structure will also contain the flow actions in follow
up patches.
This introduces two new interfaces:
bool flow_rule_match_key(rule, dissector_id)
that returns true if a given matching key is set on, and:
flow_rule_match_XYZ(rule, &match);
To fetch the matching side XYZ into the match container structure, to
retrieve the key and the mask with one single call.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling firmware to allocate a CP ring or NQ, an interrupt associated
with that ring may be generated immediately before the doorbell is even
setup after the firmware call returns. When servicing the interrupt, the
driver may crash when trying to access the doorbell.
Fix it by disabling interrupt on that vector until the doorbell is
set up.
Fixes: 697197e5a1 ("bnxt_en: Re-structure doorbells.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Register devlink_port with devlink and create initial port params
table for bnxt_en. The table consists of a generic parameter:
wake_on_lan: Enables Wake on Lan for this port when magic packet
is received with this port's MAC address using ACPI pattern.
If enabled, the controller asserts a wake pin upon reception of
WoL packet. ACPI (Advanced Configuration and Power Interface) is
an industry specification for the efficient handling of power
consumption in desktop and mobile computers.
v2->v3:
- Modify bnxt_dl_wol_validate(), to throw error message when user gives
value other than DEVLINK_PARAM_WAKE_MAGIC ot to disable WOL.
- Use netdev_err() instead of netdev_warn(), when devlink_port_register()
and devlink_port_params_register() returns error. Also, don't log rc
in this message.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Fix regression in multi-SKB responses to RTM_GETADDR, from Arthur
Gautier.
2) Fix ipv6 frag parsing in openvswitch, from Yi-Hung Wei.
3) Unbounded recursion in ipv4 and ipv6 GUE tunnels, from Stefano
Brivio.
4) Use after free in hns driver, from Yonglong Liu.
5) icmp6_send() needs to handle the case of NULL skb, from Eric
Dumazet.
6) Missing rcu read lock in __inet6_bind() when operating on mapped
addresses, from David Ahern.
7) Memory leak in tipc-nl_compat_publ_dump(), from Gustavo A. R. Silva.
8) Fix PHY vs r8169 module loading ordering issues, from Heiner
Kallweit.
9) Fix bridge vlan memory leak, from Ido Schimmel.
10) Dev refcount leak in AF_PACKET, from Jason Gunthorpe.
11) Infoleak in ipv6_local_error(), flow label isn't completely
initialized. From Eric Dumazet.
12) Handle mv88e6390 errata, from Andrew Lunn.
13) Making vhost/vsock CID hashing consistent, from Zha Bin.
14) Fix lack of UMH cleanup when it unexpectedly exits, from Taehee Yoo.
15) Bridge forwarding must clear skb->tstamp, from Paolo Abeni.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
bnxt_en: Fix context memory allocation.
bnxt_en: Fix ring checking logic on 57500 chips.
mISDN: hfcsusb: Use struct_size() in kzalloc()
net: clear skb->tstamp in bridge forwarding path
net: bpfilter: disallow to remove bpfilter module while being used
net: bpfilter: restart bpfilter_umh when error occurred
net: bpfilter: use cleanup callback to release umh_info
umh: add exit routine for UMH process
isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
vhost/vsock: fix vhost vsock cid hashing inconsistent
net: stmmac: Prevent RX starvation in stmmac_napi_poll()
net: stmmac: Fix the logic of checking if RX Watchdog must be enabled
net: stmmac: Check if CBS is supported before configuring
net: stmmac: dwxgmac2: Only clear interrupts that are active
net: stmmac: Fix PCI module removal leak
tools/bpf: fix bpftool map dump with bitfields
tools/bpf: test btf bitfield with >=256 struct member offset
bpf: fix bpffs bitfield pretty print
net: ethernet: mediatek: fix warning in phy_start_aneg
tcp: change txhash on SYN-data timeout
...
When allocating memory pages for context memory, if the last page table
should be fully populated, the current code will set nr_pages to 0 when
calling bnxt_alloc_ctx_mem_blk(). This will cause the last page table
to be completely blank and causing some RDMA failures.
Fix it by setting the last page table's nr_pages to the remainder only
if it is non-zero.
Fixes: 08fe9d1816 ("bnxt_en: Add Level 2 context memory paging support.")
Reported-by: Eric Davis <eric.davis@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bnxt_hwrm_check_pf_rings(), add the proper flag to test the NQ
resources. Without the proper flag, the firmware will change
the NQ resource allocation and remap the IRQ, causing missing
IRQs. This issue shows up when adding MQPRIO TX queues, for example.
Fixes: 36d65be9a8 ("bnxt_en: Disable MSIX before re-reserving NQs/CMPL rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.
This change was generated with the following Coccinelle SmPL patch:
@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@
-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Lots of conflicts, by happily all cases of overlapping
changes, parallel adds, things of that nature.
Thanks to Stephen Rothwell, Saeed Mahameed, and others
for their guidance in these resolutions.
Signed-off-by: David S. Miller <davem@davemloft.net>
For a little better performance on faster machines and faster link
speeds.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Older firmware only supports 16-bit flow handle, because of which the
number of flows that can be offloaded can’t scale beyond a point.
Newer firmware supports 64-bit flow handle enabling the host to scale
upto millions of flows. With the new 64-bit flow handle support, driver
has to query flow stats in a different way compared to the older approach.
This patch adds support for 64-bit flow handle and new way to query
flow stats.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If RDMA is supported on the 57500 chip, increase context memory
allocations for the resources used by RDMA.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the new functions bnxt_alloc_ctx_pg_tbls()/bnxt_free_ctx_pg_tbls()
to allocate and free pages for context memory. The new functions
will handle the different levels of paging support and allocate/free
the pages accordingly using the existing functions.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To support level 2 context page memory structures, enhance the
bnxt_ring_mem_info structure with a "depth" field to specify the page
level and add a flag to specify using full pages for L1 and L2 page
tables. This is needed to support RDMA functionality on 57500 chips
since RDMA requires more context memory.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Earlier, some of the firmware commands (ex: CFA_FLOW_*) which are processed
by KONG processor were sent to the CHIMP processor from the host. This
approach was taken as there was no direct message channel to KONG.
CHIMP in turn used to send them to KONG. Newer firmware supports a new
message channel which the host can send messages directly to the KONG
processor.
This patch adds support for required changes needed in the driver
to support direct KONG message channel. This speeds up flow related
messages sent to the firmware for CLS_FLOWER offload.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These routines will be enhanced in the subsequent patch to
return the 2nd firmware comm. channel's hwrm response address &
sequence id respectively.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Typecast hwrm_cmd_resp_addr to (u8 *) from (void *) before doing
arithmetic.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for adding a 2nd communication channel to firmware.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set hwrm_intr_seq_id value to its inverted value instead of
HWRM_SEQ_INVALID, when an hwrm completion of type
CMPL_BASE_TYPE_HWRM_DONE is received. This will enable us to use
the complete 16-bit sequence ID space.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The major changes are in the flow offload firmware APIs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code has 2 problems. It assumes that the RX ring for
the loopback packet is combined with the TX ring. This is not
true if the ethtool channels are set to non-combined mode. The
second problem is that it won't work on 57500 chips without
adjusting the logic to get the proper completion ring (cpr) pointer.
Fix both issues by locating the proper cpr pointer through the RX
ring.
Fixes: e44758b78a ("bnxt_en: Use bnxt_cp_ring_info struct pointer as parameter for RX path.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With autoneg enabled, PHY loopback test fails. To disable autoneg,
driver needs to send a valid forced speed to FW. FW is not sending
async event for invalid speeds. To fix this, query forced speeds
and send the correct speed when disabling autoneg mode.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Port statistics which include RDMA counters are useful even when the
netdevice is down. Do not free the port statistics DMA buffers
when the netdevice is down. This is keep the snapshot of the port
statistics and counters will just continue counting when the
netdevice goes back up.
Split the bnxt_free_stats() function into 2 functions. The port
statistics buffers will only be freed when the netdevice is
removed.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the current driver, the statistics reported by .ndo_get_stats64()
are reset when the device goes down. Store a snapshot of the
rtnl_link_stats64 before shutdown. This snapshot is added to the
current counters in .ndo_get_stats64() so that the counters will not
get reset when the device is down.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently firmware specific errors are returned directly in flash_device
and reset ethtool hooks. Modify it to return linux standard errors
to userspace when flashing operations fail.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the code allows ETS bandwidth weight 0 to be set on unused TCs.
We should not set any DCB parameters on unused TCs at all.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Display the CoS counters as additional priority counters by looking up
the priority to CoS queue mapping. If the TX extended port statistics
block size returned by firmware is big enough to cover the CoS counters,
then we will display the new priority counters. We call firmware to get
the up-to-date pri2cos mapping to convert the CoS counters to
priority counters.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are some minor differences when assigning VF resources on the
new chips. The MSIX (NQ) resource has to be assigned and ring group
is not needed on the new chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When bringing up a device, the code checks to see if the number of
MSIX has changed. pci_disable_msix() should be called first before
changing the number of reserved NQs/CMPL rings. This ensures that
the MSIX vectors associated with the NQs/CMPL rings are still
properly mapped when pci_disable_msix() masks the vectors.
This patch will prevent errors when RDMA support is added for the new
57500 chips. When the RDMA driver shuts down, the number of NQs is
decreased and we must use the new sequence to prevent MSIX errors.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_en requires same number of stat_ctxs as CP rings but RDMA
requires only 1 stat_ctx. Also add a new parameter resv_stat_ctxs
to better keep track of stat_ctxs reserved including resources used
by RDMA. Add a stat_ctxs parameter to all the relevant resource
reservation functions so we can reserve the correct number of
stat_ctxs.
Prior to this patch, we were not reserving the extra stat_ctx for
RDMA and RDMA would not work on the new 57500 chips.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling bnxt_set_max_func_stat_ctxs() to modify max stat_ctxs requested
or freed by the RDMA driver is wrong. After introducing reservation of
resources recently, the driver has to keep track of all stat_ctxs
including the ones used by the RDMA driver. This will provide a better
foundation for accurate accounting of the stat_ctxs.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For bnxt_en driver, stat_ctxs created will always be same as
cp_nr_rings. Remove extra variable that duplicates the value.
Also introduce bnxt_get_avail_stat_ctxs_for_en() helper to get
available stat_ctxs and bnxt_get_ulp_stat_ctxs() helper to return
number of stat_ctxs used by RDMA.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The available CP rings are calculated differently on the new 57500
chips, so add this helper to do this calculation correctly. The
VFs will be assigned these available CP rings.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF has a pool of NQs and MSIX vectors assigned to it based on
NVRAM configurations. The number of usable MSIX vectors on the PF
is the minimum of the NQs and MSIX vectors. Any excess NQs without
associated MSIX may be used for the VFs, so we need to store this
max_nqs value. max_nqs minus the NQs used by the PF will be the
available NQs for the VFs.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drivers may not be able to implement a VLAN addition or reconfiguration.
In those cases it's desirable to explain to the user that it was
rejected (and why).
To that end, add extack argument to ndo_bridge_setlink. Adapt all users
to that change.
Following patches will use the new argument in the bridge driver.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_en 0000:19:00.0 (unregistered net_device) (uninitialized): hwrm
req_type 0x190 seq id 0x6 error 0xffff
The message above is commonly seen when a newer driver is used on
hardware with older firmware. The issue is this message means nothing to
anyone except Broadcom. Remove the message to not confuse users as this
message is really not very informative.
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CP rings are accounted differently on the new 57500 chips. There
must be enough CP rings for the sum of RX and TX rings on the new
chips. The current logic may be over-estimating the RX and TX rings.
The output parameter max_cp should be the maximum NQs capped by
MSIX vectors available for networking in the context of 57500 chips.
The existing code which uses CMPL rings capped by the MSIX vectors
works most of the time but is not always correct.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new 57500 chips have introduced the NQ structure in addition to
the existing CP rings in all chips. We need to introduce a new
bnxt_nq_rings_in_use(). On legacy chips, the 2 functions are the
same and one will just call the other. On the new chips, they
refer to the 2 separate ring structures. The new function is now
called to determine the resource (NQ or CP rings) associated with
MSIX that are in use.
On 57500 chips, the RDMA driver does not use the CP rings so
we don't need to do the subtraction adjustment.
Fixes: 41e8d79837 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new 57500 chips use 1 NQ per MSIX vector, whereas legacy chips use
1 CP ring per MSIX vector. To better unify this, add a resv_irqs
field to struct bnxt_hw_resc. On legacy chips, we initialize resv_irqs
with resv_cp_rings. On new chips, we initialize it with the allocated
MSIX resources.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent changes to support the 57500 devices have created this
regression. The bnxt_hwrm_queue_qportcfg() call was moved to be
called earlier before the RDMA support was determined, causing
the CoS queues configuration to be set before knowing whether RDMA
was supported or not. Fix it by moving it to the right place right
after RDMA support is determined.
Fixes: 98f04cf0f1 ("bnxt_en: Check context memory requirements from firmware.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the year and month offset while storing it in
bnxt_fill_coredump_record().
Fixes: 6c5657d085 ("bnxt_en: Add support for ethtool get dump.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To keep track of the number of times the workaround code for 57500 A0
has been triggered. This is a per NQ counter.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hardware can sometimes not generate NQ MSIX with a single pending
CP ring entry. This seems to always happen at the last entry of
the CP ring before it wraps. Add logic to check all the CP rings for
pending entries without the CP ring consumer index advancing. Calling
HWRM_DBG_RING_INFO_GET to read the context of the CP ring will flush
out the NQ entry and MSIX.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no RDMA support on 57500 chips yet, so prevent bnxt_re from
registering on these chips. There is intermittent failure if bnxt_re
is allowed to register and proceed with RDMA operations.
Fixes: 1ab968d2f1 ("bnxt_en: Add PCI ID for BCM57508 device.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The software counter structure is defined in both the CP ring's structure
and the NQ ring's structure on the new devices. The legacy code adds the
counter to the CP ring's structure and the counter won't get displayed
since the ethtool code is looking at the NQ ring's structure.
Since all other counters are contained in the NQ ring's structure, it
makes more sense to count rx_l4_csum_errors in the NQ.
Fixes: 50e3ab7836 ("bnxt_en: Allocate completion ring structures for 57500 series chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent commit has added the reservation of RSS context. This requires
bnxt_hwrm_vnic_qcaps() to be called before allocating any RSS contexts.
The bnxt_hwrm_vnic_qcaps() call sets up proper flags that will
determine how many RSS contexts to allocate to support NTUPLE.
This causes a regression that too many RSS contexts are being reserved
and causing resource shortage when enabling many VFs. Fix it by calling
bnxt_hwrm_vnic_qcaps() earlier.
Fixes: 41e8d79837 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlvPV7IUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vyaUg//WnCaRIu2oKOp8c/bplZJDW5eT10d
oYAN9qeyptU9RYrg4KBNbZL9UKGFTk3AoN5AUjrk8njxc/dY2ra/79esOvZyyYQy
qLXBvrXKg3yZnlNlnyBneGSnUVwv/kl2hZS+kmYby2YOa8AH/mhU0FIFvsnfRK2I
XvwABFm2ZYvXCqh3e5HXaHhOsR88NQ9In0AXVC7zHGqv1r/bMVn2YzPZHL/zzMrF
mS79tdBTH+shSvchH9zvfgIs+UEKvvjEJsG2liwMkcQaV41i5dZjSKTdJ3EaD/Y2
BreLxXRnRYGUkBqfcon16Yx+P6VCefDRLa+RhwYO3dxFF2N4ZpblbkIdBATwKLjL
npiGc6R8yFjTmZU0/7olMyMCm7igIBmDvWPcsKEE8R4PezwoQv6YKHBMwEaflIbl
Rv4IUqjJzmQPaA0KkRoAVgAKHxldaNqno/6G1FR2gwz+fr68p5WSYFlQ3axhvTjc
bBMJpB/fbp9WmpGJieTt6iMOI6V1pnCVjibM5ZON59WCFfytHGGpbYW05gtZEod4
d/3yRuU53JRSj3jQAQuF1B6qYhyxvv5YEtAQqIFeHaPZ67nL6agw09hE+TlXjWbE
rTQRShflQ+ydnzIfKicFgy6/53D5hq7iH2l7HwJVXbXRQ104T5DB/XHUUTr+UWQn
/Nkhov32/n6GjxQ=
=58I4
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- Fix ASPM link_state teardown on removal (Lukas Wunner)
- Fix misleading _OSC ASPM message (Sinan Kaya)
- Make _OSC optional for PCI (Sinan Kaya)
- Don't initialize ASPM link state when ACPI_FADT_NO_ASPM is set
(Patrick Talbert)
- Remove x86 and arm64 node-local allocation for host bridge structures
(Punit Agrawal)
- Pay attention to device-specific _PXM node values (Jonathan Cameron)
- Support new Immediate Readiness bit (Felipe Balbi)
- Differentiate between pciehp surprise and safe removal (Lukas Wunner)
- Remove unnecessary pciehp includes (Lukas Wunner)
- Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner)
- Tolerate PCIe Slot Presence Detect being hardwired to zero to
workaround broken hardware, e.g., the Wilocity switch/wireless device
(Lukas Wunner)
- Unify pciehp controller & slot structs (Lukas Wunner)
- Constify hotplug_slot_ops (Lukas Wunner)
- Drop hotplug_slot_info (Lukas Wunner)
- Embed hotplug_slot struct into users instead of allocating it
separately (Lukas Wunner)
- Initialize PCIe port service drivers directly instead of relying on
initcall ordering (Keith Busch)
- Restore PCI config state after a slot reset (Keith Busch)
- Save/restore DPC config state along with other PCI config state
(Keith Busch)
- Reference count devices during AER handling to avoid race issue with
concurrent hot removal (Keith Busch)
- If an Upstream Port reports ERR_FATAL, don't try to read the Port's
config space because it is probably unreachable (Keith Busch)
- During error handling, use slot-specific reset instead of secondary
bus reset to avoid link up/down issues on hotplug ports (Keith Busch)
- Restore previous AER/DPC handling that does not remove and
re-enumerate devices on ERR_FATAL (Keith Busch)
- Notify all drivers that may be affected by error recovery resets
(Keith Busch)
- Always generate error recovery uevents, even if a driver doesn't have
error callbacks (Keith Busch)
- Make PCIe link active reporting detection generic (Keith Busch)
- Support D3cold in PCIe hierarchies during system sleep and runtime,
including hotplug and Thunderbolt ports (Mika Westerberg)
- Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots
are empty or occupied (Jon Derrick)
- Remove duplicated include from pci/pcie/err.c and unused variable
from cpqphp (YueHaibing)
- Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza
Pawandeep)
- Uninline PCI bus accessors for better ftracing (Keith Busch)
- Remove unused AER Root Port .error_resume method (Keith Busch)
- Use kfifo in AER instead of a local version (Keith Busch)
- Use threaded IRQ in AER bottom half (Keith Busch)
- Use managed resources in AER core (Keith Busch)
- Reuse pcie_port_find_device() for AER injection (Keith Busch)
- Abstract AER interrupt handling to disconnect error injection (Keith
Busch)
- Refactor AER injection callbacks to simplify future improvments
(Keith Busch)
- Remove unused Netronome NFP32xx Device IDs (Jakub Kicinski)
- Use bitmap_zalloc() for dma_alias_mask (Andy Shevchenko)
- Add switch fall-through annotations (Gustavo A. R. Silva)
- Remove unused Switchtec quirk variable (Joshua Abraham)
- Fix pci.c kernel-doc warning (Randy Dunlap)
- Remove trivial PCI wrappers for DMA APIs (Christoph Hellwig)
- Add Intel GPU device IDs to spurious interrupt quirk (Bin Meng)
- Run Switchtec DMA aliasing quirk only on NTB endpoints to avoid
useless dmesg errors (Logan Gunthorpe)
- Update Switchtec NTB documentation (Wesley Yung)
- Remove redundant "default n" from Kconfig (Bartlomiej Zolnierkiewicz)
- Avoid panic when drivers enable MSI/MSI-X twice (Tonghao Zhang)
- Add PCI support for peer-to-peer DMA (Logan Gunthorpe)
- Add sysfs group for PCI peer-to-peer memory statistics (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA scatterlist mapping interface (Logan
Gunthorpe)
- Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA driver writer's documentation (Logan
Gunthorpe)
- Add block layer flag to indicate driver support for PCI peer-to-peer
DMA (Logan Gunthorpe)
- Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P
memory (Logan Gunthorpe)
- Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan
Gunthorpe)
- Add nvme-pci support for PCI peer-to-peer memory in requests (Logan
Gunthorpe)
- Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise,
Christoph Hellwig, Logan Gunthorpe)
- Cache VF config space size to optimize enumeration of many VFs
(KarimAllah Ahmed)
- Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas)
- Fix VMD AERSID quirk Device ID matching (Jon Derrick)
- Fix Cadence PHY handling during probe (Alan Douglas)
- Signal Cadence Endpoint interrupts via AXI region 0 instead of last
region (Alan Douglas)
- Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan
Douglas)
- Remove redundant controller tests for "device_type == pci" (Rob
Herring)
- Document R-Car E3 (R8A77990) bindings (Tho Vu)
- Add device tree support for R-Car r8a7744 (Biju Das)
- Drop unused mvebu PCIe capability code (Thomas Petazzoni)
- Add shared PCI bridge emulation code (Thomas Petazzoni)
- Convert mvebu to use shared PCI bridge emulation (Thomas Petazzoni)
- Add aardvark Root Port emulation (Thomas Petazzoni)
- Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach)
- Add initial power management for i.MX7 (Leonard Crestez)
- Add PME_Turn_Off support for i.MX7 (Leonard Crestez)
- Fix qcom runtime power management error handling (Bjorn Andersson)
- Update TI dra7xx unaligned access errata workaround for host mode as
well as endpoint mode (Vignesh R)
- Fix kirin section mismatch warning (Nathan Chancellor)
- Remove iproc PAXC slot check to allow VF support (Jitendra Bhivare)
- Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I)
- Update Keystone to use MRRS quirk for host bridge instead of open
coding (Kishon Vijay Abraham I)
- Refactor Keystone link establishment (Kishon Vijay Abraham I)
- Simplify and speed up Keystone link training (Kishon Vijay Abraham I)
- Remove unused Keystone host_init argument (Kishon Vijay Abraham I)
- Merge Keystone driver files into one (Kishon Vijay Abraham I)
- Remove redundant Keystone platform_set_drvdata() (Kishon Vijay
Abraham I)
- Rename Keystone functions for uniformity (Kishon Vijay Abraham I)
- Add Keystone device control module DT binding (Kishon Vijay Abraham
I)
- Use SYSCON API to get Keystone control module device IDs (Kishon
Vijay Abraham I)
- Clean up Keystone PHY handling (Kishon Vijay Abraham I)
- Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I)
- Clean up Keystone config space access checks (Kishon Vijay Abraham I)
- Get Keystone outbound window count from DT (Kishon Vijay Abraham I)
- Clean up Keystone outbound window configuration (Kishon Vijay Abraham
I)
- Clean up Keystone DBI setup (Kishon Vijay Abraham I)
- Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I)
- Fix Keystone IRQ status checking (Kishon Vijay Abraham I)
- Add debug messages for all Keystone errors (Kishon Vijay Abraham I)
- Clean up Keystone includes and macros (Kishon Vijay Abraham I)
- Fix Mediatek unchecked return value from devm_pci_remap_iospace()
(Gustavo A. R. Silva)
- Fix Mediatek endpoint/port matching logic (Honghui Zhang)
- Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui
Zhang)
- Remove redundant Mediatek PM domain check (Honghui Zhang)
- Convert Mediatek to pci_host_probe() (Honghui Zhang)
- Fix Mediatek MSI enablement (Honghui Zhang)
- Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang)
- Add Mediatek loadable module support (Honghui Zhang)
- Detach VMD resources after stopping root bus to prevent orphan
resources (Jon Derrick)
- Convert pcitest build process to that used by other tools (iio, perf,
etc) (Gustavo Pimentel)
* tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
PCI/AER: Refactor error injection fallbacks
PCI/AER: Abstract AER interrupt handling
PCI/AER: Reuse existing pcie_port_find_device() interface
PCI/AER: Use managed resource allocations
PCI: pcie: Remove redundant 'default n' from Kconfig
PCI: aardvark: Implement emulated root PCI bridge config space
PCI: mvebu: Convert to PCI emulated bridge config space
PCI: mvebu: Drop unused PCI express capability code
PCI: Introduce PCI bridge emulated config space common logic
PCI: vmd: Detach resources after stopping root bus
nvmet: Optionally use PCI P2P memory
nvmet: Introduce helper functions to allocate and free request SGLs
nvme-pci: Add support for P2P memory in requests
nvme-pci: Use PCI p2pmem subsystem to manage the CMB
IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()
block: Add PCI P2P flag for request queue
PCI/P2PDMA: Add P2P DMA driver writer's documentation
docs-rst: Add a new directory for PCI documentation
PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers
PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset
...
The struct type was copied from the line before but it should be "tx"
instead of "rx". I have reviewed the code and I can't immediately see
that this bug causes a runtime issue.
Fixes: 36e53349b6 ("bnxt_en: Add additional extended port statistics.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new poll function that polls for NQ events. If the NQ event is
a CQ notification, we locate the CP ring from the cq_handle and call
__bnxt_poll_work() to handle RX/TX events on the CP ring.
Add a new has_more_work field in struct bnxt_cp_ring_info to indicate
budget has been reached. __bnxt_poll_cqs_done() is called to update or
ARM the CP rings if budget has not been reached or not. If budget
has been reached, the next bnxt_poll_p5() call will continue to poll
from the CQ rings directly. Otherwise, the NQ will be ARMed for the
next IRQ.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate the CP ring polling logic in bnxt_poll_work() into 2 separate
functions __bnxt_poll_work() and __bnxt_poll_work_done(). Since the logic
is separated, we need to add tx_pkts and events fields to struct bnxt_napi
to keep track of the events to handle between the 2 functions. We also
add had_work_done field to struct bnxt_cp_ring_info to indicate whether
some work was performed on the CP ring.
This is needed to better support the 57500 chips. We need to poll up to
2 separate CP rings before we update or ARM the CP rings on the 57500 chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On legacy chips, the CP ring may be shared between RX and TX and so only
setup the RX coalescing parameters in such a case. On 57500 chips, we
always have a dedicated CP ring for TX so we can always set up the
TX coalescing parameters in bnxt_hwrm_set_coal().
Also, the min_timer coalescing parameter applies to the NQ on the new
chips and a separate firmware call needs to be made to set it up.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the RX code path, we current use the bnxt_napi struct pointer to
identify the associated RX/CP rings. Change it to use the struct
bnxt_cp_ring_info pointer instead since there are now up to 2
CP rings per MSIX.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RSS context allocation and RSS indirection table setup are very different
on the new chip. Refactor bnxt_setup_vnic() to call 2 different functions
to set up RSS for the vnic based on chip type. On the new chip, the
number of RSS contexts and the indirection table size depends on the
number of RX rings. Each indirection table entry is also different
on the new chip since ring groups are no longer used.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the new 57500 chips, we need to allocate one RSS context for every
64 RX rings. In previous chips, only one RSS context per vnic is
required regardless of the number of RX rings. So increase the max
RSS context array count to 8.
Hardware ring groups are not used on the new chips. Note that the
software ring group structure is still maintained in the driver to
keep track of the rings associated with the vnic.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the new 57500 chips, we allocate/free one CP ring for each RX ring or
TX ring separately. Using separate CP rings for RX/TX is an improvement
as TX events will no longer be stuck behind RX events.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Firmware ring allocation semantics are slightly different for most
ring types on 57500 chips. Allocation/deallocation for NQ rings are
also added for the new chips.
A CP ring handle is also added so that from the NQ interrupt event,
we can locate the CP ring.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the new 57500 chips, getting the associated CP ring ID associated with
an RX ring or TX ring is different than before. On the legacy chips,
we find the associated ring group and look up the CP ring ID. On the
57500 chips, each RX ring and TX ring has a dedicated CP ring even if
they share the MSIX. Use these helper functions at appropriate places
to get the CP ring ID.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On 57500 chips, the original bnxt_cp_ring_info struct now refers to the
NQ. bp->cp_nr_rings refer to the number of NQs on 57500 chips. There
are now 2 pointers for the CP rings associated with RX and TX rings.
Modify bnxt_alloc_cp_rings() and bnxt_free_cp_rings() accordingly.
With multiple CP rings per NAPI, we need to add a pointer in
bnxt_cp_ring_info struct to point back to the bnxt_napi struct.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ring reservation functions have to be modified for P5 chips in the
following ways:
- bnxt_cp_ring_info structs map to internal NQs as well as CP rings.
- Ring groups are not used.
- 1 CP ring must be available for each RX or TX ring.
- number of RSS contexts to reserve is multiples of 64 RX rings.
- RFS currently not supported.
Also, RX AGG rings are only used for jumbo frames, so we need to
unconditionally call bnxt_reserve_rings() in __bnxt_open_nic()
to see if we need to reserve AGG rings in case MTU has changed.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store the maximum MSIX capability in PCIe config. space earlier. When
we call firmware to query capability, we need to compare the PCIe
MSIX max count with the firmware count and use the smaller one as
the MSIX count for 57500 (P5) chips.
The new chips don't use ring groups. But previous chips do and
the existing logic limits the available rings based on resource
calculations including ring groups. Setting the max ring groups to
the max rx rings will work on the new chips without changing the
existing logic.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 57500 series chips have a new 64-bit doorbell format. Use a new
bnxt_db_info structure to unify the new and the old 32-bit doorbells.
Add a new bnxt_set_db() function to set up the doorbell addreses and
doorbell keys ahead of time. Modify and introduce new doorbell
helpers to help abstract and unify the old and new doorbells.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
57500 series is a new chip class (P5) that requires some driver changes
in the next several patches. This adds basic chip ID, doorbells, and
the notification queue (NQ) structures. Each MSIX is associated with an
NQ instead of a CP ring in legacy chips. Each NQ has up to 2 associated
CP rings for RX and TX. The same bnxt_cp_ring_info struct will be used
for the NQ.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call firmware to configure the DMA addresses of all context memory
pages on new devices requiring context memory.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New device requires host context memory as a backing store. Call
firmware to check for context memory requirements and store the
parameters. Allocate host pages accordingly.
We also need to move the call bnxt_hwrm_queue_qportcfg() earlier
so that all the supported hardware queues and the IDs are known
before checking and allocating context memory.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer chips require the PTU_PTE_VALID bit to be set for every page
table entry for context memory and rings. Additional bits are also
required for page table entries for all rings. Add a flags field to
bnxt_ring_mem_info struct to specify these additional bits to be used
when setting up the pages tables as needed.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the DMA page table and vmem fields in bnxt_ring_struct to a new
bnxt_ring_mem_info struct. This will allow context memory management
for a new device to re-use some of the existing infrastructure.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
New firmware spec. allows interrupt coalescing parameters, such as
maximums, timer units, supported features to be queried. Update
the driver to make use of the new call to query these parameters
and provide the legacy defaults if the call is not available.
Replace the hard-coded values with these parameters.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Support the max_ext_req_len field from the HWRM_VER_GET_RESPONSE.
If this field is valid and greater than the mailbox size, use the
short command format to send firmware messages greater than the
mailbox size. Newer devices use this method to send larger messages
to the firmware.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Latest firmware spec. has some additional rx extended port stats and new
tx extended port stats added. We now need to check the size of the
returned rx and tx extended stats and determine how many counters are
valid. New counters added include CoS byte and packet counts for rx
and tx.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Among the new changes are trusted VF support, 200Gbps support, and new
API to dump ring information on the new chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need to compare *val.vu32* with < 0 because
such variable is of type u32 (32 bits, unsigned), making it
impossible to hold a negative value. Fix this by removing
such comparison.
Also, initialize variable *max_val* to -1, just in case
it is not initialized to either BNXT_MSIX_VEC_MAX or
BNXT_MSIX_VEC_MIN_MAX before using it in a comparison
with val.vu32 at line 159:
if (val.vu32 > max_val)
Addresses-Coverity-ID: 1473915 ("Unsigned compared against 0")
Addresses-Coverity-ID: 1473920 ("Uninitialized scalar variable")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When getting the max rings supported, get the reduced max_irqs
by the ones used by RDMA.
If the number MSIX is the limiting factor, this bug may cause the
max ring count to be higher than it should be when RDMA driver is
loaded and may result in ring allocation failures.
Fixes: 30f529473e ("bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the driver probe fails, all the resources that were allocated prior
to the failure must be freed. However, hwrm dma response memory is not
getting freed.
This patch fixes the problem described above.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In HWRM_QUEUE_COS2BW_CFG request, enables field should have the bits
set only for the queue ids which are having the valid parameters.
This causes firmware to return error when the TC to hardware CoS queue
mapping is not 1:1 during DCBNL ETS setup.
Fixes: 2e8ef77ee0 ("bnxt_en: Add TC to hardware QoS queue mapping logic.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The enables bit for VNIC was set wrong when calling the HWRM_FUNC_CFG
firmware call to reserve VNICs. This has the effect that the firmware
will keep a large number of VNICs for the PF, and having very few for
VFs. DPDK driver running on the VFs, which requires more VNICs, may not
work properly as a result.
Fixes: 674f50a5b0 ("bnxt_en: Implement new method to reserve rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds following driver-specific permanent mode boolean
parameter.
gre_ver_check - Generic Routing Encapsulation(GRE) version check
will be enabled in the device. If disabled, device skips version
checking for GRE packets.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for following generic permanent mode
devlink parameters. They can be modified using devlink param
commands.
msix_vec_per_pf_max - This param sets the number of MSIX vectors
that the device requests from the host on driver initialization.
This value is set in the device which limits MSIX vectors per PF.
msix_vec_per_pf_min - This param sets the number of minimal MSIX
vectors required for the device initialization. Value 0 indicates
a default value is selected. This value is set in the device which
limits MSIX vectors per PF.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return proper error code when Firmware returns
HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED for HWRM_NVM_GET/SET_VARIABLE
commands.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for ignore_ari generic permanent mode
devlink parameter. This parameter is disabled by default. It can be
enabled using devlink param commands.
ignore_ari - If enabled, device ignores ARI(Alternate Routing ID)
capability, even when platforms has the support and creates same number
of partitions when platform does not support ARI capability.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5 core driver and ethernet netdev updates, please note there is a small
devlink releated update to allow extack argument to eswitch operations.
From Eli Britstein,
1) devlink: Add extack argument to the eswitch related operations
2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
3) net/mlx5e: Add extack messages for TC offload failures
From Eran Ben Elisha,
4) mlx5e: Add counter for aRFS rule insertion failures
From Feras Daoud
5) Fast teardown support for mlx5 device
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the FW pages.
Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been scatter
to memory
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbtU45AAoJEEg/ir3gV/o+/C4H/RHA4KImrb476EdB3VNYMqAN
dgXb+bmh6sZP+jHWqQ4c3aVeh6/T8qm4gwiSn2nVTtHEnxtCdIYljzDC1Nswczeg
pSjD1eOP7M1LpAOmBb8xdnJcX7yM7r1bTklnp2sN853WShbsDRYgZBHsBwTzx25U
ZdzL4QTLuohlG/aLrbGXMntIy45ya2fVQrnK54s18nFlgsdFjEs0mi0xaUKNBC6+
P8CTohHAxuuxmL5b+6MIYLZCdgd8cLNQFdtqbckEVw7SvcRTxfraRlyqJ0YOgTGB
TdSWnqZz2JYH29wSFbpFG8qX6GCv8FoiZ+fKzldbolHk442rrktHv3+Y7qQuZVs=
=NVks
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2018-10-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-10-03
mlx5 core driver and ethernet netdev updates, please note there is a small
devlink releated update to allow extack argument to eswitch operations.
From Eli Britstein,
1) devlink: Add extack argument to the eswitch related operations
2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
3) net/mlx5e: Add extack messages for TC offload failures
From Eran Ben Elisha,
4) mlx5e: Add counter for aRFS rule insertion failures
From Feras Daoud
5) Fast teardown support for mlx5 device
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the FW pages.
Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been scatter
to memory
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add extack argument to the eswitch related operations.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
After bfcb79fca1 ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.
Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.
Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The current netpoll implementation in the bnxt_en driver has problems
that may miss TX completion events. bnxt_poll_work() in effect is
only handling at most 1 TX packet before exiting. In addition,
there may be in flight TX completions that ->poll() may miss even
after we fix bnxt_poll_work() to handle all visible TX completions.
netpoll may not call ->poll() again and HW may not generate IRQ
because the driver does not ARM the IRQ when the budget (0 for netpoll)
is reached.
We fix it by handling all TX completions and to always ARM the IRQ
when we exit ->poll() with 0 budget.
Also, the logic to ACK the completion ring in case it is almost filled
with TX completions need to be adjusted to take care of the 0 budget
case, as discussed with Eric Dumazet <edumazet@google.com>
Reported-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Version bump conflict in batman-adv, take what's in net-next.
iavf conflict, adjustment of netdev_ops in net-next conflicting
with poll controller method removal in net.
Signed-off-by: David S. Miller <davem@davemloft.net>
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.
bnxt uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt offload code currently supports only 'push' and 'pop' operation: let
.ndo_setup_tc() return -EOPNOTSUPP if VLAN 'modify' action is configured.
Fixes: 2ae7408fed ("bnxt_en: bnxt: add TC flower filter offload support")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the following compile warning:
drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c:49:5: warning: nvm_param.dir_type may be used uninitialized in this function [-Wmaybe-uninitialized]
if (nvm_param.dir_type == BNXT_NVM_PORT_CFG)
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two new tls tests added in parallel in both net and net-next.
Used Stephen Rothwell's linux-next resolution.
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent commit to always forward the VF MAC address to the PF for
approval may not work if the PF driver or the firmware is older. This
will cause the VF driver to fail during probe:
bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): hwrm req_type 0xf seq id 0x5 error 0xffff
bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): VF MAC address 00:00:17:02:05:d0 not approved by the PF
bnxt_en 0000:00:03.0: Unable to initialize mac address.
bnxt_en: probe of 0000:00:03.0 failed with error -99
We fix it by treating the error as fatal only if the VF MAC address is
locally generated by the VF.
Fixes: 707e7e9660 ("bnxt_en: Always forward VF MAC address to the PF.")
Reported-by: Seth Forshee <seth.forshee@canonical.com>
Reported-by: Siwei Liu <loseweigh@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>