This patch adds GSO support to the sunvnet driver.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for sender-side checksum offloading.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds scatter/gather support to the sunvnet driver.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for VIO v1.7 (extended descriptor format)
and v1.8 (receive-side checksumming) to the sunvnet driver.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the name of vnet_port_alloc_tx_bufs to
vnet_port_alloc_tx_ring, since there are no buffer allocations after
transmit zero copy support was added. This patch also moves the ring
allocation to after VIO version negotiation to allow for
different-sized descriptors in later VIO versions.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ConnectX HW is capable of using one of the following hash functions:
Toeplitz and an XOR hash function. This patch extends the implementation
of the mlx4_en driver set/get_rxfh callbacks to support getting and
setting the RSS hash function used by the device.
Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch extends the set/get_rxfh ethtool-options for getting or
setting the RSS hash function.
It modifies drivers implementation of set/get_rxfh accordingly.
This change also delegates the responsibility of checking whether a
modification to a certain RX flow hash parameter is supported to the
driver implementation of set_rxfh.
User-kernel API is done through the new hfunc bitmask field in the
ethtool_rxfh struct. A bit set in the hfunc field is corresponding to an
index in the new string-set ETH_SS_RSS_HASH_FUNCS.
Got approval from most of the relevant driver maintainers that their
driver is using Toeplitz, and for the few that didn't answered, also
assumed it is Toeplitz.
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Cc: Prashant Sreedharan <prashant@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
Cc: Carolyn Wyborny <carolyn.wyborny@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Cc: Greg Rose <gregory.v.rose@intel.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Mitch Williams <mitch.a.williams@intel.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Shradha Shah <sshah@solarflare.com>
Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
Cc: "VMware, Inc." <pv-drivers@vmware.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mvneta_tx() dereferences skb to get skb->len too late,
as hardware might have completed the transmit and TX completion
could have freed the skb from another cpu.
Fixes: 71f6d1b31f ("net: mvneta: replace Tx timer with a real interrupt")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-12-06
This series contains updates to i40e and i40evf.
Shannon provides several patches to cleanup and fix i40e. First removes
an unneeded break statement in i40e_vsi_link_event(). Then removes
some debug messages that really do not give any useful information and
ends up getting printed every service_task loop, which fills the logfile
with noise when AQ tracing is enabled. Updates the aq_cmd arguments to
use %i which is much more forgiving and user friendly than the more
restrictive %x, or %d. Fixes the netdev_stat macro, where the old
xxx_NETDEV_STAT() macro was defined long before the newer
rtnl_link_stats64 came into being, and just never got updated.
Getting the pf_id from the function number had an issue when
when the PF was setup in passthru mode, the PCI bus/device/function
was virtualized and the number in the VM is different from the number in
the bare metal. This caused HW configuration issues when the wrong pf_id
was used to set up the HMC and other structures. The PF_FUNC_RID register
has the real bus/device/function information as configured by the BIOS,
so use that for a better number.
Carolyn adds additional text description for the base pf0 and flow
director generated interrupts, since these interrupts are difficult
to distinguish per port on a multi-function device.
Jacob resolves an issue related to images with multiple PFs per
physical port. We cannot fully support 1588 PTP features, since only
one port should control (i.e. write) the registers at a time. Doing
so can cause interference of functionality.
Anjali provides several updates to i40e, first adds the Virtual Channel
OP event opcode for CONFIG_RSS, so that the Virtual Channel state
machine can properly decipher status change events. Then updates the
driver to add (and use) i40e_is_vf macro for future expansion when new
VF MAC types get added. Adds new update VSI flow to accommodate a
firmware dix with VSI loopback mode. All VSIs on a VEB should either
have loopback enabled or disabled, a mixed mode is not supported for a
VEB. Since our driver supports multiple VSIs per PF that need to talk to
each other make sure to enable Loopback for the PF and FDIR VSI as well.
Mitch provides a couple of i40e and i40evf patches. First updates
i40evf init code more adept at handling when multiple VFs attempt
to initialize simultaneously.
Joe Perches provides a i40e patch which resolves a compile warning
about about frame size being larger than 2048 bytes by reducing the
stack use by using kmemdup and not using a very large struct on the
stack.
v2:
- Dropped patch 13 & 14 while Mitch reworks the patches based on
feedback from Ben Hutchings, probably the tryptophan in the turkey
is to blame for the delay...
- Added Joe Perches patch which resolves a compile warning about frame
size
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace rtl_skb_pad with eth_skb_pad since they do the same thing.
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update myri10ge to use eth_skb_pad helper. This also corrects a minor
issue as the driver was updating length without updating the tail pointer.
Cc: Hyong-Youb Kim <hykim@myri.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the standard layout for padding an ethernet frame with the
eth_skb_pad call.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update the Intel Ethernet drivers to use eth_skb_pad() and skb_put_padto
instead of doing their own implementations of the function.
Also this cleans up two other spots where skb_pad was called but the length
and tail pointers were being manipulated directly instead of just having
the padding length added via __skb_put.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add ConnectX-4LX to the list of supported devices as well as their virtual
functions.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The outbox should be cleared before executing the command.
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Command queue descriptor page size is 4KB and not the page size used by the
kernel.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mlx5 requires at least one interrupt vector for completions so fix the minvec
argument to pci_enable_msix_range() accordingly.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call request module on mlx5_ib so it will be available for applications
requiring it, such as installers that require boot over IB.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cmac engine is the bridge between driver and dash firmware.
Other os may not disable cmac when leave. And r8169 did not allocate any
resources for cmac engine. Disable it to prevent abnormal system behavior.
Signed-off-by: Chunhao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For RTL8168G/GU/H/EP and RTL8411B remove enable tx/rx from its own hw_start
function. This will prevent enable tx/rx before complete hardware tx/rx
setting.
Tx/Rx will be enabled in the end of function rtl_hw_start_8168.
Signed-off-by: Chunhao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mvneta driver sets the amount of Tx coalesce packets to 16 by
default. Normally that does not cause any trouble since the driver
uses a much larger Tx ring size (532 packets). But some sockets
might run with very small buffers, much smaller than the equivalent
of 16 packets. This is what ping is doing for example, by setting
SNDBUF to 324 bytes rounded up to 2kB by the kernel.
The problem is that there is no documented method to force a specific
packet to emit an interrupt (eg: the last of the ring) nor is it
possible to make the NIC emit an interrupt after a given delay.
In this case, it causes trouble, because when ping sends packets over
its raw socket, the few first packets leave the system, and the first
15 packets will be emitted without an IRQ being generated, so without
the skbs being freed. And since the socket's buffer is small, there's
no way to reach that amount of packets, and the ping ends up with
"send: no buffer available" after sending 6 packets. Running with 3
instances of ping in parallel is enough to hide the problem, because
with 6 packets per instance, that's 18 packets total, which is enough
to grant a Tx interrupt before all are sent.
The original driver in the LSP kernel worked around this design flaw
by using a software timer to clean up the Tx descriptors. This timer
was slow and caused terrible network performance on some Tx-bound
workloads (such as routing) but was enough to make tools like ping
work correctly.
Instead here, we simply set the packet counts before interrupt to 1.
This ensures that each packet sent will produce an interrupt. NAPI
takes care of coalescing interrupts since the interrupt is disabled
once generated.
No measurable performance impact nor CPU usage were observed on small
nor large packets, including when saturating the link on Tx, and this
fixes tools like ping which rely on too small a send buffer. If one
wants to increase this value for certain workloads where it is safe
to do so, "ethtool -C $dev tx-frames" will override this default
setting.
This fix needs to be applied to stable kernels starting with 3.10.
Tested-By: Maggie Mae Roxas <maggie.mae.roxas@gmail.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify bcmgenet driver so that it can be used on Broadcom 7xxx
MIPS-based STB platforms without a device tree.
Signed-off-by: Petri Gynther <pgynther@google.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reduce stack use by using kmemdup and not using a very
large struct on stack.
In function ‘i40e_dbg_dump_desc’:
warning: the frame size of 8192 bytes is larger than 2048 bytes [-Wframe-larger-than=]
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Getting the pf_id from the function number was a good place to start,
but when the PF was setup in passthru mode, the PCI bus/device/function
was virtualized and the number in the VM is different from the number in
the bare metal. This caused HW configuration issues when the wrong pf_id
was used to set up the HMC and other structures. The PF_FUNC_RID register
has the real bus/device/function information as configured by the BIOS,
so use that for a better number. This works in NPAR mode as well.
Change-ID: I65e3dd6c97594890c2bad566b83cc670b1dae534
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Acked-by: Kevin Scott <kevin.c.scott@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ARQ needs to have at least as many entries as VFs, or the VFs will
get errors from the FW when they send messages to the PF. Since we don't
know how many VFs we'll end up with, just set up 128 descriptors.
Change-ID: I04ae3d1c7faf09110eb782214e9c05aeb62a6c59
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There is an order in which this should happen. It turns out that FW will
not let you change the Loopback setting of the VSI with update VSI prior
to the VEB creation.
Change-ID: I7614ddff8b4c37702930c02f16f8c346aaa64bd1
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
All VSIs on a VEB should either have loopback enabled or disabled, a
mixed mode is not supported for a VEB. Since our driver supports multiple
VSIs per PF that need to talk to each other make sure to enable Loopback
for the PF and FDIR VSI as well.
Also, we now have to explicitly enable Loopback mode otherwise we fail
VSI creation for VMDq and VF VSIs.
Change-ID: Ib68c3ea4aeb730ac9468f930610de456efbe5b20
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Increase reset delay to ensure all internal caches are properly flushed
in worst case scenario.
Change-ID: I6f059a9e024fbf9ef1debd32497eed21369957fc
Signed-off-by: Kevin Scott <kevin.c.scott@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When multiple VFs attempt to initialize simultaneously, the firmware may
delay or drop messages. Make the init code more adept at handling these
situations by a) reinitializing the admin queue if the firmware fails to
process a request, and b) resending a request if the PF doesn't answer.
Once the request has been sent again, the PF might end up getting both
requests and send the configuration information to the driver twice.
This will cause the VF to complain about receiving an unexpected message
from the PF. Since this is not fatal, reduce the warning level of the
log messages that are generated in response to this event.
Change-ID: I9370a1a2fde2ad3934fa25ccfd0545edfbbb4805
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The old xxx_NETDEV_STAT() macro was defined long before the newer
rtnl_link_stats64 came into being, and just never got updated. Since we're
using rtnl_link_stats64 in other parts of the driver, we should use it
here as well. We've just been lucky that the field definitions are the
same sizes.
Change-ID: I19fc71619905700235dcdf0d3c8153aec81d36de
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is useful for future expansion when new VF MAC types get
added. It helps with cleaning up VF driver flow.
Change-ID: Ibe1eeb71262a3a40f24a1c5409436bdc3411da7f
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add the Virtual Channel OP event opcode for CONFIG_RSS, so that the
Virtual Channel state machine can properly decipher status change events.
Change-ID: I09939c7aa380147f60c49fd01ef2e27d0dc1c299
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Resolve an issue related to images with multiple PFs per physical
port. We cannot fully support 1588 PTP features, since only one port
should control (ie: write) the registers at a time. Doing so can cause
interference of functionality.
It may be possible to partially implement the API for only those
features without side effects. However, this at minimum means non
controlling PFs lose Tx timestamps, frequency atunement, and possibly
SYSTIME adjustment. There may be further impact I did not discover.
Since the API in the kernel expects these features to work, it is
simpler and less dangerous to just disable PTP features on all PFs not
identified as the controlling PF in PRTTSYN_CTL0.PF_ID.
This change also removes the warning printed when hwtstaml IOCTL is
called on the wrong PF. This is actually meaningless now, since only one
PF per port will support it. In addition, the ethtool get_ts_info IOCTL
was updated so that only the controlling port will even indicate support
(so as not to confuse users).
The overall downside is complete loss of functionality on non
controlling PF, vs the possible gain of partial support. The biggest
factor for choosing this approach is simplicity and ensuring that the
main PF will work. There could easily be other portions of the 1588
logic with side effects I am not aware, and the reduced functionality
that might be made available is significantly less useful. In addition,
the API does not allow for proper indication of why particular features
are not supported. These reasons are enough to decide for the simpler
approach to resolving this issue.
Change-ID: If4696bae686fc18aef6552b67dd417213d987c16
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds additional text description for base pf0 and flow director
generated interrupts. Without this patch, these interrupts are difficult
to distinguish per port on a multi-function device.
Change-ID: I4662e1b38840757765a3fe63d90219d28e76bfab
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use the 'i' rather than the more restrictive 'x' or 'd' in the aq_cmd
arguments. This makes the user interface much more forgiving and user
friendly.
Change-ID: I5dcd57b9befc047e06b74cf1152a25a3fa9e1309
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This message really doesn't give any useful information and ends up
getting printed every service_task loop in the Linux driver, filling the
logfile with noise when AQ tracing is enabled. This patch simply removes
the noise.
Change-ID: I30ad51e6b03c7ad12a7d9c102def0087db622df3
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This case statement is empty and the fall through just breaks out
so remove the break and let it fall through to break out.
Change-ID: I1b5ba9870d5245ca80bfca6e7f5f089e2eb8ccb0
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In sky2_change_mtu setting B0_IMSK to 0 may be delayed due to PCI write posting
which could result in irqs being still active when synchronize_irq is called.
Since we are not prepared to handle any further irqs after synchronize_irq
(our resources are freed after that) force the write by a consecutive read from
the same register.
Similar situation in sky2_all_down: Here we disabled irqs by a write to B0_IMSK
but did not ensure that this write took place before synchronize_irq. Fix that
too.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of a spurious interrupt dont forget to reenable the interrupts that
have been masked by reading the interrupt source register.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Acked-by: Mirko Lindner <mlindner@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In pxa168_eth_open() the irqs are enabled before napi. This opens a tiny time
window in which the irq handler is processed, disables irqs but then is not able
to schedule the not yet activated napi, leaving irqs disabled forever (since
irqs are reenabled in napi poll function).
Fix this race by activating napi before irqs are activated.
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pci_dev_put() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call
is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The vfree() function performs also input parameter validation.
Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of using global variables we are going to use dynamically allocated
memory. It allows to append a support of more than one ethernet adapter which
might have different settings simultaniously.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch resolves couple of issues in ixgbevf_probe/remove():
1. Fix a case where adapter->state is tested after free_netdev() this is
same as the patch for ixgbe from Daniel Borkmann <dborkman@redhat.com>:
commit b5b2ffc057 ("ixgbe: fix use after free adapter->state test in ixgbe_remove/ixgbe_probe")
2. Move pci_set_drvdata() after all the error checks in ixgbevf_probe() and
then add a check in ixgbevf_probe() to avoid running the cleanup functions
twice in cases where probe failed.
CC: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds initial support for VFs on a new mac - X550.
The patch adds the basic structures and device IDs for the X550 VFs
that would allow the driver to load and pass traffic.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The driver has logic to free up used data in case any of the checks in
ixgbe_probe() fail, however there is a similar set of cleanups that can
occur on driver unload in ixgbe_remove() which can cause the rmmod command
to crash.
This patch aims to fix the logic by moving pci_set_drvdata() after all error
checks and then adds a check in ixgbe_remove() to skip it altogether if
adapter comes up empty.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since we now support X550 mac's bump the version number to reflect this.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch extends the function pointer structure to include the new
X550 class MAC types. This creates a new file ixgbe_x550.c that contains
all of the new methods. Because of similarities to the X540 part in
some cases we just use it's methods where they can be used without any
modification. These exported functions are now defined in the new
ixgbe_x540.h file.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently the shared code checksum calculation function only
returns a u16 and cannot return an error code. Unfortunately
a variety of errors can happen that completely prevent the
calculation of a checksum. So, change the function return value
from a u16 to an s32 and return a negative value on error, or the
positive checksum value when there is no error.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Some X550 procedures will be using CS4227 PHY and need to
perform combined read and write operations. This patch
adds those methods.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The X550 hardware will use more bits in the mask, so change
the prototypes to match. This larger mask will require changes
in callers which use the higher bits. Likewise since X550 will
use different semaphore mask values and will use the lan_id
value. So save these values in the ixgbe_phy_info struct.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since on X550 we use host interface commands to read,write and erase
some commands require more time to complete. So this adds a timeout
parameter to ixgbe_host_interface_command as wells as a return_data
parameter allowing us to return with any data.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The new X550 family of MAC's will have a larger RSS hash (16 -> 64).
It will also support individual VF to have their own independent RSS
hash key. This patch will enable this functionality
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Accessing the CIAA/D register can block access to the PCI config space.
This patch removes the read/write operations to the CIAA/D registers
and makes use of standard kernel functions for accessing the PCI config
space.
In addition it moves ixgbevf_check_for_bad_vf() into the watchdog subtask
which reduces the frequency of the checks.
CC: Alex Williamson <alex.williamson@redhat.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Attempt to look up the MAC address in Open Firmware on systems that
support it. On SPARC resort to using the IDPROM if no OF address is
found.
Signed-off-by: Martin K Petersen <martin.petersen@oracle.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change cleans up the tail writes for the ixgbe descriptor queues. The
current implementation had me confused as I wasn't sure if it was still
making use of the surprise remove logic or not.
It also adds the mmiowb which is needed on ia64, mips, and a couple other
architectures in order to synchronize the MMIO writes with the Tx queue
_xmit_lock spinlock.
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch cleans up the page reuse code getting it into a state where all
the workarounds needed are in place as well as cleaning up a few minor
oversights such as using __free_pages instead of put_page to drop a locally
allocated page.
It also cleans up how we clear the descriptor status bits. Previously they
were zeroed as a part of clearing the hdr_addr. However the hdr_addr is a
64 bit field and 64 bit writes can be a bit more expensive on on 32 bit
systems. Since we are no longer using the header split feature the upper
32 bits of the address no longer need to be cleared. As a result we can
just clear the status bits and leave the length and VLAN fields as-is which
should provide more information in debugging.
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
After commit b2b49ccbdd (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so #ifdef blocks
depending on CONFIG_PM_RUNTIME within #ifdef blocks depending on
CONFIG_PM may be dropped now.
Do that in the e1000e and igb network drivers.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Wake On Lan was not working on laptop DELL Vostro 1500.
If WOL was turned on, BCM4401 was powered up in suspend mode. LEDs blinked.
But the laptop could not be woken up with the Magic Packet. The reason for
that was that PCIE was not enabled as a system wakeup source and
therefore the host PCI bridge was not powered up in suspend mode.
PCIE was not enabled in suspend by PM because no child devices were
registered as wakeup source during suspend process.
On laptop BCM4401 is connected through the SSB bus, that is connected to the
PCI-Express bus. SSB and B44 did not use standard PM wakeup functions
and did not forward wakeup settings to their parents.
To fix that B44 driver enables PM wakeup and registers new wakeup source
using device_set_wakeup_enable(). Wakeup is automatically reported to the parent SSB
bus via power.wakeup_path. SSB bus enables wakeup for the parent PCI bridge, if there is any
child devices with enabled wakeup functionality. All other steps are
done by PM core code.
Signed-off-by: Andrey Skvortsov <Andrej.Skvortzov@gmail.com>
Signed-off-by: Michael Buesch <m@bues.ch>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Silences various sparse warnings
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rocker ports will use new "swdev" hwmode for bridge port offload policy.
Current supported policy settings are BR_LEARNING and BR_LEARNING_SYNC.
User can turn on/off device port FDB learning and syncing to bridge.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add L2 bridge offloading support to rocker driver. Here, the Linux bridge
driver is used to collect swdev ports into a tagged (or untagged) VLAN
bridge. The switchdev will offload from the bridge driver the following L2
bridging functions:
- Learning of neighbor MAC addresses on VLAN X Learned mac/vlan is
installed in bridge FDB. (And removed when device unlearns mac/vlan).
Learning must be turned off on each bridge port to disable the feature in
the bridge driver.
- Flooding of multicast/broadcast and unknown unicast pkts to (STP)
active ports in bridge. The bridge driver is unaware of the flooding happening
at the device level. Flooding must be turned off on each bridge port to
disable the feature on the bridge driver.
- STP port state is pushed down to driver/device. The bridge still processes
STP BDPUs and maintains port STP state (for all VLANs in bridge), but
the driver/device must be notified of port STP state change to program
the device.
Multiple (VLAN) bridges are supported. The device (implemented per
the OF-DPA spec) must use a portion of the VLAN namespace for
internal VLANs. Right now, the upper 255 VLANs (0xf00 to 0xffe) are
used as internal VLAN IDs for untagged traffic and are not available
as port VLANs.
The driver uses the following interfaces:
1. To track VLAN add/del on ports in bridge:
.ndo_vlan_rx_add_vid
.ndo_vlan_rx_kill_vid
2. To track port add/del membership in bridge:
NETDEV_CHANGEUPPER netdevice notifier
3. To catch static FDB entries installed on bridge/vlan by user using netlink:
.ndo_fdb_add
.ndo_fdb_del
4. To be notified on port STP state change:
.ndo_switch_port_stp_update
5. To notify bridge driver on learned/forgotten mac/vlans on bridge port:
br_fdb_external_learn_add
br_fdb_external_learn_del
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rocker driver maintains 4 hash tables: flows, groups, FDB, and VLANs.
Flow and group tables track the entries installed to OF-DPA tables,
per the OF-DPA spec. See OF-DPA spec for full description of fields
in each flow and group table. New table entries are pushed to the
device with ADD cmd. Updated entries are pushed to the device with
MOD cmd. For flow table entries, a crc32 key is made from fields of
the particular field. For group table entries, the group_id is used
as the key.
The FDB table tracks fdb entries learned by the device or manually
pushed to the bridge by the user. A crc32 key is made from the
port/mac/vlan tuple for the fdb entry.
The VLAN table tracks the ifindex-to-internal-vlan mapping for
untagged pkts. On ingress, an untagged pkt is inserted with an
internal VLAN ID based on the input port's current internal VLAN ID.
The input port's internal VLAN will either be referenced by the port's
ifindex, if not bridged, or the containing bridge's ifindex, if
bridged. Since the ifindex space isn't within a fixed range, uses a
hash table (with ifindex as key) to track internal VLAN ID for a given
ifindex. The internal VLAN ID range is fixed and currently uses the
upper 255 VLAN IDs, starting at 0xf00.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces the first driver to benefit from the switchdev
infrastructure and to implement newly introduced switch ndos. This is a
driver for emulated switch chip implemented in qemu:
https://github.com/sfeldma/qemu-rocker/
This patch is a result of joint work with Scott Feldman.
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To allow brport device to return current brport flags set on port. Add
returned flags to nested IFLA_PROTINFO netlink msg built in dflt getlink.
With this change, netlink msg returned for bridge_getlink contains the port's
offloaded flag settings (the port's SELF settings).
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So this can be reused for identification of other "items" as well.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Thomas Graf <tgraf@suug.ch>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do the work of parsing NDA_VLAN directly in rtnetlink code, pass simple
u16 vid to drivers from there.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Original code only check/alloc plat_dat for the CONFIG_OF case, this
patch check/alloc it earlier and unconditionally to avoid kernel build
warnings:
drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:275
stmmac_pltfr_probe() warn: variable dereferenced before check 'plat_dat'
V2: Fix coding style.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the current driver, allocation size of skb does not care the alignment
adjust after allocation.
And also, in the current implementation, buffer alignment method by
sh_eth_set_receive_align function has a bug that this function displace
buffer start address forcedly when the alignment is corrected.
In the result, tail of the skb will exceed allocated area and kernel panic
will be occurred.
This patch fix this issue.
Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ndo_bridge_setlink() is currently only called on the slave if
IFLA_AF_SPEC is set but this is a very fragile assumption and may
change in the future.
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Payload is currently accessed blindly and may exceed valid message
boundaries.
Fixes: a77dcb8c8 ("be2net: set and query VEB/VEPA mode of the PF interface")
Fixes: 815cccbf1 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If sky2->tx_le = pci_alloc_consistent() or sky2->tx_ring = kcalloc() in
sky2_alloc_buffers() fails, sky2->rx_ring = kcalloc() will never be called.
In this error case handling, sky2_rx_clean() is called from within
sky2_free_buffers().
In sky2_rx_clean() we find the following:
...
memset(sky2->rx_le, 0, RX_LE_BYTES);
...
This results in a memset using a NULL pointer and will crash the system.
Signed-off-by: Mirko Lindner <mlindner@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hook a nway_reset ethtool callback to allow restarting the
auto-negotiation process when asked to. We defer to the PHY library call
to do the heavy lifting.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow enabling and disabling EEE using the designated ethtool getters
and setters. GENET allows controlling EEE at the UniMAC, RBUF and TBUF
levels. We also take care of restoring EEE after a suspend/resume cycle
if it was enabled prior to suspending.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add register definitions to control EEE in the UniMAC, RBUF and TBUF
register ranges.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 3b57de958e brought the support for a different amount of
the filter bins, but didn't update the platform driver that without
CONFIG_OF.
Fixes: 3b57de958e (net: stmmac: Support devicetree configs for mcast
and ucast filter entries)
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some VF drivers use the upper byte of "param1" (the qp count field)
in mlx4_qp_reserve_range() to pass flags which are used to optimize
the range allocation.
Under the current code, if any of these flags are set, the 32-bit
count field yields a count greater than 2^24, which is out of range,
and this VF fails.
As these flags represent a "best-effort" allocation hint anyway, they may
safely be ignored. Therefore, the PF driver may simply mask out the bits.
Fixes: c82e9aa0a8 "mlx4_core: resource tracking for HCA resources used by guests"
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The precise selection is useless, so we simply remove these dependencies.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
If TX channels are set to 4 and RX channels are set to less than 4,
using ethtool -L, the driver will try to initialize more RX channels
than it has allocated, causing an oops.
This fix only initializes the RX ring if it has been allocated.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new file t4_pci_id_tbl.h that contains T4/T5 PCI ID Table so that for all
drivers that uses T4/T5 PCI functions changes can be done in one place.
checkpatch.pl script reports following error, which if tried to fix ends up in
compilation error.
ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_DEVICE_ID_TABLE_DEFINE_END \
+ { 0, } \
+ }
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
new file mode 100644
ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_ID_TABLE_FENTRY(devid) \
+ CH_PCI_ID_TABLE_ENTRY((devid) | \
+ ((CH_PCI_DEVICE_ID_FUNCTION) << 8)), \
+ CH_PCI_ID_TABLE_ENTRY((devid) | \
+ ((CH_PCI_DEVICE_ID_FUNCTION2) << 8))
ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_DEVICE_ID_TABLE_DEFINE_END { 0, } }
ERROR: Macros with complex values should be enclosed in parentheses
+#define CH_PCI_DEVICE_ID_TABLE_DEFINE_END { 0, } }
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add minimal runtime PM support (enable on probe, disable on remove), to
ensure proper operation with a parent device that uses runtime PM.
This is needed on systems where the external bus controller module of
the SoC is contained in a PM domain and/or has a gateable functional
clock. In such cases, before accessing any device connected to the
external bus, the PM domain must be powered up, and/or the functional
clock must be enabled, which is typically handled through runtime PM by
the bus controller driver.
An example of this is the kzm9g development board, where an smsc9220
Ethernet controller is connected to the Bus State Controller (BSC) of a
Renesas sh73a0 SoC.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
As pointed out by Ben Hutchings drivers that allow using VLAN have to
provide enough headroom for the VLAN tags.
Signed-off-by: Alban Bedel <albeu@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
i.MX6SX fec support three rx ring1, the current driver lost to init
ring1 and ring2 maximum receive buffer size, that cause receving
frame date length error. The driver reports "rcv is not +last" error
log in user case.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key might increase attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christian Benvenuti <benve@cisco.com>
Cc: Govindarajulu Varadarajan <_govind@gmx.com>
Cc: Sujith Sankar <ssujith@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All the access to wq has been moved out of hardirq context. We no longer need to
use spin_lock_irqsave.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
enic_isr_legacy(), enic_isr_msix() & enic_isr_msi() run from hard
interrupt context.
They can use napi_schedule_irqoff() instead of napi_schedule()
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds some checks in order to prevent panic's on surprise
removal of devices during S0, S3, S4. Without this patch, Thunderbolt
type device removal will panic the system.
Signed-off-by: Yanir Lubetkin <yanirx.lubetkin@intel.com>
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While working on a different issue, I noticed an annoying use
after free bug on my machine when unloading the ixgbe driver:
[ 8642.318797] ixgbe 0000:02:00.1: removed PHC on p2p2
[ 8642.742716] ixgbe 0000:02:00.1: complete
[ 8642.743784] BUG: unable to handle kernel paging request at ffff8807d3740a90
[ 8642.744828] IP: [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
[ 8642.745886] PGD 20c6067 PUD 81c1f6067 PMD 81c15a067 PTE 80000007d3740060
[ 8642.746956] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[ 8642.748039] Modules linked in: [...]
[ 8642.752929] CPU: 1 PID: 1225 Comm: rmmod Not tainted 3.18.0-rc2+ #49
[ 8642.754203] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 1.1b 11/01/2013
[ 8642.755505] task: ffff8807e34d3fe0 ti: ffff8807b7204000 task.ti: ffff8807b7204000
[ 8642.756831] RIP: 0010:[<ffffffffa01c77dc>] [<ffffffffa01c77dc>] ixgbe_remove+0xfc/0x1b0 [ixgbe]
[...]
[ 8642.774335] Stack:
[ 8642.775805] ffff8807ee824098 ffff8807ee824098 ffffffffa01f3000 ffff8807ee824000
[ 8642.777326] ffff8807b7207e18 ffffffff8137720f ffff8807ee824098 ffff8807ee824098
[ 8642.778848] ffffffffa01f3068 ffff8807ee8240f8 ffff8807b7207e38 ffffffff8144180f
[ 8642.780365] Call Trace:
[ 8642.781869] [<ffffffff8137720f>] pci_device_remove+0x3f/0xc0
[ 8642.783395] [<ffffffff8144180f>] __device_release_driver+0x7f/0xf0
[ 8642.784876] [<ffffffff814421f8>] driver_detach+0xb8/0xc0
[ 8642.786352] [<ffffffff814414a9>] bus_remove_driver+0x59/0xe0
[ 8642.787783] [<ffffffff814429d0>] driver_unregister+0x30/0x70
[ 8642.789202] [<ffffffff81375c65>] pci_unregister_driver+0x25/0xa0
[ 8642.790657] [<ffffffffa01eb38e>] ixgbe_exit_module+0x1c/0xc8e [ixgbe]
[ 8642.792064] [<ffffffff810f93a2>] SyS_delete_module+0x132/0x1c0
[ 8642.793450] [<ffffffff81012c61>] ? do_notify_resume+0x61/0xa0
[ 8642.794837] [<ffffffff816d2029>] system_call_fastpath+0x12/0x17
The issue is that test_and_set_bit() done on adapter->state is being
performed *after* the netdevice has been freed via free_netdev().
When netdev is being allocated on initialization time, it allocates
a private area, here struct ixgbe_adapter, that resides after the
net_device structure. In ixgbe_probe(), the device init routine,
we set up the adapter after alloc_etherdev_mq() on the private area
and add a reference for the pci_dev as well via pci_set_drvdata().
Both in the error path of ixgbe_probe(), but also on module unload
when ixgbe_remove() is being called, commit 41c62843eb ("ixgbe:
Fix rcu warnings induced by LER") accesses adapter after free_netdev().
The patch stores the result in a bool and thus fixes above oops on my
side.
Fixes: 41c62843eb ("ixgbe: Fix rcu warnings induced by LER")
Cc: stable <stable@vger.kernel.org>
Cc: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IXGBE adapter seems to require that VLAN filtering be enabled if
VMDQ or SRIOV are enabled. When those functions are disabled,
VLAN filtering may be disabled in promiscuous mode.
Prior to commit a9b8943ee1 ("ixgbe: remove vlan_filter_disable
and enable functions")
The logic was correct. However, after the commit the logic
got reversed and VLAN filtered in now turned on when VMDQ/SRIOV
is disabled.
This patch changes the condition to enable hw vlan filtered
when VMDQ or SRIOV is enabled.
Fixes: a9b8943ee1 ("ixgbe: remove vlan_filter_disable and enable functions")
Cc: stable <stable@vger.kernel.org>
CC: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleanups all PCIE, RSS & FW related macros/register defines that are
defined in t4fw_api.h and the affected files.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleanups all port and VI related macros/register defines that are
defined in t4fw_api.h and the affected files.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleanups all queue related macros/register defines that are defined
in t4fw_api.h and the affected files.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleanups PF/VF and LDST related macros/register defines that are
defined in t4fw_api.h and the affected files.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch cleanups all filter related macros/register defines that are defined
in t4fw_api.h and the affected files.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original FDB code submission wasn't correct and the code
wasn't enabled. This removes some dead code (can use the common kernel
code for fdb_del and fdb_dump) and correctly enables the fdb_add
function pointer.
The fdb_add functionality is important to i40e because it is needed
for a workaround to allow bridges to work correctly on the i40e
hardware.
Reported-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ieee802154/fakehard.c
A bug fix went into 'net' for ieee802154/fakehard.c, which is removed
in 'net-next'.
Add build fix into the merge from Stephen Rothwell in openvswitch, the
logging macros take a new initial 'log' argument, a new call was added
in 'net' so when we merge that in here we have to explicitly add the
new 'log' arg to it else the build fails.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-11-20
This series contains updates to ixgbevf, i40e and i40evf.
Emil updates ixgbevf with much of the work that Alex Duyck did while at
Intel. First updates the driver to clear the status bits on allocation
instead of in the cleanup routine, this way we can leave the recieve
descriptor rings as a read only memory block until we actually have
buffers to give back to the hardware. Clean up ixgbevf_clean_rx_irq()
by creating ixgbevf_process_skb_field() to merge several similar
operations into this new function. Cleanup temporary variables within
the receive hot-path and reducing the scope of variables that do not
need to exist outside the main loop. Save on stack space by just
storing our updated values back in next_to_clean instead of using
a stack variable, which also collapses the size the function. Improve
performace on IOMMU enabled systems and reduce cache misses by changing
the basic receive patch for ixgbevf so that instead of receiving the
data into an skb, it is received into a double buffered page. Add
netpoll support by creating ixgbevf_netpoll(), which is a callback for
.ndo_poll_controller to allow for the VF interface to be used with
netconsole.
Mitch provides several cleanups and trivial fixes for i40e and i40evf.
First is a fix the overloading of the msg_size field in the
arq_event_info struct by splitting the field into two and renaming to
indicate the actual function of each field. Updates code comments
to match the actual function. Cleanup several checkpatch.pl warnings
by adding or removing blank lines, aligning function parameters, and
correcting over-long lines (which makes the code more readable).
Shannon provides a patch for i40e to write the extra bits that will
turn off the ITR wait for the interrupt, since we want the SW INT to
go off as soon as possible.
v2: updated patch 07 based on feedback from Alex Duyck by
- adding pfmemalloc check to a new function for reusable page
- moved atomic_inc outside of #if/else in ixgbevf_add_rx_frag()
- reverted the removal of the API check in ixgbevf_change_mtu()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support to delay telling the hardware about data that is ready to
be transmitted if the skb->xmit_more flag is set.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current form of Tx coalescing works on a descriptor basis instead
of on a packet basis and doesn't take into account TSO packets. Update
the Tx coalescing support to work on a packet basis, taking into
account the number of packets associated with a TSO transmit. Also,
only activate the Tx timer if a timer value is set.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tso_header variable in the xgbe_tx_ring_data structure is not used,
remove it.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call the appropriate BQL functions to track the number of bytes queued
during Tx processing and to track the number of packets and bytes
that have been transmitted during Tx complete processing.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the Tx and Rx related fields within the xgbe_ring_data struct into
their own structs in order to more easily see what fields are used for
each operation.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Smatch tool indicated that one of the if statements in xgbe-dev.c
could be rewritten to remove a redundant check for the 'err' variable
in an if statement.
Change the statement as suggested and add a comment to help clarify.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the Tx engine is told to stop while it is actively processing Tx
descriptors it is possible that the Tx descriptor(s) will not be closed
out properly. When the Tx engine is restarted this could result in the
driver being stuck on the improperly closed descriptor.
Update the driver to wait for the Tx engine to be in a stopped or
suspended state before issuing the stop command.
This has not been an issue to date, but it's a good safe-guard to have.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a read memory barrier to the Tx and Rx paths where the ownership
bit is checked to be sure that all descriptor fields are read after
having read the ownership bit for the descriptor.
This has not been an issue to date, but it's a good safe-guard to have.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The functions kfree() and of_node_put() test whether their argument is NULL
and then return immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The of_dev_put() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix ethtool set settings to not check AUTONEG_ENABLE
mlx4_en_set_settings should not check if cmd->autoneg == AUTONEG_ENABLE,
cmd->autoneg can be enabled by default and this check will fail other settings requests.
mlx4_en driver doesn't support changing autoneg value, but shouldn't fail the request
in case cmd->autoneg was set.
Fixes: d48b3ab ("net/mlx4_en: Use PTYS register to set ethtool settings (Speed)")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To help troubleshoot heavy memory pressure conditions, add a bunch of
statistics counter to log RX buffer allocation and RX/TX DMA mapping
failures. These are reported like any other counters through the ethtool
stats interface.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To help troubleshoot heavy memory pressure conditions, add a bunch of
statistics counter to log RX buffer allocation and RX/TX DMA mapping
failures. These are reported like any other counters through the ethtool
stats interface.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Name fits better. Plus there's going to be introduced
__vlan_insert_tag later on.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch to a random RSS key rather than a fixed one.
Using netdev_rss_key_fill helper also ensures that all ports share
a common key.
See also commit 960fb622f8.
Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check and update posted_index only when skb->xmit_more is 0 or tx queue is full.
v2:
use txq_map instead of skb_get_queue_mapping(skb)
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peer priority groups were being reversed, but this was missed in the previous
fix sent out for this issue.
v2 : Previous patch was doing extra unnecessary work, result is the same.
Please ignore previous patch
Fixes : ee7bc3cdc2 ('cxgb4 : dcb open-lldp interop fixes')
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we want the SW INT to go off as soon as possible, write the
extra bits that will turn off the ITR wait for the interrupt.
Change-ID: I6d5382ba60840fa32abb7dea17c839eb4b5f68f7
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since the if part of this statement contains a break, there's no reason
for the else. Clean up the code and make it more obvious that the delay
happens each time through the loop.
Change-ID: I9292eaf7dd687688bdc401b8bd8d1d14f6944460
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Most of the null-checking in this driver is of the style if (!foo),
except these few. Make these checks consistent with the rest of the
code.
Change-ID: I991924f34072fa607a1b626a8b3f1fa5195d43e9
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch is the result of running checkpatch on the i40evf driver with
the --strict option. The vast majority of changes are adding/removing
blank lines, aligning function parameters, and correcting over-long
lines.
The only possible functional change is changing the flags member of the
adapter structure to be non-volatile. However, according to the kernel
documentation, this is not necessary and the volatile should be removed.
Change-ID: Ie8c6414800924f529bef831e8845292b970fe2ed
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
No code changes. Update comments to match actual function declarations.
Change-ID: Ib830d2f154ee917a104955c0914267fc98f3d2c8
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Overloading the msg_size field in the arq_event_info struct is just a
bad idea. It leads to repeated bugs when the structure is used in a
loop, since the input value (buffer size) is overwritten by the output
value (actual message length).
Fix this by splitting the field into two and renaming to indicate the
actual function of each field.
Since the arq_event struct has now changed, we need to change the drivers
to support this. Note that we no longer need to initialize the buffer size
each time we go through a loop as this value is no longer destroyed by
arq processing.
In the process, we also fix a bug in i40evf_verify_api_ver where the
buffer size was not correctly reinitialized each time through the loop.
Change-ID: Ic7f9633cdd6f871f93e698dfb095e29c696f5581
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Acked-by: Ashish Shah <ashish.n.shah@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds ixgbevf_netpoll() a callback for .ndo_poll_controller to
allow for the VF interface to be used with netconsole.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
total_rx_packets is the number of packets we had cleaned, and budget is
the total number of packets that we could clean per poll. Instead of
altering both of these values we can save ourselves one write to memory by
just comparing total_rx_packets to the budget and as long as we are less
than budget we continue cleaning.
Also change the do{}while logic to while{} in order to avoid processing
packets when budget is 0.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes the basic receive path for ixgbevf so that instead of
receiving the data into an skb it is received into a double buffered page.
The main change is that the receives will be done in pages only and then
pull the header out of the page and copy it into the sk_buff data.
This has the advantages of reduced cache misses and improved performance on
IOMMU enabled systems.
v2:
- added pfmemalloc check to a new function for reusable page
- moved atomic_inc outside of #if/else in ixgbevf_add_rx_frag()
- reverted the removal of the api check in ixgbevf_change_mtu()
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Since the next_to_clean value is only accessed by the Rx interrupt handler
we can save on stack space by just storing our updated values back in
next_to_clean instead of using the stack variable i. This should help to
reduce stack space and we can further collapse the size of the function.
Also removed non_eop_descs counter as it was never shown in the stats.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change allows us to go from a loop based on the descriptor to one
primarily based on the budget. The advantage to this is that we can avoid
carrying too many values from one iteration to the next.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to help cleanup the usage of temporary variables
within the Rx hot-path by removing unnecessary variables and reducing
the scope of variables that do not need to exist outside the main loop.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch cleans up ixgbevf_clean_rx_irq() by merging several similar
operations into a new function - ixgbevf_process_skb_fields().
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of keeping a local copy of the status bits from the descriptor
we can just read them directly - this is accomplished with the addition
of ixgbevf_test_staterr().
In addition instead of doing a byteswap on the status bits value, we
can byteswap the constant values we are testing since that can be done
at compile time which should help to improve performance on big-endian
systems.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of clearing the status bits in the cleanup it makes more sense to
just clear the status bits on allocation. This way we can leave the Rx
descriptor rings as a read only memory block until we actually have buffers
to give back to the hardware.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is currently missing, which results in a crash when one attempts
to set VXLAN tunnel over the mlx4_en when acting as PF.
[ 2408.785472] BUG: unable to handle kernel NULL pointer dereference at (null)
[...]
[ 2408.994104] Call Trace:
[ 2408.996584] [<ffffffffa021f7f5>] ? vxlan_get_rx_port+0xd6/0x103 [vxlan]
[ 2409.003316] [<ffffffffa021f71f>] ? vxlan_lowerdev_event+0xf2/0xf2 [vxlan]
[ 2409.010225] [<ffffffffa0630358>] mlx4_en_start_port+0x862/0x96a [mlx4_en]
[ 2409.017132] [<ffffffffa063070f>] mlx4_en_open+0x17f/0x1b8 [mlx4_en]
While here, make sure to invoke vxlan_get_rx_port() only when VXLAN
offloads are actually enabled and not when they are only supported.
Reported-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When run ./scripts/kernel-doc several warnings are reported
so this patch fix them.
Also it reviews many comments and adds new ones.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds some useful comments inside the common header
file to provide information about the APIs exposed by the driver.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return value of swap_buffer() is not used by any caller, thus
remove it.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate the DIV_ROUND_UP() and change the loop counter increment to
4 instead. This results in saving 6 instructions in the functions
assembly code.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
when swap_buffer() is being called, we know for sure, that we need to
byte swap the data. Furthermore, this function is called for swapping
data in both directions. Thus cpu_to_be32() is semantically not
correct for all use cases. Use swab32s() to reflect this.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
fep->bufdesc_ex is treated as a boolean value, thus declare it as
such.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to DCBX configuration change if the VSI needs to use more than 1 TC;
it needs to disable the XPS maps that were set when operating in 1 TC mode.
Without disabling XPS the netdev layer will select queues based on those
settings and not use the TC queue mapping to make the queue selection.
This patch allows the driver to enable/disable the XPS based on the number
of TCs being enabled for the given VSI.
Change-ID: Idc4dec47a672d2a509f6d7fe11ed1ee65b4f0e08
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When PFC is enabled we should not proceed with setting the link flow control
parameters. Also, always report the link flow Tx/Rx settings as off when
PFC is enabled.
Change-ID: Ib09ec58afdf0b2e587ac9d8851a5c80ad58206c4
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
FCoE VSI Tx queue disable times out when reconfiguring as a result of
DCB TC configuration change event.
The hardware allows us to skip disabling and enabling of Tx queues for
VSIs with single TC enabled. As FCoE VSI is configured to have only
single TC we skip it from disable/enable flow.
Change-ID: Ia73ff3df8785ba2aa3db91e6f2c9005e61ebaec2
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When DCB TC configuration changes the firmware suspends the port's Tx.
Now, as DCB TCs may have changed the PF driver tries to reconfigure the
TC configuration of the VSIs it manages. As part of this process it disables
the VSI queues but the Tx queue disable will not complete as the port's
Tx has been suspended. So, waiting for Tx queues to go to disable state
in this flow may lead to detection of Tx queue disable timeout errors.
Hence, this patch adds a new PF state so that if a port's Tx is in
suspended state the Tx queue disable flow would just put the request for
the queue to be disabled and return without waiting for the queue to be
actually disabled.
Once the VSI(s) TC reconfiguration has been done and driver has called
firmware AQC "Resume PF Traffic" the driver checks the Tx queues requested
to be disabled are actually disabled before re-enabling them again.
Change-ID: If3e03ce4813a4e342dbd5a1eb1d2861e952b7544
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the port TC configuration changes as a result of DCBx the driver
modifies the enabled TCs for the VEBs it manages. But, in the process
it did not update the enabled_tc value that it caches on a per VEB basis.
So, when the next reconfiguration event occurs where the number of TC
value is same as the value cached in enabled_tc for a given VEB; driver
does not modify it's TC configuration by calling appropriate AQ command
believing it is running with the same configuration as requested.
Now, as the VEB is not actually enabled for the TCs that are there any
TC configuration command for VSI attached to that VEB with TCs that are
not enabled for the VEB fails.
This patch fixes this issue.
Change-ID: Ife5694469b05494228e0d850429ea1734738cf29
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a check whether LLDP Agent's default AdminStatus is
enabled or disabled on a given port. If it is disabled then it sets
the DCBX status to disabled as well; and would not query firmware for
any DCBX configuration data.
Change-ID: I73c0b9f0adbf4cae177d14914b20a48c9a8f50fd
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch allows i40e driver to query and use DCB configuration from
firmware when firmware DCBX agent is in CEE mode.
Change-ID: I30f92a67eb890f0f024f35339696e6e83d49a274
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When there are DCB configuration changes based on DCBX the firmware suspends
the port's Tx and generates an event to the PF. The PF is then responsible
to reconfigure the PF VSIs and switching topology as per the updated DCB
configuration and then resume the port's Tx by calling the "Resume Port Tx"
AQ command.
This patch adds this call to the flow that handles DCB re-configuration in
the PF.
Change-ID: I5b860ad48abfbf379b003143c4d3453e2ed5cc1c
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bumping minor version as this will be the second SW release and it
should be 1.
Change-ID: If0bd102095d2f059ae0c9b7f4ad625535ffbbdee
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
VF interrupt processing takes a looooong time, and it's possible that we
could lose a VFLR event if it happens while we're processing a VFLR on
another VF. This would leave the VF in a semi-permanent reset state,
which would not be cleared until yet another VF experiences a VFLR.
To correct this situation, we enable the VFLR interrupt cause before we
begin processing any pending resets. This means that any VFLR that
occurs during reset processing will generate another interrupt and this
routine will get called again.
This change may cause a spurious interrupt when multiple VFLRs occur
very close together in time. If this happens, then this routine will be
called again and it will detect no outstanding VFLR events and do
nothing. No harm, no foul.
Change-ID: Id0451f3e6e73a2cf6db1668296c71e129b59dc19
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Only warn once that PTP is not supported when linked at 100Mbit.
Yes, using a static this way means that this once-only message is not
port specific, but once only for the life of the driver, regardless of
the number of ports. That should be plenty.
Change-ID: Ie6476530056df408452e195ef06afd4f57caa4b2
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Also provide ethtool -x support to fetch RSS key
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Rename rss_hkey local variable to rss_key to have consistent name among
drivers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Prashant Sreedharan <prashant@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Lendacky, Thomas <Thomas.Lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TX_IN_SEL offset for the CPSW_PORT/TX_IN_CTL register was
incorrect. This caused the Dual MAC mode to never get set when
it should. It also caused possible unintentional setting of a
bit in the CPSW_PORT/TX_BLKS_REM register.
The purpose of setting the Dual MAC mode for this register is to:
"... allow packets from both ethernet ports to be written into
the FIFO without one port starving the other port."
- AM335x ARM TRM
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vxlan_gso_check() to advertise offload support for this NIC.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vxlan_gso_check() to advertise offload support for this NIC.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vxlan_gso_check() to advertise offload support for this NIC.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Sathya Perla <sperla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/chelsio/cxgb4vf/sge.c
drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
sge.c was overlapping two changes, one to use the new
__dev_alloc_page() in net-next, and one to use s->fl_pg_order in net.
ixgbe_phy.c was a set of overlapping whitespace changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
We now allow up to 126 VFs. Note though that certain firmware
versions only allow up to 80 VFs. Moreover, old HCAs only support 64 VFs.
In these cases, we limit the maximum number of VFs to 64.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, the driver queried the firmware in order to get the number
of supported EQs. Under SRIOV, since this was done before the driver
notified the firmware how many VFs it actually needs, the firmware had
to take into account a worst case scenario and always allocated four EQs
per VF, where one was used for events while the others were used for completions.
Now, when the firmware supports the asymmetric allocation scheme, denoted
by exposing num_sys_eqs > 0 (--> MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the
QUERY_FUNC command to query the firmware before enabling SRIOV. Thus we
can get more EQs and MSI-X vectors per function.
Moreover, when running in the new firmware/driver mode, the limitation
that the number of EQs should be a power of two is lifted.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
QUERY_FUNC firmware command could be used in order to query the
number of EQs, reserved EQs, etc for a specific function.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor mlx4_load_one, as a preparation step for a new and
more complicated load function. The goal is to support both
newer firmware that required init_hca to be done before
enable_sriov and legacy firmwares that requires things to
be done the other way around.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactoring mlx4_cmd_init and mlx4_cmd_cleanup such that partial init
and cleanup are possible. After this refactoring, calling mlx4_cmd_init
several times is safe.
This is necessary in the VF init flow when mlx4_init_hca returns -EACCESS,
we need to issue cleanup and re-attempt to call it with the slave flag.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We've used an incorrect type for the loop counter and the
mlx4_QUERY_FUNC_CAP function. The current input modifier
is either a port or a boolean.
Since the number of ports is always a positive value < 255,
we should use u8 instead of an integer with casting.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We mistakenly read the reserved_eqs field as a standard
numeric value rather than a log2 value.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With commit be9dad1f9f ("net: phy: suspend phydev when going
to HALTED"), the PHY device will be put in a low-power mode using
BMCR_PDOWN if the the interface is set down. The smsc911x driver does
a software_reset opening the device driver (ndo_open). In such case,
the PHY must be powered-up before access to any register and before
calling the software_reset function. Otherwise, as the PHY is powered
down the software reset fails and the interface can not be enabled
again.
This patch fixes this scenario that is easy to reproduce setting down
the network interface and setting up again.
$ ifconfig eth0 down
$ ifconfig eth0 up
ifconfig: SIOCSIFFLAGS: Input/output error
Signed-off-by: Enric Balletbo i Serra <eballetbo@iseebcn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device tree probing for R-Car M2N (r8a7793) is added.
Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using RMMI mode, it is necessary to change in probe.
Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was a missing break statement so we set everything to
PKT_HASH_TYPE_L3 even when we intended to use PKT_HASH_TYPE_L4.
Fixes: 5b9dfe299e ('amd-xgbe: Provide support for receive side scaling')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increased delay in the smsc911x_phy_disable_energy_detect (from 1ms to 2ms).
Dropped delays in the smsc911x_phy_enable_energy_detect (100ms and 1ms).
The patch affect SMSC LAN generation 4 chips with integrated PHY (LAN9221).
I saw problems with soft reset due to wrong udelay timings.
After I fixed udelay, I measured the time needed to bring integrated PHY
from power-down to operational mode (the time beetween clearing EDPWRDOWN
bit and soft reset complete event). I got 1ms (measured using ktime_get).
The value is equal to the current value (1ms) used in the
smsc911x_phy_disable_energy_detect. It is near the upper bound and in order
to avoid rare soft reset faults it is doubled (2ms).
I don't know official timing for bringing up integrated PHY as specs doesn't
clarify this (or may be I didn't found).
It looks safe to drop delays before and after setting EDPWRDOWN bit
(enable PHY power-down mode). I didn't saw any regressions with the patch.
The patch was reviewed by Steve Glendinning and Microchip Team.
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Acked-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch affect SMSC LAN generation 4 chips with integrated PHY (LAN9221).
It is possible that PHY could enter power-down mode (ENERGYON clear),
between ENERGYON bit check in smsc911x_phy_disable_energy_detect and SRST
bit set in smsc911x_soft_reset. This could happen, for example, if someone
disconnect ethernet cable between the checks. The PHY in a power-down mode
would prevent the MAC portion of chip to be software reseted.
Initially found by code review, confirmed later using test case.
This is low probability issue, and in order to reproduce it you have to
run the script:
while true; do
ifconfig eth0 down
ifconfig eth0 up || break
done
While the script is running you have to plug/unplug ethernet cable many
times (using gpio controlled ethernet switch, for example) until get:
[ 4516.477783] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 4516.512207] smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xce006000, IRQ: 336
[ 4516.524658] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 4516.559082] smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xce006000, IRQ: 336
[ 4516.571990] ADDRCONF(NETDEV_UP): eth0: link is not ready
ifconfig: SIOCSIFFLAGS: Input/output error
The patch was reviewed by Steve Glendinning and Microchip Team.
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Acked-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactored all macros used in cxgb4i as part of previously started cxgb4 macro
names cleanup. Makes them more uniform and avoids namespace collision.
Minor changes in other drivers where required as some of these macros are used
by multiple drivers, affected drivers are iw_cxgb4, cxgb4(vf) & csiostor
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With commit d75b1ade56 ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget. bcm_sysport_tx_poll()
always returns 0 whether or not we completed the poll quantum.
Fix this by returning either 0 when we did complete the TX ring reclaim,
or budget to trigger a repoll.
Fixes: d75b1ade56 ("net: less interrupt masking in NAPI")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the commit d75b1ade56 ("net: less interrupt masking in NAPI") napi repoll
is done only when work_done == budget. In tx napi poll we always return 0.
So tx napi is not called again and we do not clean up the tx ring.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the types of the descriptor entries in the xgbe_ring_desc struct
from u32 to __le32 to fix endian warnings issued by sparse.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is to ensure the net_device's dev.parent is set before we used it
in dma_zalloc_coherent() from init_hash_table().
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ae5c6c6d "ptp: Classify ptp over ip over vlan packets" changed the
code in two drivers that matches time stamps with PTP frames, with the goal
of allowing VLAN tagged PTP packets to receive hardware time stamps.
However, that commit failed to account for the VLAN header when parsing
IPv4 packets. This patch fixes those two drivers to correctly match VLAN
tagged IPv4/UDP PTP messages with their time stamps.
This patch should also be applied to v3.17.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix static checker warning that got introduced in commit e2ac962895
("cxgb4: Cleanup macros so they follow the same style and look consistent, part
2") due to accidental checkin of bogus line.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* In LLD_MANAGED mode, traffic classes were being returned in reverse order to
lldp agent.
* Priotype of strict is no longer the default returned.
* Change behaviour of getdcbx() based on discussions on lldp-devel
These were missed as there was no working fetch interface for open-lldp when
running in LLD_MANAGED mode till now.
Fixes: 76bcb31efc ("cxgb4 : Add DCBx support codebase and dcbnl_ops")
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a NULL pointer dereference when __tx_port_find() doesn't
find a matching port.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Intel drivers were pretty much just using the plain vanilla GFP flags
in their calls to __skb_alloc_page so this change makes it so that they use
dev_alloc_page which just uses GFP_ATOMIC for the gfp_flags value.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the bloated use of __skb_alloc_page and replace it with
__dev_alloc_page. In addition update the one other spot that is
allocating a page so that it allocates with the correct flags.
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case an interface has been brought down before entering S3, and then
brought up out of S3, all the initialization done during
bcmgenet_probe() by bcmgenet_mii_init() calling bcmgenet_mii_config() is
just lost since register contents are restored to their reset values.
Re-apply this configuration anytime we call bcmgenet_open() to make sure
our port multiplexer is properly configured to match the PHY interface.
Since we are now calling bcmgenet_mii_config() everytime bcmgenet_open()
is called, make sure we only print the message during initialization
time not to pollute the console.
Fixes: b6e978e504 ("net: bcmgenet: add suspend/resume callbacks")
Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_disconnect() is the only way to guarantee that we are not going to
schedule more work on the PHY state machine workqueue for that
particular PHY device.
This fixes an issue where a network interface was suspended prior to a
system suspend/resume cycle and would then be resumed as part of
mdio_bus_resume(), since the GENET interface clocks would have been
disabled, this basically resulted in bus errors to appear since we are
invoking the GENET driver adjust_link() callback.
Fixes: b6e978e504 ("net: bcmgenet: add suspend/resume callbacks")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the dependency of the VENDOR entry and fixes
the QCA7000 one.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Status variable is never initialized, can carry an arbitrary value
on the stack and thus may let the function fail.
Fixes: e90dd26456 ("ixgbe: Make return values more direct")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-11-11
This series contains updates to i40e, i40evf and ixgbe.
Kamil updated the i40e and i40evf driver to poll the firmware slower
since we were polling faster than the firmware could respond.
Shannon updates i40e to add a check to keep the service_task from
running the periodic tasks more than once per second, while still
allowing quick action to service the events.
Jesse cleans up the throttle rate code by fixing the minimum interrupt
throttle rate and removing some unused defines.
Mitch makes the early init admin queue message receive code more robust
by handling messages in a loop and ignoring those that we are not
interested in. This also gets rid of some scary log messages that
really do not indicate a problem.
Don provides several ixgbe patches, first fixes an issue with x540
completion timeout where on topologies including few levels of PCIe
switching for x540 can run into an unexpected completion error. Cleans
up the functionality in ixgbe_ndo_set_vf_vlan() in preparation for
future work. Adds support for x550 MAC's to the driver.
v2:
- Remove code comment in patch 01 of the series, based on feedback from
David Liaght
- Updated the "goto" to "break" statements in patch 06 of the series,
based on feedback from Sergei Shtylyov
- Initialized the variable err due to the possibility of use before
being assigned a value in patch 07 of the series
- Added patch "ixgbe: add helper function for setting RSS key in
preparation of X550" since it is needed for the addition of X550 MAC
support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of registering the platform and PCI drivers in one module let's move
necessary bits to where it belongs. During this procedure we convert the module
registration part to use module_*_driver() macros which makes code simplier.
>From now on the driver consists three parts: core library, PCI, and platform
drivers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currenly we only support Large-Send and TX checksum offloads for
encapsulated traffic of type VXLAN. We must make sure to advertize
these offloads up to the stack only when VXLAN tunnel is set.
Failing to do so, would mislead the the networking stack to assume
that the driver can offload the internal TX checksum for GRE packets
and other buggy schemes.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When processing received traffic, pass CHECKSUM_COMPLETE status to the
stack, with calculated checksum for non TCP/UDP packets (such
as GRE or ICMP).
Although the stack expects checksum which doesn't include the pseudo
header, the HW adds it. To address that, we are subtracting the pseudo
header checksum from the checksum value provided by the HW.
In the IPv6 case, we also compute/add the IP header checksum which
is not added by the HW for such packets.
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can call napi_gro_frags for all the received traffic regardless
of the checksum status. Specifically, received packets whose status
is CHECKSUM_NONE (and soon to be added CHECKSUM_COMPLETE)
are eligible for napi_gro_frags as well.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split off the setting of the RSS key into its own function. This
will help when we add support for X550 which can have different
RSS keys per pool.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch will add in the new MAC defines and fit it into the switch
cases throughout the driver. New functionality and enablement support will
be added in following patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move setting of drop enable to support function. This not only makes the
code more readable but is also prep for following patches that add
additional MAC support.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up functionality in ixgbe_ndo_set_vf_vlan that will simplify later
patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On topologies including few levels of PCIe switching X540 can run into an
unexpected completion error. We get around this by waiting after enabling
loopback a sufficient amount of time until Tx Data Fetch is sent. We then
poll the pending transaction bit to ensure we received the completion. Only
then do we go on to clear the buffers.
Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It's kind of silly to configure and attempt to use a bunch of queue
pairs when you're running on a single (virtual) CPU. Instead of
unconditionally configuring all of the queues that the PF gives us,
clamp the number of queue pairs to the number of CPUs.
Change-ID: I321714c9e15072ee76de8f95ab9a81f86ed347d1
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In early init, if we get an unexpected message from the PF (such as link
status), we just kick an error back to the init task, causing it to
restart its state machine and delaying initialization.
Make the early init AQ message receive code more robust by handling
messages in a loop, and ignoring those that we aren't interested in.
This also gets rid of some scary log messages that really didn't
indicate a problem.
Change-ID: I620e8c72e49c49c665ef33eeab2425dd10e721cf
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The interrupt throttle rate minimum is actually 2us, so
fix that define and while we are there, remove some unused defines.
Change some strings in the function to be a bit less wrappy, and
express the correct limits.
Change-ID: I96829bbc77935e0b57c6f0fc1439fb4152b2960a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ARQ events cause a service_task execution, and we do a link_status
check and full stats gathering for each service_task. However, when
there are a lot of ARQ events, such as when doing an NVM update, we end up
doing 10's if not 100's of these per second, thereby heavily abusing the
PCI bus and especially the Firmware. This patch adds a check to keep the
service_task from running these periodic tasks more than once per second,
while still allowing quick action to service the events.
Change-ID: Iec7670c37bfae9791c43fec26df48aea7f70b33e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code was polling the firmware tail register for completion every
10 microseconds, which is way faster than the firmware can respond.
This changes the poll interval to 1ms, which reduces polling CPU
utilization, and the number of times we loop.
The maximum delay is still 100ms.
Change-ID: I4bbfa6b66d802890baf8b4154061e55942b90958
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
After commit 1a28817282 ("mlx4: use napi_complete_done()") we ended up
calling napi_complete_done() in the case NAPI poll consumed all its
budget.
This added extra interrupt pressure, this patch restores proper
behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 1a28817282 ("mlx4: use napi_complete_done()")
Signed-off-by: David S. Miller <davem@davemloft.net>
The out_dropped label will only do rcu_read_unlock for non-null port.
So add the missing rcu_read_unlock() when bailing due to non-null port.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per comments in vnet_start_xmit, for the edge case
when outgoing vnet_start_xmit() data and an incoming STOPPED
ACK cross each other in flight, we may need to send the missed
START trigger from maybe_tx_wakeup() after checking for a
false value of start_cons
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When vnet_start_xmit() is concurrent with vnet_ack(), we may
have a race that looks like:
thread 1 thread 2
vnet_start_xmit vnet_event_napi -> vnet_rx
__vnet_tx_trigger for some desc X
at this point dr->prod == X
peer sends back a stopped ack for X
we process X, but X == dr->prod
so we bail out in vnet_ack with
!idx_is_pending
update dr->prod
As a result of the fact that we never processed the stopped ack for X,
the Tx path is led to incorrectly believe that the peer is still
"started" and reading, but the peer has stopped reading, which will
ultimately end in flow-control assertions.
The fix is to synchronize the above 2 paths on the netif_tx_lock.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver allows MTU up to 1518 bytes which is not enought to run
batman-adv. Simply raise the maximum packet size up to the maximum
allowed by the transmit descriptor, 1792 bytes, giving a maximum MTU
of 1774 bytes.
Signed-off-by: Alban Bedel <albeu@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the default ndo_change_mtu callback with one that allow
setting MTU that the driver can handle.
Signed-off-by: Alban Bedel <albeu@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike CEE, IEEE has a bespoke app delete call and does not rely on priority
for app deletion
Fixes : 2376c879b8 ('cxgb4 : Improve handling of DCB negotiation or loss
thereof')
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free List Starvation Threshold needs to be larger than the SGE's Egress
Congestion Threshold or we'll end up in a mutual stall where the driver waits
for Ingress Packets to drive replacing Free List Pointers and the SGE waits for
Free List Pointers before pushing Ingress Packets to the host.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
T5 introduces the ability to have separate Packing and Padding Boundaries
for SGE DMA transfers from the chip to Host Memory. This change set takes
advantage of that to set up a smaller Padding Boundary to conserve PCI Link
and Memory Bandwidth with T5.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move fl_starv_thres into adapter->sge data structure since it
_could_ be different from adapter to adapter. Also move other per-adapter
SGE values which had been treated as driver globals into adapter->sge.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This ways drivers like cxgb4 don't need to do ugly relative includes.
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Various patches have ended up changing the style of the symbolic macros/register
defines to different style.
As a result, the current kernel.org files are a mix of different macro styles.
Since this macro/register defines is used by different drivers a
few patch series have ended up adding duplicate macro/register define entries
with different styles. This makes these register define/macro files a complete
mess and we want to make them clean and consistent. This patch cleans up a part
of it.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Various patches have ended up changing the style of the symbolic macros/register
to different style.
As a result, the current kernel.org files are a mix of different macro styles.
Since this macro/register defines is used by different drivers a
few patch series have ended up adding duplicate macro/register define entries
with different styles. This makes these register define/macro files a complete
mess and we want to make them clean and consistent. This patch cleans up a part
of it.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To enable gro_flush_timeout, a driver has to use napi_complete_done()
instead of napi_complete().
Tested:
Ran 200 netperf TCP_STREAM from A to B (10Gbe mlx4 link, 8 RX queues)
Without this feature, we send back about 305,000 ACK per second.
GRO aggregation ratio is low (811/305 = 2.65 segments per GRO packet)
Setting a timer of 2000 nsec is enough to increase GRO packet sizes
and reduce number of ACK packets. (811/19.2 = 42)
Receiver performs less calls to upper stacks, less wakes up.
This also reduces cpu usage on the sender, as it receives less ACK
packets.
Note that reducing number of wakes up increases cpu efficiency, but can
decrease QPS, as applications wont have the chance to warmup cpu caches
doing a partial read of RPC requests/answers if they fit in one skb.
B:~# sar -n DEV 1 10 | grep eth0 | tail -1
Average: eth0 811269.80 305732.30 1199462.57 19705.72 0.00
0.00 0.50
B:~# echo 2000 >/sys/class/net/eth0/gro_flush_timeout
B:~# sar -n DEV 1 10 | grep eth0 | tail -1
Average: eth0 811577.30 19230.80 1199916.51 1239.80 0.00
0.00 0.50
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following sparse warnings. One is fixed by casting return
value to a return type of the function. The others by creating a specific
stmmac_platform.h which provides the bits related to the platform driver.
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: warning: incorrect type in return expression (different address spaces)
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: expected void *
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: got void [noderef] <asn:2>*reg
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:64:29: warning: symbol 'meson6_dwmac_data' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c:354:29: warning: symbol 'stih4xx_dwmac_data' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c:361:29: warning: symbol 'stid127_dwmac_data' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/dwmac-sunxi.c:133:29: warning: symbol 'sun7i_gmac_data' was not declared. Should it be static?
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a kernel helper to dump buffers in a hexdecimal format. This patch
substitutes the open coded function by calling that helper.
The output is slightly changed:
- no lead space
- ASCII part will be printed along with the dump
- offset is longer than 3 characters (now 8)
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 1b7bde6d65 ("net: fec: implement rx_copybreak to improve rx performance")
introduced a regression for i.MX28. The swap_buffer() function doing
the endian conversion of the received data on i.MX28 may access memory
beyond the actual packet size in the DMA buffer. fec_enet_copybreak()
does not copy those bytes, so that the last bytes of a packet may be
filled with invalid data after swapping.
This will likely lead to checksum errors on received packets.
E.g. when trying to mount an NFS rootfs:
UDP: bad checksum. From 192.168.1.225:111 to 192.168.100.73:44662 ulen 36
Do the byte swapping and copying to the new skb in one go if
necessary.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the skb allocation fails during receive processing, the driver would
continue reading descriptors without first determining if there were
any more descriptors for the current packet. Update the code to check
whether more descriptors are associated with the current packet or
whether to move on to the next descriptor as a new packet.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The channel structure is freed before freeing the per channel
interrupts resulting in a kernel oops. Move the call to free
the channel structure to after the freeing of the per channel
interrupts.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Poll for the link events only if firmware doesn't have capability
to notify the driver for the link events.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When events arrive at driver load, the event handler gets called even before
the spinlock and list are initialized. Fix this by moving the initialization
before EQs creation.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>