Instead of keeping a local copy of the status bits from the descriptor
we can just read them directly - this is accomplished with the addition
of ixgbevf_test_staterr().
In addition instead of doing a byteswap on the status bits value, we
can byteswap the constant values we are testing since that can be done
at compile time which should help to improve performance on big-endian
systems.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Instead of clearing the status bits in the cleanup it makes more sense to
just clear the status bits on allocation. This way we can leave the Rx
descriptor rings as a read only memory block until we actually have buffers
to give back to the hardware.
CC: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This is currently missing, which results in a crash when one attempts
to set VXLAN tunnel over the mlx4_en when acting as PF.
[ 2408.785472] BUG: unable to handle kernel NULL pointer dereference at (null)
[...]
[ 2408.994104] Call Trace:
[ 2408.996584] [<ffffffffa021f7f5>] ? vxlan_get_rx_port+0xd6/0x103 [vxlan]
[ 2409.003316] [<ffffffffa021f71f>] ? vxlan_lowerdev_event+0xf2/0xf2 [vxlan]
[ 2409.010225] [<ffffffffa0630358>] mlx4_en_start_port+0x862/0x96a [mlx4_en]
[ 2409.017132] [<ffffffffa063070f>] mlx4_en_open+0x17f/0x1b8 [mlx4_en]
While here, make sure to invoke vxlan_get_rx_port() only when VXLAN
offloads are actually enabled and not when they are only supported.
Reported-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When run ./scripts/kernel-doc several warnings are reported
so this patch fix them.
Also it reviews many comments and adds new ones.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds some useful comments inside the common header
file to provide information about the APIs exposed by the driver.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return value of swap_buffer() is not used by any caller, thus
remove it.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eliminate the DIV_ROUND_UP() and change the loop counter increment to
4 instead. This results in saving 6 instructions in the functions
assembly code.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
when swap_buffer() is being called, we know for sure, that we need to
byte swap the data. Furthermore, this function is called for swapping
data in both directions. Thus cpu_to_be32() is semantically not
correct for all use cases. Use swab32s() to reflect this.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
fep->bufdesc_ex is treated as a boolean value, thus declare it as
such.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to DCBX configuration change if the VSI needs to use more than 1 TC;
it needs to disable the XPS maps that were set when operating in 1 TC mode.
Without disabling XPS the netdev layer will select queues based on those
settings and not use the TC queue mapping to make the queue selection.
This patch allows the driver to enable/disable the XPS based on the number
of TCs being enabled for the given VSI.
Change-ID: Idc4dec47a672d2a509f6d7fe11ed1ee65b4f0e08
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When PFC is enabled we should not proceed with setting the link flow control
parameters. Also, always report the link flow Tx/Rx settings as off when
PFC is enabled.
Change-ID: Ib09ec58afdf0b2e587ac9d8851a5c80ad58206c4
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
FCoE VSI Tx queue disable times out when reconfiguring as a result of
DCB TC configuration change event.
The hardware allows us to skip disabling and enabling of Tx queues for
VSIs with single TC enabled. As FCoE VSI is configured to have only
single TC we skip it from disable/enable flow.
Change-ID: Ia73ff3df8785ba2aa3db91e6f2c9005e61ebaec2
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When DCB TC configuration changes the firmware suspends the port's Tx.
Now, as DCB TCs may have changed the PF driver tries to reconfigure the
TC configuration of the VSIs it manages. As part of this process it disables
the VSI queues but the Tx queue disable will not complete as the port's
Tx has been suspended. So, waiting for Tx queues to go to disable state
in this flow may lead to detection of Tx queue disable timeout errors.
Hence, this patch adds a new PF state so that if a port's Tx is in
suspended state the Tx queue disable flow would just put the request for
the queue to be disabled and return without waiting for the queue to be
actually disabled.
Once the VSI(s) TC reconfiguration has been done and driver has called
firmware AQC "Resume PF Traffic" the driver checks the Tx queues requested
to be disabled are actually disabled before re-enabling them again.
Change-ID: If3e03ce4813a4e342dbd5a1eb1d2861e952b7544
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When the port TC configuration changes as a result of DCBx the driver
modifies the enabled TCs for the VEBs it manages. But, in the process
it did not update the enabled_tc value that it caches on a per VEB basis.
So, when the next reconfiguration event occurs where the number of TC
value is same as the value cached in enabled_tc for a given VEB; driver
does not modify it's TC configuration by calling appropriate AQ command
believing it is running with the same configuration as requested.
Now, as the VEB is not actually enabled for the TCs that are there any
TC configuration command for VSI attached to that VEB with TCs that are
not enabled for the VEB fails.
This patch fixes this issue.
Change-ID: Ife5694469b05494228e0d850429ea1734738cf29
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds a check whether LLDP Agent's default AdminStatus is
enabled or disabled on a given port. If it is disabled then it sets
the DCBX status to disabled as well; and would not query firmware for
any DCBX configuration data.
Change-ID: I73c0b9f0adbf4cae177d14914b20a48c9a8f50fd
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch allows i40e driver to query and use DCB configuration from
firmware when firmware DCBX agent is in CEE mode.
Change-ID: I30f92a67eb890f0f024f35339696e6e83d49a274
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When there are DCB configuration changes based on DCBX the firmware suspends
the port's Tx and generates an event to the PF. The PF is then responsible
to reconfigure the PF VSIs and switching topology as per the updated DCB
configuration and then resume the port's Tx by calling the "Resume Port Tx"
AQ command.
This patch adds this call to the flow that handles DCB re-configuration in
the PF.
Change-ID: I5b860ad48abfbf379b003143c4d3453e2ed5cc1c
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-By: Jack Morgan <jack.morgan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Bumping minor version as this will be the second SW release and it
should be 1.
Change-ID: If0bd102095d2f059ae0c9b7f4ad625535ffbbdee
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
VF interrupt processing takes a looooong time, and it's possible that we
could lose a VFLR event if it happens while we're processing a VFLR on
another VF. This would leave the VF in a semi-permanent reset state,
which would not be cleared until yet another VF experiences a VFLR.
To correct this situation, we enable the VFLR interrupt cause before we
begin processing any pending resets. This means that any VFLR that
occurs during reset processing will generate another interrupt and this
routine will get called again.
This change may cause a spurious interrupt when multiple VFLRs occur
very close together in time. If this happens, then this routine will be
called again and it will detect no outstanding VFLR events and do
nothing. No harm, no foul.
Change-ID: Id0451f3e6e73a2cf6db1668296c71e129b59dc19
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Only warn once that PTP is not supported when linked at 100Mbit.
Yes, using a static this way means that this once-only message is not
port specific, but once only for the life of the driver, regardless of
the number of ports. That should be plenty.
Change-ID: Ie6476530056df408452e195ef06afd4f57caa4b2
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Also provide ethtool -x support to fetch RSS key
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Rename rss_hkey local variable to rss_key to have consistent name among
drivers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Sathya Perla <sathya.perla@emulex.com>
Cc: Subbu Seetharaman <subbu.seetharaman@emulex.com>
Cc: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use of well known RSS key increases attack surface.
Switch to a random one, using generic helper so that all
ports share a common key.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Prashant Sreedharan <prashant@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netdev_rss_key_fill() helper, as it provides better support for some
bonding setups.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Lendacky, Thomas <Thomas.Lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TX_IN_SEL offset for the CPSW_PORT/TX_IN_CTL register was
incorrect. This caused the Dual MAC mode to never get set when
it should. It also caused possible unintentional setting of a
bit in the CPSW_PORT/TX_BLKS_REM register.
The purpose of setting the Dual MAC mode for this register is to:
"... allow packets from both ethernet ports to be written into
the FIFO without one port starving the other port."
- AM335x ARM TRM
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Reviewed-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vxlan_gso_check() to advertise offload support for this NIC.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Shahed Shaikh <shahed.shaikh@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vxlan_gso_check() to advertise offload support for this NIC.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use vxlan_gso_check() to advertise offload support for this NIC.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Sathya Perla <sperla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/chelsio/cxgb4vf/sge.c
drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
sge.c was overlapping two changes, one to use the new
__dev_alloc_page() in net-next, and one to use s->fl_pg_order in net.
ixgbe_phy.c was a set of overlapping whitespace changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
We now allow up to 126 VFs. Note though that certain firmware
versions only allow up to 80 VFs. Moreover, old HCAs only support 64 VFs.
In these cases, we limit the maximum number of VFs to 64.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, the driver queried the firmware in order to get the number
of supported EQs. Under SRIOV, since this was done before the driver
notified the firmware how many VFs it actually needs, the firmware had
to take into account a worst case scenario and always allocated four EQs
per VF, where one was used for events while the others were used for completions.
Now, when the firmware supports the asymmetric allocation scheme, denoted
by exposing num_sys_eqs > 0 (--> MLX4_DEV_CAP_FLAG2_SYS_EQS), we use the
QUERY_FUNC command to query the firmware before enabling SRIOV. Thus we
can get more EQs and MSI-X vectors per function.
Moreover, when running in the new firmware/driver mode, the limitation
that the number of EQs should be a power of two is lifted.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
QUERY_FUNC firmware command could be used in order to query the
number of EQs, reserved EQs, etc for a specific function.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactor mlx4_load_one, as a preparation step for a new and
more complicated load function. The goal is to support both
newer firmware that required init_hca to be done before
enable_sriov and legacy firmwares that requires things to
be done the other way around.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactoring mlx4_cmd_init and mlx4_cmd_cleanup such that partial init
and cleanup are possible. After this refactoring, calling mlx4_cmd_init
several times is safe.
This is necessary in the VF init flow when mlx4_init_hca returns -EACCESS,
we need to issue cleanup and re-attempt to call it with the slave flag.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We've used an incorrect type for the loop counter and the
mlx4_QUERY_FUNC_CAP function. The current input modifier
is either a port or a boolean.
Since the number of ports is always a positive value < 255,
we should use u8 instead of an integer with casting.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We mistakenly read the reserved_eqs field as a standard
numeric value rather than a log2 value.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With commit be9dad1f9f ("net: phy: suspend phydev when going
to HALTED"), the PHY device will be put in a low-power mode using
BMCR_PDOWN if the the interface is set down. The smsc911x driver does
a software_reset opening the device driver (ndo_open). In such case,
the PHY must be powered-up before access to any register and before
calling the software_reset function. Otherwise, as the PHY is powered
down the software reset fails and the interface can not be enabled
again.
This patch fixes this scenario that is easy to reproduce setting down
the network interface and setting up again.
$ ifconfig eth0 down
$ ifconfig eth0 up
ifconfig: SIOCSIFFLAGS: Input/output error
Signed-off-by: Enric Balletbo i Serra <eballetbo@iseebcn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The device tree probing for R-Car M2N (r8a7793) is added.
Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When using RMMI mode, it is necessary to change in probe.
Signed-off-by: Hisashi Nakamura <hisashi.nakamura.ak@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There was a missing break statement so we set everything to
PKT_HASH_TYPE_L3 even when we intended to use PKT_HASH_TYPE_L4.
Fixes: 5b9dfe299e ('amd-xgbe: Provide support for receive side scaling')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Increased delay in the smsc911x_phy_disable_energy_detect (from 1ms to 2ms).
Dropped delays in the smsc911x_phy_enable_energy_detect (100ms and 1ms).
The patch affect SMSC LAN generation 4 chips with integrated PHY (LAN9221).
I saw problems with soft reset due to wrong udelay timings.
After I fixed udelay, I measured the time needed to bring integrated PHY
from power-down to operational mode (the time beetween clearing EDPWRDOWN
bit and soft reset complete event). I got 1ms (measured using ktime_get).
The value is equal to the current value (1ms) used in the
smsc911x_phy_disable_energy_detect. It is near the upper bound and in order
to avoid rare soft reset faults it is doubled (2ms).
I don't know official timing for bringing up integrated PHY as specs doesn't
clarify this (or may be I didn't found).
It looks safe to drop delays before and after setting EDPWRDOWN bit
(enable PHY power-down mode). I didn't saw any regressions with the patch.
The patch was reviewed by Steve Glendinning and Microchip Team.
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Acked-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch affect SMSC LAN generation 4 chips with integrated PHY (LAN9221).
It is possible that PHY could enter power-down mode (ENERGYON clear),
between ENERGYON bit check in smsc911x_phy_disable_energy_detect and SRST
bit set in smsc911x_soft_reset. This could happen, for example, if someone
disconnect ethernet cable between the checks. The PHY in a power-down mode
would prevent the MAC portion of chip to be software reseted.
Initially found by code review, confirmed later using test case.
This is low probability issue, and in order to reproduce it you have to
run the script:
while true; do
ifconfig eth0 down
ifconfig eth0 up || break
done
While the script is running you have to plug/unplug ethernet cable many
times (using gpio controlled ethernet switch, for example) until get:
[ 4516.477783] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 4516.512207] smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xce006000, IRQ: 336
[ 4516.524658] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 4516.559082] smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xce006000, IRQ: 336
[ 4516.571990] ADDRCONF(NETDEV_UP): eth0: link is not ready
ifconfig: SIOCSIFFLAGS: Input/output error
The patch was reviewed by Steve Glendinning and Microchip Team.
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Acked-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Refactored all macros used in cxgb4i as part of previously started cxgb4 macro
names cleanup. Makes them more uniform and avoids namespace collision.
Minor changes in other drivers where required as some of these macros are used
by multiple drivers, affected drivers are iw_cxgb4, cxgb4(vf) & csiostor
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With commit d75b1ade56 ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget. bcm_sysport_tx_poll()
always returns 0 whether or not we completed the poll quantum.
Fix this by returning either 0 when we did complete the TX ring reclaim,
or budget to trigger a repoll.
Fixes: d75b1ade56 ("net: less interrupt masking in NAPI")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With the commit d75b1ade56 ("net: less interrupt masking in NAPI") napi repoll
is done only when work_done == budget. In tx napi poll we always return 0.
So tx napi is not called again and we do not clean up the tx ring.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the types of the descriptor entries in the xgbe_ring_desc struct
from u32 to __le32 to fix endian warnings issued by sparse.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is to ensure the net_device's dev.parent is set before we used it
in dma_zalloc_coherent() from init_hash_table().
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Acked-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ae5c6c6d "ptp: Classify ptp over ip over vlan packets" changed the
code in two drivers that matches time stamps with PTP frames, with the goal
of allowing VLAN tagged PTP packets to receive hardware time stamps.
However, that commit failed to account for the VLAN header when parsing
IPv4 packets. This patch fixes those two drivers to correctly match VLAN
tagged IPv4/UDP PTP messages with their time stamps.
This patch should also be applied to v3.17.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix static checker warning that got introduced in commit e2ac962895
("cxgb4: Cleanup macros so they follow the same style and look consistent, part
2") due to accidental checkin of bogus line.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* In LLD_MANAGED mode, traffic classes were being returned in reverse order to
lldp agent.
* Priotype of strict is no longer the default returned.
* Change behaviour of getdcbx() based on discussions on lldp-devel
These were missed as there was no working fetch interface for open-lldp when
running in LLD_MANAGED mode till now.
Fixes: 76bcb31efc ("cxgb4 : Add DCBx support codebase and dcbnl_ops")
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a NULL pointer dereference when __tx_port_find() doesn't
find a matching port.
Signed-off-by: David L Stevens <david.stevens@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Intel drivers were pretty much just using the plain vanilla GFP flags
in their calls to __skb_alloc_page so this change makes it so that they use
dev_alloc_page which just uses GFP_ATOMIC for the gfp_flags value.
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Matthew Vick <matthew.vick@intel.com>
Cc: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the bloated use of __skb_alloc_page and replace it with
__dev_alloc_page. In addition update the one other spot that is
allocating a page so that it allocates with the correct flags.
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case an interface has been brought down before entering S3, and then
brought up out of S3, all the initialization done during
bcmgenet_probe() by bcmgenet_mii_init() calling bcmgenet_mii_config() is
just lost since register contents are restored to their reset values.
Re-apply this configuration anytime we call bcmgenet_open() to make sure
our port multiplexer is properly configured to match the PHY interface.
Since we are now calling bcmgenet_mii_config() everytime bcmgenet_open()
is called, make sure we only print the message during initialization
time not to pollute the console.
Fixes: b6e978e504 ("net: bcmgenet: add suspend/resume callbacks")
Fixes: 1c1008c793 ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
phy_disconnect() is the only way to guarantee that we are not going to
schedule more work on the PHY state machine workqueue for that
particular PHY device.
This fixes an issue where a network interface was suspended prior to a
system suspend/resume cycle and would then be resumed as part of
mdio_bus_resume(), since the GENET interface clocks would have been
disabled, this basically resulted in bus errors to appear since we are
invoking the GENET driver adjust_link() callback.
Fixes: b6e978e504 ("net: bcmgenet: add suspend/resume callbacks")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the dependency of the VENDOR entry and fixes
the QCA7000 one.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Status variable is never initialized, can carry an arbitrary value
on the stack and thus may let the function fail.
Fixes: e90dd26456 ("ixgbe: Make return values more direct")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-11-11
This series contains updates to i40e, i40evf and ixgbe.
Kamil updated the i40e and i40evf driver to poll the firmware slower
since we were polling faster than the firmware could respond.
Shannon updates i40e to add a check to keep the service_task from
running the periodic tasks more than once per second, while still
allowing quick action to service the events.
Jesse cleans up the throttle rate code by fixing the minimum interrupt
throttle rate and removing some unused defines.
Mitch makes the early init admin queue message receive code more robust
by handling messages in a loop and ignoring those that we are not
interested in. This also gets rid of some scary log messages that
really do not indicate a problem.
Don provides several ixgbe patches, first fixes an issue with x540
completion timeout where on topologies including few levels of PCIe
switching for x540 can run into an unexpected completion error. Cleans
up the functionality in ixgbe_ndo_set_vf_vlan() in preparation for
future work. Adds support for x550 MAC's to the driver.
v2:
- Remove code comment in patch 01 of the series, based on feedback from
David Liaght
- Updated the "goto" to "break" statements in patch 06 of the series,
based on feedback from Sergei Shtylyov
- Initialized the variable err due to the possibility of use before
being assigned a value in patch 07 of the series
- Added patch "ixgbe: add helper function for setting RSS key in
preparation of X550" since it is needed for the addition of X550 MAC
support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of registering the platform and PCI drivers in one module let's move
necessary bits to where it belongs. During this procedure we convert the module
registration part to use module_*_driver() macros which makes code simplier.
>From now on the driver consists three parts: core library, PCI, and platform
drivers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currenly we only support Large-Send and TX checksum offloads for
encapsulated traffic of type VXLAN. We must make sure to advertize
these offloads up to the stack only when VXLAN tunnel is set.
Failing to do so, would mislead the the networking stack to assume
that the driver can offload the internal TX checksum for GRE packets
and other buggy schemes.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When processing received traffic, pass CHECKSUM_COMPLETE status to the
stack, with calculated checksum for non TCP/UDP packets (such
as GRE or ICMP).
Although the stack expects checksum which doesn't include the pseudo
header, the HW adds it. To address that, we are subtracting the pseudo
header checksum from the checksum value provided by the HW.
In the IPv6 case, we also compute/add the IP header checksum which
is not added by the HW for such packets.
Cc: Jerry Chu <hkchu@google.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We can call napi_gro_frags for all the received traffic regardless
of the checksum status. Specifically, received packets whose status
is CHECKSUM_NONE (and soon to be added CHECKSUM_COMPLETE)
are eligible for napi_gro_frags as well.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Split off the setting of the RSS key into its own function. This
will help when we add support for X550 which can have different
RSS keys per pool.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch will add in the new MAC defines and fit it into the switch
cases throughout the driver. New functionality and enablement support will
be added in following patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Move setting of drop enable to support function. This not only makes the
code more readable but is also prep for following patches that add
additional MAC support.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Clean up functionality in ixgbe_ndo_set_vf_vlan that will simplify later
patches.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On topologies including few levels of PCIe switching X540 can run into an
unexpected completion error. We get around this by waiting after enabling
loopback a sufficient amount of time until Tx Data Fetch is sent. We then
poll the pending transaction bit to ensure we received the completion. Only
then do we go on to clear the buffers.
Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It's kind of silly to configure and attempt to use a bunch of queue
pairs when you're running on a single (virtual) CPU. Instead of
unconditionally configuring all of the queues that the PF gives us,
clamp the number of queue pairs to the number of CPUs.
Change-ID: I321714c9e15072ee76de8f95ab9a81f86ed347d1
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In early init, if we get an unexpected message from the PF (such as link
status), we just kick an error back to the init task, causing it to
restart its state machine and delaying initialization.
Make the early init AQ message receive code more robust by handling
messages in a loop, and ignoring those that we aren't interested in.
This also gets rid of some scary log messages that really didn't
indicate a problem.
Change-ID: I620e8c72e49c49c665ef33eeab2425dd10e721cf
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The interrupt throttle rate minimum is actually 2us, so
fix that define and while we are there, remove some unused defines.
Change some strings in the function to be a bit less wrappy, and
express the correct limits.
Change-ID: I96829bbc77935e0b57c6f0fc1439fb4152b2960a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The ARQ events cause a service_task execution, and we do a link_status
check and full stats gathering for each service_task. However, when
there are a lot of ARQ events, such as when doing an NVM update, we end up
doing 10's if not 100's of these per second, thereby heavily abusing the
PCI bus and especially the Firmware. This patch adds a check to keep the
service_task from running these periodic tasks more than once per second,
while still allowing quick action to service the events.
Change-ID: Iec7670c37bfae9791c43fec26df48aea7f70b33e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code was polling the firmware tail register for completion every
10 microseconds, which is way faster than the firmware can respond.
This changes the poll interval to 1ms, which reduces polling CPU
utilization, and the number of times we loop.
The maximum delay is still 100ms.
Change-ID: I4bbfa6b66d802890baf8b4154061e55942b90958
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
After commit 1a28817282 ("mlx4: use napi_complete_done()") we ended up
calling napi_complete_done() in the case NAPI poll consumed all its
budget.
This added extra interrupt pressure, this patch restores proper
behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 1a28817282 ("mlx4: use napi_complete_done()")
Signed-off-by: David S. Miller <davem@davemloft.net>
The out_dropped label will only do rcu_read_unlock for non-null port.
So add the missing rcu_read_unlock() when bailing due to non-null port.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per comments in vnet_start_xmit, for the edge case
when outgoing vnet_start_xmit() data and an incoming STOPPED
ACK cross each other in flight, we may need to send the missed
START trigger from maybe_tx_wakeup() after checking for a
false value of start_cons
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When vnet_start_xmit() is concurrent with vnet_ack(), we may
have a race that looks like:
thread 1 thread 2
vnet_start_xmit vnet_event_napi -> vnet_rx
__vnet_tx_trigger for some desc X
at this point dr->prod == X
peer sends back a stopped ack for X
we process X, but X == dr->prod
so we bail out in vnet_ack with
!idx_is_pending
update dr->prod
As a result of the fact that we never processed the stopped ack for X,
the Tx path is led to incorrectly believe that the peer is still
"started" and reading, but the peer has stopped reading, which will
ultimately end in flow-control assertions.
The fix is to synchronize the above 2 paths on the netif_tx_lock.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver allows MTU up to 1518 bytes which is not enought to run
batman-adv. Simply raise the maximum packet size up to the maximum
allowed by the transmit descriptor, 1792 bytes, giving a maximum MTU
of 1774 bytes.
Signed-off-by: Alban Bedel <albeu@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the default ndo_change_mtu callback with one that allow
setting MTU that the driver can handle.
Signed-off-by: Alban Bedel <albeu@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unlike CEE, IEEE has a bespoke app delete call and does not rely on priority
for app deletion
Fixes : 2376c879b8 ('cxgb4 : Improve handling of DCB negotiation or loss
thereof')
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Free List Starvation Threshold needs to be larger than the SGE's Egress
Congestion Threshold or we'll end up in a mutual stall where the driver waits
for Ingress Packets to drive replacing Free List Pointers and the SGE waits for
Free List Pointers before pushing Ingress Packets to the host.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
T5 introduces the ability to have separate Packing and Padding Boundaries
for SGE DMA transfers from the chip to Host Memory. This change set takes
advantage of that to set up a smaller Padding Boundary to conserve PCI Link
and Memory Bandwidth with T5.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move fl_starv_thres into adapter->sge data structure since it
_could_ be different from adapter to adapter. Also move other per-adapter
SGE values which had been treated as driver globals into adapter->sge.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This ways drivers like cxgb4 don't need to do ugly relative includes.
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Various patches have ended up changing the style of the symbolic macros/register
defines to different style.
As a result, the current kernel.org files are a mix of different macro styles.
Since this macro/register defines is used by different drivers a
few patch series have ended up adding duplicate macro/register define entries
with different styles. This makes these register define/macro files a complete
mess and we want to make them clean and consistent. This patch cleans up a part
of it.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Various patches have ended up changing the style of the symbolic macros/register
to different style.
As a result, the current kernel.org files are a mix of different macro styles.
Since this macro/register defines is used by different drivers a
few patch series have ended up adding duplicate macro/register define entries
with different styles. This makes these register define/macro files a complete
mess and we want to make them clean and consistent. This patch cleans up a part
of it.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To enable gro_flush_timeout, a driver has to use napi_complete_done()
instead of napi_complete().
Tested:
Ran 200 netperf TCP_STREAM from A to B (10Gbe mlx4 link, 8 RX queues)
Without this feature, we send back about 305,000 ACK per second.
GRO aggregation ratio is low (811/305 = 2.65 segments per GRO packet)
Setting a timer of 2000 nsec is enough to increase GRO packet sizes
and reduce number of ACK packets. (811/19.2 = 42)
Receiver performs less calls to upper stacks, less wakes up.
This also reduces cpu usage on the sender, as it receives less ACK
packets.
Note that reducing number of wakes up increases cpu efficiency, but can
decrease QPS, as applications wont have the chance to warmup cpu caches
doing a partial read of RPC requests/answers if they fit in one skb.
B:~# sar -n DEV 1 10 | grep eth0 | tail -1
Average: eth0 811269.80 305732.30 1199462.57 19705.72 0.00
0.00 0.50
B:~# echo 2000 >/sys/class/net/eth0/gro_flush_timeout
B:~# sar -n DEV 1 10 | grep eth0 | tail -1
Average: eth0 811577.30 19230.80 1199916.51 1239.80 0.00
0.00 0.50
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following sparse warnings. One is fixed by casting return
value to a return type of the function. The others by creating a specific
stmmac_platform.h which provides the bits related to the platform driver.
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: warning: incorrect type in return expression (different address spaces)
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: expected void *
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:59:29: got void [noderef] <asn:2>*reg
drivers/net/ethernet/stmicro/stmmac/dwmac-meson.c:64:29: warning: symbol 'meson6_dwmac_data' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c:354:29: warning: symbol 'stih4xx_dwmac_data' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c:361:29: warning: symbol 'stid127_dwmac_data' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/dwmac-sunxi.c:133:29: warning: symbol 'sun7i_gmac_data' was not declared. Should it be static?
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a kernel helper to dump buffers in a hexdecimal format. This patch
substitutes the open coded function by calling that helper.
The output is slightly changed:
- no lead space
- ASCII part will be printed along with the dump
- offset is longer than 3 characters (now 8)
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 1b7bde6d65 ("net: fec: implement rx_copybreak to improve rx performance")
introduced a regression for i.MX28. The swap_buffer() function doing
the endian conversion of the received data on i.MX28 may access memory
beyond the actual packet size in the DMA buffer. fec_enet_copybreak()
does not copy those bytes, so that the last bytes of a packet may be
filled with invalid data after swapping.
This will likely lead to checksum errors on received packets.
E.g. when trying to mount an NFS rootfs:
UDP: bad checksum. From 192.168.1.225:111 to 192.168.100.73:44662 ulen 36
Do the byte swapping and copying to the new skb in one go if
necessary.
Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the skb allocation fails during receive processing, the driver would
continue reading descriptors without first determining if there were
any more descriptors for the current packet. Update the code to check
whether more descriptors are associated with the current packet or
whether to move on to the next descriptor as a new packet.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The channel structure is freed before freeing the per channel
interrupts resulting in a kernel oops. Move the call to free
the channel structure to after the freeing of the per channel
interrupts.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
o Poll for the link events only if firmware doesn't have capability
to notify the driver for the link events.
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When events arrive at driver load, the event handler gets called even before
the spinlock and list are initialized. Fix this by moving the initialization
before EQs creation.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the EQ is created, it can possibly generate interrupts and the interrupt
handler is referencing eq->dev. It is therefore required to set eq->dev before
calling request_irq() so if an event is generated before request_irq() returns,
we will have a valid eq->dev field.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
vnet_event_napi() may be called as part of the NAPI ->poll,
to resume reading descriptor rings. When no data is available,
descriptor ring state (e.g., rcv_nxt) needs to be reset
carefully to stay in lock-step with ldc_read(). In the interest
of simplicity, the best way to do this is to return from
vnet_event_napi() when there are no more packets to read.
The next trip through ldc_rx will correctly set up the dring state.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: David Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
remove redundant tab.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
when cpsw is build as modulea and simple insert and removal of module
creates a deadlock, due to delete timer. the timer is created and destroyed
in cpsw_ale_start and cpsw_ale_stop which are from device open and close.
root@am437x-evm:~# modprobe -r ti_cpsw
[ 158.505333] INFO: trying to register non-static key.
[ 158.510623] the code is fine but needs lockdep annotation.
[ 158.516448] turning off the locking correctness validator.
[ 158.522282] CPU: 0 PID: 1339 Comm: modprobe Not tainted 3.14.23-00445-gd41c88f #44
[ 158.530359] [<c0015380>] (unwind_backtrace) from [<c0012088>] (show_stack+0x10/0x14)
[ 158.538603] [<c0012088>] (show_stack) from [<c054ad70>] (dump_stack+0x78/0x94)
[ 158.546295] [<c054ad70>] (dump_stack) from [<c0088008>] (__lock_acquire+0x176c/0x1b74)
[ 158.554711] [<c0088008>] (__lock_acquire) from [<c0088944>] (lock_acquire+0x9c/0x104)
[ 158.563043] [<c0088944>] (lock_acquire) from [<c004e520>] (del_timer_sync+0x44/0xd8)
[ 158.571289] [<c004e520>] (del_timer_sync) from [<bf2eac1c>] (cpsw_ale_destroy+0x10/0x3c [ti_cpsw])
[ 158.580821] [<bf2eac1c>] (cpsw_ale_destroy [ti_cpsw]) from [<bf2eb268>] (cpsw_remove+0x30/0xa0 [ti_cpsw])
[ 158.591000] [<bf2eb268>] (cpsw_remove [ti_cpsw]) from [<c035ef44>] (platform_drv_remove+0x18/0x1c)
[ 158.600527] [<c035ef44>] (platform_drv_remove) from [<c035d8bc>] (__device_release_driver+0x70/0xc8)
[ 158.610236] [<c035d8bc>] (__device_release_driver) from [<c035e0d4>] (driver_detach+0xb4/0xb8)
[ 158.619386] [<c035e0d4>] (driver_detach) from [<c035d6e4>] (bus_remove_driver+0x4c/0x90)
[ 158.627988] [<c035d6e4>] (bus_remove_driver) from [<c00af2a8>] (SyS_delete_module+0x10c/0x198)
[ 158.637144] [<c00af2a8>] (SyS_delete_module) from [<c000e580>] (ret_fast_syscall+0x0/0x48)
[ 179.524727] INFO: rcu_sched detected stalls on CPUs/tasks: {} (detected by 0, t=2102 jiffies, g=1487, c=1486, q=6)
[ 179.535741] INFO: Stall ended before state dump start
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ATM, txq_reclaim will dequeue and free an skb for each tx desc released
by the hw that has TX_LAST_DESC set. However, in case of TSO, each
hw desc embedding the last part of a segment has TX_LAST_DESC set,
losing the one-to-one 'last skb frag'/'TX_LAST_DESC set' correspondance,
which causes data corruption.
Fix this by checking TX_ENABLE_INTERRUPT instead of TX_LAST_DESC, and
warn when trying to dequeue from an empty txq (which can be symptomatic
of releasing skbs prematurely).
Fixes: 3ae8f4e0b9 ('net: mv643xx_eth: Implement software TSO')
Reported-by: Slawomir Gajzner <slawomir.gajzner@gmail.com>
Reported-by: Julien D'Ascenzio <jdascenzio@yahoo.fr>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Ian Campbell <ijc@hellion.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also add dummy functions where required to avoid NULL pointer dereference.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch in preparation for the upcoming EF10 sriov support.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch series provides a base and cleanup for the
upcoming EF10 SRIOV support.
This patch moves the VF state into siena_nic_data as a basis to
save the VF state based on nic type.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of pr_* macros let's use dev_* macros which provide device name.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Migrate pci driver to managed resources to reduce boilerplate error handling
code.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert system PM callbacks to use dev_pm_ops. In addition remove the PCI calls
related to a power state since the bus code cares about this already.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The last standard PCI resource is defined as PCI_STD_RESOURCE_END. Thus, we
could use it instead of plain integer.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following sparse warnings.
drivers/net/ethernet/stmicro/stmmac/enh_desc.c:381:30: warning: symbol 'enh_desc_ops' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/norm_desc.c:253:30: warning: symbol 'ndesc_ops' was not declared. Should it be static?
drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c:141:33: warning: symbol 'stmmac_ptp' was not declared. Should it be static?
There is no functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The amd-xgbe driver needs to perform ioremap calls, so add HAS_IOMEM
to its build dependency.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the spelling of the word "descriptor" in a couple
of locations.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for ethtool receive side scaling (RSS) commands.
Support is added to get/set the RSS hash key and the RSS lookup table.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides support for receive side scaling (RSS). RSS allows
for spreading incoming network packets across the Rx queues. When used
in conjunction with the per DMA channel interrupt support, this allows
the receive processing to be spread across multiple processors.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch provides support for interrupts that are generated by the
Tx/Rx DMA channel pairs of the device. This allows for Tx and Rx
processing to run across multiple processsors.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide support for splitting IP packets so that the header and
payload can be sent to different DMA addresses. This will allow
the IP header to be put into the linear part of the skb while the
payload can be added as frags.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use page allocations for Rx buffers instead of pre-allocating skbs
of a set size.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Tx and Rx descriptors are unsigned 32 bit values. Use the u32
type, rather than unsigned int, to map these descriptors.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pre_xmit function name implies that it performs operations prior
to transmitting the packet when in fact it is responsible for setting
up the descriptors and initiating the transmit. Rename this to
function from pre_xmit to dev_xmit, which is consistent with the name
used during receive processing - dev_read.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the channel and ring tracking structures allocation to device
open. This will allow for future support to vary the number of Tx/Rx
queues without unloading the module.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to fix the atomicity when suspend and resume the
driver. The clk api have been changed (as reported by Hao Liang)
and the skb allocation is done out of the hw setup function and
taking care about the GFP flags.
Reported-by: Hao Liang <hliang1025@gmail.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Hao Liang <hliang1025@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch aims to fix the concurrency in eee initialization
inside the stmmac driver and related warnings when enable
DEBUG_ATOMIC_SLEEP.
Prior this patch, the stmmac_eee_init could be called in several places
as shown below:
stmmac_open stmmac_resume PHY Layer
| | |
stmmac_hw_setup stmmac_adjust_link
| | stmmac ethtool
|__________________________|______________|
|
stmmac_eee_init
The patch removes the stmmac_eee_init call inside the stmmac_hw_setup
that is unnecessary. It is sufficient to call it in the adjust_link to
always guarantee that EEE is always configured at mac level too.
Fixing the lock protection now it is covered another case (not
considered before). The stmmac_eee_init could be called by the ethtool
so critical sections must be protected inside this function too.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing spin_unlock when tx frames gets dropped.
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac_tx_avail() may lie if used unprotected. It's using cur_tx
and dirty_tx index. These index may be already in use by tx_clean
when entering xmit routine. So, this should be called locked.
This can cause transmit queue to be stuck, with following message:
NETDEV WATCHDOG: eth0 (stmmaceth): transmit queue 0 timed out
Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a very old and often unused option to configure
a bit in a register inside the DMA. This support should
not stay under Koption and should be extended for new chips too.
This will be do later maybe via device-tree parameters.
Also no performance impact when remove this setting on STi platforms.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
the STMMAC_DEBUG_FS Koption is now removed from the
driver configuration and this support will be built
by default when DEBUG_FS is present. This can also be
useful on building driver verification.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes all the Koptions added to build the glue-logic files
for all different architectures: DWMAC_MESON, DWMAC_SUNXI, DWMAC_STI ...
Nowadays the stmmac needs to be compiled on several platforms; in some
case it very convenient to guarantee that its build is always completed
with success on all the branches where the driver is present.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Microblaze is a fpga soft core, it can be customized easily, which may
cause many various hardware version strings.
So the original fix patch based on hard-coded compatible version strings
is not a good idea (although it is correct for current issue). For it,
there will be a new solving way soon (which based on the device tree).
The original issue is related with qemu, so can only change the hardware
version string in qemu for it, then keep the original driver no touch (
qemu is for virtualization which has much easier life than real world).
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following kernel crash during SGMII based 1GbE probe.
BUG: Bad page state in process swapper/0 pfn:40fe6ad
page:ffffffbee37a75d8 count:-1 mapcount:0 mapping: (null) index:0x0
flags: 0x0()
page dumped because: nonzero _count
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0+ #7
Call trace:
[<ffffffc000087fa0>] dump_backtrace+0x0/0x12c
[<ffffffc0000880dc>] show_stack+0x10/0x1c
[<ffffffc0004d981c>] dump_stack+0x74/0xc4
[<ffffffc00012fe70>] bad_page+0xd8/0x128
[<ffffffc000133000>] get_page_from_freelist+0x4b8/0x640
[<ffffffc000133260>] __alloc_pages_nodemask+0xd8/0x834
[<ffffffc0004194f8>] __netdev_alloc_frag+0x124/0x1b8
[<ffffffc00041bfdc>] __netdev_alloc_skb+0x90/0x10c
[<ffffffc00039ff30>] xgene_enet_refill_bufpool+0x11c/0x280
[<ffffffc0003a11a4>] xgene_enet_process_ring+0x168/0x340
[<ffffffc0003a1498>] xgene_enet_napi+0x1c/0x50
[<ffffffc00042b454>] net_rx_action+0xc8/0x18c
[<ffffffc0000b0880>] __do_softirq+0x114/0x24c
[<ffffffc0000b0c34>] irq_exit+0x94/0xc8
[<ffffffc0000e68a0>] __handle_domain_irq+0x8c/0xf4
[<ffffffc000081288>] gic_handle_irq+0x30/0x7c
This was due to hardware resource sharing conflict with the firmware. This
patch fixes this crash by using resources (descriptor ring, prefetch buffer)
that are not shared.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support when used with older firmware (<= 1.13.28).
- Added xgene_ring_mgr_init() to check whether ring manager is initialized
- Calling xgene_ring_mgr_init() from xgene_port_ops.reset()
- To handle errors, changed the return type of xgene_port_ops.reset()
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-11-03
This series contains updates to i40e and i40evf.
Akeem adds a check for i40e so that flow director flush and reinit are
not done when flow director is not enabled.
Mitch fixes the i40evf driver to properly handle multiple admin queue
messages, by reinit the msg_size field each time we go through the loop.
Without this, we may receive truncated messages due to the firmware
thinking we have insufficient buffer size. Also fixes the link checking
logic to only check the carrier state if the interface is actually
open, which allows link changes to be reported correctly without spamming
the VFs. Updates i40e to inset the VSI ID in the QTX_CTL register
when configuring queues for VMDq VSIs.
Paul adds support for 10G-base-T in i40evf.
Jesse fixes i40e where the call to irq_dynamic_disable() was turning off
the interrupt completely when trying to set ITR to 0 (for lowest
moderation).
Shannon removes debugfs dump stats function, since it was not being
kept up-to-date and was redundant with the ethtool output. Also, scales
back the LAN MSIx usage to force queue/vector sharing and leave some
vectors for Flow Director, VMDq, etc. when there are more cores than
vectors available to the PF. Cleans up the error reporting for
get_lump() resource tracking errors. Also adds a check for the
debug module parameter earlier to be able to catch the early configuration
phase admin queue messages.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
efx_ef10_probe() was BUGging out if the BAR2 size was 0. This is
unnecessarily violent; instead we should just fail to probe the device.
Kept a WARN_ON as this problem indicates a broken or misconfigured NIC.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add code to issue CONFIG_DEV "get" firmware command.
This command is used in order to obtain certain parameters used for
supporting various RX checksumming options and vxlan UDP port.
The GET operation is allowed for VFs too.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Needed in order to get cache cold pages (L3 flushed) for HW scatter.
Otherwise memory may flush those entries when the packet comes from
PCI, causing back pressure resulting in BW decrease.
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When IP_ALIGN has a non zero value, hardware will write to a non aligned
address. The only reader from this address is when copying the header
from the first frag into the linear buffer (further access to the IP
address will be from the linear buffer, in which the headers are
aligned). Since the penalty of non align access by the hardware is
greater than the software memcpy, changing the frag_align to always be 0.
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>