linux/drivers
Brett Creeley c585ea42ec ice: Fix tx_timeout in PF driver
Prior to this commit the driver was running into tx_timeouts when a
queue was stressed enough. This was happening because the HW tail
and SW tail (NTU) were incorrectly out of sync. Consequently this was
causing the HW head to collide with the HW tail, which to the hardware
means that all descriptors posted for Tx have been processed.

Due to the Tx logic used in the driver SW tail and HW tail are allowed
to be out of sync. This is done as an optimization because it allows the
driver to write HW tail as infrequently as possible, while still
updating the SW tail index to keep track. However, there are situations
where this results in the tail never getting updated, resulting in Tx
timeouts.

Tx HW tail write condition:
	if (netif_xmit_stopped(txring_txq(tx_ring) || !skb->xmit_more)
		writel(sw_tail, tx_ring->tail);

An issue was found in the Tx logic that was causing the afore mentioned
condition for updating HW tail to never happen, causing tx_timeouts.

In ice_xmit_frame_ring we calculate how many descriptors we need for the
Tx transaction based on the skb the kernel hands us. This is then passed
into ice_maybe_stop_tx along with some extra padding to determine if we
have enough descriptors available for this transaction. If we don't then
we return -EBUSY to the stack, otherwise we move on and eventually
prepare the Tx descriptors accordingly in ice_tx_map and set
next_to_watch. In ice_tx_map we make another call to ice_maybe_stop_tx
with a value of MAX_SKB_FRAGS + 4. The key here is that this value is
possibly less than the value we sent in the first call to
ice_maybe_stop_tx in ice_xmit_frame_ring. Now, if the number of unused
descriptors is between MAX_SKB_FRAGS + 4 and the value used in the first
call to ice_maybe_stop_tx in ice_xmit_frame_ring then we do not update
the HW tail because of the "Tx HW tail write condition" above. This is
because in ice_maybe_stop_tx we return success from ice_maybe_stop_tx
instead of calling __ice_maybe_stop_tx and subsequently calling
netif_stop_subqueue, which sets the __QUEUE_STATE_DEV_XOFF bit. This
bit is then checked in the "Tx HW tail write condition" by calling
netif_xmit_stopped and subsequently updating HW tail if the
afore mentioned bit is set.

In ice_clean_tx_irq, if next_to_watch is not NULL, we end up cleaning
the descriptors that HW sets the DD bit on and we have the budget. The
HW head will eventually run into the HW tail in response to the
description in the paragraph above.

The next time through ice_xmit_frame_ring we make the initial call to
ice_maybe_stop_tx with another skb from the stack. This time we do not
have enough descriptors available and we return NETDEV_TX_BUSY to the
stack and end up setting next_to_watch to NULL.

This is where we are stuck. In ice_clean_tx_irq we never clean anything
because next_to_watch is always NULL and in ice_xmit_frame_ring we never
update HW tail because we already return NETDEV_TX_BUSY to the stack and
eventually we hit a tx_timeout.

This issue was fixed by making sure that the second call to
ice_maybe_stop_tx in ice_tx_map is passed a value that is >= the value
that was used on the initial call to ice_maybe_stop_tx in
ice_xmit_frame_ring. This was done by adding the following defines to
make the logic more clear and to reduce the chance of mucking this up
again:

ICE_CACHE_LINE_BYTES		64
ICE_DESCS_PER_CACHE_LINE	(ICE_CACHE_LINE_BYTES / \
				 sizeof(struct ice_tx_desc))
ICE_DESCS_FOR_CTX_DESC		1
ICE_DESCS_FOR_SKB_DATA_PTR	1

The ICE_CACHE_LINE_BYTES being 64 is an assumption being made so we
don't have to figure this out on every pass through the Tx path. Instead
I added a sanity check in ice_probe to verify cache line size and print
a message if it's not 64 Bytes. This will make it easier to file issues
if they are seen when the cache line size is not 64 Bytes when reading
from the GLPCI_CNF2 register.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-06 12:46:47 -08:00
..
accessibility
acpi pwm: Changes for v4.20-rc1 2018-11-02 11:22:45 -07:00
amba
android
ata libata: Apply NOLPM quirk for SAMSUNG MZ7TD256HAFV-000L9 2018-10-26 08:21:04 -06:00
atm atm: zatm: Fix empty body Clang warnings 2018-10-18 15:39:10 -07:00
auxdisplay The Compiler Attributes series 2018-11-01 18:34:46 -07:00
base mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock 2018-10-31 08:54:17 -07:00
bcma
block for-linus-20181102 2018-11-02 11:25:48 -07:00
bluetooth Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-10-24 14:43:41 +01:00
bus ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
cdrom gdrom: fix mistake in assignment of error 2018-10-25 11:17:40 -06:00
char RTC for 4.20 2018-10-27 09:24:24 -07:00
clk This time it looks like a quieter release cycle in the clk tree. I guess that's 2018-10-31 11:08:30 -07:00
clocksource Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-11-04 08:15:15 -08:00
connector
cpufreq cpufreq: remove unused arm_big_little_dt driver 2018-10-25 18:39:02 +02:00
cpuidle More power management updates for 4.20-rc1 2018-10-30 09:08:07 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2018-10-25 16:43:35 -07:00
dax
dca
devfreq
dio
dma pci-v4.20-changes 2018-10-25 06:50:48 -07:00
dma-buf
edac * skx_edac: Address translation for NVDIMMs (Tony Luck and Qiuxu Zhuo) 2018-11-02 11:17:22 -07:00
eisa
extcon
firewire
firmware Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-11-03 18:25:17 -07:00
fmc
fpga fpga: add devm_fpga_region_create 2018-10-16 11:13:50 +02:00
fsi iov_iter: Separate type from direction and use accessor functions 2018-10-24 00:41:07 +01:00
gnss
gpio pci-v4.20-changes 2018-10-25 06:50:48 -07:00
gpu drm, i915, amdgpu, bridge + core quirk 2018-11-02 10:58:20 -07:00
hid platform-drivers-x86 for v4.20-1 2018-11-01 08:42:21 -07:00
hsi
hv hv_balloon: Replace spin_is_locked() with lockdep 2018-10-15 20:54:17 +02:00
hwmon Lots of small changes to the IPMI driver. Most of the changes 2018-10-23 09:42:05 +01:00
hwspinlock
hwtracing
i2c i2c: Clear client->irq in i2c_device_remove 2018-10-31 23:33:34 +00:00
ide
idle Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
iio Staging/IIO patches for 4.20-rc1 2018-10-29 10:38:10 -07:00
infiniband Revert "mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks" 2018-10-26 16:25:19 -07:00
input Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
iommu mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
ipack
irqchip irqchip/irq-mvebu-sei: Fix a NULL vs IS_ERR() bug in probe function 2018-11-01 12:38:48 +01:00
isdn Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
leds leds: gpio: set led_dat->gpiod pointer for OF defined GPIO leds 2018-10-26 20:51:36 +02:00
lightnvm
macintosh memblock: stop using implicit alignment to SMP_CACHE_BYTES 2018-10-31 08:54:16 -07:00
mailbox - Convert print users to use the %pOFn format specifier 2018-10-29 10:30:44 -07:00
mcb
md for-linus-20181102 2018-11-02 11:25:48 -07:00
media media updates for v4.20-rc1 2018-10-31 10:53:29 -07:00
memory
memstick
message
mfd chrome-platform for v4.20 2018-10-31 16:47:55 -07:00
misc Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
mmc Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
mtd This pull request contains updates for UBIFS: 2018-11-04 14:46:04 -08:00
mux This is the bulk of GPIO changes for the v4.20 series: 2018-10-23 08:45:05 +01:00
net ice: Fix tx_timeout in PF driver 2018-11-06 12:46:47 -08:00
nfc NFC: nfcmrvl_uart: fix OF child-node lookup 2018-10-23 13:28:53 -05:00
ntb ntb: idt: Alter the driver info comments 2018-11-01 10:33:12 -04:00
nubus
nvdimm libnvdimm for 4.20 2018-10-25 06:31:56 -07:00
nvme for-linus-20181102 2018-11-02 11:25:48 -07:00
nvmem nvmem: hide unused nvmem_find_cell_by_index function 2018-10-15 15:56:15 +02:00
of Devicetree fixes for v4.20-rc1: 2018-11-01 14:45:38 -07:00
opp
oprofile
parisc parisc: Add alternative coding infrastructure 2018-10-17 17:22:26 +02:00
parport
pci Merge branch 'xarray' of git://git.infradead.org/users/willy/linux-dax 2018-10-28 11:35:40 -07:00
pcmcia powerpc updates for 4.20 2018-10-26 14:36:21 -07:00
perf arm64 updates for 4.20: 2018-10-22 17:30:06 +01:00
phy USB/PHY patches for 4.20-rc1 2018-10-26 08:14:13 -07:00
pinctrl This is the bulk of GPIO changes for the v4.20 series: 2018-10-23 08:45:05 +01:00
platform platform-drivers-x86 for v4.20-1 2018-11-01 08:42:21 -07:00
pnp
power Devicetree updates for 4.20: 2018-10-26 12:09:58 -07:00
powercap Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-10-23 13:32:18 +01:00
pps
ps3
ptp ptp: drop redundant kasprintf() to create worker name 2018-10-28 19:20:06 -07:00
pwm pwm: lpss: Only set update bit if we are actually changing the settings 2018-10-16 13:16:15 +02:00
rapidio
ras
regulator regulator: Regulator updates for next release 2018-10-23 01:54:44 +01:00
remoteproc remoteproc: qcom: q6v5-mss: Register segments/dumpfn for coredump 2018-10-19 12:54:03 -07:00
reset ARM: SoC driver updates for 4.17 2018-10-29 15:16:01 -07:00
rpmsg
rtc rtc: sc27xx: Always read normal alarm when registering RTC device 2018-10-25 02:35:42 +02:00
s390 s390/qeth: report 25Gbit link speed 2018-11-03 10:44:06 -07:00
sbus
scsi Kbuild updates for v4.20 (2nd) 2018-11-03 10:47:33 -07:00
sfi mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
sh
siox
slimbus
sn
soc soc: ti: QMSS: Fix usage of irq_set_affinity_hint 2018-11-02 11:22:09 -07:00
soundwire
spi - New Drivers 2018-10-25 06:19:15 -07:00
spmi
ssb
staging media updates for v4.20-rc1 2018-10-31 10:53:29 -07:00
target SCSI misc on 20181102 2018-11-03 10:34:03 -07:00
tc
tee
thermal Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux 2018-10-31 11:28:12 -07:00
thunderbolt
tty mm: remove include/linux/bootmem.h 2018-10-31 08:54:16 -07:00
uio
usb Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
uwb
vfio VFIO updates for v4.20 2018-10-31 11:01:38 -07:00
vhost virtio, vhost: fixes, tweaks 2018-11-01 14:42:49 -07:00
video fbdev changes for v4.20: 2018-10-31 11:41:37 -07:00
virt
virtio virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON 2018-10-24 20:57:55 -04:00
visorbus
vlynq
vme
w1 w1: IAD Register is yet readable trough iad sys file. Fix snprintf (%u for unsigned, count for max size). 2018-10-15 20:50:32 +02:00
watchdog watchdog: ts4800: release syscon device node in ts4800_wdt_probe() 2018-10-22 10:16:28 +02:00
xen Merge branch 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2018-11-01 19:58:52 -07:00
zorro
Kconfig
Makefile