linux/drivers
Lyude 82922da391 drm/dp_helper: Retry aux transactions on all errors
This is part of a patch series to migrate all of the workarounds for
commonly seen behavior from bad sinks in intel_dp_dpcd_read_wake() to
drm's DP helper.

We cannot rely on sinks NACKing or deferring when they can't receive
transactions, nor can we rely on any other sort of consistent error to
know when we should stop retrying. As such, we need to just retry
unconditionally on errors. We also make sure here to return the error we
encountered during the first transaction, since it's possible that
retrying the transaction might return a different error then we had
originally.

This, along with the previous patch, work around a weird bug with the
ThinkPad T560's and it's dock. When resuming the laptop, it appears that
there's a short period of time where we're unable to complete any aux
transactions, as they all immediately timeout. The only machine I'm able
to reproduce this on is the T560 as other production Skylake models seem
to be fine. The period during which AUX transactions fail appears to be
around 22ms long. AFAIK, the dock for the T560 never actually turns off,
the only difference is that it's in SST mode at the start of the resume
process, so it's unclear as to why it would need so much time to come
back up.

There's been a discussion on this issue going on for a while on the
intel-gfx mailing list about this that has, in addition to including
developers from Intel, also had the correspondence of one of the
hardware engineers for Intel:

http://www.spinics.net/lists/intel-gfx/msg88831.html
http://www.spinics.net/lists/intel-gfx/msg88410.html

We've already looked into a couple of possible explanations for the
problem:

- Calling intel_dp_mst_resume() before right fix.
  intel_runtime_pm_enable_interrupts(). This was the first fix I tried,
  and while it worked it definitely wasn't the right fix. This worked
  because DP aux transactions don't actually require interrupts to work:

	static uint32_t
	intel_dp_aux_wait_done(struct intel_dp *intel_dp, bool has_aux_irq)
	{
		struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
		struct drm_device *dev = intel_dig_port->base.base.dev;
		struct drm_i915_private *dev_priv = dev->dev_private;
		i915_reg_t ch_ctl = intel_dp->aux_ch_ctl_reg;
		uint32_t status;
		bool done;

	#define C (((status = I915_READ_NOTRACE(ch_ctl)) & DP_AUX_CH_CTL_SEND_BUSY) == 0)
		if (has_aux_irq)
			done = wait_event_timeout(dev_priv->gmbus_wait_queue, C,
						  msecs_to_jiffies_timeout(10));
		else
			done = wait_for_atomic(C, 10) == 0;
		if (!done)
			DRM_ERROR("dp aux hw did not signal timeout (has irq: %i)!\n",
				  has_aux_irq);
	#undef C

		return status;
	}

  When there's no interrupts enabled, we end up timing out on the
  wait_event_timeout() call, which causes us to check the DP status
  register once to see if the transaction was successful or not. Since
  this adds a 10ms delay to each aux transaction, it ends up adding a
  long enough delay to the resume process for aux transactions to become
  functional again. This gave us the illusion that enabling interrupts
  had something to do with making things work again, and put me on the
  wrong track for a while.

- Interrupts occurring when we try to perform the aux transactions
  required to put the dock back into MST mode. This isn't the problem,
  as the only interrupts I've observed that come during this timeout
  period are from the snd_hda_intel driver, and disabling that driver
  doesn't appear to change the behavior at all.

- Skylake's PSR block causing issues by performing aux transactions
  while we try to bring the dock out of MST mode. Disabling PSR through
  i915's command line options doesn't seem to change the behavior
  either, nor does preventing the DMC firmware from being loaded.

Since this investigation went on for about 2 weeks, we decided it would
be better for the time being to just workaround this issue by making
sure AUX transactions wait a short period of time before retrying.

Signed-off-by: Lyude <cpaul@redhat.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460559513-32280-3-git-send-email-cpaul@redhat.com
2016-04-22 18:51:54 +02:00
..
accessibility
acpi Merge branch 'acpi-processor' 2016-04-02 01:17:36 +02:00
amba
android
ata Merge branch 'for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2016-03-18 20:06:46 -07:00
atm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-03-17 21:38:27 -07:00
auxdisplay
base PM / wakeirq: fix wakeirq setting after wakup re-configuration from sysfs 2016-04-07 22:23:47 +02:00
bcma
block Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client 2016-04-07 16:34:26 -07:00
bluetooth Bluetooth: btmrvl_sdio: fix firmware activation failure 2016-03-10 19:51:29 +01:00
bus arm[64] perf updates for 4.6: 2016-03-21 13:14:16 -07:00
cdrom
char Revert "ppdev: use new parport device model" 2016-03-25 09:02:13 -07:00
clk clk: qcom: ipq4019: add some fixed clocks for ddrppl and fepll 2016-03-29 16:31:16 -07:00
clocksource Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-24 10:32:42 -07:00
connector
cpufreq Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'acpi-cppc' 2016-04-08 21:46:05 +02:00
cpuidle cpuidle: menu: Fall back to polling if next timer event is near 2016-03-21 15:50:28 +01:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-03-23 06:12:39 -07:00
dca
devfreq PM / devfreq: Spelling s/frequnecy/frequency/ 2016-03-17 02:30:16 +01:00
dio
dma asm-generic changes for 4.6 2016-03-24 23:13:48 -07:00
dma-buf dma-buf: Update docs for SYNC ioctl 2016-03-21 09:26:45 +01:00
edac EDAC queue for 4.6 2016-03-16 08:36:55 -07:00
eisa
extcon
firewire IEEE 1394 subsystem patch: 2016-03-25 08:52:25 -07:00
firmware firmware: qemu_fw_cfg.c: hold ACPI global lock during device access 2016-04-07 15:16:40 +03:00
fmc
fpga
gpio gpio: pca953x: Use correct u16 value for register word write 2016-04-08 11:49:47 +02:00
gpu drm/dp_helper: Retry aux transactions on all errors 2016-04-22 18:51:54 +02:00
hid drivers/hid/uhid.c: check write() bitness using in_compat_syscall 2016-03-22 15:36:02 -07:00
hsi
hv Char/Misc patches for 4.6-rc1 2016-03-17 13:47:50 -07:00
hwmon hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated 2016-03-27 10:37:48 -07:00
hwspinlock
hwtracing
i2c i2c: jz4780: really prevent potential division by zero 2016-04-09 08:36:44 +02:00
ide ide: palm_bk3710: test clock rate to avoid division by 0 2016-03-20 16:59:27 -04:00
idle intel_idle: Add KBL support 2016-04-07 22:11:08 +02:00
iio Second set of IIO fixes for the 4.6 cycle. 2016-04-04 13:45:10 -07:00
infiniband Revert "ib_srpt: Convert to percpu_ida tag allocation" 2016-04-07 18:16:20 -07:00
input Merge branch 'akpm' (patches from Andrew) 2016-03-25 16:59:11 -07:00
iommu iommu/vt-d: Silence an uninitialized variable warning 2016-04-07 14:51:47 +02:00
ipack
irqchip irqchip/mbigen: Make CONFIG_HISILICON_IRQ_MBIGEN a hidden option 2016-03-23 12:02:29 +01:00
isdn Drivers: isdn: hisax: isac.c: Fix assignment and check into one expression. 2016-03-27 22:38:12 -04:00
leds platform-drivers-x86 for 4.6-1 2016-03-23 17:20:59 -07:00
lguest
lightnvm lightnvm: do not load L2P table if not supported 2016-03-18 18:10:38 -07:00
macintosh
mailbox Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'acpi-cppc' 2016-04-08 21:46:05 +02:00
mcb
md Merge tag 'md/4.6-rc2-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2016-04-09 11:23:27 -07:00
media media fixes for v4.6-rc2 2016-04-05 06:47:50 -07:00
memory MTD updates for v4.6 2016-03-24 19:57:15 -07:00
memstick drivers/memstick/host/r592.c: avoid gcc-6 warning 2016-03-25 16:37:42 -07:00
message
mfd - New Drivers 2016-03-18 10:15:11 -07:00
misc mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
mmc MMC host: 2016-04-10 17:38:55 -07:00
mtd mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
net tuntap: restore default qdisc 2016-04-08 15:52:45 -04:00
nfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
ntb NTB: Remove _addr functions from ntb_hw_amd 2016-03-26 11:44:33 -04:00
nubus
nvdimm Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 2016-04-09 14:05:45 -07:00
nvme nvme: avoid cqe corruption when update at the same time as read 2016-03-22 10:27:29 -06:00
nvmem
of DeviceTree updates for 4.6: 2016-03-19 15:15:07 -07:00
oprofile mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
parisc PCI changes for the v4.6 merge window: 2016-03-16 14:45:55 -07:00
parport
pci Revert "PCI: dra7xx: Mark driver as broken" 2016-03-22 07:50:11 -05:00
pcmcia pcmcia: db1xxx_ss: fix last irq_to_gpio user 2016-03-29 22:48:53 +02:00
perf drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree 2016-03-21 11:36:17 +00:00
phy
pinctrl Revert "Revert "pinctrl: lantiq: Implement gpio_chip.to_irq"" 2016-04-01 15:21:27 +02:00
platform Convert straggling drivers to new six-argument get_user_pages() 2016-04-02 18:35:05 -05:00
pnp PNP / ACPI: add ACPI_RESOURCE_TYPE_SERIAL_BUS as a valid type 2016-03-09 23:50:55 +01:00
power Power management and ACPI material for v4.6-rc1, part 2 2016-03-25 16:55:37 -07:00
powercap powercap: intel_rapl: Add missing Haswell model 2016-04-05 03:44:48 +02:00
pps
ps3
ptp Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-15 12:13:56 -07:00
pwm pwm: omap-dmtimer: Add debug message for effective period and duty cycle 2016-03-23 17:11:48 +01:00
rapidio Convert straggling drivers to new six-argument get_user_pages() 2016-04-02 18:35:05 -05:00
ras
regulator - New Drivers 2016-03-18 10:15:11 -07:00
remoteproc remoteproc: st: fix check of syscon_regmap_lookup_by_phandle() return value 2016-03-28 16:19:00 -07:00
reset
rpmsg
rtc RTC for 4.6 #2 2016-03-24 22:49:08 -07:00
s390 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2016-04-01 07:15:54 -05:00
sbus
scsi SCSI fixes on 20160408 2016-04-09 12:00:42 -07:00
sfi
sh
sn
soc ARM: SoC driver updates for v4.6 2016-03-20 15:40:32 -07:00
spi Merge remote-tracking branches 'spi/fix/omap2' and 'spi/fix/rockchip' into spi-linus 2016-04-04 10:05:49 -07:00
spmi
ssb
staging Staging / IIO driver fixes for 4.6-rc3 2016-04-09 12:09:37 -07:00
target target: add a new add_wwn_groups fabrics method 2016-03-30 20:06:44 -07:00
tc
thermal Thermal: Ignore invalid trip points 2016-03-18 14:10:57 +08:00
thunderbolt
tty tty: Fix merge of "tty: Refactor tty_open()" 2016-03-31 20:49:39 -07:00
uio
usb USB fixes for 4.6-rc3 2016-04-09 12:23:02 -07:00
uwb
vfio VFIO updates for v4.6-rc1 2016-03-17 13:05:09 -07:00
vhost Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-03-22 12:41:14 -07:00
video mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
virt
virtio virtio: virtio 1.0 cs04 spec compliance for reset 2016-04-07 15:16:39 +03:00
vlynq
vme
w1
watchdog hpwdt: use nmi_panic() when kernel panics in NMI handler 2016-03-22 15:36:02 -07:00
xen xen/events: Mask a moving irq 2016-04-04 11:18:00 +01:00
zorro
Kconfig
Makefile