linux/drivers/gpu/drm
Chris Wilson 2b49e7210e drm/i915: Disable per-engine reset for Broxton
Triggering a GPU reset for one engine affects another, notably
corrupting the context status buffer (CSB) effectively losing track of
inflight requests.

Adding a few printks:
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ad41836fa5e5..a969456bc0fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1953,6 +1953,7 @@ int i915_reset_engine(struct intel_engine_cs *engine)
                goto out;
        }

+       pr_err("Resetting %s\n", engine->name);
        ret = intel_gpu_reset(engine->i915, intel_engine_flag(engine));
        if (ret) {
                /* If we fail here, we expect to fallback to a global reset */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 716e5c9ea222..a72bc35d0870 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -355,6 +355,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
                                execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
                        port_set(&port[n], port_pack(rq, count));
                        desc = execlists_update_context(rq);
+                       pr_err("%s: in (rq=%x) ctx=%d\n", engine->name, rq->global_seqno, upper_32_bits(desc));
                        GEM_DEBUG_EXEC(port[n].context_id = upper_32_bits(desc));
                } else {
                        GEM_BUG_ON(!n);
@@ -594,9 +595,23 @@ static void intel_lrc_irq_handler(unsigned long data)
                        if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
                                continue;

+                       pr_err("%s: out CSB (%x head=%d, tail=%d), ctx=%d, rq=%d\n",
+                                       engine->name,
+                                       readl(csb_mmio),
+                                       head, tail,
+                                       readl(buf+2*head+1),
+                                       port->context_id);
+
                        /* Check the context/desc id for this event matches */
-                       GEM_DEBUG_BUG_ON(readl(buf + 2 * head + 1) !=
-                                        port->context_id);
+                       if (readl(buf + 2 * head + 1) != port->context_id) {
+                               pr_err("%s: BUG CSB (%x head=%d, tail=%d), ctx=%d, rq=%d\n",
+                                               engine->name,
+                                               readl(csb_mmio),
+                                               head, tail,
+                                               readl(buf+2*head+1),
+                                               port->context_id);
+                               BUG();
+                       }

                        rq = port_unpack(port, &count);
                        GEM_BUG_ON(count == 0);

Results in:

[ 6423.006602] Resetting rcs0
[ 6423.009080] rcs0: in (rq=fffffe70) ctx=1
[ 6423.009216] rcs0: in (rq=fffffe6f) ctx=3
[ 6423.009542] rcs0: out CSB (2 head=1, tail=2), ctx=3, rq=3
[ 6423.009619] Resetting bcs0
[ 6423.009980] rcs0: BUG CSB (0 head=1, tail=2), ctx=0, rq=3

Note that this bug may be affect all machines and not just Broxton,
Broxton is just the first machine on which I have confirmed this bug.

Fixes: 142bc7d99b ("drm/i915: Modify error handler for per engine hang recovery")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-13-chris@chris-wilson.co.uk

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:48 +02:00
..
amd Linux 4.13-rc2 2017-07-27 08:15:43 +10:00
arc drm: Convert atomic drivers from CRTC .disable() to .atomic_disable() 2017-06-30 14:53:15 +02:00
arm drm/mali: Use new atomic iterator macros 2017-07-13 09:54:12 +02:00
armada drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
ast drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
atmel-hlcdc drm/atmel-hlcdc: Handle drm_atomic_helper_swap_state failure 2017-07-26 13:22:41 +02:00
bochs drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
bridge Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
cirrus drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
etnaviv main drm pull for v4.13 2017-07-09 18:48:37 -07:00
exynos drm/atomic: implement drm_atomic_helper_commit_tail for runtime_pm users 2017-07-26 13:45:08 +02:00
fsl-dcu drm: Add old state pointer to CRTC .enable() helper function 2017-06-30 14:53:14 +02:00
gma500 drm/gma500: remove an unneeded NULL check 2017-06-28 19:17:38 +02:00
hisilicon drm/hisilicon: fix build error without fbdev emulation 2017-07-26 13:45:09 +02:00
i2c drm: handle HDMI 2.0 VICs in AVI info-frames 2017-07-14 21:23:54 +03:00
i810 drm/pci: Deprecate drm_pci_init/exit completely 2017-06-20 10:41:03 +02:00
i915 drm/i915: Disable per-engine reset for Broxton 2017-07-27 09:38:48 +02:00
imx Linux 4.13-rc2 2017-07-27 08:15:43 +10:00
lib
mediatek drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
meson drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
mga Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
mgag200 drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
msm drm/msm: Handle drm_atomic_helper_swap_state failure 2017-07-26 13:22:42 +02:00
mxsfb drm/mxsfb: Use gem_free_object_unlocked 2017-07-18 08:40:54 +02:00
nouveau Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
omapdrm drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
panel drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
pl111 drm/pl111: Use gem_free_object_unlocked 2017-07-18 08:40:54 +02:00
qxl drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
r128 drm/pci: Deprecate drm_pci_init/exit completely 2017-06-20 10:41:03 +02:00
radeon Linux 4.13-rc2 2017-07-27 08:15:43 +10:00
rcar-du drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
rockchip drm/atomic: implement drm_atomic_helper_commit_tail for runtime_pm users 2017-07-26 13:45:08 +02:00
savage drm/pci: Deprecate drm_pci_init/exit completely 2017-06-20 10:41:03 +02:00
selftests
shmobile drm/shmob: Drop drm_vblank_cleanup 2017-06-22 08:41:15 +02:00
sis drm/pci: Deprecate drm_pci_init/exit completely 2017-06-20 10:41:03 +02:00
sti drm: handle HDMI 2.0 VICs in AVI info-frames 2017-07-14 21:23:54 +03:00
stm drm/stm: ltdc: Add panel-bridge support 2017-07-18 12:06:42 +05:30
sun4i drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
tdfx drm/pci: Deprecate drm_pci_init/exit completely 2017-06-20 10:41:03 +02:00
tegra drm/tegra: Handle drm_atomic_helper_swap_state failure 2017-07-26 13:22:42 +02:00
tilcdc drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
tinydrm drm/tinydrm: Add RePaper e-ink driver 2017-07-14 19:30:08 +02:00
ttm drm/ttm: Fix use-after-free in ttm_bo_clean_mm 2017-07-03 16:25:43 -04:00
udl drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
vc4 Linux 4.13-rc2 2017-07-27 08:15:43 +10:00
vgem drm/vgem: add compat_ioctl support 2017-07-17 21:08:31 +02:00
via drm/pci: Deprecate drm_pci_init/exit completely 2017-06-20 10:41:03 +02:00
virtio drm/<drivers>: Drop fbdev info flags 2017-07-26 13:22:40 +02:00
vmwgfx Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
zte drm/zte: Use gem_free_object_unlocked 2017-07-18 08:40:54 +02:00
Kconfig
Makefile Merge tag 'drm-misc-next-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc into drm-next 2017-06-16 09:33:43 +10:00
ati_pcigart.c
drm_agpsupport.c
drm_atomic.c drm: rename, adjust and export drm_atomic_replace_property_blob 2017-07-14 15:53:06 +02:00
drm_atomic_helper.c Merge airlied/drm-next into drm-intel-next-queued 2017-07-27 09:33:49 +02:00
drm_auth.c
drm_blend.c
drm_bridge.c drm: Introduce drm_bridge_mode_valid() 2017-05-30 08:37:50 +02:00
drm_bufs.c switch compat_drm_mapbufs() to drm_ioctl_kernel() 2017-07-04 13:16:26 -04:00
drm_cache.c
drm_color_mgmt.c drm: More links for gamma support helpers 2017-06-20 12:13:11 +02:00
drm_connector.c Linux 4.12-rc7 2017-06-27 08:28:30 +10:00
drm_context.c
drm_crtc.c
drm_crtc_helper.c
drm_crtc_helper_internal.h drm: Add drm_{crtc/encoder/connector}_mode_valid() 2017-05-30 08:37:24 +02:00
drm_crtc_internal.h
drm_debugfs.c
drm_debugfs_crc.c drm/crc: Only open CRC on atomic drivers when the CRTC is active. 2017-07-17 16:34:51 +02:00
drm_dma.c
drm_dp_aux_dev.c drm_dp_aux_dev: switch to read_iter/write_iter 2017-07-08 20:51:46 -04:00
drm_dp_dual_mode_helper.c
drm_dp_helper.c drm/dp: start a DPCD based DP sink/branch device quirk database 2017-05-29 13:43:26 +03:00
drm_dp_mst_topology.c Linux 4.13-rc2 2017-07-27 08:15:43 +10:00
drm_drv.c drm: inhibit drm drivers register to uninitialized drm core 2017-07-11 12:03:11 +02:00
drm_dumb_buffers.c
drm_edid.c drm/edid: parse ycbcr 420 deep color information 2017-07-14 21:23:54 +03:00
drm_edid_load.c
drm_encoder.c
drm_encoder_slave.c
drm_fb_cma_helper.c drm: Convert CMA fbdev console suspend helpers to use bool 2017-06-20 16:23:40 +02:00
drm_fb_helper.c drm/fb-helper: Support deferred setup 2017-07-26 13:45:07 +02:00
drm_file.c Merge remote-tracking branch 'airlied/drm-next' into drm-misc-next 2017-06-27 09:18:17 -04:00
drm_flip_work.c
drm_fourcc.c
drm_framebuffer.c Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
drm_gem.c drm: Don't complain too much about struct_mutex. 2017-07-18 09:17:22 +02:00
drm_gem_cma_helper.c drm: Update docs around gem_free_object 2017-07-26 13:22:39 +02:00
drm_global.c
drm_hashtab.c
drm_info.c
drm_internal.h Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
drm_ioc32.c Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
drm_ioctl.c Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
drm_irq.c drm/doc: Polish irq helper documentation 2017-06-01 08:02:14 +02:00
drm_kms_helper_common.c
drm_legacy.h switch compat_drm_mapbufs() to drm_ioctl_kernel() 2017-07-04 13:16:26 -04:00
drm_lock.c
drm_memory.c
drm_mipi_dsi.c drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
drm_mm.c
drm_mode_config.c
drm_mode_object.c
drm_modes.c drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
drm_modeset_helper.c
drm_modeset_lock.c drm: Improve kerneldoc for drm_modeset_lock 2017-07-26 13:45:08 +02:00
drm_of.c drm: Convert to using %pOF instead of full_name 2017-07-26 13:45:06 +02:00
drm_panel.c
drm_pci.c drm/pci: Deprecate drm_pci_init/exit completely 2017-06-20 10:41:03 +02:00
drm_plane.c
drm_plane_helper.c
drm_prime.c
drm_print.c
drm_probe_helper.c drm: add helper to validate YCBCR420 modes 2017-07-14 21:23:54 +03:00
drm_property.c drm: rename, adjust and export drm_atomic_replace_property_blob 2017-07-14 15:53:06 +02:00
drm_rect.c
drm_scatter.c
drm_scdc_helper.c
drm_simple_kms_helper.c drm/simple-kms-helper: Fix the check for the mismatch between plane and CRTC enabled. 2017-07-13 09:44:51 +02:00
drm_syncobj.c Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
drm_sysfs.c
drm_trace.h
drm_trace_points.c
drm_vblank.c Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
drm_vm.c
drm_vma_manager.c