Right after runtime resume we know that we can re-enable DC5, since we
just disabled DC9 and power well 2 is disabled. So enable DC5 explicitly
instead of delaying this until the next time we disable power well 2.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461173277-16090-5-git-send-email-imre.deak@intel.com
After suspend-to-ram or -disk we don't know what power state the display
HW will be, DC0 or DC9 are both possible states, so reset the software
DC state tracking in these cases. This gets rid of 'DC state mismatch'
error messages during resuming from ram or disk where we expected to be
in DC9 (as set by the suspend handler) but we are in DC0.
v2:
- Remove extra WS in gen9_sanitize_dc_state() (Bob)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461173277-16090-4-git-send-email-imre.deak@intel.com
Initially we thought that the platform specific suspend/resume sequences
can be shared between the runtime and system suspend/resume handlers.
This turned out to be not true, we have quite a few differences on most
of the platforms. This was realized already earlier by Paulo who
inlined the platform specific resume_prepare handlers. We have the
same problem with the corresponding suspend_complete handlers, there are
platform differences that make it unfeasible to share the code between
the runtime and system suspend paths. Also now we call functions that
need to be paired like hsw_enable_pc8()/hsw_disable_pc8() from different
levels of the call stack, which is confusing. Fix this by inlining the
suspend_complete handlers too.
This is also needed by the next patch that removes a redundant
uninit/init call during system suspend/resume on BXT.
No functional change.
CC: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
[s/uninline/inline in the commit message]
Link: http://patchwork.freedesktop.org/patch/msgid/1461173277-16090-2-git-send-email-imre.deak@intel.com
In commit 5f304c8736 ("drm/i915/kbl: Reset secondary power well requests
left on by DMC/KVMR") I forgot about the fact that SKL==KBL most of the
time and that a secondary MISC IO power well request left on by the DMC is
"expected". Tune down the corresponding WARN to be a debug message. This
was caught by CI suspend tests.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461060036-19043-1-git-send-email-imre.deak@intel.com
It was noticed on bug #94087 that module parameter
i915.edp_vswing=2 that should override the VBT setting
to use default voltage swing (400 mV) was not applied
for Broadwell.
This patch provides a fix for this by checking if default
i.e. higher voltage swing is requested to be used and
applies the DDI translations table for DP instead of eDP
(low vswing) table.
v2: Combine two if statements into one (Jani)
v3: Change dev_priv->edp_low_vswing to use dev_priv->vbt.edp.low_vswing
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94087
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461155942-7749-1-git-send-email-mika.kahola@intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The newly-introduced function i915_gem_object_pin_map() returns an
ERR_PTR (not NULL) if the pin-and-map opertaion fails, so that's what we
must check for. And it's nicer not to assign such a pointer-or-error to
a structure being filled in until after it's been validated, so we
should keep it local and avoid exporting a bogus pointer. Also, for
clarity and symmetry, we should clear 'virtual_start' along with 'vma'
when unmapping a ringbuffer.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Now that we keep the GuC client process descriptor permanently mapped,
we don't really need to keep a local copy of the GuC's work-queue-head.
So we can simplify the code a little by not doing this.
Signed-off-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Don't use kmap_atomic() for doorbell & process descriptor access.
This patch fixes the BUG shown below, where the thread could sleep
while holding a kmap_atomic mapping. In order not to need to call
kmap_atomic() in this code path, we now set up a permanent kernel
mapping of the shared doorbell and process-descriptor page, and
use that in all doorbell and process-descriptor related code.
BUG: scheduling while atomic: gem_close_race/1941/0x00000002
Modules linked in: hid_generic usbhid i915 asix usbnet libphy mii
i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt
sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid
hid video pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1
e1000e ptp psmouse pps_core ahci libahci
CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G U 4.4.0-160121+ #123
Hardware name: Intel Corporation Skylake Client platform/Skylake AIO
DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
0000000000013e40 ffff880166c27a78 ffffffff81280d02 ffff880172c13e40
ffff880166c27a88 ffffffff810c203a ffff880166c27ac8 ffffffff814ec808
ffff88016b7c6000 ffff880166c28000 00000000000f4240 0000000000000001
Call Trace:
[<ffffffff81280d02>] dump_stack+0x4b/0x79
[<ffffffff810c203a>] __schedule_bug+0x41/0x4f
[<ffffffff814ec808>] __schedule+0x5a8/0x690
[<ffffffff814ec927>] schedule+0x37/0x80
[<ffffffff814ef3fd>] schedule_hrtimeout_range_clock+0xad/0x130
[<ffffffff81090be0>] ? hrtimer_init+0x10/0x10
[<ffffffff814ef3f1>] ? schedule_hrtimeout_range_clock+0xa1/0x130
[<ffffffff814ef48e>] schedule_hrtimeout_range+0xe/0x10
[<ffffffff814eef9b>] usleep_range+0x3b/0x40
[<ffffffffa01ec109>] i915_guc_wq_check_space+0x119/0x210 [i915]
[<ffffffffa01da47c>] intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
[<ffffffffa01cdbf1>] i915_gem_request_alloc+0x91/0x170 [i915]
[<ffffffffa01c1c07>] i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
[<ffffffffa01cb785>] ? i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
[<ffffffffa01d1fb6>] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
[<ffffffffa01c2e68>] i915_gem_execbuffer2+0xa8/0x250 [i915]
[<ffffffffa00f65d8>] drm_ioctl+0x258/0x4f0 [drm]
[<ffffffffa01c2dc0>] ? i915_gem_execbuffer+0x340/0x340 [i915]
[<ffffffff8111590d>] do_vfs_ioctl+0x2cd/0x4a0
[<ffffffff8111eac2>] ? __fget+0x72/0xb0
[<ffffffff81115b1c>] SyS_ioctl+0x3c/0x70
[<ffffffff814effd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
------------[ cut here ]------------
v4:
Only tear down doorbell & kunmap() client object if we actually
succeeded in allocating a client object (Tvrtko Ursulin)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93847
Original-version-by: Alex Dai <yu.dai@intel.com>
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvtrko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Since we can only swap out shmemfs objects, those are the only ones that
can influence the ability of the shrinker to free pages. Currently, all
non-shmemfs objects have a raised pages_pin_count to protect them from
the shrinker, so this just makes the logic for can_release_pages()
clearer (and safer in future so that we don't over estimate our ability
to free up pages from future non-swappable objects).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Inside the shrinker we call can_release_pages() to indicate whether or
not we can make forward progress in freeing up memory by unbinding that
object. When adding our report to oom, we should be using the same
logic.
Whilst here, change the reporting from bytes to pages so that it looks
smaller to the user!, is consistent with the neighbouring oom report
itself which displays counts in pages, and makes the unsigned long
overflow less likely.
v2: Split oversized format string into two lines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When iterating over the bound list, we expect all objects there to have
their pages pinned (by the bound VMA). So only report those objects with
additional pin count on their pages as "pinned". These should be those
objects used for display and hardware access.
Reported-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461150592-27818-1-git-send-email-chris@chris-wilson.co.uk
Looks like DPF was not implemented for gen8+ but the IER and IMR
are still enabled on initialization.
Since there is no code to handle this interrupt, gate the irq
enablement behind HAS_L3_DPF in case the feature gets enabled
in the future.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Since commit 30d9aa4265 ("drm/i915: Read sink_count dpcd always"),
the status of a DP connector depends on its sink count value.
However, some eDP panels don't set that value appropriately,
causing them to be reported as disconnected.
Fix this by ignoring sink count for eDP.
v2: Rephrased commit message. (Ander)
In case of eDP, returning status as connected if DPCD
read succeeds to avoid any further operations.
Fixes: 30d9aa4265 ("drm/i915: Read sink_count dpcd always")
Cc: Ander Conselvan De Oliveira <conselvan2@gmail.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460444034-22320-1-git-send-email-shubhangi.shrivastava@intel.com
In commit 7d23e3c37b ("drm/i915: Cleaning up intel_dp_hpd_pulse") some
much needed clean-up was done, but unfortunately part of the change
broke DP MST. The real issue was setting the connector state to
disconnected in the MST case, which is good, but the code then (after
a goto) checks if the connector state is not connected and shuts down
MST if this is the case, which is bad. With this change both SST and
MST seem to be happy.
v2: Add removed check further up in the function to be sure that MST
is shut down when we lose the link. (Ander)
Fixes: commit 7d23e3c37b ("drm/i915: Cleaning up intel_dp_hpd_pulse")
cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
cc: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
cc: Ander Conselvan de Oliveira <conselvan2@gmail.com>
cc: Nathan D Ciobanu <nathan.d.ciobanu@intel.com>
Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Reviewed-by: Lyude <cpaul@redhat.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460394684-7036-1-git-send-email-jim.bride@linux.intel.com
Do not use magic numbers, do not prefix stuff with "PCI_", do not
declare registers in implementation files. Also move the PCI
registers under correct comment in i915_reg.h.
v2:
- Consistently use BSM (not BDSM or other variants from PRM) (Chris)
- Also include register address to help identify the register (Chris)
v3:
- Refer to register value as *_val instead of *_reg (Chris)
v4:
- Make style checker happy
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We need to kunmap pt_vaddr and not pt itself, otherwise we end up
mapping a bunch of pages without ever unmapping them.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: d1c54acd67 ("drm/i915/gtt: Introduce kmap|kunmap for dma page")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460476663-24890-4-git-send-email-matthew.auld@intel.com
The power cycle delay starts _after_ turning off the panel power. Do the
msleep after frobbing the pmic panel power gpio.
Also toss in a FIXME about optimizing away needless waits.
Cc: Shobhit Kumar <shobhit.kumar@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Fixes: fc45e82199 ("drm/i915: Use the CRC gpio for panel enable/disable")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460996271-29795-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Shobhit Kumar <shobhit.kumar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Currently we're trying to define HSW/BDW power wells by what's not
included. Let's do it the other way around, so that you can actually
tell when the power well would get enabled. This will also allow us to
add new power domains without accidentally adding it to the HSW/BDW
display power domains.
The current set of domains looks rather buggy even:
- POWER_DOMAIN_MODESET is included in the display power well needlessly
- DDI-B to DDI-E were not part of the display power well when they
should be
So let's fix that up while at it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460977348-32260-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Currently we're using POWER_DOMAIN_MASK as the power domains for the
display power well on VLV/CHV. That includes all power domains even
though the disp2d/pipe-a power well is not needed for a lot of things.
Let's reduce these to what we actually need.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460977348-32260-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
While we disable runtime PM and with that display power well support if
the DMC firmware isn't loaded, we still want to disable power wells
during system suspend and driver unload. So drop/reacquire the
corresponding power refcount during suspend/resume and driver unloading.
This also means we have to check if DMC is not loaded and skip enabling
DC states in the power well code.
v2:
- Reuse intel_csr_ucode_suspend() in intel_csr_ucode_fini() instead of
opencoding the former. (Chris)
- Add docbook comment to the public resume and suspend functions.
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460980101-14713-1-git-send-email-imre.deak@intel.com
The driver's VDD on/off logic assumes that whenever the VDD is on we
also hold an AUX power domain reference. Since BIOS can leave the VDD on
during booting and resuming and on DDI platforms we won't take a
corresponding power reference, the above assumption won't hold on those
platforms and an eventual delayed VDD off work will do an extraneous AUX
power domain put resulting in a refcount underflow. Fix this the same
way we did this for non-DDI DP encoders:
commit 6d93c0c417 ("drm/i915: fix VDD state tracking after system
resume")
At the same time call the DP encoder suspend handler the same way as the
non-DDI DP encoders do to flush any pending VDD off work. Leaving the
work running may cause a HW access where we don't expect this (at a point
where power domains are suspended already).
While at it remove an unnecessary function call indirection.
This fixed for me AUX refcount underflow problems on BXT during
suspend/resume.
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460963062-13211-4-git-send-email-imre.deak@intel.com
During system resume we depended on pci_enable_device() also putting the
device into PCI D0 state. This won't work if the PCI device was already
enabled but still in D3 state. This is because pci_enable_device() is
refcounted and will not change the HW state if called with a non-zero
refcount. Leaving the device in D3 will make all subsequent device
accesses fail.
This didn't cause a problem most of the time, since we resumed with an
enable refcount of 0. But it fails at least after module reload because
after that we also happen to leak a PCI device enable reference: During
probing we call drm_get_pci_dev() which will enable the PCI device, but
during device removal drm_put_dev() won't disable it. This is a bug of
its own in DRM core, but without much harm as it only leaves the PCI
device enabled. Fixing it is also a bit more involved, due to DRM
mid-layering and because it affects non-i915 drivers too. The fix in
this patch is valid regardless of the problem in DRM core.
v2:
- Add a code comment about the relation of this fix to the freeze/thaw
vs. the suspend/resume phases. (Ville)
- Add a code comment about the inconsistent ordering of set power state
and device enable calls. (Chris)
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
CC: stable@vger.kernel.org
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460979954-14503-1-git-send-email-imre.deak@intel.com
The workaround added in
commit c6782b76d3 ("drm/i915/gen9: Reset secondary power well
requests left on by DMC/KVMR")
needs to be applied on Kabylake too as shown by the corresponding
timeout errors about power well 1 and MISC IO power well disabling in
the latest CI run.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460748778-4484-1-git-send-email-imre.deak@intel.com
When a vblank wait times out in intel_atomic_wait_for_vblanks() we just
get a cryptic 'WARN_ON(!ret)' backtrace in dmesg. Repace it with
something that tells you what actually happened.
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460978973-24945-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
The legacy cursor ioctl expects to be asynchronous with respect to other
screen updates, in particular page flips. As X updates the cursor from a
signal context, if the cursor blocks then it will stall both the input
and output chains causing bad stuttering and horrible UX.
Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94980
Fixes: 5008e874ed ("drm/i915: Make wait_for_flips interruptible.")
Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/1460922166-20292-1-git-send-email-chris@chris-wilson.co.uk
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Due to "some hardware limitation" the DPI enable bit in port C control
register does not get set on VLV. As a workaround we check the status in
pipe B conf register instead. The workaround was added in
commit c0beefd29f
Author: Gaurav K Singh <gaurav.k.singh@intel.com>
Date: Tue Dec 9 10:59:20 2014 +0530
drm/i915: Software workaround for getting the HW status of DSI Port C on BYT
Empirical evidence (on Surface 3 with DSI on port C per VBT) shows that
this is the case also on CHV, so extend the workaround to CHV. We still
have the device ready register check in place, so this should not get
confused with e.g. HDMI on pipe B.
This fixes a number of state checker warnings on CHV DSI port C.
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460724451-13810-1-git-send-email-jani.nikula@intel.com
Show a total and purgeable number of pin mapped objects
and their total and purgeable size.
Example output (new stat prefixed with a star):
# cat i915_gem_objects
19920 objects, 289243136 bytes
19920 [18466] objects, 288714752 [267911168] bytes in gtt
0 [0] active objects, 0 [0] bytes
19917 [18466] inactive objects, 288714752 [267911168] bytes
0 unbound objects, 0 bytes
0 purgeable objects, 0 bytes
1 pinned mappable objects, 3145728 bytes
0 fault mappable objects, 0 bytes
* 19914 [0] pin mapped objects, 285560832 [0] bytes [purgeable]
4294967296 [268435456] gtt total
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460716493-27826-1-git-send-email-tvrtko.ursulin@linux.intel.com
Reflect the status of obj->mapping as added with the
i915_gem_object_pin_map API.
'M' was chosen to designate the pin mapped status.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
We don't have a LVDS_BORDER_ENABLE type of bit for either eDP or DSI,
and just trying to frob the display timings to include borders results
in a corrupted picture. So reject the 'Center' scaling mode on GMCH
platforms for eDP and DSI.
TODO: Should really filter out the unsupported modes from the prop,
but that would be fairly invasive since the prop is now created and
stored by drm core. So leave it for a rainy day.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-6-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Jani Nikula <jani.nikula@intel.com>
Add the scaling mode property to DSI connectors, handle changes in the
property value, and compute the panel fitter state during
.compute_config().
v2: Handle BXT as well
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-5-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Jani Nikula <jani.nikula@intel.com>
Compute the DSI PLL parameters during .compute_config() rather than
.pre_pll_enable() so that we can fail gracefully if we can't find
suitable parameters.
In order to do that we need to store the DSI PLL parameters in
pipe_config.
v2: Handle BXT too
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-3-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Jani Nikula <jani.nikula@intel.com>
Set up DPLL and DPLL_MD even when driving DSI output on VLV/CHV. While
the DPLL isn't used to provide the clock we still need the refclock, and
it appears that the pixel repeat factor also has an effect on DSI
output. So set up eveyrhing in DPLL and DPLL_MD as we would do for
DP/HDMI/VGA, but don't actually enable the DPLL or configure the
dividers via DPIO.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460488478-18311-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Jani Nikula <jani.nikula@intel.com>
This patch is to correct one thing in this commit:
commit 25a5670533
Author: Dongwon Kim <dongwon.kim@intel.com>
Date: Wed Mar 16 18:06:13 2016 -0700
drm/i915/bxt: Reversed polarity of PORT_PLL_REF_SEL bit
This reversed bit polarity is actually common
for all BXT and APL SoCs. Therefore, revision checking
in the original commit should be removed to make
the bit set regardless of revision ID of GFX block.
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460673463-14453-1-git-send-email-dongwon.kim@intel.com
I caught a few errors in our current PHY/CDCLK programming by sanity
checking the actual programmed state, so I thought it would be also
useful for the future. In addition to verifying the state after
programming it also verify it after exiting DC5, to make sure DMC
restored/kept intact everything related.
v2:
- Inlining __phy_reg_verify_state() doesn't make sense and also
incorrect, so don't do it (PW/CI gcc)
v3:
- Rebase on latest -nightly
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459780030-15781-1-git-send-email-imre.deak@intel.com
If BIOS has already programmed and enabled a PHY, don't reprogram it as
that may interfere with the currently active outputs. A follow-up patch
will add state verification, so we can catch any misconfiguration on
BIOS's behalf.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-14-git-send-email-imre.deak@intel.com
When determining whether CDCLK is enabled by BIOS and so we should skip
reprogramming it, we didn't check the related DBUF power request and
state. In theory BIOS could enable one without the other so check for
this case and reprogram things if something is amiss.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-13-git-send-email-imre.deak@intel.com
Power well 1 is managed by the DMC firmware so don't toggle it on-demand
from the driver. This means we need to follow the BSpec display
initialization sequence during driver loading and resuming (both system
and runtime) and enable power well 1 only once there. Afterwards DMC
will toggle power well 1 whenever entering/exiting DC5.
For this to work we also need to do away getting the PLL power domain,
since that just kept runtime PM disabled for good.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-12-git-send-email-imre.deak@intel.com
The power-down step logically belongs to the individual PHY uninit
sequence so move it there. The only functional change is that we will
power down now PHY 1 separately before PHY 0 and preserve the other bits
in the register which are defined as reserved.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-11-git-send-email-imre.deak@intel.com
For internal APIs passing dev_priv is preferred to reduce indirections,
so convert over a few DDI PHY, CDCLK helpers.
No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-10-git-send-email-imre.deak@intel.com
On Broxton we need to enable/disable power well 1 during the init/unit
display sequence similarly to Skylake/Kabylake. The code for this will
be added in a follow-up patch, but to prepare for that unexport
skl_pw1_misc_io_init(). It's a simple function called only from a single
place and having it inlined in the Skylake display core init/unit
functions will make it easier to compare it with its Broxton
counterpart.
This also flips the order of Misc IO and power well 1 disabling which
matches the enabling order. The specification doesn't prescribe the
disabling order, so this should be fine.
v2:
- Fix incorrect enable vs. disable power well call in
skl_display_core_uninit() (Patrik)
- Add commit comment about chaning the order of PW1 and Misc IO power
well disabling (Patrik)
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459773777-10701-1-git-send-email-imre.deak@intel.com
On SKL/KBL suspend-to-idle (aka freeze/s0ix) is performed with DMC
firmware assistance where the target display power state is DC6. On
Broxton on the other hand we don't use the firmware for this, but rely
instead on a manual DC9 flow. For this we have to uninitialize the
display following the BSpec display uninit sequence, just as during
S3/S4, so make sure we follow this sequence.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-8-git-send-email-imre.deak@intel.com
The display power well support and DC state management doesn't depend on
runtime PM support, so remove the incorrect asserts about this.
Also Broxton does support DC5, so the related assert in
assert_can_enable_dc5() is incorrect. There is a more generic and
correct assert for this already in gen9_set_dc_state(), so we can remove
all the other ones.
At the same time convert WARNs to WARN_ONCE for consistency with the
other DC state asserts.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-7-git-send-email-imre.deak@intel.com
So far we only power well enabling was synchronous not disabling. Since
we don't exactly know how the firmware (both DMC and PCU) synchronizes
against the actual power well state during DC transitions, make the
disabling also synchronous.
CC: Mika Kuoppala <mika.kuoppala@linux.intel.com>
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-6-git-send-email-imre.deak@intel.com
DMC forces on power well 1 and the misc IO power well by setting the
corresponding request bits both in the BIOS and the DEBUG power well
request registers. This is somewhat unexpected since the firmware should
really just save and restore state but not alter it. We also depend on
being able to disable power well 1, and the misc IO power well before
entering S3/S4 on BXT and SKL or entering DC9 on BXT. To fix this make
sure these request bits are cleared whenever we want to disable the
given power wells.
On SKL there is another twist where the firmware also clears the power
well 1 request bit in HSW_POWER_WELL_DRIVER (but not that of the misc IO
power well). This happens to not cause a problem due to the forced-on
request bits in the other request registers.
I've filed a bug about all this, but fixing that may take a while and
having this sanity check in place makes sense even for future firmware
versions.
At the same time also check the KVMR request bits. I haven't seen this
being altered, but we don't expect any request bits in here either, so
sanitize this register as well.
v2:
- Apply the workaround on SKL as well. I noticed the related failure
from the CI report, later Patrik also reported seeing it on his
machine.
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459851965-6137-1-git-send-email-imre.deak@intel.com
This register is read-only, so we have never actually set
OCL2_LDOFUSE_PWR_DIS in it as specified by the specification. Add a code
comment about this. I filed a specification update request to clarify
this there.
CC: Arthur J Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-4-git-send-email-imre.deak@intel.com
This has been corrected in BSpec quite some time ago, but we missed it
somehow. The wrong field definitions resulted in configuring PHY0 with
an incorrect GRC value.
v2:
- Remove the FIXME comment, we left in the code exactly about this
issue. (Ville)
CC: Arthur J Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-3-git-send-email-imre.deak@intel.com
DMC version 1.06 has a known bug, where the firmware polls forever for a
port PLL to lock, if the PLL was disabled when entering DC5, which locks
up the machine. Version 1.07 fixes this, so make that the minimum
required version on BXT.
CC: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459515767-29228-2-git-send-email-imre.deak@intel.com
Split the VLV/CHV hoplug irq handling to ack and handler phases. This
way we can move the actual irq handling outside the section where
we have disabled the interrupt sources.
For now, we leave things as is for pre-VLV GMCH platforms, but
eventually they could get the same treatment.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On VLV/CHV the master interrupt enable bit only affects GT/PM
interrupts. Display interrupts are not affected by the master
irq control.
Also it seems that the CPU interrupt will only be generated when
the combined result of all GT/PM/display interrupts has a 0->1
edge. We already use the master interrupt enable bit to make sure
GT/PM interrupt can generate such an edge if we don't end up clearing
all IIR bits. We must do the same for display interrupts, and for
that we can simply clear out VLV_IER, and restore after we've acked
all the interrupts we are about to process.
So with both master interrupt enable and VLV_IER cleared out, we will
guarantee that there will be a 0->1 edge if any IIR bits are still set
at the end, and thus another CPU interrupt will be generated.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 579de73b04 ("drm/i915: Exit cherryview_irq_handler() after one pass")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On VLV/CHV VLV_IIR is not double double buffered, and it doesn't detect
edges from PIPESTAT & co. like it does on gen4. Instead it just
directly latches the level from PIPESTAT & co. That means we must clear
VLV_IIR after PIPESTAT & co. or else we'll get a spurious bit in VLV_IIR
every single time.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Use GEN8_MASTER_IRQ_CONTROL instead of DE_MASTER_IRQ_CONTROL or
MASTER_INTERRUPT_ENABLE with the GEN8_MASTER_IRQ register. They're
all bit 31 so there's no actual bug here, but let's be consistent
which name we use for the bit.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
On CHV GTFIFODBG has some read-only bits to indicate the number
of free FIFO entries. Ignore these when checking to see if any
of the sticky error bits are set.
This gets rid of these during device resume:
[drm:cherryview_enable_rps] GT fifo had a previous error 1080000
While at it, move the assignments out of the if().
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460570970-14073-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Allow for the MOCS to be programmed for all engines.
Currently we program the MOCS when the first render batch
goes through. This works on most platforms but fails on
platforms that do not run a render batch early,
i.e. headless servers. The patch now programs all initialised engines
on init and the RCS is programmed again within the initial batch. This
is done for predictable consistency with regards to the hardware
context.
Hardware context loading sets the values of the MOCS for RCS
and L3CC. Programming them from within the batch makes sure that
the render context is valid, no matter what the previous state of
the saved-context was.
v2: posted correct version to the mailing list.
v3: moved programming to within engine->init_hw() (Chris Wilson)
v4: code formatting and white-space changes. (Chris Wilson)
Testcase: igt/gem_mocs_settings
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460556205-6644-1-git-send-email-chris@chris-wilson.co.uk
Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.
The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.
All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.
A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-16-git-send-email-chris@chris-wilson.co.uk
After mi_set_context() succeeds, we need to update the state of the
engine's last_context. This ensures that we hold a pin on the context
whilst the hardware may write to it. However, since we didn't complete
the post-switch setup of the context, we need to force the subsequent
use of the same context to complete the setup (which means updating
should_skip_switch()).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-15-git-send-email-chris@chris-wilson.co.uk
Having the !RCS legacy context switch threaded through the RCS switching
code makes it much harder to follow and understand. In the next patch, I
want to fix a bug handling the incomplete switch, this is made much
simpler if we segregate the two paths now.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-14-git-send-email-chris@chris-wilson.co.uk
As paranoia, we want to ensure that the CPU's PTEs have been revoked for
the object before we return from i915_gem_release_mmap(). This allows us
to rely on there being no outstanding memory accesses from userspace
and guarantees serialisation of the code against concurrent access just
by calling i915_gem_release_mmap().
v2: Reduce the mb() into a wmb() following the revoke.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-13-git-send-email-chris@chris-wilson.co.uk
For reasons unknown Sandybridge GT1 (at least) will eventually hang when
it encounters a ring wraparound at offset 0. The test case that
reproduces the bug reliably forces a large number of interrupted context
switches, thereby causing very frequent ring wraparounds, but there are
similar bug reports in the wild with the same symptoms, seqno writes
stop just before the wrap and the ringbuffer at address 0. It is also
timing crucial, but adding various delays hasn't helped pinpoint where
the window lies.
Whether the fault is restricted to the ringbuffer itself or the GTT
addressing is unclear, but moving the ringbuffer fixes all the hangs I
have been able to reproduce.
References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-12-git-send-email-chris@chris-wilson.co.uk
Two concurrent writes into the same register cacheline has the chance of
killing the machine on Ivybridge and other gen7. This includes LRI
emitted from the command parser. The MI_SET_CONTEXT itself serves as
serialising barrier and prevents the pair of register writes in the first
packet from triggering the fault. However, if a second switch-context
immediately occurs then we may have two adjacent blocks of LRI to the
same registers which may then trigger the hang. To counteract this we
need to insert a delay after the second register write using SRM.
This is easiest to reproduce with something like
igt/gem_ctx_switch/interruptible that triggers back-to-back context
switches (with no operations in between them in the command stream,
which requires the execbuf operation to be interrupted after the
MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
interruptible igt. No reports from the wild though, so it must be of low
enough frequency that no one has correlated the random machine freezes
with i915.ko
The issue was introduced with
commit 2c55018347 [v3.19]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 16 10:02:27 2014 +0000
drm/i915: Disable PSMI sleep messages on all rings around context switches
Testcase: igt/gem_ctx_switch/render-interruptible #ivb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-11-git-send-email-chris@chris-wilson.co.uk
If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the expected failure as a DEBUG message.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-10-git-send-email-chris@chris-wilson.co.uk
Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.
If the we reset the GPU or have detected a hang and wish to reset the
GPU, the request is forcibly complete and the wait broken. Currently, we
report either -EAGAIN or -EIO in order for the caller to retreat and
restart the wait (if appropriate) after dropping and then reacquiring
the struct_mutex (essential to allow the GPU reset to proceed). However,
if we take the view that the request is complete (no further work will
be done on it by the GPU because it is dead and soon to be reset), then
we can proceed with the task at hand and then drop the struct_mutex
allowing the reset to occur. This transfers the burden of checking
whether it is safe to proceed to the caller, which in all but one
instance it is safe - completely eliminating the source of all spurious
-EIO.
Of note, we only have two API entry points where we expect that
userspace can observe an EIO. First is when submitting an execbuf, if
the GPU is terminally wedged, then the operation cannot succeed and an
-EIO is reported. Secondly, existing userspace uses the throttle ioctl
to detect an already wedged GPU before starting using HW acceleration
(or to confirm that the GPU is wedged after an error condition). So if
the GPU is wedged when the user calls throttle, also report -EIO.
v2: Split more carefully the change to i915_wait_request() and assorted
ABI from the reset handling.
v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code
so that we don't start to leak EIO there in future (and break our hang
resistant modesetting).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-9-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
Now that the reset_counter is stored on the request, we can rearrange
the code to handle reading the counter versus waiting during the atomic
modesetting for readibility (by deleting the hairiest of codes).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-8-git-send-email-chris@chris-wilson.co.uk
As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to transfer those waits to a central dispatcher for all
waiters and all requests.
PS: With per-engine resets, we obviously cannot assume a global reset
epoch for the requests - a per-engine epoch makes the most sense. The
challenge then is how to handle checking in the waiter for when to break
the wait, as the fine-grained reset may also want to requeue the
request (i.e. the assumption that just because the epoch changes the
request is completed may be broken - or we just avoid breaking that
assumption with the fine-grained resets).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-7-git-send-email-chris@chris-wilson.co.uk
In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by incrementing the reset_counter thereby incrementing the gobal
reset epoch). By clearing that flag when the recovery task holds the
struct_mutex, we can forgo a second flag that simply tells GEM to ignore
the "reset-in-progress" flag.
The second flag we store in the reset_counter is whether the
reset failed and we consider the GPU terminally wedged. Whilst this flag
is set, all access to the GPU (at least through GEM rather than direct mmio
access) is verboten.
PS: Fun is in store, as in the future we want to move from a global
reset epoch to a per-engine reset engine with request recovery.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-6-git-send-email-chris@chris-wilson.co.uk
If we, when we store the reset_counter for the operation, we ensure that
it is not in a wedged or in the middle of a reset, we can then assert that
if any reset occurs the reset_counter must change. Later we can just
compare the operation's reset epoch against the current counter to see
if we need to abort the operation (to handle the hang).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-5-git-send-email-chris@chris-wilson.co.uk
This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do multiple tests rather than risk the value changing between tests.
v2: Be more strict on converting existing i915_reset_in_progress() over to
the more verbose i915_reset_in_progress_or_wedged().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-4-git-send-email-chris@chris-wilson.co.uk
Currently there is a #define to enable extra BUG_ON for debugging
requests and associated activities. I want to expand its use to cover
all of GEM internals (so that we can saturate the code with asserts).
We can add a Kconfig option to make it easier to enable - with the usual
caveats of not enabling unless explicitly requested.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-3-git-send-email-chris@chris-wilson.co.uk
Separate out the layers of includes (linux, drm, intel, i915) so that it
is a little easier to order our definitions between our multiple
reentrant headers. A couple of headers needed fixes to make them more
standalone (forgotten includes, forward declarations etc).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-2-git-send-email-chris@chris-wilson.co.uk
Our driver compiles clean (nowadays thanks to 0day) but for me, at least,
it would be beneficial if the compiler threw an error rather than a
warning when it found a piece of suspect code. (I use this to
compile-check patch series and want to break on the first compiler error
in order to fix the patch.)
v2: Kick off a new "Debugging" submenu for i915.ko
At this point, we applied it to the kernel and promptly kicked it out
again as it broke buildbots (due to a compiler warning on 32bits):
commit 908d759b21
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue May 26 07:46:21 2015 +0200
Revert "drm/i915: Force clean compilation with -Werror"
v3: Avoid enabling -Werror for allyesconfig/allmodconfig builds, using
COMPILE_TEST as a suitable proxy suggested by Andrew Morton. (Damien)
Only make the option available for EXPERT to reinforce that the option
should not be casually enabled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk
With gen9+ the edram capabilities are defined so
that we can calculate the edram (ellc) size accordingly.
Note that there are undefined combinations for some subset of
edram capability bits. Return the closest size for undefined indexes.
Even if we get it wrong with beginning of future gen enabling, the size
information is currently only used for boot message and in debugfs entry.
v2: Use function instead of hard to read macro (Daniel)
v3: s/INTEL_INFO/INTEL_GEN (Matthew)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460557604-7126-2-git-send-email-mika.kuoppala@intel.com
Store the edram capabilities instead of only the size of
edram. This is preparatory patch to allow edram size calculation
based on edram capability bits for gen9+. With gen9 the
edram is behind llc and is a separate entity. With hsw/bdw
it was more of a victim cache for LLC so the name 'eLLC' might
be warranted. Regardless, rename all mentions of eLLC to EDRAM to
clear the confusion.
v2: return bytes for edram size (Chris)
s/eLLC/eDRAM in output if we are gen > 8
v3: rebase, INTEL_GEN (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
For gen9 onwards, eDRAM is a true memory side cache. So
there is no need to program idi hash mask as it is for eLLC
only.
v2: INTEL_GEN (Chris), s/has/hash (Matthew)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
We started to use PIPE_CONTROL to write render ring seqno in order to
combat seqno write vs interrupt generation problems. This was introduced
by commit 7c17d37737 ("drm/i915: Use ordered seqno write interrupt
generation on gen8+ execlists").
On gen8+ size of PIPE_CONTROL with Post Sync Operation should be
6 dwords. When we're using older 5-dword variant it's possible to
observe inconsistent values written by PIPE_CONTROL with Post
Sync Operation from user batches, resulting in rendering corruptions.
v2: Fix BAT failures
v3: Comments on alignment and thrashing high dword of seqno (Chris)
v4: Updated commit msg (Mika)
Testcase: igt/gem_pipe_control_store_loop/*-qword-write
Issue: VIZ-7393
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460469115-26002-1-git-send-email-michal.winiarski@intel.com
Experiments with heaven 4.0 benchmark and skylake gt3e (rev 0xa)
suggest that WaForceContextSaveRestoreNonCoherent is needed for all
revs. Extending this to all revs cures a gpu hang with rev 0xa when
running heaven4.0 gpu benchmark.
We have been here before, with problems enabling gt4e and extending
up to revision F0 instead of false claims of bspec of E0 only. See
commit <e238659ddd88> ("drm/i915/skl: Default to noncoherent access
up to F0"). In retrospect we should have covered this with this big
blanket back then already, as E0 vs F0 discrepancy was suspicious
enough.
Previously the WaForceEnableNonCoherent has been tied to
context non-coherence, atleast in relevant hsds. So keep this tie
and extended this alongside.
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: stable@vger.kernel.org
Reported-by: Mike Lothian <mike@fireburn.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=93491
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-2-git-send-email-mika.kuoppala@intel.com
For all gt3 and gt4 skylake variants, extend the usage of
WaRsDisableCoarsePowerGating for all revisions. Without this
gt3 and gt4 skylakes up to atleast rev 0xa can gpu hang or
system hang.
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Timo Aaltonen <tjaalton@ubuntu.com>
Cc: <stable@vger.kernel.org>
Reported-by: Mikael Djurfeldt <mikael@djurfeldt.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=94161
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Timo Aaltonen <tjaalton@ubuntu.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459860977-27751-1-git-send-email-mika.kuoppala@intel.com
We can use the new pin/lazy unpin API for simplicity
and more performance in the execlist submission paths.
v2:
* Fix error handling and convert more users.
* Compact some names for readability.
v3:
* intel_lr_context_free was not unpinning.
* Special case for GPU reset which otherwise unbalances
the HWS object pages pin count by running the engine
initialization only (not destructors).
v4:
* Rebased on top of hws setup/init split.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460472042-1998-1-git-send-email-tvrtko.ursulin@linux.intel.com
[tursulin: renames: s/hwd/hws/, s/obj_addr/vaddr/]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Split the hardware status page into setup and initialisation,
where setup means setting up the driver state to support the
engine, and initialization means programming the hardware
with the before set up state.
This way the design matches the design of the engine setup/init
code which is split in the same fashion and it enables the
stages to be used in a balanced fashion (engine setup - hws
setup, engine init - hws init).
This will enable the upcoming improvements to slot in without
any kludges on the GPU reset path.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Check whether the DPLL is even enabled before readoing out the dividers
and trying to derive port_clock on CHV. We already did this on VLV.
Also remove the comment "MIPI" comment from the VLV code since we call
this function whenever the pipe is enabled.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
On VLV at least, the BIOS may leave the DSI PLL enabled in some wonky
state where it just refuses to lock. Simply disabling the PLL before
reconfiguring it is not enough to fix it, but power gating the PLL
prior to reconfiguring does work.
This happens on BYT FFRD8 when booting with HDMI connected so the DSI
display will not be lit up by the BIOS.
Also we can remove the code for BXT that disables the PLL before
enabling it again.
v2: s/vlv/intel/ since BXT made thing generic
v3: Remove the BXT disable PLL before enable trick
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458052809-23426-11-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
The registers frobbed by vlv_init_display_clock_gating() libve inside
the disp2d power well, so frobbing them while the power well is down
results in unclaimed register access warning (and of course the values
won't stick). Let's do this setup after we know the power well is
enabled.
It's also worth noting that DSPCLK_GATE_D and CBR1_VLV lose their state
when the power well goes down, but fortunately the values we've been
writing are actually the reset defaults.
MI_ARB_VLV actually retains its value even if the power well was turned
off, we just can't access it while the power well is down.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
For a bit of extra paranoia make sure the display irqs are all cleared
before we enabled them when turning on the power well. This should
really be the case already since the power well was off which resets
everything.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
During runtime PM we'll be reinitializing interrupt support from the
ground up. However since the display power well will be off at that
time, well end up with a ton of unclaimed register accesses from the
display irq setup. Since we turned off the power well already before
runtime suspend, we've flagged display irqs as disabled during runtime
PM transitions. So we can just check that flag to see if we should do
skip display irqs during irq setup.
During driver load display irqs will be flagged as enabled since we've
turned on the power well already, however the power well code will have
skipped the display irq setup since irq support as a whole wasn't yet
enabled when the power well was enabled. So we'll want to do the display
irq setup in that case.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The vlv/chv display irq setup was a bit of mess after I ran out of steam
when working on it last. Fix it up so that we just have a _reset() and
_postinstall() hooks for the display irqs, and use those consistently.
v2: Clear out pipestat_irq_mask[] and PIPE_FIFO_UNDERRUN_STATUS in
vlv_display_irq_reset() (Imre)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476574-1921-1-git-send-email-ville.syrjala@linux.intel.com
The underruns we were seeing when enabling eDP port A on ILK seem to
have been caused by prematurely clearing the LP1+ watermark values when
disabling LP1+ watermarks. Now that the watermarks are handled
properly, we can rip out the underrun suppression around the port A
enable.
We still need to worry about the underruns on FDI when enabling
the eDP PLL. But as Bspec tells us, we can avoid that by a vblank
wait on the pipe driving FDI just prior to enabling the eDP PLL.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Once again ILK is unhappy if we clear out the LP1+ watermark levels
outright, and instead we must disable the levels we don't want while
still leaving the actual programmed watermark levels intact.
Fixes underruns on the already enabled pipe when programming watermarks
while enabling the second pipe.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Matt Roper <matthew.d.roper@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93787
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Take a bigger hammer to the underrun suppression on ILK. Instead of
trying to suppress them at specific points in the modeset sequence just
silence them across the entire sequence. This gets rid of some underruns
at least on my ILK. Note that this changes SNB and IVB to follow the
same approach just to keep the code less convoluted. The difference is
that on those platforms we won't suppress CPU underruns for port A since
it doesn't seem to be necessary.
My ILK has port A eDP and two PCH HDMI ports, so I can't be sure this is
as effective on other PCH port types. Perhaps we still need some of
Daniel's extra vblank waits [2]?
I've still been able to trigger an underrun on the other pipe, but
fixing that perhaps needs the LP1+ disable trick I implemented here [1]
which never got merged.
A few details which hamper stress testing on my ILK are that sometimes
the PCH transcoder gets messed up and refuses to shut down, and sometimes
even the panel power sequencer apparently gets stuck on the always on
position.
[1] https://lists.freedesktop.org/archives/intel-gfx/2014-March/041317.html
[2] https://lists.freedesktop.org/archives/intel-gfx/2016-January/086397.html
v2: Add a note that we also get underruns when enabling PCH ports
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1459536799-18109-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Rather than blindly waking up all forcewake domains on command
submission, we can teach each engine what is (or are) the correct
one to take.
On platforms with multiple forcewake domains like VLV, CHV, SKL
and BXT, this has the potential of lowering the GPU and CPU
power use and submission latency.
To implement it we add a function named
intel_uncore_forcewake_for_reg whose purpose is to query which
forcewake domains need to be taken to read or write a specific
register with raw mmio accessors.
These enables the execlists engine setup to query which
forcewake domains are relevant per engine on the currently
running platform.
v2:
* Kerneldoc.
* Split from intel_uncore.c macro extraction, WARN_ON,
no warns on old platforms. (Chris Wilson)
v3:
* Single domain per engine, mention all registers,
bi-directional function and a new name, fix handling
of gen6 and gen7 writes. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460468251-14069-1-git-send-email-tvrtko.ursulin@linux.intel.com
Chris Wilson points out that we can remove them from the array
since they are always written to with raw accessors.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Knowledge of which register per platform belonds in which
forcewake domain was embedded in the MMIO accessors themselves.
Extract it into standalone macros so they can be used from
new code in the following patches.
This causes GCC to compile some of the MMIO accessors slightly
differently and grows the code a tiny amount. But none of the
growth is on the fast-path so it does not matter hugely.
Affected sizes before:
00000000000026f0 00000000000001a5 t gen6_read16
0000000000002390 00000000000001a5 t gen6_read32
00000000000028a0 00000000000001a5 t gen6_read64
00000000000061d0 000000000000019e t gen8_write16
0000000000006510 000000000000019d t gen8_write32
0000000000006370 000000000000019d t gen8_write64
00000000000021f0 000000000000019d t gen8_write8
Affected sizes after:
0000000000002840 00000000000001aa t gen6_read16
00000000000024e0 00000000000001a9 t gen6_read32
00000000000029f0 00000000000001a9 t gen6_read64
0000000000004f20 00000000000001b5 t gen8_write16
0000000000004ba0 00000000000001b4 t gen8_write32
00000000000050e0 00000000000001b4 t gen8_write64
0000000000004d60 00000000000001b4 t gen8_write8
Other MMIO accessors are not affected in size.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
As the vast majority of users do not use the domain id variable,
we can eliminate it from the iterator and also change the latter
using the same principle as was recently done for for_each_engine.
For a couple of callers which do need the domain mask, store it
in the domain array (which already has the domain id), then both
can be retrieved thence.
Result is clearer code and smaller generated binary, especially
in the tight fw get/put loops. Also, relationship between domain
id and mask is no longer assumed in the macro.
v2: Improve grammar in the commit message and rename the
iterator to for_each_fw_domain_masked for consistency.
(Dave Gordon)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Because it is based on jiffies, current implementation releases the
forcewake at any time between straight away and between 1ms and 10ms,
depending on the kernel configuration (CONFIG_HZ).
This is probably not what has been desired, since the dynamics of keeping
parts of the GPU awake should not be correlated with this kernel
configuration parameter.
Change the auto-release mechanism to use hrtimers and set the timeout to
1ms with a 1ms of slack. This should make the GPU power consistent
across kernel configs, and timer slack should enable some timer coalescing
where multiple force-wake domains exist, or with unrelated timers.
For GlBench/T-Rex this decreases the number of forcewake releases from
~480 to ~300 per second, and for a heavy combined OGL/OCL test from
~670 to ~360 (HZ=1000 kernel).
Even though this reduction can be attributed to the average release period
extending from 0-1ms to 1-2ms, as discussed above, it will make the
forcewake timeout consistent for different CONFIG_HZ values.
Real life measurements with the above workload has shown that, with this
patch, both manage to auto-release the forcewake between 2-4 times per
10ms, even though the number of forcewake gets is dramatically different.
T-Rex requests between 5-10 explicit gets and 5-10 implict gets in each
10ms period, while the OGL/OCL test requests 250 and 380 times in the same
period.
The two data points together suggest that the nature of the forwake
accesses is bursty and that further changes and potential timeout
extensions, or moving the start of timeout from the first to the last
automatic forcewake grab, should be carefully measured for power and
performance effects.
v2:
* Commit spelling. (Dave Gordon)
* More discussion on numbers in the commit. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
We've had problems on several occasions with using the panel type
from the VBT block 40. Usually it seems to be 2, which often
doesn't give us the correct timings for the panel. After some
more digging I found a way to get a panel type via the OpRegion
SWSCI GBDA "Get Panel Details" method. Let's try to use it.
The spec has this to say about the output:
"Bits [15:8] - Panel Type
Bits contain the panel type user setting from CMOS
00h = Not Valid, use default Panel Type & Timings from VBT
01h - 0Fh = Panel Number"
Another version of the spec lists the valid range as 1-16, which makes
more sense since VBT supports 16 panels. Based on actual results
from Rob's G45, 1-16 is what we need to accept.
The other bits in the output don't look relevant for the problem at
hand.
The input is specified as:
"Bits [31:4] - Reserved
Reserved (must be zero)
Bits [3:0] - Panel Number
These bits contain the sequential index of Panel, starting at 0 and
counting upwards from the first integrated Internal Flat-Panel Display
Encoder present, and then from the first external Display Encoder
(e.g., S/DVO-B then S/DVO-C) which supports Internal Flat-Panels.
0h - 0Fh = Panel number"
For now I've just hardcoded the input panel number as 0. That would seem
like a decent choise for LVDS. Not so sure about eDP when port != A.
v2: Accept values 1-16
Filter out bogus results in opregion code (Jani)
Add debug logging for all the different branches (Jani)
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94825
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460359431-11003-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Rob Kramer <rob@solution-space.com>
Store the extracted panel_type under dev_priv.vbt instead of keeping
around a static variable for it.
Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
VBT can only contain 16 panel entries, indexed with the panel_type.
To play it safe we should reject panel_type > 0xf, so that we don't
read past the valid data.
v2: Add debug logging (Jani)
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460359329-10817-1-git-send-email-ville.syrjala@linux.intel.com
When the GMBUS based i2c transfer times out, we try to fall back to
bit-banging and retry the operation that way. However if the bit-banging
attempt also fails, we should probably go back to the GMBUS method for
the next attempt. Maybe there simply wasn't anyone one the bus at this
time.
There's also a bit of a mess going on with the force_bit handling.
It's supposed to be a ref count actually, and it is as far as
intel_gmbus_force_bit() is concerned. But it's treated as just a
flag by the timeout based bit-banging fallback. I suppose that's
fine since we should never end up in the timeout fallback case
if force_bit was already non-zero. However now that we want to restore
things back to where they were after the bit-banging attempt failed,
we're going to have to do things a bit differently to avoid clobbering
the force_bit count as set up by intel_gmbus_force_bit(). So let's
dedicate the high bit as a flag for the low level timeout based fallback
and treat the rest of the bits as a ref count just as before.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Extend the protection of gmbus_mutex around the force_bit
RMW in intel_gmbus_force_bit(), in case someone gets the
idea of calling it from a separate thread while there's
other stuff happening on the same bus.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Since we only ever use the drm_i915_private from the stored
i915_mm_struct->dev, save some electrons by storing the right
backpointer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-3-git-send-email-chris@chris-wilson.co.uk
Holding a reference to the containing task_struct is not sufficient to
prevent the mm_struct from being reaped under memory pressure. If this
happens whilst we are calling get_user_pages(), explosions erupt -
sometimes an immediate GPF, sometimes page flag corruption. To prevent
the target mm from being reaped as we are reading from it, acquire a
reference before we begin.
Testcase: igt/gem_shrink/*userptr
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-2-git-send-email-chris@chris-wilson.co.uk
In order to ensure that all invalidations are completed before the
operation returns to userspace (i.e. before the munmap() syscall returns)
we need to wait upon the outstanding operations.
We are allowed to block inside the invalidate_range_start callback, and
as struct_mutex is the inner lock with mmap_sem we can wait upon the
struct_mutex without provoking lockdep into warning about a deadlock.
However, we don't actually want to wait upon outstanding rendering
whilst holding the struct_mutex if we can help it otherwise we also
block other processes from submitting work to the GPU. So first we do a
wait without the lock and then when we reacquire the lock, we double
check that everything is ready for removing the invalidated pages.
Finally to wait upon the outstanding unpinning tasks, we create a
private workqueue as a means to conveniently wait upon all at once. The
drawback is that this workqueue is per-mm, so any threads concurrently
invalidating objects will wait upon each other. The advantage of using
the workqueue is that we can wait in parallel for completion of
rendering and unpinning of several objects (of particular importance if
the process terminates with a whole mm full of objects).
v2: Apply a cup of tea to the changelog.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94699
Testcase: igt/gem_userptr_blits/sync-unmap-cycles
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXCva8AAoJEHm+PkMAQRiGXBoIAIkrjxdbuT2nS9A3tHwkiFXa
6/Th1UjbNaoLuZ+MckQHayAD9NcWY9lVjOUmFsSiSWMCQK/rTWDl8x5ITputrY2V
VuhrJCwI7huEtu6GpRaJaUgwtdOjhIHz1Ue2MCdNIbKX3l+LjVyyJ9Vo8rruvZcR
fC7kiivH04fYX58oQ+SHymCg54ny3qJEPT8i4+g26686m11hvZLI3UAs2PAn6ut+
atCjxdQ4yLN3DWsbjuA7wYGWhTgFloxL4TIoisuOUc3FXnSi/ivIbXZvu4lUfisz
LA2JBhfII3AEMBWG9xfGbXPijJTT4q7yNlTD0oYcnMtAt/Roh2F04asqB1LetEY=
=bri6
-----END PGP SIGNATURE-----
Merge tag 'v4.6-rc3' into drm-intel-next-queued
Linux 4.6-rc3
Backmerge requested by Chris Wilson to make his patches apply cleanly.
Tiny conflict in vmalloc.c with the (properly acked and all) patch in
drm-intel-next:
commit 4da56b99d9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Apr 4 14:46:42 2016 +0100
mm/vmap: Add a notifier for when we run out of vmap address space
and Linus' tree.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
If we want a contiguous mapping of a single page sized object, we can
forgo using vmap() and just use a regular kmap(). Note that this is only
suitable if the desired pgprot_t is compatible.
v2: Use is_vmalloc_addr()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-7-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want to
try a high-order kmalloc() before using a vmalloc().
So refactor my usage into drm_malloc_gfp().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk
When called because we have run out of vmap address space, we only need
to recover objects that have vmappings and not all.
v2: Start using is_vmalloc_addr()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
We now have two implementations for vmapping a whole object, one for
dma-buf and one for the ringbuffer. If we couple the mapping into the
obj->pages lifetime, then we can reuse an obj->mapping for both and at
the same time couple it into the shrinker. There is a third vmapping
routine in the cmdparser that maps only a range within the object, for
the time being that is left alone, but will eventually use these routines
in order to cache the mapping between invocations.
v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
v3: Call unpin_vmap from the right dmabuf unmapper
v4: Rename vmap to map as we don't wish to imply the type of mapping
involved, just that it contiguously maps the object into kernel space.
Add kerneldoc and lockdep annotations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
After we pin the ringbuffer into the GGTT, all error paths need to unpin
it again. Move this common step into one block, and make the unable to
iomap error code consistent (i.e. treat it as out of memory to avoid
confusing it with a invalid argument).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-3-git-send-email-chris@chris-wilson.co.uk
We only need the struct_mutex to manipulate the pages_pin_count on the
object, we do not need to hold our BKL when freeing the exported
scatterlist.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-2-git-send-email-chris@chris-wilson.co.uk
This is to fix a GPU hang seen with mid thread pre-emption
and pooled EUs.
v2. Use IS_BXT_REVID instead of IS_BROXTON and INTEL_REVID
v3. And use correct type for register addresses
Signed-off-by: Tim Gore <tim.gore@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458571049-854-1-git-send-email-tim.gore@intel.com
For BXT, description of polarities of PORT_PLL_REF_SEL
has been reversed for newer Gen9LP steppings according to the
recent update in Bspec. This bit now should be set for
"Non-SSC" mode for all Gen9LP starting from B0 stepping.
v2: Only B0 and newer stepping should be affected by this
change.
Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94866
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458176773-26925-1-git-send-email-dongwon.kim@intel.com
Check functions are used by atomic to see if the new state will
be allowed. There's also a hw state checker which checks afterwards
that the committed state is correct. Rename it to hw state verifier
to reduce some confusion.
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56FB8785.8020506@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
The modeset state verifier no longer has full access to the hardware,
instead it should only verify affected crtc's.
Looking for disabled stuff can be verified immediately after all crtc
disables have completed, while each enabled crtc can be verified right
after being enabled.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-3-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: check -> verify]
This will make it easier to keep the crtc checker when atomic
commit is reworked for asynchronous commits. This prevents checking
crtc's that were not part of the state. It's safe to verify disabled
encoders, connectors and dpll's that are not part of the state,
because during modeset connection_mutex is held.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: Extend commit message and rename check to verify.]
When reading from the HWS page, we use barrier() to prevent the compiler
optimising away the read from the volatile (may be updated by the GPU)
memory address. This is more suited to READ_ONCE(); make it so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-5-git-send-email-chris@chris-wilson.co.uk
Rather than call a function to compute the matching cachelines and
clflush them, just call the clflush *instruction* directly. We also know
that we can use the unpatched plain clflush rather than the clflushopt
alternative.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-4-git-send-email-chris@chris-wilson.co.uk
Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.
v2: Clear the stuck interrupt marker between successful batches
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-3-git-send-email-chris@chris-wilson.co.uk
In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.
v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).
Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]
v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-2-git-send-email-chris@chris-wilson.co.uk
In order to ensure seqno/irq coherency, we currently read a ring register.
The mmio transaction following the interrupt delays the inspection of
the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU
cache. However, it is only the memory timing that is important for the
purposes of the delay, we do not need nor desire the extra forcewake.
v3: Update commentary
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-1-git-send-email-chris@chris-wilson.co.uk