Commit Graph

26368 Commits

Author SHA1 Message Date
Tvrtko Ursulin 4e1176dd61 drm/i915: Do not serialize forcewake acquire across domains
On platforms with multiple forcewake domains it seems more efficient
to request all desired ones and then to wait for acks to avoid
needlessly serializing on each domain.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1460045074-1006-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-04-12 14:30:41 +01:00
Tvrtko Ursulin 33c582c10a drm/i915: Simplify for_each_fw_domain iterators
As the vast majority of users do not use the domain id variable,
we can eliminate it from the iterator and also change the latter
using the same principle as was recently done for for_each_engine.

For a couple of callers which do need the domain mask, store it
in the domain array (which already has the domain id), then both
can be retrieved thence.

Result is clearer code and smaller generated binary, especially
in the tight fw get/put loops. Also, relationship between domain
id and mask is no longer assumed in the macro.

v2: Improve grammar in the commit message and rename the
    iterator to for_each_fw_domain_masked for consistency.
    (Dave Gordon)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
2016-04-12 14:30:41 +01:00
Tvrtko Ursulin a57a4a67e5 drm/i915: Use consistent forcewake auto-release timeout across kernel configs
Because it is based on jiffies, current implementation releases the
forcewake at any time between straight away and between 1ms and 10ms,
depending on the kernel configuration (CONFIG_HZ).

This is probably not what has been desired, since the dynamics of keeping
parts of the GPU awake should not be correlated with this kernel
configuration parameter.

Change the auto-release mechanism to use hrtimers and set the timeout to
1ms with a 1ms of slack. This should make the GPU power consistent
across kernel configs, and timer slack should enable some timer coalescing
where multiple force-wake domains exist, or with unrelated timers.

For GlBench/T-Rex this decreases the number of forcewake releases from
~480 to ~300 per second, and for a heavy combined OGL/OCL test from
~670 to ~360 (HZ=1000 kernel).

Even though this reduction can be attributed to the average release period
extending from 0-1ms to 1-2ms, as discussed above, it will make the
forcewake timeout consistent for different CONFIG_HZ values.

Real life measurements with the above workload has shown that, with this
patch, both manage to auto-release the forcewake between 2-4 times per
10ms, even though the number of forcewake gets is dramatically different.

T-Rex requests between 5-10 explicit gets and 5-10 implict gets in each
10ms period, while the OGL/OCL test requests 250 and 380 times in the same
period.

The two data points together suggest that the nature of the forwake
accesses is bursty and that further changes and potential timeout
extensions, or moving the start of timeout from the first to the last
automatic forcewake grab, should be carefully measured for power and
performance effects.

v2:
  * Commit spelling. (Dave Gordon)
  * More discussion on numbers in the commit. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 14:30:41 +01:00
Ville Syrjälä a05628195a drm/i915: Get panel_type from OpRegion panel details
We've had problems on several occasions with using the panel type
from the VBT block 40. Usually it seems to be 2, which often
doesn't give us the correct timings for the panel. After some
more digging I found a way to get a panel type via the OpRegion
SWSCI GBDA "Get Panel Details" method. Let's try to use it.

The spec has this to say about the output:
"Bits [15:8] - Panel Type
 Bits contain the panel type user setting from CMOS
 00h = Not Valid, use default Panel Type & Timings from VBT
 01h - 0Fh = Panel Number"

Another version of the spec lists the valid range as 1-16, which makes
more sense since VBT supports 16 panels. Based on actual results
from Rob's G45, 1-16 is what we need to accept.

The other bits in the output don't look relevant for the problem at
hand.

The input is specified as:
"Bits [31:4] - Reserved
 Reserved (must be zero)
 Bits [3:0] - Panel Number
 These bits contain the sequential index of Panel, starting at 0 and
 counting upwards from the first integrated Internal Flat-Panel Display
 Encoder present, and then from the first external Display Encoder
 (e.g., S/DVO-B then S/DVO-C) which supports Internal Flat-Panels.
 0h - 0Fh = Panel number"

For now I've just hardcoded the input panel number as 0. That would seem
like a decent choise for LVDS. Not so sure about eDP when port != A.

v2: Accept values 1-16
    Filter out bogus results in opregion code (Jani)
    Add debug logging for all the different branches (Jani)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94825
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460359431-11003-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Rob Kramer <rob@solution-space.com>
2016-04-12 13:23:43 +03:00
Ville Syrjälä 3e845c7a40 drm/i915: Replace the static panel_type variable with dev_priv->vbt.panel_type
Store the extracted panel_type under dev_priv.vbt instead of keeping
around a static variable for it.

Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:23:43 +03:00
Ville Syrjälä eeeebea6cb drm/i915: Reject panel_type > 0xf from VBT
VBT can only contain 16 panel entries, indexed with the panel_type.
To play it safe we should reject panel_type > 0xf, so that we don't
read past the valid data.

v2: Add debug logging (Jani)

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rob Kramer <rob@solution-space.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460359329-10817-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 13:23:42 +03:00
Ville Syrjälä 706778013b drm/i915: Make GMBUS timeout message DRM_DEBUG_KMS
There's no real reason the user should care that we're about to fall
back to bitbanging, so let's change the message from DRM_INFO to
DRM_DEBUG_KMS.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-5-git-send-email-ville.syrjala@linux.intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94890
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-12 13:23:17 +03:00
Ville Syrjälä 3e4d44e0fa drm/i915: Restore GMBUS operation after a failed bit-banging fallback
When the GMBUS based i2c transfer times out, we try to fall back to
bit-banging and retry the operation that way. However if the bit-banging
attempt also fails, we should probably go back to the GMBUS method for
the next attempt. Maybe there simply wasn't anyone one the bus at this
time.

There's also a bit of a mess going on with the force_bit handling.
It's supposed to be a ref count actually, and it is as far as
intel_gmbus_force_bit() is concerned. But it's treated as just a
flag by the timeout based bit-banging fallback. I suppose that's
fine since we should never end up in the timeout fallback case
if force_bit was already non-zero. However now that we want to restore
things back to where they were after the bit-banging attempt failed,
we're going to have to do things a bit differently to avoid clobbering
the force_bit count as set up by intel_gmbus_force_bit(). So let's
dedicate the high bit as a flag for the low level timeout based fallback
and treat the rest of the bits as a ref count just as before.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:20:58 +03:00
Ville Syrjälä ade754ec11 drm/i915: Protect force_bit with gmbus_mutex
Extend the protection of gmbus_mutex around the force_bit
RMW in intel_gmbus_force_bit(), in case someone gets the
idea of calling it from a separate thread while there's
other stuff happening on the same bus.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457366220-29409-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-04-12 13:20:58 +03:00
Chris Wilson f470b19095 drm/i915/userptr: Store i915 backpointer for i915_mm_struct
Since we only ever use the drm_i915_private from the stored
i915_mm_struct->dev, save some electrons by storing the right
backpointer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-3-git-send-email-chris@chris-wilson.co.uk
2016-04-11 20:39:16 +01:00
Chris Wilson 40313f0cd0 drm/i915/userptr: Hold mmref whilst calling get-user-pages
Holding a reference to the containing task_struct is not sufficient to
prevent the mm_struct from being reaped under memory pressure. If this
happens whilst we are calling get_user_pages(), explosions erupt -
sometimes an immediate GPF, sometimes page flag corruption. To prevent
the target mm from being reaped as we are reading from it, acquire a
reference before we begin.

Testcase: igt/gem_shrink/*userptr
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-2-git-send-email-chris@chris-wilson.co.uk
2016-04-11 20:39:01 +01:00
Chris Wilson 393afc2c3f drm/i915/userptr: Flush cancellations before mmu-notifier invalidate returns
In order to ensure that all invalidations are completed before the
operation returns to userspace (i.e. before the munmap() syscall returns)
we need to wait upon the outstanding operations.

We are allowed to block inside the invalidate_range_start callback, and
as struct_mutex is the inner lock with mmap_sem we can wait upon the
struct_mutex without provoking lockdep into warning about a deadlock.
However, we don't actually want to wait upon outstanding rendering
whilst holding the struct_mutex if we can help it otherwise we also
block other processes from submitting work to the GPU. So first we do a
wait without the lock and then when we reacquire the lock, we double
check that everything is ready for removing the invalidated pages.

Finally to wait upon the outstanding unpinning tasks, we create a
private workqueue as a means to conveniently wait upon all at once. The
drawback is that this workqueue is per-mm, so any threads concurrently
invalidating objects will wait upon each other. The advantage of using
the workqueue is that we can wait in parallel for completion of
rendering and unpinning of several objects (of particular importance if
the process terminates with a whole mm full of objects).

v2: Apply a cup of tea to the changelog.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94699
Testcase: igt/gem_userptr_blits/sync-unmap-cycles
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459864801-28606-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-04-11 20:38:33 +01:00
Daniel Vetter ba3150ac38 drm/i915: Update DRIVER_DATE to 20160411
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-04-11 20:20:18 +02:00
Daniel Vetter 3970285319 Linux 4.6-rc3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXCva8AAoJEHm+PkMAQRiGXBoIAIkrjxdbuT2nS9A3tHwkiFXa
 6/Th1UjbNaoLuZ+MckQHayAD9NcWY9lVjOUmFsSiSWMCQK/rTWDl8x5ITputrY2V
 VuhrJCwI7huEtu6GpRaJaUgwtdOjhIHz1Ue2MCdNIbKX3l+LjVyyJ9Vo8rruvZcR
 fC7kiivH04fYX58oQ+SHymCg54ny3qJEPT8i4+g26686m11hvZLI3UAs2PAn6ut+
 atCjxdQ4yLN3DWsbjuA7wYGWhTgFloxL4TIoisuOUc3FXnSi/ivIbXZvu4lUfisz
 LA2JBhfII3AEMBWG9xfGbXPijJTT4q7yNlTD0oYcnMtAt/Roh2F04asqB1LetEY=
 =bri6
 -----END PGP SIGNATURE-----

Merge tag 'v4.6-rc3' into drm-intel-next-queued

Linux 4.6-rc3

Backmerge requested by Chris Wilson to make his patches apply cleanly.
Tiny conflict in vmalloc.c with the (properly acked and all) patch in
drm-intel-next:

commit 4da56b99d9
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Apr 4 14:46:42 2016 +0100

    mm/vmap: Add a notifier for when we run out of vmap address space

and Linus' tree.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-04-11 19:25:13 +02:00
Chris Wilson fb8621d3be drm/i915: Avoid allocating a vmap arena for a single page
If we want a contiguous mapping of a single page sized object, we can
forgo using vmap() and just use a regular kmap(). Note that this is only
suitable if the desired pgprot_t is compatible.

v2: Use is_vmalloc_addr()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-7-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:13:36 +01:00
Chris Wilson f2a85e1975 drm,i915: Introduce drm_malloc_gfp()
I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want to
try a high-order kmalloc() before using a vmalloc().

So refactor my usage into drm_malloc_gfp().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk
2016-04-11 17:13:10 +01:00
Chris Wilson eae2c43b12 drm/i915/shrinker: Restrict vmap purge to objects with vmaps
When called because we have run out of vmap address space, we only need
to recover objects that have vmappings and not all.

v2: Start using is_vmalloc_addr()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:12:16 +01:00
Chris Wilson 0a798eb92e drm/i915: Refactor duplicate object vmap functions
We now have two implementations for vmapping a whole object, one for
dma-buf and one for the ringbuffer. If we couple the mapping into the
obj->pages lifetime, then we can reuse an obj->mapping for both and at
the same time couple it into the shrinker. There is a third vmapping
routine in the cmdparser that maps only a range within the object, for
the time being that is left alone, but will eventually use these routines
in order to cache the mapping between invocations.

v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala)
v3: Call unpin_vmap from the right dmabuf unmapper

v4: Rename vmap to map as we don't wish to imply the type of mapping
involved, just that it contiguously maps the object into kernel space.
Add kerneldoc and lockdep annotations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-4-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2016-04-11 17:11:44 +01:00
Chris Wilson d2cad5358b drm/i915: Consolidate common error handling in intel_pin_and_map_ringbuffer_obj
After we pin the ringbuffer into the GGTT, all error paths need to unpin
it again. Move this common step into one block, and make the unable to
iomap error code consistent (i.e. treat it as out of memory to avoid
confusing it with a invalid argument).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-3-git-send-email-chris@chris-wilson.co.uk
2016-04-11 17:11:08 +01:00
Chris Wilson 6d19245f18 drm/i915/dmabuf: Tighten struct_mutex for unmap_dma_buf
We only need the struct_mutex to manipulate the pages_pin_count on the
object, we do not need to hold our BKL when freeing the exported
scatterlist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-2-git-send-email-chris@chris-wilson.co.uk
2016-04-11 17:10:56 +01:00
Tim Gore b1e429fe3b drm/i915: implement WaClearTdlStateAckDirtyBits
This is to fix a GPU hang seen with mid thread pre-emption
and pooled EUs.

v2. Use IS_BXT_REVID instead of IS_BROXTON and INTEL_REVID

v3. And use correct type for register addresses

Signed-off-by: Tim Gore <tim.gore@intel.com>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458571049-854-1-git-send-email-tim.gore@intel.com
2016-04-11 11:59:43 +01:00
Dongwon Kim 25a5670533 drm/i915/bxt: Reversed polarity of PORT_PLL_REF_SEL bit
For BXT, description of polarities of PORT_PLL_REF_SEL
has been reversed for newer Gen9LP steppings according to the
recent update in Bspec. This bit now should be set for
"Non-SSC" mode for all Gen9LP starting from B0 stepping.

v2: Only B0 and newer stepping should be affected by this
change.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94866
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458176773-26925-1-git-send-email-dongwon.kim@intel.com
2016-04-11 13:02:23 +03:00
Maarten Lankhorst c0ead7039a drm/i915: Rename hw state checker to hw state verifier.
Check functions are used by atomic to see if the new state will
be allowed. There's also a hw state checker which checks afterwards
that the committed state is correct. Rename it to hw state verifier
to reduce some confusion.

Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56FB8785.8020506@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
2016-04-11 10:34:17 +02:00
Maarten Lankhorst f6d1973db2 drm/i915: Move modeset state verifier calls.
The modeset state verifier no longer has full access to the hardware,
instead it should only verify affected crtc's.

Looking for disabled stuff can be verified immediately after all crtc
disables have completed, while each enabled crtc can be verified right
after being enabled.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-3-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: check -> verify]
2016-04-11 10:33:30 +02:00
Maarten Lankhorst e7c8454475 drm/i915: Make modeset state verifier take crtc as argument.
This will make it easier to keep the crtc checker when atomic
commit is reworked for asynchronous commits. This prevents checking
crtc's that were not part of the state. It's safe to verify disabled
encoders, connectors and dpll's that are not part of the state,
because during modeset connection_mutex is held.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458741487-23801-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
[mlankhorst: Extend commit message and rename check to verify.]
2016-04-11 10:32:56 +02:00
Chris Wilson 5dd8e50c27 drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor
When reading from the HWS page, we use barrier() to prevent the compiler
optimising away the read from the volatile (may be updated by the GPU)
memory address. This is more suited to READ_ONCE(); make it so.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-5-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:59 +01:00
Chris Wilson 0d317ce99e drm/i915: Use simplest form for flushing the single cacheline in the HWS
Rather than call a function to compute the matching cachelines and
clflush them, just call the clflush *instruction* directly. We also know
that we can use the unpatched plain clflush rather than the clflushopt
alternative.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-4-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:45 +01:00
Chris Wilson 12471ba87a drm/i915: Harden detection of missed interrupts
Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.

v2: Clear the stuck interrupt marker between successful batches

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-3-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:29 +01:00
Chris Wilson c04e0f3b4e drm/i915: Separate out the seqno-barrier from engine->get_seqno
In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.

v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).

Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]

v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-2-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:05 +01:00
Chris Wilson 9b9ed30936 drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
In order to ensure seqno/irq coherency, we currently read a ring register.
The mmio transaction following the interrupt delays the inspection of
the seqno long enough for the MI_STORE_DWORD_IMM to update the CPU
cache. However, it is only the memory timing that is important for the
purposes of the delay, we do not need nor desire the extra forcewake.

v3: Update commentary

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-1-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:08:53 +01:00
Akash Goel 782f6bc0ab drm/i915: Fixup the free space logic in ring_prepare
Currently for the case where there is enough space at the end of Ring
buffer for accommodating only the base request, the wrapround is done
immediately and as a result the base request gets added at the start
of Ring buffer. But there may not be enough free space at the beginning
to accommodate the base request, as before the wraparound, the wait was
effectively done for the reserved_size free space from the start of
Ring buffer. In such a case there is a potential of Ring buffer overflow,
the instructions at the head of Ring (ACTHD) can get overwritten.

Since the base request can fit in the remaining space, there is no need
to wraparound immediately. The wraparound will anyway happen later when
the reserved part starts getting used.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1457688402-10411-1-git-send-email-akash.goel@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
2016-04-09 11:37:48 +01:00
Chris Wilson cffa781e59 drm/i915: Simplify check for idleness in hangcheck
Having fixed the tracking of the engine's last_submitted_seqno, we can
now rely on it for detecting when the engine is idle (and not have to
touch the requests pointer).

Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-9-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2016-04-08 11:45:05 +01:00
Chris Wilson 7c90b7de73 drm/i915: Apply a mb between emitting the request and hangcheck
Seal the request and mark it as pending execution before we submit it to
hardware. We assume that the actual submission cannot fail (that
guarantee is provided by preallocating space in the request for the
submission). As we may inspect this state without holding any locks
during hangcheck we should apply a barrier to ensure that we do
not see a more recent value in the HWS than we are tracking.

Based on a patch by Mika Kuoppala.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-8-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2016-04-08 11:44:48 +01:00
Chris Wilson 01347126f4 drm/i915: Reset engine->last_submitted_seqno
When we change the current seqno, we also need to remember to reset the
last_submitted_seqno for the engine.

Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-7-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:44:22 +01:00
Chris Wilson a058d93482 drm/i915: Reset semaphore page for gen8
An oversight is that when we wrap the seqno, we need to reset the hw
semaphore counters to 0. We did this for gen6 and gen7 and forgot to do
so for the new implementation required for gen8 (legacy).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-6-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:59 +01:00
Chris Wilson 8c12672ee2 drm/i915: Refactor gen8 semaphore offset calculation
We reuse the same calculation into two macros, and I want to add a third
user. Time to refactor.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-5-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:48 +01:00
Chris Wilson 29dcb57026 drm/i915: Move the hw semaphore initialisation from GEM to the engine
Since we are setting engine local values that are tied to the hardware,
move it out of i915_gem_init_seqno() into the intel_ring_init_seqno()
backend, next to where the other hw semaphore registers are written.

v2: Make the explanatory comment about always resetting the semaphores to
0 irrespective of the value of the reset seqno.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-4-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:38 +01:00
Chris Wilson d04bce48a5 drm/i915: Remove unneeded drm_device pointer from intel_ring_init_seqno()
We only use drm_i915_private within the function, so delete the unneeded
drm_device local.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-3-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:25 +01:00
Chris Wilson 2ed53a94d8 drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno
After the GPU reset and we discard all of the incomplete requests, mark
the GPU as having advanced to the last_submitted_seqno (as having
completed the requests and ready for fresh work). The impact of this is
negligible, as all the requests will be considered completed by this
point, it just brings the HWS into line with expectations for external
viewers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-2-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:43:11 +01:00
Chris Wilson 14fd0d6d0b drm/i915: Include engine->last_submitted_seqno in GPU error state
It's useful to look at the last seqno submitted on a particular engine
and compare it against the HWS value to check for irregularities.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-1-git-send-email-chris@chris-wilson.co.uk
2016-04-08 11:42:50 +01:00
Ramalingam C 6f0e7535e7 drm/i915/BXT: Get pipe conf from the port registers
At BXT DSI, PIPE registers are inactive. So we can't get the
PIPE's mode parameters from them. The possible option is
retriving them from the PORT registers.

The required changes are added for BXT in intel_dsi_get_config
(encoder->get_config).

v2: Addressed the Jani's comments
    -removed the redundant call to encoder->get_config
    -read bpp from port register
    -removed retrival of src_size from encoder->get_config

v3: pipe_config->pipe_bpp is fixed
    Jani's review comments addressed:
	Few horizontal timing parameters dropped from the patch to make
	progress, as there seems to be some disagreement on
	best/feasible/possible options.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>

Previously Reviewed at: https://lists.freedesktop.org/archives/intel-gfx/2016-April/091737.html
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460019967-26501-2-git-send-email-ramalingam.c@intel.com
2016-04-07 16:46:09 +03:00
Ramalingam C 43367ec962 drm/i915: Sharing the pixel_format_from_vbt to whole i915
Shared the function pixel_format_from_vbt for whole display module.
Function declaration is added to intel_dsi.h.

V2: Moved the function to intel_dsi.c and renamed as per the purpose
	of the function. Suggested by Jani.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>

Previously reviewed at https://lists.freedesktop.org/archives/intel-gfx/2016-April/091736.html
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460019967-26501-1-git-send-email-ramalingam.c@intel.com
2016-04-07 16:46:06 +03:00
Jani Nikula b13d8e2888 drm/i915/dsi: use a temp variable for referencing the gpio table
The shorthand is easier. Also change the struct name. No functional
changes.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/6572c108424a67b02367ea69cbbe00a03af9b958.1459884518.git.jani.nikula@intel.com
2016-04-07 16:37:03 +03:00
Jani Nikula 515d07dedc drm/i915/dsi: abstract VLV gpio element execution to a separate function
Prepare for future. No functional changes.

v2: Move earlier in the series. Use bool for gpio value.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[Jani: restored fixme comment while applying.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/ee791fed271d7f31c34163de6c6be37d1b704ef3.1459884518.git.jani.nikula@intel.com
2016-04-07 16:35:52 +03:00
Jani Nikula b0c91cd0a6 drm/i915/dsi: clean up vlv gpio table and definitions
Define and store the pad base offset in the array, and reference the
pconf0 and padval registers through macros. Add VLV prefixes to
macros. Use spec nomenclature for pconf0 and padval.

v2: Address Ville's review comments, squash another patch here.

v3: Use the names Ville dug up in the specs.

Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/34932140b78a3de7f825c78380a08c930694651b.1459884518.git.jani.nikula@intel.com
2016-04-07 16:35:07 +03:00
Joonas Lahtinen 2d1fe07340 drm/i915: Do not use {HAS_*, IS_*, INTEL_INFO}(dev_priv->dev)
dev_priv is what the macro works hard to extract, pass it directly.

> sed 's/\([A-Z].*(dev_priv\)->dev)/\1)/g'

v2:
- Include all wrapper macros too (Chris)

v3:
- Include sed cmdline (Chris)

v4:
- Break long line
- Rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460016485-8089-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-04-07 14:50:26 +03:00
Joonas Lahtinen 9458f4ba7d drm/i915: Do not WARN_ON in i915_vm_to_ppgtt
According to Chris, use of i915_vm_to_ppgtt is visible in benchmark
unless WARN_ON is removed, so lets get rid of it.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-04-07 14:49:42 +03:00
Joonas Lahtinen e5716f5575 drm/i915: Use i915_vm_to_ppgtt instead of manual container_of
Looks much better without container_of everywhere.

v2:
- In i915_gem_restore_gtt_mappings too (Chris)

v3:
- Do not cause WARN by calling on non PPGTT object (Chris)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-07 14:49:42 +03:00
Dave Airlie fd8c61ebd4 Merge branch 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Lots of misc bug fixes for radeon and amdgpu and one for ttm.
- fix vram info fetching on Fiji and unposted boards
- additional vblank fixes from the conversion to drm_vblank_on/off
- UVD dGPU suspend and resume fixes
- lots of powerplay fixes
- fix a fence leak in the pageflip code
- ttm fix for platforms where CPU is 32 bit, but physical addresses are >32bits

* 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux: (21 commits)
  drm/amdgpu: total vram size also reduces pin size
  drm/amd/powerplay: add uvd/vce dpm enabling flag default.
  drm/amd/powerplay: fix issue that resume back, dpm can't work on FIJI.
  drm/amdgpu: save and restore the firwmware cache part when suspend resume
  drm/amdgpu: save and restore UVD context with suspend and resume
  drm/ttm: use phys_addr_t for ttm_bus_placement
  drm/radeon: Only call drm_vblank_on/off between drm_vblank_init/cleanup
  drm/amdgpu: fence wait old rcu slot
  drm/amdgpu: fix leaking fence in the pageflip code
  drm/amdgpu: print vram type rather than just DDR
  drm/amdgpu/gmc: use proper register for vram type on Fiji
  drm/amdgpu/gmc: move vram type fetching into sw_init
  drm/amdgpu: Set vblank_disable_allowed = true
  drm/radeon: Set vblank_disable_allowed = true
  drm/amd/powerplay: Need to change boot to performance state in resume.
  drm/amd/powerplay: add new Fiji function for not setting same ps.
  drm/amdgpu: check dpm state before pm system fs initialized.
  drm/amd/powerplay: notify amdgpu whether dpm is enabled or not.
  drm/amdgpu: Not support disable dpm in powerplay.
  drm/amdgpu: add an cgs interface to notify amdgpu the dpm state.
  ...
2016-04-07 07:08:46 +10:00
Matt Roper 281c114f8e drm/i915/bxt: Set max cdclk frequency properly
intel_update_max_cdclk() doesn't have a switch case for Broxton, so
dev_priv->max_cdclk_freq gets set to whatever clock frequency we're
currently running at (e.g., 144 MHz) rather than the true maximum.  This
causes our max dotclock to also be set too low and in turn leads mode
verification to reject perfectly valid modes while loading EDID firmware
blobs.

Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459892239-14041-1-git-send-email-matthew.d.roper@intel.com
2016-04-06 11:01:02 -07:00