linux

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	f0a1fb10e5	drm/i915: Insert a command barrier on BLT/BSD cache flushes This looked like an odd regression from commit `ec5cc0f9b0` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 12 10:28:55 2014 +0100 drm/i915: Restrict GPU boost to the RCS engine but in reality it undercovered a much older coherency bug. The issue that boosting the GPU frequency on the BCS ring was masking was that we could wake the CPU up after completion of a BCS batch and inspect memory prior to the write cache being fully evicted. In order to serialise the breadcrumb interrupt (and so ensure that the CPU's view of memory is coherent) we need to perform a post-sync operation in the MI_FLUSH_DW. v2: Fix all the MI_FLUSH_DW (bsd plus the duplication in execlists). Also fix the invalidate_domains mask in gen8_emit_flush() for ring != VCS. Testcase: gpuX-rcs-gpu-read-after-write Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2015-02-09 20:03:15 +02:00
Ville Syrjälä	e7fc24362c	drm/i915: Change CHV WIZ hashing mode to 16x4 I ran a few tests with xonotic and synmark2 trying out the different WIZ hashing modes on CHV. The results seem to match the results I got with IVB/HSW when I did the similar tests on them in the past. That is 16x4 is generally the fastest mode, 8x8 comes next and finally 8x4. On CHV the difference between the modes is at most ~1% in most tests. IIRC on IVB/HSW the difference was a little bigger, but as there doesn't seem to be any real downside to 16x4 let's use it by default. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-27 09:51:10 +01:00
Ville Syrjälä	14bc16e398	drm/i915: Implement Wa4x4STCOptimizationDisable:chv Wa4x4STCOptimizationDisable got only implemented for BDW, but according to the w/a database CHV needs it too, so add it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-27 09:51:09 +01:00
Mika Kuoppala	59bad94718	drm/i915: Rename the forcewake get/put functions We have multiple forcewake domains now on recent gens. Change the function naming to reflect this. v2: More verbose names (Chris) v3: Rebase v4: Rebase v5: Add documentation for forcewake_get/put Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-27 09:50:57 +01:00
Nick Hoath	72f95afa5f	drm/i915: Removed duplicate members from submit_request Where there were duplicate variables for the tail, context and ring (engine) in the gem request and the execlist queue item, use the one from the request and remove the duplicate from the execlist queue item. Issue: VIZ-4274 v1: Rebase v2: Fixed build issues. Keep separate postfix & tail pointers as these are used in different ways. Reinserted missing full tail pointer update. Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-27 09:50:52 +01:00
Kenneth Graunke	973a5b06a0	drm/i915: Ensure the HiZ RAW Stall Optimization is on for Cherryview. This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Increases performance in OglVSInstancing by about 2.7x on Braswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-17 05:01:28 +01:00
Kenneth Graunke	2701fc4356	drm/i915: Enable the HiZ RAW Stall Optimization on Broadwell. This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Improves performance in OglVSInstancing by 3.2x on Broadwell GT3e (Iris Pro 6200). Thanks to Jesse Barnes and Ben Widawsky for their help in tracking this down. Thanks to Chris Wilson for showing me the new workarounds system. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-17 05:01:19 +01:00
Kenneth Graunke	d60de81da4	drm/i915: Improve HiZ throughput on Cherryview. Found by reading the HIZ_CHICKEN documentation. Improves performance in a HiZ microbenchmark by around 50%. Improves performance in OglZBuffer by around 18%. Thanks to Chris Wilson for helping me figure out where to put this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-01-13 00:16:53 +01:00
Daniel Vetter	0a87a2db48	Merge tag 'topic/i915-hda-componentized-2015-01-12' into drm-intel-next-queued Conflicts: drivers/gpu/drm/i915/intel_runtime_pm.c Separate branch so that Takashi can also pull just this refactoring into sound-next. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-01-12 23:07:46 +01:00
Chris Wilson	add284a3a2	drm/i915: Force the CS stall for invalidate flushes In order to act as a full command barrier by itself, we need to tell the pipecontrol to actually stall the command streamer while the flush runs. We require the full command barrier before operations like MI_SET_CONTEXT, which currently rely on a prior invalidate flush. References: https://bugs.freedesktop.org/show_bug.cgi?id=83677 Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-12-16 15:06:48 +02:00
Chris Wilson	148b83d081	drm/i915: Invalidate media caches on gen7 In the gen7 pipe control there is an extra bit to flush the media caches, so let's set it during cache invalidation flushes. v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec. Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-12-16 15:04:39 +02:00
Michel Thierry	e6c1abb739	drm/i915: Warn about missing context state workarounds only once Otherwise, new platforms without workarounds will hit this warning for every new context created. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-16 10:39:12 +01:00
Michel Thierry	1a2520582e	drm/i915/bdw: Add WaForceEnableNonCoherent label We already implement this workaround, but it was missing its name. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-10 17:47:27 +01:00
Damien Lespiau	26459343e0	drm/i915: Remove '& 0xffff' from the mask given to WA_REG() We may be hidding bugs by doing that, so let remove it and have the actual mask value shine through, for better or worse. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-12-10 16:39:24 +02:00
Damien Lespiau	cf4b0de6a3	drm/i915: Invert the mask and val arguments in wa_add() and WA_REG() While trying to unify the order of those arguments throughout the driver, Daniel noticed what we were inverting them in this part of the code. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-12-10 16:33:30 +02:00
Damien Lespiau	98533251b0	drm/i915/bdw: Fix the write setting up the WIZ hashing mode I was playing with clang and oh surprise! a warning trigerred by -Wshift-overflow (gcc doesn't have this one): WA_SET_BIT_MASKED(GEN7_GT_MODE, GEN6_WIZ_HASHING_MASK \| GEN6_WIZ_HASHING_16x4); drivers/gpu/drm/i915/intel_ringbuffer.c:786:2: warning: signed shift result (0x28002000000) requires 43 bits to represent, but 'int' only has 32 bits [-Wshift-overflow] WA_SET_BIT_MASKED(GEN7_GT_MODE, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/intel_ringbuffer.c:737:15: note: expanded from macro 'WA_SET_BIT_MASKED' WA_REG(addr, _MASKED_BIT_ENABLE(mask), (mask) & 0xffff) Turned out GEN6_WIZ_HASHING_MASK was already shifted by 16, and we were trying to shift it a bit more. The other thing is that it's not the usual case of setting WA bits here, we need to have separate mask and value. To fix this, I've introduced a new _MASKED_FIELD() macro that takes both the (unshifted) mask and the desired value and the rest of the patch ripples through from it. This bug was introduced when reworking the WA emission in: Commit `7225342ab5` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Oct 7 17:21:26 2014 +0300 drm/i915: Build workaround list in ring initialization v2: Invert the order of the mask and value arguments (Daniel Vetter) Rewrite _MASKED_BIT_ENABLE() and _MASKED_BIT_DISABLE() with _MASKED_FIELD() (Jani Nikula) Make sure we only evaluate 'a' once in _MASKED_BIT_ENABLE() (Dave Gordon) Add check to ensure the value is within the mask boundaries (Chris Wilson) v3: Ensure the the value and mask are 16 bits (Dave Gordon) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-12-10 11:20:46 +02:00
Daniel Vetter	8f0e2b9d95	drm/i915: Move golden context init into ->init_context Similar to a patch from Thomas Daniel for lrc contexts. This keeps both sides somewhat in sync and should make Dave Gordon happy. Note that both the wa and the golden context init code suffer a bit from an inssuficient split into driver load and hw init code. Which means we have a bunch of tests all over the place to check whether the one-time initialization has been done already or not. All that one-tim code should be moved into the one-time ring setup code, but that's work for later. Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-08 15:19:02 +01:00
John Harrison	67e2937bf4	drm/i915: Add unique id to the request structure for debugging For debugging purposes, it is useful to be able to uniquely identify a given request structure as it works its way through the system. This becomes especially tricky once the seqno value is lazily allocated as then the request has nothing but its pointer to identify it for much of its life. Change-Id: Ie76b2268b940467f4cdf5a4ba6f5a54cbb96445d For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-06 01:46:27 +01:00
John Harrison	aaeb1ba041	drm/i915: Zero fill the request structure There is a general theory that kzmalloc is better/safer than kmalloc, especially for interesting data structures. This change updates the request structure allocation to be zero filled. This also fixes crashes in the reset code. Quoting Mika's patch: "Clean the request structure on alloc. Otherwise we might end up referencing uninitialized fields. This is apparent when we try to cleanup the preallocated request on ring reset, before any request has been submitted to the ring. The request->ctx is foobar and we end up freeing the foobarness." Note that this fixes a regression introduced in commit `9eba5d4a1d` Author: John Harrison <John.C.Harrison@Intel.com> Date: Mon Nov 24 18:49:23 2014 +0000 drm/i915: Ensure OLS & PLR are always in sync References: https://bugs.freedesktop.org/show_bug.cgi?id=86959 References: https://bugs.freedesktop.org/show_bug.cgi?id=86962 References: https://bugs.freedesktop.org/show_bug.cgi?id=86992 Change-Id: I68715ef758025fab8db763941ef63bf60d7031e2 For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-06 01:46:26 +01:00
Michel Thierry	f3f32360b6	drm/i915/bdw: Add WaHdcDisableFetchWhenMasked We already have it for chv, but was missing for bdw. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-05 15:40:05 +01:00
Daniel Vetter	bfc882b4e3	drm/i915: Flatten engine init control flow Now that sanity prevails and we have the clean split between software init and starting the engines we can drop all the "have we allocate this struct already?" nonsense. Execlist code could benefit quite a bit more still, but that's for another patch. Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-12-03 09:35:29 +01:00
Daniel Vetter	35a57ffbb1	drm/i915: Only init engines once We can do this. And now there's finally the clean split between software setup and hardware setup I kinda wanted since multi-ring support was merged aeons ago. It only took almost 5 years. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:28 +01:00
Daniel Vetter	99be1dfe06	drm/i915: Move intel_init_pipe_control out of engine->init_hw With this all the ->init_hw hooks really only set up hw state needed to start the ring, all the software state setup and memory/buffer allocations happen beforehand. v2: We need to call intel_init_pipe_control after the ring init since otherwise engine->dev is NULL and it falls over. Currently that's now after the hw ring is enabled but a) we'll be fine as long as no one submits a batch b) this will change soon. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:27 +01:00
Daniel Vetter	ecfe00d802	drm/i915: s/init()/init_hw()/ in intel_engine_cs This is (mostly, some exceptions that need fixing) the hw setup function which starts the ring. And not the function which allocates all the resources. Make this clear by giving it a better name. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:27 +01:00
Dave Gordon	ebd0fd4bef	drm/i915: Consolidate ring freespace calculations There are numerous places in the code where the driver's idea of how much space is left in a ring is updated using the driver's latest notions of the positions of 'head' and 'tail' for the ring. Among them are some that update one or both of these values before (re)doing the calculation. In particular, there are four different places in the code where 'last_retired_head' is copied to 'head' and then set to -1; and two of these do not have a guard to check that it has actually been updated since last time it was consumed, leaving the possibility that the dummy -1 can be transferred from 'last_retired_head' to 'head', causing the space calculation to produce 'impossible' results (previously seen on Android/VLV). This code therefore consolidates all the calculation and updating of these values, such that there is only one place where the ring space is updated, and it ALWAYS uses (and consumes) 'last_retired_head' if (and ONLY if) it has been updated since the last call. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:26 +01:00
Dave Gordon	4f54741e07	drm/i915: Make ring freespace calculation more robust The used space in a ring is given by the cyclic distance from the consumer (HEAD) to the producer (TAIL), i.e. ((tail-head) MOD size); conversely, the available space in a ring is the cyclic distance from the producer to the consumer, MINUS the amount reserved for a "gap" that is supposed to guarantee that the producer never catches up with or overruns the consumer. Note that some GEN h/w requires that TAIL never approach to within one cacheline of HEAD, so the gap is usually set to twice the cacheline size to ensure this. While the existing code gives the correct answer for correct inputs, if the producer HAS overrun into the reserved space, the result can be a value larger than the maximum valid value (size-reserved). We can improve this by reorganising the calculation, so that in the event of overrun the result will be negative rather than over-large. This means that the commonly-used test (available >= required) will then reject further writes into the ring after an overrun, giving some chance that we can recover from or at least diagnose the original problem; whereas allowing more writes would likely both confuse the h/w and destroy the evidence of what went wrong. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:26 +01:00
John Harrison	ff79e85702	drm/i915: Connect requests to rings at creation not submission It makes a lot more sense (and makes future seqno -> request conversion patches simpler) to fill in the 'ring' field of the request structure at the point of creation rather than submission. Given that the request structure is assigned by ring specific code and thus is locked to a ring from the start, there really is no reason to defer this assignment. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:22 +01:00
John Harrison	9400ae5c82	drm/i915: Remove obsolete seqno parameter from 'i915_add_request' There is no longer any need to retrieve a seqno value from an i915_add_request() call. The calling code already knows which request structure is being processed (it can only be ring->OLR). And as the request itself is now used in preference to the basic seqno value, the latter is now redundant in this situation. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:19 +01:00
Daniel Vetter	a4b3a5713d	drm/i915: Convert i915_wait_seqno to i915_wait_request Updated i915_wait_seqno() to take a request structure instead of a seqno value and renamed it accordingly. Internally, it just pulls the seqno out of the request and calls on to __wait_seqno() as before. However, all the code further up the stack is now simplified as it can just pass the request object straight through without having to peek inside. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> [danvet: Squash in hunk from an earlier patch which was rebased wrongly.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:17 +01:00
John Harrison	6259cead57	drm/i915: Remove 'outstanding_lazy_seqno' The OLS value is now obsolete. Exactly the same value is guarateed to be always available as PLR->seqno. Thus it is safe to remove the OLS completely. And also to rename the PLR to OLR to keep the 'outstanding lazy ...' naming convention valid. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:16 +01:00
John Harrison	abfe262ae7	drm/i915: Add reference count to request structure The plan is to use request structures everywhere that seqno values were previously used. This means saving pointers to structures in places that used to be simple integers. In turn, that means that the target structure now needs much more stringent lifetime tracking. That is, it must not be freed while some other random object still holds a pointer to it. To achieve this tracking, a reference count needs to be added. Whenever a pointer to the structure is saved away, the count must be incremented and the free must only occur when all references have been released. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:13 +01:00
John Harrison	9eba5d4a1d	drm/i915: Ensure OLS & PLR are always in sync The aim is to replace seqno values with request structures. A step along the way is to switch to using the PLR in preference to the OLS. That requires the PLR to only be valid when and only when the OLS is also valid. I.e., the two must be kept in lock step. Then, code which was using the OLS can be safely switched over to using the PLR instead. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:13 +01:00
Chris Wilson	5c6c600354	drm/i915: Remove DRI1 ring accessors and API With the deprecation of UMS, and by association DRI1, we have a tough choice when updating the ring access routines. We either rewrite the DRI1 routines blindly without testing (so likely to be broken) or take the liberty of declaring them no longer supported and remove them entirely. This takes the latter approach. v2: Also remove the DRI1 sarea updates Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Fix rebase conflicts.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-19 21:17:11 +01:00
Thomas Daniel	7ba717cf36	drm/i915/bdw: Pin the ringbuffer backing object to GGTT on-demand Same as with the context, pinning to GGTT regardless is harmful (it badly fragments the GGTT and can even exhaust it). Unfortunately, this case is also more complex than the previous one because we need to map and access the ringbuffer in several places along the execbuffer path (and we cannot make do by leaving the default ringbuffer pinned, as before). Also, the context object itself contains a pointer to the ringbuffer address that we have to keep updated if we are going to allow the ringbuffer to move around. v2: Same as with the context pinning, we cannot really do it during an interrupt. Also, pin the default ringbuffers objects regardless (makes error capture a lot easier). v3: Rebased. Take a pin reference of the ringbuffer for each item in the execlist request queue because the hardware may still be using the ringbuffer after the MI_USER_INTERRUPT to notify the seqno update is executed. The ringbuffer must remain pinned until the context save is complete. No longer pin and unpin ringbuffer in populate_lr_context() - this transient address is meaningless and the pinning can cause a sleep while atomic. v4: Moved ringbuffer pin and unpin into the lr_context_pin functions. Downgraded pinning check BUG_ONs to WARN_ONs. v5: Reinstated WARN_ONs for unexpected execlist states. Removed unused variable. Issue: VIZ-4277 Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Thomas Daniel <thomas.daniel@intel.com> Reviewed-by: Akash Goel <akash.goels@gmail.com> Reviewed-by: Deepak S<deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-19 19:56:44 +01:00
Michel Thierry	771b9a5324	drm/i915: Initialize workarounds in logical ring mode too Following the legacy ring submission example, update the ring->init_context() hook to support the execlist submission mode. v2: update to use the new workaround macros and cleanup unused code. This takes care of both bdw and chv workarounds. v2.1: Add missing call to init_context() during deferred context creation. v3: Split init_context (emit) in legacy/lrc modes. For lrc, get the ringbuf from the context (Mika/Daniel). v4: Merge init_context interfaces back, the legacy mode only needs the ring, but the lrc mode needs the ring and context (Mika). Issue: VIZ-4092 Issue: GMIN-3475 Change-Id: Ie3d093b2542ab0e2a44b90460533e2f979788d6c Cc: Deepak S <deepak.s@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Align function paramater lists properly.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:25 +01:00
Arun Siluvery	952890098a	drm/i915/chv: Add new workarounds for chv +WaForceEnableNonCoherent:chv +WaHdcDisableFetchWhenMasked:chv For: VIZ-4090 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:15 +01:00
Arun Siluvery	605f143320	drm/i915/chv: Combine GEN8_ROW_CHICKEN w/a WaDisablePartialInstShootdown:chv and WaDisableThreadStallDopClockGating:chv are related to the same register so combine them. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:14 +01:00
Arun Siluvery	3e470eaaee	drm/i915/chv: Remove pre-production workarounds -WaDisableDopClockGating:chv -WaDisableSamplerPowerBypass:chv -WaDisableGunitClockGating:chv -WaDisableFfDopClockGating:chv -WaDisableDopClockGating:chv v2: Remove pre-production WA instead of restricting them based on revision id (Ville) For: VIZ-4090 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:14 +01:00
John Harrison	6402c330a6	drm/i915: Fix null pointer dereference in ring cleanup code If a ring failed to initialise for any reason then the error path would try to clean up all rings including those that had not yet been allocated. The ring clean up code did a check that the ring was valid before starting its work. Unfortunately, that was after it had already dereferenced the ring to obtain a dev_private pointer. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-04 23:22:14 +01:00
Dave Airlie	041df3573d	Merge tag 'drm-intel-next-2014-10-24' of git://anongit.freedesktop.org/drm-intel into drm-next - suspend/resume/freeze/thaw unification from Imre - wa list improvements from Mika&Arun - display pll precomputation from Ander Conselvan, this removed the last ->mode_set callbacks, a big step towards implementing atomic modesets - more kerneldoc for the interrupt code - 180 rotation for cursors (Ville&Sonika) - ULT/ULX feature check macros cleaned up thanks to Damien - piles and piles of fixes all over, bug team seems to work! * tag 'drm-intel-next-2014-10-24' of git://anongit.freedesktop.org/drm-intel: (61 commits) drm/i915: Update DRIVER_DATE to 20141024 drm/i915: add comments on what stage a given PM handler is called drm/i915: unify switcheroo and legacy suspend/resume handlers drm/i915: add poweroff_late handler drm/i915: sanitize suspend/resume helper function names drm/i915: unify S3 and S4 suspend/resume handlers drm/i915: disable/re-enable PCI device around S4 freeze/thaw drm/i915: enable output polling during S4 thaw drm/i915: check for GT faults in all resume handlers and driver load time drm/i915: remove unused restore_gtt_mappings optimization during suspend drm/i915: fix S4 suspend while switcheroo state is off drm/i915: vlv: fix switcheroo/legacy suspend/resume drm/i915: propagate error from legacy resume handler drm/i915: unify legacy S3 suspend and S4 freeze handlers drm/i915: factor out i915_drm_suspend_late drm/i915: Emit even number of dwords when emitting LRIs drm/i915: Add rotation support for cursor plane (v5) drm/i915: Correctly reject invalid flags for wait_ioctl drm/i915: use macros to assign mmio access functions drm/i915: only run hsw_power_well_post_enable when really needed ...	2014-11-04 07:36:06 +10:00
Dave Airlie	bbf0ef0334	Merge tag 'drm-intel-next-2014-10-03-no-ppgtt' of git://anongit.freedesktop.org/drm-intel into drm-next Ok, new attempt, this time around with full ppgtt disabled again. drm-intel-next-2014-10-03: - first batch of skl stage 1 enabling - fixes from Rodrigo to the PSR, fbc and sink crc code - kerneldoc for the frontbuffer tracking code, runtime pm code and the basic interrupt enable/disable functions - smaller stuff all over drm-intel-next-2014-09-19: - bunch more i830M fixes from Ville - full ppgtt now again enabled by default - more ppgtt fixes from Michel Thierry and Chris Wilson - plane config work from Gustavo Padovan - spinlock clarifications - piles of smaller improvements all over, as usual * tag 'drm-intel-next-2014-10-03-no-ppgtt' of git://anongit.freedesktop.org/drm-intel: (114 commits) Revert "drm/i915: Enable full PPGTT on gen7" drm/i915: Update DRIVER_DATE to 20141003 drm/i915: Remove the duplicated logic between the two shrink phases drm/i915: kerneldoc for interrupt enable/disable functions drm/i915: Use dev_priv instead of dev in irq setup functions drm/i915: s/pm._irqs_disabled/pm.irqs_enabled/ drm/i915: Clear TX FIFO reset master override bits on chv drm/i915: Make sure hardware uses the correct swing margin/deemph bits on chv drm/i915: make sink_crc return -EIO on aux read/write failure drm/i915: Constify send buffer for intel_dp_aux_ch drm/i915: De-magic the PSR AUX message drm/i915: Reinstate error level message for non-simulated gpu hangs drm/i915: Kerneldoc for intel_runtime_pm.c drm/i915: Call runtime_pm_disable directly drm/i915: Move intel_display_set_init_power to intel_runtime_pm.c drm/i915: Bikeshed rpm functions name a bit. drm/i915: Extract intel_runtime_pm.c drm/i915: Remove intel_modeset_suspend_hw drm/i915: spelling fixes for frontbuffer tracking kerneldoc drm/i915: Tighting frontbuffer tracking around flips ...	2014-10-28 12:37:58 +10:00
Arun Siluvery	22a916aaa1	drm/i915: Emit even number of dwords when emitting LRIs The number of DWords should be even when doing ring emits as command sequences require QWord alignment. There was some discussion about the maximum length of the MI_LRI command. Quoting Mika "I did some test with bdw: "The maximum is 128 writes, resulting the 8 bit length field of the command being 0xff, thus following the spec. The 128'th write went through. "Perhaps the max command length is then less in older gens? "Perhaps WARN_ON(x > 128) in MI_LOAD_REGISTER_IMM would be in place but one needs minor tweak to command parser a bit also then. #define I915_MAX_WA_REGS 16 keeps us safe for now atleast." Ville commented that on pre-gen6 the length field seems to be restricted to 0x3f though. So for all cases we should be ok. v2: user LRI variant that can write multiple regs in one go (Damien). We can simply insert one NOP at the end instead of one per register write. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Add a summary of the MI_LRI length discussion.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-10-24 16:34:15 +02:00
Mika Kuoppala	7225342ab5	drm/i915: Build workaround list in ring initialization If we build the workaround list in ring initialization and decouple it from the actual writing of values, we gain the ability to decide where and how we want to apply the values. The advantage of this will become more clear when we need to initialize workarounds on older gens where it is not possible to write all the registers through ring LRIs. v2: rebase on newest bdw workarounds Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Cc: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> [danvet: Resolve tiny conflict in comments and ocd alignments a bit.] [danvet2: Remove bogus force_wake_get call spotted by Paulo and QA.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-10-24 16:34:07 +02:00
Rodrigo Vivi	101b376d35	drm/i915/bdw: Remove BDW preproduction W/As until C stepping. Let's clean this a bit v2: Rebase after other Mika's patch that removed some BDW production workarounds. v3: Removed stepping info. Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-10-24 16:34:03 +02:00
Daniel Vetter	a8cbd45977	Merge branch 'drm-intel-next-fixes' into drm-intel-next So I've sent the first pull request to Dave and I expect his request for a merge tree any second now ;-) More seriously I have some pending patches for 3.19 that depend upon both trees, hence backmerge. Conflicts are all trivial. Conflicts: drivers/gpu/drm/i915/i915_irq.c drivers/gpu/drm/i915/intel_display.c v2: Of course I've forgotten the fixup script for the silent conflict. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-10-21 14:42:30 +02:00
Linus Torvalds	2d65a9f48f	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm updates from Dave Airlie: "This is the main git pull for the drm, I pretty much froze major pulls at -rc5/6 time, and haven't had much fallout, so will probably continue doing that. Lots of changes all over, big internal header cleanup to make it clear drm features are legacy things and what are things that modern KMS drivers should be using. Also big move to use the new generic fences in all the TTM drivers. core: atomic prep work, vblank rework changes, allows immediate vblank disables major header reworking and cleanups to better delinate legacy interfaces from what KMS drivers should be using. cursor planes locking fixes ttm: move to generic fences (affects all TTM drivers) ppc64 caching fixes radeon: userptr support, uvd for old asics, reset rework for fence changes better buffer placement changes, dpm feature enablement hdmi audio support fixes intel: Cherryview work, 180 degree rotation, skylake prep work, execlist command submission full ppgtt prep work cursor improvements edid caching, vdd handling improvements nouveau: fence reworking kepler memory clock work gt21x clock work fan control improvements hdmi infoframe fixes DP audio ast: ppc64 fixes caching fix rcar: rcar-du DT support ipuv3: prep work for capture support msm: LVDS support for mdp4, new panel, gpu refactoring exynos: exynos3250 SoC support, drop bad mmap interface, mipi dsi changes, and component match support" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (640 commits) drm/mst: rework payload table allocation to conform better. drm/ast: Fix HW cursor image drm/radeon/kv: add uvd/vce info to dpm debugfs output drm/radeon/ci: add uvd/vce info to dpm debugfs output drm/radeon: export reservation_object from dmabuf to ttm drm/radeon: cope with foreign fences inside the reservation object drm/radeon: cope with foreign fences inside display drm/core: use helper to check driver features drm/radeon/cik: write gfx ucode version to ucode addr reg drm/radeon/si: print full CS when we hit a packet 0 drm/radeon: remove unecessary includes drm/radeon/combios: declare legacy_connector_convert as static drm/radeon/atombios: declare connector convert tables as static drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table drm/radeon/dpm: drop clk/voltage dependency filters for BTC drm/radeon/dpm: drop clk/voltage dependency filters for CI drm/radeon/dpm: drop clk/voltage dependency filters for SI drm/radeon/dpm: drop clk/voltage dependency filters for NI drm/radeon: disable audio when we disable hdmi (v2) drm/radeon: split audio enable between eg and r600 (v2) ...	2014-10-14 09:39:08 +02:00
Daniel Vetter	955e36d0b4	Merge branch 'topic/skl-stage1' into drm-intel-next-queued SKL stage 1 patches still need polish so will likely miss the 3.18 merge window. We've decided to postpone to 3.19 so let's pull this in to make patch merging and conflict handling easier. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-09-30 22:36:57 +02:00
Rodrigo Vivi	da09654d77	drm/i915/bdw: WaDisableFenceDestinationToSLM This WA affect BDW GT3 pre-production steppings. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Don't mention steppings ...] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-30 09:20:36 +02:00
Rodrigo Vivi	1d73c2a8f2	drm/i915: Minimize the huge amount of unecessary fbc sw cache clean. The sw cache clean on BDW is a tempoorary workaround because we cannot set cache clean on blt ring with risk of hungs. So we are doing the cache clean on sw. However we are doing much more than needed. Not only when using blt ring. So, with this extra w/a we minimize the ammount of cache cleans and call it only on same cases that it was being called on gen7. The traditional FBC Cache clean happens over LRI on BLT ring when there is a frontbuffer touch happening. frontbuffer tracking set fbc_dirty variable to let BLT flush that it must clean FBC cache. fbc.need_sw_cache_clean works in the opposite information direction of ring->fbc_dirty telling software on frontbuffer tracking to perform the cache clean on sw side. v2: Clean it a little bit and fully check for Broadwell instead of gen8. v3: Rebase after frontbuffer organization. v4: Wiggle confused me. So fixing v3! Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-29 14:17:31 +02:00
Imre Deak	fbdcb06880	drm/i915/skl: don't set the AsyncFlip performance mode for Gen9+ The following sets the AsyncFlip performance mode for everything above Gen6: commit 4790cb36b3eede8fb0cca529dc1d31b9936fa24b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Jan 20 16:11:20 2013 +0000 drm/i915: Disable AsyncFlip performance optimisations Starting from Gen9 the MI_MODE register layout changes and doesn't include the above bit. Reviewed-by: Thomas Wood <thomas.wood@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-24 14:33:15 +02:00

1 2 3 4 5 ...

405 Commits