linux

Commit Graph

Author	SHA1	Message	Date
Daniel Vetter	7cd512f152	drm/i915: Fix irq checks in ring->irq_get/put functions Yet another place that wasn't properly transformed when implementing SOix. While at it convert the checks to WARN_ON on gen5+ (since we don't have UMS potentially doing stupid things on those platforms). And also add the corresponding checks to the put functions (again with a WARN_ON) for gen5+. v2: Drop the WARNINGS in the irq_put functions (including the existing one for vebox), Chris convinced me that they're not that terribly useful. v3: Don't forget about execlist code. Cc: Imre Deak <imre.deak@intel.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: "Volkin, Bradley D" <bradley.d.volkin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2014-09-19 14:43:13 +02:00
Chris Wilson	7707225856	drm/i915: HSW always use GGTT selector for secure batches gen6 and earlier conflate address space selection (ppgtt vs ggtt) with the security bit (i.e. only privileged batches were allowed to run from ggtt). From Haswell only, you are able to select the security bit separate from the address space - and we always requested to use ppgtt. This breaks the golden render state batch execution with full-ppgtt as that is only present in the global GTT and more generally any secure batch that is not colocated in the ppgtt and ggtt. So we need to disable the use of the ppgtt selector bit for secure batches, or else we hang immediately upon boot and thence after every GPU reset... v2: Only HSW differentiates between secure dispatch and ggtt, so simply ignore the differentiation and always use secure==ggtt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: Rectify commit message as noted by Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-19 14:41:18 +02:00
Dave Airlie	40d201af0b	Merge tag 'drm-intel-next-2014-09-05' of git://anongit.freedesktop.org/drm-intel into drm-next - final bits (again) for the rotation support (Sonika Jindal) - support bl_power in the intel backlight (Jani) - vdd handling improvements from Ville - i830M fixes from Ville - piles of prep work all over to make skl enabling just plug in (Damien, Sonika) - rename DP training defines to reflect latest edp standards, this touches all drm drivers supporting DP (Sonika Jindal) - cache edids during single detect cycle to avoid re-reading it for e.g. audio, from Chris - move w/a for registers which are stored in the hw context to the context init code (Arun&Damien) - edp panel power sequencer fixes, helps chv a lot (Ville) - piles of other chv fixes all over - much more paranoid pageflip handling with stall detection and better recovery from Chris - small things all over, as usual * tag 'drm-intel-next-2014-09-05' of git://anongit.freedesktop.org/drm-intel: (114 commits) drm/i915: Update DRIVER_DATE to 20140905 drm/i915: Decouple the stuck pageflip on modeset drm/i915: Check for a stalled page flip after each vblank drm/i915: Introduce a for_each_plane() macro drm/i915: Rewrite ABS_DIFF() in a safer manner drm/i915: Add comments explaining the vdd on/off functions drm/i915: Move DP port disable to post_disable for pch platforms drm/i915: Enable DP port earlier drm/i915: Turn on panel power before doing aux transfers drm/i915: Be more careful when picking the initial power sequencer pipe drm/i915: Reset power sequencer pipe tracking when disp2d is off drm/i915: Track which port is using which pipe's power sequencer drm/i915: Fix edp vdd locking drm/i915: Reset the HEAD pointer for the ring after writing START drm/i915: Fix unsafe vma iteration in i915_drop_caches drm/i915: init sprites with univeral plane init function drm/i915: Check of !HAS_PCH_SPLIT() in PCH transcoder funcs drm/i915: Use HAS_GMCH_DISPLAY un underrun reporting code drm/i915: Use IS_BROADWELL() instead of IS_GEN8() in forcewake code drm/i915: Don't call gen8_fbc_sw_flush() on chv ...	2014-09-16 16:02:09 +10:00
Dave Airlie	b2efb3f0a1	Linux 3.17-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJUFjfVAAoJEHm+PkMAQRiGANkIAIU3PNrAz9dIItq8a/rEAhnx l2shHoOyEmyNR2apholM3BPUNX50cbsc/HGdi7lZKLkA/ifAj6B9nFD2NzVsIChD 1QWVcvdkKlVuxXCDd26qbijlfmbTOAWrLw9ntvM+J6ZtECM6zCAZF4MAV/FwogPq ETGKD76AxJtVIhBMS99troAiC1YxmQ7DKgEr8CraTOR1qwXEonnPCmN/IZA6x2/G EXiihOuQB5me1X7k4PI0V8CDscQOn+3B2CQHIrjRB+KiTF+iKIuI8n6ORC6bpFh+ U8UZP9wLlIG1BrUHG83pIndglIHotqPcjmtfl1WGrRr2hn7abzVSfV+g5Syo3Vg= =Ep+s -----END PGP SIGNATURE----- drm: backmerge tag 'v3.17-rc5' into drm-next This is requested to get the fixes for intel and radeon into the same tree for future development work. i915_display.c: fix missing dev_priv conflict.	2014-09-16 11:38:04 +10:00
Chris Wilson	611a7a4fd8	drm/i915: Fix SRC_COPY width on 830/845g One small change I forgot to make in commit `c4d69da167` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Sep 8 14:25:41 2014 +0100 drm/i915: Evict CS TLBs between batches was to update the copy width for the compact BLT copy instruction. Reported-by: Thomas Richter <thor@math.tu-berlin.de> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Richter <thor@math.tu-berlin.de> Cc: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Tested-by: Thomas Richter <thor@math.tu-berlin.de> Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-09-15 09:55:52 +03:00
Chris Wilson	c4d69da167	drm/i915: Evict CS TLBs between batches Running igt, I was encountering the invalid TLB bug on my 845g, despite that it was using the CS workaround. Examining the w/a buffer in the error state, showed that the copy from the user batch into the workaround itself was suffering from the invalid TLB bug (the first cacheline was broken with the first two words reversed). Time to try a fresh approach. This extends the workaround to write into each page of our scratch buffer in order to overflow the TLB and evict the invalid entries. This could be refined to only do so after we update the GTT, but for simplicity, we do it before each batch. I suspect this supersedes our current workaround, but for safety keep doing both. v2: The magic number shall be 2. This doesn't conclusively prove that it is the mythical TLB bug we've been trying to workaround for so long, that it requires touching a number of pages to prevent the corruption indicates to me that it is TLB related, but the corruption (the reversed cacheline) is more subtle than a TLB bug, where we would expect it to read the wrong page entirely. Oh well, it prevents a reliable hang for me and so probably for others as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2014-09-08 16:45:03 +03:00
Chris Wilson	95468892fd	drm/i915: Reset the HEAD pointer for the ring after writing START Ville found an old w/a documented for g4x that suggested that we need to reset the HEAD after writing START. This is a useful fixup for some of the g4x ring initialisation woes, but as usual, not all. v2: Do the rewrite unconditionally anyway References: https://bugs.freedesktop.org/show_bug.cgi?id=76554 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-04 11:26:17 +02:00
Damien Lespiau	b07ba1dc78	drm/i915: Remove unneeded brackets Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 12:39:10 +02:00
Damien Lespiau	04ad2dc711	drm/i915: Don't silently discard workarounds If we happen to emit more than I915_MAX_WA_REGS workarounds, we will currently discard them, not even emit the LRI. Not really what we want, so warn loudly. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 12:39:03 +02:00
Damien Lespiau	55820e1e84	drm/i915: Don't overrun the intel_wa_regs array When entering intel_ring_emit_wa() with num_wa_regs equal to I915_MAX_WA_REGS, we end up indexing the intel_wa_regs array beyond its allocation. Fix the check then. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 12:38:54 +02:00
Ville Syrjälä	00e1e623e6	drm/i915: Init some CHV workarounds via LRIs in ring->init_context() Follow the BDW example and apply the workarounds touching registers which are saved in the context image through LRIs in the new ring->init_context() hook. This makes Mesa much happier and eg. glxgears doesn't hang after the first frame. Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: Add missing wa table initialization to avoid a functional conflict with Arun's wa table debugfs support.] Reviewed-by: "Barbalho, Rafael" <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 11:04:59 +02:00
Arun Siluvery	888b59951e	drm/i915/bdw: Export workaround data to debugfs The workarounds that are applied are exported to a debugfs file; this is used to verify their state after the test case (reset or suspend/resume etc). This patch is only required to support i-g-t. Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 11:04:44 +02:00
Arun Siluvery	86d7f23842	drm/i915/bdw: Apply workarounds in render ring init function For BDW workarounds are currently initialized in init_clock_gating() but they are lost during reset, suspend/resume etc; this patch moves the WAs that are part of register state context to render ring init fn otherwise default context ends up with incorrect values as they don't get initialized until init_clock_gating fn. v2: Add workarounds to golden render state This method has its own issues, first of all this is different for each gen and it is generated using a tool so adding new workaround and mainitaining them across gens is not a straightforward process. v3: Use LRIs to emit these workarounds (Ville) Instead of modifying the golden render state the same LRIs are emitted from within the driver. v4: Use abstract name when exporting gen specific routines (Chris) For: VIZ-4092 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 11:04:42 +02:00
Rodrigo Vivi	c5ad011d7d	drm/i915: FBC flush nuke for BDW According to spec FBC on BDW and HSW are identical without any gaps. So let's copy the nuke and let FBC really start compressing stuff. Without this patch we can verify with false color that nothing is being compressed. With the nuke in place and false color it is possible to see false color debugs. Unfortunatelly on some rings like BCS on BDW we have to avoid Bits 22:18 on LRIs due to a high risk of hung. So, when using Blt ring for frontbuffer rend cache would never been cleaned and FBC would stop compressing buffer. One alternative is to cache clean on software frontbuffer tracking. v2: Fix rebase conflict. v3: Do not clean cache on BCS ring. Instead use sw frontbuffer tracking. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-09-03 11:04:40 +02:00
Oscar Mateo	cc9130be80	drm/i915/bdw: Make sure gpu reset still works with Execlists If we reset a ring after a hang, we have to make sure that we clear out all queued Execlists requests. v2: The ring is, at this point, already being correctly re-programmed for Execlists, and the hangcheck counters cleared. v3: Daniel suggests to drop the "if (execlists)" because the Execlists queue should be empty in legacy mode (which is true, if we do the INIT_LIST_HEAD). v4: Do the pending intel_runtime_pm_put Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-20 17:17:48 +02:00
Daniel Vetter	896ab1a5d5	drm/i915: Fix up checks for aliasing ppgtt A subsequent patch will no longer initialize the aliasing ppgtt if we have full ppgtt enabled, since we simply don't need that any more. Unfortunately a few places check for the aliasing ppgtt instead of checking for ppgtt in general. Fix them up. One special case are the gtt offset and size macros, which have some code to remap the aliasing ppgtt to the global gtt. The aliasing ppgtt is _not_ a logical address space, so passing that in as the vm is plain and simple a bug. So just WARN about it and carry on - we have a gracefully fall-through anyway if we can't find the vma. Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-13 14:23:31 +02:00
Oscar Mateo	4712274c36	drm/i915/bdw: GEN-specific logical ring emit flush Same as the legacy-style ring->flush. v2: The BSD invalidate bit still exists in GEN8! Add it for the VCS rings (but still consolidate the blt and bsd ring flushes into one). This was noticed by Brad Volkin. v3: The command for BSD and for other rings is slightly different: get it exactly the same as in gen6_ring_flush + gen6_bsd_ring_flush Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 22:44:37 +02:00
Oscar Mateo	82e104cc26	drm/i915/bdw: New logical ring submission mechanism Well, new-ish: if all this code looks familiar, that's because it's a clone of the existing submission mechanism (with some modifications here and there to adapt it to LRCs and Execlists). And why did we do this instead of reusing code, one might wonder? Well, there are some fears that the differences are big enough that they will end up breaking all platforms. Also, Execlists offer several advantages, like control over when the GPU is done with a given workload, that can help simplify the submission mechanism, no doubt. I am interested in getting Execlists to work first and foremost, but in the future this parallel submission mechanism will help us to fine tune the mechanism without affecting old gens. v2: Pass the ringbuffer only (whenever possible). Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Appease checkpatch. Again. And drop the legacy sarea gunk that somehow crept in.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 22:42:36 +02:00
Oscar Mateo	9b1136d505	drm/i915/bdw: GEN-specific logical ring init Logical rings do not need most of the initialization their legacy ringbuffer counterparts do: we just need the pipe control object for the render ring, enable Execlists on the hardware and a few workarounds. v2: Squash with: "drm/i915: Extract pipe control fini & make init outside accesible". Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Make checkpatch happy.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 17:03:28 +02:00
Oscar Mateo	48d823878d	drm/i915/bdw: Generic logical ring init and cleanup Allocate and populate the default LRC for every ring, call gen-specific init/cleanup, init/fini the command parser and set the status page (now inside the LRC object). These are things all engines/rings have in common. Stopping the ring before cleanup and initializing the seqnos is left as a TODO task (we need more infrastructure in place before we can achieve this). v2: Check the ringbuffer backing obj for ring_is_initialized, instead of the context backing obj (similar, but not exactly the same). Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:55:17 +02:00
Daniel Vetter	0c7dd53b84	drm/i915/bdw: Add a context and an engine pointers to the ringbuffer Any given ringbuffer is unequivocally tied to one context and one engine. By setting the appropriate pointers to them, the ringbuffer struct holds all the infromation you might need to submit a workload for processing, Execlists style. v2: Drop ring->ctx since that looks terribly ill-defined for legacy ringbuffer submission. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> (v1) Acked-by: Damien Lespiau <damien.lespiau@intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:18:17 +02:00
Oscar Mateo	84c2377fce	drm/i915/bdw: Allocate ringbuffers for Logical Ring Contexts As we have said a couple of times by now, logical ring contexts have their own ringbuffers: not only the backing pages, but the whole management struct. In a previous version of the series, this was achieved with two separate patches: drm/i915/bdw: Allocate ringbuffer backing objects for default global LRC drm/i915/bdw: Allocate ringbuffer for user-created LRCs Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 16:10:58 +02:00
Chris Wilson	9bec9b1334	drm/i915: Double check ring is idle before declaring the GPU wedged During ring initialisation, sometimes we observe, though not in production hardware, that the idle flag is not set even though the ring is empty. Double check before giving up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-11 13:33:49 +02:00
Daniel Vetter	403bdd10c8	drm/i915: No busy-loop wait_for in the ring init code Doing a 1s wait (tops) with the cpu is a bit excessive. Tune it down like everything else in that code. v2: Also insert the missing space Chris spotted. Cc: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-08 17:44:01 +02:00
Jiri Kosina	ece4a17d23	drm/i915: read HEAD register back in init_ring_common() to enforce ordering Withtout this, ring initialization fails reliabily during resume with [drm:init_ring_common] ERROR render ring initialization failed ctl 0001f001 head ffffff8804 tail 00000000 start 000e4000 This is not a complete fix, but it is verified to make the ring initialization failures during resume much less likely. We were not able to root-cause this bug (likely HW-specific to Gen4 chips) yet. This is therefore used as a ducttape before problem is fully understood and proper fix created, so that people don't suffer from completely unusable systems in the meantime. The discussion and debugging is happening at https://bugs.freedesktop.org/show_bug.cgi?id=76554 Signed-off-by: Jiri Kosina <jkosina@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-08 16:22:18 +02:00
Kenneth Graunke	02c9f7e3cf	drm/i915: Add the WaCsStallBeforeStateCacheInvalidate:bdw workaround. On Broadwell, any PIPE_CONTROL with the "State Cache Invalidate" bit set must be preceded by a PIPE_CONTROL with the "CS Stall" bit set. Documented on the BSpec 3D workarounds page. Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [vsyrjala: add chv w/a note too] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-07 14:04:07 +02:00
Kenneth Graunke	884ceacee3	drm/i915: Refactor Broadwell PIPE_CONTROL emission into a helper. We'll want to reuse this for a workaround. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Rmove now unused int.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-08-07 14:01:12 +02:00
Daniel Vetter	480c803386	drm/i915: Use genX_ prefix for gt irq enable/disable functions Traditionally we use genX_ for GT/render stuff and the codenames for display stuff. But the gt and pm interrupt handling functions on gen5/6+ stuck out as exceptions, so convert them. Looking at the diff this nicely realigns our ducks since almost all the callers are already platform-specific functions following the genX_ pattern. Spotted while reviewing some internal rps patches. No function change in this patch. Acked-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-23 07:05:30 +02:00
Chris Wilson	1f767e02d6	drm/i915: HWS must be in the mappable region for g33 On g33, the documentation states "HWS_PGA: Format = Bits 28:12 of graphics memory address (bits 31:29 MBZ)." which translates to that the address of the HWS must be below 256MiB, which is conveniently the mappable aperture. This also appears to be true (but not documented as so) for gen4 and gen5. To generalise we force it into the low mappable region for all non-LLC platforms. If we locate the HWS at the top of the GTT the machine will hard hang during boot (fails on pnv, gm45, ilk and byt, but works on snb, ivb, hsw). v2: Add comments to explain why use PIN_MAPPABLE even though we have no intention of mapping the object. (Ville) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 21:07:17 +02:00
Oscar Mateo	64c58f2c48	drm/i915: Generalize ring_space to take a ringbuf It's simple enough that it doesn't need to know anything about the engine. Trivial change. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 12:30:57 +02:00
Oscar Mateo	2919d2913c	drm/i915: Extract ringbuffer destroy & generalize alloc to take a ringbuf More prep work: with Execlists, we are going to start creating a lot of extra ringbuffers soon, so these functions are handy. No functional changes. v2: rename allocate/destroy_ring_buffer to alloc/destroy_ringbuffer_obj because the name is more meaningful and to mirror a similar function in the context world: i915_gem_alloc_context_obj(). Change suggested by Brad Volkin. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-08 12:30:53 +02:00
Ben Widawsky	bae4fcd2c7	drm/i915/bdw: poll semaphores As Ville points out, it's possible/probable we don't actually need this. Potentially, this validates the letter of the spec, and not the spirit. Ville: > I discussed this on irc w/ Ben, and I was suggesting we don't need to > poll. Polling apparently can be used as a workaround for certain > hardware issues, but it looks like those issues shouldn't affect us, > for the momemnt at least. So my suggestion was to try w/o polling > first (since there could be some power cost to polling) and add the > poll bit if problems arise. Rodrigo: Spec suggests this as an W/A for GT3. However semaphores didn't worked in my BDW GT2 on Signal Mode. So pool mode is definitely needed. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 23:16:55 +02:00
Ben Widawsky	5ee426ca13	drm/i915/bdw: implement semaphore wait Semaphore waits use a new instruction, MI_SEMAPHORE_WAIT. The seqno to wait on is all well defined by the table in the previous patch. There is nothing else different from previous GEN's semaphore synchronization code. v2: Update macros to not require the other ring's ring->id (Chris) v3: Add missing VCS2 gen8_ring_wait init besides s/ring_buffer/engine_cs (Rodrigo) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 22:22:58 +02:00
Ben Widawsky	3e78998a58	drm/i915/bdw: implement semaphore signal Semaphore signalling works similarly to previous GENs with the exception that the per ring mailboxes no longer exist. Instead you must define your own space, somewhere in the GTT. The comments in the code define the layout I've opted for, which should be fairly future proof. Ie. I tried to define offsets in abstract terms (NUM_RINGS, seqno size, etc). NOTE: If one wanted to move this to the HWSP they could. I've decided one 4k object would be easier to deal with, and provide potential wins with cache locality, but that's all speculative. v2: Update the macro to not need the other ring's ring->id (Chris) Update the comment to use the correct formula (Chris) v3: Move the macros the ringbuffer.h to prevent churn in next patch (Ville) v4: Fixed compilation rebase conflict commit `1ec9e26dda` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Feb 14 14:01:11 2014 +0100 drm/i915: Consolidate binding parameters into flags v5: VCS2 rebase Replace hweight_long with hweight32 v6 (Rodrigo): * Add missed VC2 gen8 ring signal init * fixing conflicst on rebase * minor fixes on address table * remove WARN_ON Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [danvet: s/BUG_ON/WARN_ON/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 22:16:23 +02:00
Ben Widawsky	a1444b79fe	drm/i915: Make semaphore updates more precise With the ring mask we now have an easy way to know the number of rings in the system, and therefore can accurately predict the number of dwords to emit for semaphore signalling. This was not possible (easily) previously. There should be no functional impact, simply fewer instructions emitted. While we're here, simply do the round up to 2 instead of the fancier rounding we did before, which rounding up per mbox, ie 4. This also allows us to drop the unnecessary MI_NOOP, so not really 4, 3. v2: Use 3 dwords instead of 4 (Ville) Do the proper calculation to get the number of dwords to emit (Ville) Conditionally set .sync_to when semaphores are enabled (Ville) v3: Rebased on VCS2 Replace hweight_long with hweight32 (Ville) v4: Pull out the accidentally squashed hunk from the next patch after rebase (Daniel). v5: Fix conflict after rebase (Rodrigo) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 22:10:10 +02:00
Ben Widawsky	707d9cf993	drm/i915: gen specific ring init Gen8 has already had some differentiation with how it handles rings. Semaphores bring yet more differences, and now is as good a time as any to do the split. Also, since gen8 doesn't actually use semaphores up until this point, put the proper "NULL" values in for the mbox info. v2: v1 had a stale commit message v3: Move everything in the is_semaphore_enabled() check v4: VCS2 rebase Remove double assignment of signal in render ring (Ville) v5: Adding missed VCS2 signal init on gen8+ (Rodrigo) Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 22:08:29 +02:00
Rodrigo Vivi	f7b6423685	drm/i915: Fix VCS2's ring name. It just fix a typo. v2: removing underscore to let this like all other ring names (Oscar) Cc: Oscar Mateo <oscar.mateo@intel.com> Reviewed-by (v1): Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-07-07 21:57:59 +02:00
Konrad Zapalowicz	9c33baa6b3	drivers/i915: Fix unnoticed failure of init_ring_common() This commit add check for return value of init_ring_common() in the init_render_ring(). Now, when failure is detected the error code is propagated to the caller instead of being ignored. Signed-off-by: Konrad Zapalowicz <bergo.torino@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-19 20:59:52 +02:00
Akash Goel	24f3a8cf77	drm/i915: Added write-enable pte bit supportt This adds support for a write-enable bit in the entry of GTT. This is handled via a read-only flag in the GEM buffer object which is then used to see how to set the bit when writing the GTT entries. Currently by default the Batch buffer & Ring buffers are marked as read only. v2: Moved the pte override code for read-only bit to 'byt_pte_encode'. (Chris) Fixed the issue of leaving 'gt_old_ro' as unused. (Chris) v3: Removed the 'gt_old_ro' field, now setting RO bit only for Ring Buffers(Daniel). v4: Added a new 'flags' parameter to all the pte(gen6) encode & insert_entries functions, in lieu of overloading the cache_level enum (Daniel). v5: Removed the superfluous VLV check & changed the definition location of PTE_READ_ONLY flag (Imre) Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-17 09:21:47 +02:00
Oscar Mateo	3b2cc8ab64	drm/i915/bdw: Do not write the Semaphore Sync Registers in GEN8+ These do not exist anymore. Spotted while reading through intel_ringbuffer.c Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-13 17:45:18 +02:00
Ville Syrjälä	de8f0a5016	drm/i915: Don't WARN about ring idle bit on gen2 Gen2 doesn't have the ring idle/stop bits in the SCPD/MI_MODE register, so don't go spewing warnings about the state of those bits. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-06-05 08:52:41 +02:00
Oscar Mateo	93b0a4e0b2	drm/i915: Split the ringbuffers from the rings (3/3) Manual cleanup after the previous Coccinelle script. Yes, I could write another Coccinelle script to do this but I don't want labor-replacing robots making an honest programmer's work obsolete (also, I'm lazy). Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:30:18 +02:00
Oscar Mateo	ee1b1e5ef3	drm/i915: Split the ringbuffers from the rings (2/3) This refactoring has been performed using the following Coccinelle semantic script: @@ struct intel_engine_cs r; @@ ( - (r).obj + r.buffer->obj \| - (r).virtual_start + r.buffer->virtual_start \| - (r).head + r.buffer->head \| - (r).tail + r.buffer->tail \| - (r).space + r.buffer->space \| - (r).size + r.buffer->size \| - (r).effective_size + r.buffer->effective_size \| - (r).last_retired_head + r.buffer->last_retired_head ) @@ struct intel_engine_cs *r; @@ ( - (r)->obj + r->buffer->obj \| - (r)->virtual_start + r->buffer->virtual_start \| - (r)->head + r->buffer->head \| - (r)->tail + r->buffer->tail \| - (r)->space + r->buffer->space \| - (r)->size + r->buffer->size \| - (r)->effective_size + r->buffer->effective_size \| - (r)->last_retired_head + r->buffer->last_retired_head ) @@ expression E; @@ ( - LP_RING(E)->obj + LP_RING(E)->buffer->obj \| - LP_RING(E)->virtual_start + LP_RING(E)->buffer->virtual_start \| - LP_RING(E)->head + LP_RING(E)->buffer->head \| - LP_RING(E)->tail + LP_RING(E)->buffer->tail \| - LP_RING(E)->space + LP_RING(E)->buffer->space \| - LP_RING(E)->size + LP_RING(E)->buffer->size \| - LP_RING(E)->effective_size + LP_RING(E)->buffer->effective_size \| - LP_RING(E)->last_retired_head + LP_RING(E)->buffer->last_retired_head ) Note: On top of this this patch also removes the now unused ringbuffer fields in intel_engine_cs. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> [danvet: Add note about fixup patch included here.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:27:25 +02:00
Oscar Mateo	8ee149756e	drm/i915: Split the ringbuffers from the rings (1/3) As advanced by the previous patch, the ringbuffers and the engine command streamers belong in different structs. This is so because, while they used to be tightly coupled together, the new Logical Ring Contexts (LRC for short) have a ringbuffer each. In legacy code, we will use the buffer* pointer inside each ring to get to the pertaining ringbuffer (the actual switch will be done in the next patch). In the new Execlists code, this pointer will be NULL and we will use instead the one inside the context instead. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:02:16 +02:00
Oscar Mateo	a4872ba6d0	drm/i915: s/intel_ring_buffer/intel_engine_cs In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:01:05 +02:00
Ville Syrjälä	b3f797ac49	drm/i915/chv: Add some workaround notes We implement the following workarounds: * WaDisableAsyncFlipPerfMode:chv * WaProgramMiArbOnOffAroundMiSetContext:chv v2: Drop WaDisableSemaphoreAndSyncFlipWait note Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-20 15:20:05 +02:00
Mika Kuoppala	6e450ab24d	drm/i915: Bail out early on gen6_signal if no semaphores If we dont have semaphores enabled, we allocate 4 dwords for signalling. But end up emitting more regardless. Fix this by bailing out early if semaphores are not enabled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78274 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78283 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-15 23:14:13 +02:00
Oscar Mateo	d153337958	drm/i915: Ringbuffer signal func for the second BSD ring This is missing in: commit `78325f2d27` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Tue Apr 29 14:52:29 2014 -0700 drm/i915: Virtualize the ringbuffer signal func Looks to me like a rebase side-effect... Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-13 14:13:24 +02:00
Brad Volkin	44e895a8a2	drm/i915: Use hash tables for the command parser For clients that submit large batch buffers the command parser has a substantial impact on performance. On my HSW ULT system performance drops as much as ~20% on some tests. Most of the time is spent in the command lookup code. Converting that from the current naive search to a hash table lookup reduces the performance drop to ~10%. The choice of value for I915_CMD_HASH_ORDER allows all commands currently used in the parser tables to hash to their own bucket (except for one collision on the render ring). The tradeoff is that it wastes memory. Because the opcodes for the commands in the tables are not particularly well distributed, reducing the order still leaves many buckets empty. The increased collisions don't seem to have a huge impact on the performance gain, but for now anyhow, the parser trades memory for performance. NB: Ville noticed that the error paths through the ring init code will leak memory. I've not addressed that here. We can do a follow up pass to handle all of the leaks. v2: improved comment describing selection of hash key mask (Damien) replace a BUG_ON() with an error return (Tvrtko, Ville) commit message improvements Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-12 19:15:51 +02:00
Chris Wilson	1cf0ba1474	drm/i915: Flush request queue when waiting for ring space During the review of commit `1f70999f90` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Jan 27 22:43:07 2014 +0000 drm/i915: Prevent recursion by retiring requests when the ring is full Ville raised the point that our interaction with request->tail was likely to foul up other uses elsewhere (such as hang check comparing ACTHD against requests). However, we also need to restore the implicit retire requests that certain test cases depend upon (e.g. igt/gem_exec_lut_handle), this raises the spectre that the ppgtt will randomly call i915_gpu_idle() and recurse back into intel_ring_begin(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78023 Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Remove now unused 'tail' variable as spotted by Brad.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-08 01:23:34 +02:00
Chris Wilson	dcfe050659	drm/i915: Improve fallback ring waiting A few improvements to the fallback method for waiting upon ring space: 1. Fix the start/end wait tracepoints to always be paired. 2. Increase responsiveness of checking 3. Mark the process as waiting upon io 4. Check for signal interruptions Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> [danvet: Drop the s/msleep/io_schedule_timeout/ change again since the latter isn't exported.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-08 01:22:34 +02:00
Ben Widawsky	9bcb144c83	drm/i915: Support 64b execbuf Previously, our code only had a 32b offset value for where the batchbuffer starts. With full PPGTT, and 64b canonical GPU address space, that is an insufficient value. The code to expand is pretty straight forward, and only one platform needs to do anything with the extra bits. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 16:01:58 +02:00
Ben Widawsky	024a43e12c	drm/i915: Move ring_begin to signal() Add_request has always contained both the semaphore mailbox updates as well as the breadcrumb writes. Since the semaphore signal is the one which actually knows about the number of dwords it needs to emit to the ring, we move the ring_begin to that function. This allows us to remove the hideously shared #define On a related not, gen8 will use a different number of dwords for semaphores, but not for add request. v2: Make number of dwords an explicit part of signalling (via function argument). (Chris) v3: very slight comment change Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 10:56:53 +02:00
Ben Widawsky	78325f2d27	drm/i915: Virtualize the ringbuffer signal func This abstraction again is in preparation for gen8. Gen8 will bring new semantics for doing this operation. While here, make the writes of MI_NOOPs explicit for non-existent rings. This should have been implicit before. NOTE: This is going to be removed in a few patches. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 10:56:53 +02:00
Ben Widawsky	ebc348b2ad	drm/i915: Move semaphore specific ring members to struct This will be helpful in abstracting some of the code in preparation for gen8 semaphores. v2: Move mbox stuff to a separate struct v3: Rebased over VCS2 work Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 10:56:52 +02:00
Zhao Yakui	77fe2ff3db	drm/i915:Add the VCS2 switch in Intel_ring_setup_status_page The Gen7 doesn't have the second BSD ring. But it will complain the switch check warning message during compilation. So just add it to remove the switch check warning. V1->V2: Follow Daniel's comment to update the comment Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:48 +02:00
Zhao Yakui	845f74a701	drm/i915:Initialize the second BSD ring on BDW GT3 machine Based on the hardware spec, the BDW GT3 machine has two independent BSD ring that can be used to dispatch the video commands. So just initialize it. V3->V4: Follow Imre's comment to do some minor updates. For example: more comments are added to describe the semaphore between ring. Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> [danvet: Fix up checkpatch error.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:46 +02:00
Chris Wilson	48e48a0b8c	drm/i915: Include a little more information about why ring init fails If we include the expected values for the failing ring register checks, it makes it marginally easier to see which is the culprit. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:40 +02:00
Chris Wilson	e3efda49e7	drm/i915: Preserve ring buffers objects across resume Tearing down the ring buffers across resume is overkill, risks unnecessary failure and increases fragmentation. After failure, since the device is still active we may end up trying to write into the dangling iomapping and trigger an oops. v2: stop_ringbuffers() was meant to call stop(ring) not cleanup(ring) during resume! Reported-by: Jae-hyeon Park <jhyeon@gmail.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=72351 References: https://bugs.freedesktop.org/show_bug.cgi?id=76554 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> [danvet: s/ring->obj == NULL/!intel_ring_initialized(ring)/ as suggested by Oscar.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:37 +02:00
Chris Wilson	18393f6322	drm/i915: Replace hardcoded cacheline size with macro For readibility and guess at the meaning behind the constants. v2: Claim only the meagerest connections with reality. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-05 09:08:36 +02:00
Mika Kuoppala	88b4aa8770	drm/i915: add flags to i915_ring_stop Piglit runner and QA are both looking at the dmesg for DRM_ERRORs with test cases. Add a flag to control those when we they are expected from related test cases. Also add flag to control if contexts should be banned that introduced the hang. Hangcheck is timer based and preventing bans by adding sleeps to testcases makes testing slower. v2: intel_ring_stopped(), readable comment (Chris) v3: keep compatibility (Daniel) References: https://bugs.freedesktop.org/show_bug.cgi?id=75876 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-09 15:07:42 +02:00
Chris Wilson	9991ae787a	drm/i915: Move all ring resets before setting the HWS page In commit `a51435a313` Author: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Date: Wed Mar 12 16:39:40 2014 +0530 drm/i915: disable rings before HW status page setup we reordered stopping the rings to do so before we set the HWS register. However, there is an extra workaround for g45 to reset the rings twice, and for consistency we should apply that workaround before setting the HWS to be sure that the rings are truly stopped. Cc: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-03 17:16:45 +02:00
Ben Widawsky	057f6a8ad7	drm/i915: Invariably invalidate before ctx switch We have been setting the bit which was originally BIOS dependent since: commit `f05bb0c7b6` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Jan 20 16:33:32 2013 +0000 drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits Therefore, we do not need to try to figure it out dynamically and we can just always invalidate the TLBs. It's a partial revert of: commit `12b0286f49` Author: Ben Widawsky <ben@bwidawsk.net> Date: Mon Jun 4 14:42:50 2012 -0700 drm/i915: possibly invalidate TLB before context switch The original commit attempted to only invalidate when necessary (very much a relic from the old days). Now, we can just always invalidate. I guess the old TODO still exists. Since we seem to have abandoned ILK contexts however, there isn't much point in even remembering. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-03 11:41:39 +02:00
Akash Goel	01fa03021f	drm/i915: Enabling the TLB invalidate bit in GFX Mode register This patch Enables the bit for TLB invalidate in GFX Mode register for Gen7. According to bspec, When enabled this bit limits the invalidation of the TLB only to batch buffer boundaries, to pipe_control commands which have the TLB invalidation bit set and sync flushes. If disabled, the TLB caches are flushed for every full flush of the pipeline. Tested only on vlv platform. Chris has tested on ivb and hsw platforms. v2: Adding the explicit enabling of this bit for all Gen7 platforms instead of only vlv (Chris) Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> #ivb, hsw -Chris Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: Add w/a markers as suggested by Ville.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-02 13:54:02 +02:00
Chris Wilson	aa83e30d8f	drm/i915: Rename GFX_TLB_INVALIDATE_ALWAYS The documentation calls this GFX_MODE bit "Flush TLB invalidate Mode". However, that is not a good name for an enable bit as it doesn't make it clear what is enabled. An even worse name is GFX_TLB_INVALIDATE_ALWAYS as enabling that bit actually prevents the TLB from being invalidated at every flush. This leads to great confusion when reading code and proposed patches. To get around this try to bake in what is enabled by setting the bit and call it GFX_TLB_INVALIDATE_EXPLICIT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: "Gupta, Sourab" <sourab.gupta@intel.com> Acked-by: "Gupta, Sourab" <sourab.gupta@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-01 22:58:04 +02:00
Jani Nikula	4640c4ff08	drm/i915/ringbuffer: prefer struct drm_i915_private to drm_i915_private_t No functional changes. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-31 15:32:56 +02:00
Chris Wilson	508774452d	drm/i915: Broadwell expands ACTHD to 64bit As Broadwell has an increased virtual address size, it requires more than 32 bits to store offsets into its address space. This includes the debug registers to track the current HEAD of the individual rings, which may be anywhere within the per-process address spaces. In order to find the full location, we need to read the high bits from a second register. We then also need to expand our storage to keep track of the larger address. v2: Carefully read the two registers to catch wraparound between the reads. v3: Use a WARN_ON rather than loop indefinitely on an unstable register read. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Timo Aaltonen <tjaalton@ubuntu.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Drop spurious hunk which conflicted.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-28 18:33:14 +01:00
Akash Goel	61a563a2a6	drm/i915: Remove the enabling of VS_TIMER_DISPATCH bit in MI MODE reg This patch Removes the VS_TIMER_DISPATCH bit enable in MI MODE reg for platforms > Gen6. VS_TIMER_DISPATCH bit enable was earlier required as a part of WA 'WaTimedSingleVertexDispatch', which is now applicable only to platforms < Gen7. v2: Enhancing the scope of the patch to full Gen7 (Chris) v3: Modifying the WA condition to the cover the applicable platforms, and adding the WA name in comments. (Ville) Signed-off-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> # ivb, hsw -Chris Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-28 18:31:59 +01:00
Damien Lespiau	dc616b89db	drm/i915/bdw: The TLB invalidation mechanism has been removed from INSTPM While wandering in the spec, I noticed that BDW removes those 2 bits from INSTPM. I couldn't find any direct way to invalidate the TLB (ie without the ring working already). Maybe someone will be more lucky. At least, we now know we may be a problem. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-13 03:52:31 +01:00
Naresh Kumar Kachhi	02f6a1e750	drm/i915: warn if ring is active before sync flush Based on Bspec the command parser must be stopped prior to issuing sync flush. This should be done by the caller of intel_ring_setup_status_page. Patch adds a warning if it is not done. v2: rebased based on new patch (wait for ring to become idle) Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-12 16:10:55 +01:00
Naresh Kumar Kachhi	e9fea5747d	drm/i915: wait for rings to become idle once disabled make sure we wait for rings to become idle once they are disabled. In case of timeout print an error message Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> [danvet: Frob patch as suggested by Chris.] Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-12 16:04:19 +01:00
Naresh Kumar Kachhi	a51435a313	drm/i915: disable rings before HW status page setup Rings should be idle before issuing sync_flush (in intel_ring_setup_status_page). This patch moves the ring disabling before doing the HW status page setup. Signed-off-by: Naresh Kumar Kachhi <naresh.kumar.kachhi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-12 16:00:41 +01:00
Daniel Vetter	e8e6e6012d	Linux 3.14-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTHSaRAAoJEHm+PkMAQRiG7G8IAJHElwFDNSQE7Y9MmbicrAMG kfjhBtBpTaVrJKQXegCNUwDaLLyC4oLIxDheW84oPXbrEGDLqPtBov/hrcFkHVr4 lh/ZYk02nYtcfpN0JnL/Yj2oKHVmBWs0vFlM7StSFsJCj10DoCVQQdmAJ8XODTPo CXMapk+UikTX1TlIO8+B5toyl3R1OqPmW211UV1vQVLKy66hu+MKVN/V+/EyopL0 1jO81EDpaRaeIJh1/okcyUoIq9pqLkAWNpeQ7uyXZ+Sfivt9RXwLYKmAB3lP20Hc ZMIIoHSCyYRFjxLlQvt02bA9nY4wTY7YN5kZ2kk65y7TFfhcGsCw1Sc69iyCoKs= =CJcA -----END PGP SIGNATURE----- Merge tag 'v3.14-rc6' into drm-intel-next-queued Linux 3.14-rc6 I need the hdmi/dvi-dual link fixes in 3.14 to avoid ugly conflicts when merging Ville's new hdmi cloning support into my -next tree Conflicts: drivers/gpu/drm/i915/Makefile drivers/gpu/drm/i915/intel_dp.c Makefile cleanup conflicts with an acpi build fix, intel_dp.c is trivial. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-10 21:43:46 +01:00
Brad Volkin	351e3db2b3	drm/i915: Implement command buffer parsing logic The command parser scans batch buffers submitted via execbuffer ioctls before the driver submits them to hardware. At a high level, it looks for several things: 1) Commands which are explicitly defined as privileged or which should only be used by the kernel driver. The parser generally rejects such commands, with the provision that it may allow some from the drm master process. 2) Commands which access registers. To support correct/enhanced userspace functionality, particularly certain OpenGL extensions, the parser provides a whitelist of registers which userspace may safely access (for both normal and drm master processes). 3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The parser always rejects such commands. See the overview comment in the source for more details. This patch only implements the logic. Subsequent patches will build the tables that drive the parser. v2: Don't set the secure bit if the parser succeeds Fail harder during init Makefile cleanup Kerneldoc cleanup Clarify module param description Convert ints to bools in a few places Move client/subclient defs to i915_reg.h Remove the bits_count field OTC-Tracker: AXIA-4631 Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Appease checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-07 22:37:00 +01:00
Ville Syrjälä	8285222c48	drm/i915: We implement WaDisableAsyncFlipPerfMode:bdw Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:34 +01:00
Daniel Vetter	e01f69295b	drm/i915: Handle set_cache_level errors in the status page setup Split out from Chris vma-bind rework. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:18:00 +01:00
Daniel Vetter	9a6bbb6216	drm/i915: Don't pin the status page as mappable We access it through the cpu window. No functional difference expected atm since we default to a bottom-up allocation scheme. But that might eventually change so that we prefer the unmappable range for buffers that don't need cpu gtt access. Split out from Chris vma-bind rework. Note that this is only possible due to the split-up of the mappable pin flag into PIN_GLOBAL and PIN_MAPPABLE. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:48 +01:00
Daniel Vetter	be1fa129f5	drm/i915: Don't set PIN_MAPPABLE for legacy ringbuffers Tighter code since legacy gem has only mappable anyway. Split out from Chris vma-bind rework. Note that this is only possible due to the split-up of the mappable pin flag into PIN_GLOBAL and PIN_MAPPABLE. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:42 +01:00
Daniel Vetter	a9cc726c85	drm/i915: Handle set_cache_level errors in the pipe control scratch setup Split out from Chris vma-bind rework. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:35 +01:00
Daniel Vetter	1ec9e26dda	drm/i915: Consolidate binding parameters into flags Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:16:58 +01:00
Daniel Vetter	fb19e2ac7c	drm/i915: protect ringbuffer sarea update behind !MODESET Avoids surprises when userspace races open/closes against this. Cc: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 23:44:34 +01:00
Ville Syrjälä	753b1ad4a2	drm/i915: Add intel_ring_cachline_align() intel_ring_cachline_align() emits MI_NOOPs until the ring tail is aligned to a cacheline boundary. Cc: Bjoern C <lkml@call-home.ch> Cc: Alexandru DAMIAN <alexandru.damian@intel.com> Cc: Enrico Tagliavini <enrico.tagliavini@gmail.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org (prereq for the next patch) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-11 23:00:19 +01:00
Chris Wilson	1f70999f90	drm/i915: Prevent recursion by retiring requests when the ring is full As the VM do not track activity of objects and instead use a large hammer to forcibly idle and evict all of their associated objects when one is released, it is possible for that to cause a recursion when we need to wait for free space on a ring and call retire requests. (intel_ring_begin -> intel_ring_wait_request -> i915_gem_retire_requests_ring -> i915_gem_context_free -> i915_gem_evict_vm -> i915_gpu_idle -> intel_ring_begin etc) In order to remove the requirement for calling retire-requests from intel_ring_wait_request, we have to inline a couple of steps from retiring requests, notably we have to record the position of the request we wait for and use that to update the available ring space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-06 17:43:13 +01:00
Daniel Vetter	0e5539b923	Merge branch 'topic/ppgtt' into drm-intel-next-queued Because whatever.* * This should contain a fairly long list of issues and still unresolved resgressions, but I didn't really get a vote. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-25 21:14:57 +01:00
Ben Widawsky	f0a9f74c0d	drm/i915: Fix disabled semaphores The ring will emit too many if semaphores are disabled since we do not add the correct number to num_dwords anymore. This was introduced: commit `52ed23253b` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Mon Dec 16 20:50:38 2013 -0800 drm/i915: Don't emit mbox updates without semaphores FWIW, the bug was fixed later in the series. /me hangs head in shame. Daniel: Also note that we should have merged the read-only semaphore modparam before this patch. Reported-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 09:58:26 +01:00
Chris Wilson	304d695c3d	drm/i915: Flush outstanding requests before allocating new seqno In very rare cases (such as a memory failure stress test) it is possible to fill the entire ring without emitting a request. Under this circumstance, the outstanding request is flushed and waited upon. After space on the ring is cleared, we return to emitting the new command - except that we just cleared the seqno allocated for this operation and trigger the sanity check that a request is only ever emitted with a valid seqno. The fix is to rearrange the code to make sure the allocation of the seqno for this operation is after any required flushes of outstanding operations. The bug exists since the preallocation was introduced in commit `9d7730914f` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Nov 27 16:22:52 2012 +0000 drm/i915: Preallocate next seqno before touching the ring Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-07 12:33:41 +01:00
Ben Widawsky	d7f46fc4e7	drm/i915: Make pin count per VMA Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:49 +01:00
Ben Widawsky	52ed23253b	drm/i915: Don't emit mbox updates without semaphores Aside from the fact that it leaves confusing dumps on error capture, it is entirely unnecessary, and potentially harmful in cases like BDW, where the instruction has changed. In reality (seemingly), this will have no behavioral impact. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-17 09:56:29 +01:00
Deepak S	c8d9a5905e	drm/i915: Add power well arguments to force wake routines. Added power well arguments to all the force wake routines to help us individually control power well based on the scenario. Signed-off-by: Deepak S <deepak.s@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Resolve conflict with the removed forcewake hack and drop one spurious hunk Jesse noticed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-28 08:31:22 +01:00
Chris Wilson	67e5871be8	drm/i915: Drop forcewake w/a for missed interrupts/seqno on Sandybridge I believe, and an evening of i-g-t, that our original workaround for the missed interrupts on Sandybridge, that of holding forcewake whilst we wait for an interrupts, is no longer required. This leaves us dependent on the second workaround of forcing an UC read of the ACTHD before reading back the seqno from the snooped HWS. Dropping the forcewake should allow us to conserve a little power, not much as the GPU is meant to be busy whilst we wait for it! Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-26 10:27:55 +01:00
Ville Syrjälä	37c1d94fa8	drm/i915: Emit SRM after the MSG_FBC_REND_STATE LRI The spec tells us that we need to emit an SRM after the LRI to MSG_FBC_REND_STATE. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-21 09:06:31 +01:00
Ville Syrjälä	9688ecadd2	drm/i915: Limit FBC flush to post batch flush Don't issue the FBC nuke/cache clean command when invalidate_domains!=0. That would indicate that we're not being called for the post-batch flush. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-21 09:06:09 +01:00
Ben Widawsky	eb0d4b75d5	drm/i915/bdw: Add comment about gen8 HWS PGA This confused me some many times that I think it is appropriate to add a small comment to instruct the reader of the code that it is indeed doing what it is supposed to do. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-14 09:33:11 +01:00
Daniel Vetter	40c499f93f	drm/i915/bdw: Take render error interrupt out of the mask The handling of the error interrupts isn't wired up at all. And it hasn't been ever since ilk happened, so don't bother. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:10:10 +01:00
Ben Widawsky	780f18c84c	drm/i915/bdw: BSD init for gen8 also This was an oversight and should have been in a previous series somewhere. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:50 +01:00
Ben Widawsky	a5f3d68e2e	drm/i915/bdw: Render ring flushing PIPE_CONTROL added the high address dword. I'm not sure how the simulator let me get away with this. I've explicitly left out all the workarounds from Gen7 because in the minimal digging that I did, most don't seem necessary, and the simulator doesn't complain without them Note that BLT and BSD ring commands had already been updated previously. Just render/pipe_control should have been broken. v2: Squash in a fixup from Ville to follow the recent IVB PIPE_CONTROL updates: "BDW uses the IVB PIPE_CONTROL style for specifying GTT vs. PPGTT for the PIPE_CONTROL QW/DW write." v3: Rebase on top of Chris' cleanup to have an explicit ring->scratch buffer object instead of an opaque ring->private where everyone stores the same stuff inside. Reported-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (for the fixup) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:49 +01:00
Ben Widawsky	28cf541543	drm/i915/bdw: unleash PPGTT v2: Squash in fix from Ben: Set PPGTT batches as necessary This fixes the regression in the last couple of days when we enabled PPGTT. v3: Squash in fixup to still use GTT for secure batches from Ville: BDW doesn't have a separate secure vs. non-secure bit in MI_BATCH_BUFFER_START. So for secure batches we have to simply leave the PPGTT bit unset. Fortunately older generations (except HSW) had similar limitations so execbuffer already creates a GTT mapping for all secure batches. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:48 +01:00
Ben Widawsky	075b3bbaae	drm/i915/bdw: Update MI_FLUSH_DW The code is more verbose than necessary for the reader's sake, hopefully the compiler optimizes away the if. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:42 +01:00
Ben Widawsky	1c7a0623c7	drm/i915/bdw: dispatch updates (64b related) The command to emit batch buffers has changed to address 48b addresses. It seemed reasonable that we could still use the old instruction where emitting 0 for length would do the right thing, but it seems to bother the simulator when the code does that. Now the second dword in the command has the upper 16b of the address of the batchbuffer. v2: Remove duplicated vfun assignment. v3: Squash in VECS support changes from Zhao Yakui <yakui.zhao@intel.com> v4: Make checkpatch happy. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:41 +01:00
Ben Widawsky	abd58f0175	drm/i915/bdw: Implement interrupt changes The interrupt handling implementation remains the same as previous generations with the 4 types of registers, status, identity, mask, and enable. However the layout of where the bits go have changed entirely. To address these changes, all of the interrupt vfuncs needed special gen8 code. The way it works is there is a top level status register now which informs the interrupt service routine which unit caused the interrupt, and therefore which interrupt registers to read to process the interrupt. For display the division is quite logical, a set of interrupt registers for each pipe, and in addition to those, a set each for "misc" and port. For GT the things get a bit hairy, as seen by the code. Each of the GT units has it's own bits defined. They all look very similar and resides in 16 bits of a GT register. As an example, RCS and BCS share register 0. To compact the code a bit, at a slight expense to complexity, this is exactly how the code works as well. 2 structures are added to the ring buffer so that our ring buffer interrupt handling code knows which ring shares the interrupt registers, and a shift value (ie. the top or bottom 16 bits of the register). The above allows us to kept the interrupt register caching scheme, the per interrupt enables, and the code to mask and unmask interrupts relatively clean (again at the cost of some more complexity). Most of the GT units mentioned above are command streamers, and so the symmetry should work quite well for even the yet to be implemented rings which Broadwell adds. v2: Fixes up a couple of bugs, and is more verbose about errors in the Broadwell interrupt handler. v3: fix DE_MISC IER offset v4: Simplify interrupts: I totally misread the docs the first time I implemented interrupts, and so this should greatly simplify the mess. Unlike GEN6, we never touch the regular mask registers in irq_get/put. v5: Rebased on to of recent pch hotplug setup changes. v6: Fixup on top of moving num_pipes to intel_info. v7: Rebased on top of Egbert Eich's hpd irq handling rework. Also wired up ibx_hpd_irq_setup for gen8. v8: Rebase on top of Jani's asle handling rework. v9: Rebase on top of Ben's VECS enabling for Haswell, where he unfortunately went OCD on the gt irq #defines. Not that they're still not yet fully consistent: - Used the GT_RENDER_ #defines + bdw shifts. - Dropped the shift from the L3_PARITY stuff, seemed clearer. - s/irq_refcount/irq_refcount.gt/ v10: Squash in VECS enabling patches and the gen8_gt_irq_handler refactoring from Zhao Yakui <yakui.zhao@intel.com> v11: Rebase on top of the interrupt cleanups in upstream. v12: Rebase on top of Ben's DPF changes in upstream. v13: Drop bdw from the HAS_L3_DPF feature flag for now, it's unclear what exactly needs to be done. Requested by Ben. v14: Fix the patch. - Drop the mask of reserved bits and assorted logic, it doesn't match the spec. - Do the posting read inconditionally instead of commenting it out. - Add a GEN8_MASTER_IRQ_CONTROL definition and use it. - Fix up the GEN8_PIPE interrupt defines and give the GEN8_ prefixes - we actually will need to use them. - Enclose macros in do {} while (0) (checkpatch). - Clear DE_MISC interrupt bits only after having processed them. - Fix whitespace fail (checkpatch). - Fix overtly long lines where appropriate (checkpatch). - Don't use typedef'ed private_t (maintainer-scripts). - Align the function parameter list correctly. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v4) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> bikeshed	2013-11-08 18:09:39 +01:00
Ben Widawsky	3d57e5bd12	drm/i915: Do a fuller init after reset I had this lying around from he original PPGTT series, and thought we might try to get it in by itself. It's convenient to just call i915_gem_init_hw at reset because we'll be adding new things to that function, and having just one function to call instead of reimplementing it in two places is nice. In order to accommodate we cleanup ringbuffers in order to bring them back up cleanly. Optionally, we could also teardown/re initialize the default context but this was causing some problems on reset which I wasn't able to fully debug, and is unnecessary with the previous context init/enable split. This essentially reverts: commit `8e88a2bd59` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jun 19 18:40:00 2012 +0200 drm/i915: don't call modeset_init_hw in i915_reset It seems to work for me on ILK now. Perhaps it's due to: commit `8a5c2ae753` Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Thu Mar 28 13:57:19 2013 -0700 drm/i915: fix ILK GPU reset for render Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 11:08:08 +02:00
Ben Widawsky	ab484f8fd6	drm/i915: Remove gen specific checks in MMIO Now that MMIO has been split up into gen specific functions it is obvious when HAS_FPGA_DBG_UNCLAIMED, HAS_FORCE_WAKE are needed. As such, we can remove this extraneous condition. As a result of this, as well as previously existing function pointers for forcewake, we no longer need the has_force_wake member in the device specific data structure. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:47:10 +02:00
Ben Widawsky	040d2baa62	drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF We'd only ever used this define to denote whether or not we have the dynamic parity feature (DPF) and never to determine whether or not L3 exists. Baytrail is a good example of where L3 exists, and not DPF. This patch provides clarify in the code for future use cases which might want to actually query whether or not L3 exists. v2: Add /* DPF == dynamic parity feature */ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:41:00 +02:00
Ben Widawsky	35a85ac606	drm/i915: Add second slice l3 remapping Certain HSW SKUs have a second bank of L3. This L3 remapping has a separate register set, and interrupt from the first "slice". A slice is simply a term to define some subset of the GPU's l3 cache. This patch implements both the interrupt handler, and ability to communicate with userspace about this second slice. v2: Remove redundant check about non-existent slice. Change warning about interrupts of unknown slices to WARN_ON_ONCE Handle the case where we get 2 slice interrupts concurrently, and switch the tracking of interrupts to be non-destructive (all Ville) Don't enable/mask the second slice parity interrupt for ivb/vlv (even though all docs I can find claim it's rsvd) (Ville + Bryan) Keep BYT excluded from L3 parity v3: Fix the slice = ffs to be decremented by one (found by Ville). When I initially did my testing on the series, I was using 1-based slice counting, so this code was correct. Not sure why my simpler tests that I've been running since then didn't pick it up sooner. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:37:04 +02:00
Chris Wilson	092467327c	drm/i915: Write RING_TAIL once per-request Ignoring the legacy DRI1 code, and a couple of special cases (to be discussed later), all access to the ring is mediated through requests. The first write to a ring will grab a seqno and mark the ring as having an outstanding_lazy_request. Either through explicitly adding a request after an execbuffer or through an implicit wait (either by the CPU or by a semaphore), that sequence of writes will be terminated with a request. So we can ellide all the intervening writes to the tail register and send the entire command stream to the GPU at once. This will reduce the number of serialising writes to the tail register by a factor or 3-5 times (depending upon architecture and number of workarounds, context switches, etc involved). This becomes even more noticeable when the register write is overloaded with a number of debugging tools. The astute reader will wonder if it is then possible to overflow the ring with a single command. It is not. When we start a command sequence to the ring, we check for available space and issue a wait in case we have not. The ring wait will in this case be forced to flush the outstanding register write and then poll the ACTHD for sufficient space to continue. The exception to the rule where everything is inside a request are a few initialisation cases where we may want to write GPU commands via the CS before userspace wakes up and page flips. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-10 15:35:58 +02:00
Chris Wilson	3c0e234c84	drm/i915; Preallocate the lazy request It is possible for us to be forced to perform an allocation for the lazy request whilst running the shrinker. This allocation may fail, leaving us unable to reclaim any memory leading to premature OOM. A neat solution to the problem is to preallocate the request at the same time as acquiring the seqno for the ring transaction. This means that we can report ENOMEM prior to touching the rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-05 12:03:53 +02:00
Chris Wilson	1823521d2b	drm/i915: Rename ring->outstanding_lazy_request Prior to preallocating an request for lazy emission, rename the existing field to make way (and differentiate the seqno from the request struct). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-05 12:03:12 +02:00
Chris Wilson	0d1aacac36	drm/i915: Embed the ring->private within the struct intel_ring_buffer We now have more devices using ring->private than not, and they all want the same structure. Worse, I would like to use a scratch page from outside of intel_ringbuffer.c and so for convenience would like to reuse ring->private. Embed the object into the struct intel_ringbuffer so that we can keep the code clean. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-03 19:17:55 +02:00
Dave Airlie	9c725e5bcd	Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next Alex writes: This is the radeon drm-next request. Big changes include: - support for dpm on CIK parts - support for ASPM on CIK parts - support for berlin GPUs - major ring handling cleanup - remove the old 3D blit code for bo moves in favor of CP DMA or sDMA - lots of bug fixes [airlied: fix up a bunch of conflicts from drm_order removal] * 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux: (898 commits) drm/radeon/dpm: make sure dc performance level limits are valid (CI) drm/radeon/dpm: make sure dc performance level limits are valid (BTC-SI) (v2) drm/radeon: gcc fixes for extended dpm tables drm/radeon: gcc fixes for kb/kv dpm drm/radeon: gcc fixes for ci dpm drm/radeon: gcc fixes for si dpm drm/radeon: gcc fixes for ni dpm drm/radeon: gcc fixes for trinity dpm drm/radeon: gcc fixes for sumo dpm drm/radeonn: gcc fixes for rv7xx/eg/btc dpm drm/radeon: gcc fixes for rv6xx dpm drm/radeon: gcc fixes for radeon_atombios.c drm/radeon: enable UVD interrupts on CIK drm/radeon: fix init ordering for r600+ drm/radeon/dpm: only need to reprogram uvd if uvd pg is enabled drm/radeon: check the return value of uvd_v1_0_start in uvd_v1_0_init drm/radeon: split out radeon_uvd_resume from uvd_v4_2_resume radeon kms: fix uninitialised hotplug work usage in r100_irq_process() drm/radeon/audio: set up the sads on DCE3.2 asics drm/radeon: fix handling of variable sized arrays for router objects ... Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_gem_dmabuf.c drivers/gpu/drm/i915/intel_pm.c drivers/gpu/drm/radeon/cik.c drivers/gpu/drm/radeon/ni.c drivers/gpu/drm/radeon/r600.c	2013-09-02 09:31:40 +10:00
Paulo Zanoni	edbfdb4560	drm/i915: wrap GEN6_PMIMR changes Just like we're doing with the other IMR changes. One of the functional changes is that not every caller was doing the POSTING_READ. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-23 14:52:26 +02:00
Paulo Zanoni	43eaea1318	drm/i915: wrap GTIMR changes Just like the functions that touch DEIMR and SDEIMR, but for GTIMR. The new functions contain a POSTING_READ(GTIMR) which was not present at the 2 callers inside i915_irq.c. The implementation is based on ibx_display_interrupt_update. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-23 14:52:26 +02:00
Ben Widawsky	5020150b3b	drm/i915: Initialize seqno for VECS too We require n-1 mailboxes for proper semaphore synchronization. All semaphore synchronization code relies on proper values in these mailboxes. The fact that we failed to touch the vebox ring by itself was unlikely to be an issue since the HW should be initializing the values to 0. However the error framework for testing seqno wrap introduced by Mika, in addition to the hangcheck via seqno, and i915_error_first_batchbuffer() combined caused a nice explosion. The problem is caused by seqno wrap because the wrap condition is not properly setup. The wrap code attempts to set the sync mailboxes all to 0, and then set the current seqno to one less than 0. In all cases, the vebox mailbox wasn't properly being initialized. This caused a wrap to not occur. When hangcheck kicks in with the bogus seqno values, the rest just doesn't work. It makes me wonder if we shouldn't consider a dumber version of hangcheck... How we messed this up: VECS support was written before the aforementioned other features. Upon VECS being rebased, these facts were missed. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:42 +02:00
Chris Wilson	884020bf3d	drm/i915: Invalidate TLBs for the rings after a reset After any "soft gfx reset" we must manually invalidate the TLBs associated with each ring. Empirically, it seems that a suspend/resume or D3-D0 cycle count as a "soft reset". The symptom is that the hardware would fail to note the new address for its status page, and so it would continue to write the shadow registers and breadcrumbs into the old physical address (now used by something completely different, scary). Whereas the driver would read the new status page and never see any progress, it would appear that the GPU hung immediately upon resume. Based on a patch by naresh kumar kachhi <naresh.kumar.kacchi@intel.com> Reported-by: Thiago Macieira <thiago@kde.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64725 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Thiago Macieira <thiago@kde.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-18 19:37:41 +02:00
Ben Widawsky	c37e220461	drm/i915: Add VM to pin To verbalize it, one can say, "pin an object into the given address space." The semantics of pinning remain the same otherwise. Certain objects will always have to be bound into the global GTT. Therefore, global GTT is a special case, and keep a special interface around for it (i915_gem_obj_ggtt_pin). v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:09 +02:00
Daniel Vetter	cb54b53ada	Merge commit 'Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux' This backmerges Linus' merge commit of the latest drm-fixes pull: commit `549f3a1218` Merge: `42577ca` `058ca4a` Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Tue Jul 23 15:47:08 2013 -0700 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux We've accrued a few too many conflicts, but the real reason is that I want to merge the 100% solution for Haswell concurrent registers writes into drm-intel-next. But that depends upon the 90% bandaid merged into -fixes: commit `a7cd1b8fea` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 19 20:36:51 2013 +0100 drm/i915: Serialize almost all register access Also, we can roll up on accrued conflicts. Usually I'd backmerge a tagged -rc, but I want to get this done before heading off to vacations next week ;-) Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_gem.c v2: For added hilarity we have a init sequence conflict around the gt_lock, so need to move that one, too. Spotted by Jani Nikula. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-25 15:18:41 +02:00
Daniel Vetter	c0d6a3dd61	drm/i915: don't enable PM_VEBOX_CS_ERROR_INTERRUPT The code to handle it is broken - there's simply no code to clear CS parser errors on gen5+. And behold, for all the other rings we also don't enable it! Leave the handling code itself in place just to be consistent with the existing mess though. And in case someone feels like fixing it all up. This has been errornously enabled in commit `12638c57f3` Author: Ben Widawsky <ben@bwidawsk.net> Date: Tue May 28 19:22:31 2013 -0700 drm/i915: Enable vebox interrupts Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-11 14:37:00 +02:00
Daniel Vetter	c7113cc35f	drm/i915: unify ring irq refcounts (again) With the simplified locking there's no reason any more to keep the refcounts seperate. v2: Readd the lost comment that ring->irq_refcount is protected by dev_priv->irq_lock. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-11 14:36:49 +02:00
Daniel Vetter	59cdb63d52	drm/i915: kill dev_priv->rps.lock Now that the rps interrupt locking isn't clearly separated (at elast conceptually) from all the other interrupt locking having a different lock stopped making sense: It protects much more than just the rps workqueue it started out with. But with the addition of VECS the separation started to blurr and resulted in some more complex locking for the ring interrupt refcount. With this we can (again) unifiy the ringbuffer irq refcounts without causing a massive confusion, but that's for the next patch. v2: Explain better why the rps.lock once made sense and why no longer, requested by Ben. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-11 14:36:43 +02:00
Daniel Vetter	aaf8a51672	drm/i915: fix up ring cleanup for the i830/i845 CS tlb w/a It's not a good idea to also run the pipe_control cleanup. This regression has been introduced whith the original cs tlb w/a in commit `b45305fce5` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Dec 17 16:21:27 2012 +0100 drm/i915: Implement workaround for broken CS tlb on i830/845 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64610 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-09 16:31:49 +02:00
Ben Widawsky	f343c5f647	drm/i915: Getter/setter for object attributes Soon we want to gut a lot of our existing assumptions how many address spaces an object can live in, and in doing so, embed the drm_mm_node in the object (and later the VMA). It's possible in the future we'll want to add more getter/setter methods, but for now this is enough to enable the VMAs. v2: Reworked commit message (Ben) Added comments to the main functions (Ben) sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/.[ch] (Daniel) v3: Rebased on new reserve_node patch Changed DRM_DEBUG_KMS to actually work (will need fixing later) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:34 +02:00
Daniel Vetter	035dc1e0f9	drm/i915: reinit status page registers after gpu reset This fixes gpu reset on my gm45 - without this patch the bsd thing is forever stuck since the seqno updates never reach the status page. Tbh I have no idea how this ever worked without rewriting the hws registers after a gpu reset. To satisfy my OCD also give the functions a bit more consistent names: - Use status_page everywhere, also for the physical addressed one. - Use init for the allocation part and setup for the register setup part consistently. Long term I'd really like to share the hw init parts completely between gpu reset, resume and driver load, i.e. to call i915_gem_init_hw instead of the individual pieces we might need. v2: Add the missing paragraph to the commit message about what bug exactly this patch here fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65495 Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: lu hua <huax.lu@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-04 11:36:36 +02:00
Mika Kuoppala	0025c0772d	drm/i915: change i915_add_request to macro Only execbuffer needed all the parameters on i915_add_request(). By putting __i915_add_request behind macro, all current callsites become cleaner. Following patch will introduce a new parameter for __i915_add_request. With this patch, only the relevant callsite will reflect the change making commit smaller and easier to understand. v2: _i915_add_request as function name (Chris Wilson) v3: change name __i915_add_request and fix ordering of params (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:15 +02:00
Chris Wilson	50f018dff1	drm/i915: Initialize ring->hangcheck upon ring init When we reset and restart a ring, we also want to clear any existing hangcheck. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-11 11:40:58 +02:00
Rodrigo Vivi	fd3da6c95b	drm/i915: WA: FBC Render Nuke. WaFbcNukeOn3DBlt for IVB, HSW. According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC. Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that follows each render submission." v2: Chris noticed that flush_domains check was missing here and also suggested to do LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the module parameter check. v3: Adding Wa name as Damien suggested. v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec. v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one. v6: Check for flush domain on blt (by Ville). Check for scanout dirty (by Chris). v7: Apply proper fbc_dirty implemented by Chris. v8: remove unused variables. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-07 17:56:55 +02:00
Ben Widawsky	12638c57f3	drm/i915: Enable vebox interrupts Similar to a patch originally written by: v2: Reversed the meanings of masked and enabled (Haihao) Made non-destructive writes in case enable/disabler rps runs first (Haihao) v3: Reword error message (Damien) Modify postinstall to do the right thing based on previous fixup. (Ben) CC: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:20 +02:00
Ben Widawsky	a19d2933cb	drm/i915: vebox interrupt get/put v2: Use the correct lock to protect PM interrupt regs, this was accidentally lost from earlier (Haihao) Fix return types (Ben) Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:19 +02:00
Ben Widawsky	cc609d5da5	drm/i915: consolidate interrupt naming scheme The motivation here is we're going to add some new interrupt definitions and handling outside of the GT interrupts which is all we've managed so far (with some RPS exceptions). By consolidating the names in the future we can make thing a bit cleaner as we don't need to define register names twice, and we can leverage pretty decent overlap in HW registers since ILK. To explain briefly what is in the comments: there are two sets of interrupt masking/enabling registers. At least so far, the definitions of the two sets overlap. The old code setup distinct names for interrupts in each set, ie. one for global, and one for ring. This made things confusing when using the wrong defines in the wrong places. rebase: Modified VLV bits v2: Renamed GT_RENDER_MASTER to GT_RENDER_CS_MASTER (Damien) Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:18 +02:00
Ben Widawsky	aeb0659338	drm/i915: Convert irq_refounct to struct It's overkill on older gens, but it's useful for newer gens. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:18 +02:00
Ben Widawsky	9a8a2213a7	drm/i915: Vebox ringbuffer init v2: Add set_seqno which didn't exist before rebase (Haihao) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:12 +02:00
Ben Widawsky	ea251324ca	drm/i915: Rename ring flush functions Historically we considered the render ring to have special flush semantics and everything else to fall under a more general umbrella. Probably by coincidence more than anything we decided to make the bsd ring have the default other flush. As the new vebox ring exposes, the bsd ring is actually the weird one. Doing this allows us to call gen6_ring_flush for the vebox because calling blt_ring_flush would be weird... This patch should have no functional change. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:10 +02:00
Ben Widawsky	1950de14fd	drm/i915: Add VECS semaphore bits Like the other rings, the VECS supports semaphores. The semaphore stuff is a bit wonky so this patch on it's own should be nice for review. This patch should have no functional impact. v2: Fix the English parts of clarification (again, register names were right, text was reversed) (Damien) Restore the still valid invariant. (Damien) The bsd semaphore register should be MI_SEMAPHORE_SYNC_VVE (Damien) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:10 +02:00
Ben Widawsky	4a3dd19d94	drm/i915: Introduce VECS: the 4th ring The video enhancement command streamer is a new ring on HSW which does what it sounds like it does. This patch provides the most minimal inception of the ring. In order to support a new ring, we need to bump the number. The patch may look trivial to the untrained eye, but bumping the number of rings is a bit scary. As such the patch is not terribly useful by itself, but a pretty nice place to find issues during a bisection. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:09 +02:00
Ben Widawsky	ad776f8b09	drm/i915: Semaphore MBOX update generalization This replaces the existing MBOX update code with a more generalized calculation for emitting mbox updates. We also create a sentinel for doing the updates so we can more abstractly deal with the rings. When doing MBOX updates the code must be aware of the /other/ rings. Until now the platforms which supported semaphores had a fixed number of rings and so it made sense for the code to be very specialized (hardcoded). The patch does contain a functional change, but should have no behavioral changes. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:08 +02:00
Ben Widawsky	5586181fce	drm/i915: Comments for semaphore clarification Semaphores are tied very closely to the rings in the GPU. Trivial patch adds comments to the existing code so that when we add new rings we can include comments there as well. It also helps distinguish the ring to semaphore mailbox interactions by using the ringname in the semaphore data structures. This patch should have no functional impact. v2: The English parts (as opposed to register names) of the comments were reversed. (Damien) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:08 +02:00
Wei Yongjun	56b085a01b	drm/i915: fix error return code in init_pipe_control() Fix to return -ENOMEM in the kmap() error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:04 +02:00
Mika Kuoppala	92cab73451	drm/i915: track ring progression using seqnos Instead of relying in acthd, track ring seqno progression to detect if ring has hung. v2: put hangcheck stuff inside struct (Chris Wilson) v3: initialize hangcheck.seqno (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:53:54 +02:00
Damien Lespiau	8693a82487	drm/i915: Add references to some workaround we implement We did not mention the workaround name when implementing those. This should help us track what we already implement. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-10 21:56:34 +02:00
Ville Syrjälä	b9e1faa763	drm/i915: Fix PIPE_CONTROL DW/QW write through global GTT on IVB+ The bit controlling whether PIPE_CONTROL DW/QW write targets the global GTT or PPGTT moved moved from DW 2 bit 2 to DW 1 bit 24 on IVB. I verified on IVB that the fix is in fact effective. Without the fix none of the scratch writes actually landed in the pipe control page. With the fix the writes show up correctly. v2: move PIPE_CONTROL_GLOBAL_GTT_IVB setup to where other flags are set Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-02-20 00:21:47 +01:00
Ville Syrjälä	2b1086cc58	drm/i915: Print the pipe control page GTT address We already print the HWS addresses during init, so do the same for the pipe control page. Reduces guesswork when looking at hex addresses later. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-02-20 00:21:40 +01:00
Dave Airlie	6dc1c49da6	Merge branch 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux into drm-next This pulls in most of Linus tree up to -rc6, this fixes the worst lockdep reported issues and re-enables fbcon lockdep. (not the fbcon maintainer) * 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux: (529 commits) Revert "Revert "console: implement lockdep support for console_lock"" fbcon: fix locking harder fb: Yet another band-aid for fixing lockdep mess fb: rework locking to fix lock ordering on takeover	2013-02-08 12:10:18 +10:00
Chris Wilson	f05bb0c7b6	drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits On SNB, if bit 13 of GFX_MODE, Flush TLB Invalidate Mode, is not set to 1, the hardware can not program the scanline values. Those scanline values then control when the signal is sent from the display engine to the render ring for MI_WAIT_FOR_EVENTs. Note setting this bit means that TLB invalidations must be performed explicitly through the appropriate bits being set in PIPE_CONTROL. References: https://bugzilla.kernel.org/show_bug.cgi?id=52311 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-23 00:58:23 +01:00
Chris Wilson	1c8c38c588	drm/i915: Disable AsyncFlip performance optimisations This is a required workarounds for all products, especially on gen6+ where it causes the command streamer to fail to parse instructions following a WAIT_FOR_EVENT. We use WAIT_FOR_EVENT for synchronising between the GPU and the display engines, and so this bit being unset may cause hangs. References: https://bugzilla.kernel.org/show_bug.cgi?id=52311 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-23 00:58:22 +01:00
Mika Kuoppala	9943393195	drm/i915: use gem_set_seqno() on hardware init When machine was rebooted or module was reloaded, gem_hw_init() set last_seqno to be identical to next_seqno. This lead to situation that waits for first ever request always passed immediately regardless if it was actually executed. Use gem_set_seqno() to be consistent how hw is initialized on init, wrap and on resume. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-22 13:52:26 +01:00
Daniel Vetter	33196dedda	drm/i915: move wedged to the other gpu error handling stuff And to make Ben Widawsky happier, use the gpu_error instead of the entire device as the argument in some functions. Drop the outdated comment on ->wedged for now, a follow-up patch will change the semantics and add a proper comment again. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:15 +01:00
Daniel Vetter	99584db33b	drm/i915: extract hangcheck/reset/error_state state into substruct This has been sprinkled all over the place in dev_priv. I think it'd be good to also move all the code into a separate file like i915_gem_error.c, but that's for another patch. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:14 +01:00
Ben Widawsky	dabb7a91ae	drm/i915: Remove use on gma_bus_addr on gen6+ We have enough info to not use the intel_gtt bridge stuff. v2: Move setup of mappable_base above the legacy init stuff because we still need that on older platforms. (Daniel) v3: Remove the dev_priv hunk which was rebased in by accident Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:47:03 +01:00
Dave Airlie	b5cc6c0387	Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: - seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris Wilson - some leftover kill-agp on gen6+ patches from Ben - hotplug improvements from Damien - clear fb when allocated from stolen, avoids dirt on the fbcon (Chris) - Stolen mem support from Chris Wilson, one of the many steps to get to real fastboot support. - Some DDI code cleanups from Paulo. - Some refactorings around lvds and dp code. - some random little bits&pieces * tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits) drm/i915: Return the real error code from intel_set_mode() drm/i915: Make GSM void drm/i915: Move GSM mapping into dev_priv drm/i915: Move even more gtt code to i915_gem_gtt drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno drm/i915: Introduce i915_gem_set_seqno() drm/i915: Always clear semaphore mboxes on seqno wrap drm/i915: Initialize hardware semaphore state on ring init drm/i915: Introduce ring set_seqno drm/i915: Missed conversion to gtt_pte_t drm/i915: Bug on unsupported swizzled platforms drm/i915: BUG() if fences are used on unsupported platform drm/i915: fixup overlay stolen memory leak drm/i915: clean up PIPECONF bpc #defines drm/i915: add intel_dp_set_signal_levels drm/i915: remove leftover display.update_wm assignment drm/i915: check for the PCH when setting pch_transcoder drm/i915: Clear the stolen fb before enabling drm/i915: Access to snooped system memory through the GTT is incoherent drm/i915: Remove stale comment about intel_dp_detect() ... Conflicts: drivers/gpu/drm/i915/intel_display.c	2013-01-17 20:34:08 +10:00
Mika Kuoppala	f7e98ad4d4	drm/i915: Initialize hardware semaphore state on ring init Hardware status page needs to have proper seqno set as our initial seqno can be arbitrary. If initial seqno is close to wrap boundary on init and i915_seqno_passed() (31bit space) refers to hw status page which contains zero, errorneous result will be returned. v2: clear mboxes and set hws page directly instead of going through rings. Suggested by Chris Wilson. v3: hws needs to be updated for all gens. Noticed by Chris Wilson. References: https://bugs.freedesktop.org/show_bug.cgi?id=58230 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:17:01 +01:00
Mika Kuoppala	b70ec5bf43	drm/i915: Introduce ring set_seqno In preparation for setting per ring initial seqno values add ring::set_seqno(). Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:16:18 +01:00
Daniel Vetter	b45305fce5	drm/i915: Implement workaround for broken CS tlb on i830/845 Now that Chris Wilson demonstrated that the key for stability on early gen 2 is to simple _never_ exchange the physical backing storage of batch buffers I've tried a stab at a kernel solution. Doesn't look too nefarious imho, now that I don't try to be too clever for my own good any more. v2: After discussing the various techniques, we've decided to always blit batches on the suspect devices, but allow userspace to opt out of the kernel workaround assume full responsibility for providing coherent batches. The principal reason is that avoiding the blit does improve performance in a few key microbenchmarks and also in cairo-trace replays. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring wrap w/a. Suggested by Chris Wilson. - Also add the ACTHD check from Chris Wilson for the error state dumping, so that we still catch batches when userspace opts out of the w/a.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-17 17:27:02 +01:00
Mika Kuoppala	f72b3435c1	drm/i915: Don't emit semaphore wait if wrap happened If wrap just happened we need to prevent emitting waits for pre wrap values. Detect this and emit no-ops instead. v2: Use olr > seqno to detect wrap instead of *seqno == 0 as suggested by Chris Wilson. v3: Use last used seqno to detect the wraparound. From Chris Wilson v4: Fixed unnecessary last_seqno assigment References: https://bugs.freedesktop.org/show_bug.cgi?id=57967 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 13:32:26 +01:00
Mika Kuoppala	498d2ac15c	drm/i915: Add intel_ring_handle_seqno wrap If there are pre-wrap values in semaphore-mbox registers after wrap, syncing against some after-wrap request will complete immediately. Fix this by emitting ring commands to set mbox registers to zero when the wrap happens. v2: Use __intel_ring_begin to emit ring commands, from Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add a small comment to handle_seqno_wrap.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-06 13:14:34 +01:00
Mika Kuoppala	cbcc80dff3	drm/i915: Split intel_ring_begin In preparation for handling ring seqno wrapping, split intel_ring_begin into helper part which doesn't allocate seqno. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-06 13:14:34 +01:00
Ville Syrjälä	633cf8f505	drm/i915: Don't allow ring tail to reach the same cacheline as head From BSpec: "If the Ring Buffer Head Pointer and the Tail Pointer are on the same cacheline, the Head Pointer must not be greater than the Tail Pointer." The easiest way to enforce this is to reduce the reported ring space. References: Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use" Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use" Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use" v2: Include the exact BSpec references in the description v3: s/64/I915_RING_FREE_SPACE, and add the BSpec information to the code Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 18:31:20 +01:00
Chris Wilson	ebc052e0c6	drm/i915: Allocate ringbuffers from stolen memory Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-30 23:41:52 +01:00
Chris Wilson	3e9605018a	drm/i915: Rearrange code to only have a single method for waiting upon the ring Replace the wait for the ring to be clear with the more common wait for the ring to be idle. The principle advantage is one less exported intel_ring_wait function, and the removal of a hardcoded value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:53 +01:00
Chris Wilson	9d7730914f	drm/i915: Preallocate next seqno before touching the ring Based on the work by Mika Kuoppala, we realised that we need to handle seqno wraparound prior to committing our changes to the ring. The most obvious point then is to grab the seqno inside intel_ring_begin(), and then to reuse that seqno for all ring operations until the next request. As intel_ring_begin() can fail, the callers must already be prepared to handle such failure and so we can safely add further checks. This patch looks like it should be split up into the interface changes and the tweaks to move seqno wrapping from the execbuffer into the core seqno increment. However, I found no easy way to break it into incremental steps without introducing further broken behaviour. v2: Mika found a silly mistake and a subtle error in the existing code; inside i915_gem_retire_requests() we were resetting the sync_seqno of the target ring based on the seqno from this ring - which are only related by the order of their allocation, not retirement. Hence we were applying the optimisation that the rings were synchronised too early, fortunately the only real casualty there is the handling of seqno wrapping. v3: Do not forget to reset the sync_seqno upon module reinitialisation, ala resume. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:52 +01:00
Chris Wilson	1c8b46fc8c	drm/i915: Use LRI to update the semaphore registers The bspec was recently updated to remove the ability to update the semaphore using the MI_SEMAPHORE_BOX command, the ability to wait upon the semaphore value remained. Instead the advice is to update the register using the MI_LOAD_REGISTER_IMM command. In cursory testing, semaphores continue to function - the question is whether this fixes some of the deadlocks where the semaphore registers contained stale values? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel J Blueman <daniel@quora.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:45:00 +01:00
Chris Wilson	6b8294a4d3	drm/i915: Restore physical HWS_PGA after resume By always setting up the HWS register for both physical and virtual address variations during render ring we can reduce the number of different special cases that get set up at varying different times during module load. Fixes regression from commit `c630119f43` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Oct 17 11:32:57 2012 +0200 drm/i915: don't save/restore HWS_PGA reg for kms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-16 13:47:40 +01:00
Daniel Vetter	b3fcabb15b	drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush This has been introduced in "drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op". Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:42 +01:00
Jesse Barnes	3ac7831314	drm/i915: PIPE_CONTROL TLB invalidate requires CS stall "If ENABLED, PIPE_CONTROL command will flush the in flight data written out by render engine to Global Observation point on flush done. Also Requires stall bit ([20] of DW1) set." So set the stall bit to ensure proper invalidation. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:37 +01:00
Jesse Barnes	9a28977181	drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op v3 So store into the scratch space of the HWS to make sure the invalidate occurs. v2: use GTT address space for store, clean up #defines (Chris) v3: use correct #define in blt ring flush (Chris) Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> References: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1063252 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:36 +01:00
Mika Kuoppala	17f10fdc01	drm/i915/ringbuffer: exclude last 2 cachelines on 845g on all callpaths Make intel_render_ring_init_dri and intel_init_ring_buffer symmetrical with regards of workaround introduced by: commit `27c1cbd06a` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Apr 9 13:59:46 2012 +0100 drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:09 +01:00
Daniel Vetter	c2fb791692	Linux 3.7-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJQgvdwAAoJEHm+PkMAQRiG+3AH/i2XsqqN3VctL0nnbWfvds+Q aKulfIdJTjKiVAsawPUtRqReZ8ijiebrgA/53lZLlrFOoPPQ5+LHmnSyQF6gErOY NuAE1lijXDRM1pwBlhvOBbAj26wUobGjqONFJ9OkKr758Ue8ds/Q7UdxyEgmYgmg tvVMzfRcICzryUV3PcqL+3cNPpCUdT6wGGRJ9DCv/jvGiWKExWhOle5oltrmxk+D NsqRcws5pEubfHE4J8BvNWr8lE1kHfYVhrJETiLJUiN2XAJcbI4Jy7rU/3EGteNS 0HMZdaPPjV874lohdM70X2225SbYrCVkAYB5hnZCTeC3tYyCawBBPMQoyAiOcmU= =+861 -----END PGP SIGNATURE----- Merge tag 'v3.7-rc2' into drm-intel-next-queued Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-22 14:34:51 +02:00
Chris Wilson	d7d4eeddb8	drm/i915: Allow DRM_ROOT_ONLY\|DRM_MASTER to submit privileged batchbuffers With the introduction of per-process GTT space, the hardware designers thought it wise to also limit the ability to write to MMIO space to only a "secure" batch buffer. The ability to rewrite registers is the only way to program the hardware to perform certain operations like scanline waits (required for tear-free windowed updates). So we either have a choice of adding an interface to perform those synchronized updates inside the kernel, or we permit certain processes the ability to write to the "safe" registers from within its command stream. This patch exposes the ability to submit a SECURE batch buffer to DRM_ROOT_ONLY\|DRM_MASTER processes. v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security bit (bit 13, accidentally not set). Also add a comment explaining why secure batches need a global gtt binding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) [danvet: added hsw fixup.] Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-17 21:06:59 +02:00
Linus Torvalds	612a9aab56	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...	2012-10-03 23:29:23 -07:00
David Howells	760285e7e7	UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/ Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:07 +01:00
David Howells	4126d5d61f	UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:05 +01:00
Chris Wilson	9da3da660d	drm/i915: Replace the array of pages with a scatterlist Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Paulo Zanoni	f39876317a	drm/i915: add workarounds to gen7_render_ring_flush From Bspec, Vol 2a, Section 1.9.3.4 "PIPE_CONTROL", intro section detailing the various workarounds: "[DevIVB {W/A}, DevHSW {W/A}]: Pipe_control with CS-stall bit set must be issued before a pipe-control command that has the State Cache Invalidate bit set." Note that public Bspec has different numbering, it's Vol2Part1, Section 1.10.4.1 "PIPE_CONTROL" there. There's also a second workaround for the PIPE_CONTROL command itself: "[DevIVB, DevVLV, DevHSW] {WA}: Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with only read-cache-invalidate bit(s) set, must have a CS_STALL bit set" For simplicity we simply set the CS_STALL bit on every pipe_control on gen7+ Note that this massively helps on some hsw machines, together with the following patch to unconditionally set the CS_STALL bit on every pipe_control it prevents a gpu hang every few seconds. This is a regression that has been introduced in the pipe_control cleanup: commit `6c6cf5aa9c` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 It looks like the massive snb pipe_control workaround also papered over any issues on ivb and hsw. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [danvet: squashed both workarounds together, pimped commit message with Bsepc citations, regression commit citation and changed the comment in the code a bit to clarify that we unconditionally set CS_STALL to avoid being hurt by trying to be clever.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:26 +02:00
Paulo Zanoni	b311150927	drm/i915: add workarounds directly to gen6_render_ring_flush Since gen 7+ now run the new gen7_render_ring_flush function. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:26 +02:00
Paulo Zanoni	4772eaebcd	drm/i915: add gen7_render_ring_flush For now, just a copy of gen6_render_ring_flush. Different gens have different workarounds, so we want different functions. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:25 +02:00
Chris Wilson	86a1ee26bb	drm/i915: Only pwrite through the GTT if there is space in the aperture Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:03:33 +02:00
Daniel Vetter	a22ddff8be	Linux 3.6-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQLWtvAAoJEHm+PkMAQRiG/DYH+wd0FqfEuYkYk4KPyAPuhKpX zX7HYfLvyJE/ZYIdrhjq1E6Xm2KNr7gtX7/Rdzi2W38M9sjbYzwG1UGIw51qnxWy yZJH9BGkfyQgQPeuDGohfB6DkDy2JWr2eqMDvakjOwgBsIzji0PQD/f3UvndhtUa c+tTj/kjavHE1Yr2Wy6OnRZz3Uc0hIMn/Q0JqtbCs3LUgEV1KA4OEAe56XNz4Ku4 WE+FFaGFPvtriQsQON+ohPS5IC8jzQGK/0vbrJ4lWjFnZy4gvZXnborTOwD0WSQG fbsNuxp1AaM2/pqfMwXm1w0ADvwOITHNiwwXf9id6DoK81QwTFpUdvKpn6yB6gQ= =rurr -----END PGP SIGNATURE----- Merge tag 'v3.6-rc2' into drm-intel-next Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put even more madness on top: - drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in -fixes, that has been changed in a variable-rename in -next, too. - drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr (since all their users have been extracted in another fucntion), -fixes added another user for a hw workaroudn. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-17 09:01:08 +02:00
Chris Wilson	7d54a90428	drm/i915: Apply post-sync write for pipe control invalidates When invalidating the TLBs it is documentated as requiring a post-sync write. Failure to do so seems to result in a GPU hang. Exposure to this hang on IVB seems to be a result of removing the extra stalls required for SNB pipecontrol workarounds: commit `6c6cf5aa9c` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 Note: Manually switch the pipe_control cmd to 4 dwords to avoid a (silent) functional conflict with -next. This way will get a loud (but conflict with next (since the scratch_addr has been deleted there). Reported-and-tested-by: yex.tian@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322 Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: added note about merge conflict with -next.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-14 09:47:45 +02:00
Chris Wilson	b2eadbc85b	drm/i915: Lazily apply the SNB+ seqno w/a Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 \|\| IS_GEN7 is redundant in that function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-10 11:11:32 +02:00
Daniel Vetter	0d8957c8a9	drm/i915: correctly order the ring init sequence We may only start to set up the new register values after having confirmed that the ring is truely off. Otherwise the hw might lose the newly written register values. This is caught later on in the init sequence, when we check whether the register writes have stuck. Cc: stable@vger.kernel.org Reviewed-by: Jani Nikula <jani.nikula@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Tested-by: Yang Guang <guang.a.yang@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-08 10:23:35 +02:00
Chris Wilson	6c6cf5aa9c	drm/i915: Only apply the SNB pipe control w/a to gen6 The requirements for the sync flush to be emitted prior to the render cache flush is only true for SandyBridge. On IvyBridge and friends we can just emit the flushes with an inline CS stall. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-08 09:34:32 +02:00
Ben Widawsky	e1ef7cc299	drm/i915: Macro to determine DPF support Originally I had a macro specifically for DPF support, and Daniel, with good reason asked me to change it to this. It's not the way I would have gone (and indeed I didn't), but for now there is no distinction as all platforms with L3 also have DPF. Note: The good reasons are that dpf is a l3$ feature (at least on currrent hw), hence I don't expect one to go without the other. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: added note] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:56 +02:00
Chris Wilson	a7b9761d0a	drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:55 +02:00
Chris Wilson	69c2fc8913	drm/i915: Remove the per-ring write list This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Ben Widawsky	2e6c21ed63	drm/i915: missing error case in init status page Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:40 +02:00
Chris Wilson	12f55818ba	drm/i915: Add comments to explain the BSD tail write workaround Having had to dive into the bspec to understand what each stage of the workaround meant, and how that the ring broadcasting IDLE corresponded with the GT powering down the ring (i.e. rc6) add comments to aide the next reader. And since the register "is used to control all aspects of PSMI and power saving functions" that makes it quite interesting to inspect with regards to RC6 hangs, so add it to the error-state. v2: Rediscover the piece of magic, set the RNCID to 0 before waiting for the ring to wake up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:37 +02:00
Daniel Vetter	de2b998552	drm/i915: don't return a spurious -EIO from intel_ring_begin The issue with this check is that it results in userspace receiving an -EIO while the gpu reset hasn't completed, resulting in fallback to sw rendering or worse. Now there's also a stern comment in intel_ring_wait_seqno saying that intel_ring_begin should not return -EAGAIN, ever, because some callers can't handle that. But after an audit of the callsites I don't see any issues. I guess the last problematic spot disappeared with the removal of the pipelined fencing code. So do the right thing and call check_wedge, which should properly decide whether an -EAGAIN or -EIO is appropriate if wedged is set. Note that the early check for a wedged gpu before touching the ring is rather important (and it took me quite some time of acting like the densest doofus to figure that out): If we don't do that and the gpu died for good, not having been resurrect by the reset code, userspace can merrily fill up the entire ring until it notices that something is amiss. Allowing userspace to emit more render, despite that we know that it will fail can't lead to anything good (and by experience can lead to all sorts of havoc, including angering the OOM gods and hard-hanging the hw for good). v2: Fix EAGAIN mispell, noticed by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:03:45 +02:00
Daniel Vetter	d6b2c790a4	drm/i915: non-interruptible sleeps can't handle -EAGAIN So don't return -EAGAIN, even in the case of a gpu hang. Remap it to -EIO instead. Note that this isn't really an issue with interruptability, but more that we have quite a few codepaths (mostly around kms stuff) that simply can't handle any errors and hence not even -EAGAIN. Instead of adding proper failure paths so that we could restart these ioctls we've opted for the cheap way out of sleeping non-interruptibly. Which works everywhere but when the gpu dies, which this patch fixes. So essentially interruptible == false means 'wait for the gpu or die trying'.' This patch is a bit ugly because intel_ring_begin is all non-interruptible and hence only returns -EIO. But as the comment in there says, auditing all the callsites would be a pain. To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno and intel_wait_ring_buffer. Also use the opportunity to clarify the different cases in i915_gem_check_wedge a bit with comments. v2: Don't access dev_priv->mm.interruptible from check_wedge - we might not hold dev->struct_mutex, making this racy. Instead pass interruptible in as a parameter. I've noticed this because I've hit a BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been added in commit `b4aca0106c` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Apr 25 20:50:12 2012 -0700 drm/i915: extract some common olr+wedge code although that commit is missing any justification for this. I guess it's just copy&paste, because the same commit add the same BUG_ON check to check_olr, where it indeed makes sense. But in check_wedge everything we access is protected by other means, so this is superflous. And because it now gets in the way (we add a new caller in __wait_seqno, which can be called without dev->struct_mutext) let's just remove it. v3: Group all the i915_gem_check_wedge refactoring into this patch, so that this patch here is all about not returning -EAGAIN to callsites that can't handle syscall restarting. v4: Add clarification what interuptible == fales means in our code, requested by Ben Widawsky. v5: Fix EAGAIN mispell noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:01:14 +02:00
Daniel Vetter	97f209bcfc	drm/i915: "Flush Me Harder" required on gen6+ The prep to remove the flushing list in commit `cc889e0f6c` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jun 13 20:45:19 2012 +0200 drm/i915: disable flushing_list/gpu_write_list causes quite some decent regressions. We can fix this by setting the CS_STALL bit to ensure that the following seqno write happens only after the cache flush has completed. But only do that when the caller actually wants the flush (and not also when we invalidate caches before starting the next batch). I've looked through all our ancient scrolls about gen6+ pipe control workarounds, and this seems to be indeed a legal combination: We're allowed to set the CS_STALL bit when we flush the render cache (which we do). While yelling at this code, also pass back the return value from intel_emit_post_sync_nonzero_flush properly. v2: Instead of emitting more pipe controls, set the CS_STALL bit on the write flush as suggested by Chris Wilson. It seems to work, too. Cc: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51436 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51429 Tested-by: Lu Hua <huax.lu@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-28 21:06:25 +02:00
Daniel Vetter	7b0cfee1a2	Linux 3.5-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJP53AxAAoJEHm+PkMAQRiGs2QH/RaqkXz96fwjhDcyiKpDqA3c kGuS5mz5cOhnqKSmR88HFm6pwuhLux/qSJzeAmoQy1MC8a0ACx7AnANW0lfN3/qe /HGYz8h60yCL/fhn8/bUYtdt9xsoDqoDcq/ooFl9mcsJGWbC6WeMSZU5dAUYqviE qFrp5zjY07FG53CRGT0hFpezQNwNL+VLH30CF9LD+fJLPVEYum2zBNGXWM42rcw5 fxzGL/6SO8YqA/Upic1ht6HAd6s5LOrlST7qvnyXUMvRXN5z/Y92ueYJZefkS1Om ohuLIKM2bv9/dJS67H8N2baSKGCzBdfSe5/5WaHdLYW9MiVju0wRl6HPJtAMrkk= =H8t8 -----END PGP SIGNATURE----- Merge tag 'v3.5-rc4' into drm-intel-next-queued I want to merge the "no more fake agp on gen6+" patches into drm-intel-next (well, the last pieces). But a patch in 3.5-rc4 also adds a new use of dev->agp. Hence the backmarge to sort this out, for otherwise drm-intel-next merged into Linus' tree would conflict in the relevant code, things would compile but nicely OOPS at driver load :( Conflicts in this merge are just simple cases of "both branches changed/added lines at the same place". The only tricky part is to keep the order correct wrt the unwind code in case of errors in intel_ringbuffer.c (and the MI_DISPLAY_FLIP #defines in i915_reg.h together, obviously). Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ringbuffer.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-25 19:10:36 +02:00
Ben Widawsky	12b0286f49	drm/i915: possibly invalidate TLB before context switch From http://intellinuxgraphics.org/documentation/SNB/IHD_OS_Vol1_Part3.pdf [DevSNB] If Flush TLB invalidation Mode is enabled it's the driver's responsibility to invalidate the TLBs at least once after the previous context switch after any GTT mappings changed (including new GTT entries). This can be done by a pipelined PIPE_CONTROL with TLB inv bit set immediately before MI_SET_CONTEXT. On GEN7 the invalidation mode is explicitly set, but this appears to be lacking for GEN6. Since I don't know the history on this, I've decided to dynamically read the value at ring init time, and use that value throughout. v2: better comment (daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:19 +02:00
Ben Widawsky	cc0f639822	drm/i915: PIPE_CONTROL_TLB_INVALIDATE This has showed up in several other patches. It's required for the next context workaround. I tested this one on its own and saw no differences in basic tests (performance or otherwise). This patch is relatively likely to cause regressions, hence why it's split out. Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:18 +02:00
Daniel Vetter	dd2757f8b5	drm/i915: stop using dev->agp->base For that to work we need to export the base address of the gtt mmio window from intel-gtt. Also replace all other uses of dev->agp by values we already have at hand. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-12 22:18:06 +02:00
Daniel Vetter	b7884eb45e	drm/i915: hold forcewake around ring hw init Empirical evidence suggests that we need to: On at least one ivb machine when running the hangman i-g-t test, the rings don't properly initialize properly - the RING_START registers seems to be stuck at all zeros. Holding forcewake around this register init sequences makes chip reset reliable again. Note that this is not the first such issue: commit `f01db988ef` Author: Sean Paul <seanpaul@chromium.org> Date: Fri Mar 16 12:43:22 2012 -0400 drm/i915: Add wait_for in init_ring_common added delay loops to make RING_START and RING_CTL initialization reliable on the blt ring at boot-up. So I guess it won't hurt if we do this unconditionally for all force_wake needing gpus. To avoid copy&pasting of the HAS_FORCE_WAKE check I've added a new intel_info bit for that. v2: Fixup missing commas in static struct and properly handling the error case in init_ring_common, both noticed by Jani Nikula. Cc: stable@vger.kernel.org Reported-and-tested-by: Yang Guang <guang.a.yang@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-04 20:25:29 +02:00
Chris Wilson	3eef8918ff	drm/i915: Mark the ringbuffers as being in the GTT domain By correctly describing the rinbuffers as being in the GTT domain, it appears that we are more careful with the management of the CPU cache upon resume and so prevent some coherency issue when submitting commands to the GPU later. A secondary effect is that the debug logs are then consistent with the actual usage (i.e. they no longer describe the ringbuffers as being in the CPU write domain when we are accessing them through an wc iomapping.) Reported-and-tested-by: Daniel Gnoutcheff <daniel@gnoutcheff.name> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41092 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-04 20:16:40 +02:00
Ben Widawsky	15b9f80e00	drm/i915: enable parity error interrupts The previous patch put all the code, and handlers in place. It should now be safe to enable the parity error interrupt. The parity error must be unmasked in both the GTIMR, and the CS IMR. Unfortunately, the docs aren't clear about this; nevertheless it's the truth. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-31 12:07:13 +02:00
Chris Wilson	c3b2003792	drm/i915: Reset last_retired_head when resetting ring When we reset the ring control registers, including the HEAD and TAIL of the ring, we also need to reset associated state. In this instance, we were failing to reset the cached value of ring->last_retired_head and so upon the first request for more space following a resume would potentially (depending on a narrow race window) believe that the HEAD had advanced much further than reality. This is a regression from: commit `a71d8d9452` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 15 11:25:36 2012 +0000 drm/i915: Record the tail at each request and use it to estimate the head Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org # 3.4 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-29 20:06:58 +02:00
Ben Widawsky	199b2bc25b	drm/i915: s/i915_wait_request/i915_wait_seqno/g Wait request is poorly named IMO. After working with these functions for some time, I feel it's much clearer to name the functions more appropriately. Of course we must update the callers to use the new name as well. This leaves room within our namespace for a real wait request function at some point. Note to maintainer: this patch is optional. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 14:18:42 +02:00
Daniel Vetter	5e13a0c5ec	Merge remote-tracking branch 'airlied/drm-core-next' into drm-intel-next-queued Backmerge of drm-next to resolve a few ugly conflicts and to get a few fixes from 3.4-rc6 (which drm-next has already merged). Note that this merge also restricts the stencil cache lra evict policy workaround to snb (as it should) - I had to frob the code anyway because the CM0_MASK_SHIFT define died in the masked bit cleanups. We need the backmerge to get Paulo Zanoni's infoframe regression fix for gm45 - further bugfixes from him touch the same area and would needlessly conflict. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-08 13:39:59 +02:00
Daniel Vetter	dc257cf154	Linux 3.4-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPpvY9AAoJEHm+PkMAQRiGpEoIAJgbu+Y8gITnBK/wh9O6zy3S 5jie5KK4YWdbJsvO58WbNr3CyVIwGIqQ2dUZLiU59aBVLarlGw8xor0MmW+cZwhp 6fBHaf0qDYAV0MZjD+mnnExOiCRyISa2lPmsfu9dAWywh5KGe6/oAP6/qcXIyok3 KZyl3qQf4ENpaZPHwZPXCEkUvtuyHgNiszN+QXEadA3s19Ot4VGe9A3VGw+GNrSm JqFIq3acQAbKa5BYaqf7TQC02v2FI7//eqt6QHxTqbE6a7LGbTvLfX3HlJ2mnfqa 1R6QHhM4y4OZDHbaMT2raHZ8WuLXzhehJzhP8Co7AHFOKwVKOb5XbcUr2RrukMU= =HkMd -----END PGP SIGNATURE----- Merge tag 'v3.4-rc6' into drm-intel-next Conflicts: drivers/gpu/drm/i915/intel_display.c Ok, this is a fun story of git totally messing things up. There /shouldn't/ be any conflict in here, because the fixes in -rc6 do only touch functions that have not been changed in -next. The offending commits in drm-next are 14415745b2..1fa611065 which simply move a few functions from intel_display.c to intel_pm.c. The problem seems to be that git diff gets completely confused: $ git diff 14415745b2..1fa611065 is a nice mess in intel_display.c, and the diff leaks into totally unrelated functions, whereas $git diff --minimal 14415745b2..1fa611065 is exactly what we want. Unfortunately there seems to be no way to teach similar smarts to the merge diff and conflict generation code, because with the minimal diff there really shouldn't be any conflicts. For added hilarity, every time something in that area changes the + and - lines in the diff move around like crazy, again resulting in new conflicts. So I fear this mess will stay with us for a little longer (and might result in another backmerge down the road). Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-07 14:02:14 +02:00
Daniel Vetter	e5eb3d63c6	drm/i915: add interface to simulate gpu hangs gpu reset is a very important piece of our infrastructure. Unfortunately we only really it test by actually hanging the gpu, which often has bad side-effects for the entire system. And the gpu hang handling code is one of the rather complicated pieces of code we have, consisting of - hang detection - error capture - actual gpu reset - reset of all the gem bookkeeping - reinitialition of the entire gpu This patch adds a debugfs to selectively stopping rings by ceasing to update the hw tail pointer, which will result in the gpu no longer updating it's head pointer and eventually to the hangcheck firing. This way we can exercise the gpu hang code under controlled conditions without a dying gpu taking down the entire systems. Patch motivated by me forgetting to properly reinitialize ppgtt after a gpu reset. Usage: echo $((1 << $ringnum)) > i915_ring_stop # stops one ring echo 0xffffffff > i915_ring_stop # stops all, future-proof version then run whatever testload is desired. i915_ring_stop automatically resets after a gpu hang is detected to avoid hanging the gpu to fast and declaring it wedged. v2: Incorporate feedback from Chris Wilson. v3: Add the missing cleanup. v4: Fix up inconsistent size of ring_stop_read vs _write, noticed by Eugeni Dodonov. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-05 19:45:00 +02:00
Daniel Vetter	4225d0f219	drm/i915: fixup __iomem mixups in ringbuffer.c Two things: - ring->virtual start is an __iomem pointer, treat it accordingly. - dev_priv->status_page.page_addr is now always a cpu addr, no pointer casting needed for that. Take the opportunity to remove the unnecessary drm indirection when setting up the ringbuffer iomapping. v2: Add a compiler barrier before reading the hw status page. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:31 +02:00
Daniel Vetter	627965ad3e	drm/i915: kill pointless clearing of dev_priv->hws_map We kzalloc dev_priv, and we never use hws_map in intel_ringbuffer.c. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:30 +02:00

... 2 3 4 5 6 ...

504 Commits