linux

Commit Graph

Author	SHA1	Message	Date
Brad Volkin	351e3db2b3	drm/i915: Implement command buffer parsing logic The command parser scans batch buffers submitted via execbuffer ioctls before the driver submits them to hardware. At a high level, it looks for several things: 1) Commands which are explicitly defined as privileged or which should only be used by the kernel driver. The parser generally rejects such commands, with the provision that it may allow some from the drm master process. 2) Commands which access registers. To support correct/enhanced userspace functionality, particularly certain OpenGL extensions, the parser provides a whitelist of registers which userspace may safely access (for both normal and drm master processes). 3) Commands which access privileged memory (i.e. GGTT, HWS page, etc). The parser always rejects such commands. See the overview comment in the source for more details. This patch only implements the logic. Subsequent patches will build the tables that drive the parser. v2: Don't set the secure bit if the parser succeeds Fail harder during init Makefile cleanup Kerneldoc cleanup Clarify module param description Convert ints to bools in a few places Move client/subclient defs to i915_reg.h Remove the bits_count field OTC-Tracker: AXIA-4631 Change-Id: I50b98c71c6655893291c78a2d1b8954577b37a30 Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Appease checkpatch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-07 22:37:00 +01:00
Ville Syrjälä	8285222c48	drm/i915: We implement WaDisableAsyncFlipPerfMode:bdw Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-05 21:30:34 +01:00
Daniel Vetter	e01f69295b	drm/i915: Handle set_cache_level errors in the status page setup Split out from Chris vma-bind rework. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:18:00 +01:00
Daniel Vetter	9a6bbb6216	drm/i915: Don't pin the status page as mappable We access it through the cpu window. No functional difference expected atm since we default to a bottom-up allocation scheme. But that might eventually change so that we prefer the unmappable range for buffers that don't need cpu gtt access. Split out from Chris vma-bind rework. Note that this is only possible due to the split-up of the mappable pin flag into PIN_GLOBAL and PIN_MAPPABLE. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:48 +01:00
Daniel Vetter	be1fa129f5	drm/i915: Don't set PIN_MAPPABLE for legacy ringbuffers Tighter code since legacy gem has only mappable anyway. Split out from Chris vma-bind rework. Note that this is only possible due to the split-up of the mappable pin flag into PIN_GLOBAL and PIN_MAPPABLE. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:42 +01:00
Daniel Vetter	a9cc726c85	drm/i915: Handle set_cache_level errors in the pipe control scratch setup Split out from Chris vma-bind rework. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:17:35 +01:00
Daniel Vetter	1ec9e26dda	drm/i915: Consolidate binding parameters into flags Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:16:58 +01:00
Daniel Vetter	fb19e2ac7c	drm/i915: protect ringbuffer sarea update behind !MODESET Avoids surprises when userspace races open/closes against this. Cc: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 23:44:34 +01:00
Ville Syrjälä	753b1ad4a2	drm/i915: Add intel_ring_cachline_align() intel_ring_cachline_align() emits MI_NOOPs until the ring tail is aligned to a cacheline boundary. Cc: Bjoern C <lkml@call-home.ch> Cc: Alexandru DAMIAN <alexandru.damian@intel.com> Cc: Enrico Tagliavini <enrico.tagliavini@gmail.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org (prereq for the next patch) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-11 23:00:19 +01:00
Chris Wilson	1f70999f90	drm/i915: Prevent recursion by retiring requests when the ring is full As the VM do not track activity of objects and instead use a large hammer to forcibly idle and evict all of their associated objects when one is released, it is possible for that to cause a recursion when we need to wait for free space on a ring and call retire requests. (intel_ring_begin -> intel_ring_wait_request -> i915_gem_retire_requests_ring -> i915_gem_context_free -> i915_gem_evict_vm -> i915_gpu_idle -> intel_ring_begin etc) In order to remove the requirement for calling retire-requests from intel_ring_wait_request, we have to inline a couple of steps from retiring requests, notably we have to record the position of the request we wait for and use that to update the available ring space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-06 17:43:13 +01:00
Daniel Vetter	0e5539b923	Merge branch 'topic/ppgtt' into drm-intel-next-queued Because whatever.* * This should contain a fairly long list of issues and still unresolved resgressions, but I didn't really get a vote. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-25 21:14:57 +01:00
Ben Widawsky	f0a9f74c0d	drm/i915: Fix disabled semaphores The ring will emit too many if semaphores are disabled since we do not add the correct number to num_dwords anymore. This was introduced: commit `52ed23253b` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Mon Dec 16 20:50:38 2013 -0800 drm/i915: Don't emit mbox updates without semaphores FWIW, the bug was fixed later in the series. /me hangs head in shame. Daniel: Also note that we should have merged the read-only semaphore modparam before this patch. Reported-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 09:58:26 +01:00
Chris Wilson	304d695c3d	drm/i915: Flush outstanding requests before allocating new seqno In very rare cases (such as a memory failure stress test) it is possible to fill the entire ring without emitting a request. Under this circumstance, the outstanding request is flushed and waited upon. After space on the ring is cleared, we return to emitting the new command - except that we just cleared the seqno allocated for this operation and trigger the sanity check that a request is only ever emitted with a valid seqno. The fix is to rearrange the code to make sure the allocation of the seqno for this operation is after any required flushes of outstanding operations. The bug exists since the preallocation was introduced in commit `9d7730914f` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Nov 27 16:22:52 2012 +0000 drm/i915: Preallocate next seqno before touching the ring Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-07 12:33:41 +01:00
Ben Widawsky	d7f46fc4e7	drm/i915: Make pin count per VMA Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:49 +01:00
Ben Widawsky	52ed23253b	drm/i915: Don't emit mbox updates without semaphores Aside from the fact that it leaves confusing dumps on error capture, it is entirely unnecessary, and potentially harmful in cases like BDW, where the instruction has changed. In reality (seemingly), this will have no behavioral impact. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-17 09:56:29 +01:00
Deepak S	c8d9a5905e	drm/i915: Add power well arguments to force wake routines. Added power well arguments to all the force wake routines to help us individually control power well based on the scenario. Signed-off-by: Deepak S <deepak.s@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: Resolve conflict with the removed forcewake hack and drop one spurious hunk Jesse noticed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-28 08:31:22 +01:00
Chris Wilson	67e5871be8	drm/i915: Drop forcewake w/a for missed interrupts/seqno on Sandybridge I believe, and an evening of i-g-t, that our original workaround for the missed interrupts on Sandybridge, that of holding forcewake whilst we wait for an interrupts, is no longer required. This leaves us dependent on the second workaround of forcing an UC read of the ACTHD before reading back the seqno from the snooped HWS. Dropping the forcewake should allow us to conserve a little power, not much as the GPU is meant to be busy whilst we wait for it! Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-26 10:27:55 +01:00
Ville Syrjälä	37c1d94fa8	drm/i915: Emit SRM after the MSG_FBC_REND_STATE LRI The spec tells us that we need to emit an SRM after the LRI to MSG_FBC_REND_STATE. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-21 09:06:31 +01:00
Ville Syrjälä	9688ecadd2	drm/i915: Limit FBC flush to post batch flush Don't issue the FBC nuke/cache clean command when invalidate_domains!=0. That would indicate that we're not being called for the post-batch flush. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-21 09:06:09 +01:00
Ben Widawsky	eb0d4b75d5	drm/i915/bdw: Add comment about gen8 HWS PGA This confused me some many times that I think it is appropriate to add a small comment to instruct the reader of the code that it is indeed doing what it is supposed to do. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-14 09:33:11 +01:00
Daniel Vetter	40c499f93f	drm/i915/bdw: Take render error interrupt out of the mask The handling of the error interrupts isn't wired up at all. And it hasn't been ever since ilk happened, so don't bother. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:10:10 +01:00
Ben Widawsky	780f18c84c	drm/i915/bdw: BSD init for gen8 also This was an oversight and should have been in a previous series somewhere. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:50 +01:00
Ben Widawsky	a5f3d68e2e	drm/i915/bdw: Render ring flushing PIPE_CONTROL added the high address dword. I'm not sure how the simulator let me get away with this. I've explicitly left out all the workarounds from Gen7 because in the minimal digging that I did, most don't seem necessary, and the simulator doesn't complain without them Note that BLT and BSD ring commands had already been updated previously. Just render/pipe_control should have been broken. v2: Squash in a fixup from Ville to follow the recent IVB PIPE_CONTROL updates: "BDW uses the IVB PIPE_CONTROL style for specifying GTT vs. PPGTT for the PIPE_CONTROL QW/DW write." v3: Rebase on top of Chris' cleanup to have an explicit ring->scratch buffer object instead of an opaque ring->private where everyone stores the same stuff inside. Reported-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (for the fixup) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:49 +01:00
Ben Widawsky	28cf541543	drm/i915/bdw: unleash PPGTT v2: Squash in fix from Ben: Set PPGTT batches as necessary This fixes the regression in the last couple of days when we enabled PPGTT. v3: Squash in fixup to still use GTT for secure batches from Ville: BDW doesn't have a separate secure vs. non-secure bit in MI_BATCH_BUFFER_START. So for secure batches we have to simply leave the PPGTT bit unset. Fortunately older generations (except HSW) had similar limitations so execbuffer already creates a GTT mapping for all secure batches. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:48 +01:00
Ben Widawsky	075b3bbaae	drm/i915/bdw: Update MI_FLUSH_DW The code is more verbose than necessary for the reader's sake, hopefully the compiler optimizes away the if. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:42 +01:00
Ben Widawsky	1c7a0623c7	drm/i915/bdw: dispatch updates (64b related) The command to emit batch buffers has changed to address 48b addresses. It seemed reasonable that we could still use the old instruction where emitting 0 for length would do the right thing, but it seems to bother the simulator when the code does that. Now the second dword in the command has the upper 16b of the address of the batchbuffer. v2: Remove duplicated vfun assignment. v3: Squash in VECS support changes from Zhao Yakui <yakui.zhao@intel.com> v4: Make checkpatch happy. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:41 +01:00
Ben Widawsky	abd58f0175	drm/i915/bdw: Implement interrupt changes The interrupt handling implementation remains the same as previous generations with the 4 types of registers, status, identity, mask, and enable. However the layout of where the bits go have changed entirely. To address these changes, all of the interrupt vfuncs needed special gen8 code. The way it works is there is a top level status register now which informs the interrupt service routine which unit caused the interrupt, and therefore which interrupt registers to read to process the interrupt. For display the division is quite logical, a set of interrupt registers for each pipe, and in addition to those, a set each for "misc" and port. For GT the things get a bit hairy, as seen by the code. Each of the GT units has it's own bits defined. They all look very similar and resides in 16 bits of a GT register. As an example, RCS and BCS share register 0. To compact the code a bit, at a slight expense to complexity, this is exactly how the code works as well. 2 structures are added to the ring buffer so that our ring buffer interrupt handling code knows which ring shares the interrupt registers, and a shift value (ie. the top or bottom 16 bits of the register). The above allows us to kept the interrupt register caching scheme, the per interrupt enables, and the code to mask and unmask interrupts relatively clean (again at the cost of some more complexity). Most of the GT units mentioned above are command streamers, and so the symmetry should work quite well for even the yet to be implemented rings which Broadwell adds. v2: Fixes up a couple of bugs, and is more verbose about errors in the Broadwell interrupt handler. v3: fix DE_MISC IER offset v4: Simplify interrupts: I totally misread the docs the first time I implemented interrupts, and so this should greatly simplify the mess. Unlike GEN6, we never touch the regular mask registers in irq_get/put. v5: Rebased on to of recent pch hotplug setup changes. v6: Fixup on top of moving num_pipes to intel_info. v7: Rebased on top of Egbert Eich's hpd irq handling rework. Also wired up ibx_hpd_irq_setup for gen8. v8: Rebase on top of Jani's asle handling rework. v9: Rebase on top of Ben's VECS enabling for Haswell, where he unfortunately went OCD on the gt irq #defines. Not that they're still not yet fully consistent: - Used the GT_RENDER_ #defines + bdw shifts. - Dropped the shift from the L3_PARITY stuff, seemed clearer. - s/irq_refcount/irq_refcount.gt/ v10: Squash in VECS enabling patches and the gen8_gt_irq_handler refactoring from Zhao Yakui <yakui.zhao@intel.com> v11: Rebase on top of the interrupt cleanups in upstream. v12: Rebase on top of Ben's DPF changes in upstream. v13: Drop bdw from the HAS_L3_DPF feature flag for now, it's unclear what exactly needs to be done. Requested by Ben. v14: Fix the patch. - Drop the mask of reserved bits and assorted logic, it doesn't match the spec. - Do the posting read inconditionally instead of commenting it out. - Add a GEN8_MASTER_IRQ_CONTROL definition and use it. - Fix up the GEN8_PIPE interrupt defines and give the GEN8_ prefixes - we actually will need to use them. - Enclose macros in do {} while (0) (checkpatch). - Clear DE_MISC interrupt bits only after having processed them. - Fix whitespace fail (checkpatch). - Fix overtly long lines where appropriate (checkpatch). - Don't use typedef'ed private_t (maintainer-scripts). - Align the function parameter list correctly. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v4) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> bikeshed	2013-11-08 18:09:39 +01:00
Ben Widawsky	3d57e5bd12	drm/i915: Do a fuller init after reset I had this lying around from he original PPGTT series, and thought we might try to get it in by itself. It's convenient to just call i915_gem_init_hw at reset because we'll be adding new things to that function, and having just one function to call instead of reimplementing it in two places is nice. In order to accommodate we cleanup ringbuffers in order to bring them back up cleanly. Optionally, we could also teardown/re initialize the default context but this was causing some problems on reset which I wasn't able to fully debug, and is unnecessary with the previous context init/enable split. This essentially reverts: commit `8e88a2bd59` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jun 19 18:40:00 2012 +0200 drm/i915: don't call modeset_init_hw in i915_reset It seems to work for me on ILK now. Perhaps it's due to: commit `8a5c2ae753` Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Thu Mar 28 13:57:19 2013 -0700 drm/i915: fix ILK GPU reset for render Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 11:08:08 +02:00
Ben Widawsky	ab484f8fd6	drm/i915: Remove gen specific checks in MMIO Now that MMIO has been split up into gen specific functions it is obvious when HAS_FPGA_DBG_UNCLAIMED, HAS_FORCE_WAKE are needed. As such, we can remove this extraneous condition. As a result of this, as well as previously existing function pointers for forcewake, we no longer need the has_force_wake member in the device specific data structure. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:47:10 +02:00
Ben Widawsky	040d2baa62	drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF We'd only ever used this define to denote whether or not we have the dynamic parity feature (DPF) and never to determine whether or not L3 exists. Baytrail is a good example of where L3 exists, and not DPF. This patch provides clarify in the code for future use cases which might want to actually query whether or not L3 exists. v2: Add /* DPF == dynamic parity feature */ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:41:00 +02:00
Ben Widawsky	35a85ac606	drm/i915: Add second slice l3 remapping Certain HSW SKUs have a second bank of L3. This L3 remapping has a separate register set, and interrupt from the first "slice". A slice is simply a term to define some subset of the GPU's l3 cache. This patch implements both the interrupt handler, and ability to communicate with userspace about this second slice. v2: Remove redundant check about non-existent slice. Change warning about interrupts of unknown slices to WARN_ON_ONCE Handle the case where we get 2 slice interrupts concurrently, and switch the tracking of interrupts to be non-destructive (all Ville) Don't enable/mask the second slice parity interrupt for ivb/vlv (even though all docs I can find claim it's rsvd) (Ville + Bryan) Keep BYT excluded from L3 parity v3: Fix the slice = ffs to be decremented by one (found by Ville). When I initially did my testing on the series, I was using 1-based slice counting, so this code was correct. Not sure why my simpler tests that I've been running since then didn't pick it up sooner. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:37:04 +02:00
Chris Wilson	092467327c	drm/i915: Write RING_TAIL once per-request Ignoring the legacy DRI1 code, and a couple of special cases (to be discussed later), all access to the ring is mediated through requests. The first write to a ring will grab a seqno and mark the ring as having an outstanding_lazy_request. Either through explicitly adding a request after an execbuffer or through an implicit wait (either by the CPU or by a semaphore), that sequence of writes will be terminated with a request. So we can ellide all the intervening writes to the tail register and send the entire command stream to the GPU at once. This will reduce the number of serialising writes to the tail register by a factor or 3-5 times (depending upon architecture and number of workarounds, context switches, etc involved). This becomes even more noticeable when the register write is overloaded with a number of debugging tools. The astute reader will wonder if it is then possible to overflow the ring with a single command. It is not. When we start a command sequence to the ring, we check for available space and issue a wait in case we have not. The ring wait will in this case be forced to flush the outstanding register write and then poll the ACTHD for sufficient space to continue. The exception to the rule where everything is inside a request are a few initialisation cases where we may want to write GPU commands via the CS before userspace wakes up and page flips. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-10 15:35:58 +02:00
Chris Wilson	3c0e234c84	drm/i915; Preallocate the lazy request It is possible for us to be forced to perform an allocation for the lazy request whilst running the shrinker. This allocation may fail, leaving us unable to reclaim any memory leading to premature OOM. A neat solution to the problem is to preallocate the request at the same time as acquiring the seqno for the ring transaction. This means that we can report ENOMEM prior to touching the rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-05 12:03:53 +02:00
Chris Wilson	1823521d2b	drm/i915: Rename ring->outstanding_lazy_request Prior to preallocating an request for lazy emission, rename the existing field to make way (and differentiate the seqno from the request struct). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-05 12:03:12 +02:00
Chris Wilson	0d1aacac36	drm/i915: Embed the ring->private within the struct intel_ring_buffer We now have more devices using ring->private than not, and they all want the same structure. Worse, I would like to use a scratch page from outside of intel_ringbuffer.c and so for convenience would like to reuse ring->private. Embed the object into the struct intel_ringbuffer so that we can keep the code clean. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-03 19:17:55 +02:00
Dave Airlie	9c725e5bcd	Merge branch 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux into drm-next Alex writes: This is the radeon drm-next request. Big changes include: - support for dpm on CIK parts - support for ASPM on CIK parts - support for berlin GPUs - major ring handling cleanup - remove the old 3D blit code for bo moves in favor of CP DMA or sDMA - lots of bug fixes [airlied: fix up a bunch of conflicts from drm_order removal] * 'drm-next-3.12' of git://people.freedesktop.org/~agd5f/linux: (898 commits) drm/radeon/dpm: make sure dc performance level limits are valid (CI) drm/radeon/dpm: make sure dc performance level limits are valid (BTC-SI) (v2) drm/radeon: gcc fixes for extended dpm tables drm/radeon: gcc fixes for kb/kv dpm drm/radeon: gcc fixes for ci dpm drm/radeon: gcc fixes for si dpm drm/radeon: gcc fixes for ni dpm drm/radeon: gcc fixes for trinity dpm drm/radeon: gcc fixes for sumo dpm drm/radeonn: gcc fixes for rv7xx/eg/btc dpm drm/radeon: gcc fixes for rv6xx dpm drm/radeon: gcc fixes for radeon_atombios.c drm/radeon: enable UVD interrupts on CIK drm/radeon: fix init ordering for r600+ drm/radeon/dpm: only need to reprogram uvd if uvd pg is enabled drm/radeon: check the return value of uvd_v1_0_start in uvd_v1_0_init drm/radeon: split out radeon_uvd_resume from uvd_v4_2_resume radeon kms: fix uninitialised hotplug work usage in r100_irq_process() drm/radeon/audio: set up the sads on DCE3.2 asics drm/radeon: fix handling of variable sized arrays for router objects ... Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_gem_dmabuf.c drivers/gpu/drm/i915/intel_pm.c drivers/gpu/drm/radeon/cik.c drivers/gpu/drm/radeon/ni.c drivers/gpu/drm/radeon/r600.c	2013-09-02 09:31:40 +10:00
Paulo Zanoni	edbfdb4560	drm/i915: wrap GEN6_PMIMR changes Just like we're doing with the other IMR changes. One of the functional changes is that not every caller was doing the POSTING_READ. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-23 14:52:26 +02:00
Paulo Zanoni	43eaea1318	drm/i915: wrap GTIMR changes Just like the functions that touch DEIMR and SDEIMR, but for GTIMR. The new functions contain a POSTING_READ(GTIMR) which was not present at the 2 callers inside i915_irq.c. The implementation is based on ibx_display_interrupt_update. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-23 14:52:26 +02:00
Ben Widawsky	5020150b3b	drm/i915: Initialize seqno for VECS too We require n-1 mailboxes for proper semaphore synchronization. All semaphore synchronization code relies on proper values in these mailboxes. The fact that we failed to touch the vebox ring by itself was unlikely to be an issue since the HW should be initializing the values to 0. However the error framework for testing seqno wrap introduced by Mika, in addition to the hangcheck via seqno, and i915_error_first_batchbuffer() combined caused a nice explosion. The problem is caused by seqno wrap because the wrap condition is not properly setup. The wrap code attempts to set the sync mailboxes all to 0, and then set the current seqno to one less than 0. In all cases, the vebox mailbox wasn't properly being initialized. This caused a wrap to not occur. When hangcheck kicks in with the bogus seqno values, the rest just doesn't work. It makes me wonder if we shouldn't consider a dumber version of hangcheck... How we messed this up: VECS support was written before the aforementioned other features. Upon VECS being rebased, these facts were missed. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:42 +02:00
Chris Wilson	884020bf3d	drm/i915: Invalidate TLBs for the rings after a reset After any "soft gfx reset" we must manually invalidate the TLBs associated with each ring. Empirically, it seems that a suspend/resume or D3-D0 cycle count as a "soft reset". The symptom is that the hardware would fail to note the new address for its status page, and so it would continue to write the shadow registers and breadcrumbs into the old physical address (now used by something completely different, scary). Whereas the driver would read the new status page and never see any progress, it would appear that the GPU hung immediately upon resume. Based on a patch by naresh kumar kachhi <naresh.kumar.kacchi@intel.com> Reported-by: Thiago Macieira <thiago@kde.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64725 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Thiago Macieira <thiago@kde.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-18 19:37:41 +02:00
Ben Widawsky	c37e220461	drm/i915: Add VM to pin To verbalize it, one can say, "pin an object into the given address space." The semantics of pinning remain the same otherwise. Certain objects will always have to be bound into the global GTT. Therefore, global GTT is a special case, and keep a special interface around for it (i915_gem_obj_ggtt_pin). v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:09 +02:00
Daniel Vetter	cb54b53ada	Merge commit 'Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux' This backmerges Linus' merge commit of the latest drm-fixes pull: commit `549f3a1218` Merge: `42577ca` `058ca4a` Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Tue Jul 23 15:47:08 2013 -0700 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux We've accrued a few too many conflicts, but the real reason is that I want to merge the 100% solution for Haswell concurrent registers writes into drm-intel-next. But that depends upon the 90% bandaid merged into -fixes: commit `a7cd1b8fea` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 19 20:36:51 2013 +0100 drm/i915: Serialize almost all register access Also, we can roll up on accrued conflicts. Usually I'd backmerge a tagged -rc, but I want to get this done before heading off to vacations next week ;-) Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_gem.c v2: For added hilarity we have a init sequence conflict around the gt_lock, so need to move that one, too. Spotted by Jani Nikula. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-25 15:18:41 +02:00
Daniel Vetter	c0d6a3dd61	drm/i915: don't enable PM_VEBOX_CS_ERROR_INTERRUPT The code to handle it is broken - there's simply no code to clear CS parser errors on gen5+. And behold, for all the other rings we also don't enable it! Leave the handling code itself in place just to be consistent with the existing mess though. And in case someone feels like fixing it all up. This has been errornously enabled in commit `12638c57f3` Author: Ben Widawsky <ben@bwidawsk.net> Date: Tue May 28 19:22:31 2013 -0700 drm/i915: Enable vebox interrupts Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-11 14:37:00 +02:00
Daniel Vetter	c7113cc35f	drm/i915: unify ring irq refcounts (again) With the simplified locking there's no reason any more to keep the refcounts seperate. v2: Readd the lost comment that ring->irq_refcount is protected by dev_priv->irq_lock. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-11 14:36:49 +02:00
Daniel Vetter	59cdb63d52	drm/i915: kill dev_priv->rps.lock Now that the rps interrupt locking isn't clearly separated (at elast conceptually) from all the other interrupt locking having a different lock stopped making sense: It protects much more than just the rps workqueue it started out with. But with the addition of VECS the separation started to blurr and resulted in some more complex locking for the ring interrupt refcount. With this we can (again) unifiy the ringbuffer irq refcounts without causing a massive confusion, but that's for the next patch. v2: Explain better why the rps.lock once made sense and why no longer, requested by Ben. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-11 14:36:43 +02:00
Daniel Vetter	aaf8a51672	drm/i915: fix up ring cleanup for the i830/i845 CS tlb w/a It's not a good idea to also run the pipe_control cleanup. This regression has been introduced whith the original cs tlb w/a in commit `b45305fce5` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Dec 17 16:21:27 2012 +0100 drm/i915: Implement workaround for broken CS tlb on i830/845 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64610 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-09 16:31:49 +02:00
Ben Widawsky	f343c5f647	drm/i915: Getter/setter for object attributes Soon we want to gut a lot of our existing assumptions how many address spaces an object can live in, and in doing so, embed the drm_mm_node in the object (and later the VMA). It's possible in the future we'll want to add more getter/setter methods, but for now this is enough to enable the VMAs. v2: Reworked commit message (Ben) Added comments to the main functions (Ben) sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/.[ch] (Daniel) v3: Rebased on new reserve_node patch Changed DRM_DEBUG_KMS to actually work (will need fixing later) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:34 +02:00
Daniel Vetter	035dc1e0f9	drm/i915: reinit status page registers after gpu reset This fixes gpu reset on my gm45 - without this patch the bsd thing is forever stuck since the seqno updates never reach the status page. Tbh I have no idea how this ever worked without rewriting the hws registers after a gpu reset. To satisfy my OCD also give the functions a bit more consistent names: - Use status_page everywhere, also for the physical addressed one. - Use init for the allocation part and setup for the register setup part consistently. Long term I'd really like to share the hw init parts completely between gpu reset, resume and driver load, i.e. to call i915_gem_init_hw instead of the individual pieces we might need. v2: Add the missing paragraph to the commit message about what bug exactly this patch here fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65495 Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: lu hua <huax.lu@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-04 11:36:36 +02:00
Mika Kuoppala	0025c0772d	drm/i915: change i915_add_request to macro Only execbuffer needed all the parameters on i915_add_request(). By putting __i915_add_request behind macro, all current callsites become cleaner. Following patch will introduce a new parameter for __i915_add_request. With this patch, only the relevant callsite will reflect the change making commit smaller and easier to understand. v2: _i915_add_request as function name (Chris Wilson) v3: change name __i915_add_request and fix ordering of params (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:15 +02:00
Chris Wilson	50f018dff1	drm/i915: Initialize ring->hangcheck upon ring init When we reset and restart a ring, we also want to clear any existing hangcheck. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-11 11:40:58 +02:00
Rodrigo Vivi	fd3da6c95b	drm/i915: WA: FBC Render Nuke. WaFbcNukeOn3DBlt for IVB, HSW. According BSPec: "Workaround: Do not enable Render Command Streamer tracking for FBC. Instead insert a LRI to address 0x50380 with data 0x00000004 after the PIPE_CONTROL that follows each render submission." v2: Chris noticed that flush_domains check was missing here and also suggested to do LRI only when fbc is enabled. To avoid do a I915_READ on every flush lets use the module parameter check. v3: Adding Wa name as Damien suggested. v4: Ville noticed VLV doesn't support fbc at all and comment came wrong from spec. v5: Ville noticed than on blt a Cache Clean LRI should be used instead the Nuke one. v6: Check for flush domain on blt (by Ville). Check for scanout dirty (by Chris). v7: Apply proper fbc_dirty implemented by Chris. v8: remove unused variables. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-07 17:56:55 +02:00
Ben Widawsky	12638c57f3	drm/i915: Enable vebox interrupts Similar to a patch originally written by: v2: Reversed the meanings of masked and enabled (Haihao) Made non-destructive writes in case enable/disabler rps runs first (Haihao) v3: Reword error message (Damien) Modify postinstall to do the right thing based on previous fixup. (Ben) CC: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:20 +02:00
Ben Widawsky	a19d2933cb	drm/i915: vebox interrupt get/put v2: Use the correct lock to protect PM interrupt regs, this was accidentally lost from earlier (Haihao) Fix return types (Ben) Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:19 +02:00
Ben Widawsky	cc609d5da5	drm/i915: consolidate interrupt naming scheme The motivation here is we're going to add some new interrupt definitions and handling outside of the GT interrupts which is all we've managed so far (with some RPS exceptions). By consolidating the names in the future we can make thing a bit cleaner as we don't need to define register names twice, and we can leverage pretty decent overlap in HW registers since ILK. To explain briefly what is in the comments: there are two sets of interrupt masking/enabling registers. At least so far, the definitions of the two sets overlap. The old code setup distinct names for interrupts in each set, ie. one for global, and one for ring. This made things confusing when using the wrong defines in the wrong places. rebase: Modified VLV bits v2: Renamed GT_RENDER_MASTER to GT_RENDER_CS_MASTER (Damien) Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:18 +02:00
Ben Widawsky	aeb0659338	drm/i915: Convert irq_refounct to struct It's overkill on older gens, but it's useful for newer gens. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:18 +02:00
Ben Widawsky	9a8a2213a7	drm/i915: Vebox ringbuffer init v2: Add set_seqno which didn't exist before rebase (Haihao) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:12 +02:00
Ben Widawsky	ea251324ca	drm/i915: Rename ring flush functions Historically we considered the render ring to have special flush semantics and everything else to fall under a more general umbrella. Probably by coincidence more than anything we decided to make the bsd ring have the default other flush. As the new vebox ring exposes, the bsd ring is actually the weird one. Doing this allows us to call gen6_ring_flush for the vebox because calling blt_ring_flush would be weird... This patch should have no functional change. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:10 +02:00
Ben Widawsky	1950de14fd	drm/i915: Add VECS semaphore bits Like the other rings, the VECS supports semaphores. The semaphore stuff is a bit wonky so this patch on it's own should be nice for review. This patch should have no functional impact. v2: Fix the English parts of clarification (again, register names were right, text was reversed) (Damien) Restore the still valid invariant. (Damien) The bsd semaphore register should be MI_SEMAPHORE_SYNC_VVE (Damien) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:10 +02:00
Ben Widawsky	4a3dd19d94	drm/i915: Introduce VECS: the 4th ring The video enhancement command streamer is a new ring on HSW which does what it sounds like it does. This patch provides the most minimal inception of the ring. In order to support a new ring, we need to bump the number. The patch may look trivial to the untrained eye, but bumping the number of rings is a bit scary. As such the patch is not terribly useful by itself, but a pretty nice place to find issues during a bisection. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:09 +02:00
Ben Widawsky	ad776f8b09	drm/i915: Semaphore MBOX update generalization This replaces the existing MBOX update code with a more generalized calculation for emitting mbox updates. We also create a sentinel for doing the updates so we can more abstractly deal with the rings. When doing MBOX updates the code must be aware of the /other/ rings. Until now the platforms which supported semaphores had a fixed number of rings and so it made sense for the code to be very specialized (hardcoded). The patch does contain a functional change, but should have no behavioral changes. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:08 +02:00
Ben Widawsky	5586181fce	drm/i915: Comments for semaphore clarification Semaphores are tied very closely to the rings in the GPU. Trivial patch adds comments to the existing code so that when we add new rings we can include comments there as well. It also helps distinguish the ring to semaphore mailbox interactions by using the ringname in the semaphore data structures. This patch should have no functional impact. v2: The English parts (as opposed to register names) of the comments were reversed. (Damien) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:08 +02:00
Wei Yongjun	56b085a01b	drm/i915: fix error return code in init_pipe_control() Fix to return -ENOMEM in the kmap() error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:04 +02:00
Mika Kuoppala	92cab73451	drm/i915: track ring progression using seqnos Instead of relying in acthd, track ring seqno progression to detect if ring has hung. v2: put hangcheck stuff inside struct (Chris Wilson) v3: initialize hangcheck.seqno (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:53:54 +02:00
Damien Lespiau	8693a82487	drm/i915: Add references to some workaround we implement We did not mention the workaround name when implementing those. This should help us track what we already implement. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-10 21:56:34 +02:00
Ville Syrjälä	b9e1faa763	drm/i915: Fix PIPE_CONTROL DW/QW write through global GTT on IVB+ The bit controlling whether PIPE_CONTROL DW/QW write targets the global GTT or PPGTT moved moved from DW 2 bit 2 to DW 1 bit 24 on IVB. I verified on IVB that the fix is in fact effective. Without the fix none of the scratch writes actually landed in the pipe control page. With the fix the writes show up correctly. v2: move PIPE_CONTROL_GLOBAL_GTT_IVB setup to where other flags are set Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-02-20 00:21:47 +01:00
Ville Syrjälä	2b1086cc58	drm/i915: Print the pipe control page GTT address We already print the HWS addresses during init, so do the same for the pipe control page. Reduces guesswork when looking at hex addresses later. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-02-20 00:21:40 +01:00
Dave Airlie	6dc1c49da6	Merge branch 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux into drm-next This pulls in most of Linus tree up to -rc6, this fixes the worst lockdep reported issues and re-enables fbcon lockdep. (not the fbcon maintainer) * 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux: (529 commits) Revert "Revert "console: implement lockdep support for console_lock"" fbcon: fix locking harder fb: Yet another band-aid for fixing lockdep mess fb: rework locking to fix lock ordering on takeover	2013-02-08 12:10:18 +10:00
Chris Wilson	f05bb0c7b6	drm/i915: GFX_MODE Flush TLB Invalidate Mode must be '1' for scanline waits On SNB, if bit 13 of GFX_MODE, Flush TLB Invalidate Mode, is not set to 1, the hardware can not program the scanline values. Those scanline values then control when the signal is sent from the display engine to the render ring for MI_WAIT_FOR_EVENTs. Note setting this bit means that TLB invalidations must be performed explicitly through the appropriate bits being set in PIPE_CONTROL. References: https://bugzilla.kernel.org/show_bug.cgi?id=52311 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-23 00:58:23 +01:00
Chris Wilson	1c8c38c588	drm/i915: Disable AsyncFlip performance optimisations This is a required workarounds for all products, especially on gen6+ where it causes the command streamer to fail to parse instructions following a WAIT_FOR_EVENT. We use WAIT_FOR_EVENT for synchronising between the GPU and the display engines, and so this bit being unset may cause hangs. References: https://bugzilla.kernel.org/show_bug.cgi?id=52311 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-23 00:58:22 +01:00
Mika Kuoppala	9943393195	drm/i915: use gem_set_seqno() on hardware init When machine was rebooted or module was reloaded, gem_hw_init() set last_seqno to be identical to next_seqno. This lead to situation that waits for first ever request always passed immediately regardless if it was actually executed. Use gem_set_seqno() to be consistent how hw is initialized on init, wrap and on resume. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-22 13:52:26 +01:00
Daniel Vetter	33196dedda	drm/i915: move wedged to the other gpu error handling stuff And to make Ben Widawsky happier, use the gpu_error instead of the entire device as the argument in some functions. Drop the outdated comment on ->wedged for now, a follow-up patch will change the semantics and add a proper comment again. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:15 +01:00
Daniel Vetter	99584db33b	drm/i915: extract hangcheck/reset/error_state state into substruct This has been sprinkled all over the place in dev_priv. I think it'd be good to also move all the code into a separate file like i915_gem_error.c, but that's for another patch. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:14 +01:00
Ben Widawsky	dabb7a91ae	drm/i915: Remove use on gma_bus_addr on gen6+ We have enough info to not use the intel_gtt bridge stuff. v2: Move setup of mappable_base above the legacy init stuff because we still need that on older platforms. (Daniel) v3: Remove the dev_priv hunk which was rebased in by accident Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:47:03 +01:00
Dave Airlie	b5cc6c0387	Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: - seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris Wilson - some leftover kill-agp on gen6+ patches from Ben - hotplug improvements from Damien - clear fb when allocated from stolen, avoids dirt on the fbcon (Chris) - Stolen mem support from Chris Wilson, one of the many steps to get to real fastboot support. - Some DDI code cleanups from Paulo. - Some refactorings around lvds and dp code. - some random little bits&pieces * tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits) drm/i915: Return the real error code from intel_set_mode() drm/i915: Make GSM void drm/i915: Move GSM mapping into dev_priv drm/i915: Move even more gtt code to i915_gem_gtt drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno drm/i915: Introduce i915_gem_set_seqno() drm/i915: Always clear semaphore mboxes on seqno wrap drm/i915: Initialize hardware semaphore state on ring init drm/i915: Introduce ring set_seqno drm/i915: Missed conversion to gtt_pte_t drm/i915: Bug on unsupported swizzled platforms drm/i915: BUG() if fences are used on unsupported platform drm/i915: fixup overlay stolen memory leak drm/i915: clean up PIPECONF bpc #defines drm/i915: add intel_dp_set_signal_levels drm/i915: remove leftover display.update_wm assignment drm/i915: check for the PCH when setting pch_transcoder drm/i915: Clear the stolen fb before enabling drm/i915: Access to snooped system memory through the GTT is incoherent drm/i915: Remove stale comment about intel_dp_detect() ... Conflicts: drivers/gpu/drm/i915/intel_display.c	2013-01-17 20:34:08 +10:00
Mika Kuoppala	f7e98ad4d4	drm/i915: Initialize hardware semaphore state on ring init Hardware status page needs to have proper seqno set as our initial seqno can be arbitrary. If initial seqno is close to wrap boundary on init and i915_seqno_passed() (31bit space) refers to hw status page which contains zero, errorneous result will be returned. v2: clear mboxes and set hws page directly instead of going through rings. Suggested by Chris Wilson. v3: hws needs to be updated for all gens. Noticed by Chris Wilson. References: https://bugs.freedesktop.org/show_bug.cgi?id=58230 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:17:01 +01:00
Mika Kuoppala	b70ec5bf43	drm/i915: Introduce ring set_seqno In preparation for setting per ring initial seqno values add ring::set_seqno(). Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:16:18 +01:00
Daniel Vetter	b45305fce5	drm/i915: Implement workaround for broken CS tlb on i830/845 Now that Chris Wilson demonstrated that the key for stability on early gen 2 is to simple _never_ exchange the physical backing storage of batch buffers I've tried a stab at a kernel solution. Doesn't look too nefarious imho, now that I don't try to be too clever for my own good any more. v2: After discussing the various techniques, we've decided to always blit batches on the suspect devices, but allow userspace to opt out of the kernel workaround assume full responsibility for providing coherent batches. The principal reason is that avoiding the blit does improve performance in a few key microbenchmarks and also in cairo-trace replays. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - Drop the hunk which uses HAS_BROKEN_CS_TLB to implement the ring wrap w/a. Suggested by Chris Wilson. - Also add the ACTHD check from Chris Wilson for the error state dumping, so that we still catch batches when userspace opts out of the w/a.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-17 17:27:02 +01:00
Mika Kuoppala	f72b3435c1	drm/i915: Don't emit semaphore wait if wrap happened If wrap just happened we need to prevent emitting waits for pre wrap values. Detect this and emit no-ops instead. v2: Use olr > seqno to detect wrap instead of *seqno == 0 as suggested by Chris Wilson. v3: Use last used seqno to detect the wraparound. From Chris Wilson v4: Fixed unnecessary last_seqno assigment References: https://bugs.freedesktop.org/show_bug.cgi?id=57967 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 13:32:26 +01:00
Mika Kuoppala	498d2ac15c	drm/i915: Add intel_ring_handle_seqno wrap If there are pre-wrap values in semaphore-mbox registers after wrap, syncing against some after-wrap request will complete immediately. Fix this by emitting ring commands to set mbox registers to zero when the wrap happens. v2: Use __intel_ring_begin to emit ring commands, from Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add a small comment to handle_seqno_wrap.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-06 13:14:34 +01:00
Mika Kuoppala	cbcc80dff3	drm/i915: Split intel_ring_begin In preparation for handling ring seqno wrapping, split intel_ring_begin into helper part which doesn't allocate seqno. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-06 13:14:34 +01:00
Ville Syrjälä	633cf8f505	drm/i915: Don't allow ring tail to reach the same cacheline as head From BSpec: "If the Ring Buffer Head Pointer and the Tail Pointer are on the same cacheline, the Head Pointer must not be greater than the Tail Pointer." The easiest way to enforce this is to reduce the reported ring space. References: Gen2 BSpec "1. Programming Environment" / 1.4.4.6 "Ring Buffer Use" Gen3 BSpec "vol1c Memory Interface Functions" / 2.3.4.5 "Ring Buffer Use" Gen4+ BSpec "vol1c Memory Interface and Command Stream" / 5.3.4.5 "Ring Buffer Use" v2: Include the exact BSpec references in the description v3: s/64/I915_RING_FREE_SPACE, and add the BSpec information to the code Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 18:31:20 +01:00
Chris Wilson	ebc052e0c6	drm/i915: Allocate ringbuffers from stolen memory Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-30 23:41:52 +01:00
Chris Wilson	3e9605018a	drm/i915: Rearrange code to only have a single method for waiting upon the ring Replace the wait for the ring to be clear with the more common wait for the ring to be idle. The principle advantage is one less exported intel_ring_wait function, and the removal of a hardcoded value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:53 +01:00
Chris Wilson	9d7730914f	drm/i915: Preallocate next seqno before touching the ring Based on the work by Mika Kuoppala, we realised that we need to handle seqno wraparound prior to committing our changes to the ring. The most obvious point then is to grab the seqno inside intel_ring_begin(), and then to reuse that seqno for all ring operations until the next request. As intel_ring_begin() can fail, the callers must already be prepared to handle such failure and so we can safely add further checks. This patch looks like it should be split up into the interface changes and the tweaks to move seqno wrapping from the execbuffer into the core seqno increment. However, I found no easy way to break it into incremental steps without introducing further broken behaviour. v2: Mika found a silly mistake and a subtle error in the existing code; inside i915_gem_retire_requests() we were resetting the sync_seqno of the target ring based on the seqno from this ring - which are only related by the order of their allocation, not retirement. Hence we were applying the optimisation that the rings were synchronised too early, fortunately the only real casualty there is the handling of seqno wrapping. v3: Do not forget to reset the sync_seqno upon module reinitialisation, ala resume. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:52 +01:00
Chris Wilson	1c8b46fc8c	drm/i915: Use LRI to update the semaphore registers The bspec was recently updated to remove the ability to update the semaphore using the MI_SEMAPHORE_BOX command, the ability to wait upon the semaphore value remained. Instead the advice is to update the register using the MI_LOAD_REGISTER_IMM command. In cursory testing, semaphores continue to function - the question is whether this fixes some of the deadlocks where the semaphore registers contained stale values? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel J Blueman <daniel@quora.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:45:00 +01:00
Chris Wilson	6b8294a4d3	drm/i915: Restore physical HWS_PGA after resume By always setting up the HWS register for both physical and virtual address variations during render ring we can reduce the number of different special cases that get set up at varying different times during module load. Fixes regression from commit `c630119f43` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Oct 17 11:32:57 2012 +0200 drm/i915: don't save/restore HWS_PGA reg for kms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-16 13:47:40 +01:00
Daniel Vetter	b3fcabb15b	drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush This has been introduced in "drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op". Reported-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:42 +01:00
Jesse Barnes	3ac7831314	drm/i915: PIPE_CONTROL TLB invalidate requires CS stall "If ENABLED, PIPE_CONTROL command will flush the in flight data written out by render engine to Global Observation point on flush done. Also Requires stall bit ([20] of DW1) set." So set the stall bit to ensure proper invalidation. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:37 +01:00
Jesse Barnes	9a28977181	drm/i915: TLB invalidation with MI_FLUSH_DW requires a post-sync op v3 So store into the scratch space of the HWS to make sure the invalidate occurs. v2: use GTT address space for store, clean up #defines (Chris) v3: use correct #define in blt ring flush (Chris) Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> References: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/1063252 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:36 +01:00
Mika Kuoppala	17f10fdc01	drm/i915/ringbuffer: exclude last 2 cachelines on 845g on all callpaths Make intel_render_ring_init_dri and intel_init_ring_buffer symmetrical with regards of workaround introduced by: commit `27c1cbd06a` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Apr 9 13:59:46 2012 +0100 drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:09 +01:00
Daniel Vetter	c2fb791692	Linux 3.7-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJQgvdwAAoJEHm+PkMAQRiG+3AH/i2XsqqN3VctL0nnbWfvds+Q aKulfIdJTjKiVAsawPUtRqReZ8ijiebrgA/53lZLlrFOoPPQ5+LHmnSyQF6gErOY NuAE1lijXDRM1pwBlhvOBbAj26wUobGjqONFJ9OkKr758Ue8ds/Q7UdxyEgmYgmg tvVMzfRcICzryUV3PcqL+3cNPpCUdT6wGGRJ9DCv/jvGiWKExWhOle5oltrmxk+D NsqRcws5pEubfHE4J8BvNWr8lE1kHfYVhrJETiLJUiN2XAJcbI4Jy7rU/3EGteNS 0HMZdaPPjV874lohdM70X2225SbYrCVkAYB5hnZCTeC3tYyCawBBPMQoyAiOcmU= =+861 -----END PGP SIGNATURE----- Merge tag 'v3.7-rc2' into drm-intel-next-queued Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-22 14:34:51 +02:00
Chris Wilson	d7d4eeddb8	drm/i915: Allow DRM_ROOT_ONLY\|DRM_MASTER to submit privileged batchbuffers With the introduction of per-process GTT space, the hardware designers thought it wise to also limit the ability to write to MMIO space to only a "secure" batch buffer. The ability to rewrite registers is the only way to program the hardware to perform certain operations like scanline waits (required for tear-free windowed updates). So we either have a choice of adding an interface to perform those synchronized updates inside the kernel, or we permit certain processes the ability to write to the "safe" registers from within its command stream. This patch exposes the ability to submit a SECURE batch buffer to DRM_ROOT_ONLY\|DRM_MASTER processes. v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security bit (bit 13, accidentally not set). Also add a comment explaining why secure batches need a global gtt binding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) [danvet: added hsw fixup.] Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-17 21:06:59 +02:00
Linus Torvalds	612a9aab56	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...	2012-10-03 23:29:23 -07:00
David Howells	760285e7e7	UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/ Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:07 +01:00
David Howells	4126d5d61f	UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:05 +01:00
Chris Wilson	9da3da660d	drm/i915: Replace the array of pages with a scatterlist Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Paulo Zanoni	f39876317a	drm/i915: add workarounds to gen7_render_ring_flush From Bspec, Vol 2a, Section 1.9.3.4 "PIPE_CONTROL", intro section detailing the various workarounds: "[DevIVB {W/A}, DevHSW {W/A}]: Pipe_control with CS-stall bit set must be issued before a pipe-control command that has the State Cache Invalidate bit set." Note that public Bspec has different numbering, it's Vol2Part1, Section 1.10.4.1 "PIPE_CONTROL" there. There's also a second workaround for the PIPE_CONTROL command itself: "[DevIVB, DevVLV, DevHSW] {WA}: Every 4th PIPE_CONTROL command, not counting the PIPE_CONTROL with only read-cache-invalidate bit(s) set, must have a CS_STALL bit set" For simplicity we simply set the CS_STALL bit on every pipe_control on gen7+ Note that this massively helps on some hsw machines, together with the following patch to unconditionally set the CS_STALL bit on every pipe_control it prevents a gpu hang every few seconds. This is a regression that has been introduced in the pipe_control cleanup: commit `6c6cf5aa9c` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 It looks like the massive snb pipe_control workaround also papered over any issues on ivb and hsw. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> [danvet: squashed both workarounds together, pimped commit message with Bsepc citations, regression commit citation and changed the comment in the code a bit to clarify that we unconditionally set CS_STALL to avoid being hurt by trying to be clever.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:26 +02:00
Paulo Zanoni	b311150927	drm/i915: add workarounds directly to gen6_render_ring_flush Since gen 7+ now run the new gen7_render_ring_flush function. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:26 +02:00
Paulo Zanoni	4772eaebcd	drm/i915: add gen7_render_ring_flush For now, just a copy of gen6_render_ring_flush. Different gens have different workarounds, so we want different functions. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:25 +02:00
Chris Wilson	86a1ee26bb	drm/i915: Only pwrite through the GTT if there is space in the aperture Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:03:33 +02:00
Daniel Vetter	a22ddff8be	Linux 3.6-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQLWtvAAoJEHm+PkMAQRiG/DYH+wd0FqfEuYkYk4KPyAPuhKpX zX7HYfLvyJE/ZYIdrhjq1E6Xm2KNr7gtX7/Rdzi2W38M9sjbYzwG1UGIw51qnxWy yZJH9BGkfyQgQPeuDGohfB6DkDy2JWr2eqMDvakjOwgBsIzji0PQD/f3UvndhtUa c+tTj/kjavHE1Yr2Wy6OnRZz3Uc0hIMn/Q0JqtbCs3LUgEV1KA4OEAe56XNz4Ku4 WE+FFaGFPvtriQsQON+ohPS5IC8jzQGK/0vbrJ4lWjFnZy4gvZXnborTOwD0WSQG fbsNuxp1AaM2/pqfMwXm1w0ADvwOITHNiwwXf9id6DoK81QwTFpUdvKpn6yB6gQ= =rurr -----END PGP SIGNATURE----- Merge tag 'v3.6-rc2' into drm-intel-next Backmerge Linux 3.6-rc2 to resolve a few funny conflicts before we put even more madness on top: - drivers/gpu/drm/i915/i915_irq.c: Just a spurious WARN removed in -fixes, that has been changed in a variable-rename in -next, too. - drivers/gpu/drm/i915/intel_ringbuffer.c: -next remove scratch_addr (since all their users have been extracted in another fucntion), -fixes added another user for a hw workaroudn. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-17 09:01:08 +02:00
Chris Wilson	7d54a90428	drm/i915: Apply post-sync write for pipe control invalidates When invalidating the TLBs it is documentated as requiring a post-sync write. Failure to do so seems to result in a GPU hang. Exposure to this hang on IVB seems to be a result of removing the extra stalls required for SNB pipecontrol workarounds: commit `6c6cf5aa9c` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 18:02:28 2012 +0100 drm/i915: Only apply the SNB pipe control w/a to gen6 Note: Manually switch the pipe_control cmd to 4 dwords to avoid a (silent) functional conflict with -next. This way will get a loud (but conflict with next (since the scratch_addr has been deleted there). Reported-and-tested-by: yex.tian@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53322 Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: added note about merge conflict with -next.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-14 09:47:45 +02:00
Chris Wilson	b2eadbc85b	drm/i915: Lazily apply the SNB+ seqno w/a Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 \|\| IS_GEN7 is redundant in that function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-10 11:11:32 +02:00
Daniel Vetter	0d8957c8a9	drm/i915: correctly order the ring init sequence We may only start to set up the new register values after having confirmed that the ring is truely off. Otherwise the hw might lose the newly written register values. This is caught later on in the init sequence, when we check whether the register writes have stuck. Cc: stable@vger.kernel.org Reviewed-by: Jani Nikula <jani.nikula@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Tested-by: Yang Guang <guang.a.yang@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-08 10:23:35 +02:00
Chris Wilson	6c6cf5aa9c	drm/i915: Only apply the SNB pipe control w/a to gen6 The requirements for the sync flush to be emitted prior to the render cache flush is only true for SandyBridge. On IvyBridge and friends we can just emit the flushes with an inline CS stall. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-08 09:34:32 +02:00
Ben Widawsky	e1ef7cc299	drm/i915: Macro to determine DPF support Originally I had a macro specifically for DPF support, and Daniel, with good reason asked me to change it to this. It's not the way I would have gone (and indeed I didn't), but for now there is no distinction as all platforms with L3 also have DPF. Note: The good reasons are that dpf is a l3$ feature (at least on currrent hw), hence I don't expect one to go without the other. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: added note] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:56 +02:00
Chris Wilson	a7b9761d0a	drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:55 +02:00
Chris Wilson	69c2fc8913	drm/i915: Remove the per-ring write list This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Ben Widawsky	2e6c21ed63	drm/i915: missing error case in init status page Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:40 +02:00
Chris Wilson	12f55818ba	drm/i915: Add comments to explain the BSD tail write workaround Having had to dive into the bspec to understand what each stage of the workaround meant, and how that the ring broadcasting IDLE corresponded with the GT powering down the ring (i.e. rc6) add comments to aide the next reader. And since the register "is used to control all aspects of PSMI and power saving functions" that makes it quite interesting to inspect with regards to RC6 hangs, so add it to the error-state. v2: Rediscover the piece of magic, set the RNCID to 0 before waiting for the ring to wake up. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:37 +02:00
Daniel Vetter	de2b998552	drm/i915: don't return a spurious -EIO from intel_ring_begin The issue with this check is that it results in userspace receiving an -EIO while the gpu reset hasn't completed, resulting in fallback to sw rendering or worse. Now there's also a stern comment in intel_ring_wait_seqno saying that intel_ring_begin should not return -EAGAIN, ever, because some callers can't handle that. But after an audit of the callsites I don't see any issues. I guess the last problematic spot disappeared with the removal of the pipelined fencing code. So do the right thing and call check_wedge, which should properly decide whether an -EAGAIN or -EIO is appropriate if wedged is set. Note that the early check for a wedged gpu before touching the ring is rather important (and it took me quite some time of acting like the densest doofus to figure that out): If we don't do that and the gpu died for good, not having been resurrect by the reset code, userspace can merrily fill up the entire ring until it notices that something is amiss. Allowing userspace to emit more render, despite that we know that it will fail can't lead to anything good (and by experience can lead to all sorts of havoc, including angering the OOM gods and hard-hanging the hw for good). v2: Fix EAGAIN mispell, noticed by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:03:45 +02:00
Daniel Vetter	d6b2c790a4	drm/i915: non-interruptible sleeps can't handle -EAGAIN So don't return -EAGAIN, even in the case of a gpu hang. Remap it to -EIO instead. Note that this isn't really an issue with interruptability, but more that we have quite a few codepaths (mostly around kms stuff) that simply can't handle any errors and hence not even -EAGAIN. Instead of adding proper failure paths so that we could restart these ioctls we've opted for the cheap way out of sleeping non-interruptibly. Which works everywhere but when the gpu dies, which this patch fixes. So essentially interruptible == false means 'wait for the gpu or die trying'.' This patch is a bit ugly because intel_ring_begin is all non-interruptible and hence only returns -EIO. But as the comment in there says, auditing all the callsites would be a pain. To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno and intel_wait_ring_buffer. Also use the opportunity to clarify the different cases in i915_gem_check_wedge a bit with comments. v2: Don't access dev_priv->mm.interruptible from check_wedge - we might not hold dev->struct_mutex, making this racy. Instead pass interruptible in as a parameter. I've noticed this because I've hit a BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been added in commit `b4aca0106c` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Apr 25 20:50:12 2012 -0700 drm/i915: extract some common olr+wedge code although that commit is missing any justification for this. I guess it's just copy&paste, because the same commit add the same BUG_ON check to check_olr, where it indeed makes sense. But in check_wedge everything we access is protected by other means, so this is superflous. And because it now gets in the way (we add a new caller in __wait_seqno, which can be called without dev->struct_mutext) let's just remove it. v3: Group all the i915_gem_check_wedge refactoring into this patch, so that this patch here is all about not returning -EAGAIN to callsites that can't handle syscall restarting. v4: Add clarification what interuptible == fales means in our code, requested by Ben Widawsky. v5: Fix EAGAIN mispell noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:01:14 +02:00
Daniel Vetter	97f209bcfc	drm/i915: "Flush Me Harder" required on gen6+ The prep to remove the flushing list in commit `cc889e0f6c` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jun 13 20:45:19 2012 +0200 drm/i915: disable flushing_list/gpu_write_list causes quite some decent regressions. We can fix this by setting the CS_STALL bit to ensure that the following seqno write happens only after the cache flush has completed. But only do that when the caller actually wants the flush (and not also when we invalidate caches before starting the next batch). I've looked through all our ancient scrolls about gen6+ pipe control workarounds, and this seems to be indeed a legal combination: We're allowed to set the CS_STALL bit when we flush the render cache (which we do). While yelling at this code, also pass back the return value from intel_emit_post_sync_nonzero_flush properly. v2: Instead of emitting more pipe controls, set the CS_STALL bit on the write flush as suggested by Chris Wilson. It seems to work, too. Cc: Eric Anholt <eric@anholt.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51436 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51429 Tested-by: Lu Hua <huax.lu@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-28 21:06:25 +02:00
Daniel Vetter	7b0cfee1a2	Linux 3.5-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJP53AxAAoJEHm+PkMAQRiGs2QH/RaqkXz96fwjhDcyiKpDqA3c kGuS5mz5cOhnqKSmR88HFm6pwuhLux/qSJzeAmoQy1MC8a0ACx7AnANW0lfN3/qe /HGYz8h60yCL/fhn8/bUYtdt9xsoDqoDcq/ooFl9mcsJGWbC6WeMSZU5dAUYqviE qFrp5zjY07FG53CRGT0hFpezQNwNL+VLH30CF9LD+fJLPVEYum2zBNGXWM42rcw5 fxzGL/6SO8YqA/Upic1ht6HAd6s5LOrlST7qvnyXUMvRXN5z/Y92ueYJZefkS1Om ohuLIKM2bv9/dJS67H8N2baSKGCzBdfSe5/5WaHdLYW9MiVju0wRl6HPJtAMrkk= =H8t8 -----END PGP SIGNATURE----- Merge tag 'v3.5-rc4' into drm-intel-next-queued I want to merge the "no more fake agp on gen6+" patches into drm-intel-next (well, the last pieces). But a patch in 3.5-rc4 also adds a new use of dev->agp. Hence the backmarge to sort this out, for otherwise drm-intel-next merged into Linus' tree would conflict in the relevant code, things would compile but nicely OOPS at driver load :( Conflicts in this merge are just simple cases of "both branches changed/added lines at the same place". The only tricky part is to keep the order correct wrt the unwind code in case of errors in intel_ringbuffer.c (and the MI_DISPLAY_FLIP #defines in i915_reg.h together, obviously). Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ringbuffer.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-25 19:10:36 +02:00
Ben Widawsky	12b0286f49	drm/i915: possibly invalidate TLB before context switch From http://intellinuxgraphics.org/documentation/SNB/IHD_OS_Vol1_Part3.pdf [DevSNB] If Flush TLB invalidation Mode is enabled it's the driver's responsibility to invalidate the TLBs at least once after the previous context switch after any GTT mappings changed (including new GTT entries). This can be done by a pipelined PIPE_CONTROL with TLB inv bit set immediately before MI_SET_CONTEXT. On GEN7 the invalidation mode is explicitly set, but this appears to be lacking for GEN6. Since I don't know the history on this, I've decided to dynamically read the value at ring init time, and use that value throughout. v2: better comment (daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:19 +02:00
Ben Widawsky	cc0f639822	drm/i915: PIPE_CONTROL_TLB_INVALIDATE This has showed up in several other patches. It's required for the next context workaround. I tested this one on its own and saw no differences in basic tests (performance or otherwise). This patch is relatively likely to cause regressions, hence why it's split out. Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:18 +02:00
Daniel Vetter	dd2757f8b5	drm/i915: stop using dev->agp->base For that to work we need to export the base address of the gtt mmio window from intel-gtt. Also replace all other uses of dev->agp by values we already have at hand. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-12 22:18:06 +02:00
Daniel Vetter	b7884eb45e	drm/i915: hold forcewake around ring hw init Empirical evidence suggests that we need to: On at least one ivb machine when running the hangman i-g-t test, the rings don't properly initialize properly - the RING_START registers seems to be stuck at all zeros. Holding forcewake around this register init sequences makes chip reset reliable again. Note that this is not the first such issue: commit `f01db988ef` Author: Sean Paul <seanpaul@chromium.org> Date: Fri Mar 16 12:43:22 2012 -0400 drm/i915: Add wait_for in init_ring_common added delay loops to make RING_START and RING_CTL initialization reliable on the blt ring at boot-up. So I guess it won't hurt if we do this unconditionally for all force_wake needing gpus. To avoid copy&pasting of the HAS_FORCE_WAKE check I've added a new intel_info bit for that. v2: Fixup missing commas in static struct and properly handling the error case in init_ring_common, both noticed by Jani Nikula. Cc: stable@vger.kernel.org Reported-and-tested-by: Yang Guang <guang.a.yang@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50522 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-04 20:25:29 +02:00
Chris Wilson	3eef8918ff	drm/i915: Mark the ringbuffers as being in the GTT domain By correctly describing the rinbuffers as being in the GTT domain, it appears that we are more careful with the management of the CPU cache upon resume and so prevent some coherency issue when submitting commands to the GPU later. A secondary effect is that the debug logs are then consistent with the actual usage (i.e. they no longer describe the ringbuffers as being in the CPU write domain when we are accessing them through an wc iomapping.) Reported-and-tested-by: Daniel Gnoutcheff <daniel@gnoutcheff.name> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41092 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-04 20:16:40 +02:00
Ben Widawsky	15b9f80e00	drm/i915: enable parity error interrupts The previous patch put all the code, and handlers in place. It should now be safe to enable the parity error interrupt. The parity error must be unmasked in both the GTIMR, and the CS IMR. Unfortunately, the docs aren't clear about this; nevertheless it's the truth. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-31 12:07:13 +02:00
Chris Wilson	c3b2003792	drm/i915: Reset last_retired_head when resetting ring When we reset the ring control registers, including the HEAD and TAIL of the ring, we also need to reset associated state. In this instance, we were failing to reset the cached value of ring->last_retired_head and so upon the first request for more space following a resume would potentially (depending on a narrow race window) believe that the HEAD had advanced much further than reality. This is a regression from: commit `a71d8d9452` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 15 11:25:36 2012 +0000 drm/i915: Record the tail at each request and use it to estimate the head Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org # 3.4 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-29 20:06:58 +02:00
Ben Widawsky	199b2bc25b	drm/i915: s/i915_wait_request/i915_wait_seqno/g Wait request is poorly named IMO. After working with these functions for some time, I feel it's much clearer to name the functions more appropriately. Of course we must update the callers to use the new name as well. This leaves room within our namespace for a real wait request function at some point. Note to maintainer: this patch is optional. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 14:18:42 +02:00
Daniel Vetter	5e13a0c5ec	Merge remote-tracking branch 'airlied/drm-core-next' into drm-intel-next-queued Backmerge of drm-next to resolve a few ugly conflicts and to get a few fixes from 3.4-rc6 (which drm-next has already merged). Note that this merge also restricts the stencil cache lra evict policy workaround to snb (as it should) - I had to frob the code anyway because the CM0_MASK_SHIFT define died in the masked bit cleanups. We need the backmerge to get Paulo Zanoni's infoframe regression fix for gm45 - further bugfixes from him touch the same area and would needlessly conflict. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-08 13:39:59 +02:00
Daniel Vetter	dc257cf154	Linux 3.4-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPpvY9AAoJEHm+PkMAQRiGpEoIAJgbu+Y8gITnBK/wh9O6zy3S 5jie5KK4YWdbJsvO58WbNr3CyVIwGIqQ2dUZLiU59aBVLarlGw8xor0MmW+cZwhp 6fBHaf0qDYAV0MZjD+mnnExOiCRyISa2lPmsfu9dAWywh5KGe6/oAP6/qcXIyok3 KZyl3qQf4ENpaZPHwZPXCEkUvtuyHgNiszN+QXEadA3s19Ot4VGe9A3VGw+GNrSm JqFIq3acQAbKa5BYaqf7TQC02v2FI7//eqt6QHxTqbE6a7LGbTvLfX3HlJ2mnfqa 1R6QHhM4y4OZDHbaMT2raHZ8WuLXzhehJzhP8Co7AHFOKwVKOb5XbcUr2RrukMU= =HkMd -----END PGP SIGNATURE----- Merge tag 'v3.4-rc6' into drm-intel-next Conflicts: drivers/gpu/drm/i915/intel_display.c Ok, this is a fun story of git totally messing things up. There /shouldn't/ be any conflict in here, because the fixes in -rc6 do only touch functions that have not been changed in -next. The offending commits in drm-next are 14415745b2..1fa611065 which simply move a few functions from intel_display.c to intel_pm.c. The problem seems to be that git diff gets completely confused: $ git diff 14415745b2..1fa611065 is a nice mess in intel_display.c, and the diff leaks into totally unrelated functions, whereas $git diff --minimal 14415745b2..1fa611065 is exactly what we want. Unfortunately there seems to be no way to teach similar smarts to the merge diff and conflict generation code, because with the minimal diff there really shouldn't be any conflicts. For added hilarity, every time something in that area changes the + and - lines in the diff move around like crazy, again resulting in new conflicts. So I fear this mess will stay with us for a little longer (and might result in another backmerge down the road). Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-07 14:02:14 +02:00
Daniel Vetter	e5eb3d63c6	drm/i915: add interface to simulate gpu hangs gpu reset is a very important piece of our infrastructure. Unfortunately we only really it test by actually hanging the gpu, which often has bad side-effects for the entire system. And the gpu hang handling code is one of the rather complicated pieces of code we have, consisting of - hang detection - error capture - actual gpu reset - reset of all the gem bookkeeping - reinitialition of the entire gpu This patch adds a debugfs to selectively stopping rings by ceasing to update the hw tail pointer, which will result in the gpu no longer updating it's head pointer and eventually to the hangcheck firing. This way we can exercise the gpu hang code under controlled conditions without a dying gpu taking down the entire systems. Patch motivated by me forgetting to properly reinitialize ppgtt after a gpu reset. Usage: echo $((1 << $ringnum)) > i915_ring_stop # stops one ring echo 0xffffffff > i915_ring_stop # stops all, future-proof version then run whatever testload is desired. i915_ring_stop automatically resets after a gpu hang is detected to avoid hanging the gpu to fast and declaring it wedged. v2: Incorporate feedback from Chris Wilson. v3: Add the missing cleanup. v4: Fix up inconsistent size of ring_stop_read vs _write, noticed by Eugeni Dodonov. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-05 19:45:00 +02:00
Daniel Vetter	4225d0f219	drm/i915: fixup __iomem mixups in ringbuffer.c Two things: - ring->virtual start is an __iomem pointer, treat it accordingly. - dev_priv->status_page.page_addr is now always a cpu addr, no pointer casting needed for that. Take the opportunity to remove the unnecessary drm indirection when setting up the ringbuffer iomapping. v2: Add a compiler barrier before reading the hw status page. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:31 +02:00
Daniel Vetter	627965ad3e	drm/i915: kill pointless clearing of dev_priv->hws_map We kzalloc dev_priv, and we never use hws_map in intel_ringbuffer.c. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:30 +02:00
Ben Widawsky	b2da9fe5d5	drm/i915: remove do_retire from i915_wait_request This originates from a hack by me to quickly fix a bug in an earlier patch where we needed control over whether or not waiting on a seqno actually did any retire list processing. Since the two operations aren't clearly related, we should pull the parameter out of the wait function, and make the caller responsible for retiring if the action is desired. The only function call site which did not get an explicit retire_request call (on purpose) is i915_gem_inactive_shrink(). That code was already calling retire_request a second time. v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:20 +02:00
Daniel Vetter	63ed2cb2d1	drm/i915: rip out GEM drm feature checks We always set it so there's no point in checking. We could instead add a bit that tells us whether gem is actually initialized (i.e. either kms or gem_init_ioctl called), but that's imho not worth it. So just rip it out. There's a little change in the wait_ring timeout, but we've never run with anything else than the 60 second timeout, even on dri1 userspace. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:14 +02:00
Chris Wilson	7338aefa5c	drm/i915: Use a global lock for modifying global irq flags We were attempting to use a per-ring spinlock whilst modifying global IRQ flags. A recipe for rare missed interrupts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:09 +02:00
Daniel Vetter	6b26c86d61	drm/i915: create macros to handle masked bits ... and put them to so good use. Note that there's functional change in vlv clock gating code, we now no longer spuriously read back the current value of the bit. According to Bspec the high bits should always read zero, so ORing this in should have no effect. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:08 +02:00
Chris Wilson	c2798b19ba	drm/i915: i8xx interrupt handler gen2 hardware has some significant differences from the other interrupt routines that were glossed over and then forgotten about in the transition to KMS. Such as - 16bit IIR - PendingFlip status bit This patch reintroduces a handler specifically for gen2 for the purpose of handling pageflips correctly, simplifying code in the process. v2: Also fixup ring get/put irq to only access 16bit registers (Daniel) Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=24202 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41793 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: use posting_read16 in intel_ringbuffer.c and kill _driver from the function names.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:07 +02:00
Kenneth Graunke	3a69ddd6f8	drm/i915: Set the Stencil Cache eviction policy to non-LRA mode. Clearing bit 5 of CACHE_MODE_0 is necessary to prevent GPU hangs in OpenGL programs such as Google MapsGL, Google Earth, and gzdoom when using separate stencil buffers. Without it, the GPU tries to use the LRA eviction policy, which isn't supported. This was supposed to be off by default, but seems to be on for many machines. This cannot be done in gen6_init_clock_gating with most of the other workaround bits; the render ring needs to exist. Otherwise, the register write gets dropped on the floor (one printk will show it changed, but a second printk immediately following shows the value reverts to the old one). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47535 Cc: stable@vger.kernel.org Cc: Rob Castle <futuredub@gmail.com> Cc: Eric Appleman <erappleman@gmail.com> Cc: aaron667@gmx.net Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>	2012-04-28 08:05:15 +01:00
Daniel Vetter	31b14c9fc5	drm/i915: invalidate render cache on gen2 It looks like we also need to flush the render cache when we just invalidate it. This fixes a regression in i-g-t/gem_tiled_blits on my i855gm. I guess the render cache there is virtually indexed, so we need to clean it when changing gtt mappings. This regression has been introduce in commit `46f0f8d120` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Apr 18 11:12:11 2012 +0100 drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-20 09:28:06 +02:00
Chris Wilson	46f0f8d120	drm/i915: Don't set a MBZ bit in gen2/3 MI_FLUSH On gen2 MI_EXE_FLUSH is actually an AGP flush bit and on gen3 marked as reserved. On both it is documented as being must-be-zero. So obey the documentation, and separate the gen2 flush into its own little routine and share with gen3. This means that we can rename the existing render_ring_flush() to reflect the generation from which it first applies and remove the code for handling earlier generations from it. v2: Applies to gen3 as well v3: Make it compile and improve the commit message. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 12:39:57 +02:00
Chris Wilson	65f5687603	drm/i915: Replace open coded MI_BATCH_GTT The (2<<6) virtual memory space selector harks back to gen3 and is mandatory given our use of GTT space for batchbuffers. On gen4+, use of the GTT became mandatory and bit6 marked reserved. However the code must now explicitly set (1<<7), which conveniently is also (2<<6). To clarify the meaning for future readers, replace the open coded (2<<6) with MI_BATCH_GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 11:11:14 +02:00
Ben Widawsky	c43b563403	drm/i915: [sparse] trivial sparse fixes This should contain all the changes which require no thought to make sparse happy. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 10:34:49 +02:00
Daniel Vetter	767878908e	Linux 3.4-rc3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEbBAABAgAGBQJPi3XOAAoJEHm+PkMAQRiGnsUH9RjHwH4YFVyuP/DKtKa6zs74 wqkpT15yITQ5WWMog4JaJFFg5rJCUd8QZr7AS/HSn0ijDyZX5VU7Rcs9cMudDzNR H/5K/AscS4fjb0HwWVqoltTWHRb9QGSwVN3+E3VCDLt9P89YJ0o3QztkkuEX5dkZ jc7reVXTfRnCcILEa9jleOzrn+OLM3j/jAjQ2hGunl8EDLzD4b17HHPoli4jEZ/5 5ibpSVsPD+AqzN+glbXvYjVItl12D0IQos/JdOwfuZriCVWLxysSSwHZTbPCyvBZ LHH4HR5T+XLSXbjJeNkUFHLzqU+d5gVRadIoWtJCxqxFjKbOs2YtzJ5Ai0nDiw== =kTkC -----END PGP SIGNATURE----- Merge tag 'v3.4-rc3' into drm-intel-next-queued Backmerge Linux 3.4-rc3 into drm-intel-next to resolve a few things that conflict/depend upon patches in -rc3: - Second part of the Sandybridge workaround series - it changes some of the same registers. - Preparation for Chris Wilson's fencing cleanup - we need the fix from -rc3 merged before we can move around all that code. - Resolve the gmbus conflict - gmbus has been disabled in 3.4 again, but should be enabled on all generations in 3.5. Conflicts: drivers/gpu/drm/i915/intel_i2c.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-17 11:16:20 +02:00
Daniel Vetter	f637fde434	drm/i915: inline enable/disable_irq into ring->get/put_irq Now that these are properly refactored this additional indirection doesn't really buy us anything but confusion. Hence inline them. This duplicates the ironlake gt enable/disable code snippet, but we've already separate ilk from gen6+ gt irq in i915_irq.c, so I think this makes more sense. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:53:44 +02:00
Daniel Vetter	28f0cbf71f	drm/i915: don't set up gem ring functions on gen5 for !kms We already disallow initialition of gem in this case in the corresponding ioctl, so don't bother setting up the gem support ring functions in the legacy dri render ring init. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:53:28 +02:00
Daniel Vetter	8620a3a908	drm/i915: consolidate ring->add_request a bit They're indentical, so just kill one. Also give the other a prefix to distinguish it from the gen6+ functions - this add_request function is not really generic code. v2: Fixup commit message as noted by Ben Widawsky. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:52:17 +02:00
Daniel Vetter	fb3256da8d	drm/i915: split up ring->dispatch_execbuffer functions Now that we can, we should split them up in a way that makes some sense and banishes the IS_ checks into init code. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:52:02 +02:00
Daniel Vetter	0fd2c20148	drm/i915: don't enable the gen6 bsd ring tail write enable on gen7 HW engineers have fixed this issue for ivb. Again, a nice cleanup possible thanks to the more flexible ring initialization. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:51:49 +02:00
Daniel Vetter	e48d86347c	drm/i915: split out the gen5 ring irq get/put functions Now that we have sensibly split up, we can nicely get rid of that ugly is_gen5 check. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:51:37 +02:00
Daniel Vetter	e367031966	drm/i915: abstract away ring-specific irq_get/put Inspired by Ben Widawsky's patch for gen6+. Now after restructuring how we set up the ring vtables and parameters, we can do this right. This kills the bsd specific get/put_irq functions, they're now the same. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:51:07 +02:00
Daniel Vetter	686cb5f9f5	drm/i915: consolidate ring->sync-to functions The waiter is always the ring itself (otherwise we'd have a decent snafu in a callsite), so we can unify this easily. Also give it the usual gen6_ prefix, in case anyone is foolish enough to implement hw semaphores for gen5. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:50:41 +02:00
Daniel Vetter	b4178f8aaf	drm/i915: don't set up rings on gen6+ for non-kms It's not supported, and with the patch to refuse loading on gen6+ without kms enabled, there's also no way we can hit this. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:42:35 +02:00
Daniel Vetter	3535d9dd5a	drm/i915: dynamically set up blt ring functions and parameters Just for consistency. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:42:30 +02:00
Daniel Vetter	58fa383587	drm/i915: dynamically set up bsd ring functions and params The same treatment for the bsd ring. Again, this will be split up further by the irq rework. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:42:25 +02:00
Daniel Vetter	59465b5f78	drm/i915: dynamically set up the render ring functions and params Our hw is simply not well-designed enough that it neatly fits into boxes. Everywhere else we set up vtables and similar things dynamically using switch statements - it's simply much more flexible. This is prep work to rework the pre-gen6 ring irq stuff - it'll add a few more differences. With the current const struct templates, that would be a mess. This leads to some unfortunate duplication with the old dri1 code, but we can reap that again because gen6 isn't actually supported there. But that's for a separate patch. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:41:36 +02:00
Daniel Vetter	dfc9ef2fb0	drm/i915: set ring->size in common ring setup code Eventually we want to scale the ring size depending upon available gtt space. For now just consolidate this instead of replicating it over all ringbuffer templates. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:41:22 +02:00
Daniel Vetter	6a848ccb80	drm/i915: rip out ring->irq_mask We only ever enable/disable one interrupt (namely user_interrupts and pipe_notify), so we don't need to track the interrupt masking state. Also rename irq_enable to irq_enable_mask, now that it won't collide - beforehand both a irq_mask and irq_enable_mask would have looked a bit strange. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-13 12:40:57 +02:00
Ben Widawsky	1500f7ea06	drm/i915: hide (seqno-1) in ringbuffer code Waiting for seqno-1 in our object synchronization code is an implementation detail given how we've decided to do the waits within the rest of our code. Requested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:14 +02:00
Dave Airlie	effbc4fd8e	Merge branch 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel into drm-core-next Daniel Vetter wrote First pull request for 3.5-next, slightly large than usual because new things kept coming in since the last pull for 3.4. Highlights: - first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci ids are not yet added, and there's still quite a few patches to merge (mostly modesetting). To make QA easier I've decided to merge this stuff in pieces. - loads of cleanups and prep patches spurred by the above. Especially vlv is a real frankenstein chip, but also hsw is stretching our driver's code design. Expect more to come in this area for 3.5. - more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again, there are more patches needed (and some already queued up), but I wanted to split this a bit for better testing. - pwrite/pread rework and retuning. This series has been in the works for a few months already and a lot of i-g-t tests have been created for it. Now it's finally ready to be merged. Note that one patch in this series touches include/pagemap.h, that patch is acked-by akpm. - reduce mappable pressure and relocation throughput improvements from Chris. - mmap offset exhaustion mitigation by Chris Wilson. - a start at figuring out which codepaths in our messy dri1/ums+gem/kms driver we actually need to support by bailing out of unsupported case. The driver now refuses to load without kms on gen6+ and disallows a few ioctls that userspace never used in certain cases. More of this will definitely come. - More decoupling of global gtt and ppgtt. - Improved dual-link lvds detection by Takashi Iwai. - Shut up the compiler + plus fix the fallout (Ben) - Inverted panel brightness handling (mostly Acer manages to break things in this way). - Small fixlets and adjustements and some minor things to help debugging. Regression-wise QA reported quite a few issues on ivb, but all of them turned out to be hw stability issues which are already fixed in drm-intel-fixes (QA runs the nightly regression tests on -next alone, without -fixes automatically merged in). There's still one issue open on snb, it looks like occlusion query writes are not quite as cache coherent as we've expected. With some of the pwrite adjustements we can now reliably hit this. Kernel workaround for it is in the works." * 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits) drm/i915: VCS is not the last ring drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2 drm/i915: make quirks more verbose drm/i915: dump the DMA fetch addr register on pre-gen6 drm/i915/sdvo: Include YRPB as an additional TV output type drm/i915: disallow gem init ioctl on ilk drm/i915: refuse to load on gen6+ without kms drm/i915: extract gt interrupt handler drm/i915: use render gen to switch ring irq functions drm/i915: rip out old HWSTAM missed irq WA for vlv drm/i915: open code gen6+ ring irqs drm/i915: ring irq cleanups drm/i915: add SFUSE_STRAP registers for digital port detection drm/i915: add WM_LINETIME registers drm/i915: add WRPLL clocks drm/i915: add LCPLL control registers drm/i915: add SSC offsets for SBI access drm/i915: add port clock selection support for HSW drm/i915: add S PLL control drm/i915: add PIXCLK_GATE register ... Conflicts: drivers/char/agp/intel-agp.h drivers/char/agp/intel-gtt.c drivers/gpu/drm/i915/i915_debugfs.c	2012-04-12 10:27:01 +01:00
Chris Wilson	27c1cbd06a	drm/i915/ringbuffer: Exclude last 2 cachlines of ring on 845g The 845g shares the errata with i830 whereby executing a command within 2 cachelines of the end of the ringbuffer may cause a GPU hang. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-11 12:14:24 +02:00
Daniel Vetter	901781b997	drm/i915: use render gen to switch ring irq functions Top-level interrupt bits are usually found in the display block. It therefore makes sense to use HAS_PCH_SPLIT in i915_irq.c But the irq stuff in intel_ring.c only concerns itself with render core/gt-level interrupt sources. It therefore makes more sense to switch based on gpu gen. Kills a vlv special case. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-09 18:04:07 +02:00
Ben Widawsky	25c063004a	drm/i915: open code gen6+ ring irqs We can now open-code the get/put irq functions as they were just abstracting single register definitions. It would be nice to merge this in with the IRQ handling code... but that is too much work for me at present. In addition I could probably collapse this in to a lot of the Ironlake stuff, but I don't think it's worth the potential regressions. This patch itself should not effect functionality. CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-09 18:04:06 +02:00
Ben Widawsky	e2a1e2f024	drm/i915: ring irq cleanups - gen6 put/get only need one argument rflags and gflags are always the same (see above explanation) - remove a couple redundantly defined IRQs - reordered some lines to make things go in descending order Every ring has its own interrupts, enables, masks, and status bits that are fed into the main interrupt enable/mask/status registers. At one point in time it seemed like a good idea to make our functions support the notion that each interrupt may have a different bit position in the corresponding register (blitter parser error may be bit n in IMR, but bit m in blitter IMR). It turned out though that the HW designers did us a solid on Gen6+ and this unfortunate situation has been avoided. This allows our interrupt code to be cleaned up a bit. I jammed this into one commit because there should be no functional change with this commit, and staging it into multiple commits was unnecessarily artificial IMO. CC: Chris Wilson <chris@chris-wilson.co.uk> CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: - fixed up merged conflict with vlv changes. - added GEN6 to GT blitter bit, we only use it on gen6+. - added a comment to both ring irq bits and GT irq bits that on gen6+ these alias. - added comment that GT_BSD_USER_INTERRUPT is ilk-only. - I've got confused a bit that we still use GT_USER_INTERRUPT on ivb for the render ring - but this goes back to ilk where we have only gt interrupt bits and so we be equally confusing if changed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-09 18:04:05 +02:00
Daniel Vetter	1c7eaac737	drm/i915: apply CS reg readback trick against missed IRQ on snb Ben Widawsky reported missed IRQ issues and this patch here helps. We have one other missed IRQ report still left on snb, reported by QA: https://bugs.freedesktop.org/show_bug.cgi?id=46145 This is _not_ a regression due to the forcewake voodoo though, it started showing up before that was applied and has been on-and-off for the past few weeks. According to QA this patch does not help. But the missed IRQ is always from the blt ring (despite running piglit, so also render activity expected), so I'm hopefully that this is an issue with the blt ring itself. Tested-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-01 12:30:24 +02:00
Jesse Barnes	7e231dbe0c	drm/i915: ValleyView IRQ support ValleyView has a new interrupt architecture; best to put it in a new set of functions. Also make sure the ring mask functions handle ValleyView. FIXME: fix flipping; need to enable interrupts and call prepare/finish Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-29 00:11:22 +02:00
Sean Paul	f01db988ef	drm/i915: Add wait_for in init_ring_common I have seen a number of "blt ring initialization failed" messages where the ctl or start registers are not the correct value. Upon further inspection, if the code just waited a little bit, it would read the correct value. Adding the wait_for to these reads should eliminate the issue. Signed-off-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-18 19:10:06 +01:00
Dave Airlie	8229c885fe	drm: Merge tag 'v3.3-rc7' into drm-core-next Merge the fixes so far into core-next, needed to test intel driver. Conflicts: drivers/gpu/drm/i915/intel_ringbuffer.c	2012-03-15 10:24:32 +00:00
Chris Wilson	5d031e5b63	drm/i915: Remove use of the autoreported ringbuffer HEAD position This is a revert of `6aa56062ea`. This was originally introduced to workaround reads of the ringbuffer registers returning 0 on SandyBridge causing hangs due to ringbuffer overflow. The root cause here was reads through the GT powerwell require the forcewake dance, something we only learnt of later. Now it appears that reading the reported head position from the HWS is returning garbage, leading once again to hangs. For example, on q35 the autoreported head reports: [ 217.975608] head now 00010000, actual 00010000 [ 436.725613] head now 00200000, actual 00200000 [ 462.956033] head now 00210000, actual 00210010 [ 485.501409] head now 00400000, actual 00400020 [ 508.064280] head now 00410000, actual 00410000 [ 530.576078] head now 00600000, actual 00600020 [ 553.273489] head now 00610000, actual 00610018 which appears reasonably sane. In contrast, if we look at snb: [ 141.970680] head now 00e10000, actual 00008238 [ 141.974062] head now 02734000, actual 000083c8 [ 141.974425] head now 00e10000, actual 00008488 [ 141.980374] head now 032b5000, actual 000088b8 [ 141.980885] head now 03271000, actual 00008950 [ 142.040628] head now 02101000, actual 00008b40 [ 142.180173] head now 02734000, actual 00009050 [ 142.181090] head now 00000000, actual 00000ae0 [ 142.183737] head now 02734000, actual 00009050 In addition, the automatic reporting of the head position is scheduled to be defeatured in the future. It has no more utility, remove it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45492 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2012-02-27 08:49:56 -08:00
Chris Wilson	a71d8d9452	drm/i915: Record the tail at each request and use it to estimate the head By recording the location of every request in the ringbuffer, we know that in order to retire the request the GPU must have finished reading it and so the GPU head is now beyond the tail of the request. We can therefore provide a conservative estimate of where the GPU is reading from in order to avoid having to read back the ring buffer registers when polling for space upon starting a new write into the ringbuffer. A secondary effect is that this allows us to convert intel_ring_buffer_wait() to use i915_wait_request() and so consolidate upon the single function to handle the complicated task of waiting upon the GPU. A necessary precaution is that we need to make that wait uninterruptible to match the existing conditions as all the callers of intel_ring_begin() have not been audited to handle ERESTARTSYS correctly. By using a conservative estimate for the head, and always processing all outstanding requests first, we prevent a race condition between using the estimate and direct reads of I915_RING_HEAD which could result in the value of the head going backwards, and the tail overflowing once again. We are also careful to mark any request that we skip over in order to free space in ring as consumed which provides a self-consistency check. Given sufficient abuse, such as a set of unthrottled GPU bound cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on Sandy Bridge (i5-2520m): firefox-paintball 18927ms -> 15646ms: 1.21x speedup firefox-fishtank 12563ms -> 11278ms: 1.11x speedup which is a mild consolation for the performance those traces achieved from exploiting the buggy autoreported head. v2: Add a few more comments and make request->tail a conservative estimate as suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: resolve conflicts with retirement defering and the lack of the autoreport head removal (that will go in through -fixes).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-15 14:26:03 +01:00
Daniel Vetter	99ffa1629d	drm/i915: enable forcewake voodoo also for gen6 We still have reports of missed irqs even on Sandybridge with the HWSTAM workaround in place. Testing by the bug reporter gets rid of them with the forcewake voodoo and no HWSTAM writes. Because I've slightly botched the rebasing I've left out the ACTHD readback which is also required to get IVB working. Seems to still work on the tester's machine, so I think we should go with the more minmal approach on SNB. Especially since I've only found weak evidence for holding forcewake while waiting for an interrupt to arrive, but none for the ACTHD readback. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45332 Tested-by: Nicolas Kalkhof nkalkhof()at()web.de Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:57:07 +01:00
Daniel Vetter	53d227f282	drm/i915: fixup seqno allocation logic for lazy_request Currently we reserve seqnos only when we emit the request to the ring (by bumping dev_priv->next_seqno), but start using it much earlier for ring->oustanding_lazy_request. When 2 threads compete for the gpu and run on two different rings (e.g. ddx on blitter vs. compositor) hilarity ensued, especially when we get constantly interrupted while reserving buffers. Breakage seems to have been introduced in commit `6f392d5486` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Aug 7 11:01:22 2010 +0100 drm/i915: Use a common seqno for all rings. This patch fixes up the seqno reservation logic by moving it into i915_gem_next_request_seqno. The ring->add_request functions now superflously still return the new seqno through a pointer, that will be refactored in the next patch. Note that with this change we now unconditionally allocate a seqno, even when ->add_request might fail because the rings are full and the gpu died. But this does not open up a new can of worms because we can already leave behind an outstanding_request_seqno if e.g. the caller gets interrupted with a signal while stalling for the gpu in the eviciton paths. And with the bugfix we only ever have one seqno allocated per ring (and only that ring), so there are no ordering issues with multiple outstanding seqnos on the same ring. v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it clear that we only have one seqno counter for all rings. Suggested by Chris Wilson. v3: As suggested by Chris Wilson use i915_gem_next_request_seqno instead of ring->oustanding_lazy_request to make the follow-up refactoring more clearly correct. Also improve the commit message with issues discussed on irc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181 Tested-by: Nicolas Kalkhof nkalkhof()at()web.de Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:55:57 +01:00
Daniel Vetter	9edd576d89	Merge remote-tracking branch 'airlied/drm-fixes' into drm-intel-next-queued Back-merge from drm-fixes into drm-intel-next to sort out two things: - interlaced support: -fixes contains a bugfix to correctly clear interlaced configuration bits in case the bios sets up an interlaced mode and we want to set up the progressive mode (current kernels don't support interlaced). The actual feature work to support interlaced depends upon (and conflicts with) this bugfix. - forcewake voodoo to workaround missed IRQ issues: -fixes only enabled this for ivybridge, but some recent bug reports indicate that we need this on Sandybridge, too. But in a slightly different flavour and with other fixes and reworks on top. Additionally there are some forcewake cleanup patches heading to -next that would conflict with currrent -fixes. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-10 17:14:49 +01:00
Daniel Vetter	6a233c7887	drm/i915/ringbuffer: kill snb blt workaround This was just to facilitate product enablement with pre-production hw. Allows us to kill quite a bit of cruft. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 17:50:40 +01:00
Daniel Vetter	96154f2fab	drm/i915: switch ring->id to be a real id ... and add a helpr function for the places where we want a flag. This way we can use ring->id to index into arrays. v2: Resurrect the missing beautification-space Chris Wilson noted. I'm moving this space around because I'll reuse ring_str in the next patch. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 17:32:58 +01:00
Eric Anholt	8d79c3490a	drm/i915: Remove the MI_FLUSH_ENABLE setting. We have always been using the wrong bit -- it's bit 12. However, the bit also doesn't do anything -- hardware has always accepted the MI_FLUSH command even when it was specced not to. Given that there is only one MI_FLUSH emitted in all of the driver stack on gen6+ (in i965_video.c of the 2d driver, and it should be using other code to do its flush instead), just remove the MI_FLUSH enable instead of trying to fix it. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-25 09:32:21 +01:00
Keith Packard	8f0fc977f5	Revert "drm/i915: Work around gen7 BLT ring synchronization issues." This reverts commit `42ff6572e5`. New forcewake voodoo makes this no longer necessary. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-20 10:20:44 -08:00
Daniel Vetter	4cd53c0c8b	drm/i915: paper over missed irq issues with force wake voodoo Two things seem to do the trick on my ivb machine here: - prevent the gt from powering down while waiting for seqno notification interrupts by grabbing the force_wake in get_irq (and dropping it in put_irq again). - ordering writes from the ring's CS by reading a CS register, ACTHD seems to work. Only the blt&bsd ring on ivb seem to be massively affected by this, but for paranoia do this dance also on the render ring and on snb (i.e. all gpus with forcewake). Tested with Eric's glCopyPixels loop which without this patch scores a missed irq every few seconds. This patch needs my forcewake rework to use a spinlock instead of dev->struct_mutex. After crawling through docs a lot I've found the following nugget: Internal doc "SNB GT PM Programming Guide", Section 4.3.1: "GT does not generate interrupts while in RC6 (by design)" So it looks like rc6 and irq generation are indeed related. v2: Improve the comment per Eugeni Dodonov's suggestion. v3: Add the documentation snipped. Also restrict the w/a to ivb only for -fixes, as suggested by Keith Packard. Cc: stable@kernel.org Cc: Eric Anholt <eric@anholt.net> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Eugeni Dodonov <eugeni.dodonov@intel.com> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-19 12:28:57 -08:00
Daniel Vetter	e6bfaf8542	drm/i915: don't bail out of intel_wait_ring_buffer too early In the pre-gem days with non-existing hangcheck and gpu reset code, this timeout of 3 seconds was pretty important to avoid stuck processes. But now we have the hangcheck code in gem that goes to great length to ensure that the gpu is really dead before declaring it wedged. So there's no need for this timeout anymore. Actually it's even harmful because we can bail out too early (e.g. with xscreensaver slip) when running giant batchbuffers. And our code isn't robust enough to properly unroll any state-changes, we pretty much rely on the gpu reset code cleaning up the mess (like cache tracking, fencing state, active list/request tracking, ...). With this change intel_begin_ring can only fail when the gpu is wedged, and it will return -EAGAIN (like wait_request in case the gpu reset is still outstanding). v2: Chris Wilson noted that on resume timers aren't running and hence we won't ever get kicked out of this loop by the hangcheck code. Use an insanely large timeout instead for the HAS_GEM case to prevent resume bugs from totally hanging the machine. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 10:26:31 -08:00
Eric Anholt	42ff6572e5	drm/i915: Work around gen7 BLT ring synchronization issues. Previous to this commit, testing easily reproduced a failure where the seqno would apparently arrive after the IRQ associated with it, with test programs as simple as: for (;;) { glCopyPixels(0, 0, 1, 1); glFinish(); } Various workarounds we've seen for previous generations didn't work to fix this issue, so until new information comes in, replace the IRQ waits on the BLT ring with polling. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:31:15 -08:00
Ben Widawsky	84f9f938be	drm/i915: Force sync command ordering (Gen6+) The docs say this is required for Gen7, and since the bit was added for Gen6, we are also setting it there pit pf paranoia. Particularly as Chris points out, if PIPE_CONTROL counts as a 3d state packet. This was found through doc inspection by Ken and applies to Gen6+; Reported-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:09:44 -08:00
Jesse Barnes	8d31528703	drm/i915: Use PIPE_CONTROL for flushing on gen6+. v2 by danvet: Use a new flag to flush the render target cache on gen6+ (hw reuses the old write flush bit), as suggested by Ben Widawsdy. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [danvet: this seems to fix cairo-perf-trace hangs on my snb] Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:41 -07:00
Kenneth Graunke	9d971b3753	drm/i915: Rename PIPE_CONTROL bit defines to be less terse. "STALL_AT_SCOREBOARD" is much clearer than "STALL_EN" now that there are several different kinds of stalls. Also, "INSTRUCTION_CACHE_INVALIDATE" is a lot easier to understand at a glance than the terse "IS_FLUSH." Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [danvet: use INVALIDATE for ro cache flags for more consistency] Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:40 -07:00
Kenneth Graunke	fcbc34e4dc	drm/i915: Remove implied length of 2 from GFX_OP_PIPE_CONTROL #define. Not all PIPE_CONTROLs have a length of 2, so remove it from the #define and make each invocation specify the desired length. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> [danvet: implement style suggestion from Ben Widawsdy] Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:40 -07:00
Ben Widawsky	c8c99b0f0d	drm/i915: Dumb down the semaphore logic While I think the previous code is correct, it was hard to follow and hard to debug. Since we already have a ring abstraction, might as well use it to handle the semaphore updates and compares. I don't expect this code to make semaphores better or worse, but you never know... v2: Remove magic per Keith's suggestions. Ran Daniel's gem_ring_sync_loop test on this. v3: Ignored one of Keith's suggestions. v4: Removed some bloat per Daniel's recommendation. Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-09-21 14:52:41 -07:00
Akshay Joshi	0206e353a0	Drivers: i915: Fix all space related issues. Various issues involved with the space character were generating warnings in the checkpatch.pl file. This patch removes most of those warnings. Signed-off-by: Akshay Joshi <me@akshayjoshi.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-09-19 18:01:47 -07:00
Jesse Barnes	b095cd0a0c	drm/i915: set GFX_MODE to pre-Ivybridge default value even on Ivybridge Prior to Ivybridge, the GFX_MODE would default to 0x800, meaning that MI_FLUSH would flush the TLBs in addition to the rest of the caches indicated in the MI_FLUSH command. However starting with Ivybridge, the register defaults to 0x2800 out of reset, meaning that to invalidate the TLB we need to use PIPE_CONTROL. Since we're not doing that yet, go back to the old default so things work. v2: don't forget to actually clear the new bit Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2011-08-19 11:57:12 -07:00
Keith Packard	df7976797f	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-07-22 13:40:42 -07:00
Keith Packard	f3234706a7	drm/i915: Initialize RCS ring status page address in intel_render_ring_init_dri Physically-addressed hardware status pages are initialized early in the driver load process by i915_init_phys_hws. For UMS environments, the ring structure is not initialized until the X server starts. At that point, the entire ring structure is re-initialized with all new values. Any values set in the ring structure (including ring->status_page.page_addr) will be lost when the ring is re-initialized. This patch moves the initialization of the status_page.page_addr value to intel_render_ring_init_dri. Signed-off-by: Keith Packard <keithp@keithp.com> Cc: stable@kernel.org	2011-07-22 13:36:52 -07:00
Chris Wilson	e4ffd173a1	drm/i915: Add an interface to dynamically change the cache level [anholt v2: Don't forget that when going from cached to uncached, we haven't been tracking the write domain from the CPU perspective, since we haven't needed it for GPU coherency.] [ickle v3: We also need to make sure we relinquish any fences on older chipsets and clear the GTT for sane domain tracking.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:16 -07:00
Feng, Boqun	8547920fc6	drm/i915: clean up unused ring_get_irq/ring_put_irq functions This patch depends on patch "drm/i915: fix user irq miss in BSD ring on g4x". Once the previous patch apply, ring_get_irq/ring_put_irq become unused. So simply remove them. Signed-off-by: Feng, Boqun <boqun.feng@intel.com> Reviewed-by: Xiang, Haihao <haihao.xiang@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-16 12:54:16 -07:00
Feng, Boqun	5bfa1063a7	drm/i915: fix user irq miss in BSD ring on g4x On g4x, user interrupt in BSD ring is missed. This is because though g4x and ironlake share the same bsd_ring, their interrupt control interfaces have _two_ differences. 1.different irq enable/disable functions: On g4x are i915_enable_irq and i915_disable_irq. On ironlake are ironlake_enable_irq and ironlake_disable_irq. 2.different irq flag: On g4x user interrupt flag in BSD ring on is I915_BSD_USER_INTERRUPT. On ironlake is GT_BSD_USER_INTERRUPT Old bsd_ring_get/put_irq call ring_get_irq and ring_get_irq. ring_get_irq and ring_put_irq only call ironlake_enable/disable_irq. So comes the irq miss on g4x. To fix this, as other rings' code do, conditionally call different functions(i915_enable/disable_irq and ironlake_enable/disable_irq) and use different interrupt flags in bsd_ring_get/put_irq. Signed-off-by: Feng, Boqun <boqun.feng@intel.com> Reviewed-by: Xiang, Haihao <haihao.xiang@intel.com> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-16 12:54:05 -07:00
Eric Anholt	4593010b68	drm/i915: Update the location of the ringbuffers' HWS_PGA registers for IVB. They have been moved from the ringbuffer groups to their own group it looks like. Fixes GPU hangs on gnome startup. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-13 18:12:52 -07:00
Jesse Barnes	65d3eb1e06	drm/i915: ring support for Ivy Bridge Use Sandy Bridge paths in a few places. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-13 17:10:33 -07:00
Chris Wilson	93dfb40cd8	drm/i915: Rename agp_type to cache_level ... to clarify just how we use it inside the driver and remove the confusion of the poorly matching agp_type names. We still need to translate through agp_type for interface into the fake AGP driver. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-10 13:56:43 -07:00
Ben Widawsky	96f298aa9c	drm/1915: ringbuffer wait for idle function Added a new function which waits for the ringbuffer space to be equal to (total - 8). This is the empty condition of the ringbuffer, and equivalent to head==tail. Also modified two users of this functionality elsewhere in the code. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-05-10 13:56:40 -07:00
Chris Wilson	b259f6730c	drm/i915: Move the irq wait queue initialisation into the ring init Required so that we don't obliterate the queue if initialising the rings after the global IRQ handler is installed. [Jesse, you recently looked at refactoring the IRQ installation routines, does moving the initialisation of ring buffer data structures away from that routine make sense in your grand scheme?] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-10 13:19:12 -07:00
Chris Wilson	36d527dead	drm/i915: Restore missing command flush before interrupt on BLT ring We always skipped flushing the BLT ring if the request flush did not include the RENDER domain. However, this neglects that we try to flush the COMMAND domain after every batch and before the breadcrumb interrupt (to make sure the batch is indeed completed prior to the interrupt firing and so insuring CPU coherency). As a result of the missing flush, incoherency did indeed creep in, most notable when using lots of command buffers and so potentially rewritting an active command buffer (i.e. the GPU was still executing from it even though the following interrupt had already fired and the request/buffer retired). As all ring->flush routines now have the same preconditions, de-duplicate and move those checks up into i915_gem_flush_ring(). Fixes gem_linear_blit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: mengmeng.meng@intel.com	2011-03-23 09:17:01 +00:00
Chris Wilson	9035a97a32	Merge branch 'drm-intel-fixes' into drm-intel-next Grab the latest stabilisation bits from -fixes and some suspend and resume fixes from linus. Conflicts: drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_irq.c	2011-02-16 09:44:30 +00:00
Chris Wilson	db53a30261	drm/i915: Refine tracepoints A lot of minor tweaks to fix the tracepoints, improve the outputting for ftrace, and to generally make the tracepoints useful again. It is a start and enough to begin identifying performance issues and gaps in our coverage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 14:59:18 +00:00
Chris Wilson	71a77e07d0	drm/i915: Invalidate TLB caches on SNB BLT/BSD rings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2011-02-02 15:52:38 +00:00
Chris Wilson	21dd373486	drm/i915: Defer reporting EIO until we try to use the GPU Instead of reporting EIO upfront in the entrance of an ioctl that may or may not attempt to use the GPU, defer the actual detection of an invalid ioctl to when we issue a GPU instruction. This allows us to continue to use bo in video memory (via pread/pwrite and mmap) after the GPU has hung. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-27 11:06:07 +00:00
Chris Wilson	29ee399131	drm/i915: Silence a few -Wunused-but-set-variable Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-25 10:33:11 +00:00
Chris Wilson	bdd92c9ad2	Merge branch 'drm-intel-fixes' into drm-intel-next Merge important suspend and resume regression fixes and resolve the small conflict. Conflicts: drivers/gpu/drm/i915/i915_dma.c	2011-01-24 23:45:32 +00:00
Chris Wilson	c7dca47bd6	drm/i915/ringbuffer: Fix use of stale HEAD position whilst polling for space During suspend, Linus found that his machine would hang for 3 seconds, and identified that intel_ring_buffer_wait() was the culprit: "Because from looking at the code, I get the notion that "intel_read_status_page()" may not be exact. But what happens if that inexact value matches our cached ring->actual_head, so we never even try to read the exact case? Does it _stay_ inexact for arbitrarily long times? If so, we might wait for the ring to empty forever (well, until the timeout - the behavior I see), even though the ring really _is_ empty." As the reported HEAD position is only updated every time it crosses a 64k boundary, whilst draining the ring it is indeed likely to remain one value. If that value matches the last known HEAD position, we never read the true value from the register and so trigger a timeout. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-20 17:26:57 +00:00
Chris Wilson	c0c06bd244	drm/i915/ringbuffer: Kill an annoyingly frequent debug message This is better handled through the tracepoints and just clutters the debug logs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-20 11:21:40 +00:00

... 2 3 4 5 6 ...

431 Commits