linux

Commit Graph

Author	SHA1	Message	Date
Chris Wilson	6e4930f6ee	drm/i915: Flush GPU rendering with a lockless wait during a pagefault Arjan van de Ven reported that on his test machine that he was seeing stalls of greater than 1 frame greatly impacting the user experience. He tracked this down to being the locked flush during a pagefault as being the culprit hogging the struct_mutex and so blocking any other user from proceeding. Stalling on a pagefault is bad behaviour on userspace's part, for one it means that they are ignoring the coherency rules on pointer access through the GTT, but fortunately we can apply the same trick as the set-to-domain ioctl to do a lightweight, nonblocking flush of outstanding rendering first. "Prior to the patch it looks like this (this one testrun does not show the 20ms+ I've seen occasionally) 4.99 ms 2.36 ms 31360 __wait_seqno i915_wait_seqno i915_gem_object_wait_rendering i915_gem_object_set_to_gtt_domain i915_gem_fault __do_fault handle_ +pte_fault handle_mm_fault __do_page_fault do_page_fault page_fault 4.99 ms 2.75 ms 107751 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.99 ms 1.63 ms 1666 i915_mutex_lock_interruptible i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_fault do_page_fault page_fa +ult 4.93 ms 2.45 ms 980 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ +sysret 4.89 ms 2.20 ms 3283 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.34 ms 1.66 ms 1715 i915_mutex_lock_interruptible i915_gem_pwrite_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.73 ms 3.73 ms 49 i915_mutex_lock_interruptible i915_gem_set_domain_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.17 ms 0.33 ms 931 i915_mutex_lock_interruptible i915_gem_madvise_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.97 ms 0.43 ms 1029 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.55 ms 0.51 ms 735 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret After the patch it looks like this: 4.99 ms 2.14 ms 22212 __wait_seqno i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 4.86 ms 0.99 ms 14170 __wait_seqno i915_gem_object_wait_rendering__nonblocking i915_gem_fault __do_fault handle_pte_fault handle_mm_fault __do_page_ +fault do_page_fault page_fault 3.59 ms 1.31 ms 325 i915_gem_get_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 3.37 ms 3.37 ms 65 i915_mutex_lock_interruptible i915_gem_wait_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 2.58 ms 2.58 ms 65 i915_mutex_lock_interruptible i915_gem_do_execbuffer.isra.23 i915_gem_execbuffer2 drm_ioctl i915_compat_ioctl compat_sys_ioctl +ia32_sysret 2.19 ms 2.19 ms 65 i915_mutex_lock_interruptible intel_crtc_page_flip drm_mode_page_flip_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_ +sysret 2.18 ms 2.18 ms 65 i915_mutex_lock_interruptible i915_gem_busy_ioctl drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret 1.66 ms 1.66 ms 65 i915_gem_set_tiling drm_ioctl i915_compat_ioctl compat_sys_ioctl ia32_sysret It may not look like it, but this is quite a large difference, and I've been unable to reproduce > 5 msec delays at all, while before they do happen (just not in the trace above)." gem_gtt_hog on an old Pineview (GMA3150), before: 4969.119ms after: 4122.749ms Reported-by: Arjan van de Ven <arjan.van.de.ven@intel.com> Testcase: igt/gem_gtt_hog Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 18:52:59 +01:00
Chris Wilson	bd9b6a4ec5	drm/i915: Downgrade ERROR message for invalid user input When we detect that the user passed along an invalid handle or object, we emit a warning as an aide for debugging. Since these are indeed only for debugging user triggerable errors (and the errors are reported back to userspace by the errno), the messages should only be at the debug level and not claiming that there is a catastrophic error in the driver/hardware. References: https://bugs.freedesktop.org/show_bug.cgi?id=74704 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 18:52:54 +01:00
Damien Lespiau	3d13ef2e2d	drm/i915: Always use INTEL_INFO() to access the device_info structure If we make sure that all the dev_priv->info usages are wrapped by INTEL_INFO(), we can easily modify the ->info field to be structure and not a pointer while keeping the const protection in the INTEL_INFO() macro. v2: Rebased onto latest drm-nightly Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-12 18:52:50 +01:00
Chris Wilson	8c99e57d39	drm/i915: Treat using a purged buffer as a source of EFAULT Since a purged buffer is one without any associated pages, attempting to use it should generate EFAULT rather than EINVAL, as it is not strictly an invalid parameter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 17:03:32 +01:00
Chris Wilson	45d678173a	drm/i915: Convert EFAULT into a silent SIGBUS EFAULT will be a possible return code where backing storage is transient, such after it is purged by madvise. As such it is to be expected and so should not trigger a WARN inside i915_gem_fault() but be converted silently to SIGBUS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 17:03:27 +01:00
Mika Kuoppala	e38486943e	drm/i915: release mutex in i915_gem_init()'s error path Found with smatch. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 12:10:45 +01:00
Mika Kuoppala	939fd76208	drm/i915: Get rid of acthd based guilty batch search As we seek the guilty batch using request and hangcheck score, this code is not needed anymore. v2: Rebase. Passing dev_priv instead of getting it from last_ring Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 11:57:29 +01:00
Mika Kuoppala	b6b0fac04d	drm/i915: Use hangcheck score to find guilty context With full ppgtt using acthd is not enough to find guilty batch buffer. We get multiple false positives as acthd is per vm. Instead of scanning which vm was running on a ring, to find corressponding context, use a different, simpler, strategy of finding batches that caused gpu hang: If hangcheck has declared ring to be hung, find first non complete request on that ring and claim it was guilty. v2: Rebase Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73652 Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-04 11:57:24 +01:00
Mika Kuoppala	3fac8978f5	drm/i915: Tune down debug output when context is banned If we have stopped rings then we know that test is running so no need for spam. In addition, only spam when default context gets banned. v2: - make sure default context ban gets shown (Chris) - use helper for checking for default context, everywhere (Chris) v3: - dont be quiet when debug is set (Ben, Daniel) Reference: https://bugs.freedesktop.org/show_bug.cgi?id=73652 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-30 17:25:38 +01:00
Mika Kuoppala	44e2c0705a	drm/i915: Use i915_hw_context to set reset stats With full ppgtt support drm_i915_file_private gained knowledge about the default context. Also reset stats are now inside i915_hw_context so we can use proper abstraction. v2: Move BUG_ON and WARN_ON to more proper locations (Ben) v3: Pass dev directly to i915_context_is_banned to avoid the need to dereference ctx->last_ring. Spotted by Mika when checking my s/BUG/WARN/ change, I've missed this ->last_ring dereference. Suggested-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v2) Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v2) [danvet: s/BUG/WARN/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-30 17:24:36 +01:00
Daniel Vetter	6ba844b090	drm/i915: GEN7_MSG_CONTROL is ivb-only At least I couldn't find it in the Haswell Bspec any more and we've tried to test-boot a Haswell machine with num_pipes forced to 0 (i.e. hit the PCH_NOP path) and the unclaimed register logic complained. So restrict this dance to just ivb platforms. v2: Art pointed out that the bits simply moved on hsw+ v3: Buy code terseneness with a notch of sublety as suggested by Chris. v4: Frob the right bit, spotted by Art. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arthur Ranyan <arthur.j.runyan@intel.com> Cc: Dave Airlie <airlied@gmail.com> Reviewed-by: Art Runyan <arthur.j.runyan@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 17:16:47 +01:00
Jani Nikula	d330a9530c	drm/i915: move module parameters into a struct, in a new file With 20+ module parameters, I think referring to them via a struct improves clarity over just having a bunch of globals. While at it, move the parameter initialization and definitions into a new file i915_params.c to reduce clutter in i915_drv.c. Apart from the ill-named i915_enable_rc6, i915_enable_fbc and i915_enable_ppgtt parameters, for which we lose the "i915_" prefix internally, the module parameters now look the same both on the kernel command line and in code. For example, "i915.modeset". The downsides of the change are losing static on a couple of variables and not having the initialization and module_param_named() right next to each other. On the other hand, all module parameters are now defined in one place at i915_params.c. Plus you can do this to find all module parameter references: $ git grep "i915\." -- drivers/gpu/drm/i915 v2: - move the definitions into a new file - s/i915_params/i915/ - make i915_try_reset i915.reset, for consistency Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-27 17:16:45 +01:00
Daniel Vetter	0e5539b923	Merge branch 'topic/ppgtt' into drm-intel-next-queued Because whatever.* * This should contain a fairly long list of issues and still unresolved resgressions, but I didn't really get a vote. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-25 21:14:57 +01:00
Chris Wilson	f72d21eddf	drm/i915: Place the Global GTT VM first in the list of VM This is useful for debugging as we then know that the first entry is always the global GTT, and all later entries the per-process GTT VM. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-25 20:07:15 +01:00
Chris Wilson	5dce5b9387	drm/i915: Wait for completion of pending flips when starved of fences On older generations (gen2, gen3) the GPU requires fences for many operations, such as blits. The display hardware also requires fences for scanouts and this leads to a situation where an arbitrary number of fences may be pinned by old scanouts following a pageflip but before we have executed the unpin workqueue. This is unpredictable by userspace and leads to random EDEADLK when submitting an otherwise benign execbuffer. However, we can detect when we have an outstanding flip and so cause userspace to wait upon their completion before finally declaring that the system is starved of fences. This is really no worse than forcing the GPU to stall waiting for older execbuffer to retire and release their fences before we can reallocate them for the next execbuffer. v2: move the test for a pending fb unpin to a common routine for later reuse during eviction Reported-and-tested-by: dimon@gmx.net Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73696 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 10:34:40 +01:00
Ben Widawsky	1d62beeaeb	drm/i915/ppgtt: Defer request freeing on reset We need to defer the free request until the object/vma is capable of being freed - or else we have a problem when we try to destroy the context. The exact same issue is described and fixed here: commit `e20780439b` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:22 2013 -0800 drm/i915: Defer request freeing I had this fix previously, but decided not to keep it for some reason I can no longer remember. gem_reset_stats is a really good test at hitting the problem. For the inquisitive: [ 170.516392] ------------[ cut here ]------------ [ 170.517227] WARNING: CPU: 1 PID: 105 at drivers/gpu/drm/drm_mm.c:578 drm_mm_takedown+0x2e/0x30 [drm]() [ 170.518064] Memory manager not clean during takedown. [ 170.518941] CPU: 1 PID: 105 Comm: kworker/1:1 Not tainted 3.13.0-rc4-BEN+ #28 [ 170.519787] Hardware name: Hewlett-Packard HP EliteBook 8470p/179B, BIOS 68ICF Ver. F.02 04/27/2012 [ 170.520662] Call Trace: [ 170.521517] [<ffffffff814f0589>] dump_stack+0x4e/0x7a [ 170.522373] [<ffffffff81049e6d>] warn_slowpath_common+0x7d/0xa0 [ 170.523227] [<ffffffff81049edc>] warn_slowpath_fmt+0x4c/0x50 [ 170.524079] [<ffffffffa06c414e>] drm_mm_takedown+0x2e/0x30 [drm] [ 170.524934] [<ffffffffa07213f3>] gen6_ppgtt_cleanup+0x23/0x110 [i915] [ 170.525777] [<ffffffffa07837ed>] ppgtt_release.part.5+0x24/0x29 [i915] [ 170.526603] [<ffffffffa071aaa5>] i915_gem_context_free+0x195/0x1a0 [i915] [ 170.527423] [<ffffffffa071189d>] i915_gem_free_request+0x9d/0xb0 [i915] [ 170.528247] [<ffffffffa0718af9>] i915_gem_reset+0x1f9/0x3f0 [i915] [ 170.529065] [<ffffffffa0700cce>] i915_reset+0x4e/0x180 [i915] [ 170.529870] [<ffffffffa070829d>] i915_error_work_func+0xcd/0x120 [i915] [ 170.530666] [<ffffffff8106c13a>] process_one_work+0x1fa/0x6d0 [ 170.531453] [<ffffffff8106c0d8>] ? process_one_work+0x198/0x6d0 [ 170.532230] [<ffffffff8106c72b>] worker_thread+0x11b/0x3a0 [ 170.532996] [<ffffffff8106c610>] ? process_one_work+0x6d0/0x6d0 [ 170.533771] [<ffffffff810743ef>] kthread+0xff/0x120 [ 170.534548] [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80 [ 170.535322] [<ffffffff814f97ac>] ret_from_fork+0x7c/0xb0 [ 170.536089] [<ffffffff810742f0>] ? insert_kthread_work+0x80/0x80 [ 170.536847] ---[ end trace 3d4c12892e42d58f ]--- v2: Whitespace fix. (Chris) Note: This is a bug that only hits the ppgtt topic branch but I've figured that doing the request cleanup in this order is generally the right thing to do. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Add a code comment to clarify what's actually going on since the lifetime rules aroung ppgtt cleanup are ... fuzzy a best atm. Also add a note about why we need this.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 10:34:37 +01:00
Daniel Vetter	8664850087	drm/i915: Tune down reset_stat output from ERROR to debug This is user-triggerable and hence we should not allow it to spam dmesg. Also, it upsets the nice dmesg tracking piglit does. Note that this is just extra debugging information, mostly unwanted, in case of a hang and that there is a separate message to the user giving instructions on how to report a bug for a GPU hang. v2: Add note as suggests in Chris' reply. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72740 Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-22 10:34:35 +01:00
Daniel Vetter	0d9d349d87	Merge commit origin/master into drm-intel-next Conflicts are getting out of hand, and now we have to shuffle even more in -next which was also shuffled in -fixes (the call for drm_mode_config_reset needs to move yet again). So do a proper backmerge. I wanted to wait with this for the 3.13 relaese, but alas let's just do this now. Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_ddi.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_pm.c Besides the conflict around the forcewake get/put (where we chaged the called function in -fixes and added a new parameter in -next) code all the current conflicts are of the adjacent lines changed type. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-16 22:06:30 +01:00
Chris Wilson	e910303802	drm/i915: Free requests after object release when retiring requests Freeing a request triggers the destruction of the context. This needs to occur after all objects are themselves unbound from the context, and so the free request needs to occur after the object release during retire. This tidies up commit `e20780439b` Author: Ben Widawsky <ben@bwidawsk.net> Date: Fri Dec 6 14:11:22 2013 -0800 drm/i915: Defer request freeing by simply swapping the order of operations rather than introducing further complexity - as noted during review. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-01-10 08:21:52 +01:00
Daniel Vetter	bfca05275a	Revert "drm/i915: Do not allow buffers at offset 0" This reverts commit `4fe9adbc36`. The patch completely lacks a detailed explanation of what exactly blows up and how, so is insufficiently justified as a band-aid. Otoh the justification as a safety measure against userspace botching up relocations is also fairly weak: If we want real project we need to at least make the gab big enough that the gpu doesn't scribble over more important stuff. With 4k screens that would be 32MB. Also I think this would be much better in conjunction with a (debug) switch to disable our use of the scratch page. Hence revert this. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 17:50:39 +01:00
Daniel Vetter	02f6bcccf7	drm/i915: Reject the pin ioctl on gen6+ Especially with ppgtt this kinda stopped making sense. And if we indeed need this to hack around an issue, we need something that also works for non-root. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:30:22 +01:00
Ben Widawsky	7e0d96bc03	drm/i915: Use multiple VMs -- the point of no return As with processes which run on the CPU, the goal of multiple VMs is to provide process isolation. Specific to GEN, there is also the ability to map more objects per process (2GB each instead of 2Gb-2k total). For the most part, all the pipes have been laid, and all we need to do is remove asserts and actually start changing address spaces with the context switch. Since prior to this we've converted the setting of the page tables to a streamed version, this is quite easy. One important thing to point out (since it'd been hotly contested) is that with this patch, every context created will have it's own address space (provided the HW can do it). v2: Disable BDW on rebase NOTE: I tried to make this commit as small as possible. I needed one place where I could "turn everything on" and that is here. It could be split into finer commits, but I didn't really see much point. Cc: Eric Anholt <eric@anholt.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:24:52 +01:00
Daniel Vetter	3d7f0f9dcc	Merge commit drm-intel-fixes into topic/ppgtt I need the tricky do_switch fix before I can merge the final piece of the ppgtt enabling puzzle. Otherwise the conflict will be a real pain to resolve since the do_switch hunk from -fixes must be placed at the exact right place within a hunk in the next patch. Conflicts: drivers/gpu/drm/i915/i915_gem_context.c drivers/gpu/drm/i915/i915_gem_execbuffer.c drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:23:37 +01:00
Ben Widawsky	4fe9adbc36	drm/i915: Do not allow buffers at offset 0 This is primarily a band aid for an unexplainable error in gem_reloc_vs_gpu/forked-faulting-reloc-thrashing. Essentially as soon as a relocated buffer (which had a non-zero presumed offset) moved to offset 0, something goes bad. Since I have been unable to solve this, and potentially this is a good thing to do anyway, since many things can accidentally write to offset 0, why not? Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 16:15:40 +01:00
Ben Widawsky	e20780439b	drm/i915: Defer request freeing With context destruction, we always want to be able to tear down the underlying address space. This is invoked on the last unreference to the context which could happen before we've moved all objects to the inactive list. To enable a clean tear down the address space, make sure to process the request free lastly. Without this change, we cannot guarantee to we don't still have active objects in the VM. As an example of a failing case: CTX-A is created, count=1 CTX-A is used during execbuf does a context switch count = 2 and add_request count = 3 CTX B runs, switches, CTX-A count = 2 CTX-A is destroyed, count = 1 retire requests is called free_request from CTX-A, count = 0 <--- free context with active object As mentioned above, by doing the free request after processing the active list, we can avoid this case. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:52:51 +01:00
Ben Widawsky	41bde5535a	drm/i915: Get context early in execbuf We need to have the address space when reserving space for the objects. Since the address space and context are tied together, and reserve occurs before context switch (for good reason), we must lookup our context earlier in the process. This leaves some room for optimizations where we no longer need to use ctx_id in certain places. This will be addressed in a subsequent patch. Important tricky bit: Because slow relocations during execbuffer drop struct_mutex Perhaps it would be best to acquire the reference when we get the context, but I'll save that for another day (note I have written the patch before, and I found the changes required to be uglier than this). Note that since we currently access everything via context id, and not the data structure this is fine, though not desirable. The next change attempts to get the context only once via the context ID idr lookup, and as such, the following can happen: CTX-A is created, refcount = 1 CTX-A execbuf, mutex dropped close IOCTL called on CTX-A, refcount = 0 CTX-A resumes in execbuf. v2: Rebased on top of commit `b6359918b8` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed Oct 30 15:44:16 2013 +0200 drm/i915: add i915_get_reset_stats_ioctl v3: Rebased on top of commit `25b3dfc87b` Author: Mika Westerberg <mika.westerberg@linux.intel.com> Date: Tue Nov 12 11:57:30 2013 +0200 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Nov 26 16:14:33 2013 +0200 drm/i915: check context reset stats before relocations Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:52:42 +01:00
Ben Widawsky	c482972a08	drm/i915: Piggy back hangstats off of contexts To simplify the codepaths somewhat, we can simply always create a context. Contexts already keep hangstat information. This prevents us from having to differentiate at other parts in the code. There is allocation overhead, but it should not be measurable. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:51:58 +01:00
Ben Widawsky	bdf4fd7ea0	drm/i915: Do aliasing PPGTT init with contexts We have a default context which suits the aliasing PPGTT well. Tie them together so it looks like any other context/PPGTT pair. This makes the code cleaner as it won't have to special case aliasing as often. The patch has one slightly tricky part in the default context creation function. In the future (and on aliased setup) we create a new VM for a context (potentially). However, if we have aliasing PPGTT, which occurs at this point in time for all platforms GEN6+, we can simply manage the refcounting to allow things to behave as normal. Now is a good time to recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT drm_mm. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:32:14 +01:00
Ben Widawsky	a3d67d2396	drm/i915: PPGTT vfuncs should take a ppgtt argument Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:56 +01:00
Ben Widawsky	2fa48d8d4a	drm/i915: Split context enabling from init We need to do this for exactly 1 reason, because we want to embed a PPGTT into the context, but we don't want to special case the default context. To achieve that, we must be able to initialize contexts after the GTT is setup (so we can allocate and pin the default context's BO), but before the PPGTT and rings are initialized. This is because, currently, context initialization requires ring usage. We don't have rings until after the GTT is setup. If we split the enabling part of context initialization, the part requiring the ringbuffer, we can untangle this, and then later embed the PPGTT Incidentally this allows us to also adhere to the original design of context init/fini in future patches: they were only ever meant to be called at driver load and unload. v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris) v3: BUG_ON after checking for disabled contexts. Or else it blows up pre gen6 (Ben) v4: Forward port Modified enable for each ring, since that patch is earlier in the series Dropped ring arg from create_default_context so it can be used by others Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:55 +01:00
Ben Widawsky	acce9ffa48	drm/i915: Better reset handling for contexts This patch adds to changes for contexts on reset: Sets last context to default - this will prevent the context switch happening after a reset. That switch is not possible because the rings are hung during reset and context switch requires reset. This behavior will need to be reworked in the future, but this is what we want for now. In the future, we'll also want to reset the guilty context to uninitialized. We should wait for ARB_Robustness related code to land for that. This is somewhat for paranoia. Because we really don't know what the GPU was doing when it hung, or the state it was in (mid context write, for example), later restoring the context is a bad idea. By setting the flag to not initialized, the next load of that context will not restore the state, and thus on the subsequent switch away from the context will overwrite the old data. NOTE: This code needs a fixup when we actually have multiple VMs. The issue that can occur is inactive objects in a VM will need to be destroyed before the last context unref. This can now happen via the fake switch introduced in this patch (and it other ways in the future) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:54 +01:00
Ben Widawsky	e422b888eb	drm/i915: Add a context open function We'll be doing a bit more stuff with each file, so having our own open function should make things clean. This also allows us to easily add conditionals for stuff we don't want to do when we don't have HW contexts. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:51 +01:00
Ben Widawsky	6f65e29aca	drm/i915: Create bind/unbind abstraction for VMAs To sum up what goes on here, we abstract the vma binding, similarly to the previous object binding. This helps for distinguishing legacy binding, versus modern binding. To keep the code churn as minimal as possible, I am leaving in insert_entries(). It serves as the per platform pte writing basically. bind_vma and insert_entries do share a lot of similarities, and I did have designs to combine the two, but as mentioned already... too much churn in an already massive patchset. What follows are the 3 commits which existed discretely in the original submissions. Upon rebasing on Broadwell support, it became clear that separation was not good, and only made for more error prone code. Below are the 3 commit messages with all their history. drm/i915: Add bind/unbind object functions to VMA drm/i915: Use the new vm [un]bind functions drm/i915: reduce vm->insert_entries() usage drm/i915: Add bind/unbind object functions to VMA As we plumb the code with more VM information, it has become more obvious that the easiest way to deal with bind and unbind is to simply put the function pointers in the vm, and let those choose the correct way to handle the page table updates. This change allows many places in the code to simply be vm->bind, and not have to worry about distinguishing PPGTT vs GGTT. Notice that this patch has no impact on functionality. I've decided to save the actual change until the next patch because I think it's easier to review that way. I'm happy to squash the two, or let Daniel do it on merge. v2: Make ggtt handle the quirky aliasing ppgtt Add flags to bind object to support above Don't ever call bind/unbind directly for PPGTT until we have real, full PPGTT (use NULLs to assert this) Make sure we rebind the ggtt if there already is a ggtt binding. This happens on set cache levels. Use VMA for bind/unbind (Daniel, Ben) v3: Reorganize ggtt_vma_bind to be more concise and easier to read (Ville). Change logic in unbind to only unbind ggtt when there is a global mapping, and to remove a redundant check if the aliasing ppgtt exists. v4: Make the bind function a bit smarter about the cache levels to avoid unnecessary multiple remaps. "I accept it is a wart, I think unifying the pin_vma / bind_vma could be unified later" (Chris) Removed the git notes, and put version info here. (Daniel) v5: Update the comment to not suck (Chris) v6: Move bind/unbind to the VMA. It makes more sense in the VMA structure (always has, but I was previously lazy). With this change, it will allow us to keep a distinct insert_entries. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: Use the new vm [un]bind functions Building on the last patch which created the new function pointers in the VM for bind/unbind, here we actually put those new function pointers to use. Split out as a separate patch to aid in review. I'm fine with squashing into the previous patch if people request it. v2: Updated to address the smart ggtt which can do aliasing as needed Make sure we bind to global gtt when mappable and fenceable. I thought we could get away without this initialy, but we cannot. v3: Make the global GTT binding explicitly use the ggtt VM for bind_vma(). While at it, use the new ggtt_vma helper (Chris) At this point the original mailing list thread diverges. ie. v4^: use target_obj instead of obj for gen6 relocate_entry vma->bind_vma() can be called safely during pin. So simply do that instead of the complicated conditionals. Don't restore PPGTT bound objects on resume path Bug fix in resume path for globally bound Bos Properly handle secure dispatch Rebased on vma bind/unbind conversion Signed-off-by: Ben Widawsky <ben@bwidawsk.net> drm/i915: reduce vm->insert_entries() usage FKA: drm/i915: eliminate vm->insert_entries() With bind/unbind function pointers in place, we no longer need insert_entries. We could, and want, to remove clear_range, however it's not totally easy at this point. Since it's used in a couple of place still that don't only deal in objects: setup, ppgtt init, and restore gtt mappings. v2: Don't actually remove insert_entries, just limit its usage. It will be useful when we introduce gen8. It will always be called from the vma bind/unbind. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:50 +01:00
Ben Widawsky	d7f46fc4e7	drm/i915: Make pin count per VMA Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:49 +01:00
Ben Widawsky	feb822cfc2	drm/i915: Handle inactivating objects for all VMAs This came from a patch called, "drm/i915: Move active to vma" When moving an object to the inactive list, we do it for all VMs for which the object is bound. The primary difference from that patch is this time around we don't not track 'active' per vma, but rather by object. Therefore, we only need one unref. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:46 +01:00
Ben Widawsky	c39538a88d	drm/i915: Takedown drm_mm on failed gtt setup This was found by code inspection. If the GTT setup fails then we are left without properly tearing down the drm_mm. Hopefully this never happens. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:45 +01:00
Ben Widawsky	6e164c3382	drm/i915: Allow ggtt lookups to not WARN To be able to effectively use the GGTT object lookup function, we don't want to warn when there is no GGTT mapping. Let the caller deal with it instead. Originally, I had intended to have this behavior, and has not introduced the WARN. It was introduced during review with the addition of the follow commit commit `5c2abbeab7` Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Tue Sep 24 09:57:57 2013 -0700 drm/i915: Provide a cheap ggtt vma lookup Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:45 +01:00
Ben Widawsky	6f425321e0	drm/i915: Don't unconditionally try to deref aliasing ppgtt Since the beginning, the functions which try to properly reference the aliasing PPGTT have deferences a potentially null aliasing_ppgtt member. Since the accessors are meant to be global, this will not do. Introduced originally in: commit `a70a3148b0` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Jul 31 16:59:56 2013 -0700 drm/i915: Make proper functions for VMs Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-18 15:27:44 +01:00
Paulo Zanoni	48018a57a8	drm/i915: release the GTT mmaps when going into D3 So we'll get a fault when someone tries to access the mmap, then we'll wake up from D3. v2: - Rebase v3: - Use gtt active/inactive Testcase: igt/pm_pc8/gem-mmap-gtt Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Add comment + WARN as discussed with Paulo on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-14 15:35:52 +01:00
Mika Kuoppala	168c3f2151	drm/i915: dont call irq_put when irq test is on If test is running, irq_get was not called so we should gain balance by not doing irq_put "So the rule is: if you access unlocked values, you use ACCESS_ONCE(). You don't say "but it can't matter". Because you simply don't know." -- Linus v2: use local variable so it can't change during test (Chris) v3: update commit msg and use ACCESS_ONCE (Ville) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-12 22:58:44 +01:00
Mika Kuoppala	47e9766df0	drm/i915: Fix timeout with missed interrupts in __wait_seqno Commit `094f9a54e3` ("drm/i915: Fix __wait_seqno to use true infinite timeouts") added support for __wait_seqno to detect missing interrupts and go around them by polling. As there is also timeout detection in __wait_seqno, the polling and timeout detection were done with the same timer. When there has been missed interrupts and polling is needed, the timer is set to trigger in (now + 1) jiffies in future, instead of the caller specified timeout. Now when io_schedule() returns, we calculate the jiffies left to timeout using the timer expiration value. As the current jiffies is now bound to be always equal or greater than the expiration value, the timeout_jiffies will become zero or negative and we return -ETIME to caller even tho the timeout was never reached. Fix this by decoupling timeout calculation from timer expiration. v2: Commit message with some sense in it (Chris Wilson) v3: add parenthesis on timeout_expire calculation v4: don't read jiffies without timeout (Chris Wilson) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-12 15:27:21 +01:00
Chris Wilson	4db080f9e9	drm/i915: Fix erroneous dereference of batch_obj inside reset_status As the rings may be processed and their requests deallocated in a different order to the natural retirement during a reset, /* Whilst this request exists, batch_obj will be on the * active_list, and so will hold the active reference. Only when this * request is retired will the the batch_obj be moved onto the * inactive_list and lose its active reference. Hence we do not need * to explicitly hold another reference here. */ is violated, and the batch_obj may be dereferenced after it had been freed on another ring. This can be simply avoided by processing the status update prior to deallocating any requests. Fixes regression (a possible OOPS following a GPU hang) from commit `aa60c664e6` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Wed Jun 12 15:13:20 2013 +0300 drm/i915: find guilty batch buffer on ring resets Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: Add the code comment Chris supplied.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-12 10:49:05 +01:00
Paulo Zanoni	f65c916898	drm/i915: add runtime put/get calls at the basic places If I add code to enable runtime PM on my Haswell machine, start a desktop environment, then enable runtime PM, these functions will complain that they're trying to read/write registers while the graphics card is suspended. v2: - Simplify i915_gem_fault changes. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [danvet: Drop the hunk in i915_hangcheck_elapsed, it's the wrong thing to do.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-10 22:47:33 +01:00
Chris Wilson	70903c3ba8	drm/i915: Fix ordering of unbind vs unpin pages It is useful to assert that if the object is bound, then it must have its pages pinned to prevent the shrinker from reaping its backing store. This is even more useful with the introduction of real-ppgtt whereupon we may have the object bound into several vma, with each instance pinning the backing store. This assertion breaks down during unbind where we unpinned the backing store before decoupling the vma binding. This can be fixed with a trivial reording of the unbind sequence, which reinforces the pin pages bind to vma ... unbind from vma unpin pages concept. v2: Bonus comment Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-12-04 12:10:50 +01:00
Ville Syrjälä	0bf2134780	drm/i915: MI_PREDICATE_RESULT_2 is HSW only The MI_PREDICATE_RESULT_2 register exits only on HSW. On other platforms the same offset is either reserved, or contains some other register. So write the register only on HSW. This regression has been introduced in commit `9435373ef8` Author: Rodrigo Vivi <rodrigo.vivi@gmail.com> Date: Wed Aug 28 16:45:46 2013 -0300 drm/i915: Report enabled slices on Haswell GT3 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: Add regression notice.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-29 15:00:03 +01:00
Daniel Vetter	c09cd6e969	Merge branch 'backlight-rework' into drm-intel-next-queued Pull in Jani's backlight rework branch. This was merged through a separate branch to be able to sort out the Broadwell conflicts properly before pulling it into the main development branch. Conflicts: drivers/gpu/drm/i915/intel_display.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-15 10:02:39 +01:00
Ben Widawsky	31a5336e1c	drm/i915/bdw: Swizzling support Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:37 +01:00
Ben Widawsky	5ab31333ac	drm/i915/bdw: Fences on gen8 look just like gen7 Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-08 18:09:36 +01:00
Ben Widawsky	8245be3139	drm/i915: Require HW contexts (when possible) v2: Fixed the botched locking on init_hw failure in i915_reset (Ville) Call cleanup_ringbuffer on failed context create in init_hw (Ville) v3: Add dev argument ti clean_ringbuffer Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-11-07 09:35:44 +01:00
Paulo Zanoni	de45eaf7b9	drm/i915: fix open-coded DIV_ROUND_UP Use the nice Kernel macro, it makes the code much more readable. Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-21 10:04:03 +02:00
Daniel Vetter	aa5f802181	drm/i915: Use unsigned long for obj->user_pin_count At least on linux sizeof(long) == sizeof(void*) and the thinking is that you can grab about as many references as there's memory. Doesn't really matter, just a bit of OCD since the fixed size data type in a pure in-kernel datastructure look off. v2: Ville asked for an overflow check since no one prevents userspace from incrementing the pin count forever. v3: s/INT/LONG/, noticed by Chris. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 22:06:39 +02:00
Chris Wilson	45c5f2022c	drm/i915: Disable all GEM timers and work on unload We have two once very similar functions, i915_gpu_idle() and i915_gem_idle(). The former is used as the lower level operation to flush work on the GPU, whereas the latter is the high level interface to flush the GEM bookkeeping in addition to flushing the GPU. As such i915_gem_idle() also clears out the request and activity lists and cancels the delayed work. This is what we need for unloading the driver, unfortunately we called i915_gpu_idle() instead. In the process, make sure that when cancelling the delayed work and timer, which is synchronous, that we do not hold any locks to prevent a deadlock if the work item is already waiting upon the mutex. This requires us to push the mutex down from the caller to i915_gem_idle(). v2: s/i915_gem_idle/i915_gem_suspend/ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70334 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: xunx.fang@intel.com [danvet: Only set ums.suspended for !kms as discussed earlier. Chris noticed that this slipped through.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 19:42:14 +02:00
Ben Widawsky	3d57e5bd12	drm/i915: Do a fuller init after reset I had this lying around from he original PPGTT series, and thought we might try to get it in by itself. It's convenient to just call i915_gem_init_hw at reset because we'll be adding new things to that function, and having just one function to call instead of reimplementing it in two places is nice. In order to accommodate we cleanup ringbuffers in order to bring them back up cleanly. Optionally, we could also teardown/re initialize the default context but this was causing some problems on reset which I wasn't able to fully debug, and is unnecessary with the previous context init/enable split. This essentially reverts: commit `8e88a2bd59` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jun 19 18:40:00 2012 +0200 drm/i915: don't call modeset_init_hw in i915_reset It seems to work for me on ILK now. Perhaps it's due to: commit `8a5c2ae753` Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Thu Mar 28 13:57:19 2013 -0700 drm/i915: fix ILK GPU reset for render Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-16 11:08:08 +02:00
Daniel Vetter	3bbbe706e8	drm/i915: check that the i965g/gm 4G limit is really obeyed In truly crazy circumstances shmem might give us the wrong type of page. So be a bit paranoid and double check this. Reviewer: Damien Lespiau <damien.lespiau@intel.com> Cc: Rob Clark <robdclark@gmail.com> References: http://lkml.org/lkml/2011/7/11/238 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:47:05 +02:00
Chris Wilson	d9973b4356	drm/i915: Fix type mismatch and accounting in i915_gem_shrink The interface uses an unsigned long, and we can use the unsigned counter throughout our code, so do so. In the process, we notice one instance where the shrink count is based on a heuristic rather than the result, and another where we ask for too many pages to be purged. v2: nr_to_scan needs to be promoted to a long as well, so just use sc->nr_to_scan directly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:46:48 +02:00
Chris Wilson	5035c275af	drm/i915: Call io_schedule() whilst whilsting for the GPU Since we are waiting upon IO completion, inform the kernel through use of the io_schedule() call rather than the regular schedule(). This should allow the kernel to make better decisions regarding scheduling and power management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:46:47 +02:00
Daniel Vetter	967ad7f148	Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next The conflict in intel_drv.h tripped me up a bit since a patch in dinq moves all the functions around, but another one in drm-next removes a single function. So I'ev figured backing this into a backmerge would be good. i915_dma.c is just adjacent lines changed, nothing nefarious there. Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/intel_drv.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-10 12:44:43 +02:00
David Herrmann	16eb5f4379	drm: kill ->gem_init_object() and friends All drivers embed gem-objects into their own buffer objects. There is no reason to keep drm_gem_object_alloc(), gem->driver_private and ->gem_init_object() anymore. New drivers are highly encouraged to do the same. There is no benefit in allocating gem-objects separately. Cc: Dave Airlie <airlied@gmail.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Inki Dae <inki.dae@samsung.com> Cc: Ben Skeggs <skeggsb@gmail.com> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-10-09 14:38:02 +10:00
Chris Wilson	b29c19b645	drm/i915: Boost RPS frequency for CPU stalls If we encounter a situation where the CPU blocks waiting for results from the GPU, give the GPU a kick to boost its the frequency. This should work to reduce user interface stalls and to quickly promote mesa to high frequencies - but the cost is that our requested frequency stalls high (as we do not idle for long enough before rc6 to start reducing frequencies, nor are we aggressive at down clocking an underused GPU). However, this should be mitigated by rc6 itself powering off the GPU when idle, and that energy use is dependent upon the workload of the GPU in addition to its frequency (e.g. the math or sampler functions only consume power when used). Still, this is likely to adversely affect light workloads. In particular, this nearly eliminates the highly noticeable wake-up lag in animations from idle. For example, expose or workspace transitions. (However, given the situation where we fail to downclock, our requested frequency is almost always the maximum, except for Baytrail where we manually downclock upon idling. This often masks the latency of upclocking after being idle, so animations are typically smooth - at the cost of increased power consumption.) Stéphane raised the concern that this will punish good applications and reward bad applications - but due to the nature of how mesa performs its client throttling, I believe all mesa applications will be roughly equally affected. To address this concern, and to prevent applications like compositors from permanently boosting the RPS state, we ratelimit the frequency of the wait-boosts each client recieves. Unfortunately, this techinique is ineffective with Ironlake - which also has dynamic render power states and suffers just as dramatically. For Ironlake, the thermal/power headroom is shared with the CPU through Intelligent Power Sharing and the intel-ips module. This leaves us with no GPU boost frequencies available when coming out of idle, and due to hardware limitations we cannot change the arbitration between the CPU and GPU quickly enough to be effective. v2: Limit each client to receiving a single boost for each active period. Tested by QA to only marginally increase power, and to demonstrably increase throughput in games. No latency measurements yet. v3: Cater for front-buffer rendering with manual throttling. v4: Tidy up. v5: Sadly the compositor needs frequent boosts as it may never idle, but due to its picking mechanism (using ReadPixels) may require frequent waits. Those waits, along with the waits for the vrefresh swap, conspire to keep the GPU at low frequencies despite the interactive latency. To overcome this we ditch the one-boost-per-active-period and just ratelimit the number of wait-boosts each client can receive. Reported-and-tested-by: Paul Neumann <paul104x@yahoo.de> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68716 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Stéphane Marchesin <stephane.marchesin@gmail.com> Cc: Owen Taylor <otaylor@redhat.com> Cc: "Meng, Mengmeng" <mengmeng.meng@intel.com> Cc: "Zhuang, Lena" <lena.zhuang@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> [danvet: No extern for function prototypes in headers.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 20:01:31 +02:00
Chris Wilson	094f9a54e3	drm/i915: Fix __wait_seqno to use true infinite timeouts When we switched to always using a timeout in conjunction with wait_seqno, we lost the ability to detect missed interrupts. Since, we have had issues with interrupts on a number of generations, and they are required to be delivered in a timely fashion for a smooth UX, it is important that we do log errors found in the wild and prevent the display stalling for upwards of 1s every time the seqno interrupt is missed. Rather than continue to fix up the timeouts to work around the interface impedence in wait_event_*(), open code the combination of wait_event[_interruptible][_timeout], and use the exposed timer to poll for seqno should we detect a lost interrupt. v2: In order to satisfy the debug requirement of logging missed interrupts with the real world requirments of making machines work even if interrupts are hosed, we revert to polling after detecting a missed interrupt. v3: Throw in a debugfs interface to simulate broken hw not reporting interrupts. v4: s/EGAIN/EAGAIN/ (Imre) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Don't use the struct typedef in new code.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-03 20:01:30 +02:00
Chris Wilson	b52b89da09	drm/i915: Add a tracepoint for using a semaphore So that we can find the callers who introduce a ring stall. A single ring stall is not too unwelcome, the right issue becomes when they start to interlock and prevent any concurrent work. That, however, is a little tricker to detect with a mere tracepoint! v2: Rebrand it as a ring event, rather than an object event. v3: Include the seqno in the tracepoint for posterity or something. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:24 +02:00
Ben Widawsky	e2d05a8b1e	drm/i915: Convert active API to VMA Even though we track object activity and not VMA, because we have the active_list be based on the VM, it makes the most sense to use VMAs in the APIs. NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for now, leave them be. v2: Remove leftover hunk from the previous patch which didn't keep i915_gem_object_move_to_active. That patch had to rely on the ring to get the dev instead of the obj. (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:21 +02:00
Ben Widawsky	5c2abbeab7	drm/i915: Provide a cheap ggtt vma lookup "We do fairly often lookup the ggtt vma for an obj." - Chris Wilson. As such, provide a function to offer slightly cheaper access to the vma. Not performance tested. By my quick estimation it saves at least 3 pointer dereferences from the existing mechanism. This patch mostly matches code from Chris in <20130911221430.GB7825@nuc-i3427.alporthouse.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:21 +02:00
Chris Wilson	f740334775	drm/i915: Do not unlock upon error in i915_gem_idle() We never took the lock ourselves and all callers expect the struct_mutex to be locked upon return (be it success or error), thereore dropping the lock along the error paths looks to be a vestigial error from commit `db1b76ca6a` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jul 9 16:51:37 2013 +0200 drm/i915: don't frob mm.suspended when not using ums Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:03 +02:00
Daniel Vetter	b14c5679dd	drm/i915: use pointer = k[cmz...]alloc(sizeof(*pointer), ...) pattern Done while reviewing all our allocations for fubar. Also a few errant cases of lacking () for the sizeof operator - just a bit of OCD. I've left out all the conversions that also should use kcalloc from this patch (it's only 2). Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:01 +02:00
Dave Airlie	4821ff14a3	Merge tag 'drm-intel-next-2013-09-21-merged' of git://people.freedesktop.org/~danvet/drm-intel into drm-next drm-intel-next-2013-09-21: - clock state handling rework from Ville - l3 parity handling fixes for hsw from Ben - some more watermark improvements from Ville - ban badly behaved context from Mika - a few vlv improvements from Jesse - VGA power domain handling from Ville drm-intel-next-2013-09-06: - Basic mipi dsi support from Jani. Not yet converted over to drm_bridge since that was too fresh, but the porting is in progress already. - More vma patches from Ben, this time the code to convert the execbuffer code. Now that the shrinker recursion bug is tracked down we can move ahead here again. Yay! - Optimize hw context switching to not generate needless interrupts (Chris Wilson). Also some shuffling for the oustanding request allocation. - Opregion support for SWSCI, although not yet fully wired up (we need a bit of runtime D3 support for that apparently, due to Windows design deficiencies), from Jani Nikula. - A few smaller changes all over. [airlied: merge conflict fix in i9xx_set_pipeconf] * tag 'drm-intel-next-2013-09-21-merged' of git://people.freedesktop.org/~danvet/drm-intel: (119 commits) drm/i915: assume all GM45 Acer laptops use inverted backlight PWM drm/i915: cleanup a min_t() cast drm/i915: Pull intel_init_power_well() out of intel_modeset_init_hw() drm/i915: Add POWER_DOMAIN_VGA drm/i915: Refactor power well refcount inc/dec operations drm/i915: Add intel_display_power_{get, put} to request power for specific domains drm/i915: Change i915_request power well handling drm/i915: POSTING_READ IPS_CTL before waiting for the vblank drm/i915: don't disable ERR_INT on the IRQ handler drm/i915/vlv: disable rc6p and rc6pp residency reporting on BYT drm/i915/vlv: honor i915_enable_rc6 boot param on VLV drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF drm/i915: Do remaps for all contexts drm/i915: Keep a list of all contexts drm/i915: Make l3 remapping use the ring drm/i915: Add second slice l3 remapping drm/i915: Fix HSW parity test drm/i915: dump crtc timings from the pipe config drm/i915: register backlight device also when backlight class is a module drm/i915: write D_COMP using the mailbox ... Conflicts: drivers/gpu/drm/i915/intel_display.c	2013-10-01 10:00:50 +10:00
Daniel Vetter	d32270460f	drm/i915: Fix up usage of SHRINK_STOP In commit `81e49f8114` Author: Glauber Costa <glommer@openvz.org> Date: Wed Aug 28 10:18:13 2013 +1000 i915: bail out earlier when shrinker cannot acquire mutex SHRINK_STOP was added to tell the core shrinker code to bail out and go to the next shrinker since the i915 shrinker couldn't acquire required locks. But the SHRINK_STOP return code was added to the ->count_objects callback and not the ->scan_objects callback as it should have been, resulting in tons of dmesg noise like shrink_slab: i915_gem_inactive_scan+0x0/0x9c negative objects to delete nr=-xxxxxxxxx Fix discusssed with Dave Chinner. References: http://www.spinics.net/lists/intel-gfx/msg33597.html Reported-by: Knut Petersen <Knut_Petersen@t-online.de> Cc: Knut Petersen <Knut_Petersen@t-online.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Glauber Costa <glommer@openvz.org> Cc: Glauber Costa <glommer@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Acked-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-26 00:31:51 +02:00
Daniel Vetter	b599c89e8c	Linux 3.12-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQEcBAABAgAGBQJSQMORAAoJEHm+PkMAQRiGj14H/1bjhtfNjPdX7MVQAzA+WpwX s7h1IQu2Si9S5S1lBiM2sBTOssVcmfheO9x4yqm7JNOD1RnssWKOM3q+zVOLstwd GD3gluJPeraD5EyYSqEJ9ILPQ3gbxb4wOlT0Z291TW6E8XhLRr0RTOJPksRsgvLH Ckm9uJh6ArS6ZXfXiaDQfd+xHAQJkUfW6nMSA0g9ZO9C6KIDRvcbUmrY3m4HhfIk mK0TXCBs+AXGDIjTEB8JgIQL/5y1Qn0c4R+2uTU/4YWwyLvJTV1e44kGoleukMMT 6Pw/TNlUEN161dbSaqCyF3sfXHDYQ5valycI2PDgitMtPSxbzsU1VDizS8+daRg= =lEmF -----END PGP SIGNATURE----- Merge tag 'v3.12-rc2' into drm-intel-next Backmerge Linux 3.12-rc2 to prep for a bunch of -next patches: - Header cleanup in intel_drv.h, both changed in -fixes and my current -next pile. - Cursor handling cleanup for -next which depends upon the cursor handling fix merged into -rc2. All just trivial conflicts of the "changed adjacent lines" type: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_drv.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-24 09:32:53 +02:00
Linus Torvalds	d8524ae9d6	Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: - some small fixes for msm and exynos - a regression revert affecting nouveau users with old userspace - intel pageflip deadlock and gpu hang fixes, hsw modesetting hangs * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (22 commits) Revert "drm: mark context support as a legacy subsystem" drm/i915: Don't enable the cursor on a disable pipe drm/i915: do not update cursor in crtc mode set drm/exynos: fix return value check in lowlevel_buffer_allocate() drm/exynos: Fix address space warnings in exynos_drm_fbdev.c drm/exynos: Fix address space warning in exynos_drm_buf.c drm/exynos: Remove redundant OF dependency drm/msm: drop unnecessary set_need_resched() drm/i915: kill set_need_resched drm/msm: fix potential NULL pointer dereference drm/i915/dvo: set crtc timings again for panel fixed modes drm/i915/sdvo: Robustify the dtd<->drm_mode conversions drm/msm: workaround for missing irq drm/msm: return -EBUSY if bo still active drm/msm: fix return value check in ERR_PTR() drm/msm: fix cmdstream size check drm/msm: hangcheck harder drm/msm: handle read vs write fences drm/i915/sdvo: Fully translate sync flags in the dtd->mode conversion drm/i915: Use proper print format for debug prints ...	2013-09-22 19:51:49 -07:00
Ben Widawsky	040d2baa62	drm/i915: s/HAS_L3_GPU_CACHE/HAS_L3_DPF We'd only ever used this define to denote whether or not we have the dynamic parity feature (DPF) and never to determine whether or not L3 exists. Baytrail is a good example of where L3 exists, and not DPF. This patch provides clarify in the code for future use cases which might want to actually query whether or not L3 exists. v2: Add /* DPF == dynamic parity feature */ Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:41:00 +02:00
Ben Widawsky	a33afea5ff	drm/i915: Keep a list of all contexts I have implemented this patch before without creating a separate list (I'm having trouble finding the links, but the messages ids are: <1364942743-6041-2-git-send-email-ben@bwidawsk.net> <1365118914-15753-9-git-send-email-ben@bwidawsk.net>) However, the code is much simpler to just use a list and it makes the code from the next patch a lot more pretty. As you'll see in the next patch, the reason for this is to be able to specify when a context needs to get L3 remapping. More details there. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:39:43 +02:00
Ben Widawsky	c3787e2eac	drm/i915: Make l3 remapping use the ring Using LRI for setting the remapping registers allows us to stream l3 remapping information. This is necessary to handle per context remaps as we'll see implemented in an upcoming patch. Using the ring also means we don't need to frob the DOP clock gating bits. v2: Add comment about lack of worry for concurrent register access (Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> [danvet: Bikeshed the comment a bit by doing a s/XXX/Note - there's nothing to fix.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:38:00 +02:00
Ben Widawsky	35a85ac606	drm/i915: Add second slice l3 remapping Certain HSW SKUs have a second bank of L3. This L3 remapping has a separate register set, and interrupt from the first "slice". A slice is simply a term to define some subset of the GPU's l3 cache. This patch implements both the interrupt handler, and ability to communicate with userspace about this second slice. v2: Remove redundant check about non-existent slice. Change warning about interrupts of unknown slices to WARN_ON_ONCE Handle the case where we get 2 slice interrupts concurrently, and switch the tracking of interrupts to be non-destructive (all Ville) Don't enable/mask the second slice parity interrupt for ivb/vlv (even though all docs I can find claim it's rsvd) (Ville + Bryan) Keep BYT excluded from L3 parity v3: Fix the slice = ffs to be decremented by one (found by Ville). When I initially did my testing on the series, I was using 1-based slice counting, so this code was correct. Not sure why my simpler tests that I've been running since then didn't pick it up sooner. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-19 20:37:04 +02:00
Linus Torvalds	26935fb06e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile 4 from Al Viro: "list_lru pile, mostly" This came out of Andrew's pile, Al ended up doing the merge work so that Andrew didn't have to. Additionally, a few fixes. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (42 commits) super: fix for destroy lrus list_lru: dynamically adjust node arrays shrinker: Kill old ->shrink API. shrinker: convert remaining shrinkers to count/scan API staging/lustre/libcfs: cleanup linux-mem.h staging/lustre/ptlrpc: convert to new shrinker API staging/lustre/obdclass: convert lu_object shrinker to count/scan API staging/lustre/ldlm: convert to shrinkers to count/scan API hugepage: convert huge zero page shrinker to new shrinker API i915: bail out earlier when shrinker cannot acquire mutex drivers: convert shrinkers to new count/scan API fs: convert fs shrinkers to new scan/count API xfs: fix dquot isolation hang xfs-convert-dquot-cache-lru-to-list_lru-fix xfs: convert dquot cache lru to list_lru xfs: rework buffer dispose list tracking xfs-convert-buftarg-lru-to-generic-code-fix xfs: convert buftarg LRU to generic code fs: convert inode and dentry shrinking to be node aware vmscan: per-node deferred work ...	2013-09-12 15:01:38 -07:00
Daniel Vetter	571c608d06	drm/i915: kill set_need_resched This is just a remnant from the old days when our reset handling was horribly racy, suffered from terribly locking issues and often happily live-locked. Those days are now gone so we can drop the hacks and just rip the reschedule-point out. Reported-by: Peter Zijlstra <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-12 22:40:36 +02:00
Ben Widawsky	23f5448398	drm/i915: Synchronize pread/pwrite with wait_rendering lifted from Daniel: pread/pwrite isn't about the object's domain at all, but purely about synchronizing for outstanding rendering. Replacing the call to set_to_gtt_domain with a wait_rendering would imo improve code readability. Furthermore we could pimp pread to only block for outstanding writes and not for reads. Since you're not the first one to trip over this: Can I volunteer you for a follow-up patch to fix this? v2: Switch the pwrite patch to use \!read_only. This was a typo in the original code. (Chris, Daniel) Recommended-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Fix up the logic fumble - wait_rendering has a bool readonly paramater, set_to_gtt_domain otoh has bool write. Breakage reported by Jani Nikula, I've double-checked that igt/gem_concurrent_blt/prw-* would have caught this.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-12 21:56:52 +02:00
Glauber Costa	81e49f8114	i915: bail out earlier when shrinker cannot acquire mutex The main shrinker driver will keep trying for a while to free objects if the returned value from the shrink scan procedure is 0. That means "no objects now", but a retry could very well succeed. But what we should say here is a different thing: that it is impossible to shrink, and we would better bail out soon. We find this behavior more appropriate for the case where the lock cannot be taken. Specially given the hammer behavior of the i915: if another thread is already shrinking, we are likely not to be able to shrink anything anyway when we finally acquire the mutex. Signed-off-by: Glauber Costa <glommer@openvz.org> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Chinner <dchinner@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Kent Overstreet <koverstreet@google.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-10 18:56:32 -04:00
Dave Chinner	7dc19d5aff	drivers: convert shrinkers to new count/scan API Convert the driver shrinkers to the new API. Most changes are compile tested only because I either don't have the hardware or it's staging stuff. FWIW, the md and android code is pretty good, but the rest of it makes me want to claw my eyes out. The amount of broken code I just encountered is mind boggling. I've added comments explaining what is broken, but I fear that some of the code would be best dealt with by being dragged behind the bike shed, burying in mud up to it's neck and then run over repeatedly with a blunt lawn mower. Special mention goes to the zcache/zcache2 drivers. They can't co-exist in the build at the same time, they are under different menu options in menuconfig, they only show up when you've got the right set of mm subsystem options configured and so even compile testing is an exercise in pulling teeth. And that doesn't even take into account the horrible, broken code... [glommer@openvz.org: fixes for i915, android lowmem, zcache, bcache] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Glauber Costa <glommer@openvz.org> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Kent Overstreet <koverstreet@google.com> Cc: John Stultz <john.stultz@linaro.org> Cc: David Rientjes <rientjes@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-09-10 18:56:32 -04:00
Chris Wilson	5a1d5eb020	drm/i915: Remove the double-list iteration from bound_any() The purpose of the function is to find out whether the object is still bound in any address space. This can be easily checked by looking at the vma currently associated with the object, rather than asking if any of the global address spaces have an active vma on the object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-10 16:14:06 +02:00
Mika Kuoppala	be62acb4cc	drm/i915: ban badly behaving contexts Now when we have mechanism in place to track which context was guilty of hanging the gpu, it is possible to punish for bad behaviour. If context has recently submitted a faulty batchbuffers guilty of gpu hang and submits another batch which hangs gpu in quick succession, ban it permanently. If ctx is banned, no more batchbuffers will be queued for execution. There is no need for global wedge machinery anymore and it would be unwise to wedge the whole gpu if we have multiple hanging batches queued for execution. Instead just ban the guilty ones and carry on. v2: Store guilty ban status bool in gpu_error instead of pointers that might become danling before hang is declared. v3: Use return value for banned status instead of stashing state into gpu_error (Chris Wilson) v4: - rebase on top of fixed hang stats api - add define for ban period - rename commit and improve commit msg v5: - rely context banning instead of wedging the gpu - beautification and fix for ban calculation (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-06 17:55:50 +02:00
Chris Wilson	57094f8246	drm/i915: Hold an object reference whilst we shrink it Whilst running the shrinker, we need to hold a reference as we unbind the objects, or else we may end up waiting for and retiring requests, which in turn may result in this object being freed. This is very similar to the eviction code which also has to be very careful to keep a reference to its objects as it retires and unbinds them. Another similarity, that Ben pointed out, is that as we may call retire-requests, the unbound_list is outside of our control. We must only process a single element of that list at a time, that is we can not rely on the "safe" next pointer being valid after a call to i915_vma_unbind(). BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915] PGD 758d3067 PUD ac0d6067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: dm_mod snd_hda_codec_realtek iTCO_wdt iTCO_vendor_support pcspkr snd_hda_intel i2c_i801 snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd lpc_ich mfd_core soundcore battery ac option usb_wwan usbserial uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev i915 video button drm_kms_helper drm acpi_cpufreq mperf freq_table CPU: 1 PID: 16835 Comm: fbo-maxsize Not tainted 3.11.0-rc7_nightlytop_8fdad4_20130902_+ #7977 task: ffff8800712106d0 ti: ffff880028e4a000 task.ti: ffff880028e4a000 RIP: 0010:[<ffffffffa0082892>] [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915] RSP: 0018:ffff880028e4b9e8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880145734000 RCX: ffff880145735328 RDX: ffff8801457353fc RSI: 0000000000000000 RDI: ffff88007597cc00 RBP: ffff88007597cc00 R08: 0000000000000001 R09: ffff88014f257f00 R10: ffffea0001d65f00 R11: 0000000000bba60b R12: ffff880149e5b000 R13: ffff880145734001 R14: ffff88007597ccc8 R15: ffff88007597cc00 FS: 00007ff5bc919740(0000) GS:ffff88014f240000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 0000000028f4c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: 0000000000000000 ffff88007597cc00 ffff8801440d6840 0000000000000000 ffff880145734000 ffffffffa007c854 0000000000000010 ffff88007597c900 0000000000018000 00000000004a1201 ffff88007597cc60 ffffffffa007d183 Call Trace: [<ffffffffa007c854>] ? i915_vma_unbind+0xe2/0x1d1 [i915] [<ffffffffa007d183>] ? __i915_gem_shrink+0xf1/0x162 [i915] [<ffffffffa007d2ee>] ? i915_gem_object_get_pages_gtt+0xfa/0x303 [i915] [<ffffffffa00795f4>] ? i915_gem_object_get_pages+0x54/0x89 [i915] [<ffffffffa007cbda>] ? i915_gem_object_pin+0x238/0x5ce [i915] [<ffffffff812cba5f>] ? __sg_page_iter_next+0x2b/0x58 [<ffffffffa0082056>] ? gen6_ppgtt_insert_entries+0xf2/0x114 [i915] [<ffffffffa007fe4b>] ? i915_gem_execbuffer_reserve_vma.isra.13+0x79/0x18d [i915] [<ffffffffa008017c>] ? i915_gem_execbuffer_reserve+0x21d/0x347 [i915] [<ffffffffa0080bfb>] ? i915_gem_do_execbuffer.isra.17+0x4f3/0xe61 [i915] [<ffffffffa00795f4>] ? i915_gem_object_get_pages+0x54/0x89 [i915] [<ffffffffa007e405>] ? i915_gem_pwrite_ioctl+0x743/0x7a5 [i915] [<ffffffffa0081a46>] ? i915_gem_execbuffer2+0x15e/0x1e4 [i915] [<ffffffffa000e20d>] ? drm_ioctl+0x2a5/0x3c4 [drm] [<ffffffffa00818e8>] ? i915_gem_execbuffer+0x37f/0x37f [i915] [<ffffffff816f64c0>] ? __do_page_fault+0x3ab/0x449 [<ffffffff810be3da>] ? do_mmap_pgoff+0x2b2/0x341 [<ffffffff810e49be>] ? vfs_ioctl+0x1e/0x31 [<ffffffff810e5194>] ? do_vfs_ioctl+0x3ad/0x3ef [<ffffffff810e5224>] ? SyS_ioctl+0x4e/0x7e [<ffffffff816f88d2>] ? system_call_fastpath+0x16/0x1b Code: 52 0c a0 48 c7 c6 22 30 0d a0 31 c0 e8 ef 00 f9 ff bf c6 a7 00 00 e8 90 5d 24 e1 f6 85 13 01 00 00 10 75 44 48 8b 85 18 01 00 00 <8b> 50 08 48 8b 30 49 8b 84 24 88 02 00 00 48 89 c7 48 81 c7 98 RIP [<ffffffffa0082892>] i915_gem_gtt_finish_object+0x68/0xbd [i915] RSP <ffff880028e4b9e8> CR2: 0000000000000008 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org [danvet: Bikeshed the comments a bit as discussed with Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-05 14:47:59 +02:00
Chris Wilson	3c0e234c84	drm/i915; Preallocate the lazy request It is possible for us to be forced to perform an allocation for the lazy request whilst running the shrinker. This allocation may fail, leaving us unable to reclaim any memory leading to premature OOM. A neat solution to the problem is to preallocate the request at the same time as acquiring the seqno for the ring transaction. This means that we can report ENOMEM prior to touching the rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-05 12:03:53 +02:00
Chris Wilson	1823521d2b	drm/i915: Rename ring->outstanding_lazy_request Prior to preallocating an request for lazy emission, rename the existing field to make way (and differentiate the seqno from the request struct). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-05 12:03:12 +02:00
Chris Wilson	9a7e0c2a1b	drm/i915: Rearrange the comments in i915_add_request() The comments were a little out-of-sequence with the code, forcing the reader to jump around whilst reading. Whilst moving the comments around, add one to explain the context reference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:54 +02:00
Chris Wilson	c0321e2c5a	drm/i915: Do not add an interrupt for a context switch We use the request to ensure we hold a reference to the context for the duration that it remains in use by the ring. Each request only holds a reference to the current context, hence we emit a request after switching contexts with the final reference to the old context. However, the extra interrupt caused by that request is not useful (no timing critical function will wait for the context object), instead the overhead of servicing the IRQ shows up in some (lightweight) benchmarks. In order to keep the useful property of using the request to manage the context lifetime, we want to add a dummy request that is associated with the interrupt from the subsequent real request following the batch. The extra interrupt was added as a side-effect of using i915_add_request() in commit `112522f678` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu May 2 16:48:07 2013 +0300 drm/i915: put context upon switching v2: Daniel convinced me that the request here was solely for context lifetime tracking and that we have the active ref to keep the object alive whilst the MI_SET_CONTEXT. So the only concern then is which context should get the blame for MI_SET_CONTEXT failing. The old scheme added a request for the old context so that any hang upto and including the switch away would mark the old context as guilty. Now any hang here implicates the new context. However since we have already gone through a complete flush with the last context in its last request, and all that lies in no-man's-land is an invalidate flush and the MI_SET_CONTEXT, we should be safe in not unduly placing blame on the new context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:53 +02:00
Daniel Vetter	0ff501cbb5	drm/i915: Fix list corruption in vma_unbind The saga around the breadcrumb vmas used by execbuf continues ... This time around we've managed to unconditionally move the object to the unbound list on the last vma unbind even though it might never have been on either the bound or unbound list. Hilarity ensued. Chris Wilson tracked this one down but compared to his patches I've simply opted to completely separate the unbound case for not-yet bound vmas. Otherwise we imo end up with semantically hard to parse checks around the list_move_tail(global_list, ...). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68462 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:52 +02:00
Rodrigo Vivi	9435373ef8	drm/i915: Report enabled slices on Haswell GT3 Batchbuffers constructed by userspace can conditionalise their URB allocations through the use of the MI_SET_PREDICATE command. This command can read the MI_PREDICATE_RESULT_2 register to see how many slices are enabled on GT3, and by virtue of the result, scale their memory allocations to fit enabled memory. Of course, this only works if the kernel sets the appropriate bit in the register first. v2: Better commit subject and message by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Credits-to: Yejun Guo <yejun.guo@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:51 +02:00
Daniel Vetter	b93dab6e9d	drm/i915: More vma fixups around unbind/destroy The important bugfix here is that we must not unlink the vma when we keep it around as a placeholder for the execbuf code. Since then we won't find it again when execbuf gets interrupt and restarted and create a 2nd vma. And since the code as-is isn't fit yet to deal with more than one vma, hilarity ensues. Specifically the dma map/unmap of the sg table isn't adjusted for multiple vmas yet and will blow up like this: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915] PGD 56bb5067 PUD ad3dd067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: tcp_lp ppdev parport_pc lp parport ipv6 dm_mod dcdbas snd_hda_codec_hdmi pcspkr snd_hda_codec_realtek serio_raw i2c_i801 iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_codec lpc_ich snd_hwdep mfd_core snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915 video button drm_kms_helper drm mperf freq_table CPU: 1 PID: 16650 Comm: fbo-maxsize Not tainted 3.11.0-rc4_nightlytop_d93f59_debug_20130814_+ #6957 Hardware name: Dell Inc. OptiPlex 9010/03JR84, BIOS A01 05/04/2012 task: ffff8800563b3f00 ti: ffff88004bdf4000 task.ti: ffff88004bdf4000 RIP: 0010:[<ffffffffa008fb37>] [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP: 0018:ffff88004bdf5958 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8801135e0000 RCX: ffff8800ad3bf8e0 RDX: ffff8800ad3bf8e0 RSI: 0000000000000000 RDI: ffff8801007ee780 RBP: ffff88004bdf5978 R08: ffff8800ad3bf8e0 R09: 0000000000000000 R10: ffffffff86ca1810 R11: ffff880036a17101 R12: ffff8801007ee780 R13: 0000000000018001 R14: ffff880118c4e000 R15: ffff8801007ee780 FS: 00007f401a0ce740(0000) GS:ffff88011e280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000005635c000 CR4: 00000000001407e0 Stack: ffff8801007ee780 ffff88005c253180 0000000000018000 ffff8801135e0000 ffff88004bdf59a8 ffffffffa0088e55 0000000000000011 ffff8801007eec00 0000000000018000 ffff880036a17101 ffff88004bdf5a08 ffffffffa0089026 Call Trace: [<ffffffffa0088e55>] i915_vma_unbind+0xdf/0x1ab [i915] [<ffffffffa0089026>] __i915_gem_shrink+0x105/0x177 [i915] [<ffffffffa0089452>] i915_gem_object_get_pages_gtt+0x108/0x309 [i915] [<ffffffffa0085ba9>] i915_gem_object_get_pages+0x61/0x90 [i915] [<ffffffffa008f22b>] ? gen6_ppgtt_insert_entries+0x103/0x125 [i915] [<ffffffffa008a113>] i915_gem_object_pin+0x1fa/0x5df [i915] [<ffffffffa008cdfe>] i915_gem_execbuffer_reserve_object.isra.6+0x8d/0x1bc [i915] [<ffffffffa008d156>] i915_gem_execbuffer_reserve+0x229/0x367 [i915] [<ffffffffa008dbf6>] i915_gem_do_execbuffer.isra.12+0x4dc/0xf3a [i915] [<ffffffff810fc823>] ? might_fault+0x40/0x90 [<ffffffffa008eb89>] i915_gem_execbuffer2+0x187/0x222 [i915] [<ffffffffa000971c>] drm_ioctl+0x308/0x442 [drm] [<ffffffffa008ea02>] ? i915_gem_execbuffer+0x3ae/0x3ae [i915] [<ffffffff817db156>] ? __do_page_fault+0x3dd/0x481 [<ffffffff8112fdba>] vfs_ioctl+0x26/0x39 [<ffffffff811306a2>] do_vfs_ioctl+0x40e/0x451 [<ffffffff817deda7>] ? sysret_check+0x1b/0x56 [<ffffffff8113073c>] SyS_ioctl+0x57/0x87 [<ffffffff8135bbfe>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff817ded82>] system_call_fastpath+0x16/0x1b Code: 48 c7 c6 84 30 0e a0 31 c0 e8 d0 e9 f7 ff bf c6 a7 00 00 e8 07 af 2c e1 41 f6 84 24 03 01 00 00 10 75 44 49 8b 84 24 08 01 00 00 <8b> 50 08 48 8b 30 49 8b 86 b0 04 00 00 48 89 c7 48 81 c7 98 00 RIP [<ffffffffa008fb37>] i915_gem_gtt_finish_object+0x73/0xc8 [i915] RSP <ffff88004bdf5958> CR2: 0000000000000008 As a consequence we need to change the "only one vma for now" check in vma_unbind - since vma_destroy isn't always called the obj->vma_list might not be empty. Instead check that the vma list is singular at the beginning of vma_unbind. This is also more symmetric with bind_to_vm. This fixes the igt/gem_evict_everything\|alignment testcases. v2: - Add a paranoid WARN to mark_free in the eviction code to make sure we never try to evict a vma used by the execbuf code right now. - Move the check for a temporary execbuf vma into vma_destroy - otherwise the failure path cleanup in bind_to_vm will blow up. Our first attempting at fixing this was commit 1be81a2f2cfd8789a627401d470423358fba2d76 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Aug 20 12:56:40 2013 +0100 drm/i915: Don't destroy the vma placeholder during execbuffer reservation Squash with this when merging! v3: Improvements suggested in Chris' review: - Move the WARN_ON in vma_destroy that checks for vmas with an drm_mm allocation before the early return. - Bail out if we hit the WARN in mark_free to hopefully make the kernel survive for long enough to capture it. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68171 Tested-by: lu hua <huax.lu@intel.com> (v2) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:42 +02:00
Chris Wilson	aaa0566792	drm/i915: Don't destroy the vma placeholder during execbuffer reservation The execbuffer handle and exec_link were moved from the object into the vma. As the vma may be unbound and destroyed whilst attempting to reserve the execbuffer objects (either through a forced unbind to fix up a misalignment or through an evict-everything call) we need to prevent the free of the i915_vma itself. Otherwise not only is the list of objects to reserve corrupt, but we continue to reference stale vma entries. Fixes kernel crash with i-g-t/gem_evict_everything This regression has been introduced in commit 04038a515d6eda6dd0857c0ade0b3950d372f4c0 Author: Ben Widawsky <ben@bwidawsk.net> AuthorDate: Wed Aug 14 11:38:36 2013 +0200 drm/i915: Convert execbuf code to use vmas Reported-by: Dan Carpenter <dan.carpenter@oracle.com> References: http://www.spinics.net/lists/intel-gfx/msg32038.html Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68298 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:42 +02:00
Daniel Vetter	e656a6cba0	drm/i915: inline vma_create into lookup_or_create_vma In the execbuf code we don't clean up any vmas which ended up not getting bound for code simplicity. To make sure that we don't end up creating multiple vma for the same vm kill the somewhat dangerous vma_create function and inline it into lookup_or_create. This is just a safety measure to prevent surprises in the future. Also update the somewhat confused comment in the execbuf code and clarify what kind of magic is going on with a new one. v2: Keep the function separate as requested by Chris. But give it a __ prefix for paranoia and move it tighter together with the other vma stuff. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <ben@bwidawsk.net> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:41 +02:00
Ben Widawsky	27173f1f95	drm/i915: Convert execbuf code to use vmas In order to transition more of our code over to using a VMA instead of an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up until now, we've only had a VMA when actually binding an object. The previous patch helped handle the distinction on bound vs. unbound. This patch will help us catch leaks, and other issues before we actually shuffle a bunch of stuff around. This attempts to convert all the execbuf code to speak in vmas. Since the execbuf code is very self contained it was a nice isolated conversion. The meat of the code is about turning eb_objects into eb_vma, and then wiring up the rest of the code to use vmas instead of obj, vm pairs. Unfortunately, to do this, we must move the exec_list link from the obj structure. This list is reused in the eviction code, so we must also modify the eviction code to make this work. WARNING: This patch makes an already hotly profiled path slower. The cost is unavoidable. In reply to this mail, I will attach the extra data. v2: Release table lock early, and two a 2 phase vma lookup to avoid having to use a GFP_ATOMIC. (Chris) v3: s/obj_exec_list/obj_exec_link/ Updates to address commit `6d2b888569` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 7 18:30:54 2013 +0100 drm/i915: List objects allocated from stolen memory in debugfs v4: Use obj = vma->obj for neatness in some places (Chris) need_reloc_mappable() should return false if ppgtt (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Split out prep patches. Also remove a FIXME comment which is now taken care of.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-04 17:34:41 +02:00
Damien Lespiau	d2933a5b8f	drm/i915: Don't call sg_free_table() if sg_alloc_table() fails One needs to call __sg_free_table() if __sg_alloc_table() fails, but sg_alloc_table() does that for us already. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewd-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-03 19:18:00 +02:00
Joe Perches	fac15c1082	i915_gem: Convert kmem_cache_alloc(...GFP_ZERO) to kmem_cache_zalloc The helper exists, might as well use it instead of __GFP_ZERO. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-09-03 19:17:56 +02:00
Dave Airlie	efa27f9cec	Merge tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Need to get my stuff out the door ;-) Highlights: - pc8+ support from Paulo - more vma patches from Ben. - Kconfig option to enable preliminary support by default (Josh Triplett) - Optimized cpu cache flush handling and support for write-through caching of display planes on Iris (Chris) - rc6 tuning from Stéphane Marchesin for more stability - VECS seqno wrap/semaphores fix (Ben) - a pile of smaller cleanups and improvements all over Note that I've ditched Ben's execbuf vma conversion for 3.12 since not yet ready. But there's still other vma conversion stuff in here. * tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel: (62 commits) drm/i915: Print seqnos as unsigned in debugfs drm/i915: Fix context size calculation on SNB/IVB/VLV drm/i915: Use POSTING_READ in lcpll code drm/i915: enable Package C8+ by default drm/i915: add i915.pc8_timeout function drm/i915: add i915_pc8_status debugfs file drm/i915: allow package C8+ states on Haswell (disabled) drm/i915: fix SDEIMR assertion when disabling LCPLL drm/i915: grab force_wake when restoring LCPLL drm/i915: drop WaMbcDriverBootEnable workaround drm/i915: Cleaning up the relocate entry function drm/i915: merge HSW and SNB PM irq handlers drm/i915: fix how we mask PMIMR when adding work to the queue drm/i915: don't queue PM events we won't process drm/i915: don't disable/reenable IVB error interrupts when not needed drm/i915: add dev_priv->pm_irq_mask drm/i915: don't update GEN6_PMIMR when it's not needed drm/i915: wrap GEN6_PMIMR changes drm/i915: wrap GTIMR changes drm/i915: add the FCLK case to intel_ddi_get_cdclk_freq ...	2013-08-30 09:47:41 +10:00
Paulo Zanoni	c67a470b1d	drm/i915: allow package C8+ states on Haswell (disabled) This patch allows PC8+ states on Haswell. These states can only be reached when all the display outputs are disabled, and they allow some more power savings. The fact that the graphics device is allowing PC8+ doesn't mean that the machine will actually enter PC8+: all the other devices also need to allow PC8+. For now this option is disabled by default. You need i915.allow_pc8=1 if you want it. This patch adds a big comment inside i915_drv.h explaining how it works and how it tracks things. Read it. v2: (this is not really v2, many previous versions were already sent, but they had different names) - Use the new functions to enable/disable GTIMR and GEN6_PMIMR - Rename almost all variables and functions to names suggested by Chris - More WARNs on the IRQ handling code - Also disable PC8 when there's GPU work to do (thanks to Ben for the help on this), so apps can run caster - Enable PC8 on a delayed work function that is delayed for 5 seconds. This makes sure we only enable PC8+ if we're really idle - Make sure we're not in PC8+ when suspending v3: - WARN if IRQs are disabled on __wait_seqno - Replace some DRM_ERRORs with WARNs - Fix calls to restore GT and PM interrupts - Use intel_mark_busy instead of intel_ring_advance to disable PC8 v4: - Use the force_wake, Luke! v5: - Remove the "IIR is not zero" WARNs - Move the force_wake chunk to its own patch - Only restore what's missing from RC6, not everything Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-23 14:52:33 +02:00
Ben Widawsky	accfef2e5a	drm/i915: prepare bind_to_vm for preallocated vma In the new execbuf code we want to track buffers using the vmas even before they're all properly mapped. Which means that bind_to_vm needs to deal with buffers which have preallocated vmas which aren't yet bound. This patch implements this prep work and adjusts our WARN/BUG checks. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Split out from Ben's big execbuf patch. Also move one BUG back to its original place to deflate the diff a notch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:53 +02:00
Ben Widawsky	82a55ad1a0	drm/i915: Switch eviction code to use vmas The execbuf wants to do relocations usings vmas, so we need a vma->exec_list. The eviction code also uses the old obj execbuf list for it's own book-keeping, but would really prefer to deal in vmas only. So switch it over to the new list. Again this is just a prep patch for the big execbuf vma conversion. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Split out from Ben's big execbuf vma patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:52 +02:00
Ben Widawsky	b25cb2f882	drm/i915: s/obj->exec_list/obj->obj_exec_link in debugfs To convert the execbuf code over to use vmas natively we need to shuffle the exec_list a bit. This patch here just prepares things with the debugfs code, which also uses the old exec_list list_head, newly called obj_exec_link. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Split out from Ben's big patch.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:51 +02:00
Chris Wilson	4b6d846e9a	drm/i915: Drop the overzealous warning from i915_gem_set_cache_level By our earlier reckoning, move from a snooped/llc setting to an uncached setting, leaves the CPU cache in a consistent state irrespective of our domain tracking - so we can forgo the warning about the lack of invalidation. Similarly for any writes posted to the snooped CPU domain, we know will be safely clflushed to the uncached PTEs after forcing the domain change. This WARN started to pop up with commit `d46f1c3f13` Author: Chris Wilson <chris@chris-wilson.co.uk> AuthorDate: Thu Aug 8 14:41:06 2013 +0100 drm/i915: Allow the GPU to cache stolen memory Ville brought up a scenario where the interaction of a set_caching ioctl call from userspace on a scanout buffer (i.e. obj->pin_display is set) resulted in the code getting confused and not properly flushing stale cpu cachelines. Luckily we already prevent this by rejecting caching changes when obj->pin_count is set. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68040 Tested-by: cancan,feng <cancan.feng@intel.com> [danvet: Add buglink, bisect result and explain why Ville's scenario is already taken care of.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:46 +02:00
Daniel Vetter	49987099e2	drm/i915: use vma->node directly and rewrap map&fence in bind Use () to make for neater alignment of the split lines, too. With this we ditch another jump through the obj_gtt_size/offset indirection maze. Cc: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:45 +02:00
Ben Widawsky	4bd561b3e8	drm/i915: cleanup map&fence in bind Cleanup the map and fenceable setting during bind to make more sense, and not check i915_is_ggtt() 2 unnecessary times v2: Move the bools into the if block (Chris) - There are ways to tidy this function (fence calculations for instance) even further, but they are quite invasive, so I am punting on those unless specifically asked. v3: Add newline between variable declaration and logic (Chris) Recommended-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:45 +02:00
Ben Widawsky	433544bd25	drm/i915: Remove node only when allocated VMAs can be created and not bound. One may think of it as lazy cleanup, and safely gloss over the conditions which manufacture it. In either case, when the object backing the i915 vma is destroyed, we must cleanup the vma without stumbling into a bunch of pitfalls that assume the vma is bound. NOTE: I was pretty certain the above condition could only happen when we introduced the use of VMAs being looked up at execbuf, and already existing. Paulo has hit this though, so I must be missing something. As I believe the patch is correct anyway, therefore I won't scratch my head too hard. v2: use goto destroy as a compromise (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:44 +02:00
Chris Wilson	4257d3ba3b	drm/i915: Allow the user to set bo into the DISPLAY cache domain This is primarily for the benefit of the create2 ioctl so that the caller can avoid the later step of rebinding the bo with new PTE bits. After introducing WT (and possibly GFDT) cacheing for display targets, not everything in the display is earmarked as UC, and more importantly what is is controlled by the kernel. Note that set_cache_level/get_cache_level for DISPLAY is not necessarily idempotent; get_cache_level may return UC for architectures that have no special cache domain for the display engine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:39 +02:00
Chris Wilson	651d794fae	drm/i915: Use Write-Through cacheing for the display plane on Iris Haswell GT3e has the unique feature of supporting Write-Through cacheing of objects within the eLLC/LLC. The purpose of this is to enable the display plane to remain coherent whilst objects lie resident in the eLLC/LLC - so that we, in theory, get the best of both worlds, perfect display and fast access. However, we still need to be careful as the CPU does not see the WT when accessing the cache. In particular, this means that we need to flush the cache lines after writing to an object through the CPU, and on transitioning from a cached state to WT. v2: Actually do the clflush on transition to WT, nagging by Ville. v3: Flush the CPU cache after writes into WT objects. v4: Rease onto LLC updates and report WT as "uncached" for get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:38 +02:00
Jani Nikula	f2f4d82faf	drm/i915: give more distinctive names to ring hangcheck action enums The short lowercase names are bound to collide. The default warnings don't even warn about shadowing. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:37 +02:00
Ben Widawsky	7ace7ef2f5	drm/i915: WARN_ON failed map_and_fenceable I just noticed in our code we don't really check the assertion, and given some of the code I am changing in this area, I feel a WARN is very nice to have. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: s/&/&&/ to fix typo on the check.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:35 +02:00
Chris Wilson	000433b67e	drm/i915: Only do a chipset flush after a clflush Now that we skip clflushes more often, return a boolean indicating whether the clflush was actually performed, and only if it was do the chipset flush. (Though on most of the architectures where the clflush will be skipped, the chipset flush is a no-op!) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-22 13:31:34 +02:00
Dave Airlie	9712def2b3	Merge tag 'drm-intel-next-2013-08-09' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: New pile of stuff for -next: - Cleanup of the old crtc helper callbacks, all encoders are now converted to the i915 modeset infrastructure. - Massive amount of wm patches from Ville for ilk, snb, ivb, hsw, this is prep work to eventually get things going for nuclear pageflips where we need to adjust watermarks on the fly. - More vm/vma patches from Ben. This refactoring isn't yet fully rolled out, we miss the execbuf conversion and some of the low-level bind/unbind support code. - Convert our hdmi infoframe code to use the new common helper functions (Damien). This contains some bugfixes for the common infoframe helpers. - Some cruft removal from Damien. - Various smaller bits&pieces all over, as usual. * tag 'drm-intel-next-2013-08-09' of git://people.freedesktop.org/~danvet/drm-intel: (105 commits) drm/i915: Fix FB WM for HSW drm/i915: expose HDMI connectors on port C on BYT drm/i915: fix a limit check in hsw_compute_wm_results() drm/i915: unbreak i915_gem_object_ggtt_unbind() drm/i915: Make intel_set_mode() static drm/i915: Remove intel_modeset_disable() drm/i915: Make intel_encoder_dpms() static drm/i915: Make i915_hangcheck_elapsed() static drm/i915: Fix #endif comment drm/i915: Remove i915_gem_object_check_coherency() drm/i915: Remove stale prototypes drm/i915: List objects allocated from stolen memory in debugfs drm/i915: Always call intel_update_sprite_watermarks() when disabling a plane drm/i915: Pass plane and crtc to intel_update_sprite_watermarks drm/i915: Don't try to disable plane if it's already disabled drm/i915: Pass crtc to our update/disable_plane hooks drm/i915: Split plane watermark parameters into a separate struct drm/i915: Pull some watermarks state into a separate structure drm/i915: Calculate max watermark levels for ILK+ drm/i915: Rename hsw_lp_wm_result to intel_wm_level ...	2013-08-21 12:48:59 +10:00
Chris Wilson	2c22569bba	drm/i915: Update rules for writing through the LLC with the cpu As mentioned in the previous commit, reads and writes from both the CPU and GPU go through the LLC. This gives us coherency between the CPU and GPU irrespective of the attribute settings either device sets. We can use to avoid having to clflush even uncached memory. Except for the scanout. The scanout resides within another functional block that does not use the LLC but reads directly from main memory. So in order to maintain coherency with the scanout, writes to uncached memory must be flushed. In order to optimize writes elsewhere, we start tracking whether an framebuffer is attached to an object. v2: Use pin_display tracking rather than fb_count (to ensure we flush cursors as well etc) and only force the clflush along explicit writes to the scanout paths (i.e. pin_to_display_plane and pwrite into scanout). v3: Force the flush after hitting the slowpath in pwrite, as after dropping the lock the object's cache domain may be invalidated. (Ville) Based on a patch by Ville Syrjälä. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-10 11:20:49 +02:00
Chris Wilson	cc98b413c1	drm/i915: Track when an object is pinned for use by the display engine The display engine has unique coherency rules such that it requires special handling to ensure that all writes to cursors, scanouts and sprites are clflushed. This patch introduces the infrastructure to simply track when an object is being accessed by the display engine. v2: Explain the is_pin_display() magic as the sources for obj->pin_count and their individual rules is not obvious. (Ville) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-10 11:19:51 +02:00
Chris Wilson	c76ce038e3	drm/i915: Update rules for reading cache lines through the LLC The LLC is a fun device. The cache is a distinct functional block within the SA that arbitrates access from both the CPU and GPU cores. As such all writes to memory land first in the LLC before further action is taken. For example, an uncached write from either the CPU or GPU will then proceed to memory and evict the cacheline from the LLC. This means that a read from the LLC always returns the correct information even if the PTE bit in the GPU differs from the PAT bit in the CPU. For the older snooping architecture on non-LLC, the fundamental principle still holds except that some coordination is required between the CPU and GPU to explicitly perform the snooping (which is handled by our request tracking). The upshot of this is that we know that we can issue a read from either LLC devices or snoopable memory and trust the contents of the cache - i.e. we can forgo a clflush before a read in these circumstances. Writing to memory from the CPU is a little more tricky as we have to consider that the scanout does not read from the CPU cache at all, but from main memory. So we have to currently treat all requests to write to uncached memory as having to be flushed to main memory for coherency with all consumers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-10 11:19:50 +02:00
Dan Carpenter	58e73e1570	drm/i915: unbreak i915_gem_object_ggtt_unbind() There is an extra semi-colon here so we just leak and never unbind anything. This regression has been introduced in commit `07fe0b1280` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Jul 31 17:00:10 2013 -0700 drm/i915: plumb VM into bind/unbind code Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-09 12:04:53 +02:00
Ben Widawsky	8b9c2b9411	drm/i915: Add vma to list at creation With the current code there shouldn't be a distinction - however with an upcoming change we intend to allocate a vma much earlier, before it's actually bound anywhere. To do this we have to check node allocation as well for the _bound() check. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: move list_del(&vma->vma_link) from vma_unbind to vma_destroy, again fallout from the loss of "rm/i915: Cleanup more of VMA in destroy".] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> fixup for drm/i915: Add vma to list at creation	2013-08-08 14:10:20 +02:00
Ben Widawsky	ca191b1313	drm/i915: mm_list is per VMA formerly: "drm/i915: Create VMAs (part 5) - move mm_list" The mm_list is used for the active/inactive LRUs. Since those LRUs are per address space, the link should be per VMx . Because we'll only ever have 1 VMA before this point, it's not incorrect to defer this change until this point in the patch series, and doing it here makes the change much easier to understand. Shamelessly manipulated out of Daniel: "active/inactive stuff is used by eviction when we run out of address space, so needs to be per-vma and per-address space. Bound/unbound otoh is used by the shrinker which only cares about the amount of memory used and not one bit about in which address space this memory is all used in. Of course to actual kick out an object we need to unbind it from every address space, but for that we have the per-object list of vmas." v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris) v3: Moved earlier in the series v4: Add dropped message from v3 Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Frob patch to apply and use vma->node.size directly as discused with Ben. Also drop a needles BUG_ON before move_to_inactive, the function itself has the same check.] [danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA in destroy", specifically unlink the vma from the mm_list in vma_unbind (to keep it symmetric with bind_to_vm) instead of vma_destroy.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:06:58 +02:00
Ben Widawsky	5cacaac77c	drm/i915: Fix up map and fenceable for VMA formerly: "drm/i915: Create VMAs (part 3.5) - map and fenceable tracking" The map_and_fenceable tracking is per object. GTT mapping, and fences only apply to global GTT. As such, object operations which are not performed on the global GTT should not effect mappable or fenceable characteristics. Functionally, this commit could very well be squashed in to a previous patch which updated object operations to take a VM argument. This commit is split out because it's a bit tricky (or at least it was for me). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Drop the bogus hunk in i915_vma_unbind as discussed with Ben.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:04:55 +02:00
Ben Widawsky	9843877d10	drm/i915: turn bound_ggtt checks to bound_any In some places, we want to know if an object is bound in any address space, and not just the global GTT. This often applies when there is a single global resource (object, pages, etc.) function \| reason -------------------------------------------------- i915_gem_object_is_inactive \| global object i915_gem_object_put_pages \| object's pages 915_gem_object_unpin \| global object i915_gem_execbuffer_unreserve_object \| temporary until we plumb vma pread/pwrite \| see the note below Note: set_to_gtt_domain in pwrite/pread is abused as a wait_rendering call - but that once only worked if the object is bound. We really should replace this with a plain wait_rendering call, which would have the upside that in pread it would be clearer that we actually only wait for oustanding gpu writes. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Explain the set_to_gtt_domain in pwrite/pread and volunteer Ben to replace those with wait_rendering calls.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:04:43 +02:00
Ben Widawsky	f6cd1f15d3	drm/i915: Use new bind/unbind in eviction code Eviction code, like the rest of the converted code needs to be aware of the address space for which it is evicting (or the everything case, all addresses). With the updated bind/unbind interfaces of the last patch, we can now safely move the eviction code over. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:04:43 +02:00
Ben Widawsky	07fe0b1280	drm/i915: plumb VM into bind/unbind code As alluded to in several patches, and it will be reiterated later... A VMA is an abstraction for a GEM BO bound into an address space. Therefore it stands to reason, that the existing bind, and unbind are the ones which will be the most impacted. This patch implements this, and updates all callers which weren't already updated in the series (because it was too messy). This patch represents the bulk of an earlier, larger patch. I've pulled out a bunch of things by the request of Daniel. The history is preserved for posterity with the email convention of ">" One big change from the original patch aside from a bunch of cropping is I've created an i915_vma_unbind() function. That is because we always have the VMA anyway, and doing an extra lookup is useful. There is a caveat, we retain an i915_gem_object_ggtt_unbind, for the global cases which might not talk in VMAs. > drm/i915: plumb VM into object operations > > This patch was formerly known as: > "drm/i915: Create VMAs (part 3) - plumbing" > > This patch adds a VM argument, bind/unbind, and the object > offset/size/color getters/setters. It preserves the old ggtt helper > functions because things still need, and will continue to need them. > > Some code will still need to be ported over after this. > > v2: Fix purge to pick an object and unbind all vmas > This was doable because of the global bound list change. > > v3: With the commit to actually pin/unpin pages in place, there is no > longer a need to check if unbind succeeded before calling put_pages(). > Make put_pages only BUG() after checking pin count. > > v4: Rebased on top of the new hangcheck work by Mika > plumbed eb_destroy also > Many checkpatch related fixes > > v5: Very large rebase > > v6: > Change BUG_ON to WARN_ON (Daniel) > Rename vm to ggtt in preallocate stolen, since it is always ggtt when > dealing with stolen memory. (Daniel) > list_for_each will short-circuit already (Daniel) > remove superflous space (Daniel) > Use per object list of vmas (Daniel) > Make obj_bound_any() use obj_bound for each vm (Ben) > s/bind_to_gtt/bind_to_vm/ (Ben) > > Fixed up the inactive shrinker. As Daniel noticed the code could > potentially count the same object multiple times. While it's not > possible in the current case, since 1 object can only ever be bound into > 1 address space thus far - we may as well try to get something more > future proof in place now. With a prep patch before this to switch over > to using the bound list + inactive check, we're now able to carry that > forward for every address space an object is bound into. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Rebase on top of the loss of "drm/i915: Cleanup more of VMA in destroy".] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:04:20 +02:00
Ben Widawsky	80dcfdbd68	drm/i915: Rework __i915_gem_shrink In order to do this for all VMs, it's convenient to rework the logic a bit. This should have no functional impact. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:02:41 +02:00
Dave Airlie	32c913e436	Merge tag 'drm-intel-next-2013-07-26-fixed' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Neat that QA (and Ben) keeps on humming along while I'm on vacation, so you already get the next feature pull request: - proper eLLC support for HSW from Ben - more interrupt refactoring - add w/a tags where we implement them already (Damien) - hangcheck fixes (Chris) + hangcheck stats (Mika) - flesh out the new vm structs for ppgtt and ggtt (Ben) - PSR for Haswell, still disabled by default (Rodrigo et al.) - pc8+ refclock sequence code from Paulo - more interrupt refactoring from Paulo, unifying ilk/snb with the ivb/hsw interrupt code - full solution for the Haswell concurrent reg access issues (Chris) - fix racy object accounting, used by some new leak tests - fix sync polarity settings on ch7xxx dvo encoder - random bits&pieces, little fixes and better debug output all over [airlied: fix conflict with drm_mm cleanups] * tag 'drm-intel-next-2013-07-26-fixed' of git://people.freedesktop.org/~danvet/drm-intel: (289 commits) drm/i915: Do not dereference NULL crtc or fb until after checking drm/i915: fix pnv display core clock readout out drm/i915: Replace open-coded offset_in_page() drm/i915: Retry DP aux_ch communications with a different clock after failure drm/i915: Add messages useful for HPD storm detection debugging (v2) drm/i915: dvo_ch7xxx: fix vsync polarity setting drm/i915: fix the racy object accounting drm/i915: Convert the register access tracepoint to be conditional drm/i915: Squash gen lookup through multiple indirections inside GT access drm/i915: Use the common register access functions for NOTRACE variants drm/i915: Use a private interface for register access within GT drm/i915: Colocate all GT access routines in the same file drm/i915: fix reference counting in i915_gem_create drm/i915: Use Graphics Base of Stolen Memory on all gen3+ drm/i915: disable stolen mem for OVERLAY_NEEDS_PHYSICAL drm/i915: add functions to disable and restore LCPLL drm/i915: disable CLKOUT_DP when it's not needed drm/i915: extend lpt_enable_clkout_dp drm/i915: fix up error cleanup in i915_gem_object_bind_to_gtt drm/i915: Add some debug breadcrumbs to connector detection ...	2013-08-07 18:11:35 +10:00
David Herrmann	31e5d7c67b	drm/mm: add "best_match" flag to drm_mm_insert_node() Add a "best_match" flag similar to the drm_mm_search_() helpers so we can convert TTM to use them in follow up patches. We can also inline the non-generic helpers and move them into the header to allow compile-time optimizations. To make calls to drm_mm_{search,insert}_node() more readable, this converts the boolean argument to a flagset. There are pending patches that add additional flags for top-down allocators and more. v2: - use flag parameter instead of boolean "best_match" - convert _search_free() helpers to also use flags argument Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-08-07 10:08:58 +10:00
Daniel Vetter	43387b37fa	drm/gem: create drm_gem_dumb_destroy All the gem based kms drivers really want the same function to destroy a dumb framebuffer backing storage object. So give it to them and roll it out in all drivers. This still leaves the option open for kms drivers which don't use GEM for backing storage, but it does decently simplify matters for gem drivers. Acked-by: Inki Dae <inki.dae@samsung.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org> Cc: Ben Skeggs <skeggsb@gmail.com> Reviwed-by: Rob Clark <robdclark@gmail.com> Cc: Alex Deucher <alexdeucher@gmail.com> Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-08-07 09:59:24 +10:00
Ben Widawsky	637efacf8f	drm/i915: eliminate dead domain clearing on reset The code itself is no longer accurate without updating once we have multiple address space since clearing the domains of every object requires scanning the inactive list for all VMs. "This code is dead. Just remove it rather than port it to vma." - Chris Wilson Recommended-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:12:25 +02:00
Ben Widawsky	d1ccbb5d71	drm/i915: make reset&hangcheck code VM aware Hangcheck, and some of the recent reset code for guilty batches need to know which address space the object was in at the time of a hangcheck. This is because we use offsets in the (PP\|G)GTT to determine this information, and those offsets can differ depending on which VM they are bound into. Since we still only have 1 VM ever, this code shouldn't yet have any impact. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:13 +02:00
Ben Widawsky	3e12302705	drm/i915: BUG_ON put_pages later With multiple VMs, the eviction code benefits from being able to blindly put pages without needing to know if there are any entities still holding on to those pages. As such it's preferable to return the -EBUSY before the BUG. Eviction code is the only user for now, but overall it makes sense anyway, IMO. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:13 +02:00
Ben Widawsky	3089c6f239	drm/i915: make caching operate on all address spaces For now, objects will maintain the same cache levels amongst all address spaces. This is to limit the risk of bugs, as playing with cacheability in the different domains can be very error prone. In the future, it may be optimal to allow setting domains per VMA (ie. an object bound into an address space). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:12 +02:00
Ben Widawsky	c37e220461	drm/i915: Add VM to pin To verbalize it, one can say, "pin an object into the given address space." The semantics of pinning remain the same otherwise. Certain objects will always have to be bound into the global GTT. Therefore, global GTT is a special case, and keep a special interface around for it (i915_gem_obj_ggtt_pin). v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:09 +02:00
Ben Widawsky	fcb4a57805	drm/i915: Use bound list for inactive shrink Do to the move active/inactive lists, it no longer makes sense to use them for shrinking, since shrinking isn't VM specific (such a need may also exist, but doesn't yet). What we can do instead is use the global bound list to find all objects which aren't active. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:09 +02:00
Ben Widawsky	a70a3148b0	drm/i915: Make proper functions for VMs Earlier in the conversion sequence we attempted to quickly wedge in the transitional interface as static inlines. Now that we're sure these interfaces are sane, for easier debug and to decrease code size (since many of these functions may be called quite a bit), make them real functions While at it, kill off the set_color interface. We'll always have the VMA, or easily get to it. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:08 +02:00
Ben Widawsky	fc8c067eee	drm/i915: Create an init vm Move all the similar address space (VM) initialization code to one function. Until we have multiple VMs, there should only ever be 1 VM. The aliasing ppgtt is a special case without it's own VM (since it doesn't need it's own address space management). Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-05 19:04:07 +02:00
Daniel Vetter	c20e835586	drm/i915: fix the racy object accounting Just use a spinlock to protect them. v2: Rebase onto the new object create refcount fix patch. v3: Don't kill dev_priv->mm.object_memory as requested by Chris and hence just use a spinlock instead of atomic_t. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67287 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-25 15:30:54 +02:00
Daniel Vetter	cb54b53ada	Merge commit 'Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux' This backmerges Linus' merge commit of the latest drm-fixes pull: commit `549f3a1218` Merge: `42577ca` `058ca4a` Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Tue Jul 23 15:47:08 2013 -0700 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux We've accrued a few too many conflicts, but the real reason is that I want to merge the 100% solution for Haswell concurrent registers writes into drm-intel-next. But that depends upon the 90% bandaid merged into -fixes: commit `a7cd1b8fea` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 19 20:36:51 2013 +0100 drm/i915: Serialize almost all register access Also, we can roll up on accrued conflicts. Usually I'd backmerge a tagged -rc, but I want to get this done before heading off to vacations next week ;-) Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_gem.c v2: For added hilarity we have a init sequence conflict around the gt_lock, so need to move that one, too. Spotted by Jani Nikula. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-25 15:18:41 +02:00
David Herrmann	51335df9f0	drm/vma: provide drm_vma_node_unmap() helper Instead of unmapping the nodes in TTM and GEM users manually, we provide a generic wrapper which does the correct thing for all vma-nodes. v2: remove bdev->dev_mapping test in ttm_bo_unmap_virtual_unlocked() as ttm_mem_io_free_vm() does nothing in that case (io_reserved_vm is 0). v4: Fix docbook comments v5: use drm_vma_node_size() Cc: Dave Airlie <airlied@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>	2013-07-25 20:47:08 +10:00
David Herrmann	0de23977cf	drm/gem: convert to new unified vma manager Use the new vma manager instead of the old hashtable. Also convert all drivers to use the new convenience helpers. This drops all the (map_list.hash.key << PAGE_SHIFT) non-sense. Locking and access-management is exactly the same as before with an additional lock inside of the vma-manager, which strictly wouldn't be needed for gem. v2: - rebase on drm-next - init nodes via drm_vma_node_reset() in drm_gem.c v3: - fix tegra v4: - remove duplicate if (drm_vma_node_has_offset()) checks - inline now trivial drm_vma_node_offset_addr() calls v5: - skip node-reset on gem-init due to kzalloc() - do not allow mapping gem-objects with offsets (backwards compat) - remove unneccessary casts Cc: Inki Dae <inki.dae@samsung.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>	2013-07-25 20:47:06 +10:00
Daniel Vetter	d861e33876	drm/i915: fix reference counting in i915_gem_create This function is called without the dev->struct_mutex held, hence we need to use the _unlocked unreference variants. As soon as the object is registered userspace can sneak in here with a gem_close ioctl call, so the object can (and with my new evil tests actually does) get the final unreference in this place. The lack of locking then results in hilarity and some good leakage. To fix this we simply need to revert Chris Wilson <chris@chris-wilson.co.uk> v2: We need to make the trace call _before_ we drop our ref - the object might very well be gone by then already. v3: Just revert the original patch as suggested by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Remove the added white line again to tighten the return block, requested by Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-24 23:25:03 +02:00
Daniel Vetter	bc6bc15bd7	drm/i915: fix up error cleanup in i915_gem_object_bind_to_gtt This has been broken in commit `2f63315692` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Jul 17 12:19:03 2013 -0700 drm/i915: Create VMAs which resulted in an OOPS the first time around we've hit -ENOSPC. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67156 Cc: Imre Deak <imre.deak@intel.com> Cc: Ben Widawsky <ben@bwidawsk.net> Tested-by: meng <mengmeng.meng@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-24 10:37:08 +02:00
Xiong Zhang	0b74b508f7	drm/i915: add prefault_disable module option prefault is stll enabled by default which prevent most of pwrite/pread/reloc from running slow path, in order to verify these slow pathes, prefault need to be disabled. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> [danvet: Make checkpatch happy and bikeshed the module option help text a bit.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-19 09:29:26 +02:00
Dan Carpenter	6286ef9b56	drm/i915: use after free on error path i915_gem_vma_destroy() frees its argument so we have to move the drm_mm_remove_node() call up a few lines. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-19 08:58:42 +02:00
Dan Carpenter	db473b36d4	drm/i915: checking for NULL instead of IS_ERR() i915_gem_vma_create() returns and ERR_PTR() or a valid pointer, it never returns NULL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-19 08:58:33 +02:00
Dave Airlie	e13af9a834	Merge tag 'drm-intel-next-2013-07-12' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Highlights: - follow-up refactoring after the shared dpll rework that landed in 3.11 - oddball prep cleanups from Ben for ppgtt - encoder->get_config state tracking infrastructure from Jesse - used by the experimental fastboot support from Jesse (disabled by default) - make the error state file official and add it to our sysfs interface (Mika) - drm_mm prep changes from Ben, prepares to embedd the drm_mm_node (which will be used by the vma rework later on) - interrupt handling rework, follow up cleanups to the VECS enabling, hpd storm handling and fifo underrun reporting. - Big pile of smaller cleanups, code improvements and related stuff. * tag 'drm-intel-next-2013-07-12' of git://people.freedesktop.org/~danvet/drm-intel: (72 commits) drm/i915: clear DPLL reg when disabling i9xx dplls drm/i915: Fix up cpt pixel multiplier enable sequence drm/i915: clean up vlv ->pre_pll_enable and pll enable sequence drm/i915: move error state to own compilation unit drm/i915: Don't attempt to read an unitialized stack value drm/i915: Use for_each_pipe() when possible drm/i915: don't enable PM_VEBOX_CS_ERROR_INTERRUPT drm/i915: unify ring irq refcounts (again) drm/i915: kill dev_priv->rps.lock drm/i915: queue work outside spinlock in hsw_pm_irq_handler drm/i915: streamline hsw_pm_irq_handler drm/i915: irq handlers don't need interrupt-safe spinlocks drm/i915: kill lpt pch transcoder->crtc mapping code for fifo underruns drm/i915: improve GEN7_ERR_INT clearing for fifo underrun reporting drm/i915: improve SERR_INT clearing for fifo underrun reporting drm/i915: extract ibx_display_interrupt_update drm/i915: remove unused members from drm_i915_private drm/i915: don't frob mm.suspended when not using ums drm/i915: Fix VLV DP RBR/HDMI/DAC PLL LPF coefficients drm/i915: WARN if the bios reserved range is bigger than stolen size ... Conflicts: drivers/gpu/drm/i915/i915_gem.c	2013-07-19 12:12:21 +10:00
Daniel Vetter	94a335dba3	drm/i915: correctly restore fences with objects attached To avoid stalls we delay tiling changes and especially hold of committing the new fence state for as long as possible. Synchronization points are in the execbuf code and in our gtt fault handler. Unfortunately we've missed that tricky detail when adding proper fence restore code in commit `19b2dbde57` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jun 12 10:15:12 2013 +0100 drm/i915: Restore fences after resume and GPU resets The result was that we've restored fences for objects with no tiling, since the object<->fence link still existed after resume. Now that wouldn't have been too bad since any subsequent access would have fixed things up, but if we've changed from tiled to untiled real havoc happened: The tiling stride is stored -1 in the fence register, so a stride of 0 resulted in all 1s in the top 32bits, and so a completely bogus fence spanning everything from the start of the object to the top of the GTT. The tell-tale in the register dumps looks like: FENCE START 2: 0x0214d001 FENCE END 2: 0xfffff3ff Bit 11 isn't set since the hw doesn't store it, even when writing all 1s (at least on my snb here). To prevent such a gaffle in the future add a sanity check for fences with an untiled object attached in i915_gem_write_fence. v2: Fix the WARN, spotted by Chris. v3: Trying to reuse get_fences looked ugly and obfuscated the code. Instead reuse update_fence and to make it really dtrt also move the fence dirty state clearing into update_fence. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Stéphane Marchesin <marcheu@chromium.org> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=60530 Cc: stable@vger.kernel.org (for 3.10 only) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Matthew Garrett <matthew.garrett@nebula.com> Tested-by: Björn Bidar <theodorstormgrade@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-19 00:08:16 +02:00
Daniel Vetter	8157ee2115	Linux 3.10 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJR0K2gAAoJEHm+PkMAQRiGWsEH+gMZSN1qRm34hZ82q1Tx7HvL Eb/Gsl3Qw/7G2TlTqgjBUs36IdqV9O2cui/aa3/TfXvdvrx+0GlhRkEwQPc+ygcO Mvoyoke4tT4+4jVFdCg1J8avREsa28/6oaHs0ZZxuVmJBBLTJH7aXaNsGn6eU1q9 9+p798MQis6naIiPC63somlZcCIiBhsuWCPWpEfLMn8G1HWAFTM3xXIbNBqe/brS bmIOfhomlIZ5dcdaXGvjtP3+KJhkNDwhkPC4tVYu8JqqgSlrE+a+EGyEuuGqKk10 U+swiqyuD31uBI9ga54u/2FzSqDiAu6YOcMXevjo/m3g9XLdYbYLvN+nvN8alCQ= =Ob6Z -----END PGP SIGNATURE----- Merge tag 'v3.10' into drm-intel-fixes Backmerge Linux 3.10 to get at commit `19b2dbde57` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jun 12 10:15:12 2013 +0100 drm/i915: Restore fences after resume and GPU resets That commit is not in my current -fixes pile since that's based on my -next queue for 3.11. And the above mentioned fix was merged really late into 3.10 (and blew up, bad me) so was on a diverging branch. Option B would have been to rebase my current pile of fixes onto Dave's drm-fixes branch. But since some of the patches here are a bit tricky I've decided not to void all the testing by moving over the entire merge window. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-18 12:03:29 +02:00
Ben Widawsky	2f63315692	drm/i915: Create VMAs Formerly: "drm/i915: Create VMAs (part 1)" In a previous patch, the notion of a VM was introduced. A VMA describes an area of part of the VM address space. A VMA is similar to the concept in the linux mm. However, instead of representing regular memory, a VMA is backed by a GEM BO. There may be many VMAs for a given object, one for each VM the object is to be used in. This may occur through flink, dma-buf, or a number of other transient states. Currently the code depends on only 1 VMA per object, for the global GTT (and aliasing PPGTT). The following patches will address this and make the rest of the infrastructure more suited v2: s/i915_obj/i915_gem_obj (Chris) v3: Only move an object to the now global unbound list if there are no more VMAs for the object which are bound into a VM (ie. the list is empty). v4: killed obj->gtt_space some reworks due to rebase v5: Free vma on error path (Imre) v6: Another missed vma free in i915_gem_object_bind_to_gtt error path (Imre) Fixed vma freeing in stolen preallocation (Imre) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> [danvet: Squash in fixup from Ben to not deref a non-existing vma in set_cache_level, reported by Chris.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-18 08:46:13 +02:00
Ben Widawsky	5cef07e162	drm/i915: Move active/inactive lists to new mm Shamelessly manipulated out of Daniel :-) "When moving the lists around explain that the active/inactive stuff is used by eviction when we run out of address space, so needs to be per-vma and per-address space. Bound/unbound otoh is used by the shrinker which only cares about the amount of memory used and not one bit about in which address space this memory is all used in. Of course to actual kick out an object we need to unbind it from every address space, but for that we have the per-object list of vmas." v2: Leave the bound list as a global one. (Chris, indirectly) v3: Rebased with no i915_gtt_vm. In most places I added a new *vm local, since it will eventually be replaces by a vm argument. Put comment back inline, since it no longer makes sense to do otherwise. v4: Rebased on hangcheck/error state movement Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-17 22:24:32 +02:00
Ben Widawsky	93bd8649db	drm/i915: Put the mm in the parent address space Every address space should support object allocation. It therefore makes sense to have the allocator be part of the "superclass" which GGTT and PPGTT will derive. Since our maximum address space size is only 2GB we're not yet able to avoid doing allocation/eviction; but we'd hope one day this becomes almost irrelvant. v2: Rebased Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-17 22:23:43 +02:00
Ben Widawsky	853ba5d223	drm/i915: Move gtt and ppgtt under address space umbrella The GTT and PPGTT can be thought of more generally as GPU address spaces. Many of their actions (insert entries), state (LRU lists), and many of their characteristics (size) can be shared. Do that. The change itself doesn't actually impact most of the VMA/VM rework coming up, it just fits in with the grand scheme of abstracting the GPU VM operations. GGTT will usually be a special case where we either know an object must be in the GGTT (dislay engine, workarounds, etc.). The scratch page is left as part of the VM (even though it's currently shared with the ppgtt code) because in the future when we have Full PPGTT, I intend to create a separate scratch page for each. v2: Drop usage of i915_gtt_vm (Daniel) Make cleanup also part of the parent class (Ben) Modified commit msg Rebased v3: Properly share scratch page (Imre) Finish commit message (Daniel, Imre) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-17 22:21:47 +02:00
Dave Airlie	6bd2cab2c1	Merge tag 'drm-intel-fixes-2013-07-11' of git://people.freedesktop.org/~danvet/drm-intel One feature latecomer, I've forgotten to merge the patch to reeanble the Haswell power well feature now that the audio interaction is fixed up. Since that was the only unfixed issue with it I've figured I could throw it in a bit late, and it's trivial to revert in case I'm wrong. Otherwise all bug/regression fixes: - Fix status page reinit after gpu hangs, spotted by more paranoid igt checks. - Fix object list walking fumble regression in the shrinker (only the counting part, the actual shrinking code was correct so no Oops potential), from Xiong Zhang. - Fix DP 1.2 bw limits (Imre). - Restore legacy forcewake on ivb, too many broken biosen out there. We dump a warn though that recent userspace might fall over with that config (Guenter Roeck). - Patch up the gen2 cs tlb w/a. - Improve the fence coherency w/a now that we have a better understanding what's going on. The removed wbinvd+ipi should make -rt folks happy. Big thanks to Jon Bloomfield for figuring this out, patches from Chris. - Fix write-read race when switching ring (Chris). Spotted with code inspection, but now we also have an igt for it. There's an ugly regression we're still working on introduced between 3.10-rc7 and 3.10.0. Unfortunately we can't just revert the offender since that one fixes another regression :( I've asked Steven to include my -fixes branch into linux-next to prevent such fallout in the future, hopefully. * tag 'drm-intel-fixes-2013-07-11' of git://people.freedesktop.org/~danvet/drm-intel: Revert "drm/i915: Workaround incoherence between fences and LLC across multiple CPUs" drm/i915: Fix incoherence with fence updates on Sandybridge+ drm/i915: Fix write-read race with multiple rings Partially revert "drm/i915: unconditionally use mt forcewake on hsw/ivb" drm/i915: fix lane bandwidth capping for DP 1.2 sinks drm/i915: fix up ring cleanup for the i830/i845 CS tlb w/a drm/i915: Correct obj->mm_list link to dev_priv->dev_priv->mm.inactive_list drm/i915: switch disable_power_well default value to 1 drm/i915: reinit status page registers after gpu reset	2013-07-17 08:40:49 +10:00
Mika Kuoppala	10cd45b6e8	drm/i915: introduce i915_queue_hangcheck To run hangcheck in near future. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-16 12:44:02 +02:00
Ben Widawsky	59124506ba	drm/i915: store eLLC size The eLLC cannot be determined by PCIID because as far as we know, even machines supporting eLLC may not have it enabled, or fused off or whatever. It's possible this isn't actually true, and at that point we can switch to a DEV_INFO flag instead. I've defined everything where the docs are clear, and left the rest as magic. But we need it before we set the pte_encode function pointers, which happens really early, in gtt_init. The problem with just doing the normal sequence earlier is we don't have the ability to use forcewake until after the pte functions have been set up. Since all solutions are somewhat ugly (barring rewriting all the init ordering), I've opted to do the detection really early, and the enabling later - since the register to detect doesn't require forcewake. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Move dev_priv->ellc_size away from the dri1 dungeon to a nice place right next to the l3 parity stuff. Also squash in the follow-up commit to read out the eLLC size a bit earlier.] Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-16 08:08:21 +02:00
Ben Widawsky	05e21cc43d	drm/i915: Define some of the eLLC magic The EDRAM present register isn't really defined in the docs. It just says check to see if it's set to 1. So I haven't defined the 1 value not knowing what it actually means. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-16 08:00:52 +02:00
Chris Wilson	46a0b638f3	Revert "drm/i915: Workaround incoherence between fences and LLC across multiple CPUs" This reverts commit `25ff119` and the follow on for Valleyview commit `2dc8aae`. commit `25ff1195f8` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Apr 4 21:31:03 2013 +0100 drm/i915: Workaround incoherence between fences and LLC across multiple CPUs commit `2dc8aae06d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed May 22 17:08:06 2013 +0100 drm/i915: Workaround incoherence with fence updates on Valleyview Jon Bloomfield came up with a plausible explanation and cheap fix (drm/i915: Fix incoherence with fence updates on Sandybridge+) for the race condition, so lets run with it. This is a candidate for stable as the old workaround incurs a significant cost (calling wbinvd on all CPUs before performing the register write) for some workloads as noted by Carsten Emde. Link: http://lists.freedesktop.org/archives/intel-gfx/2013-June/028819.html References: https://www.osadl.org/?id=1543#c7602 References: https://bugs.freedesktop.org/show_bug.cgi?id=63825 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Carsten Emde <C.Emde@osadl.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-10 15:31:12 +02:00
Chris Wilson	d18b961903	drm/i915: Fix incoherence with fence updates on Sandybridge+ This hopefully fixes the root cause behind the workaround added in commit `25ff1195f8` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Apr 4 21:31:03 2013 +0100 drm/i915: Workaround incoherence between fences and LLC across multiple CPUs Thanks to further investigation by Jon Bloomfield, he realised that the 64-bit register might be broken up by the hardware into two 32-bit writes (a problem we have encountered elsewhere). This non-atomicity would then cause an issue where a second thread would see an intermediate register state (new high dword, old low dword), and this register would randomly be used in preference to its own thread register. This would cause the second thread to read from and write into a fairly random tiled location. Breaking the operation into 3 explicit 32-bit updates (first disable the fence, poke the upper bits, then poke the lower bits and enable) ensures that, given proper serialisation between the 32-bit register write and the memory transfer, that the fence value is always consistent. Armed with this knowledge, we can explain how the previous workaround work. The key to the corruption is that a second thread sees an erroneous fence register that conflicts and overrides its own. By serialising the fence update across all CPUs, we have a small window where no GTT access is occurring and so hide the potential corruption. This also leads to the conclusion that the earlier workaround was incomplete. v2: Be overly paranoid about the order in which fence updates become visible to the GPU to make really sure that we turn the fence off before doing the update, and then only switch the fence on afterwards. Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Carsten Emde <C.Emde@osadl.org> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-10 14:41:46 +02:00
Daniel Vetter	db1b76ca6a	drm/i915: don't frob mm.suspended when not using ums In kernel modeset driver mode we're in full control of the chip, always. So there's no need at all to set mm.suspended in i915_gem_idle. Hence move that out into the leavevt ioctl. Since i915_gem_idle doesn't suspend gem any more we can also drop the re-enabling for KMS in the thaw function. Also clean up the handling of mm.suspend at driver load by coalescing all the assignments. Stumbled over while reading through our resume code for unrelated reasons. v2: Shovel mm.suspended into the (newly created) ums dungeon as suggested by Chris Wilson. The plan is that once we've completely stopped relying on the register save/restore code we could shovel even that in there. v3: Improve the locking for the entervt/leavevt ioctls a bit by moving the dev->struct_mutex locking outside of i915_gem_idle. Also don't clear dev_priv->ums.mm_suspended for the kms case, we allocate it with kzalloc. Both suggested by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-10 14:30:25 +02:00
Chris Wilson	02978ff57a	drm/i915: Fix write-read race with multiple rings Daniel noticed a problem where is we wrote to an object with ring A in the middle of a very long running batch, then executed a quick batch on ring B before a batch that reads from the same object, its obj->ring would now point to ring B, but its last_write_seqno would be still relative to ring A. This would allow for the user to read from the object before the GPU had completed the write, as set_domain would only check that ring B had passed the last_write_seqno. To fix this simply (and inelegantly), we bump the last_write_seqno when switching rings so that the last_write_seqno is always relative to the current obj->ring. This fixes igt/tests/gem_write_read_ring_switch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org [danvet: Add note about the newly created igt which exercises this bug.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-10 10:41:55 +02:00
Linus Torvalds	2e17c5a97e	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm updates from Dave Airlie: "Okay this is the big one, I was stalled on the fbdev pull req as I stupidly let fbdev guys merge a patch I required to fix a warning with some patches I had, they ended up merging the patch from the wrong place, but the warning should be fixed. In future I'll just take the patch myself! Outside drm: There are some snd changes for the HDMI audio interactions on haswell, they've been acked for inclusion via my tree. This relies on the wound/wait tree from Ingo which is already merged. Major changes: AMD finally released the dynamic power management code for all their GPUs from r600->present day, this is great, off by default for now but also a huge amount of code, in fact it is most of this pull request. Since it landed there has been a lot of community testing and Alex has sent a lot of fixes for any bugs found so far. I suspect radeon might now be the biggest kernel driver ever :-P p.s. radeon.dpm=1 to enable dynamic powermanagement for anyone. New drivers: Renesas r-car display unit. Other highlights: - core: GEM CMA prime support, use new w/w mutexs for TTM reservations, cursor hotspot, doc updates - dvo chips: chrontel 7010B support - i915: Haswell (fbc, ips, vecs, watermarks, audio powerwell), Valleyview (enabled by default, rc6), lots of pll reworking, 30bpp support (this time for sure) - nouveau: async buffer object deletion, context/register init updates, kernel vp2 engine support, GF117 support, GK110 accel support (with external nvidia ucode), context cleanups. - exynos: memory leak fixes, Add S3C64XX SoC series support, device tree updates, common clock framework support, - qxl: cursor hotspot support, multi-monitor support, suspend/resume support - mgag200: hw cursor support, g200 mode limiting - shmobile: prime support - tegra: fixes mostly I've been banging on this quite a lot due to the size of it, and it seems to okay on everything I've tested it on." * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (811 commits) drm/radeon/dpm: implement vblank_too_short callback for si drm/radeon/dpm: implement vblank_too_short callback for cayman drm/radeon/dpm: implement vblank_too_short callback for btc drm/radeon/dpm: implement vblank_too_short callback for evergreen drm/radeon/dpm: implement vblank_too_short callback for 7xx drm/radeon/dpm: add checks against vblank time drm/radeon/dpm: add helper to calculate vblank time drm/radeon: remove stray line in old pm code drm/radeon/dpm: fix display_gap programming on rv7xx drm/nvc0/gr: fix gpc firmware regression drm/nouveau: fix minor thinko causing bo moves to not be async on kepler drm/radeon/dpm: implement force performance level for TN drm/radeon/dpm: implement force performance level for ON/LN drm/radeon/dpm: implement force performance level for SI drm/radeon/dpm: implement force performance level for cayman drm/radeon/dpm: implement force performance levels for 7xx/eg/btc drm/radeon/dpm: add infrastructure to force performance levels drm/radeon: fix surface setup on r1xx drm/radeon: add support for 3d perf states on older asics drm/radeon: set default clocks for SI when DPM is disabled ...	2013-07-09 16:04:31 -07:00
Xiong Zhang	067556084a	drm/i915: Correct obj->mm_list link to dev_priv->dev_priv->mm.inactive_list obj->mm_list link to dev_priv->mm.inactive_list/active_list obj->global_list link to dev_priv->mm.unbound_list/bound_list This regression has been introduced in commit `93927ca52a` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Jan 10 18:03:00 2013 +0100 drm/i915: Revert shrinker changes from "Track unbound pages" Cc: stable@vger.kernel.org Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> [danvet: Add regression notice.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-09 16:31:48 +02:00
Ben Widawsky	c6cfb32567	drm/i915: Embed drm_mm_node in i915 gem obj Embedding the node in the obj is more natural in the transition to VMAs which will also have embedded nodes. This change also helps transition away from put_block to remove node. Though it's quite an uncommon occurrence, it's somewhat convenient to not fail at bind time because we cannot allocate the node. Though in practice there are other allocations (like the request structure) which would probably make this point not terribly useful. Quoting Daniel: Note that the only difference between put_block and remove_node is that the former fills up the preallocation cache. Which we don't need anyway and hence is just wasted space. v2: Clean up the stolen preallocation code. Rebased on the reserve_node patches renames ggtt_ stuff to gtt_ stuff WARN_ON if the object is already bound (which doesn't mean it's in the bound list, tricky) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:36 +02:00
Ben Widawsky	edd41a870f	drm/i915: Kill obj->gtt_offset With the getters in place from the previous patch this members serves no purpose other than saving one spare pointer chase, which will be killed in the next patch anyway. Moving to VMAs, this members adds unnecessary confusion since an object may exist at different offsets in different VMs. v2: Properly preserve the stolen offset. This code is a bit hacky but it all goes away when we embed the drm_mm_node and removes the need for the incorrect patch I submitted previously: "Use gtt_space->start for stolen reservation" Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:35 +02:00
Ben Widawsky	f343c5f647	drm/i915: Getter/setter for object attributes Soon we want to gut a lot of our existing assumptions how many address spaces an object can live in, and in doing so, embed the drm_mm_node in the object (and later the VMA). It's possible in the future we'll want to add more getter/setter methods, but for now this is enough to enable the VMAs. v2: Reworked commit message (Ben) Added comments to the main functions (Ben) sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/.[ch] (Daniel) v3: Rebased on new reserve_node patch Changed DRM_DEBUG_KMS to actually work (will need fixing later) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:34 +02:00
Chris Wilson	d26e3af842	drm/i915: Refactor the wait_rendering completion into a common routine Harmonise the completion logic between the non-blocking and normal wait_rendering paths, and move that logic into a common function. In the process, we note that the last_write_seqno is by definition the earlier of the two read/write seqnos and so all successful waits will have passed the last_write_seqno. Therefore we can unconditionally clear the write seqno and its domains in the completion logic. v2: Add the missing ring parameter, because sometimes it is good to have things compile. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:15:01 +02:00
Chris Wilson	daa13e1ca5	drm/i915: Only clear write-domains after a successful wait-seqno In the introduction of the non-blocking wait, I cut'n'pasted the wait completion code from normal locked path. Unfortunately, this neglected that the normal path returned early if the wait returned early. The result is that read-only waits may return whilst the GPU is still writing to the bo. Fixes regression from commit `3236f57a01` [v3.7] Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Aug 24 09:35:09 2012 +0100 drm/i915: Use a non-blocking wait for set-to-domain ioctl Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66163 Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:15:00 +02:00
Jani Nikula	3765f30486	drm/i915: fix build warning on format specifier mismatch drivers/gpu/drm/i915/i915_gem.c: In function ‘i915_gem_object_bind_to_gtt’: drivers/gpu/drm/i915/i915_gem.c:3002:3: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 5 has type ‘size_t’ [-Wformat] v2: Use %zu instead of %d. Two char patch, and 100% wrong. (Ville) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:14:43 +02:00
Konrad Rzeszutek Wilk	1625e7e549	drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. Git commit `90797e6d1e` ("drm/i915: create compact dma scatter lists for gem objects") makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] ERROR pin & fence failed [drm:intel_crtc_set_config] ERROR failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Chris Wilson <chris@chris-wilson.co.uk> CC: Imre Deak <imre.deak@intel.com> CC: Daniel Vetter <daniel.vetter@ffwll.ch> CC: David Airlie <airlied@linux.ie> CC: <dri-devel@lists.freedesktop.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-01 11:14:42 +02:00
Dave Airlie	28419261b0	Merge tag 'drm-intel-next-2013-06-18' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Last 3.11 feature pull. I have a few odds bits and pieces and fixes in my queue, I'll sort them out later on to see what's for 3.11-fixes and what's for 3.12. But nothing to hold this here up imo. Highlights: - more hangcheck work from Mika and Chris to prepare for arb robustness - trickle feed fixes from Ville - first parts of the shared pch pll rework, with some basic hw state readout and cross-checking (this shuts up the confused pch pll refcount WARN that Linus just recently forwarded) - Haswell audio power well support from Wang Xingchao (alsa bits acked by Takashi) - some cleanups and asserts sprinkling around the plane/gamma enabling sequence from Ville - more gtt refactoring from Ben - clear up the adjusted->mode vs. pixel clock vs. port clock confusion - 30bpp support, this time for real hopefully * tag 'drm-intel-next-2013-06-18' of git://people.freedesktop.org/~danvet/drm-intel: (97 commits) drm/i915: remove a superflous semi-colon drm/i915: Kill useless "Enable panel fitter" comments drm/i915: Remove extra "ring" from error message drm/i915: simplify the reduced clock handling for pch plls drm/i915: stop killing pfit on i9xx drm/i915: explicitly set up PIPECONF (and gamma table) on haswell drm/i915: set up PIPECONF explicitly for i9xx/vlv platforms drm/i915: set up PIPECONF explicitly on ilk-ivb drm/i915: find guilty batch buffer on ring resets drm/i915: store ring hangcheck action drm/i915: add batch bo to i915_add_request() drm/i915: change i915_add_request to macro drm/i915: add i915_gem_context_get_hang_stats() drm/i915: add struct i915_ctx_hang_stats drm/i915: Try harder to disable trickle feed on VLV drm/i915: fix up pch pll enabling for pixel multipliers drm/i915: hw state readout and cross-checking for shared dplls drm/i915: WARN on lack of shared dpll drm/i915: split up intel_modeset_check_state drm/i915: extract readout_hw_state from setup_hw_state ... Conflicts: drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_fb.c drivers/gpu/drm/i915/intel_sdvo.c	2013-06-28 09:50:34 +10:00
Dave Airlie	4300a0f8bd	Linux 3.10-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQEbBAABAgAGBQJRxf9cAAoJEHm+PkMAQRiGMWkH911xM4gRmFgE7SqVW4F4AWBm ngcqMqNy9IdqKfibORUUDvVfEa5gjD5ai2quIKpfQiaukbpQJ696H90ijuAkajLn DQBrN243s0pzhhc/quWINnWxsFQ613JjdUMUMaD7e9A1aKjYzWrPGt/tSjrFXGCP tArTupVzc/iOmnEQDKiROI/Nokq44QJ36aTGPM7n08xMtpKmkCXM+9/UosBteB0O HVI33dmjwz7i55fI53XAWyuZCE+gSEnA4z8spJ9LfXso2W14V+roc+GuL6OyeeTI pCn/+4niVPb4B0ROZlpyVmdZjbPPcMMEK5o+BSJI68SH6LHZTQh2iVuqYfpSyA== =uUH5 -----END PGP SIGNATURE----- Merge tag 'v3.10-rc7' into drm-next Linux 3.10-rc7 The sdvo lvds fix in this -fixes pull commit `c3456fb3e4` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Jun 10 09:47:58 2013 +0200 drm/i915: prefer VBT modes for SVDO-LVDS over EDID has a silent functional conflict with commit `990256aec2` Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Fri May 31 12:17:07 2013 +0000 drm: Add probed modes in probe order in drm-next. W simply need to add the vbt modes before edid modes, i.e. the other way round than now. Conflicts: drivers/gpu/drm/drm_prime.c drivers/gpu/drm/i915/intel_sdvo.c	2013-06-27 20:40:44 +10:00
Konrad Rzeszutek Wilk	426729dcc7	drm/i915: make compact dma scatter lists creation work with SWIOTLB backend. Git commit `90797e6d1e` ("drm/i915: create compact dma scatter lists for gem objects") makes certain assumptions about the under laying DMA API that are not always correct. On a ThinkPad X230 with an Intel HD 4000 with Xen during the bootup I see: [drm:intel_pipe_set_base] ERROR pin & fence failed [drm:intel_crtc_set_config] ERROR failed to set mode on [CRTC:3], err = -28 Bit of debugging traced it down to dma_map_sg failing (in i915_gem_gtt_prepare_object) as some of the SG entries were huge (3MB). That unfortunately are sizes that the SWIOTLB is incapable of handling - the maximum it can handle is a an entry of 512KB of virtual contiguous memory for its bounce buffer. (See IO_TLB_SEGSIZE). Previous to the above mention git commit the SG entries were of 4KB, and the code introduced by above git commit squashed the CPU contiguous PFNs in one big virtual address provided to DMA API. This patch is a simple semi-revert - were we emulate the old behavior if we detect that SWIOTLB is online. If it is not online then we continue on with the new compact scatter gather mechanism. An alternative solution would be for the the '.get_pages' and the i915_gem_gtt_prepare_object to retry with smaller max gap of the amount of PFNs that can be combined together - but with this issue discovered during rc7 that might be too risky. Reported-and-Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Chris Wilson <chris@chris-wilson.co.uk> CC: Imre Deak <imre.deak@intel.com> CC: Daniel Vetter <daniel.vetter@ffwll.ch> CC: David Airlie <airlied@linux.ie> CC: <dri-devel@lists.freedesktop.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-06-25 10:39:57 +10:00
Chris Wilson	19b2dbde57	drm/i915: Restore fences after resume and GPU resets Stéphane Marchesin found that fences for pinned objects (i.e. the scanout) were not being restored upon resume, leading to corruption on the display and reference counting issues. This is due to a bug in commit `312817a39f` [2.6.38] Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Nov 22 11:50:11 2010 +0000 drm/i915: Only save and restore fences for UMS that zapped the pinned fences even though they were in use. Fortuitously, whilst we forced a VT switch during suspend and resume, no fences were ever pinned at the time. However, we now can do switchless S3 transitions and so the old bug finally surfaces. Reported-by: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Stéphane Marchesin <marcheu@chromium.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-16 01:10:45 +02:00
Mika Kuoppala	aa60c664e6	drm/i915: find guilty batch buffer on ring resets After hang check timer has declared gpu to be hung, rings are reset. In ring reset, when clearing request list, do post mortem analysis to find out the guilty batch buffer. Select requests for further analysis by inspecting the completed sequence number which has been updated into the HWS page. If request was completed, it can't be related to the hang. For noncompleted requests mark the batch as guilty if the ring was not waiting and the ring head was stuck inside the buffer object or in the flush region right after the batch. For everything else, mark them as innocents. v2: Fixed a typo in commit message (Ville Syrjälä) v3: - more descriptive function parameters (Chris Wilson) - use masked head address when inspecting if request is in ring - s/hangcheck.last_action/hangcheck.action - added comment about unmasked head hitting batch_obj range Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:17 +02:00
Mika Kuoppala	7d736f4f0b	drm/i915: add batch bo to i915_add_request() In order to track down a batch buffer and context which caused the ring to hang, store reference to bo into the request struct. Request can also cause gpu to hang after the batch in the flush section in the ring. To detect this add start of the flush portion offset into the request. v2: Included comment about request vs batch_obj lifetimes (Chris Wilson) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:16 +02:00
Mika Kuoppala	0025c0772d	drm/i915: change i915_add_request to macro Only execbuffer needed all the parameters on i915_add_request(). By putting __i915_add_request behind macro, all current callsites become cleaner. Following patch will introduce a new parameter for __i915_add_request. With this patch, only the relevant callsite will reflect the change making commit smaller and easier to understand. v2: _i915_add_request as function name (Chris Wilson) v3: change name __i915_add_request and fix ordering of params (Ben Widawsky) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-13 17:42:15 +02:00
Dave Airlie	e6dfcc5303	Merge tag 'drm-intel-next-2013-06-01' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Another round of drm-intel-next for 3.11. Highlights: - Haswell IPS support (Paulo Zanoni) - VECS support on Haswell (Ben Widawsky, Xiang Haihao, ...) - Haswell watermark fixes (Paulo Zanoni) - "Make the gun bigger again" multithread fence fix from Chris. - i915_error_state finnally no longer fails with -ENOMEM! Big thanks to Mika for tackling this. - vlv sideband locking fixes from Jani - Hangcheck prep work for arb_robustness support (Mika&Chris) - edp vs cpu port confusion clean-up from Imre - pile of smaller fixes and cleanups all over. * tag 'drm-intel-next-2013-06-01' of git://people.freedesktop.org/~danvet/drm-intel: (70 commits) drm/i915: add i915_ips_status debugfs entry drm/i915: add enable_ips module option drm/i915: implement IPS feature drm/i915: fix up the edp power well check drm/i915: add I915_PARAM_HAS_VEBOX to i915_getparam drm/i915: add I915_EXEC_VEBOX to i915_gem_do_execbuffer() drm/i915: add VEBOX into debugfs drm/i915: Enable vebox interrupts drm/i915: vebox interrupt get/put drm/i915: consolidate interrupt naming scheme drm/i915: Convert irq_refounct to struct drm/i915: make PM interrupt writes non-destructive drm/i915: Add PM regs to pre/post install drm/i915: Create an ivybridge_irq_preinstall drm/i915: Create a more generic pm handler for hsw+ drm/i915: add support for 5/6 data buffer partitioning on Haswell drm/i915: properly set HSW WM_LP watermarks drm/i915: properly set HSW WM_PIPE registers drm/i915: fix pch_nop support drm/i915: Vebox ringbuffer init ...	2013-06-11 08:38:56 +10:00
Daniel Vetter	7abb690a0e	drm/i915: Fix spurious -EIO/SIGBUS on wedged gpus Chris Wilson noticed that since commit `1f83fee08d` [v3.9] Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Nov 15 17:17:22 2012 +0100 drm/i915: clear up wedged transitions X can again get -EIO when it does not expect it. And even worse score a SIGBUS when accessing gtt mmaps. The established ABI is that we _only_ return an -EIO from execbuf - all other ioctls should just work. And since the reset code moves all bos out of gpu domains and clears out all the last_seqno/ring tracking there really shouldn't be any reason for non-execbuf code to ever touch the hw and see an -EIO. After some extensive discussions we've noticed that these spurios -EIO are caused by i915_gem_wait_for_error: http://www.mail-archive.com/intel-gfx@lists.freedesktop.org/msg20540.html That is easy to fix by returning 0 instead of -EIO, since grabbing the dev->struct_mutex does not yet mean that we actually want to touch the hw. And so there is no reason at all to fail with -EIO. But that's not the entire since, since often (at least it's easily googleable) dmesg indicates that the reset fails and we declare the gpu wedged. Then, quite a bit later X wakes up with the "Timed out waiting for the gpu reset to complete" DRM_ERROR message in wait_for_errror and brings down the desktop with an -EIO/SIGBUS. So clearly we're missing a wakeup somewhere, since the gpu reset just doesn't take 10 seconds to complete. And indeed we're do handle the terminally wedged state wrong. Fix this all up. References: https://bugs.freedesktop.org/show_bug.cgi?id=63921 References: https://bugs.freedesktop.org/show_bug.cgi?id=64073 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Damien Lespiau <damien.lespiau@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-03 14:35:18 +02:00
Ben Widawsky	35c20a60c7	drm/i915: Rename the gtt_list to global_list Since it will be used for the global bound/unbound list with full PPGTT, this helps clarify things for upcoming code rework. Recommended-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-03 10:51:14 +02:00
Ben Widawsky	401c29f607	drm/i915: unpin pages at unbind If we properly keep track of the pages_pin_count, then when we later add multiple address spaces, the put_pages doesn't need any special checks to be able to perform it's job. CC: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Rebased on top of the fix for stolen memory pinning.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-03 10:50:22 +02:00
Ben Widawsky	1d64ae719b	drm/i915: Unpin stolen pages The way the stolen handling works is we take a pin on the backing pages, but we never actually get a reference to the bo. On freeing objects allocated with stolen memory, the final unref will end up freeing the object with pinned pages count left. To enable an assertion to catch bugs in this code path, this patch cleans up that remaining pin. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-06-03 10:49:08 +02:00
Ben Widawsky	9a8a2213a7	drm/i915: Vebox ringbuffer init v2: Add set_seqno which didn't exist before rebase (Haihao) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:54:12 +02:00
Ben Widawsky	0a9ae0d7f8	drm/i915: pre-fixes for checkpatch Since I'll need to modify i915_gem_object_bind_to_gtt(), fix the errors now to get checkpatch to not complain. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Resolve conflict with Chris' improved debug output, and bikeshed the new variable with s/max/gtt_max/ a bit while at it.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-31 20:53:56 +02:00
Dave Airlie	e81f3d81e2	Merge tag 'drm-intel-next-2013-05-20-merged' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Highlights (copy-pasted from my testing cycle mails): - fbc support for Haswell (Rodrigo) - streamlined workaround comments, including an igt tool to grep for them (Damien) - sdvo and TV out cleanups, including a fixup for sdvo multifunction devices - refactor our eDP mess a bit (Imre) - don't register the hdmi connector on haswell when desktop eDP is present - vlv support is no longer preliminary! - more vlv fixes from Jesse for stolen and dpll handling - more flexible power well checking infrastructure from Paulo - a few gtt patches from Ben - a bit of OCD cleanups for transcoder #defines and an assorted pile of smaller things. - fixes for the gmch modeset sequence - a bit of OCD around plane/pipe usage (Ville) - vlv turbo support (Jesse) - tons of vlv modeset fixes (Jesse et al.) - vlv pte write fixes (Kenneth Graunke) - hpd filtering to avoid costly probes on unaffected outputs (Egbert Eich) - intel dev_info cleanups and refactorings (Damien) - vlv rc6 support (Jesse) - random pile of fixes around non-24bpp modes handling - asle/opregion cleanups and locking fixes (Jani) - dp dpll refactoring - improvements for reduced_clock computation on g4x/ilk+ - pfit state refactored to use pipe_config (Jesse) - lots more computed modeset state moved to pipe_config, including readout and cross-check support - fdi auto-dithering for ivb B/C links, using the neat pipe_config improvements - drm_rect helpers plus sprite clipping fixes (Ville) - hw context refcounting (Mika + Ben) * tag 'drm-intel-next-2013-05-20-merged' of git://people.freedesktop.org/~danvet/drm-intel: (155 commits) drm/i915: add support for dvo Chrontel 7010B drm/i915: Use pipe config state to control gmch pfit enable/disable drm/i915: Use pipe_config state to disable ilk+ pfit drm/i915: panel fitter hw state readout&check support drm/i915: implement WADPOClockGatingDisable for LPT drm/i915: Add missing platform tags to FBC workaround comments drm/i915: rip out an unused lvds_reg variable drm/i915: Compute WR PLL dividers dynamically drm/i915: HSW FBC WaFbcDisableDpfcClockGating drm/i915: HSW FBC WaFbcAsynchFlipDisableFbcQueue drm/i915: Enable FBC at Haswell. drm/i915: IVB FBC WaFbcDisableDpfcClockGating drm/i915: IVB FBC WaFbcAsynchFlipDisableFbcQueue drm/i915: Add support for FBC on Ivybridge. drm/i915: Organize VBT stuff inside drm_i915_private drm/i915: make SDVO TV-out work for multifunction devices drm/i915: rip out now unused is_foo tracking from crtc code drm/i915: rip out TV-out lore ... drm/i915: drop TVclock special casing on ilk+ drm/i915: move sdvo TV clock computation to intel_sdvo.c ...	2013-05-31 12:56:05 +10:00
Chris Wilson	2dc8aae06d	drm/i915: Workaround incoherence with fence updates on Valleyview In commit `25ff1195f8` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Apr 4 21:31:03 2013 +0100 drm/i915: Workaround incoherence between fences and LLC across multiple CPUs we introduced an empirical workaround for memory corruption when using fences from multiple CPUs. At the time, we did not have any results for Valleyview, so the presumption was that it was limited to recent generations using LLC. Now we have evidence that Valleyview also suffers incoherence and requires a similar but different workaround. For Valleyview, the wbinvd instruction is insufficient and we require the serialising register write per-CPU. Conversely, that serialising register write is not enough for SNB/IVB/HSW. To compromise and keep the code relatively clean, employ both serialisation techniques in the same workaround. Reported-by: Jon Bloomfield <jon.bloomfield@intel.com> Tested-by: Jon Bloomfield <jon.bloomfield@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-23 12:51:31 +02:00
Chris Wilson	a36689cb77	drm/i915: Be more informative when reporting "too large for aperture" error This should help debugging the truly unexpected cases where it occurs - in particular to see which value is garbage. References: https://bugzilla.kernel.org/show_bug.cgi?id=58511 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: s/%ld/%zd/ as spotted by Wu Fengguang's autobuilder.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-23 12:51:29 +02:00
Imre Deak	e054cc3937	drm/i915: avoid premature timeouts in __wait_seqno() At the moment wait_event_timeout/wait_event_interruptible_timeout may time out 1 jiffy too early, as the calculated expiry time is 1 less than needed. Besides timing out too early this also means that the calculation of the remaining time will be incorrect and we will pass a non-zero remaining time to user space in case of a time out. This is one reason for the following bugzilla report: Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64270 Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-22 13:51:23 +02:00
Daniel Vetter	e1b73cba13	Linux 3.10-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJRmpexAAoJEHm+PkMAQRiGrRIH/1uWFW38RvaCV/PXm/ia6Z+x BfBJfBIvPxGwb4n7aQNQlhU25xkfrPZ6szO4WiBH5/KPH3xYi2I2OZ1AzffkYqMF BWkPmsPK6EsTdp16zsi6JtH2aXArG4SpYA7ZamPvDkmfigHuiZg7GlL/9eHTRPNV P7Q8JToOrcnP8RoGgNj0uFiQeQbc62Kmoq7WuPtUhVlpQCCCknXgOJiYgz9w6Xe9 /i79YFS8WRrzAquExT1NbIOh4ZMqB9MvuroaVWy8JDDLUyz7QUvOCe3tCDNguwgi FdWvU6nfkdQq5SLaWCWXDE9Rp/pL1MvfBn9vCOwFcp42aw0aQ0PgJVIXvsqufd0= =jgDI -----END PGP SIGNATURE----- Merge tag 'v3.10-rc2' into drm-intel-next-queued Backmerge Linux 3.10-rc2 since the various (rather trivial) conflicts grew a bit out of hand. intel_dp.c has the only real functional conflict since the logic changed while dev_priv->edp.bpp was moved around. Also squash in a whitespace fixup from Ben Widawsky for i915_gem_gtt.c, git seems to do something pretty strange in there (which I don't fully understand tbh). Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_dp.c Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-21 09:52:16 +02:00
Mika Kuoppala	0e50e96bf2	drm/i915: add context into request struct Storing context reference into request struct allows us to inspect context and its associated objects when requests are retired. Both ppgtt and arb robustness work will need this. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-05-06 11:21:51 +02:00
Chris Wilson	4f42f4ef0d	drm/i915: Always normalize return timeout for wait_timeout_ioctl As we recompute the remaining timeout after waiting, there is a potential for that timeout to be less than zero and so need sanitizing. The timeout is always returned to userspace and validated, so we should always perform the sanitation. v2 [vsyrjala]: Only normalize the timespec if it's invalid v3: Add a comment to clarify the situation and remove the now useless WARN_ON() (ickle) Cc: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-30 10:50:32 +02:00
Ville Syrjälä	42b5aeabe9	drm/i915: IVB/HSW have 32 fence register Increase the number of fence registers to 32 on IVB/HSW. VLV however only has 16 fence registers according to the docs. Increasing the number of fences was attempted before [1], but there was some uncertainty about the maximum CPU fence number for FBC. Since then BSpec has been updated to state that there are in fact 32 fence registers, and the CPU fence number field in the SNB_DPFC_CTL_SA register is 5 bits, and the CPU fence number field in the ILK_DPFC_CONTROL register must be zero. So now it all makes sense. [1] http://lists.freedesktop.org/archives/intel-gfx/2011-October/012865.html v2: Include some background information based on the previous attempt Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:21 +02:00
Ben Widawsky	b7c36d2546	drm/i915: Allow PPGTT enable to fail I'm really not happy that we have to support this, but this will be the simplest way to handle cases where PPGTT init can fail, which I promise will be coming in the future. v2: Resolve conflicts due to patch series reordering. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:16 +02:00
Ben Widawsky	6197349bde	drm/i915: Abstract PPGTT enabling Since we've already set up a nice vtable to abstract other PPGTT functions, also abstract the actual register programming to enable things. This function will probably need to change a bit as we implement real processes. v2: Resolve conflicts due to patch series reordering. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:15 +02:00
Chris Wilson	25ff1195f8	drm/i915: Workaround incoherence between fences and LLC across multiple CPUs In order to fully serialize access to the fenced region and the update to the fence register we need to take extreme measures on SNB+, and manually flush writes to memory prior to writing the fence register in conjunction with the memory barriers placed around the register write. Fixes i-g-t/gem_fence_thrash v2: Bring a bigger gun v3: Switch the bigger gun for heavier bullets (Arjan van de Ven) v4: Remove changes for working generations. v5: Reduce to a per-cpu wbinvd() call prior to updating the fences. v6: Rewrite comments to ellide forgotten history. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62191 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Tested-by: Jon Bloomfield <jon.bloomfield@intel.com> (v2) Cc: stable@vger.kernel.org Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-18 09:43:10 +02:00
Ben Widawsky	88a2b2a32d	drm/i915: Don't wait for PCH on reset BIOS should be setting this, but in case it doesn't... v2: Define the bits we actually want to clear (Jesse) Make it an RMW op (Jesse) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-04-08 20:53:05 +02:00
Imre Deak	2db76d7c3c	lib/scatterlist: sg_page_iter: support sg lists w/o backing pages The i915 driver uses sg lists for memory without backing 'struct page' pages, similarly to other IO memory regions, setting only the DMA address for these. It does this, so that it can program the HW MMU tables in a uniform way both for sg lists with and without backing pages. Without a valid page pointer we can't call nth_page to get the current page in __sg_page_iter_next, so add a helper that relevant users can call separately. Also add a helper to get the DMA address of the current page (idea from Daniel). Convert all places in i915, to use the new API. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-27 17:13:44 +01:00
Chris Wilson	f9c513e9d6	drm/i915: Always call fence-lost prior to removing the fence There is a minute window for a race between put-fence removing the fence and for a new transaction by an external party on the GTT mmap. That is we must zap the mmap prior to removing the fence and not afterwards. Fixes regression from commit `61050808bb` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 17 15:31:31 2012 +0100 drm/i915: Refactor put_fence() to use the common fence writing routine v2: Remember the fence to remove with a local variable (gcc) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-26 20:16:18 +01:00
Imre Deak	90797e6d1e	drm/i915: create compact dma scatter lists for gem objects So far we created a sparse dma scatter list for gem objects, where each scatter list entry represented only a single page. In the future we'll have to handle compact scatter lists too where each entry can consist of multiple pages, for example for objects imported through PRIME. The previous patches have already fixed up all other places where the i915 driver _walked_ these lists. Here we have the corresponding fix to _create_ compact lists. It's not a performance or memory footprint improvement, but it helps to better exercise the new logic. Reference: http://www.spinics.net/lists/dri-devel/msg33917.html Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-23 12:17:09 +01:00
Imre Deak	67d5a50c04	drm/i915: handle walking compact dma scatter lists So far the assumption was that each dma scatter list entry contains only a single page. This might not hold in the future, when we'll introduce compact scatter lists, so prepare for this everywhere in the i915 code where we walk such a list. We'll fix the place _creating_ these lists separately in the next patch to help the reviewing/bisectability. Reference: http://www.spinics.net/lists/dri-devel/msg33917.html Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-23 12:16:36 +01:00
Daniel Vetter	0d4a42f6bd	Linux 3.9-rc3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJRRkrbAAoJEHm+PkMAQRiGy3oH/jrbHinYs0auurANgx4TdtWT /WNajstKBqLOJJ6cnTR7sOqwOVlptt65EbbTs+qGyZ2Z2W/Lg0BMenHvNHo4ER8C e7UbMdBCSLKBjAMKh1XCoZscGv4Exm8WRH3Vc5yP0Hafj3EzSAVLY1dta9WKKoQi bh7D1ErUlbU1zczA1w5YbPF0LqFKRvyZOwebMCCAKAxv5wWAxmbcPNxVR4sufkjg k6TkQ2ysgWivZAfy3tJYOcxiEu7ahpZVEuYdlZEJQXHRQUfoNljQlOp4BqKsYUai 5A0kaf2VpKay/7pkhvTfBBcF/jFJ68pYP6gQ2ThNdr0b5kOiAfMWj030Xyngnhg= =iO9t -----END PGP SIGNATURE----- Merge tag 'v3.9-rc3' into drm-intel-next-queued Backmerge so that I can merge Imre Deak's coalesced sg entries fixes, which depend upon the new for_each_sg_page introduce in commit `a321e91b6d` Author: Imre Deak <imre.deak@intel.com> Date: Wed Feb 27 17:02:56 2013 -0800 lib/scatterlist: add simple page iterator The merge itself is just two trivial conflicts: Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-19 09:47:30 +01:00
Jesse Barnes	d62b4892f3	drm/i915: allow force wake at init time on VLV v2 We need to set the 'allow force wake' bit to enable forcewake handling later on. v2: split from clock gating patch (Jani) check for allowwakeack (Ville) Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-19 09:38:32 +01:00
Ville Syrjälä	2bb4629add	drm/i915: Add to_user_ptr() to_user_ptr() simply casts a pointer passed as u64 from user space to void __user * correctly. Using this lets us get rid of all the tiresome casts. The idea came from Chris Wilson <chris@chris-wilson.co.uk>. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-03-03 19:49:11 +01:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Daniel Vetter	eb32e4584d	drm/i915: Use HAS_L3_GPU_CACHE in i915_gem_l3_remap Yet another remnant ... this might explain why l3 remapping didn't really work on HSW. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57441 Spotted-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-02-20 00:21:47 +01:00
Imre Deak	769ce4643b	drm/i915: don't clflush gem objects in stolen memory As explained by Chris Wilson gem objects in stolen memory are always coherent with the GPU so we don't need to ever flush the CPU caches for these. This fixes a breakage - at least with the compact sg patches applied - during the resume/restore gtt mappings path, when we tried to clflush an FB object in stolen memory, but since stolen objects don't have backing pages we passed an invalid page pointer to drm_clflush_page(). Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-02-20 00:21:43 +01:00
Ben Widawsky	4fc7c971c3	drm/i915: Extract ring init from hw_init The ring initialization will differ a bit in upcoming generations, and this split will prepare the code for what's needed. This patch also fixes a bug introduced in: commit `9943393195` Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Jan 22 14:12:17 2013 +0200 drm/i915: use gem_set_seqno() on hardware init After doing the extraction, the bad error handling became obvious. I acknowledge that this should be two patches, but it's a pretty small/trivial patch. If requested, I can certainly do the fix as a distinct patch. v2: Should be cleanup blt, not init blt on failure (Chris) v3: Forgot to git add on v2 Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-02-15 10:30:39 +01:00
Dave Airlie	cd17ef4114	Merge tag 'drm-intel-next-2013-02-01' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: "Probably the last feature pull for 3.9, there's some fixes outstanding thought that I'd like to sneak in. And maybe 3.8 takes a bit longer ... Anyway, highlights of this pull: - Kill the horrible IS_DISPLAYREG hack to handle the mmio offset movements on vlv, big thanks to Ville. - Dynamic power well support for Haswell, shaves away a bit when only using the eDP port on pipe A (Paulo). Plus unclaimed register fixes uncovered by this. - Clarifications of the gpu hang/reset state transitions, hopefully fixing a few spurious -EIO deaths in userspace. - Haswell ELD fixes. - Some more (pp)gtt cleanups from Ben. - A few smaller things all over. Plus all the stuff from the previous rather small pull request: - Broadcast RBG improvements and reduced color range fixes from Ville. - Ben is on a "kill legacy gtt code for good" spree, first pile of patches included. - No-relocs and bo lut improvements for faster execbuf from Chris. - Some refactorings from Imre." * tag 'drm-intel-next-2013-02-01' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits) GPU/i915: Fix acpi_bus_get_device() check in drivers/gpu/drm/i915/intel_opregion.c drm/i915: Set the SR01 "screen off" bit in i915_redisable_vga() too drm/i915: Kill IS_DISPLAYREG() drm/i915: Introduce i915_vgacntrl_reg() drm/i915: gen6_gmch_remove can be static drm/i915: dynamic Haswell display power well support drm/i915: check the power down well on assert_pipe() drm/i915: don't send DP "idle" pattern before "normal" on HSW PORT_A drm/i915: don't run hsw power well code on !hsw drm/i915: kill cargo-culted locking from power well code drm/i915: Only run idle processing from i915_gem_retire_requests_worker drm/i915: Fix CAGF for HSW drm/i915: Reclaim GTT space for failed PPGTT drm/i915: remove intel_gtt structure drm/i915: Add probe and remove to the gtt ops drm/i915: extract hw ppgtt setup/cleanup code drm/i915: pte_encode is gen6+ drm/i915: vfuncs for ppgtt drm/i915: vfuncs for gtt_clear_range/insert_entries drm/i915: Error state should print /sys/kernel/debug ...	2013-02-08 11:08:10 +10:00
Chris Wilson	725a5b5402	drm/i915: Only run idle processing from i915_gem_retire_requests_worker When adding the fb idle detection to mark-inactive, it was forgotten that userspace can drive the processing of retire-requests. We assumed that it would be principally driven by the retire requests worker, running once every second whilst active and so we would get the deferred timer for free. Instead we spend too many CPU cycles reclocking the LVDS preventing real work from being done. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-and-tested-by: Alexander Lam <lambchop468@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58843 Cc: stable@vger.kernel.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-31 11:50:09 +01:00
Mika Kuoppala	9943393195	drm/i915: use gem_set_seqno() on hardware init When machine was rebooted or module was reloaded, gem_hw_init() set last_seqno to be identical to next_seqno. This lead to situation that waits for first ever request always passed immediately regardless if it was actually executed. Use gem_set_seqno() to be consistent how hw is initialized on init, wrap and on resume. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-22 13:52:26 +01:00
Daniel Vetter	f69061bedd	drm/i915: create a race-free reset detection With the previous patch the state transition handling of the reset code itself is now (hopefully) race free and solid. But that still leaves out everyone else - with the various lock-free wait paths we have there's the possibility that the reset happens between the point where we read the seqno we should wait on and the actual wait. And if __wait_seqno then never sees the RESET_IN_PROGRESS state, we'll happily wait for a seqno which will in all likelyhood never signal. In practice this is not a big problem since the X server gets constantly interrupted, and can then submit more work (hopefully) to unblock everyone else: As soon as a new seqno write lands, all waiters will unblock. But running the i-g-t reset testcase ZZ_hangman can expose this race, especially on slower hw with fewer cpu cores. Now looking forward to ARB_robustness and friends that's not the best possible behaviour, hence this patch adds a reset_counter to be able to detect any reset, even if a given thread never observed the in-progress state. The important part is to correctly order things: - The write side needs to increment the counter after any seqno gets reset. Hence we need to do that at the end of the reset work, and again wake everyone up. We also need to place a barrier in between any possible seqno changes and the counter increment, since any unlock operations only guarantee that nothing leaks out, but not that at later load operation gets moved ahead. - On the read side we need to ensure that no reset can sneak in and invalidate the seqno. In all cases we can use the one-sided barrier that unlock operations guarantee (of the lock protecting the respective seqno/ring pair) to ensure correct ordering. Hence it is sufficient to place the atomic read before the mutex/spin_unlock and no additional barriers are required. The end-result of all this is that we need to wake up everyone twice in a reset operation: - First, before the reset starts, to get any lockholders of the locks, so that the reset can proceed. - Second, after the reset is completed, to allow waiters to properly and reliably detect the reset condition and bail out. I admit that this entire reset_counter thing smells a bit like overkill, but I think it's justified since it makes it really explicit what the bail-out condition is. And we need a reset counter anyway to implement ARB_robustness, and imo with finer-grained locking on the horizont this is the most resilient scheme I could think of. v2: Drop spurious change in the wait_for_error EXIT_COND - we only need to wait until we leave the reset-in-progress wedged state. v3: Don't play tricks with barriers in the throttle ioctl, the spin_unlock is barrier enough. I've also considered using a little helper to grab the current reset_counter, but then decided that hiding the atomic_read isn't a great idea, since having it explicitly show up in the code is a nice remainder to reviews to check the memory barriers. v4: Add a comment to explain why we need to fall through in __wait_seqno in the end variable assignments. v5: Review from Damien: - s/smb/smp/ in a comment - don't increment the reset counter after we've set it to WEDGED. Now we (again) properly wedge the gpu when the reset fails. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-21 19:53:54 +01:00
Dave Airlie	735dc0d1e2	Merge branch 'drm-kms-locking' of git://people.freedesktop.org/~danvet/drm-intel into drm-next The aim of this locking rework is that ioctls which a compositor should be might call for every frame (set_cursor, page_flip, addfb, rmfb and getfb/create_handle) should not be able to block on kms background activities like output detection. And since each EDID read takes about 25ms (in the best case), that always means we'll drop at least one frame. The solution is to add per-crtc locking for these ioctls, and restrict background activities to only use the global lock. Change-the-world type of events (modeset, dpms, ...) need to grab all locks. Two tricky parts arose in the conversion: - A lot of current code assumes that a kms fb object can't disappear while holding the global lock, since the current code serializes fb destruction with it. Hence proper lifetime management using the already created refcounting for fbs need to be instantiated for all ioctls and interfaces/users. - The rmfb ioctl removes the to-be-deleted fb from all active users. But unconditionally taking the global kms lock to do so introduces an unacceptable potential stall point. And obviously changing the userspace abi isn't on the table, either. Hence this conversion opportunistically checks whether the rmfb ioctl holds the very last reference, which guarantees that the fb isn't in active use on any crtc or plane (thanks to the conversion to the new lifetime rules using proper refcounting). Only if this is not the case will the code go through the slowpath and grab all modeset locks. Sane compositors will never hit this path and so avoid the stall, but userspace relying on these semantics will also not break. All these cases are exercised by the newly added subtests for the i-g-t kms_flip, tested on a machine where a full detect cycle takes around 100 ms. It works, and no frames are dropped any more with these patches applied. kms_flip also contains a special case to exercise the above-describe rmfb slowpath. * 'drm-kms-locking' of git://people.freedesktop.org/~danvet/drm-intel: (335 commits) drm/fb_helper: check whether fbcon is bound drm/doc: updates for new framebuffer lifetime rules drm: don't hold crtc mutexes for connector ->detect callbacks drm: only grab the crtc lock for pageflips drm: optimize drm_framebuffer_remove drm/vmwgfx: add proper framebuffer refcounting drm/i915: dump refcount into framebuffer debugfs file drm: refcounting for crtc framebuffers drm: refcounting for sprite framebuffers drm: fb refcounting for dirtyfb_ioctl drm: don't take modeset locks in getfb ioctl drm: push modeset_lock_all into ->fb_create driver callbacks drm: nest modeset locks within fpriv->fbs_lock drm: reference framebuffers which are on the idr drm: revamp framebuffer cleanup interfaces drm: create drm_framebuffer_lookup drm: revamp locking around fb creation/destruction drm: only take the crtc lock for ->cursor_move drm: only take the crtc lock for ->cursor_set drm: add per-crtc locks ...	2013-01-21 07:44:58 +10:00
Chris Wilson	97c809fd9c	drm/i915: Only apply the mb() when flushing the GTT domain during a finish Now that we seem to have brought order to the GTT barriers, the last one to review is the terminal barrier before we unbind the buffer from the GTT. This needs to only be performed if the buffer still resides in the GTT domain, and so we can skip some needless barriers otherwise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:17 +01:00
Chris Wilson	d0a57789d5	drm/i915: Only insert the mb() before updating the fence parameter With a fence, we only need to insert a memory barrier around the actual fence alteration for CPU accesses through the GTT. Performing the barrier in flush-fence was inserting unnecessary and expensive barriers for never fenced objects. Note removing the barriers from flush-fence, which was effectively a barrier before every direct access through the GTT, revealed that we where missing a barrier before the first access through the GTT. Lack of that barrier was sufficient to cause GPU hangs. v2: Add a couple more comments to explain the new barriers Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:16 +01:00
Daniel Vetter	1f83fee08d	drm/i915: clear up wedged transitions We have two important transitions of the wedged state in the current code: - 0 -> 1: This means a hang has been detected, and signals to everyone that they please get of any locks, so that the reset work item can do its job. - 1 -> 0: The reset handler has completed. Now the last transition mixes up two states: "Reset completed and successful" and "Reset failed". To distinguish these two we do some tricks with the reset completion, but I simply could not convince myself that this doesn't race under odd circumstances. Hence split this up, and add a new terminal state indicating that the hw is gone for good. Also add explicit #defines for both states, update comments. v2: Split out the reset handling bugfix for the throttle ioctl. v3: s/tmp/wedged/ sugested by Chris Wilson. Also fixup up a rebase error which prevented this patch from actually compiling. v4: To unify the wedged state with the reset counter, keep the reset-in-progress state just as a flag. The terminally-wedged state is now denoted with a big number. v5: Add a comment to the reset_counter special values explaining that WEDGED & RESET_IN_PROGRESS needs to be true for the code to be correct. v6: Fixup logic errors introduced with the wedged+reset_counter unification. Since WEDGED implies reset-in-progress (in a way we're terminally stuck in the dead-but-reset-not-completed state), we need ensure that we check for this everywhere. The specific bug was in wait_for_error, which would simply have timed out. v7: Extract an inline i915_reset_in_progress helper to make the code more readable. Also annote the reset-in-progress case with an unlikely, to help the compiler optimize the fastpath. Do the same for the terminally wedged case with i915_terminally_wedged. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:16 +01:00
Daniel Vetter	308887aad1	drm/i915: fix reset handling in the throttle ioctl While auditing the code I've noticed one place (the throttle ioctl) which does not yet wait for the reset handler to complete and doesn't properly decode the wedge state into -EAGAIN/-EIO. Fix this up by calling the right helpers. This might explain the oddball "my compositor just died in a successfull gpu reset" reports. Or maybe not, since current mesa doesn't use this ioctl to throttle command submission. The throttle ioctl doesn't take the struct_mutex, so to avoid busy-looping with -EAGAIN while a reset is in process, check for errors first and wait for the handler to complete if a reset is pending by calling i915_gem_wait_for_error. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:15 +01:00
Daniel Vetter	33196dedda	drm/i915: move wedged to the other gpu error handling stuff And to make Ben Widawsky happier, use the gpu_error instead of the entire device as the argument in some functions. Drop the outdated comment on ->wedged for now, a follow-up patch will change the semantics and add a proper comment again. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:15 +01:00
Daniel Vetter	99584db33b	drm/i915: extract hangcheck/reset/error_state state into substruct This has been sprinkled all over the place in dev_priv. I think it'd be good to also move all the code into a separate file like i915_gem_error.c, but that's for another patch. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:14 +01:00
Ben Widawsky	93d187993b	drm/i915: Remove use of gtt_mappable_entries Mappable_end, ie. size is almost always what you want as opposed to the number of entries. Since we already have that information, we can scrap the number of entries and only calculate it when needed. If gtt_start is !0, this will have slightly different behavior. This difference can only occur in DRI1, and exists when we try to kick out the firmware fb. The new code seems like a bugfix to me. The other case where we've changed the behavior is during init we check the mappable region against our current known upper and lower limits (64MB, and 512MB). This now matches the comment, and makes things more convenient after removing gtt_mappable_entries. Also worth noting is the setting of mappable_end is taken out of setup because we do it earlier now in the DRI2 case and therefore need to add that tiny hunk to support the DRI1 IOCTL. v2: Move up mappable end to before legacy AGP init v3: Add the dev_priv inclusion here from previous rebase error in patch 5 Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: squash in fix for a printk format flag mismatch warning.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:09:20 +01:00
Ben Widawsky	5d4545aef5	drm/i915: Create a gtt structure The purpose of the gtt structure is to help isolate our gtt specific properties from the rest of the code (in doing so it help us finish the isolation from the AGP connection). The following members are pulled out (and renamed): gtt_start gtt_total gtt_mappable_end gtt_mappable gtt_base_addr gsm The gtt structure will serve as a nice place to put gen specific gtt routines in upcoming patches. As far as what else I feel belongs in this structure: it is meant to encapsulate the GTT's physical properties. This is why I've not added fields which track various drm_mm properties, or things like gtt_mtrr (which is itself a pretty transient field). Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [Ben modified commit messages] Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:33:56 +01:00
Chris Wilson	43e28f092b	drm/i915: Bail if we attempt to allocate pages for a purged object Move the existing checking inside bind_to_gtt() to the more appropriate layer in order to prevent recreation of the pages after they have been explicitly truncated. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:59 +01:00
Chris Wilson	dd624afd53	drm/i915: Add a debug interface to forcibly evict and shrink our object caches As a means to investigate some bad system behaviour related to the purging of the active, inactive and unbound lists, it is useful to be able to manually control when those lists should be cleared. v2: use _safe list iterators as we kick objects from the list as we walk. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add a small comment explaining why we don't need to check and wait for gpu resets, acked by Chris on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:57 +01:00
Imre Deak	0fa8779651	drm/i915: use gtt_get_size() instead of open coding it Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:56 +01:00
Imre Deak	56c844e539	drm/i915: merge {i965, sandybridge}_write_fence_reg() The two functions are rather similar, so merge them. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:55 +01:00
Imre Deak	d865110cc2	drm/i915: merge get_gtt_alignment/get_unfenced_gtt_alignment() The two functions are rather similar, so merge them. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:54 +01:00
Dave Airlie	b5cc6c0387	Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: - seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris Wilson - some leftover kill-agp on gen6+ patches from Ben - hotplug improvements from Damien - clear fb when allocated from stolen, avoids dirt on the fbcon (Chris) - Stolen mem support from Chris Wilson, one of the many steps to get to real fastboot support. - Some DDI code cleanups from Paulo. - Some refactorings around lvds and dp code. - some random little bits&pieces * tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits) drm/i915: Return the real error code from intel_set_mode() drm/i915: Make GSM void drm/i915: Move GSM mapping into dev_priv drm/i915: Move even more gtt code to i915_gem_gtt drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno drm/i915: Introduce i915_gem_set_seqno() drm/i915: Always clear semaphore mboxes on seqno wrap drm/i915: Initialize hardware semaphore state on ring init drm/i915: Introduce ring set_seqno drm/i915: Missed conversion to gtt_pte_t drm/i915: Bug on unsupported swizzled platforms drm/i915: BUG() if fences are used on unsupported platform drm/i915: fixup overlay stolen memory leak drm/i915: clean up PIPECONF bpc #defines drm/i915: add intel_dp_set_signal_levels drm/i915: remove leftover display.update_wm assignment drm/i915: check for the PCH when setting pch_transcoder drm/i915: Clear the stolen fb before enabling drm/i915: Access to snooped system memory through the GTT is incoherent drm/i915: Remove stale comment about intel_dp_detect() ... Conflicts: drivers/gpu/drm/i915/intel_display.c	2013-01-17 20:34:08 +10:00
Daniel Vetter	93927ca52a	drm/i915: Revert shrinker changes from "Track unbound pages" This partially reverts commit `6c085a728c` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Aug 20 11:40:46 2012 +0200 drm/i915: Track unbound pages Closer inspection of that patch revealed a bunch of unrelated changes in the shrinker: - The shrinker count is now in pages instead of objects. - For counting the shrinkable objects the old code only looked at the inactive list, the new code looks at all bounds objects (including pinned ones). That is obviously in addition to the new unbound list. - The shrinker cound is no longer scaled with sysctl_vfs_cache_pressure. Note though that with the default tuning value of vfs_cache_pressue = 100 this doesn't affect the shrinker behaviour. - When actually shrinking objects, the old code first dropped purgeable objects, then normal (inactive) objects. Only then did it, in a last-ditch effort idle the gpu and evict everything. The new code omits the intermediate step of evicting normal inactive objects. Safe for the first change, which seems benign, and the shrinker count scaling, which is a bit a different story, the endresult of all these changes is that the shrinker is _much_ more likely to fall back to the last-ditch resort of idling the gpu and evicting everything. The old code could only do that if something else evicted lots of objects meanwhile (since without any other changes the nr_to_scan will be smaller than the object count). Reverting the vfs_cache_pressure behaviour itself is a bit bogus: Only dentry/inode object caches should scale their shrinker counts with vfs_cache_pressure. Originally I've had that change reverted, too. But Chris Wilson insisted that it's too bogus and shouldn't again see the light of day. Hence revert all these other changes and restore the old shrinker behaviour, with the minor adjustment that we now first scan the unbound list, then the inactive list for each object category (purgeable or normal). A similar patch has been tested by a few people affected by the gen4/5 hangs which started to appear in 3.7, which some people bisected to the "drm/i915: Track unbound pages" commit. But just disabling the unbound logic alone didn't change things at all. Note that this patch doesn't fix the referenced bugs, it only hides the underlying bug(s) well enough to restore pre-3.7 behaviour. The key to achieve that is to massively reduce the likelyhood of going into a full gpu stall and evicting everything. v2: Reword commit message a bit, taking Chris Wilson's comment into account. v3: On Chris Wilson's insistency, do not reinstate the rather bogus vfs_cache_pressure change. Tested-by: Greg KH <gregkh@linuxfoundation.org> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=55984 References: https://bugs.freedesktop.org/show_bug.cgi?id=57122 References: https://bugs.freedesktop.org/show_bug.cgi?id=56916 References: https://bugs.freedesktop.org/show_bug.cgi?id=57136 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-10 18:02:44 +01:00
Chris Wilson	93be8788e6	drm/i915; Only increment the user-pin-count after successfully pinning the bo As along the error path we do not correct the user pin-count for the failure, we may end up with userspace believing that it has a pinned object at offset 0 (when interrupted by a signal for example). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-07 10:30:53 +01:00
Dave Airlie	8be0e5c427	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Some fixes for 3.8: - Watermark fixups from Chris Wilson (4 pieces). - 2 snb workarounds, seem to be recently added to our internal DB. - workaround for the infamous i830/i845 hang, seems now finally solid! Based on Chris' fix for SNA, now also for UXA/mesa&old SNA. - Some more fixlets for shrinker-pulls-the-rug issues (Chris&me). - Fix dma-buf flags when exporting (you). - Disable the VGA plane if it's enabled on lid open - similar fix in spirit to the one I've sent you last weeek, BIOS' really like to mess with the display when closing the lid (awesome debug work from Krzysztof Mazur). * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: disable shrinker lock stealing for create_mmap_offset drm/i915: optionally disable shrinker lock stealing drm/i915: fix flags in dma buf exporting i915: ensure that VGA plane is disabled drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm manager drm: Export routines for inserting preallocated nodes into the mm manager drm/i915: don't disable disconnected outputs drm/i915: Implement workaround for broken CS tlb on i830/845 drm/i915: Implement WaSetupGtModeTdRowDispatch drm/i915: Implement WaDisableHiZPlanesWhenMSAAEnabled drm/i915: Prefer CRTC 'active' rather than 'enabled' during WM computations drm/i915: Clear self-refresh watermarks when disabled drm/i915: Double the cursor self-refresh latency on Valleyview drm/i915: Fixup cursor latency used for IVB lp3 watermarks	2012-12-30 13:54:12 +10:00
Ben Widawsky	d7e5008f7c	drm/i915: Move even more gtt code to i915_gem_gtt This really should have been part of the kill agp series. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-20 16:27:35 +01:00
Daniel Vetter	da494d7ca5	drm/i915: disable shrinker lock stealing for create_mmap_offset The mmap offset structure is not part of the drm/i915 code, but provided by gem helpers. To avoid leaky abstractions (by either depending upon implementation details of said helper wrt to preallocations, or reimplementing it in our code and so fuzzing around in internal details of that helpr) simply disable the shrinker lock stealing accross calls into the helper functions. This should fix igt/gem_tiled_swapping. v2: Fix cleanup path confusion bemoaned by Chris Wilson. Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-20 14:57:35 +01:00
Daniel Vetter	677feac291	drm/i915: optionally disable shrinker lock stealing commit `5774506f15` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Nov 21 13:04:04 2012 +0000 drm/i915: Borrow our struct_mutex for the direct reclaim added a nice trick to steal the struct_mutex lock in the shrinker if it's the current task holding it. But this also caused the requirement that every place which allocates memory needs to be careful about the gem state of objects, since the shrinker could have pulled the rug out from under it. We've usually solved this by carefully preallocating things or ensure that buffers are pinned already. But the shrinker also reaps mmap offset, so allocating those needs to be careful, too. Now that code has been factored out into some common helpers, so either we have fragile code depending upon the common helper not doing something we don't want it to do. Or we need to reimplement the mmap offset creation and so also leak implementation details into our code. Since this all results in leaky abstraction, cop out by disabling the lock borrowing trick while calling down into the helpers. That way our craziness is nicely confined to files in drm/i915. v2: Split out the change to create_mmap_offset as request by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-20 14:56:04 +01:00
Mika Kuoppala	fca26bb453	drm/i915: Introduce i915_gem_set_seqno() This function can be used to set the driver's next_seqno to arbitrary value. i915_gem_set_seqno() will idle the gpu, retire outstanding requests, clear the semaphore mailboxes and set the hardware status page's seqno index. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:25:10 +01:00
Mika Kuoppala	ba1a7067c0	drm/i915: Always clear semaphore mboxes on seqno wrap In preparation for setting the seqno to arbitrary value on init or through debugfs. We need to always clear the semaphores and set the hws page seqno index by calling intel_ring_init_seqno(). v2: rewrote the commit message as suggested by Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:17:41 +01:00
Mika Kuoppala	f7e98ad4d4	drm/i915: Initialize hardware semaphore state on ring init Hardware status page needs to have proper seqno set as our initial seqno can be arbitrary. If initial seqno is close to wrap boundary on init and i915_seqno_passed() (31bit space) refers to hw status page which contains zero, errorneous result will be returned. v2: clear mboxes and set hws page directly instead of going through rings. Suggested by Chris Wilson. v3: hws needs to be updated for all gens. Noticed by Chris Wilson. References: https://bugs.freedesktop.org/show_bug.cgi?id=58230 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:17:01 +01:00
Ben Widawsky	8782e26c0c	drm/i915: Bug on unsupported swizzled platforms Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-18 22:31:23 +01:00
Ben Widawsky	7dbf9d6e0f	drm/i915: BUG() if fences are used on unsupported platform Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-18 22:29:55 +01:00
Chris Wilson	dc9dd7a20f	drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm manager As we may reap neighbouring objects in order to free up pages for allocations, we need to be careful not to allocate in the middle of the drm_mm manager. To accomplish this, we can simply allocate the drm_mm_node up front and then use the combined search & insert drm_mm routines, reducing our code footprint in the process. Fixes (partially) i-g-t/gem_tiled_swapping Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Again fixup atomic bikeshed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-18 22:02:29 +01:00
Linus Torvalds	3c2e81ef34	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull DRM updates from Dave Airlie: "This is the one and only next pull for 3.8, we had a regression we found last week, so I was waiting for that to resolve itself, and I ended up with some Intel fixes on top as well. Highlights: - new driver: nvidia tegra 20/30/hdmi support - radeon: add support for previously unused DMA engines, more HDMI regs, eviction speeds ups and fixes - i915: HSW support enable, agp removal on GEN6, seqno wrapping - exynos: IPP subsystem support (image post proc), HDMI - nouveau: display class reworking, nv20->40 z compression - ttm: start of locking fixes, rcu usage for lookups, - core: documentation updates, docbook integration, monotonic clock usage, move from connector to object properties" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (590 commits) drm/exynos: add gsc ipp driver drm/exynos: add rotator ipp driver drm/exynos: add fimc ipp driver drm/exynos: add iommu support for ipp drm/exynos: add ipp subsystem drm/exynos: support device tree for fimd radeon: fix regression with eviction since evict caching changes drm/radeon: add more pedantic checks in the CP DMA checker drm/radeon: bump version for CS ioctl support for async DMA drm/radeon: enable the async DMA rings in the CS ioctl drm/radeon: add VM CS parser support for async DMA on cayman/TN/SI drm/radeon/kms: add evergreen/cayman CS parser for async DMA (v2) drm/radeon/kms: add 6xx/7xx CS parser for async DMA (v2) drm/radeon: fix htile buffer size computation for command stream checker drm/radeon: fix fence locking in the pageflip callback drm/radeon: make indirect register access concurrency-safe drm/radeon: add W\|RREG32_IDX for MM_INDEX\|DATA based mmio accesss drm/exynos: support extended screen coordinate of fimd drm/exynos: fix x, y coordinates for right bottom pixel drm/exynos: fix fb offset calculation for plane ...	2012-12-17 08:26:17 -08:00
Chris Wilson	eb119bd612	drm/i915: Access to snooped system memory through the GTT is incoherent We ignore all the user requests to handle flushing to the GTT domain if the user requests such on a snoopable bo, and as such access through the GTT to such pages remains incoherent. The specs even warn that such behaviour is undefined - a strong reason never to do so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-17 12:28:23 +01:00
Mika Kuoppala	9e8e36879f	drm/i915: Set initial seqno value close to wrap boundary To gain confidence in the wrap handling, make it happen quite soon after the boot. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 14:07:22 +01:00
Chris Wilson	107f27a5df	drm/i915: Open-code i915_gpu_idle() for handling seqno wrapping The complication is that during seqno wrapping we must be extremely careful not to write to any ring as that will require a new seqno, and so would recurse back into the seqno wrap handler. So we cannot call i915_gpu_idle() as that does additional work beyond simply retiring the current set of requests, and instead must do the minimal work ourselves during seqno wrapping. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 14:07:03 +01:00
Mika Kuoppala	f72b3435c1	drm/i915: Don't emit semaphore wait if wrap happened If wrap just happened we need to prevent emitting waits for pre wrap values. Detect this and emit no-ops instead. v2: Use olr > seqno to detect wrap instead of *seqno == 0 as suggested by Chris Wilson. v3: Use last used seqno to detect the wraparound. From Chris Wilson v4: Fixed unnecessary last_seqno assigment References: https://bugs.freedesktop.org/show_bug.cgi?id=57967 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 13:32:26 +01:00
Linus Torvalds	caf491916b	Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage This reverts commits `a50915394f` and `d7c3b937bd`. This is a revert of a revert of a revert. In addition, it reverts the even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the original commits in linux-next. It turns out that the original patch really was bogus, and that the original revert was the correct thing to do after all. We thought we had fixed the problem, and then reverted the revert, but the problem really is fundamental: waking up kswapd simply isn't the right thing to do, and direct reclaim sometimes simply _is_ the right thing to do. When certain allocations fail, we simply should try some direct reclaim, and if that fails, fail the allocation. That's the right thing to do for THP allocations, which can easily fail, and the GPU allocations want to do that too. So starting kswapd is sometimes simply wrong, and removing the flag that said "don't start kswapd" was a mistake. Let's hope we never revisit this mistake again - and certainly not this many times ;) Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-10 11:03:05 -08:00
Chris Wilson	e9b73c6739	drm/i915: Reduce memory pressure during shrinker by preallocating swizzle pages On a machine with bit17 swizzling, we need to store the bit17 of the physical page address in put-pages. This requires a memory allocation, on average less than a page, which may be difficult to satisfy is the request to put-pages is on behalf of the shrinker. We could allow that allocation to pull from the reserved memory pools, but it seems much safer to preallocate the array for tiled objects on affected machines. v2: Export i915_gem_object_needs_bit17_swizzle() for reuse. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-07 01:16:15 +01:00
Mika Kuoppala	498d2ac15c	drm/i915: Add intel_ring_handle_seqno wrap If there are pre-wrap values in semaphore-mbox registers after wrap, syncing against some after-wrap request will complete immediately. Fix this by emitting ring commands to set mbox registers to zero when the wrap happens. v2: Use __intel_ring_begin to emit ring commands, from Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add a small comment to handle_seqno_wrap.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-06 13:14:34 +01:00
Daniel Vetter	1a240d4de2	drm/i915: fixup sparse warnings - __iomem where there is none (I love how we mix these things up). - Use gfp_t instead of an other plain type. - Unconfuse one place about enum pipe vs enum transcoder - for the pch transcoder we actually use the pipe enum. Fixup the other cases where we assign the pipe to the cpu transcoder with explicit casts. - Declare the mch_lock properly in a header. There is still a decent mess in intel_bios.c about __iomem, but heck, this is x86 and we're allowed to do that. Makes-sparse-happy: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Use a space after the cast consistently and fix up the newly-added cast in i915_irq.c to properly use __iomem.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 22:31:04 +01:00
Damien Lespiau	4239ca779d	drm/i915: Fix dieing -> dying typo Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 18:25:02 +01:00
Chris Wilson	a2165e3123	drm/i915: Decouple the object from the unbound list before freeing pages As we may actually allocate in order to save the physical swizzling bits during the free, we have to be careful not to trigger the shrinker on the same object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added a small comment in the code to really drive the scariness of this patch home.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 17:22:16 +01:00
Chris Wilson	42dcedd4f2	drm/i915: Use a slab for object allocation The primary purpose of this was to debug some use-after-free memory corruption that was causing an OOPS inside drm/i915. As it turned out the corruption was being caused elsewhere and i915.ko as a major user of many objects was being hit hardest. Indeed as we do frequent the generic kmalloc caches, dedicating one to ourselves (or at least naming one for us depending upon the core) aids debugging our own slab usage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-30 23:44:05 +01:00
Chris Wilson	0104fdbb84	drm/i915: Introduce i915_gem_object_create_stolen() Allow for the creation of GEM objects backed by stolen memory. As these are not backed by ordinary pages, we create a fake dma mapping and store the address in the scatterlist rather than obj->pages. v2: Mark _i915_gem_object_create_stolen() as static, as noticed by Jesse Barnes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-30 23:34:16 +01:00
Daniel Vetter	8dcf015eb9	drm/i915: optimize the shmem_pwrite slowpath handling Since we drop dev->struct_mutex when going through the slowpath, the object might have been moved out of the cpu domain. Hence we need to clflush the entire object to ensure that after the ioctl returns, everything is coherent again (interwoven writes are ill-defined anyway). But we only need to do this if we start in the cpu domain and the object requires flushing for coherency. So don't do the flushing if the object is coherent anyway or if we've done in-line clfushing already. v2: i915_gem_clflush_object already checks whether the object is coherent and if so, drops the flushing. Hence we don't need to check that ourselves, simplifying the condition. v3: Reorder the checks for better clarity (and adjust the comment accordingly), suggested by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 13:49:08 +01:00
Daniel Vetter	a39a68054f	drm/i915: simplify shmem pwrite/pread slowpath handling The shmem paths for pwrite/pread used a clever trick to hold onto a single page when dropping the big dev->struct_mutex for the slowpath. But this ran the risk of reinstating (or not completely purging) the backing storage when dropping purgeable objects. Hence the code needed to keep track of whether it ever dropped the lock, and if it did, manually check whether it needs to re-purge the backing storage. But thanks to the pages pin count introduced in commit `a5570178c0` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 4 21:02:54 2012 +0100 drm/i915: Pin backing pages whilst exporting through a dmabuf vmap which allowed us to pin the backing storage and remove that page reference trick from shmem_pwrite/read in commit `f60d7f0c1d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 4 21:02:56 2012 +0100 drm/i915: Pin backing pages for pread and commit `755d22184f` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 4 21:02:55 2012 +0100 drm/i915: Pin backing pages for pwrite we can now abolish this check. The slowpath cleanup completely disappears from pread, and for pwrite we're only left with the domain fixup in case someone moved the object out of the cpu domain from under us. A follow-on patch will optimize that a notch more. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 13:48:34 +01:00
Mika Kuoppala	7b01e260a6	drm/i915: Set sync_seqno properly after seqno wrap i915_gem_handle_seqno_wrap() will zero all sync_seqnos but as the wrap can happen inside ring->sync_to(), pre wrap seqno was carried over and overwrote the zeroed sync_seqno. When wrap is handled, all outstanding requests will be retired and objects moved to inactive queue, causing their last_read_seqno to be zero. Use this to update the sync_seqno correctly. RING_SYNC registers after wrap will contain pre wrap values which are >= seqno. So injecting the semaphore wait into ring completes immediately. Original idea for using last_read_seqno from Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:54 +01:00
Chris Wilson	3e9605018a	drm/i915: Rearrange code to only have a single method for waiting upon the ring Replace the wait for the ring to be clear with the more common wait for the ring to be idle. The principle advantage is one less exported intel_ring_wait function, and the removal of a hardcoded value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:53 +01:00
Chris Wilson	b662a06632	drm/i915: Simplify flushing activity on the ring As we now always preallocate the seqno before writing to the ring, we can trivially test if we have any pending activity on the ring by inspecting the olr. This makes it then possible to flush operations that are not normally associated with a request, like power-management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:53 +01:00
Chris Wilson	9d7730914f	drm/i915: Preallocate next seqno before touching the ring Based on the work by Mika Kuoppala, we realised that we need to handle seqno wraparound prior to committing our changes to the ring. The most obvious point then is to grab the seqno inside intel_ring_begin(), and then to reuse that seqno for all ring operations until the next request. As intel_ring_begin() can fail, the callers must already be prepared to handle such failure and so we can safely add further checks. This patch looks like it should be split up into the interface changes and the tweaks to move seqno wrapping from the execbuffer into the core seqno increment. However, I found no easy way to break it into incremental steps without introducing further broken behaviour. v2: Mika found a silly mistake and a subtle error in the existing code; inside i915_gem_retire_requests() we were resetting the sync_seqno of the target ring based on the seqno from this ring - which are only related by the order of their allocation, not retirement. Hence we were applying the optimisation that the rings were synchronised too early, fortunately the only real casualty there is the handling of seqno wrapping. v3: Do not forget to reset the sync_seqno upon module reinitialisation, ala resume. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:52 +01:00
Chris Wilson	b5d177946a	drm/i915: Wait upon the last request seqno, rather than a future seqno In commit `69c2fc8913` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 12:41:03 2012 +0100 drm/i915: Remove the per-ring write list the explicit flush was removed from i915_ring_idle(). However, we continued to wait upon the next seqno which now did not correspond to any request (except for the unusual condition of a failure to queue a request after execbuffer) and so would wait indefinitely. This has an important side-effect that i915_gpu_idle() does not cause the seqno to be incremented. This is vital if we are to be able to idle the GPU to handle seqno wraparound, as in subsequent patches. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:51 +01:00
Chris Wilson	5774506f15	drm/i915: Borrow our struct_mutex for the direct reclaim If we have hit oom whilst holding our struct_mutex, then currently we cannot reap our own GPU buffers which likely pin most of memory, making an outright OOM more likely. So if we are running in direct reclaim and already hold the mutex, attempt to free buffers knowing that the original function can not continue until we return. v2: Add a note explaining that the mutex may be stolen due to pre-emption, and that is bad. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:47:14 +01:00
Chris Wilson	8742267af4	drm/i915: Defer assignment of obj->gtt_space until after all possible mallocs As we may invoke the shrinker whilst trying to allocate memory to hold the gtt_space for this object, we need to be careful not to mark the drm_mm_node as activated (by assigning it to this object) before we have finished our sequence of allocations. Note: We also need to move the binding of the object into the actual pagetables down a bit. The best way seems to be to move it out into the callsites. Reported-by: Imre Deak <imre.deak@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added small note to commit message to summarize review discussion.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:47:13 +01:00
Chris Wilson	c9839303d1	drm/i915: Pin the object whilst faulting it in In order to prevent reaping of the object whilst setting it up to handle the pagefault, we need to mark it as pinned. This has the nice side-effect of eliminating some special cases from the pagefault handler as well! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:45:04 +01:00
Chris Wilson	fbdda6fb5e	drm/i915: Guard pages being reaped by OOM whilst binding-to-GTT In the circumstances that the shrinker is allowed to steal the mutex in order to reap pages, we need to be careful to prevent it operating on the current object and shooting ourselves in the foot. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:45:04 +01:00
Dave Airlie	9fabd4eede	Merge branch 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Highlights of this -next round: - ivb fdi B/C fixes - hsw sprite/plane offset fixes from Damien - unified dp/hdmi encoder for hsw, finally external dp support on hsw (Paulo) - kill-agp and some other prep work in the gtt code from Ben - some fb handling fixes from Ville - massive pile of patches to align hsw VGA with the spec and make it actually work (Paulo) - pile of workarounds from Jesse, mostly for vlv, but also some other related platforms - start of a dev_priv reorg, that thing grew out of bounds and chaotic - small bits&pieces all over the place, down to better error handling for load-detect on gen2 (Chris, Jani, Mika, Zhenyu, ...) On top of the previous pile (just copypasta): - tons of hsw dp prep patches form Paulo - round scheduled work items and timers to nearest second (Chris) - some hw workarounds (Jesse&Damien) - vlv dp support and related fixups (Vijay et al.) - basic haswell dp support, not yet wired up for external ports (Paulo) - edp support (Paulo) - tons of refactorings to prepare for the above (Paulo) - panel rework, unifiying code between lvds and edp panels (Jani) - panel fitter scaling modes (Jani + Yuly Novikov) - panel power improvements, should now work without the BIOS setting it up - extracting some dp helpers from radeon/i915 and move them to drm_dp_helper.c - randome pile of workarounds (Damien, Ben, ...) - some cleanups for the register restore code for suspend/resume - secure batchbuffer support, should enable tear-free blits on gen6+ Chris) - random smaller fixlets and cleanups. * 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (231 commits) drm/i915: Restore physical HWS_PGA after resume drm/i915: Report amount of usable graphics memory in MiB drm/i915/i2c: Track users of GMBUS force-bit drm/i915: Allocate the proper size for contexts. drm/i915: Update load-detect failure paths for modeset-rework drm/i915: Clear unused fields of mode for framebuffer creation drm/i915: Always calculate 8xx WM values based on a 32-bpp framebuffer drm/i915: Fix sparse warnings in from AGP kill code drm/i915: Missed lock change with rps lock drm/i915: Move the remaining gtt code drm/i915: flush system agent TLBs on SNB drm/i915: Kill off now unused gen6+ AGP code drm/i915: Calculate correct stolen size for GEN7+ drm/i915: Stop using AGP layer for GEN6+ drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush drm/i915: don't rewrite the GTT on resume v4 drm/i915: protect RPS/RC6 related accesses (including PCU) with a new mutex drm/i915: put ring frequency and turbo setup into a work queue v5 drm/i915: don't block resume on fb console resume v2 drm/i915: extract l3_parity substruct from dev_priv ...	2012-11-20 09:22:35 +10:00
Ben Widawsky	26b1ff35c8	drm/i915: Move the remaining gtt code It's pretty much all consolidated now that we've killed AGP. We can move the one outlier, and defines too. (Kill some unused defines in the process) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:44 +01:00
Ben Widawsky	e76e9aebcd	drm/i915: Stop using AGP layer for GEN6+ As a quick hack we make the old intel_gtt structure mutable so we can fool a bunch of the existing code which depends on elements in that data structure. We can/should try to remove this in a subsequent patch. This should preserve the old gtt init behavior which upon writing these patches seems incorrect. The next patch will fix these things. The one exception is VLV which doesn't have the preserved flush control write behavior. Since we want to do that for all GEN6+ stuff, we'll handle that in a later patch. Mainstream VLV support doesn't actually exist yet anyway. v2: Update the comment to remove the "voodoo" Check that the last pte written matches what we readback v3: actually kill cache_level_to_agp_type since most of the flags will disappear in an upcoming patch v4: v3 was actually not what we wanted (Daniel) Make the ggtt bind assertions better and stricter (Chris) Fix some uncaught errors at gtt init (Chris) Some other random stuff that Chris wanted v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a tad more robust by mapping everything != CACHE_NONE to the cached agp flag - we have a 1:1 uncached mapping, but different modes of cacheable (at least on later generations). Suggested by Chris Wilson.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:42 +01:00
Daniel Vetter	a4da4fa4e5	drm/i915: extract l3_parity substruct from dev_priv Pretty astonishing how far apart these two members landed ... Especially since I've already removed almost 200 lines in between. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:40 +01:00
Daniel Vetter	c2fb791692	Linux 3.7-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJQgvdwAAoJEHm+PkMAQRiG+3AH/i2XsqqN3VctL0nnbWfvds+Q aKulfIdJTjKiVAsawPUtRqReZ8ijiebrgA/53lZLlrFOoPPQ5+LHmnSyQF6gErOY NuAE1lijXDRM1pwBlhvOBbAj26wUobGjqONFJ9OkKr758Ue8ds/Q7UdxyEgmYgmg tvVMzfRcICzryUV3PcqL+3cNPpCUdT6wGGRJ9DCv/jvGiWKExWhOle5oltrmxk+D NsqRcws5pEubfHE4J8BvNWr8lE1kHfYVhrJETiLJUiN2XAJcbI4Jy7rU/3EGteNS 0HMZdaPPjV874lohdM70X2225SbYrCVkAYB5hnZCTeC3tYyCawBBPMQoyAiOcmU= =+861 -----END PGP SIGNATURE----- Merge tag 'v3.7-rc2' into drm-intel-next-queued Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-22 14:34:51 +02:00
Dave Airlie	64acba6a7a	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Daniel writes: The big thing is the disabling of the hsw support by default, cc: stable. We've aimed for basic hsw support in 3.6, but due to a few bad happenstances we've screwed up and only 3.8 will have better modeset support than vesa. To avoid yet another round of fallout from such a gaffle on for the next platform we've added a module option to disable early hw support by default. That should also give us more flexibility in bring-up. Otherwise just small fixes: - 3 fixes from Egbert for sdvo corner cases - invert-brightness quirk entry from Egbert - revert a dp link training change, it regresses some setups - and shut up a spurious WARN in our gem fault handler. - regression fix for an oops on bit17 swizzling machines, introduce in 3.7 - another no-lvds quirk * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: Initialize obj->pages before use by i915_gem_object_do_bit17_swizzle() drm/i915: Add no-lvds quirk for Supermicro X7SPA-H drm/i915: Insert i915_preliminary_hw_support variable. drm/i915: shut up spurious WARN in the gtt fault handler Revert "drm/i915: Try harder to complete DP training pattern 1" DRM/i915: Restore sdvo_flags after dtd->mode->dtd Roundrtrip. DRM/i915: Don't clone SDVO LVDS with analog. DRM/i915: Add QUIRK_INVERT_BRIGHTNESS for NCR machines. DRM/i915: Don't delete DPLL Multiplier during DAC init.	2012-10-22 09:55:48 +10:00
Chris Wilson	74ce6b6c63	drm/i915: Initialize obj->pages before use by i915_gem_object_do_bit17_swizzle() If we leave obj->pages set to NULL before attempting to deswizzle them, then an OOPS is well deserved. Fixes regression introduced in commit `9da3da660d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jun 1 15:20:22 2012 +0100 drm/i915: Replace the array of pages with a scatterlist Reported-and-tested-by: Krzysztof Kolasa Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-19 21:52:52 +02:00
Daniel Vetter	a7c2e1aad6	drm/i915: shut up spurious WARN in the gtt fault handler -ENOSPC can happen if userspace is being simplistic and tries to map a too big object. To aid further spurious WARN debugging, also print out the error code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56017 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-17 11:56:40 +02:00
Dave Airlie	3459f62047	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Daniel writes: "- some register magic to fix hsw crw (Paulo&Ben) - fix backlight destruction for cpu edp (Jani) - fix gen ch7xxx dvo ->get_hw_state - fixup the plane->pipe fixup code, the broken version massively angers the modeset sanity checks - kill pipe A quirk for i855gm, otherwise I get a black screen with the above patch - fixup for gem_get_page helper (Chris) - fixup guardband clipping w/a (Ken), without this mesa master can erronously drop vertices on snb, mesa 9.0 has the optimization reverted - another pageflip vs. modeset fix - kill bogus BUG_ON which broke ums+gem from Willy Tarreau (gasp, people are still using this!)" * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: fix non-DP-D eDP backlight cleanup and module reload drm/i915: HSW CRW stability magic drm/i915/dvo-ch7xxx: fix get_hw_state drm/i915: fixup the plane->pipe fixup code drm/i915: rip out the pipe A quirk for i855gm drm/i915: disable wc gtt pte mappings on gen2 drm/i915: fixup i915_gem_object_get_page inline helper drm/i915: Disallow preallocation of requests drm/i915: Set guardband clipping workaround bit in the right register. drm/i915: paper over a pipe-enable vs pageflip race drm/i915: remove useless BUG_ON which caused a regression in 3.5.	2012-10-16 10:11:59 +10:00
Rodrigo Vivi	eda2d7f582	drm/i915: HSW CRW stability magic This magic brings stability to HSW CRW machines. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-12 10:59:11 +02:00
Chris Wilson	acb868d3d7	drm/i915: Disallow preallocation of requests The intention was to allow the caller to avoid a failure to queue a request having already written commands to the ring. However, this is a moot point as the i915_add_request() can fail for other reasons than a mere allocation failure and those failure cases are more likely than ENOMEM. So the overlay code already had to handle i915_add_request() failures, and due to commit `3bb73aba1e` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 12:40:59 2012 +0100 drm/i915: Allow late allocation of request for i915_add_request() the error handling code in intel_overlay.c was subject to causing double-frees, as found by coverity. Rather than further complicate i915_add_request() and callers, realise the battle is lost and adapt intel_overlay.c to take advantage of the late allocation of requests. v2: Handle callers passing in a NULL seqno. v3: Ditto. This time for sure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-12 10:59:09 +02:00
Chris Wilson	bcb4508616	drm/i915: Align the retire_requests worker to the nearest second By using round_jiffies() we can align the wakeup of our worker to the nearest second in order to batch wakeups and reduce system load, which is useful for unimportant coarse tasks like our retire_requests. v2: round_jiffies_relative() already returns the relative timeout value, so no need to incorrectly perform the subtraction twice. The timer interface still leaves the possibility for the value of jiffies to change be we program the timer. Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-08 18:45:21 +02:00
Chris Wilson	cecc21fea9	drm/i915: Align the hangcheck wakeup to the nearest second round_jiffies() aligns the wakeup time to the nearest second in order to batch wakeups and reduce system load, which is useful for unimportant coarse timers like our hangcheck. v2: round_jiffies_relative() returns the relative jiffie value, whereas we need the absolute value for the timer. Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-08 18:44:36 +02:00
Willy Tarreau	c77d7162a7	drm/i915: remove useless BUG_ON which caused a regression in 3.5. starting an old X server causes a kernel BUG since commit 1b50247a8d: ------------[ cut here ]------------ kernel BUG at drivers/gpu/drm/i915/i915_gem.c:3661! invalid opcode: 0000 [#1] SMP Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss uvcvideo +videobuf2_core videodev videobuf2_vmalloc videobuf2_memops uhci_hcd ath9k mac80211 snd_hda_codec_realtek ath9k_common microcode +ath9k_hw psmouse serio_raw sg ath cfg80211 atl1c lpc_ich mfd_core ehci_hcd snd_hda_intel snd_hda_codec snd_hwdep snd_pcm rtc_cmos +snd_timer snd evdev eeepc_laptop snd_page_alloc sparse_keymap Pid: 2866, comm: X Not tainted 3.5.6-rc1-eeepc #1 ASUSTeK Computer INC. 1005HA/1005HA EIP: 0060:[<c12dc291>] EFLAGS: 00013297 CPU: 0 EIP is at i915_gem_entervt_ioctl+0xf1/0x110 EAX: f5941df4 EBX: f5940000 ECX: 00000000 EDX: 00020000 ESI: f5835400 EDI: 00000000 EBP: f51d7e38 ESP: f51d7e20 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: b760e0a0 CR3: 351b6000 CR4: 000007d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process X (pid: 2866, ti=f51d6000 task=f61af8d0 task.ti=f51d6000) Stack: 00000001 00000000 f5835414 f51d7e84 f5835400 f54f85c0 f51d7f10 c12b530b 00000001 c151b139 c14751b6 c152e030 00000b32 00006459 00000059 0000e200 00000001 00000000 00006459 c159ddd0 c12dc1a0 ffffffea 00000000 00000000 Call Trace: [<c12b530b>] drm_ioctl+0x2eb/0x440 [<c12dc1a0>] ? i915_gem_init+0xe0/0xe0 [<c1052b2b>] ? enqueue_hrtimer+0x1b/0x50 [<c1053321>] ? __hrtimer_start_range_ns+0x161/0x330 [<c10530b3>] ? lock_hrtimer_base+0x23/0x50 [<c1053163>] ? hrtimer_try_to_cancel+0x33/0x70 [<c12b5020>] ? drm_version+0x90/0x90 [<c10ca171>] vfs_ioctl+0x31/0x50 [<c10ca2e4>] do_vfs_ioctl+0x64/0x510 [<c10535de>] ? hrtimer_nanosleep+0x8e/0x100 [<c1052c20>] ? update_rmtp+0x80/0x80 [<c10ca7c9>] sys_ioctl+0x39/0x60 [<c1433949>] syscall_call+0x7/0xb Code: 83 c4 0c 5b 5e 5f 5d c3 c7 44 24 04 2c 05 53 c1 c7 04 24 6f ef 47 c1 e8 6e e0 fd ff c7 83 38 1e 00 00 00 00 00 00 e9 3f ff ff +ff <0f> 0b eb fe 0f 0b eb fe 8d b4 26 00 00 00 00 0f 0b eb fe 8d b6 EIP: [<c12dc291>] i915_gem_entervt_ioctl+0xf1/0x110 SS:ESP 0068:f51d7e20 ---[ end trace dd332ec083cbd513 ]--- The crash happens here in i915_gem_entervt_ioctl() : 3659 BUG_ON(!list_empty(&dev_priv->mm.active_list)); 3660 BUG_ON(!list_empty(&dev_priv->mm.flushing_list)); -> 3661 BUG_ON(!list_empty(&dev_priv->mm.inactive_list)); 3662 mutex_unlock(&dev->struct_mutex); Quoting Chris : "That BUG_ON there is silly and can simply be removed. The check is to verify that no batches were submitted to the kernel whilst the UMS/GEM client was suspended - to which the BUG_ONs are a crude approximation. Furthermore, the checks are too late, since it means we attempted to program the hardware whilst it was in an invalid state, the BUG_ONs are the least of your concerns at that point." Note that this regression has been introduced in commit `1b50247a8d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 24 15:47:30 2012 +0100 drm/i915: Remove the list of pinned inactive objects Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Willy Tarreau <w@1wt.eu> [danvet: Added note about the regressing commit and cc: stable.] Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-07 22:57:25 +02:00
Dave Airlie	1f31c69dac	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Bigger -fixes pile, mostly because I've included Ajax' DP dongle stuff, as discussed on irc. Otherwise just small things: - regression fix to finally make 6bpc auto-dither on dp work (Jani) - reinstate an snb ctx w/a that accidentally got lost in a rework (Chris) - fixup the DP train sequence, logic-goof-up uncovered by Coverty (Chris) - fix set_caching locking (Ben) - fix spurious segfault on con-current gtt mmap faulting (Dimitry and Mika) - some pageflip correctness fixes (still hunting down some issues, but these are the worst offenders of confused code that we've tracked down thus far) from Chris and me - fixup swizzling settings on vlv (Jesse) - gt_mode w/a from Ben added, fixes snb gt1 rc6+hw ctx hangs. * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: Fix GT_MODE default value drm/i915: don't frob the vblank ts in finish_page_flip drm/i915: call drm_handle_vblank before finish_page_flip drm/i915: print warning if vmi915_gem_fault error is not handled drm/i915: EBUSY status handling added to i915_gem_fault(). drm/i915: Try harder to complete DP training pattern 1 drm/i915: set swizzling to none on VLV drm/dp: Make sink count DP 1.2 aware drm/dp: Document DP spec versions for various DPCD registers drm/i915/dp: Be smarter about connection sense for branch devices drm/i915/dp: Fetch downstream port info if needed during DPCD fetch drm/dp: Update DPCD defines drm: Export drm_probe_ddc() drm/i915: Flush the pending flips on the CRTC before modification drm/i915: Actually invalidate the TLB for the SandyBridge HW contexts w/a drm/i915: Fix set_caching locking drm/i915: use adjusted_mode instead of mode for checking the 6bpc force flag	2012-10-07 21:13:54 +10:00
Mika Kuoppala	4d0f817e74	drm/i915: print warning if vmi915_gem_fault error is not handled Falling into default case in vmi915_gem_fault is a bug. Be more verbose about it. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-04 10:33:42 +02:00
Dmitry Rogozhkin	e79e0fe380	drm/i915: EBUSY status handling added to i915_gem_fault(). Subsequent threads returning EBUSY from vm_insert_pfn() was not handled correctly. As a result concurrent access from new threads to mmapped data caused SIGBUS. Note that this fixes i-g-t/tests/gem_threaded_tiled_access. Tested-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-04 10:33:42 +02:00
Linus Torvalds	612a9aab56	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...	2012-10-03 23:29:23 -07:00
David Howells	760285e7e7	UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/ Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:07 +01:00
David Howells	4126d5d61f	UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:05 +01:00
Ben Widawsky	3bc2913e2c	drm/i915: Fix set_caching locking On the EINVAL case we don't release struct_mutex. It should be safe to grab the lock after checking the parameters, which also resolves the issues. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-27 08:45:11 +02:00
Ben Widawsky	199adf40ae	drm/i915: s/cacheing/caching/ Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-26 09:24:36 +02:00
Daniel Vetter	398b7a1b88	Linux 3.6-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk= =0v2l -----END PGP SIGNATURE----- Merge tag 'v3.6-rc7' into drm-intel-next-queued Manual backmerge of -rc7 to resolve a silent conflict leading to compile failure in drivers/gpu/drm/i915/intel_hdmi.c. This is due to the bugfix in -rc7: commit `b98b601672` Author: Wang Xingchao <xingchao.wang@intel.com> Date: Thu Sep 13 07:43:22 2012 +0800 drm/i915: HDMI - Clear Audio Enable bit for Hot Plug Since this code moved around a lot in -next git put that snippet at the wrong spot. I've tried to fix this by making the conflict explicit by merging a version for next with: commit `3cce574f01` Author: Wang Xingchao <xingchao.wang@intel.com> Date: Thu Sep 13 11:19:00 2012 +0800 drm/i915: HDMI - Clear Audio Enable bit for Hot Plug unconditionally But that failed to solve the entire problem. To avoid pushing out further -nightly branch to our QA where this is broken, do the backmerge and manually add the stuff git adds to -next from the patch in -fixes. Note that this doesn't show up in git's merge diff (and hence is also not handled by git rerere), which adds to the reasons why I'd like to fix this with a verbose backmerge. The git merge diff only shows a bunch of trivial conflicts of the "code changed in lines next to each another" kind. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-24 18:17:12 +02:00
Chris Wilson	2f745ad3d3	drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops By providing a callback for when we need to bind the pages, and then release them again later, we can shorten the amount of time we hold the foreign pages mapped and pinned, and importantly the dmabuf objects then behave as any other normal object with respect to the shrinker and memory management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:23:10 +02:00
Chris Wilson	9da3da660d	drm/i915: Replace the array of pages with a scatterlist Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	f60d7f0c1d	drm/i915: Pin backing pages for pread By using the recently introduced pinning of pages, we can safely drop the mutex in the knowledge that the pages are not going to disappear beneath us, and so we can simplify the code for iterating over the pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	755d22184f	drm/i915: Pin backing pages for pwrite By using the recently introduced pinning of pages, we can safely drop the mutex in the knowledge that the pages are not going to disappear jeneath us, and so we can simplify the code for iterating over the pages. Note: The old code had such complicated page refcounting since it used obj->pages as a micro-optimization if it's there, but that could (before this patch) disappear when we drop the dev->struct_mutex. Hence some manual page refcounting was required for the slow path, complicated by the fact that pages returned by shmem_read_mapping_page already have a pageref, which needs to be dropped again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Added note to explain the question Ben raised in review.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:56 +02:00
Chris Wilson	a5570178c0	drm/i915: Pin backing pages whilst exporting through a dmabuf vmap We need to refcount our pages in order to prevent reaping them at inopportune times, such as when they currently vmapped or exported to another driver. However, we also wish to keep the lazy deallocation of our pages so we need to take a pin/unpinned approach rather than a simple refcount. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:56 +02:00
Chris Wilson	37e680a15f	drm/i915: Introduce drm_i915_gem_object_ops In order to specialise functions depending upon the type of object, we can attach vfuncs to each object via a new ->ops pointer. For instance, this will be used in future patches to only bind pages from a dma-buf for the duration that the object is used by the GPU - and so prevent them from pinning those pages for the entire of the object. v2: Bonus comments. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:55 +02:00
Chris Wilson	7e81a42e34	drm/i915: Reduce a pin-leak BUG into a WARN Pin-leaks persist and we get the perennial bug reports of machine lockups to the BUG_ON(pin_count==MAX). If we instead loudly report that the object cannot be pinned at that time it should prevent the driver from locking up, and hopefully restore a semblance of working whilst still leaving us a OOPS to debug. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-17 10:12:57 +02:00
Dave Airlie	65983bd605	Merge branch 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: "New stuff for -next. Highlights: - prep patches for the modeset rework. Note that one of those patches touches the fb helper in the common drm code. - hasw hdmi audio support (Wang Xingchao) - improved instdone dumping for gen7 (Ben) - unbound tracking and a few follow-up patches from Chris - dma_buf->begin/end_cpu_access plus fix for drm/udl (Dave) - improve mmio error reporting for hsw - prep patch for WQ_NON_REENTRANT removal (Tejun Heo) " * 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (41 commits) drm/i915: Remove __GFP_NO_KSWAPD drm/i915: disable rc6 on ilk when vt-d is enabled drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer drm/i915: Use new INSTDONE registers (Gen7+) drm/i915: Add new INSTDONE registers drm/i915: Extract reading INSTDONE drm/i915: Use a non-blocking wait for set-to-domain ioctl drm/i915: Juggle code order to ease flow of the next patch drm/i915: Use cpu relocations if the object is in the GTT but not mappable drm/i915: Extract general object init routine drm/i915: Protect private gem objects from truncate (such as imported dmabuf) drm/i915: Only pwrite through the GTT if there is space in the aperture i915: use alloc_ordered_workqueue() instead of explicit UNBOUND w/ max_active = 1 drm/i915: Find unclaimed MMIO writes. drm/i915: Add ERR_INT to gen7 error state drm/i915: Cantiga+ cannot handle a hsync front porch of 0 drm/i915: fix reassignment of variable "intel_dp->DP" drm/i915: Try harder to allocate an mmap_offset drm/i915: Show pin count in debugfs drm/i915: Show (count, size) of purgeable objects in i915_gem_objects ...	2012-09-03 12:05:01 +10:00
Sedat Dilek	d7c3b937bd	drm/i915: Remove __GFP_NO_KSWAPD When I pulled-in today's drm-intel-next into linux-next (next-20120824) I saw this build-breakage: drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_object_get_pages_gtt': drivers/gpu/drm/i915/i915_gem.c:1778:40: error: '__GFP_NO_KSWAPD' undeclared (first use in this function) drivers/gpu/drm/i915/i915_gem.c:1778:40: note: each undeclared identifier is reported only once for each function it appears in This is caused by commit ba099ef165f8 ("mm: remove __GFP_NO_KSWAPD") and commit b6beae2c2014 ("mm: remove __GFP_NO_KSWAPD fixes") in linux-next (next-20120824). Fix this by removing __GFP_NO_KSWAPD from drm/i915 driver. Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-27 17:11:38 +02:00
Dave Airlie	93bb70e0c0	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next There was some merge conflicts in -next and they weren't so pretty, so backmerge now to avoid them. Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_modes.c	2012-08-27 16:22:20 +10:00
Chris Wilson	3236f57a01	drm/i915: Use a non-blocking wait for set-to-domain ioctl The principal use for set-to-domain is for userspace to serialise operations with a particular buffer, for example to maintain coherency with a CPU map or to ratelimit its rendering by waiting on all previous operations before continuing. As such we tend to hold the struct_mutex for long periods during the synchronisation and so cause contention issues with other users of the graphics device, even for independent operations as memory management. An example is the contention between compiz and X which causes jitter in the display and a drop in peak throughput. The ultimate solution would be a set of fine grained locks and lockless operations, but an intermediate step is to first attempt the synchronisation for set-to-domain without holding the mutex. This introduces a number of race conditions, so we limit it use to the ioctl periphery where we have no dependent state and can safely complete with a locked synchronisation afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 11:12:56 +02:00
Chris Wilson	b361237bcc	drm/i915: Juggle code order to ease flow of the next patch Move the wait-for-rendering logic around in the file so that we can group it together with the subsequent variations. The general goal is to have the lower level routines clustered together and then the higher level logic building upon those low level routines that came before. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 11:12:53 +02:00
Chris Wilson	0327d6ba99	drm/i915: Extract general object init routine As we wish to create specialised object constructions in the near future that share the same basic GEM object struct, export the default initializer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:04:38 +02:00
Chris Wilson	4d6294bf77	drm/i915: Protect private gem objects from truncate (such as imported dmabuf) If the object has no backing shmemfs filp, then we obviously cannot perform a truncation operation upon it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:04:31 +02:00
Chris Wilson	86a1ee26bb	drm/i915: Only pwrite through the GTT if there is space in the aperture Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:03:33 +02:00
Chris Wilson	d8cb508669	drm/i915: Try harder to allocate an mmap_offset Given the persistence of an offset for the lifetime of an object, itis easy to contemplate how the mmap space becomes badly fragmented to the point that further allocations fail with ENOSPC. Our only recourse at this point is to try to purge the objects to release some space and reattempt the allocation. References: https://bugs.freedesktop.org/show_bug.cgi?id=39552 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:36 +02:00
Chris Wilson	c4670ad080	drm/i915: Add some sanity checks to unbound tracking A pair of universally true checks that just need to be put in the right place depending on where in the patch sequence you go. Note that i915_gem_object_put_pages_gtt() already gains the BUG_ON(obj->gtt_space), but on reflection that needed to migrate to put_pages(). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:20 +02:00
Chris Wilson	6c085a728c	drm/i915: Track unbound pages When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:11 +02:00
Daniel Vetter	225067eedf	drm/i915: move functions around Prep work to make Chris Wilson's unbound tracking patch a bit easier to read. Alas, I'd have preferred that moving the page allocation retry loop from bind to get_pages would have been a separate patch, too. But that looks like real work ;-) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-20 10:59:41 +02:00
Ben Widawsky	b6c7488df6	drm/i915/contexts: fix list corruption After reset we unconditionally reinitialize lists. If the context switch hasn't yet completed before the suspend, the default context object will end up on lists that are going to go away when we resume. The patch forces the context switch to be synchronous before suspend assuring that the active/inactive tracking is correct at the time of resume. References: https://bugs.freedesktop.org/show_bug.cgi?id=52429 Tested-by: Guang A Yang <guang.a.yang@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-17 09:21:34 +02:00
Chris Wilson	b2eadbc85b	drm/i915: Lazily apply the SNB+ seqno w/a Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 \|\| IS_GEN7 is redundant in that function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-10 11:11:32 +02:00

... 4 5 6 7 8 ...

1156 Commits