linux

Commit Graph

Author	SHA1	Message	Date
John Harrison	bcfcc8ba29	drm/i915: Additional request structure tracing Added the request structure's 'uniq' identifier to the trace information. Also renamed the '_complete' trace event to '_notify' as it actually happens in the IRQ 'notify_ring()' function. The intention is to add a new '_complete' trace event which occurs when a request structure is actually marked as complete. However, at the moment the completion status is re-tested every time the query is made so there isn't a completion event as such. v2: New patch added to series. v3: Rebased to remove completion caching as that is apparently contentious. Change-Id: Ic9bcde67d175c6c03b96217cdcb6e4cc4aa45d67 For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-06 01:46:27 +01:00
John Harrison	581c26e8a2	drm/i915: Convert 'trace_irq' to use requests rather than seqnos Updated the trace_irq code to use requests instead of seqnos. This includes reference counting the request object to ensure it sticks around when required. Note that getting access to the reference counting functions means moving the inline i915_trace_irq_get() function from intel_ringbuffer.h to i915_drv.h. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> [danvet: Resolve conflict due to shuffled merge order.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:24 +01:00
John Harrison	74328ee510	drm/i915: Convert trace functions from seqno to request All the code above is now using requests not seqnos so it is possible to convert the trace functions across. Note that rather than get into problematic reference counting issues, the trace code only saves the seqno and ring values from the request structure not the structure pointer itself. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-12-03 09:35:21 +01:00
Daniele Ceraolo Spurio	198c974d7e	drm/i915: Add tracepoints to track a vm during its lifetime - ppgtt init/release: these tracepoints are useful for observing the creation and destruction of Full PPGTTs. - ctx create/free: we can use the ctx_free trace in combination with the ppgtt_release one to be sure that the ppgtt doesn't stay alive for too long after the ctx is destroyed. ctx_create is there for simmetry - switch_mm: important point in the lifetime of the vm v4: add DOC information v5: pull the DOC in drm.tmpl v6: clean ppgtt init/release traces + add ctx create/free and switch_mm tracepoints (Chris) v7: drop execlist_submit_context tracepoint Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-11-14 10:29:13 +01:00
Oscar Mateo	a4872ba6d0	drm/i915: s/intel_ring_buffer/intel_engine_cs In the upcoming patches we plan to break the correlation between engine command streamers (a.k.a. rings) and ringbuffers, so it makes sense to refactor the code and make the change obvious. No functional changes. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-22 23:01:05 +02:00
Ville Syrjälä	25ef284a2a	drm/i915: Add pipe update trace points Add trace points for observing the atomic pipe update mechanism. v2: Rebased due to earlier changes v3: Pass intel_crtc instead of drm_crtc (Daniel) v4: Pass frame counter from the caller to evaded/end since the caller now always has that ready Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Sourab Gupta <sourabgupta@gmail.com> Reviewed-by: Akash Goel <akash.goels@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-06 10:18:03 +02:00
Steven Rostedt	9297ebf29a	drm/i915: Do not dereference pointers from ring buffer in evict event The TP_printk() should never dereference any pointers, because the ring buffer can be read at some unknown time in the future. If a device no longer exists, it can cause a kernel oops. This also makes this event useless when saving the ring buffer in userspaces tools such as perf and trace-cmd. The i915_gem_evict_vm dereferences the vm pointer which may also not exist when the ring buffer is read sometime in the future. Link: http://lkml.kernel.org/r/1395095198-20034-3-git-send-email-artagnon@gmail.com Reported-by: Ramkumar Ramachandra <artagnon@gmail.com> Cc: stable@vger.kernel.org # 3.13+ Fixes: `bcccff847d` "drm/i915: trace vm eviction instead of everything" Signed-off-by: Steven Rostedt <rostedt@goodmis.org> [danvet: Try to make it actually compile] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-03-19 08:17:29 +01:00
Daniel Vetter	1ec9e26dda	drm/i915: Consolidate binding parameters into flags Anything more than just one bool parameter is just a pain to read, symbolic constants are much better. Split out from Chris' vma-binding rework patch. v2: Undo the behaviour change in object_pin that Chris spotted. v3: Split out misplaced hunk to handle set_cache_level errors, spotted by Jani. v4: Keep the current over-zealous binding logic in the execbuffer code working with a quick hack while the overall binding code gets shuffled around. v5: Reorder the PIN_ flags for more natural patch splitup. v6: Pull out the PIN_GLOBAL split-up again. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-02-14 14:16:58 +01:00
Chris Wilson	b52b89da09	drm/i915: Add a tracepoint for using a semaphore So that we can find the callers who introduce a ring stall. A single ring stall is not too unwelcome, the right issue becomes when they start to interlock and prevent any concurrent work. That, however, is a little tricker to detect with a mere tracepoint! v2: Rebrand it as a ring event, rather than an object event. v3: Include the seqno in the tracepoint for posterity or something. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:24 +02:00
Chris Wilson	814e9b57c0	drm/i915: Move the conditional seqno query into the tracepoint We only wish to know the value of seqno when emitting the tracepoint, so move the query from a parameter to the macro to inside the conditional macro body so that the query is only evaluated when required. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:22 +02:00
Ben Widawsky	bcccff847d	drm/i915: trace vm eviction instead of everything Tracing vm eviction is really the event we care about. For the cases we evict everything, we still will get the trace. v2: Add the drm device to the trace since we might not be the only device in the system. (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-10-01 07:45:20 +02:00
Ben Widawsky	07fe0b1280	drm/i915: plumb VM into bind/unbind code As alluded to in several patches, and it will be reiterated later... A VMA is an abstraction for a GEM BO bound into an address space. Therefore it stands to reason, that the existing bind, and unbind are the ones which will be the most impacted. This patch implements this, and updates all callers which weren't already updated in the series (because it was too messy). This patch represents the bulk of an earlier, larger patch. I've pulled out a bunch of things by the request of Daniel. The history is preserved for posterity with the email convention of ">" One big change from the original patch aside from a bunch of cropping is I've created an i915_vma_unbind() function. That is because we always have the VMA anyway, and doing an extra lookup is useful. There is a caveat, we retain an i915_gem_object_ggtt_unbind, for the global cases which might not talk in VMAs. > drm/i915: plumb VM into object operations > > This patch was formerly known as: > "drm/i915: Create VMAs (part 3) - plumbing" > > This patch adds a VM argument, bind/unbind, and the object > offset/size/color getters/setters. It preserves the old ggtt helper > functions because things still need, and will continue to need them. > > Some code will still need to be ported over after this. > > v2: Fix purge to pick an object and unbind all vmas > This was doable because of the global bound list change. > > v3: With the commit to actually pin/unpin pages in place, there is no > longer a need to check if unbind succeeded before calling put_pages(). > Make put_pages only BUG() after checking pin count. > > v4: Rebased on top of the new hangcheck work by Mika > plumbed eb_destroy also > Many checkpatch related fixes > > v5: Very large rebase > > v6: > Change BUG_ON to WARN_ON (Daniel) > Rename vm to ggtt in preallocate stolen, since it is always ggtt when > dealing with stolen memory. (Daniel) > list_for_each will short-circuit already (Daniel) > remove superflous space (Daniel) > Use per object list of vmas (Daniel) > Make obj_bound_any() use obj_bound for each vm (Ben) > s/bind_to_gtt/bind_to_vm/ (Ben) > > Fixed up the inactive shrinker. As Daniel noticed the code could > potentially count the same object multiple times. While it's not > possible in the current case, since 1 object can only ever be bound into > 1 address space thus far - we may as well try to get something more > future proof in place now. With a prep patch before this to switch over > to using the bound list + inactive check, we're now able to carry that > forward for every address space an object is bound into. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Rebase on top of the loss of "drm/i915: Cleanup more of VMA in destroy".] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-08-08 14:04:20 +02:00
Chris Wilson	ed71f1b48e	drm/i915: Convert the register access tracepoint to be conditional The TRACE_EVENT_CONDITION is supposed to generate more efficient code than if (cond) trace(), which is what we are currently using inside the register access functions. v2: Rebase onto uncore Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-25 15:22:07 +02:00
Ben Widawsky	f343c5f647	drm/i915: Getter/setter for object attributes Soon we want to gut a lot of our existing assumptions how many address spaces an object can live in, and in doing so, embed the drm_mm_node in the object (and later the VMA). It's possible in the future we'll want to add more getter/setter methods, but for now this is enough to enable the VMAs. v2: Reworked commit message (Ben) Added comments to the main functions (Ben) sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/.[ch] sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/.[ch] (Daniel) v3: Rebased on new reserve_node patch Changed DRM_DEBUG_KMS to actually work (will need fixing later) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-07-08 22:04:34 +02:00
Chris Wilson	d7d4eeddb8	drm/i915: Allow DRM_ROOT_ONLY\|DRM_MASTER to submit privileged batchbuffers With the introduction of per-process GTT space, the hardware designers thought it wise to also limit the ability to write to MMIO space to only a "secure" batch buffer. The ability to rewrite registers is the only way to program the hardware to perform certain operations like scanline waits (required for tear-free windowed updates). So we either have a choice of adding an interface to perform those synchronized updates inside the kernel, or we permit certain processes the ability to write to the "safe" registers from within its command stream. This patch exposes the ability to submit a SECURE batch buffer to DRM_ROOT_ONLY\|DRM_MASTER processes. v2: Haswell split up bit8 into a ppgtt bit (still bit8) and a security bit (bit 13, accidentally not set). Also add a comment explaining why secure batches need a global gtt binding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1) [danvet: added hsw fixup.] Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-17 21:06:59 +02:00
Daniel Vetter	be2cde9a6d	drm/i915: add a tracepoint for gpu frequency changes We've had and still have too many issues where the gpu turbo doesn't quite to what it's supposed to do (or what we want it to do). Adding a tracepoint to track when the desired gpu frequency changes should help a lot in characterizing and understanding problematic workloads. Also, this should be fairly interesting for power tuning (and especially noticing when the gpu is stuck in high frequencies, as has happened in the past) and hence for integration into powertop and similar tools. Cc: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Paul Menzel <paulepanter@users.sourceforge.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-03 10:09:27 +02:00
Chris Wilson	6c085a728c	drm/i915: Track unbound pages When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:11 +02:00
Ben Widawsky	f3fd37683c	drm/i915: improve i915_wait_request_begin trace The trace events adds whether or not the wait was blocking. Blocking in this case means to hold struct_mutex (ie. no new work can be submitted during the wait). The information is inherently racy. The blocking information is racy since mutex_is_locked doesn't check that the current thread holds the lock. The only other option would be to pass the boolean information of whether or not the class was blocking down through the stack which is less desirable. v2: Don't do a trace event per loop. (Chris) Only get blocking/non-blocking info (Chris) v3: updated comment in code as well as commit msg (Daniel) Add "(NB)" to trace information to remind us in 6 months (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 09:55:15 +02:00
Akshay Joshi	0206e353a0	Drivers: i915: Fix all space related issues. Various issues involved with the space character were generating warnings in the checkpatch.pl file. This patch removes most of those warnings. Signed-off-by: Akshay Joshi <me@akshayjoshi.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-09-19 18:01:47 -07:00
Chris Wilson	db53a30261	drm/i915: Refine tracepoints A lot of minor tweaks to fix the tracepoints, improve the outputting for ftrace, and to generally make the tracepoints useful again. It is a start and enough to begin identifying performance issues and gaps in our coverage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 14:59:18 +00:00
Chris Wilson	60de2ba51e	drm/i915: Kill the get_fence tracepoint As the tracepoint is now decoupled from when the actual register is assigned and was never complemented by detailing when the object lost its fence, it has outlived its limited usefulness. Profiling the actual stalls is a far more profitable venture anyway. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-02 10:20:47 +00:00
Chris Wilson	05394f3975	drm/i915: Use drm_i915_gem_object as the preferred type A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and many characters! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:10 +00:00
Yuanhan Liu	ba4f01a304	drm/i915: trace down all the register write and read Add two tracepoints at I915_WRITE/READ for tracing down all the register write and read. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-08 09:36:48 +00:00
Daniel Vetter	ec57d2602a	drm/i915: add mappable to gem_object_bind tracepoint This way we can make some more educated guesses as to why exactly we can't use 2G apertures to their full potential ;) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:07 +01:00
Jesse Barnes	e5510fac98	drm/i915: add tracepoints for flip requests & completions Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-07-02 14:04:14 +10:00
Li Zefan	f41275e893	drm/i915: Convert more trace events to DEFINE_EVENT Convert i915_gem_object_clflush to DEFINE_EVENT, and save ~0.5K: text data bss dec hex filename 13204 2732 12 15948 3e4c i915_trace_points.o.orig 12668 2732 12 15412 3c34 i915_trace_points.o No change in functionality. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 13:49:13 -07:00
Peter Clifton	a7c542782e	drm/i915: Fix out of tree builds Fixes up include paths for i915_trace.h by setting additional CFLAGS for i915_trace_points.c to include the $src directory. The required TRACE_INCLUDE_PATH is then "." Signed-off-by: Peter Clifton <pcjc2@cam.ac.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-10 13:38:32 -07:00
Li Zefan	903cf20c99	drm/i915: Convert some trace events to DEFINE_TRACE Use DECLARE_EVENT_CLASS to remove duplicate code: text data bss dec hex filename 14655 2732 15 17402 43fa i915_trace_points.o.orig 11625 2732 10 14367 381f i915_trace_points.o 8 events are converted: i915_gem_object: i915_gem_object_{unbind, destroy} i915_gem_request: i915_gem_request_{complete, retire, wait_begin, wait_end} i915_ring: i915_ring_{wait_begin, wait_end} No functional change. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-04-09 14:16:34 -07:00
Chris Wilson	9d34e5db07	drm/i915: Enable irq to trace batch buffer completion. If we trigger a tracepoint for batch buffer submission, it is a reasonable assumption that we wish to also trace the batch buffer completion. So in order to capture the completion events, we need to enable irqs... However, we cannot rely on the completion event to disable the irq later, so we defer the irq disable to the retire request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-29 03:15:25 +01:00
Chris Wilson	4f49be5468	drm/i915: Record device minor rather than pointer in TRACE_EVENT Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-29 03:15:23 +01:00
Chris Wilson	1c5d22f76d	drm/i915: Add tracepoints By adding tracepoint equivalents for WATCH_BUF/EXEC we are able to monitor the lifetimes of objects, requests and significant events. These events can then be probed using the tracing frameworks, such as systemtap and, in particular, perf. For example to record the stack trace for every GPU stall during a run, use $ perf record -e i915:i915_gem_request_wait_begin -c 1 -g And $ perf report to view the results. [Updated to fix compilation issues caused.] Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Ben Gamari <bgamari@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:05:21 +01:00

31 Commits