Commit Graph

37138 Commits

Author SHA1 Message Date
Ben Skeggs 7df1bb87b8 drm/nouveau/disp/nv50-: avoid creating ORs that aren't present on HW
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-17 11:23:44 +10:00
Rodrigo Vivi 46c26662d2 drm/i915/cfl: Introduce Coffee Lake workarounds.
Coffee Lake inherit most of Kabylake production
workarounds.

v2: Fix typo on commit message and remove
    WaDisableKillLogic and GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC,
    since as Mika pointed out they shouldn't be here for cfl
    according to BSpec.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497653398-15722-1-git-send-email-rodrigo.vivi@intel.com
2017-06-16 16:14:29 -07:00
Dave Airlie 660e855813 amdgpu: use drm sync objects for shared semaphores (v6)
This creates a new command submission chunk for amdgpu
to add in and out sync objects around the submission.

Sync objects are managed via the drm syncobj ioctls.

The command submission interface is enhanced with two new
chunks, one for syncobj pre submission dependencies,
and one for post submission sync obj signalling,
and just takes a list of handles for each.

This is based on work originally done by David Zhou at AMD,
with input from Christian Konig on what things should look like.

In theory VkFences could be backed with sync objects and
just get passed into the cs as syncobj handles as well.

NOTE: this interface addition needs a version bump to expose
it to userspace.

TODO: update to dep_sync when rebasing onto amdgpu master.
(with this - r-b from Christian)

v1.1: keep file reference on import.
v2: move to using syncobjs
v2.1: change some APIs to just use p pointer.
v3: make more robust against CS failures, we now add the
wait sems but only remove them once the CS job has been
submitted.
v4: rewrite names of API and base on new syncobj code.
v5: move post deps earlier, rename some apis
v6: lookup post deps earlier, and just replace fences
in post deps stage (Christian)

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:32 -04:00
Dave Airlie 6f0308ebc1 amdgpu/cs: split out fence dependency checking (v2)
This just splits out the fence depenency checking into it's
own function to make it easier to add semaphore dependencies.

v2: rebase onto other changes.

v1-Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:31 -04:00
Alex Deucher 64dab074fe drm/amdgpu: don't check the default value for vm size
Avoids printing spurious messages like this:
[    3.102059] amdgpu 0000:01:00.0: VM size (-1) must be a power of 2

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-16 16:58:00 -04:00
Dhinakaran Pandiyan 28e0f4eef6 drm/i915: Store 9 bits of PCI Device ID for platforms with a LP PCH
Although we use 9 bits of Device ID for identifying PCH, only 8 bits are
stored in dev_priv->pch_id. This makes HAS_PCH_CNP_LP() and
HAS_PCH_SPT_LP() incorrect. Fix this by storing all the 9 bits for the
platforms with LP PCH.

v2: Drop PCH_LPT_LP change (Imre)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Fixes: commit ec7e0bb35f ("drm/i915/cnp: Add PCI ID for Cannonpoint LP PCH")
Reported-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497641774-29104-1-git-send-email-dhinakaran.pandiyan@intel.com
2017-06-16 23:06:20 +03:00
Chris Wilson 95ff7c7dd7 drm/i915: Stash a pointer to the obj's resv in the vma
During execbuf, a mandatory step is that we add this request (this
fence) to each object's reservation_object. Inside execbuf, we track the
vma, and to add the fence to the reservation_object then means having to
first chase the obj, incurring another cache miss. We can reduce the
 number of cache misses by stashing a pointer to the reservation_object
in the vma itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170616140525.6394-1-chris@chris-wilson.co.uk
2017-06-16 16:54:05 +01:00
Chris Wilson 7dd4f6729f drm/i915: Async GPU relocation processing
If the user requires patching of their batch or auxiliary buffers, we
currently make the alterations on the cpu. If they are active on the GPU
at the time, we wait under the struct_mutex for them to finish executing
before we rewrite the contents. This happens if shared relocation trees
are used between different contexts with separate address space (and the
buffers then have different addresses in each), the 3D state will need
to be adjusted between execution on each context. However, we don't need
to use the CPU to do the relocation patching, as we could queue commands
to the GPU to perform it and use fences to serialise the operation with
the current activity and future - so the operation on the GPU appears
just as atomic as performing it immediately. Performing the relocation
rewrites on the GPU is not free, in terms of pure throughput, the number
of relocations/s is about halved - but more importantly so is the time
under the struct_mutex.

v2: Break out the request/batch allocation for clearer error flow.
v3: A few asserts to ensure rq ordering is maintained

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16 16:54:05 +01:00
Chris Wilson 1a71cf2fa6 drm/i915: Allow execbuffer to use the first object as the batch
Currently, the last object in the execlist is the always the batch.
However, when building the batch buffer we often know the batch object
first and if we can use the first slot in the execlist we can emit
relocation instructions relative to it immediately and avoid a separate
pass to adjust the relocations to point to the last execlist slot.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16 16:54:05 +01:00
Chris Wilson 8a2421bd0d drm/i915: Wait upon userptr get-user-pages within execbuffer
This simply hides the EAGAIN caused by userptr when userspace causes
resource contention. However, it is quite beneficial with highly
contended userptr users as we avoid repeating the setup costs and
kernel-user context switches.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
2017-06-16 16:54:05 +01:00
Chris Wilson 616d9cee4f drm/i915: First try the previous execbuffer location
When choosing a slot for an execbuffer, we ideally want to use the same
address as last time (so that we don't have to rebind it) and the same
address as expected by the user (so that we don't have to fixup any
relocations pointing to it). If we first try to bind the incoming
execbuffer->offset from the user, or the currently bound offset that
should hopefully achieve the goal of avoiding the rebind cost and the
relocation penalty. However, if the object is not currently bound there
we don't want to arbitrarily unbind an object in our chosen position and
so choose to rebind/relocate the incoming object instead. After we
report the new position back to the user, on the next pass the
relocations should have settled down.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtien@linux.intel.com>
2017-06-16 16:54:05 +01:00
Chris Wilson dade2a6165 drm/i915: Store a persistent reference for an object in the execbuffer cache
If we take a reference to the object/vma when it is first used in an
execbuf, we can keep that reference until the object's file-local handle
is closed. Thereby saving a frequent ref/unref pair.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16 16:54:05 +01:00
Chris Wilson 2889caa923 drm/i915: Eliminate lots of iterations over the execobjects array
The major scaling bottleneck in execbuffer is the processing of the
execobjects. Creating an auxiliary list is inefficient when compared to
using the execobject array we already have allocated.

Reservation is then split into phases. As we lookup up the VMA, we
try and bind it back into active location. Only if that fails, do we add
it to the unbound list for phase 2. In phase 2, we try and add all those
objects that could not fit into their previous location, with fallback
to retrying all objects and evicting the VM in case of severe
fragmentation. (This is the same as before, except that phase 1 is now
done inline with looking up the VMA to avoid an iteration over the
execobject array. In the ideal case, we eliminate the separate reservation
phase). During the reservation phase, we only evict from the VM between
passes (rather than currently as we try to fit every new VMA). In
testing with Unreal Engine's Atlantis demo which stresses the eviction
logic on gen7 class hardware, this speed up the framerate by a factor of
2.

The second loop amalgamation is between move_to_gpu and move_to_active.
As we always submit the request, even if incomplete, we can use the
current request to track active VMA as we perform the flushes and
synchronisation required.

The next big advancement is to avoid copying back to the user any
execobjects and relocations that are not changed.

v2: Add a Theory of Operation spiel.
v3: Fall back to slow relocations in preparation for flushing userptrs.
v4: Document struct members, factor out eb_validate_vma(), add a few
more comments to explain some magic and hide other magic behind macros.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16 16:54:05 +01:00
Chris Wilson 071750e550 drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations
If we write a relocation into the buffer, we require our own implicit
synchronisation added after the start of the execbuf, outside of the
user's control. As we may end up clflushing, or doing the patch itself
on the GPU, asynchronously we need to look at the implicit serialisation
on obj->resv and hence need to disable EXEC_OBJECT_ASYNC for this
object.

If the user does trigger a stall for relocations, we make sure the stall
is complete enough so that the batch is not submitted before we complete
those relocations.

Fixes: 77ae995789 ("drm/i915: Enable userspace to opt-out of implicit fencing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16 16:54:05 +01:00
Chris Wilson 507d977ff9 drm/i915: Pass vma to relocate entry
We can simplify our tracking of pending writes in an execbuf to the
single bit in the vma->exec_entry->flags, but that requires the
relocation function knowing the object's vma. Pass it along.

Note we have only been using a single bit to track flushing since

commit cc889e0f6c
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Wed Jun 13 20:45:19 2012 +0200

    drm/i915: disable flushing_list/gpu_write_list

unconditionally flushed all render caches before the breadcrumb and

commit 6ac42f4148
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Sat Jul 21 12:25:01 2012 +0200

    drm/i915: Replace the complex flushing logic with simple invalidate/flush all

did away with the explicit GPU domain tracking. This was then codified
into the ABI with NO_RELOC in

commit ed5982e6ce
Author: Daniel Vetter <daniel.vetter@ffwll.ch> # Oi! Patch stealer!
Date:   Thu Jan 17 22:23:36 2013 +0100

    drm/i915: Allow userspace to hint that the relocations were known

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16 16:54:04 +01:00
Chris Wilson 4ff4b44cbb drm/i915: Store a direct lookup from object handle to vma
The advent of full-ppgtt lead to an extra indirection between the object
and its binding. That extra indirection has a noticeable impact on how
fast we can convert from the user handles to our internal vma for
execbuffer. In order to bypass the extra indirection, we use a
resizable hashtable to jump from the object to the per-ctx vma.
rhashtable was considered but we don't need the online resizing feature
and the extra complexity proved to undermine its usefulness. Instead, we
simply reallocate the hastable on demand in a background task and
serialize it before iterating.

In non-full-ppgtt modes, multiple files and multiple contexts can share
the same vma. This leads to having multiple possible handle->vma links,
so we only use the first to establish the fast path. The majority of
buffers are not shared and so we should still be able to realise
speedups with multiple clients.

v2: Prettier names, more magic.
v3: Many style tweaks, most notably hiding the misuse of execobj[].rsvd2

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-06-16 16:54:04 +01:00
Chris Wilson 4c9c0d0974 drm/i915: Fix retrieval of hangcheck stats
The default context is always supported (as it contains the global
hangcheck stats) and the contexts for hangcheck are not limited
to any ring.

This was dropped in 2013 because it was supposed to have been included
with Ben's full-ppgtt patch set. It never landed and the bug remains.

References: https://bugs.freedesktop.org/show_bug.cgi?id=65845
Link: http://patchwork.freedesktop.org/patch/msgid/1372175222-27622-1-git-send-email-mika.kuoppala@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170616132849.29597-1-chris@chris-wilson.co.uk
2017-06-16 16:22:43 +01:00
Archit Taneja 816fa34c05 drm/msm/hdmi: Fix HDMI pink strip issue seen on 8x96
A 2 pixel wide pink strip was observed on the left end of some HDMI
monitors configured in a HDMI mode.

It turned out that we were missing out on configuring AVI infoframes, and
unlike APQ8064, the 8x96 HDMI H/W seems to be sensitive to that.

Add configuration of AVI infoframes. While at it, make sure that
hdmi_audio_update is only called when we've detected that the monitor
supports HDMI.

Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:09 -04:00
Archit Taneja b474cbbb2b drm/msm/hdmi: 8996 PLL: Populate unprepare
Without doing anything in unprepare, the HDMI driver isn't able to
switch modes successfully. Calling set_rate with a new rate results
in an un-locked PLL.

If we reset the PLL in unprepare, the PLL is able to lock with the
new rate.

Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:08 -04:00
Liviu Dudau ffe8f53f9c drm/msm/hdmi: Use bitwise operators when building register values
Commit c0c0d9eeeb ("drm/msm: hdmi audio support") uses logical
OR operators to build up a value to be written in the
REG_HDMI_AUDIO_INFO0 and REG_HDMI_AUDIO_INFO1 registers when it
should have used bitwise operators.

Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Fixes: c0c0d9eeeb ("drm/msm: hdmi audio support")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:08 -04:00
Rob Clark 52260ae4c4 drm/msm: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:07 -04:00
Rob Clark 8432a903fb drm/msm: remove address-space id
Now that the msm_gem supports an arbitrary number of vma's, we no longer
need to assign an id (index) to each address space.  So rip out the
associated code.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:06 -04:00
Rob Clark 4b85f7f5cf drm/msm: support for an arbitrary number of address spaces
It means we have to do a list traversal where we once had an index into
a table.  But the list will normally have one or two entries.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:06 -04:00
Rob Clark f4839bd512 drm/msm: refactor how we handle vram carveout buffers
Pull some of the logic out into msm_gem_new() (since we don't need to
care about the imported-bo case), and don't defer allocating pages.  The
latter is generally a good idea, since if we are using VRAM carveout to
allocate contiguous buffers (ie. no IOMMU), the allocation is more
likely to fail.  So failing at allocation time is a more sane option.
Plus this simplifies things in the next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:05 -04:00
Rob Clark 8bdcd949bb drm/msm: pass address-space to _get_iova() and friends
No functional change, that will come later.  But this will make it
easier to deal with dynamically created address spaces (ie. per-
process pagetables for gpu).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:04 -04:00
Rob Clark f59f62d592 drm/msm/mdp4+5: move aspace/id to base class
Before we can shift to passing the address-space object to _get_iova(),
we need to fix a few places (dsi+fbdev) that were hard-coding the adress
space id.  That gets somewhat easier if we just move these to the kms
base class.

Prep work for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:03 -04:00
Rob Clark aa7cd24297 drm/msm/mdp5: kill pipe_lock
It serves no purpose, things should be sufficiently synchronized already
by atomic framework.  And it is somewhat awkward to be holding a spinlock
when msm_gem_iova() is going to start needing to grab a mutex.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:02 -04:00
Rob Clark cb1e38181a drm/msm: fix locking inconsistency for gpu->hw_init()
Most, but not all, paths where calling the with struct_mutex held.  The
fast-path in msm_gem_get_iova() (plus some sub-code-paths that only run
the first time) was masking this issue.

So lets just always hold struct_mutex for hw_init().  And sprinkle some
WARN_ON()'s and might_lock() to avoid this sort of problem in the
future.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:01 -04:00
Jordan Crouse 42a105e9cf drm/msm: Remove memptrs->wptr
memptrs->wptr seems to be unused. Remove it to avoid
confusing the upcoming preemption code.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:01 -04:00
Jordan Crouse 5770fc7a56 drm/msm: Add a struct to pass configuration to msm_gpu_init()
The amount of information that we need to pass into msm_gpu_init()
is steadily increasing, so add a new struct to stabilize the function
call and make it easier to add new configuration down the line.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:16:00 -04:00
Jordan Crouse 49fd08baa3 drm/msm: Add hint to DRM_IOCTL_MSM_GEM_INFO to return an object IOVA
Modify the 'pad' member of struct drm_msm_gem_info to 'flags'. If the
user sets 'flags' to non-zero it means that they want a IOVA for the
GEM object instead of a mmap() offset. Return the iova in the 'offset'
member.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
[robclark: s/hint/flags in commit msg]
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:15:47 -04:00
Jordan Crouse e895c7bd31 drm/msm: Remove idle function hook
There isn't any generic code that uses ->idle so remove it.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:15:47 -04:00
Jordan Crouse 167b606aa2 drm/msm: Remove DRM_MSM_NUM_IOCTLS
The ioctl array is sparsely populated but the compiler will make sure
that it is sufficiently sized for all the values that we have so we
can safely use ARRAY_SIZE() instead of having a constantly changing
#define in the uapi header.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:15:46 -04:00
Jordan Crouse 7c65817e6d drm/msm: gpu: Enable zap shader for A5XX
The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
only render to buffers that are marked as secure and inaccessible
to the kernel and user through a series of hardware protections. In
practice secure mode is used to draw things like a UI on a secure
video frame.

In order to switch out of secure mode the GPU executes a special
shader that clears out the GMEM and other sensitve registers and
then writes a register. Because the kernel can't be trusted the
shader binary is signed and verified and programmed by the
secure world. To do this we need to read the MDT header and the
segments from the firmware location and put them in memory and
present them for approval.

For targets without secure support there is an out: if the
secure world doesn't support secure then there are no hardware
protections and we can freely write the SECVID_TRUST register from
the CPU. We don't have 100% confidence that we can query the
secure capabilities at run time but we have enough calls that
need to go right to give us some confidence that we're at least doing
something useful.

Of course if we guess wrong you trigger a permissions violation
which usually ends up in a system crash but thats a problem
that shows up immediately.

[v2: use child device per Bjorn]
[v3: use generic MDT loader per Bjorn]
[v4: use managed dma functions and ifdefs for the MDT loader]
[v5: Add depends for QCOM_MDT_LOADER]

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[robclark: fix Kconfig to use select instead of depends + #if IS_ENABLED()]
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-06-16 11:15:31 -04:00
Chris Wilson 7fc92e96c3 drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty
For ease of use (i.e. avoiding a few checks and function calls), store
the object's cache coherency next to the cache is dirty bit.

Specifically this patch aims to reduce the frequency of no-op calls to
i915_gem_object_clflush() to counter-act the increase of such calls for
GPU only objects in the previous patch.

v2: Replace cache_dirty & ~cache_coherent with cache_dirty &&
!cache_coherent as gcc generates much better code for the latter
(Tvrtko)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Dongwon Kim <dongwon.kim@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170616105455.16977-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-06-16 14:52:27 +01:00
Chris Wilson e27ab73d17 drm/i915: Mark CPU cache as dirty on every transition for CPU writes
Currently, we only mark the CPU cache as dirty if we skip a clflush.
This leads to some confusion where we have to ask if the object is in
the write domain or missed a clflush. If we always mark the cache as
dirty, this becomes a much simply question to answer.

The goal remains to do as few clflushes as required and to do them as
late as possible, in the hope of deferring the work to a kthread and not
block the caller (e.g. execbuf, flips).

v2: Always call clflush before GPU execution when the cache_dirty flag
is set. This may cause some extra work on llc systems that migrate dirty
buffers back and forth - but we do try to limit that by only setting
cache_dirty at the end of the gpu sequence.

v3: Always mark the cache as dirty upon a level change, as we need to
invalidate any stale cachelines due to external writes.

Reported-by: Dongwon Kim <dongwon.kim@intel.com>
Fixes: a6a7cc4b7d ("drm/i915: Always flush the dirty CPU cache when pinning the scanout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Dongwon Kim <dongwon.kim@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615123850.26843-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-06-16 14:50:52 +01:00
Chris Wilson b8e5d2ef19 drm/i915: Make i915_vma_destroy() static
i915_vma_destroy() is now not used outside of i915_vma.c so we can
remove the export and make the function static.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170616123508.12673-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
2017-06-16 14:32:12 +01:00
Ville Syrjälä 10223df2c1 drm/i915: Actually attach the tv_format property to the SDVO connector
Attach the tv_format property to the SDVO connector instead of passing
a '0' in place of the pointer to the property. This got broken when
the SDVO connector properties were converted to atomic.

We can thank sparse for catching this:
drivers/gpu/drm/i915/intel_sdvo.c:2742:75: warning: Using plain integer as NULL pointer

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: 630d30a4ee ("drm/i915: Convert intel_sdvo connector properties to atomic.")
Link: http://patchwork.freedesktop.org/patch/msgid/20170615172308.10121-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-16 15:46:57 +03:00
Liviu Dudau e40eda3dda drm/arm: mali-dp: Use CMA helper for plane buffer address calculation
CMA has gained a recent helper function for calculating the start
of a plane buffer's physical address. Use that instead of the
hand rolled version.

Cc: Brian Starkey <brian.starkey@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2017-06-16 11:56:46 +01:00
Liviu Dudau 0df34a8073 drm/mali-dp: Check PM status when sharing interrupt lines
If an instance of Mali DP hardware shares the interrupt line with
another hardware (usually another instance of the Mali DP) its
interrupt handler can get called when the device is suspended.

Check the PM status before making access to the hardware registers
to avoid deadlocks.

Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
2017-06-16 11:56:46 +01:00
Jose Abreu e2113c0367 drm/arm: malidp: Use crtc->mode_valid() callback
Now that we have a callback to check if crtc supports a given mode
we can use it in malidp so that we restrict the number of probbed
modes to the ones we can actually display.

Also, remove the mode_fixup() callback as this is no longer needed
because mode_valid() will be called before.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Carlos Palminha <palminha@synopsys.com>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Archit Taneja <architt@codeaurora.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Liviu Dudau <liviu@dudau.co.uk>
2017-06-16 11:56:46 +01:00
Jani Nikula 7dfb9ba33f Merge tag 'gvt-next-2017-06-08' of https://github.com/01org/gvt-linux into drm-intel-next-queued
gvt-next-2017-06-08

First gvt-next pull for 4.13:
- optimization for per-VM mmio save/restore (Changbin)
- optimization for mmio hash table (Changbin)
- scheduler optimization with event (Ping)
- vGPU reset refinement (Fred)
- other misc refactor and cleanups, etc.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608093547.bjgs436e3iokrzdm@zhen-hp.sh.intel.com
2017-06-16 10:03:01 +03:00
Arnd Bergmann 5499473c86 drm/nouveau: use proper prototype in nouveau_pmops_runtime() definition
There is a prototype for this function in the header, but the function
itself lacks a 'void' in the argument list, causing a harmless warning
when building with 'make W=1':

drivers/gpu/drm/nouveau/nouveau_drm.c: In function 'nouveau_pmops_runtime':
drivers/gpu/drm/nouveau/nouveau_drm.c:730:1: error: old-style function definition [-Werror=old-style-definition]

Fixes: 321f5c5f2c ("drm/nouveau: replace multiple open-coded runpm support checks with function")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:04 +10:00
Mikko Perttunen 876ea7be6a drm/nouveau: Skip vga_fini on non-PCI device
As with vga_init, this function doesn't make sense on non-PCI devices,
and the Thunderbolt check in it dereferences a NULL pointer in that
case. Add some code to skip this function when the device is not a PCI
device.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:04 +10:00
Mikko Perttunen fcd504e312 drm/nouveau/tegra: Don't leave GPU in reset
On Tegra186 systems with certain firmware revisions, leaving the GPU in
reset can cause a hang. To prevent this, don't leave the GPU in reset.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:04 +10:00
Mikko Perttunen b1df242544 drm/nouveau/tegra: Skip manual unpowergating when not necessary
On Tegra186, powergating is handled by the BPMP power domain provider
and the "legacy" powergating API is not available. Therefore skip
these calls if we are attached to a power domain.

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:03 +10:00
Oscar Salvador 3a93dd2243 drm/nouveau/hwmon: Change permissions to numeric
This patch replaces the symbolic permissions with the numeric ones.

Signed-off-by: Oscar Salvador <osalvador.vilardaga@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:03 +10:00
Oscar Salvador b28d78f187 drm/nouveau/hwmon: expose the auto_point and pwm_min/max attrs
This patch creates a special group attributes for attrs like "*auto_point*".
We check if we have support for them, and if we do, we gather them all in
an attribute_group's structure which is the parameter regarding special groups
of hwmon_device_register_with_info
We also do the same for pwm_min/max attrs.

Signed-off-by: Oscar Salvador <osalvador.vilardaga@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:02 +10:00
Oscar Salvador bfb96e4c34 drm/nouveau/hwmon: Remove old code, add .write/.read operations
This patch removes old code related to the old api and transforms the
functions for the new api. It also adds the .write and .read operations.

Signed-off-by: Oscar Salvador <osalvador.vilardaga@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:02 +10:00
Oscar Salvador dbddaaf083 drm/nouveau/hwmon: Add nouveau_hwmon_ops structure with .is_visible/.read_string
This patch introduces the nouveau_hwmon_ops structure, sets up
.is_visible and .read_string operations and adds all the functions
for these operations.
This is also a preparation for the next patches, where most of the
work is being done.
This code doesn't interacture with the old one.
It's just to make easier the review of all patches.

Signed-off-by: Oscar Salvador <osalvador.vilardaga@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:02 +10:00
Oscar Salvador 02e9722da8 drm/nouveau/hwmon: Add config for all sensors and their settings
This is a preparation for the next patches. It just adds the sensors with
their possible configurable settings and then fills the struct hwmon_channel_info
with all this information.

Signed-off-by: Oscar Salvador <osalvador.vilardaga@gmail.com>
Reviewed-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:01 +10:00
Ben Skeggs 2863204c62 drm/nouveau/disp/gm200-: allow non-identity mapping of SOR <-> macro links
Finally, everything should be in place to handle this.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:01 +10:00
Ben Skeggs 0d93cd92bd drm/nouveau/disp/nv50-: implement a common supervisor 3.0
This makes use of all the additional routing and state added in previous
commits, making it possible to deal with GM20x macro link routing, while
also sharing code between the NV50 and GF119 implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:00 +10:00
Ben Skeggs 8d7ef84d90 drm/nouveau/disp/nv50-: implement a common supervisor 2.2
This makes use of all the additional routing and state added in previous
commits, making it possible to deal with GM20x macro link routing, while
also sharing code between the NV50 and GF119 implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:00 +10:00
Ben Skeggs 1f0c9eaf31 drm/nouveau/disp/nv50-: implement a common supervisor 2.1
This makes use of all the additional routing and state added in previous
commits, making it possible to deal with GM20x macro link routing, while
also sharing code between the NV50 and GF119 implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:05:00 +10:00
Ben Skeggs d52e948c67 drm/nouveau/disp/nv50-: implement a common supervisor 2.0
This makes use of all the additional routing and state added in previous
commits, making it possible to deal with GM20x macro link routing, while
also sharing code between the NV50 and GF119 implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:59 +10:00
Ben Skeggs 327c5581d3 drm/nouveau/disp/nv50-: implement a common supervisor 1.0
This makes use of all the additional routing and state added in previous
commits, making it possible to deal with GM20x macro link routing, while
also sharing code between the NV50 and GF119 implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:59 +10:00
Ben Skeggs 99a845a30f drm/nouveau/disp/nv50-gt21x: remove workaround for dp->tmds hotplug issues
This shouldn't have been needed ever since we started executing the
DisableLT script when shutting down heads.

Testing of the board this was originally written for seems to agree.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:59 +10:00
Ben Skeggs 32a232c514 drm/nouveau/disp/dp: use new devinit script interpreter entry-point
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:58 +10:00
Ben Skeggs 9648da5a71 drm/nouveau/disp/dp: determine link bandwidth requirements from head state
Training/Untraining will be hooked up to the routing logic, which
doesn't allow us to pass in a data rate.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:58 +10:00
Ben Skeggs 6c22ea3747 drm/nouveau/disp: introduce acquire/release display path methods
These exist to give NVKM information on the set of display paths that
the DD needs to be active at any given time.

Previously, the supervisor attempted to determine this solely from OR
state, but there's a few configurations where this information on its
own isn't enough to determine the specific display paths in question:

- ANX9805, where the PIOR protocol for both DP and TMDS is TMDS.
- On a device using DCB Switched Outputs.
- On GM20x and newer, with a crossbar between the SOR and macro links.

After this commit, the DD tells NVKM *exactly* which display path it's
attempting a modeset on.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:57 +10:00
Ben Skeggs 3c66c87dc9 drm/nouveau/disp: remove hw-specific customisation of output paths
All of the necessary hw-specific logic is now handled at the output
resource level, so all of this can go away.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:57 +10:00
Ben Skeggs e8ccc96dd5 drm/nouveau/disp/gf119-: port OR DP VCPI control to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:57 +10:00
Ben Skeggs 409b9e5472 drm/nouveau/disp/gt215-: port HDA ELD controls to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:56 +10:00
Ben Skeggs 7d1fede03c drm/nouveau/disp/g94-: port OR DP drive setting control to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:56 +10:00
Ben Skeggs a1de2b522f drm/nouveau/disp/g94-: port OR DP training pattern control to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:56 +10:00
Ben Skeggs a3e81117ce drm/nouveau/disp/g94-: port OR DP link power control to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:55 +10:00
Ben Skeggs 7dc0bac4aa drm/nouveau/disp/g94-: port OR DP link setup to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:55 +10:00
Ben Skeggs 333781045d drm/nouveau/disp/g94-: port OR DP lane mapping to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:54 +10:00
Ben Skeggs 797b2fb81b drm/nouveau/disp/g84-: port OR HDMI control to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:54 +10:00
Ben Skeggs 0df1824662 drm/nouveau/disp/nv50-: port OR manual sink detection to nvkm_ior
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:54 +10:00
Ben Skeggs 9c5753bc70 drm/nouveau/disp/nv50-: port OR power state control to nvkm_ior
Also removes the user-facing methods to these controls, as they're not
currently utilised by the DD anyway.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:53 +10:00
Ben Skeggs 29c0ca7389 drm/nouveau/disp/nv50-: fetch head/OR state at beginning of supervisor
This data will be used by essentially every part of the supervisor
handling process.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:53 +10:00
Ben Skeggs 3607bfd398 drm/nouveau/disp/nv50-: execute supervisor on its own workqueue
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:52 +10:00
Ben Skeggs 7d0a01a6de drm/nouveau/disp/dp: train link only when actively displaying an image
This essentially (unless the link becomes unstable and needs to be
re-trained) gives us a single entry-point to link training, during
supervisor handling, where we can ensure all routing is up to date.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:52 +10:00
Ben Skeggs 22e008f90d drm/nouveau/disp/dp: only check for re-train when the link is active
An upcoming commit will limit link training to only when the sink is
meant to be displaying an image.

We still need IRQs enabled even when the link isn't trained (for MST
messages), but don't want to train the link unnecessarily.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:52 +10:00
Ben Skeggs 49f2b376df drm/nouveau/disp/dp: determine a failsafe link training rate
The aim here is to protect the OR against locking up when something
unexpected happens (such as the display disappearing during modeset,
or the DD misbehaving).

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:51 +10:00
Ben Skeggs fafa8b5c9f drm/nouveau/disp/dp: use cached link configuration when checking link status
Saves some trips across the aux channel.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:51 +10:00
Ben Skeggs 4423c743ef drm/nouveau/disp/dp: no need for lt_state except during manual link training
This struct doesn't hold link configuration data anymore, so we can
limit its use to internal DP training (anx9805 handles training for
external DP).

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:51 +10:00
Ben Skeggs 75eefe95ee drm/nouveau/disp/dp: store current link configuration in nvkm_ior
We care about this information outside of link training.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:50 +10:00
Ben Skeggs 02d786ccbc drm/nouveau/disp/dp: remove DP_PWR method
This hasn't been used since atomic.

We may want to re-implement "fast" DPMS at some point, but for now,
this just gets in the way.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:50 +10:00
Ben Skeggs 01a976376b drm/nouveau/disp: identity-map display paths to output resources
This essentially replicates our current behaviour in a way that's
compatible with the new model that's emerging, so that we're able
to start porting the hw-specific functions to it.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:49 +10:00
Ben Skeggs b3c9c0226c drm/nouveau/disp: fork off some new hw-specific implementations
Upcoming commits make supervisor handling share code between the NV50
and GF119 implementations.  Because of this, and a few other cleanups,
we need to allow some additional customisation.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:49 +10:00
Ben Skeggs 78f1ad6f65 drm/nouveau/disp: introduce input/output resource abstraction
In order to properly support the SOR -> SOR + pad macro separation
that occurred with GM20x GPUs, we need to separate OR handling out
of the output path code.

This will be used as the base to support ORs (DAC, SOR, PIOR).

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:49 +10:00
Ben Skeggs 57b2d73be2 drm/nouveau/disp: common implementation of scanoutpos method in nvkm_head
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:48 +10:00
Ben Skeggs 14187b007e drm/nouveau/disp: move vblank_{get,put} methods into nvkm_head
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:48 +10:00
Ben Skeggs a1c930789a drm/nouveau/disp: introduce object to track per-head functions/state
Primarily intended as a way to pass per-head state around during
supervisor handling, and share logic between NV50/GF119.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:48 +10:00
Ben Skeggs 4b2b42f8e9 drm/nouveau/disp: delay output path / connector construction until oneinit()
This is to allow hw-specific code to instantiate output resources first,
so we can cull unsupported output paths based on them.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:47 +10:00
Ben Skeggs 981a8162e2 drm/nouveau/disp: s/nvkm_connector/nvkm_conn/
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:47 +10:00
Ben Skeggs f3e70d2991 drm/nouveau/disp: rename nvkm_output_dp to nvkm_dp
Not all users of nvkm_output_dp have been changed here.  The remaining
ones belong to code that's disappearing in upcoming commits.

This also modifies the debug level of some messages.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:46 +10:00
Ben Skeggs d7ce92e273 drm/nouveau/disp: rename nvkm_output to nvkm_outp
This isn't technically "output", but, "display/output path".

Not all users of nvkm_output have been changed here.  The remaining
ones belong to code that's disappearing in upcoming commits.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:46 +10:00
Ben Skeggs af85389c61 drm/nouveau/disp: shuffle functions around
Upcoming changes to split OR from output path drastically change the
placement of various operations.

In order to make the real changes clearer, do the moving around part
ahead of time.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:46 +10:00
Ben Skeggs 639d72e242 drm/nouveau/kms/nv04: use new devinit script interpreter entry-point
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:45 +10:00
Ben Skeggs 4fdc6ba32e drm/nouveau/fb/ram/nv40-: use new devinit script interpreter entry-point
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:45 +10:00
Ben Skeggs 28c62976a8 drm/nouveau/devinit: use new devinit script interpreter entry-point
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:44 +10:00
Ben Skeggs 74bcb2e98a drm/nouveau/bios/init: add a new devinit script interpreter entry-point
This will ensure unspecified args are easily identified.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:44 +10:00
Ben Skeggs b88afa4396 drm/nouveau/bios/init: add or/link args separate from output path
As of DCB 4.1, these are not the same thing.

Compatibility temporarily in place until callers have been updated.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:44 +10:00
Ben Skeggs ca9c2d5b28 drm/nouveau/bios/init: bump script offset to 32-bits
No (known) case yet, but other tables have been moving beyond 16-bits,
so we may as well be prepared.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:43 +10:00
Ben Skeggs 2195a22f6d drm/nouveau/bios/init: rename 'crtc' to 'head'
Compatibility temporarily in place until all callers have been updated.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:43 +10:00
Ben Skeggs 5b0e787ad5 drm/nouveau/bios/init: remove internal use of nvbios_init.bios
We already have a subdev pointer, from which we can locate the device's
BIOS subdev.  No need for a separate pointer.

Structure/callers not updated yet, as I want to batch more changes and
only touch the callers once.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:42 +10:00
Ben Skeggs 4bb4a7466a drm/nouveau/bios/init: rename nvbios_init() to nvbios_devinit()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:42 +10:00
Ben Skeggs 7eaf1198a9 drm/nouveau/tmr: remove nvkm_timer_alarm_cancel()
nvkm_timer_alarm() already handles this as part of protecting against
callers passing in no timeout value.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:42 +10:00
Karol Herbst 4dc33b1222 drm/nouveau/bios/iccsense: rails for power sensors have a mask of 0xf8 for version 0x10
I only saw those values inside the vbios: 0xff, 0xfd, 0xfc, 0xfa for valid
rails.

No idea what the lower value does, but at least we get power readings on
a lot of Fermi GPUs with that.

v2: add missing parentheses

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:19 +10:00
Karol Herbst c0cd04700f drm/nouveau/bios/volt: Parse min and max for Version 0x40
This is according to what we have in nvbios.

Fixes "ERROR: Can't get value of subfeature in0_min: Can't read" errors
in sensors for some GPUs.

Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:19 +10:00
Alastair Bridgewater 0f18d2765a drm/nouveau: Enable stereoscopic 3D output over HDMI
Enable stereoscopic output for HDMI and DisplayPort connectors on
NV50+ (G80+) hardware.  We do not enable stereoscopy on older
hardware in case there is some older board that still has HDMI
output but for which we have no logic for setting the Vendor
InfoFrame.

With this, I get an obvious 3D output when using the "testdisplay"
program from intel-gpu-tools with the "-3" parameter and outputting
to a 3D-capable HDMI display, for all available 3D modes (be they
TB, SBSH, or FP) on all four G80+ DISPs.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
2017-06-16 14:04:19 +10:00
Alastair Bridgewater 37aa2243ff drm/nouveau: Handle frame-packing mode geometry and timing effects
Frame-packing modes add an extra vtotal raster lines to each frame
above and beyond what the basic mode description calls for.
Account for this during scaler configuration (possibly a bit of a
hack), during CRTC configuration (clearly not a hack), and when
checking that a mode is valid for a given connector (cribbed from
the i915 driver).

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater a897074350 drm/nouveau/disp/gk104-: Use supplied HDMI InfoFrames
Now that we have the InfoFrame data being provided, for the most
part, program the hardware to use it.

While we're here, and since the functionality will come in handy
for supporting 3D stereoscopy, implement setting the Vendor
("generic"?) InfoFrame.

Also don't enable any InfoFrame that is not provided, and disable
the Vendor InfoFrame when disabling the output.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater 2709b275c5 drm/nouveau/disp/gf119: Use supplied HDMI InfoFrames
Now that we have the InfoFrame data being provided, for the most
part, program the hardware to use it.

While we're here, and since the functionality will come in handy
for supporting 3D stereoscopy, implement setting the Vendor
("generic"?) InfoFrame.

Also don't enable any InfoFrame that is not provided, and disable
the Vendor InfoFrame when disabling the output.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater ba32836879 drm/nouveau/disp/gt215: Use supplied HDMI InfoFrames
Now that we have the InfoFrame data being provided, for the most
part, program the hardware to use it.

While we're here, and since the functionality will come in handy
for supporting 3D stereoscopy, implement setting the Vendor
("generic") InfoFrame.

Also don't enable any AVI or Vendor InfoFrame that is not provided,
and disable the Vendor InfoFrame when disabling the output.

Ignore the Audio InfoFrame: We don't supply it, and altering HDMI
audio semantics (for better or worse) on this hardware is out of
scope for me at this time.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater a45f7908b3 drm/nouveau/disp/g84-gt200: Use supplied HDMI InfoFrames
Now that we have the InfoFrame data being provided, for the most
part, program the hardware to use it.

While we're here, and since the functionality will come in handy
for supporting 3D stereoscopy, implement setting the Vendor
("generic"?) InfoFrame.

Also don't enable any AVI or Vendor InfoFrame that is not provided,
and disable the Vendor InfoFrame when disabling the output.

Ignore the Audio InfoFrame: We don't supply it, and altering HDMI
audio semantics (for better or worse) on this hardware is out of
scope for me at this time.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater f60213c0ee drm/nouveau/disp: Add mechanism to convert HDMI InfoFrames to hardware format
HDMI InfoFrames are passed to NVKM as bags of bytes, but the
hardware needs them to be packed into words.  Rather than having
four (or more) copies of the packing logic introduce a single copy
now, in a central place.

We currently need these for AVI and Vendor InfoFrames, but we may
also expect to need them for Audio InfoFrames at some point.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater 34fd3e5d8c drm/nouveau: Pass mode-dependent AVI and Vendor HDMI InfoFrames to NVKM
Now that we have mechanism by which to pass mode-dependent HDMI
InfoFrames to the low-level hardware driver, it is incumbent upon
us to do so.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater 31fe2c2002 drm/nouveau/disp/g84-: Extend NVKM HDMI power control method to set InfoFrames
The nouveau driver, in the Linux 3.7 days, used to try and set the
AVI InfoFrame based on the selected display mode.  These days, it
uses a fixed set of InfoFrames.  Start to correct that, by
providing a mechanism whereby InfoFrame data may be passed to the
NVKM functions that do the actual configuration.

At this point, only establish the new parameters and their parsing,
don't actually use the data anywhere yet (since it's not supplied
anywhere).

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Alastair Bridgewater 35dd9874bf drm/nouveau: Clean up nv50_head_atomic_check_mode() and fix blankus calculation
drm_mode_set_crtcinfo() does compensation for interlace and
doublescan timing effects already, so do it first and use the
compensated figures instead of the constant "vscan / ilace" terms
that we had before.

And then it turns out that the hardware model for how the timing
parameters are configured is basically the standard model, but
starting one clock before the sync pulse rather than at the start
of the display area, which lets us drastically simplify the
overall timing calculations (verifying the changes by algebraic
operations is left as an exercise for the reader).

Finally, there were a couple of issues with the computation of
m->v.blankus that are addressed here.  Interlaced modes would
generate a negative intermediate result.  Double scan modes would
generate an overestimate rather than an underestimate.  And when
enabling frame-packing modes, a rather extreme overestimate would
be generated.  Fixed, by using the timings as adjusted for the
CRTC to find the length of the vertical blanking period instead of
mixing adjusted and pre-adjustment timing parameters.

Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2017-06-16 14:04:18 +10:00
Dave Airlie 925344ccc9 Linux 4.12-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJZPdbLAAoJEHm+PkMAQRiGx4wH/1nCjfnl6fE8oJ24/1gEAOUh
 biFdqJkYZmlLYHVtYfLm4Ueg4adJdg0wx6qM/4RaAzmQVvLfDV34bc1qBf1+P95G
 kVF+osWyXrZo5cTwkwapHW/KNu4VJwAx2D1wrlxKDVG5AOrULH1pYOYGOpApEkZU
 4N+q5+M0ce0GJpqtUZX+UnI33ygjdDbBxXoFKsr24B7eA0ouGbAJ7dC88WcaETL+
 2/7tT01SvDMo0jBSV0WIqlgXwZ5gp3yPGnklC3F4159Yze6VFrzHMKS/UpPF8o8E
 W9EbuzwxsKyXUifX2GY348L1f+47glen/1sedbuKnFhP6E9aqUQQJXvEO7ueQl4=
 =m2Gx
 -----END PGP SIGNATURE-----

BackMerge tag 'v4.12-rc5' into drm-next

Linux 4.12-rc5 for nouveau fixes
2017-06-16 13:58:27 +10:00
Dave Airlie a682169891 imx-drm: cleanups and YUV 4:2:0 memory read/write reduction support
- Remove counter load enable form PRE, which has no effect.
 - Add support for setting the double read/write reduction flag in channel
   parameter memory. This can be used to save some memory bandwidth when
   capturing in YUV 4:2:0 chroma subsampled formats.
 - Allocate DMA channel structures as needed, most of the 64 channels are
   unused or even reserved.
 - Remove unused interrupt busy waiting routine.
 - Set VDIC field order for both AUTO and MAN inputs simultaneously as
   both can't be active at the same time.
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCAA1FiEEBsBxhV1FaKwXuCOBUMKIHHCeYOsFAlk49zAXHHAuemFiZWxA
 cGVuZ3V0cm9uaXguZGUACgkQUMKIHHCeYOtX+g/+M5DPkGhcP8fN9m47MOAEa+nP
 oDciYC76VrQev/Qys/zLM3/6sWF9h82USJ52H+zw41RuKKkYlcOzVWnSPQd2yN6Y
 hIl0fCsFzGxOMnIAhmi6BHFnvJKP1jsfeBdXSHyxI0y5kGoufG7BEHiJ7TTSgy/I
 JhccDKTRV9NzAfwpD37EI3a/Nc53DRpw3jrnHPnAaBJ6hYVPZ9YCSrBYbQQbIrDr
 x6NB8E1Ga3KRGZMTw45bTBiOs4AbZKSunzrqWQFnTRjbE+aTDs9W5n5wcdr7AoWi
 gqnx+b6TkiarNK3taHffjYioYvvn2nbGhuoAtg7hpS0CeUup0gitQquKM0kqsWnk
 yBykkP+Z7udASRgXdK6Gtzo6hzdhPeFPmmMbKmSBdIvT26t0ikf9RN1UlEhE6nY3
 A68jKC4+gNTu8kF6imzWCfwM9KB4pWn0N0qTY5U9Y7/gWFky6IEDn3V5OM3XXqUa
 c/iglYyzO+B7vVu6ZajlH+shemO1mVaxGjVFrfX29syVooZrmo0NVJPdoKwiYx0r
 E08FCOdUIEhzFS6h1/FII+mZ6YzAmNkXVz+l+MWaoOW2tIWWX4xSsFTKBka2hDfq
 daU3CDcKcg8LuSybHg6lzj8Hw+/CWkqgtv6ESVnvHEUYCHoZsvArJTNoQrx/zJbE
 EwgZIM4daufEuv6I5zo=
 =m3BY
 -----END PGP SIGNATURE-----

Merge tag 'imx-drm-next-2017-06-08' of git://git.pengutronix.de/git/pza/linux into drm-next

imx-drm: cleanups and YUV 4:2:0 memory read/write reduction support

- Remove counter load enable form PRE, which has no effect.
- Add support for setting the double read/write reduction flag in channel
  parameter memory. This can be used to save some memory bandwidth when
  capturing in YUV 4:2:0 chroma subsampled formats.
- Allocate DMA channel structures as needed, most of the 64 channels are
  unused or even reserved.
- Remove unused interrupt busy waiting routine.
- Set VDIC field order for both AUTO and MAN inputs simultaneously as
  both can't be active at the same time.

* tag 'imx-drm-next-2017-06-08' of git://git.pengutronix.de/git/pza/linux:
  gpu: ipu-v3: vdic: include AUTO field order bit in ipu_vdi_set_field_order
  gpu: ipu-v3: remove interrupt busy waiting routine
  gpu: ipu-v3: allocate ipuv3_channels as needed
  gpu: ipu-v3: Add support for double read/write reduction
  gpu: ipu-v3: prg: remove counter load enable
2017-06-16 10:05:38 +10:00
Dave Airlie 033fd3256f Merge tag 'drm-fsl-dcu-for-v4.13' of http://git.agner.ch/git/linux-drm-fsl-dcu into drm-next
some fsl-dcu cleanups

* tag 'drm-fsl-dcu-for-v4.13' of http://git.agner.ch/git/linux-drm-fsl-dcu:
  drm/fsl-dcu: use new drm_atomic_helper_shutdown
  drm/fsl-dcu: implement irq_preinstall/uninstall callbacks
  drm/fsl: Drop drm_vblank_cleanup
2017-06-16 10:05:03 +10:00
Dave Airlie 202dfa086d Merge branch 'drm/next/du' of git://linuxtv.org/pinchartl/media into drm-next
The series interleaves DRM and V4L2 patches due to dependencies between the R-
Car DU and VSP drivers. Mauro has acked all the V4L2 patches to go through
your tree, and they don't conflict with anything queued for v4.13 in his tree.
If I need to send any conflicting patches through Mauro's tree for v4.13, I'll
make sure to base them on this branch.

* 'drm/next/du' of git://linuxtv.org/pinchartl/media:
  drm: rcar-du: Map memory through the VSP device
  v4l: vsp1: Add API to map and unmap DRM buffers through the VSP
  v4l: vsp1: Map the DL and video buffers through the proper bus master
  v4l: rcar-fcp: Add an API to retrieve the FCP device
  v4l: rcar-fcp: Don't get/put module reference
  drm: rcar-du: Register a completion callback with VSP1
  v4l: vsp1: Extend VSP1 module API to allow DRM callbacks
  v4l: vsp1: Postpone frame end handling in event of display list race
  drm: rcar-du: Arm the page flip event after queuing the page flip
2017-06-16 10:04:14 +10:00
Dave Airlie 7249e3d64e sun4i-drm changes for 4.13
An unusually big pull request for this merge window, with three notable
 features:
   - V3s display engine support. This is especially notable because it uses
     a different display engine used on the newer Allwinner SoCs (H3, A64
     and the likes) that will be quite easily supported now.
   - HDMI support for the old Allwinner SoCs. This is enabled only on the
     A10s for now, but should be really easy to extend to deal with A10, A20
     and A31
   - Preliminary work to deal with dual-pipeline SoCs (A10, A20, A31, H3,
     etc.). It currently ignores the second pipeline, but we can use the
     dual-pipelines bindings. This will be useful to enable the display
     pipeline while we work on the dual-pipeline.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZQZ40AAoJEBx+YmzsjxAgEDgP+wd29aHXZPIp7EGlCR9L1qPS
 62Ps5lrbgwxGJx8Nf7T51tQLl0SDBHrI9xgrQLla1bdqItAMyIkJG2NQ+INN2ZSt
 Yz2aUZL3Xx9DE3dUhbLdfUejOCQZqopn8pPe2egeMYNXS5hsAhwwLB8AYcX3LLCN
 WFpbC3hcd4pih/teDGyJ4bn4I/teyA3qWleq73f+A/eBBbCI4aRZWvx34/G/pzUL
 KeHryE35evukgfOJmZ9wAYMbj8BWHV3t6xObk9TqYrgQ/Vzh7IkAWl1wkcd2qRSQ
 RVhfCI4s0jkd44EHMQLxqh27LQakSviIqZWsJEf91qvi5g9A4ZyIS/H4/dGusfbP
 59QkREjOdnlVqTSfP5E0QzZbzdcbZhV2YoPp73Rv319lif4HqV/9Z7IhDBmNC2IZ
 g2O4J3DKvDjtP4A2s3CoGRVgxf1ag/K+Ig6AEQZL+ShrcHwPEBrVqfqp/MnV+Qg2
 gRv0YimGMKReHaMYkT8NdDtqlAzaLWUFnEYRO92X0B4m3NZENMeeDZ8wMy26/yxo
 TJxwzxxsNON59mZESS0Ci88q/bMnFkyCYvuWIeHol2GgUK7yzN2ePf9p/eUPOziK
 gDbJaKhSZxH67Lerqg+zCFbuPGDu7ZGyOHWMgfG6Xh69bCJptJAJNK03TEvoBvpX
 CSdBObTExOj5S/m64+SI
 =HtFh
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-drm-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-next

sun4i-drm changes for 4.13

An unusually big pull request for this merge window, with three notable
features:
  - V3s display engine support. This is especially notable because it uses
    a different display engine used on the newer Allwinner SoCs (H3, A64
    and the likes) that will be quite easily supported now.
  - HDMI support for the old Allwinner SoCs. This is enabled only on the
    A10s for now, but should be really easy to extend to deal with A10, A20
    and A31
  - Preliminary work to deal with dual-pipeline SoCs (A10, A20, A31, H3,
    etc.). It currently ignores the second pipeline, but we can use the
    dual-pipelines bindings. This will be useful to enable the display
    pipeline while we work on the dual-pipeline.

* tag 'sunxi-drm-for-4.13' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: (27 commits)
  drm/sun4i: Add compatible for the A10s pipeline
  drm/sun4i: Add HDMI support
  dt-bindings: display: sun4i: Add allwinner,tcon-channel property
  dt-bindings: display: sun4i: Add HDMI display bindings
  drm/sun4i: Ignore the generic connectors for components
  drm/sun4i: tcon: multiply the vtotal when not in interlace
  drm/sun4i: tcon: Change vertical total size computation inconsistency
  drm/sun4i: tcon: Fix tcon channel 1 backporch calculation
  drm/sun4i: tcon: Switch mux on only for composite
  drm/sun4i: tcon: Move the muxing out of the mode set function
  drm/sun4i: tcon: Add channel debug
  drm/sun4i: tcon: add support for V3s TCON
  drm/sun4i: Add compatible string for V3s display engine
  drm/sun4i: add support for Allwinner DE2 mixers
  drm/sun4i: add a Kconfig option for sun4i-backend
  drm/sun4i: abstract a engine type
  drm/sun4i: return only planes for layers created
  dt-bindings: add bindings for DE2 on V3s SoC
  drm/sun4i: backend: Clarify sun4i_backend_layer_enable debug message
  drm/sun4i: Set TCON clock inside sun4i_tconX_mode_set
  ...
2017-06-16 10:02:35 +10:00
Dave Airlie 7119dbdf7c Merge tag 'drm-intel-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
drm/i915 fixes for v4.12-rc6

* tag 'drm-intel-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: Fix GVT-g PVINFO version compatibility check
  drm/i915: Fix SKL+ watermarks for 90/270 rotation
  drm/i915: Fix scaling check for 90/270 degree plane rotation
2017-06-16 10:01:52 +10:00
Dave Airlie 91c0719c69 Merge tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Driver Changes:
- dw-hdmi: Fix compilation error if REGMAP_MMIO not selected (Laurent)
- host1x: Fix incorrect return value (Christophe)
- tegra: Shore up idr API usage in tegra staging code (Dmitry)
- mgag200: Always use HiPri mode for G200e4v2 and limit max bandwidth (Mathieu)
- mxsfb: Ensure display can be lit up without bootloader initialization (Fabio)

Cc: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Mathieu Larouche <mathieu.larouche@matrox.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>

* tag 'drm-misc-fixes-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc:
  drm: mxsfb_crtc: Reset the eLCDIF controller
  drm/mgag200: Fix to always set HiPri for G200e4 V2
  drm/tegra: Correct idr_alloc() minimum id
  drm/tegra: Fix lockup on a use of staging API
  gpu: host1x: Fix error handling
  drm: dw-hdmi: Fix compilation breakage by selecting REGMAP_MMIO
2017-06-16 10:01:04 +10:00
Dave Airlie 04d4fb5fa6 Merge branch 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-next
New radeon and amdgpu features for 4.13:
- Lots of Vega10 bug fixes
- Preliminary Raven support
- KIQ support for compute rings
- MEC queue management rework from Andres
- Audio support for DCE6
- SR-IOV improvements
- Improved module parameters for controlling radeon vs amdgpu support
  for SI and CIK
- Bug fixes
- General code cleanups

[airlied: dropped drmP.h header from one file was needed and build broke]

* 'drm-next-4.13' of git://people.freedesktop.org/~agd5f/linux: (362 commits)
  drm/amdgpu: Fix compiler warnings
  drm/amdgpu: vm_update_ptes remove code duplication
  drm/amd/amdgpu: Port VCN over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
  drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
  drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
  drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
  drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
  drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
  drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
  drm/amd/amdgpu: Add offset variant to SOC15 macros
  drm/amd/powerplay: add avfs control for Vega10
  drm/amdgpu: add virtual display support for raven
  drm/amdgpu/gfx9: fix compute ring doorbell index
  drm/amd/amdgpu: Rename KIQ ring to avoid spaces
  drm/amd/amdgpu: gfx9 tidy ups (v2)
  drm/amdgpu: add contiguous flag in ucode bo create
  drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
  drm/amdgpu: export test ib debugfs interface
  ...
2017-06-16 09:56:53 +10:00
Dave Airlie bfda9aa153 Merge tag 'drm-misc-next-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc into drm-next
Cross-subsystem Changes:
- dt-bindings: add vendor prefix for NLT Technologies, Ltd. (Lucas)
- dt-bindings: Add support for samsung s6e3hf2 panel (Hoegeun)

Core Changes:
- Add drm_panel_bridge to avoid connector boilerplate in drivers (Eric)
- Trival fixes for dupe forward decl and reduce scope of variable (Dawid)

Driver Changes:
- dw-hdmi: Use mode_valid hook on bridge instead of connector (Jose)
- vc4,atmel-hlcdc: Use drm_panel_bridge where appropriate (Eric)
- panel: Add Innolux P079ZCA panel driver (Chris)
- panel-simple: Add NL12880B20-05, NL192108AC18-02D, P320HVN03 panels (Lucas)
- panel-samsung-s6e3ha2: Add s6e3hf2 panel support (Hoegeun)
- zte,vc4,pl111,panel,mxsfb: Miscellaneous fixes

Cc: Jose Abreu <Jose.Abreu@synopsys.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Chris Zhong <zyw@rock-chips.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Hoegeun Kwon <hoegeun.kwon@samsung.com>
Cc: Dawid Kurek <dawikur@gmail.com>

* tag 'drm-misc-next-2017-06-15' of git://anongit.freedesktop.org/git/drm-misc: (26 commits)
  drm: Reduce scope of 'state' variable
  drm: mxsfb_crtc: Reset the eLCDIF controller
  drm: Remove duplicate forward declaration
  drm/panel: s6e3ha2: Add support for s6e3hf2 panel on TM2e board
  dt-bindings: Add support for samsung s6e3hf2 panel
  drm/panel: add backlight dependency for sitronix-st7789v
  drm/panel: S6E3HA2 needs backlight code
  drm/panel: simple: add support for AUO P320HVN03
  drm/panel: simple: add support for NLT NL192108AC18-02D
  dt-bindings: add vendor prefix for NLT Technologies, Ltd.
  drm/panel: simple: add support for NEC NL12880B20-05
  drm/panel: add Innolux P079ZCA panel driver
  dt-bindings: Add INNOLUX P079ZCA panel bindings
  drm/vc4: Fix resource leak in 'vc4_get_hang_state_ioctl()' in error handling path
  drm/vc4/vc4_bo.c: always set bo->resv
  drm: Add const to name field declaration in struct drm_prop_enum_list
  drm/pl111: Fix offset calculation for the primary plane.
  drm/atmel-hlcdc: Fix panel registration
  drm/bridge: Build the panel wrapper in drm_kms_helper
  drm/atmel-hlcdc: Replace the panel usage with drm_panel_bridge.
  ...
2017-06-16 09:33:43 +10:00
Boris Brezillon 34c8ea400f drm/vc4: Mimic drm_atomic_helper_commit() behavior
The VC4 KMS driver is implementing its own ->atomic_commit() but there
are a few generic helpers we can use instead of open-coding the logic.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/1496392332-8722-4-git-send-email-boris.brezillon@free-electrons.com
2017-06-15 16:29:08 -07:00
Eric Anholt 83753117f1 drm/vc4: Add get/set tiling ioctls.
This allows mesa to set the tiling format for a BO and have that
tiling format be respected by mesa on the other side of an
import/export (and by vc4 scanout in the kernel), without defining a
protocol to pass the tiling through userspace.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-2-eric@anholt.net
Acked-by: Dave Airlie <airlied@redhat.com>
2017-06-15 16:02:45 -07:00
Eric Anholt 98830d91da drm/vc4: Add T-format scanout support.
The T tiling format is what V3D uses for textures, with no raster
support at all until later revisions of the hardware (and always at a
large 3D performance penalty).  If we can't scan out V3D's format,
then we often need to do a relayout at some stage of the pipeline,
either right before texturing from the scanout buffer (common in X11
without a compositor) or between a tiled screen buffer right before
scanout (an option I've considered in trying to resolve this
inconsistency, but which means needing to use the dirty fb ioctl and
having some update policy).

T-format scanout lets us avoid either of those shadow copies, for a
massive, obvious performance improvement to X11 window dragging
without a compositor.  Unfortunately, enabling a compositor to work
around the discrepancy has turned out to be too costly in memory
consumption for the Raspbian distribution.

Because the HVS operates a scanline at a time, compositing from T does
increase the memory bandwidth cost of scanout.  On my 1920x1080@32bpp
display on a RPi3, we go from about 15% of system memory bandwidth
with linear to about 20% with tiled.  However, for X11 this still ends
up being a huge performance win in active usage.

This patch doesn't yet handle src_x/src_y offsetting within the tiled
buffer.  However, we fail to do so for untiled buffers already.

Signed-off-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170608001336.12842-1-eric@anholt.net
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-06-15 16:02:45 -07:00
Rodrigo Vivi 9a30a26122 Revert "drm/i915/skl: New ddb allocation algorithm"
This reverts commit bb9d85f6e9.

New ddb allocation algorithm is a show stopper on my SKL system.

Besides not be able to get external DP 4k@60 (through USB type C),
It fully hang my screen when unplugging the USB type C.

Bugzilla: https://patchwork.freedesktop.org/patch/161571/
Fixes: bb9d85f6e9 ("drm/i915/skl: New ddb allocation algorithm")
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497376350-3400-1-git-send-email-rodrigo.vivi@intel.com
2017-06-15 14:01:14 -07:00
Madhav Chauhan 8a1deb329f drm/i915/glk: Add cold boot sequence for GLK DSI
As per BSEPC, if device ready bit is '0' in enable IO sequence
then its a cold boot/reset scenario eg: S3/S4 resume. If cold boot
scenario detected in enable IO, then prepare port immediately.
In normal boot scenario, prepare port after glk_dsi_device_ready().
Without cold boot sequence enabled, features like S3/S4 doesn't work.

Signed-off-by: Madhav Chauhan <madhav.chauhan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497340095-5877-2-git-send-email-madhav.chauhan@intel.com
2017-06-15 23:03:57 +03:00
Madhav Chauhan 74e4ce6a78 drm/i915/glk: Split GLK DSI device ready functionality
This patch divides glk_dsi_device_ready() function into
two part. First part will program LP wake and MIPI DSI mode
to MIPI_CTRL reg using newly defined function glk_dsi_enable_io().
glk_dsi_enable_io() will be called from intel_dsi_pre_enable.
Second part will do remaining device ready activities using
the existing function glk_dsi_device_ready().

Signed-off-by: Madhav Chauhan <madhav.chauhan@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497340095-5877-1-git-send-email-madhav.chauhan@intel.com
2017-06-15 22:56:15 +03:00
Fabio Estevam 0f933328f0 drm: mxsfb_crtc: Reset the eLCDIF controller
According to the eLCDIF initialization steps listed in the MX6SX
Reference Manual the eLCDIF block reset is mandatory.

Without performing the eLCDIF reset the display shows garbage content
when the kernel boots.

In earlier tests this issue has not been observed because the bootloader
was previously showing a splash screen and the bootloader display driver
does properly implement the eLCDIF reset.

Add the eLCDIF reset to the driver, so that it can operate correctly
independently of the bootloader.

Tested on a imx6sx-sdb board.

Cc: <stable@vger.kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1494007301-14535-1-git-send-email-fabio.estevam@nxp.com
2017-06-15 14:26:24 -04:00
Dawid Kurek ac7c748317 drm: Reduce scope of 'state' variable
Smaller scope reduces visibility of variable and makes usage of
uninitialized variable less possible.

Changes in v2:
	- separate declaration and initialization
Changes in v3:
	- add missing signed-off-by tag

Signed-off-by: Dawid Kurek <dawikur@gmail.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615174556.GA8872@gmail.com
2017-06-15 14:26:02 -04:00
Fabio Estevam ae9d042143 drm: mxsfb_crtc: Reset the eLCDIF controller
According to the eLCDIF initialization steps listed in the MX6SX
Reference Manual the eLCDIF block reset is mandatory.

Without performing the eLCDIF reset the display shows garbage content
when the kernel boots.

In earlier tests this issue has not been observed because the bootloader
was previously showing a splash screen and the bootloader display driver
does properly implement the eLCDIF reset.

Add the eLCDIF reset to the driver, so that it can operate correctly
independently of the bootloader.

Tested on a imx6sx-sdb board.

Cc: <stable@vger.kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1494007301-14535-1-git-send-email-fabio.estevam@nxp.com
2017-06-15 13:30:45 -04:00
Mathieu Larouche 0cbb738108 drm/mgag200: Fix to always set HiPri for G200e4 V2
- Changed the HiPri value for G200e4 to always be 0.
  - Added Bandwith limitation to block resolution above 1920x1200x60Hz

Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Acked-by: Dave Airlie <airlied@redhat.com>
[seanpaul removed some trailing whitespace from the patch]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/ec0f8568d7ec41904dfe593c5deccf3f062d7bd8.1497450944.git.mathieu.larouche@matrox.com
2017-06-15 12:32:58 -04:00
Harish Kasiviswanathan a1924005a2 drm/amdgpu: Fix compiler warnings
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:36 -04:00
Harish Kasiviswanathan 370f092f30 drm/amdgpu: vm_update_ptes remove code duplication
CPU and GPU paths were mostly the same.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:35 -04:00
Tom St Denis 0ad6f0d387 drm/amd/amdgpu: Port VCN over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:35 -04:00
Tom St Denis c5c1effd85 drm/amd/amdgpu: Port PSP v10.0 over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:34 -04:00
Tom St Denis 3176810d60 drm/amd/amdgpu: Port PSP v3.1 over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:34 -04:00
Tom St Denis ba7d5a22a6 drm/amd/amdgpu: Port NBIO v7.0 driver over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:33 -04:00
Tom St Denis db0c4d26d6 drm/amd/amdgpu: Port NBIO v6.1 driver over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:32 -04:00
Tom St Denis 4ad5751a6c drm/amd/amdgpu: Port UVD 7.0 over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:32 -04:00
Tom St Denis deca8322f1 drm/amd/amdgpu: Port MMHUB over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:31 -04:00
Tom St Denis 805cb75ccb drm/amd/amdgpu: Cleanup gfxhub read-modify-write patterns
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:31 -04:00
Tom St Denis f7047402d1 drm/amd/amdgpu: Port GFXHUB over to new SOC15 macros
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:30 -04:00
Tom St Denis 496828e786 drm/amd/amdgpu: Add offset variant to SOC15 macros
Allows reading/writing via SOC15 macros with offset for
various register banks.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:30 -04:00
Eric Huang 9d90f0bd7c drm/amd/powerplay: add avfs control for Vega10
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:29 -04:00
Alex Deucher d67fed1618 drm/amdgpu: add virtual display support for raven
Same as other asics.  If enabled, exposes a user selectable
number of virtual displays.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:28 -04:00
Alex Deucher 7366af81da drm/amdgpu/gfx9: fix compute ring doorbell index
This got lost when the code was revamped.  Copy/paste bug from
gfx8.

Reported-by: Evan Quan <evan.quan@amd.com>
Fixes: 78c168342 (drm/amdgpu: allow split of queues with kfd at queue granularity v4)
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:28 -04:00
Tom St Denis 2119d0db59 drm/amd/amdgpu: Rename KIQ ring to avoid spaces
Swap space for underscore in ring name.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:27 -04:00
Tom St Denis e5475e16eb drm/amd/amdgpu: gfx9 tidy ups (v2)
A couple of simple tidy ups to register programming.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

(v2): Avoid using 'data' uninitialized

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:26 -04:00
horchen 948edf0951 drm/amdgpu: add contiguous flag in ucode bo create
Under VF environment, the ucode would be settled to the visible VRAM,
As it would be pinned to the visible VRAM, it's better to add
contiguous flag,otherwise it need to move gpu address during the pin
process. This movement is not necessary.

Signed-off-by: horchen <horace.chen@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:26 -04:00
Huang Rui ab4fe3e1f9 drm/amdgpu: fix missed gpu info firmware when cache firmware during S3
gpu_info firmware is released after data is used. But when system enters into
suspend, upper class driver will cache all firmware names. At that time,
gpu_info will be failing to load. It seems an upper class issue, that we should
not release gpu_info firmware until device finished.

[  903.236589] cache_firmware: amdgpu/vega10_sdma1.bin
[  903.236590] fw_set_page_data: fw-amdgpu/vega10_sdma1.bin buf=ffff88041eee10c0 data=ffffc90002561000 size=17408
[  903.236591] cache_firmware: amdgpu/vega10_sdma1.bin ret=0
[  903.464160] __allocate_fw_buf: fw-amdgpu/vega10_gpu_info.bin buf=ffff88041eee2c00
[  903.471815] (NULL device *): loading /lib/firmware/updates/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.482870] (NULL device *): loading /lib/firmware/updates/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.492716] (NULL device *): loading /lib/firmware/4.11.0-custom/amdgpu/vega10_gpu_info.bin failed with error -2
[  903.503156] (NULL device *): direct-loading amdgpu/vega10_gpu_info.bin

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:25 -04:00
Huang Rui 4f0955fcc0 drm/amdgpu: export test ib debugfs interface
As Christian and David's suggestion, submit the test ib ring debug interfaces.
It's useful for debugging with the command submission without VM case.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:25 -04:00
Eric Huang 17d176a5cc drm/amd/powerplay: add GPU power display for vega10
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:24 -04:00
Eric Huang 4b1d63600e drm/amd/powerplay: update vega10_ppsmc.h
Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:24 -04:00
Hawking Zhang 2bbec882c2 drm/amdgpu: avoid to reset wave_front_size to 0
No need to clear it.  The values are set explicitly.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:23 -04:00
Hawking Zhang 51fd037067 drm/amdgpu: add new member in gpu_info fw
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-15 11:50:22 -04:00
Dhinakaran Pandiyan f6262bda46 drm/i915: Don't enable backlight at setup time.
Maarten and Ville noticed that we are enabling backlight via DP aux very
early in the modeset_init path via the intel_dp_aux_setup_backlight()
function, since commit e7156c8339 ("drm/i915: Add Backlight Control using
DPCD for eDP connectors (v9)"). Looks like all we need to do during
_setup_backlight() is read the current brightness state instead of
modifying it.

v2: Rewrote commit message.

Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Yetunde Adebisi <yetundex.adebisi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Puthikorn Voravootivat <puthik@chromium.org>
Fixes: e7156c8339 ("drm/i915: Add Backlight Control using DPCD for eDP connectors (v9)")
Link: http://patchwork.freedesktop.org/patch/msgid/1497384239-2965-1-git-send-email-dhinakaran.pandiyan@intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-15 16:11:08 +03:00
Colin Ian King 29962acaa0 drm/i915/cnl: make function cnl_ddi_dp_set_dpll_hw_state static
The function cnl_ddi_dp_set_dpll_hw_state does not need to be in global
scope, so make it static.

Cleans up sparse warning:
"symbol 'cnl_ddi_dp_set_dpll_hw_state' was not declared. Should it
 be static?"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170613134751.29196-1-colin.king@canonical.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-06-15 15:53:56 +03:00
Ville Syrjälä e56134bc79 drm/i915: Remove pipe A quirk remnants
With 830 the only thing needing pipe quirks, we can just drop the quirk
defines and replace the checks with IS_I830() checks.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-8-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-06-15 15:38:27 +03:00
Ville Syrjälä b82a682d32 drm/i915: Drop pipe A quirk for Thinkapd T60
The pipe A force quirk shouldn't needed except on 830. So let's nuke it
for the IBM Thinkpad T60 945 machines. This quirk pre-dates
KMS so it's usefulness is doubtful at best now.

The original bug report [1] describes the symptoms as "system hang on
closing T60 panel lid", and we already dropped a similar quirk for
another 945 machine in
commit 736a69ca8c ("drm/i915: Drop PIPE-A quirk for 945GSE HP Mini")
so I'm hopeful we can drop this one as well.

The quirk was added into xf86-video-intel in
commit 08903abe4dc0 ("Add pipe a force enable quirk for Lenovo T60")

[1] https://bugs.freedesktop.org/show_bug.cgi?id=16494

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-7-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-06-15 15:37:19 +03:00
Ville Syrjälä dc453e336c drm/i915: Drop pipe A quirk for Toshiba Protege R205-S209
The pipe A force quirk shouldn't needed except on 830. So let's nuke it
for the Toshiba Protege R-205/S-209 945 machines. This quirk pre-dates
KMS so it's usefulness is doubtful at best now.

Unfortunately the original bug report [1] isn't very helpful since it
doesn't describe the symptoms. And the commit message in xf86-video-intel
commit ecdb5963ef68 ("Add pipe A force enable quirk for Toshiba Portege R205-S209")
is not much help either.

However, if we assume the problem was the typical "closing the lid
hangs the box" type of thing, we already nuked the quirk for another
945 machine in
commit 736a69ca8c ("drm/i915: Drop PIPE-A quirk for 945GSE HP Mini")
and so I hope we can drop this one as well.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=14944

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-6-ville.syrjala@linux.intel.com
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-06-15 15:36:45 +03:00
Ville Syrjälä 2ee0da1631 drm/i915: Add i830 "pipes power well"
830 more or less requires both pipes and DPLLs to remain on as long
as either pipe is needed. However, when neither pipe is actually needed,
we can save a bit of power by turning everything off. To do that we add
a new "power well" that turns both pipes and DPLLs on and off in the
right order. Seems to save ~50mW on my Fujitsu-Siemens Lifebook S6010.

This also avoids having to abuse the load detection to force pipe A on
at init time. That was never very robust, and it only worked for one
pipe, whereas 830 really needs both pipes enabled. As a bonus the 830
pipe quirk is now a bit more isolated from the rest of the mode setting
infrastructure, which should mean that it's much less likely someone
will accidentally break it in the future. The extra cost is of course
slight code duplication, but that seems like a worthwile tradeoff here.

v2; s/BIT/BIT_ULL/

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-5-ville.syrjala@linux.intel.com
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-06-15 15:35:38 +03:00
Ville Syrjälä bb408dd2b2 drm/i915: Use a loop for the "three times for luck" DPLL procedure
The magic "enable the  DPLL three times" sequence feels like it
deserves a loop.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-06-15 15:26:10 +03:00
Dmitry Osipenko 43240bbd87 gpu: host1x: At first try a non-blocking allocation for the gather copy
The blocking gather copy allocation is a major performance downside of the
Host1x firewall, it may take hundreds milliseconds which is unacceptable
for the real-time graphics operations. Let's try a non-blocking allocation
first as a least invasive solution, it makes opentegra (Xorg driver)
performance indistinguishable with/without the firewall.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:56 +02:00
Mikko Perttunen 8474b02531 gpu: host1x: Refactor channel allocation code
This is largely a rewrite of the Host1x channel allocation code, bringing
several changes:

- The previous code could deadlock due to an interaction
  between the 'reflock' mutex and CDMA timeout handling.
  This gets rid of the mutex.
- Support for more than 32 channels, required for Tegra186
- General refactoring, including better encapsulation
  of channel ownership handling into channel.c

Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:38 +02:00
Dmitry Osipenko 03f0de770e gpu: host1x: Remove unused host1x_cdma_stop() definition
There is no host1x_cdma_stop() in the code, let's remove its definition
from the header file.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:25:18 +02:00
Dmitry Osipenko 03ebcaa3de gpu: host1x: Remove unused 'struct host1x_cmdbuf'
The struct host1x_cmdbuf is unused, let's remove it.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:24:59 +02:00
Dmitry Osipenko a47ac10e6e gpu: host1x: Check waits in the firewall
Check waits in the firewall in a way it is done for relocations.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:24:41 +02:00
Ville Syrjälä da1d0e2655 drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic()
If intel_crtc_disable_noatomic() were to ever get called during resume
we'd end up deadlocking since resume has its own acqcuire_ctx but
intel_crtc_disable_noatomic() still tries to use the
mode_config.acquire_ctx. Pass down the correct acquire ctx from the top.

Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b8701e ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-3-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-06-15 15:24:00 +03:00
Dmitry Osipenko 0f563a4bf6 gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall
Several channels could be made to write the same unit concurrently via
the SETCLASS opcode, trusting userspace is a bad idea. It should be
possible to drop the per-client channel reservation and add a per-unit
locking by inserting MLOCK's to the command stream to re-allow the
SETCLASS opcode, but it will be much more work. Let's forbid the
unit-unrelated class changes for now.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:23:50 +02:00
Dmitry Osipenko ef81624994 gpu: host1x: Forbid RESTART opcode in the firewall
The RESTART opcode terminates the gather and restarts the CDMA fetching
from a specified word << 2 relative to the CDMA start address. That
shouldn't be allowed to be done by userspace.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:23:18 +02:00
Dmitry Osipenko 571cbf70c1 gpu: host1x: Forbid relocation address shifting in the firewall
Incorrectly shifted relocation address will cause a lower memory
corruption and likely a hang on a write or a read of an arbitrary data
in case of IOMMU absence. As of now, there is no known use for the
address shifting and adding a proper shifts / sizes validation is a much
more work. Let's forbid shifts in the firewall till a proper validation
is implemented.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:22:32 +02:00
Dmitry Osipenko 47f89c10dd gpu: host1x: Do not leak BO's phys address to userspace
Perform gathers coping before patching them, so that original gathers are
left untouched. That's not as bad as leaking kernel addresses, but still
doesn't feel right.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:22:03 +02:00
Dmitry Osipenko e5855aa3e6 gpu: host1x: Correct host1x_job_pin() error handling
In case of relocations / waitchecks patching failure the jobs pins stay
referenced till DRM file get closed, wasting memory. Add the missed
unpinning.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:21:46 +02:00
Dmitry Osipenko 3833d16f16 gpu: host1x: Initialize firewall class to the job's one
The commands stream is prepended by the jobs class on the CDMA
submission, so that explicitly setting a module class in the commands
stream isn't necessary. The firewall initializes its class to 0 and the
command stream that doesn't explicitly specify the class effectively
bypasses the firewall.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:21:23 +02:00
Dmitry Osipenko 80d3eef16e drm/tegra: dc: Disable plane if it is invisible
On Tegra20 if plane has width or height equal to 0, it will be infinitely
wide or tall. Let's disable the plane if it is invisible on atomic state
committing to fix the issue. The Rockchip DRM driver does the same.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:20:30 +02:00
Dmitry Osipenko 7d2058571a drm/tegra: dc: Apply clipping to the plane
On Tegra20 an overlay plane should be clipped, otherwise its output is
distorted once plane crosses display boundary.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:20:11 +02:00
Ville Syrjälä aecd36b8a1 drm/i915: Fix deadlock witha the pipe A quirk during resume
Pass down the correct acquire context to the pipe A quirk load detect
hack during display resume. Avoids deadlocking the entire thing.

Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: e2c8b8701e ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-2-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2017-06-15 15:19:32 +03:00
Dmitry Osipenko 6ac1571b4c drm/tegra: dc: Avoid reset asserts on Tegra20
Commit 33a8eb8d40 ("drm/tegra: dc: Implement runtime PM") introduced
HW reset control. It causes a hang on Tegra20 if both display
controllers are utilized (RGB panel and HDMI). The TRM suggests that
each display controller has its own reset control, apparently it is not
correct.

Fixes: 33a8eb8d40 ("drm/tegra: dc: Implement runtime PM")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:19:23 +02:00
Dmitry Osipenko e0b2ce0210 drm/tegra: Check syncpoint ID in the 'submit' IOCTL
In case of invalid syncpoint ID, the host1x_syncpt_get() returns NULL and
none of its users perform a check of the returned pointer later. Let's bail
out until it's too late.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:17:21 +02:00
Dmitry Osipenko d0fbbdff2e drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL
The waitchecks along with multiple syncpoints per submit are not ready
for use yet, let's forbid them for now.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:16:37 +02:00
Dmitry Osipenko 368f622c0d drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL
If commands buffer claims a number of words that is higher than its BO can
fit, a kernel OOPS will be fired on the out-of-bounds BO access. This was
triggered by an opentegra Xorg driver that erroneously pushed too many
commands to the pushbuf.

The CDMA commands buffer address is 4 bytes aligned, so check its
alignment.

The maximum number of the CDMA gather fetches is 16383, add a check for
it.

Add a sanity check for the relocations in a same way.

[   46.829393] Unable to handle kernel paging request at virtual address f09b2000
...
[<c04a3ba4>] (host1x_job_pin) from [<c04dfcd0>] (tegra_drm_submit+0x474/0x510)
[<c04dfcd0>] (tegra_drm_submit) from [<c04deea0>] (tegra_submit+0x50/0x6c)
[<c04deea0>] (tegra_submit) from [<c04c07c0>] (drm_ioctl+0x1e4/0x3ec)
[<c04c07c0>] (drm_ioctl) from [<c02541a0>] (do_vfs_ioctl+0x9c/0x8e4)
[<c02541a0>] (do_vfs_ioctl) from [<c0254a1c>] (SyS_ioctl+0x34/0x5c)
[<c0254a1c>] (SyS_ioctl) from [<c0107640>] (ret_fast_syscall+0x0/0x3c)

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 14:16:07 +02:00
Dmitry Osipenko d6c153ec85 drm/tegra: Correct idr_alloc() minimum id
The client ID 0 is reserved by the host1x/cdma to mark the timeout timer
work as already been scheduled and context ID is used as the clients one.
This fixes spurious CDMA timeouts.

Fixes: bdd2f9cd10 ("drm/tegra: Don't leak kernel pointer to userspace")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/9c19a44219acd988e678cf9abe21363911184625.1497480754.git.digetx@gmail.com
2017-06-15 14:12:25 +02:00
Dmitry Osipenko 1066a8959d drm/tegra: Fix lockup on a use of staging API
Commit bdd2f9cd10 ("Don't leak kernel pointer to userspace") added a
mutex around staging IOCTL's, some of those mutexes are taken twice.

Fixes: bdd2f9cd10 ("drm/tegra: Don't leak kernel pointer to userspace")
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/7b70a506a9d2355ea6ff19a8c4f4d726b67719b3.1497480754.git.digetx@gmail.com
2017-06-15 14:11:05 +02:00
Christophe JAILLET 59e04bc20d gpu: host1x: Fix error handling
If 'devm_reset_control_get' returns an error, then we erroneously return
success because error code is taken from 'host->clk' instead of
'host->rst'.

Fixes: b386c6b73a ("gpu: host1x: Support module reset")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170410202922.17665-1-christophe.jaillet@wanadoo.fr
2017-06-15 14:06:49 +02:00
Thierry Reding 466749f13e gpu: host1x: Flesh out kerneldoc
Improve kerneldoc for the public parts of the host1x infrastructure in
preparation for adding driver-specific part to the GPU documentation.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>
2017-06-15 13:58:43 +02:00
Chris Wilson 8c45cec48e drm/i915: Split vma exec_link/evict_link
Currently the vma has one link member that is used for both holding its
place in the execbuf reservation list, and in any eviction list. This
dual property is quite tricky and error prone.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-3-chris@chris-wilson.co.uk
2017-06-15 10:53:26 +01:00
Chris Wilson d55495b4dc drm/i915: Use vma->exec_entry as our double-entry placeholder
This has the benefit of not requiring us to manipulate the
vma->exec_link list when tearing down the execbuffer, and is a
marginally cheaper test to detect the user error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-2-chris@chris-wilson.co.uk
2017-06-15 10:52:58 +01:00
Chris Wilson 650bc63568 drm/i915: Amalgamate execbuffer parameter structures
Combine the two slightly overlapping parameter structures we pass around
the execbuffer routines into one.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-1-chris@chris-wilson.co.uk
2017-06-15 10:50:35 +01:00
Lionel Landwerlin 28c7ef9ecc drm/i915/perf: add GLK support
Add OA support for Geminilake (pretty much identical to Broxton), and
also add the associated OA configurations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Link: http://patchwork.freedesktop.org/patch/msgid/20170613112309.4088-2-lionel.g.landwerlin@intel.com
2017-06-14 12:31:58 -07:00
Lionel Landwerlin 6c5c1d89af drm/i915/perf: add KBL support
Add OA support for Kabylake (pretty much identical to Skylake), and
also add the associated OA configurations.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:58 -07:00
Lionel Landwerlin 3891589eee drm/i915: add KBL GT2/GT3 check macros
Add macros to detect GT2/GT3 skus so we can apply the proper OA
configuration later.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00
Robert Bragg 1bef3409f1 drm/i915/perf: remove perf.hook_lock
In earlier iterations of the i915-perf driver we had a number of
callbacks/hooks from other parts of the i915 driver to e.g. notify us
when a legacy context was pinned and these could run asynchronously with
respect to the stream file operations and might also run in atomic
context.

dev_priv->perf.hook_lock had been for serialising access to state needed
within these callbacks, but as the code has evolved some of the hooks
have gone away or are implemented to avoid needing to lock any state.

The remaining use of this lock was actually redundant considering how
the gen7 oacontrol state used to be updated as part of a context pin
hook.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00
Robert Bragg 155e941f49 drm/i915/perf: per-gen timebase for checking sample freq
An oa_exponent_to_ns() utility and per-gen timebase constants where
recently removed when updating the tail pointer race condition WA, and
this restores those so we can update the _PROP_OA_EXPONENT validation
done in read_properties_unlocked() to not assume we have a 12.5MHz
timebase as we did for Haswell.

Accordingly the oa_sample_rate_hard_limit value that's referenced by
proc_dointvec_minmax defining the absolute limit for the OA sampling
frequency is now initialized to (timestamp_frequency / 2) instead of the
6.25MHz constant for Haswell.

v2:
    Specify frequency of 19.2MHz for BXT (Ville)
    Initialize oa_sample_rate_hard_limit per-gen too (Lionel)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00
Robert Bragg fc59921178 drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT
These are auto generated from an XML description of metric sets,
currently maintained in gputop, ref:

 https://github.com/rib/gputop
 > gputop-data/oa-*.xml
 > scripts/i915-perf-kernelgen.py

 $ make -C gputop-data -f Makefile.xml

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00
Robert Bragg 19f81df285 drm/i915/perf: Add OA unit support for Gen 8+
Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all
share (more-or-less) the same OA unit design.

Of particular note in comparison to Haswell: some OA unit HW config
state has become per-context state and as a consequence it is somewhat
more complicated to manage synchronous state changes from the cpu while
there's no guarantee of what context (if any) is currently actively
running on the gpu.

The periodic sampling frequency which can be particularly useful for
system-wide analysis (as opposed to command stream synchronised
MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to
have become per-context save and restored (while the OABUFFER
destination is still a shared, system-wide resource).

This support for gen8+ takes care to consider a number of timing
challenges involved in synchronously updating per-context state
primarily by programming all config state from the cpu and updating all
current and saved contexts synchronously while the OA unit is still
disabled.

The driver intentionally avoids depending on command streamer
programming to update OA state considering the lack of synchronization
between the automatic loading of OACTXCONTROL state (that includes the
periodic sampling state and enable state) on context restore and the
parsing of any general purpose BB the driver can control. I.e. this
implementation is careful to avoid the possibility of a context restore
temporarily enabling any out-of-date periodic sampling state. In
addition to the risk of transiently-out-of-date state being loaded
automatically; there are also internal HW latencies involved in the
loading of MUX configurations which would be difficult to account for
from the command streamer (and we only want to enable the unit when once
the MUX configuration is complete).

Since the Gen8+ OA unit design no longer supports clock gating the unit
off for a single given context (which effectively stopped any progress
of counters while any other context was running) and instead supports
tagging OA reports with a context ID for filtering on the CPU, it means
we can no longer hide the system-wide progress of counters from a
non-privileged application only interested in metrics for its own
context. Although we could theoretically try and subtract the progress
of other contexts before forwarding reports via read() we aren't in a
position to filter reports captured via MI_REPORT_PERF_COUNT commands.
As a result, for Gen8+, we always require the
dev.i915.perf_stream_paranoid to be unset for any access to OA metrics
if not root.

v5: Drain submitted requests when enabling metric set to ensure no
    lite-restore erases the context image we just updated (Lionel)

v6: In addition to drain, switch to kernel context & update all
    context in place (Chris)

v7: Add missing mutex_unlock() if switching to kernel context fails
    (Matthew)

v8: Simplify OA period/flex-eu-counters programming by using the
    batchbuffer instead of modifying ctx-image (Lionel)

v9: Back to updating the context image (due to erroneous testing,
    batchbuffer programming the OA unit doesn't actually work)
    (Lionel)
    Pin context before updating context image (Chris)
    Drop MMIO programming now that we switch to a kernel context with
    right values in initial context image (Chris)

v10: Just pin_map the contexts we want to modify or let the
     configuration happen on first use (Chris)

v11: Update kernel context OA config through the batchbuffer rather
     than on the fly ctx-image update (Lionel)

v12: Rework OA context registers update again by swithing away from
     user contexts and reconfiguring the kernel context through the
     batchbuffer and updating all the other contexts' context image.
     Also take care to lock slice/subslice configuration when OA is
     on. (Lionel)

v13: Request rpcs updates on all engine when updating the OA config
     (Lionel)

v14: Drop any kind of rpcs management now that we monitor sseu
     configuration changes in a later patch (Lionel)
     Remove usleep after programming the NOA configs on Gen8+, this
     doesn't seem to be needed (Lionel)

v15: Respect coding style for block comments (Chris)

v16: Add missing i915_add_request() in case we fail to emit OA
     configuration (Matthew)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> \o/
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00
Robert Bragg 5182f646c7 drm/i915/perf: Add 'render basic' Gen8+ OA unit configs
Adds a static OA unit, MUX, B Counter + Flex EU configurations for basic
render metrics on Broadwell, Cherryview, Skylake and Broxton. These are
auto generated from an XML description of metric sets, currently
maintained in gputop, ref:

 https://github.com/rib/gputop
 > gputop-data/oa-*.xml
 > scripts/i915-perf-kernelgen.py

 $ make -C gputop-data -f Makefile.xml WHITELIST=RenderBasic

v2: add newlines to debug messages + fix comment (Matthew Auld)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00
Lionel Landwerlin 3f488d9985 drm/i915/perf: rework mux configurations queries
Gen8+ might have mux configurations per slices/subslices. Depending on
whether slices/subslices have been fused off, only part of the
configuration needs to be applied. This change reworks the mux
configurations query mechanism to allow more than one set of registers
to be programmed.

v2: s/n_mux_regs/n_mux_configs/ (Matthew)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00
Robert Bragg f532023381 drm/i915: expose _SUBSLICE_MASK GETPARM
Assuming a uniform mask across all slices, this enables userspace to
determine the specific sub slices can be enabled. This information is
required, for example, to be able to analyse some OA counter reports
where the counter configuration depends on the HW sub slice
configuration.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2017-06-14 12:31:57 -07:00