Add the support for audio clients to VGA-switcheroo for handling the
HDMI audio controller together with VGA switching. The id of the
audio controller should be given explicitly at registration time
unlike the video controller.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=43155
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This changes the API as a clean-up. Instead of passing multiple
function pointers at each time, introduce a new struct holding the
whole callback functions and pass it to the registration.
The same struct will be used for the upcoming audio client
registration, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Daniel says
Highlights:
- sparse fixes from Ben.
- tons of little cleanups from Chris all over: tiling_changed
clarification, deferred_free list removal, ...
- fix up irq handler on gen2 & gen3 + related cleanups from Chris
- prep work for wait_rendering_timeout from Ben with some nice
refactorings
- first set of infoframe fixes from Paulo for doubleclocked CEA modes
- improve pch pll handling from Jesse and Chris
- gpu hangman, this also contains the reset fix for gen4
- rps sanity check from Chris - this papers over issues when the gpu fails
to clock up on snb/ivb, and it is shockingly easy to hit. The code
prints a big WARN backtrace and restores the hw to a sane state. The
real fix is still in the works.
Atm I'm aware of 2 regressions in -next:
- One of the gmbus patches (not gmbus itself) regressed lvds detection on
a MacbookPro. I've analyzed the bug already and I think I know what's
going on, patch is awaiting test feedback.
- Just today QA reported that DP on ilk regressed. That bug is fresh of
the press and still awaiting detailed logfiles and the bisect result.
The only thing that's clear atm is that -fixes works and -next doesn't.
Previously these functions would assume that vma->vm_file was the
drm_file. Although if in some cases if the drm driver needs to use
something else for the backing file (such as the tmpfs filp) then this
assumption is no longer true. But vma->vm_private_data is still the
GEM object.
With this change, now the drm_device comes from the GEM object rather
than the drm_file so the driver is more free to play with vma->vm_file.
The scenario where this comes up is for mmap'ing of cached dmabuf's
for non-coherent systems, where the driver needs to use fault handling
and PTE shootdown to simulate coherency. We can't use the vma->vm_file
of the dmabuf, which is using anon_inode's address_space. The most
straightforward thing to do is to use the GEM object's obj->filp for
vma->vm_file in all cases, for which we need this patch.
Signed-off-by: Rob Clark <rob@ti.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We don't need to check these - they are always going to be the
same for any PVR based device.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cover all D2xxx/N2xxx chips.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
[Hand applied to upstream driver]
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We have a lot of debug type stuff we don't actually need any more.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
All the conditional ugly register selection really wants to be
cleaned up. Use a struct describing each pipe and its registers.
This will also let us hide some of the oddments between platforms
for any future merging of bits together. In particular the way the
DPLL and FP registers randomly wander around.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We have lots of local assignments that can now be eliminated
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This starts the move away from lots of confused unions of per driver stuff
inherited when we merged the drivers together.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We can take advantage that the PCH_IIR is a subordinate register to
reduce one of the required IIR reads, and that we only need to clear
interrupts handled to reduce the writes. And by simply tidying the code
we can reduce the line count and hopefully make it more readable.
v2: Split out the bugfix from the refactoring.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently the code re-reads PCH_IIR during the hotplug interrupt
processing. Not only is this a wasted read, but introduces a potential
for handling a spurious interrupt as we then may not clear all the
interrupts processed (since the re-read IIR may contains more interrupts
asserted than we clear using the result of the original read).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: stable@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Inspired by the recent ppgtt regression report, where switching of
dmar only for the gpu seems to fix things completely, I've looked
again at the semaphores+vt-d situation.
Contrary to my earlier testing a few months back my system is now
stable with dmar disabled for the igd, and not only when disabling
dmar completely.
So I'm rather hopeful that all our recent fixes for snb have changed
things for code and it's time to try enabling semaphores again. We've
also had issues with enabling semaphores which are not vt-d related,
but I guess these are all fixed by the autoreport-disabling and lazy
request fix. And there's only one way to find out whether there are
still other issues ...
When I've tried to apply this patch I've noticed that semaphores on
gen6 have already silently been enabled in
commit 2911a35b2e
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Thu Apr 5 14:47:36 2012 -0700
drm/i915: use semaphores for the display plane
Fix this up by only checking whether dmar is enabled on the gfx (not
on the entire system).
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
/ssd/git/drm-core-next/drivers/gpu/drm/radeon/radeon_fence.c: In function ‘radeon_debugfs_fence_info’:
/ssd/git/drm-core-next/drivers/gpu/drm/radeon/radeon_fence.c:606:7: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘long long int’ [-Wformat]
Signed-off-by: Dave Airlie <airlied@redhat.com>
No need to malloc it any more.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we don't store local data into global variables
it isn't necessary to lock anything.
v2: rebased on new SA interface
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It never really belonged there in the first place.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We can now protected the semaphore ram by a
fence, so free it immediately.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It isn't necessary any more and the suballocator seems to perform
even better.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Directly use the suballocator to get small chunks of memory.
It's equally fast and doesn't crash when we encounter a GPU reset.
v2: rebased on new SA interface.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A startover with a new idea for a multiple ring allocator.
Should perform as well as a normal ring allocator as long
as only one ring does somthing, but falls back to a more
complex algorithm if more complex things start to happen.
We store the last allocated bo in last, we always try to allocate
after the last allocated bo. Principle is that in a linear GPU ring
progression was is after last is the oldest bo we allocated and thus
the first one that should no longer be in use by the GPU.
If it's not the case we skip over the bo after last to the closest
done bo if such one exist. If none exist and we are not asked to
block we report failure to allocate.
If we are asked to block we wait on all the oldest fence of all
rings. We just wait for any of those fence to complete.
v2: We need to be able to let hole point to the list_head, otherwise
try free will never free the first allocation of the list. Also
stop calling radeon_fence_signalled more than necessary.
v3: Don't free allocations without considering them as a hole,
otherwise we might lose holes. Also return ENOMEM instead of ENOENT
when running out of fences to wait for. Limit the number of holes
we try for each ring to 3.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Use one wait queue for all rings. When one ring progress, other
likely does to and we are not expecting to have a lot of waiter
anyway.
Also add a fence_wait_any that will wait until the first fence
in the fence array (one fence per ring) is signaled. This allow
to wait on all rings.
v2: some minor cleanups and improvements.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Define the interface without modifying the allocation
algorithm in any way.
v2: rebase on top of fence new uint64 patch
v3: add ring to debugfs output
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of offset + size keep start and end offset directly.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Make the suballocator self containing to locking.
v2: split the bugfix into a seperate patch.
v3: remove some unreleated changes.
Sig-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of hacking the calculation multiple times.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some callers illegal called fence_wait_next/empty
while holding the ring emission mutex. So don't
relock the mutex in that cases, and move the actual
locking into the fence code.
v2: Don't try to unlock the mutex if it isn't locked.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Using 64bits fence sequence we can directly compare sequence
number to know if a fence is signaled or not. Thus the fence
list became useless, so does the fence lock that mainly
protected the fence list.
Things like ring.ready are no longer behind a lock, this should
be ok as ring.ready is initialized once and will only change
when facing lockup. Worst case is that we return an -EBUSY just
after a successfull GPU reset, or we go into wait state instead
of returning -EBUSY (thus delaying reporting -EBUSY to fence
wait caller).
v2: Remove left over comment, force using writeback on cayman and
newer, thus not having to suffer from possibly scratch reg
exhaustion
v3: Rebase on top of change to uint64 fence patch
v4: Change DCE5 test to force write back on cayman and newer but
also any APU such as PALM or SUMO family
v5: Rebase on top of new uint64 fence patch
v6: Just break if seq doesn't change any more. Use radeon_fence
prefix for all function names. Even if it's now highly optimized,
try avoiding polling to often.
v7: We should never poll the last_seq from the hardware without
waking the sleeping threads, otherwise we might lose events.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This convert fence to use uint64_t sequence number intention is
to use the fact that uin64_t is big enough that we don't need to
care about wrap around.
Tested with and without writeback using 0xFFFFF000 as initial
fence sequence and thus allowing to test the wrap around from
32bits to 64bits.
v2: Add comment about possible race btw CPU & GPU, add comment
stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
Read fence sequenc in reverse order of GPU write them so we
mitigate the race btw CPU and GPU.
v3: Drop the need for ring to emit the 64bits fence, and just have
each ring emit the lower 32bits of the fence sequence. We
handle the wrap over 32bits in fence_process.
v4: Just a small optimization: Don't reread the last_seq value
if loop restarts, since we already know its value anyway.
Also start at zero not one for seq value and use pre instead
of post increment in emmit, otherwise wait_empty will deadlock.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A single global mutex for ring submissions seems sufficient.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We need to sync with the GFX ring as ttm might have schedule bo move
on it and new command scheduled for other ring need to wait for bo
data to be in place.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
... given that I essentially run the show already, let's make this
official.
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These two functions are actually hw-specific and only valid for gm45
thru gen7. HSW completely changes how this works, so label them
accordingly.
v2: s/gm45/g4x/ like for the previous patch.
Acked-by: Paulo Zanoni <przanoni@gmail.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Generally we call stuff with i9xx_ when it's valid for gen3+. But
gen3 and early gen4 only support hdmi with sdvo cards, and writing
infoframes works completely different there.
v2: Use g4x instead of gm45 - it applies to the desktop variant, too.
v3: Properly align the paramters of g4x_write_infoframe again, noticed
by Paulo Zanoni.
Acked-by: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Simplifies things because for all the infoframes we care about,
we always send them on each vblank. Also, this gets rid of one
of the hw specific functions mislabelled with the intel_ prefix -
hsw will completely change how this works!
Acked-by: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The principle of intel_mark_busy() is that we want to spot the
transition of when the display engine is being used in order to bump
powersaving modes and increase display clocks. As such it is only
important when the display is changing, i.e. when rendering to the
scanout or other sprite/plane, and these are characterised by being
pinned.
v2: Mark the whole device as busy on execbuffer and pageflips as well
and rebase against dinq for the minor bug fix to be immediately
applicable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: fix compile fail.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
intel_wait_for_vblank uses PIPESTAT, which does not exist on Ironlake
and newer, so now we use PIPEFRAME.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Ditch the check for disable pipe from the new ilk wait for
vblank function to keep it consisten with existing behaviour.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just like Gen 4, IBX has a "Port Select" field on the DIP register,
but the ports are different.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
IBX does not need the workaround used in cpt_write_infoframe that
requires the AVI frame to be enabled while being updated.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The registers are on the PCH, so use the PCH name instead of the CPU
name. Also, the way this function is implemented is really only for
CPT and PPT. For now, both functions have the same implementations:
the next patch will fix ibx_write_infoframe.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>