No need to malloc it any more.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we don't store local data into global variables
it isn't necessary to lock anything.
v2: rebased on new SA interface
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It never really belonged there in the first place.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We can now protected the semaphore ram by a
fence, so free it immediately.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
It isn't necessary any more and the suballocator seems to perform
even better.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Directly use the suballocator to get small chunks of memory.
It's equally fast and doesn't crash when we encounter a GPU reset.
v2: rebased on new SA interface.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A startover with a new idea for a multiple ring allocator.
Should perform as well as a normal ring allocator as long
as only one ring does somthing, but falls back to a more
complex algorithm if more complex things start to happen.
We store the last allocated bo in last, we always try to allocate
after the last allocated bo. Principle is that in a linear GPU ring
progression was is after last is the oldest bo we allocated and thus
the first one that should no longer be in use by the GPU.
If it's not the case we skip over the bo after last to the closest
done bo if such one exist. If none exist and we are not asked to
block we report failure to allocate.
If we are asked to block we wait on all the oldest fence of all
rings. We just wait for any of those fence to complete.
v2: We need to be able to let hole point to the list_head, otherwise
try free will never free the first allocation of the list. Also
stop calling radeon_fence_signalled more than necessary.
v3: Don't free allocations without considering them as a hole,
otherwise we might lose holes. Also return ENOMEM instead of ENOENT
when running out of fences to wait for. Limit the number of holes
we try for each ring to 3.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Use one wait queue for all rings. When one ring progress, other
likely does to and we are not expecting to have a lot of waiter
anyway.
Also add a fence_wait_any that will wait until the first fence
in the fence array (one fence per ring) is signaled. This allow
to wait on all rings.
v2: some minor cleanups and improvements.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Define the interface without modifying the allocation
algorithm in any way.
v2: rebase on top of fence new uint64 patch
v3: add ring to debugfs output
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of offset + size keep start and end offset directly.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Make the suballocator self containing to locking.
v2: split the bugfix into a seperate patch.
v3: remove some unreleated changes.
Sig-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of hacking the calculation multiple times.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some callers illegal called fence_wait_next/empty
while holding the ring emission mutex. So don't
relock the mutex in that cases, and move the actual
locking into the fence code.
v2: Don't try to unlock the mutex if it isn't locked.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Using 64bits fence sequence we can directly compare sequence
number to know if a fence is signaled or not. Thus the fence
list became useless, so does the fence lock that mainly
protected the fence list.
Things like ring.ready are no longer behind a lock, this should
be ok as ring.ready is initialized once and will only change
when facing lockup. Worst case is that we return an -EBUSY just
after a successfull GPU reset, or we go into wait state instead
of returning -EBUSY (thus delaying reporting -EBUSY to fence
wait caller).
v2: Remove left over comment, force using writeback on cayman and
newer, thus not having to suffer from possibly scratch reg
exhaustion
v3: Rebase on top of change to uint64 fence patch
v4: Change DCE5 test to force write back on cayman and newer but
also any APU such as PALM or SUMO family
v5: Rebase on top of new uint64 fence patch
v6: Just break if seq doesn't change any more. Use radeon_fence
prefix for all function names. Even if it's now highly optimized,
try avoiding polling to often.
v7: We should never poll the last_seq from the hardware without
waking the sleeping threads, otherwise we might lose events.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This convert fence to use uint64_t sequence number intention is
to use the fact that uin64_t is big enough that we don't need to
care about wrap around.
Tested with and without writeback using 0xFFFFF000 as initial
fence sequence and thus allowing to test the wrap around from
32bits to 64bits.
v2: Add comment about possible race btw CPU & GPU, add comment
stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
Read fence sequenc in reverse order of GPU write them so we
mitigate the race btw CPU and GPU.
v3: Drop the need for ring to emit the 64bits fence, and just have
each ring emit the lower 32bits of the fence sequence. We
handle the wrap over 32bits in fence_process.
v4: Just a small optimization: Don't reread the last_seq value
if loop restarts, since we already know its value anyway.
Also start at zero not one for seq value and use pre instead
of post increment in emmit, otherwise wait_empty will deadlock.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A single global mutex for ring submissions seems sufficient.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We need to sync with the GFX ring as ttm might have schedule bo move
on it and new command scheduled for other ring need to wait for bo
data to be in place.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
... given that I essentially run the show already, let's make this
official.
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
These two functions are actually hw-specific and only valid for gm45
thru gen7. HSW completely changes how this works, so label them
accordingly.
v2: s/gm45/g4x/ like for the previous patch.
Acked-by: Paulo Zanoni <przanoni@gmail.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Generally we call stuff with i9xx_ when it's valid for gen3+. But
gen3 and early gen4 only support hdmi with sdvo cards, and writing
infoframes works completely different there.
v2: Use g4x instead of gm45 - it applies to the desktop variant, too.
v3: Properly align the paramters of g4x_write_infoframe again, noticed
by Paulo Zanoni.
Acked-by: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Simplifies things because for all the infoframes we care about,
we always send them on each vblank. Also, this gets rid of one
of the hw specific functions mislabelled with the intel_ prefix -
hsw will completely change how this works!
Acked-by: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The principle of intel_mark_busy() is that we want to spot the
transition of when the display engine is being used in order to bump
powersaving modes and increase display clocks. As such it is only
important when the display is changing, i.e. when rendering to the
scanout or other sprite/plane, and these are characterised by being
pinned.
v2: Mark the whole device as busy on execbuffer and pageflips as well
and rebase against dinq for the minor bug fix to be immediately
applicable.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: fix compile fail.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
intel_wait_for_vblank uses PIPESTAT, which does not exist on Ironlake
and newer, so now we use PIPEFRAME.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Ditch the check for disable pipe from the new ilk wait for
vblank function to keep it consisten with existing behaviour.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just like Gen 4, IBX has a "Port Select" field on the DIP register,
but the ports are different.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
IBX does not need the workaround used in cpt_write_infoframe that
requires the AVI frame to be enabled while being updated.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The registers are on the PCH, so use the PCH name instead of the CPU
name. Also, the way this function is implemented is really only for
CPT and PPT. For now, both functions have the same implementations:
the next patch will fix ibx_write_infoframe.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Better safe than sorry. Currently we never change the frequency and
use the same for every infoframe type, so the only way to reproduce a
bug would be with the BIOS doing something.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
That's what the VIDEO_DIP_CTL documentation says we need to do. Except
when it's the AVI InfoFrame and we're ironlake_write_infoframe.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will allow us to disable an infoframe without changing its
frequency.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Should prevent bugs when changing the port.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Make sure we're doing the right thing, just like we do on gen5+.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Don't use intermediate variables, change the value of 'val' as we go
through the function. The new style looks more similar to the rest of
our code. IMHO, it's also easier to read and change.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Gen3+ is 13 bits (12:0), and on gen2 only 12 (11:0). For both the high
bits are marked reserved, read-only so continue to mask them. Bit 31
is not reserved and has a meaning.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Backmerge of drm-next to resolve a few ugly conflicts and to get a few
fixes from 3.4-rc6 (which drm-next has already merged). Note that this
merge also restricts the stencil cache lra evict policy workaround to
snb (as it should) - I had to frob the code anyway because the
CM0_MASK_SHIFT define died in the masked bit cleanups.
We need the backmerge to get Paulo Zanoni's infoframe regression fix
for gm45 - further bugfixes from him touch the same area and would
needlessly conflict.
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJPpvY9AAoJEHm+PkMAQRiGpEoIAJgbu+Y8gITnBK/wh9O6zy3S
5jie5KK4YWdbJsvO58WbNr3CyVIwGIqQ2dUZLiU59aBVLarlGw8xor0MmW+cZwhp
6fBHaf0qDYAV0MZjD+mnnExOiCRyISa2lPmsfu9dAWywh5KGe6/oAP6/qcXIyok3
KZyl3qQf4ENpaZPHwZPXCEkUvtuyHgNiszN+QXEadA3s19Ot4VGe9A3VGw+GNrSm
JqFIq3acQAbKa5BYaqf7TQC02v2FI7//eqt6QHxTqbE6a7LGbTvLfX3HlJ2mnfqa
1R6QHhM4y4OZDHbaMT2raHZ8WuLXzhehJzhP8Co7AHFOKwVKOb5XbcUr2RrukMU=
=HkMd
-----END PGP SIGNATURE-----
Merge tag 'v3.4-rc6' into drm-intel-next
Conflicts:
drivers/gpu/drm/i915/intel_display.c
Ok, this is a fun story of git totally messing things up. There
/shouldn't/ be any conflict in here, because the fixes in -rc6 do only
touch functions that have not been changed in -next.
The offending commits in drm-next are 14415745b2..1fa611065 which
simply move a few functions from intel_display.c to intel_pm.c. The
problem seems to be that git diff gets completely confused:
$ git diff 14415745b2..1fa611065
is a nice mess in intel_display.c, and the diff leaks into totally
unrelated functions, whereas
$git diff --minimal 14415745b2..1fa611065
is exactly what we want.
Unfortunately there seems to be no way to teach similar smarts to the
merge diff and conflict generation code, because with the minimal diff
there really shouldn't be any conflicts. For added hilarity, every
time something in that area changes the + and - lines in the diff move
around like crazy, again resulting in new conflicts. So I fear this
mess will stay with us for a little longer (and might result in
another backmerge down the road).
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
cc1: warning: include/drm: No such file or directory [enabled by default]
It's reproducible if you build with O=/some/obj/dir and W=1.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This was mostly already fixed but this one change is needed to match Kirill's
original submission
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>