This avoids them getting in the way of BOs which might be accessed by
the CPU. They can still go to the CPU accessible part of VRAM though if
there's no space outside of it.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
First round of fixes for 3.18.
- Use gart for DMA ring tests to avoid caching issues with HDP
- SI dpm stability fixes
- Performance stabilization fixes
- misc other things
* 'drm-fixes-3.18' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: reduce sparse false positive warnings
drm/radeon: fix vm page table block size calculation
drm/ttm: Don't evict BOs outside of the requested placement range
drm/ttm: Don't skip fpfn check if lpfn is 0 in ttm_bo_mem_compat
drm/radeon: use gart memory for DMA ring tests
drm/radeon: fix speaker allocation setup
drm/radeon: initialize sadb to NULL in the audio code
Revert "drm/radeon/dpm: drop clk/voltage dependency filters for SI"
Revert "drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table"
Avoids HDP cache flush issues when using vram which can
cause ring test failures on certain boards.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Alexander Fyodorov <halcy@yandex.ru>
Cc: stable@vger.kernel.org
Pull drm updates from Dave Airlie:
"This is the main git pull for the drm,
I pretty much froze major pulls at -rc5/6 time, and haven't had much
fallout, so will probably continue doing that.
Lots of changes all over, big internal header cleanup to make it clear
drm features are legacy things and what are things that modern KMS
drivers should be using. Also big move to use the new generic fences
in all the TTM drivers.
core:
atomic prep work,
vblank rework changes, allows immediate vblank disables
major header reworking and cleanups to better delinate legacy
interfaces from what KMS drivers should be using.
cursor planes locking fixes
ttm:
move to generic fences (affects all TTM drivers)
ppc64 caching fixes
radeon:
userptr support,
uvd for old asics,
reset rework for fence changes
better buffer placement changes,
dpm feature enablement
hdmi audio support fixes
intel:
Cherryview work,
180 degree rotation,
skylake prep work,
execlist command submission
full ppgtt prep work
cursor improvements
edid caching,
vdd handling improvements
nouveau:
fence reworking
kepler memory clock work
gt21x clock work
fan control improvements
hdmi infoframe fixes
DP audio
ast:
ppc64 fixes
caching fix
rcar:
rcar-du DT support
ipuv3:
prep work for capture support
msm:
LVDS support for mdp4, new panel, gpu refactoring
exynos:
exynos3250 SoC support, drop bad mmap interface,
mipi dsi changes, and component match support"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (640 commits)
drm/mst: rework payload table allocation to conform better.
drm/ast: Fix HW cursor image
drm/radeon/kv: add uvd/vce info to dpm debugfs output
drm/radeon/ci: add uvd/vce info to dpm debugfs output
drm/radeon: export reservation_object from dmabuf to ttm
drm/radeon: cope with foreign fences inside the reservation object
drm/radeon: cope with foreign fences inside display
drm/core: use helper to check driver features
drm/radeon/cik: write gfx ucode version to ucode addr reg
drm/radeon/si: print full CS when we hit a packet 0
drm/radeon: remove unecessary includes
drm/radeon/combios: declare legacy_connector_convert as static
drm/radeon/atombios: declare connector convert tables as static
drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table
drm/radeon/dpm: drop clk/voltage dependency filters for BTC
drm/radeon/dpm: drop clk/voltage dependency filters for CI
drm/radeon/dpm: drop clk/voltage dependency filters for SI
drm/radeon/dpm: drop clk/voltage dependency filters for NI
drm/radeon: disable audio when we disable hdmi (v2)
drm/radeon: split audio enable between eg and r600 (v2)
...
Not the whole world is a radeon! :-)
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: Don't forget git add, noticed by David.
Cc: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Acked-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add a module parameter to disable the radeon GPU backlight
controller to override the automatic detection. Some
laptops seems to indicate that they use the integrated
controller, but appear to actually use an external
controller.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=81382
v2: fix module parameter description
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUFjfVAAoJEHm+PkMAQRiGANkIAIU3PNrAz9dIItq8a/rEAhnx
l2shHoOyEmyNR2apholM3BPUNX50cbsc/HGdi7lZKLkA/ifAj6B9nFD2NzVsIChD
1QWVcvdkKlVuxXCDd26qbijlfmbTOAWrLw9ntvM+J6ZtECM6zCAZF4MAV/FwogPq
ETGKD76AxJtVIhBMS99troAiC1YxmQ7DKgEr8CraTOR1qwXEonnPCmN/IZA6x2/G
EXiihOuQB5me1X7k4PI0V8CDscQOn+3B2CQHIrjRB+KiTF+iKIuI8n6ORC6bpFh+
U8UZP9wLlIG1BrUHG83pIndglIHotqPcjmtfl1WGrRr2hn7abzVSfV+g5Syo3Vg=
=Ep+s
-----END PGP SIGNATURE-----
drm: backmerge tag 'v3.17-rc5' into drm-next
This is requested to get the fixes for intel and radeon into the
same tree for future development work.
i915_display.c: fix missing dev_priv conflict.
This allows us to specify if we want to sync to
the shared fences of a reservation object or not.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
DRM_DEBUG_CODE is currently always set, so distributions enable it. The
only reason to keep support in code is if developers wanted to disable
debug support. Sounds unlikely.
All the DRM_DEBUG() printks are still guarded by a drm_debug read. So if
its cacheline is read once, they're discarded pretty fast.. There should
hardly be any performance penalty, it's even guarded by unlikely().
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Changes since v1:
- Kill the sw interrupt dance, add and use
radeon_irq_kms_sw_irq_get_delayed instead.
- Change custom wait function, lockdep complained about it.
Holding exclusive_lock in the wait function might cause deadlocks.
Instead do all the processing in .enable_signaling, and wait
on the global fence_queue to pick up gpu resets.
- Process all fences in radeon_gpu_reset after reset to close a race
with the trylock in enable_signaling.
Changes since v2:
- Small changes to work with the rewritten lockup recovery patches.
Changes since v3:
- Call radeon_fence_schedule_check when exclusive_lock cannot be
acquired to always cause a wake up.
- Reset irqs from hangup check.
- Drop reading seqno in the callback, use cached value.
- Fix indentation in radeon_fence_default_wait
- Add a radeon_test_signaled function, drop a few test_bit calls.
- Make to_radeon_fence global.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
This improves concurrent stream decoding.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
More radeon changes for drm-next. Highlights:
- UVD support for older asics
- Reset rework in preparation for Maarten's fence patches
I have a few more patches which depend on Christian's ttm changes,
I'll send them out separately once you've merged the ttm changes.
* 'drm-next-3.18' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: drop doing resets in a work item
drm/radeon: drop RADEON_FENCE_SIGNALED_SEQ v2
drm/radeon: add timeout argument to radeon_fence_wait_seq v2
drm/radeon: handle lockup in delayed work, v5
drm/radeon: take exclusive_lock in read mode during ring tests, v5
drm/radeon: force fence completion only on problematic rings (v2)
drm/radeon: wake up all fences on manual reset
drm/radeon: add UVD fw names for older asic
drm/radeon: enable RB_ARB before resetting the VCPU
drm/radeon: 760G/780V/880V don't have UVD
drm/radeon: implement UVD hw workarounds for R6xx v3
drm/radeon: add UVD support for older asics v4
drm/radeon: add set_uvd_clocks callback for r6xx v4
drm/radeon: properly init UVD MC bits on R600
drm/radeon: force UVD buffers into VRAM on RS[78]80 v2
drm/radeon: move the IB test after the AGP fallback
Blocking completely innocent processes with a GPU reset is
a pretty bad idea. Just set needs_reset and let the next
command submission or fence wait do the job.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's causing issues with VMID handling and comparing the
fence value two times actually doesn't make handling faster.
v2: rebased on reset changes
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v5 (chk): complete rework, start when the first fence is emitted,
stop when the last fence is signalled, make it work
correctly with GPU resets, cleanup radeon_fence_wait_seq
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is needed for the next commit, because the lockup detection
will need the read lock to run.
v4 (chk): split out forced fence completion, remove unrelated changes,
add and handle in_reset flag
v5 (agd5f): rebase fix
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of resetting all fence numbers, only reset the
number of the problematic ring. Split out from a patch
from Maarten Lankhorst <maarten.lankhorst@canonical.com>
v2 (agd5f): rebase build fix
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows us to more fine grained specify where to place the buffer object.
v2: rebased on drm-next, add bochs changes as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This fixes a problem with GPU resets and TLB flushes on SI/CIK.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
radeon userptr support.
* 'drm-next-3.18' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: allow userptr write access under certain conditions
drm/radeon: add userptr flag to register MMU notifier v3
drm/radeon: add userptr flag to directly validate the BO to GTT
drm/radeon: add userptr flag to limit it to anonymous memory v2
drm/radeon: add userptr support v8
Conflicts:
drivers/gpu/drm/radeon/radeon_prime.c
It isn't necessary for command streams generated by the kernel (at least
not while we aren't storing ring or indirect buffers in VRAM).
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a module paramter to enable bapm on APUs. It's disabled
by default on certain APUs due to stability issues. This
option makes it easier to test and to enable it on systems that
are stable.
bug:
https://bugzilla.kernel.org/show_bug.cgi?id=81021
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Whenever userspace mapping related to our userptr change
we wait for it to become idle and unmap it from GTT.
v2: rebased, fix mutex unlock in error path
v3: improve commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds an IOCTL for turning a pointer supplied by
userspace into a buffer object.
It imposes several restrictions upon the memory being mapped:
1. It must be page aligned (both start/end addresses, i.e ptr and size).
2. It must be normal system memory, not a pointer into another map of IO
space (e.g. it must not be a GTT mmapping of another object).
3. The BO is mapped into GTT, so the maximum amount of memory mapped at
all times is still the GTT limit.
4. The BO is only mapped readonly for now, so no write support.
5. List of backing pages is only acquired once, so they represent a
snapshot of the first use.
Exporting and sharing as well as mapping of buffer objects created by
this function is forbidden and results in an -EPERM.
v2: squash all previous changes into first public version
v3: fix tabs, map readonly, don't use MM callback any more
v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
pin/unpin pages on bind/unbind instead of populate/unpopulate
v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown
flags, better handle READONLY flag, improve permission check
v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin
v7: add warning about it's availability in the API definition
v8: drop access_ok check, fix VM mapping bits
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v4)
Reviewed-by: Jérôme Glisse <jglisse@redhat.com> (v4)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Skip the "manual" pageflip completion checks via polling and
guessing in the vblank handler radeon_crtc_handle_vblank() on
asics which are known to reliably support hw pageflip completion
irqs. Those pflip irqs are a more reliable and race-free method
of handling pageflip completion detection, whereas the "classic"
polling method has some small races in combination with dpm on,
and with the reworked pageflip implementation since Linux 3.16.
On old asics without pflip irqs, the classic method is used.
On asics with known good pflip irqs, only pflip irqs are used
by default, but a new module parameter "use_pflipirqs" allows to
override this in case we encounter asics in the wild with
unreliable or faulty pflip irqs. A module parameter of 0 allows
to use the classic method only in such a case. A parameter of 1
allows to use both classic method and pflip irqs as additional
band-aid to avoid some small races which could happen with the
classic method alone. The setting 1 gives Linux 3.16 behaviour.
Hw pflip irqs are available since R600.
Tested on DCE-4, AMD Cedar - FirePro 2270.
v2: agd5f: only enable pflip interrupts on DCE4+ as they are not
reliable on older asics.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move the decision what to use into the common VM code.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Scales much better than scanning the address range linearly.
v2: store pfn instead of address
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't wait for the BO to be used again, just
update the PT on the next VM use.
v2: remove stray semicolon.
Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Some hawaii boards use a different method for fetching the
voltage information from the vbios.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This ensures the GPU sees all previous CPU writes to VRAM, which makes it
safe:
* For userspace to stream data from CPU to GPU via VRAM instead of GTT
* For IBs to be stored in VRAM instead of GTT
* For ring buffers to be stored in VRAM instead of GTT, if the HPD flush
is performed via MMIO
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
And clean up the function comment a little.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
That didn't worked correctly any more and opened up a security problem.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Unused and unimplemented. Also fix specifying the
kernel flag incorrectly at one occasion.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now that fallback to gtt is fixed for cpu access, we can
remove this limit.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=78717
v2: use new gart_pin_size to accurately track available gtt.
v3: fix comment
v4: clarify comment
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
So we know how large an allocation we can allow.
v2: incorporate Michel's comments
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
v2: fix rebase onto drm-fixes
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Doesn't seem necessary, the GART table memory should be persistent.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This was originally un-inlined by Andi Kleen in 2011 citing size concerns.
Indeed, a first attempt at inlining it grew radeon.ko by 7%.
However, 2% of cpu is spent in this function. Simply inlining it gave 1% more fps
in Urban Terror.
v2: We know the minimum MMIO size. Adding it to the if allows the compiler to
optimize the branch out, improving both performance and size.
The v2 patch decreases radeon.ko size by 2%. I didn't re-benchmark, but common sense
says perf is now more than 1% better.
v3: Also change _wreg, make the threshold a define.
Inlining _wreg increased the size a bit compared to v2, so now radeon.ko
is only 1% smaller.
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This adds CIK support for the new ucode format.
v2: add size validation, integrate debug info
v3: add support for MEC2 on KV
v4: fix typos
v4: update to latest format
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This adds SI support for the new ucode format.
v2: add size validation, integrate debug info
v3: update to latest version
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Calling radeon_vm_bo_find on the IB BO during CS
is illegal and can lead to an crash.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v3: completely rewritten. We now just remember which areas
of the PT to clear and do so on the next command submission.
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=79980
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As well as enabling the vblank interrupt. These shouldn't take any
significant amount of time, but at least pinning the BO has actually been
seen to fail in practice before, in which case we need to let userspace
know about it.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>