This allows us to query certain registers from userspace
for profiling and harvest configuration. E.g., it can
be used by the GALLIUM_HUD for profiling the status of
various gfx blocks.
Tested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allow the UMDs to query the current sclk/mclk
for profiling, etc.
Tested-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows pinning of buffers in the non-CPU visible portion of
vram.
v2: incorporate Michel's comments.
v3: rebase on Michel's patch
v4: rebase on Michel's v2 patch
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This flag is a hint that userspace expects the BO to be accessed by the
CPU. We can use that hint to prevent such BOs from ever being stored in
the CPU inaccessible part of VRAM.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
radeon userptr support.
* 'drm-next-3.18' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: allow userptr write access under certain conditions
drm/radeon: add userptr flag to register MMU notifier v3
drm/radeon: add userptr flag to directly validate the BO to GTT
drm/radeon: add userptr flag to limit it to anonymous memory v2
drm/radeon: add userptr support v8
Conflicts:
drivers/gpu/drm/radeon/radeon_prime.c
Instead of hard coding the value properly document
that this is an userspace interface.
No intended functional change.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Whenever userspace mapping related to our userptr change
we wait for it to become idle and unmap it from GTT.
v2: rebased, fix mutex unlock in error path
v3: improve commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This way we test userptr availability at BO creation time instead of first use.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Avoid problems with writeback by limiting userptr to anonymous memory.
v2: add commit and code comments
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds an IOCTL for turning a pointer supplied by
userspace into a buffer object.
It imposes several restrictions upon the memory being mapped:
1. It must be page aligned (both start/end addresses, i.e ptr and size).
2. It must be normal system memory, not a pointer into another map of IO
space (e.g. it must not be a GTT mmapping of another object).
3. The BO is mapped into GTT, so the maximum amount of memory mapped at
all times is still the GTT limit.
4. The BO is only mapped readonly for now, so no write support.
5. List of backing pages is only acquired once, so they represent a
snapshot of the first use.
Exporting and sharing as well as mapping of buffer objects created by
this function is forbidden and results in an -EPERM.
v2: squash all previous changes into first public version
v3: fix tabs, map readonly, don't use MM callback any more
v4: set TTM_PAGE_FLAG_SG so that TTM never messes with the pages,
pin/unpin pages on bind/unbind instead of populate/unpopulate
v5: rebased on 3.17-wip, IOCTL renamed to userptr, reject any unknown
flags, better handle READONLY flag, improve permission check
v6: fix ptr cast warning, use set_page_dirty/mark_page_accessed on unpin
v7: add warning about it's availability in the API definition
v8: drop access_ok check, fix VM mapping bits
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v4)
Reviewed-by: Jérôme Glisse <jglisse@redhat.com> (v4)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The statistics are:
- VRAM usage in bytes
- GTT usage in bytes
- number of bytes moved by TTM
The last one is actually a counter, so you need to sample it before and after
command submission and take the difference.
This is useful for finding performance bottlenecks. Userspace queries are
also added.
v2: use atomic64_t
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
When passing buffers between processes, the receiving process needs to know
the original buffer domain, so that it doesn't accidentally move the buffer.
v2: reserve the buffer
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Only VCE 2.0 support so far.
v2: squashing multiple patches into this one
v3: add IRQ support for CIK, major cleanups,
basic code documentation
v4: remove HAINAN from chipset list
Signed-off-by: Christian König <christian.koenig@amd.com>
This is needed for reporting the max GPU engine clock
in OpenCL. This just reports the max possible engine
clock, it does not take into account current conditions
that may limit that clock.
v2: fix query number for merge with 3.13
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This will allow userspace to correctly program the PA_SC_RASTER_CONFIG
register, so it can be considered a fix.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This is required to properly calculate the tiling parameters
in userspace.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CIK uses a different index for 1D DST surfaces compared to SI. Expose
the new index so libdrm_radeon can use it properly for userspace
drivers.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Also add a new RADEON_INFO query to check that CP DMA packets are
supported on the compute ring.
CP DMA has been supported since the 3.8 kernel, but due to an oversight
we forgot to teach the CS checker that the CP DMA packet was legal for
the compute ring on Southern Islands GPUs.
This patch fixes a bug where the radeon driver will incorrectly reject a legal
CP DMA packet from user space. I would like to have the patch
backported to stable so that we don't have to require Mesa users to use a
bleeding edge kernel in order to take advantage of this feature which
is already present in the stable kernels (3.8 and newer).
v2:
- Don't bump kms version, so this patch can be backported to stable
kernels.
Cc: stable@vger.kernel.org
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allow userspace to query for the tile mode array so userspace can properly
compute surface pitch and alignment requirement depending on tiling.
v2: Make strict aliasing safer by casting to char when copying
v3: merge fix from Christian
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add new ioctl option and bumb minor version number.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Just everything needed to decode videos using UVD.
v6: just all the bugfixes and support for R7xx-SI merged in one patch
v7: UVD_CGC_GATE is a write only register, lockup detection fix
v8: split out VRAM fallback changes, remove support for RV770,
add support for HEMLOCK, add buffer sizes checks
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch allows the CPU to map the stolen vram segment
directly rather than going through the PCI BAR. This
significantly improves performance for certain workloads with
a properly patched ddx.
Use radeon.fastfb=1 to enable it (disabled by default).
Currently only supported on RS690, but support for RS780/880
and newer APUs may be added eventually.
Signed-off-by: Samuel Li <samuel.li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This enables the functionality added in the previous
patches. Userspace acceleration drivers can use the
CS ioctl to submit command buffers to the async DMA
rings.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add requests to get the number of shader engines (SE) and
the number of SH per SE. These are needed for geometry
and tesselation shaders in the 3D driver as well as setting
up PA_SC_RASTER_CONFIG on SI asics.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
No version bump is required because setting the flag on older DRM has
no effect.
This only reserves the bit and doesn't use it. I assume we will use it
for buffer eviction heuristics.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>