Instead of calculating the size in bytes just to recalculate the number
of pages from it pass the BO directly to the function.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Allows us to gut a BO of it's backing store when the driver says that it
isn't needed any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
this patch actually refactor mailbox implmentations, and
all below changes are needed together to fix all those mailbox
handshake issues exposured by heavey TDR test.
1)refactor all mailbox functions based on byte accessing for mb_control
reason is to avoid touching non-related bits when writing trn/rcv part of
mailbox_control, this way some incorrect INTR sent to hypervisor
side could be avoided, and it fixes couple handshake bug.
2)trans_msg function re-impled: put a invalid
logic before transmitting message to make sure the ACK bit is in
a clear status, otherwise there is chance that ACK asserted already
before transmitting message and lead to fake ACK polling.
(hypervisor side have some tricks to workaround ACK bit being corrupted
by VF FLR which hase an side effects that may make guest side ACK bit
asserted wrongly), and clear TRANS_MSG words after message transferred.
3)for mailbox_flr_work, it is also re-worked: it takes the mutex lock
first if invoked, to block gpu recover's participate too early while
hypervisor side is doing VF FLR. (hypervisor sends FLR_NOTIFY to guest
before doing VF FLR and sentds FLR_COMPLETE after VF FLR done, and
the FLR_NOTIFY will trigger interrupt to guest which lead to
mailbox_flr_work being invoked)
This can avoid the issue that mailbox trans msg being cleared by its VF FLR.
4)for mailbox_rcv_irq IRQ routine, it should only peek msg and schedule
mailbox_flr_work, instead of ACK to hypervisor itself, because FLR_NOTIFY
msg sent from hypervisor side doesn't need VF's ACK (this is because
VF's ACK would lead to hypervisor clear its trans_valid/msg, and this
would cause handshake bug if trans_valid/msg is cleared not due to
correct VF ACK but from a wrong VF ACK like this "FLR_NOTIFY" one)
This fixed handshake bug that sometimes GUEST always couldn't receive
"READY_TO_ACCESS_GPU" msg from hypervisor.
5)seperate polling time limite accordingly:
POLL ACK cost no more than 500ms
POLL MSG cost no more than 12000ms
POLL FLR finish cost no more than 500ms
6) we still need to set adev into in_gpu_reset mode after we received
FLR_NOTIFY from host side, this can prevent innocent app wrongly succesed
to open amdgpu dri device.
FLR_NOFITY is received due to an IDLE hang detected from hypervisor side
which indicating GPU is already die in this VF.
v2:
use MACRO as the offset of mailbox_control register
don't test if NOTIFY_CMPL event in rcv_msg since it won't
recieve that message anymore
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Pixel Ding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
mailbox registers can be accessed with a byte boundry according
to BIF team, so this patch prepares register byte access
and will be used by following patches.
Actually, for mailbox registers once the byte field is touched even not changed,
the mailbox behaves, so we need the byte width accessing to those sort of regs.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Pixel Ding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The vram type for dGPU is stored in umc_info while sys mem type
for APU is stored in integratedsysteminfo
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The commit d296278fd372003fc69588acfd0c0c5edbdf4874 added support for
detecting DDR4 but omitted the label that is printed out in
amdgpu_bo_init() resulting in a KASAN error.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The amdgpu_ucode_fini_bo should be called after gfx_v8_0_hw_fini,
or it will have KCQ disable failed issue.
For Tonga, as it firstly finishes SMC block, and the SMC hw fini
will call amdgpu_ucode_fini, which will lead the amdgpu_ucode_fini_bo
called before gfx_v8_0_hw_fini, this is incorrect.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The amdgpu_pm_sysfs_fini should call before amdgpu_device_ip_fini,
or the adev->pm.dpm_enabled would be set to 0, then the device files
related to pp won't be removed by amdgpu_pm_sysfs_fini when unload
driver.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't need the page array for prime shared BOs, stop allocating it.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows drivers to only allocate dma addresses, but not a page
array.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Let's stop mangling everything in a single header and create one header
per object instead.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Roger He <Hongbo.He@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch replace instances of drm_framebuffer_unreference with _put()
suffix, because it is shorter and consistent with the kernel use of
*_get/put() suffixes.
This was done with the following Coccinelle script:
@r@
expression e;
@@
(
-drm_framebuffer_reference(e);
+drm_framebuffer_get(e);
|
-drm_framebuffer_unreference(e);
+drm_framebuffer_put(e);
)
Signed-off-by: Haneen Mohammed <hamohammed.sa@gmail.com>
Acked-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180311233313.GA19721@Haneen
The rockchip DRM driver is quite careful to disable interrupts
when taking a lock that is also taken in interrupt context,
which is a good thing.
What is a bit over the top is to use spin_lock_irqsave when
already in interrupt context, as you cannot take another
interrupt again, and disabling interrupt is just pure
overhead.
Switching to the non _irqsave version in interrupt context is
more logical, and less heavy handed.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-4-marc.zyngier@arm.com
memcpy is only meant to be used for memory, and only that.
MMIO accessors should be used to access MMIO regions, preferably
the ones that correspond to the size of the register accessed.
Let's convert the bulk register copy to writel/readl_relaxed,
which is the correct API.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-3-marc.zyngier@arm.com
Calling request_irq() followed by disable_irq() is usually a bad idea,
specially if the interrupt can be pending, and you're not yet in a
position to handle it.
This is exactly what happens on my kevin system when rebooting in a
second kernel using kexec: Some interrupt is left pending from
the previous kernel, and we take it too early, before disable_irq()
could do anything.
Let's clear the pending interrupts as we initialize the HW, and move
the interrupt request after that point. This ensures that we're in
a sane state when the interrupt is requested.
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[adapted to recent rockchip-drm changes]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180220130120.5254-2-marc.zyngier@arm.com
Internally Mali DP uses an RGB pipeline so video layers that support
YUV input buffers need to convert the input data to RGB. The YUV
buffers can have various encodings and this patch introduces support
for BT.601, BT.709 and BT.2020 encodings, both limited and full ranges.
This patch adds support for specifying the color encoding of the
input buffers for the planes that are backed by the video layers
and programs the YUV2RGB coefficients into hardware based on the
selected encoding.
Signed-off-by: Mihail Atanassov <mihail.atanassov@arm.com>
[updated to use standard properties]
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
When unbinding the mali-dp driver the drm_vblank_cleanup() function
warns us that the vblanks are still enabled. Fix that by calling
drm_crtc_vblank_off() in the malidp_unbind() function.
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
The plane cleanup handler currently calls drm_plane_helper_disable(),
which is a legacy helper function. Replace it with a call to
drm_atomic_helper_shutdown() at removal time.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
The top-level error handler calls drm_mode_config_cleanup() which will
destroy all planes. There's no need to destroy them manually in lower
error handlers.
As plane cleanup is now handled entirely by drm_mode_config_cleanup(),
we must ensure that the plane .destroy() handler frees allocated memory
for the plane object that was freed by malidp_de_planes_destroy(). Do so
by replacing the call to devm_kfree() in the .destroy() handler by
kfree(). devm_kfree() is currently a no-op as the plane memory is
allocated with kzalloc(), not devm_kzalloc().
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Mali DP hardware has a 'go' bit (config_valid) for making the new scene
parameters active at the next page flip. The problem with the current
code is that the driver first sets this bit and then proceeds to wait
for confirmation from the hardware that the configuration has been
updated before arming the vblank event. As config_valid is actually
asserted by the hardware after the vblank event, during the prefetch
phase, when we get to arming the vblank event we are going to send it
at the next vblank, in effect halving the vblank rate from the userspace
perspective.
Fix it by sending the userspace event from the IRQ handler, when we
handle the config_valid interrupt, which syncs with the time when the
hardware is active with the new parameters.
Reported-by: Alexandru-Cosmin Gheorghe <alexandru-cosmin.gheorghe@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Mali dp needs to disable pixel alpha blending (use layer alpha blending) to
display color formats that do not contain alpha bits per pixel
This patch depends on:
"[PATCH v2 01/19] drm/fourcc: Add a alpha field to drm_format_info"
Signed-off-by: Ayan Kumar Halder <ayan.halder@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
In the case, when the user wants to scale and rotate a layer by 90/270
degrees, the scaling engine input dimensions' parameters ie width and
height needs to be swapped with respect to the layer's input dimensions.
This means scaling engine input height should be set to layer's input
width and scaling engine input width should be set to
layer's input height.
Signed-off-by: Ayan Halder <ayan.halder@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Currently the scaling engine gets enabled for a plane where the input
size differs from the composition size. As rotation is done natively
by the plane's hardware layer, we don't need the scaling engine to be
enabled.
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
We use "mc" without initializing it if scaling is not necessary.
Fixes: 28ce675b74 ("drm: mali-dp: Add plane upscaling support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mihail Atanassov <Mihail.Atanassov@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Mali DP hardware needs pitch line sizes aligned to the bus burst
size for reads, so take that into consideration when allocating dumb
buffers. If the layer is rotated then the stride size requirement is
even larger for some hardware versions, so allocate for the worst case
scenario. Update the ->dumb_create() hook to a driver specific function
that sets the correct pitch size.
Reported-by: Ayan Halder <ayan.halder@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Rotated planes need a pitch size that is aligned to 8 bytes
for older DP500 and DP550 and at least 64 bytes for DP650. Replace
the malidp_hw_pitch_valid() function with one that calculates
the correct pitch alignment to take into account rotation.
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
We currently wait for the panel to mirror our intended PSR state
before continuing on both PSR enter and PSR exit. This is really
only important to do when we're entering PSR, since we want to
be sure the last frame we pushed is being served from the panel's
internal fb before shutting down the soc blocks (vop/analogix).
This patch changes the behavior such that we only wait for the
panel to complete the PSR transition when we're entering PSR, and
to skip verification when we're exiting.
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-7-enric.balletbo@collabora.com
Like many other panel drivers, this one fails to build when backlight
support is disabled:
drivers/gpu/drm/panel/panel-raydium-rm68200.o: In function `rm68200_probe':
panel-raydium-rm68200.c:(.text+0x14a): undefined reference to `devm_of_find_backlight'
This adds the appropriate dependency.
Note that while include/linux/backlight.h provides a stub inline when
backlight support is not enabled, this isn't enough to deal with the
case where backlight support is built as a module but the panel driver
is built-in, in which case linking will still fail as above.
One way to avoid this is to add a dependency such as this:
depends on BACKLIGHT_CLASS_DEVICE || BACKLIGHT_CLASS_DEVICE=n
but that is rather complex and misses the point that the panel support
is mostly useless without backlight support.
Fixes: 2b7ed18bed ("drm/panel: Add support for Raydium RM68200 panel driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[treding@nvidia.com: clarify the need for the dependency]
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180313210015.3344380-1-arnd@arndb.de
Add a lock to vop to avoid disabling the crtc while waiting for a line
flag while enabling psr. If we disable in the middle of waiting for the
line flag, we'll end up timing out or worse.
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-5-enric.balletbo@collabora.com
We would meet a short black screen when exit PSR with the full link
training, In this case, we should use fast link train instead of full
link training.
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
[dropped header reordering]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-6-enric.balletbo@collabora.com
There is a race between AUX CH bring-up and enabling bridge which will
cause link training to fail. To avoid hitting it, don't change psr state
while enabling the bridge.
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
[seanpaul fixed up the commit message a bit and renamed *_supported to *_enabled]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-4-enric.balletbo@collabora.com
Now that the spinlocks and timers are gone, we can remove the psr
worker located in rockchip's analogix driver and do the enable/disable
directly. This should simplify the code and remove races on disable.
Cc: 征增 王 <wzz@rock-chips.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-3-enric.balletbo@collabora.com
Make sure the request PSR state takes effect in analogix_dp_send_psr_spd()
function, or print the sink PSR error state if we failed to apply the
requested PSR setting.
Cc: 征增 王 <wzz@rock-chips.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Yakir Yang <ykk@rock-chips.com>
[seanpaul changed timeout loop to a readx poll]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Thierry Escande <thierry.escande@collabora.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309222327.18689-2-enric.balletbo@collabora.com
When CONFIG_OMAP2_DSS_DPI is disabled, compilation fails due to:
drivers/gpu/drm/omapdrm/dss/dss.h:388:25: error: conflicting types for ‘port’
struct device_node *port,
^~~~
Fix this by renaming the first parameter correctly.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
When compiling with CONFIG_OMAP2_DSS_DEBUGFS disabled, build fails due
to:
drivers/gpu/drm/omapdrm/dss/dss.c:1474:10: error: ‘dss_debug_dump_clocks’ undeclared (first use in this function); did you mean ‘dispc_dump_clocks’?
dss_debug_dump_clocks, dss);
^~~~~~~~~~~~~~~~~~~~~
dispc_dump_clocks
Fix this by moving the required functions outside #if
defined(CONFIG_OMAP2_DSS_DEBUGFS).
In the long term, we perhaps want to try to get all the debugfs support
left out if debugfs is not enabled.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Handle both positive and negative dclk polarity,
according to bus_flags, taking care of this:
On A20 and similar SoCs, the only way to achieve Positive Edge
(Rising Edge), is setting dclk clock phase to 2/3(240°).
By default TCON works in Negative Edge(Falling Edge), this is why phase
is set to 0 in that case.
Unfortunately there's no way to logically invert dclk through IO_POL
register.
The only acceptable way to work, triple checked with scope,
is using clock phase set to 0° for Negative Edge and set to 240° for
Positive Edge.
On A33 and similar SoCs there would be a 90° phase option, but it divides
also dclk by 2.
This patch is a way to avoid quirks all around TCON and DOTCLOCK drivers
for using A33 90° phase divided by 2 and consequently increase code
complexity.
Signed-off-by: Giulio Benetti <giulio.benetti@micronovasrl.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520963677-124239-1-git-send-email-giulio.benetti@micronovasrl.com
mode_valid function must be connected to encoder.
Otherwise it could get not be called by drm in the case there's a
bridge connected to encoder instead of a panel.
Move mode_valid function pointer to encoder helper functions,
changing its prototype according to encoder helper function pointer.
Signed-off-by: Giulio Benetti <giulio.benetti@micronovasrl.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520941017-81177-1-git-send-email-giulio.benetti@micronovasrl.com
UAPI Changes:
- Query uAPI interface (used for GPU topology information currently)
* Mesa: https://patchwork.freedesktop.org/series/38795/
Driver Changes:
- Increase PSR2 size for CNL (DK)
- Avoid retraining LSPCON link unnecessarily (Ville)
- Decrease request signaling latency (Chris)
- GuC error capture fix (Daniele)
* tag 'drm-intel-next-2018-03-08' of git://anongit.freedesktop.org/drm/drm-intel: (127 commits)
drm/i915: Update DRIVER_DATE to 20180308
drm/i915: add schedule out notification of preempted but completed request
drm/i915: expose rcs topology through query uAPI
drm/i915: add query uAPI
drm/i915: add rcs topology to error state
drm/i915/debugfs: add rcs topology entry
drm/i915/debugfs: reuse max slice/subslices already stored in sseu
drm/i915: store all subslice masks
drm/i915/guc: work around gcc-4.4.4 union initializer issue
drm/i915/cnl: Add Wa_2201832410
drm/i915/icl: Gen11 forcewake support
drm/i915/icl: Add Indirect Context Offset for Gen11
drm/i915/icl: Enhanced execution list support
drm/i915/icl: new context descriptor support
drm/i915/icl: Correctly initialize the Gen11 engines
drm/i915: Assert that the request is indeed complete when signaled from irq
drm/i915: Handle changing enable_fbc parameter at runtime better.
drm/i915: Track whether the DP link is trained or not
drm/i915: Nuke intel_dp->channel_eq_status
drm/i915: Move SST DP link retraining into the ->post_hotplug() hook
...
Major points for this pull request:
- Add dGPU support for amdkfd initialization code and queue handling. It's
not complete support since the GPUVM part is missing (the under debate stuff).
- Enable PCIe atomics for dGPU if present
- Various adjustments to the amdgpu<-->amdkfd interface for dGPUs
- Refactor IOMMUv2 code to allow loading amdkfd without IOMMUv2 in the system
- Add HSA process eviction code in case of system memory pressure
- Various fixes and small changes
* tag 'drm-amdkfd-next-2018-03-11' of git://people.freedesktop.org/~gabbayo/linux: (24 commits)
uapi: Fix type used in ioctl parameter structures
drm/amdkfd: Implement KFD process eviction/restore
drm/amdkfd: Add GPUVM virtual address space to PDD
drm/amdkfd: Remove unaligned memory access
drm/amdkfd: Centralize IOMMUv2 code and make it conditional
drm/amdgpu: Add submit IB function for KFD
drm/amdgpu: Add GPUVM memory management functions for KFD
drm/amdgpu: add amdgpu_sync_clone
drm/amdgpu: Update kgd2kfd_shared_resources for dGPU support
drm/amdgpu: Add KFD eviction fence
drm/amdgpu: Remove unused kfd2kgd interface
drm/amdgpu: Fix wrong mask in get_atc_vmid_pasid_mapping_pasid
drm/amdgpu: Fix header file dependencies
drm/amdgpu: Replace kgd_mem with amdgpu_bo for kernel pinned gtt mem
drm/amdgpu: remove useless BUG_ONs
drm/amdgpu: Enable KFD initialization on dGPUs
drm/amdkfd: Add dGPU device IDs and device info
drm/amdkfd: Add dGPU support to kernel_queue_init
drm/amdkfd: Add dGPU support to the MQD manager
drm/amdkfd: Add dGPU support to the device queue manager
...
Commit 5addcf0a5f ("nouveau: add runtime PM support (v0.9)") prevents
runtime suspend of the GPU if its integrated HDA controller is not bound
to a driver. The rationale appears to be that probing the HDA fails if
the GPU is in D3cold.
However we now use a device link to ensure that the GPU is runtime
resumed while the HDA controller is probed, rendering this safety
measure obsolete. Remove it.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/77e0ab74f3377ea9b6acf8fab624acfb4f7dbeca.1520068884.git.lukas@wunner.de
When switching the display on muxed machines, we currently force the HDA
controller into runtime suspend on the previously used GPU and into
runtime active state on the newly used GPU.
That's unnecessary if the GPU uses driver power control, we can just let
the audio device autosuspend or autoresume as it sees fit.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/098ed883460eb4976a899eac6f5192fefc877c0f.1520068884.git.lukas@wunner.de
Back in 2013, runtime PM for GPUs with integrated HDA controller was
introduced with commits 0d69704ae3 ("gpu/vga_switcheroo: add driver
control power feature. (v3)") and 246efa4a07 ("snd/hda: add runtime
suspend/resume on optimus support (v4)").
Briefly, the idea was that the HDA controller is forced on and off in
unison with the GPU.
The original code is mostly still in place even though it was never a
100% perfect solution: E.g. on access to the HDA controller, the GPU
is powered up via vga_switcheroo_runtime_resume_hdmi_audio() but there
are no provisions to keep it resumed until access to the HDA controller
has ceased: The GPU autosuspends after 5 seconds, rendering the HDA
controller inaccessible.
Additionally, a kludge is required when hda_intel.c probes: It has to
check whether the GPU is powered down (check_hdmi_disabled()) and defer
probing if so.
However in the meantime (in v4.10) the driver core has gained a feature
called device links which promises to solve such issues in a clean way:
It allows us to declare a dependency from the HDA controller (consumer)
to the GPU (supplier). The PM core then automagically ensures that the
GPU is runtime resumed as long as the HDA controller's ->probe hook is
executed and whenever the HDA controller is accessed.
By default, the HDA controller has a dependency on its parent, a PCIe
Root Port. Adding a device link creates another dependency on its
sibling:
PCIe Root Port
^ ^
| |
| |
HDA ===> GPU
The device link is not only used for runtime PM, it also guarantees that
on system sleep, the HDA controller suspends before the GPU and resumes
after the GPU, and on system shutdown the HDA controller's ->shutdown
hook is executed before the one of the GPU. It is a complete solution.
Using this functionality is as simple as calling device_link_add(),
which results in a dmesg entry like this:
pci 0000:01:00.1: Linked as a consumer to 0000:01:00.0
The code for the GPU-governed audio power management can thus be removed
(except where it's still needed for legacy manual power control).
The device link is added in a PCI quirk rather than in hda_intel.c.
It is therefore legal for the GPU to runtime suspend to D3cold even if
the HDA controller is not bound to a driver or if CONFIG_SND_HDA_INTEL
is not enabled, for accesses to the HDA controller will cause the GPU to
wake up regardless if they're occurring outside of hda_intel.c (think
config space readout via sysfs).
Contrary to the previous implementation, the HDA controller's power
state is now self-governed, rather than GPU-governed, whereas the GPU's
power state is no longer fully self-governed. (The HDA controller needs
to runtime suspend before the GPU can.)
It is thus crucial that runtime PM is always activated on the HDA
controller even if CONFIG_SND_HDA_POWER_SAVE_DEFAULT is set to 0 (which
is the default), lest the GPU stays awake. This is achieved by setting
the auto_runtime_pm flag on every codec and the AZX_DCAPS_PM_RUNTIME
flag on the HDA controller.
A side effect is that power consumption might be reduced if the GPU is
in use but the HDA controller is not, because the HDA controller is now
allowed to go to D3hot. Before, it was forced to stay in D0 as long as
the GPU was in use. (There is no reduction in power consumption on my
Nvidia GK107, but there might be on other chips.)
The code paths for legacy manual power control are adjusted such that
runtime PM is disabled during power off, thereby preventing the PM core
from resuming the HDA controller.
Note that the device link is not only added on vga_switcheroo capable
systems, but for *any* GPU with integrated HDA controller. The idea is
that the HDA controller streams audio via connectors located on the GPU,
so the GPU needs to be on for the HDA controller to do anything useful.
This commit implicitly fixes an unbalanced runtime PM ref upon unbind of
hda_intel.c: On ->probe, a runtime PM ref was previously released under
the condition "azx_has_pm_runtime(chip) || hda->use_vga_switcheroo", but
on ->remove a runtime PM ref was only acquired under the first of those
conditions. Thus, binding and unbinding the driver twice on a
vga_switcheroo capable system caused the runtime PM refcount to drop
below zero. The issue is resolved because the AZX_DCAPS_PM_RUNTIME flag
is now always set if use_vga_switcheroo is true.
For more information on device links please refer to:
https://www.kernel.org/doc/html/latest/driver-api/device_link.html
Documentation/driver-api/device_link.rst
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress
Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/51bd38360ff502a8c42b1ebf4405ee1d3f27118d.1520068884.git.lukas@wunner.de
If DRM drivers use runtime PM, they currently notify vga_switcheroo
whenever they ->runtime_suspend or ->runtime_resume to update
vga_switcheroo's internal power state tracking.
That's essentially a duplication of a functionality performed by the
PM core as it already tracks the GPU's power state and vga_switcheroo
can always query it.
Introduce a new internal helper vga_switcheroo_pwr_state() which does
just that if runtime PM is used, or falls back to vga_switcheroo's
internal power state tracking if manual power control is used.
Drop a redundant power state check in set_audio_state() while at it.
This removes one of the two purposes of the notification mechanism
implemented by vga_switcheroo_set_dynamic_switch(). The other one is
power management of the audio device and we'll remove that next.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress
Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/0aa49d735b988aa04524a8dc339582ace33f0f94.1520068884.git.lukas@wunner.de
When cutting power to a GPU and its integrated HDA controller, their
cached current_state should be updated to D3cold to reflect reality.
We currently rely on the DRM and HDA drivers to do that, however:
- The HDA driver updates the current_state in azx_vs_set_state(), which
will no longer be called with driver power control once we migrate to
device links. (It will still be called with manual power control.)
- If the HDA device is not bound, its current_state remains at D0 even
though the GPU driver may decide to go to D3cold.
- The DRM drivers update the current_state using pci_set_power_state()
which can't put the device into a deeper power state than D3hot if the
GPU is not deemed power-manageable by the platform (even though it
*is* power-manageable by some nonstandard means, such as a _DSM).
Centralize updating the current_state of the GPU and HDA controller in
vga_switcheroo's ->runtime_suspend hook to overcome these deficiencies.
The GPU and HDA controller are two functions of the same PCI device
(VGA class device on function 0 and audio device on function 1) and
no other PCI devices reside on the same bus since this is a PCIe
point-to-point link, so we can just walk the bus and update the
current_state of all devices.
On ->runtime_resume, the HDA controller is in D0uninitialized state.
Resume to D0active and then let it autosuspend as it sees fit.
Note that vga_switcheroo_init_domain_pm_ops() is not supposed to be
called by hybrid graphics laptops which power down the GPU via its root
port's _PR3 resources and consequently vga_switcheroo_runtime_suspend()
is not used. On those laptops, the root port is power-manageable by the
platform (instead of by a nonstandard means) and the current_state is
therefore updated by the PCI core through the following call chain:
pci_set_power_state()
__pci_complete_power_transition()
pci_bus_set_current_state()
Resuming to D0active happens through:
pci_set_power_state()
__pci_start_power_transition()
pci_wakeup_bus()
Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress
Tested-by: Mike Lothian <mike@fireburn.co.uk> # AMD PowerXpress
Tested-by: Denis Lisov <dennis.lissov@gmail.com> # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/8416958482c8c42d6f311ea5c1e5a65ccf21f5db.1520068884.git.lukas@wunner.de
There are PCI devices which are power-manageable by a nonstandard means,
such as a custom ACPI method. One example are discrete GPUs in hybrid
graphics laptops, another are Thunderbolt controllers in Macs.
Such devices can't be put into D3cold with pci_set_power_state() because
pci_platform_power_transition() fails with -ENODEV. Instead they're put
into D3hot by pci_set_power_state() and subsequently into D3cold by
invoking the nonstandard means. However as a consequence the cached
current_state is incorrectly left at D3hot.
What we need to do is walk the hierarchy below such a PCI device on
powerdown and update the current_state to D3cold. On powerup the PCI
device itself and the hierarchy below it is in D0uninitialized, so we
need to walk the hierarchy again and wake all devices, causing them to
be put into D0active and then letting them autosuspend as they see fit.
To this end make pci_wakeup_bus() & pci_bus_set_current_state() public
so PCI drivers don't have to reinvent the wheel.
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/2962443259e7faec577274b4ef8c54aad66f9a94.1520068884.git.lukas@wunner.de
We leave PCI devices not bound to a driver in D0 during runtime suspend.
But they may have a parent which is bound and can be transitioned to
D3cold at runtime. Once the parent goes to D3cold, the unbound child
may go to D3cold as well. When the child goes to D3cold, its internal
state, including configuration of BARs, MSI, ASPM, MPS, etc., is lost.
One example are recent hybrid graphics laptops which cut power to the
discrete GPU when the root port above it goes to ACPI power state D3.
Users may provoke this by unbinding the GPU driver and allowing runtime
PM on the GPU via sysfs: The PM core will then treat the GPU as
"suspended", which in turn allows the root port to runtime suspend,
causing the power resources listed in its _PR3 object to be powered off.
The GPU's BARs will be uninitialized when a driver later probes it.
Another example are hybrid graphics laptops where the GPU itself (rather
than the root port) is capable of runtime suspending to D3cold. If the
GPU's integrated HDA controller is not bound and the GPU's driver
decides to runtime suspend to D3cold, the HDA controller's BARs will be
uninitialized when a driver later probes it.
Fix by saving and restoring config space over a runtime suspend cycle
even if the device is not bound.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Peter Wu <peter@lekensteyn.nl> # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de> # MacBook Pro
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[lukas: add commit message, bikeshed code comments for clarity]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/92fb6e6ae2730915eb733c08e2f76c6a313e3860.1520068884.git.lukas@wunner.de
This patch adds support for DMT display modes over HDMI.
The modes timings configurations are from the Amlogic Vendor linux tree
and tested over multiples monitors.
Previously only a selected number of CEA modes were supported.
Only these following modes are supported with these changes:
- 640x480@60Hz
- 800x600@60Hz
- 1024x768@60Hz
- 1152x864@75Hz
- 1280x1024@60Hz
- 1600x1200@60Hz
- 1920x1080@60Hz
The associated code to handle the clock rates is also added.
Acked-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520935670-14187-1-git-send-email-narmstrong@baylibre.com