linux

Commit Graph

Author	SHA1	Message	Date
Christian König	5918045c4e	drm/scheduler: rework job destruction We now destroy finished jobs from the worker thread to make sure that we never destroy a job currently in timeout processing. By this we avoid holding lock around ring mirror list in drm_sched_stop which should solve a deadlock reported by a user. v2: Remove unused variable. v4: Move guilty job free into sched code. v5: Move sched->hw_rq_count to drm_sched_start to account for counter decrement in drm_sched_stop even when we don't call resubmit jobs if guily job did signal. v6: remove unused variable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com	2019-05-02 15:45:48 -05:00
Eric Anholt	dffa9b7a78	drm/v3d: Add missing implicit synchronization. It is the expectation of existing userspace (X11 + Mesa, in particular) that jobs submitted to the kernel against a shared BO will get implicitly synchronized by their submission order. If we want to allow clever userspace to disable implicit synchronization, we should do that under its own submit flag (as amdgpu and lima do). Note that we currently only implicitly sync for the rendering pass, not binning -- if you texture-from-pixmap in the binning vertex shader (vertex coordinate generation), you'll miss out on synchronization. Fixes flickering when multiple clients are running in parallel, particularly GL apps and compositors. v2: Fix a missing refcount on the CSD done fence for L2 cleaning. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-6-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>	2019-04-18 09:54:16 -07:00
Eric Anholt	d223f98f02	drm/v3d: Add support for compute shader dispatch. The compute shader dispatch interface is pretty simple -- just pass in the regs that userspace has passed us, with no CLs to run. However, with no CL to run it means that we need to do manual cache flushing of the L2 after the HW execution completes (for SSBO, atomic, and image_load_store writes that are the output of compute shaders). This doesn't yet expose the L2 cache's ability to have a region of the address space not write back to memory (which could be used for shared_var storage). So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing the ES31 tests), and on the kernel side on 7278 (failing atomic compswap tests in a way that doesn't reproduce on simpenrose). v2: Fix excessive allocation for the clean_job (reported by Dan Carpenter). Keep refs on jobs until clean_job is finished, to avoid spurious MMU errors if the output BOs are freed by userspace before L2 cleaning is finished. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>	2019-04-18 09:54:10 -07:00
Eric Anholt	a783a09ee7	drm/v3d: Refactor job management. The CL submission had two jobs embedded in an exec struct. When I added TFU support, I had to replicate some of the exec stuff and some of the job stuff. As I went to add CSD, it became clear that actually what was in exec should just be in the two CL jobs, and it would let us share a lot more code between the 4 queues. v2: Fix missing error path in TFU ioctl's bo[] allocation. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-3-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>	2019-04-18 09:54:07 -07:00
Eric Anholt	3f0b646e1a	drm/v3d: Rename the fence signaled from IRQs to "irq_fence". We have another thing called the "done fence" that tracks when the scheduler considers the job done, and having the shared name was confusing. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190313235211.28995-2-eric@anholt.net Reviewed-by: Dave Emett <david.emett@broadcom.com>	2019-04-01 10:44:34 -07:00
Andrey Grodzovsky	e8074f75f4	drm/v3d: Fix calling drm_sched_resubmit_jobs for same sched. Also stop calling drm_sched_increase_karma multiple times. v2: Fix whitespace in the code we're moving (by anholt) Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1552409822-17230-1-git-send-email-andrey.grodzovsky@amd.com Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> Fixes: `222b5f0441` ("drm/sched: Refactor ring mirror list handling.")	2019-03-14 09:22:58 -07:00
Andrey Grodzovsky	222b5f0441	drm/sched: Refactor ring mirror list handling. Decauple sched threads stop and start and ring mirror list handling from the policy of what to do about the guilty jobs. When stoppping the sched thread and detaching sched fences from non signaled HW fenes wait for all signaled HW fences to complete before rerunning the jobs. v2: Fix resubmission of guilty job into HW after refactoring. v4: Full restart for all the jobs, not only from guilty ring. Extract karma increase into standalone function. v5: Rework waiting for signaled jobs without relying on the job struct itself as those might already be freed for non 'guilty' job's schedulers. Expose karma increase to drivers. v6: Use list_for_each_entry_safe_continue and drm_sched_process_job in case fence already signaled. Call drm_sched_increase_karma only once for amdgpu and add documentation. v7: Wait only for the latest job's fence. Suggested-by: Christian Koenig <Christian.Koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-25 16:15:36 -05:00
Eric Anholt	1584f16ca9	drm/v3d: Add support for submitting jobs to the TFU. The TFU can copy from raster, UIF, and SAND input images to UIF output images, with optional mipmap generation. This will certainly be useful for media EGL image input, but is also useful immediately for mipmap generation without bogging the V3D core down. For now we only run the queue 1 job deep, and don't have any hang recovery (though I don't think we should need it, with TFU). Queuing multiple jobs in the HW will require synchronizing the YUV coefficient regs updates since they don't get FIFOed with the job. v2: Change the ioctl to IOW instead of IOWR, always set COEF0, explain why TFU is AUTH, clarify the syncing docs, drop the unused TFU interrupt regs (you're expected to use the hub's), don't take &bo->base for NULL bos. v3: Fix a little whitespace alignment (noticed by checkpatch), rebase on drm_sched_job_cleanup() changes. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dave Emett <david.emett@broadcom.com> (v2) Link: https://patchwork.freedesktop.org/patch/264607/	2018-11-30 13:03:32 -08:00
Dave Airlie	61647c77cb	drm-misc-next for v4.21: Core Changes: - Merge drm_info.c into drm_debugfs.c - Complete the fake drm_crtc_commit's hw_done/flip_done sooner. - Remove deprecated drm_obj_ref/unref functions. All drivers use get/put now. - Decrease stack use of drm_gem_prime_mmap. - Improve documentation for dumb callbacks. Driver Changes: - Add edid support to virtio. - Wait on implicit fence in meson and sun4i. - Add support for BGRX8888 to sun4i. - Preparation patches for sun4i driver to start supporting linear and tiled YUV formats. - Add support for HDMI 1.4 4k modes to meson, and support for VIC alternate timings. - Drop custom dumb_map in vkms. - Small fixes and cleanups to v3d. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAlv+YOEACgkQ/lWMcqZw E8OJvw//fc+j9sJLScvrahLDNZVMh4pTvQCOySmxIPVAhjLZIkRjdvR9Ou51tbL+ qSm3tDexHEPniR8xmuTYjPZtJP6o4e4NLqUzWYdZb0U+oK3QMuJKDD3uK+6BwM8P CyAa4VxV9F17oN+d0aFpoPTHheRVt3egyvREqLHoiAJYtp01cm+f/FFKZSe+o3p/ QLi0tJ5unXg6AZFoomYbZirE/jp6t8m+cjRkYOafE57+2qpMEJ8RA5G0D1UxCgP2 imGW6n4N7rmB1bNbtTvFEDGIffE+W9AkVQkJ2YXUfQldtmUKgLA9OG47DIdDb1Xa P7RWVjHJejhvu9URcFmQcrjoCtKURPcPTuLZEQHvae1sUxwwMUvtpKUM7TBGYo2I G/nQLkMLmK9yfJRyo2OHRHTClduU3X7FXzbJhbL3cUMx0beWjCQmRDjM9ywfSJR3 lrJIlnQ3voCp0IZWj86RG0idpd3RIjE8Aaqz/m4bSmqMCqmlepnZzIpZcFB7gXbM k0xiK4LUFO1VbFsMoRaqrP1zXduY+nbLhfiDiIDs34v0ZVqNooJpLYilRI/lvmXt vzApnxgwRePW0vz67Lagqq+ZUXJXptirmGw7bnvfT90cOKlRLi5CDZTRwCOuUNPL 9kUgXj8EoX9+7p9M14TrEx9tV0MZIwbP8nlS9Ty0Kx4s240mbYg= =JNy2 -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2018-11-28' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v4.21: Core Changes: - Merge drm_info.c into drm_debugfs.c - Complete the fake drm_crtc_commit's hw_done/flip_done sooner. - Remove deprecated drm_obj_ref/unref functions. All drivers use get/put now. - Decrease stack use of drm_gem_prime_mmap. - Improve documentation for dumb callbacks. Driver Changes: - Add edid support to virtio. - Wait on implicit fence in meson and sun4i. - Add support for BGRX8888 to sun4i. - Preparation patches for sun4i driver to start supporting linear and tiled YUV formats. - Add support for HDMI 1.4 4k modes to meson, and support for VIC alternate timings. - Drop custom dumb_map in vkms. - Small fixes and cleanups to v3d. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/151a3270-b1be-ed75-bd58-6b29d741f592@linux.intel.com	2018-11-29 10:28:49 +10:00
Eric Anholt	e90e45f6bd	drm/v3d: Update a comment about what uses v3d_job_dependency(). I merged bin and render's paths in a late refactoring. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20181108161654.19888-3-eric@anholt.net Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>	2018-11-27 12:17:22 -08:00
Sharat Masetty	26efecf955	drm/scheduler: Add drm_sched_job_cleanup This patch adds a new API to clean up the scheduler job resources. This is primarliy needed in cases the job was created but was not queued to the scheduler queue. Additionally with this change, the layer which creates the scheduler job also gets to free up the job's resources and this entails moving the dma_fence_put(finished_fence) to the drivers ops free handler routines. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:27 -05:00
Christian König	19067e522d	drm/sched: make sure timer is restarted Make sure we always restart the timer after a timeout and remove the device specific workarounds. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:01 -05:00
Nayan Deshmukh	6a96243056	drm/scheduler: remove timeout work_struct from drm_sched_job (v3) having a delayed work item per job is redundant as we only need one per scheduler to track the time out the currently executing job. v2: the first element of the ring mirror list is the currently executing job so we don't need a additional variable for it v3: squash in fixes for v3d and etnaviv Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-27 09:55:45 -05:00
Eric Anholt	a65020d0a6	drm/v3d: Fix a grammar nit in the scheduler docs. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180703170515.6298-4-eric@anholt.net Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Daniel Vetter <daniel@ffwll.ch>	2018-07-05 11:42:50 -07:00
Eric Anholt	624bb0c08b	drm/v3d: Delay the scheduler timeout if we're still making progress. GTF-GLES2.gtf.GL.acos.acos_float_vert_xvary submits jobs that take 4 seconds at maximum resolution, but we still want to reset quickly if a job is really hung. Sample the CL's current address and the return address (since we call into tile lists repeatedly) and if either has changed then assume we've made progress. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/20180703170515.6298-1-eric@anholt.net Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Daniel Vetter <daniel@ffwll.ch>	2018-07-05 11:42:49 -07:00
Dan Carpenter	17e23993f2	drm/v3d: Checking for NULL vs IS_ERR() The v3d_fence_create() only returns error pointers on error. It never returns NULL. Fixes: `57692c94dc` ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180518081041.GC28335@mwanda	2018-05-21 10:44:33 -07:00
Eric Anholt	57692c94dc	drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+ This driver will be used to support Mesa on the Broadcom 7268 and 7278 platforms. V3D 3.3 introduces an MMU, which means we no longer need CMA or vc4's complicated CL/shader validation scheme. This massively changes the GEM behavior, so I've forked off to a new driver. v2: Mark SUBMIT_CL as needing DRM_AUTH. coccinelle fixes from kbuild test robot. Drop personal git link from MAINTAINERS. Don't double-map dma-buf imported BOs. Add kerneldoc about needing MMU eviction. Drop prime vmap/unmap stubs. Delay mmap offset setup to mmap time. Use drm_dev_init instead of _alloc. Use ktime_get() for wait_bo timeouts. Drop drm_can_sleep() usage, since we don't modeset. Switch page tables back to WC (debug change to coherent had slipped in). Switch drm_gem_object_unreference_unlocked() to drm_gem_object_put_unlocked(). Simplify overflow mem handling by not sharing overflow mem between jobs. v3: no changes v4: align submit_cl to 64 bits (review by airlied), check zero flags in other ioctls. Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v4) Acked-by: Dave Airlie <airlied@linux.ie> (v3, requested submit_cl change) Link: https://patchwork.freedesktop.org/patch/msgid/20180430181058.30181-3-eric@anholt.net	2018-05-03 16:26:30 -07:00

17 Commits