linux

Commit Graph

Author	SHA1	Message	Date
Rex Zhu	5d24af846e	drm/amd/pp: Implement get/set_power_profile_mode on smu7 It show what parameters can be configured to tune the behavior of natural dpm for perf/watt on smu7. user can select the mode per workload, but even the default per workload settings are not bulletproof. user can configure custom settings per different use case for better perf or better perf/watt. cat pp_power_profile_mode NUM MODE_NAME SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL 0 3D_FULL_SCREEN: 0 100 30 0 100 10 1 POWER_SAVING: 10 0 30 - - - 2 VIDEO: - - - 10 16 31 3 VR: 0 11 50 0 100 10 4 COMPUTE: 0 5 30 - - - 5 CUSTOM: 0 0 0 0 0 0 * CURRENT: 0 100 30 0 100 10 Under manual dpm level, user can echo "0/1/2/3/4">pp_power_profile_mode to select 3D_FULL_SCREEN/POWER_SAVING/VIDEO/VR/COMPUTE mode. echo "5 * * * * * * * ">pp_power_profile_mode to set custom settings. "5 * * * * * * *" mean "CUSTOM enable_sclk SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL enable_mclk MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL" if the parameter enable_sclk/enable_mclk is true, driver will update the following parameters to dpm table. if false, ignore the following parameters. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-06 13:12:13 -05:00
Rex Zhu	6dcd30aa11	drm/amd/pp: Implement update_dpm_settings on CI use SW method to update DPM settings by updating SRAM directly on CI. Reviewed-by: Alex Deucher <alexdeucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-06 13:12:06 -05:00
Rex Zhu	fdd62a40d4	drm/amd/pp: Implement update_dpm_settings on Tonga use SW method to update DPM settings by updating SRAM directly on Tonga. Reviewed-by: Alex Deucher <alexdeucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-06 13:11:58 -05:00
Rex Zhu	59a6642454	drm/amd/pp: Implement update_dpm_settings on Fiji use SW method to update DPM settings by updating SRAM directly on Fiji. Reviewed-by: Alex Deucher <alexdeucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-06 13:11:34 -05:00
Chris Wilson	f41d19becc	drm/i915: Flush waiters on seqno wraparound Previously, we would spin waiting for all waiters to wake up and notice their request had completed before we would reset the seqno upon wraparound. However, we can mark their waits as complete and wake them up directly using the existing machinery for handling the flushing of missed wakeups when idling. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-2-chris@chris-wilson.co.uk	2018-03-06 17:25:56 +00:00
Chris Wilson	93eef7d653	drm/i915: Stop kicking the signaling thread on seqno wraparound Since commit `fd10e2ce99` ("drm/i915/breadcrumbs: Ignore unsubmitted signalers"), we cancel the signaler when retiring the request and so upon wraparound, where we wait for all requests to be retired, we no longer need to spin waiting for the signaling thread to release its references to the in-flight requests, and so we can assert that the signaler is idle. References: `fd10e2ce99` ("drm/i915/breadcrumbs: Ignore unsubmitted signalers") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-1-chris@chris-wilson.co.uk	2018-03-06 17:25:55 +00:00
Christian König	186ca446ae	drm/prime: make the pages array optional for drm_prime_sg_to_page_addr_arrays Most of the time we only need the dma addresses. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Roger He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180227115000.4105-2-christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20180227115000.4105-3-christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20180227115000.4105-4-christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20180227115000.4105-5-christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/BN6PR12MB18262C0DE9B5F07B9A42EAE7F2C60@BN6PR12MB1826.namprd12.prod.outlook.com	2018-03-06 12:24:52 -05:00
Christian König	681066ec1d	drm/prime: fix potential race in drm_gem_map_detach Unpin the GEM object only after freeing the sg table. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Roger He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180227115000.4105-1-christian.koenig@amd.com	2018-03-06 12:22:42 -05:00
Ville Syrjälä	5ebbb5b4d4	drm: Check property/enum name length Reject requests to add properties/enums with an overly long name. Previously we would have just silently truncated the string and exposed it userspace. v2: drm_property_create() returns a pointer Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302140300.31110-1-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2018-03-06 18:03:02 +02:00
Ville Syrjälä	8e5e83ad82	drm: Don't create properties without names Creating a property that doesn't have a name makes no sense to me. Don't allow it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302132544.12491-1-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2018-03-06 18:01:59 +02:00
Chris Wilson	9792e213a4	drm/i915/breadcrumbs: Assert all missed breadcrumbs were signaled When parking the engines and their breadcrumbs, if we have waiters left then they missed their wakeup. Verify that each waiter's seqno did complete. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-2-chris@chris-wilson.co.uk	2018-03-06 12:12:46 +00:00
Chris Wilson	cd46c545b7	drm/i915/breadcrumbs: Reduce signaler rbtree to a sorted list The goal here is to try and reduce the latency of signaling additional requests following the wakeup from interrupt by reducing the list of to-be-signaled requests from an rbtree to a sorted linked list. The original choice of using an rbtree was to facilitate random insertions of request into the signaler while maintaining a sorted list. However, if we assume that most new requests are added when they are submitted, we see those new requests in execution order making a insertion sort fast, and the reduction in overhead of each signaler iteration significant. Since commit `56299fb7d9` ("drm/i915: Signal first fence from irq handler if complete"), we signal most fences directly from notify_ring() in the interrupt handler greatly reducing the amount of work that actually needs to be done by the signaler kthread. All the thread is then required to do is operate as the bottom-half, cleaning up after the interrupt handler and preparing the next waiter. This includes signaling all later completed fences in a saturated system, but on a mostly idle system we only have to rebuild the wait rbtree in time for the next interrupt. With this de-emphasis of the signaler's role, we want to rejig it's datastructures to reduce the amount of work we require to both setup the signal tree and maintain it on every interrupt. References: `56299fb7d9` ("drm/i915: Signal first fence from irq handler if complete") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-1-chris@chris-wilson.co.uk	2018-03-06 12:12:45 +00:00
Neil Armstrong	9c305eb442	drm: bridge: dw-hdmi: Fix overflow workaround for Amlogic Meson GX SoCs The Amlogic Meson GX SoCs, embedded the v2.01a controller, has been also identified needing this workaround. This patch adds the corresponding version to enable a single iteration for this specific version. Fixes: `be41fc55f1` ("drm: bridge: dw-hdmi: Handle overflow workaround based on device version") Acked-by: Archit Taneja <architt@codeaurora.org> [narmstrong: s/identifies/identified and rebased against Jernej's change] Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/1519386277-25902-1-git-send-email-narmstrong@baylibre.com	2018-03-06 11:52:15 +01:00
Daniele Ceraolo Spurio	7cc62d0b8e	drm/i915/error: capture uc_state after gen_state error->device_info.has_guc, which we check in capture_uc_state, is set in capture_gen_state, so the latter needs to be performed first. v2: rebased Reported-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Fixes: `7d41ef3479` (drm/i915: Add Guc/HuC firmware details to error state) Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-3-daniele.ceraolospurio@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-06 09:34:46 +00:00
Daniele Ceraolo Spurio	53b725c7db	drm/i915/error: standardize function style in error capture some of the static functions used from capture() have the "i915_" prefix while other don't; most of them take i915 as a parameter, but one of them derives it internally from error->i915. Let's be consistent by avoiding prefix for static functions and by getting i915 from error->i915. While at it, s/dev_priv/i915 in functions that don't perform register reads. v2: take i915 from error->i915 (Michal), s/dev_priv/i915, update commit message Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-2-daniele.ceraolospurio@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-06 09:34:40 +00:00
Daniele Ceraolo Spurio	618d87d783	drm/i915/error: remove unused gen8_engine_sync_index Leftover from Gen8 ringbuffer support removal Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-1-daniele.ceraolospurio@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-06 09:34:35 +00:00
Baruch Siach	2efa83920c	drm: of: simplify component probe code Use positive logic for better readability. This also eliminates one of_node_put() call, making the code shorter. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Tested-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Archit Taneja <archit@cradlewise.com> Link: https://patchwork.freedesktop.org/patch/msgid/61c56895d44117d80e7d82f04e729e29c60fadbd.1519327370.git.baruch@tkos.co.il	2018-03-06 14:05:00 +05:30
Joe Moriarty	3bd07ccd27	drm: NULL pointer dereference [null-pointer-deref] (CWE 476) problem The Parfait (version 2.1.0) static code analysis tool found the following NULL pointer dereference problem. - drivers/gpu/drm/drm_drv.c Any calls to drm_minor_get_slot() could result in the return of a NULL pointer when an invalid DRM device type is encountered. The return of NULL was removed with BUG() from drm_minor_get_slot(). Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180220191157.100960-3-joe.moriarty@oracle.com	2018-03-06 08:14:16 +01:00
Joe Moriarty	4ffb8deeed	drm: NULL pointer dereference [null-pointer-deref] (CWE 476) problem The Parfait (version 2.1.0) static code analysis tool found the following NULL pointer derefernce problem. - drivers/gpu/drm/drm_vblank.c Null pointer checks were added to return values from calls to drm_crtc_from_index(). There is a possibility, however minute, that crtc->index may not be found when trying to find the struct crtc from it's assigned index given in drm_crtc_init_with_planes(). 3 return checks for NULL where added with a call to WARN_ON(!crtc). Signed-off-by: Joe Moriarty <joe.moriarty@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180220191157.100960-2-joe.moriarty@oracle.com	2018-03-06 08:14:16 +01:00
Xiong Zhang	991ecefbdd	drm/i915/gvt: Return error at the failure of finding page_track In XenGT, ioreq copy is used to trap mmio write and ppgtt write. Both of them are memory write, ioreq handler couldn't distinguish them. So ioreq handler probe the ppgtt write handler, if it is succuess, this ioreq is ppgtt write, otherwise it is mmio write. So ppgtt write handler should return an error at the failure of finding page track, it is fatal to implement ioreq handler in XenGT. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 14:49:38 +08:00
Xiong Zhang	7e60946feb	drm/i915/gvt: Release gvt->lock at the failure of finding page track page_track_handler take lock at the beginning, the lock should be released at the failure of finding page track. Otherwise deadlock will happen. Fixes: `e502a2af4c` ("drm/i915/gvt: Provide generic page_track infrastructure for write-protected page") Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 14:49:24 +08:00
Changbin Du	6846dfeb87	drm/i915/kvmgt: Add kvmgt debugfs entry nr_cache_entries under vgpu Add a new debugfs entry kvmgt_nr_cache_entries under vgpu which shows the number of entry in dma cache. $ cat /sys/kernel/debug/gvt/vgpu1/kvmgt_nr_cache_entries 10101 v3: fix compiling error for some configuration. (Xiong Zhang <xiong.y.zhang@intel.com>) v2: keep debugfs layout flat. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:28 +08:00
Changbin Du	cf4ee73fd9	drm/i915/gvt: Fix guest vGPU hang caused by very high dma setup overhead The implementation of current kvmgt implicitly setup dma mapping at MPT API gfn_to_mfn. First this design against the API's original purpose. Second, there is no unmap hit in this design. The result is that the dma mapping keep growing larger and larger. For mutl-vm case, they will consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree entries crated in the IOMMU IOVA allocator. Finally, single IOVA allocation can take as long as ~70ms. Such latency is intolerable. To address both above issues, this patch introduced two new MPT API: o dma_map_guest_page - setup dma map for guest page o dma_unmap_guest_page - cancel dma map for guest page The kvmgt implements these 2 API. And to reduce dma setup overhead for duplicated pages (eg. scratch pages), two caches are used: one is for mapping gfn to struct gvt_dma, another is for mapping dma addr to struct gvt_dma. With these 2 new API, the gtt now is able to cancel dma mapping when page table is invalidated. The dma mapping is not in a gradual increase now. v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point. Cc: Hang Yuan <hang.yuan@intel.com> Cc: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:27 +08:00
Zhenyu Wang	b52646fd5b	drm/i915/gvt: Fix check error on hws_pga_write() fail message Fix below check error by using proper failure message output. drivers/gpu/drm/i915//gvt/handlers.c:1392 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR() drivers/gpu/drm/i915//gvt/handlers.c:1402 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR() Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:26 +08:00
Zhenyu Wang	253fe56ea9	drm/i915/gvt: Fix one indent error Fix below warning: drivers/gpu/drm/i915//gvt/handlers.c:323 gdrst_mmio_write() warn: inconsistent indenting Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:26 +08:00
Zhenyu Wang	c39bca4e04	drm/i915/gvt: Fix check error on fence mmio handler Fix below error with minor code refactor. CHECK drivers/gpu/drm/i915//gvt/handlers.c drivers/gpu/drm/i915//gvt/handlers.c:203 sanitize_fence_mmio_access() error: 'vgpu' dereferencing possible ERR_PTR() Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:25 +08:00
Zhenyu Wang	64c066a911	drm/i915/gvt: Fix check error of vgpu create failure message Fix check error at CHECK drivers/gpu/drm/i915//gvt/kvmgt.c drivers/gpu/drm/i915//gvt/kvmgt.c:455 intel_vgpu_create() error: we previously assumed 'vgpu' could be null (see line 454) For failed vgpu create, just show error return in failure message. Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:25 +08:00
Zhenyu Wang	9803984581	drm/i915/gvt: Fix vGPU sched timeslice calculation warning Fix below warning by using proper ktime helper to calculate timeslice. CHECK drivers/gpu/drm/i915//gvt/sched_policy.c drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1 drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1 Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:24 +08:00
Zhenyu Wang	0102d0d922	drm/i915/gvt: remove gvt max port definition Remove GVT-g private max port definition but use i915 one. Fix error caused by: drivers/gpu/drm/i915//gvt/handlers.c:871 dp_aux_ch_ctl_mmio_write() error: buffer overflow 'display->ports' 5 <= 5 Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:24 +08:00
Zhenyu Wang	7e534ac985	drm/i915/gvt: Fix one gvt_vgpu_error() use in dmabuf.c Fix below warning with proper usage. CHECK drivers/gpu/drm/i915//gvt/dmabuf.c drivers/gpu/drm/i915//gvt/dmabuf.c:462 intel_vgpu_get_dmabuf() error: 'vgpu' dereferencing possible ERR_PTR() Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:23 +08:00
Weinan Li	cd7e61b93d	drm/i915/gvt: init mmio by lri command in vgpu inhibit context There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g, vgpu can't get the correct default context by updating the registers before inhibit context submission. It always get back the hardware default value unless the inhibit context submission happened before the 1st time forcewake put. With this wrong default context, vgpu will run with incorrect state and meet unknown issues. The solution is initialize these mmios by adding lri command in ring buffer of the inhibit context, then gpu hardware has no chance to go down RC6 when lri commands are right being executed, and then vgpu can get correct default context for further use. v3: - fix code fault, use 'for' to loop through mmio render list(Zhenyu) v4: - save the count of engine mmio need to be restored for inhibit context and refine some comments. (Kevin) v5: - code rebase Cc: Kevin Tian <kevin.tian@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:23 +08:00
Weinan Li	64f46f55bb	drm/i915/gvt: add interface to check if context is inhibit No functional change, just for easy to use. v4: - refine comment (Kevin) Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:22 +08:00
Weinan Li	f9a651c05d	drm/i915/gvt: add define GEN9_MOCS_SIZE No functional change. This defination will also be used in future patchesi. v4: - refine patch description (Kevin) Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:22 +08:00
Changbin Du	420fba78d9	drm/i915/gvt: Define PTE addr mask with GENMASK_ULL Define the masks better. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:21 +08:00
Changbin Du	b6c126a393	drm/i915/gvt: Manage shadow pages with radix tree We don't know how many page tables will be shadowed. It varies considerably corresponding to guest load. Radix tree is a better choice for us. Since Page Frame Number is used as key so most of the bits are common. Here is some performance data (duration in us) of looking up a element: Before: (aka. ppgtt_find_shadow_page) 0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325 After: (aka. intel_vgpu_find_spt_by_mfn) 0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108 This time I didn't get the early data of hash table. The data is measured when desktop is shown. As last change, the overall benchmark almost is not changed, but we get better scalability. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:21 +08:00
Changbin Du	e502a2af4c	drm/i915/gvt: Provide generic page_track infrastructure for write-protected page This patch provide generic page_track infrastructure for write-protected guest page. The old page_track logic gets rewrote and now stays in a new standalone page_track.c. This page track infrastructure can be both used by vGUC and GTT shadowing. The important change is that it uses radix tree instead of hash table. We don't have a predictable number of pages that will be tracked. Here is some performance data (duration in us) of looking up a element: Before: (aka. intel_vgpu_find_tracked_page) 0.091 0.089 0.090 ... 0.093 0.091 0.087 ... 0.292 0.285 0.292 0.291 After: (aka. intel_vgpu_find_page_track) 0.104 0.105 0.100 0.102 0.102 0.100 ... 0.101 0.101 0.105 0.105 The hash table has good performance at beginning, but turns bad with more pages being tracked even no 3D applications are running. As expected, radix tree has stable duration and very quick. The overall benchmark (tested with Heaven Benchmark) marginally improved since this is not the bottleneck. What we benefit more from this change is scalability. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:20 +08:00
Changbin Du	0947572849	drm/i915/gvt: Don't extend page_track to mpt layer Don't extend page_track to mpt layer. Keep MPT simple and clean. Meanwhile remove gtt.n_tracked_guest_page which doesn't make much sense. v2: clean up gtt.n_tracked_guest_page. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:20 +08:00
Changbin Du	f66e5ff706	drm/i915/gvt: Rename mpt api {set, unset}_wp_page to {enable, disable}_page_track The kvmgt's implementation of mpt api {set,unset}_wp_page is not real write-protection - the data get written before invoke this two api. As discussed, change the mpt api to match the real behavior. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:19 +08:00
Changbin Du	d87f5ff35f	drm/i915/gvt: Rename shadow_page to short name spt The target structure of some functions is struct intel_vgpu_ppgtt_spt and their names are xxx_shadow_page. It should be xxx_shadow_page_table. Let's use short name 'spt' instead to reduce the length. As well as the hash table name. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:18 +08:00
Changbin Du	44b4673380	drm/i915/gvt: Rework shadow page management code This is a another big one and the GVT shadow page management code is heavily refined. The new code only use struct intel_vgpu_ppgtt_spt to represent a vgpu shadow page table - w/ or wo/ a guest page associated with. A pure shadow page (no guest page associated) will be used to shadow splited 2M huge gtt. In this case, the spt.guest_page.gfn should be a zero. To search a existed shadow page table, we have two new interfaces: - intel_vgpu_find_spt_by_gfn(), find a spt by guest gfn. It must not be a pure spt. - intel_vgpu_find_spt_by_mfn, Find the spt using shadow page mfn in shadowed PTE. The oos_page management is remained as what is was. v2: Split some changes into small standalone patches. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:18 +08:00
Changbin Du	72f03d7ea1	drm/i915/gvt: Refine pte shadowing process Make the shadow PTE population code clear. Later we will add huge gtt support based on this. v2: - rebase to latest code. Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:17 +08:00
Changbin Du	d861ca237d	drm/i915/gvt: Use standard pte bit definition GTT entry has similar format with the CPU PTE. We'd prefer named macro instead of hardcode. Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:17 +08:00
Changbin Du	e6e9c46fd2	drm/i915/gvt: Factor out intel_vgpu_{get, put}_ppgtt_mm interface Factor out these two interfaces so we can kill some duplicated code in scheduler.c. v2: - rename to intel_vgpu_{get,put}_ppgtt_mm - refine handle_g2v_notification Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:16 +08:00
Changbin Du	a143cef7db	drm/i915/gvt: Rename ggtt related functions to be more specific Accurate names help to avoid confusing so improve readability. Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:16 +08:00
Changbin Du	bc37ab5679	drm/i915/gvt: Add verbose gtt shadow logs This add a new macro gvt_vdbg_mm() to print more verbose logs for gtt shadowing. The added verbose logs are very useful for debugging. gvt_vdbg_mm() only comes into effect if VERBOSE_DEBUG is defined by the developer. Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:15 +08:00
Changbin Du	b0c766bf29	drm/i915/gvt: Refine ggtt_set_shadow_entry Less code and use existed helper ggtt_set_host_entry. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:15 +08:00
Changbin Du	3aff351280	drm/i915/gvt: Refine ggtt and ppgtt root entry ops Separate ggtt and ppgtt since they are different. A little more code but straightforward. And move these helpers to gtt.c since that is the only client. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:14 +08:00
Changbin Du	1bc258519d	drm/i915/gvt: Refine the intel_vgpu_mm reference management If we manage an object with a reference count, then its life cycle must flow the reference count operations. Meanwhile, change the operation functions to generic name put and get. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:14 +08:00
Changbin Du	ede9d0cfcb	drm/i915/gvt: Rework shadow graphic memory management code This is a big one and the GVT shadow graphic memory management code is heavily refined. The new code is more straightforward with less code. The struct intel_vgpu_mm is restructured to be clearly defined, use accurate names and some of the original fields are removed which are really redundant. Now we only manage ppgtt mm object with mm->ppgtt_mm.lru_list. No need to mix ppgtt and ggtt together, since one vGPU only has one ggtt object. v4: Don't invoke ppgtt_free_all_shadow_page before intel_vgpu_destroy_all_ppgtt_mm. v3: Add GVT_RING_CTX_NR_PDPS to avoid confusing about the PDPs. v2: Split some changes into small standalone patches. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-06 13:19:13 +08:00
Alex Deucher	a97fc4e452	drm/amdgpu: fix KV harvesting Always set the graphics values to the max for the asic type. E.g., some 1 RB chips are actually 1 RB chips, others are actually harvested 2 RB chips. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=99353 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2018-03-05 15:39:52 -05:00

... 5 6 7 8 9 ...

738641 Commits All Branches Search

738641 Commits

All Branches