linux

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	7fa1d27b63	Merge tag 'drm-intel-next-fixes-2016-05-25' of git://anongit.freedesktop.org/drm-intel into drm-next I see the main drm pull got merged, here's the first batch of fixes for 4.7 already. Fixes all around, a large portion cc: stable stuff. [airlied: the DP++ stuff is a regression fix]. * tag 'drm-intel-next-fixes-2016-05-25' of git://anongit.freedesktop.org/drm-intel: drm/i915: Stop automatically retiring requests after a GPU hang drm/i915: Unify intel_ring_begin() drm/i915: Ignore stale wm register values on resume on ilk-bdw (v2) drm/i915/psr: Try to program link training times correctly drm/i915/bxt: Adjusting the error in horizontal timings retrieval drm/i915: Don't leave old junk in ilk active watermarks on readout drm/i915: s/DPPL/DPLL/ for SKL DPLLs drm/i915: Fix gen8 semaphores id for legacy mode drm/i915: Set crtc_state->lane_count for HDMI drm/i915/BXT: Retrieving the horizontal timing for DSI drm/i915: Protect gen7 irq_seqno_barrier with uncore lock drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms drm/i915: Determine DP++ type 1 DVI adaptor presence based on VBT drm/i915: Enable/disable TMDS output buffers in DP++ adaptor as needed drm/i915: Respect DP++ adaptor TMDS clock limit drm: Add helper for DP++ adaptors	2016-05-27 16:08:38 +10:00
Linus Torvalds	84787c572d	Merge branch 'akpm' (patches from Andrew) Merge yet more updates from Andrew Morton: - Oleg's "wait/ptrace: assume __WALL if the child is traced". It's a kernel-based workaround for existing userspace issues. - A few hotfixes - befs cleanups - nilfs2 updates - sys_wait() changes - kexec updates - kdump - scripts/gdb updates - the last of the MM queue - a few other misc things * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (84 commits) kgdb: depends on VT drm/amdgpu: make amdgpu_mn_get wait for mmap_sem killable drm/radeon: make radeon_mn_get wait for mmap_sem killable drm/i915: make i915_gem_mmap_ioctl wait for mmap_sem killable uprobes: wait for mmap_sem for write killable prctl: make PR_SET_THP_DISABLE wait for mmap_sem killable exec: make exec path waiting for mmap_sem killable aio: make aio_setup_ring killable coredump: make coredump_wait wait for mmap_sem for write killable vdso: make arch_setup_additional_pages wait for mmap_sem for write killable ipc, shm: make shmem attach/detach wait for mmap_sem killable mm, fork: make dup_mmap wait for mmap_sem for write killable mm, proc: make clear_refs killable mm: make vm_brk killable mm, elf: handle vm_brk error mm, aout: handle vm_brk failures mm: make vm_munmap killable mm: make vm_mmap killable mm: make mmap_sem for write waits killable for mm syscalls MAINTAINERS: add co-maintainer for scripts/gdb ...	2016-05-23 19:42:28 -07:00
Michal Hocko	80a89a5e85	drm/i915: make i915_gem_mmap_ioctl wait for mmap_sem killable i915_gem_mmap_ioctl relies on mmap_sem for write. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: David Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-23 17:04:14 -07:00
Linus Torvalds	1d6da87a32	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm updates from Dave Airlie: "Here's the main drm pull request for 4.7, it's been a busy one, and I've been a bit more distracted in real life this merge window. Lots more ARM drivers, not sure if it'll ever end. I think I've at least one more coming the next merge window. But changes are all over the place, support for AMD Polaris GPUs is in here, some missing GM108 support for nouveau (found in some Lenovos), a bunch of MST and skylake fixes. I've also noticed a few fixes from Arnd in my inbox, that I'll try and get in asap, but I didn't think they should hold this up. New drivers: - Hisilicon kirin display driver - Mediatek MT8173 display driver - ARC PGU - bitstreamer on Synopsys ARC SDP boards - Allwinner A13 initial RGB output driver - Analogix driver for DisplayPort IP found in exynos and rockchip DRM Core: - UAPI headers fixes and C++ safety - DRM connector reference counting - DisplayID mode parsing for Dell 5K monitors - Removal of struct_mutex from drivers - Connector registration cleanups - MST robustness fixes - MAINTAINERS updates - Lockless GEM object freeing - Generic fbdev deferred IO support panel: - Support for a bunch of new panels i915: - VBT refactoring - PLL computation cleanups - DSI support for BXT - Color manager support - More atomic patches - GEM improvements - GuC fw loading fixes - DP detection fixes - SKL GPU hang fixes - Lots of BXT fixes radeon/amdgpu: - Initial Polaris support - GPUVM/Scheduler/Clock/Power improvements - ASYNC pageflip support - New mesa feature support nouveau: - GM108 support - Power sensor support improvements - GR init + ucode fixes. - Use GPU provided topology information vmwgfx: - Add host messaging support gma500: - Some cleanups and fixes atmel: - Bridge support - Async atomic commit support fsl-dcu: - Timing controller for LCD support - Pixel clock polarity support rcar-du: - Misc fixes exynos: - Pipeline clock support - Exynoss4533 SoC support - HW trigger mode support - export HDMI_PHY clock - DECON5433 fixes - Use generic prime functions - use DMA mapping APIs rockchip: - Lots of little fixes vc4: - Render node support - Gamma ramp support - DPI output support msm: - Mostly cleanups and fixes - Conversion to generic struct fence etnaviv: - Fix for prime buffer handling - Allow hangcheck to be coalesced with other wakeups tegra: - Gamme table size fix" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1050 commits) drm/edid: add displayid detailed 1 timings to the modelist. (v1.1) drm/edid: move displayid validation to it's own function. drm/displayid: Iterate over all DisplayID blocks drm/edid: move displayid tiled block parsing into separate function. drm: Nuke ->vblank_disable_allowed drm/vmwgfx: Report vmwgfx version to vmware.log drm/vmwgfx: Add VMWare host messaging capability drm/vmwgfx: Kill some lockdep warnings drm/nouveau/gr/gf100-: fix race condition in fecs/gpccs ucode drm/nouveau/core: recognise GM108 chipsets drm/nouveau/gr/gm107-: fix touching non-existent ppcs in attrib cb setup drm/nouveau/gr/gk104-: share implementation of ppc exception init drm/nouveau/gr/gk104-: move rop_active_fbps init to nonctx drm/nouveau/bios/pll: check BIT table version before trying to parse it drm/nouveau/bios/pll: prevent oops when limits table can't be parsed drm/nouveau/volt/gk104: round up in gk104_volt_set drm/nouveau/fb/gm200: setup mmu debug buffer registers at init() drm/nouveau/fb/gk20a,gm20b: setup mmu debug buffer registers at init() drm/nouveau/fb/gf100-: allocate mmu debug buffers drm/nouveau/fb: allow chipset-specific actions for oneinit() ...	2016-05-23 11:48:48 -07:00
Chris Wilson	157d2c7fad	drm/i915: Stop automatically retiring requests after a GPU hang Following a GPU hang, we break out of the request loop in order to unlock the struct_mutex for use by the GPU reset. However, if we retire all the requests at that moment, we cannot identify the guilty request after performing the reset. v2: Not automatically retiring requests forces us to recheck for available ringspace. Fixes: `f4457ae71f` ("drm/i915: Prevent leaking of -EIO from i915_wait_request()") Testcase: igt/gem_reset_stats/ban-* Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Tested-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463137042-9669-4-git-send-email-chris@chris-wilson.co.uk (cherry picked from commit `e075a32f51`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2016-05-23 16:21:32 +03:00
Ville Syrjälä	5fbd0418ee	drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms Move the intel_enable_gtt() call to happen before we touch the GTT during resume. Right now it's done way too late. Before commit `ebb7c78d35` ("agp/intel-gtt: Only register fake agp driver for gen1") it was actually done earlier on account of also getting called from the resume hook of the fake agp driver. With the fake agp driver no longer getting registered we must move the call up. The symptoms I've seen on my 830 machine include lowmem corruption, other kinds of memory corruption, and straight up hung machine during or just after resume. Not really sure what causes the memory corruption, but so far I've not seen any with this fix. I think we shouldn't really need to call this during init, but we have been doing that so I've decided to keep the call. However moving that call earlier could be prudent as well. Doing it right after the intel-gtt probe seems appropriate. Also tested this on 946gz,elk,ilk and all seemed quite happy with this change. v2: Reorder init_hw vs. enable_hw functions (Chris) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: drm-intel-fixes@lists.freedesktop.org Fixes: `ebb7c78d35` ("agp/intel-gtt: Only register fake agp driver for gen1") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462559755-353-1-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit `ac840ae535`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2016-05-23 11:10:48 +03:00
Linus Torvalds	2f37dd131c	Staging and IIO driver update for 4.7-rc1 Here's the big staging and iio driver update for 4.7-rc1. I think we almost broke even with this release, only adding a few more lines than we removed, which isn't bad overall given that there's a bunch of new iio drivers added. The Lustre developers seem to have woken up from their sleep and have been doing a great job in cleaning up the code and pruning unused or old cruft, the filesystem is almost readable :) Other than that, just a lot of basic coding style cleanups in the churn. All have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlc/00QACgkQMUfUDdst+ynXYQCdG9oEsw4CCItbjGfQau5YVGbd TOcAnA19tZz+Wcg3sLT8Zsm979dgVvDt =9UG/ -----END PGP SIGNATURE----- Merge tag 'staging-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver updates from Greg KH: "Here's the big staging and iio driver update for 4.7-rc1. I think we almost broke even with this release, only adding a few more lines than we removed, which isn't bad overall given that there's a bunch of new iio drivers added. The Lustre developers seem to have woken up from their sleep and have been doing a great job in cleaning up the code and pruning unused or old cruft, the filesystem is almost readable :) Other than that, just a lot of basic coding style cleanups in the churn. All have been in linux-next for a while with no reported issues" * tag 'staging-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (938 commits) Staging: emxx_udc: emxx_udc: fixed coding style issue staging/gdm724x: fix "alignment should match open parenthesis" issues staging/gdm724x: Fix avoid CamelCase staging: unisys: rename misleading var ii with frag staging: unisys: visorhba: switch success handling to error handling staging: unisys: visorhba: main path needs to flow down the left margin staging: unisys: visorinput: handle_locking_key() simplifications staging: unisys: visorhba: fail gracefully for thread creation failures staging: unisys: visornic: comment restructuring and removing bad diction staging: unisys: fix format string %Lx to %llx for u64 staging: unisys: remove unused struct members staging: unisys: visorchannel: correct variable misspelling staging: unisys: visorhba: replace functionlike macro with function staging: dgnc: Need to check for NULL of ch staging: dgnc: remove redundant condition check staging: dgnc: fix 'line over 80 characters' staging: dgnc: clean up the dgnc_get_modem_info() staging: lustre: lnet: enable configuration per NI interface staging: lustre: o2iblnd: properly set ibr_why staging: lustre: o2iblnd: remove last of kiblnd_tunables_fini ...	2016-05-20 22:20:48 -07:00
Chris Wilson	a8ad0bd84f	drm: Remove unused drm_device from drm_gem_object_lookup() drm_gem_object_lookup() has never required the drm_device for its file local translation of the user handle to the GEM object. Let's remove the unused parameter and save some space. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: dri-devel@lists.freedesktop.org Cc: Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> [danvet: Fixup kerneldoc too.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-05-17 08:47:30 +02:00
Dave Airlie	fffb675106	Merge tag 'drm-intel-next-2016-04-25' of git://anongit.freedesktop.org/drm-intel into drm-next - more userptr cornercase fixes from Chris - clean up and tune forcewake handling (Tvrtko) - more underrun fixes from Ville, mostly for ilk to appeas CI - fix unclaimed register warnings on vlv/chv and enable the debug code to catch them by default (Ville) - skl gpu hang fixes for gt3/4 (Mika Kuoppala) - edram improvements for gen9+ (Mika again) - clean up gpu reset corner cases (Chris) - fix ctx/ring machine deaths on snb/ilk (Chris) - MOCS programming for all engines (Peter Antoine) - robustify/clean up vlv/chv irq handler (Ville) - split gen8+ irq handlers into ack/handle phase (Ville) - tons of bxt rpm fixes (mostly around firmware interactions), from Imre - hook up panel fitting for dsi panels (Ville) - more runtime PM fixes all over from Imre - shrinker polish (Chris) - more guc fixes from Alex Dai and Dave Gordon - tons of bugfixes and small polish all over (but with a big focus on bxt) * tag 'drm-intel-next-2016-04-25' of git://anongit.freedesktop.org/drm-intel: (142 commits) drm/i915: Update DRIVER_DATE to 20160425 drm/i915/bxt: Explicitly clear the Turbo control register drm/i915: Correct the i915_frequency_info debugfs output drm/i915: Macros to convert PM time interval values to microseconds drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW drm/i915: Fake HDMI live status drm/i915/bxt: Force reprogramming a PHY with invalid HW state drm/i915/bxt: Wait for PHY1 GRC done if PHY0 was already enabled drm/i915/bxt: Use PHY0 GRC value for HW state verification drm/i915: use dev_priv directly in gen8_ppgtt_notify_vgt drm/i915/bxt: Enable DC5 during runtime resume drm/i915/bxt: Sanitize DC state tracking during system resume drm/i915/bxt: Don't uninit/init display core twice during system suspend/resume drm/i915: Inline intel_suspend_complete drm/i915/kbl: Don't WARN for expected secondary MISC IO power well request drm/i915: Fix eDP low vswing for Broadwell drm/i915: check for ERR_PTR from i915_gem_object_pin_map() drm/i915/guc: local optimisations and updating comments drm/i915/guc: drop cached copy of 'wq_head' drm/i915/guc: keep GuC doorbell & process descriptor mapped in kernel ...	2016-05-04 17:25:30 +10:00
Gustavo Padovan	3ed605bc8a	kernel.h: add u64_to_user_ptr() This function had copies in 3 different files. Unify them in kernel.h. Cc: Joe Perches <joe@perches.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Rob Clark <robdclark@gmail.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: Daniel Vetter <daniel.vetter@intel.com> [drm/i915/] Acked-by: Rob Clark <robdclark@gmail.com> [drm/msm/] Acked-by: Lucas Stach <l.stach@pengutronix.de> [drm/etinav/] Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-29 17:03:49 -07:00
Dave Airlie	605b28c859	Merge tag 'drm-intel-next-2016-04-11' of git://anongit.freedesktop.org/drm-intel into drm-next - make modeset hw state checker atomic aware (Maarten) - close races in gpu stuck detection/seqno reading (Chris) - tons&tons of small improvements from Chris Wilson all over the gem code - more dsi/bxt work from Ramalingam&Jani - macro polish from Joonas - guc fw loading fixes (Arun&Dave) - vmap notifier (acked by Andrew) + i915 support by Chris Wilson - create bottom half for execlist irq processing (Chris Wilson) - vlv/chv pll cleanup (Ville) - rework DP detection, especially sink detection (Shubhangi Shrivastava) - make color manager support fully atomic (Maarten) - avoid livelock on chv in execlist irq handler (Chris) * tag 'drm-intel-next-2016-04-11' of git://anongit.freedesktop.org/drm-intel: (82 commits) drm/i915: Update DRIVER_DATE to 20160411 drm/i915: Avoid allocating a vmap arena for a single page drm,i915: Introduce drm_malloc_gfp() drm/i915/shrinker: Restrict vmap purge to objects with vmaps drm/i915: Refactor duplicate object vmap functions drm/i915: Consolidate common error handling in intel_pin_and_map_ringbuffer_obj drm/i915/dmabuf: Tighten struct_mutex for unmap_dma_buf drm/i915: implement WaClearTdlStateAckDirtyBits drm/i915/bxt: Reversed polarity of PORT_PLL_REF_SEL bit drm/i915: Rename hw state checker to hw state verifier. drm/i915: Move modeset state verifier calls. drm/i915: Make modeset state verifier take crtc as argument. drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor drm/i915: Use simplest form for flushing the single cacheline in the HWS drm/i915: Harden detection of missed interrupts drm/i915: Separate out the seqno-barrier from engine->get_seqno drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+ drm/i915: Fixup the free space logic in ring_prepare drm/i915: Simplify check for idleness in hangcheck drm/i915: Apply a mb between emitting the request and hangcheck ...	2016-04-22 09:03:31 +10:00
Dave Airlie	49047962ec	Linux 4.6-rc3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXCva8AAoJEHm+PkMAQRiGXBoIAIkrjxdbuT2nS9A3tHwkiFXa 6/Th1UjbNaoLuZ+MckQHayAD9NcWY9lVjOUmFsSiSWMCQK/rTWDl8x5ITputrY2V VuhrJCwI7huEtu6GpRaJaUgwtdOjhIHz1Ue2MCdNIbKX3l+LjVyyJ9Vo8rruvZcR fC7kiivH04fYX58oQ+SHymCg54ny3qJEPT8i4+g26686m11hvZLI3UAs2PAn6ut+ atCjxdQ4yLN3DWsbjuA7wYGWhTgFloxL4TIoisuOUc3FXnSi/ivIbXZvu4lUfisz LA2JBhfII3AEMBWG9xfGbXPijJTT4q7yNlTD0oYcnMtAt/Roh2F04asqB1LetEY= =bri6 -----END PGP SIGNATURE----- Merge tag 'v4.6-rc3' into drm-next Backmerge 4.6-rc3 for i915. Linux 4.6-rc3	2016-04-22 08:32:51 +10:00
Daniel Vetter	f74418a400	drm/vma_manage: Drop has_offset It's racy, creating mmap offsets is a slowpath, so better to remove it to avoid drivers doing broken things. The only user is i915, and it's ok there because everything (well almost) is protected by dev->struct_mutex in i915-gem. While at it add a note in the create_mmap_offset kerneldoc that drivers must release it again. And then I also noticed that drm_gem_object_release entirely lacks kerneldoc. Cc: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459330852-27668-14-git-send-email-daniel.vetter@ffwll.ch	2016-04-20 12:58:53 +02:00
Tvrtko Ursulin	791bee125b	drm/i915: Remove a couple pointless WARN_ONs Just two WARN_ONs followed by pointer dereference I spotted by accident. v2: Remove some more of the same. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1461080770-14693-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-04-20 09:59:17 +01:00
Ingo Molnar	6666ea558b	Linux 4.6-rc4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXFELfAAoJEHm+PkMAQRiGRYIH+wWsUva7TR9arN1ZrURvI17b KqyQH8Ov9zJBsIaq/rFXOr5KfNgx7BU9BL9h7QkBy693HXTWf+GTZ1czHM4N12C3 0ZdHGrLwTHo2zdisiQaFORZSfhSVTUNGXGHXw13bUMgEqatPgkozXEnsvXXNdt1Z HtlcuJn3pcj+QIY7qDXZgTLTwgn248hi1AgNag+ntFcWiz21IYaMIi7/mCY9QUIi AY+Y3hqFQM7/8cVyThGS5wZPTg1YzdhsLJpoCk0TbS8FvMEnA+ylcTgc15C78bwu AxOwM3OCmH4gMsd7Dd/O+i9lE3K6PFrgzdDisYL3P7eHap+EdiLDvVzPDPPx0xg= =Q7r3 -----END PGP SIGNATURE----- Merge tag 'v4.6-rc4' into x86/asm, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-19 10:38:52 +02:00
Peter Antoine	0ccdacf694	drm/i915/mocs: Program MOCS for all engines on init Allow for the MOCS to be programmed for all engines. Currently we program the MOCS when the first render batch goes through. This works on most platforms but fails on platforms that do not run a render batch early, i.e. headless servers. The patch now programs all initialised engines on init and the RCS is programmed again within the initial batch. This is done for predictable consistency with regards to the hardware context. Hardware context loading sets the values of the MOCS for RCS and L3CC. Programming them from within the batch makes sure that the render context is valid, no matter what the previous state of the saved-context was. v2: posted correct version to the mailing list. v3: moved programming to within engine->init_hw() (Chris Wilson) v4: code formatting and white-space changes. (Chris Wilson) Testcase: igt/gem_mocs_settings Signed-off-by: Peter Antoine <peter.antoine@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1460556205-6644-1-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Chris Wilson	aa9b78104f	drm/i915: Late request cancellations are harmful Conceptually, each request is a record of a hardware transaction - we build up a list of pending commands and then either commit them to hardware, or cancel them. However, whilst building up the list of pending commands, we may modify state outside of the request and make references to the pending request. If we do so and then cancel that request, external objects then point to the deleted request leading to both graphical and memory corruption. The easiest example is to consider object/VMA tracking. When we mark an object as active in a request, we store a pointer to this, the most recent request, in the object. Then we want to free that object, we wait for the most recent request to be idle before proceeding (otherwise the hardware will write to pages now owned by the system, or we will attempt to read from those pages before the hardware is finished writing). If the request was cancelled instead, that wait completes immediately. As a result, all requests must be committed and not cancelled if the external state is unknown. All that remains of i915_gem_request_cancel() users are just a couple of extremely unlikely allocation failures, so remove the API entirely. A consequence of committing all incomplete requests is that we generate excess breadcrumbs and fill the ring much more often with dummy work. We have completely undone the outstanding_last_seqno optimisation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-16-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Chris Wilson	349f2ccff9	drm/i915: Move the mb() following release-mmap into release-mmap As paranoia, we want to ensure that the CPU's PTEs have been revoked for the object before we return from i915_gem_release_mmap(). This allows us to rely on there being no outstanding memory accesses from userspace and guarantees serialisation of the code against concurrent access just by calling i915_gem_release_mmap(). v2: Reduce the mb() into a wmb() following the revoke. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: "Goel, Akash" <akash.goel@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-13-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Chris Wilson	f4457ae71f	drm/i915: Prevent leaking of -EIO from i915_wait_request() Reporting -EIO from i915_wait_request() has proven very troublematic over the years, with numerous hard-to-reproduce bugs cropping up in the corner case of where a reset occurs and the code wasn't expecting such an error. If the we reset the GPU or have detected a hang and wish to reset the GPU, the request is forcibly complete and the wait broken. Currently, we report either -EAGAIN or -EIO in order for the caller to retreat and restart the wait (if appropriate) after dropping and then reacquiring the struct_mutex (essential to allow the GPU reset to proceed). However, if we take the view that the request is complete (no further work will be done on it by the GPU because it is dead and soon to be reset), then we can proceed with the task at hand and then drop the struct_mutex allowing the reset to occur. This transfers the burden of checking whether it is safe to proceed to the caller, which in all but one instance it is safe - completely eliminating the source of all spurious -EIO. Of note, we only have two API entry points where we expect that userspace can observe an EIO. First is when submitting an execbuf, if the GPU is terminally wedged, then the operation cannot succeed and an -EIO is reported. Secondly, existing userspace uses the throttle ioctl to detect an already wedged GPU before starting using HW acceleration (or to confirm that the GPU is wedged after an error condition). So if the GPU is wedged when the user calls throttle, also report -EIO. v2: Split more carefully the change to i915_wait_request() and assorted ABI from the reset handling. v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code so that we don't start to leak EIO there in future (and break our hang resistant modesetting). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-9-git-send-email-chris@chris-wilson.co.uk Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-1-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Chris Wilson	299259a3a9	drm/i915: Store the reset counter when constructing a request As the request is only valid during the same global reset epoch, we can record the current reset_counter when constructing the request and reuse it when waiting upon that request in future. This removes a very hairy atomic check serialised by the struct_mutex at the time of waiting and allows us to transfer those waits to a central dispatcher for all waiters and all requests. PS: With per-engine resets, we obviously cannot assume a global reset epoch for the requests - a per-engine epoch makes the most sense. The challenge then is how to handle checking in the waiter for when to break the wait, as the fine-grained reset may also want to requeue the request (i.e. the assumption that just because the epoch changes the request is completed may be broken - or we just avoid breaking that assumption with the fine-grained resets). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-7-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Chris Wilson	d98c52cf4f	drm/i915: Tighten reset_counter for reset status In the reset_counter, we use two bits to track a GPU hang and reset. The low bit is a "reset-in-progress" flag that we set to signal when we need to break waiters in order for the recovery task to grab the mutex. As soon as the recovery task has the mutex, we can clear that flag (which we do by incrementing the reset_counter thereby incrementing the gobal reset epoch). By clearing that flag when the recovery task holds the struct_mutex, we can forgo a second flag that simply tells GEM to ignore the "reset-in-progress" flag. The second flag we store in the reset_counter is whether the reset failed and we consider the GPU terminally wedged. Whilst this flag is set, all access to the GPU (at least through GEM rather than direct mmio access) is verboten. PS: Fun is in store, as in the future we want to move from a global reset epoch to a per-engine reset engine with request recovery. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-6-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Chris Wilson	c19ae989b0	drm/i915: Hide the atomic_read(reset_counter) behind a helper This is principally a little bit of syntatic sugar to hide the atomic_read()s throughout the code to retrieve the current reset_counter. It also provides the other utility functions to check the reset state on the already read reset_counter, so that (in later patches) we can read it once and do multiple tests rather than risk the value changing between tests. v2: Be more strict on converting existing i915_reset_in_progress() over to the more verbose i915_reset_in_progress_or_wedged(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-4-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Chris Wilson	d501b1d2b1	drm/i915: Add GEM debugging Kconfig option Currently there is a #define to enable extra BUG_ON for debugging requests and associated activities. I want to expand its use to cover all of GEM internals (so that we can saturate the code with asserts). We can add a Kconfig option to make it easier to enable - with the usual caveats of not enabling unless explicitly requested. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-3-git-send-email-chris@chris-wilson.co.uk	2016-04-14 10:45:40 +01:00
Mika Kuoppala	3accaf7e73	drm/i915: Store and use edram capabilities Store the edram capabilities instead of only the size of edram. This is preparatory patch to allow edram size calculation based on edram capability bits for gen9+. With gen9 the edram is behind llc and is a separate entity. With hsw/bdw it was more of a victim cache for LLC so the name 'eLLC' might be warranted. Regardless, rename all mentions of eLLC to EDRAM to clear the confusion. v2: return bytes for edram size (Chris) s/eLLC/eDRAM in output if we are gen > 8 v3: rebase, INTEL_GEN (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-04-14 12:27:37 +03:00
Mika Kuoppala	666fbcf5c2	drm/i915: Don't program eLLC IDI hash mask for gen9+ For gen9 onwards, eDRAM is a true memory side cache. So there is no need to program idi hash mask as it is for eLLC only. v2: INTEL_GEN (Chris), s/has/hash (Matthew) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2016-04-14 12:27:37 +03:00
Daniel Vetter	3970285319	Linux 4.6-rc3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXCva8AAoJEHm+PkMAQRiGXBoIAIkrjxdbuT2nS9A3tHwkiFXa 6/Th1UjbNaoLuZ+MckQHayAD9NcWY9lVjOUmFsSiSWMCQK/rTWDl8x5ITputrY2V VuhrJCwI7huEtu6GpRaJaUgwtdOjhIHz1Ue2MCdNIbKX3l+LjVyyJ9Vo8rruvZcR fC7kiivH04fYX58oQ+SHymCg54ny3qJEPT8i4+g26686m11hvZLI3UAs2PAn6ut+ atCjxdQ4yLN3DWsbjuA7wYGWhTgFloxL4TIoisuOUc3FXnSi/ivIbXZvu4lUfisz LA2JBhfII3AEMBWG9xfGbXPijJTT4q7yNlTD0oYcnMtAt/Roh2F04asqB1LetEY= =bri6 -----END PGP SIGNATURE----- Merge tag 'v4.6-rc3' into drm-intel-next-queued Linux 4.6-rc3 Backmerge requested by Chris Wilson to make his patches apply cleanly. Tiny conflict in vmalloc.c with the (properly acked and all) patch in drm-intel-next: commit `4da56b99d9` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Apr 4 14:46:42 2016 +0100 mm/vmap: Add a notifier for when we run out of vmap address space and Linus' tree. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2016-04-11 19:25:13 +02:00
Chris Wilson	fb8621d3be	drm/i915: Avoid allocating a vmap arena for a single page If we want a contiguous mapping of a single page sized object, we can forgo using vmap() and just use a regular kmap(). Note that this is only suitable if the desired pgprot_t is compatible. v2: Use is_vmalloc_addr() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-7-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>	2016-04-11 17:13:36 +01:00
Chris Wilson	f2a85e1975	drm,i915: Introduce drm_malloc_gfp() I have instances where I want to use drm_malloc_ab() but with a custom gfp mask. And with those, where I want a temporary allocation, I want to try a high-order kmalloc() before using a vmalloc(). So refactor my usage into drm_malloc_gfp(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: dri-devel@lists.freedesktop.org Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Dave Airlie <airlied@redhat.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk	2016-04-11 17:13:10 +01:00
Chris Wilson	0a798eb92e	drm/i915: Refactor duplicate object vmap functions We now have two implementations for vmapping a whole object, one for dma-buf and one for the ringbuffer. If we couple the mapping into the obj->pages lifetime, then we can reuse an obj->mapping for both and at the same time couple it into the shrinker. There is a third vmapping routine in the cmdparser that maps only a range within the object, for the time being that is left alone, but will eventually use these routines in order to cache the mapping between invocations. v2: Mark the failable kmalloc() as __GFP_NOWARN (vsyrjala) v3: Call unpin_vmap from the right dmabuf unmapper v4: Rename vmap to map as we don't wish to imply the type of mapping involved, just that it contiguously maps the object into kernel space. Add kerneldoc and lockdep annotations Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-4-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>	2016-04-11 17:11:44 +01:00
Chris Wilson	7c90b7de73	drm/i915: Apply a mb between emitting the request and hangcheck Seal the request and mark it as pending execution before we submit it to hardware. We assume that the actual submission cannot fail (that guarantee is provided by preallocating space in the request for the submission). As we may inspect this state without holding any locks during hangcheck we should apply a barrier to ensure that we do not see a more recent value in the HWS than we are tracking. Based on a patch by Mika Kuoppala. Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-8-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2016-04-08 11:44:48 +01:00
Chris Wilson	29dcb57026	drm/i915: Move the hw semaphore initialisation from GEM to the engine Since we are setting engine local values that are tied to the hardware, move it out of i915_gem_init_seqno() into the intel_ring_init_seqno() backend, next to where the other hw semaphore registers are written. v2: Make the explanatory comment about always resetting the semaphores to 0 irrespective of the value of the reset seqno. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-4-git-send-email-chris@chris-wilson.co.uk	2016-04-08 11:43:38 +01:00
Chris Wilson	2ed53a94d8	drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno After the GPU reset and we discard all of the incomplete requests, mark the GPU as having advanced to the last_submitted_seqno (as having completed the requests and ready for fresh work). The impact of this is negligible, as all the requests will be considered completed by this point, it just brings the HWS into line with expectations for external viewers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-2-git-send-email-chris@chris-wilson.co.uk	2016-04-08 11:43:11 +01:00
Kirill A. Shutemov	09cbfeaf1a	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-04 10:41:08 -07:00
Tvrtko Ursulin	27af5eea54	drm/i915: Move execlists irq handler to a bottom half Doing a lot of work in the interrupt handler introduces huge latencies to the system as a whole. Most dramatic effect can be seen by running an all engine stress test like igt/gem_exec_nop/all where, when the kernel config is lean enough, the whole system can be brought into multi-second periods of complete non-interactivty. That can look for example like this: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u8:3:143] Modules linked in: [redacted for brevity] CPU: 0 PID: 143 Comm: kworker/u8:3 Tainted: G U L 4.5.0-160321+ #183 Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1 Workqueue: i915 gen6_pm_rps_work [i915] task: ffff8800aae88000 ti: ffff8800aae90000 task.ti: ffff8800aae90000 RIP: 0010:[<ffffffff8104a3c2>] [<ffffffff8104a3c2>] __do_softirq+0x72/0x1d0 RSP: 0000:ffff88014f403f38 EFLAGS: 00000206 RAX: ffff8800aae94000 RBX: 0000000000000000 RCX: 00000000000006e0 RDX: 0000000000000020 RSI: 0000000004208060 RDI: 0000000000215d80 RBP: ffff88014f403f80 R08: 0000000b1b42c180 R09: 0000000000000022 R10: 0000000000000004 R11: 00000000ffffffff R12: 000000000000a030 R13: 0000000000000082 R14: ffff8800aa4d0080 R15: 0000000000000082 FS: 0000000000000000(0000) GS:ffff88014f400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa53b90c000 CR3: 0000000001a0a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: 042080601b33869f ffff8800aae94000 00000000fffc2678 ffff88010000000a 0000000000000000 000000000000a030 0000000000005302 ffff8800aa4d0080 0000000000000206 ffff88014f403f90 ffffffff8104a716 ffff88014f403fa8 Call Trace: <IRQ> [<ffffffff8104a716>] irq_exit+0x86/0x90 [<ffffffff81031e7d>] smp_apic_timer_interrupt+0x3d/0x50 [<ffffffff814f3eac>] apic_timer_interrupt+0x7c/0x90 <EOI> [<ffffffffa01c5b40>] ? gen8_write64+0x1a0/0x1a0 [i915] [<ffffffff814f2b39>] ? _raw_spin_unlock_irqrestore+0x9/0x20 [<ffffffffa01c5c44>] gen8_write32+0x104/0x1a0 [i915] [<ffffffff8132c6a2>] ? n_tty_receive_buf_common+0x372/0xae0 [<ffffffffa017cc9e>] gen6_set_rps_thresholds+0x1be/0x330 [i915] [<ffffffffa017eaf0>] gen6_set_rps+0x70/0x200 [i915] [<ffffffffa0185375>] intel_set_rps+0x25/0x30 [i915] [<ffffffffa01768fd>] gen6_pm_rps_work+0x10d/0x2e0 [i915] [<ffffffff81063852>] ? finish_task_switch+0x72/0x1c0 [<ffffffff8105ab29>] process_one_work+0x139/0x350 [<ffffffff8105b186>] worker_thread+0x126/0x490 [<ffffffff8105b060>] ? rescuer_thread+0x320/0x320 [<ffffffff8105fa64>] kthread+0xc4/0xe0 [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170 [<ffffffff814f351f>] ret_from_fork+0x3f/0x70 [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170 I could not explain, or find a code path, which would explain a +20 second lockup, but from some instrumentation it was apparent the interrupts off proportion of time was between 10-25% under heavy load which is quite bad. When a interrupt "cliff" is reached, which was >~320k irq/s on my machine, the whole system goes into a terrible state of the above described multi-second lockups. By moving the GT interrupt handling to a tasklet in a most simple way, the problem above disappears completely. Testing the effect on sytem-wide latencies using igt/gem_syslatency shows the following before this patch: gem_syslatency: cycles=1532739, latency mean=416531.829us max=2499237us gem_syslatency: cycles=1839434, latency mean=1458099.157us max=4998944us gem_syslatency: cycles=1432570, latency mean=2688.451us max=1201185us gem_syslatency: cycles=1533543, latency mean=416520.499us max=2498886us This shows that the unrelated process is experiencing huge delays in its wake-up latency. After the patch the results look like this: gem_syslatency: cycles=808907, latency mean=53.133us max=1640us gem_syslatency: cycles=862154, latency mean=62.778us max=2117us gem_syslatency: cycles=856039, latency mean=58.079us max=2123us gem_syslatency: cycles=841683, latency mean=56.914us max=1667us Showing a huge improvement in the unrelated process wake-up latency. It also shows an approximate halving in the number of total empty batches submitted during the test. This may not be worrying since the test puts the driver under a very unrealistic load with ncpu threads doing empty batch submission to all GPU engines each. Another benefit compared to the hard-irq handling is that now work on all engines can be dispatched in parallel since we can have up to number of CPUs active tasklets. (While previously a single hard-irq would serially dispatch on one engine after another.) More interesting scenario with regards to throughput is "gem_latency -n 100" which shows 25% better throughput and CPU usage, and 14% better dispatch latencies. I did not find any gains or regressions with Synmark2 or GLbench under light testing. More benchmarking is certainly required. v2: * execlists_lock should be taken as spin_lock_bh when queuing work from userspace now. (Chris Wilson) * uncore.lock must be taken with spin_lock_irq when submitting requests since that now runs from either softirq or process context. v3: * Expanded commit message with more testing data; * converted missed locking sites to _bh; * added execlist_lock comment. (Chris Wilson) v4: * Mention dispatch parallelism in commit. (Chris Wilson) * Do not hold uncore.lock over MMIO reads since the block is already serialised per-engine via the tasklet itself. (Chris Wilson) * intel_lrc_irq_handler should be static. (Chris Wilson) * Cancel/sync the tasklet on GPU reset. (Chris Wilson) * Document and WARN that tasklet cannot be active/pending on engine cleanup. (Chris Wilson/Imre Deak) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Testcase: igt/gem_exec_nop/all Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94350 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1459768316-6670-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-04-04 14:08:52 +01:00
Joonas Lahtinen	72e96d6450	drm/i915: Refer to GGTT {,VM} consistently Refer to the GGTT VM consistently as "ggtt->base" instead of just "ggtt", "vm" or indirectly through other variables like "dev_priv->ggtt.base" to avoid confusion with the i915_ggtt object itself and PPGTT VMs. Refer to the GGTT as "ggtt" instead of indirectly through chaining. As a bonus gets rid of the long-standing i915_obj_to_ggtt vs. i915_gem_obj_to_ggtt conflict, due to removal of i915_obj_to_ggtt! v2: - Added some more after grepping sources with Chris v3: - Refer to GGTT VM through ggtt->base consistently instead of ggtt_vm (Chris) v4: - Convert all dev_priv->ggtt->foo accesses to ggtt->foo. v5: - Make patch checker happy Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-31 17:55:43 +03:00
Borislav Petkov	568a58e5df	x86/mm/pat, x86/cpufeature: Remove cpu_has_pat Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: intel-gfx@lists.freedesktop.org Link: http://lkml.kernel.org/r/1459266123-21878-9-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-31 13:32:43 +02:00
Joonas Lahtinen	d85489d314	drm/i915: Rename GGTT init functions Rename and document the GGTT init functions to give a better idea of the context where they are called from. i915_gem_gtt_init => i915_ggtt_init_hw i915_gem_init_global_gtt => i915_gem_init_ggtt i915_global_gtt_cleanup => i915_ggtt_cleanup_hw Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458830866-12578-1-git-send-email-joonas.lahtinen@linux.intel.com	2016-03-30 13:47:18 +03:00
Matthew Auld	ade7daa164	drm/i915: BUG_ON when ggtt_view is NULL Lets BUG_ON and don't bother with a WARN and returning an error, so we can remove the need to pollute the code with error handling, after all it is a programmer error to provide NULL view. Also while we're here remove redundant NULL ggtt_view check. Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458834860-7898-1-git-send-email-matthew.auld@intel.com	2016-03-30 13:47:18 +03:00
Dave Gordon	b4ac5afc6b	drm/i915: replace for_each_engine() Having provided for_each_engine_id() for cases where the third (id) argument is useful, we can now replace all the remaining instances with a simpler version that takes only two parameters. In many cases, this also allows the elimination of the local variable used in the iterator (usually 'i'). v2: s/dev_priv/(dev_priv__)/ in body of for_each_engine_masked() [Chris Wilson] Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458757194-17783-2-git-send-email-david.s.gordon@intel.com	2016-03-24 14:34:11 +00:00
Joonas Lahtinen	62106b4f6b	drm/i915: Rename dev_priv->gtt to dev_priv->ggtt Refer to Global GTT consistently as GGTT, thus rename dev_priv->gtt to dev_priv->ggtt and struct i915_gtt to struct i915_ggtt. Fix a couple of whitespace problems while at it. v2: - Fix a typo in commit message. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2016-03-18 15:18:15 +02:00
Tvrtko Ursulin	39dabecd99	drm/i915: Use shorter route to dev_private where possible Where we have a request we can use req->i915 directly instead of going through the engine and device. Coccinelle script: @@ function f; identifier r; @@ f(..., struct drm_i915_gem_request r, ...) { ... - engine->dev->dev_private + r->i915 ... } @@ struct drm_i915_gem_request req; @@ ( req-> - engine->dev->dev_private + i915 ) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1458219850-21007-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-18 09:50:37 +00:00
Tvrtko Ursulin	a112dbad44	drm/i915: Remove unused variable in i915_gem_request_add_to_client Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-18 09:50:36 +00:00
Imre Deak	40ae4e1661	drm/i915: Move load time gem_load_init earlier The only steps requiring device access is the fence and swizzling initialization, so split these out keeping them in their current place and move the rest of init steps earlier. v2-v3: - unchanged v4: - move call to i915_gem_detect_bit_6_swizzle() to i915_gem_load_init_fences() and preserve the original order of the detection of HW fence capailities wrt. swizzling (Chris) CC: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1458132843-21860-1-git-send-email-imre.deak@intel.com	2016-03-17 15:22:05 +02:00
Mika Kuoppala	ee4b6faf96	drm/i915: Modify reset func to handle per engine resets In full gpu reset we prime all engines and reset domains corresponding to each engine. Per engine reset is just a special case of this process wherein only a single engine is reset. This change is aimed to modify relevant functions to achieve this. There are some other steps we carry out in case of engine reset which are addressed in later patches. Reset func now accepts a mask of all engines that need to be reset. Where per engine resets are supported, error handler populates the mask accordingly otherwise all engines are specified. v2: ALL_ENGINES mask fixup, better for_each_ring_masked (Chris) v3: Whitespace fixes (Chris) v4: Rebase due to s/ring/engine Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458143640-20563-1-git-send-email-mika.kuoppala@intel.com	2016-03-17 15:01:15 +02:00
Tvrtko Ursulin	117897f42c	drm/i915: More renaming of rings to engines This time using only sed and a few by hand. v2: Rename also intel_ring_id and intel_ring_initialized. v3: Fixed typo in intel_ring_initialized. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1458126040-33105-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-16 15:33:30 +00:00
Tvrtko Ursulin	666796da7a	drm/i915: More intel_engine_cs renaming Some trivial ones, first pass done with Coccinelle: @@ @@ ( - I915_NUM_RINGS + I915_NUM_ENGINES \| - intel_ring_flag + intel_engine_flag \| - for_each_ring + for_each_engine \| - i915_gem_request_get_ring + i915_gem_request_get_engine \| - intel_ring_idle + intel_engine_idle \| - i915_gem_reset_ring_status + i915_gem_reset_engine_status \| - i915_gem_reset_ring_cleanup + i915_gem_reset_engine_cleanup \| - init_ring_lists + init_engine_lists ) But that didn't fully work so I cleaned it up with: for f in .[hc]; do sed -i -e s/I915_NUM_RINGS/I915_NUM_ENGINES/ $f; done for f in .[hc]; do sed -i -e s/i915_gem_request_get_ring/i915_gem_request_get_engine/ $f; done for f in .[hc]; do sed -i -e s/intel_ring_flag/intel_engine_flag/ $f; done for f in .[hc]; do sed -i -e s/intel_ring_idle/intel_engine_idle/ $f; done for f in .[hc]; do sed -i -e s/init_ring_lists/init_engine_lists/ $f; done for f in .[hc]; do sed -i -e s/i915_gem_reset_ring_cleanup/i915_gem_reset_engine_cleanup/ $f; done for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_status/i915_gem_reset_engine_status/ $f; done v2: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:24 +00:00
Tvrtko Ursulin	4a570db57c	drm/i915: Rename intel_engine_cs struct members below and a couple manual fixups. @@ identifier I, J; @@ struct I { ... - struct intel_engine_cs J; + struct intel_engine_cs engine; ... } @@ identifier I, J; @@ struct I { ... - struct intel_engine_cs J; + struct intel_engine_cs engine; ... } @@ struct drm_i915_private d; @@ ( - d->ring + d->engine ) @@ struct i915_execbuffer_params p; @@ ( - p->ring + p->engine ) @@ struct intel_ringbuffer r; @@ ( - r->ring + r->engine ) @@ struct drm_i915_gem_request req; @@ ( - req->ring + req->engine ) v2: Script missed the tracepoint code - fixed up by hand. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:17 +00:00
Tvrtko Ursulin	0bc40be85f	drm/i915: Rename intel_engine_cs function parameters @@ identifier func; @@ func(..., struct intel_engine_cs * - ring + engine , ...) { <... - ring + engine ...> } @@ identifier func; type T; @@ T func(..., struct intel_engine_cs * - ring + engine , ...); Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:10 +00:00
Tvrtko Ursulin	e2f8039147	drm/i915: Rename local struct intel_engine_cs variables Done by the Coccinelle script below plus a manual intervention to GEN8_RING_SEMAPHORE_INIT. @@ expression E; @@ - struct intel_engine_cs ring = E; + struct intel_engine_cs engine = E; <+... - ring + engine ...+> @@ @@ - struct intel_engine_cs ring; + struct intel_engine_cs engine; <+... - ring + engine ...+> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:00 +00:00
Tvrtko Ursulin	ca377809d6	drm/i915: Avoid snooping with userptr where not supported commit `e5756c10d8` Author: Imre Deak <imre.deak@intel.com> Date: Fri Aug 14 18:43:30 2015 +0300 drm/i915/bxt: don't allow cached GEM mappings on A stepping Added an exception of disallowing snooping for Broxton A stepping hardware but userptr was still enabling it regardless. Move the check to HAS_SNOOP now that it is used from multiple call sites and use it. v2: Userptr cannot be supported when it cannot be coherent and generalize the code better. (Chris Wilson) v3: Make has_snoop true only when !has_llc. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1456920631-34302-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-02 13:46:21 +00:00

1 2 3 4 5 ...

1316 Commits