linux

Commit Graph

Author	SHA1	Message	Date
Thomas Hellstrom	0e6d6ec02f	drm/ttm: Work around performance regression with VM_PFNMAP A performance regression was introduced in TTM in linux 3.13 when we started using VM_PFNMAP for shared mappings. In theory this should've been faster due to less page book-keeping but it appears like VM_PFNMAP + x86 PAT + write-combine is a particularly cpu-hungry combination, as seen by largely increased cpu-usage on r200 GL video playback. Until we've sorted out why, revert to always use VM_MIXEDMAP. Reference: freedesktop.org bugzilla bug #75719 Reported-and-tested-by: <smoki00790@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Cc: stable@vger.kernel.org	2014-03-12 14:07:24 +01:00
Dave Airlie	cfd72a4c20	Merge branch 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel into drm-next drm-intel-next-2014-01-10: - final bits for runtime D3 on Haswell from Paul (now enabled fully) - parse the backlight modulation freq information in the VBT from Jani (but not yet used) - more watermark improvements from Ville for ilk-ivb and bdw - bugfixes for fastboot from Jesse - watermark fix for i830M (but not yet everything) - vlv vga hotplug w/a (Imre) - piles of other small improvements, cleanups and fixes all over Note that the pull request includes a backmerge of the last drm-fixes pulled into Linus' tree - things where getting a bit too messy. So the shortlog also contains a bunch of patches from Linus tree. Please yell if you want me to frob it for you a bit. * 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (609 commits) drm/i915/bdw: make sure south port interrupts are enabled properly v2 drm/i915: Include more information in disabled hotplug interrupt warning drm/i915: Only complain about a rogue hotplug IRQ after disabling drm/i915: Only WARN about a stuck hotplug irq ONCE drm/i915: s/hotplugt_status_gen4/hotplug_status_g4x/	2014-01-20 10:21:54 +10:00
Thomas Hellstrom	58aa6622d3	drm/ttm: Correctly set page mapping and -index members Needed for some vm operations; most notably unmap_mapping_range() with even_cows = 0. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Brian Paul <brianp@vmware.com>	2014-01-08 10:08:28 +01:00
Thomas Hellstrom	667a50db04	drm/ttm: Refuse to fault (prime-) imported pages This is illegal for at least two reasons: 1) While it may work on some platforms / iommus, obtaining page pointers from mapped sg-lists is illegal, since the DMA API allows page pointer information to be destroyed in the sg mapping process. 2) TTM has no way of determining the linear kernel map caching state of the underlying pages. PTEs with conflicting caching state pointing to the same pfn is not allowed. TTM operations touching pages of imported sg-tables should be redirected through the proper dma-buf operations. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Acked-by: Daniel Vetter <daniel@ffwll.ch>	2014-01-08 09:55:05 +01:00
Thomas Hellstrom	7dfe8b6187	drm/ttm: Use VM_PFNMAP for shared bo maps VM_PFNMAP is faster than VM_MIXEDMAP due to reduced page administration so use it for shared maps where we don't have any Copy-On-Write pages. For private maps, we continue to use VM_MIXEDMAP. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2014-01-08 09:53:12 +01:00
Thomas Hellstrom	d386735588	drm/ttm: Fix accesses through vmas with only partial coverage VMAs covering a bo but that didn't start at the same address space offset as the bo they were mapping were incorrectly generating SEGFAULT errors in the fault handler. Reported-by: Joseph Dolinak <kanilo2@yahoo.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com> Cc: stable@vger.kernel.org	2013-12-16 10:08:35 -08:00
Thomas Hellstrom	c58f009e01	drm/ttm: Remove set_need_resched from the ttm fault handler Addresses "[BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE". In the first occurence it was used to try to be nice while releasing the mmap_sem and retrying the fault to work around a locking inversion. The second occurence was never used. There has been some discussion whether we should change the locking order to mmap_sem -> bo_reserve. This patch doesn't address that issue, and leaves that locking order undefined. The solution that we release the mmap_sem if tryreserve fails and wait for the buffer to become unreserved is something we want in any case, and follows how the core vm system waits for pages to be come unlocked while releasing the mmap_sem. The code also outlines what needs to be changed if we want to establish the locking order as mmap_sem -> bo::reserve. One slight issue that remains with this code is that the fault handler might be prone to starvation if another thread countinously reserves the buffer. IMO that usage pattern is highly unlikely. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>	2013-11-20 03:46:54 -08:00
Thomas Hellstrom	3943875e7b	drm/ttm: Fix vma page_prot bit manipulation Fix a long-standing TTM issue where we manipulated the vma page_prot bits while mmap_sem was taken in read mode only. We now make a local copy of the vma structure which we pass when we set the ptes. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jerome Glisse <jglisse@redhat.com>	2013-11-12 23:55:31 -08:00
Thomas Hellstrom	cbe12e74ee	drm/ttm: Allow vm fault retries Make use of the FAULT_FLAG_ALLOW_RETRY flag to allow dropping the mmap_sem while waiting for bo idle. FAULT_FLAG_ALLOW_RETRY appears to be primarily designed for disk waits but should work just as fine for GPU waits.. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>	2013-11-06 04:14:43 -08:00
Maarten Lankhorst	5c692948d8	drm/ttm: kill unused functions Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-08-19 09:36:12 +10:00
David Herrmann	72525b3f33	drm/ttm: convert to unified vma offset manager Use the new vma-manager infrastructure. This doesn't change any implementation details as the vma-offset-manager is nearly copied 1-to-1 from TTM. The vm_lock is moved into the offset manager so we can drop it from TTM. During lookup, we use the vma locking helpers to take a reference to the found object. In all other scenarios, locking stays the same as before. We always guarantee that drm_vma_offset_remove() is called only during destruction. Hence, helpers like drm_vma_node_offset_addr() are always safe as long as the node has a valid offset. This also drops the addr_space_offset member as it is a copy of vm_start in vma_node objects. Use the accessor functions instead. v4: - remove vm_lock - use drm_vma_offset_lock_lookup() to protect lookup (instead of vm_lock) Cc: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com> Cc: Martin Peres <martin.peres@labri.fr> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>	2013-07-25 20:47:07 +10:00
Libin	025df77554	drm: use vma_pages() to replace (vm_end - vm_start) >> PAGE_SHIFT (->vm_end - ->vm_start) >> PAGE_SHIFT operation is implemented as a inline funcion vma_pages() in linux/mm.h, so using it. Signed-off-by: Libin <huawei.libin@huawei.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2013-04-16 13:14:00 +10:00
Thomas Hellstrom	82fe50bcc8	drm/ttm: Optimize vm locking using kref_get_unless_zero v3 Removes the need for a write lock each time we call ttm_bo_unref(). v2: Remove an unused variable. v3: Really remove the unused variable. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2012-11-28 18:37:59 +10:00
Konstantin Khlebnikov	314e51b985	mm: kill vma flag VM_RESERVED and mm->reserved_vm counter A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA, currently it lost original meaning but still has some effects: \| effect \| alternative flags -+------------------------+--------------------------------------------- 1\| account as reserved_vm \| VM_IO 2\| skip in core dump \| VM_IO, VM_DONTDUMP 3\| do not merge or expand \| VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP 4\| do not mlock \| VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP This patch removes reserved_vm counter from mm_struct. Seems like nobody cares about it, it does not exported into userspace directly, it only reduces total_vm showed in proc. Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND \| VM_DONTDUMP. remap_pfn_range() and io_remap_pfn_range() set VM_IO\|VM_DONTEXPAND\|VM_DONTDUMP. remap_vmalloc_range() set VM_DONTEXPAND \| VM_DONTDUMP. [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:19 +09:00
Joe Perches	25d0479a59	drm/ttm: Use pr_fmt and pr_<level> Use the more current logging style. Add pr_fmt and remove the TTM_PFX uses. Coalesce formats and align arguments. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2012-03-20 08:45:35 +00:00
Jerome Glisse	b1e5f17232	drm/ttm: introduce callback for ttm_tt populate & unpopulate V4 Move the page allocation and freeing to driver callback and provide ttm code helper function for those. Most intrusive change, is the fact that we now only fully populate an object this simplify some of code designed around the page fault design. V2 Rebase on top of memory accounting overhaul V3 New rebase on top of more memory accouting changes V4 Rebase on top of no memory account changes (where/when is my delorean when i need it ?) Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>	2011-12-06 10:39:24 +00:00
Thomas Hellstrom	eba67093f5	drm/ttm: Fix up io_mem_reserve / io_mem_free calling This patch attempts to fix up shortcomings with the current calling sequences. 1) There's a fastpath where no locking occurs and only io_mem_reserved is called to obtain needed info for mapping. The fastpath is set per memory type manager. 2) If the fastpath is disabled, io_mem_reserve and io_mem_free will be exactly balanced and not called recursively for the same struct ttm_mem_reg. 3) Optionally the driver can choose to enable a per memory type manager LRU eviction mechanism that, when io_mem_reserve returns -EAGAIN will attempt to kill user-space mappings of memory in that manager to free up needed resources Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-11-22 13:25:22 +10:00
Thomas Hellstrom	702adba224	drm/ttm/radeon/nouveau: Kill the bo lock in favour of a bo device fence_lock The bo lock used only to protect the bo sync object members, and since it is a per bo lock, fencing a buffer list will see a lot of locks and unlocks. Replace it with a per-device lock that protects the sync object members on all bos. Reading and setting these members will always be very quick, so the risc of heavy lock contention is microscopic. Note that waiting for sync objects will always take place outside of this lock. The bo device fence lock will eventually be replaced with a seqlock / rcu mechanism so we can determine that a bo is idle under a rcu / read seqlock. However this change will allow us to batch fencing and unreserving of buffers with a minimal amount of locking. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Jerome Glisse <j.glisse@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-11-22 13:25:18 +10:00
Jerome Glisse	82c5da6bf8	drm/ttm: ttm_fault callback to allow driver to handle bo placement V6 On fault the driver is given the opportunity to perform any operation it sees fit in order to place the buffer into a CPU visible area of memory. This patch doesn't break TTM users, nouveau, vmwgfx and radeon should keep working properly. Future patch will take advantage of this infrastructure and remove the old path from TTM once driver are converted. V2 return VM_FAULT_NOPAGE if callback return -EBUSY or -ERESTARTSYS V3 balance io_mem_reserve and io_mem_free call, fault_reserve_notify is responsible to perform any necessary task for mapping to succeed V4 minor cleanup, atomic_t -> bool as member is protected by reserve mecanism from concurent access V5 the callback is now responsible for iomapping the bo and providing a virtual address this simplify TTM and will allow to get rid of TTM_MEMTYPE_FLAG_NEEDS_IOREMAP V6 use the bus addr data to decide to ioremap or this isn't needed but we don't necesarily need to ioremap in the callback but still allow driver to use static mapping Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 14:12:05 +10:00
Dave Airlie	b8ff7357da	drm/ttm: fix incorrect logic in ttm_bo_io path This path isn't used by radeon yet, but future drivers will want it, so fix it right. Reported-by: Luca Barbieri <luca@luca-barbieri.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-12-16 15:32:31 +10:00
Thomas Hellstrom	98ffc4158e	drm/ttm: Have the TTM code return -ERESTARTSYS instead of -ERESTART. Return -ERESTARTSYS instead of -ERESTART when interrupted by a signal. The -ERESTARTSYS is converted to an -EINTR by the kernel signal layer before returned to user-space. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-12-10 15:09:03 +10:00
Alexey Dobriyan	f0f37e2f77	const: mark struct vm_struct_operations * mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-27 11:39:25 -07:00
Linus Torvalds	84210aeb4a	Merge branch 'drm-radeon-kms' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-radeon-kms' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (35 commits) drm/radeon: set fb aperture sizes for framebuffer handoff. drm/ttm: fix highuser vs dma32 confusion. drm/radeon: Fix size used for benchmarking BO copies. drm/radeon: Add radeon.test parameter for running BO GPU copy tests. drm/radeon/kms: allow interruptible waits for objects. drm/ttm: powerpc: Fix Highmem cache flushing. x86: Export kmap_atomic_prot() needed for TTM. drm/ttm: Fix ttm in-kernel copying of pages with non-standard caching attributes. drm/ttm: Fix an oops and sync object leak. drm/radeon/kms: vram sizing on certain r100 chips needs workaround. drm/radeon: Pay more attention to object placement requested by userspace. drm/radeon: Fall back to evicting BOs with memcpy if necessary. drm/radeon: Don't unreserve twice on failure to validate. drm/radeon/kms: fix bandwidth computation on avivo hardware drm/radeon/kms: add initial colortiling support. drm/radeon/kms: fix hotspot handling on pre-avivo chips drm/radeon/kms: enable frac fb divs on rs600/rs690/rs740 drm/radeon/kms: add PLL flag to prefer frequencies <= the target freq drm/radeon/kms: block RN50 from using 3D engine. drm/radeon/kms: fix VRAM sizing like DDX does it. ...	2009-07-29 12:31:59 -07:00
Dave Airlie	e024e11070	drm/radeon/kms: add initial colortiling support. This adds new set/get tiling interfaces where the pitch and macro/micro tiling enables can be set. Along with a flag to decide if this object should have a surface when mapped. The only thing we need to allocate with a mapped surface should be the frontbuffer. Note rotate scanout shouldn't require one, and back/depth shouldn't either, though mesa needs some fixes. It fixes the TTM interfaces along Thomas's suggestions, and I've tested the surface stealing code with two X servers and not seen any lockdep issues. I've stopped tiling the fbcon frontbuffer, as I don't see there being any advantage other than testing, I've left the testing commands in there, just flip the fb_tiled to true in radeon_fb.c Open: Can we integrate endian swapping in with this? Future features: texture tiling - need to relocate texture registers TXOFFSET* with tiling info. This also merges Michel's cleanup surfaces regs at init time patch even though it makes sense on its own, this patch really relies on it. Some PowerMac firmwares set up a tiling surface at the beginning of VRAM which messes us up otherwise. that patch is: Signed-off-by: Michel Dänzer <daenzer@vmware.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-07-29 15:42:18 +10:00
Roel Kluin	916635bfca	drm/ttm: fix misplaced parentheses Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-07-15 16:00:37 +10:00
Huang Weiyi	8b169b5f1f	drm: remove unused #include <linux/version.h>'s Remove unused #include <linux/version.h>('s) in drivers/gpu/drm/ttm/ttm_bo_util.c drivers/gpu/drm/ttm/ttm_bo_vm.c drivers/gpu/drm/ttm/ttm_tt.c Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-06-24 16:31:50 +10:00
Thomas Hellstrom	ba4e7d973d	drm: Add the TTM GPU memory manager subsystem. TTM is a GPU memory manager subsystem designed for use with GPU devices with various memory types (On-card VRAM, AGP, PCI apertures etc.). It's essentially a helper library that assists the DRM driver in creating and managing persistent buffer objects. TTM manages placement of data and CPU map setup and teardown on data movement. It can also optionally manage synchronization of data on a per-buffer-object level. TTM takes care to provide an always valid virtual user-space address to a buffer object which makes user-space sub-allocation of big buffer objects feasible. TTM uses a fine-grained per buffer-object locking scheme, taking care to release all relevant locks when waiting for the GPU. Although this implies some locking overhead, it's probably a big win for devices with multiple command submission mechanisms, since the lock contention will be minimal. TTM can be used with whatever user-space interface the driver chooses, including GEM. It's used by the upcoming Radeon KMS DRM driver and is also the GPU memory management core of various new experimental DRM drivers. Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Jerome Glisse <jglisse@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-06-15 09:37:57 +10:00

27 Commits