linux_old1/arch
Mark Hairgrove 7ead15a144 powerpc/powernv/npu: Reduce eieio usage when issuing ATSD invalidates
There are two types of ATSDs issued to the NPU: invalidates targeting a
specific virtual address and invalidates targeting the whole address
space. In both cases prior to this change, the sequence was:

    for each NPU
        - Write the target address to the XTS_ATSD_AVA register
        - EIEIO
        - Write the launch value to issue the ATSD

First, a target address is not required when invalidating the whole
address space, so that write and the EIEIO have been removed. The AP
(size) field in the launch is not needed either.

Second, for per-address invalidates the above sequence is inefficient in
the common case of multiple NPUs because an EIEIO is issued per NPU. This
unnecessarily forces the launches of later ATSDs to be ordered with the
launches of earlier ones. The new sequence only issues a single EIEIO:

    for each NPU
        - Write the target address to the XTS_ATSD_AVA register
    EIEIO
    for each NPU
        - Write the launch value to issue the ATSD

Performance results were gathered using a microbenchmark which creates a
1G allocation then uses mprotect with PROT_NONE to trigger invalidates in
strides across the allocation.

With only a single NPU active (one GPU) the difference is in the noise for
both types of invalidates (+/-1%).

With two NPUs active (on a 6-GPU system) the effect is more noticeable:

         mprotect rate (GB/s)
Stride   Before      After      Speedup
64K         5.9        6.5          10%
1M         31.2       33.4           7%
2M         36.3       38.7           7%
4M        322.6      356.7          11%

Signed-off-by: Mark Hairgrove <mhairgrove@nvidia.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-10-04 16:55:52 +10:00
..
alpha Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-08-24 09:25:39 -07:00
arc ARC: don't check for HIGHMEM pages in arch_dma_alloc 2018-09-04 13:21:38 -07:00
arm Fixes for KVM/ARM for Linux v4.19 v2: 2018-09-07 18:38:25 +02:00
arm64 KVM fixes for 4.19-rc3 2018-09-08 15:52:45 -07:00
c6x kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
h8300 Kbuild updates for v4.19 (2nd) 2018-08-25 13:40:38 -07:00
hexagon kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
ia64 ia64: Fix allnoconfig section mismatch for ioc_init/ioc_iommu_info 2018-08-22 14:12:47 -07:00
m68k m68k: fix early memory reservation for ColdFire MMU systems 2018-09-03 10:19:36 +10:00
microblaze kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
mips KVM fixes for 4.19-rc3 2018-09-08 15:52:45 -07:00
nds32 nds32: linker script: GCOV kernel may refers data in __exit 2018-09-05 10:16:26 +08:00
nios2 nios2: kconfig: remove duplicate DEBUG_STACK_USAGE symbol defintions 2018-08-27 09:47:20 +08:00
openrisc OpenRISC updates for 4.19 2018-08-23 14:09:37 -07:00
parisc Merge branch 'parisc-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux 2018-08-22 14:06:37 -07:00
powerpc powerpc/powernv/npu: Reduce eieio usage when issuing ATSD invalidates 2018-10-04 16:55:52 +10:00
riscv RISC-V: Use a less ugly workaround for unused variable warnings 2018-08-28 12:58:36 -07:00
s390 KVM: s390: Properly lock mm context allow_gmap_hpage_1m setting 2018-09-04 11:40:26 +02:00
sh kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
sparc sparc: set a default 32-bit dma mask for OF devices 2018-09-02 10:02:04 +02:00
um kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
unicore32 mm: convert return type of handle_mm_fault() caller to vm_fault_t 2018-08-17 16:20:28 -07:00
x86 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-09-09 07:05:15 -07:00
xtensa Kbuild updates for v4.19 (2nd) 2018-08-25 13:40:38 -07:00
.gitignore
Kconfig Merge branch 'tlb-fixes' 2018-08-23 14:55:01 -07:00