Commit Graph

7407 Commits

Author SHA1 Message Date
Niklas Schnelle c3b2c9064e s390/pci: remove clp_rescan_pci_devices_simple()
clp_rescan_pci_devices_simple() is neither simpler than
clp_scan_pci_devices() nor does it really scan PCI devices, in particular
it will neither add newly discovered devices nor remove those which
disappeared.
Instead it only refreshes PCI function handles and also
has just a single callsite in the same translation unit left which
in fact only refreshes one specific function handle identified by
a FID.

Clarify this by renaming the function and its helper to
clp_refresh_fh() respectvely __clp_refresh_fh() and make it take
a fid directly which saves us dealing with the NULL case which
updated all function handles but is not used anymore.
Furthermore since the only callsite is in the same translation unit
make it static.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 11:38:35 +02:00
Niklas Schnelle 809fcfaf92 s390/pci: remove clp_rescan_pci_devices()
there is only one call site of clp_rescan_pci_devices() and
all the function does is call zpci_remove_reserved_devices()
followed by a duplicating clp_scan_pci_devices().
So inline the single call as a call to zpci_remove_reserved_devices()
and clp_scan_pci_devices() and remove the function.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 11:38:34 +02:00
Niklas Schnelle 2bce60b503 s390/pci: remove unused function zpci_rescan()
the only caller of this was removed as part of the suspend/resume
removal so no need to keep this function around.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 11:38:34 +02:00
Niklas Schnelle abb95b7550 s390/pci: consolidate SR-IOV specific code
currently we have multiple #ifdef CONFIG_PCI_IOV blocks spread over
different compliation units and headers, all dealing with SR-IOV
specific behavior.
This violates the style guide which discourages conditionally compiled
code blocks and hinders maintainability by speading SR-IOV functionality
over many files.

Let's move all of this into a conditionally compiled pci_iov.c file and
local header and prefix SR-IOV specific functions with zpci_iov_*.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 11:38:34 +02:00
Heiko Carstens da1694ad9e s390/mm,ptdump: hold cpa mutex while walking for kernel page table dump
This is currently only preventing that outdated information is
provided to user space. A concurrent split of huge/large pages does
modify the kernel page tables, however either the huge/large mapping
is reported or the split area is being walked.

This "fixes" also only a potential future bug, since split pages could
also be merged again if page permissions are the same for larger
memory areas.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 11:38:34 +02:00
Heiko Carstens 36c2733c43 s390/mm,ptdump: hold memory hotplug lock while walking for kernel page table dump
This is the s390 variant of commit bf2b59f60e ("arm64/mm: Hold
memory hotplug lock while walking for kernel page table dump").

Right now this doesn't fix any real bug, however as soon as kvm
patches get merged which make use of memory remove we might end up
dereferencing/accessing freed page tables.

Therefore fix this potential bug already now.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 11:38:34 +02:00
Heiko Carstens 9d719d39aa s390/mm,ptdump: convert to generic page table dumper
Make use of generic ptdump infrastructure.

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 11:38:34 +02:00
Julian Wiedmann 180a4c42e5 s390/qdio: always use dev_name() for device name in QIB
Passing a custom name from the device driver is nice - but in practice
it's only zfcp who has been using this. So we might as well hard-code
a naming scheme in the qdio layer, so that qeth also benefits from it.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:30:07 +02:00
Niklas Schnelle b02002cc4c s390/pci: Implement ioremap_wc/prot() with MIO
With our current support for the new MIO PCI instructions, write
combining/write back MMIO memory can be obtained via the pci_iomap_wc()
and pci_iomap_wc_range() functions.
This is achieved by using the write back address for a specific bar
as provided in clp_store_query_pci_fn()

These functions are however not widely used and instead drivers often
rely on ioremap_wc() and ioremap_prot(), which on other platforms enable
write combining using a PTE flag set through the pgrprot value.

While we do not have a write combining flag in the low order flag bits
of the PTE like x86_64 does, with MIO support, there is a write back bit
in the physical address (bit 1 on z15) and thus also the PTE.
Which bit is used to toggle write back and whether it is available at
all, is however not fixed in the architecture. Instead we get this
information from the CLP Store Logical Processor Characteristics for PCI
command. When the write back bit is not provided we fall back to the
existing behavior.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:30:07 +02:00
Julian Wiedmann 4d4a3caaf3 s390/qdio: clean up QDR setup
__qdio_allocate_fill_qdr() is meant to set up one specific queue
descriptor in the QDR. But for this simple task, it gets passed a bunch
of global structs and offsets - and then navigates through the structs
to find its actual operands.

Clean up all the complicated pointer chasing & index calculation, and
just pass a descriptor and its associated queue struct.

While at it also add some virt_to_phys() translations, to clarify that
addresses in the QDR are meant to be absolute.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:30:07 +02:00
Sven Schnelle 4bf3ec384e s390: disable branch profiling for vdso
When branch profiling is enabled, if () gets annotated with code to
instrument the hit/miss ratio. This doesn't work for VDSO as we can't
access kernel code. Add -DDISABLE_BRANCH_PROFILING to fix this.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:30:07 +02:00
Janosch Frank cd4d3d5f21 s390: add 3f program exception handler
Program exception 3f (secure storage violation) can only be detected
when the CPU is running in SIE with a format 4 state description,
e.g. running a protected guest. Because of this and because user
space partly controls the guest memory mapping and can trigger this
exception, we want to send a SIGSEGV to the process running the guest
and not panic the kernel.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Cc: <stable@vger.kernel.org> # 5.7
Fixes: 084ea4d611 ("s390/mm: add (non)secure page access exceptions handlers")
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Niklas Schnelle afdf9550e5 s390/pci: fix leak of DMA tables on hard unplug
commit f606b3ef47 ("s390/pci: adapt events for zbus") removed the
zpci_disable_device() call for a zPCI event with PEC 0x0304 because
the device is already deconfigured by the platform.
This however skips the Linux side of the disable in particular it leads
to leaking the DMA tables and bitmaps because zpci_dma_exit_device() is
never called on the device.

If the device transitions to the Reserved state we call zpci_zdev_put()
but zpci_release_device() will not call zpci_disable_device() because
the state of the zPCI function is already ZPCI_FN_STATE_STANDBY.

If the device is put into the Standby state, zpci_disable_device() is
not called and the device is assumed to have been put in Standby through
platform action.
At this point the device may be removed by a subsequent event with PEC
0x0308 or 0x0306 which calls zpci_zdev_put() with the same problem
as above or the device may be configured again in which case
zpci_disable_device() is also not called.

Fix this by calling zpci_disable_device() explicitly for PEC 0x0304 as
before. To make it more clear that zpci_disable_device() may be called,
even if the lower level device has already been disabled by the
platform, add a comment to zpci_disable_device().

Cc: <stable@vger.kernel.org> # 5.8
Fixes: f606b3ef47 ("s390/pci: adapt events for zbus")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Ilya Leoshkevich fcb2b70cdb s390/init: add missing __init annotations
Add __init to reserve_memory_end, reserve_oldmem and remove_oldmem.
Sometimes these functions are not inlined, and then the build
complains about section mismatch.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Peter Zijlstra ca589ea8d1 s390/idle: fix suspicious RCU usage
After commit eb1f00237a ("lockdep,trace: Expose tracepoints") the
lock tracepoints are visible to lockdep and RCU-lockdep is finding a
bunch more RCU violations that were previously hidden.

Switch the idle->seqcount over to using raw_write_*() to avoid the
lockdep annotation and thus the lock tracepoints.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-14 10:08:07 +02:00
Christoph Hellwig 5e6e9852d6 uaccess: add infrastructure for kernel builds with set_fs()
Add a CONFIG_SET_FS option that is selected by architecturess that
implement set_fs, which is all of them initially.  If the option is not
set stubs for routines related to overriding the address space are
provided so that architectures can start to opt out of providing set_fs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-08 22:21:32 -04:00
Masami Hiramatsu 26a24a6b43 s390: kprobes: Use generic kretprobe trampoline handler
Use the generic kretprobe trampoline handler. Don't use
framepointer verification.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/159870612453.1229682.15950927742606892302.stgit@devnote2
2020-09-08 11:52:34 +02:00
Masahiro Yamada 887af6d7c9 arch: vdso: add vdso linker script to 'targets' instead of extra-y
The vdso linker script is preprocessed on demand.
Adding it to 'targets' is enough to include the .cmd file.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
2020-09-07 21:41:27 +09:00
Nicolin Chen 1e9d90dbed dma-mapping: introduce dma_get_seg_boundary_nr_pages()
We found that callers of dma_get_seg_boundary mostly do an ALIGN
with page mask and then do a page shift to get number of pages:
    ALIGN(boundary + 1, 1 << shift) >> shift

However, the boundary might be as large as ULONG_MAX, which means
that a device has no specific boundary limit. So either "+ 1" or
passing it to ALIGN() would potentially overflow.

According to kernel defines:
    #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
    #define ALIGN(x, a)	ALIGN_MASK(x, (typeof(x))(a) - 1)

We can simplify the logic here into a helper function doing:
  ALIGN(boundary + 1, 1 << shift) >> shift
= ALIGN_MASK(b + 1, (1 << s) - 1) >> s
= {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s
= [b + 1 + (1 << s) - 1] >> s
= [b + (1 << s)] >> s
= (b >> s) + 1

This patch introduces and applies dma_get_seg_boundary_nr_pages()
as an overflow-free helper for the dma_get_seg_boundary() callers
to get numbers of pages. It also takes care of the NULL dev case
for non-DMA API callers.

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-03 18:12:15 +02:00
Heiko Carstens 5c60ed283e s390: update defconfigs
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-02 13:17:05 +02:00
Eric Farman 114b9df419 s390: fix GENERIC_LOCKBREAK dependency typo in Kconfig
Commit fa68645305 ("sched/rt, s390: Use CONFIG_PREEMPTION")
changed a bunch of uses of CONFIG_PREEMPT to _PREEMPTION.
Except in the Kconfig it used two T's. That's the only place
in the system where that spelling exists, so let's fix that.

Fixes: fa68645305 ("sched/rt, s390: Use CONFIG_PREEMPTION")
Cc: <stable@vger.kernel.org> # 5.6
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-02 13:17:05 +02:00
Kees Cook c604abc3f6 vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG
The .comment section doesn't belong in STABS_DEBUG. Split it out into a
new macro named ELF_DETAILS. This will gain other non-debug sections
that need to be accounted for when linking with --orphan-handling=warn.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-09-01 09:50:35 +02:00
Linus Torvalds b69bea8a65 A set of fixes for lockdep, tracing and RCU:
- Prevent recursion by using raw_cpu_* operations
 
   - Fixup the interrupt state in the cpu idle code to be consistent
 
   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections
 
   - Move trace_cpu_idle() into generic code so it's called before RCU goes
     idle.
 
   - Handle raw_local_irq* vs. local_irq* operations correctly
 
   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl9L5qETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoV/NEADG+h02tj2I4gP7IQ3nVodEzS1+odPI
 orabY5ggH0kn4YIhPB4UtOd5zKZjr3FJs9wEhyhQpV6ZhvFfgaIKiYqfg+Q81aMO
 /BXrfh6jBD2Hu7gaPBnVdkKeh1ehl+w0PhTeJhPBHEEvbGeLUYWwyPNlaKz//VQl
 XCWl7e7o/Uw2UyJ469SCx3z+M2DMNqwdMys/zcqvTLiBdLNCwp4TW5ACzEA0rfHh
 Pepu3eIKnMURyt82QanrOATvT2io9pOOaUh59zeKi2WM8ikwKd/Eho2kXYng6GvM
 GzX4Kn13MsNobZXf9BhqEGICdRkaJqLsXlmBNmbJdSTCn5W2lLZqu2wCEp5VZHCc
 XwMbey8ek+BRskJMqAV4oq2GA8Om9KEYWOOdixyOG0UJCiW5qDowuDYBXTLV7FWj
 XhzLGuHpUF9eKLKokJ7ideLaDcpzwYjHr58pFLQrqPwmjVKWguLeYMg5BhhTiEuV
 wNfiLIGdMNsCpYKhnce3o9paV8+hy1ZveWhNy+/4HaDLoEwI2T62i8R7xxbrcWMg
 sgdAiQG+kVLwSJ13bN+Cz79uLYTIbqGaZHtOXmeIT3jSxBjx5RlXfzocwTHSYrNk
 GuLYHd7+QaemN49Rrf4bPR16Db7ifL32QkUtLBTBLcnos9jM+fcl+BWyqYRxhgDv
 xzDS+vfK8DvRiA==
 =Hgt6
 -----END PGP SIGNATURE-----

Merge tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "A set of fixes for lockdep, tracing and RCU:

   - Prevent recursion by using raw_cpu_* operations

   - Fixup the interrupt state in the cpu idle code to be consistent

   - Push rcu_idle_enter/exit() invocations deeper into the idle path so
     that the lock operations are inside the RCU watching sections

   - Move trace_cpu_idle() into generic code so it's called before RCU
     goes idle.

   - Handle raw_local_irq* vs. local_irq* operations correctly

   - Move the tracepoints out from under the lockdep recursion handling
     which turned out to be fragile and inconsistent"

* tag 'locking-urgent-2020-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep,trace: Expose tracepoints
  lockdep: Only trace IRQ edges
  mips: Implement arch_irqs_disabled()
  arm64: Implement arch_irqs_disabled()
  nds32: Implement arch_irqs_disabled()
  locking/lockdep: Cleanup
  x86/entry: Remove unused THUNKs
  cpuidle: Move trace_cpu_idle() into generic code
  cpuidle: Make CPUIDLE_FLAG_TLB_FLUSHED generic
  sched,idle,rcu: Push rcu_idle deeper into the idle path
  cpuidle: Fixup IRQ state
  lockdep: Use raw_cpu_*() for per-cpu variables
2020-08-30 11:43:50 -07:00
Sven Schnelle 4bff8cb545 s390: convert to GENERIC_VDSO
Convert s390 to generic vDSO. There are a few special things on s390:

- vDSO can be called without a stack frame - glibc did this in the past.
  So we need to allocate a stackframe on our own.

- The former assembly code used stcke to get the TOD clock and applied
  time steering to it. We need to do the same in the new code. This is done
  in the architecture specific __arch_get_hw_counter function. The steering
  information is stored in an architecure specific area in the vDSO data.

- CPUCLOCK_VIRT is now handled with a syscall fallback, which might
  be slower/less accurate than the old implementation.

The getcpu() function stays as an assembly function because there is no
generic implementation and the code is just a few lines.

Performance number from my system do 100 mio gettimeofday() calls:

Plain syscall: 8.6s
Generic VDSO:  1.3s
old ASM VDSO:  1s

So it's a bit slower but still much faster than syscalls.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:47:21 +02:00
Heiko Carstens 98ad45fb58 s390/checksum: coding style changes
Add some coding style changes which hopefully make the code
look a bit less odd.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:47:20 +02:00
Heiko Carstens 612ad0785d s390/checksum: have consistent calculations
Use "|" instead of "+" within csum_fold() for consistency reasons,
like in the rest of the file.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:47:20 +02:00
Heiko Carstens 614b4f5d0f s390/checksum: make ip_fast_csum() faster
Convert ip_fast_csum() so it doesn't call csum_partial(), but instead
open code the checksum calculation. The problem with csum_partial() is
that it makes use of the cksm instruction, which has high startup
costs and therefore is only very fast if used on larger memory
regions.

IPv4 headers however are small in size (5-16 32-bit words). The open
coded variant calculates the checksum in ~30% of the time compared to
the old variant (z14, march=z196).

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:47:20 +02:00
Heiko Carstens bb4644b14a s390/checksum: rewrite csum_tcpudp_nofold()
Rewrite csum_tcpudp_nofold() so that the generated code will not
contain branches. The old implementation was also optimized for
machines which came with "add logical with carry" instructions,
however the compiler doesn't generate them anymore. This is most
likely because those instructions are slower.

However with the old code the compiler generates a lot of branches,
which isn't too helpful usually. Therefore rewrite the code.

In a tight loop this doesn't make any difference since the branch
prediction unit does its job.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:47:20 +02:00
Heiko Carstens b064904c50 s390/checksum: provide csum_ipv6_magic()
This implementation needs only ~30% of the time to calculate the
checksum compared to the generic variant. In addition the compiler
also generates only ~30% of the instructions compared to the generic
variant (on z14, compiled with march=z196).

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:47:20 +02:00
Vasily Gorbik bffc2f7aa9 s390/vmem: fix vmem_add_range for 4-level paging
The kernel currently crashes if 4-level paging is used. Add missing
p4d_populate for just allocated pud entry.

Fixes: 3e0d3e408e ("s390/vmem: consolidate vmem_add_range() and vmem_remove_range()")
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:07:05 +02:00
Sven Schnelle 1196f12a2c s390: don't trace preemption in percpu macros
Since commit a21ee6055c ("lockdep: Change hardirq{s_enabled,_context}
to per-cpu variables") the lockdep code itself uses percpu variables. This
leads to recursions because the percpu macros are calling preempt_enable()
which might call trace_preempt_on().

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-08-26 18:07:04 +02:00
Peter Zijlstra 9864f5b594 cpuidle: Move trace_cpu_idle() into generic code
Remove trace_cpu_idle() from the arch_cpu_idle() implementations and
put it in the generic code, right before disabling RCU. Gets rid of
more trace_*_rcuidle() users.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200821085348.428433395@infradead.org
2020-08-26 12:41:54 +02:00
Al Viro 6e41c585e3 unify generic instances of csum_partial_copy_nocheck()
quite a few architectures have the same csum_partial_copy_nocheck() -
simply memcpy() the data and then return the csum of the copy.

hexagon, parisc, ia64, s390, um: explicitly spelled out that way.

arc, arm64, csky, h8300, m68k/nommu, microblaze, mips/GENERIC_CSUM, nds32,
nios2, openrisc, riscv, unicore32: end up picking the same thing spelled
out in lib/checksum.h (with varying amounts of perversions along the way).

everybody else (alpha, arm, c6x, m68k/mmu, mips/!GENERIC_CSUM, powerpc,
sh, sparc, x86, xtensa) have non-generic variants.  For all except c6x
the declaration is in their asm/checksum.h.  c6x uses the wrapper
from asm-generic/checksum.h that would normally lead to the lib/checksum.h
instance, but in case of c6x we end up using an asm function from arch/c6x
instead.

Screw that mess - have architectures with private instances define
_HAVE_ARCH_CSUM_AND_COPY in their asm/checksum.h and have the default
one right in net/checksum.h conditional on _HAVE_ARCH_CSUM_AND_COPY
*not* defined.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-20 15:45:14 -04:00
Niklas Schnelle b97bf44f99 s390/pci: fix PF/VF linking on hot plug
Currently there are four places in which a PCI function is scanned
and made available to drivers:
 1. In pci_scan_root_bus() as part of the initial zbus
    creation.
 2. In zpci_bus_add_devices() when registering
    a device in configured state on a zbus that has already been
    scanned.
 3. When a function is already known to zPCI (in reserved/standby state)
    and configuration is triggered through firmware by PEC 0x301.
 4. When a device is already known to zPCI (in standby/reserved state)
    and configuration is triggered from within Linux using
    enable_slot().

The PF/VF linking step and setting of pdev->is_virtfn introduced with
commit e5794cf1a2 ("s390/pci: create links between PFs and VFs") was
only triggered for the second case, which is where VFs created through
sriov_numvfs usually land. However unlike some other platforms but like
POWER VFs can be individually enabled/disabled through
/sys/bus/pci/slots.

Fix this by doing VF setup as part of pcibios_bus_add_device() which is
called in all of the above cases.

Finally to remove the PF/VF links call the common code
pci_iov_remove_virtfn() function to remove linked VFs.
This takes care of the necessary sysfs cleanup.

Fixes: e5794cf1a2 ("s390/pci: create links between PFs and VFs")
Cc: <stable@vger.kernel.org> # 5.8: 2f0230b2f2d5: s390/pci: re-introduce zpci_remove_device()
Cc: <stable@vger.kernel.org> # 5.8
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:34 +02:00
Niklas Schnelle 2f0230b2f2 s390/pci: re-introduce zpci_remove_device()
For fixing the PF to VF link removal we need to perform some action on
every removal of a zdev from the common PCI subsystem.
So in preparation re-introduce zpci_remove_device() and use that instead
of directly calling the common code functions. This  was actually still
declared from earlier code but no longer implemented.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:25 +02:00
Niklas Schnelle 3cddb79afc s390/pci: fix zpci_bus_link_virtfn()
We were missing the pci_dev_put() for candidate PFs.  Furhtermore in
discussion with upstream it turns out that somewhat counterintuitively
some common code, in particular the vfio-pci driver, assumes that
pdev->is_virtfn always implies that pdev->physfn is set, i.e. that VFs
are always linked.
While POWER does seem to set pdev->is_virtfn even for unlinked functions
(see comments in arch/powerpc/kernel/eeh.c:eeh_debugfs_break_device())
for now just be safe and only set pdev->is_virtfn on linking.
Also make sure that we only search for parent PFs if the zbus is
multifunction and we thus know the devfn values supplied by firmware
come from the RID.

Fixes: e5794cf1a2 ("s390/pci: create links between PFs and VFs")
Cc: <stable@vger.kernel.org> # 5.8
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:18 +02:00
Heiko Carstens fd78c59446 s390/ptrace: fix storage key handling
The key member of the runtime instrumentation control block contains
only the access key, not the complete storage key. Therefore the value
must be shifted by four bits. Since existing user space does not
necessarily query and set the access key correctly, just ignore the
user space provided key and use the correct one.
Note: this is only relevant for debugging purposes in case somebody
compiles a kernel with a default storage access key set to a value not
equal to zero.

Fixes: 262832bc5a ("s390/ptrace: add runtime instrumention register get/set")
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:14 +02:00
Heiko Carstens 9eaba29c79 s390/runtime_instrumentation: fix storage key handling
The key member of the runtime instrumentation control block contains
only the access key, not the complete storage key. Therefore the value
must be shifted by four bits.
Note: this is only relevant for debugging purposes in case somebody
compiles a kernel with a default storage access key set to a value not
equal to zero.

Fixes: e4b8b3f33f ("s390: add support for runtime instrumentation")
Reported-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:10 +02:00
Niklas Schnelle b76fee1bc5 s390/pci: ignore stale configuration request event
A configuration request event may be stale, that is the event
may reference a zdev which was already configured.
This can happen when a hotplug happens during boot such that
the device is discovered and configured in the initial clp_list_pci(),
then after initialization we enable events and process
the original configuration request which additionally still contains
the old disabled function handle leading to a failure during device
enablement and subsequent I/O lockout.

Fix this by restoring the check that the device to be configured is in
standby which was removed in commit f606b3ef47 ("s390/pci: adapt events
for zbus").

This check does not need serialization as we only enable the events after
zPCI has fully initialized, which includes the initial clp_list_pci(),
rescan only does updates and events are serialized with respect to each
other.

Fixes: f606b3ef47 ("s390/pci: adapt events for zbus")
Cc: <stable@vger.kernel.org> # 5.8
Reported-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Tested-by: Shalini Chellathurai Saroja <shalini@linux.ibm.com>
Acked-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-17 13:17:05 +02:00
Xiaoming Ni 88db0aa242 all arch: remove system call sys_sysctl
Since commit 61a47c1ad3 ("sysctl: Remove the sysctl system call"),
sys_sysctl is actually unavailable: any input can only return an error.

We have been warning about people using the sysctl system call for years
and believe there are no more users.  Even if there are users of this
interface if they have not complained or fixed their code by now they
probably are not going to, so there is no point in warning them any
longer.

So completely remove sys_sysctl on all architectures.

[nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
 Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com

Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will@kernel.org>		[arm/arm64]
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bin Meng <bin.meng@windriver.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: chenzefeng <chenzefeng2@huawei.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Diego Elio Pettenò <flameeyes@flameeyes.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kars de Jong <jongk@linux-m68k.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Paul Burton <paulburton@kernel.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Sven Schnelle <svens@stackframe.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com>
Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14 19:56:56 -07:00
Linus Torvalds 990f227371 - Allow s390 debug feature to handle finally more than 256 CPU numbers, instead
of truncating the most significant bits.
 
 - Improve THP splitting required by qemu processes by making use of
   walk_page_vma() instead of calling follow_page() for every single page
   within each vma.
 
 - Add missing ZCRYPT dependency to VFIO_AP to fix potential compile problems.
 
 - Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again.
 
 - Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma
   translates a node distance of 0 to "no NUMA support available".
 
 - Couple of other minor fixes and improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAl81bRwACgkQIg7DeRsp
 bsIT3g/7BSIfbI852VTn6hWw+LfdoBLVqXDFcg9xQhUqccbPcG8LG/BYlmuvIn1Z
 X9XujP76xAX7LKnt9x0Z9znwpgGIj0oURmcBxzGBTk+yLBf7YXCOLu9w4MV1AD57
 WRoGG7pPFZy5rMDM29frqBNQBWc7g+/wGoPVqiffzqgj2DcS4Ka8TSTTboMIH7th
 mLK5xaU504FdDGwTgNv6Q/Pt3+YBZ6P/1cIuifzYnfjJ2CsjSBLjS9IFP6/w/c1Q
 VQ2TfajuRbAgkkY01o15tKrlqwPzzWYhnh9ix1LkL0WT6VNql+fcwN665HdsSDFu
 Ctu6Sk3BZmhN1GONK4gIx5didRxBi7neAfUMenSasPT8uMGJcR4EMoNUr70sXNA/
 B1naRayfCH0dE3SVZflAeFJRSh1LnikY14uj65Gg/Wtla5N+90zuHEGDttIBNzrr
 FDzc+39GWN1wJEl7FzSIm+YRC88C/So/BNUQhmlVYNY0sbqyAszUGBOAa+5VS2WQ
 tkWzWPjJ6QA3e3t/dpnc0dvlCKLTJG3o7bdtilb71QNybGiAniP8+79hCNa5nPT3
 iLRhmLX27cQF54IUw1whVA5mkfzHSzwAlB+/laPi8MWFQXTRLTD1XzhmjrIyR57Y
 Q6wM+lxC4qreDz7yGX+eQk7OEs8j8IKTF14HZxkp/DEL09Vbmbs=
 =E0hf
 -----END PGP SIGNATURE-----

Merge tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:

 - Allow s390 debug feature to handle finally more than 256 CPU numbers,
   instead of truncating the most significant bits.

 - Improve THP splitting required by qemu processes by making use of
   walk_page_vma() instead of calling follow_page() for every single
   page within each vma.

 - Add missing ZCRYPT dependency to VFIO_AP to fix potential compile
   problems.

 - Remove not required select CLOCKSOURCE_VALIDATE_LAST_CYCLE again.

 - Set node distance to LOCAL_DISTANCE instead of 0, since e.g. libnuma
   translates a node distance of 0 to "no NUMA support available".

 - Couple of other minor fixes and improvements.

* tag 's390-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/numa: move code to arch/s390/kernel
  s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again
  s390/debug: debug feature version 3
  s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP
  s390/numa: set node distance to LOCAL_DISTANCE
  s390/pkey: remove redundant variable initialization
  s390/test_unwind: fix possible memleak in test_unwind()
  s390/gmap: improve THP splitting
  s390/atomic: circumvent gcc 10 build regression
2020-08-13 12:38:32 -07:00
Peter Xu 64019a2e46 mm/gup: remove task_struct pointer for all gup code
After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more.  Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:04 -07:00
Peter Xu 35e45f3e5a mm/s390: use general page fault accounting
Use the general page fault accounting by passing regs into
handle_mm_fault().  It naturally solve the issue of multiple page fault
accounting when page fault retry happened.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-19-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:03 -07:00
Peter Xu bce617edec mm: do page fault accounting in handle_mm_fault
Patch series "mm: Page fault accounting cleanups", v5.

This is v5 of the pf accounting cleanup series.  It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b98270 ("mm: allow
VM_FAULT_RETRY for multiple times"):

  https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/

What this series did:

  - Correct page fault accounting: we do accounting for a page fault
    (no matter whether it's from #PF handling, or gup, or anything else)
    only with the one that completed the fault.  For example, page fault
    retries should not be counted in page fault counters.  Same to the
    perf events.

  - Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
    event is used in an adhoc way across different archs.

    Case (1): for many archs it's done at the entry of a page fault
    handler, so that it will also cover e.g.  errornous faults.

    Case (2): for some other archs, it is only accounted when the page
    fault is resolved successfully.

    Case (3): there're still quite some archs that have not enabled
    this perf event.

    Since this series will touch merely all the archs, we unify this
    perf event to always follow case (1), which is the one that makes most
    sense.  And since we moved the accounting into handle_mm_fault, the
    other two MAJ/MIN perf events are well taken care of naturally.

  - Unify definition of "major faults": the definition of "major
    fault" is slightly changed when used in accounting (not
    VM_FAULT_MAJOR).  More information in patch 1.

  - Always account the page fault onto the one that triggered the page
    fault.  This does not matter much for #PF handlings, but mostly for
    gup.  More information on this in patch 25.

Patchset layout:

Patch 1:     Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23:  Enable the new accounting for arch #PF handlers one by one.
Patch 24:    Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25:    Cleanup GUP task_struct pointer since it's not needed any more

This patch (of 25):

This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault().  This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events.  To
do this, the pt_regs pointer is passed into handle_mm_fault().

PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.

So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200707225021.200906-1-peterx@redhat.com
Link: http://lkml.kernel.org/r/20200707225021.200906-2-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:02 -07:00
Christoph Hellwig 428e2976a5 uaccess: remove segment_eq
segment_eq is only used to implement uaccess_kernel.  Just open code
uaccess_kernel in the arch uaccess headers and remove one layer of
indirection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:58 -07:00
Alexander Gordeev b450eeb0c9 s390/numa: move code to arch/s390/kernel
Move all code from arch/s390/numa/ to arch/s390/kernel/
since numa.c is the only source file and all others were
deleted with the fake NUMA support removal.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:55 +02:00
Heiko Carstens 12bbf0962a s390/time: remove select CLOCKSOURCE_VALIDATE_LAST_CYCLE again
Sven Schnelle reported that setting CLOCKSOURCE_VALIDATE_LAST_CYCLE
doesn't make sense: even if our tod clock overflows delta calculation
(now - last) with unsigned 64 bit values will still be correct.

Therefore revert commit 555701a714 ("s390/time: select
CLOCKSOURCE_VALIDATE_LAST_CYCLE").

Fixes: 555701a714 ("s390/time: select CLOCKSOURCE_VALIDATE_LAST_CYCLE")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:48 +02:00
Mikhail Zaslonko 0990d836ce s390/debug: debug feature version 3
Change __debug_entry structure in the following way:
 - remove redundant union
 - Field containing cpuid is expanded to 16 bits. 8-bit width was not
   enough since we already support up to 512 cpus.
 - Field containing the timestamp is expanded to 60 bits. The timestamp
   itself is now stored in the absolute Unix time format in microseconds
   taking the Epoch Index into acount.
Adjust default header for debug entries by setting minimum width for cpuid
to 4 digits.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:43 +02:00
Krzysztof Kozlowski 929a343b85 s390/Kconfig: add missing ZCRYPT dependency to VFIO_AP
The VFIO_AP uses ap_driver_register() (and deregister) functions
implemented in ap_bus.c (compiled into ap.o).  However the ap.o will be
built only if CONFIG_ZCRYPT is selected.

This was not visible before commit e93a1695d7 ("iommu: Enable compile
testing for some of drivers") because the CONFIG_VFIO_AP depends on
CONFIG_S390_AP_IOMMU which depends on the missing CONFIG_ZCRYPT.  After
adding COMPILE_TEST, it is possible to select a configuration with
VFIO_AP and S390_AP_IOMMU but without the ZCRYPT.

Add proper dependency to the VFIO_AP to fix build errors:

ERROR: modpost: "ap_driver_register" [drivers/s390/crypto/vfio_ap.ko] undefined!
ERROR: modpost: "ap_driver_unregister" [drivers/s390/crypto/vfio_ap.ko] undefined!

Reported-by: kernel test robot <lkp@intel.com>
Fixes: e93a1695d7 ("iommu: Enable compile testing for some of drivers")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:39 +02:00
Alexander Gordeev 535e4fc623 s390/numa: set node distance to LOCAL_DISTANCE
The node distance is hardcoded to 0, which causes a trouble
for some user-level applications. In particular, "libnuma"
expects the distance of a node to itself as LOCAL_DISTANCE.
This update removes the offending node distance override.

Cc: <stable@vger.kernel.org> # 4.4
Fixes: 3a368f742d ("s390/numa: add core infrastructure")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:35 +02:00
Wang Hai 75d3e7f476 s390/test_unwind: fix possible memleak in test_unwind()
test_unwind() misses to call kfree(bt) in an error path.
Add the missed function call to fix it.

Fixes: 0610154650 ("s390/test_unwind: print verbose unwinding results")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:16 +02:00
Gerald Schaefer ba925fa350 s390/gmap: improve THP splitting
During s390_enable_sie(), we need to take care of splitting all qemu user
process THP mappings. This is currently done with follow_page(FOLL_SPLIT),
by simply iterating over all vma ranges, with PAGE_SIZE increment.

This logic is sub-optimal and can result in a lot of unnecessary overhead,
especially when using qemu and ASAN with large shadow map. Ilya reported
significant system slow-down with one CPU busy for a long time and overall
unresponsiveness.

Fix this by using walk_page_vma() and directly calling split_huge_pmd()
only for present pmds, which greatly reduces overhead.

Cc: <stable@vger.kernel.org> # v5.4+
Reported-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:13 +02:00
Vasily Gorbik f0cbd3b83e s390/atomic: circumvent gcc 10 build regression
Circumvent the following gcc 10 allyesconfig build regression:

  CC      drivers/leds/trigger/ledtrig-cpu.o
In file included from ./arch/s390/include/asm/bitops.h:39,
                 from ./include/linux/bitops.h:29,
                 from ./include/linux/kernel.h:12,
                 from drivers/leds/trigger/ledtrig-cpu.c:18:
./arch/s390/include/asm/atomic_ops.h: In function 'ledtrig_cpu':
./arch/s390/include/asm/atomic_ops.h:46:2: warning: 'asm' operand 1 probably does not match constraints
   46 |  asm volatile(       \
      |  ^~~
./arch/s390/include/asm/atomic_ops.h:53:2: note: in expansion of macro '__ATOMIC_CONST_OP'
   53 |  __ATOMIC_CONST_OP(op_name, op_type, op_string, "\n")  \
      |  ^~~~~~~~~~~~~~~~~
./arch/s390/include/asm/atomic_ops.h:56:1: note: in expansion of macro '__ATOMIC_CONST_OPS'
   56 | __ATOMIC_CONST_OPS(__atomic_add_const, int, "asi")
      | ^~~~~~~~~~~~~~~~~~
./arch/s390/include/asm/atomic_ops.h:46:2: error: impossible constraint in 'asm'
   46 |  asm volatile(       \
      |  ^~~
./arch/s390/include/asm/atomic_ops.h:53:2: note: in expansion of macro '__ATOMIC_CONST_OP'
   53 |  __ATOMIC_CONST_OP(op_name, op_type, op_string, "\n")  \
      |  ^~~~~~~~~~~~~~~~~
./arch/s390/include/asm/atomic_ops.h:56:1: note: in expansion of macro '__ATOMIC_CONST_OPS'
   56 | __ATOMIC_CONST_OPS(__atomic_add_const, int, "asi")
      | ^~~~~~~~~~~~~~~~~~
scripts/Makefile.build:280: recipe for target 'drivers/leds/trigger/ledtrig-cpu.o' failed

By swapping conditions as proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549318.html

Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-08-11 18:16:08 +02:00
Linus Torvalds fc80c51fd4 Kbuild updates for v5.9
- run the checker (e.g. sparse) after the compiler
 
  - remove unneeded cc-option tests for old compiler flags
 
  - fix tar-pkg to install dtbs
 
  - introduce ccflags-remove-y and asflags-remove-y syntax
 
  - allow to trace functions in sub-directories of lib/
 
  - introduce hostprogs-always-y and userprogs-always-y syntax
 
  - various Makefile cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl8wJXEVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGMGEP/0jDq/WafbfPN0aU83EqEWLt/sKg
 bluzmf/6HGx3XVRnuAzsHNNqysUx77WJiDsU/jbC/zdH8Iox3Sc1diE2sELLNAfY
 iJmQ8NBPggyU74aYG3OJdpDjz8T9EX/nVaYrjyFlbuXElM+Qvo8Z4Fz6NpWqKWlA
 gU+yGxEPPdX6MLHcSPSIu1hGWx7UT4fgfx3zDFTI2qvbQgQjKtzyTjAH5Cm3o87h
 rfomvHSSoAUg+Fh1LediRh1tJlkdVO+w7c+LNwCswmdBtkZuxecj1bQGUTS8GaLl
 CCWOKYfWp0KsVf1veXNNNaX/ecbp+Y34WErFq3V9Fdq5RmVlp+FPSGMyjDMRiQ/p
 LGvzbJLPpG586MnK8of0dOj6Es6tVPuq6WH2HuvsyTGcZJDpFTTxRcK3HDkE8ig6
 ZtuM3owB/Mep8IzwY2yWQiDrc7TX5Fz8S4hzGPU1zG9cfj4VT6TBqHGAy1Eql/0l
 txj6vJpnbQSdXiIX8MIU3yH35Y7eW3JYWgspTZH5Woj1S/wAWwuG93Fuuxq6mQIJ
 q6LSkMavtOfuCjOA9vJBZewpKXRU6yo0CzWNL/5EZ6z/r/I+DGtfb/qka8oYUDjX
 9H0cecL37AQxDHRPTxCZDQF0TpYiFJ6bmnMftK9NKNuIdvsk9DF7UBa3EdUNIj38
 yKS3rI7Lw55xWuY3
 =bkNQ
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - run the checker (e.g. sparse) after the compiler

 - remove unneeded cc-option tests for old compiler flags

 - fix tar-pkg to install dtbs

 - introduce ccflags-remove-y and asflags-remove-y syntax

 - allow to trace functions in sub-directories of lib/

 - introduce hostprogs-always-y and userprogs-always-y syntax

 - various Makefile cleanups

* tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: stop filtering out $(GCC_PLUGINS_CFLAGS) from cc-option base
  kbuild: include scripts/Makefile.* only when relevant CONFIG is enabled
  kbuild: introduce hostprogs-always-y and userprogs-always-y
  kbuild: sort hostprogs before passing it to ifneq
  kbuild: move host .so build rules to scripts/gcc-plugins/Makefile
  kbuild: Replace HTTP links with HTTPS ones
  kbuild: trace functions in subdirectories of lib/
  kbuild: introduce ccflags-remove-y and asflags-remove-y
  kbuild: do not export LDFLAGS_vmlinux
  kbuild: always create directories of targets
  powerpc/boot: add DTB to 'targets'
  kbuild: buildtar: add dtbs support
  kbuild: remove cc-option test of -ffreestanding
  kbuild: remove cc-option test of -fno-stack-protector
  Revert "kbuild: Create directory for target DTB"
  kbuild: run the checker after the compiler
2020-08-09 14:10:26 -07:00
Linus Torvalds 8d3e09b433 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull regset conversion fix from Al Viro:
 "Fix a regression from an unnoticed bisect hazard in the regset series.

  A bunch of old (aout, originally) primitives used by coredumps became
  dead code after fdpic conversion to regsets. Removal of that dead code
  had been the first commit in the followups to regset series;
  unfortunately, it happened to hide the bisect hazard on sh (extern for
  fpregs_get() had not been updated in the main series when it should
  have been; followup simply made fpregs_get() static). And without that
  followup commit this bisect hazard became breakage in the mainline"

Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  kill unused dump_fpu() instances
2020-08-09 13:33:54 -07:00
Linus Torvalds b79675e15a Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "No common topic whatsoever in those, sorry"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: define inode flags using bit numbers
  iov_iter: Move unnecessary inclusion of crypto/hash.h
  dlmfs: clean up dlmfs_file_{read,write}() a bit
2020-08-07 21:14:30 -07:00
Linus Torvalds 81e11336d9 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few MM hotfixes

 - kthread, tools, scripts, ntfs and ocfs2

 - some of MM

Subsystems affected by this patch series: kthread, tools, scripts, ntfs,
ocfs2 and mm (hofixes, pagealloc, slab-generic, slab, slub, kcsan,
debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, mincore,
sparsemem, vmalloc, kasan, pagealloc, hugetlb and vmscan).

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits)
  mm: vmscan: consistent update to pgrefill
  mm/vmscan.c: fix typo
  khugepaged: khugepaged_test_exit() check mmget_still_valid()
  khugepaged: retract_page_tables() remember to test exit
  khugepaged: collapse_pte_mapped_thp() protect the pmd lock
  khugepaged: collapse_pte_mapped_thp() flush the right range
  mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
  mm: thp: replace HTTP links with HTTPS ones
  mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
  mm/page_alloc.c: skip setting nodemask when we are in interrupt
  mm/page_alloc: fallbacks at most has 3 elements
  mm/page_alloc: silence a KASAN false positive
  mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()
  mm/page_alloc.c: simplify pageblock bitmap access
  mm/page_alloc.c: extract the common part in pfn_to_bitidx()
  mm/page_alloc.c: replace the definition of NR_MIGRATETYPE_BITS with PB_migratetype_bits
  mm/shuffle: remove dynamic reconfiguration
  mm/memory_hotplug: document why shuffle_zone() is relevant
  mm/page_alloc: remove nr_free_pagecache_pages()
  mm: remove vm_total_pages
  ...
2020-08-07 11:39:33 -07:00
Mike Rapoport c89ab04feb mm/sparse: cleanup the code surrounding memory_present()
After removal of CONFIG_HAVE_MEMBLOCK_NODE_MAP we have two equivalent
functions that call memory_present() for each region in memblock.memory:
sparse_memory_present_with_active_regions() and membocks_present().

Moreover, all architectures have a call to either of these functions
preceding the call to sparse_init() and in the most cases they are called
one after the other.

Mark the regions from memblock.memory as present during sparce_init() by
making sparse_init() call memblocks_present(), make memblocks_present()
and memory_present() functions static and remove redundant
sparse_memory_present_with_active_regions() function.

Also remove no longer required HAVE_MEMORY_PRESENT configuration option.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:27 -07:00
Mike Rapoport ca15ca406f mm: remove unneeded includes of <asm/pgalloc.h>
Patch series "mm: cleanup usage of <asm/pgalloc.h>"

Most architectures have very similar versions of pXd_alloc_one() and
pXd_free_one() for intermediate levels of page table.  These patches add
generic versions of these functions in <asm-generic/pgalloc.h> and enable
use of the generic functions where appropriate.

In addition, functions declared and defined in <asm/pgalloc.h> headers are
used mostly by core mm and early mm initialization in arch and there is no
actual reason to have the <asm/pgalloc.h> included all over the place.
The first patch in this series removes unneeded includes of
<asm/pgalloc.h>

In the end it didn't work out as neatly as I hoped and moving
pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require
unnecessary changes to arches that have custom page table allocations, so
I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local
to mm/.

This patch (of 8):

In most cases <asm/pgalloc.h> header is required only for allocations of
page table memory.  Most of the .c files that include that header do not
use symbols declared in <asm/pgalloc.h> and do not require that header.

As for the other header files that used to include <asm/pgalloc.h>, it is
possible to move that include into the .c file that actually uses symbols
from <asm/pgalloc.h> and drop the include from the header file.

The process was somewhat automated using

	sed -i -E '/[<"]asm\/pgalloc\.h/d' \
                $(grep -L -w -f /tmp/xx \
                        $(git grep -E -l '[<"]asm/pgalloc\.h'))

where /tmp/xx contains all the symbols defined in
arch/*/include/asm/pgalloc.h.

[rppt@linux.ibm.com: fix powerpc warning]

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Pekka Enberg <penberg@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Link: http://lkml.kernel.org/r/20200627143453.31835-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200627143453.31835-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:26 -07:00
Waiman Long 453431a549 mm, treewide: rename kzfree() to kfree_sensitive()
As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/kzfree/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.

[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:22 -07:00
Linus Torvalds 19b39c38ab Merge branch 'work.regset' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ptrace regset updates from Al Viro:
 "Internal regset API changes:

   - regularize copy_regset_{to,from}_user() callers

   - switch to saner calling conventions for ->get()

   - kill user_regset_copyout()

  The ->put() side of things will have to wait for the next cycle,
  unfortunately.

  The balance is about -1KLoC and replacements for ->get() instances are
  a lot saner"

* 'work.regset' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits)
  regset: kill user_regset_copyout{,_zero}()
  regset(): kill ->get_size()
  regset: kill ->get()
  csky: switch to ->regset_get()
  xtensa: switch to ->regset_get()
  parisc: switch to ->regset_get()
  nds32: switch to ->regset_get()
  nios2: switch to ->regset_get()
  hexagon: switch to ->regset_get()
  h8300: switch to ->regset_get()
  openrisc: switch to ->regset_get()
  riscv: switch to ->regset_get()
  c6x: switch to ->regset_get()
  ia64: switch to ->regset_get()
  arc: switch to ->regset_get()
  arm: switch to ->regset_get()
  sh: convert to ->regset_get()
  arm64: switch to ->regset_get()
  mips: switch to ->regset_get()
  sparc: switch to ->regset_get()
  ...
2020-08-07 09:29:25 -07:00
Linus Torvalds 921d2597ab s390: implement diag318
x86:
 * Report last CPU for debugging
 * Emulate smaller MAXPHYADDR in the guest than in the host
 * .noinstr and tracing fixes from Thomas
 * nested SVM page table switching optimization and fixes
 
 Generic:
 * Unify shadow MMU cache data structures across architectures
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8pC+oUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNcOwgAjomqtEqQNlp7DdZT7VyyklzbxX1/
 ud7v+oOJ8K4sFlf64lSthjPo3N9rzZCcw+yOXmuyuITngXOGc3tzIwXpCzpLtuQ1
 WO1Ql3B/2dCi3lP5OMmsO1UAZqy9pKLg1dfeYUPk48P5+p7d/NPmk+Em5kIYzKm5
 JsaHfCp2EEXomwmljNJ8PQ1vTjIQSSzlgYUBZxmCkaaX7zbEUMtxAQCStHmt8B84
 33LczwXBm3viSWrzsoBV37I70+tseugiSGsCfUyupXOvq55d6D9FCqtCb45Hn4Vh
 Ik8ggKdalsk/reiGEwNw1/3nr6mRMkHSbl+Mhc4waOIFf9dn0urgQgOaDg==
 =YVx0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "s390:
   - implement diag318

  x86:
   - Report last CPU for debugging
   - Emulate smaller MAXPHYADDR in the guest than in the host
   - .noinstr and tracing fixes from Thomas
   - nested SVM page table switching optimization and fixes

  Generic:
   - Unify shadow MMU cache data structures across architectures"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
  KVM: SVM: Fix sev_pin_memory() error handling
  KVM: LAPIC: Set the TDCR settable bits
  KVM: x86: Specify max TDP level via kvm_configure_mmu()
  KVM: x86/mmu: Rename max_page_level to max_huge_page_level
  KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
  KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
  KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
  KVM: VMX: Make vmx_load_mmu_pgd() static
  KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
  KVM: VMX: Drop a duplicate declaration of construct_eptp()
  KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
  KVM: Using macros instead of magic values
  MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
  KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
  KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
  KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
  KVM: VMX: Add guest physical address check in EPT violation and misconfig
  KVM: VMX: introduce vmx_need_pf_intercept
  KVM: x86: update exception bitmap on CPUID changes
  KVM: x86: rename update_bp_intercept to update_exception_bitmap
  ...
2020-08-06 12:59:31 -07:00
Linus Torvalds 47ec5303d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.

 2) Support UDP segmentation in code TSO code, from Eric Dumazet.

 3) Allow flashing different flash images in cxgb4 driver, from Vishal
    Kulkarni.

 4) Add drop frames counter and flow status to tc flower offloading,
    from Po Liu.

 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.

 6) Various new indirect call avoidance, from Eric Dumazet and Brian
    Vazquez.

 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
    Yonghong Song.

 8) Support querying and setting hardware address of a port function via
    devlink, use this in mlx5, from Parav Pandit.

 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.

10) Switch qca8k driver over to phylink, from Jonathan McDowell.

11) In bpftool, show list of processes holding BPF FD references to
    maps, programs, links, and btf objects. From Andrii Nakryiko.

12) Several conversions over to generic power management, from Vaibhav
    Gupta.

13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
    Yakunin.

14) Various https url conversions, from Alexander A. Klimov.

15) Timestamping and PHC support for mscc PHY driver, from Antoine
    Tenart.

16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.

17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.

18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.

19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several
    drivers. From Luc Van Oostenryck.

20) XDP support for xen-netfront, from Denis Kirjanov.

21) Support receive buffer autotuning in MPTCP, from Florian Westphal.

22) Support EF100 chip in sfc driver, from Edward Cree.

23) Add XDP support to mvpp2 driver, from Matteo Croce.

24) Support MPTCP in sock_diag, from Paolo Abeni.

25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
    infrastructure, from Jakub Kicinski.

26) Several pci_ --> dma_ API conversions, from Christophe JAILLET.

27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.

28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.

29) Refactor a lot of networking socket option handling code in order to
    avoid set_fs() calls, from Christoph Hellwig.

30) Add rfc4884 support to icmp code, from Willem de Bruijn.

31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.

32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.

33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.

34) Support TCP syncookies in MPTCP, from Flowian Westphal.

35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
    Brivio.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
  net: thunderx: initialize VF's mailbox mutex before first usage
  usb: hso: remove bogus check for EINPROGRESS
  usb: hso: no complaint about kmalloc failure
  hso: fix bailout in error case of probe
  ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
  selftests/net: relax cpu affinity requirement in msg_zerocopy test
  mptcp: be careful on subflow creation
  selftests: rtnetlink: make kci_test_encap() return sub-test result
  selftests: rtnetlink: correct the final return value for the test
  net: dsa: sja1105: use detected device id instead of DT one on mismatch
  tipc: set ub->ifindex for local ipv6 address
  ipv6: add ipv6_dev_find()
  net: openvswitch: silence suspicious RCU usage warning
  Revert "vxlan: fix tos value before xmit"
  ptp: only allow phase values lower than 1 period
  farsync: switch from 'pci_' to 'dma_' API
  wan: wanxl: switch from 'pci_' to 'dma_' API
  hv_netvsc: do not use VF device if link is down
  dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
  net: macb: Properly handle phylink on at91sam9x
  ...
2020-08-05 20:13:21 -07:00
Linus Torvalds a754292348 Printk changes for 5.9
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAl8pk84ACgkQUqAMR0iA
 lPIrTxAAhD6fosJx+7LCrDRABIw/ZybeS5MIxTuPsNtdMmGBemigew5Ao1wYY6Ww
 3BFiNC2LpDXPxSOCQpz0Zm5/oCLhShPJmS6ukjLbufDsiw0MezliKCAa2Bfw3W31
 6xntQtf7ps+bmTEQDyuznu8Kfg+I3lmdGUOEBBluHIP4gb7XKQE8ttyUHB6qdiXI
 3eAl53Q8dOMMjtk5eNBXA19JY43g4JmLZRBumrAUc1vsv15KTDmSyWKlV8+tLH9K
 JbQAHe0pNVec4sJUIYLvIwDZXvtsvxjdJyX3tTeZ7xJ/ARcvRLoixVGqWxKhqdth
 j5U/L+YQfCJifyqvEVo03yy4Ti+OraliRpGcRf/bM2HpmFBA2+dISr7/VEqRwkG7
 Sy8HuvBHHyUqdrPjB7izhv8iyRN+LxFfpdT5LMnzsvxMxAJ+QwNjxb13RA4kkeRU
 5SgOhfGWgTsLy71J6qdSeXYB2oPFw4Onp5yAtoUsOJVYqWkN9x0zdl+9HmqIHF7T
 dY+KNriEO6gmpsQrMR4FC/GVMtwYWf8AoqeZen5O5SQULmzuKQ5AkOo0IAMrU92i
 iAdFrSZj35HAQjIJRccPNGZ3FwTd1Z4r5GT7VRvrN+nq2wVopzbbz924/lmsGoAS
 YppAw31sKfXDc5uWE8jP8GP3OJqhORn2PPXq3D5Q3XSVbGgey0Q=
 =ZcMq
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Herbert Xu made printk header file self-contained.

 - Andy Shevchenko and Sergey Senozhatsky cleaned up console->setup()
   error handling.

 - Andy Shevchenko did some cleanups (e.g. sparse warning) in vsprintf
   code.

 - Minor documentation updates.

* tag 'printk-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  lib/vsprintf: Force type of flags value for gfp_t
  lib/vsprintf: Replace custom spec to print decimals with generic one
  lib/vsprintf: Replace hidden BUILD_BUG_ON() with static_assert()
  printk: Make linux/printk.h self-contained
  doc:kmsg: explicitly state the return value in case of SEEK_CUR
  Replace HTTP links with HTTPS ones: vsprintf
  hvc: unify console setup naming
  console: Fix trivia typo 'change' -> 'chance'
  console: Propagate error code from console ->setup()
  tty: hvc: Return proper error code from console ->setup() hook
  serial: sunzilog: Return proper error code from console ->setup() hook
  serial: sunsab: Return proper error code from console ->setup() hook
  mips: Return proper error code from console ->setup() hook
2020-08-04 22:22:25 -07:00
Linus Torvalds 2ed90dbbf7 dma-mapping updates for 5.9
- make support for dma_ops optional
  - move more code out of line
  - add generic support for a dma_ops bypass mode
  - misc cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl8oGscLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYNfEhAAmFwd6BBHGwAhXUchoIue5vdNnuY3GiBFRzUdz67W
 zRYYgZYiPjl+MwflRmwPcoWEnGzmweRa2s6OnyDostiCRauioa8BuQfGqJasf1yZ
 D36dFNVHGW0o6pRDUQkd688k/4A6szwuwpq83qi4e8X2I9QzAITHtW8izjfPM923
 FlJzxEFggbB2TvwfUXOZhmpuG4Dog8S7VZ1Uz4QAg0Z/5FDqIKAAG2aZMqCXBbiX
 01E8tr0AqU/jn2xpc8O+DJGFiYIRhqhyNxQbH6qz1Q3xGFSokcLYm3YqkqVOgpn1
 DLs2UFDxWkly/F+wGnYtju7OD9VGPywzOcW125/LIsApYN5R/rYrtQzK41eq7Mp5
 HY3tqgNTIMdnl4so7QXeU4Vxj+lUdPlI26NZGszcM5AVftdTX8KjGdS+0+PBza6i
 i7trwG7J5/DnwiBCvEKoul7Ul1psUMTSvYwINTXRqsU4mZXhhx/mwyXbtruELnkj
 3agM98u6hoalLNjd2aueh+NjMZi1r+MchTrfRvTcxJ+yQ5BoR5kF+iz7eT/LtZ72
 AqWwimsPGNkLHUa0TrqWql5tv90cdDkBZzWXVbixwxRfgynWYLE6jugeIy8hwjFf
 GjO5XKbBwnWPjdSzFsVMPeuNpmr7ZjVHHewy2Q/jWQAIOyeof0VztEl23LN5yUkx
 pc8=
 =90UK
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.9' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - make support for dma_ops optional

 - move more code out of line

 - add generic support for a dma_ops bypass mode

 - misc cleanups

* tag 'dma-mapping-5.9' of git://git.infradead.org/users/hch/dma-mapping:
  dma-contiguous: cleanup dma_alloc_contiguous
  dma-debug: use named initializers for dir2name
  powerpc: use the generic dma_ops_bypass mode
  dma-mapping: add a dma_ops_bypass flag to struct device
  dma-mapping: make support for dma ops optional
  dma-mapping: inline the fast path dma-direct calls
  dma-mapping: move the remaining DMA API calls out of line
2020-08-04 17:29:57 -07:00
Linus Torvalds 4f30a60aa7 close-range-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXygcpgAKCRCRxhvAZXjc
 ogPeAQDv1ncqtNroFAC4pJ4tQhH7JSjW0OltiMk/AocY/J2SdQD9GJ15luYJ0/om
 697q/Z68sndRynhdoZlMuf3oYuBlHQw=
 =3ZhE
 -----END PGP SIGNATURE-----

Merge tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull close_range() implementation from Christian Brauner:
 "This adds the close_range() syscall. It allows to efficiently close a
  range of file descriptors up to all file descriptors of a calling
  task.

  This is coordinated with the FreeBSD folks which have copied our
  version of this syscall and in the meantime have already merged it in
  April 2019:

    https://reviews.freebsd.org/D21627
    https://svnweb.freebsd.org/base?view=revision&revision=359836

  The syscall originally came up in a discussion around the new mount
  API and making new file descriptor types cloexec by default. During
  this discussion, Al suggested the close_range() syscall.

  First, it helps to close all file descriptors of an exec()ing task.
  This can be done safely via (quoting Al's example from [1] verbatim):

        /* that exec is sensitive */
        unshare(CLONE_FILES);
        /* we don't want anything past stderr here */
        close_range(3, ~0U);
        execve(....);

  The code snippet above is one way of working around the problem that
  file descriptors are not cloexec by default. This is aggravated by the
  fact that we can't just switch them over without massively regressing
  userspace. For a whole class of programs having an in-kernel method of
  closing all file descriptors is very helpful (e.g. demons, service
  managers, programming language standard libraries, container managers
  etc.).

  Second, it allows userspace to avoid implementing closing all file
  descriptors by parsing through /proc/<pid>/fd/* and calling close() on
  each file descriptor and other hacks. From looking at various
  large(ish) userspace code bases this or similar patterns are very
  common in service managers, container runtimes, and programming
  language runtimes/standard libraries such as Python or Rust.

  In addition, the syscall will also work for tasks that do not have
  procfs mounted and on kernels that do not have procfs support compiled
  in. In such situations the only way to make sure that all file
  descriptors are closed is to call close() on each file descriptor up
  to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery.

  Based on Linus' suggestion close_range() also comes with a new flag
  CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping
  right before exec. This would usually be expressed in the sequence:

        unshare(CLONE_FILES);
        close_range(3, ~0U);

  as pointed out by Linus it might be desirable to have this be a part
  of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which
  gets especially handy when we're closing all file descriptors above a
  certain threshold.

  Test-suite as always included"

* tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  tests: add CLOSE_RANGE_UNSHARE tests
  close_range: add CLOSE_RANGE_UNSHARE
  tests: add close_range() tests
  arch: wire-up close_range()
  open: add close_range()
2020-08-04 15:12:02 -07:00
Linus Torvalds 9ba27414f2 fork-v5.9
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXyge/QAKCRCRxhvAZXjc
 oildAQCCWpnTeXm6hrIE3VZ36X5npFtbaEthdBVAUJM7mo0FYwEA8+Wbnubg6jCw
 mztkXCnTfU7tApUdhKtQzcpEws45/Qk=
 =REE/
 -----END PGP SIGNATURE-----

Merge tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull fork cleanups from Christian Brauner:
 "This is cleanup series from when we reworked a chunk of the process
  creation paths in the kernel and switched to struct
  {kernel_}clone_args.

  High-level this does two main things:

   - Remove the double export of both do_fork() and _do_fork() where
     do_fork() used the incosistent legacy clone calling convention.

     Now we only export _do_fork() which is based on struct
     kernel_clone_args.

   - Remove the copy_thread_tls()/copy_thread() split making the
     architecture specific HAVE_COYP_THREAD_TLS config option obsolete.

  This switches all remaining architectures to select
  HAVE_COPY_THREAD_TLS and thus to the copy_thread_tls() calling
  convention. The current split makes the process creation codepaths
  more convoluted than they need to be. Each architecture has their own
  copy_thread() function unless it selects HAVE_COPY_THREAD_TLS then it
  has a copy_thread_tls() function.

  The split is not needed anymore nowadays, all architectures support
  CLONE_SETTLS but quite a few of them never bothered to select
  HAVE_COPY_THREAD_TLS and instead simply continued to use copy_thread()
  and use the old calling convention. Removing this split cleans up the
  process creation codepaths and paves the way for implementing clone3()
  on such architectures since it requires the copy_thread_tls() calling
  convention.

  After having made each architectures support copy_thread_tls() this
  series simply renames that function back to copy_thread(). It also
  switches all architectures that call do_fork() directly over to
  _do_fork() and the struct kernel_clone_args calling convention. This
  is a corollary of switching the architectures that did not yet support
  it over to copy_thread_tls() since do_fork() is conditional on not
  supporting copy_thread_tls() (Mostly because it lacks a separate
  argument for tls which is trivial to fix but there's no need for this
  function to exist.).

  The do_fork() removal is in itself already useful as it allows to to
  remove the export of both do_fork() and _do_fork() we currently have
  in favor of only _do_fork(). This has already been discussed back when
  we added clone3(). The legacy clone() calling convention is - as is
  probably well-known - somewhat odd:

    #
    # ABI hall of shame
    #
    config CLONE_BACKWARDS
    config CLONE_BACKWARDS2
    config CLONE_BACKWARDS3

  that is aggravated by the fact that some architectures such as sparc
  follow the CLONE_BACKWARDSx calling convention but don't really select
  the corresponding config option since they call do_fork() directly.

  So do_fork() enforces a somewhat arbitrary calling convention in the
  first place that doesn't really help the individual architectures that
  deviate from it. They can thus simply be switched to _do_fork()
  enforcing a single calling convention. (I really hope that any new
  architectures will __not__ try to implement their own calling
  conventions...)

  Most architectures already have made a similar switch (m68k comes to
  mind).

  Overall this removes more code than it adds even with a good portion
  of added comments. It simplifies a chunk of arch specific assembly
  either by moving the code into C or by simply rewriting the assembly.

  Architectures that have been touched in non-trivial ways have all been
  actually boot and stress tested: sparc and ia64 have been tested with
  Debian 9 images. They are the two architectures which have been
  touched the most. All non-trivial changes to architectures have seen
  acks from the relevant maintainers. nios2 with a custom built
  buildroot image. h8300 I couldn't get something bootable to test on
  but the changes have been fairly automatic and I'm sure we'll hear
  people yell if I broke something there.

  All other architectures that have been touched in trivial ways have
  been compile tested for each single patch of the series via git rebase
  -x "make ..." v5.8-rc2. arm{64} and x86{_64} have been boot tested
  even though they have just been trivially touched (removal of the
  HAVE_COPY_THREAD_TLS macro from their Kconfig) because well they are
  basically "core architectures" and since it is trivial to get your
  hands on a useable image"

* tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  arch: rename copy_thread_tls() back to copy_thread()
  arch: remove HAVE_COPY_THREAD_TLS
  unicore: switch to copy_thread_tls()
  sh: switch to copy_thread_tls()
  nds32: switch to copy_thread_tls()
  microblaze: switch to copy_thread_tls()
  hexagon: switch to copy_thread_tls()
  c6x: switch to copy_thread_tls()
  alpha: switch to copy_thread_tls()
  fork: remove do_fork()
  h8300: select HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  nios2: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  ia64: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args
  sparc: unconditionally enable HAVE_COPY_THREAD_TLS
  sparc: share process creation helpers between sparc and sparc64
  sparc64: enable HAVE_COPY_THREAD_TLS
  fork: fold legacy_clone_args_valid() into _do_fork()
2020-08-04 14:47:45 -07:00
Linus Torvalds 99ea1521a0 Remove uninitialized_var() macro for v5.9-rc1
- Clean up non-trivial uses of uninitialized_var()
 - Update documentation and checkpatch for uninitialized_var() removal
 - Treewide removal of uninitialized_var()
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAl8oYLQWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJsfjEACvf0D3WL3H7sLHtZ2HeMwOgAzq
 il08t6vUscINQwiIIK3Be43ok3uQ1Q+bj8sr2gSYTwunV2IYHFferzgzhyMMno3o
 XBIGd1E+v1E4DGBOiRXJvacBivKrfvrdZ7AWiGlVBKfg2E0fL1aQbe9AYJ6eJSbp
 UGqkBkE207dugS5SQcwrlk1tWKUL089lhDAPd7iy/5RK76OsLRCJFzIerLHF2ZK2
 BwvA+NWXVQI6pNZ0aRtEtbbxwEU4X+2J/uaXH5kJDszMwRrgBT2qoedVu5LXFPi8
 +B84IzM2lii1HAFbrFlRyL/EMueVFzieN40EOB6O8wt60Y4iCy5wOUzAdZwFuSTI
 h0xT3JI8BWtpB3W+ryas9cl9GoOHHtPA8dShuV+Y+Q2bWe1Fs6kTl2Z4m4zKq56z
 63wQCdveFOkqiCLZb8s6FhnS11wKtAX4czvXRXaUPgdVQS1Ibyba851CRHIEY+9I
 AbtogoPN8FXzLsJn7pIxHR4ADz+eZ0dQ18f2hhQpP6/co65bYizNP5H3h+t9hGHG
 k3r2k8T+jpFPaddpZMvRvIVD8O2HvJZQTyY6Vvneuv6pnQWtr2DqPFn2YooRnzoa
 dbBMtpon+vYz6OWokC5QNWLqHWqvY9TmMfcVFUXE4AFse8vh4wJ8jJCNOFVp8On+
 drhmmImUr1YylrtVOw==
 =xHmk
 -----END PGP SIGNATURE-----

Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull uninitialized_var() macro removal from Kees Cook:
 "This is long overdue, and has hidden too many bugs over the years. The
  series has several "by hand" fixes, and then a trivial treewide
  replacement.

   - Clean up non-trivial uses of uninitialized_var()

   - Update documentation and checkpatch for uninitialized_var() removal

   - Treewide removal of uninitialized_var()"

* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  compiler: Remove uninitialized_var() macro
  treewide: Remove uninitialized_var() usage
  checkpatch: Remove awareness of uninitialized_var() macro
  mm/debug_vm_pgtable: Remove uninitialized_var() usage
  f2fs: Eliminate usage of uninitialized_var() macro
  media: sur40: Remove uninitialized_var() usage
  KVM: PPC: Book3S PR: Remove uninitialized_var() usage
  clk: spear: Remove uninitialized_var() usage
  clk: st: Remove uninitialized_var() usage
  spi: davinci: Remove uninitialized_var() usage
  ide: Remove uninitialized_var() usage
  rtlwifi: rtl8192cu: Remove uninitialized_var() usage
  b43: Remove uninitialized_var() usage
  drbd: Remove uninitialized_var() usage
  x86/mm/numa: Remove uninitialized_var() usage
  docs: deprecated.rst: Add uninitialized_var()
2020-08-04 13:49:43 -07:00
Linus Torvalds 9ba19ccd2d These were the main changes in this cycle:
- LKMM updates: mostly documentation changes, but also some new litmus tests for atomic ops.
 
  - KCSAN updates: the most important change is that GCC 11 now has all fixes in place
                   to support KCSAN, so GCC support can be enabled again. Also more annotations.
 
  - futex updates: minor cleanups and simplifications
 
  - seqlock updates: merge preparatory changes/cleanups for the 'associated locks' facilities.
 
  - lockdep updates:
     - simplify IRQ trace event handling
     - add various new debug checks
     - simplify header dependencies, split out <linux/lockdep_types.h>, decouple
       lockdep from other low level headers some more
     - fix NMI handling
 
  - misc cleanups and smaller fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl8n9/wRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hZFQ//dD+AKw9Nym+WbylovmeD0qxWxPyeN/jG
 vBVDTOJIJLtZTkZf6YHcYOJlPwaMDYUQluqTPQhsaQZy/NoEb5NM2cFAj2R9gjyT
 O8665T1dvhW9Sh353mBpuwviqdrnvCeHTBEcglSlFY7hxToYAflUN0+DXGVtNys8
 PFNf3L9SHT0GLVC8+di/eJzQaRqxiB0Pq7kvh2RvPJM/dcQNA9Ho3CCNO5j6qGoY
 u7OnMT8xJXkgbdjjUO4RO0v9VjMuNthZ2JiONDgvgKtJfIL2wt5YXIv1EYX0GuWp
 WZgIzE4o1G7GJOOzKpFfZFyK8grHu2fWgK1plvodWjlLkBmltJZ1qyOM+wngd/m2
 TgtPo73/YFbxFUbbBpkb0eiIaH2t99kMvfCWd05+GiPCtzn9UL9GfFRWd42vonwc
 sQWjFrHKlnuzifUfNcLmKg7R2nUtF3Dm/SydiTJ+9NtH/QA17YJKWnlE1moulNtQ
 p7H7+8UdcvSQ7F38A74v2IYNIyDsv5qcE8ar4QHdaanBBX/LCyD0UlfgsgxEReXf
 GDKkpx7LFQlI6Y2YB+dZgkCwhNBl3/OQ3v6hC95B37fA67dAIQyPIWHiHbaM+029
 gghqU4GcUcbjSnHPzl9PPL+hi9MyXrMjpb7CBXytg4NI4EE1waHR+0kX14V8ndRj
 MkWQOKPUgB0=
 =3MTT
 -----END PGP SIGNATURE-----

Merge tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:

 - LKMM updates: mostly documentation changes, but also some new litmus
   tests for atomic ops.

 - KCSAN updates: the most important change is that GCC 11 now has all
   fixes in place to support KCSAN, so GCC support can be enabled again.
   Also more annotations.

 - futex updates: minor cleanups and simplifications

 - seqlock updates: merge preparatory changes/cleanups for the
   'associated locks' facilities.

 - lockdep updates:
    - simplify IRQ trace event handling
    - add various new debug checks
    - simplify header dependencies, split out <linux/lockdep_types.h>,
      decouple lockdep from other low level headers some more
    - fix NMI handling

 - misc cleanups and smaller fixes

* tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  kcsan: Improve IRQ state trace reporting
  lockdep: Refactor IRQ trace events fields into struct
  seqlock: lockdep assert non-preemptibility on seqcount_t write
  lockdep: Add preemption enabled/disabled assertion APIs
  seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount()
  seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs
  seqlock: Reorder seqcount_t and seqlock_t API definitions
  seqlock: seqcount_t latch: End read sections with read_seqcount_retry()
  seqlock: Properly format kernel-doc code samples
  Documentation: locking: Describe seqlock design and usage
  locking/qspinlock: Do not include atomic.h from qspinlock_types.h
  locking/atomic: Move ATOMIC_INIT into linux/types.h
  lockdep: Move list.h inclusion into lockdep.h
  locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs
  futex: Remove unused or redundant includes
  futex: Consistently use fshared as boolean
  futex: Remove needless goto's
  futex: Remove put_futex_key()
  rwsem: fix commas in initialisation
  docs: locking: Replace HTTP links with HTTPS ones
  ...
2020-08-03 14:39:35 -07:00
Linus Torvalds 45365a06aa - Add support for function error injection.
- Add support for custom exception handlers, as required by BPF_PROBE_MEM.
 
 - Add support for BPF_PROBE_MEM.
 
 - Add trace events for idle enter / exit for the s390 specific idle
   implementation.
 
 - Remove unused zcore memmmap device.
 
 - Remove unused "raw view" from s390 debug feature.
 
 - AP bus + zcrypt device driver code refactoring.
 
 - Provide cex4 cca sysfs attributes for cex3 for zcrypt device driver.
 
 - Expose only minimal interface to walk physmem for mm/memblock. This
   is a common code change and it has been agreed on with Mike Rapoport
   and Andrew Morton that this can go upstream via the s390 tree.
 
 - Rework of the s390 vmem/vmmemap code to allow for future memory hot
   remove.
 
 - Get rid of FORCE_MAX_ZONEORDER to finally allow for order-10
   allocations again, instead of only order-8 allocations.
 
 - Various small improvements and fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAl8n1eUACgkQIg7DeRsp
 bsJJIhAAsY4IwWHOOh9GRY0yAU8FQvJiBI8H2IuukjnwjKmj8LQA/VkiIWOfWU99
 2cnrnEi7+Op1od0ebjnkAU+oGws3qazpRxp6RaN3qTbnEYYSVMGvNfjTaWH3/Tsd
 jxNgYZ4bV7foSWfYvyoBy4cORcSt1xFdA7by+XQYoacFJMNgjktDoeMFnj9TMCbj
 LFHjAdqN78o98nwgREuzSPV806cQgNhzBc6kYaC2zw1W5Z3NrdmLXVyyqM7YCB/9
 rKTQrEYi550BoyHHpxOY3K9PQQBEZZOH3M/2rA/W/gQaWCs2z3dwmBqjzwM36eZQ
 To+sw4F9x/enuYpU5ylVrh0nuWaJ7wpe3DugHY+UghGZwm71On6ZTnEkWD450jD+
 bVdDdYPturypTLdCiAFr7D0pMDqzgUP+jyTpIPH1uOFAkocfwrfFj6Als3mIjjks
 pptWs+1m4lv1E+7flrSgkNdvPpUhwD6Zf5RZi03GUZShFZzA6Nq4+yVOX7O871M7
 R9rLOQ0ch9/PiDdD4VXihL0Qva9eayo/Bek0npEBp0ZnyjIgHr64Xr77jqx74mMB
 yoT+CSfICqvmF5CV4lPhPeQYEpvzYj8yi9zAxlFNyRpeM75B7L/JkNcqMN9fra4I
 yKxo4Ng/6EEYx7ooCnX2I0BWJZc3b4ZBIJiRAF7OXzX91O9v8nU=
 =H0KX
 -----END PGP SIGNATURE-----

Merge tag 's390-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Add support for function error injection.

 - Add support for custom exception handlers, as required by
   BPF_PROBE_MEM.

 - Add support for BPF_PROBE_MEM.

 - Add trace events for idle enter / exit for the s390 specific idle
   implementation.

 - Remove unused zcore memmmap device.

 - Remove unused "raw view" from s390 debug feature.

 - AP bus + zcrypt device driver code refactoring.

 - Provide cex4 cca sysfs attributes for cex3 for zcrypt device driver.

 - Expose only minimal interface to walk physmem for mm/memblock. This
   is a common code change and it has been agreed on with Mike Rapoport
   and Andrew Morton that this can go upstream via the s390 tree.

 - Rework of the s390 vmem/vmmemap code to allow for future memory hot
   remove.

 - Get rid of FORCE_MAX_ZONEORDER to finally allow for order-10
   allocations again, instead of only order-8 allocations.

 - Various small improvements and fixes.

* tag 's390-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits)
  s390/vmemmap: coding style updates
  s390/vmemmap: avoid memset(PAGE_UNUSED) when adding consecutive sections
  s390/vmemmap: remember unused sub-pmd ranges
  s390/vmemmap: fallback to PTEs if mapping large PMD fails
  s390/vmem: cleanup empty page tables
  s390/vmemmap: take the vmem_mutex when populating/freeing
  s390/vmemmap: cleanup when vmemmap_populate() fails
  s390/vmemmap: extend modify_pagetable() to handle vmemmap
  s390/vmem: consolidate vmem_add_range() and vmem_remove_range()
  s390/vmem: rename vmem_add_mem() to vmem_add_range()
  s390: enable HAVE_FUNCTION_ERROR_INJECTION
  s390/pci: clarify comment in s390_mmio_read/write
  s390/time: improve comparison for tod steering
  s390/time: select CLOCKSOURCE_VALIDATE_LAST_CYCLE
  s390/time: use CLOCKSOURCE_MASK
  s390/bpf: implement BPF_PROBE_MEM
  s390/kernel: expand exception table logic to allow new handling options
  s390/kernel: unify EX_TABLE* implementations
  s390/mm: allow order 10 allocations
  s390/mm: avoid trimming to MAX_ORDER
  ...
2020-08-03 13:58:10 -07:00
Paolo Bonzini f3633c2683 KVM: s390: Enhancement for 5.9
- implement diagnose 318
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJfIpNzAAoJEBF7vIC1phx8l0cP/AvZ6oT5dlAGeBhtPeM/3rqp
 g7RCukN445LQfxWeWXuzckAYE4AAAtFqMS6PujfKBc+Lf7t+d6Iuod7wFlJTDImP
 wIGcCV1pTSpIHaFiSM1rpqRjnzFGeWrqWg6gBSjm0aSMqB8KAjv+PdyQ1rcfyiIj
 r+sD+Vt9DNGop12TY2YxUlXaxzPccGMAniDXesFgKb9IoTdMLdEt45Evkx9D6UAx
 eetWMwZTwqB8iWJx6xU41LxDA4ERlS+8TsE+SC0r8n6yCmhQ98hgb4i2O1gx9JIl
 K5TqpXMWVBKFyeSbJBw9bXtWa5F/gXDuD6zrzRiMjZR4Og6TXqL2NoXgr9LHN/g7
 WpBlF/eDr7TNxF1VutvSiLvV5XI/t8yjbwSvAt2+QtIIrJK+fPAdTRSH1Q8TRUMj
 cIRdCw2H10neseAPhbdn9nSJhuQ5E/hGrMzubiYQeTXsA3TLfLWniuejfRufMOXB
 kgepl+8H60D8o1l459+81NBV6rM5RdRRzWkWIIYD2/+yWRtclb1K2CF2HrN51saC
 3SQI90Rr7Vx4yjS0p84/aasAAy7WxfumnoLwBsRwIE0X9R4e4plC12igwsmPK8oM
 V/SO4w+LAJnW1bQpXuqRGMPI29gpGDHVEcfOtuerHE1pZya6VRIWTEkSsdXt1eZI
 trxY3c6Xruor8DQSDsjv
 =hr9t
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next-5.6

KVM: s390: Enhancement for 5.9
- implement diagnose 318
2020-08-03 14:19:13 -04:00
Ingo Molnar 28cff52eae Merge branch 'linus' into locking/core, to resolve conflict
Conflicts:
	arch/arm/include/asm/percpu.h

As Stephen Rothwell noted, there's a conflict between this commit
in locking/core:

  a21ee6055c ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")

and this fresh upstream commit:

  aa54ea903a ("ARM: percpu.h: fix build error")

a21ee6055c is a simpler solution to the dependency problem and doesn't
further increase header hell - so this conflict resolution effectively
reverts aa54ea903a and uses the a21ee6055c solution.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-31 12:16:09 +02:00
Peter Zijlstra f05d67179d Merge branch 'locking/header' 2020-07-29 16:14:21 +02:00
Herbert Xu 7ca8cf5347 locking/atomic: Move ATOMIC_INIT into linux/types.h
This patch moves ATOMIC_INIT from asm/atomic.h into linux/types.h.
This allows users of atomic_t to use ATOMIC_INIT without having to
include atomic.h as that way may lead to header loops.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20200729123105.GB7047@gondor.apana.org.au
2020-07-29 16:14:18 +02:00
Al Viro bb1a773d5b kill unused dump_fpu() instances
dump_fpu() is used only on the architectures that support elf
and have neither CORE_DUMP_USE_REGSET nor ELF_CORE_COPY_FPREGS
defined.

Currently that's csky, m68k, microblaze, nds32 and unicore32.  The rest
of the instances are dead code.

NB: THIS MUST GO AFTER ELF_FDPIC CONVERSION

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-27 14:33:10 -04:00
Al Viro b69c632052 s390: switch to ->regset_get()
NB: compat NT_S390_LAST_BREAK might be better as compat_long_t
rather than long.  User-visible ABI, again...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-27 14:31:08 -04:00
Herbert Xu b4a461e72b printk: Make linux/printk.h self-contained
As it stands if you include printk.h by itself it will fail to
compile because it requires definitions from ratelimit.h.  However,
simply including ratelimit.h from printk.h does not work due to
inclusion loops involving sched.h and kernel.h.

This patch solves this by moving bits from ratelimit.h into a new
header file which can then be included by printk.h without any
worries about header loops.

The build bot then revealed some intriguing failures arising out
of this patch.  On s390 there is an inclusion loop with asm/bug.h
and linux/kernel.h that triggers a compile failure, because kernel.h
will cause asm-generic/bug.h to be included before s390's own
asm/bug.h has finished processing.  This has been fixed by not
including kernel.h in arch/s390/include/asm/bug.h.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Acked-by: Petr Mladek <pmladek@suse.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Link: https://lore.kernel.org/r/20200721062248.GA18383@gondor.apana.org.au
2020-07-27 17:46:24 +09:00
Heiko Carstens 9a996c67a6 s390/vmemmap: coding style updates
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:34:19 +02:00
David Hildenbrand 2c114df071 s390/vmemmap: avoid memset(PAGE_UNUSED) when adding consecutive sections
Let's avoid memset(PAGE_UNUSED) when adding consecutive sections,
whereby the vmemmap of a single section does not span full PMDs.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-10-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:34:14 +02:00
David Hildenbrand cd5781d63e s390/vmemmap: remember unused sub-pmd ranges
With a memmap size of 56 bytes or 72 bytes per page, the memmap for a
256 MB section won't span full PMDs. As we populate single sections and
depopulate single sections, the depopulation step would not be able to
free all vmemmap pmds anymore.

Do it similarly to x86, marking the unused memmap ranges in a special way
(pad it with 0xFD).

This allows us to add/remove sections, cleaning up all allocated
vmemmap pages even if the memmap size is not multiple of 16 bytes per page.

A 56 byte memmap can, for example, be created with !CONFIG_MEMCG and
!CONFIG_SLUB.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-9-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:34:08 +02:00
David Hildenbrand f2057b4266 s390/vmemmap: fallback to PTEs if mapping large PMD fails
Let's fallback to single pages if short on huge pages. No need to stop
memory hotplug.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-8-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:34:03 +02:00
David Hildenbrand b9ff81003c s390/vmem: cleanup empty page tables
Let's cleanup empty page tables. Consider only page tables that fully
fall into the idendity mapping and the vmemmap range.

As there are no valid accesses to vmem/vmemmap within non-populated ranges,
the single tlb flush at the end should be sufficient.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-7-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:59 +02:00
David Hildenbrand aa18e0e658 s390/vmemmap: take the vmem_mutex when populating/freeing
Let's synchronize all accesses to the 1:1 and vmemmap mappings. This will
be especially relevant when wanting to cleanup empty page tables that could
be shared by both. Avoid races when removing tables that might be just
about to get reused.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-6-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:51 +02:00
David Hildenbrand c00f05a924 s390/vmemmap: cleanup when vmemmap_populate() fails
Cleanup what we partially added in case vmemmap_populate() fails. For
vmem, this is already handled by vmem_add_mapping().

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-5-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:46 +02:00
David Hildenbrand 9ec8fa8dc3 s390/vmemmap: extend modify_pagetable() to handle vmemmap
Extend our shiny new modify_pagetable() to handle !direct (vmemmap)
mappings. Convert vmemmap_populate() and implement vmemmap_free().

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-4-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:41 +02:00
David Hildenbrand 3e0d3e408e s390/vmem: consolidate vmem_add_range() and vmem_remove_range()
We want to have only a single pagetable walker and reuse the same
functionality for vmemmap handling. Let's start by consolidating
vmem_add_range() and vmem_remove_range(), converting it into a
recursive implementation.

A recursive implementation makes it easier to expand individual cases
without harming readability. In addition, we minimize traversing the
whole hierarchy over and over again.

One change is that we don't unmap large PMDs/PUDs when not completely
covered by the request, something that should never happen with direct
mappings, unless one would be removing in other granularity than added,
which would be broken already.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-3-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:36 +02:00
David Hildenbrand 8398b226b8 s390/vmem: rename vmem_add_mem() to vmem_add_range()
Let's match the name to vmem_remove_range().

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200722094558.9828-2-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:32 +02:00
Ilya Leoshkevich 73d6eb48d2 s390: enable HAVE_FUNCTION_ERROR_INJECTION
This kernel feature is required for enabling BPF_KPROBE_OVERRIDE.

Define override_function_with_return() and regs_set_return_value()
functions, and fix compile errors in syscall_wrapper.h.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:28 +02:00
Niklas Schnelle 4631f3ca49 s390/pci: clarify comment in s390_mmio_read/write
The existing comment was talking about reading in the write part
and vice versa. While we are here make it more clear why restricting
the syscalls to MIO capable devices is okay.

Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-27 10:33:24 +02:00
David S. Miller a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Ingo Molnar c84d53051f Linux 5.8-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl8UzA4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGQ7cH/3v+Gv+SmHJCvaT2
 CSu0+7okVnYbY3UTb3hykk7/aOqb6284KjxR03r0CWFzsEsZVhC5pvvruASSiMQg
 Pi04sLqv6CsGLHd1n+pl4AUYEaxq6k4KS3uU3HHSWxrahDDApQoRUx2F8lpOxyj8
 RiwnoO60IMPA7IFJqzcZuFqsgdxqiiYvnzT461KX8Mrw6fyMXeR2KAj2NwMX8dZN
 At21Sf8+LSoh6q2HnugfiUd/jR10XbfxIIx2lXgIinb15GXgWydEQVrDJ7cUV7ix
 Jd0S+dtOtp+lWtFHDoyjjqqsMV7+G8i/rFNZoxSkyZqsUTaKzaR6JD3moSyoYZgG
 0+eXO4A=
 =9EpR
 -----END PGP SIGNATURE-----

Merge tag 'v5.8-rc6' into locking/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-07-25 21:49:36 +02:00
David S. Miller dee72f8a0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-21

The following pull-request contains BPF updates for your *net-next* tree.

We've added 46 non-merge commits during the last 6 day(s) which contain
a total of 68 files changed, 4929 insertions(+), 526 deletions(-).

The main changes are:

1) Run BPF program on socket lookup, from Jakub.

2) Introduce cpumap, from Lorenzo.

3) s390 JIT fixes, from Ilya.

4) teach riscv JIT to emit compressed insns, from Luke.

5) use build time computed BTF ids in bpf iter, from Yonghong.
====================

Purely independent overlapping changes in both filter.h and xdp.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 12:35:33 -07:00
Heiko Carstens 411155820b s390/time: improve comparison for tod steering
It doesn't make sense to add zero shifted by 15. It's still zero.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-22 17:02:13 +02:00
Heiko Carstens 555701a714 s390/time: select CLOCKSOURCE_VALIDATE_LAST_CYCLE
The value returned by read_tod_clock() will overflow on September 17th 2042.
To avoid that system time jumps back select CLOCKSOURCE_VALIDATE_LAST_CYCLE
which enables a sanity check in order to prevent negative "delta" values.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-22 17:02:08 +02:00
Heiko Carstens 58e15716fe s390/time: use CLOCKSOURCE_MASK
Make use of CLOCKSOURCE_MASK instead of open-coding it.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-22 17:02:04 +02:00
Ilya Leoshkevich 94ad428df5 s390/bpf: Use bpf_skip() in bpf_jit_prologue()
Now that we have bpf_skip() for emitting nops, use it in
bpf_jit_prologue() in order to reduce code duplication.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717165326.6786-6-iii@linux.ibm.com
2020-07-21 13:26:25 -07:00
Ilya Leoshkevich 1491b73311 s390/bpf: Tolerate not converging code shrinking
"BPF_MAXINSNS: Maximum possible literals" unnecessarily falls back to
the interpreter because of failing sanity check in bpf_set_addr. The
problem is that there are a lot of branches that can be shrunk, and
doing so opens up the possibility to shrink even more. This process
does not converge after 3 passes, causing code offsets to change during
the codegen pass, which must never happen.

Fix by inserting nops during codegen pass in order to preserve code
offets.

Fixes: 4e9b4a6883 ("s390/bpf: Use relative long branches")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717165326.6786-5-iii@linux.ibm.com
2020-07-21 13:26:25 -07:00
Ilya Leoshkevich 5fa6974471 s390/bpf: Use brcl for jumping to exit_ip if necessary
"BPF_MAXINSNS: Maximum possible literals" test causes panic with
bpf_jit_harden = 2. The reason is that BPF_JMP | BPF_EXIT is always
emitted as brc, however, after removal of JITed image size
limitations, brcl might be required.

Fix by using brcl when necessary.

Fixes: 4e9b4a6883 ("s390/bpf: Use relative long branches")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717165326.6786-4-iii@linux.ibm.com
2020-07-21 13:26:25 -07:00
Ilya Leoshkevich 7477d43be5 s390/bpf: Fix sign extension in branch_ku
Both signed and unsigned variants of BPF_JMP | BPF_K require
sign-extending the immediate. JIT emits cgfi for the signed case,
which is correct, and clgfi for the unsigned case, which is not
correct: clgfi zero-extends the immediate.

s390 does not provide an instruction that does sign-extension and
unsigned comparison at the same time. Therefore, fix by first loading
the sign-extended immediate into work register REG_1 and proceeding
as if it's BPF_X.

Fixes: 4e9b4a6883 ("s390/bpf: Use relative long branches")
Reported-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
Link: https://lore.kernel.org/bpf/20200717165326.6786-3-iii@linux.ibm.com
2020-07-21 13:26:24 -07:00
Thomas Richter 3d3af181d3 s390/cpum_cf,perf: change DFLT_CCERROR counter name
Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15.
This counter counts completed DEFLATE instructions with exit code
0, 1 or 2. Since exit code 0 means success and exit code 1 or 2
indicate errors, change the counter name to avoid confusion.
This counter is incremented each time the DEFLATE instruction
completed regardless if an error was detected or not.

Fixes: d68d5d51dc ("s390/cpum_cf: Add new extended counters for IBM z15")
Fixes: e7950166e4 ("perf vendor events s390: Add new deflate counters for IBM z15")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-21 13:53:56 +02:00
Ilya Leoshkevich 3f161e0ae8 s390/bpf: implement BPF_PROBE_MEM
This is a s390 port of x86 commit 3dec541b2e ("bpf: Add support for BTF
pointers to x86 JIT").

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-20 10:55:59 +02:00
Ilya Leoshkevich 05a68e892e s390/kernel: expand exception table logic to allow new handling options
This is a s390 port of commit 548acf1923 ("x86/mm: Expand the
exception table logic to allow new handling options"), which is needed
for implementing BPF_PROBE_MEM on s390.

The new handler field is made 64-bit in order to allow pointing from
dynamically allocated entries to handlers in kernel text. Unlike on x86,
NULL is used instead of ex_handler_default. This is because exception
tables are used by boot/text_dma.S, and it would be a pain to preserve
ex_handler_default.

The new infrastructure is ignored in early_pgm_check_handler, since
there is no pt_regs.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-20 10:55:50 +02:00
Ilya Leoshkevich 88aa8939c9 s390/kernel: unify EX_TABLE* implementations
Replace three implementations with one using using __stringify_in_c
macro conveniently "borrowed" from powerpc and microblaze.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-20 10:55:45 +02:00
Heiko Carstens 771cf196cc s390/mm: allow order 10 allocations
Get rid of FORCE_MAX_ZONEORDER which limited allocations to order 8 (= 1MB)
and use the default, which allows for order 10 (= 4MB) allocations.

Given that s390 allows less than the default this caused some memory
allocation problems more or less unique to s390 from time to time.

Note: this was originally introduced with commit 684de39bd7 ("[S390]
Fix IPL from NSS.") in order to support Named Saved Segments, which
could start/end at an arbitrary 1 megabyte boundary and also before
support for sparsemem vmemmmap was enabled.

Since NSS support is gone, but sparsemem vmemmap support is available
this limitation can go away.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-20 10:55:40 +02:00
Heiko Carstens 3c5f2eb969 s390/mm: avoid trimming to MAX_ORDER
Trimming to MAX_ORDER was originally done in order to avoid to set
HOLES_IN_ZONE, which in turn would enable a quite expensive
pfn_valid() check. pfn_valid() however only checks if a struct page
exists for a given pfn.

With sparsemen vmemmap there are always struct pages, since memmaps
are allocated for whole sections. Therefore remove the HOLES_IN_ZONE
comment and the trimming.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-20 10:55:32 +02:00
Heiko Carstens 7904aaa8b2 s390/mm: fix typo in comment
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-20 10:55:13 +02:00
Christoph Hellwig 55db9c0e85 net: remove compat_sys_{get,set}sockopt
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.

This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.

It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:16:40 -07:00
Christoph Hellwig 2f9237d4f6 dma-mapping: make support for dma ops optional
Avoid the overhead of the dma ops support for tiny builds that only
use the direct mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2020-07-19 09:29:23 +02:00
Kees Cook 3f649ab728 treewide: Remove uninitialized_var() usage
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
	xargs perl -pi -e \
		's/\buninitialized_var\(([^\)]+)\)/\1/g;
		 s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
2020-07-16 12:35:15 -07:00
Linus Torvalds e8749d0688 - Update email addresses in MAINTAINERS file and add .mailmap entries
for Gerald Schaefer and Heiko Carstens.
 
 - Fix huge pte soft dirty copying.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAl8IaaoACgkQIg7DeRsp
 bsKmdhAAsGmVSePcWGu+9b4LrOw8W37TJsRRnztMUXjfmgpUzDr87NfcE4CbBzUp
 vBV0cqfSgWH38ExtkdqtPsoQJT/HQqkGAE3w4qQ+9eQC3wOzuYkhLs0v+UacpmOn
 stALI/AmqY8seXo1KI3b6F2aZ+wM49bqmVHj95O8eE0d8rKw4fOBbYeEtmCacelh
 PypHSNDYW2cJ1t26zRXidGnRtqC5FWGkXx/8Mc72FJqqFj3GDHsOlLsVH28qqV4i
 cLRzLZr7zI+Yo/U1X2s6/Fy634lHfxcIqO9oCIhGxJAbKVYtB5EiK6/TWuzpujwL
 a9leoaA95LO2YO9hpDFlY6oWzt75deuFrLwCr1ZJclVETuP/YBM9BXLoEQH2/DZi
 nk8hL/i1Edfy1BoDSJjeBqUkXffmhxBlYfVF7vtmCU79nJ4XliLsodIC1rNjn+Pj
 WYWMMPuh3wuDp/yal6+QV+Qq9yLGWYDqmVSbblKjl3RPFWcqK2dvwETiHeuNsQrj
 F6VUtw0jvvgRbklzl/i780eHwyDjHYLuq1Ua9q3NiIz6eTZnv1peIaD3oc6zU0CN
 fR0tIpxwotMPtWG458ojx7ySwdQ9ZSwEHLMyCEeHFXdEc08enFBwE0zrW3oOXtZp
 covYEJmHHvPZEO9WSbzFvdxT5kasAAMDbG92BAcbBtWjOsjx5gk=
 =vgnJ
 -----END PGP SIGNATURE-----

Merge tag 's390-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:
 "This is mainly due to the fact that Gerald Schaefer's and also my old
  email addresses currently do not work any longer. Therefore we decided
  to switch to new email addresses and reflect that in the MAINTAINERS
  file.

   - Update email addresses in MAINTAINERS file and add .mailmap entries
     for Gerald Schaefer and Heiko Carstens.

   - Fix huge pte soft dirty copying"

* tag 's390-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  MAINTAINERS: update email address for Gerald Schaefer
  MAINTAINERS: update email address for Heiko Carstens
  s390/mm: fix huge pte soft dirty copying
2020-07-10 08:39:33 -07:00
Sven Schnelle 6589c93f99 s390: add trace events for idle enter/exit
Helpful for debugging.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-10 15:08:27 +02:00
Christian Borntraeger 7b7735c5be s390: fix comment regarding interrupts in svc
With the removal of the critical section cleanup, we now enter the svc
interrupt handler with interrupts disabled.

Fixes: 0b0ed657fe ("s390: remove critical section cleanup from entry.S")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-10 15:08:23 +02:00
David Hildenbrand fa49066fc3 s390/mm: don't set ARCH_KEEP_MEMBLOCK
Commit 50be634507 ("s390/mm: Convert bootmem to memblock") mentions
	"The original bootmem allocator is getting replaced by memblock. To
	cover the needs of the s390 kdump implementation the physical
	memory list is used."

As we can now reference "physmem" managed in the memblock allocator after
init even without ARCH_KEEP_MEMBLOCK, and s390x does no longer need
other memblock metadata after boot (esp., the zcore memmap device that used
it got removed), we can stop setting ARCH_KEEP_MEMBLOCK.

With this change, we no longer create memblocks for standby/hotplugged
memory (added via add_memory()) and free up memblock metadata (except
physmem) after boot.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Philipp Rudo <prudo@linux.ibm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200701141830.18749-3-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-10 15:08:14 +02:00
David Hildenbrand 7764990581 mm/memblock: expose only miminal interface to add/walk physmem
"physmem" in the memblock allocator is somewhat weird: it's not actually
used for allocation, it's simply information collected during boot, which
describes the unmodified physical memory map at boot time, without any
standby/hotplugged memory. It's only used on s390 and is currently the
only reason s390 keeps using CONFIG_ARCH_KEEP_MEMBLOCK.

Physmem isn't numa aware and current users don't specify any flags. Let's
hide it from the user, exposing only for_each_physmem(), and simplify. The
interface for physmem is now really minimalistic:
- memblock_physmem_add() to add ranges
- for_each_physmem() / __next_physmem_range() to walk physmem ranges

Don't place it into an __init section and don't discard it without
CONFIG_ARCH_KEEP_MEMBLOCK. As we're reusing __next_mem_range(), remove
the __meminit notifier to avoid section mismatch warnings once
CONFIG_ARCH_KEEP_MEMBLOCK is no longer used with
CONFIG_HAVE_MEMBLOCK_PHYS_MAP.

While fixing up the documentation, sneak in some related cleanups. We can
stop setting CONFIG_ARCH_KEEP_MEMBLOCK for s390 next.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Message-Id: <20200701141830.18749-2-david@redhat.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-10 15:08:09 +02:00
Peter Zijlstra 28e5bfd81c s390: Break cyclic percpu include
In order to use <asm/percpu.h> in irqflags.h, we need to make sure
asm/percpu.h does not itself depend on irqflags.h

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200623083721.396143816@infradead.org
2020-07-10 12:00:02 +02:00
Tianjia Zhang 2f0a83bece KVM: s390: clean up redundant 'kvm_run' parameters
In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu'
structure. For historical reasons, many kvm-related function parameters
retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This
patch does a unified cleanup of these remaining redundant parameters.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200623131418.31473-2-tianjia.zhang@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 04:26:39 -04:00
Sean Christopherson 2aa9c199cf KVM: Move x86's version of struct kvm_mmu_memory_cache to common code
Move x86's 'struct kvm_mmu_memory_cache' to common code in anticipation
of moving the entire x86 implementation code to common KVM and reusing
it for arm64 and MIPS.  Add a new architecture specific asm/kvm_types.h
to control the existence and parameters of the struct.  The new header
is needed to avoid a chicken-and-egg problem with asm/kvm_host.h as all
architectures define instances of the struct in their vCPU structs.

Add an asm-generic version of kvm_types.h to avoid having empty files on
PPC and s390 in the long term, and for arm64 and mips in the short term.

Suggested-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703023545.8771-15-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09 13:29:42 -04:00
Janosch Frank 528a953934 s390/mm: fix huge pte soft dirty copying
If the pmd is soft dirty we must mark the pte as soft dirty (and not dirty).
This fixes some cases for guest migration with huge page backings.

Cc: <stable@vger.kernel.org> # 4.8
Fixes: bc29b7ac1d ("s390/mm: clean up pte/pmd encoding")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-09 15:18:23 +02:00
Vitaly Kuznetsov e8c22266e6 KVM: async_pf: change kvm_setup_async_pf()/kvm_arch_setup_async_pf() return type to bool
Unlike normal 'int' functions returning '0' on success, kvm_setup_async_pf()/
kvm_arch_setup_async_pf() return '1' when a job to handle page fault
asynchronously was scheduled and '0' otherwise. To avoid the confusion
change return type to 'bool'.

No functional change intended.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200615121334.91300-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:36 -04:00
Masahiro Yamada 685969e0bd kbuild: remove cc-option test of -ffreestanding
Some Makefiles already pass -ffreestanding unconditionally.
For example, arch/arm64/lib/Makefile, arch/x86/purgatory/Makefile.

No problem report so far about hard-coding this option. So, we can
assume all supported compilers know -ffreestanding.

I confirmed GCC 4.8 and Clang manuals document this option.

Get rid of cc-option from -ffreestanding.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
2020-07-07 11:13:10 +09:00
Linus Torvalds bfe91da29b Bugfixes and a one-liner patch to silence sparse.
-----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8DWosUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO8cAf/UskNg8qoLGG17rQwhFpmigSllbiJ
 TAyi3tpb1Y0Z2MfYeGkeiEb1L34bS28Cxl929DoqI3hrXy1wDCmsHPB5c3URXrzd
 aswvr7pJtQV9iH1ykaS2woFJnOUovMFsFYMhj46yUPoAvdKOZKvuqcduxbogYHFw
 YeRhS+1lGfiP2A0j3O/nnNJ0wq+FxKO46G3CgWeqG75+FSL6y/tl0bZJUMKKajQZ
 GNaOv/CYCHAfUdvgy0ZitRD8lV8yxng3dYGjm+a52Kmn2ZWiFlxNrnxzHySk16Rn
 Lq6MfFOqgrYpoZv7SnsFYnRE05U5bEFQ8BGr22fImQ+ktKDgq+9gv6cKwA==
 =+DN/
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Bugfixes and a one-liner patch to silence a sparse warning"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART
  KVM: arm64: PMU: Fix per-CPU access in preemptible context
  KVM: VMX: Use KVM_POSSIBLE_CR*_GUEST_BITS to initialize guest/host masks
  KVM: x86: Mark CR4.TSD as being possibly owned by the guest
  KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode
  kvm: use more precise cast and do not drop __user
  KVM: x86: bit 8 of non-leaf PDPEs is not reserved
  KVM: X86: Fix async pf caused null-ptr-deref
  KVM: arm64: vgic-v4: Plug race between non-residency and v4.1 doorbell
  KVM: arm64: pvtime: Ensure task delay accounting is enabled
  KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE
  KVM: arm64: Annotate hyp NMI-related functions as __always_inline
  KVM: s390: reduce number of IO pins to 1
2020-07-06 12:48:04 -07:00
Christian Brauner 714acdbd1c
arch: rename copy_thread_tls() back to copy_thread()
Now that HAVE_COPY_THREAD_TLS has been removed, rename copy_thread_tls()
back simply copy_thread(). It's a simpler name, and doesn't imply that only
tls is copied here. This finishes an outstanding chunk of internal process
creation work since we've added clone3().

Cc: linux-arch@vger.kernel.org
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>A
Acked-by: Stafford Horne <shorne@gmail.com>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>A
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-04 23:41:37 +02:00
Christian Brauner 140c8180eb
arch: remove HAVE_COPY_THREAD_TLS
All architectures support copy_thread_tls() now, so remove the legacy
copy_thread() function and the HAVE_COPY_THREAD_TLS config option. Everyone
uses the same process creation calling convention based on
copy_thread_tls() and struct kernel_clone_args. This will make it easier to
maintain the core process creation code under kernel/, simplifies the
callpaths and makes the identical for all architectures.

Cc: linux-arch@vger.kernel.org
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Greentime Hu <green.hu@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-04 23:41:37 +02:00
Harald Freudenberger 74ecbef7b9 s390/zcrypt: code beautification and struct field renames
Some beautifications related to the internal only used
struct ap_message and related code. Instead of one int carrying
only the special flag now a u32 flags field is used.

At struct CPRBX the pointers to additional data are now marked
with __user. This caused some changes needed on code, where
these structs are also used within the zcrypt misc functions.

The ica_rsa_* structs now use the generic types __u8, __u32, ...
instead of char, unsigned int.

zcrypt_msg6 and zcrypt_msg50 use min_t() instead of min().

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-03 10:49:34 +02:00
David Hildenbrand 0ef5d691aa s390/extmem: remove stale -ENOSPC comment and handling
segment_load() will no longer return -ENOSPC. If a segment overlaps with
storage, we now also return -EBUSY. Remove the stale comment from
__segment_load() and the stale handling from segment_warning().

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200630084240.8283-1-david@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-03 10:49:16 +02:00
Heiko Carstens 8e1398f898 s390/smp: add missing linebreak
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-03 10:49:13 +02:00
Heiko Carstens 24840e76bf s390/smp: move smp_cpus_done() to header file
Saves us a couple of bytes.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-03 10:49:08 +02:00
Heiko Carstens 9e9f85e029 s390: update defconfigs
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-01 20:02:38 +02:00
Thomas Richter 5aa98879ef s390/cpum_sf: prohibit callchain data collection
CPU Measurement sampling facility on s390 does not support
perf tool collection of callchain data using --call-graph
option. The sampling facility collects samples in a ring
buffer which includes only the instruction address the
samples were taken. When the ring buffer hits a watermark,
a measurement alert interrupt is triggered and handled
by the performance measurement unit (PMU) device driver.
It collects the samples and feeds each sample to the
perf ring buffer in the common code via functions
perf_prepare_sample()/perf_output_sample(). When function
perf_prepare_sample() is called to collect sample data's
callchain, user register values or stack area, invalid
data is picked, because the context of the collected
information does not match the context when the sample
was taken.

There is currently no way to provide the callchain and other
information, because the hardware sampler does not collect this
information.

Therefore prohibit sampling when the user requests a callchain graph
from the hardware sampler. Return -EOPNOTSUPP to the user in this
case.
If call chains are really wanted, users need to specify software
event cpu-clock to get the callchain information from a
software event.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-01 20:02:33 +02:00
David Hildenbrand f05f62d042 s390/vmem: get rid of memory segment list
I can't come up with a satisfying reason why we still need the memory
segment list. We used to represent in the list:
- boot memory
- standby memory added via add_memory()
- loaded dcss segments

When loading/unloading dcss segments, we already track them in a
separate list and check for overlaps
(arch/s390/mm/extmem.c:segment_overlaps_others()) when loading segments.

The overlap check was introduced for some segments in
commit b2300b9efe ("[S390] dcssblk: add >2G DCSSs support and stacked
contiguous DCSSs support.")
and was extended to cover all dcss segments in
commit ca57114609 ("s390/extmem: remove code for 31 bit addressing
mode").

Although I doubt that overlaps with boot memory and standby memory
are relevant, let's reshuffle the checks in load_segment() to request
the resource first. This will bail out in case we have overlaps with
other resources (esp. boot memory and standby memory). The order
is now different compared to segment_unload() and segment_unload(), but
that should not matter.

This smells like a leftover from ancient times, let's get rid of it. We
can now convert vmem_remove_mapping() into a void function - everybody
ignored the return value already.

Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200625150029.45019-1-david@redhat.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [DCSS]
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-01 20:00:49 +02:00
Sven Schnelle 66a049b764 s390/stp: allow group and users to read stp sysfs files
There are no secrets in these files, so allow all users
to read it.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-07-01 20:00:43 +02:00
Herbert Xu 7999096fa9 iov_iter: Move unnecessary inclusion of crypto/hash.h
The header file linux/uio.h includes crypto/hash.h which pulls in
most of the Crypto API.  Since linux/uio.h is used throughout the
kernel this means that every tiny bit of change to the Crypto API
causes the entire kernel to get rebuilt.

This patch fixes this by moving it into lib/iov_iter.c instead
where it is actually used.

This patch also fixes the ifdef to use CRYPTO_HASH instead of just
CRYPTO which does not guarantee the existence of ahash.

Unfortunately a number of drivers were relying on linux/uio.h to
provide access to linux/slab.h.  This patch adds inclusions of
linux/slab.h as detected by build failures.

Also skbuff.h was relying on this to provide a declaration for
ahash_request.  This patch adds a forward declaration instead.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-06-30 09:34:23 -04:00
Gustavo A. R. Silva 28ccce5f50 s390/appldata: use struct_size() helper
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle and, audited and
fixed manually.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Message-Id: <20200617212930.GA11728@embeddedor>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:32:34 +02:00
Heiko Carstens 6ffb3f6b46 s390/debug: remove struct __debug_entry from uapi
There is no interface to userspace which exposes anything that would
require the struct __debug_entry definition. Therefore remove it from
uapi. This allows to change the definition, since it is only kernel
internally used.

The only exception is the crash utility, however that tool must handle
changes all the time anyway.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:32:25 +02:00
Heiko Carstens ecb1ff6833 s390/debug: remove raw view
There is not a single user of the debug raw view. Therefore remove it
before anybody uses it. If anybody would make use of the view it would
expose the struct __debug_entry definition to userspace and really
would make it uapi. This wouldn't be good, since the definition is
suboptimal and needs to be changed.

Right now the structure definition is only defined to be uapi, however
there is no user.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:32:20 +02:00
Sven Schnelle 7fa0d6ff35 s390/time: remove unused function
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:32:14 +02:00
Sven Schnelle 90ce70f065 s390/pci: remove unused functions
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:32:09 +02:00
Sven Schnelle 0188d08a46 s390: convert to msecs_to_jiffies()
Instead of using the old 'jiffies + HZ {/,*} something' calculation
use msecs_to_jiffies() as that makes the code more readable.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:31:46 +02:00
Vasily Gorbik 95e61b1b5d s390/setup: init jump labels before command line parsing
Command line parameters might set static keys. This is true for s390 at
least since commit 6471384af2 ("mm: security: introduce init_on_alloc=1
and init_on_free=1 boot options"). To avoid the following WARN:

static_key_enable_cpuslocked(): static key 'init_on_alloc+0x0/0x40' used
before call to jump_label_init()

call jump_label_init() just before parse_early_param().
jump_label_init() is safe to call multiple times (x86 does that), doesn't
do any memory allocations and hence should be safe to call that early.

Fixes: 6471384af2 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Cc: <stable@vger.kernel.org> # 5.3: d6df52e9996d: s390/maccess: add no DAT mode to kernel_write
Cc: <stable@vger.kernel.org> # 5.3
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:28:39 +02:00
Vasily Gorbik d6df52e999 s390/maccess: add no DAT mode to kernel_write
To be able to patch kernel code before paging is initialized do plain
memcpy if DAT is off. This is required to enable early jump label
initialization.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:26:36 +02:00
Niklas Schnelle 3047766bc6 s390/pci: fix enabling a reserved PCI function
In usual IPL or hot plug scenarios a zPCI function transitions directly
from reserved (invisible to Linux) to configured state or is configured
by Linux itself using an SCLP, however it can also first go from
reserved to standby and then from standby to configured without
Linux initiative.
In this scenario we first get a PEC event 0x302 and then 0x301.  This may
happen for example when the device is deconfigured at another LPAR and
made available for this LPAR. It may also happen under z/VM when
a device is attached while in some inconsistent state.

However when we get the 0x301 the device is already known to zPCI
so calling zpci_create() will add it twice resulting in the below
BUG. Instead we should only enable the existing device and finally
scan it through the PCI subsystem.

list_add double add: new=00000000ed5a9008, prev=00000000ed5a9008, next=0000000083502300.
kernel BUG at lib/list_debug.c:31!
Krnl PSW : 0704c00180000000 0000000082dc2db8 (__list_add_valid+0x70/0xa8)
Call Trace:
 [<0000000082dc2db8>] __list_add_valid+0x70/0xa8
([<0000000082dc2db4>] __list_add_valid+0x6c/0xa8)
 [<00000000828ea920>] zpci_create_device+0x60/0x1b0
 [<00000000828ef04a>] zpci_event_availability+0x282/0x2f0
 [<000000008315f848>] chsc_process_crw+0x2b8/0xa18
 [<000000008316735c>] crw_collect_info+0x254/0x348
 [<00000000829226ea>] kthread+0x14a/0x168
 [<000000008319d5c0>] ret_from_fork+0x24/0x2c

Fixes: f606b3ef47 ("s390/pci: adapt events for zbus")
Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-29 16:26:28 +02:00
Christian Borntraeger 827c491392 s390/debug: avoid kernel warning on too large number of pages
When specifying insanely large debug buffers a kernel warning is
printed. The debug code does handle the error gracefully, though.
Instead of duplicating the check let us silence the warning to
avoid crashes when panic_on_warn is used.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-23 14:05:55 +02:00
Vasily Gorbik 998f5bbe3d s390/kasan: fix early pgm check handler execution
Currently if early_pgm_check_handler is called it ends up in pgm check
loop. The problem is that early_pgm_check_handler is instrumented by
KASAN but executed without DAT flag enabled which leads to addressing
exception when KASAN checks try to access shadow memory.

Fix that by executing early handlers with DAT flag on under KASAN as
expected.

Reported-and-tested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-23 14:05:50 +02:00
Sven Schnelle e64a1618af s390: fix system call single stepping
When single stepping an svc instruction on s390, the kernel is entered
with a PER program check interruption. The program check handler than
jumps to the system call handler by reloading the PSW. The code didn't
set GPR13 to the thread pointer in struct task_struct. This made the
kernel access invalid memory while trying to fetch the syscall function
address. Fix this by always assigned GPR13 after .Lsysc_per.

Fixes: 0b0ed657fe ("s390: remove critical section cleanup from entry.S")
Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2020-06-23 14:05:45 +02:00
Collin Walling 23a60f8344 s390/kvm: diagnose 0x318 sync and reset
DIAGNOSE 0x318 (diag318) sets information regarding the environment
the VM is running in (Linux, z/VM, etc) and is observed via
firmware/service events.

This is a privileged s390x instruction that must be intercepted by
SIE. Userspace handles the instruction as well as migration. Data
is communicated via VCPU register synchronization.

The Control Program Name Code (CPNC) is stored in the SIE block. The
CPNC along with the Control Program Version Code (CPVC) are stored
in the kvm_vcpu_arch struct.

This data is reset on load normal and clear resets.

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200622154636.5499-3-walling@linux.ibm.com
[borntraeger@de.ibm.com: fix sync_reg position]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-06-23 10:55:33 +02:00
Collin Walling a23816f3cd s390/setup: diag 318: refactor struct
The diag 318 struct introduced in include/asm/diag.h can be
reused in KVM, so let's condense the version code fields in the
diag318_info struct for easier usage and simplify it until we
can determine how the data should be formatted.

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20200622154636.5499-2-walling@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-06-23 09:16:48 +02:00
Linus Torvalds 1566feea45 s390 fixes for 5.8-rc2
- Few ptrace fixes mostly for strace and seccomp_bpf kernel tests
   findings.
 
 - Cleanup unused pm callbacks in virtio ccw.
 
 - Replace kmalloc + memset with kzalloc in crypto.
 
 - Use $(LD) for vDSO linkage to make clang happy.
 
 - Fix vDSO clock_getres() to preserve the same behaviour as
   posix_get_hrtimer_res().
 
 - Fix workqueue cpumask warning when NUMA=n and nr_node_ids=2.
 
 - Reduce SLSB writes during input processing, improve warnings and
   cleanup qdio_data usage in qdio.
 
 - Few fixes to use scnprintf() instead of snprintf().
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl7uJy8ACgkQjYWKoQLX
 FBitMwgAovHP6O19ZS2RE2Ps20CjM+z0sLLGHF6aMrV7OqmOWrNnFzN4jT2j42Ck
 idSZ6sehVd3Uj6K8NnzrlSS3sjGRhVaQJEjjN+rLyw0HBwxspJJfW5HgcoMtqNH1
 oo+nt+zw5jk+6MqHx4QEwTxN5rgGs6UMhiLIAIlkDu4bivgohvGUxe4RUrN/mINx
 cdYqomCkvovLT5sBTaWyXKNCDAdAWgNpOfdqc9MjOUXSbUg3lrUol0gUULzenPo7
 wUN+sZ0di0Ox0+2+4m8LU1av/kMTLSSvnR9DW5KdpGTon1nwpZcdJnhI5o1v7uaU
 pIaMOYNieEHJ2DnieR9iBBSbGoNCmw==
 =gkgN
 -----END PGP SIGNATURE-----

Merge tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - a few ptrace fixes mostly for strace and seccomp_bpf kernel tests
   findings

 - cleanup unused pm callbacks in virtio ccw

 - replace kmalloc + memset with kzalloc in crypto

 - use $(LD) for vDSO linkage to make clang happy

 - fix vDSO clock_getres() to preserve the same behaviour as
   posix_get_hrtimer_res()

 - fix workqueue cpumask warning when NUMA=n and nr_node_ids=2

 - reduce SLSB writes during input processing, improve warnings and
   cleanup qdio_data usage in qdio

 - a few fixes to use scnprintf() instead of snprintf()

* tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: fix syscall_get_error for compat processes
  s390/qdio: warn about unexpected SLSB states
  s390/qdio: clean up usage of qdio_data
  s390/numa: let NODES_SHIFT depend on NEED_MULTIPLE_NODES
  s390/vdso: fix vDSO clock_getres()
  s390/vdso: Use $(LD) instead of $(CC) to link vDSO
  s390/protvirt: use scnprintf() instead of snprintf()
  s390: use scnprintf() in sys_##_prefix##_##_name##_show
  s390/crypto: use scnprintf() instead of snprintf()
  s390/zcrypt: use kzalloc
  s390/virtio: remove unused pm callbacks
  s390/qdio: reduce SLSB writes during Input Queue processing
  selftests/seccomp: s390 shares the syscall and return value register
  s390/ptrace: fix setting syscall number
  s390/ptrace: pass invalid syscall numbers to tracing
  s390/ptrace: return -ENOSYS when invalid syscall is supplied
  s390/seccomp: pass syscall arguments via seccomp_data
  s390/qdio: fine-tune SLSB update
2020-06-20 12:31:08 -07:00
Christoph Hellwig 25f12ae45f maccess: rename probe_kernel_address to get_kernel_nofault
Better describe what this helper does, and match the naming of
copy_from_kernel_nofault.

Also switch the argument order around, so that it acts and looks
like get_user().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-18 11:14:40 -07:00
Christian Borntraeger 774911290c KVM: s390: reduce number of IO pins to 1
The current number of KVM_IRQCHIP_NUM_PINS results in an order 3
allocation (32kb) for each guest start/restart. This can result in OOM
killer activity even with free swap when the memory is fragmented
enough:

kernel: qemu-system-s39 invoked oom-killer: gfp_mask=0x440dc0(GFP_KERNEL_ACCOUNT|__GFP_COMP|__GFP_ZERO), order=3, oom_score_adj=0
kernel: CPU: 1 PID: 357274 Comm: qemu-system-s39 Kdump: loaded Not tainted 5.4.0-29-generic #33-Ubuntu
kernel: Hardware name: IBM 8562 T02 Z06 (LPAR)
kernel: Call Trace:
kernel: ([<00000001f848fe2a>] show_stack+0x7a/0xc0)
kernel:  [<00000001f8d3437a>] dump_stack+0x8a/0xc0
kernel:  [<00000001f8687032>] dump_header+0x62/0x258
kernel:  [<00000001f8686122>] oom_kill_process+0x172/0x180
kernel:  [<00000001f8686abe>] out_of_memory+0xee/0x580
kernel:  [<00000001f86e66b8>] __alloc_pages_slowpath+0xd18/0xe90
kernel:  [<00000001f86e6ad4>] __alloc_pages_nodemask+0x2a4/0x320
kernel:  [<00000001f86b1ab4>] kmalloc_order+0x34/0xb0
kernel:  [<00000001f86b1b62>] kmalloc_order_trace+0x32/0xe0
kernel:  [<00000001f84bb806>] kvm_set_irq_routing+0xa6/0x2e0
kernel:  [<00000001f84c99a4>] kvm_arch_vm_ioctl+0x544/0x9e0
kernel:  [<00000001f84b8936>] kvm_vm_ioctl+0x396/0x760
kernel:  [<00000001f875df66>] do_vfs_ioctl+0x376/0x690
kernel:  [<00000001f875e304>] ksys_ioctl+0x84/0xb0
kernel:  [<00000001f875e39a>] __s390x_sys_ioctl+0x2a/0x40
kernel:  [<00000001f8d55424>] system_call+0xd8/0x2c8

As far as I can tell s390x does not use the iopins as we bail our for
anything other than KVM_IRQ_ROUTING_S390_ADAPTER and the chip/pin is
only used for KVM_IRQ_ROUTING_IRQCHIP. So let us use a small number to
reduce the memory footprint.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200617083620.5409-1-borntraeger@de.ibm.com
2020-06-18 09:48:19 +02:00
Dmitry V. Levin b3583fca5f s390: fix syscall_get_error for compat processes
If both the tracer and the tracee are compat processes, and gprs[2]
is assigned a value by __poke_user_compat, then the higher 32 bits
of gprs[2] are cleared, IS_ERR_VALUE() always returns false, and
syscall_get_error() always returns 0.

Fix the implementation by sign-extending the value for compat processes
the same way as x86 implementation does.

The bug was exposed to user space by commit 201766a20e ("ptrace: add
PTRACE_GET_SYSCALL_INFO request") and detected by strace test suite.

This change fixes strace syscall tampering on s390.

Link: https://lkml.kernel.org/r/20200602180051.GA2427@altlinux.org
Fixes: 753c4dd6a2 ("[S390] ptrace changes")
Cc: Elvira Khabirova <lineprinter@altlinux.org>
Cc: stable@vger.kernel.org # v2.6.28+
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-06-17 23:05:05 +02:00