Commit Graph

738552 Commits

Author SHA1 Message Date
Jerry Hoemann 2b3d89b402 watchdog: hpwdt: Remove legacy NMI sourcing.
Gen8 and prior Proliant systems supported the "CRU" interface
to firmware.  This interfaces allows linux to "call back" into firmware
to source the cause of an NMI.  This feature isn't fully utilized
as the actual source of the NMI isn't printed, the driver only
indicates that the source couldn't be determined when the call
fails.

With the advent of Gen9, iCRU replaces the CRU. The call back
feature is no longer available in firmware.  To be compatible and
not attempt to call back into firmware on system not supporting CRU,
the SMBIOS table is consulted to determine if it is safe to
make the call back or not.

This results in about half of the driver code being devoted
to either making CRU calls or determing if it is safe to make
CRU calls.  As noted, the driver isn't really using the results of
the CRU calls.

Furthermore, as a consequence of the Spectre security issue, the
BIOS/EFI calls are being wrapped into Spectre-disabling section.
Removing the call back in hpwdt_pretimeout assists in this effort.

As the CRU sourcing of the NMI isn't required for handling the
NMI and there are security concerns with making the call back, remove
the legacy (pre Gen9) NMI sourcing and the DMI code to determine if
the system had the CRU interface.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2018-03-03 15:52:33 +01:00
Jayachandran C 93ac3deb7c watchdog: sbsa: use 32-bit read for WCV
According to SBSA spec v3.1 section 5.3:
  All registers are 32 bits in size and should be accessed using
  32-bit reads and writes. If an access size other than 32 bits
  is used then the results are IMPLEMENTATION DEFINED.
  [...]
  The Generic Watchdog is little-endian

The current code uses readq to read the watchdog compare register
which does a 64-bit access. This fails on ThunderX2 which does not
implement 64-bit access to this register.

Fix this by using lo_hi_readq() that does two 32-bit reads.

Signed-off-by: Jayachandran C <jnair@caviumnetworks.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2018-03-03 15:52:32 +01:00
Igor Pylypiv 7bd3e7b743 watchdog: f71808e_wdt: Fix magic close handling
Watchdog close is "expected" when any byte is 'V' not just the last one.
Writing "V" to the device fails because the last byte is the end of string.

$ echo V > /dev/watchdog
f71808e_wdt: Unexpected close, not stopping watchdog!

Signed-off-by: Igor Pylypiv <igor.pylypiv@gmail.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2018-03-03 15:52:32 +01:00
Laurent Vivier 61bd0f66ff KVM: PPC: Book3S HV: Fix guest time accounting with VIRT_CPU_ACCOUNTING_GEN
Since commit 8b24e69fc4 ("KVM: PPC: Book3S HV: Close race with testing
for signals on guest entry"), if CONFIG_VIRT_CPU_ACCOUNTING_GEN is set, the
guest time is not accounted to guest time and user time, but instead to
system time.

This is because guest_enter()/guest_exit() are called while interrupts
are disabled and the tick counter cannot be updated between them.

To fix that, move guest_exit() after local_irq_enable(), and as
guest_enter() is called with IRQ disabled, call guest_enter_irqoff()
instead.

Fixes: 8b24e69fc4 ("KVM: PPC: Book3S HV: Close race with testing for signals on guest entry")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-03 19:28:34 +11:00
Linus Torvalds 03a6c2592f KVM fixes for v4.16-rc4
x86:
 - fix NULL dereference when using userspace lapic
 - optimize spectre v1 mitigations by allowing guests to use LFENCE
 - make microcode revision configurable to prevent guests from
   unnecessarily blacklisting spectre v2 mitigation features
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJambvzAAoJEED/6hsPKofo9HwH/2il8xNSLIYf9pJtxZo/puyQ
 ZSwByGdeLKBZ1GP1dhdZ8kMk3eBoci0a/sQJmhDiEG6GDf1Mrgri/xj3p60sWwXT
 iReG+ZhBKg4QMj/IgOJQrh+53JT73QQP14wIhzc/DSi0Fo0ziqDA/lINxqMKc7oF
 b5qratjsb4xF1db4d1g8Ii1VRk64UoBEVpEoP37OOyAu1rgXgDr+9C832KkP0rb+
 pVYT8hLFiaYiwVN+WN52/NrIkqBlMvMp3ouRtMAajCQ9OznnraDJE6eTkPGDSDBM
 RizSuQbev5R7pcmzCAP9/XKfTbeSUZQei2ZFXQXAOvIXMrQd/ITjbPvnObsceSE=
 =wtQd
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "x86:

   - fix NULL dereference when using userspace lapic

   - optimize spectre v1 mitigations by allowing guests to use LFENCE

   - make microcode revision configurable to prevent guests from
     unnecessarily blacklisting spectre v2 mitigation feature"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: fix vcpu initialization with userspace lapic
  KVM: X86: Allow userspace to define the microcode version
  KVM: X86: Introduce kvm_get_msr_feature()
  KVM: SVM: Add MSR-based feature support for serializing LFENCE
  KVM: x86: Add a framework for supporting MSR-based features
2018-03-02 19:40:43 -08:00
Dan Williams 949b93250a memremap: fix softlockup reports at teardown
The cond_resched() currently in the setup path needs to be duplicated in
the teardown path. Rather than require each instance of
for_each_device_pfn() to open code the same sequence, embed it in the
helper.

Link: https://github.com/intel/ixpdimm_sw/issues/11
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org>
Fixes: 7138970383 ("mm, zone_device: Replace {get, put}_zone_device_page()...")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-03-02 19:34:50 -08:00
Dave Jiang 5fdf8e5ba5 libnvdimm: re-enable deep flush for pmem devices via fsync()
Re-enable deep flush so that users always have a way to be sure that a
write makes it all the way out to media. Writes from the PMEM driver
always arrive at the NVDIMM since movnt is used to bypass the cache, and
the driver relies on the ADR (Asynchronous DRAM Refresh) mechanism to
flush write buffers on power failure. The Deep Flush mechanism is there
to explicitly write buffers to protect against (rare) ADR failure.  This
change prevents a regression in deep flush behavior so that applications
can continue to depend on fsync() as a mechanism to trigger deep flush
in the filesystem-DAX case.

Fixes: 06e8ccdab1 ("acpi: nfit: Add support for detect platform CPU cache...")
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-03-02 19:31:40 -08:00
Masahiro Yamada 50186e121e MAINTAINERS: take over Kconfig maintainership
I have recently picked up Kconfig patches to my tree without any
declaration.  Making it official now.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-03-03 12:28:27 +09:00
Dan Williams 94db151dc8 vfio: disable filesystem-dax page pinning
Filesystem-DAX is incompatible with 'longterm' page pinning. Without
page cache indirection a DAX mapping maps filesystem blocks directly.
This means that the filesystem must not modify a file's block map while
any page in a mapping is pinned. In order to prevent the situation of
userspace holding of filesystem operations indefinitely, disallow
'longterm' Filesystem-DAX mappings.

RDMA has the same conflict and the plan there is to add a 'with lease'
mechanism to allow the kernel to notify userspace that the mapping is
being torn down for block-map maintenance. Perhaps something similar can
be put in place for vfio.

Note that xfs and ext4 still report:

   "DAX enabled. Warning: EXPERIMENTAL, use at your own risk"

...at mount time, and resolving the dax-dma-vs-truncate problem is one
of the last hurdles to remove that designation.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: kvm@vger.kernel.org
Cc: <stable@vger.kernel.org>
Reported-by: Haozhong Zhang <haozhong.zhang@intel.com>
Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
Fixes: d475c6346a ("dax,ext2: replace XIP read and write with DAX I/O")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-03-02 18:00:04 -08:00
Linus Torvalds 329ad5e544 pci-v4.16-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlqZy4cUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vzyuw/+M3EDZaGd0SMj5k47qll1udstDEBL
 /WkCVTgrVov4TqMh+x5Pba7Yd+/2vFeIx2QbruYS7uGF8YwXMtET6+RJg+vUUMkW
 ryIMKl0QRb9X11qJr5GtgOqij//wtV8BXPimGczB02d8ganJFhbxrJSx87z1lshm
 g0u5ZNupddMHfT3JW0H4c2AFrkW9+KXjosShPuVzQbkhXxMAk+zbnuC6wRd1+GJV
 4BewJMHLNFTg4cEz7O8ao7fRyYsHsaV65HdFoTlSGPrv3PEGZTqapmF9ATbEJTQd
 CNQ0QMN+ewFMwZh17oCs0eEkoSLeyzMFVUfXs4IIpMRTxyJbLzuqwlU5VaCtXdkT
 ntlETQ09MmNqvyusAmKvrqBjjw2clgvO7NCjJhaMY1ehOH6sEcbvOx83xp6P1RUc
 mTqOVV7TbSXyyRFOEa6KunyC2tKmV+t2Cl4dOxVKwaja5J/Eh8GhGlM/r7qt4pEa
 h3roNNE1VilHYR/8VcpCVydyueruaMSNrHzlhmIQw+UcsMDzEjBKzuxGFUYNU+0H
 2wXZacv2MaDj3qndh8ZS9IHWW1bw6PMBCdf13HOPdEs9DjHX5uI1RsB49Z6rSM75
 xXoiB/MHlK/3YgZNB2eEBQjFHfMDD1ylwsnIbeIMhvk074rE8d56C8uNXuBvYYPn
 dakUdnUB8abeeD0=
 =0inR
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Update pci.ids location (documentation only) (Randy Dunlap)

 - Fix a crash when BIOS didn't assign a BAR and we try to enlarge it
   (Christian König)

* tag 'pci-v4.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Allow release of resources that were never assigned
  PCI: Update location of pci.ids file
2018-03-02 17:44:39 -08:00
David S. Miller 4a0c7191c7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter/IPVS fixes for net

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Put back reference on CLUSTERIP configuration structure from the
   error path, patch from Florian Westphal.

2) Put reference on CLUSTERIP configuration instead of freeing it,
   another cpu may still be walking over it, also from Florian.

3) Refetch pointer to IPv6 header from nf_nat_ipv6_manip_pkt() given
   packet manipulation may reallocation the skbuff header, from Florian.

4) Missing match size sanity checks in ebt_among, from Florian.

5) Convert BUG_ON to WARN_ON in ebtables, from Florian.

6) Sanity check userspace offsets from ebtables kernel, from Florian.

7) Missing checksum replace call in flowtable IPv4 DNAT, from Felix
   Fietkau.

8) Bump the right stats on checksum error from bridge netfilter,
   from Taehee Yoo.

9) Unset interface flag in IPv6 fib lookups otherwise we get
   misleading routing lookup results, from Florian.

10) Missing sk_to_full_sk() in ip6_route_me_harder() from Eric Dumazet.

11) Don't allow devices to be part of multiple flowtables at the same
    time, this may break setups.

12) Missing netlink attribute validation in flowtable deletion.

13) Wrong array index in nf_unregister_net_hook() call from error path
    in flowtable addition path.

14) Fix FTP IPVS helper when NAT mangling is in place, patch from
    Julian Anastasov.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02 20:32:15 -05:00
Matt Redfearn fde9fc766e
signals: Move put_compat_sigset to compat.h to silence hardened usercopy
Since commit afcc90f862 ("usercopy: WARN() on slab cache usercopy
region violations"), MIPS systems booting with a compat root filesystem
emit a warning when copying compat siginfo to userspace:

WARNING: CPU: 0 PID: 953 at mm/usercopy.c:81 usercopy_warn+0x98/0xe8
Bad or missing usercopy whitelist? Kernel memory exposure attempt
detected from SLAB object 'task_struct' (offset 1432, size 16)!
Modules linked in:
CPU: 0 PID: 953 Comm: S01logging Not tainted 4.16.0-rc2 #10
Stack : ffffffff808c0000 0000000000000000 0000000000000001 65ac85163f3bdc4a
	65ac85163f3bdc4a 0000000000000000 90000000ff667ab8 ffffffff808c0000
	00000000000003f8 ffffffff808d0000 00000000000000d1 0000000000000000
	000000000000003c 0000000000000000 ffffffff808c8ca8 ffffffff808d0000
	ffffffff808d0000 ffffffff80810000 fffffc0000000000 ffffffff80785c30
	0000000000000009 0000000000000051 90000000ff667eb0 90000000ff667db0
	000000007fe0d938 0000000000000018 ffffffff80449958 0000000020052798
	ffffffff808c0000 90000000ff664000 90000000ff667ab0 00000000100c0000
	ffffffff80698810 0000000000000000 0000000000000000 0000000000000000
	0000000000000000 0000000000000000 ffffffff8010d02c 65ac85163f3bdc4a
	...
Call Trace:
[<ffffffff8010d02c>] show_stack+0x9c/0x130
[<ffffffff80698810>] dump_stack+0x90/0xd0
[<ffffffff80137b78>] __warn+0x100/0x118
[<ffffffff80137bdc>] warn_slowpath_fmt+0x4c/0x70
[<ffffffff8021e4a8>] usercopy_warn+0x98/0xe8
[<ffffffff8021e68c>] __check_object_size+0xfc/0x250
[<ffffffff801bbfb8>] put_compat_sigset+0x30/0x88
[<ffffffff8011af24>] setup_rt_frame_n32+0xc4/0x160
[<ffffffff8010b8b4>] do_signal+0x19c/0x230
[<ffffffff8010c408>] do_notify_resume+0x60/0x78
[<ffffffff80106f50>] work_notifysig+0x10/0x18
---[ end trace 88fffbf69147f48a ]---

Commit 5905429ad8 ("fork: Provide usercopy whitelisting for
task_struct") noted that:

"While the blocked and saved_sigmask fields of task_struct are copied to
userspace (via sigmask_to_save() and setup_rt_frame()), it is always
copied with a static length (i.e. sizeof(sigset_t))."

However, this is not true in the case of compat signals, whose sigset
is copied by put_compat_sigset and receives size as an argument.

At most call sites, put_compat_sigset is copying a sigset from the
current task_struct. This triggers a warning when
CONFIG_HARDENED_USERCOPY is active. However, by marking this function as
static inline, the warning can be avoided because in all of these cases
the size is constant at compile time, which is allowed. The only site
where this is not the case is handling the rt_sigpending syscall, but
there the copy is being made from a stack local variable so does not
trigger the warning.

Move put_compat_sigset to compat.h, and mark it static inline. This
fixes the WARN on MIPS.

Fixes: afcc90f862 ("usercopy: WARN() on slab cache usercopy region violations")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Dmitry V . Levin" <ldv@altlinux.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/18639/
Signed-off-by: James Hogan <jhogan@kernel.org>
2018-03-02 21:31:55 +00:00
Linus Torvalds 5fbdefcf68 Merge branch 'parisc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:

 - a patch to change the ordering of cache and TLB flushes to hopefully
   fix the random segfaults we very rarely face (by Dave Anglin).

 - a patch to hide the virtual kernel memory layout due to security
   reasons.

 - two small patches to make the kernel run more smoothly under qemu.

* 'parisc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Reduce irq overhead when run in qemu
  parisc: Use cr16 interval timers unconditionally on qemu
  parisc: Check if secondary CPUs want own PDC calls
  parisc: Hide virtual kernel memory layout
  parisc: Fix ordering of cache and TLB flushes
2018-03-02 13:05:20 -08:00
Linus Torvalds 0573fed92b xen: fixes for v4.16-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJamYZrAAoJELDendYovxMv8N4H/A2HOfHGnmrg+Q1eLf0vRzOD
 +5MsVdjpYCqfkbFF+ITTC/yQL6sQfYIA9pzFKzmyabO3xwXGtg0sJToBbQhtmVDh
 opp/2bYyG8VN+Pmhe9Rc7L0ON0ShDeCs+J5L/8scPE52EKLiinLlBGWMgwIFYMII
 EehLWHtWiVjHG+Od1nnGNJhuxhWzk5FqdTBBerUt4+ra2zT0Luhe3iVnXl7f3I81
 EpYbjNZ7D+yOwVoGJf200RHiGr/ItavQYvTjYP9Mau4InRlOIs6COhWTSRjGvuiC
 Sp4Ra5mR/eSgPb0VO1Qc0nOjZqdgiQEM7hzmKKP+A1gbLhsYH3ghhuHmQOPlbe0=
 =e01A
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.16a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Five minor fixes for Xen-specific drivers"

* tag 'for-linus-4.16a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  pvcalls-front: 64-bit align flags
  x86/xen: add tty0 and hvc0 as preferred consoles for dom0
  xen-netfront: Fix hang on device removal
  xen/pirq: fix error path cleanup when binding MSIs
  xen/pvcalls: fix null pointer dereference on map->sock
2018-03-02 10:19:57 -08:00
Linus Torvalds 2833419a62 A cap handling fix from Zhi that ensures that metadata writeback isn't
delayed and three error path memory leak fixups from Chengguang.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJamV3qAAoJEEp/3jgCEfOLv6kIAI0RXShR20dz9M3GnL2bDws7
 KFkC/aNwbviXMAih3q7Q7ZiwXv6BI/uUK0YIYG/tWZoNHucYGS8ZOFDikdiIMcJR
 2VP8Yas1/NvnLlh0ut45s2imvbXinkHcjWOLqgJJmJ2cEP9Ar/mbwz//TgfVMdNm
 TSHEm2LPRge9jK3Ab/ZiatExGH6dx5hRJvssSauF+f2iN+q4nKj2aBqle+FKaNWT
 Iv0ONi2f7aXnaXRvum4Y1gnKWXzKSFd8Y2Cr5R3tfTrKBv8nmQz54eRsMnnuxK/7
 uV6Ujch4LUXBI1etXLqTwRdZQIcw3Li5PYxGrO8fl8mQfJau9yv5REw/rbgB5FU=
 =f8gA
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-4.16-rc4' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A cap handling fix from Zhi that ensures that metadata writeback isn't
  delayed and three error path memory leak fixups from Chengguang"

* tag 'ceph-for-4.16-rc4' of git://github.com/ceph/ceph-client:
  ceph: fix potential memory leak in init_caches()
  ceph: fix dentry leak when failing to init debugfs
  libceph, ceph: avoid memory leak when specifying same option several times
  ceph: flush dirty caps of unlinked inode ASAP
2018-03-02 10:05:10 -08:00
Linus Torvalds fb6d47a592 for-linus-20180302
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJamXImAAoJEPfTWPspceCmEP4P/3kkm0JIXtbNZFMb1JZtsjwE
 t4OUVEDj4jjRfmZfUVkajPnczM4MSPiXm43PbcOi4NF53mv8k76jyIPhlZREzYzq
 MBknibvpqyiWpbii9tBRrR6FGDR/N51//ya9vdPaYBcBssTg6Aqtt4BE5oPfo011
 PleGROe1jtrBUNBy2dMy4sHb/MvZ0vRuNPxMsD8Agy5UiVeItAelY/lDn1Hw41BY
 O+muE5bw6+yKqB9vGXhV3O4WRh8BofJi1YdzbwbbIzH40ZZK5VTDQc5o19/CFEZ/
 uZ8BStOFEWA0LNuarME5fknWcogiedEtszweyiWBbVZo4VqCsfxPoaRCibY/Wg5F
 a0UNJ4iSzglhfSMoHJlhvlCAMCyubFSeMSdJjrrpIcyBrziJXpcEXcUnWI43yi4P
 FoM8zUni22XnfLWxIdTjVkMRytjtqTLcXOHXdP5N/ESa80jBq3Q76TLmzIKW+kK5
 sAre+hgr52NdgovP/NSxsdvsckAolWNe40JI8wLbwNo+lMHr0ckzOG+sAdz1iPRK
 iVL0CAlby4A94Wcu+OHCwfY7B9lBrMuMfHsesEM6x1cxgAhd3YNfEJ8g2QolCUEV
 KmZizXbV9nnmJfegVC06SgM+D7AR26dwsBG2aoibShuvdxX6jMdUHygyu5DCJdg/
 JS+q71jmxb/r1TWe/62r
 =AMhV
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20180302' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A collection of fixes for this series. This is a little larger than
  usual at this time, but that's mainly because I was out on vacation
  last week. Nothing in here is major in any way, it's just two weeks of
  fixes. This contains:

   - NVMe pull from Keith, with a set of fixes from the usual suspects.

   - mq-deadline zone unlock fix from Damien, fixing an issue with the
     SMR zone locking added for 4.16.

   - two bcache fixes sent in by Michael, with changes from Coly and
     Tang.

   - comment typo fix from Eric for blktrace.

   - return-value error handling fix for nbd, from Gustavo.

   - fix a direct-io case where we don't defer to a completion handler,
     making us sleep from IRQ device completion. From Jan.

   - a small series from Jan fixing up holes around handling of bdev
     references.

   - small set of regression fixes from Jiufei, mostly fixing problems
     around the gendisk pointer -> partition index change.

   - regression fix from Ming, fixing a boundary issue with the discard
     page cache invalidation.

   - two-patch series from Ming, fixing both a core blk-mq-sched and
     kyber issue around token freeing on a requeue condition"

* tag 'for-linus-20180302' of git://git.kernel.dk/linux-block: (24 commits)
  block: fix a typo
  block: display the correct diskname for bio
  block: fix the count of PGPGOUT for WRITE_SAME
  mq-deadline: Make sure to always unlock zones
  nvmet: fix PSDT field check in command format
  nvme-multipath: fix sysfs dangerously created links
  nbd: fix return value in error handling path
  bcache: fix kcrashes with fio in RAID5 backend dev
  bcache: correct flash only vols (check all uuids)
  blktrace_api.h: fix comment for struct blk_user_trace_setup
  blockdev: Avoid two active bdev inodes for one device
  genhd: Fix BUG in blkdev_open()
  genhd: Fix use after free in __blkdev_get()
  genhd: Add helper put_disk_and_module()
  genhd: Rename get_disk() to get_disk_and_module()
  genhd: Fix leaked module reference for NVME devices
  direct-io: Fix sleep in atomic due to sync AIO
  nvme-pci: Fix nvme queue cleanup if IRQ setup fails
  block: kyber: fix domain token leak during requeue
  blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch
  ...
2018-03-02 09:35:36 -08:00
Shuah Khan ba004a2955 selftests: memory-hotplug: fix emit_tests regression
Commit 16c513b134
("selftests: memory-hotplug: silence test command echo")

introduced regression in emit_tests and results in the following
failure when selftests are installed and run. Fix it.

Running tests in memory-hotplug
========================================
./run_kselftest.sh: line 121: @./mem-on-off-test.sh: No such file or
directory
selftests: memory-hotplug [FAIL]

Fixes: 16c513b134 (selftests: memory-hotplug: silence test command echo")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-03-02 10:12:59 -07:00
Linus Torvalds ff06b55ec4 MMC core:
- mmc: core: Avoid hang when claiming host
 
 MMC host:
  - dw_mmc: Avoid hang when accessing registers
  - dw_mmc: Fix out-of-bounds access for slot's caps
  - dw_mmc-k3: Fix out-of-bounds access through DT alias
  - sdhci-pci: Fix S0i3 for Intel BYT-based controllers
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJamTk2AAoJEP4mhCVzWIwp108P/RcicjtOpcvTKMYnt7l6hAQ6
 1Qr4tjAtj75FSVZooEKo1B13GTJOqgGh3O5x1nEcCGI2iA8ZV2XclN1QZhJeVJcF
 5B1/hAXMjRaUkXzUrD9pOi+m+s3IBK82hJ40ac4JHYUpRT7fRLkB1PdU0gV3V9yZ
 Uy967pL9spHMbgDDbGut2gnBt9MyTysRfvCnEUKPKvPPqL6QaOGNEEIlXQedrSoi
 vUvvQjpAAUMd3kDWhnxNileLyHUFatpaJOYtxfTWiXNgP9LHv1o1X2QbbagnXaqY
 AybOjvWinbKoaX6+5G+ZZJMrW14D6gG9AZ/vi57U7ta8NiKRWmn0KjzTk0VVKsbo
 5oKqjRT3Hu/SCqyqhVvo8dY+HSqhv0v1EtXVjfI1D8mxBt4v4fwnQrzNr9dBCouD
 Lb+l6hnZJq/xq3BDF+pZk0HiErGdXrXeq/pEKZXRL4EvX6TF0YZhc4VIyShDmRa5
 5ZNgCndkUQ9Y/LYYnuwDvqkqlysR49FTCw5fGZDxtJzud6rupgF+/0QGwyGjGCZk
 4Add0EcTKgWsSBcVwkyRJbDTmQB4tSVwAGMjAnlMJW3YX8chuzg2//DuW3DZLRao
 TH0PMPYTQeWclsz3ySBE/FU9IQhSQ44s0VUO+G+XxtfffaqOcMVs5D2ZKTjk8jWD
 lvdznNBylRrleG9K5zcA
 =pILk
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - mmc: core: Avoid hang when claiming host

  MMC host:
   - dw_mmc: Avoid hang when accessing registers
   - dw_mmc: Fix out-of-bounds access for slot's caps
   - dw_mmc-k3: Fix out-of-bounds access through DT alias
   - sdhci-pci: Fix S0i3 for Intel BYT-based controllers"

* tag 'mmc-v4.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Avoid hanging to claim host for mmc via some nested calls
  mmc: dw_mmc: Avoid accessing registers in runtime suspended state
  mmc: dw_mmc: Fix out-of-bounds access for slot's caps
  mmc: dw_mmc: Factor out dw_mci_init_slot_caps
  mmc: dw_mmc-k3: Fix out-of-bounds access through DT alias
  mmc: sdhci-pci: Fix S0i3 for Intel BYT-based controllers
2018-03-02 08:44:11 -08:00
Linus Torvalds a5c05b7459 Power management updates for v4.16-rc4
- Make the task scheduler load and utilization signals be
    frequency-invariant again after recent changes in the SCPI
    cpufreq driver (Dietmar Eggemann).
 
  - Drop an unnecessary leftover Kconfig dependency from the SCPI
    cpufreq driver (Sudeep Holla).
 
  - Fix the initialization of the s3c24xx cpufreq driver (Viresh
    Kumar).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJamS/lAAoJEILEb/54YlRxwzEP/RWq6hStcKa5ObYjNjJo7/9W
 37fnJkvGMmfl4rXGGcwVGmEJJpZhgRbx3vHpCZ0zvDnJtP8KRa79RhboSXPoTN8s
 KheBp9bggdszz0XWmrwP8pvQdO3l988zEztG3kEkiM6mxqWPuUUWduP3/RRPApA4
 i/i7BuQoQf1SIrWtv8DNbKg6Eha43GT1N/OvR8qzILmj1loaZRulrIyVPm4Cl/KG
 DrvT3Bod+EBshr8xs+GYsfvf07/hNPmRo2N8t5lpxy6ikiekZyfmyuQI1Fhq6W5U
 DxHmQOntaO/4YExgFGmWhp7V9rCRvICBY/leFOFY7UuZX/edtXwTtJz/7hz2IUnb
 wXu1EvUyU4ulyXaoqndeWhETQjkAgainszRAEyJBUNQjBF5CTEUI4ysZVB8H4cJp
 Bg5YWsO1a2w4u4IcIOudJ4mkfgmnAihr1O6S9f/DAGPYN7RqJn9i/tx1vrYd+VIi
 pULdYTEfTsfQUO+wBiorVYjJFIimJZj6Y1uw84oWoYN0f12pxegssLcKEfFqGtFd
 t5g5+XDpU5SUqMo29AnAp5kVmrQNwBUvr/T89cl/pvfEoOPv0Kf9iZ50aF5flsAi
 dudbfIbxhjTRQxtOYdj8vLtgN89BtuYrF1CIH4t3hI2SDA9Xb02mxB8Vsuk9WZ+e
 0OqYKcGKPCcc4uLWf+ii
 =AMSC
 -----END PGP SIGNATURE-----

Merge tag 'pm-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix three issues in cpufreq drivers: one recent regression, one
  leftover Kconfig dependency and one old but "stable" material.

  Specifics:

   - Make the task scheduler load and utilization signals be
     frequency-invariant again after recent changes in the SCPI cpufreq
     driver (Dietmar Eggemann).

   - Drop an unnecessary leftover Kconfig dependency from the SCPI
     cpufreq driver (Sudeep Holla).

   - Fix the initialization of the s3c24xx cpufreq driver (Viresh
     Kumar)"

* tag 'pm-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: s3c24xx: Fix broken s3c_cpufreq_init()
  cpufreq: scpi: Fix incorrect arm_big_little config dependency
  cpufreq: scpi: invoke frequency-invariance setter function
2018-03-02 08:17:49 -08:00
Masahiro Yamada 5ae6fcc4bb kconfig: fix line number in recursive inclusion error message
When recursive inclusion is detected, the line number of the last
'included from:' is wrong.

[Test Case]

Kconfig:
  -------->8--------
  source "Kconfig2"
  -------->8--------

Kconfig2:
  -------->8--------
  source "Kconfig3"
  -------->8--------

Kconfig3:
  -------->8--------
  source "Kconfig"
  -------->8--------

[Result]

  $ make allyesconfig
  scripts/kconfig/conf  --allyesconfig Kconfig
  Kconfig:1: recursive inclusion detected. Inclusion path:
    current file : 'Kconfig'
    included from: 'Kconfig3:1'
    included from: 'Kconfig2:1'
    included from: 'Kconfig:3'
  scripts/kconfig/Makefile:89: recipe for target 'allyesconfig' failed
  make[1]: *** [allyesconfig] Error 1
  Makefile:512: recipe for target 'allyesconfig' failed
  make: *** [allyesconfig] Error 2

where we expect

    current file : 'Kconfig'
    included from: 'Kconfig3:1'
    included from: 'Kconfig2:1'
    included from: 'Kconfig:1'

The 'iter->lineno+1' in the second fpinrtf() should be 'iter->lineno-1'.
I refactored the code to merge the two fprintf() calls.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Ulf Magnusson <ulfalizer@gmail.com>
2018-03-03 00:44:47 +09:00
Dafna Hirschfeld a11761c2dd Coccinelle: memdup: Fix typo in warning messages
Replace 'kmemdep' with 'kmemdup' in warning messages.

Signed-off-by: Dafna Hirschfeld <dafna3@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Nicolas Palix <nicolas.palix@imag.fr>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-03-03 00:41:24 +09:00
David S. Miller d69242bf20 Three more patches:
* fix for a regression in 4-addr mode with fast-RX
  * fix for a Kconfig problem with the new regdb
  * fix for the long-standing TCP performance issue in
    wifi using the new sk_pacing_shift_update()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAlqZEcoACgkQB8qZga/f
 l8SfMw//USsQzokuZOPNW4kI8ntVmp3lPRcIKFOP/CCbC+vuR9nwDE1hNPbGAiBY
 SN+ALcF9mv96vVFBvpVp1eZMLBz07+tBqWHCBkbAhHeZ+cEo0VfNoOvyY7mqQtsS
 OsKfj0xG0M9+aG/bPLiSZJNvvcPZxbtVwE31zVM0Xl6ogJ6/98t5z+wlbFFvLbOF
 e8O/FiO4x2MAY27ITdZepaa7OnS2Jprcoae7csERBH5siqX9jFoH6PDLQObscQZ/
 XBHEvOy2NCZrw0B9hDKexIGvwzLgfnyePjweI8B59p+unBSFOXNNoIo3yux4aFcH
 WmNQJ11PYpVffk8BS97HmeQmIta4TaFNP0OAUc+aTDZiDRxCjMzQF9dMehxXwR2N
 nqeE0wa0LeM+HIH5HFf7iUhv5ORy64eI5mt1HRTzTB4cLRbpD5OPwkdwN94ck4fs
 AY8uN2sgZotYOFtrOt4X0HNoU5RqwI7c6a++n75HRafbeoa1uSQm9NlJcPyLD9J9
 eAZjVQRC70HBP3PSuUgqzVZyS/6TrT2D0Gan2S555Bao0iA/1Vq59VDn+pfftJsN
 8T6d0BM3clv/VYYkqdzP3pFPrzJT/XqWGkRcpfnUBpmTxTkZgkWOpbpTVBCunje7
 Xy1zP8itH3bm4/A+zbNUO3H4I/WrUsOLTyEqIJZatIOfnDrhne0=
 =d3er
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2018-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Three more patches:
 * fix for a regression in 4-addr mode with fast-RX
 * fix for a Kconfig problem with the new regdb
 * fix for the long-standing TCP performance issue in
   wifi using the new sk_pacing_shift_update()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02 09:47:39 -05:00
Ka-Cheong Poon 84eef2b218 rds: Incorrect reference counting in TCP socket creation
Commit 0933a578cd ("rds: tcp: use sock_create_lite() to create the
accept socket") has a reference counting issue in TCP socket creation
when accepting a new connection.  The code uses sock_create_lite() to
create a kernel socket.  But it does not do __module_get() on the
socket owner.  When the connection is shutdown and sock_release() is
called to free the socket, the owner's reference count is decremented
and becomes incorrect.  Note that this bug only shows up when the socket
owner is configured as a kernel module.

v2: Update comments

Fixes: 0933a578cd ("rds: tcp: use sock_create_lite() to create the accept socket")
Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-02 09:40:27 -05:00
Guenter Roeck 61e18270f6 s390: Fix runtime warning about negative pgtables_bytes
When running s390 images with 'compat' processes, the following
BUG is seen repeatedly.

BUG: non-zero pgtables_bytes on freeing mm: -16384

Bisect points to commit b4e98d9ac7 ("mm: account pud page tables").
Analysis shows that init_new_context() is called with
mm->context.asce_limit set to _REGION3_SIZE. In this situation,
pgtables_bytes remains set to 0 and is not increased. The message is
displayed when the affected process dies and mm_dec_nr_puds() is called.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Fixes: b4e98d9ac7 ("mm: account pud page tables")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-03-02 12:48:40 +01:00
Jan Glauber 7c4246797b i2c: octeon: Prevent error message on bus error
The error message:

[Fri Feb 16 13:42:13 2018] i2c-thunderx 0000:01:09.4: unhandled state: 0

is mis-leading as state 0 (bus error) is not an unknown state.

Return -EIO as before but avoid printing the message. Also rename
STAT_ERROR to STATE_BUS_ERROR.

Signed-off-by: Jan Glauber <jglauber@cavium.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2018-03-02 11:11:15 +01:00
Wolfram Sang 1a0e3a35c6 - sort the manufacturers in DT bindings alphabetically
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlqZErcACgkQEacuoBRx
 13JCFg/9Ej3CNc0eHYN1mSfA+fGV59M6m5foSJw1eiZRrOYZJYEB+Bmcg8deJaWS
 9tV3QBl8RCtIVTpwXEYR6PBn56H7dxm5P+o1497LyAcB/RR7CqVLHPWaBjR83xUR
 0qucZTC0SZmCTMWOaqAVKmPBeoEhyJKgVd0cK7AHwE/nWBQl2DC9oWI/AKqows7e
 HjBgeUAF7S/lq2v2jtKv9WLTqyVNVCSELnuFAQsDhaemVfc9TzoCcaXkbPxNp4fb
 /VZb3FWlrdZjFyNF7U6qMlzWYwpxTcCWNvJvynoXfBMk73/YDQRzclm/57AvpTEa
 pJOwq3srtFEMTZGetkwyofn04TE37a2GjyRT2C43dyZGZV4xHX48V9lEgnTuashR
 7bbisQ+dzavyYbLCSH8kx3PjmU/iSL8RAWCjuVXB7rWIQLVqEG8mo2DTJGZdsN/9
 9cFr0aFgBcQhozXBwS/Gwu7FXv0n6v9vsGFv+Tr/UmURCKweW/HsfkFpU9nkLIyU
 VZo4/lu+TBeQGI3pSu8aPp45kIh49xzyhy23AcifeUs69BFl2aB5fM1i0IzIcDfc
 HWB5ZMOy2So7nZZ0u+gxaC0olh7BuVh74xpDJZNLyPUQtfqilJ8OaXnv/2vdXTkP
 A9hIhm34C03OOG/nKuOKD2wWFFYls1VNPxBSKMnmHtLaUsPzRis=
 =vIz0
 -----END PGP SIGNATURE-----

Merge tag 'at24-4.16-rc4-for-wolfram' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-current

Pull in this fixup to get rid of a dependency for the next cycle:

"- sort the manufacturers in DT bindings alphabetically"
2018-03-02 11:04:33 +01:00
Rafael J. Wysocki b61e070305 Merge branch 'cpufreq-scpi'
* cpufreq-scpi:
  cpufreq: scpi: Fix incorrect arm_big_little config dependency
  cpufreq: scpi: invoke frequency-invariance setter function
2018-03-02 10:44:44 +01:00
Helge Deller 636a415bcc parisc: Reduce irq overhead when run in qemu
When run under QEMU, calling mfctl(16) creates some overhead because the
qemu timer has to be scaled and moved into the register. This patch
reduces the number of calls to mfctl(16) by moving the calls out of the
loops.

Additionally, increase the minimal time interval to 8000 cycles instead
of 500 to compensate possible QEMU delays when delivering interrupts.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
2018-03-02 10:05:07 +01:00
Helge Deller 5ffa851885 parisc: Use cr16 interval timers unconditionally on qemu
When running on qemu we know that the (emulated) cr16 cpu-internal
clocks are syncronized. So let's use them unconditionally on qemu.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
2018-03-02 10:04:59 +01:00
Helge Deller 0ed1fe4ad3 parisc: Check if secondary CPUs want own PDC calls
The architecture specification says (for 64-bit systems): PDC is a per
processor resource, and operating system software must be prepared to
manage separate pointers to PDCE_PROC for each processor.  The address
of PDCE_PROC for the monarch processor is stored in the Page Zero
location MEM_PDC. The address of PDCE_PROC for each non-monarch
processor is passed in gr26 when PDCE_RESET invokes OS_RENDEZ.

Currently we still use one PDC for all CPUs, but in case we face a
machine which is following the specification let's warn about it.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-03-02 10:04:46 +01:00
Helge Deller fd8d0ca256 parisc: Hide virtual kernel memory layout
For security reasons do not expose the virtual kernel memory layout to
userspace.

Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org # 4.15
Reviewed-by: Kees Cook <keescook@chromium.org>
2018-03-02 10:04:35 +01:00
John David Anglin 0adb24e03a parisc: Fix ordering of cache and TLB flushes
The change to flush_kernel_vmap_range() wasn't sufficient to avoid the
SMP stalls.  The problem is some drivers call these routines with
interrupts disabled.  Interrupts need to be enabled for flush_tlb_all()
and flush_cache_all() to work.  This version adds checks to ensure
interrupts are not disabled before calling routines that need IPI
interrupts.  When interrupts are disabled, we now drop into slower code.

The attached change fixes the ordering of cache and TLB flushes in
several cases.  When we flush the cache using the existing PTE/TLB
entries, we need to flush the TLB after doing the cache flush.  We don't
need to do this when we flush the entire instruction and data caches as
these flushes don't use the existing TLB entries.  The same is true for
tmpalias region flushes.

The flush_kernel_vmap_range() and invalidate_kernel_vmap_range()
routines have been updated.

Secondly, we added a new purge_kernel_dcache_range_asm() routine to
pacache.S and use it in invalidate_kernel_vmap_range().  Nominally,
purges are faster than flushes as the cache lines don't have to be
written back to memory.

Hopefully, this is sufficient to resolve the remaining problems due to
cache speculation.  So far, testing indicates that this is the case.  I
did work up a patch using tmpalias flushes, but there is a performance
hit because we need the physical address for each page, and we also need
to sequence access to the tmpalias flush code.  This increases the
probability of stalls.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # 4.9+
Signed-off-by: Helge Deller <deller@gmx.de>
2018-03-02 10:03:28 +01:00
Hui Wang d5078193e5 ALSA: hda - Fix a wrong FIXUP for alc289 on Dell machines
With the alc289, the Pin 0x1b is Headphone-Mic, so we should assign
ALC269_FIXUP_DELL4_MIC_NO_PRESENCE rather than
ALC225_FIXUP_DELL1_MIC_NO_PRESENCE to it. And this change is suggested
by Kailang of Realtek and is verified on the machine.

Fixes: 3f2f7c553d ("ALSA: hda - Fix headset mic detection problem for two Dell machines")
Cc: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2018-03-02 09:59:15 +01:00
Jernej Skrabec f3e5feeb92
drm/sun4i: Release exclusive clock lock when disabling TCON
Currently exclusive TCON clock lock is never released, which, for
example, prevents changing resolution on HDMI.

In order to fix that, release clock when disabling TCON. TCON is always
disabled first before new mode is set.

Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180301213442.16677-7-jernej.skrabec@siol.net
2018-03-02 08:44:17 +01:00
Paul Mackerras debd574f41 KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
The current code for initializing the VRMA (virtual real memory area)
for HPT guests requires the page size of the backing memory to be one
of 4kB, 64kB or 16MB.  With a radix host we have the possibility that
the backing memory page size can be 2MB or 1GB.  In these cases, if the
guest switches to HPT mode, KVM will not initialize the VRMA and the
guest will fail to run.

In fact it is not necessary that the VRMA page size is the same as the
backing memory page size; any VRMA page size less than or equal to the
backing memory page size is acceptable.  Therefore we now choose the
largest page size out of the set {4k, 64k, 16M} which is not larger
than the backing memory page size.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-02 15:38:24 +11:00
Paul Mackerras c3856aeb29 KVM: PPC: Book3S HV: Fix handling of large pages in radix page fault handler
This fixes several bugs in the radix page fault handler relating to
the way large pages in the memory backing the guest were handled.
First, the check for large pages only checked for explicit huge pages
and missed transparent huge pages.  Then the check that the addresses
(host virtual vs. guest physical) had appropriate alignment was
wrong, meaning that the code never put a large page in the partition
scoped radix tree; it was always demoted to a small page.

Fixing this exposed bugs in kvmppc_create_pte().  We were never
invalidating a 2MB PTE, which meant that if a page was initially
faulted in without write permission and the guest then attempted
to store to it, we would never update the PTE to have write permission.
If we find a valid 2MB PTE in the PMD, we need to clear it and
do a TLB invalidation before installing either the new 2MB PTE or
a pointer to a page table page.

This also corrects an assumption that get_user_pages_fast would set
the _PAGE_DIRTY bit if we are writing, which is not true.  Instead we
mark the page dirty explicitly with set_page_dirty_lock().  This
also means we don't need the dirty bit set on the host PTE when
providing write access on a read fault.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2018-03-02 14:05:32 +11:00
David S. Miller a5f7b0eeb2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-02-28

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Add schedule points and reduce the number of loop iterations
   the test_bpf kernel module is performing in order to not hog
   the CPU for too long, from Eric.

2) Fix an out of bounds access in tail calls in the ppc64 BPF
   JIT compiler, from Daniel.

3) Fix a crash on arm64 on unaligned BPF xadd operations that
   could be triggered via interpreter and JIT, from Daniel.

Please not that once you merge net into net-next at some point, there
is a minor merge conflict in test_verifier.c since test cases had
been added at the end in both trees. Resolution is trivial: keep all
the test cases from both trees.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 21:42:07 -05:00
Edward Cree a6d50512b4 net: ethtool: don't ignore return from driver get_fecparam method
If ethtool_ops->get_fecparam returns an error, pass that error on to the
 user, rather than ignoring it.

Fixes: 1a5f3da20b ("net: ethtool: add support for forward error correction modes")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 21:41:06 -05:00
Stephen Suryaputra e2c0dc1f1d vrf: check forwarding on the original netdevice when generating ICMP dest unreachable
When ip_error() is called the device is the l3mdev master instead of the
original device. So the forwarding check should be on the original one.

Changes from v2:
- Handle the original device disappearing (per David Ahern)
- Minimize the change in code order

Changes from v1:
- Only need to reset the device on which __in_dev_get_rcu() is done (per
  David Ahern).

Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 21:40:22 -05:00
Mike Manning 50d629e7a8 net: allow interface to be set into VRF if VLAN interface in same VRF
Setting an interface into a VRF fails with 'RTNETLINK answers: File
exists' if one of its VLAN interfaces is already in the same VRF.
As the VRF is an upper device of the VLAN interface, it is also showing
up as an upper device of the interface itself. The solution is to
restrict this check to devices other than master. As only one master
device can be linked to a device, the check in this case is that the
upper device (VRF) being linked to is not the same as the master device
instead of it not being any one of the upper devices.

The following example shows an interface ens12 (with a VLAN interface
ens12.10) being set into VRF green, which behaves as expected:

  # ip link add link ens12 ens12.10 type vlan id 10
  # ip link set dev ens12 master vrfgreen
  # ip link show dev ens12
    3: ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
       master vrfgreen state UP mode DEFAULT group default qlen 1000
       link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff

But if the VLAN interface has previously been set into the same VRF,
then setting the interface into the VRF fails:

  # ip link set dev ens12 nomaster
  # ip link set dev ens12.10 master vrfgreen
  # ip link show dev ens12.10
    39: ens12.10@ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
    qdisc noqueue master vrfgreen state UP mode DEFAULT group default
    qlen 1000 link/ether 52:54:00:4c:a0:45 brd ff:ff:ff:ff:ff:ff
  # ip link set dev ens12 master vrfgreen
    RTNETLINK answers: File exists

The workaround is to move the VLAN interface back into the default VRF
beforehand, but it has to be shut first so as to avoid the risk of
traffic leaking from the VRF. This fix avoids needing this workaround.

Signed-off-by: Mike Manning <mmanning@att.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-01 21:25:29 -05:00
Alastair D'Silva e7666d046a ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-02 13:02:15 +11:00
Alastair D'Silva 07c5ccd70a ocxl: Add get_metadata IOCTL to share OCXL information to userspace
Some required information is not exposed to userspace currently (eg. the
PASID), pass this information back, along with other information which
is currently communicated via sysfs, which saves some parsing effort in
userspace.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-02 13:02:14 +11:00
Manish Rangankar 967823d6c3 scsi: qedi: Fix kernel crash during port toggle
BUG: unable to handle kernel NULL pointer dereference at 0000000000000100

[  985.596918] IP: _raw_spin_lock_bh+0x17/0x30
[  985.601581] PGD 0 P4D 0
[  985.604405] Oops: 0002 [#1] SMP
:
[  985.704533] CPU: 16 PID: 1156 Comm: qedi_thread/16 Not tainted 4.16.0-rc2 #1
[  985.712397] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 2.4.3 01/17/2017
[  985.720747] RIP: 0010:_raw_spin_lock_bh+0x17/0x30
[  985.725996] RSP: 0018:ffffa4b1c43d3e10 EFLAGS: 00010246
[  985.731823] RAX: 0000000000000000 RBX: ffff94a31bd03000 RCX: 0000000000000000
[  985.739783] RDX: 0000000000000001 RSI: ffff94a32fa16938 RDI: 0000000000000100
[  985.747744] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000a33
[  985.755703] R10: 0000000000000000 R11: ffffa4b1c43d3af0 R12: 0000000000000000
[  985.763662] R13: ffff94a301f40818 R14: 0000000000000000 R15: 000000000000000c
[  985.771622] FS:  0000000000000000(0000) GS:ffff94a32fa00000(0000) knlGS:0000000000000000
[  985.780649] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  985.787057] CR2: 0000000000000100 CR3: 000000067a009006 CR4: 00000000001606e0
[  985.795017] Call Trace:
[  985.797747]  qedi_fp_process_cqes+0x258/0x980 [qedi]
[  985.803294]  qedi_percpu_io_thread+0x10f/0x1b0 [qedi]
[  985.808931]  kthread+0xf5/0x130
[  985.812434]  ? qedi_free_uio+0xd0/0xd0 [qedi]
[  985.817298]  ? kthread_bind+0x10/0x10
[  985.821372]  ? do_syscall_64+0x6e/0x1a0

Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:54 -05:00
Darren Trapp 2b5b96473e scsi: qla2xxx: Fix FC-NVMe LUN discovery
commit a4239945b8 ("scsi: qla2xxx: Add switch command to simplify
fabric discovery") introduced regression when it did not consider
FC-NVMe code path which broke NVMe LUN discovery.

Fixes: a4239945b8 ("scsi: qla2xxx: Add switch command to simplify fabric discovery")
Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:53 -05:00
Hannes Reinecke e39a97353e scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()
When converting __scsi_error_from_host_byte() to BLK_STS error codes the
case DID_OK was forgotten, resulting in it always returning an error.

Fixes: 2a842acab1 ("block: introduce new block status code type")
Cc: Doug Gilbert <dgilbert@interlog.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:52 -05:00
Bart Van Assche 3be8828fc5 scsi: core: Avoid that ATA error handling can trigger a kernel hang or oops
Avoid that the recently introduced call_rcu() call in the SCSI core
triggers a double call_rcu() call.

Reported-by: Natanael Copa <ncopa@alpinelinux.org>
Reported-by: Damien Le Moal <damien.lemoal@wdc.com>
References: https://bugzilla.kernel.org/show_bug.cgi?id=198861
Fixes: 3bd6f43f5c ("scsi: core: Ensure that the SCSI error handler gets woken up")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Natanael Copa <ncopa@alpinelinux.org>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Alexandre Oliva <oliva@gnu.org>
Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:52 -05:00
Hannes Reinecke fa83e65885 scsi: qla2xxx: ensure async flags are reset correctly
The fcport flags FCF_ASYNC_ACTIVE and FCF_ASYNC_SENT are used to
throttle the state machine, so we need to ensure to always set and unset
them correctly. Not doing so will lead to the state machine getting
confused and no login attempt into remote ports.

Cc: Quinn Tran <quinn.tran@cavium.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Fixes: 3dbec59bdf ("scsi: qla2xxx: Prevent multiple active discovery commands per session")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:51 -05:00
Hannes Reinecke 07ea4b6026 scsi: qla2xxx: do not check login_state if no loop id is assigned
When no loop id is assigned in qla24xx_fcport_handle_login() the login
state needs to be ignored; it will get set later on in
qla_chk_n2n_b4_login().

Cc: Quinn Tran <quinn.tran@cavium.com>
Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Fixes: 040036bb0b ("scsi: qla2xxx: Delay loop id allocation at login")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:51 -05:00
Hannes Reinecke 1c6cacf4ea scsi: qla2xxx: Fixup locking for session deletion
Commit d8630bb95f ('Serialize session deletion by using work_lock')
tries to fixup a deadlock when deleting sessions, but fails to take into
account the locking rules. This patch resolves the situation by
introducing a separate lock for processing the GNLIST response, and
ensures that sess_lock is released before calling
qlt_schedule_sess_delete().

Cc: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Quinn Tran <quinn.tran@cavium.com>
Fixes: d8630bb95f ("scsi: qla2xxx: Serialize session deletion by using work_lock")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:50 -05:00
himanshu.madhani@cavium.com 1514839b36 scsi: qla2xxx: Fix NULL pointer crash due to active timer for ABTS
This patch fixes NULL pointer crash due to active timer running for abort
IOCB.

From crash dump analysis it was discoverd that get_next_timer_interrupt()
encountered a corrupted entry on the timer list.

 #9 [ffff95e1f6f0fd40] page_fault at ffffffff914fe8f8
    [exception RIP: get_next_timer_interrupt+440]
    RIP: ffffffff90ea3088  RSP: ffff95e1f6f0fdf0  RFLAGS: 00010013
    RAX: ffff95e1f6451028  RBX: 000218e2389e5f40  RCX: 00000001232ad600
    RDX: 0000000000000001  RSI: ffff95e1f6f0fdf0  RDI: 0000000001232ad6
    RBP: ffff95e1f6f0fe40   R8: ffff95e1f6451188   R9: 0000000000000001
    R10: 0000000000000016  R11: 0000000000000016  R12: 00000001232ad5f6
    R13: ffff95e1f6450000  R14: ffff95e1f6f0fdf8  R15: ffff95e1f6f0fe10
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Looking at the assembly of get_next_timer_interrupt(), address came
from %r8 (ffff95e1f6451188) which is pointing to list_head with single
entry at ffff95e5ff621178.

 0xffffffff90ea307a <get_next_timer_interrupt+426>:      mov    (%r8),%rdx
 0xffffffff90ea307d <get_next_timer_interrupt+429>:      cmp    %r8,%rdx
 0xffffffff90ea3080 <get_next_timer_interrupt+432>:      je     0xffffffff90ea30a7 <get_next_timer_interrupt+471>
 0xffffffff90ea3082 <get_next_timer_interrupt+434>:      nopw   0x0(%rax,%rax,1)
 0xffffffff90ea3088 <get_next_timer_interrupt+440>:      testb  $0x1,0x18(%rdx)

 crash> rd ffff95e1f6451188 10
 ffff95e1f6451188:  ffff95e5ff621178 ffff95e5ff621178   x.b.....x.b.....
 ffff95e1f6451198:  ffff95e1f6451198 ffff95e1f6451198   ..E.......E.....
 ffff95e1f64511a8:  ffff95e1f64511a8 ffff95e1f64511a8   ..E.......E.....
 ffff95e1f64511b8:  ffff95e77cf509a0 ffff95e77cf509a0   ...|.......|....
 ffff95e1f64511c8:  ffff95e1f64511c8 ffff95e1f64511c8   ..E.......E.....

 crash> rd ffff95e5ff621178 10
 ffff95e5ff621178:  0000000000000001 ffff95e15936aa00   ..........6Y....
 ffff95e5ff621188:  0000000000000000 00000000ffffffff   ................
 ffff95e5ff621198:  00000000000000a0 0000000000000010   ................
 ffff95e5ff6211a8:  ffff95e5ff621198 000000000000000c   ..b.............
 ffff95e5ff6211b8:  00000f5800000000 ffff95e751f8d720   ....X... ..Q....

 ffff95e5ff621178 belongs to freed mempool object at ffff95e5ff621080.

 CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS  SSIZE
 ffff95dc7fd74d00 mnt_cache                384      19785     24948    594    16k
   SLAB              MEMORY            NODE  TOTAL  ALLOCATED  FREE
   ffffdc5dabfd8800  ffff95e5ff620000     1     42         29    13
   FREE / [ALLOCATED]
    ffff95e5ff621080  (cpu 6 cache)

Examining the contents of that memory reveals a pointer to a constant string
in the driver, "abort\0", which is set by qla24xx_async_abort_cmd().

 crash> rd ffffffffc059277c 20
 ffffffffc059277c:  6e490074726f6261 0074707572726574   abort.Interrupt.
 ffffffffc059278c:  00676e696c6c6f50 6920726576697244   Polling.Driver i
 ffffffffc059279c:  646f6d207325206e 6974736554000a65   n %s mode..Testi
 ffffffffc05927ac:  636976656420676e 786c252074612065   ng device at %lx
 ffffffffc05927bc:  6b63656843000a2e 646f727020676e69   ...Checking prod
 ffffffffc05927cc:  6f20444920746375 0a2e706968632066   uct ID of chip..
 ffffffffc05927dc:  5120646e756f4600 204130303232414c   .Found QLA2200A
 ffffffffc05927ec:  43000a2e70696843 20676e696b636568   Chip...Checking
 ffffffffc05927fc:  65786f626c69616d 6c636e69000a2e73   mailboxes...incl
 ffffffffc059280c:  756e696c2f656475 616d2d616d642f78   ude/linux/dma-ma

 crash> struct -ox srb_iocb
 struct srb_iocb {
           union {
               struct {...} logio;
               struct {...} els_logo;
               struct {...} tmf;
               struct {...} fxiocb;
               struct {...} abt;
               struct ct_arg ctarg;
               struct {...} mbx;
               struct {...} nack;
    [0x0 ] } u;
    [0xb8] struct timer_list timer;
    [0x108] void (*timeout)(void *);
 }
 SIZE: 0x110

 crash> ! bc
 ibase=16
 obase=10
 B8+40
 F8

The object is a srb_t, and at offset 0xf8 within that structure
(i.e. ffff95e5ff621080 + f8 -> ffff95e5ff621178) is a struct timer_list.

Cc: <stable@vger.kernel.org> #4.4+
Fixes: 4440e46d5d ("[SCSI] qla2xxx: Add IOCB Abort command asynchronous handling.")
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-01 20:16:33 -05:00