Commit Graph

798277 Commits

Author SHA1 Message Date
Dave Chinner 43feeea88c xfs: zero length symlinks are not valid
A log recovery failure has been reproduced where a symlink inode has
a zero length in extent form. It was caused by a shutdown during a
combined fstress+fsmark workload.

The underlying problem is the issue in xfs_inactive_symlink(): the
inode is unlocked between the symlink inactivation/truncation and
the inode being freed. This opens a window for the inode to be
written to disk before it xfs_ifree() removes it from the unlinked
list, marks it free in the inobt and zeros the mode.

For shortform inodes, the fix is simple. xfs_ifree() clears the data
fork state, so there's no need to do it in xfs_inactive_symlink().
This means the shortform fork verifier will not see a zero length
data fork as it mirrors the inode size through to xfs_ifree()), and
hence if the inode gets written back and the fork verifiers are run
they will still see a fork that matches the on-disk inode size.

For extent form (remote) symlinks, it is a little more tricky. Here
we explicitly set the inode size to zero, so the above race can lead
to zero length symlinks on disk. Because the inode is unlinked at
this point (i.e. on the unlinked list) and unreferenced, it can
never be seen again by a user. Hence when we set the inode size to
zeor, also change the type to S_IFREG. xfs_ifree() expects S_IFREG
inodes to be of zero length, and so this avoids all the problems of
zero length symlinks ever hitting the disk. It also avoids the
problem of needing to handle zero length symlink inodes in log
recovery to replay the extent free intents and the remaining
deferops to free the extents the symlink used.

Also add a couple of asserts to warn us if zero length symlinks end
up in either the symlink create or inactivation paths.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-12-12 08:47:15 -08:00
Colin Ian King 8c4ce794ee xfs: clean up indentation issues, remove an unwanted space
There is a statement that has an unwanted space in the indentation.
Remove it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-12-12 08:46:20 -08:00
Pan Bian fe5ed6c22e xfs: libxfs: move xfs_perag_put late
The function xfs_alloc_get_freelist calls xfs_perag_put to drop the
reference. However, pag->pagf_btreeblks is read and written after the
put operation. This patch moves the put operation later.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
[darrick: minor changelog edits]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-12-12 08:46:20 -08:00
Darrick J. Wong d6f215f359 xfs: split up the xfs_reflink_end_cow work into smaller transactions
In xfs_reflink_end_cow, we allocate a single transaction for the entire
end_cow operation and then loop the CoW fork mappings to move them to
the data fork.  This design fails on a heavily fragmented filesystem
where an inode's data fork has exactly one more extent than would fit in
an extents-format fork, because the unmap can collapse the data fork
into extents format (freeing the bmbt block) but the remap can expand
the data fork back into a (newly allocated) bmbt block.  If the number
of extents we end up remapping is large, we can overflow the block
reservation because we reserved blocks assuming that we were adding
mappings into an already-cleared area of the data fork.

Let's say we have 8 extents in the data fork, 8 extents in the CoW fork,
and the data fork can hold at most 7 extents before needing to convert
to btree format; and that blocks A-P are discontiguous single-block
extents:

   0......7
D: ABCDEFGH
C: IJKLMNOP

When a write to file blocks 0-7 completes, we must remap I-P into the
data fork.  We start by removing H from the btree-format data fork.  Now
we have 7 extents, so we convert the fork to extents format, freeing the
bmbt block.   We then move P into the data fork and it now has 8 extents
again.  We must convert the data fork back to btree format, requiring a
block allocation.  If we repeat this sequence for blocks 6-5-4-3-2-1-0,
we'll need a total of 8 block allocations to remap all 8 blocks.  We
reserved only enough blocks to handle one btree split (5 blocks on a 4k
block filesystem), which means we overflow the block reservation.

To fix this issue, create a separate helper function to remap a single
extent, and change _reflink_end_cow to call it in a tight loop over the
entire range we're completing.  As a side effect this also removes the
size restrictions on how many extents we can end_cow at a time, though
nobody ever hit that.  It is not reasonable to reserve N blocks to remap
N blocks.

Note that this can be reproduced after ~320 million fsx ops while
running generic/938 (long soak directio fsx exerciser):

XFS: Assertion failed: tp->t_blk_res >= tp->t_blk_res_used, file: fs/xfs/xfs_trans.c, line: 116
<machine registers snipped>
Call Trace:
 xfs_trans_dup+0x211/0x250 [xfs]
 xfs_trans_roll+0x6d/0x180 [xfs]
 xfs_defer_trans_roll+0x10c/0x3b0 [xfs]
 xfs_defer_finish_noroll+0xdf/0x740 [xfs]
 xfs_defer_finish+0x13/0x70 [xfs]
 xfs_reflink_end_cow+0x2c6/0x680 [xfs]
 xfs_dio_write_end_io+0x115/0x220 [xfs]
 iomap_dio_complete+0x3f/0x130
 iomap_dio_rw+0x3c3/0x420
 xfs_file_dio_aio_write+0x132/0x3c0 [xfs]
 xfs_file_write_iter+0x8b/0xc0 [xfs]
 __vfs_write+0x193/0x1f0
 vfs_write+0xba/0x1c0
 ksys_write+0x52/0xc0
 do_syscall_64+0x50/0x160
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2018-12-12 08:46:19 -08:00
Linus Torvalds 40e020c129 Linux 4.20-rc6 2018-12-09 15:31:00 -08:00
Linus Torvalds d48f782e4f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "A decent batch of fixes here. I'd say about half are for problems that
  have existed for a while, and half are for new regressions added in
  the 4.20 merge window.

   1) Fix 10G SFP phy module detection in mvpp2, from Baruch Siach.

   2) Revert bogus emac driver change, from Benjamin Herrenschmidt.

   3) Handle BPF exported data structure with pointers when building
      32-bit userland, from Daniel Borkmann.

   4) Memory leak fix in act_police, from Davide Caratti.

   5) Check RX checksum offload in RX descriptors properly in aquantia
      driver, from Dmitry Bogdanov.

   6) SKB unlink fix in various spots, from Edward Cree.

   7) ndo_dflt_fdb_dump() only works with ethernet, enforce this, from
      Eric Dumazet.

   8) Fix FID leak in mlxsw driver, from Ido Schimmel.

   9) IOTLB locking fix in vhost, from Jean-Philippe Brucker.

  10) Fix SKB truesize accounting in ipv4/ipv6/netfilter frag memory
      limits otherwise namespace exit can hang. From Jiri Wiesner.

  11) Address block parsing length fixes in x25 from Martin Schiller.

  12) IRQ and ring accounting fixes in bnxt_en, from Michael Chan.

  13) For tun interfaces, only iface delete works with rtnl ops, enforce
      this by disallowing add. From Nicolas Dichtel.

  14) Use after free in liquidio, from Pan Bian.

  15) Fix SKB use after passing to netif_receive_skb(), from Prashant
      Bhole.

  16) Static key accounting and other fixes in XPS from Sabrina Dubroca.

  17) Partially initialized flow key passed to ip6_route_output(), from
      Shmulik Ladkani.

  18) Fix RTNL deadlock during reset in ibmvnic driver, from Thomas
      Falcon.

  19) Several small TCP fixes (off-by-one on window probe abort, NULL
      deref in tail loss probe, SNMP mis-estimations) from Yuchung
      Cheng"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits)
  net/sched: cls_flower: Reject duplicated rules also under skip_sw
  bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips.
  bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips.
  bnxt_en: Keep track of reserved IRQs.
  bnxt_en: Fix CNP CoS queue regression.
  net/mlx4_core: Correctly set PFC param if global pause is turned off.
  Revert "net/ibm/emac: wrong bit is used for STA control"
  neighbour: Avoid writing before skb->head in neigh_hh_output()
  ipv6: Check available headroom in ip6_xmit() even without options
  tcp: lack of available data can also cause TSO defer
  ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output
  mlxsw: spectrum_switchdev: Fix VLAN device deletion via ioctl
  mlxsw: spectrum_router: Relax GRE decap matching check
  mlxsw: spectrum_switchdev: Avoid leaking FID's reference count
  mlxsw: spectrum_nve: Remove easily triggerable warnings
  ipv4: ipv6: netfilter: Adjust the frag mem limit when truesize changes
  sctp: frag_point sanity check
  tcp: fix NULL ref in tail loss probe
  tcp: Do not underestimate rwnd_limited
  net: use skb_list_del_init() to remove from RX sublists
  ...
2018-12-09 15:12:33 -08:00
Linus Torvalds 8586ca8a21 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Three fixes: a boot parameter re-(re-)fix, a retpoline build artifact
  fix and an LLVM workaround"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso: Drop implicit common-page-size linker flag
  x86/build: Fix compiler support check for CONFIG_RETPOLINE
  x86/boot: Clear RSDP address in boot_params for broken loaders
2018-12-09 15:09:55 -08:00
Linus Torvalds ebbd30004d Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull kprobes fixes from Ingo Molnar:
 "Two kprobes fixes: a blacklist fix and an instruction patching related
  corruption fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kprobes/x86: Blacklist non-attachable interrupt functions
  kprobes/x86: Fix instruction patching corruption when copying more than one RIP-relative instruction
2018-12-09 14:21:33 -08:00
Linus Torvalds 4b04e73a78 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
 "Two fixes: a large-system fix and an earlyprintk fix with certain
  resolutions"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/earlyprintk/efi: Fix infinite loop on some screen widths
  x86/efi: Allocate e820 buffer before calling efi_exit_boot_service
2018-12-09 14:03:56 -08:00
Or Gerlitz 35cc3cefc4 net/sched: cls_flower: Reject duplicated rules also under skip_sw
Currently, duplicated rules are rejected only for skip_hw or "none",
hence allowing users to push duplicates into HW for no reason.

Use the flower tables to protect for that.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reported-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 11:55:08 -08:00
David S. Miller d4b60e94e9 Merge branch 'bnxt_en-Bug-fixes'
Michael Chan says:

====================
bnxt_en: Bug fixes.

The first patch fixes a regression on CoS queue setup, introduced
recently by the 57500 new chip support patches.  The rest are
fixes related to ring and resource accounting on the new 57500 chips.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 11:46:59 -08:00
Michael Chan e30fbc3319 bnxt_en: Fix _bnxt_get_max_rings() for 57500 chips.
The CP rings are accounted differently on the new 57500 chips.  There
must be enough CP rings for the sum of RX and TX rings on the new
chips.  The current logic may be over-estimating the RX and TX rings.

The output parameter max_cp should be the maximum NQs capped by
MSIX vectors available for networking in the context of 57500 chips.
The existing code which uses CMPL rings capped by the MSIX vectors
works most of the time but is not always correct.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 11:46:58 -08:00
Michael Chan c0b8cda05e bnxt_en: Fix NQ/CP rings accounting on the new 57500 chips.
The new 57500 chips have introduced the NQ structure in addition to
the existing CP rings in all chips.  We need to introduce a new
bnxt_nq_rings_in_use().  On legacy chips, the 2 functions are the
same and one will just call the other.  On the new chips, they
refer to the 2 separate ring structures.  The new function is now
called to determine the resource (NQ or CP rings) associated with
MSIX that are in use.

On 57500 chips, the RDMA driver does not use the CP rings so
we don't need to do the subtraction adjustment.

Fixes: 41e8d79837 ("bnxt_en: Modify the ring reservation functions for 57500 series chips.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 11:46:58 -08:00
Michael Chan 75720e6323 bnxt_en: Keep track of reserved IRQs.
The new 57500 chips use 1 NQ per MSIX vector, whereas legacy chips use
1 CP ring per MSIX vector.  To better unify this, add a resv_irqs
field to struct bnxt_hw_resc.  On legacy chips, we initialize resv_irqs
with resv_cp_rings.  On new chips, we initialize it with the allocated
MSIX resources.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 11:46:58 -08:00
Michael Chan 804fba4e9f bnxt_en: Fix CNP CoS queue regression.
Recent changes to support the 57500 devices have created this
regression.  The bnxt_hwrm_queue_qportcfg() call was moved to be
called earlier before the RDMA support was determined, causing
the CoS queues configuration to be set before knowing whether RDMA
was supported or not.  Fix it by moving it to the right place right
after RDMA support is determined.

Fixes: 98f04cf0f1 ("bnxt_en: Check context memory requirements from firmware.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-09 11:46:58 -08:00
Linus Torvalds 0844895a2e Char/Misc driver fixes for 4.20-rc6
Here are some small driver fixes for 4.20-rc6.
 
 There is a hyperv fix that for some reaon took forever to get into a
 shape that could be applied to the tree properly, but resolves a
 much reported issue.  The others are some gnss patches, one a bugfix and
 the two others updates to the MAINTAINERS file to properly match the
 gnss files in the tree.
 
 All have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXAz5RQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykKlACeKFk6Q9zC1gjaxNxNzV/RvCsMqNcAnjUcE7cw
 KluPsfg8zZp8Jf5eQNHG
 =hpqG
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes for 4.20-rc6.

  There is a hyperv fix that for some reaon took forever to get into a
  shape that could be applied to the tree properly, but resolves a much
  reported issue. The others are some gnss patches, one a bugfix and the
  two others updates to the MAINTAINERS file to properly match the gnss
  files in the tree.

  All have been in linux-next for a while with no reported issues"

* tag 'char-misc-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  MAINTAINERS: exclude gnss from SIRFPRIMA2 regex matching
  MAINTAINERS: add gnss scm tree
  gnss: sirf: fix activation retry handling
  Drivers: hv: vmbus: Offload the handling of channels to two workqueues
2018-12-09 10:43:17 -08:00
Linus Torvalds 47dcb0802d Staging fixes for 4.20-rc6
Here are two staging driver bugfixes for 4.20-rc6.
 
 One is a revert of a previously incorrect patch that was merged a while
 ago, and the other resolves a possible buffer overrun that was found by
 code inspection.
 
 Both of these have been in the linux-next tree with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXAz4ng8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn/YACffYTp0y+UkyjJ/wTtrl15/ny2UdwAn3gErbci
 tbvcc0bF5wvm0F1vKcU/
 =OoII
 -----END PGP SIGNATURE-----

Merge tag 'staging-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging fixes from Greg KH:
 "Here are two staging driver bugfixes for 4.20-rc6.

  One is a revert of a previously incorrect patch that was merged a
  while ago, and the other resolves a possible buffer overrun that was
  found by code inspection.

  Both of these have been in the linux-next tree with no reported
  issues"

* tag 'staging-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  Revert commit ef9209b642 "staging: rtl8723bs: Fix indenting errors and an off-by-one mistake in core/rtw_mlme_ext.c"
  staging: rtl8712: Fix possible buffer overrun
2018-12-09 10:35:33 -08:00
Linus Torvalds 822b7683ff TTY driver fixes for 4.20-rc6
Here are three small tty driver fixes for 4.20-rc6
 
 Nothing major, just some bug fixes for reported issues.  Full details
 are in the shortlog.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXAz36A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymYQgCgqA04cNFcelv38uoh9I1+AtOCRtEAnA64BAPe
 HSAR9D0dbU2kb7vmIHg8
 =vppK
 -----END PGP SIGNATURE-----

Merge tag 'tty-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty driver fixes from Greg KH:
 "Here are three small tty driver fixes for 4.20-rc6

  Nothing major, just some bug fixes for reported issues. Full details
  are in the shortlog.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  kgdboc: fix KASAN global-out-of-bounds bug in param_set_kgdboc_var()
  tty: serial: 8250_mtk: always resume the device in probe.
  tty: do not set TTY_IO_ERROR flag if console port
2018-12-09 10:24:29 -08:00
Linus Torvalds 50a5528a4b USB fixes for 4.20-rc6
Here are some small USB fixes for 4.20-rc6
 
 The "largest" here are some xhci fixes for reported issues.  Also here
 is a USB core fix, some quirk additions, and a usb-serial fix which
 required the export of one of the tty layer's functions to prevent code
 duplication.  The tty maintainer agreed with this change.
 
 All of these have been in linux-next with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXAz3WA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yl+mwCgox0WZiO3qfnpQsIrJ5gJFCTrcmMAoMBC5jDf
 e81bwzu507PcqvEXh+rg
 =giw6
 -----END PGP SIGNATURE-----

Merge tag 'usb-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 4.20-rc6

  The "largest" here are some xhci fixes for reported issues. Also here
  is a USB core fix, some quirk additions, and a usb-serial fix which
  required the export of one of the tty layer's functions to prevent
  code duplication. The tty maintainer agreed with this change.

  All of these have been in linux-next with no reported issues"

* tag 'usb-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  xhci: Prevent U1/U2 link pm states if exit latency is too long
  xhci: workaround CSS timeout on AMD SNPS 3.0 xHC
  USB: check usb_get_extra_descriptor for proper size
  USB: serial: console: fix reported terminal settings
  usb: quirk: add no-LPM quirk on SanDisk Ultra Flair device
  USB: Fix invalid-free bug in port_over_current_notify()
  usb: appledisplay: Add 27" Apple Cinema Display
2018-12-09 10:18:24 -08:00
Linus Torvalds bc4caf186f a fix for smb3 direct i/o, a fix for CIFS DFS for stable and a minor cifs Kconfig fix
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlwMfwoACgkQiiy9cAdy
 T1FkqAv/Tty/Od1u3r/il26vogLunG9JvmC62upgfpb684isEOnYn0oaT9FElunB
 C43FXF8Zw73zKv/9IqmLExNGZJ0Jm+5eCjnGX+e05QWN8ais91V2IW+4JMLwCga3
 KWBD0V1SBa355cObpvWe1WzrIzm1lwa3aCO/lpENo7Ph+kZXcFVIuOPRRxwLOUU9
 dkZCtvs5Pb8LVu2k1/jUXh1hqibUGR3V46txDMH6oomsLNV97xTC9URBCbpeWb3f
 x4mpUeWP3DnUyK8TTr3W1MWh1jlUHg3/8yDHfCuxuluAgRFQpI8Eru18XE8abnkz
 loFu9eLVT6iwJAlzmkXFF4cGi8LXqz3x4Z+1DopH2S2DBwedw9Tc1dptwOuUbpfg
 SO4jMOaNXOUSnN+0pCdEiO1z/Nc0YaNg++5AvGRM0BRohpNk6X2COPocf9HvcFI8
 xWajm+nkSsEaQNtXRAj/ku7cLovNgHTJt0KXQyCgxBVol19Gf5davHTe5+pPnsfQ
 e+jdUek4
 =wWgw
 -----END PGP SIGNATURE-----

Merge tag '4.20-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Three small fixes: a fix for smb3 direct i/o, a fix for CIFS DFS for
  stable and a minor cifs Kconfig fix"

* tag '4.20-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: Avoid returning EBUSY to upper layer VFS
  cifs: Fix separator when building path from dentry
  cifs: In Kconfig CONFIG_CIFS_POSIX needs depends on legacy (insecure cifs)
2018-12-09 10:15:13 -08:00
Linus Torvalds fa82dcbf2a dax fixes 4.20-rc6
* Fix the Xarray conversion of fsdax to properly handle
 dax_lock_mapping_entry() in the presense of pmd entries.
 
 * Fix inode destruction racing a new lock request.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcDLC5AAoJEB7SkWpmfYgC0yAP/2NULGPapIoR2aw7GKPOStHR
 tOMGE8+ZDtpd19oVoEiB3PuhMkiCj5Pkyt2ka9eUXSk3VR/N6EtGwBANFUO+0tEh
 yU8nca2T2G1f+MsZbraWA9zj9aPnfpLt46r7p/AjIB2j1/vyXkYMkXmkXtCvMEUi
 3Q9dVsT53fWncJBlTe/WW84wWQp77VsVafN4/UP75dv8cN9F+u91P1xvUXu2AKus
 RBqjlp6G6xMZ9HcWCQmStdPCi+qbO4k3oTZcohd/DYLb70ZL1kLoEMOjK2/3GnaV
 YMHZGWRAK5jroDZbdXXnJHE+4/opZpSWhW/J1WFEJUSSr7YmZsWhwegmrBlQ5cp3
 V0fZ7LjBWjgtUKsjlmfmu4ccLY87LzVlfIEtJH2+QNiLy/Pl/O0P/Jzg9kHVLEgo
 zph2yIQlx3yshej8EEBPylDJVtYVzOyjLNLx7r3bL6AzQv+6CKqjxtYOiIDwYQsA
 Db1InfeDzAAoH6NYSbqN+yG01Bz7zeCYT7Ch4rYSKcA1i8vUMR6iObI32fC3xJQm
 +TxYmqngrJaU8XOMikIem/qGn4fBpYFULrIdK7lj/WmIxQq8Hp0mqz1ikMIoErzN
 ZXV5RcbbvSSZi5EOrrVKbr2IfIcx+HYo/bIAkRCR69o8bcIZnLPHDky/7gl7GS/5
 sc9A/kysseeF45L2+6QG
 =kUil
 -----END PGP SIGNATURE-----

Merge tag 'dax-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull dax fixes from Dan Williams:
 "The last of the known regression fixes and fallout from the Xarray
  conversion of the filesystem-dax implementation.

  On the path to debugging why the dax memory-failure injection test
  started failing after the Xarray conversion a couple more fixes for
  the dax_lock_mapping_entry(), now called dax_lock_page(), surfaced.
  Those plus the bug that started the hunt are now addressed. These
  patches have appeared in a -next release with no issues reported.

  Note the touches to mm/memory-failure.c are just the conversion to the
  new function signature for dax_lock_page().

  Summary:

   - Fix the Xarray conversion of fsdax to properly handle
     dax_lock_mapping_entry() in the presense of pmd entries

   - Fix inode destruction racing a new lock request"

* tag 'dax-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: Fix unlock mismatch with updated API
  dax: Don't access a freed inode
  dax: Check page->mapping isn't NULL
2018-12-09 09:54:04 -08:00
Linus Torvalds bd799eb63d libnvdimm fixes 4.20-rc6
* Unless and until the core mm handles memory hotplug units smaller than
   a section (128M), persistent memory namespaces must be padded to
   section alignment. The libnvdimm core already handled section
   collision with "System RAM", but some configurations overlap
   independent "Persistent Memory" ranges within a section, so additional
   padding injection is added for that case.
 
 * The recent reworks of the ARS (address range scrub) state machine to
   reduce the number of state flags inadvertantly missed a conversion of
   acpi_nfit_ars_rescan() call sites. Fix the regression whereby
   user-requested ARS results in a "short" scrub rather than a "long"
   scrub.
 
 * Fixup the unit tests to handle / test the 128M section alignment of
   mocked test resources.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcDK/gAAoJEB7SkWpmfYgCCDsP/3VYrJfJwDVMGbvR4DX4P9WT
 vmft+ac++0aKId+Ei80GwENAQbMttmQ1woHTGAJpEt+A3AL8DWFSFQiqW0b+/eAg
 DW60dYN2cO0M6jkkJyIUMaWRy3iT7OOfEXLlJOubL7EnbivczjZFOAElvdpzPaK0
 +zIStgCdvmdcnZMi0BIst5XLoZ4/wWrR0Caq+ULpHoeDSO/oz3I2HV5gvxPE/7sJ
 WXSdB8zbLSS67fzFik1FJbbRRzEfBH7RJpgMoMpDLT3HtwcMvMrV8iJgNApccueV
 pFGhb5BpaJLaLc59RYxxVp/vLBEEaWDxq54RzU4twf4mbI1sc/NKJ08Nkwo9l8lH
 I6CW3FvegYzMkHD8PNd32GAr9HIxktvlH4Hc1GzaSSWwyeKx+5Et4llDpuuqW19o
 +wlybxRzZEoRNacwnxk1FPeOYUPPKLogkVOf14umh10tvi4UIuGkophoO1bxXc4d
 2gDPAHr3G1hAz+JV7PW/L+rO43uL8MWBYLdZgLiQ+90OAURu7e/f06j2SWzyHV1S
 9AajUqqLLV3whXHUfpl50Eymml9dEyw9NfbKOkry88Kde1NrwC/ccGB/90nKLoa+
 A2HxroRgxk8my8HXztnxLbhuxk0dnV4jK9v2mm9lGS6IUSWL6tePzneA8gGM2tYm
 rt2gaqY2bFOOSmG56Uac
 =85/B
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "A regression fix for the Address Range Scrub implementation, yes
  another one, and support for platforms that misalign persistent memory
  relative to the Linux memory hotplug section constraint. Longer term,
  support for sub-section memory hotplug would alleviate alignment
  waste, but until then this hack allows a 'struct page' memmap to be
  established for these misaligned memory regions.

  These have all appeared in a -next release, and thanks to Patrick for
  reporting and testing the alignment padding fix.

  Summary:

   - Unless and until the core mm handles memory hotplug units smaller
     than a section (128M), persistent memory namespaces must be padded
     to section alignment.

     The libnvdimm core already handled section collision with "System
     RAM", but some configurations overlap independent "Persistent
     Memory" ranges within a section, so additional padding injection is
     added for that case.

   - The recent reworks of the ARS (address range scrub) state machine
     to reduce the number of state flags inadvertantly missed a
     conversion of acpi_nfit_ars_rescan() call sites. Fix the regression
     whereby user-requested ARS results in a "short" scrub rather than a
     "long" scrub.

   - Fixup the unit tests to handle / test the 128M section alignment of
     mocked test resources.

* tag 'libnvdimm-fixes-4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  acpi/nfit: Fix user-initiated ARS to be "ARS-long" rather than "ARS-short"
  libnvdimm, pfn: Pad pfn namespaces relative to other regions
  tools/testing/nvdimm: Align test resources to 128M
2018-12-09 09:46:54 -08:00
Tarick Bedeir bd5122cd1e net/mlx4_core: Correctly set PFC param if global pause is turned off.
rx_ppp and tx_ppp can be set between 0 and 255, so don't clamp to 1.

Fixes: 6e8814ceb7 ("net/mlx4_en: Fix mixed PFC and Global pause user control requests")
Signed-off-by: Tarick Bedeir <tarick@google.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-08 21:26:36 -08:00
Linus Torvalds 6ec067e3a4 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal SoC fixes from Eduardo Valentin:
 "Fixes for armada and broadcom thermal drivers"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: broadcom: constify thermal_zone_of_device_ops structure
  thermal: armada: constify thermal_zone_of_device_ops structure
  thermal: bcm2835: Switch to SPDX identifier
  thermal: armada: fix legacy resource fixup
  thermal: armada: fix legacy validity test sense
2018-12-08 17:45:20 -08:00
Linus Torvalds 8214bdf7d3 asm-generic: bugfix for asm/unistd.h
Multiple people reported a bug I introduced in asm-generic/unistd.h
 in 4.20, this is the obvious bugfix to get glibc and others to
 correctly build again on new architectures that no longer provide
 the old fstatat64() family of system calls.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJcC+lEAAoJEGCrR//JCVIntacP/iQNGPECdK4PPXabI27nh8sp
 T91SmTso1o1J+TZATZCaUfCN5tlWcNUouaW5uiXWRYjMoHu3joQAfq6S8p8VmBwX
 9hEr+2xak7OQ7txCqucV8bH3LpxhRr/1DoUp4bekLghKPtmMDyXm5J7vcrI+ATbG
 JrfxJUGAJb7CYFuIXt8Es58/Of9cP5hr4HcdkUVb93jkZAubu9xxcf0tAyme8KXI
 6Dzq6t+gccQmz3bB2Ue12m+TbVWFh0onh+N5kQq3KfT5ku5jyUf95VkT8/ThxQlo
 W7Zmyb6MSC0O6Kwz4aRuLNf3VvSYfa0sF4+PlEEs0SSRiRZ8TYaPcBt5gPM9ttFf
 SOe6T+Ft04bsAf97yRkcGwlRPTMd4tVFKWluEihgixbX11/oIyrDPt3Pb/erE7bX
 9ik61vnH8Wb4yAikt/UF7k8lDt9N3aHn0mpXlAZWcxQ66BkgDG8UCjaLQqk94PlV
 Qd1iXFyW9Ihk5xgWORh8wZOfxcHoyNIG4gvBKjIDCrbd0/W7Aiyccg0/2mOQXz7y
 TN9nvRLHXR1dyFxL02BKZCRIHP76rEWL0t0+HNx9Tfgkwjj3B6sPosa260HCWICh
 brN7UzV5g9E/eDNiK47pj2JKIZF02J9U8Pa9vI7pibrleXcTNODAmIAxdhU0wsGk
 LsMR3eFwSx9y+lGg7UGt
 =Xdl0
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic fix from Arnd Bergmann:
 "Multiple people reported a bug I introduced in asm-generic/unistd.h in
  4.20, this is the obvious bugfix to get glibc and others to correctly
  build again on new architectures that no longer provide the old
  fstatat64() family of system calls"

* tag 'asm-generic-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: unistd.h: fixup broken macro include.
2018-12-08 11:44:04 -08:00
Linus Torvalds 570c9139c3 A few clk driver fixes this time:
- Introduce protected-clock DT binding to fix breakage on qcom sdm845-mtp
    boards where the qspi clks introduced this merge window cause the
    firmware on those boards to take down the system if we try to read
    the clk registers
 
  - Fix a couple off-by-one errors found by Dan Carpenter
 
  - Handle failure in zynq fixed factor clk driver to avoid using
    uninitialized data
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAlwLBpERHHNib3lkQGtl
 cm5lbC5vcmcACgkQrQKIl8bklSUmVRAAuswyoavqTsj09/lhEeM1w8f+2QlnCFb2
 6kUuVYmQP35u2MhuXv0sHwe1Gnnf5dL4r1d2TwCdZDTJxGIT6lbkLmeKxJzeAJzO
 TgVCSNxTfwNyAAY16tFSdY3sigjQCFL+0LrKTh8ID6Ub05xz2NUZ17d0eC8WTsn1
 6O66QBzCOAWvJ6LO8ktnpLpRUQAoECjKs9Vn2bnpBV+jxnuqthaD6LbK/snzNgI1
 NE+CKeFjs1bFP4WiXUn7HATw+GQxmMN3MOB9KSd36OILQYA5eMYS0voJW6V/LmJv
 5HImYCPc+vBhptLWflzQrbGmGV57zTH3V6OVXd0BD/tBxbIF/l56Xpv8YDC0ahe3
 RklehCka/o+cF7O9/sRLwLKKNYVNJs+JOX+MxJ3Pv/ANMaeZepMtTII38IcJFKQS
 aVXKbAwHvxWDI88PmdykO2gS5Nu6sdjkriQ63FdvRFL9UmnCTl3ruk/9Ww8aV04v
 cwwHJlw6sHVcRnzHJidh3UAyvEY8cSD9nbS2zV+xbiw1nR3S94hTx05N4UCc2Q/6
 7dRlZ9B+STni4YOnQlU0u6ApICrZEkcZk30dW7bX154x9WKbX7yusOCrpHa2Bx5b
 xvdNmuiSjNjLjmi2ZR0nFF/DAhBW9dAva/keeV7IC0Gilakgp33/MvUiLPFZ0piz
 5/vCWZx88FY=
 =t1kp
 -----END PGP SIGNATURE-----

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "A few clk driver fixes this time:

   - Introduce protected-clock DT binding to fix breakage on qcom
     sdm845-mtp boards where the qspi clks introduced this merge window
     cause the firmware on those boards to take down the system if we
     try to read the clk registers

   - Fix a couple off-by-one errors found by Dan Carpenter

   - Handle failure in zynq fixed factor clk driver to avoid using
     uninitialized data"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: zynqmp: Off by one in zynqmp_is_valid_clock()
  clk: mmp: Off by one in mmp_clk_add()
  clk: mvebu: Off by one bugs in cp110_of_clk_get()
  arm64: dts: qcom: sdm845-mtp: Mark protected gcc clocks
  clk: qcom: Support 'protected-clocks' property
  dt-bindings: clk: Introduce 'protected-clocks' property
  clk: zynqmp: handle fixed factor param query error
2018-12-08 11:33:26 -08:00
Linus Torvalds f896adc42d Changes since last update:
- Fix broken project quota inode counts
 - Fix incorrect PAGE_MASK/PAGE_SIZE usage
 - Fix incorrect return value in btree verifier
 - Fix WARN_ON remap flags false positive
 - Fix splice read overflows
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAlwGu/sACgkQ+H93GTRK
 tOum2g/5AWN2OMMqbCYru3BSrp4my26/CgGgsLdfR5go+fKvSI8txAm5Wx8r2doP
 aBQ0LuJaKm16bufpFx+LQDKvE/AV8yd3tL58ozPSbpPjdQFpl4DR3N95pURQJVhP
 Yt3tDuPk6ODJ/pKYSbM0GTTyJ5jr+tUlKU6kmAqNzg1W644TWZVV9zFCVU202sj9
 xZ6Auan3yb5wXk2vhgxPgHrVh5ngRm2G+/V8KxCVoWyP/lHUjgEcWJEu+/Jh8xOQ
 6dAYxdfngQqdlXI3dpJDOkFHbqcSrhl8+fQEnp+g0RyWcKPDS+L6wk4Xlkhaorjy
 nDB4GcnjCPHdJ8x/9pUlNmBrlXpAz4oNYqlQcJKssd5q9p+eTBH2drRDc5VDM5KU
 xSHsdIIxQ1YewJ3uHcIbaYBsK+XcrWCtypUTC73GeGkLqS1qKCuKCEnMQOR3czeE
 /Hyq6/eTSt2uG62MAOZGdIW4uaJsLTnXnfDElZ1YsdwtMEzKbYg1Ll2s86vudrRu
 Otyl3EVdXQjtWRxrelA5eexspoJNoZS69D6Haqt8MGc2HJoJGTuparV6YlsEeGoW
 bZFO8JV/Q0WFCa2cWRXVe+D+kx8fyFoDsKY1mR4Vwp5s0jQXp9/rQ81+zVjQ4wB2
 TU53a4+hMLKLW5aNH3ge3yB9ZlHZhEzex6hrlZH3kuqpAM3Yj00=
 =18EP
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.20-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "Here are hopefully the last set of fixes for 4.20.

  There's a fix for a longstanding statfs reporting problem with project
  quotas, a correction for page cache invalidation behaviors when
  fallocating near EOF, and a fix for a broken metadata verifier return
  code.

  Finally, the most important fix is to the pipe splicing code (aka the
  generic copy_file_range fallback) to avoid pointless short directio
  reads by only asking the filesystem for as much data as there are
  available pages in the pipe buffer. Our previous fix (simulated short
  directio reads because the number of pages didn't match the length of
  the read requested) caused subtle problems on overlayfs, so that part
  is reverted.

  Anyhow, this series passes fstests -g all on xfs and overlay+xfs, and
  has passed 17 billion fsx operations problem-free since I started
  testing

  Summary:

   - Fix broken project quota inode counts

   - Fix incorrect PAGE_MASK/PAGE_SIZE usage

   - Fix incorrect return value in btree verifier

   - Fix WARN_ON remap flags false positive

   - Fix splice read overflows"

* tag 'xfs-4.20-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  iomap: partially revert 4721a60109 (simulated directio short read on EFAULT)
  splice: don't read more than available pipe space
  vfs: allow some remap flags to be passed to vfs_clone_file_range
  xfs: fix inverted return from xfs_btree_sblock_verify_crc
  xfs: fix PAGE_MASK usage in xfs_free_file_space
  fs/xfs: fix f_ffree value for statfs when project quota is set
2018-12-08 11:25:02 -08:00
David Rientjes 356ff8a9a7 Revert "mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask"
This reverts commit 89c83fb539.

This should have been done as part of 2f0799a0ff ("mm, thp: restore
node-local hugepage allocations").  The movement of the thp allocation
policy from alloc_pages_vma() to alloc_hugepage_direct_gfpmask() was
intended to only set __GFP_THISNODE for mempolicies that are not
MPOL_BIND whereas the revert could set this regardless of mempolicy.

While the check for MPOL_BIND between alloc_hugepage_direct_gfpmask()
and alloc_pages_vma() was racy, that has since been removed since the
revert.  What is left is the possibility to use __GFP_THISNODE in
policy_node() when it is unexpected because the special handling for
hugepages in alloc_pages_vma()  was removed as part of the consolidation.

Secondly, prior to 89c83fb539, alloc_pages_vma() implemented a somewhat
different policy for hugepage allocations, which were allocated through
alloc_hugepage_vma().  For hugepage allocations, if the allocating
process's node is in the set of allowed nodes, allocate with
__GFP_THISNODE for that node (for MPOL_PREFERRED, use that node with
__GFP_THISNODE instead).  This was changed for shmem_alloc_hugepage() to
allow fallback to other nodes in 89c83fb539 as it did for new_page() in
mm/mempolicy.c which is functionally different behavior and removes the
requirement to only allocate hugepages locally.

So this commit does a full revert of 89c83fb539 instead of the partial
revert that was done in 2f0799a0ff.  The result is the same thp
allocation policy for 4.20 that was in 4.19.

Fixes: 89c83fb539 ("mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask")
Fixes: 2f0799a0ff ("mm, thp: restore node-local hugepage allocations")
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-12-08 10:26:20 -08:00
Benjamin Herrenschmidt 5b3279e2cb Revert "net/ibm/emac: wrong bit is used for STA control"
This reverts commit 624ca9c33c.

This commit is completely bogus. The STACR register has two formats, old
and new, depending on the version of the IP block used. There's a pair of
device-tree properties that can be used to specify the format used:

	has-inverted-stacr-oc
	has-new-stacr-staopc

What this commit did was to change the bit definition used with the old
parts to match the new parts. This of course breaks the driver on all
the old ones.

Instead, the author should have set the appropriate properties in the
device-tree for the variant used on his board.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07 22:37:27 -08:00
David S. Miller 8b78903bc5 Merge branch 'skb-headroom-slab-out-of-bounds'
Stefano Brivio says:

====================
Fix slab out-of-bounds on insufficient headroom for IPv6 packets

Patch 1/2 fixes a slab out-of-bounds occurring with short SCTP packets over
IPv4 over L2TP over IPv6 on a configuration with relatively low HEADER_MAX.

Patch 2/2 makes sure we avoid writing before the allocated buffer in
neigh_hh_output() in case the headroom is enough for the unaligned hardware
header size, but not enough for the aligned one, and that we warn if we hit
this condition.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07 16:24:40 -08:00
Stefano Brivio e6ac64d4c4 neighbour: Avoid writing before skb->head in neigh_hh_output()
While skb_push() makes the kernel panic if the skb headroom is less than
the unaligned hardware header size, it will proceed normally in case we
copy more than that because of alignment, and we'll silently corrupt
adjacent slabs.

In the case fixed by the previous patch,
"ipv6: Check available headroom in ip6_xmit() even without options", we
end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware
header and write 16 bytes, starting 2 bytes before the allocated buffer.

Always check we're not writing before skb->head and, if the headroom is
not enough, warn and drop the packet.

v2:
 - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet
   (Eric Dumazet)
 - if we avoid the panic, though, we need to explicitly check the headroom
   before the memcpy(), otherwise we'll have corrupted slabs on a running
   kernel, after we warn
 - use __skb_push() instead of skb_push(), as the headroom check is
   already implemented here explicitly (Eric Dumazet)

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07 16:24:40 -08:00
Stefano Brivio 66033f47ca ipv6: Check available headroom in ip6_xmit() even without options
Even if we send an IPv6 packet without options, MAX_HEADER might not be
enough to account for the additional headroom required by alignment of
hardware headers.

On a configuration without HYPERV_NET, WLAN, AX25, and with IPV6_TUNNEL,
sending short SCTP packets over IPv4 over L2TP over IPv6, we start with
100 bytes of allocated headroom in sctp_packet_transmit(), end up with 54
bytes after l2tp_xmit_skb(), and 14 bytes in ip6_finish_output2().

Those would be enough to append our 14 bytes header, but we're going to
align that to 16 bytes, and write 2 bytes out of the allocated slab in
neigh_hh_output().

KASan says:

[  264.967848] ==================================================================
[  264.967861] BUG: KASAN: slab-out-of-bounds in ip6_finish_output2+0x1aec/0x1c70
[  264.967866] Write of size 16 at addr 000000006af1c7fe by task netperf/6201
[  264.967870]
[  264.967876] CPU: 0 PID: 6201 Comm: netperf Not tainted 4.20.0-rc4+ #1
[  264.967881] Hardware name: IBM 2827 H43 400 (z/VM 6.4.0)
[  264.967887] Call Trace:
[  264.967896] ([<00000000001347d6>] show_stack+0x56/0xa0)
[  264.967903]  [<00000000017e379c>] dump_stack+0x23c/0x290
[  264.967912]  [<00000000007bc594>] print_address_description+0xf4/0x290
[  264.967919]  [<00000000007bc8fc>] kasan_report+0x13c/0x240
[  264.967927]  [<000000000162f5e4>] ip6_finish_output2+0x1aec/0x1c70
[  264.967935]  [<000000000163f890>] ip6_finish_output+0x430/0x7f0
[  264.967943]  [<000000000163fe44>] ip6_output+0x1f4/0x580
[  264.967953]  [<000000000163882a>] ip6_xmit+0xfea/0x1ce8
[  264.967963]  [<00000000017396e2>] inet6_csk_xmit+0x282/0x3f8
[  264.968033]  [<000003ff805fb0ba>] l2tp_xmit_skb+0xe02/0x13e0 [l2tp_core]
[  264.968037]  [<000003ff80631192>] l2tp_eth_dev_xmit+0xda/0x150 [l2tp_eth]
[  264.968041]  [<0000000001220020>] dev_hard_start_xmit+0x268/0x928
[  264.968069]  [<0000000001330e8e>] sch_direct_xmit+0x7ae/0x1350
[  264.968071]  [<000000000122359c>] __dev_queue_xmit+0x2b7c/0x3478
[  264.968075]  [<00000000013d2862>] ip_finish_output2+0xce2/0x11a0
[  264.968078]  [<00000000013d9b14>] ip_finish_output+0x56c/0x8c8
[  264.968081]  [<00000000013ddd1e>] ip_output+0x226/0x4c0
[  264.968083]  [<00000000013dbd6c>] __ip_queue_xmit+0x894/0x1938
[  264.968100]  [<000003ff80bc3a5c>] sctp_packet_transmit+0x29d4/0x3648 [sctp]
[  264.968116]  [<000003ff80b7bf68>] sctp_outq_flush_ctrl.constprop.5+0x8d0/0xe50 [sctp]
[  264.968131]  [<000003ff80b7c716>] sctp_outq_flush+0x22e/0x7d8 [sctp]
[  264.968146]  [<000003ff80b35c68>] sctp_cmd_interpreter.isra.16+0x530/0x6800 [sctp]
[  264.968161]  [<000003ff80b3410a>] sctp_do_sm+0x222/0x648 [sctp]
[  264.968177]  [<000003ff80bbddac>] sctp_primitive_ASSOCIATE+0xbc/0xf8 [sctp]
[  264.968192]  [<000003ff80b93328>] __sctp_connect+0x830/0xc20 [sctp]
[  264.968208]  [<000003ff80bb11ce>] sctp_inet_connect+0x2e6/0x378 [sctp]
[  264.968212]  [<0000000001197942>] __sys_connect+0x21a/0x450
[  264.968215]  [<000000000119aff8>] sys_socketcall+0x3d0/0xb08
[  264.968218]  [<000000000184ea7a>] system_call+0x2a2/0x2c0

[...]

Just like ip_finish_output2() does for IPv4, check that we have enough
headroom in ip6_xmit(), and reallocate it if we don't.

This issue is older than git history.

Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07 16:24:40 -08:00
Eric Dumazet f9bfe4e6a9 tcp: lack of available data can also cause TSO defer
tcp_tso_should_defer() can return true in three different cases :

 1) We are cwnd-limited
 2) We are rwnd-limited
 3) We are application limited.

Neal pointed out that my recent fix went too far, since
it assumed that if we were not in 1) case, we must be rwnd-limited

Fix this by properly populating the is_cwnd_limited and
is_rwnd_limited booleans.

After this change, we can finally move the silly check for FIN
flag only for the application-limited case.

The same move for EOR bit will be handled in net-next,
since commit 1c09f7d073 ("tcp: do not try to defer skbs
with eor mark (MSG_EOR)") is scheduled for linux-4.21

Tested by running 200 concurrent netperf -t TCP_RR -- -r 60000,100
and checking none of them was rwnd_limited in the chrono_stat
output from "ss -ti" command.

Fixes: 41727549de ("tcp: Do not underestimate rwnd_limited")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07 16:18:22 -08:00
Linus Torvalds 5f179793f0 virtio fixes
A couple of last-minute fixes.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJcCdNyAAoJECgfDbjSjVRpNAoH/A2eW0+UjIQar+jPKh1XPwN6
 uOoPAS3AnXGC0qhlJb2/W77intpLmF/SMkOSNBfvg1MXYTPJRGXcjS7v7446qpZf
 iCk/UEDN3Ck0OuxoR4AfO5qkbZSDGBGYDoSvB4JLufy9PTaNCrd+gBM349Fg1cRc
 lkK6aXLe7vydB+x0rLyfCrBEccZ8UtmCGs1FIzd9bvDgRGzlPnPGwavX0w8lx7Jn
 ZtQClajG2BjF0kMu+1hW9m781yjh7TYwshG2lYhQama5a8x3X5jOIUDNknfw3uQd
 crWOPPlzIELJcWrmG2/psydNGcq1b0S6vhFbpO7BiFpvCvdW5hAMk75c7hBWpls=
 =DLS0
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull vhost/virtio fixes from Michael Tsirkin:
 "A couple of last-minute fixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost/vsock: fix use-after-free in network stack callers
  virtio/s390: fix race in ccw_io_helper()
  virtio/s390: avoid race on vcdev->config
  vhost/vsock: fix reset orphans race with close timeout
2018-12-07 14:34:10 -08:00
Linus Torvalds b8bf4692c9 - Avoid sending IPIs with interrupts disabled
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlwKvQgACgkQa9axLQDI
 XvH+uw//UTEMR3jPre+hLqXQ+/hWWu9yYQeD9PVsyce6a7dIJ2NsL0Y9k4eORArw
 2rftLWq1r2CjndlULwxge01WQ2RXGiDgQ5GFutwGw/6kWid/JPnHj4yhDyDz3mR7
 DtioRhVUx1X3I+gQIqlGZSlF24wHcJfcbbVDBpydpW428Qo0OqM4GZAiCQha7QCZ
 HUMFZgWrYh6vsSe77151cRqoe8gozwlJ3aYrgf1E7cORCVRCbc2uyKLNihizmzIr
 3FwSPPn6zVPiKHfoYPjoyIBGd5Um3I1MgUhRVY/w4sZbylsdVx9whJbC6J5rg3xu
 5gCU3B0tJbZCQJQlJwtUrWNocJ0CGwK20wBukpWtHPyIZg4lf9etzthXEP1Pqn6m
 WT7LXj3SMV0qykp8mPes+AbSKJHOhE9/yoS0wu1As/mGA6sbnu0p/T77DeovChC/
 MTvNFQTvd1MudqOGbvMJrWtyrKG5dPJsE1RZ/xdgqA+jEfgJ7foZz8XCnT8Ypf9X
 Gl0JwcnmhNRXXCZ6FQweyHUgkUMFyLce1YnvtTmWr9QP94HG32IpNBqOLhaIX7Nz
 2NvENq6pmKzzOBOXB8IWz29ZFUhaiM0gaI5BT8XmMQHI6q/Xctv04KY91aJ+RAoU
 zkVt5n1e2gIW3z7oAA6hfRbn7JRvF7TWNaGjRb23783Y3nBHQtw=
 =812H
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Avoid sending IPIs with interrupts disabled"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: hibernate: Avoid sending cross-calling with interrupts disabled
2018-12-07 14:18:49 -08:00
Linus Torvalds 1cdc3624a1 Fixes for stackleak
- Remove tracing for inserted stack depth marking function (Anders Roxell)
 - Move gcc-plugin pass location to avoid objtool warnings (Alexander Popov)
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlwKp1IWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJuT9D/9DP75YerfMFxiVx8BsFnGVfPW3
 QWa/nf2c8VMhmouQ9OI8j8Nj+T4q5VXewbGC5I0F6b2YsIPjHOwK0PR557xn7jRi
 7bY3aTRzJs4v+dDYkXqTkGx4zQ9FSD3NDM0T4vtnVGEdOmojcvoLX6+V7WArOTaa
 M9oP4iNn5/+Z677HyMP3DyTY093WpCx0fNOAf1HI/kpM3TPVJiE5OLXBZY957N01
 eBrt0WHJkmaZkHeqUkK06RTxYzIKBQqFRw77pPiKq79ETxBEwHOgU2hmwwHBv4+h
 u6TQmy7aVsUiXfS1GVvkNkX/jCNxYuK8kP5dsd+cQKn3AfkDHj3RvBTOvrkD0xyF
 7F9Toz/Wpw8+/YVx8ks6cNrssmEq4rd6T7MJcoud1TwEG1o/bSUbPc4uednuIUGL
 sB4J6sxApL2vaZtgqUePVZZJVKwiryFa8LymihkHMfPU4dgCycrYLGa3A1ju9WVs
 psGYhFTEfC1KVLgTmfwZlxz/FWbRmSERRF7cl9cdw8mdlqkKxP1C//VgsdJXOnnW
 c51BS+XK9OI8HTYXmWah82ysuCE7qou4DUJA91jhyza5tEp2V5C0uhOQz2odFcBF
 8axjqExFr4YfAwIgtGOClPA0e5CaB4ASRbOIs8+WL03LiNbfP/p6+92TpnwaP637
 Q5CbAMIfKqNpqAcAJg==
 =1JZ6
 -----END PGP SIGNATURE-----

Merge tag 'gcc-plugins-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull gcc stackleak plugin fixes from Kees Cook:

 - Remove tracing for inserted stack depth marking function (Anders
   Roxell)

 - Move gcc-plugin pass location to avoid objtool warnings (Alexander
   Popov)

* tag 'gcc-plugins-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  stackleak: Register the 'stackleak_cleanup' pass before the '*free_cfg' pass
  stackleak: Mark stackleak_track_stack() as notrace
2018-12-07 13:13:07 -08:00
Linus Torvalds 52ab2ec005 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:

 - Disable the new crypto stats interface as it's still being changed

 - Fix potential uses-after-free in cbc/cfb/pcbc.

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: user - Disable statistics interface
  crypto: do not free algorithm before using
2018-12-07 13:07:10 -08:00
Linus Torvalds 7b24f6c082 pci-v4.20-fixes-3
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlwKrrsUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxKmxAAjYa1d5fcj+7tG4xPLwr2OkKVhx5p
 GKiocbCohifhtCaqYf2WONDbE7hKgDaU6GdHOgnKY3ViSOxwuJmOcSn+npCE3kTf
 Ty56mEERt2CzopUr+jqBjgoabpymCVBybDDXbs4rogevc5mp4Y4czOKjiAURKSqn
 eSI/Z9O9g0OM6Bwc3aSqI7D88G1DDryrnKD2TX5YRVh0CPI/JvPR+HAhCM2350qT
 t3o5G0lIhHyI97gtW2ZKcijUYyqYbaUPkUHqe5uFIo6tAbz6LbUEFIFjXJgx5D7P
 GpIjDPm4zYY0W/ENMGgFsijEZF3eaTYK95BbYTge4I0n2wV9qrr9wHcX/iWwA0kT
 1KPMY8rYEepT8l+vtrufX/+z5olP0a36c8ySkmAAPdRtGTeFOa+Er/cNEHJPD26G
 rpCUjLgI2553NlDpf1RDJJXbctpHb9j2JMDRp2m7esRqNU9S4iEnPTa0wCbTaNFL
 OE1qE68ZwcKQPJvcajpmVZAiYwyba5jRhcIy1XHSMgVb5r+xlyJ8WXUatSjxquDq
 yJ14JxCsYVHiv4qATCDmN4SkgZN3f+7nPVrNpJWbwjbNOeofvQti1mKdcuvLKlTE
 VHMzPiOP7AExv/YMWOSsdHQ11pIrUv1/AbMC4ln2eAy0D5bhL3F4gi8FmBaba+53
 exf99NojNiAY1Ts=
 =h1rz
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.20-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Revert ASPM change that caused a regression"

* tag 'pci-v4.20-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  Revert "PCI/ASPM: Do not initialize link state when aspm_disabled is set"
2018-12-07 12:58:34 -08:00
Shmulik Ladkani 1b4e5ad5d6 ipv6: sr: properly initialize flowi6 prior passing to ip6_route_output
In 'seg6_output', stack variable 'struct flowi6 fl6' was missing
initialization.

Fixes: 6c8702c60b ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-12-07 12:22:39 -08:00
Linus Torvalds 0b43a29979 for-linus-20181207
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlwKqUQQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpg/8D/9AffzAvrfKo9aX8/yGcwFOE9VxmgGI8N/u
 XHVnuz+g+EjmTC3k+S3uMkp7+ED/oJvicqsbv93097ZVmEF1crdW/Uk0SLBkusTL
 njDBEWKgAz/nTOx8Tp1zWeXCMtIQs3LxUEIj+jPzR/ia9NW3shqNDM0v71u/tE6W
 Kjxi/uVxQ6VzJ8ppY+xCJC3d+QJ6r8SXdSmo84+OK/3u4jablX8QectrnuD/vBXd
 K9BfB5EewH+D+OR0MeGij8QlPdKV4an/RvGawFwRMQoUvJj3Uq0lh9dMEDcYjz9P
 7uHxPFAoUiNLBdasuSt4EVyRvVwnhfIvXa+6vC57Pi+vFo+GIH0v4lMa39nl0NJr
 woMLCIGS9cZCRRT9PL19bOo/JjHe8EvniHnHbJJTFG3w73SRg4TuTnjoodRLPR7G
 8TkNnO1F/LQFRn7J6+QgL17OxI4EySxSqCqTNNapHXealAQy+5P1JOCBfUWMlvsf
 it6piYt38zyKRAd3I+dw7VFkyS+7XCrKwbH6kphPRb+lhuvPo0IwxtFNMaGfbyFW
 FmDdeIHQNZ2U0MaaUn+dujkLi0eCbVw7T5FIxYuj4IQb3vefThya9FBgQsd2nlAn
 Eu2ATCclgp2wfTHlbg8lLGvjzbHgTRu5VrzjDg6If1kN8pXq6vy72oNLjN2smDHi
 eLxCawiC2g==
 =irzk
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20181207' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Let's try this again...

  We're finally happy with the DM livelock issue, and it's also passed
  overnight testing and the corruption regression test. The end result
  is much nicer now too, which is great.

  Outside of that fix, there's a pull request for NVMe with two small
  fixes, and a regression fix for BFQ from this merge window. The BFQ
  fix looks bigger than it is, it's 90% comment updates"

* tag 'for-linus-20181207' of git://git.kernel.dk/linux-block:
  blk-mq: punt failed direct issue to dispatch list
  nvmet-rdma: fix response use after free
  nvme: validate controller state before rescheduling keep alive
  block, bfq: fix decrement of num_active_groups
2018-12-07 10:40:37 -08:00
Linus Torvalds 52f842ccd6 Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "A set of driver bugfixes for the I2C subsystem"

* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: uniphier-f: fix violation of tLOW requirement for Fast-mode
  i2c: uniphier: fix violation of tLOW requirement for Fast-mode
  i2c: uniphier-f: fill TX-FIFO only in IRQ handler for repeated START
  i2c: uniphier-f: fix timeout error after reading 8 bytes
  i2c: scmi: Fix probe error on devices with an empty SMB0001 ACPI device node
  i2c: axxia: properly handle master timeout
  i2c: rcar: check bus state before reinitializing
  i2c: nvidia-gpu: limit reads also for combined messages
  i2c: nvidia-gpu: adhere to I2C fault codes
2018-12-07 10:31:31 -08:00
Linus Torvalds c431b42058 dmaengine-4.20-rc6
dmaengine fixes for v4.20-rc6
 
  - Fixing imx-sdma handling of channel terminations, this involves
    reverting two commits and implement async termination
  - Fix cppi dma channel deletion from pending list on stop
  - Fix FIFO size for dw controller in Intel Merrifield
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJcCfKmAAoJEHwUBw8lI4NHiDUP/2dOZoPBnzbSXRYVr0PQY1Us
 6aB3cClYtrLeGSKzSjiMgNqxRQ/VnMX3iZMmEseuLd9JNQXF48qWcH0HiPgWmoxy
 OgzcJxc/faAfFumpBnjM5Zm3+kzOWf0xPdVBkFKrJoLIMCZ4rU1bstEujzs+3ViE
 a8U2BoAADW6RWqjBSMjXGhRfmo0QxXz88wzzeV9l2gtFvgN32SlhD3+cHcRdV81K
 23XbeVD7ivmUKPTZWA+4T7O7ShGH2wB4it9DpTgCMsPdrQkdncBTjYg5JwSfNJ4j
 vfQQO02ctqm9ipZacifc0VknISzMIPRY2F/PScCuUfHbDfyw10u4NHGcZDgRrXLz
 BvE624XXZoISmRzLvRxdayo1pPVyVzvkT5BvTcsMGx44SrO8fmTf/M/B4HMlj6F+
 vCe+h6e77eY+FsIvZfrZiOYEp45OEHTT+kuv97kvLdfBl998CpP9K8B6aODIVCt2
 YujqHUSE/Udato9H5jrAms0tZDQqSdTgQU95ZHOWrNn9+MPWRLZ9HUBF7Hmp9zA8
 SbCDseUR1bc/WeW8ge/WAbS+KCL1wbwCXm5/t6JI7yXPn/eiVQgUMI71wphtiDFK
 hMSQX6oZy0vmblMiI7WjInZJn01+NigN7xiF2uxai4Xh7MvRbg9L6pFrgxc3wfB2
 3JPfWLdIXp/Zcf/qlR2E
 =qeTx
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-4.20-rc6' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "Another pull request for dmaengine. We got bunch of fixes early this
  week and all are tagged to stable. Hope this is last fix for this
  cycle:

   - Fix imx-sdma handling of channel terminations, this involves
     reverting two commits and implement async termination

   - Fix cppi dma channel deletion from pending list on stop

   - Fix FIFO size for dw controller in Intel Merrifield"

* tag 'dmaengine-fix-4.20-rc6' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: dw: Fix FIFO size for Intel Merrifield
  dmaengine: cppi41: delete channel from pending list when stop channel
  dmaengine: imx-sdma: use GFP_NOWAIT for dma descriptor allocations
  dmaengine: imx-sdma: implement channel termination via worker
  Revert "dmaengine: imx-sdma: alloclate bd memory from dma pool"
  Revert "dmaengine: imx-sdma: Use GFP_NOWAIT for dma allocations"
2018-12-07 09:58:34 -08:00
Nick Desaulniers ac3e233d29 x86/vdso: Drop implicit common-page-size linker flag
GNU linker's -z common-page-size's default value is based on the target
architecture. arch/x86/entry/vdso/Makefile sets it to the architecture
default, which is implicit and redundant. Drop it.

Fixes: 2aae950b21 ("x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu")
Reported-by: Dmitry Golovin <dima@golovin.in>
Reported-by: Bill Wendling <morbo@google.com>
Suggested-by: Dmitry Golovin <dima@golovin.in>
Suggested-by: Rui Ueyama <ruiu@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Fangrui Song <maskray@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181206191231.192355-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=38774
Link: https://github.com/ClangBuiltLinux/linux/issues/31
2018-12-07 18:57:38 +01:00
Will Deacon b4aecf7808 arm64: hibernate: Avoid sending cross-calling with interrupts disabled
Since commit 3b8c9f1cdf ("arm64: IPI each CPU after invalidating the
I-cache for kernel mappings"), a call to flush_icache_range() will use
an IPI to cross-call other online CPUs so that any stale instructions
are flushed from their pipelines. This triggers a WARN during the
hibernation resume path, where flush_icache_range() is called with
interrupts disabled and is therefore prone to deadlock:

  | Disabling non-boot CPUs ...
  | CPU1: shutdown
  | psci: CPU1 killed.
  | CPU2: shutdown
  | psci: CPU2 killed.
  | CPU3: shutdown
  | psci: CPU3 killed.
  | WARNING: CPU: 0 PID: 1 at ../kernel/smp.c:416 smp_call_function_many+0xd4/0x350
  | Modules linked in:
  | CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0-rc4 #1

Since all secondary CPUs have been taken offline prior to invalidating
the I-cache, there's actually no need for an IPI and we can simply call
__flush_icache_range() instead.

Cc: <stable@vger.kernel.org>
Fixes: 3b8c9f1cdf ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Reported-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Tested-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-12-07 15:52:39 +00:00
Jens Axboe 8b878ee247 Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph.

* 'nvme-4.20' of git://git.infradead.org/nvme:
  nvmet-rdma: fix response use after free
  nvme: validate controller state before rescheduling keep alive
2018-12-07 08:40:13 -07:00
Jens Axboe c616cbee97 blk-mq: punt failed direct issue to dispatch list
After the direct dispatch corruption fix, we permanently disallow direct
dispatch of non read/write requests. This works fine off the normal IO
path, as they will be retried like any other failed direct dispatch
request. But for the blk_insert_cloned_request() that only DM uses to
bypass the bottom level scheduler, we always first attempt direct
dispatch. For some types of requests, that's now a permanent failure,
and no amount of retrying will make that succeed. This results in a
livelock.

Instead of making special cases for what we can direct issue, and now
having to deal with DM solving the livelock while still retaining a BUSY
condition feedback loop, always just add a request that has been through
->queue_rq() to the hardware queue dispatch list. These are safe to use
as no merging can take place there. Additionally, if requests do have
prepped data from drivers, we aren't dependent on them not sharing space
in the request structure to safely add them to the IO scheduler lists.

This basically reverts ffe81d4532 and is based on a patch from Ming,
but with the list insert case covered as well.

Fixes: ffe81d4532 ("blk-mq: fix corruption with direct issue")
Cc: stable@vger.kernel.org
Suggested-by: Ming Lei <ming.lei@redhat.com>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07 08:16:11 -07:00
Israel Rukshin d7dcdf9d4e nvmet-rdma: fix response use after free
nvmet_rdma_release_rsp() may free the response before using it at error
flow.

Fixes: 8407879 ("nvmet-rdma: fix possible bogus dereference under heavy load")
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-07 07:11:11 -08:00
James Smart 86880d6461 nvme: validate controller state before rescheduling keep alive
Delete operations are seeing NULL pointer references in call_timer_fn.
Tracking these back, the timer appears to be the keep alive timer.

nvme_keep_alive_work() which is tied to the timer that is cancelled
by nvme_stop_keep_alive(), simply starts the keep alive io but doesn't
wait for it's completion. So nvme_stop_keep_alive() only stops a timer
when it's pending. When a keep alive is in flight, there is no timer
running and the nvme_stop_keep_alive() will have no affect on the keep
alive io. Thus, if the io completes successfully, the keep alive timer
will be rescheduled.   In the failure case, delete is called, the
controller state is changed, the nvme_stop_keep_alive() is called while
the io is outstanding, and the delete path continues on. The keep
alive happens to successfully complete before the delete paths mark it
as aborted as part of the queue termination, so the timer is restarted.
The delete paths then tear down the controller, and later on the timer
code fires and the timer entry is now corrupt.

Fix by validating the controller state before rescheduling the keep
alive. Testing with the fix has confirmed the condition above was hit.

Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-12-07 07:11:11 -08:00
Paolo Valente ba7aeae553 block, bfq: fix decrement of num_active_groups
Since commit '2d29c9f89fcd ("block, bfq: improve asymmetric scenarios
detection")', if there are process groups with I/O requests waiting for
completion, then BFQ tags the scenario as 'asymmetric'. This detection
is needed for preserving service guarantees (for details, see comments
on the computation * of the variable asymmetric_scenario in the
function bfq_better_to_idle).

Unfortunately, commit '2d29c9f89fcd ("block, bfq: improve asymmetric
scenarios detection")' contains an error exactly in the updating of
the number of groups with I/O requests waiting for completion: if a
group has more than one descendant process, then the above number of
groups, which is renamed from num_active_groups to a more appropriate
num_groups_with_pending_reqs by this commit, may happen to be wrongly
decremented multiple times, namely every time one of the descendant
processes gets all its pending I/O requests completed.

A correct, complete solution should work as follows. Consider a group
that is inactive, i.e., that has no descendant process with pending
I/O inside BFQ queues. Then suppose that num_groups_with_pending_reqs
is still accounting for this group, because the group still has some
descendant process with some I/O request still in
flight. num_groups_with_pending_reqs should be decremented when the
in-flight request of the last descendant process is finally completed
(assuming that nothing else has changed for the group in the meantime,
in terms of composition of the group and active/inactive state of
child groups and processes). To accomplish this, an additional
pending-request counter must be added to entities, and must be
updated correctly.

To avoid this additional field and operations, this commit resorts to
the following tradeoff between simplicity and accuracy: for an
inactive group that is still counted in num_groups_with_pending_reqs,
this commit decrements num_groups_with_pending_reqs when the first
descendant process of the group remains with no request waiting for
completion.

This simplified scheme provides a fix to the unbalanced decrements
introduced by 2d29c9f89f. Since this error was also caused by lack
of comments on this non-trivial issue, this commit also adds related
comments.

Fixes: 2d29c9f89f ("block, bfq: improve asymmetric scenarios detection")
Reported-by: Steven Barrett <steven@liquorix.net>
Tested-by: Steven Barrett <steven@liquorix.net>
Tested-by: Lucjan Lucjanov <lucjan.lucjanov@gmail.com>
Reviewed-by: Federico Motta <federico@willer.it>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-12-07 07:40:07 -07:00
Greg Kroah-Hartman dbde117c31 GNSS fixes for 4.20-rc6
Here's a fix for a broken activation retry loop in the sirf driver.
 
 Included are also two MAINTAINERS updates.
 
 All have been in linux-next with no reported issues.
 
 Signed-off-by: Johan Hovold <johan@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQHbPq+cpGvN/peuzMLxc3C7H1lCAUCXAptswAKCRALxc3C7H1l
 CPSwAQChmb2y7CwP5il1PxqTa98F9hRx4H3DBCyPvS4WT5U1MgEAlrx1fnuAPiS3
 97Q6+ZRCyy5rbyjPpzJ6NwTyt13c4w0=
 =oN0i
 -----END PGP SIGNATURE-----

Merge tag 'gnss-4.20-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/gnss into char-misc-linus

Johan writes:

GNSS fixes for 4.20-rc6

Here's a fix for a broken activation retry loop in the sirf driver.

Included are also two MAINTAINERS updates.

All have been in linux-next with no reported issues.

Signed-off-by: Johan Hovold <johan@kernel.org>

* tag 'gnss-4.20-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/gnss:
  MAINTAINERS: exclude gnss from SIRFPRIMA2 regex matching
  MAINTAINERS: add gnss scm tree
  gnss: sirf: fix activation retry handling
2018-12-07 14:05:28 +01:00