Commit Graph

2642 Commits

Author SHA1 Message Date
Chuck Lever 03ffd92494 xprtrdma: Clean up tracepoints in the reply path
Replace unnecessary display of kernel memory addresses.

Also, there are no longer any trace_xprtrdma_defer_cmp() call sites.
And remove the trace_xprtrdma_leaked_rep() tracepoint because there
doesn't seem to be an overwhelming need to have a tracepoint for
catching a software bug that has long since been fixed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-11 10:36:00 -05:00
Chuck Lever 3a9568fedc xprtrdma: Clean up reply parsing error tracepoints
- Rename the tracepoints with the "_err" suffix to indicate these
  are rare error events
- Replace display of kernel memory addresses
- Tie the XID and error to a connection IP address instead

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-11 10:32:31 -05:00
Chuck Lever 36a55edfc3 xprtrdma: Clean up trace_xprtrdma_post_linv
- Replace the display of kernel memory addresses
- Add "_err" to the end of its name to indicate that it's a
  tracepoint that fires only when there's an error

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-11 10:26:06 -05:00
Chuck Lever 5ecef9c843 xprtrdma: Introduce FRWR completion IDs
Set up a completion ID in each rpcrdma_frwr. The ID is used to match
an incoming completion to a transport (CQ) and other MR-related
activity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-11 10:23:29 -05:00
Chuck Lever b2e7467f26 xprtrdma: Introduce Send completion IDs
Set up a completion ID in each rpcrdma_req. The ID is used to match
an incoming Send completion to a transport and to a previous
ib_post_send().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-11 10:21:12 -05:00
Chuck Lever af5865d278 xprtrdma: Introduce Receive completion IDs
Set up a completion ID in each rpcrdma_rep. The ID is used to match
an incoming Receive completion to a transport and to a previous
ib_post_recv().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-11 10:18:58 -05:00
Chuck Lever 3821e232eb xprtrdma: Replace dprintk call sites in ERR_CHUNK path
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-11-11 10:12:30 -05:00
Linus Torvalds 3552c3709c This is mainly server-to-server copy and fallout from Chuck's 5.10 rpc
refactoring.
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEYtFWavXG9hZotryuJ5vNeUKO4b4FAl+pk7QVHGJmaWVsZHNA
 ZmllbGRzZXMub3JnAAoJECebzXlCjuG+lwgQAL0WE92H1QJwYtrC5bXko1CjXjL7
 I1lv/rMf1ZHhdbZLZQNSqXFYTGrO3w6n02H7bJcYlryg5YSt8i8evdJXICYyeZIX
 5QAT0K5hzHTNWKnumqBSwoVOPl1e6ImZtmyxqQvA/2sQP18OPvroK/9H0YkdnM3/
 d8lcpKTBCJj0UAWmktaXGYG8PdNSjaNXMfPRwpCOGHiXk+QBAb+QjshB54PKjjhR
 aiJTJzceroLer0YlQSXfVQMt6EwkTkjCbMbxPywfFYGGvl/Y7H4YgVA8rYqO/XZr
 BmP9V+xX87GyB0IEGxoheVcmTMUSw37JUfAC2oBQB9g2emG5avRn4vdhL25nKd1T
 sgaVC+0tnoMQ7KNaYp1SK6orgS+OIYeQLhxbu6jmU+viccJ621JmpRF+95OwEZ9Z
 4+vBwI3Oft20jndgNwrTvCLgkzEVFpJuayBeZCk7pvchM2YjaWwl291ix+cwM2wQ
 fwMVs6dpLIgfB8jNOM6qAfI1jB1HMePrPraqxddxh5tZ0Tt4C4uwpEIDDwaPesmJ
 FK3JB+7GpU/tMHmmaeVFUMGx9V+8fJFEC0MFUrrqAMZ3XbzQ+DM5ysk1TQsO0OEO
 F1ojiYNW8s4U+dLCY0S16vFVoQIuM9Ui1zXGaJHQgS04l+cFCmD495s4HtYA1k7l
 H/T/o416bZlbOhcK
 =bpPt
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd fixes from Bruce Fields:
 "This is mainly server-to-server copy and fallout from Chuck's 5.10 rpc
  refactoring"

* tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linux:
  net/sunrpc: fix useless comparison in proc_do_xprt()
  net/sunrpc: return 0 on attempt to write to "transports"
  NFSD: fix missing refcount in nfsd4_copy by nfsd4_do_async_copy
  NFSD: Fix use-after-free warning when doing inter-server copy
  NFSD: MKNOD should return NFSERR_BADTYPE instead of NFSERR_INVAL
  SUNRPC: Fix general protection fault in trace_rpc_xdr_overflow()
  NFSD: NFSv3 PATHCONF Reply is improperly formed
2020-11-09 12:43:12 -08:00
Harshad Shirwadkar 556e0319fb ext4: disable fast commit with data journalling
Fast commits don't work with data journalling. This patch disables the
fast commit support when data journalling is turned on.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20201106035911.1942128-19-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06 23:01:05 -05:00
Harshad Shirwadkar b21ebf143a ext4: mark fc ineligible if inode gets evictied due to mem pressure
If inode gets evicted due to memory pressure, we have to remove it
from the fast commit list. However, that inode may have uncommitted
changes that fast commits will lose. So, just fall back to full
commits in this case. Also, rename the fast commit ineligiblity reason
from "EXT4_FC_REASON_MEM" to "EXT4_FC_REASON_MEM_NOMEM" for better
expression.

Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20201106035911.1942128-3-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-11-06 23:01:02 -05:00
Chuck Lever d321ff589c SUNRPC: Fix general protection fault in trace_rpc_xdr_overflow()
The TP_fast_assign() section is careful enough not to dereference
xdr->rqst if it's NULL. The TP_STRUCT__entry section is not.

Fixes: 5582863f45 ("SUNRPC: Add XDR overflow trace event")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2020-11-05 17:20:12 -05:00
Adrian Hunter fe1d4c2ebc scsi: ufs: Add DeepSleep feature
DeepSleep is a UFS v3.1 feature that achieves the lowest power consumption
of the device, apart from power off.

In DeepSleep mode, no commands are accepted, and the only way to exit is
using a hardware reset or power cycle.

This patch assumes that if a power cycle was an option, then power off
would be preferable, so only exit via a hardware reset is supported.

Drivers that wish to support DeepSleep need to set a new capability flag
UFSHCD_CAP_DEEPSLEEP and provide a hardware reset via the existing
 ->device_reset() callback.

It is assumed that UFS devices with wspecversion >= 0x310 support
DeepSleep.

[mkp: dropped sysfs ABI doc due to conflicts]

Link: https://lore.kernel.org/r/20201103141403.2142-2-adrian.hunter@intel.com
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Can Guo <cang@codeaurora.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-11-04 23:03:28 -05:00
Chao Yu fa4320cefb f2fs: move ioctl interface definitions to separated file
Like other filesystem does, we introduce a new file f2fs.h in path of
include/uapi/linux/, and move f2fs-specified ioctl interface definitions
to that file, after then, in order to use those definitions, userspace
developer only need to include the new header file rather than
copy & paste definitions from fs/f2fs/f2fs.h.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-11-02 08:33:02 -08:00
David Howells f86726a69d afs: Fix afs_invalidatepage to adjust the dirty region
Fix afs_invalidatepage() to adjust the dirty region recorded in
page->private when truncating a page.  If the dirty region is entirely
removed, then the private data is cleared and the page dirty state is
cleared.

Without this, if the page is truncated and then expanded again by truncate,
zeros from the expanded, but no-longer dirty region may get written back to
the server if the page gets laundered due to a conflicting 3rd-party write.

It mustn't, however, shorten the dirty region of the page if that page is
still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is
stored in page->private to record this.

Fixes: 4343d00872 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-29 13:53:04 +00:00
David Howells 185f0c7073 afs: Wrap page->private manipulations in inline functions
The afs filesystem uses page->private to store the dirty range within a
page such that in the event of a conflicting 3rd-party write to the server,
we write back just the bits that got changed locally.

However, there are a couple of problems with this:

 (1) I need a bit to note if the page might be mapped so that partial
     invalidation doesn't shrink the range.

 (2) There aren't necessarily sufficient bits to store the entire range of
     data altered (say it's a 32-bit system with 64KiB pages or transparent
     huge pages are in use).

So wrap the accesses in inline functions so that future commits can change
how this works.

Also move them out of the tracing header into the in-directory header.
There's not really any need for them to be in the tracing header.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-29 13:53:04 +00:00
Matthias Kaehlcke cab477d0d4 PM / devfreq: Add tracepoint for frequency changes
Add a tracepoint for frequency changes of devfreq devices and
use it.

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
[cw00.choi: Move print position of tracepoint and add more information]
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2020-10-26 10:52:37 +09:00
Chanwoo Choi 4281461c01 trace: events: devfreq: Use fixed indentation size to improve readability
Each tracepoint infromation consist of the different size value.
So, in order to improve the readability, use the fixed indentation size.

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
2020-10-26 10:52:37 +09:00
Joe Perches 33def8498f treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.

Remove the quote operator # from compiler_attributes.h __section macro.

Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.

Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-25 14:51:49 -07:00
Linus Torvalds f9a705ad1c ARM:
- New page table code for both hypervisor and guest stage-2
 - Introduction of a new EL2-private host context
 - Allow EL2 to have its own private per-CPU variables
 - Support of PMU event filtering
 - Complete rework of the Spectre mitigation
 
 PPC:
 - Fix for running nested guests with in-kernel IRQ chip
 - Fix race condition causing occasional host hard lockup
 - Minor cleanups and bugfixes
 
 x86:
 - allow trapping unknown MSRs to userspace
 - allow userspace to force #GP on specific MSRs
 - INVPCID support on AMD
 - nested AMD cleanup, on demand allocation of nested SVM state
 - hide PV MSRs and hypercalls for features not enabled in CPUID
 - new test for MSR_IA32_TSC writes from host and guest
 - cleanups: MMU, CPUID, shared MSRs
 - LAPIC latency optimizations ad bugfixes
 
 For x86, also included in this pull request is a new alternative and
 (in the future) more scalable implementation of extended page tables
 that does not need a reverse map from guest physical addresses to
 host physical addresses.  For now it is disabled by default because
 it is still lacking a few of the existing MMU's bells and whistles.
 However it is a very solid piece of work and it is already available
 for people to hammer on it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+S8dsUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroM40Af+M46NJmuS5rcwFfybvK/c42KT6svX
 Co1NrZDwzSQ2mMy3WQzH9qeLvb+nbY4sT3n5BPNPNsT+aIDPOTDt//qJ2/Ip9UUs
 tRNea0MAR96JWLE7MSeeRxnTaQIrw/AAZC0RXFzZvxcgytXwdqBExugw4im+b+dn
 Dcz8QxX1EkwT+4lTm5HC0hKZAuo4apnK1QkqCq4SdD2QVJ1YE6+z7pgj4wX7xitr
 STKD6q/Yt/0ndwqS0GSGbyg0jy6mE620SN6isFRkJYwqfwLJci6KnqvEK67EcNMu
 qeE017K+d93yIVC46/6TfVHzLR/D1FpQ8LZ16Yl6S13OuGIfAWBkQZtPRg==
 =AD6a
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "For x86, there is a new alternative and (in the future) more scalable
  implementation of extended page tables that does not need a reverse
  map from guest physical addresses to host physical addresses.

  For now it is disabled by default because it is still lacking a few of
  the existing MMU's bells and whistles. However it is a very solid
  piece of work and it is already available for people to hammer on it.

  Other updates:

  ARM:
   - New page table code for both hypervisor and guest stage-2
   - Introduction of a new EL2-private host context
   - Allow EL2 to have its own private per-CPU variables
   - Support of PMU event filtering
   - Complete rework of the Spectre mitigation

  PPC:
   - Fix for running nested guests with in-kernel IRQ chip
   - Fix race condition causing occasional host hard lockup
   - Minor cleanups and bugfixes

  x86:
   - allow trapping unknown MSRs to userspace
   - allow userspace to force #GP on specific MSRs
   - INVPCID support on AMD
   - nested AMD cleanup, on demand allocation of nested SVM state
   - hide PV MSRs and hypercalls for features not enabled in CPUID
   - new test for MSR_IA32_TSC writes from host and guest
   - cleanups: MMU, CPUID, shared MSRs
   - LAPIC latency optimizations ad bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
  kvm: x86/mmu: NX largepage recovery for TDP MMU
  kvm: x86/mmu: Don't clear write flooding count for direct roots
  kvm: x86/mmu: Support MMIO in the TDP MMU
  kvm: x86/mmu: Support write protection for nesting in tdp MMU
  kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
  kvm: x86/mmu: Support dirty logging for the TDP MMU
  kvm: x86/mmu: Support changed pte notifier in tdp MMU
  kvm: x86/mmu: Add access tracking for tdp_mmu
  kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
  kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
  kvm: x86/mmu: Add TDP MMU PF handler
  kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
  kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
  KVM: Cache as_id in kvm_memory_slot
  kvm: x86/mmu: Add functions to handle changed TDP SPTEs
  kvm: x86/mmu: Allocate and free TDP MMU roots
  kvm: x86/mmu: Init / Uninit the TDP MMU
  kvm: x86/mmu: Introduce tdp_iter
  KVM: mmu: extract spte.h and spte.c
  KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
  ...
2020-10-23 11:17:56 -07:00
Linus Torvalds 96485e4462 The siginificant new ext4 feature this time around is Harshad's new
fast_commit mode.  In addition, thanks to Mauricio for fixing a race
 where mmap'ed pages that are being changed in parallel with a
 data=journal transaction commit could result in bad checksums in the
 failure that could cause journal replays to fail.  Also notable is
 Ritesh's buffered write optimization which can result in significant
 improvements on parallel write workloads.  (The kernel test robot
 reported a 330.6% improvement on fio.write_iops on a 96 core system
 using DAX[1].)
 
 Besides that, we have the usual miscellaneous cleanups and bug fixes.
 
 [1] https://lore.kernel.org/r/20200925071217.GO28663@shao2-debian
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAl+RuCQACgkQ8vlZVpUN
 gaNebgf/dUnQp5SG2/2zczSDqr+f8DOiuAdn9I54BAr2HwdkMbbiktKfenfpu41k
 SMGNV6rYSs248dWFtkzM7C2T1dpGrdAe2OCYrU6HPR/xoZlx/RcDz39u7nXBDeup
 NV7RnPgIzCAGZXCOY/Zu1k88T1eosLRTIWvIcNOspt75MC0vJ8GSmkx1bVEUsv8w
 Uq6T0OREfDiLJpEZxtfbl3o+8Rfs82t3Soj4pwN8ESL/RWBTT8PlwAGhIcdjnHy/
 lsgT35IrY4OL6Eas9msUmFYrWhO6cW21kWOugYALQXZ3ny4A+r5nZZcY/wCq01NX
 J2Z02ZiMTZUmFFREbtc0eJukXWEVvA==
 =14K9
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "The siginificant new ext4 feature this time around is Harshad's new
  fast_commit mode.

  In addition, thanks to Mauricio for fixing a race where mmap'ed pages
  that are being changed in parallel with a data=journal transaction
  commit could result in bad checksums in the failure that could cause
  journal replays to fail.

  Also notable is Ritesh's buffered write optimization which can result
  in significant improvements on parallel write workloads. (The kernel
  test robot reported a 330.6% improvement on fio.write_iops on a 96
  core system using DAX)

  Besides that, we have the usual miscellaneous cleanups and bug fixes"

Link: https://lore.kernel.org/r/20200925071217.GO28663@shao2-debian

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (46 commits)
  ext4: fix invalid inode checksum
  ext4: add fast commit stats in procfs
  ext4: add a mount opt to forcefully turn fast commits on
  ext4: fast commit recovery path
  jbd2: fast commit recovery path
  ext4: main fast-commit commit path
  jbd2: add fast commit machinery
  ext4 / jbd2: add fast commit initialization
  ext4: add fast_commit feature and handling for extended mount options
  doc: update ext4 and journalling docs to include fast commit feature
  ext4: Detect already used quota file early
  jbd2: avoid transaction reuse after reformatting
  ext4: use the normal helper to get the actual inode
  ext4: fix bs < ps issue reported with dioread_nolock mount opt
  ext4: data=journal: write-protect pages on j_submit_inode_data_buffers()
  ext4: data=journal: fixes for ext4_page_mkwrite()
  jbd2, ext4, ocfs2: introduce/use journal callbacks j_submit|finish_inode_data_buffers()
  jbd2: introduce/export functions jbd2_journal_submit|finish_inode_data_buffers()
  ext4: introduce ext4_sb_bread_unmovable() to replace sb_bread_unmovable()
  ext4: use ext4_sb_bread() instead of sb_bread()
  ...
2020-10-22 10:31:08 -07:00
Harshad Shirwadkar 8016e29f43 ext4: fast commit recovery path
This patch adds fast commit recovery path support for Ext4 file
system. We add several helper functions that are similar in spirit to
e2fsprogs journal recovery path handlers. Example of such functions
include - a simple block allocator, idempotent block bitmap update
function etc. Using these routines and the fast commit log in the fast
commit area, the recovery path (ext4_fc_replay()) performs fast commit
log recovery.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20201015203802.3597742-8-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-21 23:22:38 -04:00
Harshad Shirwadkar aa75f4d3da ext4: main fast-commit commit path
This patch adds main fast commit commit path handlers. The overall
patch can be divided into two inter-related parts:

(A) Metadata updates tracking

    This part consists of helper functions to track changes that need
    to be committed during a commit operation. These updates are
    maintained by Ext4 in different in-memory queues. Following are
    the APIs and their short description that are implemented in this
    patch:

    - ext4_fc_track_link/unlink/creat() - Track unlink. link and creat
      operations
    - ext4_fc_track_range() - Track changed logical block offsets
      inodes
    - ext4_fc_track_inode() - Track inodes
    - ext4_fc_mark_ineligible() - Mark file system fast commit
      ineligible()
    - ext4_fc_start_update() / ext4_fc_stop_update() /
      ext4_fc_start_ineligible() / ext4_fc_stop_ineligible() These
      functions are useful for co-ordinating inode updates with
      commits.

(B) Main commit Path

    This part consists of functions to convert updates tracked in
    in-memory data structures into on-disk commits. Function
    ext4_fc_commit() is the main entry point to commit path.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20201015203802.3597742-6-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-10-21 23:22:37 -04:00
Linus Torvalds 59f0e7eb2f NFS Client Updates for Linux 5.10
- Stable Fixes:
   - Wait for stateid updates after CLOSE/OPEN_DOWNGRADE # v5.4+
   - Fix nfs_path in case of a rename retry
   - Support EXCHID4_FLAG_SUPP_FENCE_OPS v4.2 EXCHANGE_ID flag
 
 - New features and improvements:
   - Replace dprintk() calls with tracepoints
   - Make cache consistency bitmap dynamic
   - Added support for the NFS v4.2 READ_PLUS operation
   - Improvements to net namespace uniquifier
 
 - Other bugfixes and cleanups
   - Remove redundant clnt pointer
   - Don't update timeout values on connection resets
   - Remove redundant tracepoints
   - Various cleanups to comments
   - Fix oops when trying to use copy_file_range with v4.0 source server
   - Improvements to flexfiles mirrors
   - Add missing "local_lock=posix" mount option
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAl+PJv0ACgkQ18tUv7Cl
 QOtB+hAAza4BwQSAZs0QbszPcoNV0f0qND99hWcJ2ad5KCr8S5eNYaMqB8G8lbS+
 Ahuv6xRy69l2vjHvINXoSaSiXClYNAlAr5myiW0DqMLDpVV8VEMAhqgFJK97VpqQ
 AsbsdFwC4xAcQ+6nJ+IfK9f8AOYhvARkOP3171qkq0jX3Eq4j1IiQARn+JrGFZFL
 waC4UtVhCnx5z5pg7jyu7Ar4N0Xky72+Fm1EvRog4gndX2JbNNSkwavD33mWZ8iN
 3q6DRq/AtEk9F4ttPwVeALvpNCBqjoGcNw5FCBil7x4V+xIbDI2FBhusKas3GzXm
 W2tyUuJMTGpA6ZjkyYDcg0jC8vKUiUpMWgT3oZoQ/6g7P4vPzB/XB/RQcAInMRjS
 /MhWZ3YNQmApnVCIZSwOa9+oNUc69dM0+vqmesLvvb/aNjzRnGwiJNiuWGyrwoGX
 3SzpPUcJixa3+eKLi9uEHojKB+SuPIrB/HW4UYh/ZpOMyilVqUnX8RYbaK3qQ/gK
 Un++LjTmao9cXyjIDCiHJB630W01YSWXq1nwuSYkLuXCuALmmtgA4z8pm+4K+w+G
 SctETKVcD0qi3Yx5mFdU5A1dHXMB7nzYNaiRWYLvLO5qAR33nUPI4lV8rog2TUPe
 2iW4n0j2YxnTlwR3QMPYJRndP7KeR5XvOoNuh3XpKRVbmbc7/A4=
 =6RM5
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.10-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
 "Stable Fixes:
   - Wait for stateid updates after CLOSE/OPEN_DOWNGRADE # v5.4+
   - Fix nfs_path in case of a rename retry
   - Support EXCHID4_FLAG_SUPP_FENCE_OPS v4.2 EXCHANGE_ID flag

  New features and improvements:
   - Replace dprintk() calls with tracepoints
   - Make cache consistency bitmap dynamic
   - Added support for the NFS v4.2 READ_PLUS operation
   - Improvements to net namespace uniquifier

  Other bugfixes and cleanups:
   - Remove redundant clnt pointer
   - Don't update timeout values on connection resets
   - Remove redundant tracepoints
   - Various cleanups to comments
   - Fix oops when trying to use copy_file_range with v4.0 source server
   - Improvements to flexfiles mirrors
   - Add missing 'local_lock=posix' mount option"

* tag 'nfs-for-5.10-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (55 commits)
  NFSv4.2: support EXCHGID4_FLAG_SUPP_FENCE_OPS 4.2 EXCHANGE_ID flag
  NFSv4: Fix up RCU annotations for struct nfs_netns_client
  NFS: Only reference user namespace from nfs4idmap struct instead of cred
  nfs: add missing "posix" local_lock constant table definition
  NFSv4: Use the net namespace uniquifier if it is set
  NFSv4: Clean up initialisation of uniquified client id strings
  NFS: Decode a full READ_PLUS reply
  SUNRPC: Add an xdr_align_data() function
  NFS: Add READ_PLUS hole segment decoding
  SUNRPC: Add the ability to expand holes in data pages
  SUNRPC: Split out _shift_data_right_tail()
  SUNRPC: Split out xdr_realign_pages() from xdr_align_pages()
  NFS: Add READ_PLUS data segment support
  NFS: Use xdr_page_pos() in NFSv4 decode_getacl()
  SUNRPC: Implement a xdr_page_pos() function
  SUNRPC: Split out a function for setting current page
  NFS: fix nfs_path in case of a rename retry
  fs: nfs: return per memcg count for xattr shrinkers
  NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE
  nfs: remove incorrect fallthrough label
  ...
2020-10-20 13:26:30 -07:00
Linus Torvalds 41eea65e2a Merge tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU changes from Ingo Molnar:

 - Debugging for smp_call_function()

 - RT raw/non-raw lock ordering fixes

 - Strict grace periods for KASAN

 - New smp_call_function() torture test

 - Torture-test updates

 - Documentation updates

 - Miscellaneous fixes

[ This doesn't actually pull the tag - I've dropped the last merge from
  the RCU branch due to questions about the series.   - Linus ]

* tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  smp: Make symbol 'csd_bug_count' static
  kernel/smp: Provide CSD lock timeout diagnostics
  smp: Add source and destination CPUs to __call_single_data
  rcu: Shrink each possible cpu krcp
  rcu/segcblist: Prevent useless GP start if no CBs to accelerate
  torture: Add gdb support
  rcutorture: Allow pointer leaks to test diagnostic code
  rcutorture: Hoist OOM registry up one level
  refperf: Avoid null pointer dereference when buf fails to allocate
  rcutorture: Properly synchronize with OOM notifier
  rcutorture: Properly set rcu_fwds for OOM handling
  torture: Add kvm.sh --help and update help message
  rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
  torture: Update initrd documentation
  rcutorture: Replace HTTP links with HTTPS ones
  locktorture: Make function torture_percpu_rwsem_init() static
  torture: document --allcpus argument added to the kvm.sh script
  rcutorture: Output number of elapsed grace periods
  rcutorture: Remove KCSAN stubs
  rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
  ...
2020-10-18 14:34:50 -07:00
Linus Torvalds a1e16bc7d5 RDMA 5.10 pull request
The typical set of driver updates across the subsystem:
 
  - Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma, hns,
    usnic, qib, qedr, cxgb4, hns, bnxt_re
 
  - Various rtrs fixes and updates
 
  - Bug fix for mlx4 CM emulation for virtualization scenarios where MRA
    wasn't working right
 
  - Use tracepoints instead of pr_debug in the CM code
 
  - Scrub the locking in ucma and cma to close more syzkaller bugs
 
  - Use tasklet_setup in the subsystem
 
  - Revert the idea that 'destroy' operations are not allowed to fail at
    the driver level. This proved unworkable from a HW perspective.
 
  - Revise how the umem API works so drivers make fewer mistakes using it
 
  - XRC support for qedr
 
  - Convert uverbs objects RWQ and MW to new the allocation scheme
 
  - Large queue entry sizes for hns
 
  - Use hmm_range_fault() for mlx5 On Demand Paging
 
  - uverbs APIs to inspect the GID table instead of sysfs
 
  - Move some of the RDMA code for building large page SGLs into
    lib/scatterlist
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl+J37MACgkQOG33FX4g
 mxrKfRAAnIecwdE8df0yvVU5k0Eg6qVjMy9MMHq4va9m7g6GpUcNNI0nIlOASxH2
 l+9vnUQS3ebgsPeECaDYzEr0hh/u53+xw2g4WV5ts/hE8KkQ6erruXb9kasCe8yi
 5QWJ9K36T3c03Cd3EeH6JVtytAxuH42ombfo9BkFLPVyfG/R2tsAzvm5pVi73lxk
 46wtU1Bqi4tsLhyCbifn1huNFGbHp08OIBPAIKPUKCA+iBRPaWS+Dpi+93h3g3Bp
 oJwDhL9CBCGcHM+rKWLzek3Dy87FnQn7R1wmTpUFwkK+4AH3U/XazivhX035w1vL
 YJyhakVU0kosHlX9hJTNKDHJGkt0YEV2mS8dxAuqilFBtdnrVszb5/MirvlzC310
 /b5xCPSEusv9UVZV0G4zbySVNA9knZ4YaRiR3VDVMLKl/pJgTOwEiHIIx+vs3ejk
 p8GRWa1SjXw5LfZEQcq39J689ljt6xjCTonyuBSv7vSQq5v8pjBxvHxiAe2FIa2a
 ZyZeSCYoSh0SwJQukO2VO7aprhHP3TcCJ/987+X03LQ8tV2VWPktHqm62YCaDcOl
 fgiQuQdPivRjDDkJgMfDWDGKfZeHoWLKl5XsJhWByt0lablVrsvc+8ylUl1UI7gI
 16hWB/Qtlhfwg10VdApn+aOFpIS+s5P4XIp8ik57MZO+VeJzpmE=
 =LKpl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A usual cycle for RDMA with a typical mix of driver and core subsystem
  updates:

   - Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
     hns, usnic, qib, qedr, cxgb4, hns, bnxt_re

   - Various rtrs fixes and updates

   - Bug fix for mlx4 CM emulation for virtualization scenarios where
     MRA wasn't working right

   - Use tracepoints instead of pr_debug in the CM code

   - Scrub the locking in ucma and cma to close more syzkaller bugs

   - Use tasklet_setup in the subsystem

   - Revert the idea that 'destroy' operations are not allowed to fail
     at the driver level. This proved unworkable from a HW perspective.

   - Revise how the umem API works so drivers make fewer mistakes using
     it

   - XRC support for qedr

   - Convert uverbs objects RWQ and MW to new the allocation scheme

   - Large queue entry sizes for hns

   - Use hmm_range_fault() for mlx5 On Demand Paging

   - uverbs APIs to inspect the GID table instead of sysfs

   - Move some of the RDMA code for building large page SGLs into
     lib/scatterlist"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
  RDMA/ucma: Fix use after free in destroy id flow
  RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
  RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
  RDMA: Explicitly pass in the dma_device to ib_register_device
  lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
  IB/mlx4: Convert rej_tmout radix-tree to XArray
  RDMA/rxe: Fix bug rejecting all multicast packets
  RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
  RDMA/rxe: Remove duplicate entries in struct rxe_mr
  IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
  IB/rdmavt: Fix sizeof mismatch
  MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
  RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
  RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
  RDMA/umem: Move to allocate SG table from pages
  lib/scatterlist: Add support in dynamic allocation of SG table from pages
  tools/testing/scatterlist: Show errors in human readable form
  tools/testing/scatterlist: Rejuvenate bit-rotten test
  RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
  RDMA/uverbs: Expose the new GID query API to user space
  ...
2020-10-17 11:18:18 -07:00
Linus Torvalds fad70111d5 afs fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl+JqvsACgkQ+7dXa6fL
 C2sQeRAAiW+rGQa46ybOv8qBLgVEiH6OzNUk62fesbhl5rKRgapS2r2SJbau29gv
 KqBld29lp4o84/THJ6hZ6Al1iwZnO2FW+h53Q47M5TH9AbwZhf/zooRSoGb5AmKJ
 /FR4+K/zTLk7VCSptTtXlKG81bOKQF+tmi6sLdxx70h2T+Ythm3zdVq8PCZZhhGG
 hw1IfuNDNbUDGus5WgZfIdVkdFCs5WW5cEhrPgqKR0mXYQklnkwgtov/+RAh/Ewf
 bZ1JFqap15KJ2AcKgw79NZ01MBJ4KyZckzKwgaTVlIEtCMQMBDhJpbKzdtP767rh
 xkxdzmDmXOop9RNQ8WrIIt6EjpB0We2qxQVGLsPHdEixmlv5n2BjL/wdWa3u/4QZ
 ymjv7B8jcFSpXQzVnrmeNYsgHo1oJ7G8q8Xh+CJm9BNBtNUiz1raQfxWpS5TeWT4
 dQsr/Z/PMCBCBOb6F08pjF8od67DUCx+x1nkCG4qFCeXiC4ouTNkfMZnxWH9mVKP
 v5Hca2E0ilR3PI5+tn8o/ZPul5NDTesBnAvAAEBQDT79Ff8kAgEJ3M3ipy38VoKV
 xFiNpNHwDPp15vzneJ7f7Ir27VauRlhCLEuSUtyYH4pD458K3UQAAQRFZB7bkzMX
 TQTbmEPUbtrs/mOEMKQN6ew8dwawRpMy01j0KnDl0KBFDtfTLU8=
 =0anM
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20201016' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull afs updates from David Howells:
 "A collection of fixes to fix afs_cell struct refcounting, thereby
  fixing a slew of related syzbot bugs:

   - Fix the cell tree in the netns to use an rwsem rather than RCU.

     There seem to be some problems deriving from the use of RCU and a
     seqlock to walk the rbtree, but it's not entirely clear what since
     there are several different failures being seen.

     Changing things to use an rwsem instead makes it more robust. The
     extra performance derived from using RCU isn't necessary in this
     case since the only time we're looking up a cell is during mount or
     when cells are being manually added.

   - Fix the refcounting by splitting the usage counter into a memory
     refcount and an active users counter. The usage counter was doing
     double duty, keeping track of whether a cell is still in use and
     keeping track of when it needs to be destroyed - but this makes the
     clean up tricky. Separating these out simplifies the logic.

   - Fix purging a cell that has an alias. A cell alias pins the cell
     it's an alias of, but the alias is always later in the list. Trying
     to purge in a single pass causes rmmod to hang in such a case.

   - Fix cell removal. If a cell's manager is requeued whilst it's
     removing itself, the manager will run again and re-remove itself,
     causing problems in various places. Follow Hillf Danton's
     suggestion to insert a more terminal state that causes the manager
     to do nothing post-removal.

  In additional to the above, two other changes:

   - Add a tracepoint for the cell refcount and active users count. This
     helped with debugging the above and may be useful again in future.

   - Downgrade an assertion to a print when a still-active server is
     seen during purging. This was happening as a consequence of
     incomplete cell removal before the servers were cleaned up"

* tag 'afs-fixes-20201016' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Don't assert on unpurgeable server records
  afs: Add tracing for cell refcount and active user count
  afs: Fix cell removal
  afs: Fix cell purging with aliases
  afs: Fix cell refcounting by splitting the usage counter
  afs: Fix rapid cell addition/removal by not using RCU on cells tree
2020-10-16 15:22:41 -07:00
Linus Torvalds 7a3dadedc8 f2fs-for-5.10-rc1
In this round, we've added new features such as zone capacity for ZNS and
 a new GC policy, ATGC, along with in-memory segment management. In addition,
 we could improve the decompression speed significantly by changing virtual
 mapping method. Even though we've fixed lots of small bugs in compression
 support, I feel that it becomes more stable so that I could give it a try in
 production.
 
 Enhancement:
  - suport zone capacity in NVMe Zoned Namespace devices
  - introduce in-memory current segment management
  - add standart casefolding support
  - support age threshold based garbage collection
  - improve decompression speed by changing virtual mapping method
 
 Bug fix:
  - fix condition checks in some ioctl() such as compression, move_range, etc
  - fix 32/64bits support in data structures
  - fix memory allocation in zstd decompress
  - add some boundary checks to avoid kernel panic on corrupted image
  - fix disallowing compression for non-empty file
  - fix slab leakage of compressed block writes
 
 In addition, it includes code refactoring for better readability and minor
 bug fixes for compression and zoned device support.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAl+I0nQACgkQQBSofoJI
 UNIJdw//Rj0YapYXSu1nlOQppzSAkCiL6wxrrG2qkisRE7uXSYGZfWBqrMT6Asnf
 i0f25i6ywZOZ00dgN/klZRBh4YYSgJqYx9BPkTxZsQZ/S/EmZJPpr8m4VUB69LKL
 VijwUdgcW9vNDJ2/DkDQDVBd/ZqRxXnltffWtP4pS96Gj089/dE2q8KXqQrt3LM1
 lLQjDfHj+0AyWRzKpErTO0W9DOgO7wmmelS0h6m2RYttkbb328JEZezg5bjWNNlk
 eTXxuAFFy8Ap9DngkC/sqvY2NRTv1YgOPfrT8XWwdDIiFTZ+LoYdFI5Ap/UW7QwG
 iz7B/0wPj3+9ncl536LRbFPiLisbYrArYGmZKF6t8w1cP6mTVlileMT4Q6s9+qhn
 GJS88tTBVlR9vbzDu2brjI6qRQVTBdsohIGoA1g6lz0ogbphhmTzujPbFQ6GTSBi
 3sKKp59urkBpVH3TVJU1oshLjIEG2yToMgYwZH9DU7zlzJS6XpetJrzReqItEThc
 VNixg2DxdIFQ+nrMt+LtWaOHs5qzxIIPksguGEhqSkLL5lI75n2MZxrhKXmUsaZa
 qItJE0ndJFfi6vggIkJID+a0bpTss7+AxF1AmSZDafMLkZy8j14DNQAnmBUYRX0J
 5QExZ+LyyaKjWQ/k9SsSzV1Y3dbguDyB+gkeMhr/6XEr9DSwZc8=
 =exgv
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've added new features such as zone capacity for ZNS
  and a new GC policy, ATGC, along with in-memory segment management. In
  addition, we could improve the decompression speed significantly by
  changing virtual mapping method. Even though we've fixed lots of small
  bugs in compression support, I feel that it becomes more stable so
  that I could give it a try in production.

  Enhancements:
   - suport zone capacity in NVMe Zoned Namespace devices
   - introduce in-memory current segment management
   - add standart casefolding support
   - support age threshold based garbage collection
   - improve decompression speed by changing virtual mapping method

  Bug fixes:
   - fix condition checks in some ioctl() such as compression, move_range, etc
   - fix 32/64bits support in data structures
   - fix memory allocation in zstd decompress
   - add some boundary checks to avoid kernel panic on corrupted image
   - fix disallowing compression for non-empty file
   - fix slab leakage of compressed block writes

  In addition, it includes code refactoring for better readability and
  minor bug fixes for compression and zoned device support"

* tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
  f2fs: code cleanup by removing unnecessary check
  f2fs: wait for sysfs kobject removal before freeing f2fs_sb_info
  f2fs: fix writecount false positive in releasing compress blocks
  f2fs: introduce check_swap_activate_fast()
  f2fs: don't issue flush in f2fs_flush_device_cache() for nobarrier case
  f2fs: handle errors of f2fs_get_meta_page_nofail
  f2fs: fix to set SBI_NEED_FSCK flag for inconsistent inode
  f2fs: reject CASEFOLD inode flag without casefold feature
  f2fs: fix memory alignment to support 32bit
  f2fs: fix slab leak of rpages pointer
  f2fs: compress: fix to disallow enabling compress on non-empty file
  f2fs: compress: introduce cic/dic slab cache
  f2fs: compress: introduce page array slab cache
  f2fs: fix to do sanity check on segment/section count
  f2fs: fix to check segment boundary during SIT page readahead
  f2fs: fix uninit-value in f2fs_lookup
  f2fs: remove unneeded parameter in find_in_block()
  f2fs: fix wrong total_sections check and fsmeta check
  f2fs: remove duplicated code in sanity_check_area_boundary
  f2fs: remove unused check on version_bitmap
  ...
2020-10-16 15:14:43 -07:00
David Howells 7530d3eb3d afs: Don't assert on unpurgeable server records
Don't give an assertion failure on unpurgeable afs_server records - which
kills the thread - but rather emit a trace line when we are purging a
record (which only happens during network namespace removal or rmmod) and
print a notice of the problem.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-16 14:39:34 +01:00
David Howells dca54a7bbb afs: Add tracing for cell refcount and active user count
Add a tracepoint to log the cell refcount and active user count and pass in
a reason code through various functions that manipulate these counters.

Additionally, a helper function, afs_see_cell(), is provided to log
interesting places that deal with a cell without actually doing any
accounting directly.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-16 14:39:21 +01:00
Linus Torvalds 9ff9b0d392 networking changes for the 5.10 merge window
Add redirect_neigh() BPF packet redirect helper, allowing to limit stack
 traversal in common container configs and improving TCP back-pressure.
 Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.
 
 Expand netlink policy support and improve policy export to user space.
 (Ge)netlink core performs request validation according to declared
 policies. Expand the expressiveness of those policies (min/max length
 and bitmasks). Allow dumping policies for particular commands.
 This is used for feature discovery by user space (instead of kernel
 version parsing or trial and error).
 
 Support IGMPv3/MLDv2 multicast listener discovery protocols in bridge.
 
 Allow more than 255 IPv4 multicast interfaces.
 
 Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
 packets of TCPv6.
 
 In Multi-patch TCP (MPTCP) support concurrent transmission of data
 on multiple subflows in a load balancing scenario. Enhance advertising
 addresses via the RM_ADDR/ADD_ADDR options.
 
 Support SMC-Dv2 version of SMC, which enables multi-subnet deployments.
 
 Allow more calls to same peer in RxRPC.
 
 Support two new Controller Area Network (CAN) protocols -
 CAN-FD and ISO 15765-2:2016.
 
 Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
 kernel problem.
 
 Add TC actions for implementing MPLS L2 VPNs.
 
 Improve nexthop code - e.g. handle various corner cases when nexthop
 objects are removed from groups better, skip unnecessary notifications
 and make it easier to offload nexthops into HW by converting
 to a blocking notifier.
 
 Support adding and consuming TCP header options by BPF programs,
 opening the doors for easy experimental and deployment-specific
 TCP option use.
 
 Reorganize TCP congestion control (CC) initialization to simplify life
 of TCP CC implemented in BPF.
 
 Add support for shipping BPF programs with the kernel and loading them
 early on boot via the User Mode Driver mechanism, hence reusing all the
 user space infra we have.
 
 Support sleepable BPF programs, initially targeting LSM and tracing.
 
 Add bpf_d_path() helper for returning full path for given 'struct path'.
 
 Make bpf_tail_call compatible with bpf-to-bpf calls.
 
 Allow BPF programs to call map_update_elem on sockmaps.
 
 Add BPF Type Format (BTF) support for type and enum discovery, as
 well as support for using BTF within the kernel itself (current use
 is for pretty printing structures).
 
 Support listing and getting information about bpf_links via the bpf
 syscall.
 
 Enhance kernel interfaces around NIC firmware update. Allow specifying
 overwrite mask to control if settings etc. are reset during update;
 report expected max time operation may take to users; support firmware
 activation without machine reboot incl. limits of how much impact
 reset may have (e.g. dropping link or not).
 
 Extend ethtool configuration interface to report IEEE-standard
 counters, to limit the need for per-vendor logic in user space.
 
 Adopt or extend devlink use for debug, monitoring, fw update
 in many drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw,
 mv88e6xxx, dpaa2-eth).
 
 In mlxsw expose critical and emergency SFP module temperature alarms.
 Refactor port buffer handling to make the defaults more suitable and
 support setting these values explicitly via the DCBNL interface.
 
 Add XDP support for Intel's igb driver.
 
 Support offloading TC flower classification and filtering rules to
 mscc_ocelot switches.
 
 Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
 fixed interval period pulse generator and one-step timestamping in
 dpaa-eth.
 
 Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
 offload.
 
 Add Lynx PHY/PCS MDIO module, and convert various drivers which have
 this HW to use it. Convert mvpp2 to split PCS.
 
 Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
 7-port Mediatek MT7531 IP.
 
 Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
 and wcn3680 support in wcn36xx.
 
 Improve performance for packets which don't require much offloads
 on recent Mellanox NICs by 20% by making multiple packets share
 a descriptor entry.
 
 Move chelsio inline crypto drivers (for TLS and IPsec) from the crypto
 subtree to drivers/net. Move MDIO drivers out of the phy directory.
 
 Clean up a lot of W=1 warnings, reportedly the actively developed
 subsections of networking drivers should now build W=1 warning free.
 
 Make sure drivers don't use in_interrupt() to dynamically adapt their
 code. Convert tasklets to use new tasklet_setup API (sadly this
 conversion is not yet complete).
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAl+ItRwACgkQMUZtbf5S
 IrtTMg//UxpdR/MirT1DatBU0K/UGAZY82hV7F/UC8tPgjfHZeHvWlDFxfi3YP81
 PtPKbhRZ7DhwBXefUp6nY3UdvjftrJK2lJm8prJUPSsZRye8Wlcb7y65q7/P2y2U
 Efucyopg6RUrmrM0DUsIGYGJgylQLHnMYUl/keCsD4t5Bp4ksyi9R2t5eitGoWzh
 r3QGdbSa0AuWx4iu0i+tqp6Tj0ekMBMXLVb35dtU1t0joj2KTNEnSgABN3prOa8E
 iWYf2erOau68Ogp3yU3miCy0ZU4p/7qGHTtzbcp677692P/ekak6+zmfHLT9/Pjy
 2Stq2z6GoKuVxdktr91D9pA3jxG4LxSJmr0TImcGnXbvkMP3Ez3g9RrpV5fn8j6F
 mZCH8TKZAoD5aJrAJAMkhZmLYE1pvDa7KolSk8WogXrbCnTEb5Nv8FHTS1Qnk3yl
 wSKXuvutFVNLMEHCnWQLtODbTST9DI/aOi6EctPpuOA/ZyL1v3pl+gfp37S+LUTe
 owMnT/7TdvKaTD0+gIyU53M6rAWTtr5YyRQorX9awIu/4Ha0F0gYD7BJZQUGtegp
 HzKt59NiSrFdbSH7UdyemdBF4LuCgIhS7rgfeoUXMXmuPHq7eHXyHZt5dzPPa/xP
 81P0MAvdpFVwg8ij2yp2sHS7sISIRKq17fd1tIewUabxQbjXqPc=
 =bc1U
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:

 - Add redirect_neigh() BPF packet redirect helper, allowing to limit
   stack traversal in common container configs and improving TCP
   back-pressure.

   Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.

 - Expand netlink policy support and improve policy export to user
   space. (Ge)netlink core performs request validation according to
   declared policies. Expand the expressiveness of those policies
   (min/max length and bitmasks). Allow dumping policies for particular
   commands. This is used for feature discovery by user space (instead
   of kernel version parsing or trial and error).

 - Support IGMPv3/MLDv2 multicast listener discovery protocols in
   bridge.

 - Allow more than 255 IPv4 multicast interfaces.

 - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
   packets of TCPv6.

 - In Multi-patch TCP (MPTCP) support concurrent transmission of data on
   multiple subflows in a load balancing scenario. Enhance advertising
   addresses via the RM_ADDR/ADD_ADDR options.

 - Support SMC-Dv2 version of SMC, which enables multi-subnet
   deployments.

 - Allow more calls to same peer in RxRPC.

 - Support two new Controller Area Network (CAN) protocols - CAN-FD and
   ISO 15765-2:2016.

 - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
   kernel problem.

 - Add TC actions for implementing MPLS L2 VPNs.

 - Improve nexthop code - e.g. handle various corner cases when nexthop
   objects are removed from groups better, skip unnecessary
   notifications and make it easier to offload nexthops into HW by
   converting to a blocking notifier.

 - Support adding and consuming TCP header options by BPF programs,
   opening the doors for easy experimental and deployment-specific TCP
   option use.

 - Reorganize TCP congestion control (CC) initialization to simplify
   life of TCP CC implemented in BPF.

 - Add support for shipping BPF programs with the kernel and loading
   them early on boot via the User Mode Driver mechanism, hence reusing
   all the user space infra we have.

 - Support sleepable BPF programs, initially targeting LSM and tracing.

 - Add bpf_d_path() helper for returning full path for given 'struct
   path'.

 - Make bpf_tail_call compatible with bpf-to-bpf calls.

 - Allow BPF programs to call map_update_elem on sockmaps.

 - Add BPF Type Format (BTF) support for type and enum discovery, as
   well as support for using BTF within the kernel itself (current use
   is for pretty printing structures).

 - Support listing and getting information about bpf_links via the bpf
   syscall.

 - Enhance kernel interfaces around NIC firmware update. Allow
   specifying overwrite mask to control if settings etc. are reset
   during update; report expected max time operation may take to users;
   support firmware activation without machine reboot incl. limits of
   how much impact reset may have (e.g. dropping link or not).

 - Extend ethtool configuration interface to report IEEE-standard
   counters, to limit the need for per-vendor logic in user space.

 - Adopt or extend devlink use for debug, monitoring, fw update in many
   drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
   dpaa2-eth).

 - In mlxsw expose critical and emergency SFP module temperature alarms.
   Refactor port buffer handling to make the defaults more suitable and
   support setting these values explicitly via the DCBNL interface.

 - Add XDP support for Intel's igb driver.

 - Support offloading TC flower classification and filtering rules to
   mscc_ocelot switches.

 - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
   fixed interval period pulse generator and one-step timestamping in
   dpaa-eth.

 - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
   offload.

 - Add Lynx PHY/PCS MDIO module, and convert various drivers which have
   this HW to use it. Convert mvpp2 to split PCS.

 - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
   7-port Mediatek MT7531 IP.

 - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
   and wcn3680 support in wcn36xx.

 - Improve performance for packets which don't require much offloads on
   recent Mellanox NICs by 20% by making multiple packets share a
   descriptor entry.

 - Move chelsio inline crypto drivers (for TLS and IPsec) from the
   crypto subtree to drivers/net. Move MDIO drivers out of the phy
   directory.

 - Clean up a lot of W=1 warnings, reportedly the actively developed
   subsections of networking drivers should now build W=1 warning free.

 - Make sure drivers don't use in_interrupt() to dynamically adapt their
   code. Convert tasklets to use new tasklet_setup API (sadly this
   conversion is not yet complete).

* tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
  Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
  net, sockmap: Don't call bpf_prog_put() on NULL pointer
  bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
  bpf, sockmap: Add locking annotations to iterator
  netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
  net: fix pos incrementment in ipv6_route_seq_next
  net/smc: fix invalid return code in smcd_new_buf_create()
  net/smc: fix valid DMBE buffer sizes
  net/smc: fix use-after-free of delayed events
  bpfilter: Fix build error with CONFIG_BPFILTER_UMH
  cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
  net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
  bpf: Fix register equivalence tracking.
  rxrpc: Fix loss of final ack on shutdown
  rxrpc: Fix bundle counting for exclusive connections
  netfilter: restore NF_INET_NUMHOOKS
  ibmveth: Identify ingress large send packets.
  ibmveth: Switch order of ibmveth_helper calls.
  cxgb4: handle 4-tuple PEDIT to NAT mode translation
  selftests: Add VRF route leaking tests
  ...
2020-10-15 18:42:13 -07:00
Linus Torvalds c48b75b727 sound updates for 5.10
The amount of changes is smaller at this round (what a surprise),
 but lots of activity is seen.  Most of changes are about ASoC
 driver development, especially Intel platforms.
 Here are some highlights:
 
 General:
 * Replace all tasklet usages with other alternatives
 * Cleanup of the ASoC error unwinding code
 * Fixes for trivial issues caught by static checker
 * Spell fixes allover the places
 
 ALSA Core:
 * Lockdep fix for control devices
 * Fix for potential OSS sequencer mutex stalls
 
 HD-audio and USB-audio:
 * SoundBlaster AE-7 support
 * Changes in quirk table for the rename handling
 * Quirks for HP and ASUS machines, Pioneer DJ DJM-250MK2.
 
 ASoC:
 * Lots of updates for Intel SOF and SoundWire enablement
 * Replacement of the DSP driver for some older x86 systems;
   the new code was written from scratch, better maintenance
   expected
 * Helpers for parsing auxiluary devices from the device tree
 * New support for AllWinner A64, Cirrus Logic CS4234, Mediatek
   MT6359 Microchip S/PDIF TX and RX controllers, Realtek RT1015P,
   and Texas Instruments J721E, TAS2110, TAS2564 and TAS2764
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAl+HHD4OHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE9eAw//Wgs9LfQE3rBcsGVNTHimW2cPzbdHVK1eth6N
 pFT6rdEG2N+ALR0ESA26CSBniJocqxNvXYzaYT0fy+7tS/chOjhkfr6SttYPDmwc
 q2u1SQIqdx41Q0DVUXYxSLVExjT4Rx96qeibLy5pi8DsbL0DOVa7PkVDl1XHXNJ0
 iSZwA18gCRdezpoOCD+UF8EBplULjYfPp0xstqjaQzTCpJQ5C1xpbZdHWfhTWsKo
 H98d4GL4yUUbJb5/Wi7uqiUGhPIxgBUMVkaY+uRifeNA/MGD5rUZQaf8ft6uQFUL
 D5RCUksJiQfyrj++g9/mzOWVRCFZ6MvaAmEW4xwlPvTsP2uIVIqS5RH8Z2BhwjXr
 J8/4gPuCtoEKbfsOOCOG9MlGsquf9LBeiH5KZ7gqb7ilu4tICR2zXtBr6U7e64Wd
 LsPROQnr/+lxIlEJjlhiarf1jXMfo4glxuoLsDcIH+Baf0lTiMNoBVIZTUdJ0urq
 Srh++Bk/WGvoVJe1PHp7IfhZCoBACozPXq7EifbnCsUM+cVtQtjWrydyi8k/Yona
 5EfS5wQdEH6JvQirkmGJm8kNMu+e3hW2HzoJqV2Z2DUMMnCSra62KD0wPA/wRchu
 mkC47875a+jgo58fq4bX9hzGi2CrE/TMYdii6I2bbAm/Mp7czXZfO0LOTWDc4Bs5
 T8qt+HI=
 =nWAp
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "The amount of changes is smaller at this round (what a surprise), but
  lots of activity is seen. Most of changes are about ASoC driver
  development, especially Intel platforms. Here are some highlights:

  General:
   - Replace all tasklet usages with other alternatives
   - Cleanup of the ASoC error unwinding code
   - Fixes for trivial issues caught by static checker
   - Spell fixes allover the places

  ALSA Core:
   - Lockdep fix for control devices
   - Fix for potential OSS sequencer mutex stalls

  HD-audio and USB-audio:
   - SoundBlaster AE-7 support
   - Changes in quirk table for the rename handling
   - Quirks for HP and ASUS machines, Pioneer DJ DJM-250MK2.

  ASoC:
   - Lots of updates for Intel SOF and SoundWire enablement
   - Replacement of the DSP driver for some older x86 systems; the new
     code was written from scratch, better maintenance expected
   - Helpers for parsing auxiluary devices from the device tree
   - New support for AllWinner A64, Cirrus Logic CS4234, Mediatek MT6359
     Microchip S/PDIF TX and RX controllers, Realtek RT1015P, and Texas
     Instruments J721E, TAS2110, TAS2564 and TAS2764"

* tag 'sound-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (498 commits)
  ALSA: hda/hdmi: fix incorrect locking in hdmi_pcm_close
  ALSA: hda: fix jack detection with Realtek codecs when in D3
  ALSA: fireworks: use semicolons rather than commas to separate statements
  ALSA: hda: use semicolons rather than commas to separate statements
  ALSA: hda/i915 - fix list corruption with concurrent probes
  ASoC: dmaengine: Document support for TX only or RX only streams
  ASoC: mchp-spdiftx: remove 'TX' from playback stream name
  ASoC: ti: davinci-mcasp: Use &pdev->dev for early dev_warn
  ASoC: tas2764: Add the driver for the TAS2764
  dt-bindings: tas2764: Add the TAS2764 binding doc
  ASoC: Intel: catpt: Add explicit DMADEVICES kconfig dependency
  ASoC: Intel: catpt: Fix compilation when CONFIG_MODULES is disabled
  ASoC: stm32: dfsdm: add actual resolution trace
  ASoC: stm32: dfsdm: change rate limits
  ASoC: qcom: sc7180: Add support for audio over DP
  Asoc: qcom: lpass-platform : Increase buffer size
  ASoC: qcom: Add support for lpass hdmi driver
  Asoc: qcom: lpass:Update lpaif_dmactl members order
  Asoc:qcom:lpass-cpu:Update dts property read API
  ASoC: dt-bindings: Add dt binding for lpass hdmi
  ...
2020-10-15 11:07:44 -07:00
Linus Torvalds 55e0500eb5 SCSI misc on 20201013
This series consists of the usual driver updates (ufs, qla2xxx, tcmu,
 ibmvfc, lpfc, smartpqi, hisi_sas, qedi, qedf, mpt3sas) and minor bug
 fixes.  There are only three core changes: adding sense codes,
 cleaning up noretry and adding an option for limitless retries.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX4YulyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaZDAQCT7rwG
 UEZYHgYkU9EX9ERVBQM0SW4mLrxf3g3P5ioJsAEAtkclCM4QsIOP+MIPjIa0EyUY
 khu0kcrmeFR2YwA8zhw=
 =4w4S
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi,
  hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes.

  There are only three core changes: adding sense codes, cleaning up
  noretry and adding an option for limitless retries"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits)
  scsi: hisi_sas: Recover PHY state according to the status before reset
  scsi: hisi_sas: Filter out new PHY up events during suspend
  scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
  scsi: hisi_sas: Add check for methods _PS0 and _PR0
  scsi: hisi_sas: Add controller runtime PM support for v3 hw
  scsi: hisi_sas: Switch to new framework to support suspend and resume
  scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
  scsi: qedf: Remove redundant assignment to variable 'rc'
  scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store()
  scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro
  scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed
  scsi: sun_esp: Use module_platform_driver to simplify the code
  scsi: sun3x_esp: Use module_platform_driver to simplify the code
  scsi: sni_53c710: Use module_platform_driver to simplify the code
  scsi: qlogicpti: Use module_platform_driver to simplify the code
  scsi: mac_esp: Use module_platform_driver to simplify the code
  scsi: jazz_esp: Use module_platform_driver to simplify the code
  scsi: mvumi: Fix error return in mvumi_io_attach()
  scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req()
  scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs()
  ...
2020-10-14 15:15:35 -07:00
Linus Torvalds 7b540812cc selinux/stable-5.10 PR 20201012
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl+E9UoUHHBhdWxAcGF1
 bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMG2BAApHLKLsfH5gf7gZNjHmQxddg8maCl
 BGt7K1xc9iYBZN56Cbc7v9uKc5pM+UOoOlVmWh+8jaROpX10jJmvhsebQzpcWEEs
 O/BDg/Y/AafoLr5e7gbAnlA7TJXNSR9MG9RB7c9xC14LG/bqBmkaUNsv8isWlLgl
 J2atHLsdlvCbmqJvnc6Fh3VJCbY/I0kt9L04GBQ4pEK3TKOxtORQaQcjVgLhlcw9
 YdMPKYIwy2Ze2HUuyW2o9OuryHhoMrwxpN/35/PAxrRwpO0LVnjjiw6njQqYVGH3
 el8mPXlhHah/7QUKcngSsvcvUcaSencp9sUBrp1vK9C1vkSFyubZweVi4A2TEWnh
 Ctceje7XP/YWDcJ+5BgASvosQdqOBB7huuOOKVpvaBXqgUHFgaxphV4/FDNnlF62
 AteX5RcWb/JiFJ4YnbknPNa/MWxVYuVn78AlNsM2ZponWYWs9JZ17lX4tHAKF1Qm
 x6ZMvMCDJTj8622l8nw3dTZKNDE3nFblDThX8aSrAhCQQE6HvugbKU4Fzo1oiSPl
 84PlCPgb+3tP3OsvZDIOPCJxC6IHgS+meA0IjhjwuCb+U+YWaAIeOlOPSkxUmfLu
 iJVWHmDtsAM3bTBxwQudhgXF3a1oKCEqeqNxM6P6p55jti7xal9FnZNHTbSh2sO1
 Km4oIqTEb1XWNdU=
 =NNLw
 -----END PGP SIGNATURE-----

Merge tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux

Pull selinux updates from Paul Moore:
 "A decent number of SELinux patches for v5.10, twenty two in total. The
  highlights are listed below, but all of the patches pass our test
  suite and merge cleanly.

   - A number of changes to how the SELinux policy is loaded and managed
     inside the kernel with the goal of improving the atomicity of a
     SELinux policy load operation.

     These changes account for the bulk of the diffstat as well as the
     patch count. A special thanks to everyone who contributed patches
     and fixes for this work.

   - Convert the SELinux policy read-write lock to RCU.

   - A tracepoint was added for audited SELinux access control events;
     this should help provide a more unified backtrace across kernel and
     userspace.

   - Allow the removal of security.selinux xattrs when a SELinux policy
     is not loaded.

   - Enable policy capabilities in SELinux policies created with the
     scripts/selinux/mdp tool.

   - Provide some "no sooner than" dates for the SELinux checkreqprot
     sysfs deprecation"

* tag 'selinux-pr-20201012' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (22 commits)
  selinux: provide a "no sooner than" date for the checkreqprot removal
  selinux: Add helper functions to get and set checkreqprot
  selinux: access policycaps with READ_ONCE/WRITE_ONCE
  selinux: simplify away security_policydb_len()
  selinux: move policy mutex to selinux_state, use in lockdep checks
  selinux: fix error handling bugs in security_load_policy()
  selinux: convert policy read-write lock to RCU
  selinux: delete repeated words in comments
  selinux: add basic filtering for audit trace events
  selinux: add tracepoint on audited events
  selinux: Create new booleans and class dirs out of tree
  selinux: Standardize string literal usage for selinuxfs directory names
  selinux: Refactor selinuxfs directory populating functions
  selinux: Create function for selinuxfs directory cleanup
  selinux: permit removing security.selinux xattr before policy load
  selinux: fix memdup.cocci warnings
  selinux: avoid dereferencing the policy prior to initialization
  selinux: fix allocation failure check on newpolicy->sidtab
  selinux: refactor changing booleans
  selinux: move policy commit after updating selinuxfs
  ...
2020-10-13 16:29:55 -07:00
Linus Torvalds 7cd4ecd917 drivers-5.10-2020-10-12
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+EYWYQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpsCgD/9Izy/mbiQMmcBPBuQFds2b2SwPAoB4RVcU
 NU7pcI3EbAlcj7xDF08Z74Sr6MKyg+JhGid15iw47o+qFq6cxDKiESYLIrFmb70R
 lUDkPr9J4OLNDSZ6hpM4sE6Qg9bzDPhRbAceDQRtVlqjuQdaOS2qZAjNG4qjO8by
 3PDO7XHCW+X4HhXiu2PDCKuwyDlHxggYzhBIFZNf58US2BU8+tLn2gvTSvmTb27F
 w0s5WU1Q5Q0W9RLrp4YTQi4SIIOq03BTSqpRjqhomIzhSQMieH95XNKGRitLjdap
 2mFNJ+5I+DTB/TW2BDBrBRXnoV/QNBJsR0DDFnUZsHEejjXKEVt5BRCpSQC9A0WW
 XUyVE1K+3GwgIxSI8tjPtyPEGzzhnqJjzHPq4LJLGlQje95v9JZ6bpODB7HHtZQt
 rbNp8IoVQ0n01nIvkkt/vnzCE9VFbWFFQiiu5/+x26iKZXW0pAF9Dnw46nFHoYZi
 llYvbKDcAUhSdZI8JuqnSnKhi7sLRNPnApBxs52mSX8qaE91sM2iRFDewYXzaaZG
 NjijYCcUtopUvojwxYZaLnIpnKWG4OZqGTNw1IdgzUtfdxoazpg6+4wAF9vo7FEP
 AePAUTKrfkGBm95uAP4bRvXBzS9UhXJvBrFW3grzRZybMj617F01yAR4N0xlMXeN
 jMLrGe7sWA==
 =xE9E
 -----END PGP SIGNATURE-----

Merge tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block

Pull block driver updates from Jens Axboe:
 "Here are the driver updates for 5.10.

  A few SCSI updates in here too, in coordination with Martin as they
  depend on core block changes for the shared tag bitmap.

  This contains:

   - NVMe pull requests via Christoph:
      - fix keep alive timer modification (Amit Engel)
      - order the PCI ID list more sensibly (Andy Shevchenko)
      - cleanup the open by controller helper (Chaitanya Kulkarni)
      - use an xarray for the CSE log lookup (Chaitanya Kulkarni)
      - support ZNS in nvmet passthrough mode (Chaitanya Kulkarni)
      - fix nvme_ns_report_zones (Christoph Hellwig)
      - add a sanity check to nvmet-fc (James Smart)
      - fix interrupt allocation when too many polled queues are
        specified (Jeffle Xu)
      - small nvmet-tcp optimization (Mark Wunderlich)
      - fix a controller refcount leak on init failure (Chaitanya
        Kulkarni)
      - misc cleanups (Chaitanya Kulkarni)
      - major refactoring of the scanning code (Christoph Hellwig)

   - MD updates via Song:
      - Bug fixes in bitmap code, from Zhao Heming
      - Fix a work queue check, from Guoqing Jiang
      - Fix raid5 oops with reshape, from Song Liu
      - Clean up unused code, from Jason Yan
      - Discard improvements, from Xiao Ni
      - raid5/6 page offset support, from Yufen Yu

   - Shared tag bitmap for SCSI/hisi_sas/null_blk (John, Kashyap,
     Hannes)

   - null_blk open/active zone limit support (Niklas)

   - Set of bcache updates (Coly, Dongsheng, Qinglang)"

* tag 'drivers-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (78 commits)
  md/raid5: fix oops during stripe resizing
  md/bitmap: fix memory leak of temporary bitmap
  md: fix the checking of wrong work queue
  md/bitmap: md_bitmap_get_counter returns wrong blocks
  md/bitmap: md_bitmap_read_sb uses wrong bitmap blocks
  md/raid0: remove unused function is_io_in_chunk_boundary()
  nvme-core: remove extra condition for vwc
  nvme-core: remove extra variable
  nvme: remove nvme_identify_ns_list
  nvme: refactor nvme_validate_ns
  nvme: move nvme_validate_ns
  nvme: query namespace identifiers before adding the namespace
  nvme: revalidate zone bitmaps in nvme_update_ns_info
  nvme: remove nvme_update_formats
  nvme: update the known admin effects
  nvme: set the queue limits in nvme_update_ns_info
  nvme: remove the 0 lba_shift check in nvme_update_ns_info
  nvme: clean up the check for too large logic block sizes
  nvme: freeze the queue over ->lba_shift updates
  nvme: factor out a nvme_configure_metadata helper
  ...
2020-10-13 13:04:41 -07:00
Linus Torvalds 3ad11d7ac8 block-5.10-2020-10-12
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+EWUgQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpnoxEADCVSNBRkpV0OVkOEC3wf8EGhXhk01Jnjtl
 u5Mg2V55hcgJ0thQxBV/V28XyqmsEBrmAVi0Yf8Vr9Qbq4Ze08Wae4ChS4rEOyh1
 jTcGYWx5aJB3ChLvV/HI0nWQ3bkj03mMrL3SW8rhhf5DTyKHsVeTenpx42Qu/FKf
 fRzi09FSr3Pjd0B+EX6gunwJnlyXQC5Fa4AA0GhnXJzAznANXxHkkcXu8a6Yw75x
 e28CfhIBliORsK8sRHLoUnPpeTe1vtxCBhBMsE+gJAj9ZUOWMzvNFIPP4FvfawDy
 6cCQo2m1azJ/IdZZCDjFUWyjh+wxdKMp+NNryEcoV+VlqIoc3n98rFwrSL+GIq5Z
 WVwEwq+AcwoMCsD29Lu1ytL2PQ/RVqcJP5UheMrbL4vzefNfJFumQVZLIcX0k943
 8dFL2QHL+H/hM9Dx5y5rjeiWkAlq75v4xPKVjh/DHb4nehddCqn/+DD5HDhNANHf
 c1kmmEuYhvLpIaC4DHjE6DwLh8TPKahJjwsGuBOTr7D93NUQD+OOWsIhX6mNISIl
 FFhP8cd0/ZZVV//9j+q+5B4BaJsT+ZtwmrelKFnPdwPSnh+3iu8zPRRWO+8P8fRC
 YvddxuJAmE6BLmsAYrdz6Xb/wqfyV44cEiyivF0oBQfnhbtnXwDnkDWSfJD1bvCm
 ZwfpDh2+Tg==
 =LzyE
 -----END PGP SIGNATURE-----

Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:

 - Series of merge handling cleanups (Baolin, Christoph)

 - Series of blk-throttle fixes and cleanups (Baolin)

 - Series cleaning up BDI, seperating the block device from the
   backing_dev_info (Christoph)

 - Removal of bdget() as a generic API (Christoph)

 - Removal of blkdev_get() as a generic API (Christoph)

 - Cleanup of is-partition checks (Christoph)

 - Series reworking disk revalidation (Christoph)

 - Series cleaning up bio flags (Christoph)

 - bio crypt fixes (Eric)

 - IO stats inflight tweak (Gabriel)

 - blk-mq tags fixes (Hannes)

 - Buffer invalidation fixes (Jan)

 - Allow soft limits for zone append (Johannes)

 - Shared tag set improvements (John, Kashyap)

 - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

 - DM no-wait support (Mike, Konstantin)

 - Request allocation improvements (Ming)

 - Allow md/dm/bcache to use IO stat helpers (Song)

 - Series improving blk-iocost (Tejun)

 - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
   Xianting, Yang, Yufen, yangerkun)

* tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
  block: fix uapi blkzoned.h comments
  blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
  blk-mq: get rid of the dead flush handle code path
  block: get rid of unnecessary local variable
  block: fix comment and add lockdep assert
  blk-mq: use helper function to test hw stopped
  block: use helper function to test queue register
  block: remove redundant mq check
  block: invoke blk_mq_exit_sched no matter whether have .exit_sched
  percpu_ref: don't refer to ref->data if it isn't allocated
  block: ratelimit handle_bad_sector() message
  blk-throttle: Re-use the throtl_set_slice_end()
  blk-throttle: Open code __throtl_de/enqueue_tg()
  blk-throttle: Move service tree validation out of the throtl_rb_first()
  blk-throttle: Move the list operation after list validation
  blk-throttle: Fix IO hang for a corner case
  blk-throttle: Avoid tracking latency if low limit is invalid
  blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
  blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
  block: Remove redundant 'return' statement
  ...
2020-10-13 12:12:44 -07:00
Linus Torvalds 11e3235b43 for-5.10-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl+EuosACgkQxWXV+ddt
 WDtwLBAAiVqGGeLPwiy7R4/vMJKY9RYhf2WtCI7/AEtj6efiT1NU1Y5spw5QyJO6
 +zIvhRx/3zXXkz52poVS8kM0aIxH06a7p9SXBvTf6D5CIQf1QJZVZTjgJKVVaYkE
 0LsudJgstjowc02k/pN2uLQzfWPLIscatXTGbUKmz2QEf/1opef8EJy8asnbyCXP
 FTo5a0EN0B1Gq5GPBpdSOLmGyCA51W709EDR4uDr+iRIjI7yfsyGYNJiIEZsOJ8f
 tXLxwx2/CA7OWmlTwGJr180UnjVG/gUu9wm3x9QG5Fre8TwshLEllJaZ9DYLebQl
 JKs15pbaTJQa4R/AVZyD1JuVvV5tCdcfvNOffKR3iFNTeq72Kxoj7W+DQn4z5CgO
 PX3SnuoujWIY1hOhXQLXVXd3MTzRtZ21mn33tUFY2dfJq0U28LygkQg/QsYZxn9E
 WWF0d/ml1q9aF6B68n9ZH9jIOlfGZLKf4Dp65ND5nCpUe3k1H+93JZySQndmw+G7
 gfZMGr9319irM+MqRoxlP4ujL972axZhguF/YZxjPv47uB34n1Q1uLJApsjA/bQl
 7SSe1mNFcm4h+xXzwXTYum7yQ8i41ThPX1UAOmIz0geiGn3W6vBuaOX0n5vU9Ds3
 3RdBmijQmrnua1zaoCj5sr4XmU3I146iNOapfHyQdQhhENEicSc=
 =FNjY
 -----END PGP SIGNATURE-----

Merge tag 'for-5.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "Mostly core updates with a few user visible bits and fixes.

  Hilights:

   - fsync performance improvements
      - less contention of log mutex (throughput +4%, latency -14%,
        dbench with 32 clients)
      - skip unnecessary commits for link and rename (throughput +6%,
        latency -30%, rename latency -75%, dbench with 16 clients)
      - make fast fsync wait only for writeback (throughput +10..40%,
        runtime -1..-20%, dbench with 1 to 64 clients on various
        file/block sizes)

   - direct io is now implemented using the iomap infrastructure, that's
     the main part, we still have a workaround that requires an iomap
     API update, coming in 5.10

   - new sysfs exports:
      - information about the exclusive filesystem operation status
        (balance, device add/remove/replace, ...)
      - supported send stream version

  Core:

   - use ticket space reservations for data, fair policy using the same
     infrastructure as metadata

   - preparatory work to switch locking from our custom tree locks to
     standard rwsem, now the locking context is propagated to all
     callers, actual switch is expected to happen in the next dev cycle

   - seed device structures are now using list API

   - extent tracepoints print proper tree id

   - unified range checks for extent buffer helpers

   - send: avoid using temporary buffer for copying data

   - remove unnecessary RCU protection from space infos

   - remove unused readpage callback for metadata, enabling several
     cleanups

   - replace indirect function calls for end io hooks and remove
     extent_io_ops completely

  Fixes:

   - more lockdep warning fixes

   - fix qgroup reservation for delayed inode and an occasional
     reservation leak for preallocated files

   - fix device replace of a seed device

   - fix metadata reservation for fallocate that leads to transaction
     aborts

   - reschedule if necessary when logging directory items or when
     cloning lots of extents

   - tree-checker: fix false alert caused by legacy btrfs root item

   - send: fix rename/link conflicts for orphanized inodes

   - properly initialize device stats for seed devices

   - skip devices without magic signature when mounting

  Other:

   - error handling improvements, BUG_ONs replaced by proper handling,
     fuzz fixes

   - various function parameter cleanups

   - various W=1 cleanups

   - error/info messages improved

  Mishaps:

   - commit 62cf539120 ("btrfs: move btrfs_rm_dev_replace_free_srcdev
     outside of all locks") is a rebase leftover after the patch got
     merged to 5.9-rc8 as a466c85edc ("btrfs: move
     btrfs_rm_dev_replace_free_srcdev outside of all locks"), the
     remaining part is trivial and the patch is in the middle of the
     series so I'm keeping it there instead of rebasing"

* tag 'for-5.10-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (161 commits)
  btrfs: rename BTRFS_INODE_ORDERED_DATA_CLOSE flag
  btrfs: annotate device name rcu_string with __rcu
  btrfs: skip devices without magic signature when mounting
  btrfs: cleanup cow block on error
  btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK
  fs: remove no longer used dio_end_io()
  btrfs: return error if we're unable to read device stats
  btrfs: init device stats for seed devices
  btrfs: remove struct extent_io_ops
  btrfs: call submit_bio_hook directly for metadata pages
  btrfs: stop calling submit_bio_hook for data inodes
  btrfs: don't opencode is_data_inode in end_bio_extent_readpage
  btrfs: call submit_bio_hook directly in submit_one_bio
  btrfs: remove extent_io_ops::readpage_end_io_hook
  btrfs: replace readpage_end_io_hook with direct calls
  btrfs: send, recompute reference path after orphanization of a directory
  btrfs: send, orphanize first all conflicting inodes when processing references
  btrfs: tree-checker: fix false alert caused by legacy btrfs root item
  btrfs: use unaligned helpers for stack and header set/get helpers
  btrfs: free-space-cache: use unaligned helpers to access data
  ...
2020-10-13 09:01:36 -07:00
Linus Torvalds 53acd35050 File locking fix for v5.10
-----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAl+EOfITHGpsYXl0b25A
 a2VybmVsLm9yZwAKCRAADmhBGVaCFY1pEACkyAuCmEZHMF+8+HjqFQSwDIoNLmbs
 eAjn6TSvurBRNOOVkFSrcWgf+jKbefepfWOnn708XpKcyZiWD374+eLo3JHeVRPN
 qwv0sKo/0l59qqJ2yu2GvcBiNgrUGzYWin8t/4QZRResWkGLSMwXlWHqA9lUwrRW
 CnA9Scs1ELbuBBmC+Xr8z3twSvZeaXtRJ+/+6lBpsamhUEBJwjqUja5/gTOyWFFW
 sChOasLfPBKnxxZfuKZBAQXrcqCQ8kqr9F7sTeO6awchH8/RtBRwEZilNhR2ZUqO
 iPlQ5mJ38DqOs1yFFiYvcBWvxgQqQlHLfURtPGFaz1uF7JttddnNEfUXAFpI34Zl
 0ZzRh8Acc9zvq02J7IeNjPalv/MG9G/Dh44mvknCjP6E0+Uzs08SrixcaeJcmmKg
 a2pqsuZuLNiQoIRo4t+rqeqWFFb44IGa8gEren7QRVkfnWb09OLYAGNxVLC2X7+w
 2JFncEwFH8VOvsX8hw42SFZAOPMGDhdjeElh16GNNp2ttDnYuWMgCmsK33I3b0jQ
 sjLN17b+263pKdHAm7sh6WHEocKGDsL6U4E/E9vThac6mipRKGeSHU83jtH6I56g
 h5O1SpZBVOCBcE0XfMyu9wwzNC8JFZ9XAQW3iFfDKTzxW0CABpisG+ODS59wB8r0
 7lNa3L8c69PeGA==
 =Qa10
 -----END PGP SIGNATURE-----

Merge tag 'locks-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking fix from Jeff Layton:
 "Just a single patch to fix up some tracepoint output"

* tag 'locks-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  locks: Remove extra "0x" in tracepoint format specifier
2020-10-12 16:50:18 -07:00
Linus Torvalds ee4a925107 Clean up the paravirt code after the removal of 32-bit Xen PV support.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EknMRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jfCw//SHuZDnhwJEA0W6smo3iWs3CIvvvEriM7
 9ARjWparTD6P6ZXwW/xl76W+/QyzoWrsUDKHv9hFD5cpwafw5ZdCm4vhQi/tVLIc
 fqcEzG3I+UEqzs7K8NNVuEQs6b44diVPyVGEz7tRdufnKkXKU9Iyolc8zwa9OFB4
 qknqQXHDfJ2Xsz4zRpwtiKHFq0ZyXzGiDY+O/AYKa8Zw25W0W6Hk3IoR2o2QgBr2
 mE6VbhrO+woTEwMbNVi1fjioK2kQJ0PGleUQcaOz6rf8iMw/Ci4GJY70Yh3KIZMk
 VTNinCdC7GYwi0hsAsuas/dEIitn5B1zn3paN6wlNnpcjr1/Tn2oUw3euSju9X9a
 wvCMJX0ZoF/BLjoe7KSQAMCq0GaPNKWp9qP9gQFj/f1bUd4PC7yXRPJHZZlZfQPn
 M+jqsBye+GAbdeEzSjAutpU1gv4gjfF+heI8eLVtsYEmRmOfI6AxKm6MHjT0h6nK
 /krUyyTfi2IdXQ02FgbM8ufhXfAR6uXiaw4aCUoP53+gZR3R41aIxZ5rW4Tsfpxo
 jWeqYaVUpHnXY+Ses3Ziw1RGvpF0rrFP9xQv8jhsK1dJEPSIlpTahAgdYeQoIWFF
 7WAsRscDtqiFHGr/RdX67LkcNik+GTxQx8moctk3PHueegTwZwyVBMsItlbJrj+q
 fitB13vg18Y=
 =n1W3
 -----END PGP SIGNATURE-----

Merge tag 'x86-paravirt-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 paravirt cleanup from Ingo Molnar:
 "Clean up the paravirt code after the removal of 32-bit Xen PV support"

* tag 'x86-paravirt-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/paravirt: Avoid needless paravirt step clearing page table entries
  x86/paravirt: Remove set_pte_at() pv-op
  x86/entry/32: Simplify CONFIG_XEN_PV build dependency
  x86/paravirt: Use CONFIG_PARAVIRT_XXL instead of CONFIG_PARAVIRT
  x86/paravirt: Clean up paravirt macros
  x86/paravirt: Remove 32-bit support from CONFIG_PARAVIRT_XXL
2020-10-12 15:15:24 -07:00
Linus Torvalds dd502a8107 This tree introduces static_call(), which is the idea of static_branch()
applied to indirect function calls. Remove a data load (indirection) by
 modifying the text.
 
 They give the flexibility of function pointers, but with better
 performance. (This is especially important for cases where
 retpolines would otherwise be used, as retpolines can be pretty
 slow.)
 
 API overview:
 
   DECLARE_STATIC_CALL(name, func);
   DEFINE_STATIC_CALL(name, func);
   DEFINE_STATIC_CALL_NULL(name, typename);
 
   static_call(name)(args...);
   static_call_cond(name)(args...);
   static_call_update(name, func);
 
 x86 is supported via text patching, otherwise basic indirect calls are used,
 with function pointers.
 
 There's a second variant using inline code patching, inspired by jump-labels,
 implemented on x86 as well.
 
 The new APIs are utilized in the x86 perf code, a heavy user of function pointers,
 where static calls speed up the PMU handler by 4.2% (!).
 
 The generic implementation is not really excercised on other architectures,
 outside of the trivial test_static_call_init() self-test.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EfAQRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1iEAw//divHeVCJnHhV+YBbuI9ROUsERkzu8VhK
 O1DEmW68Fvj7pszT8NZsMjtkt97ZtxDRK7aCJiiup0eItG9qCJ8lpCLb84ZbizHV
 HhCbhBLrpxSvTrWlQnkgP1OkPAbtoryIjVlZzWhjye2MY8UEbVnZWyviBolbAAxH
 Fk1Yi56fIMu19GO+9Ohzy9E2VDnVEH1iMx5YWoLD2H88Qbq/yEMP+U2tIj8hIVKT
 Y/jdogihNXRIau6QB+YPfDPisdty+RHxfU7zct4Rv8cFF5ylglZB5fD34C3sUQF2
 WqsaYz7zjUj9f02F8pw8hIaAT7InzArPhlNVITxf2oMfmdrNqBptnSCddZqCJLvv
 oDGew21k50Zcbqkv9amclpxXH5tTpRvJeqit2pz/85GMeqBRuhzHUAkCpht5YA73
 qJsHWS3z+qIxKi0tDbhDJswuwa51q5sgdUUwo1uCr3wT3DGDlqNhCAZBzX14dcty
 0shDSbv13TCwqAcb7asPzEoPwE15cwa+x+viGEIL901pyZKyQYjs/abDU26It3BW
 roWRkuVJZ9/QMdZJs1v7kaXw1L8YiKIDkBgke+xbfrDwEvvjudQkl2LUL66DB11j
 RJU3GyxKClvdY06SSRh/H13fqZLNKh1JZ0nPEWSTJECDFN9zcDjrDrod/7PFOcpY
 NAlawLoGG+s=
 =JvpF
 -----END PGP SIGNATURE-----

Merge tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull static call support from Ingo Molnar:
 "This introduces static_call(), which is the idea of static_branch()
  applied to indirect function calls. Remove a data load (indirection)
  by modifying the text.

  They give the flexibility of function pointers, but with better
  performance. (This is especially important for cases where retpolines
  would otherwise be used, as retpolines can be pretty slow.)

  API overview:

      DECLARE_STATIC_CALL(name, func);
      DEFINE_STATIC_CALL(name, func);
      DEFINE_STATIC_CALL_NULL(name, typename);

      static_call(name)(args...);
      static_call_cond(name)(args...);
      static_call_update(name, func);

  x86 is supported via text patching, otherwise basic indirect calls are
  used, with function pointers.

  There's a second variant using inline code patching, inspired by
  jump-labels, implemented on x86 as well.

  The new APIs are utilized in the x86 perf code, a heavy user of
  function pointers, where static calls speed up the PMU handler by
  4.2% (!).

  The generic implementation is not really excercised on other
  architectures, outside of the trivial test_static_call_init()
  self-test"

* tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  static_call: Fix return type of static_call_init
  tracepoint: Fix out of sync data passing by static caller
  tracepoint: Fix overly long tracepoint names
  x86/perf, static_call: Optimize x86_pmu methods
  tracepoint: Optimize using static_call()
  static_call: Allow early init
  static_call: Add some validation
  static_call: Handle tail-calls
  static_call: Add static_call_cond()
  x86/alternatives: Teach text_poke_bp() to emulate RET
  static_call: Add simple self-test for static calls
  x86/static_call: Add inline static call implementation for x86-64
  x86/static_call: Add out-of-line static call implementation
  static_call: Avoid kprobes on inline static_call()s
  static_call: Add inline static call infrastructure
  static_call: Add basic static call infrastructure
  compiler.h: Make __ADDRESSABLE() symbol truly unique
  jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved()
  module: Properly propagate MODULE_STATE_COMING failure
  module: Fix up module_notifier return values
  ...
2020-10-12 13:58:15 -07:00
Linus Torvalds edaa5ddf38 Scheduler changes for v5.10:
- Reorganize & clean up the SD* flags definitions and add a bunch
    of sanity checks. These new checks caught quite a few bugs or at
    least inconsistencies, resulting in another set of patches.
 
  - Rseq updates, add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ
 
  - Add a new tracepoint to improve CPU capacity tracking
 
  - Improve overloaded SMP system load-balancing behavior
 
  - Tweak SMT balancing
 
  - Energy-aware scheduling updates
 
  - NUMA balancing improvements
 
  - Deadline scheduler fixes and improvements
 
  - CPU isolation fixes
 
  - Misc cleanups, simplifications and smaller optimizations.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAl+EWRERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hV8A/7BB0nt/zYVZ8Z3Di8V0b9hMtr0d1xtRM5
 ZAvg4hcZl/fVgobFndxBw6KdlK8lSce9Mcq+bTTWeD46CS13cK5Vrpiaf7x7Q00P
 m8YHeYEH13ME0pbBrhDoRCR4XzfXukzjkUl7LiyrTekAvRUtFikJ/uKl8MeJtYGZ
 gANEkadqforxUW0v45iUEGepmCWAl8hSlSMb2mDKsVhw4DFMD+px0EBmmA0VDqjE
 e0rkh6dEoUVNqlic2KoaXULld1rLg1xiaOcLUbTAXnucfhmuv5p/H11AC4ABuf+s
 7d0zLrLEfZrcLJkthYxfMHs7DYMtARiQM9Db/a5hAq9Af4Z2bvvVAaHt3gCGvkV1
 llB6BB2yWCki9Qv7oiGOAhANnyJHG/cU4r6WwMuHdlYi4dFT/iN5qkOMUL1IrDgi
 a6ZzvECChXBeisQXHSlMd8Y5O+j0gRvDR7E18z2q0/PlmO8PGJq4w34mEWveWIg3
 LaVF16bmvaARuNFJTQH/zaHhjqVQANSMx5OIv9swp0OkwvQkw21ICYHG0YxfzWCr
 oa/FESEpOL9XdYp8UwMPI0bmVIsEfx79pmDMF3zInYTpJpwMUhV2yjHE8uYVMqEf
 7U8rZv7gdbZ2us38Gjf2l73hY+recp/GrgZKnk0R98OUeMk1l/iVP6dwco6ITUV5
 czGmKlIB1ec=
 =bXy6
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - reorganize & clean up the SD* flags definitions and add a bunch of
   sanity checks. These new checks caught quite a few bugs or at least
   inconsistencies, resulting in another set of patches.

 - rseq updates, add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ

 - add a new tracepoint to improve CPU capacity tracking

 - improve overloaded SMP system load-balancing behavior

 - tweak SMT balancing

 - energy-aware scheduling updates

 - NUMA balancing improvements

 - deadline scheduler fixes and improvements

 - CPU isolation fixes

 - misc cleanups, simplifications and smaller optimizations

* tag 'sched-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  sched/deadline: Unthrottle PI boosted threads while enqueuing
  sched/debug: Add new tracepoint to track cpu_capacity
  sched/fair: Tweak pick_next_entity()
  rseq/selftests: Test MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ
  rseq/selftests,x86_64: Add rseq_offset_deref_addv()
  rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ
  sched/fair: Use dst group while checking imbalance for NUMA balancer
  sched/fair: Reduce busy load balance interval
  sched/fair: Minimize concurrent LBs between domain level
  sched/fair: Reduce minimal imbalance threshold
  sched/fair: Relax constraint on task's load during load balance
  sched/fair: Remove the force parameter of update_tg_load_avg()
  sched/fair: Fix wrong cpu selecting from isolated domain
  sched: Remove unused inline function uclamp_bucket_base_value()
  sched/rt: Disable RT_RUNTIME_SHARE by default
  sched/deadline: Fix stale throttling on de-/boosted tasks
  sched/numa: Use runnable_avg to classify node
  sched/topology: Move sd_flag_debug out of #ifdef CONFIG_SYSCTL
  MAINTAINERS: Add myself as SCHED_DEADLINE reviewer
  sched/topology: Move SD_DEGENERATE_GROUPS_MASK out of linux/sched/topology.h
  ...
2020-10-12 12:56:01 -07:00
Linus Torvalds 6734e20e39 arm64 updates for 5.10
- Userspace support for the Memory Tagging Extension introduced by Armv8.5.
   Kernel support (via KASAN) is likely to follow in 5.11.
 
 - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
   switching.
 
 - Fix and subsequent rewrite of our Spectre mitigations, including the
   addition of support for PR_SPEC_DISABLE_NOEXEC.
 
 - Support for the Armv8.3 Pointer Authentication enhancements.
 
 - Support for ASID pinning, which is required when sharing page-tables with
   the SMMU.
 
 - MM updates, including treating flush_tlb_fix_spurious_fault() as a no-op.
 
 - Perf/PMU driver updates, including addition of the ARM CMN PMU driver and
   also support to handle CPU PMU IRQs as NMIs.
 
 - Allow prefetchable PCI BARs to be exposed to userspace using normal
   non-cacheable mappings.
 
 - Implementation of ARCH_STACKWALK for unwinding.
 
 - Improve reporting of unexpected kernel traps due to BPF JIT failure.
 
 - Improve robustness of user-visible HWCAP strings and their corresponding
   numerical constants.
 
 - Removal of TEXT_OFFSET.
 
 - Removal of some unused functions, parameters and prototypes.
 
 - Removal of MPIDR-based topology detection in favour of firmware
   description.
 
 - Cleanups to handling of SVE and FPSIMD register state in preparation
   for potential future optimisation of handling across syscalls.
 
 - Cleanups to the SDEI driver in preparation for support in KVM.
 
 - Miscellaneous cleanups and refactoring work.
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+AUXMQHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNFc1B/4q2Kabe+pPu7s1f58Q+OTaEfqcr3F1qh27
 F1YpFZUYxg0GPfPsFrnbJpo5WKo7wdR9ceI9yF/GHjs7A/MSoQJis3pG6SlAd9c0
 nMU5tCwhg9wfq6asJtl0/IPWem6cqqhdzC6m808DjeHuyi2CCJTt0vFWH3OeHEhG
 cfmLfaSNXOXa/MjEkT8y1AXJ/8IpIpzkJeCRA1G5s18PXV9Kl5bafIo9iqyfKPLP
 0rJljBmoWbzuCSMc81HmGUQI4+8KRp6HHhyZC/k0WEVgj3LiumT7am02bdjZlTnK
 BeNDKQsv2Jk8pXP2SlrI3hIUTz0bM6I567FzJEokepvTUzZ+CVBi
 =9J8H
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "There's quite a lot of code here, but much of it is due to the
  addition of a new PMU driver as well as some arm64-specific selftests
  which is an area where we've traditionally been lagging a bit.

  In terms of exciting features, this includes support for the Memory
  Tagging Extension which narrowly missed 5.9, hopefully allowing
  userspace to run with use-after-free detection in production on CPUs
  that support it. Work is ongoing to integrate the feature with KASAN
  for 5.11.

  Another change that I'm excited about (assuming they get the hardware
  right) is preparing the ASID allocator for sharing the CPU page-table
  with the SMMU. Those changes will also come in via Joerg with the
  IOMMU pull.

  We do stray outside of our usual directories in a few places, mostly
  due to core changes required by MTE. Although much of this has been
  Acked, there were a couple of places where we unfortunately didn't get
  any review feedback.

  Other than that, we ran into a handful of minor conflicts in -next,
  but nothing that should post any issues.

  Summary:

   - Userspace support for the Memory Tagging Extension introduced by
     Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11.

   - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
     switching.

   - Fix and subsequent rewrite of our Spectre mitigations, including
     the addition of support for PR_SPEC_DISABLE_NOEXEC.

   - Support for the Armv8.3 Pointer Authentication enhancements.

   - Support for ASID pinning, which is required when sharing
     page-tables with the SMMU.

   - MM updates, including treating flush_tlb_fix_spurious_fault() as a
     no-op.

   - Perf/PMU driver updates, including addition of the ARM CMN PMU
     driver and also support to handle CPU PMU IRQs as NMIs.

   - Allow prefetchable PCI BARs to be exposed to userspace using normal
     non-cacheable mappings.

   - Implementation of ARCH_STACKWALK for unwinding.

   - Improve reporting of unexpected kernel traps due to BPF JIT
     failure.

   - Improve robustness of user-visible HWCAP strings and their
     corresponding numerical constants.

   - Removal of TEXT_OFFSET.

   - Removal of some unused functions, parameters and prototypes.

   - Removal of MPIDR-based topology detection in favour of firmware
     description.

   - Cleanups to handling of SVE and FPSIMD register state in
     preparation for potential future optimisation of handling across
     syscalls.

   - Cleanups to the SDEI driver in preparation for support in KVM.

   - Miscellaneous cleanups and refactoring work"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
  Revert "arm64: initialize per-cpu offsets earlier"
  arm64: random: Remove no longer needed prototypes
  arm64: initialize per-cpu offsets earlier
  kselftest/arm64: Check mte tagged user address in kernel
  kselftest/arm64: Verify KSM page merge for MTE pages
  kselftest/arm64: Verify all different mmap MTE options
  kselftest/arm64: Check forked child mte memory accessibility
  kselftest/arm64: Verify mte tag inclusion via prctl
  kselftest/arm64: Add utilities and a test to validate mte memory
  perf: arm-cmn: Fix conversion specifiers for node type
  perf: arm-cmn: Fix unsigned comparison to less than zero
  arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
  arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
  arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
  arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
  KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
  arm64: Get rid of arm64_ssbd_state
  KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
  KVM: arm64: Get rid of kvm_arm_have_ssbd()
  KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
  ...
2020-10-12 10:00:51 -07:00
Ingo Molnar b36c830f8c Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull v5.10 RCU changes from Paul E. McKenney:

- Debugging for smp_call_function().

- Strict grace periods for KASAN.  The point of this series is to find
  RCU-usage bugs, so the corresponding new RCU_STRICT_GRACE_PERIOD
  Kconfig option depends on both DEBUG_KERNEL and RCU_EXPERT, and is
  further disabled by dfefault.  Finally, the help text includes
  a goodly list of scary caveats.

- New smp_call_function() torture test.

- Torture-test updates.

- Documentation updates.

- Miscellaneous fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-10-09 08:21:56 +02:00
Qu Wenruo 2c53a14dd3 btrfs: use own btree inode io_tree owner id
Btree inode is special compared to all other inode extent io_trees,
although it has a btrfs inode, it doesn't have the track_uptodate bit at
all.

This means a lot of things like extent locking doesn't even need to be
applied to btree io tree.

Since it's so special, adds a new owner value for it to make debuging a
little easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07 12:13:22 +02:00
Nikolay Borisov acbf1dd0fc btrfs: make ordered extent tracepoint take btrfs_inode
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07 12:12:17 +02:00
Qu Wenruo 437490fed3 btrfs: tracepoints: output proper root owner for trace_find_free_extent()
The current trace event always output result like this:

 find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
 find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=4(METADATA)
 find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
 find_free_extent: root=2(EXTENT_TREE) len=8192 empty_size=0 flags=1(DATA)
 find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)
 find_free_extent: root=2(EXTENT_TREE) len=4096 empty_size=0 flags=1(DATA)

T's saying we're allocating data extent for EXTENT tree, which is not
even possible.

It's because we always use EXTENT tree as the owner for
trace_find_free_extent() without using the @root from
btrfs_reserve_extent().

This patch will change the parameter to use proper @root for
trace_find_free_extent():

Now it looks much better:

 find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
 find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
 find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=1(DATA)
 find_free_extent: root=5(FS_TREE) len=4096 empty_size=0 flags=1(DATA)
 find_free_extent: root=5(FS_TREE) len=8192 empty_size=0 flags=1(DATA)
 find_free_extent: root=5(FS_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
 find_free_extent: root=7(CSUM_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
 find_free_extent: root=2(EXTENT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)
 find_free_extent: root=1(ROOT_TREE) len=16384 empty_size=0 flags=36(METADATA|DUP)

Reported-by: Hans van Kranenburg <hans@knorrie.org>
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07 12:06:49 +02:00
Mark Brown fd6b519a30 Linux 5.9-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9epdgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG9IMH/jHCRSbcsIXHuQHn
 xcRLlhrDHfXoBza7auHfPWx2+9DZsmaSJs/SEiTGNag0Bi7jBcWcwBpsep7iVG/+
 WiftD5uOMhZigyuvfMFrt0mjr2Kr3wg5p58lwMBeBdm8iL5uKV8ehKsh05/Fral2
 6hu3jP8L0PCZMpF+sZ7s2jlhfVUMmjA8VzXZCvgQtmhoraHiF3mzfkcSMxnHwBPO
 HLo+TDDm49u+LbVsJT7+cSTiWxuUJCbix9Q4PCTx/BGg4ezYsjc6v0BnYRaYtrrA
 1uYiT6PVBEUkYYBHKQlD3N2KnUmbKx7dGUF4t+peTg5/JiocAJMNi1N9Qzvv7N6Q
 CqTiuio=
 =q+kJ
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAl98iv4THGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0KpvB/9X9BTa/SRvi2kxkji2wt8dq2DZpvyR
 wuwuNhs0P49A6um6qYNdeXTdNH7lpf3JIdsznMpJAK8ztD0F2tUe3iRxSAcleSCO
 D2YhL4Ptkap1IwzN0mw/POju8kz+gj3Oj/eYafXe7lUVK+RSI4IAyxnKy4WDrcnp
 I7CWLd5RgSz2t9v9/s34nOUJC4+U/Bk2p46VDLlMu73wQ+m8gIoizWDU6a38pr1V
 7v2srCRzbEdvT/Am43XSBo4haZgubjoCYOuniTRM6Z3HN/C6PD5ssFMFlIUXnCZK
 s3WBU7M79GLATEpPIiSvwf99dbQDnhsCFuEi/s0SLj5phUlAO43NNRGX
 =S5ik
 -----END PGP SIGNATURE-----

Merge tag 'v5.9-rc5' into asoc-5.10

Linux 5.9-rc5
2020-10-06 16:19:24 +01:00
Cezary Rojewski ca756120d4
ASoC: Intel: Remove haswell solution
Newly added catpt solution found in sound/soc/intel/catpt is a direct
replacement to sound/soc/intel/haswell. It covers all features supported
by it and more - by aligning to recommended flows and requirement list
based on Windows driver equivalent. No harm is done to userspace as
catpt - similarly to haswell - loads no extenal topology files while
sharing the exact same ADSP firmware binary.

Given the above, existing haswell code is redundant so remove it.

Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Liam Girdwood <liam.r.girdwood@intel.com>
Link: https://lore.kernel.org/r/20201006064907.16277-2-cezary.rojewski@intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
2020-10-06 15:12:18 +01:00
Vincent Donnefort 51cf18c90c sched/debug: Add new tracepoint to track cpu_capacity
rq->cpu_capacity is a key element in several scheduler parts, such as EAS
task placement and load balancing. Tracking this value enables testing
and/or debugging by a toolkit.

Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1598605249-72651-1-git-send-email-vincent.donnefort@arm.com
2020-10-03 16:30:52 +02:00
Roman Bolshakov 7010645ba7 scsi: target: core: Add CONTROL field for trace events
trace-cmd report doesn't show events from target subsystem because
scsi_command_size() leaks through event format string:

  [target:target_sequencer_start] function scsi_command_size not defined
  [target:target_cmd_complete] function scsi_command_size not defined

Addition of scsi_command_size() to plugin_scsi.c in trace-cmd doesn't
help because an expression is used inside TP_printk(). trace-cmd event
parser doesn't understand minus sign inside [ ]:

  Error: expected ']' but read '-'

Rather than duplicating kernel code in plugin_scsi.c, provide a dedicated
field for CONTROL byte.

Link: https://lore.kernel.org/r/20200929125957.83069-1-r.bolshakov@yadro.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-10-02 18:36:19 -04:00
Coly Li 1132e56e78 bcache: add set_uuid in struct cache_set
This patch adds a separated set_uuid[16] in struct cache_set, to store
the uuid of the cache set. This is the preparation to remove the
embedded struct cache_sb from struct cache_set.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-02 14:25:30 -06:00
Ido Schimmel 5b88823bfe devlink: Add a tracepoint for trap reports
Add a tracepoint for trap reports so that drop monitor could register
its probe on it. Use trace_devlink_trap_report_enabled() to avoid
wasting cycles setting the trap metadata if the tracepoint is not
enabled.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-30 18:01:26 -07:00
Alexander Graf 1ae099540e KVM: x86: Allow deflecting unknown MSR accesses to user space
MSRs are weird. Some of them are normal control registers, such as EFER.
Some however are registers that really are model specific, not very
interesting to virtualization workloads, and not performance critical.
Others again are really just windows into package configuration.

Out of these MSRs, only the first category is necessary to implement in
kernel space. Rarely accessed MSRs, MSRs that should be fine tunes against
certain CPU models and MSRs that contain information on the package level
are much better suited for user space to process. However, over time we have
accumulated a lot of MSRs that are not the first category, but still handled
by in-kernel KVM code.

This patch adds a generic interface to handle WRMSR and RDMSR from user
space. With this, any future MSR that is part of the latter categories can
be handled in user space.

Furthermore, it allows us to replace the existing "ignore_msrs" logic with
something that applies per-VM rather than on the full system. That way you
can run productive VMs in parallel to experimental ones where you don't care
about proper MSR handling.

Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

Message-Id: <20200925143422.21718-3-graf@amazon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-09-28 07:58:04 -04:00
Tejun Heo c5a6561b8d iocost: add iocg_forgive_debt tracepoint
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-25 08:35:02 -06:00
Jens Axboe ac8f7a0264 Merge branch 'for-5.10/block' into for-5.10/drivers
* for-5.10/block: (140 commits)
  bdi: replace BDI_CAP_NO_{WRITEBACK,ACCT_DIRTY} with a single flag
  bdi: invert BDI_CAP_NO_ACCT_WB
  bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag
  mm: use SWP_SYNCHRONOUS_IO more intelligently
  bdi: remove BDI_CAP_SYNCHRONOUS_IO
  bdi: remove BDI_CAP_CGROUP_WRITEBACK
  block: lift setting the readahead size into the block layer
  md: update the optimal I/O size on reshape
  bdi: initialize ->ra_pages and ->io_pages in bdi_init
  aoe: set an optimal I/O size
  bcache: inherit the optimal I/O size
  drbd: remove dead code in device_to_statistics
  fs: remove the unused SB_I_MULTIROOT flag
  block: mark blkdev_get static
  PM: mm: cleanup swsusp_swap_check
  mm: split swap_type_of
  PM: rewrite is_hibernate_resume_dev to not require an inode
  mm: cleanup claim_swapfile
  ocfs2: cleanup o2hb_region_dev_store
  dasd: cleanup dasd_scan_partitions
  ...
2020-09-24 13:44:39 -06:00
Chuck Lever 721a1d388b SUNRPC: Remove dprintk call sites in RPC queuing functions
Remove redundant call sites or call sites that are already covered
by tracepoints.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever 1466c22163 SUNRPC: Clean up RPC scheduler tracepoints
Remove several redundant dprintk call sites, and replace a couple of
potentially useful ones with tracepoints.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever c3adcc7dfb SUNRPC: Replace rpcbind dprintk call sites with tracepoints
In many cases, tracepoints already report these errors. In others,
the dprintks were mainly useful when this code was less mature.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever ac1ae53421 SUNRPC: Hoist trace_xprtrdma_op_setport into generic code
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever e465cc3fa8 SUNRPC: Remove rpcb_getport_async dprintk call sites
In many cases, tracepoints already report these errors. In others,
the dprintks were mainly useful when this code was less mature.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever 42ebfc2cbf SUNRPC: Clean up call_bind_status() observability
Time to remove dprintk call sites in here.

Regarding the rpc_bind_status tracepoint: It's friendlier to
administrators if they don't have to look up the error code to
figure out what went wrong. Replace trace_rpc_bind_status with a
set of tracepoints that report more specifically what the problem
was, and what RPC program/version was being queried.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever 7c8099f6ad SUNRPC: Trace call_refresh events
Clean up: Replace dprintk call sites.

Note that rpc_call_rpcerror() already has a trace point, so perhaps
adding trace_rpc_refresh_status() isn't necessary. However, it does
report a particular category of error.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever 914cdcc78a SUNRPC: Add trace_rpc_timeout_status()
For a long while we've wanted a tracepoint that fires when a major
timeout is reported in the system log. Such a tracepoint can be
attached to other actions that can take place when a timeout is
detected (eg, server or connection health assessment).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever db0a86c426 SUNRPC: Replace connect dprintk call sites with a tracepoint
This trace event can be used to audit transport connections from the
client.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:09 -04:00
Chuck Lever 015747d296 SUNRPC: Replace dprintk() call site in xs_nospace()
"no socket space" is an exceptional and infrequent condition
that troubleshooters want to know about.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Chuck Lever 9ce07ae5eb SUNRPC: Replace dprintk() call site in xprt_prepare_transmit
Generate a trace event when an RPC request is queued without being
sent immediately.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Chuck Lever 09d2ba0cb1 SUNRPC: Update debugging instrumentation in xprt_do_reserve()
Replace a dprintk() with a tracepoint. The tracepoint marks the
point where an RPC request is assigned an XID.

Additional clean up: Remove trace_xprt_enq_xmit, which reports much
the same thing. That tracepoint was added for debugging commit
918f3c1fe8 ("SUNRPC: Improve latency for interactive tasks").

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Chuck Lever 7806948753 SUNRPC: Remove debugging instrumentation from xprt_release
These instruments don't appear to add any substantial value.

We already have this at the termination of each RPC:

          iozone-2617  [002]   975.713126: rpc_stats_latency:    task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58
          iozone-2617  [002]   975.713127: xprt_release_cong:    task:418@5 snd_task:4294967295 cong=256 cwnd=16384
          iozone-2617  [002]   975.713127: xprt_put_cong:        task:418@5 snd_task:4294967295 cong=0 cwnd=16384

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Chuck Lever 06e234c613 SUNRPC: Hoist trace_xprtrdma_op_allocate into generic code
Introduce a tracepoint in call_allocate that reports the exact
sizes in the RPC buffer allocation request and the status of the
result. This helps catch problems with XDR buffer provisioning,
and replaces transport-specific debugging instrumentation.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Chuck Lever e4378a0fdd SUNRPC: Remove trace_xprt_complete_rqst()
Request completion is already recorded by an "rpc_task_wakeup
queue=xprt_pending" trace record. A subsequent rpc_xdr_recvfrom
trace record shows the number of bytes received.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-09-21 10:21:08 -04:00
Jason Gunthorpe 5dee5872f8 Merge branch 'mlx5_active_speed' into rdma.git for-next
Leon Romanovsky says:

====================
IBTA declares speed as 16 bits, but kernel stores it in u8. This series
fixes in-kernel declaration while keeping external interface intact.
====================

Based on the mlx5-next branch at
     git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_active_speed':
  RDMA: Fix link active_speed size
  RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
  net/mlx5: Refactor query port speed functions
2020-09-18 10:31:45 -03:00
David Howells 96a9c425e2 rxrpc: Fix a missing NULL-pointer check in a trace
Fix the rxrpc_client tracepoint to not dereference conn to get the cid if
conn is NULL, as it does for other fields.

	RIP: 0010:trace_event_raw_event_rxrpc_client+0x7e/0xe0 [rxrpc]
	Call Trace:
	 rxrpc_activate_channels+0x62/0xb0 [rxrpc]
	 rxrpc_connect_call+0x481/0x650 [rxrpc]
	 ? wake_up_q+0xa0/0xa0
	 ? rxrpc_kernel_begin_call+0x12a/0x1b0 [rxrpc]
	 rxrpc_new_client_call+0x2a5/0x5e0 [rxrpc]

Fixes: 245500d853 ("rxrpc: Rewrite the client connection manager")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
2020-09-14 16:18:59 +01:00
Chao Yu 32c0fec1aa f2fs: trace: fix typo
Fixes a typo from 'compreesed' to 'compressed'.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11 11:11:25 -07:00
Chao Yu 093749e296 f2fs: support age threshold based garbage collection
There are several issues in current background GC algorithm:
- valid blocks is one of key factors during cost overhead calculation,
so if segment has less valid block, however even its age is young or
it locates hot segment, CB algorithm will still choose the segment as
victim, it's not appropriate.
- GCed data/node will go to existing logs, no matter in-there datas'
update frequency is the same or not, it may mix hot and cold data
again.
- GC alloctor mainly use LFS type segment, it will cost free segment
more quickly.

This patch introduces a new algorithm named age threshold based
garbage collection to solve above issues, there are three steps
mainly:

1. select a source victim:
- set an age threshold, and select candidates beased threshold:
e.g.
 0 means youngest, 100 means oldest, if we set age threshold to 80
 then select dirty segments which has age in range of [80, 100] as
 candiddates;
- set candidate_ratio threshold, and select candidates based the
ratio, so that we can shrink candidates to those oldest segments;
- select target segment with fewest valid blocks in order to
migrate blocks with minimum cost;

2. select a target victim:
- select candidates beased age threshold;
- set candidate_radius threshold, search candidates whose age is
around source victims, searching radius should less than the
radius threshold.
- select target segment with most valid blocks in order to avoid
migrating current target segment.

3. merge valid blocks from source victim into target victim with
SSR alloctor.

Test steps:
- create 160 dirty segments:
 * half of them have 128 valid blocks per segment
 * left of them have 384 valid blocks per segment
- run background GC

Benefit: GC count and block movement count both decrease obviously:

- Before:
  - Valid: 86
  - Dirty: 1
  - Prefree: 11
  - Free: 6001 (6001)

GC calls: 162 (BG: 220)
  - data segments : 160 (160)
  - node segments : 2 (2)
Try to move 41454 blocks (BG: 41454)
  - data blocks : 40960 (40960)
  - node blocks : 494 (494)

IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 41364 blocks in 81 segments

- After:

  - Valid: 87
  - Dirty: 0
  - Prefree: 4
  - Free: 6008 (6008)

GC calls: 75 (BG: 76)
  - data segments : 74 (74)
  - node segments : 1 (1)
Try to move 12813 blocks (BG: 12813)
  - data blocks : 12544 (12544)
  - node blocks : 269 (269)

IPU: 0 blocks
SSR: 12032 blocks in 77 segments
LFS: 855 blocks in 2 segments

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-09-11 11:11:15 -07:00
David Howells 245500d853 rxrpc: Rewrite the client connection manager
Rewrite the rxrpc client connection manager so that it can support multiple
connections for a given security key to a peer.  The following changes are
made:

 (1) For each open socket, the code currently maintains an rbtree with the
     connections placed into it, keyed by communications parameters.  This
     is tricky to maintain as connections can be culled from the tree or
     replaced within it.  Connections can require replacement for a number
     of reasons, e.g. their IDs span too great a range for the IDR data
     type to represent efficiently, the call ID numbers on that conn would
     overflow or the conn got aborted.

     This is changed so that there's now a connection bundle object placed
     in the tree, keyed on the same parameters.  The bundle, however, does
     not need to be replaced.

 (2) An rxrpc_bundle object can now manage the available channels for a set
     of parallel connections.  The lock that manages this is moved there
     from the rxrpc_connection struct (channel_lock).

 (3) There'a a dummy bundle for all incoming connections to share so that
     they have a channel_lock too.  It might be better to give each
     incoming connection its own bundle.  This bundle is not needed to
     manage which channels incoming calls are made on because that's the
     solely at whim of the client.

 (4) The restrictions on how many client connections are around are
     removed.  Instead, a previous patch limits the number of client calls
     that can be allocated.  Ordinarily, client connections are reaped
     after 2 minutes on the idle queue, but when more than a certain number
     of connections are in existence, the reaper starts reaping them after
     2s of idleness instead to get the numbers back down.

     It could also be made such that new call allocations are forced to
     wait until the number of outstanding connections subsides.

Signed-off-by: David Howells <dhowells@redhat.com>
2020-09-08 21:11:43 +01:00
Steven Price 4beba9486a mm: Add PG_arch_2 page flag
For arm64 MTE support it is necessary to be able to mark pages that
contain user space visible tags that will need to be saved/restored e.g.
when swapped out.

To support this add a new arch specific flag (PG_arch_2). This flag is
only available on 64-bit architectures due to the limited number of
spare page flags on the 32-bit ones.

Signed-off-by: Steven Price <steven.price@arm.com>
[catalin.marinas@arm.com: use CONFIG_64BIT for guarding this new flag]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
2020-09-04 12:46:06 +01:00
Linus Torvalds 3e8d3bdc2a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi
    Kivilinna.

 2) Fix loss of RTT samples in rxrpc, from David Howells.

 3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu.

 4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka.

 5) We disable BH for too lokng in sctp_get_port_local(), add a
    cond_resched() here as well, from Xin Long.

 6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu.

 7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from
    Yonghong Song.

 8) Missing of_node_put() in mt7530 DSA driver, from Sumera
    Priyadarsini.

 9) Fix crash in bnxt_fw_reset_task(), from Michael Chan.

10) Fix geneve tunnel checksumming bug in hns3, from Yi Li.

11) Memory leak in rxkad_verify_response, from Dinghao Liu.

12) In tipc, don't use smp_processor_id() in preemptible context. From
    Tuong Lien.

13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu.

14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter.

15) Fix ABI mismatch between driver and firmware in nfp, from Louis
    Peens.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits)
  net/smc: fix sock refcounting in case of termination
  net/smc: reset sndbuf_desc if freed
  net/smc: set rx_off for SMCR explicitly
  net/smc: fix toleration of fake add_link messages
  tg3: Fix soft lockup when tg3_reset_task() fails.
  doc: net: dsa: Fix typo in config code sample
  net: dp83867: Fix WoL SecureOn password
  nfp: flower: fix ABI mismatch between driver and firmware
  tipc: fix shutdown() of connectionless socket
  ipv6: Fix sysctl max for fib_multipath_hash_policy
  drivers/net/wan/hdlc: Change the default of hard_header_len to 0
  net: gemini: Fix another missing clk_disable_unprepare() in probe
  net: bcmgenet: fix mask check in bcmgenet_validate_flow()
  amd-xgbe: Add support for new port mode
  net: usb: dm9601: Add USB ID of Keenetic Plus DSL
  vhost: fix typo in error message
  net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
  pktgen: fix error message with wrong function name
  net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode
  cxgb4: fix thermal zone device registration
  ...
2020-09-03 18:50:48 -07:00
Tejun Heo 0460375517 blk-iocost: restore inuse update tracepoints
Update and restore the inuse update tracepoints.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01 19:38:33 -06:00
Tejun Heo 065655c862 blk-iocost: decouple vrate adjustment from surplus transfers
Budget donations are inaccurate and could take multiple periods to converge.
To prevent triggering vrate adjustments while surplus transfers were
catching up, vrate adjustment was suppressed if donations were increasing,
which was indicated by non-zero nr_surpluses.

This entangling won't be necessary with the scheduled rewrite of donation
mechanism which will make it precise and immediate. Let's decouple the two
in preparation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01 19:38:32 -06:00
Tejun Heo 1aa50d020c blk-iocost: calculate iocg->usages[] from iocg->local_stat.usage_us
Currently, iocg->usages[] which are used to guide inuse adjustments are
calculated from vtime deltas. This, however, assumes that the hierarchical
inuse weight at the time of calculation held for the entire period, which
often isn't true and can lead to significant errors.

Now that we have absolute usage information collected, we can derive
iocg->usages[] from iocg->local_stat.usage_us so that inuse adjustment
decisions are made based on actual absolute usage. The calculated usage is
clamped between 1 and WEIGHT_ONE and WEIGHT_ONE is also used to signal
saturation regardless of the current hierarchical inuse weight.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-01 19:38:32 -06:00
Chuck Lever 1ad5f100e3 locks: Remove extra "0x" in tracepoint format specifier
Clean up: %p adds its own 0x already.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-09-01 18:09:34 -04:00
Steven Rostedt (VMware) d25e37d89d tracepoint: Optimize using static_call()
Currently the tracepoint site will iterate a vector and issue indirect
calls to however many handlers are registered (ie. the vector is
long).

Using static_call() it is possible to optimize this for the common
case of only having a single handler registered. In this case the
static_call() can directly call this handler. Otherwise, if the vector
is longer than 1, call a function that iterates the whole vector like
the current code.

[peterz: updated to new interface]

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20200818135805.279421092@infradead.org
2020-09-01 09:58:06 +02:00
Jason Gunthorpe 6989aa62d3 Linux 5.9-rc3
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9ML+IeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGA8EIAIy/kTbFS0yrE9yV
 hb98oX0z9+EU9YQg9vhaRWwPd+rJF/JMQZLqYcwbhjG9abaUL3T3fEcSAefMHw8E
 LAt+hYzA38dHt7tqhsFQX3vV1VorvDVICBVN0yRPRWKKikq4OPIHzaAR9tleGAF5
 8btQisl1PjN+obwYmLuNb6aX16OCwAF+uXOwehcoJs9dvMNhwtXRzfOflWzOvOo6
 tE0bHErlylLDfLv4ZzEfczTdks4QJZ7C0xLSf3oN9AAynW42Xnhct4hi8qZY/hCf
 CMaqeN4hdpub6TvQIqBdDqMMjEXGFgeNSnAEBQY9VpvUqz8NTu6sQxwgJEKDF5tg
 d81lv2c=
 =uW/F
 -----END PGP SIGNATURE-----

Merge tag 'v5.9-rc3' into rdma.git for-next

Required due to dependencies in following patches.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31 12:28:12 -03:00
Linus Torvalds 8bb5021cc2 powerpc fixes for 5.9 #4
Revert our removal of PROT_SAO, at least one user expressed an interest in using
 it on Power9. Instead don't allow it to be used in guests unless enabled
 explicitly at compile time.
 
 A fix for a crash introduced by a recent change to FP handling.
 
 Revert a change to our idle code that left Power10 with no idle support.
 
 One minor fix for the new scv system call path to set PPR.
 
 Fix a crash in our "generic" PMU if branch stack events were enabled.
 
 A fix for the IMC PMU, to correctly identify host kernel samples.
 
 The ADB_PMU powermac code was found to be incompatible with VMAP_STACK, so make
 them incompatible in Kconfig until the code can be fixed.
 
 A build fix in drivers/video/fbdev/controlfb.c, and a documentation fix.
 
 Thanks to:
   Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy, Giuseppe Sacco,
   Madhavan Srinivasan, Milton Miller, Nicholas Piggin, Pratik Rajesh Sampat,
   Randy Dunlap, Shawn Anastasio, Vaidyanathan Srinivasan.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl9LlF8THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgEwJD/4nEkp9id7bZyiGruoawqxdpmc9viIp
 JFRH3+eHWbE5rfoXn7fwM1zTE9SsHxCd0q09cHk2rtAwKMXcJW83/pXNuWEjIzcy
 7Ra8Zq2jRl6qgWAx84VKoZVg+W40yNFex0M0akMQV55SjYOTN8gpGe+algi+wPaH
 44oYBYctDi3B9X8CsaUQEdov1EZdWT6TxcN9xIJiIdr53VXMER6C+ytYV8VgkGHW
 Qt+Ardyvp6eNq9+foGegRSk3OmNcmj+CJZYzhkp5+1k9ko9GQ8wg9NzxTV4ZoSJ9
 g5rgD4ztBfLGyUDu6oUypzOnSVbfzJh9JPH/h1zaSOjSv9MnJ20zqvqjD7QXFNbs
 j960PiylTfVWdnOoUUkvON0UOYZM9XiZP63i8z/mBsMJ5BFaLB1TonZ+lDwXc1vK
 MHXhjahP2qP0LnJZ/M5gT3zfLPyrKoeIlmLTOkLjrM5C9mcSxpPnagq+AHacfYpG
 sGrg2LGLfBo/9PomUNHseQhBfsc2uYwM924si9MpNWN6BT+TNgTJYeNPDOnvRCbG
 ivDQ7HFZ6aiOj+b5iTZI2RV3EOaBKZgo+VEryNDnqd7etjyDr5PNbooGaHJDgsnz
 mNFxUNusxzv0vMI3zyFtLMTe/99/NlRSYyMXPL8SL7MvlRt624ngrrxYv+2+dBRt
 aIpxSpgdqTVXSw==
 =t+yB
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Revert our removal of PROT_SAO, at least one user expressed an
   interest in using it on Power9. Instead don't allow it to be used in
   guests unless enabled explicitly at compile time.

 - A fix for a crash introduced by a recent change to FP handling.

 - Revert a change to our idle code that left Power10 with no idle
   support.

 - One minor fix for the new scv system call path to set PPR.

 - Fix a crash in our "generic" PMU if branch stack events were enabled.

 - A fix for the IMC PMU, to correctly identify host kernel samples.

 - The ADB_PMU powermac code was found to be incompatible with
   VMAP_STACK, so make them incompatible in Kconfig until the code can
   be fixed.

 - A build fix in drivers/video/fbdev/controlfb.c, and a documentation
   fix.

Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy,
Giuseppe Sacco, Madhavan Srinivasan, Milton Miller, Nicholas Piggin,
Pratik Rajesh Sampat, Randy Dunlap, Shawn Anastasio, Vaidyanathan
Srinivasan.

* tag 'powerpc-5.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/32s: Disable VMAP stack which CONFIG_ADB_PMU
  Revert "powerpc/powernv/idle: Replace CPU feature check with PVR check"
  powerpc/perf: Fix reading of MSR[HV/PR] bits in trace-imc
  powerpc/perf: Fix crashes with generic_compat_pmu & BHRB
  powerpc/64s: Fix crash in load_fp_state() due to fpexc_mode
  powerpc/64s: scv entry should set PPR
  Documentation/powerpc: fix malformed table in syscall64-abi
  video: fbdev: controlfb: Fix build for COMPILE_TEST=y && PPC_PMAC=n
  selftests/powerpc: Update PROT_SAO test to skip ISA 3.1
  powerpc/64s: Disallow PROT_SAO in LPARs by default
  Revert "powerpc/64s: Remove PROT_SAO support"
2020-08-30 10:56:12 -07:00
Linus Torvalds e309428590 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAl9JG9wACgkQnJ2qBz9k
 QNlp3ggA3B/Xopb2X3cCpf2fFw63YGJU4i0XJxi+3fC/v6m8U+D4XbqJUjaM5TZz
 +4XABQf7OHvSwDezc3n6KXXD/zbkZCeVm9aohEXvfMYLyKbs+S7QNQALHEtpfBUU
 3IY2pQ90K7JT9cD9pJls/Y/EaA1ObWP7+3F1zpw8OutGchKcE8SvVjzL3SSJaj7k
 d8OTtMosAFuTe4saFWfsf9CmZzbx4sZw3VAzXEXAArrxsmqFKIcY8dI8TQ0WaYNh
 C3wQFvW+n9wHapylyi7RhGl2QH9Tj8POfnCTahNFFJbsmJBx0Z3r42mCBAk4janG
 FW+uDdH5V780bTNNVUKz0v4C/YDiKg==
 =jQnW
 -----END PGP SIGNATURE-----

Merge tag 'writeback_for_v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull writeback fixes from Jan Kara:
 "Fixes for writeback code occasionally skipping writeback of some
  inodes or livelocking sync(2)"

* tag 'writeback_for_v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  writeback: Drop I_DIRTY_TIME_EXPIRE
  writeback: Fix sync livelock due to b_dirty_time processing
  writeback: Avoid skipping inode writeback
  writeback: Protect inode->i_io_list with inode->i_lock
2020-08-28 10:57:14 -07:00
David S. Miller 8d73a73a7f RxRPC fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl9HwPIACgkQ+7dXa6fL
 C2vHdw//S5s+nPNuJUpz3aeYDC1Yl1YP+r2+XBfcCNem504GKfsvTBbLo7FYP6Oc
 TCLJouPxoNSWAtlrJ86YJu8EcFdYNI4w6mCHncIrLXiiIZhsAeUMAw1GiASL5VUr
 QhLjJJU4zUBg3/RWLRC1pUXBXAa3sYx5r1p8j3KdBAkAmAzXdFWSRFeffVeXIk46
 FZ52YoQkplJYqDL+1oQCjJLetVJGGvc68AIOjJOLh6CAjrZbx1aiY79XA+flsoE9
 B+KVjhEn0f9dVybgtF4rEqS88y1FJtSQgul6FOsys+Rx2I0G7ei8PB5TQ2wNCQhy
 37gGfg1AV6apcqXKcHJdovVnApoQzzdCJESCgbMvsczqM88dP7pLWrPz4wpxs655
 3t6qUyXI6yMOJhUWPOoFi30+4NM+MmsqrbYLZt2f/aXRhKnyaeaLd+VAIlkgz3Lj
 sZuuHQsTH0xiR2uCsINWEc7d7UV6WjeVUJ77LzYiiRzEC2pX80tGu5EKHN8W9oBk
 xRAuExXEGtyOR0p3/S3StkT490Tt4bIxUwnKYAaQZEydMCnTlWVbtIXwzmlCbCSB
 p+P/7twR7LiQlHCTJU94jH3Jfpm3jpFpgjQZoZcx6ZLvtSH4+QP11nMY9+sRxEz1
 hpB12AY7Wp2N7P+GhBPuXUCQpzW751aNZz4X9Etu6kRgwjuyaJM=
 =FWHJ
 -----END PGP SIGNATURE-----

Merge tag 'rxrpc-fixes-20200820' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc, afs: Fix probing issues

Here are some fixes for rxrpc and afs to fix issues in the RTT measuring in
rxrpc and thence the Volume Location server probing in afs:

 (1) Move the serial number of a received ACK into a local variable to
     simplify the next patch.

 (2) Fix the loss of RTT samples due to extra interposed ACKs causing
     baseline information to be discarded too early.  This is a particular
     problem for afs when it sends a single very short call to probe a
     server it hasn't talked to recently.

 (3) Fix rxrpc_kernel_get_srtt() to indicate whether it actually has seen
     any valid samples or not.

 (4) Remove a field that's set/woken, but never read/waited on.

 (5) Expose the RTT and other probe information through procfs to make
     debugging of this stuff easier.

 (6) Fix VL rotation in afs to only use summary information from VL probing
     and not the probe running state (which gets clobbered when next a
     probe is issued).

 (7) Fix VL rotation to actually return the error aggregated from the probe
     errors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-27 12:55:46 -07:00
Joel Fernandes (Google) c30068f41a rcu/trace: Print negative GP numbers correctly
GP numbers start from -300 and gp_seq numbers start of -1200 (for a
shift of 2). These negative numbers are printed as unsigned long which
not only takes up more text space, but is rather confusing to the reader
as they have to constantly expend energy to truncate the number. Just
print the negative numbering directly.

Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 18:36:04 -07:00
Chuck Lever b3d03daa7c RDMA/core: Move the rdma_show_ib_cm_event() macro
Refactor: Make it globally available in the utilities header.

Link: https://lore.kernel.org/r/159767239131.2968.9520990257041764685.stgit@klimt.1015granger.net
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-24 16:01:47 -03:00
Shawn Anastasio 12564485ed Revert "powerpc/64s: Remove PROT_SAO support"
This reverts commit 5c9fa16e8a.

Since PROT_SAO can still be useful for certain classes of software,
reintroduce it. Concerns about guest migration for LPARs using SAO
will be addressed next.

Signed-off-by: Shawn Anastasio <shawn@anastas.io>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200821185558.35561-2-shawn@anastas.io
2020-08-24 14:12:53 +10:00
Peter Enderborg 30969bc8e0 selinux: add basic filtering for audit trace events
This patch adds further attributes to the event. These attributes are
helpful to understand the context of the message and can be used
to filter the events.

There are three common items. Source context, target context and tclass.
There are also items from the outcome of operation performed.

An event is similar to:
           <...>-1309  [002] ....  6346.691689: selinux_audited:
       requested=0x4000000 denied=0x4000000 audited=0x4000000
       result=-13
       scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
       tcontext=system_u:object_r:bin_t:s0 tclass=file

With systems where many denials are occurring, it is useful to apply a
filter. The filtering is a set of logic that is inserted with
the filter file. Example:
 echo "tclass==\"file\" " > events/avc/selinux_audited/filter

This adds that we only get tclass=file.

The trace can also have extra properties. Adding the user stack
can be done with
   echo 1 > options/userstacktrace

Now the output will be
         runcon-1365  [003] ....  6960.955530: selinux_audited:
     requested=0x4000000 denied=0x4000000 audited=0x4000000
     result=-13
     scontext=system_u:system_r:cupsd_t:s0-s0:c0.c1023
     tcontext=system_u:object_r:bin_t:s0 tclass=file
          runcon-1365  [003] ....  6960.955560: <user stack trace>
 =>  <00007f325b4ce45b>
 =>  <00005607093efa57>

Signed-off-by: Peter Enderborg <peter.enderborg@sony.com>
Reviewed-by: Thiébaud Weksteen <tweek@google.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 17:07:29 -04:00
Thiébaud Weksteen dd8166212d selinux: add tracepoint on audited events
The audit data currently captures which process and which target
is responsible for a denial. There is no data on where exactly in the
process that call occurred. Debugging can be made easier by being able to
reconstruct the unified kernel and userland stack traces [1]. Add a
tracepoint on the SELinux denials which can then be used by userland
(i.e. perf).

Although this patch could manually be added by each OS developer to
trouble shoot a denial, adding it to the kernel streamlines the
developers workflow.

It is possible to use perf for monitoring the event:
  # perf record -e avc:selinux_audited -g -a
  ^C
  # perf report -g
  [...]
      6.40%     6.40%  audited=800000 tclass=4
               |
                  __libc_start_main
                  |
                  |--4.60%--__GI___ioctl
                  |          entry_SYSCALL_64
                  |          do_syscall_64
                  |          __x64_sys_ioctl
                  |          ksys_ioctl
                  |          binder_ioctl
                  |          binder_set_nice
                  |          can_nice
                  |          capable
                  |          security_capable
                  |          cred_has_capability.isra.0
                  |          slow_avc_audit
                  |          common_lsm_audit
                  |          avc_audit_post_callback
                  |          avc_audit_post_callback
                  |

It is also possible to use the ftrace interface:
  # echo 1 > /sys/kernel/debug/tracing/events/avc/selinux_audited/enable
  # cat /sys/kernel/debug/tracing/trace
  tracer: nop
  entries-in-buffer/entries-written: 1/1   #P:8
  [...]
  dmesg-3624  [001] 13072.325358: selinux_denied: audited=800000 tclass=4

The tclass value can be mapped to a class by searching
security/selinux/flask.h. The audited value is a bit field of the
permissions described in security/selinux/av_permissions.h for the
corresponding class.

[1] https://source.android.com/devices/tech/debug/native_stack_dump

Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Suggested-by: Joel Fernandes <joelaf@google.com>
Reviewed-by: Peter Enderborg <peter.enderborg@sony.com>
Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2020-08-21 17:05:22 -04:00
Linus Torvalds d723b99ec9 Improvements to ext4's block allocator performance for very large file
systems, especially when the file system or files which are highly
 fragmented.  There is a new mount option, prefetch_block_bitmaps which
 will pull in the block bitmaps and set up the in-memory buddy bitmaps
 when the file system is initially mounted.
 
 Beyond that, a lot of bug fixes and cleanups.  In particular, a number
 of changes to make ext4 more robust in the face of write errors or
 file system corruptions.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAl8/Q9YACgkQ8vlZVpUN
 gaPz+wgAkiWwpge0pfcukABW9FcHK9R82IPggA/NnFu0I+3trpqVQP8mYWqg+1l7
 X0W6B6GHMcITGdwxVDNGHHv0WabXCqFPT0ENwW1cnl9UL6I91Ev2NjmG9HP6hVZa
 g3+NyXJwiOP38xsxpPJGPoYFw2wZyv8/e41MMnsE6goYjMmB04sHvXCUQkbN41Fn
 3CMdsiueYZDAKflvAlL50Jy7Imz5tq9oy81/z+amqvWo4T0U8zRwQuf25nBAhr25
 1WdT4CbCNGO2Qwyu9X+t/KGNVIQhCctkx/yz71l3p2piEGkw/XE4VJNrkmWb0zN7
 k9F5uGOZlAlQEzx+5PN//Qtz1Db0QQ==
 =E6vv
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "Improvements to ext4's block allocator performance for very large file
  systems, especially when the file system or files which are highly
  fragmented. There is a new mount option, prefetch_block_bitmaps which
  will pull in the block bitmaps and set up the in-memory buddy bitmaps
  when the file system is initially mounted.

  Beyond that, a lot of bug fixes and cleanups. In particular, a number
  of changes to make ext4 more robust in the face of write errors or
  file system corruptions"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (46 commits)
  ext4: limit the length of per-inode prealloc list
  ext4: reorganize if statement of ext4_mb_release_context()
  ext4: add mb_debug logging when there are lost chunks
  ext4: Fix comment typo "the the".
  jbd2: clean up checksum verification in do_one_pass()
  ext4: change to use fallthrough macro
  ext4: remove unused parameter of ext4_generic_delete_entry function
  mballoc: replace seq_printf with seq_puts
  ext4: optimize the implementation of ext4_mb_good_group()
  ext4: delete invalid comments near ext4_mb_check_limits()
  ext4: fix typos in ext4_mb_regular_allocator() comment
  ext4: fix checking of directory entry validity for inline directories
  fs: prevent BUG_ON in submit_bh_wbc()
  ext4: correctly restore system zone info when remount fails
  ext4: handle add_system_zone() failure in ext4_setup_system_zone()
  ext4: fold ext4_data_block_valid_rcu() into the caller
  ext4: check journal inode extents more carefully
  ext4: don't allow overlapping system zones
  ext4: handle error of ext4_setup_system_zone() on remount
  ext4: delete the invalid BUGON in ext4_mb_load_buddy_gfp()
  ...
2020-08-21 11:03:38 -07:00
David Howells 4700c4d80b rxrpc: Fix loss of RTT samples due to interposed ACK
The Rx protocol has a mechanism to help generate RTT samples that works by
a client transmitting a REQUESTED-type ACK when it receives a DATA packet
that has the REQUEST_ACK flag set.

The peer, however, may interpose other ACKs before transmitting the
REQUESTED-ACK, as can be seen in the following trace excerpt:

 rxrpc_tx_data: c=00000044 DATA d0b5ece8:00000001 00000001 q=00000001 fl=07
 rxrpc_rx_ack: c=00000044 00000001 PNG r=00000000 f=00000002 p=00000000 n=0
 rxrpc_rx_ack: c=00000044 00000002 REQ r=00000001 f=00000002 p=00000001 n=0
 ...

DATA packet 1 (q=xx) has REQUEST_ACK set (bit 1 of fl=xx).  The incoming
ping (labelled PNG) hard-acks the request DATA packet (f=xx exceeds the
sequence number of the DATA packet), causing it to be discarded from the Tx
ring.  The ACK that was requested (labelled REQ, r=xx references the serial
of the DATA packet) comes after the ping, but the sk_buff holding the
timestamp has gone and the RTT sample is lost.

This is particularly noticeable on RPC calls used to probe the service
offered by the peer.  A lot of peers end up with an unknown RTT because we
only ever sent a single RPC.  This confuses the server rotation algorithm.

Fix this by caching the information about the outgoing packet in RTT
calculations in the rxrpc_call struct rather than looking in the Tx ring.

A four-deep buffer is maintained and both REQUEST_ACK-flagged DATA and
PING-ACK transmissions are recorded in there.  When the appropriate
response ACK is received, the buffer is checked for a match and, if found,
an RTT sample is recorded.

If a received ACK refers to a packet with a later serial number than an
entry in the cache, that entry is presumed lost and the entry is made
available to record a new transmission.

ACKs types other than REQUESTED-type and PING-type cause any matching
sample to be cancelled as they don't necessarily represent a useful
measurement.

If there's no space in the buffer on ping/data transmission, the sample
base is discarded.

Fixes: 50235c4b5a ("rxrpc: Obtain RTT data by requesting ACKs on DATA packets")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-08-20 17:59:27 +01:00
brookxu 27bc446e2d ext4: limit the length of per-inode prealloc list
In the scenario of writing sparse files, the per-inode prealloc list may
be very long, resulting in high overhead for ext4_mb_use_preallocated().
To circumvent this problem, we limit the maximum length of per-inode
prealloc list to 512 and allow users to modify it.

After patching, we observed that the sys ratio of cpu has dropped, and
the system throughput has increased significantly. We created a process
to write the sparse file, and the running time of the process on the
fixed kernel was significantly reduced, as follows:

Running time on unfixed kernel:
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real    0m2.051s
user    0m0.008s
sys     0m2.026s

Running time on fixed kernel:
[root@TENCENT64 ~]# time taskset 0x01 ./sparse /data1/sparce.dat
real    0m0.471s
user    0m0.004s
sys     0m0.395s

Signed-off-by: Chunguang Xu <brookxu@tencent.com>
Link: https://lore.kernel.org/r/d7a98178-056b-6db5-6bce-4ead23f4a257@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-08-19 12:04:36 -04:00
Juergen Gross e1ac3e66d3 x86/paravirt: Remove set_pte_at() pv-op
On x86 set_pte_at() is now always falling back to set_pte(). So instead
of having this fallback after the paravirt maze just drop the
set_pte_at paravirt operation and let set_pte_at() use the set_pte()
function directly.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200815100641.26362-6-jgross@suse.com
2020-08-15 13:52:12 +02:00
Linus Torvalds a1d21081a6 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Some merge window fallout, some longer term fixes:

   1) Handle headroom properly in lapbether and x25_asy drivers, from
      Xie He.

   2) Fetch MAC address from correct r8152 device node, from Thierry
      Reding.

   3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg,
      from Rouven Czerwinski.

   4) Correct fdputs in socket layer, from Miaohe Lin.

   5) Revert troublesome sockptr_t optimization, from Christoph Hellwig.

   6) Fix TCP TFO key reading on big endian, from Jason Baron.

   7) Missing CAP_NET_RAW check in nfc, from Qingyu Li.

   8) Fix inet fastreuse optimization with tproxy sockets, from Tim
      Froidcoeur.

   9) Fix 64-bit divide in new SFC driver, from Edward Cree.

  10) Add a tracepoint for prandom_u32 so that we can more easily
      perform usage analysis. From Eric Dumazet.

  11) Fix rwlock imbalance in AF_PACKET, from John Ogness"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
  net: openvswitch: introduce common code for flushing flows
  af_packet: TPACKET_V3: fix fill status rwlock imbalance
  random32: add a tracepoint for prandom_u32()
  Revert "ipv4: tunnel: fix compilation on ARCH=um"
  net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus
  net: ethernet: stmmac: Disable hardware multicast filter
  net: stmmac: dwmac1000: provide multicast filter fallback
  ipv4: tunnel: fix compilation on ARCH=um
  vsock: fix potential null pointer dereference in vsock_poll()
  sfc: fix ef100 design-param checking
  net: initialize fastreuse on inet_inherit_port
  net: refactor bind_bucket fastreuse into helper
  net: phy: marvell10g: fix null pointer dereference
  net: Fix potential memory leak in proto_register()
  net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
  ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc()
  net/nfc/rawsock.c: add CAP_NET_RAW check.
  hinic: fix strncpy output truncated compile warnings
  drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check
  net/tls: Fix kmap usage
  ...
2020-08-13 20:03:11 -07:00
Eric Dumazet 94c7eb54c4 random32: add a tracepoint for prandom_u32()
There has been some heat around prandom_u32() lately, and some people
were wondering if there was a simple way to determine how often
it was used, before considering making it maybe 10 times more expensive.

This tracepoint exports the generated pseudo random value.

Tested:

perf list | grep prandom_u32
  random:prandom_u32                                 [Tracepoint event]

perf record -a [-g] [-C1] -e random:prandom_u32 sleep 1
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 259.748 MB perf.data (924087 samples) ]

perf report --nochildren
    ...
    97.67%  ksoftirqd/1     [kernel.vmlinux]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |
               |--48.86%--tcp_v4_syn_recv_sock
               |          tcp_check_req
               |          tcp_v4_rcv
               |          ...
                --48.81%--tcp_conn_request
                          tcp_v4_conn_request
                          tcp_rcv_state_process
                          ...
perf script

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-13 15:11:14 -07:00
Linus Torvalds 8cd84b7096 PPC:
* Improvements and bugfixes for secure VM support, giving reduced startup
   time and memory hotplug support.
 * Locking fixes in nested KVM code
 * Increase number of guests supported by HV KVM to 4094
 * Preliminary POWER10 support
 
 ARM:
 * Split the VHE and nVHE hypervisor code bases, build the EL2 code
   separately, allowing for the VHE code to now be built with instrumentation
 * Level-based TLB invalidation support
 * Restructure of the vcpu register storage to accomodate the NV code
 * Pointer Authentication available for guests on nVHE hosts
 * Simplification of the system register table parsing
 * MMU cleanups and fixes
 * A number of post-32bit cleanups and other fixes
 
 MIPS:
 * compilation fixes
 
 x86:
 * bugfixes
 * support for the SERIALIZE instruction
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl8yfuQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNweQgAiEycRbpifAueihK3ScKwYcCFhbHg
 n6KLiFCY3sJRg+ORNb9EuFPJgGygV8DPKbEMvKaGDhNpX3rOpSIrpi5QQ5Hx+WOj
 WHg+aX8Eyy1ys7V84UbiMeZKUbKDDRr0/UOUtJEsF4hiD7s0FgobbQhC/3+awp5k
 sdSTMYlXelep+pjdFX7cNIgjrBNFtqH0ECeuDCcQzDg2zlH+poEPyLaC5+U4RF6r
 pfvcxd6xp50fobo8ro7kMuBeclG3JxLjqqdNrkkHcF1DxROMLLKN7CjHZchYC/BK
 c+S7JHLFnafxiTncMLhv3s4viey05mohW6SxeLw4qcWHfFlz+qyfZwMvZA==
 =d/GI
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 "PPC:
   - Improvements and bugfixes for secure VM support, giving reduced
     startup time and memory hotplug support.

   - Locking fixes in nested KVM code

   - Increase number of guests supported by HV KVM to 4094

   - Preliminary POWER10 support

  ARM:
   - Split the VHE and nVHE hypervisor code bases, build the EL2 code
     separately, allowing for the VHE code to now be built with
     instrumentation

   - Level-based TLB invalidation support

   - Restructure of the vcpu register storage to accomodate the NV code

   - Pointer Authentication available for guests on nVHE hosts

   - Simplification of the system register table parsing

   - MMU cleanups and fixes

   - A number of post-32bit cleanups and other fixes

  MIPS:
   - compilation fixes

  x86:
   - bugfixes

   - support for the SERIALIZE instruction"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (70 commits)
  KVM: MIPS/VZ: Fix build error caused by 'kvm_run' cleanup
  x86/kvm/hyper-v: Synic default SCONTROL MSR needs to be enabled
  MIPS: KVM: Convert a fallthrough comment to fallthrough
  MIPS: VZ: Only include loongson_regs.h for CPU_LOONGSON64
  x86: Expose SERIALIZE for supported cpuid
  KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled
  KVM: arm64: Move S1PTW S2 fault logic out of io_mem_abort()
  KVM: arm64: Don't skip cache maintenance for read-only memslots
  KVM: arm64: Handle data and instruction external aborts the same way
  KVM: arm64: Rename kvm_vcpu_dabt_isextabt()
  KVM: arm: Add trace name for ARM_NISV
  KVM: arm64: Ensure that all nVHE hyp code is in .hyp.text
  KVM: arm64: Substitute RANDOMIZE_BASE for HARDEN_EL2_VECTORS
  KVM: arm64: Make nVHE ASLR conditional on RANDOMIZE_BASE
  KVM: PPC: Book3S HV: Rework secure mem slot dropping
  KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up
  KVM: PPC: Book3S HV: Migrate hot plugged memory
  KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs
  KVM: PPC: Book3S HV: Track the state GFNs associated with secure VMs
  KVM: PPC: Book3S HV: Disable page merging in H_SVM_INIT_START
  ...
2020-08-12 12:25:06 -07:00
Anshuman Khandual 1a5bae25e3 mm/vmstat: add events for THP migration without split
Add following new vmstat events which will help in validating THP
migration without split.  Statistics reported through these new VM events
will help in performance debugging.

1. THP_MIGRATION_SUCCESS
2. THP_MIGRATION_FAILURE
3. THP_MIGRATION_SPLIT

In addition, these new events also update normal page migration statistics
appropriately via PGMIGRATE_SUCCESS and PGMIGRATE_FAILURE.  While here,
this updates current trace event 'mm_migrate_pages' to accommodate now
available THP statistics.

[akpm@linux-foundation.org: s/hpage_nr_pages/thp_nr_pages/]
[ziy@nvidia.com: v2]
  Link: http://lkml.kernel.org/r/C5E3C65C-8253-4638-9D3C-71A61858BB8B@nvidia.com
[anshuman.khandual@arm.com: s/thp_nr_pages/hpage_nr_pages/]
  Link: http://lkml.kernel.org/r/1594287583-16568-1-git-send-email-anshuman.khandual@arm.com

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Link: http://lkml.kernel.org/r/1594080415-27924-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:57:57 -07:00
Linus Torvalds 086ba2ec16 f2fs-for-5.9-rc1
In this round, we've added two small interfaces, 1) GC_URGENT_LOW mode for
 performance, and 2) F2FS_IOC_SEC_TRIM_FILE ioctl for security. The new GC
 mode allows Android to run some lower priority GCs in background, while new
 ioctl discards user information without race condition when the account is
 removed. In addition, some patches were merged to address latency-related
 issues. We've fixed some compression-related bug fixes as well as edge race
 conditions.
 
 Enhancement:
  - add GC_URGENT_LOW mode in gc_urgent
  - introduce F2FS_IOC_SEC_TRIM_FILE ioctl
  - bypass racy readahead to improve read latencies
  - shrink node_write lock coverage to avoid long latency
 
 Bug fix:
  - fix missing compression flag control, i_size, and mount option
  - fix deadlock between quota writes and checkpoint
  - remove inode eviction path in synchronous path to avoid deadlock
  - fix to wait GCed compressed page writeback
  - fix a kernel panic in f2fs_is_compressed_page
  - check page dirty status before writeback
  - wait page writeback before update in node page write flow
  - fix a race condition between f2fs_write_end_io and f2fs_del_fsync_node_entry
 
 We've added some minor sanity checks and refactored trivial code blocks for
 better readability and debugging information.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAl8xhqMACgkQQBSofoJI
 UNLYgA//WMoOqBACDuOWwYmgQ8oq4vrH2LOwssZF9/77vEfaHKc+TSq1il54lcUl
 MPEx7FK54CnfT8VLLR5ByobZFyH9FFeAw4FN4LBhcfE8jh8ysAGjeoZjwfcmJF6R
 cVKtn8ltUpgH3IEUuPjTiKkVNHfVJxuuL3zHbg1CEl+AkR6NJ/U9kNLwf7ZgPWq2
 I0qwyXRlUIEChhyPZB+Y6RsdGjkeievKld56DMCgG73f4yHRO/yBcrfsN875sGdM
 ALL+mYiunMT6aXcfoiQiAjeImoNajuflI6Zso2Sk8Vl6sBj0QwAuEBF5x1Z5e1mt
 YVYNuC4ucqsDBKpOqtsPP0MFTC2T5Rr9wWXjqv+9TjN7zvJ8xx+zDWtQxvI2bpqB
 4ZRxaJP45aThLYh/SEYDmj+ppyPtfLDeG0HzUkwMmuopf9eg+kxGPjBsZewgkCKg
 kmMKU0P7deGlkrWLUcz2vm0Lso+ieGm0IeLOQl9/rOLu3IQQFia0Vla7dLDgqF0P
 sz+udIiBztC3JPEmEZhfayA6P6e1TyWQUdquL08jp+DZD17gPqcaZDhHr62U5rmK
 7zoiZqkR03SbNaFhBhhoVOaAVcAnF0pSIgqzkCa3dVXxp1QV+JfD9CGR9NFyiIqB
 HK5RPFskIUCg0K2LSaAKbyoFWa/fJ8ZD8/CbFWcnXfWzoaaSkmc=
 =PjaF
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've added two small interfaces: (a) GC_URGENT_LOW
  mode for performance and (b) F2FS_IOC_SEC_TRIM_FILE ioctl for
  security.

  The new GC mode allows Android to run some lower priority GCs in
  background, while new ioctl discards user information without race
  condition when the account is removed.

  In addition, some patches were merged to address latency-related
  issues. We've fixed some compression-related bug fixes as well as edge
  race conditions.

  Enhancements:
   - add GC_URGENT_LOW mode in gc_urgent
   - introduce F2FS_IOC_SEC_TRIM_FILE ioctl
   - bypass racy readahead to improve read latencies
   - shrink node_write lock coverage to avoid long latency

  Bug fixes:
   - fix missing compression flag control, i_size, and mount option
   - fix deadlock between quota writes and checkpoint
   - remove inode eviction path in synchronous path to avoid deadlock
   - fix to wait GCed compressed page writeback
   - fix a kernel panic in f2fs_is_compressed_page
   - check page dirty status before writeback
   - wait page writeback before update in node page write flow
   - fix a race condition between f2fs_write_end_io and f2fs_del_fsync_node_entry

  We've added some minor sanity checks and refactored trivial code
  blocks for better readability and debugging information"

* tag 'f2fs-for-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (52 commits)
  f2fs: prepare a waiter before entering io_schedule
  f2fs: update_sit_entry: Make the judgment condition of f2fs_bug_on more intuitive
  f2fs: replace test_and_set/clear_bit() with set/clear_bit()
  f2fs: make file immutable even if releasing zero compression block
  f2fs: compress: disable compression mount option if compression is off
  f2fs: compress: add sanity check during compressed cluster read
  f2fs: use macro instead of f2fs verity version
  f2fs: fix deadlock between quota writes and checkpoint
  f2fs: correct comment of f2fs_exist_written_data
  f2fs: compress: delay temp page allocation
  f2fs: compress: fix to update isize when overwriting compressed file
  f2fs: space related cleanup
  f2fs: fix use-after-free issue
  f2fs: Change the type of f2fs_flush_inline_data() to void
  f2fs: add F2FS_IOC_SEC_TRIM_FILE ioctl
  f2fs: should avoid inode eviction in synchronous path
  f2fs: segment.h: delete a duplicated word
  f2fs: compress: fix to avoid memory leak on cc->cpages
  f2fs: use generic names for generic ioctls
  f2fs: don't keep meta inode pages used for compressed block migration
  ...
2020-08-10 18:33:22 -07:00
Linus Torvalds 7a6b60441f Highlights:
- Support for user extended attributes on NFS (RFC 8276)
 - Further reduce unnecessary NFSv4 delegation recalls
 
 Notable fixes:
 
 - Fix recent krb5p regression
 - Address a few resource leaks and a rare NULL dereference
 
 Other:
 
 - De-duplicate RPC/RDMA error handling and other utility functions
 - Replace storage and display of kernel memory addresses by tracepoints
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAl8oBt0ACgkQM2qzM29m
 f5dTFQ/9H72E6gr1onsia0/Py0CO8F9qzLgmUBl1vVYAh2/vPqUL1ypxrC5OYrAy
 TOqESTsJvmGluCFc/77XUTD7NvJY3znIWim49okwDiyee4Y14ZfRhhCxyyA6Z94E
 FjJQb5TbF1Mti4X3dN8Gn7O1Y/BfTjDAAXnXGlTA1xoLcxM5idWIj+G8x0bPmeDb
 2fTbgsoETu6MpS2/L6mraXVh3d5ESOJH+73YvpBl0AhYPzlNASJZMLtHtd+A/JbO
 IPkMP/7UA5DuJtWGeuQ4I4D5bQNpNWMfN6zhwtih4IV5bkRC7vyAOLG1R7w9+Ufq
 58cxPiorMcsg1cHnXG0Z6WVtbMEdWTP/FzmJdE5RC7DEJhmmSUG/R0OmgDcsDZET
 GovPARho01yp80GwTjCIctDHRRFRL4pdPfr8PjVHetSnx9+zoRUT+D70Zeg/KSy2
 99gmCxqSY9BZeHoiVPEX/HbhXrkuDjUSshwl98OAzOFmv6kbwtLntgFbWlBdE6dB
 mqOxBb73zEoZ5P9GA2l2ShU3GbzMzDebHBb9EyomXHZrLejoXeUNA28VJ+8vPP5S
 IVHnEwOkdJrNe/7cH4jd/B0NR6f8Da/F9kmkLiG2GNPMqQ8bnVhxTUtZkcAE+fd4
 f34qLxsoht70wSSfISjBs7hP5KxEM1lOAf0w0RpycPUKJNV1FB0=
 =OEpF
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6

Pull NFS server updates from Chuck Lever:
 "Highlights:
   - Support for user extended attributes on NFS (RFC 8276)
   - Further reduce unnecessary NFSv4 delegation recalls

  Notable fixes:
   - Fix recent krb5p regression
   - Address a few resource leaks and a rare NULL dereference

  Other:
   - De-duplicate RPC/RDMA error handling and other utility functions
   - Replace storage and display of kernel memory addresses by tracepoints"

* tag 'nfsd-5.9' of git://git.linux-nfs.org/projects/cel/cel-2.6: (38 commits)
  svcrdma: CM event handler clean up
  svcrdma: Remove transport reference counting
  svcrdma: Fix another Receive buffer leak
  SUNRPC: Refresh the show_rqstp_flags() macro
  nfsd: netns.h: delete a duplicated word
  SUNRPC: Fix ("SUNRPC: Add "@len" parameter to gss_unwrap()")
  nfsd: avoid a NULL dereference in __cld_pipe_upcall()
  nfsd4: a client's own opens needn't prevent delegations
  nfsd: Use seq_putc() in two functions
  svcrdma: Display chunk completion ID when posting a rw_ctxt
  svcrdma: Record send_ctxt completion ID in trace_svcrdma_post_send()
  svcrdma: Introduce Send completion IDs
  svcrdma: Record Receive completion ID in svc_rdma_decode_rqst
  svcrdma: Introduce Receive completion IDs
  svcrdma: Introduce infrastructure to support completion IDs
  svcrdma: Add common XDR encoders for RDMA and Read segments
  svcrdma: Add common XDR decoders for RDMA and Read segments
  SUNRPC: Add helpers for decoding list discriminators symbolically
  svcrdma: Remove declarations for functions long removed
  svcrdma: Clean up trace_svcrdma_send_failed() tracepoint
  ...
2020-08-09 13:58:04 -07:00