Commit Graph

53291 Commits

Author SHA1 Message Date
Richard Weinberger aac17948a7 ubifs: Check ubifs_wbuf_sync() return code
If ubifs_wbuf_sync() fails we must not write a master node with the
dirty marker cleared.
Otherwise it is possible that in case of an IO error while syncing we
mark the filesystem as clean and UBIFS refuses to recover upon next
mount.

Cc: <stable@vger.kernel.org>
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Signed-off-by: Richard Weinberger <richard@nod.at>
2018-04-04 23:41:44 +02:00
Linus Torvalds 3e968c9f14 Cleanups and bugfixes for ext4, including some fixes to make ext4 more
robust against maliciously crafted file system images.  (I still don't
 recommend that container folks hold any delusions that mounting
 arbitary images that can be crafted by malicious attackers should be
 considered sane thing to do, though!)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlrE13oACgkQ8vlZVpUN
 gaM/fQf/bw5IDWNZ9vAYNEBDsu4Je14XVTZVgpDFESk5BGjdVusAv562QCQjC3Pf
 O3uqRSh7DbulICzYGy0BmjM7Gw0Kj5WhfDjPbVXhnGoNpF7NUEhNos8xWLRWukky
 P8mPxv8cj2oprtfM1Cy8mtl0KT8jywkOahq6OmRo6+F1V32IzQAfpPpItIw9lCv8
 JcFgc/2K9zjAG89TR/vWChi8AGhNUXsYEYt+l3tMu+CRC+FWaJ7aEGPn9pmOlqOV
 svHG9skImvINOO4FPydOu+yw9spqFKBX9NO0J9MsCAHmmmKW2GW3RANGIhY9kgBb
 a/T+/cQZTTc+Xu6VzRf4e/SGUa5mgA==
 =HndN
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "Cleanups and bugfixes for ext4, including some fixes to make ext4 more
  robust against maliciously crafted file system images.

  (I still don't recommend that container folks hold any delusions that
  mounting arbitary images that can be crafted by malicious attackers
  should be considered sane thing to do, though!)"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (29 commits)
  ext4: force revalidation of directory pointer after seekdir(2)
  ext4: add extra checks to ext4_xattr_block_get()
  ext4: add bounds checking to ext4_xattr_find_entry()
  ext4: move call to ext4_error() into ext4_xattr_check_block()
  ext4: don't show data=<mode> option if defaulted
  ext4: omit init_itable=n in procfs when disabled
  ext4: show more binary mount options in procfs
  ext4: simplify kobject usage
  ext4: remove unused parameters in sysfs code
  ext4: null out kobject* during sysfs cleanup
  ext4: don't allow r/w mounts if metadata blocks overlap the superblock
  ext4: always initialize the crc32c checksum driver
  ext4: fail ext4_iget for root directory if unallocated
  ext4: limit xattr size to INT_MAX
  ext4: add validity checks for bitmap block numbers
  ext4: fix comments in ext4_swap_extents()
  ext4: use generic_writepages instead of __writepage/write_cache_pages
  ext4: don't complain about incorrect features when probing
  ext4: remove EXT4_STATE_DIOREAD_LOCK flag
  ext4: fix offset overflow on 32-bit archs in ext4_iomap_begin()
  ...
2018-04-04 14:19:24 -07:00
Linus Torvalds a8f8e8ac76 various SMB3/CIFS fixes for 4.17
-----BEGIN PGP SIGNATURE-----
 
 iQGwBAABCAAaBQJaxRzZExxzbWZyZW5jaEBnbWFpbC5jb20ACgkQiiy9cAdyT1Fb
 FQv/Rd/5CrYhZumBrPvFW2jcbRQ2ANTnSRTA3rpd/jJM52DZ7nvcePr/qm9wRLrT
 puMfd8e0a4Df5Mo/ns806iphRtYctKpMKLnkBqPL0WqrXLYSi/Nz3wy/DFyuh3C7
 U22gDYAjQ4dy6Am0CG/y4i1h8D0hRkmMS6PQECpjmNwqjtmfZn5kWJRv+W5UNNj9
 QPldz5PdyNpPw7DxDRetl5uGqUKqsvUATo109hL7ks97qgHUzMHeXWmQpOSS+exh
 P7tPNphIPJYM2VG+uDvIg15l00lgQxzzN0uOs+x7ZDnZ1Bil/a3So823SfJyXNU2
 utJNWSuN/OSHUCmmd7yn+rLd2oJa55+U+Bb3gWaZ8beP639d8P1kEF/isJzu6ede
 gh92lyU2ecfyHNbjKzAbQwwQxnkmMC5XGhP/2+eawsCyo+vk/NQR4NIehqawj/OB
 eQSRnT0vNgi4p4OXgJ3FpBBKNBFtTmWrxqKyl0U9C5nw+YdxBdkr16qlFi2b9DjC
 bW0/
 =VW5l
 -----END PGP SIGNATURE-----

Merge tag '4.17-SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:
 "Includes SMB3.11 security improvements, as well as various fixes for
  stable and some debugging improvements"

* tag '4.17-SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Add minor debug message during negprot
  smb3: Fix root directory when server returns inode number of zero
  cifs: fix sparse warning on previous patch in a few printks
  cifs: add server->vals->header_preamble_size
  cifs: smbd: disconnect transport on RDMA errors
  cifs: smbd: avoid reconnect lockup
  Don't log confusing message on reconnect by default
  Don't log expected error on DFS referral request
  fs: cifs: Replace _free_xid call in cifs_root_iget function
  SMB3.1.1 dialect is no longer experimental
  Tree connect for SMB3.1.1 must be signed for non-encrypted shares
  fix smb3-encryption breakage when CONFIG_DEBUG_SG=y
  CIFS: fix sha512 check in cifs_crypto_secmech_release
  CIFS: implement v3.11 preauth integrity
  CIFS: add sha512 secmech
  CIFS: refactor crypto shash/sdesc allocation&free
  Update README file for cifs.ko
  Update TODO list for cifs.ko
  cifs: fix memory leak in SMB2_open()
  CIFS: SMBD: fix spelling mistake: "faield" and "legnth"
2018-04-04 14:09:27 -07:00
Boris Brezillon fe5f31a801 Linux 4.16-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlqKKI0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGRNAH/0v3+nuJ0oiHE1Cl
 fH89F9Ma17j8oTo28byRPi7X5XJfJAqANhHa209rguvnC27y3ew/l9k93HoxG12i
 ttvyKFDQulQbytfJZXw8lhUyYGXVsTpyNaihPe/NtqPdIxNgfrXsUN9EIEtcnuS2
 SiAj51jUySDRNR4ST6TOx4ulDm1zLrmA28WHOBNOTvDi4jTQMt1TsngHfF5AySBB
 lD4RTRDDwWDWtdMI7euYSq019TiDXCxmwQ94vZjrqmjmSQcl/yCK/JzEV33SZslg
 4WqGIllxONvP/UlwxZLaJ+RrslqxNgDVqQKwJdfYhGaWvpgPFtS1s86zW6IgyXny
 02jJfD0=
 =DLWn
 -----END PGP SIGNATURE-----

Merge tag 'v4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into mtd/next

Backmerge v4.16-rc2 into mtd/next to resolve a conflict between Linus'
master branch and nand/for-4.17.
2018-04-04 22:13:35 +02:00
Linus Torvalds 2bd99df54f We've only got 9 GFS2 patches for this merge window:
1. Abhi Das contributed a patch to report journal recovery times more
    accurately during journal replay.
 2. Andreas Gruenbacher contributed a patch to fix fallocate chunk size.
 3. Andreas added a patch to correctly dirty inodes during rename.
 4. Andreas added a patch to improve the comment for function gfs2_block_map.
 5. Andreas added a patch to improve kernel trace point iomap end:
    The physical block address was added.
 6. Andreas added a patch to fix a nasty file system corruption bug that
    surfaced in xfstests 476 in punch-hole/truncate.
 7. Andreas fixed a problem Christoph Helwig pointed out, namely, that GFS2
    was misusing the IOMAP_ZERO flag. The zeroing of new blocks was moved
    to the proper fallocate code.
 8. I contributed a patch to declare function gfs2_remove_from_ail as static.
 9. I added a patch to only set PageChecked for jdata page writes.
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJaw7mxAAoJENeLYdPf93o7kbUH/3MVpJKJe+jdtb5zMGE+byPZ
 T4x+kjZcGwIjWtqIMnga2X1mMIpJ/MCQzZriT+zno2BRRxPS8MQsKp9rcmEGiwoP
 mKZnVGm85xvBDg/JaAnCOeH2S90BlXX3roaB4U4031tAmcZOz/tVbBiTCBXbZ7FP
 F9tnKt/CkRwBBE8CBvstpgYYvbM+kkfiQ8x38FJFx79lOBytIRhPCHpbR3mgko+p
 HrJNTadGJie65nsplKZcQpDQTPrslwP/ynyGh313ReztThHfCB6teL6sNOgJ5u+T
 ThCA95QesH3P45Dd+xl/SuCDd0pZLLslNgOl/q7qiFWaoxFFBKldN3Bnc7ItnDw=
 =iw4e
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-4.17.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 updates from Bob Peterson:
 "We've only got nine GFS2 patches for this merge window:

   - report journal recovery times more accurately during journal replay
     (Abhi Das)

   - fix fallocate chunk size (Andreas Gruenbacher)

   - correctly dirty inodes during rename (Andreas Gruenbacher)

   - improve the comment for function gfs2_block_map (Andreas
     Gruenbacher)

   - improve kernel trace point iomap end: The physical block address
     was added (Andreas Gruenbacher)

   - fix a nasty file system corruption bug that surfaced in xfstests
     476 in punch-hole/truncate (Andreas Gruenbacher)

   - fix a problem Christoph Helwig pointed out, namely, that GFS2 was
     misusing the IOMAP_ZERO flag. The zeroing of new blocks was moved
     to the proper fallocate code (Andreas Gruenbacher)

   - declare function gfs2_remove_from_ail as static (Bob Peterson)

   - only set PageChecked for jdata page writes (Bob Peterson)"

* tag 'gfs2-4.17.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: time journal recovery steps accurately
  gfs2: Zero out fallocated blocks in fallocate_chunk
  gfs2: Check for the end of metadata in punch_hole
  gfs2: gfs2_iomap_end tracepoint: log block address
  gfs2: Improve gfs2_block_map comment
  GFS2: Only set PageChecked for jdata pages
  GFS2: Make function gfs2_remove_from_ail static
  gfs2: Dirty source inode during rename
  gfs2: Fix fallocate chunk size
2018-04-04 13:09:42 -07:00
Linus Torvalds 94514bbe9e for-4.17-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlrDg0UACgkQxWXV+ddt
 WDvtOQ//QCk0zH2EcPQnqOW6HqAfkkDc7D9P51sK1izNM3vBEYtbuPlY6wp3xnJr
 0hbPjGNU7vMC/4SIwSgEdXyAvlpOr1gm9n+w1GcxpkjLa8l4P+2wt9OX0BSzRUMu
 X7LQxqg2zmQibFy4b1MSmDGsO2dxB2eqvVUT/Ir4b56uqkdtValYRWY75APJIZ5l
 6w0Ja3HVvgOX3pVwSmadCpfMEonN4JE+mfHaP8RajAlTGQcUPq8If9w4BtEoWQRl
 QC7kUCCTmp+isnzH7u4EqEQC6XUqEqeuQH+Bli1pNYTvipHY+9EO1dSxHZoCelgk
 M9PpQz8x+N6ZMcMNtJQVifkfN6tAp/acdWBTZtlpqB8nZR4v5bBndS/5TNBMu3/v
 JfhMEsIUz5o2mWz9qUneyK80RoTRkL5SfdgOclx2Yd7K1fKuJsUjmdXaT08BzotS
 5cxTFZu7EcFJyg4eHemjdyRYr3cUkS19P2uIJle9nj3DAMpCvIyL1c1vn5eB/7MN
 3JeRME6AOQcD7sFgVNuYhGVdIBuwHU6kj4mf1WN27YKGdbsaMZsFz7/HH3SsBLOF
 E0p7Q25HcHSrAimcmLTifN+gol8Y5m6dT0Pjuf9y9QbWvHwh7oRq7DVQ3L8hJkGz
 j64b0jlv43P0QorWvsX6/VJBevWedhe1hZtrBhPFSE0CtJ3Ml4Q=
 =wkSk
 -----END PGP SIGNATURE-----

Merge tag 'for-4.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "There are a several user visible changes, the rest is mostly invisible
  and continues to clean up the whole code base.

  User visible changes:
   - new mount option nossd_spread (pair for ssd_spread)

   - mount option subvolid will detect junk after the number and fail
     the mount

   - add message after cancelled device replace

   - direct module dependency on libcrc32, removed own crc wrappers

   - removed user space transaction ioctls

   - use lighter locking when reading /proc/self/mounts, RCU instead of
     mutex to avoid unnecessary contention

  Enhancements:
   - skip writeback of last page when truncating file to same size

   - send: do not issue unnecessary truncate operations

   - mount option token specifiers: use %u for unsigned values, more
     validation

   - selftests: more tree block validations

  qgroups:
   - preparatory work for splitting reservation types for data and
     metadata, this should allow for more accurate tracking and fix some
     issues with underflows or do further enhancements

   - split metadata reservations for started and joined transaction so
     they do not get mixed up and are accounted correctly at commit time

   - with the above, it's possible to revert patch that potentially
     deadlocks when trying to make more space by explicitly committing
     when the quota limit is hit

   - fix root item corruption when multiple same source snapshots are
     created with quota enabled

  RAID56:
   - make sure target is identical to source when raid56 rebuild fails
     after dev-replace

   - faster rebuild during scrub, batch by stripes and not
     block-by-block

   - make more use of cached data when rebuilding from a missing device

  Fixes:
   - null pointer deref when device replace target is missing

   - fix fsync after hole punching when using no-holes feature

   - fix lockdep splat when allocating percpu data with wrong GFP flags

  Cleanups, refactoring, core changes:
   - drop redunant parameters from various functions

   - kill and opencode trivial helpers

   - __cold/__exit function annotations

   - dead code removal

   - continued audit and documentation of memory barriers

   - error handling: handle removal from uuid tree

   - error handling: remove handling of impossible condtitons

   - more debugging or error messages

   - updated tracepoints

   - one VLA use removal (and one still left)"

* tag 'for-4.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (164 commits)
  btrfs: lift errors from add_extent_changeset to the callers
  Btrfs: print error messages when failing to read trees
  btrfs: user proper type for btrfs_mask_flags flags
  btrfs: split dev-replace locking helpers for read and write
  btrfs: remove stale comments about fs_mutex
  btrfs: use RCU in btrfs_show_devname for device list traversal
  btrfs: update barrier in should_cow_block
  btrfs: use lockdep_assert_held for mutexes
  btrfs: use lockdep_assert_held for spinlocks
  btrfs: Validate child tree block's level and first key
  btrfs: tests/qgroup: Fix wrong tree backref level
  Btrfs: fix copy_items() return value when logging an inode
  Btrfs: fix fsync after hole punching when using no-holes feature
  btrfs: use helper to set ulist aux from a qgroup
  Revert "btrfs: qgroups: Retry after commit on getting EDQUOT"
  btrfs: qgroup: Update trace events for metadata reservation
  btrfs: qgroup: Use root::qgroup_meta_rsv_* to record qgroup meta reserved space
  btrfs: delayed-inode: Use new qgroup meta rsv for delayed inode and item
  btrfs: qgroup: Use separate meta reservation type for delalloc
  btrfs: qgroup: Introduce function to convert META_PREALLOC into META_PERTRANS
  ...
2018-04-04 13:03:38 -07:00
Linus Torvalds 547c43d777 Changes for this release:
- Various cleanups and code fixes
 - Implement lazytime as a mount option
 - Convert various on-disk metadata checks from asserts to -EFSCORRUPTED
 - Fix accounting problems with the rmap per-ag reservations
 - Refactorings and cleanups for xfs_log_force
 - Various bugfixes for the reflink code
 - Work around v5 AGFL padding problems to prevent fs shutdowns
 - Establish inode fork verifiers to inspect on-disk metadata correctness
 - Various online scrub fixes
 - Fix v5 swapext blowing up on deleted inodes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCgAGBQJawZs5AAoJEPh/dxk0SrTrUbAQAKCT0zaYDHViC6p0yxVMTa1z
 7fivnwtKNYc2LiihV6wPp+Hj5YtTGExYncJOLuTsAIuBZ6px+jlV9bpA8X9mWgbN
 e5XXyqz1O8nn/5iBwKRQm2yFdSnsQQfWXNm0XPNTuPGxuzlzxF/rpFN4UlWGdZul
 tigHom5gZD//GYfYHrsOb/7CIRGw90ebpqM3Nt4eAi5o0H5eK46sHKUYtAngSfPm
 FdPHJwmw5Kx+yZW5EdR+ELbLqGsBKsOfsp9SG+un0R+kvj/CKC2ovgwS6tuU+gsi
 MRD8C0zHlz4ikQrmJ0bV+no7T+9bC8fQDIZu0h7dQ1acWb2F1Epr1LRIxNH/1bLi
 qbtchVZkCNXiV0GMQ2iNo1cDJO3AICsQwTuktpoUMU1QOWgQenvzdZCUOQAUqne6
 xwnrCq19UbmNlCdkRWChrVn9Gb7FNYVhe15W/y0qZhzJxWam6yIzKBm91Zc/XLp8
 L5VUc+FVmtSiHXpEVttSwVeMSzhDfG6qOL42dFmw7xwh7JO/vXi0MlxjGe215ApS
 lhBWjEOGB9kbUxMjhqS5KsFn8E1DhL0AMD7N53z7eBTh5Eani81ytf1PzXWhvLbI
 1auY0+7cVggXFltcW6rfAJFC0EEuw6wsx86rl3G+dQ9vmlhy4zaWlt0EJEGmNC90
 Kw4GpFLDmtV93K++lD1C
 =fdIf
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.17-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Darrick Wong:
 "Here's the first round of fixes for XFS for 4.17.

  The biggest new features this time around are the addition of lazytime
  support, further enhancement of the on-disk inode metadata verifiers,
  and a patch to smooth over some of the AGFL padding problems that have
  intermittently plagued users since 4.5. I forsee sending a second pull
  request next week with further bug fixes and speedups in the online
  scrub code and elsewhere.

  This series has been run through a full xfstests run over the weekend
  and through a quick xfstests run against this morning's master, with
  no major failures reported.

  Summary of changes for this release:

   - Various cleanups and code fixes

   - Implement lazytime as a mount option

   - Convert various on-disk metadata checks from asserts to -EFSCORRUPTED

   - Fix accounting problems with the rmap per-ag reservations

   - Refactorings and cleanups for xfs_log_force

   - Various bugfixes for the reflink code

   - Work around v5 AGFL padding problems to prevent fs shutdowns

   - Establish inode fork verifiers to inspect on-disk metadata
     correctness

   - Various online scrub fixes

   - Fix v5 swapext blowing up on deleted inodes"

* tag 'xfs-4.17-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (49 commits)
  xfs: do not log/recover swapext extent owner changes for deleted inodes
  xfs: clean up xfs_mount allocation and dynamic initializers
  xfs: remove dead inode version setting code
  xfs: catch inode allocation state mismatch corruption
  xfs: xfs_scrub_iallocbt_xref_rmap_inodes should use xref_set_corrupt
  xfs: flag inode corruption if parent ptr doesn't get us a real inode
  xfs: don't accept inode buffers with suspicious unlinked chains
  xfs: move inode extent size hint validation to libxfs
  xfs: record inode buf errors as a xref error in inobt scrubber
  xfs: remove xfs_buf parameter from inode scrub methods
  xfs: inode scrubber shouldn't bother with raw checks
  xfs: bmap scrubber should do rmap xref with bmap for sparse files
  xfs: refactor inode buffer verifier error logging
  xfs: refactor inode verifier error logging
  xfs: refactor bmap record validation
  xfs: sanity-check the unused space before trying to use it
  xfs: detect agfl count corruption and reset agfl
  xfs: unwind the try_again loop in xfs_log_force
  xfs: refactor xfs_log_force_lsn
  xfs: minor cleanup for xfs_reflink_end_cow
  ...
2018-04-04 12:44:02 -07:00
Linus Torvalds 2e08edc5c5 Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs dcache updates from Al Viro:
 "Part of this is what the trylock loop elimination series has turned
  into, part making d_move() preserve the parent (and thus the path) of
  victim, plus some general cleanups"

* 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (22 commits)
  d_genocide: move export to definition
  fold dentry_lock_for_move() into its sole caller and clean it up
  make non-exchanging __d_move() copy ->d_parent rather than swap them
  oprofilefs: don't oops on allocation failure
  lustre: get rid of pointless casts to struct dentry *
  debugfs_lookup(): switch to lookup_one_len_unlocked()
  fold lookup_real() into __lookup_hash()
  take out orphan externs (empty_string/slash_string)
  split d_path() and friends into a separate file
  dcache.c: trim includes
  fs/dcache: Avoid a try_lock loop in shrink_dentry_list()
  get rid of trylock loop around dentry_kill()
  handle move to LRU in retain_dentry()
  dput(): consolidate the "do we need to retain it?" into an inlined helper
  split the slow part of lock_parent() off
  now lock_parent() can't run into killed dentry
  get rid of trylock loop in locking dentries on shrink list
  d_delete(): get rid of trylock loop
  fs/dcache: Move dentry_kill() below lock_parent()
  fs/dcache: Remove stale comment from dentry_kill()
  ...
2018-04-04 12:05:25 -07:00
David Howells 402cb8dda9 fscache: Attach the index key and aux data to the cookie
Attach copies of the index key and auxiliary data to the fscache cookie so
that:

 (1) The callbacks to the netfs for this stuff can be eliminated.  This
     can simplify things in the cache as the information is still
     available, even after the cache has relinquished the cookie.

 (2) Simplifies the locking requirements of accessing the information as we
     don't have to worry about the netfs object going away on us.

 (3) The cache can do lazy updating of the coherency information on disk.
     As long as the cache is flushed before reboot/poweroff, there's no
     need to update the coherency info on disk every time it changes.

 (4) Cookies can be hashed or put in a tree as the index key is easily
     available.  This allows:

     (a) Checks for duplicate cookies can be made at the top fscache layer
     	 rather than down in the bowels of the cache backend.

     (b) Caching can be added to a netfs object that has a cookie if the
     	 cache is brought online after the netfs object is allocated.

A certain amount of space is made in the cookie for inline copies of the
data, but if it won't fit there, extra memory will be allocated for it.

The downside of this is that live cache operation requires more memory.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Anna Schumaker <anna.schumaker@netapp.com>
Tested-by: Steve Dickson <steved@redhat.com>
2018-04-04 13:41:28 +01:00
David Howells 08c2e3d087 fscache: Add more tracepoints
Add more tracepoints to fscache, including:

 (*) fscache_page - Tracks netfs pages known to fscache.

 (*) fscache_check_page - Tracks the netfs querying whether a page is
     pending storage.

 (*) fscache_wake_cookie - Tracks cookies being woken up after a page
     completes/aborts storage in the cache.

 (*) fscache_op - Tracks operations being initialised.

 (*) fscache_wrote_page - Tracks return of the backend write_page op.

 (*) fscache_gang_lookup - Tracks lookup of pages to be stored in the write
     operation.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:27 +01:00
David Howells a18feb5576 fscache: Add tracepoints
Add some tracepoints to fscache:

 (*) fscache_cookie - Tracks a cookie's usage count.

 (*) fscache_netfs - Logs registration of a network filesystem, including
     the pointer to the cookie allocated.

 (*) fscache_acquire - Logs cookie acquisition.

 (*) fscache_relinquish - Logs cookie relinquishment.

 (*) fscache_enable - Logs enablement of a cookie.

 (*) fscache_disable - Logs disablement of a cookie.

 (*) fscache_osm - Tracks execution of states in the object state machine.

and cachefiles:

 (*) cachefiles_ref - Tracks a cachefiles object's usage count.

 (*) cachefiles_lookup - Logs result of lookup_one_len().

 (*) cachefiles_mkdir - Logs result of vfs_mkdir().

 (*) cachefiles_create - Logs result of vfs_create().

 (*) cachefiles_unlink - Logs calls to vfs_unlink().

 (*) cachefiles_rename - Logs calls to vfs_rename().

 (*) cachefiles_mark_active - Logs an object becoming active.

 (*) cachefiles_wait_active - Logs a wait for an old object to be
     destroyed.

 (*) cachefiles_mark_inactive - Logs an object becoming inactive.

 (*) cachefiles_mark_buried - Logs the burial of an object.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:27 +01:00
David Howells 2c98425720 fscache: Fix hanging wait on page discarded by writeback
If the fscache asynchronous write operation elects to discard a page that's
pending storage to the cache because the page would be over the store limit
then it needs to wake the page as someone may be waiting on completion of
the write.

The problem is that the store limit may be updated by a different
asynchronous operation - and so may miss the write - and that the store
limit may not even get updated until later by the netfs.

Fix the kernel hang by making fscache_write_op() mark as written any pages
that are over the limit.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:26 +01:00
David Howells d0fb31ecda fscache: Detect multiple relinquishment of a cookie
Report if an fscache cookie is relinquished multiple times by the netfs.

Signed-off-by: David <dhowells@redhat.com>
2018-04-04 13:41:26 +01:00
David Howells b27ddd4624 fscache: Pass the correct cancelled indications to fscache_op_complete()
The last parameter to fscache_op_complete() is a bool indicating whether or
not the operation was cancelled.  A lot of the time the inverse value is
given or no differentiation is made.  Fix this.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:26 +01:00
David Howells bfa3837ec3 fscache, cachefiles: Fix checker warnings
Fix a couple of checker warnings in fscache and cachefiles:

 (1) fscache_n_op_requeue is never used, so get rid of it.

 (2) cachefiles_uncache_page() is passed in a lock that it releases, so
     this needs annotating.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:26 +01:00
David Howells 678edd09c2 afs: Be more aggressive in retiring cached vnodes
When relinquishing cookies, either due to iget failure or to inode
eviction, retire a cookie if we think the corresponding vnode got deleted
on the server rather than just letting it lie in the cache.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:26 +01:00
David Howells 27a3ee3a04 afs: Use the vnode ID uniquifier in the cache key not the aux data
AFS vnodes (files) are referenced by a triplet of { volume ID, vnode ID,
uniquifier }.  Currently, kafs is only using the vnode ID as the file key
in the volume fscache index and checking the uniquifier on cookie
acquisition against the contents of the auxiliary data stored in the cache.

Unfortunately, this is subject to a race in which an FS.RemoveFile or
FS.RemoveDir op is issued against the server but the local afs inode isn't
torn down and disposed off before another thread issues something like
FS.CreateFile.  The latter then gets given the vnode ID that just got
removed, but with a new uniquifier and a cookie collision occurs in the
cache because the cookie is only keyed on the vnode ID whereas the inode is
keyed on the vnode ID plus the uniquifier.

Fix this by keying the cookie on the uniquifier in addition to the vnode ID
and dropping the uniquifier from the auxiliary data supplied.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:25 +01:00
David Howells c1515999bd afs: Invalidate cache on server data change
Invalidate any data stored in fscache for a vnode that changes on the
server so that we don't end up with the cache in a bad state locally.

Signed-off-by: David Howells <dhowells@redhat.com>
2018-04-04 13:41:25 +01:00
Martin Brandenburg 209469d978 orangefs: remove unused code
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-04-03 21:55:28 -04:00
Martin Brandenburg bdd6f08358 orangefs: make several *_operations structs static
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-04-03 21:55:27 -04:00
Martin Brandenburg a5135eeab2 orangefs: implement vm_ops->fault
Must retrieve size before running filemap_fault so the kernel has
an up-to-date size.

This should have been caught by xfstests generic/246, but it was masked
by orangefs_new_inode, which set i_size to PAGE_SIZE.  When nothing
caused a getattr prior to a pagefault, i_size was still PAGE_SIZE.
Since xfstests only read 10 bytes, it did not catch this bug.

When orangefs_new_inode was modified to perform a getattr instead,
i_size was set to zero, as it was a newly created file.  Then
orangefs_file_write_iter did NOT set i_size.  Instead it invalidated the
attribute cache, which should have caused the next caller to retrieve
i_size.  But the fault handler did not know it was supposed to retrieve
i_size.  So during xfstests, i_size was still zero, and filemap_fault
returned VM_FAULT_SIGBUS.

Fixes xfstests generic/452.

Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-04-03 21:55:27 -04:00
Jaegeuk Kim 214c2461a8 f2fs: remain written times to update inode during fsync
This fixes xfstests/generic/392.

The failure was caused by different times between 1) one marked in the last
fsync(2) call and 2) the other given by roll-forward recovery after power-cut.
The reason was that we skipped updating inode block at 1), since its i_size
was recoverable along with 4KB-aligned data writes, which was fixed by:
  "f2fs: fix a wrong condition in f2fs_skip_inode_update"

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-04-03 18:52:47 -07:00
Linus Torvalds d92cd810e6 Merge branch 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
 "rcu_work addition and a couple trivial changes"

* 'for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: remove the comment about the old manager_arb mutex
  workqueue: fix the comments of nr_idle
  fs/aio: Use rcu_work instead of explicit rcu and work item
  cgroup: Use rcu_work instead of explicit rcu and work item
  RCU, workqueue: Implement rcu_work
2018-04-03 18:00:13 -07:00
Linus Torvalds 5bb053bef8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) Support offloading wireless authentication to userspace via
    NL80211_CMD_EXTERNAL_AUTH, from Srinivas Dasari.

 2) A lot of work on network namespace setup/teardown from Kirill Tkhai.
    Setup and cleanup of namespaces now all run asynchronously and thus
    performance is significantly increased.

 3) Add rx/tx timestamping support to mv88e6xxx driver, from Brandon
    Streiff.

 4) Support zerocopy on RDS sockets, from Sowmini Varadhan.

 5) Use denser instruction encoding in x86 eBPF JIT, from Daniel
    Borkmann.

 6) Support hw offload of vlan filtering in mvpp2 dreiver, from Maxime
    Chevallier.

 7) Support grafting of child qdiscs in mlxsw driver, from Nogah
    Frankel.

 8) Add packet forwarding tests to selftests, from Ido Schimmel.

 9) Deal with sub-optimal GSO packets better in BBR congestion control,
    from Eric Dumazet.

10) Support 5-tuple hashing in ipv6 multipath routing, from David Ahern.

11) Add path MTU tests to selftests, from Stefano Brivio.

12) Various bits of IPSEC offloading support for mlx5, from Aviad
    Yehezkel, Yossi Kuperman, and Saeed Mahameed.

13) Support RSS spreading on ntuple filters in SFC driver, from Edward
    Cree.

14) Lots of sockmap work from John Fastabend. Applications can use eBPF
    to filter sendmsg and sendpage operations.

15) In-kernel receive TLS support, from Dave Watson.

16) Add XDP support to ixgbevf, this is significant because it should
    allow optimized XDP usage in various cloud environments. From Tony
    Nguyen.

17) Add new Intel E800 series "ice" ethernet driver, from Anirudh
    Venkataramanan et al.

18) IP fragmentation match offload support in nfp driver, from Pieter
    Jansen van Vuuren.

19) Support XDP redirect in i40e driver, from Björn Töpel.

20) Add BPF_RAW_TRACEPOINT program type for accessing the arguments of
    tracepoints in their raw form, from Alexei Starovoitov.

21) Lots of striding RQ improvements to mlx5 driver with many
    performance improvements, from Tariq Toukan.

22) Use rhashtable for inet frag reassembly, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1678 commits)
  net: mvneta: improve suspend/resume
  net: mvneta: split rxq/txq init and txq deinit into SW and HW parts
  ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh
  net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
  net: bgmac: Correctly annotate register space
  route: check sysctl_fib_multipath_use_neigh earlier than hash
  fix typo in command value in drivers/net/phy/mdio-bitbang.
  sky2: Increase D3 delay to sky2 stops working after suspend
  net/mlx5e: Set EQE based as default TX interrupt moderation mode
  ibmvnic: Disable irqs before exiting reset from closed state
  net: sched: do not emit messages while holding spinlock
  vlan: also check phy_driver ts_info for vlan's real device
  Bluetooth: Mark expected switch fall-throughs
  Bluetooth: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for BTUSB_QCA_ROME
  Bluetooth: btrsi: remove unused including <linux/version.h>
  Bluetooth: hci_bcm: Remove DMI quirk for the MINIX Z83-4
  sh_eth: kill useless check in __sh_eth_get_regs()
  sh_eth: add sh_eth_cpu_data::no_xdfar flag
  ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data()
  ipv4: factorize sk_wmem_alloc updates done by __ip_append_data()
  ...
2018-04-03 14:04:18 -07:00
J. Bruce Fields 880a3a5325 nfsd: fix incorrect umasks
We're neglecting to clear the umask after it's set, which can cause a
later unrelated rpc to (incorrectly) use the same umask if it happens to
be processed by the same thread.

There's a more subtle problem here too:

An NFSv4 compound request is decoded all in one pass before any
operations are executed.

Currently we're setting current->fs->umask at the time we decode the
compound.  In theory a single compound could contain multiple creates
each setting a umask.  In that case we'd end up using whichever umask
was passed in the *last* operation as the umask for all the creates,
whether that was correct or not.

So, we should just be saving the umask at decode time and waiting to set
it until we actually process the corresponding operation.

In practice it's unlikely any client would do multiple creates in a
single compound.  And even if it did they'd likely be from the same
process (hence carry the same umask).  So this is a little academic, but
we should get it right anyway.

Fixes: 47057abde5 (nfsd: add support for the umask attribute)
Cc: stable@vger.kernel.org
Reported-by: Lucash Stach <l.stach@pengutronix.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 16:27:08 -04:00
Chuck Lever 38a7031559 NFSD: Clean up legacy NFS SYMLINK argument XDR decoders
Move common code in NFSD's legacy SYMLINK decoders into a helper.
The immediate benefits include:

 - one fewer data copies on transports that support DDP
 - consistent error checking across all versions
 - reduction of code duplication
 - support for both legal forms of SYMLINK requests on RDMA
   transports for all versions of NFS (in particular, NFSv2, for
   completeness)

In the long term, this helper is an appropriate spot to perform a
per-transport call-out to fill the pathname argument using, say,
RDMA Reads.

Filling the pathname in the proc function also means that eventually
the incoming filehandle can be interpreted so that filesystem-
specific memory can be allocated as a sink for the pathname
argument, rather than using anonymous pages.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:16 -04:00
Chuck Lever 8154ef2776 NFSD: Clean up legacy NFS WRITE argument XDR decoders
Move common code in NFSD's legacy NFS WRITE decoders into a helper.
The immediate benefit is reduction of code duplication and some nice
micro-optimizations (see below).

In the long term, this helper can perform a per-transport call-out
to fill the rq_vec (say, using RDMA Reads).

The legacy WRITE decoders and procs are changed to work like NFSv4,
which constructs the rq_vec just before it is about to call
vfs_writev.

Why? Calling a transport call-out from the proc instead of the XDR
decoder means that the incoming FH can be resolved to a particular
filesystem and file. This would allow pages from the backing file to
be presented to the transport to be filled, rather than presenting
anonymous pages and copying or flipping them into the file's page
cache later.

I also prefer using the pages in rq_arg.pages, instead of pulling
the data pages directly out of the rqstp::rq_pages array. This is
currently the way the NFSv3 write decoder works, but the other two
do not seem to take this approach. Fixing this removes the only
reference to rq_pages found in NFSD, eliminating an NFSD assumption
about how transports use the pages in rq_pages.

Lastly, avoid setting up the first element of rq_vec as a zero-
length buffer. This happens with an RDMA transport when a normal
Read chunk is present because the data payload is in rq_arg's
page list (none of it is in the head buffer).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:16 -04:00
Chuck Lever fff4080b2f nfsd: Trace NFSv4 COMPOUND execution
This helps record the identity and timing of the ops in each NFSv4
COMPOUND, replacing dprintk calls that did much the same thing.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:15 -04:00
Chuck Lever 87c5942e8f nfsd: Add I/O trace points in the NFSv4 read proc
NFSv4 read compound processing invokes nfsd_splice_read and
nfs_readv directly, so the trace points currently in nfsd_read are
not invoked for NFSv4 reads.

Move the NFSD READ trace points to common helpers so that NFSv4
reads are captured.

Also, record any local I/O error that occurs, the total count of
bytes that were actually returned, and whether splice or vectored
read was used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:15 -04:00
Chuck Lever d890be159a nfsd: Add I/O trace points in the NFSv4 write path
NFSv4 write compound processing invokes nfsd_vfs_write directly. The
trace points currently in nfsd_write are not effective for NFSv4
writes.

Move the trace points into the shared nfsd_vfs_write() helper.

After the I/O, we also want to record any local I/O error that
might have occurred, and the total count of bytes that were actually
moved (rather than the requested number).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:15 -04:00
Chuck Lever f394b62b7b nfsd: Add "nfsd_" to trace point names
Follow naming convention used in client and in sunrpc layers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:14 -04:00
Chuck Lever 79e0b4e247 nfsd: Record request byte count, not count of vectors
Byte count is more helpful to know than vector count.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:14 -04:00
Chuck Lever afa720a091 nfsd: Fix NFSD trace points
nfsd-1915  [003] 77915.780959: write_opened:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780960: write_io_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780964: write_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1

Byte swapping and knfsd_fh_hash() are not available in "trace-cmd
report", where the print format string is actually used. These
data transformations have to be done during the TP_fast_assign step.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:13 -04:00
Stefan Agner 47299f79ea nfsd: use correct enum type in decode_cb_op_status
Use enum nfs_cb_opnum4 in decode_cb_op_status. This fixes warnings
seen with clang:
  fs/nfsd/nfs4callback.c:451:36: warning: implicit conversion from
      enumeration type 'enum nfs_cb_opnum4' to different enumeration
      type 'enum nfs_opnum4' [-Wenum-conversion]
        status = decode_cb_op_status(xdr, OP_CB_SEQUENCE, &cb->cb_seq_status);
                 ~~~~~~~~~~~~~~~~~~~      ^~~~~~~~~~~~~~

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:08 -04:00
Fengguang Wu 51d87bc2bf nfsd: fix boolreturn.cocci warnings
fs/nfsd/nfs4state.c:926:8-9: WARNING: return of 0/1 in function 'nfs4_delegation_exists' with return type bool
fs/nfsd/nfs4state.c:2955:9-10: WARNING: return of 0/1 in function 'nfsd4_compound_in_session' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

Fixes: 68b18f5294 ("nfsd: make nfs4_get_existing_delegation less confusing")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
[bfields: also fix -EAGAIN]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2018-04-03 15:08:08 -04:00
Dan Williams d2c997c0f1 fs, dax: use page->mapping to warn if truncate collides with a busy page
Catch cases where extent unmap operations encounter pages that are
pinned / busy. Typically this is pinned pages that are under active dma.
This warning is a canary for potential data corruption as truncated
blocks could be allocated to a new file while the device is still
performing i/o.

Here is an example of a collision that this implementation catches:

 WARNING: CPU: 2 PID: 1286 at fs/dax.c:343 dax_disassociate_entry+0x55/0x80
 [..]
 Call Trace:
  __dax_invalidate_mapping_entry+0x6c/0xf0
  dax_delete_mapping_entry+0xf/0x20
  truncate_exceptional_pvec_entries.part.12+0x1af/0x200
  truncate_inode_pages_range+0x268/0x970
  ? tlb_gather_mmu+0x10/0x20
  ? up_write+0x1c/0x40
  ? unmap_mapping_range+0x73/0x140
  xfs_free_file_space+0x1b6/0x5b0 [xfs]
  ? xfs_file_fallocate+0x7f/0x320 [xfs]
  ? down_write_nested+0x40/0x70
  ? xfs_ilock+0x21d/0x2f0 [xfs]
  xfs_file_fallocate+0x162/0x320 [xfs]
  ? rcu_read_lock_sched_held+0x3f/0x70
  ? rcu_sync_lockdep_assert+0x2a/0x50
  ? __sb_start_write+0xd0/0x1b0
  ? vfs_fallocate+0x20c/0x270
  vfs_fallocate+0x154/0x270
  SyS_fallocate+0x43/0x80
  entry_SYSCALL_64_fastpath+0x1f/0x96

Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-04-03 05:41:19 -07:00
Dan Williams fb094c9074 ext2, dax: introduce ext2_dax_aops
In preparation for the dax implementation to start associating dax pages
to inodes via page->mapping, we need to provide a 'struct
address_space_operations' instance for dax. Otherwise, direct-I/O
triggers incorrect page cache assumptions and warnings.

Reviewed-by: Jan Kara <jack@suse.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-04-03 05:41:05 -07:00
Linus Torvalds 642e7fd233 Merge branch 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux
Pull removal of in-kernel calls to syscalls from Dominik Brodowski:
 "System calls are interaction points between userspace and the kernel.
  Therefore, system call functions such as sys_xyzzy() or
  compat_sys_xyzzy() should only be called from userspace via the
  syscall table, but not from elsewhere in the kernel.

  At least on 64-bit x86, it will likely be a hard requirement from
  v4.17 onwards to not call system call functions in the kernel: It is
  better to use use a different calling convention for system calls
  there, where struct pt_regs is decoded on-the-fly in a syscall wrapper
  which then hands processing over to the actual syscall function. This
  means that only those parameters which are actually needed for a
  specific syscall are passed on during syscall entry, instead of
  filling in six CPU registers with random user space content all the
  time (which may cause serious trouble down the call chain). Those
  x86-specific patches will be pushed through the x86 tree in the near
  future.

  Moreover, rules on how data may be accessed may differ between kernel
  data and user data. This is another reason why calling sys_xyzzy() is
  generally a bad idea, and -- at most -- acceptable in arch-specific
  code.

  This patchset removes all in-kernel calls to syscall functions in the
  kernel with the exception of arch/. On top of this, it cleans up the
  three places where many syscalls are referenced or prototyped, namely
  kernel/sys_ni.c, include/linux/syscalls.h and include/linux/compat.h"

* 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux: (109 commits)
  bpf: whitelist all syscalls for error injection
  kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitions
  kernel/sys_ni: sort cond_syscall() entries
  syscalls/x86: auto-create compat_sys_*() prototypes
  syscalls: sort syscall prototypes in include/linux/compat.h
  net: remove compat_sys_*() prototypes from net/compat.h
  syscalls: sort syscall prototypes in include/linux/syscalls.h
  kexec: move sys_kexec_load() prototype to syscalls.h
  x86/sigreturn: use SYSCALL_DEFINE0
  x86: fix sys_sigreturn() return type to be long, not unsigned long
  x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm()
  mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead()
  mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff()
  mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64()
  fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate()
  fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls
  fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate()
  fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall
  kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid()
  kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare()
  ...
2018-04-02 21:22:12 -07:00
Linus Torvalds f5a8eb632b arch: remove obsolete architecture ports
This removes the entire architecture code for blackfin, cris, frv, m32r,
 metag, mn10300, score, and tile, including the associated device drivers.
 
 I have been working with the (former) maintainers for each one to ensure
 that my interpretation was right and the code is definitely unused in
 mainline kernels. Many had fond memories of working on the respective
 ports to start with and getting them included in upstream, but also saw
 no point in keeping the port alive without any users.
 
 In the end, it seems that while the eight architectures are extremely
 different, they all suffered the same fate: There was one company
 in charge of an SoC line, a CPU microarchitecture and a software
 ecosystem, which was more costly than licensing newer off-the-shelf
 CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems
 that all the SoC product lines are still around, but have not used the
 custom CPU architectures for several years at this point. In contrast,
 CPU instruction sets that remain popular and have actively maintained
 kernel ports tend to all be used across multiple licensees.
 
 The removal came out of a discussion that is now documented at
 https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
 marking any ports as deprecated but remove them all at once after I made
 sure that they are all unused. Some architectures (notably tile, mn10300,
 and blackfin) are still being shipped in products with old kernels,
 but those products will never be updated to newer kernel releases.
 
 After this series, we still have a few architectures without mainline
 gcc support:
 
 - unicore32 and hexagon both have very outdated gcc releases, but the
   maintainers promised to work on providing something newer. At least
   in case of hexagon, this will only be llvm, not gcc.
 
 - openrisc, risc-v and nds32 are still in the process of finishing their
   support or getting it added to mainline gcc in the first place.
   They all have patched gcc-7.3 ports that work to some degree, but
   complete upstream support won't happen before gcc-8.1. Csky posted
   their first kernel patch set last week, their situation will be similar.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJawdL2AAoJEGCrR//JCVInuH0P/RJAZh1nTD+TR34ZhJq2TBoo
 PgygwDU7Z2+tQVU+EZ453Gywz9/NMRFk1RWAZqrLix4ZtyIMvC6A1qfT2yH1Y7Fb
 Qh6tccQeLe4ezq5u4S/46R/fQXu3Txr92yVwzJJUuPyU0arF9rv5MmI8e6p7L1en
 yb74kSEaCe+/eMlsEj1Cc1dgthDNXGKIURHkRsILoweysCpesjiTg4qDcL+yTibV
 FP2wjVbniKESMKS6qL71tiT5sexvLsLwMNcGiHPj94qCIQuI7DLhLdBVsL5Su6gI
 sbtgv0dsq4auRYAbQdMaH1hFvu6WptsuttIbOMnz2Yegi2z28H8uVXkbk2WVLbqG
 ZESUwutGh8MzOL2RJ4jyyQq5sfo++CRGlfKjr6ImZRv03dv0pe/W85062cK5cKNs
 cgDDJjGRorOXW7dyU6jG2gRqODOQBObIv3w5efdq5OgzOWlbI4EC+Y5u1Z0JF/76
 pSwtGXA6YhwC+9LLAlnVTHG+yOwuLmAICgoKcTbzTVDKA2YQZG/cYuQfI5S1wD8e
 X6urPx3Md2GCwLXQ9mzKBzKZUpu/Tuhx0NvwF4qVxy6x1PELjn68zuP7abDHr46r
 57/09ooVN+iXXnEGMtQVS/OPvYHSa2NgTSZz6Y86lCRbZmUOOlK31RDNlMvYNA+s
 3iIVHovno/JuJnTOE8LY
 =fQ8z
 -----END PGP SIGNATURE-----

Merge tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pul removal of obsolete architecture ports from Arnd Bergmann:
 "This removes the entire architecture code for blackfin, cris, frv,
  m32r, metag, mn10300, score, and tile, including the associated device
  drivers.

  I have been working with the (former) maintainers for each one to
  ensure that my interpretation was right and the code is definitely
  unused in mainline kernels. Many had fond memories of working on the
  respective ports to start with and getting them included in upstream,
  but also saw no point in keeping the port alive without any users.

  In the end, it seems that while the eight architectures are extremely
  different, they all suffered the same fate: There was one company in
  charge of an SoC line, a CPU microarchitecture and a software
  ecosystem, which was more costly than licensing newer off-the-shelf
  CPU cores from a third party (typically ARM, MIPS, or RISC-V). It
  seems that all the SoC product lines are still around, but have not
  used the custom CPU architectures for several years at this point. In
  contrast, CPU instruction sets that remain popular and have actively
  maintained kernel ports tend to all be used across multiple licensees.

  [ See the new nds32 port merged in the previous commit for the next
    generation of "one company in charge of an SoC line, a CPU
    microarchitecture and a software ecosystem"   - Linus ]

  The removal came out of a discussion that is now documented at
  https://lwn.net/Articles/748074/. Unlike the original plans, I'm not
  marking any ports as deprecated but remove them all at once after I
  made sure that they are all unused. Some architectures (notably tile,
  mn10300, and blackfin) are still being shipped in products with old
  kernels, but those products will never be updated to newer kernel
  releases.

  After this series, we still have a few architectures without mainline
  gcc support:

   - unicore32 and hexagon both have very outdated gcc releases, but the
     maintainers promised to work on providing something newer. At least
     in case of hexagon, this will only be llvm, not gcc.

   - openrisc, risc-v and nds32 are still in the process of finishing
     their support or getting it added to mainline gcc in the first
     place. They all have patched gcc-7.3 ports that work to some
     degree, but complete upstream support won't happen before gcc-8.1.
     Csky posted their first kernel patch set last week, their situation
     will be similar

  [ Palmer Dabbelt points out that RISC-V support is in mainline gcc
    since gcc-7, although gcc-7.3.0 is the recommended minimum  - Linus ]"

This really says it all:

 2498 files changed, 95 insertions(+), 467668 deletions(-)

* tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits)
  MAINTAINERS: UNICORE32: Change email account
  staging: iio: remove iio-trig-bfin-timer driver
  tty: hvc: remove tile driver
  tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers
  serial: remove tile uart driver
  serial: remove m32r_sio driver
  serial: remove blackfin drivers
  serial: remove cris/etrax uart drivers
  usb: Remove Blackfin references in USB support
  usb: isp1362: remove blackfin arch glue
  usb: musb: remove blackfin port
  usb: host: remove tilegx platform glue
  pwm: remove pwm-bfin driver
  i2c: remove bfin-twi driver
  spi: remove blackfin related host drivers
  watchdog: remove bfin_wdt driver
  can: remove bfin_can driver
  mmc: remove bfin_sdh driver
  input: misc: remove blackfin rotary driver
  input: keyboard: remove bf54x driver
  ...
2018-04-02 20:20:12 -07:00
Yunlong Song c79d152094 f2fs: make assignment of t->dentry_bitmap more readable
In make_dentry_ptr_block, it is confused with "&" for t->dentry_bitmap
but without "&" for t->dentry, so delete "&" to make code more readable.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-04-02 20:09:35 -07:00
Jaegeuk Kim dc7a10ddee f2fs: truncate preallocated blocks in error case
If write is failed, we must deallocate the blocks that we couldn't write.

Cc: stable@vger.kernel.org
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-04-02 20:09:34 -07:00
Linus Torvalds ce6eba3dba Merge branch 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull wait_var_event updates from Ingo Molnar:
 "This introduces the new wait_var_event() API, which is a more flexible
  waiting primitive than wait_on_atomic_t().

  All wait_on_atomic_t() users are migrated over to the new API and
  wait_on_atomic_t() is removed. The migration fixes one bug and should
  result in no functional changes for the other usecases"

* 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/wait: Improve __var_waitqueue() code generation
  sched/wait: Remove the wait_on_atomic_t() API
  sched/wait, arch/mips: Fix and convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait, fs/ocfs2: Convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait, fs/fscache: Convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait, fs/btrfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait, fs/afs: Convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait, drivers/media: Convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API
  sched/wait: Introduce wait_var_event()
2018-04-02 16:50:39 -07:00
Junling Zheng 235831d7dd f2fs: fix a wrong condition in f2fs_skip_inode_update
Fix commit 97dd26ad83 (f2fs: fix wrong AUTO_RECOVER condition).
We should use ~PAGE_MASK to determine whether i_size is aligned to
the f2fs's block size or not.

Signed-off-by: Junling Zheng <zhengjunling@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2018-04-02 13:21:51 -07:00
Dominik Brodowski edf292c76b fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate()
Using the ksys_fallocate() wrapper allows us to get rid of in-kernel
calls to the sys_fallocate() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In
particular, it uses the same calling convention as sys_fallocate().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:09 +02:00
Dominik Brodowski 36028d5dd7 fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls
Using the ksys_p{read,write}64() wrappers allows us to get rid of
in-kernel calls to the sys_pread64() and sys_pwrite64() syscalls.
The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_p{read,write}64().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:09 +02:00
Dominik Brodowski df260e21e6 fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate()
Using the ksys_truncate() wrapper allows us to get rid of in-kernel
calls to the sys_truncate() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In
particular, it uses the same calling convention as sys_truncate().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:08 +02:00
Dominik Brodowski 806cbae122 fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall
Using this helper allows us to avoid the in-kernel calls to the
sys_sync_file_range() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it uses
the same calling convention as sys_sync_file_range().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:07 +02:00
Dominik Brodowski 70f68ee81e fs: add ksys_sync() helper; remove in-kernel calls to sys_sync()
Using this helper allows us to avoid the in-kernel calls to the
sys_sync() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_sync().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:05 +02:00
Dominik Brodowski 3ce4a7bf66 fs: add ksys_read() helper; remove in-kernel calls to sys_read()
Using this helper allows us to avoid the in-kernel calls to the
sys_read() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_read().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:04 +02:00
Dominik Brodowski 76847e4344 fs: add ksys_lseek() helper; remove in-kernel calls to sys_lseek()
Using this helper allows us to avoid the in-kernel calls to the
sys_lseek() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_lseek().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:03 +02:00
Dominik Brodowski cbb60b924b fs: add ksys_ioctl() helper; remove in-kernel calls to sys_ioctl()
Using this helper allows us to avoid the in-kernel calls to the
sys_ioctl() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_ioctl().

After careful review, at least some of these calls could be converted
to do_vfs_ioctl() in future.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:03 +02:00
Dominik Brodowski 454dab3f96 fs: add ksys_getdents64() helper; remove in-kernel calls to sys_getdents64()
Using this helper allows us to avoid the in-kernel calls to the
sys_getdents64() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_getdents64().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:02 +02:00
Dominik Brodowski bae217ea8c fs: add ksys_open() wrapper; remove in-kernel calls to sys_open()
Using this wrapper allows us to avoid the in-kernel calls to the
sys_open() syscall. The ksys_ prefix denotes that this function is meant
as a drop-in replacement for the syscall. In particular, it uses the
same calling convention as sys_open().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:01 +02:00
Dominik Brodowski 2ca2a09d62 fs: add ksys_close() wrapper; remove in-kernel calls to sys_close()
Using the ksys_close() wrapper allows us to get rid of in-kernel calls
to the sys_close() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_close(), with one subtle
difference:

The few places which checked the return value did not care about the return
value re-writing in sys_close(), so simply use a wrapper around
__close_fd().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:00 +02:00
Dominik Brodowski 411d9475cf fs: add ksys_ftruncate() wrapper; remove in-kernel calls to sys_ftruncate()
Using the ksys_ftruncate() wrapper allows us to get rid of in-kernel
calls to the sys_ftruncate() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In
particular, it uses the same calling convention as sys_ftruncate().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:16:00 +02:00
Dominik Brodowski 55731b3cda fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers
Using the fs-interal do_fchownat() wrapper allows us to get rid of
fs-internal calls to the sys_fchownat() syscall.

Introducing the ksys_fchown() helper and the ksys_{,}chown() wrappers
allows us to avoid the in-kernel calls to the sys_{,l,f}chown() syscalls.
The ksys_ prefix denotes that these functions are meant as a drop-in
replacement for the syscalls. In particular, they use the same calling
convention as sys_{,l,f}chown().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:59 +02:00
Dominik Brodowski cbfe20f565 fs: add do_faccessat() helper and ksys_access() wrapper; remove in-kernel calls to syscall
Using the fs-internal do_faccessat() helper allows us to get rid of
fs-internal calls to the sys_faccessat() syscall.

Introducing the ksys_access() wrapper allows us to avoid the in-kernel
calls to the sys_access() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In
particular, it uses the same calling convention as sys_access().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:58 +02:00
Dominik Brodowski 03450e271a fs: add ksys_fchmod() and do_fchmodat() helpers and ksys_chmod() wrapper; remove in-kernel calls to syscall
Using the fs-internal do_fchmodat() helper allows us to get rid of
fs-internal calls to the sys_fchmodat() syscall.

Introducing the ksys_fchmod() helper and the ksys_chmod() wrapper allows
us to avoid the in-kernel calls to the sys_fchmod() and sys_chmod()
syscalls. The ksys_ prefix denotes that these functions are meant as a
drop-in replacement for the syscalls. In particular, they use the same
calling convention as sys_fchmod() and sys_chmod().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:57 +02:00
Dominik Brodowski 46ea89eb65 fs: add do_linkat() helper and ksys_link() wrapper; remove in-kernel calls to syscall
Using the fs-internal do_linkat() helper allows us to get rid of
fs-internal calls to the sys_linkat() syscall.

Introducing the ksys_link() wrapper allows us to avoid the in-kernel
calls to sys_link() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it uses
the same calling convention as sys_link().

In the near future, the only fs-external user of ksys_link() should be
converted to use vfs_link() instead.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:57 +02:00
Dominik Brodowski 87c4e19262 fs: add do_mknodat() helper and ksys_mknod() wrapper; remove in-kernel calls to syscall
Using the fs-internal do_mknodat() helper allows us to get rid of
fs-internal calls to the sys_mknodat() syscall.

Introducing the ksys_mknod() wrapper allows us to avoid the in-kernel
calls to sys_mknod() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it uses
the same calling convention as sys_mknod().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:56 +02:00
Dominik Brodowski b724e846b4 fs: add do_symlinkat() helper and ksys_symlink() wrapper; remove in-kernel calls to syscall
Using the fs-internal do_symlinkat() helper allows us to get rid of
fs-internal calls to the sys_symlinkat() syscall.

Introducing the ksys_symlink() wrapper allows us to avoid the in-kernel
calls to the sys_symlink() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In particular,
it uses the same calling convention as sys_symlink().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:55 +02:00
Dominik Brodowski 0101db7a30 fs: add do_mkdirat() helper and ksys_mkdir() wrapper; remove in-kernel calls to syscall
Using the fs-internal do_mkdirat() helper allows us to get rid of
fs-internal calls to the sys_mkdirat() syscall.

Introducing the ksys_mkdir() wrapper allows us to avoid the in-kernel calls
to the sys_mkdir() syscall. The ksys_ prefix denotes that this function is
meant as a drop-in replacement for the syscall. In particular, it uses the
same calling convention as sys_mkdir().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:54 +02:00
Dominik Brodowski f459dffae1 fs: add ksys_rmdir() wrapper; remove in-kernel calls to sys_rmdir()
Using this wrapper allows us to avoid the in-kernel calls to the
sys_rmdir() syscall. The ksys_ prefix denotes that this function is meant
as a drop-in replacement for the syscall. In particular, it uses the same
calling convention as sys_rmdir().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:54 +02:00
Dominik Brodowski 6380161ce9 hostfs: rename do_rmdir() to hostfs_do_rmdir()
do_rmdir() is used in the VFS layer at fs/namei.c, so use a different
name in hostfs.

Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:53 +02:00
Dominik Brodowski 447016e968 fs: add ksys_chdir() helper; remove in-kernel calls to sys_chdir()
Using this helper allows us to avoid the in-kernel calls to the sys_chdir()
syscall. The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_chdir().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:51 +02:00
Dominik Brodowski e7a3e8b2ed fs: add ksys_write() helper; remove in-kernel calls to sys_write()
Using this helper allows us to avoid the in-kernel calls to the sys_write()
syscall. The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_write().

In the near future, the do_mounts / initramfs callers of ksys_write()
should be converted to use filp_open() and vfs_write() instead.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-s390@vger.kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:51 +02:00
Dominik Brodowski a16fe33ab5 fs: add ksys_chroot() helper; remove-in kernel calls to sys_chroot()
Using this helper allows us to avoid the in-kernel calls to the
sys_chroot() syscall. The ksys_ prefix denotes that this function is
meant as a drop-in replacement for the syscall. In particular, it uses the
same calling convention as sys_chroot().

In the near future, the fs-external callers of ksys_chroot() should be
converted to use kern_path()/set_fs_root() directly. Then ksys_chroot()
can be moved within sys_chroot() again.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:50 +02:00
Dominik Brodowski c7248321a3 fs: add ksys_dup{,3}() helper; remove in-kernel calls to sys_dup{,3}()
Using ksys_dup() and ksys_dup3() as helper functions allows us to
avoid the in-kernel calls to the sys_dup() and sys_dup3() syscalls.
The ksys_ prefix denotes that these functions are meant as a drop-in
replacement for the syscalls. In particular, they use the same
calling convention as sys_dup{,3}().

In the near future, the fs-external callers of ksys_dup{,3}() should be
converted to call do_dup2() directly. Then, ksys_dup{,3}() can be moved
within sys_dup{,3}() again.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:49 +02:00
Dominik Brodowski 3a18ef5c1b fs: add ksys_umount() helper; remove in-kernel call to sys_umount()
Using this helper allows us to avoid the in-kernel call to the sys_umount()
syscall. The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as ksys_umount().

In the near future, the only fs-external caller of ksys_umount() should be
converted to call do_umount() directly. Then, ksys_umount() can be moved
within sys_umount() again.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:48 +02:00
Dominik Brodowski 312db1aa1d fs: add ksys_mount() helper; remove in-kernel calls to sys_mount()
Using this helper allows us to avoid the in-kernel calls to the sys_mount()
syscall. The ksys_ prefix denotes that this function is meant as a drop-in
replacement for the syscall. In particular, it uses the same calling
convention as sys_mount().

In the near future, all callers of ksys_mount() should be converted to call
do_mount() directly.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:48 +02:00
Dominik Brodowski ab0d1e85bf fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl()
While sys32_quotactl() is only needed on x86, it can use the recommended
COMPAT_SYSCALL_DEFINEx() machinery for its setup.

Acked-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:47 +02:00
Dominik Brodowski cb0b476ab1 fs/quota: add kernel_quotactl() helper; remove in-kernel call to syscall
Using the fs-internal kernel_quotactl() helper allows us to get rid of
the fs-internal call to the sys_quotactl() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:46 +02:00
Dominik Brodowski 183caa3c86 fanotify: add do_fanotify_mark() helper; remove in-kernel call to syscall
Using the fs-internal do_fanotify_mark() helper allows us to get rid of
the fs-internal call to the sys_fanotify_mark() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Acked-by: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:45 +02:00
Dominik Brodowski d0d89d1ed3 inotify: add do_inotify_init() helper; remove in-kernel call to syscall
Using the inotify-internal do_inotify_init() helper allows us to get rid
of the in-kernel call to sys_inotify_init1() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Acked-by: Jan Kara <jack@suse.cz>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:45 +02:00
Dominik Brodowski ab641afa73 fs: add do_compat_futimesat() helper; remove in-kernel call to compat syscall
Using the fs-internal do_compat_futimesat() helper allows us to get rid of
the fs-internal call to the compat_sys_futimesat() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:44 +02:00
Dominik Brodowski 570484bfe8 fs: add do_compat_signalfd4() helper; remove in-kernel call to compat syscall
Using the fs-internal do_compat_signalfd4() helper allows us to get rid of
the fs-internal call to the compat_sys_signalfd4() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:43 +02:00
Dominik Brodowski 05585e4495 fs: add do_compat_select() helper; remove in-kernel call to compat syscall
Using the fs-internal do_compat_select() helper allows us to get rid of
the fs-internal call to the compat_sys_select() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:42 +02:00
Dominik Brodowski e02af2ff65 fs: add do_compat_fcntl64() helper; remove in-kernel call to compat syscall
Using the fs-internal do_compat_fcntl64() helper allows us to get rid of
the fs-internal call to the compat_sys_fcntl64() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:42 +02:00
Dominik Brodowski 4bdb9acabf fs: add kern_select() helper; remove in-kernel call to sys_select()
Using this helper allows us to avoid the in-kernel call to the sys_umount()
syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:41 +02:00
Dominik Brodowski 30cfe4ef8b fs: add do_vmsplice() helper; remove in-kernel call to syscall
Using the fs-internal do_vmsplice() helper allows us to get rid of the
fs-internal call to the sys_vmsplice() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:40 +02:00
Dominik Brodowski 98e5f7bd2c fs: add do_lookup_dcookie() helper; remove in-kernel call to syscall
Using the fs-internal do_lookup_dcookie() helper allows us to get rid of
fs-internal calls to the sys_lookup_dcookie() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:39 +02:00
Dominik Brodowski 2fc96f8331 fs: add do_eventfd() helper; remove internal call to sys_eventfd()
Using this helper removes an in-kernel call to the sys_eventfd() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:39 +02:00
Dominik Brodowski 52fb6db0fd fs: add do_signalfd4() helper; remove internal calls to sys_signalfd4()
Using this helper removes in-kernel calls to the sys_signalfd4() syscall
function.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:38 +02:00
Dominik Brodowski 791eb22eef fs: add do_epoll_*() helpers; remove internal calls to sys_epoll_*()
Using the helper functions do_epoll_create() and do_epoll_wait() allows us
to remove in-kernel calls to the related syscall functions.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:37 +02:00
Dominik Brodowski f13903587c fs: add do_futimesat() helper; remove internal call to sys_futimesat()
Using this helper removes the in-kernel call to the sys_futimesat()
syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:37 +02:00
Dominik Brodowski ee81feb64e fs: add do_renameat2() helper; remove internal call to sys_renameat2()
Using this helper removes in-kernel calls to the sys_renameat2() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:36 +02:00
Dominik Brodowski 0a216dd1cf fs: add do_pipe2() helper; remove internal call to sys_pipe2()
Using this helper removes an in-kernel call to the sys_pipe2() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:35 +02:00
Dominik Brodowski 2dae024806 fs: add do_readlinkat() helper; remove internal call to sys_readlinkat()
Using the do_readlinkat() helper removes an in-kernel call to the
sys_readlinkat() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2018-04-02 20:15:34 +02:00
Steve French 07108d0e7c cifs: Add minor debug message during negprot
Check for unknown security mode flags during negotiate protocol
if debugging enabled.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-04-02 13:11:15 -05:00
Steve French 7ea884c77e smb3: Fix root directory when server returns inode number of zero
Some servers return inode number zero for the root directory, which
causes ls to display incorrect data (missing "." and "..").

If the server returns zero for the inode number of the root directory,
fake an inode number for it.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
2018-04-02 13:11:03 -05:00
Steve French 6c4ba31133 cifs: fix sparse warning on previous patch in a few printks
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Ronnie Sahlberg <lsahlber@redhat.com>
2018-04-02 13:10:40 -05:00
Ronnie Sahlberg 93012bf984 cifs: add server->vals->header_preamble_size
This variable is set to 4 for all protocol versions and replaces
the hardcoded constant 4 throughought the code.
This will later be updated to reflect whether a response packet
has a 4 byte length preamble or not once we start removing this
field from the SMB2+ dialects.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-04-02 13:09:44 -05:00
Martin Brandenburg dbcb5e7fc4 orangefs: open code short single-use functions
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-04-02 08:10:17 -04:00
Colin Ian King 81e3d0253f orangefs: replace vmalloc and memset with vzalloc
Use vzalloc instead of the vmalloc, memset combo

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-04-02 08:10:17 -04:00
David Reynolds c2676ef801 orangefs: bug fix for a race condition when getting a slot
When a slot becomes free, call wake_up_locked regardless of the number
of slots available.

Without this patch, wake_up_locked is only called when going from no
free slots to one. This means that there is a chance a waiting task
will not be woken up. In many cases, the system will bounce between 0
and 1 free slots, and the waiting tasks will be woken up. But if there
is still a waiting task and another slot becomes available before the
number of free slots reaches zero, that waiting task may never be woken
up since the number of free slots may never reach zero again.

The bug behavior is easy to reproduce with the following script,
where /mnt/orangefs is an OrangeFS file system.

for i in {1..100}; do
	for j in {1..20}; do
		dd if=/dev/zero of=/mnt/orangefs/tmp$j bs=32768 count=32 &
	done
	wait
done

Signed-off-by: David Reynolds <david@omnibond.com>
Reviewed-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2018-04-02 08:10:17 -04:00
Luis Henriques 9122eed528 ceph: quota: report root dir quota usage in statfs
This commit changes statfs default behaviour when reporting usage
statistics.  Instead of using the overall filesystem usage, statfs now
reports the quota for the filesystem root, if ceph.quota.max_bytes has
been set for this inode.  If quota hasn't been set, it falls back to the
old statfs behaviour.

A new mount option is also added ('noquotadf') to disable this behaviour.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:53 +02:00
Luis Henriques d557c48db7 ceph: quota: add counter for snaprealms with quota
By keeping a counter with the number of snaprealms that have quota set
allows to optimize the functions that need to walk throught the realms
hierarchy looking for quotas.  Thus, if this counter is zero it's safe to
assume that there are no realms with quota.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:53 +02:00
Luis Henriques e3161f17d9 ceph: quota: cache inode pointer in ceph_snap_realm
Keep a pointer to the inode in struct ceph_snap_realm.  This allows to
optimize functions that walk the realms hierarchy (e.g. in quotas).

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:53 +02:00
Yan, Zheng 0eb6bbe4d9 ceph: fix root quota realm check
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:52 +02:00
Yan, Zheng 2596366907 ceph: don't check quota for snap inode
snap inode's i_snap_realm is not pointing to ceph_snap_realm.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:52 +02:00
Luis Henriques 1ab302a0cb ceph: quota: update MDS when max_bytes is approaching
When we're reaching the ceph.quota.max_bytes limit, i.e., when writing
more than 1/16th of the space left in a quota realm, update the MDS with
the new file size.

This mirrors the fuse-client approach with commit 122c50315ed1 ("client:
Inform mds file size when approaching quota limit"), in the ceph git tree.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:52 +02:00
Luis Henriques 2b83845f8b ceph: quota: support for ceph.quota.max_bytes
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:52 +02:00
Luis Henriques cafe21a4fb ceph: quota: don't allow cross-quota renames
This patch changes ceph_rename so that -EXDEV is returned if an attempt is
made to mv a file between two different dir trees with different quotas
setup.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:52 +02:00
Luis Henriques b7a2921765 ceph: quota: support for ceph.quota.max_files
This patch adds support for the max_files quota.  It hooks into all the
ceph functions that add new filesystem objects that need to be checked
against the quota limits.  When these limits are hit, -EDQUOT is returned.

Note that we're not checking quotas on ceph_link().  ceph_link doesn't
really create a new inode,  and since the MDS doesn't update the directory
statistics when a new (hard) link is created (only with symlinks), they
are not accounted as a new file.

Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:51 +02:00
Luis Henriques fb18a57568 ceph: quota: add initial infrastructure to support cephfs quotas
This patch adds the infrastructure required to support cephfs quotas as it
is currently implemented in the ceph fuse client.  Cephfs quotas can be
set on any directory, and can restrict the number of bytes or the number
of files stored beneath that point in the directory hierarchy.

Quotas are set using the extended attributes 'ceph.quota.max_files' and
'ceph.quota.max_bytes', and can be removed by setting these attributes to
'0'.

Link: http://tracker.ceph.com/issues/22372
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 11:17:51 +02:00
Yan, Zheng 7aac453a03 ceph: rename function drop_leases() to a more descriptive name
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:49 +02:00
Chengguang Xu 50c55aeca2 ceph: fix invalid point dereference for error case in mdsc destroy
1. set fsc->mdsc after successfully allocate all necessary memory
in mdsc init.
2. if fsc->mdsc is NULL, just skip destroy operation in mdsc destroy.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:49 +02:00
Chengguang Xu 98cfda8104 ceph: return proper bool type to caller instead of pointer
Change to return true/false only for bool type return code.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:49 +02:00
Chengguang Xu bb48bd4dc4 ceph: optimize memory usage
In current code, regular file and directory use same struct
ceph_file_info to store fs specific data so the struct has to
include some fields which are only used for directory
(e.g., readdir related info), when having plenty of regular files,
it will lead to memory waste.

This patch introduces dedicated ceph_dir_file_info cache for
readdir related thins. So that regular file does not include those
unused fields anymore.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:49 +02:00
Chengguang Xu 47474d0b01 ceph: optimize mds session register
Do memory allocation first, so that avoid unnecessary
initialization of newly allocated session in error case.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:48 +02:00
Chengguang Xu 57a35dfb52 libceph, ceph: add __init attribution to init funcitons
Add __init attribution to the functions which are called only once
during initiating/registering operations and deleting unnecessary
symbol exports.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:48 +02:00
Chengguang Xu 51b10f3fe4 ceph: filter out used flags when printing unused open flags
Filter out used access mode flags when printing unused open flags.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:48 +02:00
Yan, Zheng 1582af2eaa ceph: don't wait on writeback when there is no more dirty pages
In sync mode, writepages() needs to write all dirty pages. But
it can only write dirty pages associated with the oldest snapc.
To write dirty pages associated with next snapc, it needs to wait
until current writes complete.

If there is no more dirty pages, writepages() should not wait on
writeback. Otherwise, dirty page writeback becomes very slow.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:48 +02:00
Yan, Zheng af9cc401ce ceph: invalidate pages that beyond EOF in ceph_writepages_start()
Dirty pages can be associated with different capsnap. Different capsnap
may have different EOF value. So invalidating dirty pages according to
the largest EOF value is wrong. Dirty pages beyond EOF, but associated
with other capsnap, do not get invalidated.

Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:47 +02:00
Chengguang Xu bc4b5ad3a6 ceph: mark the cap cache as unreclaimable
Releasing cap is affected by many factors (e.g., avail_count/reserve_count/min_count)
and min_count could be specified high volume in client mount option. Hence it's better
to mark cap cache as unreclaimable in case of non-trivial discrepancies between memory
shown as reclaimable and what is actually reclaimed.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:47 +02:00
Chengguang Xu 73737682e0 ceph: change variable name to follow common rule
Variable name ci is mostly used for ceph_inode_info.
Variable name fi is mostly used for ceph_file_info.
Variable name cf is mostly used for ceph_cap_flush.

Change variable name to follow above common rules
in case of confusing.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:47 +02:00
Chengguang Xu 79cd674aed ceph: optimizing cap reservation
When caps_avail_count is in a low level, most newly
trimmed caps will probably go into ->caps_list and
caps_avail_count will be increased. Hence after trimming,
should recheck caps_avail_count to effectly reuse
newly trimmed caps. Also, when releasing unnecessary
caps follow the same rule of ceph_put_cap.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:47 +02:00
Chengguang Xu b517c1d87f ceph: release unreserved caps if having enough available caps
When unreserving caps check if there is too mamy available caps
in the ->caps_list, if so release unreserved caps.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:47 +02:00
Chengguang Xu e327ce0685 ceph: optimizing cap allocation
When setting high volume of caps_min_count or having many
unreserved caps, unused caps may always keep in the ->caps_list
even can't get new cap from kmem_cache_alloc because lack of
maximum limitation of caps_avail_count. Hence reuse caps in
->caps_list if available, it's maybe better than setting max
limitation of caps_avail_count and releasing unused caps when
reaching the limit.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:46 +02:00
Chengguang Xu b884014a91 ceph: adding protection for showing cap reservation info
Adding spinlock protection during getting cap reservation
ralated fields so that the numbers match below BUG_ON condition
in the code.

BUG_ON(mdsc->caps_total_count != mdsc->caps_use_count +
				 mdsc->caps_reserve_count +
				 mdsc->caps_avail_count);

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:46 +02:00
Chengguang Xu 4d8969af28 ceph: use seq_show_option for string type options
Using seq_show_option to replace seq_printf for string type options.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:45 +02:00
Chengguang Xu 11e1478df9 libceph, ceph: change permission for readonly debugfs entries
Remove write permission for debugfs entries which only have readonly
function.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:45 +02:00
Chengguang Xu 7ae7a828d9 ceph: keep consistent semantic in fscache related option combination
When specifying multiple fscache related options, the result isn't always
the same as option order, this fix will keep strict consistent meaning
by order.

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:45 +02:00
Chengguang Xu 4c069a5821 ceph: add newline to end of debug message format
Some of dout format do not include newline in the end,
fix for the files which are in fs/ceph and net/ceph directories,
and changing printk to dout for printing debug info in super.c

Signed-off-by: Chengguang Xu <cgxu519@icloud.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:44 +02:00
Ilya Dryomov 08c1ac508b libceph, ceph: move ceph_calc_file_object_mapping() to striper.c
ceph_calc_file_object_mapping() has nothing to do with osdmaps.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2018-04-02 10:12:43 +02:00
Ilya Dryomov dccbf08005 libceph, ceph: change ceph_calc_file_object_mapping() signature
- make it void
- xlen (object extent length) out parameter should be u32 because only
  a single stripe unit is mapped at a time

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2018-04-02 10:12:38 +02:00
Theodore Ts'o e40ff21389 ext4: force revalidation of directory pointer after seekdir(2)
A malicious user could force the directory pointer to be in an invalid
spot by using seekdir(2).  Use the mechanism we already have to notice
if the directory has changed since the last time we called
ext4_readdir() to force a revalidation of the pointer.

Reported-by: syzbot+1236ce66f79263e8a862@syzkaller.appspotmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2018-04-01 23:21:03 -04:00
Long Li 21a4e14aae cifs: smbd: disconnect transport on RDMA errors
On RDMA errors, transport should disconnect the RDMA CM connection. This
will notify the upper layer, and it will attempt transport reconnect.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
2018-04-01 20:24:40 -05:00
Long Li 48f238a79f cifs: smbd: avoid reconnect lockup
During transport reconnect, other processes may have registered memory
and blocked on transport. This creates a deadlock situation because the
transport resources can't be freed, and reconnect is blocked.

Fix this by returning to upper layer on timeout. Before returning,
transport status is set to reconnecting so other processes will release
memory registration resources.

Upper layer will retry the reconnect. This is not in fast I/O path so
setting the timeout to 5 seconds.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
2018-04-01 20:24:40 -05:00
Steve French 2a18287b54 Don't log confusing message on reconnect by default
Change the following message (which can occur on reconnect) from
a warning to an FYI message.  It is confusing to users.

   [58360.523634] CIFS VFS: Free previous auth_key.response = 00000000a91cdc84

By default this message won't show up on reconnect unless the user bumps
up the log level to include FYI messages.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-04-01 20:24:40 -05:00
Steve French 2564f2ff83 Don't log expected error on DFS referral request
STATUS_FS_DRIVER_REQUIRED is expected when DFS is not turned
on on the server.  Do not log it on DFS referral response.
It clutters the dmesg log unnecessarily at mount time.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com
Reviewed-by: Ronnie sahlberg <lsahlber@redhat.com>
2018-04-01 20:24:40 -05:00
Phillip Potter 31cd106bb1 fs: cifs: Replace _free_xid call in cifs_root_iget function
Modify end of cifs_root_iget function in fs/cifs/inode.c to call
free_xid(xid) instead of _free_xid(xid), thereby allowing debug
notification of this action when enabled.

Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Signed-off-by: Steve French <smfrench@gmail.com>
2018-04-01 20:24:40 -05:00
Steve French d68f353fc9 SMB3.1.1 dialect is no longer experimental
SMB3.1.1 is a very important dialect, with much improved security.
We can remove the ExPERIMENTAL comments about it. It is widely
supported by servers.

Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
2018-04-01 20:24:40 -05:00
Steve French 6188f28bf6 Tree connect for SMB3.1.1 must be signed for non-encrypted shares
SMB3.1.1 tree connect was only being signed when signing was mandatory
but needs to always be signed (for non-guest users).

See MS-SMB2 section 3.2.4.1.1

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
2018-04-01 20:24:40 -05:00
Ronnie Sahlberg 262916bc69 fix smb3-encryption breakage when CONFIG_DEBUG_SG=y
We can not use the standard sg_set_buf() fucntion since when
CONFIG_DEBUG_SG=y this adds a check that will BUG_ON for cifs.ko
when we pass it an object from the stack.

Create a new wrapper smb2_sg_set_buf() which avoids doing that particular check
and use it for smb3 encryption instead.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
2018-04-01 20:24:40 -05:00
Gustavo A. R. Silva 70e80655f5 CIFS: fix sha512 check in cifs_crypto_secmech_release
It seems this is a copy-paste error and that the proper variable to use
in this particular case is _sha512_ instead of _md5_.

Addresses-Coverity-ID: 1465358 ("Copy-paste error")
Fixes: 1c6614d229e7 ("CIFS: add sha512 secmech")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <smfrench@gmail.com>
2018-04-01 20:24:40 -05:00
Aurelien Aptel 8bd68c6e47 CIFS: implement v3.11 preauth integrity
SMB3.11 clients must implement pre-authentification integrity.

* new mechanism to certify requests/responses happening before Tree
  Connect.
* supersedes VALIDATE_NEGOTIATE
* fixes signing for SMB3.11

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-04-01 20:24:40 -05:00
Aurelien Aptel 5fcd7f3f96 CIFS: add sha512 secmech
* prepare for SMB3.11 pre-auth integrity
* enable sha512 when SMB311 is enabled in Kconfig
* add sha512 as a soft dependency

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-04-01 20:24:39 -05:00
Aurelien Aptel 82fb82be05 CIFS: refactor crypto shash/sdesc allocation&free
shash and sdesc and always allocated and freed together.
* abstract this in new functions cifs_alloc_hash() and cifs_free_hash().
* make smb2/3 crypto allocation independent from each other.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
2018-04-01 20:24:39 -05:00
Ronnie Sahlberg b7a73c84eb cifs: fix memory leak in SMB2_open()
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
2018-04-01 20:24:39 -05:00
Colin Ian King ac65cb6203 CIFS: SMBD: fix spelling mistake: "faield" and "legnth"
Trivial fix to spelling mistake in log_rdma_send and log_rdma_mr
message text.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2018-04-01 20:24:39 -05:00
David S. Miller c0b458a946 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c,
we had some overlapping changes:

1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE -->
   MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE

2) In 'net-next' params->log_rq_size is renamed to be
   params->log_rq_mtu_frames.

3) In 'net-next' params->hard_mtu is added.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01 19:49:34 -04:00
Theodore Ts'o 54dd0e0a1b ext4: add extra checks to ext4_xattr_block_get()
Add explicit checks in ext4_xattr_block_get() just in case the
e_value_offs and e_value_size fields in the the xattr block are
corrupted in memory after the buffer_verified bit is set on the xattr
block.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2018-03-30 20:04:11 -04:00
David Sterba 57599c7e77 btrfs: lift errors from add_extent_changeset to the callers
The missing error handling in add_extent_changeset was hidden, so make
it at least visible in the callers.

Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31 02:03:25 +02:00
Liu Bo f50f435390 Btrfs: print error messages when failing to read trees
When mount fails to read trees like fs tree, checksum tree, extent
tree, etc, there is not enough information about where went wrong.

With this, messages like

"BTRFS warning (device sdf): failed to read root (objectid=7): -5"

would help us a bit.

Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31 02:02:14 +02:00
David Sterba 38e82de8cc btrfs: user proper type for btrfs_mask_flags flags
All users pass a local unsigned int and not the __uXX types that are
supposed to be used for userspace interfaces.

Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31 02:01:07 +02:00
David Sterba 7e79cb86be btrfs: split dev-replace locking helpers for read and write
The current calls are unclear in what way btrfs_dev_replace_lock takes
the locks, so drop the argument, split the helpers and use similar
naming as for read and write locks.

Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31 02:01:07 +02:00
David Sterba e7ab0af6c3 btrfs: remove stale comments about fs_mutex
The fs_mutex has been killed in 2008, a213501153 ("Btrfs: Replace
the big fs_mutex with a collection of other locks"), still remembered in
some comments.

We don't have any extra needs for locking in the ACL handlers.

Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31 02:01:07 +02:00
David Sterba 88c14590cd btrfs: use RCU in btrfs_show_devname for device list traversal
The show_devname callback is used to print device name in
/proc/self/mounts, we need to traverse the device list consistently and
read the name that's copied to a seq buffer so we don't need further
locking.

If the first device is being deleted at the same time, the RCU will
allow us to read the device name, though it will become stale right
after the RCU protection ends. This is unavoidable and the user can
expect that the device will disappear from the filesystem's list at some
point.

The device_list_mutex was pretty heavy as it is used eg. for writing
superblock and a few other IO related contexts. This can stall any
application that reads the proc file for no reason.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31 02:01:06 +02:00
David Sterba d1980131ca btrfs: update barrier in should_cow_block
Once there was a simple int force_cow that was used with the plain
barriers, and then converted to a bit, so we should use the appropriate
barrier helper.

Other variables in the complex if condition do not depend on a barrier,
so we should be fine in case the atomic barrier becomes a no-op.

Signed-off-by: David Sterba <dsterba@suse.com>
2018-03-31 02:01:06 +02:00