Commit Graph

2784 Commits

Author SHA1 Message Date
Linus Torvalds a867d7349e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull userns vfs updates from Eric Biederman:
 "This tree contains some very long awaited work on generalizing the
  user namespace support for mounting filesystems to include filesystems
  with a backing store.  The real world target is fuse but the goal is
  to update the vfs to allow any filesystem to be supported.  This
  patchset is based on a lot of code review and testing to approach that
  goal.

  While looking at what is needed to support the fuse filesystem it
  became clear that there were things like xattrs for security modules
  that needed special treatment.  That the resolution of those concerns
  would not be fuse specific.  That sorting out these general issues
  made most sense at the generic level, where the right people could be
  drawn into the conversation, and the issues could be solved for
  everyone.

  At a high level what this patchset does a couple of simple things:

   - Add a user namespace owner (s_user_ns) to struct super_block.

   - Teach the vfs to handle filesystem uids and gids not mapping into
     to kuids and kgids and being reported as INVALID_UID and
     INVALID_GID in vfs data structures.

  By assigning a user namespace owner filesystems that are mounted with
  only user namespace privilege can be detected.  This allows security
  modules and the like to know which mounts may not be trusted.  This
  also allows the set of uids and gids that are communicated to the
  filesystem to be capped at the set of kuids and kgids that are in the
  owning user namespace of the filesystem.

  One of the crazier corner casees this handles is the case of inodes
  whose i_uid or i_gid are not mapped into the vfs.  Most of the code
  simply doesn't care but it is easy to confuse the inode writeback path
  so no operation that could cause an inode write-back is permitted for
  such inodes (aka only reads are allowed).

  This set of changes starts out by cleaning up the code paths involved
  in user namespace permirted mounts.  Then when things are clean enough
  adds code that cleanly sets s_user_ns.  Then additional restrictions
  are added that are possible now that the filesystem superblock
  contains owner information.

  These changes should not affect anyone in practice, but there are some
  parts of these restrictions that are changes in behavior.

   - Andy's restriction on suid executables that does not honor the
     suid bit when the path is from another mount namespace (think
     /proc/[pid]/fd/) or when the filesystem was mounted by a less
     privileged user.

   - The replacement of the user namespace implicit setting of MNT_NODEV
     with implicitly setting SB_I_NODEV on the filesystem superblock
     instead.

     Using SB_I_NODEV is a stronger form that happens to make this state
     user invisible.  The user visibility can be managed but it caused
     problems when it was introduced from applications reasonably
     expecting mount flags to be what they were set to.

  There is a little bit of work remaining before it is safe to support
  mounting filesystems with backing store in user namespaces, beyond
  what is in this set of changes.

   - Verifying the mounter has permission to read/write the block device
     during mount.

   - Teaching the integrity modules IMA and EVM to handle filesystems
     mounted with only user namespace root and to reduce trust in their
     security xattrs accordingly.

   - Capturing the mounters credentials and using that for permission
     checks in d_automount and the like.  (Given that overlayfs already
     does this, and we need the work in d_automount it make sense to
     generalize this case).

  Furthermore there are a few changes that are on the wishlist:

   - Get all filesystems supporting posix acls using the generic posix
     acls so that posix_acl_fix_xattr_from_user and
     posix_acl_fix_xattr_to_user may be removed.  [Maintainability]

   - Reducing the permission checks in places such as remount to allow
     the superblock owner to perform them.

   - Allowing the superblock owner to chown files with unmapped uids and
     gids to something that is mapped so the files may be treated
     normally.

  I am not considering even obvious relaxations of permission checks
  until it is clear there are no more corner cases that need to be
  locked down and handled generically.

  Many thanks to Seth Forshee who kept this code alive, and putting up
  with me rewriting substantial portions of what he did to handle more
  corner cases, and for his diligent testing and reviewing of my
  changes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits)
  fs: Call d_automount with the filesystems creds
  fs: Update i_[ug]id_(read|write) to translate relative to s_user_ns
  evm: Translate user/group ids relative to s_user_ns when computing HMAC
  dquot: For now explicitly don't support filesystems outside of init_user_ns
  quota: Handle quota data stored in s_user_ns in quota_setxquota
  quota: Ensure qids map to the filesystem
  vfs: Don't create inodes with a uid or gid unknown to the vfs
  vfs: Don't modify inodes with a uid or gid unknown to the vfs
  cred: Reject inodes with invalid ids in set_create_file_as()
  fs: Check for invalid i_uid in may_follow_link()
  vfs: Verify acls are valid within superblock's s_user_ns.
  userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS
  fs: Refuse uid/gid changes which don't map into s_user_ns
  selinux: Add support for unprivileged mounts from user namespaces
  Smack: Handle labels consistently in untrusted mounts
  Smack: Add support for unprivileged mounts from user namespaces
  fs: Treat foreign mounts as nosuid
  fs: Limit file caps to the user namespace of the super block
  userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag
  userns: Remove implicit MNT_NODEV fragility.
  ...
2016-07-29 15:54:19 -07:00
Linus Torvalds 1c88e19b0f Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "The rest of MM"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (101 commits)
  mm, compaction: simplify contended compaction handling
  mm, compaction: introduce direct compaction priority
  mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations
  mm, page_alloc: make THP-specific decisions more generic
  mm, page_alloc: restructure direct compaction handling in slowpath
  mm, page_alloc: don't retry initial attempt in slowpath
  mm, page_alloc: set alloc_flags only once in slowpath
  lib/stackdepot.c: use __GFP_NOWARN for stack allocations
  mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
  mm, kasan: account for object redzone in SLUB's nearest_obj()
  mm: fix use-after-free if memory allocation failed in vma_adjust()
  zsmalloc: Delete an unnecessary check before the function call "iput"
  mm/memblock.c: fix index adjustment error in __next_mem_range_rev()
  mem-hotplug: alloc new page from a nearest neighbor node when mem-offline
  mm: optimize copy_page_to/from_iter_iovec
  mm: add cond_resched() to generic_swapfile_activate()
  Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements"
  mm, compaction: don't isolate PageWriteback pages in MIGRATE_SYNC_LIGHT mode
  mm: hwpoison: remove incorrect comments
  make __section_nr() more efficient
  ...
2016-07-28 16:36:48 -07:00
Mel Gorman 11fb998986 mm: move most file-based accounting to the node
There are now a number of accounting oddities such as mapped file pages
being accounted for on the node while the total number of file pages are
accounted on the zone.  This can be coped with to some extent but it's
confusing so this patch moves the relevant file-based accounted.  Due to
throttling logic in the page allocator for reliable OOM detection, it is
still necessary to track dirty and writeback pages on a per-zone basis.

[mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
  Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Linus Torvalds 554828ee0d Merge branch 'salted-string-hash'
This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.

That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.

It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.

Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.

* merge branch 'salted-string-hash':
  fs/dcache.c: Save one 32-bit multiply in dcache lookup
  vfs: make the string hashes salt the hash
2016-07-28 12:26:31 -07:00
Eric W. Biederman 0d4d717f25 vfs: Verify acls are valid within superblock's s_user_ns.
Update posix_acl_valid to verify that an acl is within a user namespace.

Update the callers of posix_acl_valid to pass in an appropriate
user namespace.  For posix_acl_xattr_set and v9fs_xattr_set_acl pass in
inode->i_sb->s_user_ns to posix_acl_valid.  For md_unpack_acl pass in
&init_user_ns as no inode or superblock is in sight.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-06-30 18:04:58 -05:00
Andreas Dilger 9936913e54 staging: lustre: quiet lockdep recursive lock warning
Lockdep complains about potential recursive locking during mount
because the client configuration log is holding a lock on the MGC
obd_device to prevent it from being torn down, while also getting
mutexes on the MDC and OSC devices as they are instantiated:

 Lustre: Mounted myth-client
=============================================
[ INFO: possible recursive locking detected ]
4.7.0-rc2-vm-nfs+ #127 Tainted: G         C
---------------------------------------------

 May be due to missing lock nesting notation
2 locks held by ll_cfg_requeue/5928:
 #0:  (&cli->cl_sem){.+.+.+}, at: mgc_requeue_thread+0x15d/0x730 [mgc]
 #1:  (&cld->cld_lock){+.+.+.}, at: mgc_process_log+0x5e/0xf80 [mgc]
CPU: 0 PID: 5928 Comm: ll_cfg_requeue
Call Trace:
 [<ffffffff814a0855>] dump_stack+0x86/0xc1
 [<ffffffff810e7766>] __lock_acquire+0x726/0x1210
 [<ffffffff810e86be>] lock_acquire+0xfe/0x1f0
 [<ffffffff81888171>] down_read+0x51/0xa0
 [<ffffffffa04a8477>] sptlrpc_conf_client_adapt+0x47/0x150 [ptlrpc]
 [<ffffffffa0186b16>] mdc_set_info_async+0x2b6/0x470 [mdc]
 [<ffffffffa0294090>] class_notify_sptlrpc_conf+0x190/0x360 [obdclass]
 [<ffffffffa01a9e85>] mgc_process_log+0x925/0xf80 [mgc]
 [<ffffffffa01abafa>] mgc_requeue_thread+0x1fa/0x730 [mgc]
 [<ffffffff810af331>] kthread+0x101/0x120
 [<ffffffff8188ad6f>] ret_from_fork+0x1f/0x40

Add a separate lock class for the MGC callpath, since it will always
be held first, and none of the other obd_device locks should ever
be held concurrently.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Oleg Drokin 281a8273f6 staging/lustre/libcfs: Do not call kthread_run in wrong state
kthread_run might sleep during an allocation, and so
it's considered unsafe to call with a state that's not
RUNNABLE.
Move the state setting to after kthread_run call.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Andriy Skulysh 025fd3c20b staging/lustre/osc: glimpse lock should match only with granted locks
A deadlock is possible during ccc_prep_size()->ldlm_lock_match() vs
cl_io_lock() which is waiting for a matched lock and conflicts with
already taken lock before ccc_prep_size().

It is better to send an additional lock request to avoid deadlock.

Seagate-bug-id: MRP-3312
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/18738
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7829
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Oleg Drokin cb96191ff4 staging/lustre: Add documentation for unstable_stats in sysfs
commit ac5b148109 ("staging: lustre: osc: Track and limit
"unstable" pages") added a new sysfs variable, but corresponding bit of
documentation was not forgotten.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Dmitry Eremin c32090fce9 staging/lustre/osc: fix signed one bit field
Bit field 'oi_lockless' and 'oi_is_active' has one bit and is signed
which is confusing.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: http://review.whamcloud.com/19196
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7258
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
akam kumar bharathi d780846e0f staging/lustre/llite: IOC_MDC_GETFILEINFO returns the wrong ino
req_capsule_server_get() through  __req_capsule_get in ll_dir_ioctl()
returns a pointer to a PTLRPC request or reply buffer, which is assigned
to struct mdt_body.

If the command is IOC_MDS_GETFILEINFO then the inode "st.st_ino" should
be assigned from one extracted from mdt_body through cl_fid_build_ino().

Signed-off-by: John Hammond <john.hammond@intel.com>
Signed-off-by: akam kumar bharathi <azurelustre@gmail.com>
Reviewed-on: http://review.whamcloud.com/17618
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5954
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Oleg Drokin aae5d55a24 staging/lustre/llite: ll_revalidate_dentry update
There are a couple of cases in ll_revalidate_dentry() where
we are pretty sure the dentry is valid, so check for them early
and save more expensive checks for later.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Oleg Drokin 6dad4d8903 staging/lustre/llite: Restore proper opencache operations
Mark dentries that came to us via NFS in a special way so that
we can tell them apart during open and activate open cache
(we really don't want to do open/close RPC for every NFS IO).

This became needed since dentry revlidate no longer reimplements
any RPCs for lookup, and as such if a dentry is valid,
ll_revalidate_dentry returns 1 and ll_lookup_it() is never visited
during opens, we get straght into ll_file_open() without a valid
intent/RPC. This used to be only true for NFS, so opencache was
engaged needlessly, and it carries a cost of it's own if there is
in fact no repetitive file opening-closing going on

Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/20354
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8019
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Sergey Cheremencev c681528a2b staging/lustre/llite: don't panic when fid is insane
LASSERT should never be done on data that is
received to over the network. Return EINVAL
when server returns invalid fid despite of
it_status == 0.

Signed-off-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Seagate-bug-id: MRP-3073
Reviewed-on: http://review.whamcloud.com/17985
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7422
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Niu Yawei aea7ccd985 staging/lustre/mdc: Zero atime in close RPC
While atime on close is supposed to only increase, there's
a bug in some older server versions where atime from a client
is taken no matter the value that allows a stale client atime
to overwrite a correct value.

Update atime in close rpc to 0 to help such servers out.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-on: http://review.whamcloud.com/19932
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8041
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Yang Sheng 2323d6d837 staging/lustre/llite: ensure obd is effective in onu_upcall
The watched obd device may still not setup while onu_upcall
invoked. So we need verify it in cl_ocd_update.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-on: http://review.whamcloud.com/19597
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8027
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
John L. Hammond e8beaf670d staging/lustre/ldlm: const qualify struct lustre_handle * params
Add a const qualifier to several struct lustre_handle * parameters in
the LDLM interface.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: http://review.whamcloud.com/17071
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7403
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
John L. Hammond 8bf86fd957 staging/lustre/llite: change it_data to it_request
Change the void *it_data member of struct lookup_intent to struct
ptlrpc_request *it_request.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: http://review.whamcloud.com/17070
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7403
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Oleg Drokin 51b39f1d36 staging/lustre: Inline Lustre intent disposition functions
They are just one-liners, so no point in having them exported
and called through a different module.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
John L. Hammond e476f2e55a staging/lustre/llite: flatten struct lookup_intent
Replace the union in struct lookup_intent with the members of struct
lustre_indent_data. Remove the then unused struct lustre_intent_data.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: http://review.whamcloud.com/17069
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7403
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Bob Glossman d55d5e8f49 staging/lustre: Add newline to LU_OBJECT_DEBUG() message
LU_OBJECT_DEBUG expects non \n terminated message from the caller,
so it should add it's own to keep debug logger happy.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: http://review.whamcloud.com/19960
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8094
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Alex Zhuravlev e93876dd73 staging/lustre: LDLM_DEBUG() shouldn't be passed \n
as it adds own \n, so any extra \n break log format.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17494
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7521
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Patrick Farrell 0e5fd06ca0 staging/lustre/llite: take trunc_sem only at vvp layer
The lli_trunc_sem is taken in 'read' mode in both
ll_page_mkwrite and vvp_io_fault_start. This can lead to a
deadlock with another thread which asks for the semaphore
in write mode between thse two read calls.

Since all users of lli_trunc_sem are in the vvp layer, we
can satisfy the requirement to exclude truncate by taking
the semaphore only in vvp_io_fault_start.

Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: http://review.whamcloud.com/19315
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7981
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Vitaly Fertman 81ea39ecf9 staging/lustre/ptlrpc: lost bulk leads to a hang
The reverse order of request_out_callback() and reply_in_callback()
puts the RPC into UNREGISTERING state, which is waiting for RPC &
bulk md unlink, whereas only RPC md unlink has been called so far.
If bulk is lost, even expired_set does not check for UNREGISTERING
state.

The same for write if server returns an error.

This phase is ambiguous, split to UNREG_RPC and UNREG_BULK.

Signed-off-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Seagate-bug-id:  MRP-2953, MRP-3206
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-by: Alexey Leonidovich Lyashkov <alexey.lyashkov@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: http://review.whamcloud.com/19953
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Ann Koehler <amk@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Ben Evans 63a46519f2 staging/lustre/ptlrpc: Remove __ptlrpc_request_bufs_pack
Combine __ptlrpc_request_bufs_pack into ptlrpc_request_bufs_pack
because it was an unnecessary wrapper otherwise.

Signed-off-by: Ben Evans <bevans@cray.com>
Reviewed-on: http://review.whamcloud.com/16765
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7269
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Vitaly Fertman af38abfcd9 staging/lustre/ptlrpc: Early Reply vs Reply MDunlink
A race between unregister_reply & early reply.
When buffers are busy for the early transfer, they cannon be unlinked
by unregister_reply, so the RPC gets into UNREGISTERING state. The
coming reply_in_callback for the early RPC already has unlinked flag
set due to previous mdunlink attempt, but we handle it properly only
for UNILNK event, whereas this is PUT in this case.

Signed-off-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Seagate-bug-id: MRP-3323
Reviewed-by: Alexey Leonidovich Lyashkov <alexey.lyashkov@seagate.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Tested-by: Parinay Vijayprakash Kondekar <parinay.kondekar@seagate.com>
Reviewed-on: http://review.whamcloud.com/18934
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7434
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Liang Zhen 9faa2ade3f staging/lustre/ptlrpc: missing wakeup for ptlrpc_check_set
This patch changes a few things:

- There is no guarantee that request_out_callback will happen
  before reply_in_callback, if a request got reply and unlinked
  reply buffer before request_out_callback is called, then the
  thread waiting on ptlrpc_request_set will miss wakeup event.

  This may seriously impact performance of some IO workloads or
  result in RPC timeout

- To make code more easier to understand, this patch changes
  action-bits "rq_req_unlink" and "rq_reply_unlink" to
  status-bits "rq_req_unlinked" and "rq_reply_unlinked"

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-on: http://review.whamcloud.com/12158
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5696
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Liang Zhen 32c8728d87 staging/lustre/ptlrpc: reorganize ptlrpc_request
ptlrpc_request has some structure members are only for client side,
and some others are only for server side, this patch moved these
members to different structure then putting into an union.

By doing this, size of ptlrpc_request is decreased about 300 bytes,
besides saving memory, it also can reduce memory footprint while
processing.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-on: http://review.whamcloud.com/8806
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-181
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Oleg Drokin fbe0456482 staging/lustre/osc: Fix reverted condition in osc_lock_weight
When imprting clio simplification patch, the check for
pbject got reversed by mistake when converting from
if (obj == NULL) it somehow became (if (obj) which is obviously wrong,
and so when it does hit, a crash was happening as result.

Fix the condition and all if fine now.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Jinshan Xiong a6307ff9aa staging/lustre/osc: osc_lock_weight endless loop fix
With huge number of pages to scan by osc_lock_weight() it is likely
CLP_GANG_RESCHED is returned from osc_page_gang_lookup() and the scan
will be repeated again from the start. To be sure that the scan is
progressing across those restarts, next scan should be started from
the last scanned page index plus one.

Xyratex-bug-id: MRP-2145
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-on: http://review.whamcloud.com/12362
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5781
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Bruno Faccini c9cc8d0f6f staging/lustre/llite: lock i_lock before __d_drop()
There has been several Lustre Client crashes reported by sites
running with Lustre versions 2.1/2.5, all showing the same
dentry->d_hash->next corrupted pointer cause.

This patch fixes a regression that has been introduced since a
long time by commit :
(LU-506 kernel: FC15 - support dcache scalability changes.)

where i_lock protection usage has been removed and
that is likely to cause racy condition during dentry [un]hashing
and to be the root cause of these crashes.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Reviewed-on: http://review.whamcloud.com/19287
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7973
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Oleg Drokin 2bbec0ed2c staging/lustre/llite: Get rid of ll_lock_dcache/ll_unlock_dcache
These are just doing spin_lock/unlock on inode's i_lock,
so just do the spinlock directly to make the code more clear

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
John L. Hammond 411c9699df staging/lustre/llite: correct request handling after ll_lookup_it()
In the FIFO cases of ll_atomic_open() and ll_lookup_nd() remove
spurious calls to ptlrpc_req_finished(). Explain that these cases are
unreachable in practice anyway.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: http://review.whamcloud.com/17068
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7402
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Emoly Liu 1b02bde3bc staging/lustre/llite: allocate and free client cache asynchronously
Since the inflight request holds import refcount as well as export,
sometimes obd_disconnect() in client_common_put_super() can't put
the last refcount of OSC import (e.g. due to network disconnection),
this will cause cl_cache being accessed after free.

To fix this issue, ccc_users is used as cl_cache refcount, and
lov/llite/osc all hold one cl_cache refcount respectively, to avoid
the race that a new OST is being added into the system when the client
is mounted.
The following cl_cache functions are added:
- cl_cache_init(): allocate and initialize cl_cache
- cl_cache_incref(): increase cl_cache refcount
- cl_cache_decref(): decrease cl_cache refcount and free the cache
  if refcount=0.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-on: http://review.whamcloud.com/13746
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6173
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:28:39 -07:00
Fan Yong 341f1f0aff staging: lustre: remove remote client support
There are several obsolete sub commands for lfs to work with
remote client. We do not support that anymore, and should be
deleted along with any kernel code related to remote client.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6971
Reviewed-on: http://review.whamcloud.com/19789
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 14:26:02 -07:00
Greg Kroah-Hartman af52739b92 Merge 4.7-rc4 into staging-next
We want the fixes in here, and we can resolve a merge issue in
drivers/iio/industrialio-trigger.c

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-20 08:25:44 -07:00
Oleg Drokin 25ed6a5e97 staging/lustre: Update FID documentation link.
When OpenSFS took over lustre.org, there was some reshuffling.
FIDs on ZFS document is now at
http://wiki.old.lustre.org/index.php/Architecture_-_Interoperability_fids_zfs
instead of the old location, so update comments accordingly.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Reported-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-18 21:18:43 -07:00
Oleg Drokin 9797fb0e25 staging/lustre: Remove unnecessary space after a cast
This patch fixes all checkpatch occurences of
"CHECK: No space is necessary after a cast"
in Lustre code.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-18 21:18:43 -07:00
Emoly Liu d719d2ddd1 staging/lustre: Keep logical continuations on the previous line
This patch fixes all checkpatch occurences of
"CHECK: Logical continuations should be on the previous line"
in Lustre code.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-18 21:18:43 -07:00
Emoly Liu f1b91de88a staging/lustre: Fix blank line before EXPORT_SYMBOL()
This patch fixes one checkpatch warning in lustre:
WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-18 21:18:43 -07:00
Oleg Drokin 82ce9365ce staging/lustre/libcfs: Remove "Please contact Oracle" from header
The "Please contact Oracle Corporation" lines are removed since not
only Oracle has nothing to do with Lustre anymore, there's a pointer
to GPL already that's independent of any particular company.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:37:21 -07:00
Oleg Drokin 837e4e6e51 staging/lustre: Remove stray line from selftest/selftest.h
The 'copy of GPLv2]' is an ending from template that's no longer needed,
so remove it to avoid any extra confusion.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:37:21 -07:00
Oleg Drokin 06894eed50 staging/lustre/lov: Fix gpl URL in lov_pool.c
There's no longer a matching sun.com URL, so refer to
gnu.org copy.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:37:21 -07:00
Oleg Drokin 6a5b99a46b staging/lustre: Replace sun.com GPLv2 URL with gnu.org one.
http://www.sun.com/software/products/lustre/docs/GPLv2.pdf is no
longer around, so replae it with (hopefully more permanent)
http://http://www.gnu.org/licenses/gpl-2.0.html

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Reported-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:37:21 -07:00
Oleg Drokin ed4df35478 staging/lustre: Remove the "Please contact SUN for GPL" from headers
Since SUN is no longer around and there's no point in contacting them,
just remove that whole thing. Copy of GPL is available online anyway
(URLs to be updated in next patch).

This patch was generated with:
find drivers/staging/lustre -name "*.[ch]" -exec perl -0777 -i -pe 's/ \* Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,\n \* CA 95054 USA or visit www.sun.com if you need additional information or\n \* have any questions.\n \*\n//igs' {} \;

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Reported-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:37:21 -07:00
James Simmons ff13fd40f2 staging: lustre: socklnd: remove typedefs
Remove all remaining typedefs in socklnd driver.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:35:59 -07:00
James Simmons 8d9de3f485 staging: lustre: o2iblnd: remove typedefs
Remove all remaining typedefs in o2iblnd driver.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:35:59 -07:00
Nathaniel Clark 77447a863f staging/lustre/lmv: Fix Multiple Assignments
Fix all multiple assignments on lustre/lmv directory.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:32:34 -07:00
Nathaniel Clark 3d2b8f5719 staging/lustre/ptlrpc: Fix Multiple Assignments
Fix all multiple assignments on lustre/ptlrpc directory.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:32:34 -07:00
Nathaniel Clark 89c6036497 staging/lustre/obdclass: Fix Multiple Assignments
Fix all multiple assignments on lustre/obdclass directory.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-06-17 20:32:34 -07:00