linux

Commit Graph

Author	SHA1	Message	Date
Vyacheslav Dubeyko	aebe17f684	nilfs2: add /sys/fs/nilfs2/features group This patchset implements creation of sysfs groups and attributes with the purpose to show NILFS2 volume details, internal state of the driver and to manage internal state of NILFS2 driver. Sysfs is a virtual file system that exports information about devices and drivers from the kernel device model to user space, and is also used for configuration. NILFS2 is a complex file system that has segctor thread, GC thread, checkpoint/snapshot model and so on. Sysfs namespace provides native and easy way for: (1) getting info and statistics about volume state; (2) getting info and configuration of internal subsystems (segctor thread); (3) snapshots management. Suggested patchset provides basis for managing segctor thread behaviour and manipulation by snapshots. Currently, it informs only about segctor thread's internal parameters and about mounted snapshots. But sysfs interface can provide easy and simple way for deep management of segctor thread and snapshots. This patchset provides opportunity to manage interval of periodical update of superblock (in seconds). Default value is 10 seconds. Now a user can increase this value by means of nilfs2/<device>/superblock/sb_update_frequency attribute in the case of necessity. Also the patchset provides opportunity to get information easily about key volumes's parameters (free blocks, superblock write count, superblock update frequency, latest segment info, dirty data blocks count, count of clean segments, count of dirty segments and so on) in real time manner. Such information can be used in scripts for subtle management of filesystem. Implemented functionality creates such groups: (1) /sys/fs/nilfs2 - root group (2) /sys/fs/nilfs2/features - group contains attributes that describe NILFS file system driver features (3) /sys/fs/nilfs2/<device> - group contains attributes that describe file system partition's details (4) /sys/fs/nilfs2/<device>/superblock - group contains attributes that describe superblock's details (5) /sys/fs/nilfs2/<device>/segctor - group contains attributes that describe segctor thread activity details (6) /sys/fs/nilfs2/<device>/segments - group contains attributes that describe details about volume's segments (7) /sys/fs/nilfs2/<device>/checkpoints - group contains attributes that describe details about volume's checkpoints (8) /sys/fs/nilfs2/<device>/mounted_snapshots - group contains group for every mounted snapshot (9) /sys/fs/nilfs2/<device>/mounted_snapshots/<snapshot> - group contains details about mounted snapshot This patch (of 9): This patch adds code of creation /sys/fs/nilfs2 group and /sys/fs/nilfs2/features group. The features group contains attributes that describe NILFS file system driver features: (1) revision - show current revision of NILFS file system driver. There are two formats of timestamp output - seconds and human-readable format. Every showed timestamp has two sysfs files (time-<xxx> and time-<xxx>-secs). One sysfs file (time-<xxx>) shows time in human-readable format. Another sysfs file (time-<xxx>-secs) shows time in seconds. It was reported by Michael Semon that timestamp output in human-readable format should be changed from "2014-4-12 14:5:38" to "2014-04-12 14:05:38". Second version of the patch fixes this issue. Reported-by: Michael L. Semon <mlsemon35@gmail.com> Signed-off-by: Vyacheslav Dubeyko <Vyacheslav.Dubeyko@hgst.com> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:20 -07:00
Fabian Frederick	834b46c37a	fs/coda: use linux/uaccess.h Fix checkpatch warning WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h> Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:20 -07:00
Fabian Frederick	8e19189ef8	fs/befs/linuxvfs.c: check superblock before dump operation befs_dump_super_block was called between befs_load_sb and befs_check_sb. It has been reported to crash (5/900) with null block testing. This patch loads, checks and only dump superblock if it's a valid one then brelse bh. (befs_dump_super_block uses disk_sb (bh->b_data) so it seems we need to call it before brelse(bh) but I don't know why befs_check_sb was called after brelse. Another thing I don't understand is why this problem appears now). Signed-off-by: Fabian Frederick <fabf@skynet.be> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:20 -07:00
Qi Yong	6d6747f853	minix zmap block counts calculation fix The original minix zmap blocks calculation was correct, in the formula of: sbi->s_nzones - sbi->s_firstdatazone + 1 It is sp->s_zones - (sp->s_firstdatazone - 1) in the minix3 source code. But a later commit `016e8d44bc` ("fs/minix: Verify bitmap block counts before mounting") has changed it unfortunately as: sbi->s_nzones - (sbi->s_firstdatazone + 1) This would show free blocks one block less than the real when the total data blocks are in "full zmap blocks plus one". This patch corrects that zmap blocks calculation and tidy a printk message while at it. Signed-off-by: Qi Yong <qiyong@fc-cn.com> Cc: Josh Boyer <jwboyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:20 -07:00
NeilBrown	3b97dd0581	autofs4: comment typo: remove a a doubled word Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:19 -07:00
NeilBrown	bdac38329e	autofs4: remove some unused inline functions {__,}manage_dentry_{set,clear}_{automount,transit} are 4 unused inline functions. Discard them. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:19 -07:00
NeilBrown	668128e90b	autofs4: don't take spinlock when not needed in autofs4_lookup_expiring If the expiring_list is empty, we can avoid a costly spinlock in the rcu-walk path through autofs4_d_manage (once the rest of the path becomes rcu-walk friendly). Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:19 -07:00
NeilBrown	c312442fe3	autofs4: remove a redundant assignment The variable 'ino' already exists and already has the correct value. The d_fsdata of a dentry is never changed after the d_fsdata is instantiated, so this new assignment cannot be necessary. It was introduced in commit `b5b801779d` ("autofs4: Add d_manage() dentry operation"). Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:19 -07:00
NeilBrown	26b7a54a35	autofs4: remove unused autofs4_ispending() Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:18 -07:00
Fabian Frederick	6f4535ed7d	fs/ramfs/file-nommu.c: replace count*size kzalloc by kcalloc Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Axel Lin <axel.lin@ingics.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:18 -07:00
Fabian Frederick	ca35664031	fs/efs/namei.c: return is not a function Fix checkpatch errors: "ERROR: return is not a function, parentheses are not required" Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-08 15:57:18 -07:00
Al Viro	c7f3888ad7	switch iov_iter_get_pages() to passing maximal number of pages ... instead of maximal size. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:11 -04:00
Fengguang Wu	49c7dd287a	fs: mark __d_obtain_alias static Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:11 -04:00
J. Bruce Fields	95ad5c2913	dcache: d_splice_alias should detect loops I believe this can only happen in the case of a corrupted filesystem. So -EIO looks like the appropriate error. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:11 -04:00
J. Bruce Fields	8d80d7dabe	dcache: d_find_alias needn't recheck IS_ROOT && DCACHE_DISCONNECTED If we get to this point and discover the dentry is not a root dentry, or not DCACHE_DISCONNECTED--great, we always prefer that anyway. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
J. Bruce Fields	52ed46f0fa	dcache: remove unused d_find_alias parameter Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
J. Bruce Fields	1a0a397e41	dcache: d_obtain_alias callers don't all want DISCONNECTED There are a few d_obtain_alias callers that are using it to get the root of a filesystem which may already have an alias somewhere else. This is not the same as the filehandle-lookup case, and none of them actually need DCACHE_DISCONNECTED set. It isn't really a serious problem, but it would really be clearer if we reserved DCACHE_DISCONNECTED for those cases where it's actually needed. In the btrfs case this was causing a spurious printk from nfsd/nfsfh.c:fh_verify when it found an unexpected DCACHE_DISCONNECTED dentry. Josef worked around this by unsetting DCACHE_DISCONNECTED manually in `3a0dfa6a12` "Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol", and this replaces that workaround. Cc: Josef Bacik <jbacik@fb.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
J. Bruce Fields	da093a9b76	dcache: d_splice_alias should ignore DCACHE_DISCONNECTED Any IS_ROOT() alias should be safe to use; there's nothing special about DCACHE_DISCONNECTED dentries. Note that this is in fact useful for filesystems such as btrfs which can legimately encounter a directory with a preexisting IS_ROOT alias on a lookup that crosses into a subvolume. (Those aliases are currently marked DCACHE_DISCONNECTED--but not really for any good reason, and we'll change that soon.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
J. Bruce Fields	908790fa3b	dcache: d_splice_alias mustn't create directory aliases Currently if d_splice_alias finds a directory with an alias that is not IS_ROOT or not DCACHE_DISCONNECTED, it creates a duplicate directory. Duplicate directory dentries are unacceptable; it is better just to error out. (In the case of a local filesystem the most likely case is filesystem corruption: for example, perhaps two directories point to the same child directory, and the other parent has already been found and cached.) Note that distributed filesystems may encounter this case in normal operation if a remote host moves a directory to a location different from the one we last cached in the dcache. For that reason, such filesystems should instead use d_materialise_unique, which tries to move the old directory alias to the right place instead of erroring out. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
J. Bruce Fields	75a2352d01	dcache: close d_move race in d_splice_alias d_splice_alias will d_move an IS_ROOT() directory dentry into place if one exists. This should be safe as long as the dentry remains IS_ROOT, but I can't see what guarantees that: once we drop the i_lock all we hold here is the i_mutex on an unrelated parent directory. Instead copy the logic of d_materialise_unique. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
J. Bruce Fields	3f70bd51cb	dcache: move d_splice_alias Just a trivial move to locate it near (similar) d_materialise_unique code and save some forward references in a following patch. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
J. Bruce Fields	d03b29a271	namei: trivial fix to vfs_rename_dir comment Looks like the directory loop check is actually done in renameat? Whatever, leave this out rather than trying to keep it up to date with the code. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
NeilBrown	b8faf035ea	VFS: allow ->d_manage() to declare -EISDIR in rcu_walk mode. In REF-walk mode, ->d_manage can return -EISDIR to indicate that the dentry is not really a mount trap (or even a mount point) and that any mounts or any DCACHE_NEED_AUTOMOUNT flag should be ignored. RCU-walk mode doesn't currently support this, so if there is a dentry with DCACHE_NEED_AUTOMOUNT set but which shouldn't be a mount-trap, lookup_fast() will always drop in REF-walk mode. With this patch, an -EISDIR from ->d_manage will always cause mounts and automounts to be ignored, both in REF-walk and RCU-walk. Bug-fixed-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:10 -04:00
Miklos Szeredi	7c33d5972c	cifs: support RENAME_NOREPLACE This flag gives CIFS the ability to support its native rename semantics. Implementation is simple: just bail out before trying to hack around the noreplace semantics. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Steve French <smfrench@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:09 -04:00
Miklos Szeredi	9a423bb6e3	hostfs: support rename flags Support RENAME_NOREPLACE and RENAME_EXCHANGE flags on hostfs if the underlying filesystem supports it. Since renameat2(2) is not yet in any libc, use syscall(2) to invoke the renameat2 syscall. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:09 -04:00
Miklos Szeredi	80ace85c91	btrfs: add RENAME_NOREPLACE RENAME_NOREPLACE is trivial to implement for most filesystems: switch over to ->rename2() and check for the supported flags. The rest is done by the VFS. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Chris Mason <clm@fb.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:09 -04:00
Miklos Szeredi	a0dbc56610	bad_inode: add ->rename2() so we return -EIO instead of -EINVAL. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:09 -04:00
Miklos Szeredi	7177a9c4b5	fs: call rename2 if exists Christoph Hellwig suggests: 1) make vfs_rename call ->rename2 if it exists instead of ->rename 2) switch all filesystems that you're adding NOREPLACE support for to use ->rename2 3) see how many ->rename instances we'll have left after a few iterations of 2. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:09 -04:00
Al Viro	3064c3563b	death to mnt_pinned Rather than playing silly buggers with vfsmount refcounts, just have acct_on() ask fs/namespace.c for internal clone of file->f_path.mnt and replace it with said clone. Then attach the pin to original vfsmount. Voila - the clone will be alive until the file gets closed, making sure that underlying superblock remains active, etc., and we can drop the original vfsmount, so that it's not kept busy. If the file lives until the final mntput of the original vfsmount, we'll notice that there's an fs_pin (one in bsd_acct_struct that holds that file) and mnt_pin_kill() will take it out. Since ->kill() is synchronous, we won't proceed past that point until these files are closed (and private clones of our vfsmount are gone), so we get the same ordering warranties we used to get. mnt_pin()/mnt_unpin()/->mnt_pinned is gone now, and good riddance - it never became usable outside of kernel/acct.c (and racy wrt umount even there). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:09 -04:00
Al Viro	8fa1f1c2bd	make fs/{namespace,super}.c forget about acct.h These externs belong in fs/internal.h. Rename (they are not acct-specific anymore) and move them over there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:09 -04:00
Al Viro	efb170c228	take fs_pin stuff to fs/* Add a new field to fs_pin - kill(pin). That's what umount and r/o remount will be calling for all pins attached to vfsmount and superblock resp. Called after bumping the refcount, so it won't go away under us. Dropping the refcount is responsibility of the instance. All generic stuff moved to fs/fs_pin.c; the next step will rip all the knowledge of kernel/acct.c from fs/super.c and fs/namespace.c. After that - death to mnt_pin(); it was intended to be usable as generic mechanism for code that wants to attach objects to vfsmount, so that they would not make the sucker busy and would get killed on umount. Never got it right; it remained acct.c-specific all along. Now it's very close to being killable. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:08 -04:00
Al Viro	0aec09d049	drop ->s_umount around acct_auto_close() just repeat the frozen check after regaining it, and check that sb is still alive. If several threads hit acct_auto_close() at the same time, acct_auto_close() will survive that just fine. And we really don't want to play with writes and closing the file with ->s_umount held exclusive - it's a deadlock country. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:08 -04:00
Al Viro	215752fce3	acct: get rid of acct_list Put these suckers on per-vfsmount and per-superblock lists instead. Note: right now it's still acct_lock for everything, but that's going to change. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:08 -04:00
Al Viro	ed44724b79	acct: switch to __kernel_write() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-08-07 14:40:07 -04:00
Al Viro	82df9c8beb	Merge commit 'ccbf62d8a284cf181ac28c8e8407dd077d90dd4b' into for-next backmerge to avoid kernel/acct.c conflict	2014-08-07 14:07:57 -04:00
Yan, Zheng	282c105225	ceph: fix kick_requests() __do_request() may unregister the request. So we should update iterator 'p' before calling __do_request() Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>	2014-08-07 14:30:00 +04:00
Linus Torvalds	33caee3992	Merge branch 'akpm' (patchbomb from Andrew Morton) Merge incoming from Andrew Morton: - Various misc things. - arch/sh updates. - Part of ocfs2. Review is slow. - Slab updates. - Most of -mm. - printk updates. - lib/ updates. - checkpatch updates. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (226 commits) checkpatch: update $declaration_macros, add uninitialized_var checkpatch: warn on missing spaces in broken up quoted checkpatch: fix false positives for --strict "space after cast" test checkpatch: fix false positive MISSING_BREAK warnings with --file checkpatch: add test for native c90 types in unusual order checkpatch: add signed generic types checkpatch: add short int to c variable types checkpatch: add for_each tests to indentation and brace tests checkpatch: fix brace style misuses of else and while checkpatch: add --fix option for a couple OPEN_BRACE misuses checkpatch: use the correct indentation for which() checkpatch: add fix_insert_line and fix_delete_line helpers checkpatch: add ability to insert and delete lines to patch/file checkpatch: add an index variable for fixed lines checkpatch: warn on break after goto or return with same tab indentation checkpatch: emit a warning on file add/move/delete checkpatch: add test for commit id formatting style in commit log checkpatch: emit fewer kmalloc_array/kcalloc conversion warnings checkpatch: improve "no space after cast" test checkpatch: allow multiple const * types ...	2014-08-06 21:14:42 -07:00
Linus Torvalds	158c12948f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree changes from Jiri Kosina: "Summer edition of trivial tree updates" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits) doc: fix two typos in watchdog-api.txt irq-gic: remove file name from heading comment MAINTAINERS: Add miscdevice.h to file list for char/misc drivers. scsi: mvsas: mv_sas.c: Fix for possible null pointer dereference doc: replace "practise" with "practice" in Documentation befs: remove check for CONFIG_BEFS_RW scsi: doc: fix 'SCSI_NCR_SETUP_MASTER_PARITY' drivers/usb/phy/phy.c: remove a leading space mfd: fix comment cpuidle: fix comment doc: hpfall.c: fix missing null-terminate after strncpy call usb: doc: hotplug.txt code typos kbuild: fix comment in Makefile.modinst SH: add proper prompt to SH_MAGIC_PANEL_R2_VERSION ARM: msm: Remove MSM_SCM crypto: Remove MPILIB_EXTRA doc: CN: remove dead link, kerneltrap.org no longer works media: update reference, kerneltrap.org no longer works hexagon: update reference, kerneltrap.org no longer works doc: LSM: update reference, kerneltrap.org no longer works ...	2014-08-06 21:03:53 -07:00
Ken Helias	1d023284c3	list: fix order of arguments for hlist_add_after(_rcu) All other add functions for lists have the new item as first argument and the position where it is added as second argument. This was changed for no good reason in this function and makes using it unnecessary confusing. The name was changed to hlist_add_behind() to cause unconverted code to generate a compile error instead of using the wrong parameter order. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Ken Helias <kenhelias@firemail.de> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> [intel driver bits] Cc: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:24 -07:00
Peter Feiner	68b5a65248	mm: softdirty: respect VM_SOFTDIRTY in PTE holes After a VMA is created with the VM_SOFTDIRTY flag set, /proc/pid/pagemap should report that the VMA's virtual pages are soft-dirty until VM_SOFTDIRTY is cleared (i.e., by the next write of "4" to /proc/pid/clear_refs). However, pagemap ignores the VM_SOFTDIRTY flag for virtual addresses that fall in PTE holes (i.e., virtual addresses that don't have a PMD, PUD, or PGD allocated yet). To observe this bug, use mmap to create a VMA large enough such that there's a good chance that the VMA will occupy an unused PMD, then test the soft-dirty bit on its pages. In practice, I found that a VMA that covered a PMD's worth of address space was big enough. This patch adds the necessary VMA lookup to the PTE hole callback in /proc/pid/pagemap's page walk and sets soft-dirty according to the VMAs' VM_SOFTDIRTY flag. Signed-off-by: Peter Feiner <pfeiner@google.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Hugh Dickins <hughd@google.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:22 -07:00
Rafael Aquini	cc7452b6dc	mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces Historically, we exported shared pages to userspace via sysinfo(2) sharedram and /proc/meminfo's "MemShared" fields. With the advent of tmpfs, from kernel v2.4 onward, that old way for accounting shared mem was deemed inaccurate and we started to export a hard-coded 0 for sysinfo.sharedram. Later on, during the 2.6 timeframe, "MemShared" got re-introduced to /proc/meminfo re-branded as "Shmem", but we're still reporting sysinfo.sharedmem as that old hard-coded zero, which makes the "shared memory" report inconsistent across interfaces. This patch leverages the addition of explicit accounting for pages used by shmem/tmpfs -- "4b02108 mm: oom analysis: add shmem vmstat" -- in order to make the users of sysinfo(2) and si_meminfo*() friends aware of that vmstat entry and make them report it consistently across the interfaces, as well to make sysinfo(2) returned data consistent with our current API documentation states. Signed-off-by: Rafael Aquini <aquini@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:19 -07:00
Fabian Frederick	1b7f8ba603	fs/ocfs2/slot_map.c: replace countsize kzalloc by kcalloc kcalloc manages countsizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:13 -07:00
Tariq Saeed	bba1cb17d9	ocfs2: race between umount and unfinished remastering during recovery Orabug: 19074140 When umount is issued during recovery on the new master that has not finished remastering locks, it triggers BUG() in dlm_send_mig_lockres_msg(). Here is the situation: 1) node A has a lock on resource X mastered by node B. 2) node B dies -> node A sets recovering flag for res X 3) Node C becomes the new master for resources owned by the dead node and is remastering locks of the dead node but has not finished the remastering process yet. 4) umount is issued on node C. 5) During processing of umount, ignoring unfished recovery, node C attempts to migrate resource X to node A. 6) node A finds res X in DLM_LOCK_RES_RECOVERING state, considers it a logic error and sends back -EFAULT. 7) node C asserts BUG() upon seeing EFAULT resp from node B. Fix is to delay migrating res X till remastering is finished at which point recovering flag will be cleared on both A and C. Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:13 -07:00
Xue jiufei	7567c14883	ocfs2: remove conversion of total_backoff in dlm_join_domain() The unit of total_backoff is msecs not jiffies, so no need to do the conversion. Otherwise, the join timeout is not 90 sec. Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com> Signed-off-by: joyce.xue <xuejiufei@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:13 -07:00
Yingtai Xie	981035b47d	ocfs2: correctly check the return value of ocfs2_search_extent_list ocfs2_search_extent_list may return -1, so we should check the return value in ocfs2_split_and_insert, otherwise it may cause array index out of bound. And ocfs2_search_extent_list can only return value less than el->l_next_free_rec, so check if it is equal or larger than le16_to_cpu(el->l_next_free_rec) is meaningless. Signed-off-by: Yingtai Xie <xieyingtai@huawei.com> Signed-off-by: Joseph Qi <joseph.qi@huawei.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:13 -07:00
Fabian Frederick	c811f5f41e	fs/squashfs/super.c: logging cleanup - Convert printk to pr_foo() - Add pr_fmt for future logging entries - Coalesce formats Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:13 -07:00
Fabian Frederick	14694888db	fs/squashfs/file_direct.c: replace countsize kmalloc by kmalloc_array kmalloc_array() manages countsizeof overflow. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:13 -07:00
Fabian Frederick	2502722dde	ntfs: kernel-doc warning fixes cached_page and lru_pvec were removed from ntfs_attr_extend_initialized in commit `2ec93b0bf3` ("ntfs: clean up ntfs_attr_extend_initialized") lru_pvec has been removed from __ntfs_grab_cache_pages in commit `4c99000ac4` ("ntfs: use add_to_page_cache_lru()") Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:12 -07:00
Fabian Frederick	2122da26c3	fs/logfs/readwrite.c: kernel-doc warning fixes s/-/:/ and fix variable names. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:12 -07:00
Jan Kara	5838d4442b	fanotify: fix double free of pending permission events Commit `8581679424` ("fanotify: Fix use after free for permission events") introduced a double free issue for permission events which are pending in group's notification queue while group is being destroyed. These events are freed from fanotify_handle_event() but they are not removed from groups notification queue and thus they get freed again from fsnotify_flush_notify(). Fix the problem by removing permission events from notification queue before freeing them if we skip processing access response. Also expand comments in fanotify_release() to explain group shutdown in detail. Fixes: `8581679424` Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Douglas Leeder <douglas.leeder@sophos.com> Tested-by: Douglas Leeder <douglas.leeder@sophos.com> Reported-by: Heinrich Schuchard <xypron.glpk@gmx.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:12 -07:00
Jan Kara	8ba8fa9170	fsnotify: rename event handling functions Rename fsnotify_add_notify_event() to fsnotify_add_event() since the "notify" part is duplicit. Rename fsnotify_remove_notify_event() and fsnotify_peek_notify_event() to fsnotify_remove_first_event() and fsnotify_peek_first_event() respectively since "notify" part is duplicit and they really look at the first event in the queue. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:12 -07:00
Fabian Frederick	3e58406484	fs/fscache: make ctl_table static fscache_sysctls and fscache_sysctls_root are only used in main.c Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: David Howells <dhowells@redhat.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:12 -07:00
Linus Torvalds	ae045e2455	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: "Highlights: 1) Steady transitioning of the BPF instructure to a generic spot so all kernel subsystems can make use of it, from Alexei Starovoitov. 2) SFC driver supports busy polling, from Alexandre Rames. 3) Take advantage of hash table in UDP multicast delivery, from David Held. 4) Lighten locking, in particular by getting rid of the LRU lists, in inet frag handling. From Florian Westphal. 5) Add support for various RFC6458 control messages in SCTP, from Geir Ola Vaagland. 6) Allow to filter bridge forwarding database dumps by device, from Jamal Hadi Salim. 7) virtio-net also now supports busy polling, from Jason Wang. 8) Some low level optimization tweaks in pktgen from Jesper Dangaard Brouer. 9) Add support for ipv6 address generation modes, so that userland can have some input into the process. From Jiri Pirko. 10) Consolidate common TCP connection request code in ipv4 and ipv6, from Octavian Purdila. 11) New ARP packet logger in netfilter, from Pablo Neira Ayuso. 12) Generic resizable RCU hash table, with intial users in netlink and nftables. From Thomas Graf. 13) Maintain a name assignment type so that userspace can see where a network device name came from (enumerated by kernel, assigned explicitly by userspace, etc.) From Tom Gundersen. 14) Automatic flow label generation on transmit in ipv6, from Tom Herbert. 15) New packet timestamping facilities from Willem de Bruijn, meant to assist in measuring latencies going into/out-of the packet scheduler, latency from TCP data transmission to ACK, etc" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1536 commits) cxgb4 : Disable recursive mailbox commands when enabling vi net: reduce USB network driver config options. tg3: Modify tg3_tso_bug() to handle multiple TX rings amd-xgbe: Perform phy connect/disconnect at dev open/stop amd-xgbe: Use dma_set_mask_and_coherent to set DMA mask net: sun4i-emac: fix memory leak on bad packet sctp: fix possible seqlock seadlock in sctp_packet_transmit() Revert "net: phy: Set the driver when registering an MDIO bus device" cxgb4vf: Turn off SGE RX/TX Callback Timers and interrupts in PCI shutdown routine team: Simplify return path of team_newlink bridge: Update outdated comment on promiscuous mode net-timestamp: ACK timestamp for bytestreams net-timestamp: TCP timestamping net-timestamp: SCHED timestamp on entering packet scheduler net-timestamp: add key to disambiguate concurrent datagrams net-timestamp: move timestamp flags out of sk_flags net-timestamp: extend SCM_TIMESTAMPING ancillary data struct cxgb4i : Move stray CPL definitions to cxgb4 driver tcp: reduce spurious retransmits due to transient SACK reneging qlcnic: Initialize dcbnl_ops before register_netdev ...	2014-08-06 09:38:14 -07:00
Linus Torvalds	bb2cbf5e93	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "In this release: - PKCS#7 parser for the key management subsystem from David Howells - appoint Kees Cook as seccomp maintainer - bugfixes and general maintenance across the subsystem" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (94 commits) X.509: Need to export x509_request_asymmetric_key() netlabel: shorter names for the NetLabel catmap funcs/structs netlabel: fix the catmap walking functions netlabel: fix the horribly broken catmap functions netlabel: fix a problem when setting bits below the previously lowest bit PKCS#7: X.509 certificate issuer and subject are mandatory fields in the ASN.1 tpm: simplify code by using %*phN specifier tpm: Provide a generic means to override the chip returned timeouts tpm: missing tpm_chip_put in tpm_get_random() tpm: Properly clean sysfs entries in error path tpm: Add missing tpm_do_selftest to ST33 I2C driver PKCS#7: Use x509_request_asymmetric_key() Revert "selinux: fix the default socket labeling in sock_graft()" X.509: x509_request_asymmetric_keys() doesn't need string length arguments PKCS#7: fix sparse non static symbol warning KEYS: revert encrypted key change ima: add support for measuring and appraising firmware firmware_class: perform new LSM checks security: introduce kernel_fw_from_file hook PKCS#7: Missing inclusion of linux/err.h ...	2014-08-06 08:06:39 -07:00
Linus Torvalds	e7fda6c4c3	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer and time updates from Thomas Gleixner: "A rather large update of timers, timekeeping & co - Core timekeeping code is year-2038 safe now for 32bit machines. Now we just need to fix all in kernel users and the gazillion of user space interfaces which rely on timespec/timeval :) - Better cache layout for the timekeeping internal data structures. - Proper nanosecond based interfaces for in kernel users. - Tree wide cleanup of code which wants nanoseconds but does hoops and loops to convert back and forth from timespecs. Some of it definitely belongs into the ugly code museum. - Consolidation of the timekeeping interface zoo. - A fast NMI safe accessor to clock monotonic for tracing. This is a long standing request to support correlated user/kernel space traces. With proper NTP frequency correction it's also suitable for correlation of traces accross separate machines. - Checkpoint/restart support for timerfd. - A few NOHZ[_FULL] improvements in the [hr]timer code. - Code move from kernel to kernel/time of all time* related code. - New clocksource/event drivers from the ARM universe. I'm really impressed that despite an architected timer in the newer chips SoC manufacturers insist on inventing new and differently broken SoC specific timers. [ Ed. "Impressed"? I don't think that word means what you think it means ] - Another round of code move from arch to drivers. Looks like most of the legacy mess in ARM regarding timers is sorted out except for a few obnoxious strongholds. - The usual updates and fixlets all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits) timekeeping: Fixup typo in update_vsyscall_old definition clocksource: document some basic timekeeping concepts timekeeping: Use cached ntp_tick_length when accumulating error timekeeping: Rework frequency adjustments to work better w/ nohz timekeeping: Minor fixup for timespec64->timespec assignment ftrace: Provide trace clocks monotonic timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC seqcount: Add raw_write_seqcount_latch() seqcount: Provide raw_read_seqcount() timekeeping: Use tk_read_base as argument for timekeeping_get_ns() timekeeping: Create struct tk_read_base and use it in struct timekeeper timekeeping: Restructure the timekeeper some more clocksource: Get rid of cycle_last clocksource: Move cycle_last validation to core code clocksource: Make delta calculation a function wireless: ath9k: Get rid of timespec conversions drm: vmwgfx: Use nsec based interfaces drm: i915: Use nsec based interfaces timekeeping: Provide ktime_get_raw() hangcheck-timer: Use ktime_get_ns() ...	2014-08-05 17:46:42 -07:00
Jeff Mahoney	27d0e5bc85	reiserfs: fix corruption introduced by balance_leaf refactor Commits `f1f007c308` (reiserfs: balance_leaf refactor, pull out balance_leaf_insert_left) and `cf22df182b` (reiserfs: balance_leaf refactor, pull out balance_leaf_paste_left) missed that the `body' pointer was getting repositioned. Subsequent users of the pointer would expect it to be repositioned, and as a result, parts of the tree would get overwritten. The most common observed corruption is indirect block pointers being overwritten. Since the body value isn't actually used anymore in the called routines, we can pass back the offset it should be shifted. We constify the body and ih pointers in the balance_leaf as a mostly-free preventative measure. Cc: <stable@vger.kernel.org> # 3.16 Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>	2014-08-05 23:18:38 +02:00
Jeff Layton	14a571a8ec	nfsd: add some comments to the nfsd4 object definitions Add some comments that describe what each of these objects is, and how they related to one another. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 16:09:20 -04:00
Jeff Layton	b687f6863e	nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 15:00:54 -04:00
Steve French	f29ebb47d5	Add worker function to set allocation size Adds setinfo worker function for SMB2/SMB3 support of SET_ALLOCATION_INFORMATION Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilovsky@samba.org>	2014-08-05 12:53:37 -05:00
Jeff Layton	74cf76df0f	nfsd: remove nfs4_lock_state: nfs4_state_shutdown_net Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:20 -04:00
Jeff Layton	dab6ef2415	nfsd: remove nfs4_lock_state: nfs4_laundromat Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:20 -04:00
Trond Myklebust	05149dd4dc	nfsd: Remove nfs4_lock_state(): reclaim_complete() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:19 -04:00
Trond Myklebust	cb86fb1428	nfsd: Remove nfs4_lock_state(): setclientid, setclientid_confirm, renew Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:18 -04:00
Trond Myklebust	3974552dce	nfsd: Remove nfs4_lock_state(): exchange_id, create/destroy_session() Also destroy_clientid and bind_conn_to_session. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:17 -04:00
Trond Myklebust	3234975f47	nfsd: Remove nfs4_lock_state(): nfsd4_open and nfsd4_open_confirm Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:16 -04:00
Trond Myklebust	084d4d4549	nfsd: Remove nfs4_lock_state(): nfsd4_delegreturn() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:15 -04:00
Trond Myklebust	36626a2ecf	nfsd: Remove nfs4_lock_state(): nfsd4_open_downgrade + nfsd4_close Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:14 -04:00
Trond Myklebust	2dd7f2ad4e	nfsd: Remove nfs4_lock_state(): nfsd4_lock/locku/lockt() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:13 -04:00
Trond Myklebust	51f5e78355	nfsd: Remove nfs4_lock_state(): nfsd4_release_lockowner Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:12 -04:00
Trond Myklebust	e7d5dc19ce	nfsd: Remove nfs4_lock_state(): nfsd4_test_stateid/nfsd4_free_stateid Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:12 -04:00
Trond Myklebust	c2d1d6a8f0	nfsd: Remove nfs4_lock_state(): nfs4_preprocess_stateid_op() Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:11 -04:00
Jeff Layton	285abdee53	nfsd: remove old fault injection infrastructure Remove the old nfsd_for_n_state function and move nfsd_find_client higher up into the file to get rid of forward declaration. Remove the struct nfsd_fault_inject_op arguments from the operations as they are no longer needed by any of them. Finally, remove the old "standard" get and set routines, which also eliminates the client_mutex from this code. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:10 -04:00
Jeff Layton	98d5c7c5bd	nfsd: add more granular locking to *_delegations fault injectors ...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:09 -04:00
Jeff Layton	82e05efaec	nfsd: add more granular locking to forget_openowners fault injector ...instead of relying on the client_mutex. Also, fix up the printk output that is generated when the file is read. It currently says that it's reporting the number of open files, but it's actually reporting the number of openowners. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:08 -04:00
Jeff Layton	016200c373	nfsd: add more granular locking to forget_locks fault injector ...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:07 -04:00
Jeff Layton	3738d50e7f	nfsd: add a list_head arg to nfsd_foreach_client_lock In a later patch, we'll want to collect the locks onto a list for later destruction. If "func" is defined and "collect" is defined, then we'll add the lock stateid to the list. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:06 -04:00
Jeff Layton	69fc9edf98	nfsd: add nfsd_inject_forget_clients ...which uses the client_lock for protection instead of client_mutex. Also remove nfsd_forget_client as there are no more callers. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:05 -04:00
Jeff Layton	a0926d1527	nfsd: add a forget_client set_clnt routine ...that relies on the client_lock instead of client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:04 -04:00
Jeff Layton	7ec0e36f1a	nfsd: add a forget_clients "get" routine with proper locking Add a new "get" routine for forget_clients that relies on the client_lock instead of the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:04 -04:00
Jeff Layton	c96223d3b6	nfsd: abstract out the get and set routines into the fault injection ops Now that we've added more granular locking in other places, it's time to address the fault injection code. This code is currently quite reliant on the client_mutex for protection. Start to change this by adding a new set of fault injection op vectors. For now they all use the legacy ones. In later patches we'll add new routines that can deal with more granular locking. Also, move some of the printk routines into the callers to make the results of the operations more uniform. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:02 -04:00
Jeff Layton	294ac32e99	nfsd: protect clid and verifier generation with client_lock The clid counter is a global counter currently. Move it to be a per-net property so that it can be properly protected by the nn->client_lock instead of relying on the client_mutex. The verifier generator is also potentially racy if there are two simultaneous callers. Generate the verifier when we generate the clid value, so it's also created under the client_lock. With this, there's no need to keep two counters as they'd always be in sync anyway, so just use the clientid_counter for both. As Trond points out, what would be best is to eventually move this code to use IDR instead of the hash tables. That would also help ensure uniqueness, but that's probably best done as a separate project. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:02 -04:00
Jeff Layton	fd699b8a48	nfsd: don't destroy clients that are busy It's possible that we'll have an in-progress call on some of the clients while a rogue EXCHANGE_ID or DESTROY_CLIENTID call comes in. Be sure to try and mark the client expired first, so that the refcount is respected. This will only be a problem once the client_mutex is removed. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:55:01 -04:00
Kinglong Mee	fb94d766af	NFSD: Put the reference of nfs4_file when freeing stid After testing nfs4 lock, I restart the nfsd service, got messages as, [ 5677.403419] nfsd: last server has exited, flushing export cache [ 5677.463728] ============================================================================= [ 5677.463942] BUG nfsd4_files (Tainted: G B OE): Objects remaining in nfsd4_files on kmem_cache_close() [ 5677.464055] ----------------------------------------------------------------------------- [ 5677.464203] INFO: Slab 0xffffea0000233400 objects=28 used=1 fp=0xffff880008cd3d98 flags=0x3ffc0000004080 [ 5677.464318] CPU: 0 PID: 3772 Comm: rmmod Tainted: G B OE 3.16.0-rc2+ #29 [ 5677.464420] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 5677.464538] 0000000000000000 0000000036af2c9f ffff88000ce97d68 ffffffff816eacfa [ 5677.464643] ffffea0000233400 ffff88000ce97e40 ffffffff811cda44 ffffffff00000020 [ 5677.464774] ffff88000ce97e50 ffff88000ce97e00 656a624f00000008 616d657220737463 [ 5677.464875] Call Trace: [ 5677.464925] [<ffffffff816eacfa>] dump_stack+0x45/0x56 [ 5677.464983] [<ffffffff811cda44>] slab_err+0xb4/0xe0 [ 5677.465040] [<ffffffff811d0457>] ? __kmalloc+0x117/0x290 [ 5677.465099] [<ffffffff81100eec>] ? on_each_cpu_cond+0xac/0xf0 [ 5677.465158] [<ffffffff811d1bc0>] ? kmem_cache_close+0x110/0x2e0 [ 5677.465218] [<ffffffff811d1be0>] kmem_cache_close+0x130/0x2e0 [ 5677.465279] [<ffffffff8135a0c1>] ? kobject_cleanup+0x91/0x1b0 [ 5677.465338] [<ffffffff811d22be>] __kmem_cache_shutdown+0xe/0x10 [ 5677.465399] [<ffffffff8119bd28>] kmem_cache_destroy+0x48/0x100 [ 5677.465466] [<ffffffffa05ef78d>] nfsd4_free_slabs+0x2d/0x50 [nfsd] [ 5677.465530] [<ffffffffa05fa987>] exit_nfsd+0x34/0x6ad [nfsd] [ 5677.465589] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 5677.465649] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 5677.465759] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b [ 5677.465822] INFO: Object 0xffff880008cd0000 @offset=0 [ 5677.465882] INFO: Allocated in nfsd4_process_open1+0x61/0x350 [nfsd] age=7599 cpu=0 pid=3253 [ 5677.466115] __slab_alloc+0x3b0/0x4b1 [ 5677.466166] kmem_cache_alloc+0x1e4/0x240 [ 5677.466220] nfsd4_process_open1+0x61/0x350 [nfsd] [ 5677.466276] nfsd4_open+0xee/0x860 [nfsd] [ 5677.466329] nfsd4_proc_compound+0x4d7/0x7f0 [nfsd] [ 5677.466384] nfsd_dispatch+0xbb/0x200 [nfsd] [ 5677.466447] svc_process_common+0x453/0x6f0 [sunrpc] [ 5677.466506] svc_process+0x103/0x170 [sunrpc] [ 5677.466559] nfsd+0x117/0x190 [nfsd] [ 5677.466609] kthread+0xd8/0xf0 [ 5677.466656] ret_from_fork+0x7c/0xb0 [ 5677.466775] kmem_cache_destroy nfsd4_files: Slab cache still has objects [ 5677.466839] CPU: 0 PID: 3772 Comm: rmmod Tainted: G B OE 3.16.0-rc2+ #29 [ 5677.466937] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013 [ 5677.467049] 0000000000000000 0000000036af2c9f ffff88000ce97eb0 ffffffff816eacfa [ 5677.467150] ffff880020bb2d00 ffff88000ce97ed0 ffffffff8119bdd9 0000000000000000 [ 5677.467250] ffffffffa06065c0 ffff88000ce97ee0 ffffffffa05ef78d ffff88000ce97ef0 [ 5677.467351] Call Trace: [ 5677.467397] [<ffffffff816eacfa>] dump_stack+0x45/0x56 [ 5677.467454] [<ffffffff8119bdd9>] kmem_cache_destroy+0xf9/0x100 [ 5677.467516] [<ffffffffa05ef78d>] nfsd4_free_slabs+0x2d/0x50 [nfsd] [ 5677.467579] [<ffffffffa05fa987>] exit_nfsd+0x34/0x6ad [nfsd] [ 5677.467639] [<ffffffff81104ac2>] SyS_delete_module+0x162/0x200 [ 5677.467765] [<ffffffff81013b69>] ? do_notify_resume+0x59/0x90 [ 5677.467826] [<ffffffff816f2369>] system_call_fastpath+0x16/0x1b Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Fixes: `11b9164ada` "nfsd: Add a struct nfs4_file field to struct nfs4_stid" Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2014-08-05 10:53:36 -04:00
Linus Torvalds	8e099d1e8b	Bug fixes and clean ups for the 3.17 merge window -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJT4Eh0AAoJENNvdpvBGATwmaAP/0W9B/VPY4rSnanpXvsYlone wjIjh11NZdU3daW/c/VhgEIO81rFaoFoQ/dN+pIo7uHG8JRqZjyHXZY2eVF5SFFp FQyGEGMRhE+u78IDg3U99OlAUTo8SAHjHVlYUBVpT9lazMLWRPqP7uHJbXow5ijK /bXSVY+6fOWY1/yruCZv1nRtg9JNZgCc1LOPDqn6K16jItBKfYBvVbpw6hise2v2 rPfcSlKJ5Wzo/PgNX+IR9nnUOXzpbdM2CZbxy0qB2jZYirzE6VNeHhc1JWxQWNB1 Dg3j/ynEWPs03+9ywLcQ0kEQvUXhQQlMGmfPkgzWeAUQDUv4QAeYmFiRhc/EgJWY othDmKVqy0Pn9rmGCOMg/TJFH0Mz/c7PTxFVF1onxs2Sqvl3yCdZANT+ie5UoC9m zkUHdY3HkARiK/I6d5CCJzvHMxWNyf6bAJmoR6L/SaOPXebc4cfFtjgV01kbTWv+ rW9MCj3TiIC9MpWdVJADcmc2w3cY0L/NUbFWHZhMSDFiuJvcLUw5afaJTvKEPuKp WnnYICPj6wMP7Gy/isTxBGGi0UjPm67DHGLpG+syPDi1RxU6Vw3p9PjbdvCRj7so UD3xzntCHOzVcgAxE92V4pMZAajtv0sfIBVzMm7k8iUXzb7Er1c5dV6ROkOzguq3 Ogj/c2JHSSB4TSYdVxsG =hf9n -----END PGP SIGNATURE----- Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Bug fixes and clean ups for the 3.17 merge window" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix ext4_discard_allocated_blocks() if we can't allocate the pa struct ext4: fix COLLAPSE RANGE test for bigalloc file systems ext4: check inline directory before converting ext4: fix incorrect locking in move_extent_per_page ext4: use correct depth value ext4: add i_data_sem sanity check ext4: fix wrong size computation in ext4_mb_normalize_request() ext4: make ext4_has_inline_data() as a inline function ext4: remove readpage() check in ext4_mmap_file() ext4: fix punch hole on files with indirect mapping ext4: remove metadata reservation checks ext4: rearrange initialization to fix EXT4FS_DEBUG	2014-08-04 20:46:54 -07:00
Linus Torvalds	b54ecfb702	Merge tag 'for-f2fs-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This series includes patches to: - add nobarrier mount option - support tmpfile and rename2 - enhance the fdatasync behavior - fix the error path - fix the recovery routine - refactor a part of the checkpoint procedure - reduce some lock contentions" * tag 'for-f2fs-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits) f2fs: use for_each_set_bit to simplify the code f2fs: add f2fs_balance_fs for expand_inode_data f2fs: invalidate xattr node page when evict inode f2fs: avoid skipping recover_inline_xattr after recover_inline_data f2fs: add tracepoint for f2fs_direct_IO f2fs: reduce competition among node page writes f2fs: fix coding style f2fs: remove redundant lines in allocate_data_block f2fs: add tracepoint for f2fs_issue_flush f2fs: avoid retrying wrong recovery routine when error was occurred f2fs: test before set/clear bits f2fs: fix wrong condition for unlikely f2fs: enable in-place-update for fdatasync f2fs: skip unnecessary data writes during fsync f2fs: add info of appended or updated data writes f2fs: use radix_tree for ino management f2fs: add infra for ino management f2fs: punch the core function for inode management f2fs: add nobarrier mount option f2fs: fix to put root inode in error path of fill_super ...	2014-08-04 20:30:07 -07:00
Linus Torvalds	29b88e23a9	Driver core patches for 3.17-rc1 Here's the big driver-core pull request for 3.17-rc1. Largest thing in here is the dma-buf rework and fence code, that touched many different subsystems so it was agreed it should go through this tree to handle merge issues. There's also some firmware loading updates, as well as tests added, and a few other tiny changes, the changelog has the details. All have been in linux-next for a long time. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlPf1XcACgkQMUfUDdst+ylREACdHLXBa02yLrRzbrONJ+nARuFv JuQAoMN49PD8K9iMQpXqKBvZBsu+iCIY =w8OJ -----END PGP SIGNATURE----- Merge tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here's the big driver-core pull request for 3.17-rc1. Largest thing in here is the dma-buf rework and fence code, that touched many different subsystems so it was agreed it should go through this tree to handle merge issues. There's also some firmware loading updates, as well as tests added, and a few other tiny changes, the changelog has the details. All have been in linux-next for a long time" * tag 'driver-core-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits) ARM: imx: Remove references to platform_bus in mxc code firmware loader: Fix _request_firmware_load() return val for fw load abort platform: Remove most references to platform_bus device test: add firmware_class loader test doc: fix minor typos in firmware_class README staging: android: Cleanup style issues Documentation: devres: Sort managed interfaces Documentation: devres: Add devm_kmalloc() et al fs: debugfs: remove trailing whitespace kernfs: kernel-doc warning fix debugfs: Fix corrupted loop in debugfs_remove_recursive stable_kernel_rules: Add pointer to netdev-FAQ for network patches driver core: platform: add device binding path 'driver_override' driver core/platform: remove unused implicit padding in platform_object firmware loader: inform direct failure when udev loader is disabled firmware: replace ALIGN(PAGE_SIZE) by PAGE_ALIGN firmware: read firmware size using i_size_read() firmware loader: allow disabling of udev as firmware loader reservation: add suppport for read-only access using rcu reservation: update api and add some helpers ... Conflicts: drivers/base/platform.c	2014-08-04 18:34:04 -07:00
Linus Torvalds	98959948a7	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Move the nohz kick code out of the scheduler tick to a dedicated IPI, from Frederic Weisbecker. This necessiated quite some background infrastructure rework, including: * Clean up some irq-work internals * Implement remote irq-work * Implement nohz kick on top of remote irq-work * Move full dynticks timer enqueue notification to new kick * Move multi-task notification to new kick * Remove unecessary barriers on multi-task notification - Remove proliferation of wait_on_bit() action functions and allow wait_on_bit_action() functions to support a timeout. (Neil Brown) - Another round of sched/numa improvements, cleanups and fixes. (Rik van Riel) - Implement fast idling of CPUs when the system is partially loaded, for better scalability. (Tim Chen) - Restructure and fix the CPU hotplug handling code that may leave cfs_rq and rt_rq's throttled when tasks are migrated away from a dead cpu. (Kirill Tkhai) - Robustify the sched topology setup code. (Peterz Zijlstra) - Improve sched_feat() handling wrt. static_keys (Jason Baron) - Misc fixes. * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) sched/fair: Fix 'make xmldocs' warning caused by missing description sched: Use macro for magic number of -1 for setparam sched: Robustify topology setup sched: Fix sched_setparam() policy == -1 logic sched: Allow wait_on_bit_action() functions to support a timeout sched: Remove proliferation of wait_on_bit() action functions sched/numa: Revert "Use effective_load() to balance NUMA loads" sched: Fix static_key race with sched_feat() sched: Remove extra static_key*() function indirection sched/rt: Fix replenish_dl_entity() comments to match the current upstream code sched: Transform resched_task() into resched_curr() sched/deadline: Kill task_struct->pi_top_task sched: Rework check_for_tasks() sched/rt: Enqueue just unthrottled rt_rq back on the stack in __disable_runtime() sched/fair: Disable runtime_enabled on dying rq sched/numa: Change scan period code to match intent sched/numa: Rework best node setting in task_numa_migrate() sched/numa: Examine a task move when examining a task swap sched/numa: Simplify task_numa_compare() sched/numa: Use effective_load() to balance NUMA loads ...	2014-08-04 16:23:30 -07:00
Scott Mayhew	71a6ec8ac5	nfs: reject changes to resvport and sharecache during remount Commit `c8e47028` made it possible to change resvport/noresvport and sharecache/nosharecache via a remount operation, neither of which should be allowed. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Fixes: `c8e47028` (nfs: Apply NFS_MOUNT_CMP_FLAGMASK to nfs_compare_remount_data) Cc: stable@vger.kernel.org # 3.16+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-04 17:41:52 -04:00
Kinglong Mee	5b53dc88b0	NFS: Avoid infinite loop when RELEASE_LOCKOWNER getting expired error Fix Commit `60ea681299` (NFS: Migration support for RELEASE_LOCKOWNER) If getting expired error, client will enter a infinite loop as, client server RELEASE_LOCKOWNER(old clid) -----> <--- expired error RENEW(old clid) -----> <--- expired error SETCLIENTID -----> <--- a new clid SETCLIENTID_CONFIRM (new clid) --> <--- ok RELEASE_LOCKOWNER(old clid) -----> <--- expired error RENEW(new clid) -----> <-- ok RELEASE_LOCKOWNER(old clid) -----> <--- expired error RENEW(new clid) -----> <-- ok ... ... Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> [Trond: replace call to nfs4_async_handle_error() with nfs4_schedule_lease_recovery()] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-04 16:51:38 -04:00
Chao Yu	b65ee14818	f2fs: use for_each_set_bit to simplify the code This patch uses for_each_set_bit to simplify some codes in f2fs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-04 13:20:53 -07:00
Chao Yu	497a0930bb	f2fs: add f2fs_balance_fs for expand_inode_data This patch adds f2fs_balance_fs in expand_inode_data to avoid allocation failure with segment. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-04 13:02:03 -07:00
Chao Yu	002a41cabb	f2fs: invalidate xattr node page when evict inode When inode is evicted, all the page cache belong to this inode should be released including the xattr node page. But previously we didn't do this, this patch fixed this issue. v2: o reposition invalidate_mapping_pages() to the right place suggested by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-04 13:01:22 -07:00
Linus Torvalds	f2a84170ed	Merge branch 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: - Major reorganization of percpu header files which I think makes things a lot more readable and logical than before. - percpu-refcount is updated so that it requires explicit destruction and can be reinitialized if necessary. This was pulled into the block tree to replace the custom percpu refcnting implemented in blk-mq. - In the process, percpu and percpu-refcount got cleaned up a bit * 'for-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (21 commits) percpu-refcount: implement percpu_ref_reinit() and percpu_ref_is_zero() percpu-refcount: require percpu_ref to be exited explicitly percpu-refcount: use unsigned long for pcpu_count pointer percpu-refcount: add helpers for ->percpu_count accesses percpu-refcount: one bit is enough for REF_STATUS percpu-refcount, aio: use percpu_ref_cancel_init() in ioctx_alloc() workqueue: stronger test in process_one_work() workqueue: clear POOL_DISASSOCIATED in rebind_workers() percpu: Use ALIGN macro instead of hand coding alignment calculation percpu: invoke __verify_pcpu_ptr() from the generic part of accessors and operations percpu: preffity percpu header files percpu: use raw_cpu_() to define __this_cpu_() percpu: reorder macros in percpu header files percpu: move {raw\|this}_cpu_() definitions to include/linux/percpu-defs.h percpu: move generic {raw\|this}_cpu__N() definitions to include/asm-generic/percpu.h percpu: only allow sized arch overrides for {raw\|this}_cpu_*() ops percpu: reorganize include/linux/percpu-defs.h percpu: move accessors from include/linux/percpu.h to percpu-defs.h percpu: include/asm-generic/percpu.h should contain only arch-overridable parts percpu: introduce arch_raw_cpu_ptr() ...	2014-08-04 10:09:27 -07:00
Eric W. Biederman	344470cac4	proc: Point /proc/mounts at /proc/thread-self/mounts instead of /proc/self/mounts In oddball cases where the thread has a different mount namespace than the thread group leader or more likely in cases where the thread remains and the thread group leader has exited this ensures that /proc/mounts continues to work. This should not cause any problems but if it does this patch can just be reverted. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2014-08-04 10:07:15 -07:00
Eric W. Biederman	e813244072	proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net In oddball cases where the thread has a different network namespace than the primary thread group leader or more likely in cases where the thread remains and the thread group leader has exited this ensures that /proc/net continues to work. This should not cause any problems but if it does this patch can just be reverted. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2014-08-04 10:07:13 -07:00
Eric W. Biederman	0097875bd4	proc: Implement /proc/thread-self to point at the directory of the current thread /proc/thread-self is derived from /proc/self. /proc/thread-self points to the directory in proc containing information about the current thread. This funtionality has been missing for a long time, and is tricky to implement in userspace as gettid() is not exported by glibc. More importantly this allows fixing defects in /proc/mounts and /proc/net where in a threaded application today they wind up being empty files when only the initial pthread has exited, causing problems for other threads. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2014-08-04 10:07:11 -07:00
Eric W. Biederman	6ba8ed79a3	proc: Have net show up under /proc/<tgid>/task/<tid> Network namespaces are per task so it make sense for them to show up in the task directory. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2014-08-04 10:07:08 -07:00
Linus Torvalds	1bff598860	File locking related changes for v3.17 (pile #1 ) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJT34bvAAoJEAAOaEEZVoIVAPIQAINMD2fqeF3g9ZHxyzsKWoUp f14ZKeF/6nbG4Bn+iIihzxz/Bs9qS+03oVeI4oAg1c9crT+qZ6+nLM4C1n5gfck0 Z0DvF1ITFcr+Nv0D/GSIiI4NY8ZJLP5gZWCPYaO4xamwVs2Bh4/B4uxi7ETIkfXh uL6dN739D2fBDNZBbeRh4VJTGXbT6ipzkTIBFXkMfmqGtUxzeTfepN+IdhE5gVnx xXc8ZZOVNmWI7g/YAYKSMlLbufHHgX47U2sNTljtHII4GXf98DmiYulcJvhfQ1JP 7xSmbIrvn9Gm2iGobzbfED/OjXA0rsdw1vSzTO/uHUYPRriMOwuDRGE+S3oP0dRD ZdxQa8iOZjWEsWbDTRekBBAIXWcTUN8g8EbPj74EN0GWi3HYFj/ORkowj5Ym6zWh Sv4w9SafNMOKy9tt4RVh4iwendU/pNLrRgvR407aM+UWkhwCpinlO6vSLHppUwlC dgxFZtkdeBf5tMkm8Tja+XAV2SjU8DwP4nFU1kHu25L0W7m7hmmIeu6Crq5qlL3J 0NCPTO1LeGNP1WiOQf99nXoJVeL3//CfD+H4LjIMcGCc4P7gJc346rH91Zd6rXY/ kGomnkBMw+5WLvfOJ1NhuaEy3g8Wfk84QzlsmWgTEzl5qT+SEjPLcDHWqWJ2GTvB gFUPWHenVMcrFN/n0CWg =hvV8 -----END PGP SIGNATURE----- Merge tag 'locks-v3.17-1' of git://git.samba.org/jlayton/linux Pull file locking related changes from Jeff Layton: "Just a couple of changes from Christoph to start us down the road toward getting rid of the fl_owner_t typedef" * tag 'locks-v3.17-1' of git://git.samba.org/jlayton/linux: locks: purge fl_owner_t from fs/locks.c locks: typedef fl_owner_t to void *	2014-08-04 10:03:10 -07:00
Eric W. Biederman	65b38851a1	NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes The usage of pid_ns->child_reaper->nsproxy->net_ns in nfs_server_list_open and nfs_client_list_open is not safe. /proc for a pid namespace can remain mounted after the all of the process in that pid namespace have exited. There are also times before the initial process in a pid namespace has started or after the initial process in a pid namespace has exited where pid_ns->child_reaper can be NULL or stale. Making the idiom pid_ns->child_reaper->nsproxy a double whammy of problems. Luckily all that needs to happen is to move /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes under /proc/net to /proc/net/nfsfs/servers and /proc/net/nfsfs/volumes and add a symlink from the original location, and to use seq_open_net as it has been designed. Cc: stable@vger.kernel.org Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2014-08-04 09:28:32 -07:00
NeilBrown	50d77739fa	NFS: fix two problems in lookup_revalidate in RCU-walk 1/ rcu_dereference isn't correct: that field isn't RCU protected. It could potentially change at any time so ACCESS_ONCE might be justified. changes to ->d_parent are protected by ->d_seq. However that isn't always checked after ->d_revalidate is called, so it is safest to keep the double-check that ->d_parent hasn't changed at the end of these functions. 2/ in nfs4_lookup_revalidate, "->d_parent" was forgotten. So 'parent' was not the parent of 'dentry'. This fails safe is the context is that dentry->d_inode is NULL, and the result of parent->d_inode being NULL is that ECHILD is returned, which is always safe. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-04 09:22:08 -04:00
Dave Chinner	645f985721	Merge branch 'xfs-misc-fixes-3.17-2' into for-next	2014-08-04 13:55:27 +10:00
Dave Chinner	b076d8720d	Merge branch 'xfs-bulkstat-refactor' into for-next	2014-08-04 13:54:46 +10:00
Dave Chinner	4d7eece2c0	Merge branch 'xfs-misc-fixes-3.17-1' into for-next	2014-08-04 13:54:14 +10:00
Dave Chinner	e0ac6d45bc	Merge branch 'xfs-quota-eofblocks-scan' into for-next	2014-08-04 13:53:47 +10:00
kbuild test robot	6eee8972cc	xfs: fix coccinelle warnings Removes unneeded semicolon, introduced by commit `a70a4fa5` ("xfs: fix a couple error sequence jumps in xfs_mountfs"): fs/xfs/xfs_mount.c:858:24-25: Unneeded semicolon Generated by: scripts/coccinelle/misc/semicolon.cocci Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 13:49:40 +10:00
Dave Chinner	4ef897a275	xfs: flush both inodes in xfs_swap_extents We need to treat both inodes identically from a page cache point of view when prepareing them for extent swapping. We don't do this right now - we assume that one of the inodes empty, because that's what xfs_fsr currently does. Remove this assumption from the code. While factoring out the flushing and related checks, move the transactions reservation to immeidately after the flushes so that we don't need to pick up and then drop the ilock to do the transaction reservation. There are no issues with aborting the transaction it if the checks fail before we join the inodes to the transaction and dirty them, so this is a safe change to make. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 13:44:08 +10:00
Dave Chinner	8121768321	xfs: fix swapext ilock deadlock xfs_swap_extents() holds the ilock over a call to filemap_write_and_wait(), which can then try to write data and take the ilock. That causes a self-deadlock. Fix the deadlock and clean up the code by separating the locking appropriately. Add a lockflags variable to track what locks we are holding as we gain and drop them and cleanup the error handling to always use "out_unlock" with the lockflags variable. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 13:29:32 +10:00
Dave Chinner	b92cc59f69	xfs: kill xfs_vnode.h Move the IO flag definitions to xfs_inode.h and kill the header file as it is now empty. Removing the xfs_vnode.h file showed up an implicit header include path: xfs_linux.h -> xfs_vnode.h -> xfs_fs.h And so every xfs header file has been inplicitly been including xfs_fs.h where it is needed or not. Hence the removal of xfs_vnode.h causes all sorts of build issues because BBTOB() and friends are no longer automatically included in the build. This also gets fixed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 13:28:20 +10:00
Dave Chinner	dd8c38bab0	xfs: kill VN_MAPPED Only one user, no longer needed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 13:23:35 +10:00
Dave Chinner	2667c6f935	xfs: kill VN_CACHED Only has 2 users, has outlived it's usefulness. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 13:23:15 +10:00
Dave Chinner	eac152b474	xfs: kill VN_DIRTY() Only one user of the macro and the dirty mapping check is redundant so just get rid of it. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 13:22:49 +10:00
Dave Chinner	ad3714b82c	xfs: dquot recovery needs verifiers dquot recovery should add verifiers to the dquot buffers that it recovers changes into. Unfortunately, it doesn't attached the verifiers to the buffers in a consistent manner. For example, xlog_recover_dquot_pass2() reads dquot buffers without a verifier and then writes it without ever having attached a verifier to the buffer. Further, dquot buffer recovery may write a dquot buffer that has not been modified, or indeed, shoul dbe written because quotas are not enabled and hence changes to the buffer were not replayed. In this case, we again write buffers without verifiers attached because that doesn't happen until after the buffer changes have been replayed. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 12:59:31 +10:00
Dave Chinner	5fd364fee8	xfs: quotacheck leaves dquot buffers without verifiers When running xfs/305, I noticed that quotacheck was flushing dquot buffers that did not have the xfs_dquot_buf_ops verifiers attached: XFS (vdb): _xfs_buf_ioapply: no ops on block 0x1dc8/0x1dc8 ffff880052489000: 44 51 01 04 00 00 65 b8 00 00 00 00 00 00 00 00 DQ....e......... ffff880052489010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ffff880052489020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ ffff880052489030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ CPU: 1 PID: 2376 Comm: mount Not tainted 3.16.0-rc2-dgc+ #306 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 ffff88006fe38000 ffff88004a0ffae8 ffffffff81cf1cca 0000000000000001 ffff88004a0ffb88 ffffffff814d50ca 000010004a0ffc70 0000000000000000 ffff88006be56dc4 0000000000000021 0000000000001dc8 ffff88007c773d80 Call Trace: [<ffffffff81cf1cca>] dump_stack+0x45/0x56 [<ffffffff814d50ca>] _xfs_buf_ioapply+0x3ca/0x3d0 [<ffffffff810db520>] ? wake_up_state+0x20/0x20 [<ffffffff814d51f5>] ? xfs_bdstrat_cb+0x55/0xb0 [<ffffffff814d513b>] xfs_buf_iorequest+0x6b/0xd0 [<ffffffff814d51f5>] xfs_bdstrat_cb+0x55/0xb0 [<ffffffff814d53ab>] __xfs_buf_delwri_submit+0x15b/0x220 [<ffffffff814d6040>] ? xfs_buf_delwri_submit+0x30/0x90 [<ffffffff814d6040>] xfs_buf_delwri_submit+0x30/0x90 [<ffffffff8150f89d>] xfs_qm_quotacheck+0x17d/0x3c0 [<ffffffff81510591>] xfs_qm_mount_quotas+0x151/0x1e0 [<ffffffff814ed01c>] xfs_mountfs+0x56c/0x7d0 [<ffffffff814f0f12>] xfs_fs_fill_super+0x2c2/0x340 [<ffffffff811c9fe4>] mount_bdev+0x194/0x1d0 [<ffffffff814f0c50>] ? xfs_finish_flags+0x170/0x170 [<ffffffff814ef0f5>] xfs_fs_mount+0x15/0x20 [<ffffffff811ca8c9>] mount_fs+0x39/0x1b0 [<ffffffff811e4d67>] vfs_kern_mount+0x67/0x120 [<ffffffff811e757e>] do_mount+0x23e/0xad0 [<ffffffff8117abde>] ? __get_free_pages+0xe/0x50 [<ffffffff811e71e6>] ? copy_mount_options+0x36/0x150 [<ffffffff811e8103>] SyS_mount+0x83/0xc0 [<ffffffff81cfd40b>] tracesys+0xdd/0xe2 This was caused by dquot buffer readahead not attaching a verifier structure to the buffer when readahead was issued, resulting in the followup read of the buffer finding a valid buffer and so not attaching new verifiers to the buffer as part of the read. Also, when a verifier failure occurs, we then read the buffer without verifiers. Attach the verifiers manually after this read so that if the buffer is then written it will be verified that the corruption has been repaired. Further, when flushing a dquot we don't ask for a verifier when reading in the dquot buffer the dquot belongs to. Most of the time this isn't an issue because the buffer is still cached, but when it is not cached it will result in writing the dquot buffer without having the verfier attached. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 12:43:26 +10:00
Dave Chinner	67dc288c21	xfs: ensure verifiers are attached to recovered buffers Crash testing of CRC enabled filesystems has resulted in a number of reports of bad CRCs being detected after the filesystem was mounted. Errors such as the following were being seen: XFS (sdb3): Mounting V5 Filesystem XFS (sdb3): Starting recovery (logdev: internal) XFS (sdb3): Metadata CRC error detected at xfs_agf_read_verify+0x5a/0x100 [xfs], block 0x1 XFS (sdb3): Unmount and run xfs_repair XFS (sdb3): First 64 bytes of corrupted metadata buffer: ffff880136ffd600: 58 41 47 46 00 00 00 01 00 00 00 00 00 0f aa 40 XAGF...........@ ffff880136ffd610: 00 02 6d 53 00 02 77 f8 00 00 00 00 00 00 00 01 ..mS..w......... ffff880136ffd620: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 03 ................ ffff880136ffd630: 00 00 00 04 00 08 81 d0 00 08 81 a7 00 00 00 00 ................ XFS (sdb3): metadata I/O error: block 0x1 ("xfs_trans_read_buf_map") error 74 numblks 1 The errors were typically being seen in AGF, AGI and their related btree block buffers some time after log recovery had run. Often it wasn't until later subsequent mounts that the problem was discovered. The common symptom was a buffer with the correct contents, but a CRC and an LSN that matched an older version of the contents. Some debug added to _xfs_buf_ioapply() indicated that buffers were being written without verifiers attached to them from log recovery, and Jan Kara isolated the cause to log recovery readahead an dit's interactions with buffers that had a more recent LSN on disk than the transaction being recovered. In this case, the buffer did not get a verifier attached, and os when the second phase of log recovery ran and recovered EFIs and unlinked inodes, the buffers were modified and written without the verifier running. Hence they had up to date contents, but stale LSNs and CRCs. Fix it by attaching verifiers to buffers we skip due to future LSN values so they don't escape into the buffer cache without the correct verifier attached. This patch is based on analysis and a patch from Jan Kara. cc: <stable@vger.kernel.org> Reported-by: Jan Kara <jack@suse.cz> Reported-by: Fanael Linithien <fanael4@gmail.com> Reported-by: Grozdan <neutrino8@gmail.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 12:43:06 +10:00
Dave Chinner	400b9d8875	xfs: catch buffers written without verifiers attached We recently had a bug where buffers were slipping through log recovery without any verifier attached to them. This was resulting in on-disk CRC mismatches for valid data. Add some warning code to catch this occurrence so that we catch such bugs during development rather than not being aware they exist. Note that we cannot do this verification unconditionally as non-CRC filesystems don't always attach verifiers to the buffers being written. e.g. during log recovery we cannot identify all the different types of buffers correctly on non-CRC filesystems, so we can't attach the correct verifiers in all cases and so we don't attach any. Hence we don't want on non-CRC filesystems to avoid spamming the logs with false indications. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 12:42:40 +10:00
Eric Sandeen	5ef828c415	xfs: avoid false quotacheck after unclean shutdown The commit `83e782e` xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD added a new function xfs_sb_quota_from_disk() which swaps on-disk XFS_OQUOTA_* flags for in-core XFS_GQUOTA_* and XFS_PQUOTA_* flags after the superblock is read. However, if log recovery is required, the superblock is read again, and the modified in-core flags are re-read from disk, so we have XFS_OQUOTA_* flags in memory again. This causes the XFS_QM_NEED_QUOTACHECK() test to be true, because the XFS_OQUOTA_CHKD is still set, and not XFS_GQUOTA_CHKD or XFS_PQUOTA_CHKD. Change xfs_sb_from_disk to call xfs_sb_quota_from disk and always convert the disk flags to in-memory flags. Add a lower-level function which can be called with "false" to not convert the flags, so that the sb verifier can verify exactly what was on disk, per Brian Foster's suggestion. Reported-by: Cyril B. <cbay@excellency.fr> Signed-off-by: Eric Sandeen <sandeen@redhat.com>	2014-08-04 11:35:44 +10:00
Brian Foster	eedf32bfca	xfs: fix rounding error of fiemap length parameter The offset and length parameters are converted from bytes to basic blocks by xfs_vn_fiemap(). The BTOBB() converter rounds the value up to the nearest basic block. This leads to unexpected behavior when unaligned offsets are provided to FIEMAP. Fix the conversions of byte values to block values to cover the provided offsets. Round down the start offset to the nearest basic block. Calculate the end offset based on the provided values, round up and calculate length based on the start block offset. Reported-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 11:35:35 +10:00
Jie Liu	1e773c4989	xfs: introduce xfs_bulkstat_ag_ichunk Introduce xfs_bulkstat_ag_ichunk() to process inodes in chunk with a pointer to a formatter function that will iget the inode and fill in the appropriate structure. Refactor xfs_bulkstat() with it. Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>	2014-08-04 11:22:31 +10:00
NeilBrown	f682a398b2	NFS: allow lockless access to access_cache The access cache is used during RCU-walk path lookups, so it is best to avoid locking if possible as taking a lock kills concurrency. The rbtree is not rcu-safe and cannot easily be made so. Instead we simply check the last (i.e. most recent) entry on the LRU list. If this doesn't match, then we return -ECHILD and retry in lock/refcount mode. This requires freeing the nfs_access_entry struct with rcu, and requires using rcu access primatives when adding entries to the lru, and when examining the last entry. Calling put_rpccred before kfree_rcu looks a bit odd, but as put_rpccred already provides rcu protection, we know that the cred will not actually be freed until the next grace period, so any concurrent access will be safe. This patch provides about 5% performance improvement on a stat-heavy synthetic work load with 4 threads on a 2-core CPU. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:14:13 -04:00
NeilBrown	1fa1e38447	NFS: teach nfs_lookup_verify_inode to handle LOOKUP_RCU It fails with -ECHILD rather than make an RPC call. This allows nfs_lookup_revalidate to call it in RCU-walk mode. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:14:13 -04:00
NeilBrown	912a108da7	NFS: teach nfs_neg_need_reval to understand LOOKUP_RCU This requires nfs_check_verifier to take an rcu_walk flag, and requires an rcu version of nfs_revalidate_inode which returns -ECHILD rather than making an RPC call. With this, nfs_lookup_revalidate can call nfs_neg_need_reval in RCU-walk mode. We can also move the LOOKUP_RCU check past the nfs_check_verifier() call in nfs_lookup_revalidate. If RCU_WALK prevents nfs_check_verifier or nfs_neg_need_reval from doing a full check, they return a status indicating that a revalidation is required. As this revalidation will not be possible in RCU_WALK mode, -ECHILD will ultimately be returned, which is the desired result. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:14:12 -04:00
NeilBrown	f3324a2a94	NFS: support RCU_WALK in nfs_permission() nfs_permission makes two calls which are not always safe in RCU_WALK, rpc_lookup_cred and nfs_do_access. The second can easily be made rcu-safe by aborting with -ECHILD before making the RPC call. The former can be made rcu-safe by calling rpc_lookup_cred_nonblock() instead. As this will almost always succeed, we use it even when RCU_WALK isn't being used as it still saves some spinlocks in a common case. We only fall back to rpc_lookup_cred() if rpc_lookup_cred_nonblock() fails and MAY_NOT_BLOCK isn't set. This optimisation (always trying rpc_lookup_cred_nonblock()) is particularly important when a security module is active. In that case inode_permission() may return -ECHILD from security_inode_permission() even though ->permission() succeeded in RCU_WALK mode. This leads to may_lookup() retrying inode_permission after performing unlazy_walk(). The spinlock that rpc_lookup_cred() takes is often more expensive than anything security_inode_permission() does, so that spinlock becomes the main bottleneck. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:14:12 -04:00
NeilBrown	d51ac1a8e9	NFS: prepare for RCU-walk support but pushing tests later in code. nfs_lookup_revalidate, nfs4_lookup_revalidate, and nfs_permission all need to understand and handle RCU-walk for NFS to gain the benefits of RCU-walk for cached information. Currently these functions all immediately return -ECHILD if the relevant flag (LOOKUP_RCU or MAY_NOT_BLOCK) is set. This patch pushes those tests later in the code so that we only abort immediately before we enter rcu-unsafe code. As subsequent patches make that rcu-unsafe code rcu-safe, several of these new tests will disappear. With this patch there are several paths through the code which will no longer return -ECHILD during an RCU-walk. However these are mostly error paths or other uninteresting cases. A noteworthy change in nfs_lookup_revalidate is that we don't take (or put) the reference to ->d_parent when LOOKUP_RCU is set. Rather we rcu_dereference ->d_parent, and check that ->d_inode is not NULL. We also check that ->d_parent hasn't changed after all the tests. In nfs4_lookup_revalidate we simply avoid testing LOOKUP_RCU on the path that only calls nfs_lookup_revalidate() as that function already performs the required test. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:14:11 -04:00
NeilBrown	49317a7fda	NFS: nfs4_lookup_revalidate: only evaluate parent if it will be used. nfs4_lookup_revalidate only uses 'parent' to get 'dir', and only uses 'dir' if 'inode == NULL'. So we don't need to find out what 'parent' or 'dir' is until we know that 'inode' is NULL. By moving 'dget_parent' inside the 'if', we can reduce the number of call sites for 'dput(parent)'. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:14:11 -04:00
Alexey Khoroshilov	1f70ef96b1	NFS: add checks for returned value of try_module_get() There is a couple of places in client code where returned value of try_module_get() is ignored. As a result there is a small chance to premature unload module because of unbalanced refcounting. The patch adds error handling in that places. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:14:10 -04:00
Weston Andros Adamson	411a99adff	nfs: clear_request_commit while holding i_lock Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:05:26 -04:00
Weston Andros Adamson	e6cf82d183	pnfs: add pnfs_put_lseg_async This is useful when lsegs need to be released while holding locks. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:05:25 -04:00
Weston Andros Adamson	02d1426c70	pnfs: find swapped pages on pnfs commit lists too nfs_page_find_head_request_locked looks through the regular nfs commit lists when the page is swapped out, but doesn't look through the pnfs commit lists. I'm not sure if anyone has hit any issues caused by this. Suggested-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:05:25 -04:00
Weston Andros Adamson	b412ddf066	nfs: fix comment and add warn_on for PG_INODE_REF Fix the comment in nfs_page.h for PG_INODE_REF to reflect that it's no longer set only on head requests. Also add a WARN_ON_ONCE in nfs_inode_remove_request as PG_INODE_REF should always be set. Suggested-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:05:25 -04:00
Weston Andros Adamson	e7029206ff	nfs: check wait_on_bit_lock err in page_group_lock Return errors from wait_on_bit_lock from nfs_page_group_lock. Add a bool argument @wait to nfs_page_group_lock. If true, loop over wait_on_bit_lock until it returns cleanly. If false, return the error from wait_on_bit_lock. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:05:24 -04:00
NeilBrown	4fa2c54b51	NFS: nfs4_do_open should add negative results to the dcache. If you have an NFSv4 mounted directory which does not container 'foo' and: ls -l foo ssh $server touch foo cat foo then the 'cat' will fail (usually, depending a bit on the various cache ages). This is correct as negative looks are cached by default. However with the same initial conditions: cat foo ssh $server touch foo cat foo will usually succeed. This is because an "open" does not add a negative dentry to the dcache, while a "lookup" does. This can have negative performance effects. When "gcc" searches for an include file, it will try to "open" the file in every director in the search path. Without caching of negative "open" results, this generates much more traffic to the server than it should (or than NFSv3 does). The root of the problem is that _nfs4_open_and_get_state() will call d_add_unique() on a positive result, but not on a negative result. Compare with nfs_lookup() which calls d_materialise_unique on both a positive result and on ENOENT. This patch adds a call d_add() in the ENOENT case for _nfs4_open_and_get_state() and also calls nfs_set_verifier(). With it, many fewer "open" requests for known-non-existent files are sent to the server. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:05:22 -04:00
Andrey Utkin	7a9e75a185	nfs3_list_one_acl(): check get_acl() result with IS_ERR_OR_NULL There was a check for result being not NULL. But get_acl() may return NULL, or ERR_PTR, or actual pointer. The purpose of the function where current change is done is to "list ACLs only when they are available", so any error condition of get_acl() mustn't be elevated, and returning 0 there is still valid. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81111 Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: `74adf83f5d` (nfs: only show Posix ACLs in listxattr if actually...) Cc: stable@vger.kernel.org # 3.14+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:05:22 -04:00
Trond Myklebust	9806755c56	Merge branch 'nfs-rdma' of git://git.linux-nfs.org/projects/anna/nfs-rdma into linux-next * 'nfs-rdma' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (916 commits) xprtrdma: Handle additional connection events xprtrdma: Remove RPCRDMA_PERSISTENT_REGISTRATION macro xprtrdma: Make rpcrdma_ep_disconnect() return void xprtrdma: Schedule reply tasklet once per upcall xprtrdma: Allocate each struct rpcrdma_mw separately xprtrdma: Rename frmr_wr xprtrdma: Disable completions for LOCAL_INV Work Requests xprtrdma: Disable completions for FAST_REG_MR Work Requests xprtrdma: Don't post a LOCAL_INV in rpcrdma_register_frmr_external() xprtrdma: Reset FRMRs after a flushed LOCAL_INV Work Request xprtrdma: Reset FRMRs when FAST_REG_MR is flushed by a disconnect xprtrdma: Properly handle exhaustion of the rb_mws list xprtrdma: Chain together all MWs in same buffer pool xprtrdma: Back off rkey when FAST_REG_MR fails xprtrdma: Unclutter struct rpcrdma_mr_seg xprtrdma: Don't invalidate FRMRs if registration fails xprtrdma: On disconnect, don't ignore pending CQEs xprtrdma: Update rkeys after transport reconnect xprtrdma: Limit data payload size for ALLPHYSICAL xprtrdma: Protect ia->ri_id when unmapping/invalidating MRs ...	2014-08-03 17:04:51 -04:00
Trond Myklebust	3a505845cd	NFS: Enforce an upper limit on the number of cached access call This may be used to limit the number of cached credentials building up inside the access cache. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2014-08-03 17:03:22 -04:00
Steve French	59b04c5df7	[CIFS] Fix incorrect hex vs. decimal in some debug print statements Joe Perches and Hans Wennborg noticed that various places in the kernel were printing decimal numbers with 0x prefix. printk("0x%d") or equivalent This fixes the instances of this in the cifs driver. CC: Hans Wennborg <hans@hanshq.net> CC: Joe Perches <joe@perches.com> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 21:16:48 -05:00
Chao Yu	70cfed88ef	f2fs: avoid skipping recover_inline_xattr after recover_inline_data When we recover data of inode in roll-forward procedure, and the inode has both inline data and inline xattr. We may skip recovering inline xattr if we recover inline data form node page first. This patch will fix the problem that we lost inline xattr data in above scenario. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-02 07:43:51 -07:00
Chao Yu	70407fad85	f2fs: add tracepoint for f2fs_direct_IO This patch adds a tracepoint for f2fs_direct_IO. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-02 07:34:46 -07:00
Steve French	81691503b2	Update cifs version to 2.04 Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	21496687a7	CIFS: Fix STATUS_CANNOT_DELETE error mapping for SMB2 The existing mapping causes unlink() call to return error after delete operation. Changing the mapping to -EACCES makes the client process the call like CIFS protocol does - reset dos attributes with ATTR_READONLY flag masked off and retry the operation. Cc: stable@vger.kernel.org Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	b770ddfa26	CIFS: Optimize readpages in a short read case on reconnects by marking pages with a data from a partially received response up-to-date. This is suitable for non-signed connections. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	d913ed17f0	CIFS: Optimize cifs_user_read() in a short read case on reconnects by filling the output buffer with a data got from a partially received response and requesting the remaining data from the server. This is suitable for non-signed connections. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	fb8a3e5255	CIFS: Improve indentation in cifs_user_read() Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	2e8a05d802	CIFS: Fix possible buffer corruption in cifs_user_read() If there was a short read in the middle of the rdata list, we can end up with a corrupt output buffer. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	b3160aebb4	CIFS: Count got bytes in read_into_pages() that let us know how many bytes we have already got before reconnect. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	34a54d6177	CIFS: Use separate var for the number of bytes got in async read and don't mix it with the number of bytes that was requested. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:04 -05:00
Pavel Shilovsky	3fabaa2746	CIFS: Indicate reconnect with ECONNABORTED error code that let us not mix it with EAGAIN. Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:03 -05:00
Pavel Shilovsky	bed9da0213	CIFS: Use multicredits for SMB 2.1/3 reads If we negotiate SMB 2.1 and higher version of the protocol and a server supports large read buffer size, we need to consume 1 credit per 65536 bytes. So, we need to know how many credits we have and obtain the required number of them before constructing a readdata structure in readpages and user read. Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:03 -05:00
Pavel Shilovsky	e374d90f8a	CIFS: Fix rsize usage for sync read If a server changes maximum buffer size for read requests (rsize) on reconnect we can fail on repeating with a big size buffer on -EAGAIN error in cifs_read. Fix this by checking rsize all the time before repeating requests. Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:03 -05:00
Pavel Shilovsky	25f402598d	CIFS: Fix rsize usage in user read If a server changes maximum buffer size for read (rsize) requests on reconnect we can fail on repeating with a big size buffer on -EAGAIN error in user read. Fix this by checking rsize all the time before repeating requests. Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:03 -05:00
Pavel Shilovsky	0ada36b244	CIFS: Separate page reading from user read Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com> Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org> Signed-off-by: Steve French <smfrench@gmail.com>	2014-08-02 01:23:03 -05:00

1 2 3 4 5 ...

37456 Commits