If a server changes maximum buffer size for read requests (rsize)
on reconnect we can fail on repeating with a big size buffer on
-EAGAIN error in cifs_read. Fix this by checking rsize all the
time before repeating requests.
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
If a server changes maximum buffer size for read (rsize) requests
on reconnect we can fail on repeating with a big size buffer on
-EAGAIN error in user read. Fix this by checking rsize all the
time before repeating requests.
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
If a server changes maximum buffer size for read (rsize) requests
on reconnect we can fail on repeating with a big size buffer on
-EAGAIN error in readpages. Fix this by checking rsize all the
time before repeating requests.
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
If we negotiate SMB 2.1 and higher version of the protocol and
a server supports large write buffer size, we need to consume 1
credit per 65536 bytes. So, we need to know how many credits
we have and obtain the required number of them before constructing
a writedata structure in writepages and iovec write.
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
If a server change maximum buffer size for write (wsize) requests
on reconnect we can fail on repeating with a big size buffer on
-EAGAIN error in iovec write. Fix this by checking wsize all the
time before repeating request in iovec write.
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
If wsize changes on reconnect we need to use new writedata structure
that for retrying.
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
If a server change maximum buffer size for write (wsize) requests
on reconnect we can fail on repeating with a big size buffer on
-EAGAIN error in writepages. Fix this by checking wsize all the
time before repeating request in writepages.
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
The recent session setup patch set
(cifs-Separate-rawntlmssp-auth-from-CIFS_SessSetup.patch)
had introduced a trivial sparse build warning.
Signed-off-by: Steve French <smfrench@gmail.com>
Cc: Sachin Prabhu <sprabhu@redhat.com>
If we get into read_into_pages() from cifs_readv_receive() and then
loose a network, we issue cifs_reconnect that moves all mids to
a private list and issue their callbacks. The callback of the async
read request sets a mid to retry, frees it and wakes up a process
that waits on the rdata completion.
After the connection is established we return from read_into_pages()
with a short read, use the mid that was freed before and try to read
the remaining data from the a newly created socket. Both actions are
not what we want to do. In reconnect cases (-EAGAIN) we should not
mask off the error with a short read but should return the error
code instead.
Acked-by: Jeff Layton <jlayton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Separate rawntlmssp authentication from CIFS_SessSetup(). Also cleanup
CIFS_SessSetup() since we no longer do any auth within it.
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
In preparation for splitting CIFS_SessSetup() into smaller more
manageable chunks, we first add helper functions.
We then proceed to split out lanman auth out of CIFS_SessSetup()
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
The functionality provided by free_rsp_buf() is duplicated in a number
of places. Replace these instances with a call to free_rsp_buf().
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Correctly assemble the client UUID by OR'ing in the flags rather than
assigning them over the other components.
Reported-by: Himangi Saraogi <himangi774@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull vfs fixes from Christoph Hellwig:
"A vfsmount leak fix, and a compile warning fix"
* 'vfs-for-3.16' of git://git.infradead.org/users/hch/vfs:
fs: umount on symlink leaks mnt count
direct-io: fix uninitialized warning in do_direct_IO()
Pull fuse fixes from Miklos Szeredi:
"These two pathes fix issues with the kernel-userspace protocol changes
in v3.15"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: add FUSE_NO_OPEN_SUPPORT flag to INIT
fuse: s_time_gran fix
Currently umount on symlink blocks following umount:
/vz is separate mount
# ls /vz/ -al | grep test
drwxr-xr-x. 2 root root 4096 Jul 19 01:14 testdir
lrwxrwxrwx. 1 root root 11 Jul 19 01:16 testlink -> /vz/testdir
# umount -l /vz/testlink
umount: /vz/testlink: not mounted (expected)
# lsof /vz
# umount /vz
umount: /vz: device is busy. (unexpected)
In this case mountpoint_last() gets an extra refcount on path->mnt
Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Ian Kent <raven@themaw.net>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christoph Hellwig <hch@lst.de>
The following warnings:
fs/direct-io.c: In function ‘__blockdev_direct_IO’:
fs/direct-io.c:1011:12: warning: ‘to’ may be used uninitialized in this function [-Wmaybe-uninitialized]
fs/direct-io.c:913:16: note: ‘to’ was declared here
fs/direct-io.c:1011:12: warning: ‘from’ may be used uninitialized in this function [-Wmaybe-uninitialized]
fs/direct-io.c:913:10: note: ‘from’ was declared here
are false positive because dio_get_page() either fails, or sets both
'from' and 'to'.
Paul Bolle said ...
Maybe it's better to move initializing "to" and "from" out of
dio_get_page(). That _might_ make it easier for both the the reader and
the compiler to understand what's going on. Something like this:
Christoph Hellwig said ...
The fix of moving the code definitively looks nicer, while I think
uninitialized_var is horrible wart that won't get anywhere near my code.
Boaz Harrosh: I agree with Christoph and Paul
Signed-off-by: Boaz Harrosh <boaz@plexistor.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Pull nfsd bugfix from Bruce Fields:
"Another regression from the xdr encoding rewrite"
* 'for-3.16' of git://linux-nfs.org/~bfields/linux:
NFSD: Fix crash encoding lock reply on 32-bit
If a filesystem uses simple_xattr to support user extended attributes,
LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
values of zero size. Fix that off-by-one (but apparently no filesystem
needs them yet).
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 079148b919 ("coredump: factor out the setting of PF_DUMPCORE")
cleaned up the setting of PF_DUMPCORE by removing it from all the
linux_binfmt->core_dump() and moving it to zap_threads().But this ended
up clearing all the previously set flags. This causes issues during
core generation when tsk->flags is checked again (eg. for PF_USED_MATH
to dump floating point registers). Fix this.
Signed-off-by: Silesh C V <svellattu@mvista.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 8c7424cff6 "nfsd4: don't try to encode conflicting owner if low
on space" forgot to free conf->data in nfsd4_encode_lockt and before
sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak.
Worse, kfree() can be called on an uninitialized pointer in the case of
a succesful lock (or one that fails for a reason other than a conflict).
(Note that lock->lk_denied.ld_owner.data appears it should be zero here,
until you notice that it's one arm of a union the other arm of which is
written to in the succesful case by the
memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid,
sizeof(stateid_t));
in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.)
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: 8c7424cff6 ""nfsd4: don't try to encode conflicting owner if low on space"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Here some additional changes to set a capability flag so that clients can
detect when it's appropriate to return -ENOSYS from open.
This amends the following commit introduced in 3.14:
7678ac5061 fuse: support clients that don't implement 'open'
However we can only add the flag to 3.15 and later since there was no
protocol version update in 3.14.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: <stable@vger.kernel.org> # v3.15+
Pull btrfs fixes from Chris Mason:
"We have two more fixes in my for-linus branch.
I was hoping to also include a fix for a btrfs deadlock with
compression enabled, but we're still nailing that one down"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: test for valid bdev before kobj removal in btrfs_rm_device
Btrfs: fix abnormal long waiting in fsync
Highlights include;
- Stable fix for an NFSv3 posix ACL regression
- Multiple fixes for regressions to the NFS generic read/write code
- Fix page splitting bugs that come into play when a small rsize/wsize
read/write needs to be sent again (due to error conditions or page
redirty).
- Fix nfs_wb_page_cancel, which is called by the "invalidatepage" method
- Fix 2 compile warnings about unused variables.
- Fix a performance issue affecting unstable writes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTys5JAAoJEGcL54qWCgDygakP/1JOfQZzyHNRml5aDVRLtfp/
QQpn8ne3jjEov+0BhzSxqkpHP+8fCcF0UgD5asEnbM83ruoB8EUHErXdq7kRkAYf
COpDZAww4sPXUszG/xGqIED503DKbf69Ds5pQMh8g71hfQpw04UmMQYbnRNBP2bI
Z5eu0Xiwaf3MVRaxHMXVy9sl7in/cQBvrXnKLTIYOLA0U5bdCI9JWT5+qbOUCC/3
YuYm5EjM+OMyhvEWyVDXFp9kmw0vtBS8FwAXYKBjwfJLNl8dGuERWKlFDPNOxdpZ
QrOBhUH73d2tgvTHzUg/RRtpA6mKOfznU3SQK3muP188/2/sbb4y/RChwv+Ignat
YqWFgbDTz1idKvjj+Vzcd7eL3HO9Kono/YkAG8i5xmOlSeenzDra1lDAbuB+zNzd
oLVY1AJp8+13c74gaCDurxJ7fq6Fth97eqxdo8fdQH8Mn6m6qRm2V57rOo59eFQS
bX6EN4ja8ashrKprCvlhXDGJmQKZWJMEGoTXJCj5w+HBklONV6jdbyzGFKT+Zmhg
IP+USsaLJDswZNpdWq8Zb9RthfFAURKEQEemwV5eLQNtTa+xQ22ZOF2ycrbBqsED
etCCm8gyNukl2RCAxGfLrtpBytlcNMmcRKGktKqO1SAAua3IP33wGD7zql8rwGCZ
m9XMXUkVKiHoErl3jZ4T
=qklz
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:
"Apologies for the relative lateness of this pull request, however the
commits fix some issues with the NFS read/write code updates in
3.16-rc1 that can cause serious Oopsing when using small r/wsize. The
delay was mainly due to extra testing to make sure that the fixes
behave correctly.
Highlights include;
- Stable fix for an NFSv3 posix ACL regression
- Multiple fixes for regressions to the NFS generic read/write code:
- Fix page splitting bugs that come into play when a small
rsize/wsize read/write needs to be sent again (due to error
conditions or page redirty)
- Fix nfs_wb_page_cancel, which is called by the "invalidatepage"
method
- Fix 2 compile warnings about unused variables
- Fix a performance issue affecting unstable writes"
* tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Don't reset pg_moreio in __nfs_pageio_add_request
NFS: Remove 2 unused variables
nfs: handle multiple reqs in nfs_wb_page_cancel
nfs: handle multiple reqs in nfs_page_async_flush
nfs: change find_request to find_head_request
nfs: nfs_page should take a ref on the head req
nfs: mark nfs_page reqs with flag for extra ref
nfs: only show Posix ACLs in listxattr if actually present
commit 99994cd btrfs: dev delete should remove sysfs entry
added a btrfs_kobj_rm_device, which dereferences device->bdev...
right after we check whether device->bdev might be NULL.
I don't honestly know if it's possible to have a NULL device->bdev
here, but assuming that it is (given the test), we need to move
the kobject removal to be under that test.
(Coverity spotted this)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
xfstests generic/127 detected this problem.
With commit 7fc34a62ca, now fsync will only flush
data within the passed range. This is the cause of the above problem,
-- btrfs's fsync has a stage called 'sync log' which will wait for all the
ordered extents it've recorded to finish.
In xfstests/generic/127, with mixed operations such as truncate, fallocate,
punch hole, and mapwrite, we get some pre-allocated extents, and mapwrite will
mmap, and then msync. And I find that msync will wait for quite a long time
(about 20s in my case), thanks to ftrace, it turns out that the previous
fallocate calls 'btrfs_wait_ordered_range()' to flush dirty pages, but as the
range of dirty pages may be larger than 'btrfs_wait_ordered_range()' wants,
there can be some ordered extents created but not getting corresponding pages
flushed, then they're left in memory until we fsync which runs into the
stage 'sync log', and fsync will just wait for the system writeback thread
to flush those pages and get ordered extents finished, so the latency is
inevitable.
This adds a flush similar to btrfs_start_ordered_extent() in
btrfs_wait_logged_extents() to fix that.
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
flock, a change to use GFP_NOFS to avoid recursion on a rarely used
code path and a fix for a race relating to the glock lru.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQIcBAABAgAGBQJTyPZQAAoJEMrg3m4a/8jSFBEQAKSnJQUP9MSxVwNBrgOiybXW
kQd8RYs7cdt33i97C3Im9xSVktPz4HKTvuwHyvNV1oyWScfWSyqCgC//cU+/zlYV
wJDZWIASNoQheY6UfxR6TeBPZo9Hgq7RQRGj4h1ttag9+b8Zz9aV5TCxcoh28ULF
629TyNwg4xdiEKX2xZusDwGCoHn5f5l9pAa5MyPrcyPzn1lOJP1lz++Lci2nqC4g
DvA/KzQzDLQ2lKXdSd95avwQxnHqmeCTvClPmK9GgONrt66tqq6CcCLB1jPRE7/O
J7f0VWy/PEeo8ot+9siiA380EvM6hWvJx5Fuen/Qb9dQ5sgsJMkvgbqlHK6zB/i3
Je6Qq+aVPz3qktmXdyEagpXfZAQAxy0PUWezQBQH8HIlhwKMGC1QaFgMoAFIks1Y
S38IBHCwlymytWYdVaRhyUOnlzzaSyeYROzs7hZoxRRUilge5rPkrqtv4HWLSRtZ
rGFEid181+qTO2TyoiMRY2oR3U0PHfbE9Dhv5Pu9caTl55kj9eAGwvqnOn6IpyvF
eiUoWOnDYFO+8sxFKPYFndglEZx0zBU6B/7axyQ3qam3BojTJwKh+2+4TqauM0zo
4ehwJEzVmV21sbyMfUHCKTQEkW8OjQ+EkxAEmGhp4IODNwZ3vPfFBdhFi3fBipqO
WhDmeDmOddb9cCoQG8WZ
=VTve
-----END PGP SIGNATURE-----
Merge tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes
Pull gfs2 fixes from Steven Whitehouse:
"This patch set contains two minor docs/spelling fixes, some fixes for
flock, a change to use GFP_NOFS to avoid recursion on a rarely used
code path and a fix for a race relating to the glock lru"
* tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: fs/gfs2/rgrp.c: kernel-doc warning fixes
GFS2: memcontrol: Spelling s/invlidate/invalidate/
GFS2: Allow caching of glocks for flock
GFS2: Allow flocks to use normal glock dq rather than dq_wait
GFS2: replace count*size kzalloc by kcalloc
GFS2: Use GFP_NOFS when allocating glocks
GFS2: Fix race in glock lru glock disposal
GFS2: Only wait for demote when last holder is dequeued
Fixes for low memory perforamnce regressions and a quota inode handling
regression.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJTyFtkAAoJEK3oKUf0dfodAioP/0nsIq3be9zch0/Jjf9aduON
1aMaUE/9p4g/zu0f96ld3GI5/guRVllZ/0qJFyYAnAgOGs1hGARQfOuNnHBjgdLZ
D261JbT9z8d8inQ7BMSc3EBBQ2CAZsAwRmg6UbeaWBE4hjlJ03RGWBE/aMMh10Wh
t2fWUoeYWZWLi+Gfa+lPpnTESH7nBK5cW+daC16I1fU9Z/RcXAl+pCN2s5Ls6h7G
1mQNfhAuII+/ydB7UWXL6SQ7/sDvDwNvedMLljwg6oYu/riSG2kYPZKW6O9qwL6T
z9onPg5lEQMjWlbl6qzNo6OT3pAs37vssG0zVw2ZS8GjZ84edRzw2PGctJAFu2Hh
sWqmtYGNKBjtjnxJ4zlRfeBkwpHbZGlLyOwzKoDlyQ8j9KZ8v8lsKyEpJK6/XJNG
1rMJZV5twu+xvZUwf0zkg0tuxoT/3T3kbIHsFkEaJmQW7jrxTvdybW/rp6KHSbcb
rzCpuZ5Ghh9qss+EeCv3k2nnWjysDP4kSwQMZ0zCedzvDTmga2TMw//MUwaM+i7M
D7Raq4Qcs2updrFk2j9OyML2hi49KuPTtEu2OC7ObfxvBsZgSbTvyw1Vq/rsiDM5
FZMV/giKRoCFpRpp7xF+db0zkBC2xDU9tGz196dzGtg7rvp6Z5401mS8fAr9H/LJ
D2Wf2OXx3oss9v4rrO7N
=rnN8
-----END PGP SIGNATURE-----
Merge tag 'xfs-for-linus-3.16-rc5' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Dave Chinner:
"Fixes for low memory perforamnce regressions and a quota inode
handling regression.
These are regression fixes for issues recently introduced - the change
in the stack switch location is fairly important, so I've held off
sending this update until I was sure that it still addresses the stack
usage problem the original solved. So while the commits in the xfs
tree are recent, it has been under tested for several weeks now"
* tag 'xfs-for-linus-3.16-rc5' of git://oss.sgi.com/xfs/xfs:
xfs: null unused quota inodes when quota is on
xfs: refine the allocation stack switch
Revert "xfs: block allocation work needs to be kswapd aware"
This patch removes the GLF_NOCACHE flag from the glocks associated with
flocks. There should be no good reason not to cache glocks for flocks:
they only force the glock to be demoted before they can be reacquired,
which can slow down performance and even cause glock hangs, especially
in cases where the flocks are held in Shared (SH) mode.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch allows flock glocks to use a non-blocking dequeue rather
than dq_wait. It also reverts the previous patch I had posted regarding
dq_wait. The reverted patch isn't necessarily a bad idea, but I decided
this might avoid unforeseen side effects, and was therefore safer.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Normally GFP_KERNEL is ok here, but there is now a rarely used code path
relating to deallocation of unlinked inodes (in certain corner cases)
which if hit at times of memory shortage can cause recursion while
trying to free memory.
One solution would be to try and move the gfs2_glock_get() call so
that it is no longer called while another glock is held, but that
doesn't look at all easy, so GFP_NOFS is the best solution for the
time being.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
We must not leave items on the LRU list with GLF_LOCK set, since
they can be removed if the glock is brought back into use, which
may then potentially result in a hang, waiting for GLF_LOCK to
clear.
It doesn't happen very often, since it requires a glock that has
not been used for a long time to be brought back into use at the
same moment that the shrinker is part way through disposing of
glocks.
The fix is to set GLF_LOCK at a later time, when we already know
that the other locks can be obtained. Also, we now only release
the lru_lock in case a resched is needed, rather than on every
iteration.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Function gfs2_glock_dq_wait is supposed to dequeue a glock and then
wait for the lock to be demoted. The problem is, if this is a shared
lock, its demote will depend on the other holders, which means you
might end up waiting forever because the other process is blocked.
This problem is especially apparent when dealing with nested flocks.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Pull quota fix from Jan Kara:
"Fix locking of dquot shrinker"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: missing lock in dqcache_shrink_scan()
Commit 1ab6c4997e (fs: convert fs shrinkers to new scan/count API)
accidentally removed locking from quota shrinker. Fix it -
dqcache_shrink_scan() should use dq_list_lock to protect the
scan on free_dquots list.
CC: stable@vger.kernel.org
Fixes: 1ab6c4997e
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Pull fuse fixes from Miklos Szeredi:
"This contains miscellaneous fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: replace count*size kzalloc by kcalloc
fuse: release temporary page if fuse_writepage_locked() failed
fuse: restructure ->rename2()
fuse: avoid scheduling while atomic
fuse: handle large user and group ID
fuse: inode: drop cast
fuse: ignore entry-timeout on LOOKUP_REVAL
fuse: timeout comparison fix
When quota is on, it is expected that unused quota inodes have a
value of NULLFSINO. The changes to support a separate project quota
in 3.12 broken this rule for non-project quota inode enabled
filesystem, as the code now refuses to write the group quota inode
if neither group or project quotas are enabled. This regression was
introduced by commit d892d58 ("xfs: Start using pquotaino from the
superblock").
In this case, we should be writing NULLFSINO rather than nothing to
ensure that we leave the group quota inode in a valid state while
quotas are enabled.
Failure to do so doesn't cause a current kernel to break - the
separate project quota inodes introduced translation code to always
treat a zero inode as NULLFSINO. This was introduced by commit
0102629 ("xfs: Initialize all quota inodes to be NULLFSINO") with is
also in 3.12 but older kernels do not do this and hence taking a
filesystem back to an older kernel can result in quotas failing
initialisation at mount time. When that happens, we see this in
dmesg:
[ 1649.215390] XFS (sdb): Mounting Filesystem
[ 1649.316894] XFS (sdb): Failed to initialize disk quotas.
[ 1649.316902] XFS (sdb): Ending clean mount
By ensuring that we write NULLFSINO to quota inodes that aren't
active, we avoid this problem. We have to be really careful when
determining if the quota inodes are active or not, because we don't
want to write a NULLFSINO if the quota inodes are active and we
simply aren't updating them.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>