linux

Commit Graph

Author	SHA1	Message	Date
Chuck Lever	5e2e7721f0	NFS: fix nfs_parse_ip_address() corner case Bruce observed that nfs_parse_ip_address() will successfully parse an IPv6 address that looks like this: "::1%" A scope delimiter is present, but there is no scope ID following it. This is harmless, as it would simply set the scope ID to zero. However, in some cases we would like to flag this as an improperly formed address. We are now also careful to reject addresses where garbage follows the address (up to the length of the string), instead of ignoring the non-address characters; and where the scope ID is nonsense (not a valid device name, but also not numeric). Before, both of these cases would result in a harmless zero scope ID. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 14:41:51 -04:00
J. Bruce Fields	456018d791	NFS: Cleanup nfs_set_port Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-10 14:41:50 -04:00
Trond Myklebust	03254e65a6	NFS: Fix attribute updates This fixes a regression seen when running the Connectathon testsuite against an ext3 filesystem. The reason was that the inode was constantly being marked as 'just updated' by the jiffy wraparound test. This again meant that newer GETATTR calls were failing to pass the nfs_inode_attrs_need_update() test unless the changes caused a ctime update on the server, since they were perceived as having been started before the latest inode update. Given that nfs_inode_attrs_need_update() already checks for wraparound of nfsi->last_updated, we can drop the buggy "protection" in nfs_update_inode(). Also make a slight micro-optimisation of nfs_inode_attrs_need_update(): we are more often going to see time_after(fattr->time_start, nfsi->last_updated) be true, rather than seeing an update of ctime/size, so put that test first to ensure that we optimise away the ctime/size tests. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-09 13:34:07 -04:00
Trond Myklebust	d7fb120774	NFS: Don't use range_cyclic for data integrity syncs It is more efficient to write linearly starting from the beginning of the file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:19:05 -04:00
Steve Dickson	8491945f11	NFS: Client mounts hang when exported directory do not exist This patch fixes a regression that was introduced by the string based mounts. nfs_mount() statically returns -EACCES for every error returned by the remote mounted. This is incorrect because -EACCES is an non-fatal error to the mount.nfs command. This error causes mount.nfs to retry the mount even in the case when the exported directory does not exist. This patch maps the errors returned by the remote mountd into valid errno values, exactly how it was done pre-string based mounts. By returning the correct errno enables mount.nfs to do the right thing. Signed-off-by: Steve Dickson <steved@redhat.com> [Trond.Myklebust@netapp.com: nfs_stat_to_errno() now correctly returns negative errors, so remove the sign change.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:19:01 -04:00
J. Bruce Fields	ea31a4437c	nfs: Fix misparsing of nfsv4 fs_locations attribute The code incorrectly assumes here that the server name (or ip address) is null-terminated. This can cause referrals to fail in some cases. Also support ipv6 addresses. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:17:47 -04:00
J. Bruce Fields	f0c929251e	nfs: prepare to share nfs_set_port We plan to use this function elsewhere. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:17:36 -04:00
J. Bruce Fields	460cdbc832	nfs: replace while loop by for loops in nfs_follow_referral Whoever wrote this had a bizarre allergy to for loops. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:17:20 -04:00
J. Bruce Fields	4ada29d5c4	nfs: break up nfs_follow_referral This function is a little longer and more deeply nested than necessary. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:16:40 -04:00
EG Keizer	37ca8f5c60	nfs: authenticated deep mounting Allow mount to do authenticated mounts below the root of the exported tree. The wording in RFC 2623, sec 2.3.2. allows fsinfo with UNIX authentication on the root of the export. Mounts are not always done on the root of the exported tree. Especially autoumounts often mount below the root of the exported tree. Some server implementations (justly) require full authentication for the so-called deep mounts. The old code used AUTH_SYS only. This caused deep mounts to fail on systems requiring stronger authentication.. The client should try both authentication types and use the first one that succeeds. This method was already partially implemented. This patch completes the implementation for NFS2 and NFS3. This patch was developed to allow Debian systems to automount home directories on Solaris servers with krb5 authentication. Tested on kernel 2.6.24-etchnhalf.1 Signed-off-by: E.G. Keizer <keie@few.vu.nl> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:16:22 -04:00
Jeff Layton	f25b874d39	NFS: missing nfs_fattr_init in nfs3_proc_getacl and nfs3_proc_setacls (resend #2 ) The fattrs used in the NFSv3 getacl/setacl calls are not being properly initialized. This occasionally causes nfs_update_inode to fall into NFSv4 specific codepaths when handling post-op attrs from these calls. Thanks to Cai Qian for noticing the spurious NFSv4 messages in debug output from a v3 mount... Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:16:22 -04:00
J. Bruce Fields	f200c11c25	nfs: remove an obsolete nfs_flock comment We do now allow bsd flocks over nfs. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:16:21 -04:00
Denis V. Lunev	44d5759d3f	nfs: BUG_ON in nfs_follow_mountpoint Unfortunately, BUG_ON(IS_ROOT(dentry)) can happen inside nfs_follow_mountpoint with NFS running Fedora 8 using a specific setup. https://bugzilla.redhat.com/show_bug.cgi?id=458622 So, the situation should be handled on NFS client gracefully. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Trond Myklebust <Trond.Myklebust@netapp.com> CC: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:15:16 -04:00
Denis V. Lunev	fd08d7e9d1	nfs: ERR_PTR is expected on failure from nfs_do_clone_mount Replace NULL with ERR_PTR(-EINVAL). Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 18:14:34 -04:00
Adrian Bunk	bb8a3b53c2	fix fs/nfs/nfsroot.c compilation This patch fixes the following compile error caused by commit `f9247273cb` (UFS: add const to parser token tabl): <-- snip --> ... CC fs/nfs/nfsroot.o /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict make[3]: *** [fs/nfs/nfsroot.o] Error 1 <-- snip --> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:59:49 -04:00
Trond Myklebust	691beb13cd	NFS: Allow concurrent inode revalidation Currently, if two processes are both trying to revalidate metadata for the same inode, they will find themselves being serialised. There is no good justification for this now that we have improved our ability to detect stale attribute data, so we should remove that serialisation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:59:43 -04:00
Trond Myklebust	2f28ea614f	NFS: Fix up nfs_setattr_update_inode() Ensure that it sets the inode metadata under the correct spinlock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:41:46 -04:00
Trond Myklebust	076f1fc94c	NFS: Don't clear nfsi->cache_validity in nfs_check_inode_attributes() If we're merely checking the inode attributes because we suspect that the 'updated' attributes returned by the RPC call are stale, then we shouldn't be doing weak cache consistency updates or clearing the cache_validity flags. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:41:33 -04:00
Trond Myklebust	4dc05efb86	NFS: Convert __nfs_revalidate_inode() to use nfs_refresh_inode() In the case where there are parallel RPC calls to the same inode, we may receive stale metadata due to the lack of ordering, hence the sanity checking of metadata in nfs_refresh_inode(). Currently, __nfs_revalidate_inode() is calling nfs_update_inode() directly, without any further sanity checks, and hence may end up setting the inode up with stale metadata. Fix is to use nfs_refresh_inode() instead of nfs_update_inode(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:41:17 -04:00
Trond Myklebust	d65f557f39	NFS: Fix nfs_post_op_update_inode_force_wcc() If we believe that the attributes are old (see nfs_refresh_inode()), then we shouldn't force an update. Also ensure that we hold the inode->i_lock across attribute checks and the call to nfs_refresh_inode_locked() to ensure that we don't race with other attribute updates. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:41:00 -04:00
Trond Myklebust	a10ad17630	NFS: Fix the NFS attribute update Currently nfs_refresh_inode() will only update the inode metadata if it sees that the RPC call that returned the nfs_fattr was started after the last update of the inode. This means that if we have parallel RPC calls to the same inode (when sending WRITE calls, for instance), we may often miss updates. This patch attempts to recover those missed updates by also accepting them if the ctime in the nfs_fattr is more recent than the inode's cached ctime. It also recovers the case where the file size has increased, but the ctime has not been updated due to limited ctime resolution. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:34:17 -04:00
Trond Myklebust	870a5be8b9	NFS: Clean up nfs_refresh_inode() and nfs_post_op_update_inode() Try to avoid taking and dropping the inode->i_lock more than once. Do so by moving the code in nfs_refresh_inode() that needs to be done under the spinlock into a function nfs_refresh_inode_locked(), and then having both nfs_refresh_inode() and nfs_post_op_update_inode() call it directly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:29:49 -04:00
Trond Myklebust	7973c1f15a	NFS: Add mount options for controlling the lookup cache Add the following NFS-specific mount options to the parser. -o lookupcache=all /* Default: cache positive & negative dentries / -o lookupcache=pos[itive] / Don't cache negative dentries / -o lookupcache=none / Strict revalidation of all dentries */ Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:23:57 -04:00
Trond Myklebust	ff3525a539	NFS: Don't apply NFS_MOUNT_FLAGMASK to text-based mounts The point of introducing text-based mounts was to allow us to add functionality without having to worry about legacy binary mount formats. The mask should be there in order to ensure that binary formats don't start enabling features that they cannot support. There is no justification for applying it to the text mount path. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:23:56 -04:00
Trond Myklebust	4eec952e42	NFS: Add options for finer control of the lookup cache Add the flag NFS_MOUNT_LOOKUP_CACHE_NONEG to turn off the caching of negative dentries. In reality what we do is to force nfs_lookup_revalidate() to always discard negative dentries. Add the flag NFS_MOUNT_LOOKUP_CACHE_NONE for enforcing stricter revalidation of dentries. It forces the revalidate code to always do a lookup instead of just checking the cached mtime of the parent directory. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-07 17:22:20 -04:00
Trond Myklebust	1daef0a868	NFS: Clean up nfs_sb_active/nfs_sb_deactive Instead of causing umount requests to block on server->active_wq while the asynchronous sillyrename deletes are executing, we can use the sb->s_active counter to obtain a reference to the super_block, and then release that reference in nfs_async_unlink_release(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-06 20:08:26 -04:00
Trond Myklebust	d5e66348bb	NFS: Fix nfs_file_llseek() After the BKL removal patches were applied to the rest of the NFS code, the BKL protection in nfs_file_llseek() is no longer sufficient to ensure that inode->i_size is read safely in generic_file_llseek_unlocked(). In order to fix the situation, we either have to replace the naked read of inode->i_size in generic_file_llseek_unlocked() with i_size_read(), or the whole thing needs to be executed under the inode->i_lock; In order to avoid disrupting other filesystems, avoid touching generic_file_llseek_unlocked() for now... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-10-06 20:08:26 -04:00
Nick Piggin	4b19de6d1c	mm: tiny-shmem nommu fix The previous patch `db203d53d4` ("mm: tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU architectures because it was using the expanding truncate to signal ramfs to allocate a physically contiguous RAM backing the inode (otherwise it is unusable for "memory mapping" it to userspace). However do_truncate is what caused the lock ordering error, due to it taking i_mutex. In this case, we can actually just call ramfs directly to allocate memory for the mapping, rather than go via truncate. Acked-by: David Howells <dhowells@redhat.com> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-02 15:53:13 -07:00
Nick Piggin	16dbc6c961	inotify: fix lock ordering wrt do_page_fault's mmap_sem Fix inotify lock order reversal with mmap_sem due to holding locks over copy_to_user. Signed-off-by: Nick Piggin <npiggin@suse.de> Reported-by: "Daniel J Blueman" <daniel.blueman@gmail.com> Tested-by: "Daniel J Blueman" <daniel.blueman@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-02 15:53:13 -07:00
Balbir Singh	31a78f23ba	mm owner: fix race between swapoff and exit There's a race between mm->owner assignment and swapoff, more easily seen when task slab poisoning is turned on. The condition occurs when try_to_unuse() runs in parallel with an exiting task. A similar race can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats> or ptrace or page migration. CPU0 CPU1 try_to_unuse looks at mm = task0->mm increments mm->mm_users task 0 exits mm->owner needs to be updated, but no new owner is found (mm_users > 1, but no other task has task->mm = task0->mm) mm_update_next_owner() leaves mmput(mm) decrements mm->mm_users task0 freed dereferencing mm->owner fails The fix is to notify the subsystem via mm_owner_changed callback(), if no new owner is found, by specifying the new task as NULL. Jiri Slaby: mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but must be set after that, so as not to pass NULL as old owner causing oops. Daisuke Nishimura: mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task() and its callers need to take account of this situation to avoid oops. Hugh Dickins: Lockdep warning and hang below exec_mmap() when testing these patches. exit_mm() up_reads mmap_sem before calling mm_update_next_owner(), so exec_mmap() now needs to do the same. And with that repositioning, there's now no point in mm_need_new_owner() allowing for NULL mm. Reported-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-29 08:41:47 -07:00
Linus Torvalds	d0185c0882	Fix NULL pointer dereference in proc_sys_compare The VFS interface for the 'd_compare()' is a bit special (read: 'odd'), because it really just essentially replaces a memcmp(). The filesystem is supposed to just compare the two names with whatever case-independent or other function. And when I say 'is supposed to', I obviously mean that 'procfs does odd things, and actually looks at the dentry that we don't even pass down, rather than just the name'. Which results in problems, because we actually call d_compare before we have even verified that the dentry is still hashed at all. And that causes a problm since the inode that procfs looks at may have been free'd and the d_inode pointer is NULL. procfs just assumes that all dentries are positive, since procfs itself never generates a negative one. But memory pressure will still result in the dentry getting torn down, and as it is removed by RCU, it still remains visible on some lists - and to d_compare. If the filesystem just did a name comparison, we wouldn't care. And we could just fix procfs to know about negative dentries too. But rather than have the low-level filesystems know about internal VFS details, just move the check for a unhashed dentry up a bit, so that we will only call d_compare on dentries that are still active. The actual oops this caused didn't look like a NULL pointer dereference because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)' to get at its internal proc_inode information from the inode pointer, and accessed a field below the inode. So the oops would look something like BUG: unable to handle kernel paging request at fffffffffffffff0 IP: [<ffffffff802bc6c6>] proc_sys_compare+0x36/0x50 and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and ppc64 (Hugh Dickins). Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Hugh Dickins <hugh@veritas.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-29 07:42:57 -07:00
Linus Torvalds	ec4d90287e	Merge git://oss.sgi.com:8090/xfs/linux-2.6 * git://oss.sgi.com:8090/xfs/linux-2.6: [XFS] Remove xfs_iext_irec_compact_full() [XFS] Fix extent list corruption in xfs_iext_irec_compact_full().	2008-09-26 08:49:34 -07:00
Linus Torvalds	bde40fe071	Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6 * 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6: UBIFS: fix printk format warnings UBIFS: remove incorrect assert UBIFS: TNC / GC race fixes UBIFS: create the name of the background thread in every case	2008-09-26 08:20:26 -07:00
Lachlan McIlroy	71a8c87fb3	[XFS] Remove xfs_iext_irec_compact_full() Yet another bug was found in xfs_iext_irec_compact_full() and while the source of the bug was found it wasn't an easy task to track it down because the conditions are very difficult to reproduce. A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their significant effort in tracking down the source of this corruption. xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost identical - they both compact indirect extent lists by moving extents from subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only moves extents if all of the extents in the next buffer will fit into the empty space in the buffer before it. xfs_iext_irec_compact_full() will go a step further and move part of the next buffer if all the extents wont fit. It will then shift the remaining extents in the next buffer up to the start of the buffer. The bug here was that we did not update er_extoff and this caused extent list corruption. It does not appear that this extra functionality gains us much. Calling xfs_iext_irec_compact_pages() instead will do a good enough job at compacting the indirect list and will be quicker too. For the case in xfs_iext_indirect_to_direct() the total number of extents in the indirect list will fit into one buffer so we will never need the extra functionality of xfs_iext_irec_compact_full() there. Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the buffers will never overlap) so we don't want the performance hit that can incur. SGI-PV: 987159 SGI-Modid: xfs-linux-melb:xfs-kern:32166a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>	2008-09-26 12:17:57 +10:00
Lachlan McIlroy	f1ccd29551	[XFS] Fix extent list corruption in xfs_iext_irec_compact_full(). If we don't move all the records from the next buffer into the current buffer then we need to update the er_extoff field of the next buffer as we shift the remaining records to the start of the buffer. SGI-PV: 987159 SGI-Modid: xfs-linux-melb:xfs-kern:32165a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Russell Cattelan <cattelan@thebarn.com>	2008-09-26 12:16:46 +10:00
Julien Brunel	62aa528e02	9p: use an IS_ERR test rather than a NULL test In case of error, the function p9_client_walk returns an ERR pointer, but never returns a NULL pointer. So a NULL test that comes after an IS_ERR test should be deleted. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match_bad_null_test@ expression x, E; statement S1,S2; @@ x = p9_client_walk(...) ... when != x = E * if (x != NULL) S1 else S2 // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-09-24 16:22:22 -05:00
Alexander Beregalov	7424bac82f	UBIFS: fix printk format warnings fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long unsigned int', but argument 5 has type 'long unsigned int' fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long unsigned int', but argument 2 has type 'long unsigned int' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-18 09:57:57 +03:00
Adrian Hunter	6e14968c86	UBIFS: remove incorrect assert The assert was not valid because one of the variables 'taken_empty_lebs' has transient values out of sync with the other variables. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-17 14:47:09 +03:00
Adrian Hunter	6dcfac4f13	UBIFS: TNC / GC race fixes - update GC sequence number if any nodes may have been moved even if GC did not finish the LEB - don't ignore error return when reading Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-17 14:23:26 +03:00
Sebastian Siewior	0855f310df	UBIFS: create the name of the background thread in every case If the ubifs partition is mounted RO and then remounted RW we end up with no thread name in ubifs_remount_rw() and the thread appears nameless. Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-17 10:07:56 +03:00
Lachlan McIlroy	2fd6f6ec64	[XFS] Don't do I/O beyond eof when unreserving space When unreserving space with boundaries that are not block aligned we round up the start and round down the end boundaries and then use this function, xfs_zero_remaining_bytes(), to zero the parts of the blocks that got dropped during the rounding. The problem is we don't consider if these blocks are beyond eof. Worse still is if we encounter delayed allocations beyond eof we will try to use the magic delayed allocation block number as a real block number. If the file size is ever extended to expose these blocks then we'll go through xfs_zero_eof() to zero them anyway. SGI-PV: 983683 SGI-Modid: xfs-linux-melb:xfs-kern:32055a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:52:50 +10:00
Lachlan McIlroy	e1f5dbd707	[XFS] Fix use-after-free with buffers We have a use-after-free issue where log completions access buffers via the buffer log item and the buffer has already been freed. Fix this by taking a reference on the buffer when attaching the buffer log item and release the hold when the buffer log item is detached and we no longer need the buffer. Also create a new function xfs_buf_item_free() to combine some common code. SGI-PV: 985757 SGI-Modid: xfs-linux-melb:xfs-kern:32025a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:52:13 +10:00
David Chinner	f9114eba1e	[XFS] Prevent lockdep false positives when locking two inodes. If we call xfs_lock_two_inodes() to grab both the iolock and the ilock, then drop the ilocks on both inodes, then grab them again (as xfs_swap_extents() does) then lockdep will report a locking order problem. This is a false positive. To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks at once - force calers to make two separate calls. This means that nested dropping and regaining of the ilocks will retain the same lockdep subclass and so lockdep will not see anything wrong with this code. SGI-PV: 986238 SGI-Modid: xfs-linux-melb:xfs-kern:31999a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-09-17 16:51:21 +10:00
David Chinner	b5b8c9acd5	[XFS] Fix barrier status change detection. The current code in xlog_iodone() uses the wrong macro to check if the barrier has been cleared due to an EOPNOTSUPP error form the lower layer. SGI-PV: 986143 SGI-Modid: xfs-linux-melb:xfs-kern:31984a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Nathaniel W. Turner <nate@houseofnate.net> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-09-17 16:50:50 +10:00
Lachlan McIlroy	364f358a73	[XFS] Prevent direct I/O from mapping extents beyond eof With the help from some tracing I found that we try to map extents beyond eof when doing a direct I/O read. It appears that the way to inform the generic direct I/O path (ie do_direct_IO()) that we have breached eof is to return an unmapped buffer from xfs_get_blocks_direct(). This will cause do_direct_IO() to jump to the hole handling code where is will check for eof and then abort. This problem was found because a direct I/O read was trying to map beyond eof and was encountering delayed allocations. The delayed allocations beyond eof are speculative allocations and they didn't get converted when the direct I/O flushed the file because there was only enough space in the current AG to convert and write out the dirty pages within eof. Note that xfs_iomap_write_allocate() wont necessarily convert all the delayed allocation passed to it - it will return after allocating the first extent - so if the delayed allocation extends beyond eof then it will stay that way. SGI-PV: 983683 SGI-Modid: xfs-linux-melb:xfs-kern:31929a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:50:14 +10:00
Christoph Hellwig	6efdf28177	[XFS] Fix regression introduced by remount fixup Logically we would return an error in xfs_fs_remount code to prevent users from believing they might have changed mount options using remount which can't be changed. But unfortunately mount(8) adds all options from mtab and fstab to the mount arguments in some cases so we can't blindly reject options, but have to check for each specified option if it actually differs from the currently set option and only reject it if that's the case. Until that is implemented we return success for every remount request, and silently ignore all options that we can't actually change. SGI-PV: 985710 SGI-Modid: xfs-linux-melb:xfs-kern:31908a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-09-17 16:49:33 +10:00
Lachlan McIlroy	31bd61f2bb	[XFS] Move memory allocations for log tracing out of the critical path Memory allocations for log->l_grant_trace and iclog->ic_trace are done on demand when the first event is logged. In xlog_state_get_iclog_space() we call xlog_trace_iclog() under a spinlock and allocating memory here can cause us to sleep with a spinlock held and deadlock the system. For the log grant tracing we use KM_NOSLEEP but that means we can lose trace entries. Since there is no locking to serialize the log grant tracing we could race and have multiple allocations and leak memory. So move the allocations to where we initialize the log/iclog structures. Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace since it's not even used. SGI-PV: 983738 SGI-Modid: xfs-linux-melb:xfs-kern:31896a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:45:37 +10:00
Andrew Morton	8d99f83b94	rescan_partitions(): make device capacity errors non-fatal Herton Krzesinski reports that the error-checking changes in `04ebd4aee5` ("block/ioctl.c and fs/partition/check.c: check value returned by add_partition") cause his buggy USB camera to no longer mount. "The camera is an Olympus X-840. The original issue comes from the camera itself: its format program creates a partition with an off by one error". Buggy devices happen. It is better for the kernel to warn and to proceed with the mount. Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Cc: Abdel Benamrouche <draconux@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-13 14:41:52 -07:00
Hugh Dickins	d7a3e4959c	mm: ifdef Quicklists in /proc/meminfo A "Quicklists: 0 kB" line has just started appearing in /proc/meminfo, but most architectures (including x86) don't have them configured, so #ifdef it, like the highmem lines. And those architectures which do have quicklists configured are using them for page tables: so let's place it next to PageTables. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-13 14:41:51 -07:00
Eric Sesterhenn	1558182f65	bfs: fix Lockdep warning This fixes: ============================================= [ INFO: possible recursive locking detected ] 2.6.27-rc5-00283-g70bb089 #68 --------------------------------------------- touch/6855 is trying to acquire lock: (&info->bfs_lock){--..}, at: [<c02262f5>] bfs_delete_inode+0x9e/0x18c but task is already holding lock: (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187 other info that might help us debug this: 2 locks held by touch/6855: #0: (&type->i_mutex_dir_key#5){--..}, at: [<c018ad13>] do_filp_open+0x10b/0x62f #1: (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187 stack backtrace: Pid: 6855, comm: touch Not tainted 2.6.27-rc5-00283-g70bb089 #68 [<c013e769>] validate_chain+0x458/0x9f4 [<c013bece>] ? trace_hardirqs_off+0xb/0xd [<c013f36b>] __lock_acquire+0x666/0x6e0 [<c013f440>] lock_acquire+0x5b/0x77 [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c06aab74>] mutex_lock_nested+0xbc/0x234 [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c02262f5>] bfs_delete_inode+0x9e/0x18c [<c0226257>] ? bfs_delete_inode+0x0/0x18c [<c01925e1>] generic_delete_inode+0x94/0xfe [<c019265d>] generic_drop_inode+0x12/0x12f [<c0191b7e>] iput+0x4b/0x4e [<c0226d1e>] bfs_create+0x163/0x187 [<c0188b42>] vfs_create+0xa6/0x114 [<c018adb5>] do_filp_open+0x1ad/0x62f [<c0107cdc>] ? native_sched_clock+0x82/0x96 [<c06ac309>] ? _spin_unlock+0x27/0x3c [<c019379e>] ? alloc_fd+0xbf/0xc9 [<c06ae2f4>] ? sub_preempt_count+0x9d/0xab [<c019379e>] ? alloc_fd+0xbf/0xc9 [<c0180391>] do_sys_open+0x42/0xb8 [<c041d564>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0180449>] sys_open+0x1e/0x26 [<c01038bd>] sysenter_do_call+0x12/0x31 ======================= The problem is that we don't unlock the bfs->lock mutex before calling iput (we do in the other cases). Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-13 14:41:51 -07:00

1 2 3 4 5 ...

10020 Commits