linux_old1

Commit Graph

Author	SHA1	Message	Date
Theodore Ts'o	f702ba0fd7	ext4: Don't use 'struct dentry' for internal lookups This is a port of a patch from Linus which fixes a 200+ byte stack usage problem in ext4_get_parent(). It's more efficient to pass down only the actual parts of the dentry that matter: the parent inode and the name, instead of allocating a struct dentry on the stack. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-22 15:21:01 -04:00
Theodore Ts'o	914258bf2c	ext4/jbd2: Avoid WARN() messages when failing to write to the superblock This fixes some very common warnings reported by kerneloops.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-06 21:35:40 -04:00
Steven Whitehouse	719ee34467	GFS2: high time to take some time over atime Until now, we've used the same scheme as GFS1 for atime. This has failed since atime is a per vfsmnt flag, not a per fs flag and as such the "noatime" flag was not getting passed down to the filesystems. This patch removes all the "special casing" around atime updates and we simply use the VFS's atime code. The net result is that GFS2 will now support all the same atime related mount options of any other filesystem on a per-vfsmnt basis. We do lose the "lazy atime" updates, but we gain "relatime". We could add lazy atime to the VFS at a later date, if there is a requirement for that variant still - I suspect relatime will be enough. Also we lose about 100 lines of code after this patch has been applied, and I have a suspicion that it will speed things up a bit, even when atime is "on". So it seems like a nice clean up as well. From a user perspective, everything stays the same except the loss of the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very least, and to be honest I don't think anybody ever used it) and that a number of options which were ignored before now work correctly. Please let me know if you've got any comments. I'm pushing this out early so that you can all see what my plans are. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-09-18 13:53:59 +01:00
Steven Whitehouse	37ec89e83c	GFS2: The war on bloat The following patch shrinks the gfs2_args structure which is embedded in every GFS2 superblock. It cuts down the size of the options to a single unsigned int (the 13 bits of bitfields will be rounded up to that size by the compiler) from the current 11 unsigned ints. So on x86 thats 44 bytes shrinking to 4 bytes, in each and every GFS2 superblock. Signed-off-by: Steven Whitehouse <swhitho@redhat.com>	2008-09-18 13:49:32 +01:00
Alexander Beregalov	7424bac82f	UBIFS: fix printk format warnings fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long unsigned int', but argument 5 has type 'long unsigned int' fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long unsigned int', but argument 2 has type 'long unsigned int' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-18 09:57:57 +03:00
Adrian Hunter	6e14968c86	UBIFS: remove incorrect assert The assert was not valid because one of the variables 'taken_empty_lebs' has transient values out of sync with the other variables. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-17 14:47:09 +03:00
Adrian Hunter	6dcfac4f13	UBIFS: TNC / GC race fixes - update GC sequence number if any nodes may have been moved even if GC did not finish the LEB - don't ignore error return when reading Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-17 14:23:26 +03:00
Sebastian Siewior	0855f310df	UBIFS: create the name of the background thread in every case If the ubifs partition is mounted RO and then remounted RW we end up with no thread name in ubifs_remount_rw() and the thread appears nameless. Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-17 10:07:56 +03:00
Lachlan McIlroy	2fd6f6ec64	[XFS] Don't do I/O beyond eof when unreserving space When unreserving space with boundaries that are not block aligned we round up the start and round down the end boundaries and then use this function, xfs_zero_remaining_bytes(), to zero the parts of the blocks that got dropped during the rounding. The problem is we don't consider if these blocks are beyond eof. Worse still is if we encounter delayed allocations beyond eof we will try to use the magic delayed allocation block number as a real block number. If the file size is ever extended to expose these blocks then we'll go through xfs_zero_eof() to zero them anyway. SGI-PV: 983683 SGI-Modid: xfs-linux-melb:xfs-kern:32055a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:52:50 +10:00
Lachlan McIlroy	e1f5dbd707	[XFS] Fix use-after-free with buffers We have a use-after-free issue where log completions access buffers via the buffer log item and the buffer has already been freed. Fix this by taking a reference on the buffer when attaching the buffer log item and release the hold when the buffer log item is detached and we no longer need the buffer. Also create a new function xfs_buf_item_free() to combine some common code. SGI-PV: 985757 SGI-Modid: xfs-linux-melb:xfs-kern:32025a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:52:13 +10:00
David Chinner	f9114eba1e	[XFS] Prevent lockdep false positives when locking two inodes. If we call xfs_lock_two_inodes() to grab both the iolock and the ilock, then drop the ilocks on both inodes, then grab them again (as xfs_swap_extents() does) then lockdep will report a locking order problem. This is a false positive. To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks at once - force calers to make two separate calls. This means that nested dropping and regaining of the ilocks will retain the same lockdep subclass and so lockdep will not see anything wrong with this code. SGI-PV: 986238 SGI-Modid: xfs-linux-melb:xfs-kern:31999a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-09-17 16:51:21 +10:00
David Chinner	b5b8c9acd5	[XFS] Fix barrier status change detection. The current code in xlog_iodone() uses the wrong macro to check if the barrier has been cleared due to an EOPNOTSUPP error form the lower layer. SGI-PV: 986143 SGI-Modid: xfs-linux-melb:xfs-kern:31984a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Nathaniel W. Turner <nate@houseofnate.net> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-09-17 16:50:50 +10:00
Lachlan McIlroy	364f358a73	[XFS] Prevent direct I/O from mapping extents beyond eof With the help from some tracing I found that we try to map extents beyond eof when doing a direct I/O read. It appears that the way to inform the generic direct I/O path (ie do_direct_IO()) that we have breached eof is to return an unmapped buffer from xfs_get_blocks_direct(). This will cause do_direct_IO() to jump to the hole handling code where is will check for eof and then abort. This problem was found because a direct I/O read was trying to map beyond eof and was encountering delayed allocations. The delayed allocations beyond eof are speculative allocations and they didn't get converted when the direct I/O flushed the file because there was only enough space in the current AG to convert and write out the dirty pages within eof. Note that xfs_iomap_write_allocate() wont necessarily convert all the delayed allocation passed to it - it will return after allocating the first extent - so if the delayed allocation extends beyond eof then it will stay that way. SGI-PV: 983683 SGI-Modid: xfs-linux-melb:xfs-kern:31929a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:50:14 +10:00
Christoph Hellwig	6efdf28177	[XFS] Fix regression introduced by remount fixup Logically we would return an error in xfs_fs_remount code to prevent users from believing they might have changed mount options using remount which can't be changed. But unfortunately mount(8) adds all options from mtab and fstab to the mount arguments in some cases so we can't blindly reject options, but have to check for each specified option if it actually differs from the currently set option and only reject it if that's the case. Until that is implemented we return success for every remount request, and silently ignore all options that we can't actually change. SGI-PV: 985710 SGI-Modid: xfs-linux-melb:xfs-kern:31908a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-09-17 16:49:33 +10:00
Lachlan McIlroy	31bd61f2bb	[XFS] Move memory allocations for log tracing out of the critical path Memory allocations for log->l_grant_trace and iclog->ic_trace are done on demand when the first event is logged. In xlog_state_get_iclog_space() we call xlog_trace_iclog() under a spinlock and allocating memory here can cause us to sleep with a spinlock held and deadlock the system. For the log grant tracing we use KM_NOSLEEP but that means we can lose trace entries. Since there is no locking to serialize the log grant tracing we could race and have multiple allocations and leak memory. So move the allocations to where we initialize the log/iclog structures. Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace since it's not even used. SGI-PV: 983738 SGI-Modid: xfs-linux-melb:xfs-kern:31896a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-09-17 16:45:37 +10:00
Steve French	388e57b275	[CIFS] use common code for turning off ATTR_READONLY in cifs_unlink We already have a cifs_set_file_info function that can flip DOS attribute bits. Have cifs_unlink call it to handle turning ATTR_HIDDEN on and ATTR_READONLY off when an unlink attempt returns -EACCES. This also removes a level of indentation from cifs_unlink. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-09-16 23:50:58 +00:00
Jeff Layton	5f0319a790	cifs: clean up variables in cifs_unlink Change parameters to cifs_unlink to match the ones used in the generic VFS. Add some local variables to cut down on the amount of struct dereferencing that needs to be done, and eliminate some unneeded NULL pointer checks on the parent directory inode. Finally, rename pTcon to "tcon" to more closely match standard kernel coding style. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-09-16 20:14:34 +00:00
Abhijith Das	acd2c8aa02	GFS2: GFS2 will panic if you misspell any mount options The gfs2 superblock pointer is NULL after a failed mount. When control eventually goes to gfs2_kill_sb, we dereference this NULL pointer. This patch ensures that the gfs2 superblock pointer is not NULL before being dereferenced in gfs2_kill_sb. Signed-off-by: Abhijith Das <adas@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-09-15 16:08:32 +01:00
Bob Peterson	acb57a3652	GFS2: Direct IO write at end of file error This patch fixes a problem whereby a direct_io write doesn't fall back to buffered write properly at end of file. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-09-15 10:31:54 +01:00
Andrew Morton	8d99f83b94	rescan_partitions(): make device capacity errors non-fatal Herton Krzesinski reports that the error-checking changes in `04ebd4aee5` ("block/ioctl.c and fs/partition/check.c: check value returned by add_partition") cause his buggy USB camera to no longer mount. "The camera is an Olympus X-840. The original issue comes from the camera itself: its format program creates a partition with an off by one error". Buggy devices happen. It is better for the kernel to warn and to proceed with the mount. Reported-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Cc: Abdel Benamrouche <draconux@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-13 14:41:52 -07:00
Hugh Dickins	d7a3e4959c	mm: ifdef Quicklists in /proc/meminfo A "Quicklists: 0 kB" line has just started appearing in /proc/meminfo, but most architectures (including x86) don't have them configured, so #ifdef it, like the highmem lines. And those architectures which do have quicklists configured are using them for page tables: so let's place it next to PageTables. Signed-off-by: Hugh Dickins <hugh@veritas.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-13 14:41:51 -07:00
Eric Sesterhenn	1558182f65	bfs: fix Lockdep warning This fixes: ============================================= [ INFO: possible recursive locking detected ] 2.6.27-rc5-00283-g70bb089 #68 --------------------------------------------- touch/6855 is trying to acquire lock: (&info->bfs_lock){--..}, at: [<c02262f5>] bfs_delete_inode+0x9e/0x18c but task is already holding lock: (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187 other info that might help us debug this: 2 locks held by touch/6855: #0: (&type->i_mutex_dir_key#5){--..}, at: [<c018ad13>] do_filp_open+0x10b/0x62f #1: (&info->bfs_lock){--..}, at: [<c0226c00>] bfs_create+0x45/0x187 stack backtrace: Pid: 6855, comm: touch Not tainted 2.6.27-rc5-00283-g70bb089 #68 [<c013e769>] validate_chain+0x458/0x9f4 [<c013bece>] ? trace_hardirqs_off+0xb/0xd [<c013f36b>] __lock_acquire+0x666/0x6e0 [<c013f440>] lock_acquire+0x5b/0x77 [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c06aab74>] mutex_lock_nested+0xbc/0x234 [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c02262f5>] ? bfs_delete_inode+0x9e/0x18c [<c02262f5>] bfs_delete_inode+0x9e/0x18c [<c0226257>] ? bfs_delete_inode+0x0/0x18c [<c01925e1>] generic_delete_inode+0x94/0xfe [<c019265d>] generic_drop_inode+0x12/0x12f [<c0191b7e>] iput+0x4b/0x4e [<c0226d1e>] bfs_create+0x163/0x187 [<c0188b42>] vfs_create+0xa6/0x114 [<c018adb5>] do_filp_open+0x1ad/0x62f [<c0107cdc>] ? native_sched_clock+0x82/0x96 [<c06ac309>] ? _spin_unlock+0x27/0x3c [<c019379e>] ? alloc_fd+0xbf/0xc9 [<c06ae2f4>] ? sub_preempt_count+0x9d/0xab [<c019379e>] ? alloc_fd+0xbf/0xc9 [<c0180391>] do_sys_open+0x42/0xb8 [<c041d564>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0180449>] sys_open+0x1e/0x26 [<c01038bd>] sysenter_do_call+0x12/0x31 ======================= The problem is that we don't unlock the bfs->lock mutex before calling iput (we do in the other cases). Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-13 14:41:51 -07:00
Alexey Dobriyan	665020c35e	proc: more debugging for "already registered" case Print parent directory name as well. The aim is to catch non-creation of parent directory when proc_mkdir will return NULL and all subsequent registrations go directly in /proc instead of intended directory. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Fixed insane printk string while at it. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-13 14:41:50 -07:00
Eric Sandeen	730c213c79	ext4: use percpu data structures for lg_prealloc_list lg_prealloc_list seems to cry out for a per-cpu data structure; on a large smp system I think this should be better. I've lightly tested this change on a 4-cpu system. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-13 15:23:29 -04:00
Theodore Ts'o	8eea80d52b	ext4: Renumber EXT4_IOC_MIGRATE Pick an ioctl number for EXT4_IOC_MIGRATE that won't conflict with other ext4 ioctl's. Since there haven't been any major userspace users of this ioctl, we can afford to change this now, to avoid potential problems later. Also, reorder the ioctl numbers in ext4.h to avoid this sort of mistake in the future. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-13 19:54:35 -04:00
Aneesh Kumar K.V	4db46fc266	ext4: hook the ext3 migration interface to the EXT4_IOC_SETFLAGS ioctl This patch hooks the ext3 to ext4 migrate interface to EXT4_IOC_SETFLAGS ioctl. The userspace interface is via chattr +e. We only allow setting extent flags. Clearing extent flag (migrating from ext4 to ext3) is not supported. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-08 23:34:06 -04:00
Aneesh Kumar K.V	2a43a87800	ext4: elevate write count for migrate ioctl The migrate ioctl writes to the filsystem, so we need to elevate the write count. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-13 12:52:26 -04:00
Linus Torvalds	e2858ce3ed	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: add llseek method udf: Fix error paths in udf_new_inode() udf: Fix lock inversion between iprune_mutex and alloc_mutex (v2)	2008-09-11 08:40:11 -07:00
Tao Ma	0e116227a0	ocfs2: Fix a bug in direct IO read. ocfs2 will become read-only if we try to read the bytes which pass the end of i_size. This can be easily reproduced by following steps: 1. mkfs a ocfs2 volume with bs=4k cs=4k and nosparse. 2. create a small file(say less than 100 bytes) and we will create the file which is allocated 1 cluster. 3. read 8196 bytes from the kernel using O_DIRECT which exceeds the limit. 4. The ocfs2 volume becomes read-only and dmesg shows: OCFS2: ERROR (device sda13): ocfs2_direct_IO_get_blocks: Inode 66010 has a hole at block 1 File system is now read-only due to the potential of on-disk corruption. Please run fsck.ocfs2 once the file system is unmounted. So suppress the ERROR message. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-09-10 01:44:08 -07:00
Linus Torvalds	b975dee381	Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6 * 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6: UBIFS: make minimum fanout 3 UBIFS: fix division by zero UBIFS: amend f_fsid UBIFS: fill f_fsid UBIFS: improve statfs reporting even more UBIFS: introduce LEB overhead UBIFS: add forgotten gc_idx_lebs component UBIFS: fix assertion UBIFS: improve statfs reporting UBIFS: remove incorrect index space check UBIFS: push empty flash hack down UBIFS: do not update min_idx_lebs in stafs UBIFS: allow for racing between GC and TNC UBIFS: always read hashed-key nodes under TNC mutex UBIFS: fix zero-length truncations	2008-09-09 11:52:12 -07:00
Chuck Lever	af904deaf6	NFS: Restore missing hunk in NFS mount option parser Automounter maps can contain mount options valid for other NFS implementations but not for Linux. The Linux automounter uses the mount command's "-s" command line option ("s" for "sloppy") so that mount requests containing such options are not rejected. Commit `f45663ce5f` attempted to address a known regression with text-based NFS mount option parsing. Unrecognized mount options would cause mount requests to fail, even if the "-s" option was used on the mount command line. Unfortunately, this commit was not complete as submitted. It adds a new mount option, "sloppy". But it is missing a hunk, so it now allows NFS mounts with unrecognized mount options, even if the "sloppy" option is not present. This could be a problem if a required critical mount option such as "sync" is misspelled, for example, and is considered a regression from 2.6.26. This patch restores the missing hunk. Now, the default behavior of text-based NFS mount options is as before: any unrecognized mount option will cause the mount to fail. Please include this in 2.6.27-rc. Thanks to Neil Brown for reporting this. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-08 15:35:19 -07:00
Christoph Hellwig	5c89468c12	udf: add llseek method UDF currently doesn't set a llseek method for regular files, which means it will fall back to default_llseek. This means no one can seek beyond 2 Gigabytes on udf, and that there's not protection vs the i_size updates from writers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>	2008-09-08 20:31:04 +02:00
Li Zefan	7ee1ec4ca3	ext4: add missing unlock in ext4_check_descriptors() on error path If there group descriptors are corrupted we need unlock the block group lock before returning from the function; else we will oops when freeing a spinlock which is still being held. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 10:47:19 -04:00
Theodore Ts'o	05496769e5	jbd2: clean up how the journal device name is printed Calculate the journal device name once and stash it away in the journal_s structure. This avoids needing to call bdevname() everywhere and reduces stack usage by not needing to allocate an on-stack buffer. In addition, we eliminate the '/' that can appear in device names (e.g. "cciss/c0d0p9" --- see kernel bugzilla #11321) that can cause problems when creating proc directory names, and include the inode number to support ocfs2 which creates multiple journals with different inode numbers. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-16 14:36:17 -04:00
Alexey Dobriyan	899fc1a4cf	ext4: fix #11321 : create /proc/ext4/*/stats more carefully ext4 creates per-suberblock directory in /proc/ext4/ . Name used as basis is taken from bdevname, which, surprise, can contain slash. However, proc while allowing to use proc_create("a/b", parent) form of PDE creation, assumes that parent/a was already created. bdevname in question is 'cciss/c0d0p9', directory is not created and all this stuff goes directly into /proc (which is real bug). Warning comes when _second_ partition is mounted. http://bugzilla.kernel.org/show_bug.cgi?id=11321 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-14 10:21:33 -04:00
Frederic Bohe	c62a11fd95	Update flex_bg free blocks and free inodes counters when resizing. This fixes a bug which prevented the newly created inodes after a resize from being used on filesystems with flex_bg. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 10:20:24 -04:00
Eric Sandeen	9d9f177572	ext4: Avoid printk floods in the face of directory corruption Note: some people thinks this represents a security bug, since it might make the system go away while it is printing a large number of console messages, especially if a serial console is involved. Hence, it has been assigned CVE-2008-3528, but it requires that the attacker either has physical access to your machine to insert a USB disk with a corrupted filesystem image (at which point why not just hit the power button), or is otherwise able to convince the system administrator to mount an arbitrary filesystem image (at which point why not just include a setuid shell or world-writable hard disk device file or some such). Me, I think they're just being silly. --tytso Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-ext4@vger.kernel.org Cc: Eugene Teo <eugeneteo@kernel.sg>	2008-10-09 11:15:52 -04:00
Aneesh Kumar K.V	cf17fea657	ext4: Properly update i_disksize. With delayed allocation we use i_data_sem to update i_disksize. We need to update i_disksize only if the new size specified is greater than the current value and we need to make sure we don't race with other i_disksize update. With delayed allocation we will switch to the write_begin function for non-delayed allocation if we are low on free blocks. This means the write_begin function for non-delayed allocation also needs to use the same locking. We also need to check and update i_disksize even if the new size is less that inode.i_size because of delayed allocation. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-13 13:06:18 -04:00
Aneesh Kumar K.V	ae4d537211	ext4: truncate block allocated on a failed ext4_write_begin For blocksize < pagesize we need to remove blocks that got allocated in block_write_begin() if we fail with ENOSPC for later blocks. block_write_begin() internally does this if it allocated pages locally. This makes sure we don't have blocks outside inode.i_size during ENOSPC. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-13 13:10:25 -04:00
Aneesh Kumar K.V	df22291ff0	ext4: Retry block allocation if we have free blocks left When we truncate files, the meta-data blocks released are not reused untill we commit the truncate transaction. That means delayed get_block request will return ENOSPC even if we have free blocks left. Force a journal commit and retry block allocation if we get ENOSPC with free blocks left. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 23:05:34 -04:00
Aneesh Kumar K.V	166348dd37	ext4: Don't add the inode to journal handle until after the block is allocated Make sure we don't add the inode to the journal handle until after the block allocation, so that a journal commit will not include the inode in case of block allocation failure. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 23:08:40 -04:00
Aneesh Kumar K.V	68629f29c6	ext4: Fix ext4 nomballoc allocator for ENOSPC We run into ENOSPC error on nonmballoc ext4, even when there is free blocks on the filesystem. The patch includes two changes: a) Set reservation to NULL if we trying to allocate near group_target_block from the goal group if the free block in the group is less than windows. This should give us a better chance to allocate near group_target_block. This also ensures that if we are not allocating near group_target_block then we don't trun off reservation. This should enable us to allocate with reservation from other groups that have large free blocks count. b) we don't need to check the window size if the block reservation is off. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 23:09:17 -04:00
Aneesh Kumar K.V	5c79161689	ext4: Signed arithmetic fix This patch converts some usage of ext4_fsblk_t to s64. This is needed so that some of the sign conversion works as expected in if loops. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-08 23:12:24 -04:00
Aneesh Kumar K.V	79f0be8d2e	ext4: Switch to non delalloc mode when we are low on free blocks count. The delayed allocation code allocates blocks during writepages(), which can not handle block allocation failures. To deal with this, we switch away from delayed allocation mode when we are running low on free blocks. This also allows us to avoid needing to reserve a large number of meta-data blocks in case all of the requested blocks are discontiguous. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-08 23:13:30 -04:00
Aneesh Kumar K.V	6bc6e63fcd	ext4: Add percpu dirty block accounting. This patch adds dirty block accounting using percpu_counters. Delayed allocation block reservation is now done by updating dirty block counter. In a later patch we switch to non delalloc mode if the filesystem free blocks is greater than 150% of total filesystem dirty blocks Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao<cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-10 09:39:00 -04:00
Aneesh Kumar K.V	030ba6bc67	ext4: Retry block reservation During block reservation if we don't have enough blocks left, retry block reservation with smaller block counts. This makes sure we try fallocate and DIO with smaller request size and don't fail early. The delayed allocation reservation cannot try with smaller block count. So retry block reservation to handle temporary disk full conditions. Also print free blocks details if we fail block allocation during writepages. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 23:14:50 -04:00
Aneesh Kumar K.V	a30d542a00	ext4: Make sure all the block allocation paths reserve blocks With delayed allocation we need to make sure block are reserved before we attempt to allocate them. Otherwise we get block allocation failure (ENOSPC) during writepages which cannot be handled. This would mean silent data loss (We do a printk stating data will be lost). This patch updates the DIO and fallocate code path to do block reservation before block allocation. This is needed to make sure parallel DIO and fallocate request doesn't take block out of delayed reserve space. When free blocks count go below a threshold we switch to a slow patch which looks at other CPU's accumulated percpu counter values. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-09 10:56:23 -04:00
David Woodhouse	e17c6d5616	Introduce HAVE_AOUT symbol to remove hard-coded arch list for BINFMT_AOUT HAVE_AOUT doesn't quite do the same thing as the recently removed ARCH_SUPPORTS_AOUT config option. That was set even on platforms where binfmt_aout isn't supported, although it's not entirely clear why. So it's best just to introduce a new symbol, handled consistently with other similar HAVE_xxx symbols; with a simple 'select' in the arch Kconfig. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-09-06 19:30:22 +01:00
David Woodhouse	6b213e1bc2	Remove redundant CONFIG_ARCH_SUPPORTS_AOUT We don't need this any more; arguably we never really did. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-09-06 19:30:20 +01:00
Artem Bityutskiy	a5cb562d69	UBIFS: make minimum fanout 3 UBIFS does not really work correctly when fanout is 2, because of the way we manage the indexing tree. It may just become a list and UBIFS screws up. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-05 20:02:35 +03:00
Artem Bityutskiy	f171d4d769	UBIFS: fix division by zero If fanout is 3, we have division by zero in 'ubifs_read_superblock()': divide error: 0000 [#1] PREEMPT SMP Pid: 28744, comm: mount Not tainted (2.6.27-rc4-ubifs-2.6 #23) EIP: 0060:[<f8f9e3ef>] EFLAGS: 00010202 CPU: 0 EIP is at ubifs_reported_space+0x2d/0x69 [ubifs] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000 ESI: 00000000 EDI: f0ae64b0 EBP: f1f9fcf4 ESP: f1f9fce0 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-05 20:01:59 +03:00
Balbir Singh	49048622ea	sched: fix process time monotonicity Spencer reported a problem where utime and stime were going negative despite the fixes in commit `b27f03d4bd`. The suspected reason for the problem is that signal_struct maintains it's own utime and stime (of exited tasks), these are not updated using the new task_utime() routine, hence sig->utime can go backwards and cause the same problem to occur (sig->utime, adds tsk->utime and not task_utime()). This patch fixes the problem TODO: using max(task->prev_utime, derived utime) works for now, but a more generic solution is to implement cputime_max() and use the cputime_gt() function for comparison. Reported-by: spencer@bluehost.com Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-05 18:14:35 +02:00
Andrew Morton	27eccf4649	dlm: choose better identifiers sparc32: fs/dlm/config.c:397: error: expected identifier or '(' before '{' token fs/dlm/config.c: In function 'drop_node': fs/dlm/config.c:589: warning: initialization from incompatible pointer type fs/dlm/config.c:589: warning: initialization from incompatible pointer type fs/dlm/config.c: In function 'release_node': fs/dlm/config.c:601: warning: initialization from incompatible pointer type fs/dlm/config.c:601: warning: initialization from incompatible pointer type fs/dlm/config.c: In function 'show_node': fs/dlm/config.c:717: warning: initialization from incompatible pointer type fs/dlm/config.c:717: warning: initialization from incompatible pointer type fs/dlm/config.c: In function 'store_node': fs/dlm/config.c:726: warning: initialization from incompatible pointer type fs/dlm/config.c:726: warning: initialization from incompatible pointer type Cc: Christine Caulfield <ccaulfie@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Teigland <teigland@redhat.com>	2008-09-05 09:51:30 -05:00
Julien Brunel	bd1eb8818c	GFS2: Use an IS_ERR test rather than a NULL test In case of error, the function gfs2_inode_lookup returns an ERR pointer, but never returns a NULL pointer. So a NULL test that necessarily comes after an IS_ERR test should be deleted, and a NULL test that may come after a call to this function should be strengthened by an IS_ERR test. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @match_bad_null_test@ expression x, E; statement S1,S2; @@ x = gfs2_inode_lookup(...) ... when != x = E * if (x != NULL) S1 else S2 // </smpl> Signed-off-by: Julien Brunel <brunel@diku.dk> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-09-05 14:19:44 +01:00
Steven Whitehouse	dff5257473	GFS2: Fix race relating to glock min-hold time In the case that a request for a glock arrives right after the grant reply has arrived, it sometimes means that the gl_tstamp field hasn't been updated recently enough. The net result is that the min-hold time for the glock is ignored. If this happens often enough, it leads to poor performance. This patch adds an additional test, so that if the reply pending bit is set on a glock, then it will select the maximum length of time for the min-hold time, rather than looking at gl_tstamp. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-09-05 14:18:02 +01:00
David Teigland	f9f2ed4862	dlm: remove bkl BLK from recent pushdown is not needed. Signed-off-by: David Teigland <teigland@redhat.com>	2008-09-04 12:55:13 -05:00
Artem Bityutskiy	7c7cbadf73	UBIFS: amend f_fsid David Woodhouse suggested to be consistent with other FSes and xor the beginning and the end of the UUID. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-09-03 14:56:22 +03:00
KOSAKI Motohiro	4b8561521d	mm: show quicklist usage in /proc/meminfo Quicklists can consume several GB of memory. We should provide a means of monitoring this. After this patch is applied, /proc/meminfo will output the following: % cat /proc/meminfo MemTotal: 7715392 kB MemFree: 5401600 kB Buffers: 80384 kB Cached: 300800 kB SwapCached: 0 kB Active: 235584 kB Inactive: 262656 kB SwapTotal: 2031488 kB SwapFree: 2031488 kB Dirty: 3520 kB Writeback: 0 kB AnonPages: 117696 kB Mapped: 38528 kB Slab: 1589952 kB SReclaimable: 23104 kB SUnreclaim: 1566848 kB PageTables: 14656 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 5889152 kB Committed_AS: 393152 kB VmallocTotal: 17592177655808 kB VmallocUsed: 29056 kB VmallocChunk: 17592177626432 kB Quicklists: 130944 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 262144 kB Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-02 19:21:38 -07:00
Adrian Bunk	169ccbd44e	NTFS: update homepage Update the location of the NTFS homepage in several files. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-09-02 19:21:37 -07:00
David Woodhouse	dc8e190948	EFS: Don't set f_fsid in statfs(). We don't have any suitable value to put in f_fsid. Using EFS_MAGIC really isn't a good idea, because all EFS file systems will have the same f_fsid then. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-09-02 23:15:22 +01:00
David Teigland	44be6fdf10	dlm: fix address compare Compare only the addr and port fields of sockaddr structures. Fixes a problem with ipv6 where sin6_scope_id does not match. Signed-off-by: David Teigland <teigland@redhat.com>	2008-09-02 14:32:08 -05:00
Linus Torvalds	e77295dc9e	Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: nfsd: fix buffer overrun decoding NFSv4 acl sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports nfsd: fix compound state allocation error handling svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet	2008-09-02 10:58:11 -07:00
J. Bruce Fields	91b80969ba	nfsd: fix buffer overrun decoding NFSv4 acl The array we kmalloc() here is not large enough. Thanks to Johann Dahm and David Richter for bug report and testing. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: David Richter <richterd@citi.umich.edu> Tested-by: Johann Dahm <jdahm@umich.edu>	2008-09-01 14:24:24 -04:00
Andy Adamson	c228c24bf1	nfsd: fix compound state allocation error handling Move the cstate_alloc call so that if it fails, the response is setup to encode the NFS error. The out label now means that the nfsd4_compound_state has not been allocated. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-09-01 14:17:48 -04:00
Artem Bityutskiy	b3385c278d	UBIFS: fill f_fsid UBIFS stores 16-bit UUID in the superblock, and it is a good idea to return part of it in 'f_fsid' filed of kstatfs structure. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:22:11 +03:00
Artem Bityutskiy	7dad181bbe	UBIFS: improve statfs reporting even more Since free space we report in statfs is file size which should fit to the FS - change the way we calculate free space and use leb_overhead instead of dark_wm in calculations. Results of "freespace" test (120MiB volume, 16KiB LEB size, 512 bytes page size). Before the change: freespace: Test 1: fill the space we have 3 times freespace: was free: 85204992 bytes 81.3 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 11284480 bytes 10.8 MiB, wrote 13.2% more than predicted freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 12935168 bytes 12.3 MiB, wrote 15.5% more than predicted freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 12939264 bytes 12.3 MiB, wrote 15.5% more than predicted freespace: Test 1 finished freespace: Test 2: gradually lessen amount of free space and fill the FS freespace: do 10 steps, lessen free space by 7596218 bytes 7.2 MiB each time freespace: was free: 78675968 bytes 75.0 MiB, wrote: 88903680 bytes 84.8 MiB, delta: 10227712 bytes 9.8 MiB, wrote 13.0% more than predicted freespace: was free: 72015872 bytes 68.7 MiB, wrote: 81514496 bytes 77.7 MiB, delta: 9498624 bytes 9.1 MiB, wrote 13.2% more than predicted freespace: was free: 63938560 bytes 61.0 MiB, wrote: 72589312 bytes 69.2 MiB, delta: `8650752` bytes 8.2 MiB, wrote 13.5% more than predicted freespace: was free: 56127488 bytes 53.5 MiB, wrote: 63762432 bytes 60.8 MiB, delta: 7634944 bytes 7.3 MiB, wrote 13.6% more than predicted freespace: was free: 48336896 bytes 46.1 MiB, wrote: 54935552 bytes 52.4 MiB, delta: 6598656 bytes 6.3 MiB, wrote 13.7% more than predicted freespace: was free: 40587264 bytes 38.7 MiB, wrote: 46157824 bytes 44.0 MiB, delta: 5570560 bytes 5.3 MiB, wrote 13.7% more than predicted freespace: was free: 32841728 bytes 31.3 MiB, wrote: 37384192 bytes 35.7 MiB, delta: 4542464 bytes 4.3 MiB, wrote 13.8% more than predicted freespace: was free: 25100288 bytes 23.9 MiB, wrote: 28618752 bytes 27.3 MiB, delta: 3518464 bytes 3.4 MiB, wrote 14.0% more than predicted freespace: was free: 17342464 bytes 16.5 MiB, wrote: 19841024 bytes 18.9 MiB, delta: 2498560 bytes 2.4 MiB, wrote 14.4% more than predicted freespace: was free: 9605120 bytes 9.2 MiB, wrote: 11063296 bytes 10.6 MiB, delta: 1458176 bytes 1.4 MiB, wrote 15.2% more than predicted freespace: Test 2 finished freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS freespace: do 10 steps, lessen free space by 7606272 bytes 7.3 MiB each time freespace: trashing: was free: 83668992 bytes 79.8 MiB, need free: 7606272 bytes 7.3 MiB, files created: 248297, delete 225724 (90.9% of them) freespace: was free: 70803456 bytes 67.5 MiB, wrote: 82485248 bytes 78.7 MiB, delta: 11681792 bytes 11.1 MiB, wrote 16.5% more than predicted freespace: trashing: was free: 81080320 bytes 77.3 MiB, need free: 15212544 bytes 14.5 MiB, files created: 248711, delete 202047 (81.2% of them) freespace: was free: 59867136 bytes 57.1 MiB, wrote: 71897088 bytes 68.6 MiB, delta: 12029952 bytes 11.5 MiB, wrote 20.1% more than predicted freespace: trashing: was free: 82243584 bytes 78.4 MiB, need free: 22818816 bytes 21.8 MiB, files created: 248866, delete 179817 (72.3% of them) freespace: was free: 50905088 bytes 48.5 MiB, wrote: 63168512 bytes 60.2 MiB, delta: 12263424 bytes 11.7 MiB, wrote 24.1% more than predicted freespace: trashing: was free: 83402752 bytes 79.5 MiB, need free: 30425088 bytes 29.0 MiB, files created: 248920, delete 158114 (63.5% of them) freespace: was free: 42651648 bytes 40.7 MiB, wrote: 55406592 bytes 52.8 MiB, delta: 12754944 bytes 12.2 MiB, wrote 29.9% more than predicted freespace: trashing: was free: 84402176 bytes 80.5 MiB, need free: 38031360 bytes 36.3 MiB, files created: 248709, delete 136641 (54.9% of them) freespace: was free: 35233792 bytes 33.6 MiB, wrote: 48250880 bytes 46.0 MiB, delta: 13017088 bytes 12.4 MiB, wrote 36.9% more than predicted freespace: trashing: was free: 82530304 bytes 78.7 MiB, need free: 45637632 bytes 43.5 MiB, files created: 248778, delete 111208 (44.7% of them) freespace: was free: 27287552 bytes 26.0 MiB, wrote: 40267776 bytes 38.4 MiB, delta: 12980224 bytes 12.4 MiB, wrote 47.6% more than predicted freespace: trashing: was free: 85114880 bytes 81.2 MiB, need free: 53243904 bytes 50.8 MiB, files created: 248508, delete 93052 (37.4% of them) freespace: was free: 22437888 bytes 21.4 MiB, wrote: 35328000 bytes 33.7 MiB, delta: 12890112 bytes 12.3 MiB, wrote 57.4% more than predicted freespace: trashing: was free: 84103168 bytes 80.2 MiB, need free: 60850176 bytes 58.0 MiB, files created: 248637, delete 68743 (27.6% of them) freespace: was free: 15536128 bytes 14.8 MiB, wrote: 28319744 bytes 27.0 MiB, delta: 12783616 bytes 12.2 MiB, wrote 82.3% more than predicted freespace: trashing: was free: 84357120 bytes 80.4 MiB, need free: 68456448 bytes 65.3 MiB, files created: 248567, delete 46852 (18.8% of them) freespace: was free: 9015296 bytes 8.6 MiB, wrote: 22044672 bytes 21.0 MiB, delta: 13029376 bytes 12.4 MiB, wrote 144.5% more than predicted freespace: trashing: was free: 84942848 bytes 81.0 MiB, need free: 76062720 bytes 72.5 MiB, files created: 248636, delete 25993 (10.5% of them) freespace: was free: 6086656 bytes 5.8 MiB, wrote: 8331264 bytes 7.9 MiB, delta: 2244608 bytes 2.1 MiB, wrote 36.9% more than predicted freespace: Test 3 finished freespace: finished successfully After the change: freespace: Test 1: fill the space we have 3 times freespace: was free: 94048256 bytes 89.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 2441216 bytes 2.3 MiB, wrote 2.6% more than predicted freespace: was free: 92246016 bytes 88.0 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 4247552 bytes 4.1 MiB, wrote 4.6% more than predicted freespace: was free: 92254208 bytes 88.0 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 4235264 bytes 4.0 MiB, wrote 4.6% more than predicted freespace: Test 1 finished freespace: Test 2: gradually lessen amount of free space and fill the FS freespace: do 10 steps, lessen free space by 8386001 bytes 8.0 MiB each time freespace: was free: 86605824 bytes 82.6 MiB, wrote: 88252416 bytes 84.2 MiB, delta: 1646592 bytes 1.6 MiB, wrote 1.9% more than predicted freespace: was free: 78667776 bytes 75.0 MiB, wrote: 80715776 bytes 77.0 MiB, delta: 2048000 bytes 2.0 MiB, wrote 2.6% more than predicted freespace: was free: 69615616 bytes 66.4 MiB, wrote: 71630848 bytes 68.3 MiB, delta: 2015232 bytes 1.9 MiB, wrote 2.9% more than predicted freespace: was free: 61018112 bytes 58.2 MiB, wrote: 62783488 bytes 59.9 MiB, delta: 1765376 bytes 1.7 MiB, wrote 2.9% more than predicted freespace: was free: 52424704 bytes 50.0 MiB, wrote: 53968896 bytes 51.5 MiB, delta: 1544192 bytes 1.5 MiB, wrote 2.9% more than predicted freespace: was free: 43880448 bytes 41.8 MiB, wrote: 45199360 bytes 43.1 MiB, delta: 1318912 bytes 1.3 MiB, wrote 3.0% more than predicted freespace: was free: 35332096 bytes 33.7 MiB, wrote: 36425728 bytes 34.7 MiB, delta: `1093632` bytes 1.0 MiB, wrote 3.1% more than predicted freespace: was free: 26771456 bytes 25.5 MiB, wrote: 27643904 bytes 26.4 MiB, delta: 872448 bytes 852.0 KiB, wrote 3.3% more than predicted freespace: was free: 18231296 bytes 17.4 MiB, wrote: 18878464 bytes 18.0 MiB, delta: 647168 bytes 632.0 KiB, wrote 3.5% more than predicted freespace: was free: 9674752 bytes 9.2 MiB, wrote: 10088448 bytes 9.6 MiB, delta: 413696 bytes 404.0 KiB, wrote 4.3% more than predicted freespace: Test 2 finished freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS freespace: do 10 steps, lessen free space by 8397544 bytes 8.0 MiB each time freespace: trashing: was free: 92372992 bytes 88.1 MiB, need free: 8397552 bytes 8.0 MiB, files created: 248296, delete 225723 (90.9% of them) freespace: was free: 71909376 bytes 68.6 MiB, wrote: 82472960 bytes 78.7 MiB, delta: 10563584 bytes 10.1 MiB, wrote 14.7% more than predicted freespace: trashing: was free: 88989696 bytes 84.9 MiB, need free: 16795096 bytes 16.0 MiB, files created: 248794, delete 201838 (81.1% of them) freespace: was free: 60354560 bytes 57.6 MiB, wrote: 71782400 bytes 68.5 MiB, delta: 11427840 bytes 10.9 MiB, wrote 18.9% more than predicted freespace: trashing: was free: 90304512 bytes 86.1 MiB, need free: 25192640 bytes 24.0 MiB, files created: 248733, delete 179342 (72.1% of them) freespace: was free: 51187712 bytes 48.8 MiB, wrote: 62943232 bytes 60.0 MiB, delta: 11755520 bytes 11.2 MiB, wrote 23.0% more than predicted freespace: trashing: was free: 91209728 bytes 87.0 MiB, need free: 33590184 bytes 32.0 MiB, files created: 248779, delete 157160 (63.2% of them) freespace: was free: 42704896 bytes 40.7 MiB, wrote: 55050240 bytes 52.5 MiB, delta: 12345344 bytes 11.8 MiB, wrote 28.9% more than predicted freespace: trashing: was free: 92700672 bytes 88.4 MiB, need free: 41987728 bytes 40.0 MiB, files created: 248848, delete 136135 (54.7% of them) freespace: was free: 35250176 bytes 33.6 MiB, wrote: 48115712 bytes 45.9 MiB, delta: 12865536 bytes 12.3 MiB, wrote 36.5% more than predicted freespace: trashing: was free: 93986816 bytes 89.6 MiB, need free: 50385272 bytes 48.1 MiB, files created: 248723, delete 115385 (46.4% of them) freespace: was free: 29995008 bytes 28.6 MiB, wrote: 41582592 bytes 39.7 MiB, delta: 11587584 bytes 11.1 MiB, wrote 38.6% more than predicted freespace: trashing: was free: 91881472 bytes 87.6 MiB, need free: 58782816 bytes 56.1 MiB, files created: 248645, delete 89569 (36.0% of them) freespace: was free: 22511616 bytes 21.5 MiB, wrote: 34705408 bytes 33.1 MiB, delta: 12193792 bytes 11.6 MiB, wrote 54.2% more than predicted freespace: trashing: was free: 91774976 bytes 87.5 MiB, need free: 67180360 bytes 64.1 MiB, files created: 248580, delete 66616 (26.8% of them) freespace: was free: 16908288 bytes 16.1 MiB, wrote: 26898432 bytes 25.7 MiB, delta: `9990144` bytes 9.5 MiB, wrote 59.1% more than predicted freespace: trashing: was free: 92450816 bytes 88.2 MiB, need free: 75577904 bytes 72.1 MiB, files created: 248654, delete 45381 (18.3% of them) freespace: was free: 10170368 bytes 9.7 MiB, wrote: 19111936 bytes 18.2 MiB, delta: 8941568 bytes 8.5 MiB, wrote 87.9% more than predicted freespace: trashing: was free: 93282304 bytes 89.0 MiB, need free: 83975448 bytes 80.1 MiB, files created: 248513, delete 24794 (10.0% of them) freespace: was free: 3911680 bytes 3.7 MiB, wrote: 7872512 bytes 7.5 MiB, delta: 3960832 bytes 3.8 MiB, wrote 101.3% more than predicted freespace: Test 3 finished freespace: finished successfully Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:21:51 +03:00
Artem Bityutskiy	9bbb5726ef	UBIFS: introduce LEB overhead This is a preparational patch for the following statfs() report fix. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:21:40 +03:00
Artem Bityutskiy	131130b9a1	UBIFS: add forgotten gc_idx_lebs component We add this component at other similar places, but not in this one. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:21:31 +03:00
Artem Bityutskiy	ad507653a3	UBIFS: fix assertion The assertion was incorrect, because it did not take into account free space. This patch also amends the comments correspondingly, and cleans them up a little. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:21:22 +03:00
Artem Bityutskiy	4b5f2762ec	UBIFS: improve statfs reporting Make free space calculation less pessimistic and more realistic, which in turn improves 'statfs()' reports. Now it lies by 10%-20%, instead of 20%-30% (10% more honest). Results of "freespace" test (120MiB volume, 16KiB LEB size, 512 bytes page size). Before the change: freespace: Test 1: fill the space we have 3 times freespace: was free: 78274560 bytes 74.6 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 18214912 bytes 17.4 MiB, wrote 23.3% more than predicted freespace: was free: 76754944 bytes 73.2 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 19738624 bytes 18.8 MiB, wrote 25.7% more than predicted freespace: was free: 76759040 bytes 73.2 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 19730432 bytes 18.8 MiB, wrote 25.7% more than predicted freespace: Test 1 finished freespace: Test 2: gradually lessen amount of free space and fill the FS freespace: do 10 steps, lessen free space by `6977722` bytes 6.7 MiB each time freespace: was free: 72273920 bytes 68.9 MiB, wrote: 88891392 bytes 84.8 MiB, delta: 16617472 bytes 15.8 MiB, wrote 23.0% more than predicted freespace: was free: 66154496 bytes 63.1 MiB, wrote: 81506304 bytes 77.7 MiB, delta: 15351808 bytes 14.6 MiB, wrote 23.2% more than predicted freespace: was free: 58732544 bytes 56.0 MiB, wrote: 72572928 bytes 69.2 MiB, delta: 13840384 bytes 13.2 MiB, wrote 23.6% more than predicted freespace: was free: 51552256 bytes 49.2 MiB, wrote: 63754240 bytes 60.8 MiB, delta: 12201984 bytes 11.6 MiB, wrote 23.7% more than predicted freespace: was free: 44404736 bytes 42.3 MiB, wrote: 54943744 bytes 52.4 MiB, delta: 10539008 bytes 10.1 MiB, wrote 23.7% more than predicted freespace: was free: 37285888 bytes 35.6 MiB, wrote: 46161920 bytes 44.0 MiB, delta: 8876032 bytes 8.5 MiB, wrote 23.8% more than predicted freespace: was free: 30171136 bytes 28.8 MiB, wrote: 37384192 bytes 35.7 MiB, delta: 7213056 bytes 6.9 MiB, wrote 23.9% more than predicted freespace: was free: 23048192 bytes 22.0 MiB, wrote: 28606464 bytes 27.3 MiB, delta: 5558272 bytes 5.3 MiB, wrote 24.1% more than predicted freespace: was free: 15941632 bytes 15.2 MiB, wrote: 19828736 bytes 18.9 MiB, delta: 3887104 bytes 3.7 MiB, wrote 24.4% more than predicted freespace: was free: 8830976 bytes 8.4 MiB, wrote: 11063296 bytes 10.6 MiB, delta: 2232320 bytes 2.1 MiB, wrote 25.3% more than predicted freespace: Test 2 finished freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS freespace: do 10 steps, lessen free space by 6985541 bytes 6.7 MiB each time freespace: trashing: was free: 76840960 bytes 73.3 MiB, need free: 6985550 bytes 6.7 MiB, files created: 248311, delete 225737 (90.9% of them) freespace: was free: 65228800 bytes 62.2 MiB, wrote: 82530304 bytes 78.7 MiB, delta: 17301504 bytes 16.5 MiB, wrote 26.5% more than predicted freespace: trashing: was free: 74485760 bytes 71.0 MiB, need free: 13971091 bytes 13.3 MiB, files created: 248712, delete 202061 (81.2% of them) freespace: was free: 55025664 bytes 52.5 MiB, wrote: 71925760 bytes 68.6 MiB, delta: 16900096 bytes 16.1 MiB, wrote 30.7% more than predicted freespace: trashing: was free: 75550720 bytes 72.1 MiB, need free: 20956632 bytes 20.0 MiB, files created: 248849, delete 179822 (72.3% of them) freespace: was free: 46669824 bytes 44.5 MiB, wrote: 63197184 bytes 60.3 MiB, delta: 16527360 bytes 15.8 MiB, wrote 35.4% more than predicted freespace: trashing: was free: 76214272 bytes 72.7 MiB, need free: 27942173 bytes 26.6 MiB, files created: 248789, delete 157576 (63.3% of them) freespace: was free: 39129088 bytes 37.3 MiB, wrote: 55164928 bytes 52.6 MiB, delta: 16035840 bytes 15.3 MiB, wrote 41.0% more than predicted freespace: trashing: was free: 77398016 bytes 73.8 MiB, need free: 34927714 bytes 33.3 MiB, files created: 248711, delete 136474 (54.9% of them) freespace: was free: 32325632 bytes 30.8 MiB, wrote: 48234496 bytes 46.0 MiB, delta: 15908864 bytes 15.2 MiB, wrote 49.2% more than predicted freespace: trashing: was free: 75796480 bytes 72.3 MiB, need free: 41913255 bytes 40.0 MiB, files created: 248674, delete 111164 (44.7% of them) freespace: was free: 25079808 bytes 23.9 MiB, wrote: 40775680 bytes 38.9 MiB, delta: 15695872 bytes 15.0 MiB, wrote 62.6% more than predicted freespace: trashing: was free: 78209024 bytes 74.6 MiB, need free: 48898796 bytes 46.6 MiB, files created: 248708, delete 93207 (37.5% of them) freespace: was free: 20582400 bytes 19.6 MiB, wrote: 34844672 bytes 33.2 MiB, delta: 14262272 bytes 13.6 MiB, wrote 69.3% more than predicted freespace: trashing: was free: 77328384 bytes 73.7 MiB, need free: 55884337 bytes 53.3 MiB, files created: 248644, delete 68951 (27.7% of them) freespace: was free: 14368768 bytes 13.7 MiB, wrote: 28278784 bytes 27.0 MiB, delta: 13910016 bytes 13.3 MiB, wrote 96.8% more than predicted freespace: trashing: was free: 77434880 bytes 73.8 MiB, need free: 62869878 bytes 60.0 MiB, files created: 248640, delete 46767 (18.8% of them) freespace: was free: 8286208 bytes 7.9 MiB, wrote: 21811200 bytes 20.8 MiB, delta: 13524992 bytes 12.9 MiB, wrote 163.2% more than predicted freespace: trashing: was free: 77856768 bytes 74.2 MiB, need free: 69855419 bytes 66.6 MiB, files created: 248576, delete 25546 (10.3% of them) freespace: was free: 5570560 bytes 5.3 MiB, wrote: 8187904 bytes 7.8 MiB, delta: 2617344 bytes 2.5 MiB, wrote 47.0% more than predicted freespace: Test 3 finished freespace: finished successfully After the change: freespace: Test 1: fill the space we have 3 times freespace: was free: 85204992 bytes 81.3 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 11284480 bytes 10.8 MiB, wrote 13.2% more than predicted freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 12935168 bytes 12.3 MiB, wrote 15.5% more than predicted freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 12939264 bytes 12.3 MiB, wrote 15.5% more than predicted freespace: Test 1 finished freespace: Test 2: gradually lessen amount of free space and fill the FS freespace: do 10 steps, lessen free space by 7596218 bytes 7.2 MiB each time freespace: was free: 78675968 bytes 75.0 MiB, wrote: 88903680 bytes 84.8 MiB, delta: 10227712 bytes 9.8 MiB, wrote 13.0% more than predicted freespace: was free: 72015872 bytes 68.7 MiB, wrote: 81514496 bytes 77.7 MiB, delta: 9498624 bytes 9.1 MiB, wrote 13.2% more than predicted freespace: was free: 63938560 bytes 61.0 MiB, wrote: 72589312 bytes 69.2 MiB, delta: `8650752` bytes 8.2 MiB, wrote 13.5% more than predicted freespace: was free: 56127488 bytes 53.5 MiB, wrote: 63762432 bytes 60.8 MiB, delta: 7634944 bytes 7.3 MiB, wrote 13.6% more than predicted freespace: was free: 48336896 bytes 46.1 MiB, wrote: 54935552 bytes 52.4 MiB, delta: 6598656 bytes 6.3 MiB, wrote 13.7% more than predicted freespace: was free: 40587264 bytes 38.7 MiB, wrote: 46157824 bytes 44.0 MiB, delta: 5570560 bytes 5.3 MiB, wrote 13.7% more than predicted freespace: was free: 32841728 bytes 31.3 MiB, wrote: 37384192 bytes 35.7 MiB, delta: 4542464 bytes 4.3 MiB, wrote 13.8% more than predicted freespace: was free: 25100288 bytes 23.9 MiB, wrote: 28618752 bytes 27.3 MiB, delta: 3518464 bytes 3.4 MiB, wrote 14.0% more than predicted freespace: was free: 17342464 bytes 16.5 MiB, wrote: 19841024 bytes 18.9 MiB, delta: 2498560 bytes 2.4 MiB, wrote 14.4% more than predicted freespace: was free: 9605120 bytes 9.2 MiB, wrote: 11063296 bytes 10.6 MiB, delta: 1458176 bytes 1.4 MiB, wrote 15.2% more than predicted freespace: Test 2 finished freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS freespace: do 10 steps, lessen free space by 7606272 bytes 7.3 MiB each time freespace: trashing: was free: 83668992 bytes 79.8 MiB, need free: 7606272 bytes 7.3 MiB, files created: 248297, delete 225724 (90.9% of them) freespace: was free: 70803456 bytes 67.5 MiB, wrote: 82485248 bytes 78.7 MiB, delta: 11681792 bytes 11.1 MiB, wrote 16.5% more than predicted freespace: trashing: was free: 81080320 bytes 77.3 MiB, need free: 15212544 bytes 14.5 MiB, files created: 248711, delete 202047 (81.2% of them) freespace: was free: 59867136 bytes 57.1 MiB, wrote: 71897088 bytes 68.6 MiB, delta: 12029952 bytes 11.5 MiB, wrote 20.1% more than predicted freespace: trashing: was free: 82243584 bytes 78.4 MiB, need free: 22818816 bytes 21.8 MiB, files created: 248866, delete 179817 (72.3% of them) freespace: was free: 50905088 bytes 48.5 MiB, wrote: 63168512 bytes 60.2 MiB, delta: 12263424 bytes 11.7 MiB, wrote 24.1% more than predicted freespace: trashing: was free: 83402752 bytes 79.5 MiB, need free: 30425088 bytes 29.0 MiB, files created: 248920, delete 158114 (63.5% of them) freespace: was free: 42651648 bytes 40.7 MiB, wrote: 55406592 bytes 52.8 MiB, delta: 12754944 bytes 12.2 MiB, wrote 29.9% more than predicted freespace: trashing: was free: 84402176 bytes 80.5 MiB, need free: 38031360 bytes 36.3 MiB, files created: 248709, delete 136641 (54.9% of them) freespace: was free: 35233792 bytes 33.6 MiB, wrote: 48250880 bytes 46.0 MiB, delta: 13017088 bytes 12.4 MiB, wrote 36.9% more than predicted freespace: trashing: was free: 82530304 bytes 78.7 MiB, need free: 45637632 bytes 43.5 MiB, files created: 248778, delete 111208 (44.7% of them) freespace: was free: 27287552 bytes 26.0 MiB, wrote: 40267776 bytes 38.4 MiB, delta: 12980224 bytes 12.4 MiB, wrote 47.6% more than predicted freespace: trashing: was free: 85114880 bytes 81.2 MiB, need free: 53243904 bytes 50.8 MiB, files created: 248508, delete 93052 (37.4% of them) freespace: was free: 22437888 bytes 21.4 MiB, wrote: 35328000 bytes 33.7 MiB, delta: 12890112 bytes 12.3 MiB, wrote 57.4% more than predicted freespace: trashing: was free: 84103168 bytes 80.2 MiB, need free: 60850176 bytes 58.0 MiB, files created: 248637, delete 68743 (27.6% of them) freespace: was free: 15536128 bytes 14.8 MiB, wrote: 28319744 bytes 27.0 MiB, delta: 12783616 bytes 12.2 MiB, wrote 82.3% more than predicted freespace: trashing: was free: 84357120 bytes 80.4 MiB, need free: 68456448 bytes 65.3 MiB, files created: 248567, delete 46852 (18.8% of them) freespace: was free: 9015296 bytes 8.6 MiB, wrote: 22044672 bytes 21.0 MiB, delta: 13029376 bytes 12.4 MiB, wrote 144.5% more than predicted freespace: trashing: was free: 84942848 bytes 81.0 MiB, need free: 76062720 bytes 72.5 MiB, files created: 248636, delete 25993 (10.5% of them) freespace: was free: 6086656 bytes 5.8 MiB, wrote: 8331264 bytes 7.9 MiB, delta: 2244608 bytes 2.1 MiB, wrote 36.9% more than predicted freespace: Test 3 finished freespace: finished successfully Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:20:26 +03:00
Artem Bityutskiy	8aabb75017	UBIFS: remove incorrect index space check When we report free space to user-space, we should not report 0 if the amount of empty LEBs is too low, because they would be produced by GC when needed. Thus, just call 'ubifs_calc_available()' straight away which would take 'min_idx_lebs' into account anyway. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:16:03 +03:00
Artem Bityutskiy	9e5de35496	UBIFS: push empty flash hack down We have a hack which forces the amount of flash space to be equivalent to 'c->blocks_cnt' in case of empty FS. This is to make users happy and see '%0' used in 'df' when they mount an empty FS. This hack is not needed in 'ubifs_calc_available()', but it is only needed the caller, in 'ubifs_budg_get_free_space()'. So push it down there. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:15:53 +03:00
Artem Bityutskiy	8191e1fa81	UBIFS: do not update min_idx_lebs in stafs This is bad because the rest of the code should not depend on it, and this may hide bugss, instead of revealing them. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-31 17:15:43 +03:00
David Teigland	c1dcf65ffc	dlm: fix locking of lockspace list in dlm_scand The dlm_scand thread needs to lock the list of lockspaces when going through it. Signed-off-by: David Teigland <teigland@redhat.com>	2008-08-28 11:50:07 -05:00
David Teigland	dc68c7ed36	dlm: detect available userspace daemon If dlm_controld (the userspace daemon that controls the setup and recovery of the dlm) fails, the kernel should shut down the lockspaces in the kernel rather than leaving them running. This is detected by having dlm_controld hold a misc device open while running, and if the kernel detects a close while the daemon is still needed, it stops the lockspaces in the kernel. Knowing that the userspace daemon isn't running also allows the lockspace create/remove routines to avoid waiting on the daemon for join/leave operations. Signed-off-by: David Teigland <teigland@redhat.com>	2008-08-28 11:49:43 -05:00
David Teigland	0f8e0d9a31	dlm: allow multiple lockspace creates Add a count for lockspace create and release so that create can be called multiple times to use the lockspace from different places. Also add the new flag DLM_LSFL_NEWEXCL to create a lockspace with the previous behavior of returning -EEXIST if the lockspace already exists. Signed-off-by: David Teigland <teigland@redhat.com>	2008-08-28 11:49:15 -05:00
Steve French	c76da9da1f	[CIFS] Turn off Unicode during session establishment for plaintext authentication LANMAN session setup did not support Unicode (after session setup, unicode can still be used though). Fixes samba bug# 5319 CC: Jeff Layton <jlayton@redhat.com> CC: Stable Kernel <stable@vger.kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-28 15:32:22 +00:00
Steve French	2e655021b8	[CIFS] update cifs change log Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-28 15:30:06 +00:00
Jeff Layton	838726c475	cifs: fix O_APPEND on directio mounts The direct I/O write codepath for CIFS is done through cifs_user_write(). That function does not currently call generic_write_checks() so the file position isn't being properly set when the file is opened with O_APPEND. It's also not doing the other "normal" checks that should be done for a write call. The problem is currently that when you open a file with O_APPEND on a mount with the directio mount option, the file position is set to the beginning of the file. This makes any subsequent writes clobber the data in the file starting at the beginning. This seems to fix the problem in cursory testing. It is, however important to note that NFS disallows the combination of (O_DIRECT\|O_APPEND). If my understanding is correct, the concern is races with multiple clients appending to a file clobbering each others' data. Since the write model for CIFS and NFS is pretty similar in this regard, CIFS is probably subject to the same sort of races. What's unclear to me is why this is a particular problem with O_DIRECT and not with buffered writes... Regardless, disallowing O_APPEND on an entire mount is probably not reasonable, so we'll probably just have to deal with it and reevaluate this flag combination when we get proper support for O_DIRECT. In the meantime this patch at least fixes the existing problem. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Stable Tree <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-28 14:15:32 +00:00
Steve French	6405c9cd9b	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-08-28 02:47:00 +00:00
Linus Torvalds	325a9a3d39	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Add destroy routine for dns_resolver [CIFS] Reorder cifs config item for better clarity [CIFS] Correct keys dependency for cifs kerberos support	2008-08-27 14:34:49 -07:00
Linus Torvalds	5b51a7e9d8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: [PATCH] deal with the first call of ->show() generating no output [PATCH] fix ->llseek() for a bunch of directories [PATCH] fix regular readdir() and friends [PATCH] fix hpux_getdents() [PATCH] fix osf_getdirents() [PATCH] ntfs: use d_add_ci [PATCH] change d_add_ci argument ordering [PATCH] fix efs_lookup() [PATCH] proc: inode number fixlet	2008-08-27 14:31:44 -07:00
Steve French	bcc55c6664	[CIFS] Fix plaintext authentication The last eight bytes of the password field were not cleared when doing lanman plaintext password authentication. This patch fixes that. I tested it with Samba by setting password encryption to no in the server's smb.conf. Other servers also can be configured to force plaintext authentication. Note that plaintexti authentication requires setting /proc/fs/cifs/SecurityFlags to 0x30030 on the client (enabling both LANMAN and also plaintext password support). Also note that LANMAN support (and thus plaintext password support) requires CONFIG_CIFS_WEAK_PW_HASH to be enabled in menuconfig. CC: Jeff Layton <jlayton@redhat.com> CC: Stable Kernel <stable@vger.kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-27 21:30:22 +00:00
Jeff Layton	87ed1d65fb	[CIFS] Add destroy routine for dns_resolver Otherwise, we're leaking the payload memory. CC: Stable Kernel <stable@vger.kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-27 21:17:41 +00:00
Linus Torvalds	0559bc8e9b	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: remove blk_queue_tag_depth() and blk_queue_tag_queue() block: remove unused ->busy part of the block queue tag map bio: fix __bio_copy_iov() handling of bio->bv_len bio: fix bio_copy_kern() handling of bio->bv_len block: submit_bh() inadvertently discards barrier flag on a sync write block: clean up cmdfilter sysfs interface block: rename blk_scsi_cmd_filter to blk_cmd_filter sg: restore command permission for TYPE_SCANNER block: move cmdfilter from gendisk to request_queue	2008-08-27 13:55:35 -07:00
Linus Torvalds	e472233fc5	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: Increment the reference count of an already-active stack. [PATCH] configfs: Consolidate locking around configfs_detach_prep() in configfs_rmdir() ocfs2: correctly set i_blocks after inline dir gets expanded ocfs2: Jump to correct label in ocfs2_expand_inline_dir() ocfs2: Fix sleep-with-spinlock recovery regression [PATCH] ocfs2/cluster/netdebug.c: fix warning [PATCH] ocfs2/cluster/tcp.c: make some functions static	2008-08-27 13:54:55 -07:00
Steven Whitehouse	0188d6c580	GFS2: Fix & clean up GFS2 rename This patch fixes a locking issue in the rename code by ensuring that we hold the per sb rename lock over both directory and "other" renames which involve different parent directories. At the same time, this moved the (only called from one place) function gfs2_ok_to_move into the file that its called from, so we can mark it static. This should make a code a bit easier to follow. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Peter Staubach <staubach@redhat.com>	2008-08-27 13:33:10 +01:00
FUJITA Tomonori	aefcc28a3a	bio: fix __bio_copy_iov() handling of bio->bv_len The commit `c5dec1c303` introduced __bio_copy_iov() to add bounce support to blk_rq_map_user_iov. __bio_copy_iov() uses bio->bv_len to copy data for READ commands after the completion but it doesn't work with a request that partially completed. SCSI always completes a PC request as a whole but seems some don't. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-08-27 09:50:19 +02:00
FUJITA Tomonori	76029ff37f	bio: fix bio_copy_kern() handling of bio->bv_len The commit `68154e90c9` introduced bio_copy_kern() to add bounce support to blk_rq_map_kern. bio_copy_kern() uses bio->bv_len to copy data for READ commands after the completion but it doesn't work with a request that partially completed. SCSI always completes a PC request as a whole but seems some don't. This patch fixes bio_copy_kern to handle the above case. As bio_copy_user does, bio_copy_kern uses struct bio_map_data to store struct bio_vec. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Reported-by: Nix <nix@esperi.org.uk> Tested-by: Nix <nix@esperi.org.uk> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-08-27 09:50:19 +02:00
Jens Axboe	48fd4f93a0	block: submit_bh() inadvertently discards barrier flag on a sync write Reported by Milan Broz <mbroz@redhat.com>, commit `18ce3751` inadvertently made submit_bh() discard the barrier bit for a WRITE_SYNC request. Fix that up. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-08-27 09:50:19 +02:00
Steve French	96c2a1137b	[CIFS] Reorder cifs config item for better clarity Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-26 18:32:28 +00:00
Steve French	e9775843ec	[CIFS] Correct keys dependency for cifs kerberos support Must also depend on CIFS ... Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-26 18:22:50 +00:00
Steve French	3dae49abef	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2008-08-26 16:56:05 +00:00
Steve French	6ce5eecb9c	[CIFS] check version in spnego upcall response Currently, we don't check the version in the SPNEGO upcall response even though one is provided. Jeff and Q have made the corresponding change to the Samba client (cifs.upcall). Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-26 00:37:14 +00:00
Joel Becker	d6817cdbd1	ocfs2: Increment the reference count of an already-active stack. The ocfs2_stack_driver_request() function failed to increment the refcount of an already-active stack. It only did the increment on the first reference. Whoops. Signed-off-by: Joel Becker <joel.becker@oracle.com> Tested-by: Marcos Matsunaga <marcos.matsunaga@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-25 07:29:47 -07:00
Adrian Hunter	601c0bc467	UBIFS: allow for racing between GC and TNC The TNC mutex is unlocked prematurely when reading leaf nodes with non-hashed keys. This is unsafe because the node may be moved by garbage collection and the eraseblock unmapped, although that has never actually happened during stress testing. This patch fixes the flaw by detecting the race and retrying with the TNC mutex locked. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-25 14:34:02 +03:00
Adrian Hunter	761e29f3bb	UBIFS: always read hashed-key nodes under TNC mutex Leaf-nodes that have a hashed key are stored in the leaf-node-cache (LNC) which is protected by the TNC mutex. Consequently, when reading a leaf node with a hashed key (i.e. directory entries, xattr entries) the TNC mutex is always required. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-25 14:33:41 +03:00
Al Viro	4cdfe84b51	[PATCH] deal with the first call of ->show() generating no output seq_read() has a subtle bug - we want the first loop there to go until at least one non-empty record had fit entirely into buffer. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-25 01:18:10 -04:00
Al Viro	59af1584bf	[PATCH] fix ->llseek() for a bunch of directories Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-25 01:18:09 -04:00
Al Viro	8f3f655da7	[PATCH] fix regular readdir() and friends Handling of -EOVERFLOW. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-25 01:18:08 -04:00
Christoph Hellwig	2690421743	[PATCH] ntfs: use d_add_ci d_add_ci was lifted 1:1 from ntfs. Change ntfs to use the common version. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-25 01:18:06 -04:00
Christoph Hellwig	e45b590b97	[PATCH] change d_add_ci argument ordering As pointed out during review d_add_ci argument order should match d_add, so switch the dentry and inode arguments. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-25 01:18:05 -04:00
Al Viro	2d8a10cd17	[PATCH] fix efs_lookup() it needs to use d_splice_alias(), not d_add() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-25 01:18:04 -04:00
Alexey Dobriyan	cc99609917	[PATCH] proc: inode number fixlet Ouch, if number taken from IDA is too big, the intent was to signal an error, not check for overflow and still do overflowing addition. One still needs 2^28 proc entries to notice this. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-25 01:18:03 -04:00
Adrian Bunk	7a8fc9b248	removed unused #include <linux/version.h>'s This patch lets the files using linux/version.h match the files that #include it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-23 12:14:12 -07:00
Louis Rilling	de6bf18e9c	[PATCH] configfs: Consolidate locking around configfs_detach_prep() in configfs_rmdir() It appears that configfs_rmdir() can protect configfs_detach_prep() retries with less calls to {spin,mutex}_{lock,unlock}, and a cleaner code. This patch does not change any behavior, except that it removes two useless lock/unlock pairs having nothing inside to protect and providing a useless barrier. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <Joel.Becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-22 11:09:02 -07:00
Mark Fasheh	9780eb6cfa	ocfs2: correctly set i_blocks after inline dir gets expanded We were setting i_blocks based on allocation before the extent insert, which is wrong as the value is a calculation based on ip_clusters which gets updated as a result of the insert. This patch moves the line in question to just after the call to ocfs2_insert_extent(). Without this fix, inline directories were temporarily having an i_blocks value of zero immediately after expansion to extents. Reported-and-tested-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-22 11:09:02 -07:00
Tao Ma	83cab5338f	ocfs2: Jump to correct label in ocfs2_expand_inline_dir() When we fail to insert extent in ocfs2_expand_inline_dir(), we should go to out_commit, not out. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-22 11:09:02 -07:00
Mark Fasheh	a1af7d15a1	ocfs2: Fix sleep-with-spinlock recovery regression This fixes a bug introduced with 539d8264093560b917ee3afe4c7f74e5da09d6a5: [PATCH 2/2] ocfs2: Fix race between mount and recovery ocfs2_mark_dead_nodes() was reading journal inodes while holding the spinlock protecting our in-memory recovery state. The fix is very simple - the disk state is protected by a cluster lock that's already held, so we just move the spinlock down past the read. Reviewed-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-22 11:08:38 -07:00
Alexander Beregalov	a57a874b04	[PATCH] ocfs2/cluster/netdebug.c: fix warning ocfs2/cluster/netdebug.c: fix warning fs/ocfs2/cluster/netdebug.c:154: warning: format '%lu' expects type 'long unsigned int', but argument 17 has type 'suseconds_t' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-22 10:56:57 -07:00
Adrian Bunk	18496e80f7	[PATCH] ocfs2/cluster/tcp.c: make some functions static Commit `0f475b2abe` (ocfs2/net: Silence build warnings) made sense as far as it fixed compile warnings, but it was not required that it made the functions global. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-08-22 10:56:40 -07:00
Linus Torvalds	ee26562772	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Update documentation to remind users to update mke2fs.conf ext4: Fix small file fragmentation ext4: Initialize writeback_index to 0 when allocating a new inode ext4: make sure ext4_has_free_blocks returns 0 for ENOSPC ext4: journal credit fix for the delayed allocation's writepages() function ext4: Rework the ext4_da_writepages() function ext4: journal credits reservation fixes for DIO, fallocate ext4: journal credits reservation fixes for extent file writepage ext4: journal credits calulation cleanup and fix for non-extent writepage ext4: Fix bug where we return ENOSPC even though we have plenty of inodes ext4: don't try to resize if there are no reserved gdt blocks left ext4: Use ext4_discard_reservations instead of mballoc-specific call ext4: Fix ext4_dx_readdir hash collision handling ext4: Fix delalloc release block reservation for truncate ext4: Fix potential truncate BUG due to i_prealloc_list being non-empty ext4: Handle unwritten extent properly with delayed allocation	2008-08-22 08:37:07 -07:00
Artem Bityutskiy	04da11bfcf	UBIFS: fix zero-length truncations Always allow truncations to zero, even if budgeting thinks there is no space. UBIFS reserves some space for deletions anyway. Otherwise, the following happans: 1. create a file, and write as much as possible there, until ENOSPC 2. truncate the file, which fails with ENOSPC, which is not good. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-21 16:48:52 +03:00
Al Viro	82d63fc9e3	cramfs: fix named-pipe handling After commit `a97c9bf33f` (fix cramfs making duplicate entries in inode cache) in kernel 2.6.14, named-pipe on cramfs does not work properly. It seems the commit make all named-pipe on cramfs share their inode (and named-pipe buffer). Make ..._test() refuse to merge inodes with ->i_ino == 1, take inode setup back to get_cramfs_inode() and make ->drop_inode() evict ones with ->i_ino == 1 immediately. Reported-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> [2.6.14 and later] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-20 15:40:32 -07:00
Ken Chen	2d70b68d42	fix setpriority(PRIO_PGRP) thread iterator breakage When user calls sys_setpriority(PRIO_PGRP ...) on a NPTL style multi-LWP process, only the task leader of the process is affected, all other sibling LWP threads didn't receive the setting. The problem was that the iterator used in sys_setpriority() only iteartes over one task for each process, ignoring all other sibling thread. Introduce a new macro do_each_pid_thread / while_each_pid_thread to walk each thread of a process. Convert 4 call sites in {set/get}priority and ioprio_{set/get}. Signed-off-by: Ken Chen <kenchen@google.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-20 15:40:32 -07:00
Pavel Emelyanov	ff9bc512f1	binfmt_misc: fix false -ENOEXEC when coupled with other binary handlers In case the binfmt_misc binary handler is registered before the e.g. script one (when for example being compiled as a module) the following situation may occur: 1. user launches a script, whose interpreter is a misc binary; 2. the load_misc_binary sets the misc_bang and returns -ENOEVEC, since the binary is a script; 3. the load_script_binary loads one and calls for search_binary_hander to run the interpreter; 4. the load_misc_binary is called again, but refuses to load the binary due to misc_bang bit set. The fix is to move the misc_bang setting lower - prior to the actual call to the search_binary_handler. Caused by the commit `3a2e7f47` (binfmt_misc.c: avoid potential kernel stack overflow) Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Tested-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: <stable@kernel.org> [2.6.26.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-20 15:40:31 -07:00
Clement Calmels	1804dc6e14	/proc/self/maps doesn't display the real file offset This addresses http://bugzilla.kernel.org/show_bug.cgi?id=11318 In function show_map (file: fs/proc/task_mmu.c), if vma->vm_pgoff > 2^20 than (vma->vm_pgoff << PAGE_SIZE) is greater than 2^32 (with PAGE_SIZE equal to 4096 (i.e. 2^12). The next seq_printf use an unsigned long for the conversion of (vma->vm_pgoff << PAGE_SIZE), as a result the offset value displayed in /proc/self/maps is truncated if the page offset is greater than 2^20. A test that shows this issue: #define _GNU_SOURCE #include <sys/types.h> #include <sys/stat.h> #include <sys/mman.h> #include <stdlib.h> #include <stdio.h> #include <fcntl.h> #include <unistd.h> #include <string.h> #define PAGE_SIZE (getpagesize()) #if __i386__ # define U64_STR "%llx" #elif __x86_64 # define U64_STR "%lx" #else # error "Architecture Unsupported" #endif int main(int argc, char argv[]) { int fd; char addr; off64_t offset = 0x10000000; char filename = "/dev/zero"; fd = open(filename, O_RDONLY); if (fd < 0) { perror("open"); return 1; } offset = 0x10; printf("offset = " U64_STR "\n", offset); addr = (char)mmap64(NULL, PAGE_SIZE, PROT_READ, MAP_PRIVATE, fd, offset); if ((void)addr == MAP_FAILED) { perror("mmap64"); return 1; } { FILE fmaps; char line = NULL; size_t len = 0; ssize_t read; size_t filename_len = strlen(filename); fmaps = fopen("/proc/self/maps", "r"); if (!fmaps) { perror("fopen"); return 1; } while ((read = getline(&line, &len, fmaps)) != -1) { if ((read > filename_len + 1) && (strncmp(&line[read - filename_len - 1], filename, filename_len) == 0)) printf("%s", line); } if (line) free(line); fclose(fmaps); } close(fd); return 0; } [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Clement Calmels <cboulte@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-20 15:40:30 -07:00
Linus Torvalds	1bbe44f69d	Merge branch 'sh/for-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: Provide a FLAT_PLAT_INIT() definition. binfmt_flat: Stub in a FLAT_PLAT_INIT(). video: export sh_mobile_lcdc panel size sh: select memchunk size using kernel cmdline sh: export sh7723 VEU as VEU2H input: migor_ts compile and detection fix sh: remove MSTPCR defines from Migo-R header file sh: Update sh7763rdp defconfig sh: Add support sh7760fb to sh7763rdp board sh: Add support sh_eth to sh7763rdp board sh: Disable 64kB hugetlbpage size when using 64kB PAGE_SIZE. sh: Don't export __{s,u}divsi3_i4i from SH-2 libgcc. fix SH7705_CACHE_32KB compilation sh: mach-x3proto: Fix up smc91x platform data.	2008-08-20 08:46:11 -07:00
Linus Torvalds	5f22ca9b13	vfat: fix 'sync' mount deadlock due to BKL->lock_super conversion There was another FAT BKL conversion deadlock reported by Bart Trojanowski due to the BKL being used as a recursive lock by FAT, which was missed because it only triggers with 'sync' (or 'dirsync') mounts. The recursion worked for the BKL, but after the conversion to lock_super (which uses a mutex), it just deadlocks. Thanks to Bart for debugging this and testing the fix. The lock debugging information from the original report: ============================================= [ INFO: possible recursive locking detected ] 2.6.27-rc3-bisect-00448-ga7f5aaf #16 --------------------------------------------- mv/4020 is trying to acquire lock: (&type->s_lock_key#9){--..}, at: [<c01a90fe>] lock_super+0x1e/0x20 but task is already holding lock: (&type->s_lock_key#9){--..}, at: [<c01a90fe>] lock_super+0x1e/0x20 other info that might help us debug this: 3 locks held by mv/4020: #0: (&sb->s_type->i_mutex_key#9/1){--..}, at: [<c01b2336>] do_unlinkat+0x66/0x140 #1: (&sb->s_type->i_mutex_key#9){--..}, at: [<c01b0954>] vfs_unlink+0x84/0x110 #2: (&type->s_lock_key#9){--..}, at: [<c01a90fe>] lock_super+0x1e/0x20 stack backtrace: Pid: 4020, comm: mv Not tainted 2.6.27-rc3-bisect-00448-ga7f5aaf #16 [<c014e694>] validate_chain+0x984/0xea0 [<c0108d70>] ? native_sched_clock+0x0/0xf0 [<c014ee9c>] __lock_acquire+0x2ec/0x9b0 [<c014f5cf>] lock_acquire+0x6f/0x90 [<c01a90fe>] ? lock_super+0x1e/0x20 [<c044e5fd>] mutex_lock_nested+0xad/0x300 [<c01a90fe>] ? lock_super+0x1e/0x20 [<c01a90fe>] ? lock_super+0x1e/0x20 [<c01a90fe>] lock_super+0x1e/0x20 [<f8b3a700>] fat_write_inode+0x60/0x2b0 [fat] [<c0450878>] ? _spin_unlock_irqrestore+0x48/0x80 [<f8b3a953>] ? fat_sync_inode+0x3/0x20 [fat] [<f8b3a962>] fat_sync_inode+0x12/0x20 [fat] [<f8b37c7e>] fat_remove_entries+0xbe/0x120 [fat] [<f8b422ef>] vfat_unlink+0x5f/0x90 [vfat] [<f8b42290>] ? vfat_unlink+0x0/0x90 [vfat] [<c01b0968>] vfs_unlink+0x98/0x110 [<c01b2400>] do_unlinkat+0x130/0x140 [<c016a8f5>] ? audit_syscall_entry+0x105/0x150 [<c01b253b>] sys_unlinkat+0x3b/0x40 [<c01040d3>] sysenter_do_call+0x12/0x3f ======================= where the deadlock is due to the nesting of lock_super from vfat_unlink to fat_write_inode: - do_unlinkat - vfs_unlink - vfat_unlink * lock_super - fat_remove_entries - fat_sync_inode - fat_write_inode * lock_super and the fix is to simply remove the use of lock_super() in fat_write_inode. The lock_super() there had been just an automatic conversion of the kernel lock to the superblock lock, but no locking was actually needed there, since the code in fat_write_inode already protected all relevant accesses with a spinlock (sbi->inode_hash_lock to be exact). The only code inside the BKL (and thus the superblock lock) was accesses tp local variables or calls to functions that have long been SMP-safe (i.e. sb_bread, mark_buffe_dirty and brlese). Bart reports: "Looks good. I ran 10 parallel processes creating 1M files truncating them, writing to them again and then deleting them. This patch fixes the issue I ran into. Signed-off-by: Bart Trojanowski <bart@jukie.net>" Reported-and-tested-by: Bart Trojanowski <bart@jukie.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-20 08:31:19 -07:00
Aneesh Kumar K.V	c4a0c46ec9	ext4: invalidate pages if delalloc block allocation fails. We are a bit agressive in invalidating all the pages. But it is ok because we really don't know why the block allocation failed and it is better to come of the writeback path so that user can look for more info. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2008-08-19 21:08:18 -04:00
Theodore Ts'o	af5bc92dde	ext4: Fix whitespace checkpatch warnings/errors Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 22:25:24 -04:00
Theodore Ts'o	e5f8eab885	ext4: Fix long long checkpatch warnings Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 22:25:04 -04:00
Theodore Ts'o	4776004f54	ext4: Add printk priority levels to clean up checkpatch warnings Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-09-08 23:00:52 -04:00
Mingming Cao	1f7c14c62c	percpu counter: clean up percpu_counter_sum_and_set() percpu_counter_sum_and_set() and percpu_counter_sum() is the same except the former updates the global counter after accounting. Since we are taking the fbc->lock to calculate the precise value of the counter in percpu_counter_sum() anyway, it should simply set fbc->count too, as the percpu_counter_sum_and_set() does. This patch merges these two interfaces into one. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-10-09 12:50:59 -04:00
Steve French	3d2af3465e	[CIFS] Kerberos support not considered experimental anymore Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-19 20:51:09 +00:00
Steve French	c16fefa563	[CIFS] distinguish between Kerberos and MSKerberos in upcall Properly handle MSKRB5 by passing sec=mskrb5 to the upcall so that the spengo blob can be generated appropriately. Also, make decode_negTokenInit prefer whichever mechanism is first in the list. Needed for some NetApp servers, and possibly some older versions of Windows which treat the two KRB5 mechanisms differently. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-19 19:35:33 +00:00
Jeff Layton	cb7691b648	cifs: add local server pointer to cifs_setup_session cifs_setup_session references pSesInfo->server several times. That pointer shouldn't change during the life of the function so grab it once and store it in a local var. This makes the code look a little cleaner too. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-19 17:11:35 +00:00
Ilpo Järvinen	aab3a8c7a3	[CIFS] reindent misindented statement Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-19 14:23:37 +00:00
Jan Kara	97e1cfb086	udf: Fix error paths in udf_new_inode() I case we failed to allocate memory for inode when creating it, we did not properly free block already allocated for this inode. Move memory allocation before the block allocation which fixes this issue (thanks for the idea go to Ingo Oeser <ioe-lkml@rameria.de>). Also remove a few superfluous initializations already done in udf_alloc_inode(). Reviewed-by: Ingo Oeser <ioe-lkml@rameria.de> Signed-off-by: Jan Kara <jack@suse.cz>	2008-08-19 11:05:05 +02:00
Jan Kara	db0badc58e	udf: Fix lock inversion between iprune_mutex and alloc_mutex (v2) A memory allocation inside alloc_mutex must not recurse back into the filesystem itself because that leads to lock inversion between iprune_mutex and alloc_mutex (and thus to deadlocks - see traces below). alloc_mutex is actually needed only to update allocation statistics in the superblock so we can drop it before we start allocating memory for the inode. tar D ffff81015b9c8c90 0 6614 6612 ffff8100d5a21a20 0000000000000086 0000000000000000 00000000ffff0000 ffff81015b9c8c90 ffff81015b8f0cd0 ffff81015b9c8ee0 0000000000000000 0000000000000003 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffff803c1d8a>] __mutex_lock_slowpath+0x64/0x9b [<ffffffff803c1bef>] mutex_lock+0xa/0xb [<ffffffff8027f8c2>] shrink_icache_memory+0x38/0x200 [<ffffffff80257742>] shrink_slab+0xe3/0x15b [<ffffffff802579db>] try_to_free_pages+0x221/0x30d [<ffffffff8025657e>] isolate_pages_global+0x0/0x31 [<ffffffff8025324b>] __alloc_pages_internal+0x252/0x3ab [<ffffffff8026b08b>] cache_alloc_refill+0x22e/0x47b [<ffffffff8026ae37>] kmem_cache_alloc+0x3b/0x61 [<ffffffff8026b15b>] cache_alloc_refill+0x2fe/0x47b [<ffffffff8026b34e>] __kmalloc+0x76/0x9c [<ffffffffa00751f2>] :udf:udf_new_inode+0x202/0x2e2 [<ffffffffa007ae5e>] :udf:udf_create+0x2f/0x16d [<ffffffffa0078f27>] :udf:udf_lookup+0xa6/0xad ... kswapd0 D ffff81015b9d9270 0 125 2 ffff81015b903c28 0000000000000046 ffffffff8028cbb0 00000000fffffffb ffff81015b9d9270 ffff81015b8f0cd0 ffff81015b9d94c0 000000000271b490 ffffe2000271b458 ffffe2000271b420 ffffe20002728dc8 ffffe20002728d90 Call Trace: [<ffffffff8028cbb0>] __set_page_dirty+0xeb/0xf5 [<ffffffff8025403a>] get_dirty_limits+0x1d/0x22f [<ffffffff803c1d8a>] __mutex_lock_slowpath+0x64/0x9b [<ffffffff803c1bef>] mutex_lock+0xa/0xb [<ffffffffa0073f58>] :udf:udf_bitmap_free_blocks+0x47/0x1eb [<ffffffffa007df31>] :udf:udf_discard_prealloc+0xc6/0x172 [<ffffffffa007875a>] :udf:udf_clear_inode+0x1e/0x48 [<ffffffff8027f121>] clear_inode+0x6d/0xc4 [<ffffffff8027f7f2>] dispose_list+0x56/0xee [<ffffffff8027fa5a>] shrink_icache_memory+0x1d0/0x200 [<ffffffff80257742>] shrink_slab+0xe3/0x15b [<ffffffff80257e93>] kswapd+0x346/0x447 ... Reported-by: Tibor Tajti <tibor.tajti@gmail.com> Reviewed-by: Ingo Oeser <ioe-lkml@rameria.de> Signed-off-by: Jan Kara <jack@suse.cz>	2008-08-19 11:04:36 +02:00
Linus Torvalds	45edb89ffd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] mount of IPC$ breaks with iget patch [CIFS] remove trailing whitespace [CIFS] if get root inode fails during mount, cleanup tree connection	2008-08-15 11:02:35 -07:00
Linus Torvalds	21d3bdb160	Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6 * 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6: (29 commits) UBIFS: xattr bugfixes UBIFS: remove unneeded check UBIFS: few commentary fixes UBIFS: fix budgeting request alignment in xattr code UBIFS: improve arguments checking in debugging messages UBIFS: always set i_generation to 0 UBIFS: correct spelling of "thrice". UBIFS: support splice_write UBIFS: minor tweaks in commit UBIFS: reserve more space for index UBIFS: print pid in dump function UBIFS: align inode data to eight UBIFS: improve budgeting checks UBIFS: correct orphan deletion order UBIFS: fix typos in comments UBIFS: do not union creat_sqnum and del_cmtno UBIFS: optimize deletions UBIFS: increment commit number earlier UBIFS: remove another unneeded function parameter UBIFS: remove unneeded function parameter ...	2008-08-15 10:33:07 -07:00
Bob Copeland	9419fc1c95	omfs: fix oops when file metadata is corrupted A fuzzed fileystem image failed with OMFS when the extent count was used in a loop without being checked against the max number of extents. It also provoked a signed division for an array index that was checked as if unsigned, leading to index by -1. omfsck will be updated to fix these cases, in the meantime bail out gracefully. Reported-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-15 08:35:44 -07:00
Bob Copeland	c963343a11	omfs: fix potential oops when directory size is corrupted Testing with a modified fsfuzzer reveals a couple of locations in omfs where filesystem variables are ultimately used as loop counters with insufficient sanity checking. In this case, dir->i_size is used to compute the number of buckets in the directory hash. If too large, readdir will overrun a buffer. Since it's an invariant that dir->i_size is equal to the sysblock size, and we already sanity check that, just use that value instead. This fixes the following oops: BUG: unable to handle kernel paging request at c978e004 IP: [<c032298e>] omfs_readdir+0x18e/0x32f Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC Modules linked in: Pid: 4796, comm: ls Not tainted (2.6.27-rc2 #12) EIP: 0060:[<c032298e>] EFLAGS: 00010287 CPU: 0 EIP is at omfs_readdir+0x18e/0x32f EAX: c978d000 EBX: 00000000 ECX: cbfcfaf8 EDX: cb2cf100 ESI: 00001000 EDI: 00000800 EBP: cb2d3f68 ESP: cb2d3f0c DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process ls (pid: 4796, ti=cb2d3000 task=cb175f40 task.ti=cb2d3000) Stack: 00000002 00000000 00000000 c018a820 cb2d3f94 cb2cf100 cbfb0000 ffffff10 cbfb3b80 cbfcfaf8 000001c9 00000a09 00000000 00000000 00000000 cbfcfbc8 c9697000 cbfb3b80 22222222 00001000 c08e6cd0 cb2cf100 cbfb3b80 cb2d3f88 Call Trace: [<c018a820>] ? filldir64+0x0/0xcd [<c018a9f2>] ? vfs_readdir+0x56/0x82 [<c018a820>] ? filldir64+0x0/0xcd [<c018aa7c>] ? sys_getdents64+0x5e/0xa0 [<c01038bd>] ? sysenter_do_call+0x12/0x31 ======================= Code: 00 89 f0 89 f3 0f ac f8 14 81 e3 ff ff 0f 00 48 8d 14 c5 b8 01 00 00 89 45 cc 89 55 f0 e9 8c 01 00 00 8b 4d c8 8b 75 f0 8b 41 18 <8b> 54 30 04 8b 04 30 31 f6 89 5d dc 89 d1 8b 55 b8 0f c8 0f c9 Reported-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-15 08:35:44 -07:00
Chris Mason	7d455e0030	fs/inode.c: properly init address_space->writeback_index write_cache_pages() uses i_mapping->writeback_index to pick up where it left off the last time a given inode was found by pdflush or balance_dirty_pages (or anyone else who sets wbc->range_cyclic) alloc_inode() should set it to a sane value so that writeback doesn't start in the middle of a file. It is somewhat difficult to notice the bug since write_cache_pages will loop around to the start of the file and the elevator helps hide the resulting seeks. For whatever reason, Btrfs hits this often. Unpatched, untarring 30 copies of the linux kernel in series runs at 47MB/s on a single sata drive. With this fix, it jumps to 62MB/s. Signed-off-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-15 08:35:44 -07:00
Artem Bityutskiy	c78c7e35a4	UBIFS: xattr bugfixes Xattr code has not been tested for a while and there were serveral bugs. One of them is using wrong inode in 'ubifs_jnl_change_xattr()'. The other is a deadlock in 'ubifs_setxattr()': the i_mutex is locked in 'cap_inode_need_killpriv()' path, so deadlock happens when 'ubifs_setxattr()' tries to lock it again. Thanks to Zoltan Sogor for finding these bugs. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-14 12:46:20 +03:00
Steve French	ad661334b8	[CIFS] mount of IPC$ breaks with iget patch In looking at network named pipe support on cifs, I noticed that Dave Howell's iget patch: iget: stop CIFS from using iget() and read_inode() broke mounts to IPC$ (the interprocess communication share), and don't handle the error case (when getting info on the root inode fails). Thanks to Gunter who noted a typo in a debug line in the original version of this patch. CC: David Howells <dhowells@redhat.com> CC: Gunter Kukkukk <linux@kukkukk.com> CC: Stable Kernel <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-14 03:55:14 +00:00
David Howells	9e2b2dc413	CRED: Introduce credential access wrappers The patches that are intended to introduce copy-on-write credentials for 2.6.28 require abstraction of access to some fields of the task structure, particularly for the case of one task accessing another's credentials where RCU will have to be observed. Introduced here are trivial no-op versions of the desired accessors for current and other tasks so that other subsystems can start to be converted over more easily. Wrappers are introduced into a new header (linux/cred.h) for UID/GID, EUID/EGID, SUID/SGID, FSUID/FSGID, cap_effective and current's subscribed user_struct. These wrappers are macros because the ordering between header files mitigates against making them inline functions. linux/cred.h is #included from linux/sched.h. Further, XFS is modified such that it no longer defines and uses parameterised versions of current_fs[ug]id(), thus getting rid of the namespace collision otherwise incurred. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>	2008-08-14 09:35:23 +10:00
Linus Torvalds	9ea319b616	Merge git://oss.sgi.com:8090/xfs/linux-2.6 * git://oss.sgi.com:8090/xfs/linux-2.6: (45 commits) [XFS] Fix use after free in xfs_log_done(). [XFS] Make xfs_bmap__count_leaves void. [XFS] Use KM_NOFS for debug trace buffers [XFS] use KM_MAYFAIL in xfs_mountfs [XFS] refactor xfs_mount_free [XFS] don't call xfs_freesb from xfs_unmountfs [XFS] xfs_unmountfs should return void [XFS] cleanup xfs_mountfs [XFS] move root inode IRELE into xfs_unmountfs [XFS] stop using file_update_time [XFS] optimize xfs_ichgtime [XFS] update timestamp in xfs_ialloc manually [XFS] remove the sema_t from XFS. [XFS] replace dquot flush semaphore with a completion [XFS] replace inode flush semaphore with a completion [XFS] extend completions to provide XFS object flush requirements [XFS] replace the XFS buf iodone semaphore with a completion [XFS] clean up stale references to semaphores [XFS] use get_unaligned_ helpers [XFS] Fix compile failure in xfs_buf_trace() ...	2008-08-13 15:17:49 -07:00
David Teigland	51409340d2	dlm: rename structs Add a dlm_ prefix to the struct names in config.c. This resolves a conflict with struct node in particular, when include/linux/node.h happens to be included. Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Teigland <teigland@redhat.com>	2008-08-13 12:47:36 -05:00
David Teigland	cb980d9a3e	dlm: add missing kfrees A couple of unlikely error conditions were missing a kfree on the error exit path. Reported-by: Juha Leppanen <juha_motorsportcom@luukku.com> Signed-off-by: David Teigland <teigland@redhat.com>	2008-08-13 12:47:36 -05:00
Artem Bityutskiy	720b499c80	UBIFS: remove unneeded check Commit `d70b67c8bc` fixed VFS and it never calls FS lookup function in deleted directories now. We may remove corresponding UBIFS check. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 18:59:09 +03:00
Artem Bityutskiy	0a883a05c5	UBIFS: few commentary fixes Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 18:59:02 +03:00
Bob Peterson	72dbf4790f	GFS2: rm on multiple nodes causes panic This patch fixes a problem whereby simultaneous unlink, rmdir, rename and link operations (e.g. rm -fR *) from multiple nodes on the same GFS2 file system can cause kernel panics, hangs, and/or memory corruption. It also gets rid of all the non-rgrp calls to gfs2_glock_nq_m. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-08-13 10:00:12 +01:00
Steven Whitehouse	9b8df98fc8	GFS2: Fix metafs mounts This patch is intended to fix the issues reported in bz #457798. Instead of having the metafs as a separate filesystem, it becomes a second root of gfs2. As a result it will appear as type gfs2 in /proc/mounts, but it is still possible (for backwards compatibility purposes) to mount it as type gfs2meta. A new mount flag "meta" is introduced so that its possible to tell the two cases apart in /proc/mounts. As a result it becomes possible to mount type gfs2 with -o meta and get the same result as mounting type gfs2meta. So it is possible to mount just the metafs on its own. Currently if you do this, its then impossible to mount the "normal" root of the gfs2 filesystem without first unmounting the metafs root. I'm not sure if thats a feature or a bug :-) Either way, this is a great improvement on the previous scheme and I've verified that it works ok with bind mounts on both the "normal" root and the metafs root in various combinations. There were also a bunch of functions in super.c which didn't belong there, so this moves them into ops_fstype.c where they can be static. Hopefully the mount/umount sequence is now more obvious as a result. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com>	2008-08-13 09:59:40 +01:00
Steven Whitehouse	c1e817d03a	GFS2: Fix debugfs glock file iterator Due to an incorrect iterator, some glocks were being missed from the glock dumps obtained via debugfs. This patch fixes the problem and ensures that we don't miss any glocks in future. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2008-08-13 09:59:10 +01:00
Zoltan Sogor	5acd6ff8ac	UBIFS: fix budgeting request alignment in xattr code Data length has to be aligned in the budgeting request. Code in xattr.c did not do this. Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:43:56 +03:00
Artem Bityutskiy	840dc6b891	UBIFS: improve arguments checking in debugging messages Use "if (0) printk()" construct in debugging print macros to make the debugging messages be checked even if debugging is off. This patch also removes some unneeded spaces and blank lines. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:42:47 +03:00
Adrian Hunter	81ffa38e15	UBIFS: always set i_generation to 0 UBIFS does not presently re-use inode numbers, so leaving i_generation zero is most appropriate for now. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:39:53 +03:00
Adrian Hunter	3a13252c6f	UBIFS: correct spelling of "thrice". Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-13 11:39:20 +03:00
Zoltan Sogor	22bc7fa8c5	UBIFS: support splice_write Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:38:43 +03:00
Artem Bityutskiy	0010f18afc	UBIFS: minor tweaks in commit No functional changes, just lessen the amount of indentations. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:38:19 +03:00
Artem Bityutskiy	b364b41aeb	UBIFS: reserve more space for index At the moment UBIFS reserves twice old index size space for the index. But this is not enough in some cases, because if the indexing node are very fragmented and there are many small gaps, while the dirty index has big znodes - in-the-gaps method would fail. Thus, reserve trise as more, in which case we are guaranteed that we can commit in any case. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:37:28 +03:00
Artem Bityutskiy	1de9415906	UBIFS: print pid in dump function Useful when something fails and there are many processes racing. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:35:58 +03:00
Artem Bityutskiy	dab4b4d2f9	UBIFS: align inode data to eight UBIFS aligns node lengths to 8, so budgeting has to do the same. Well, direntry, inode, and page budgets are already aligned, but not inode data budget (e.g., data in special devices or symlinks). Do this for inode data as well. Also, add corresponding debugging checks. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:35:16 +03:00
Artem Bityutskiy	547000da64	UBIFS: improve budgeting checks Budgeting is a crucial UBIFS subsystem - add more assertions to improve requests checking. This is not compiled in when UBIFS debugging is disabled. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:34:27 +03:00
Adrian Hunter	f769108424	UBIFS: correct orphan deletion order The debug function that checks orphans, does so using the TNC mutex. That means it will not see a correct picture if the inode is removed from the orphan tree before it is removed from TNC. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-13 11:32:53 +03:00
Adrian Hunter	7d62ff2c39	UBIFS: fix typos in comments Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-13 11:32:21 +03:00
Adrian Hunter	bc813355c7	UBIFS: do not union creat_sqnum and del_cmtno The values in these two fields need to be preserved independently and so a union cannot be used. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-13 11:30:04 +03:00
Artem Bityutskiy	de94eb558b	UBIFS: optimize deletions Every time anything is deleted, UBIFS writes the deletion inode node twice - once in 'ubifs_jnl_update()' and the second time in 'ubifs_jnl_write_inode()'. However, the second write is not needed if no commit happened after 'ubifs_jnl_update()'. This patch checks that condition and avoids writing the deletion inode for the second time. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:28:44 +03:00
Artem Bityutskiy	014eb04b03	UBIFS: increment commit number earlier Increment the commit number at the beginnig of the commit, instead of doing this after the commit. This is needed for further optimizations. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:27:47 +03:00
Artem Bityutskiy	fd6c6b51e3	UBIFS: remove another unneeded function parameter The 'last_reference' parameter of 'pack_inode()' is not really needed because 'inode->i_nlink' may be tested instead. Zap it. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:27:10 +03:00
Artem Bityutskiy	1f28681ad3	UBIFS: remove unneeded function parameter Simplify 'ubifs_jnl_write_inode()' by removing the 'deletion' parameter which is not really needed because we may test inode->i_nlink and check whether this is a deletion or not. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:26:25 +03:00
Artem Bityutskiy	fbfa6c884a	UBIFS: do not write orphans back Orphan inodes are deleted inodes which will disappear after FS re-mount. There is not need to write orphan inodes back, because they are not needed on the flash media. So optimize orphans a little by not writing them back. Just mark them as clean, free the budget, and report success to VFS. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:25:27 +03:00
Adrian Hunter	ff46d7b3e0	UBIFS: make ubifs_ro_mode() not inline We use ubifs_ro_mode() quite a lot, and not in fast-path, so there is no reason to blow the code up by having it inlined. Also, we usually want R/O mode change to be seen to other CPUs as soon as possible, so when we make this a function call, we will automatically have a memory barrier. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:24:26 +03:00
Adrian Hunter	2fb42b11f6	UBIFS: ensure UBIFS switches to read-only on error UBI transparently handles write errors by automatically copying and remapping the affected eraseblock. If UBI is unable to do that, for example its pool of eraseblocks reserved for bad block handling is empty, then the error is propagated to UBIFS. UBIFS must protect the media from falling into an inconsistent state by immediately switching to read-only mode. In the case of log updates, this was not being done. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-13 11:24:00 +03:00
Adrian Hunter	16dfd804b4	UBIFS: fix error return in failure mode UBIFS recovery testing debug facility simulates media failures. When simulating an IO error, the error code returned must be -EIO but it was not always if the user switched off the debug recovery testing option at the same time. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-08-13 11:22:41 +03:00
Artem Bityutskiy	1e0f358e29	UBIFS: free budget in delete_inode as well Although the inode is marked as clean when it is being deleted, it might stay and be used as orphan, and be marked as dirty. So we have to free the budget when we delete it. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:22:09 +03:00
Artem Bityutskiy	7d32c2bb14	UBIFS: improve debugging 1. Print inode mode in some of debugging messages 2. Add few more useful assertions Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:20:07 +03:00
Artem Bityutskiy	182854b46f	UBIFS: fix budgeting calculations The 'ubifs_release_dirty_inode_budget()' was buggy and incorrectly freed the budget, which led to not freeing all dirty data budget. This patch fixes that. Also, this patch fixes ubifs_mkdir() which passed 1 in dirty_ino_d, which makes no sense. Well, it is harmless though. Also, add few more useful assertions. And improve few debugging messages. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:20:05 +03:00
Artem Bityutskiy	ce769caa50	UBIFS: print volume name as well We encouredge people to mount using volume name, not device numbers. So print the name of the mounted UBI volume, not just IDs. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:15:50 +03:00
Lachlan McIlroy	c6a7b0f8a4	[XFS] Fix use after free in xfs_log_done(). The ticket allocation code got reworked in 2.6.26 and we now free tickets whereas before we used to cache them so the use-after-free went undetected. SGI-PV: 985525 SGI-Modid: xfs-linux-melb:xfs-kern:31877a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>	2008-08-13 16:52:50 +10:00
Ruben Porras	c94312de22	[XFS] Make xfs_bmap_*_count_leaves void. xfs_bmap_count_leaves and xfs_bmap_disk_count_leaves always return always 0, make them void. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31844a Signed-off-by: Ruben Porras <ruben.porras@linworks.de> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:52:25 +10:00
Lachlan McIlroy	5695ef46ef	[XFS] Use KM_NOFS for debug trace buffers Use KM_NOFS to prevent recursion back into the filesystem which can cause deadlocks. In the case of xfs_iread() we hold the lock on the inode cluster buffer while allocating memory for the trace buffers. If we recurse back into XFS to flush data that may require a transaction to allocate extents which needs log space. This can deadlock with the xfsaild thread which can't push the tail of the log because it is trying to get the inode cluster buffer lock. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31838a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>	2008-08-13 16:51:57 +10:00
Christoph Hellwig	d62c251fe4	[XFS] use KM_MAYFAIL in xfs_mountfs Use KM_MAYFAIL for the m_perag allocation, we can deal with the error easily and blocking forever during mount is not a good idea either. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31837a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:51:29 +10:00
Christoph Hellwig	ff4f038c6b	[XFS] refactor xfs_mount_free xfs_mount_free mostly frees the perag data, which is something that is duplicated in the mount error path. Move the XFS_QM_DONE call to the caller and remove the useless mutex_destroy/spinlock_destroy calls so that we can re-use it for the mount error path. Also rename it to xfs_free_perag to reflect what it does. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31836a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:50:47 +10:00
Christoph Hellwig	6203300e5e	[XFS] don't call xfs_freesb from xfs_unmountfs xfs_readsb is called before xfs_mount so xfs_freesb should be called after xfs_unmountfs, too. This means it now happens after a few things during the of xfs_unmount which all have nothing to do with the superblock. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31835a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:50:21 +10:00
Christoph Hellwig	41b5c2e77a	[XFS] xfs_unmountfs should return void xfs_unmounts can't and shouldn't return errors so declare it as returning void. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31833a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:49:57 +10:00
Christoph Hellwig	4249023a5d	[XFS] cleanup xfs_mountfs Remove all the useless flags and code keyed off it in xfs_mountfs. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31831a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:49:32 +10:00
Christoph Hellwig	77508ec8e6	[XFS] move root inode IRELE into xfs_unmountfs The root inode is allocated in xfs_mountfs so it should be release in xfs_unmountfs. For the unmount case that means we do it after the the xfs_sync(mp, SYNC_WAIT \| SYNC_CLOSE) in the forced shutdown case and the dmapi unmount event. Note that both reference the rip variable which might be freed by that time in case inode flushing has kicked in, so strictly speaking this might count as a bug fix SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31830a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:49:04 +10:00
Christoph Hellwig	3a76c1ea07	[XFS] stop using file_update_time xfs_ichtime updates the xfs_inode and Linux inode timestamps just fine, no need to call file_update_time and then copy the values over to the XFS inode. The only additional thing in file_update_time are checks not applicable to the write path. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31829a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>	2008-08-13 16:48:12 +10:00
Christoph Hellwig	8e5975c82f	[XFS] optimize xfs_ichgtime Port a little optmization from file_update_time to xfs_ichgtime, and only update the timestamp and mark the inode dirty if the timestamp actually changes in the timer tick resultion supported by the running kernel. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31827a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:45:13 +10:00
Christoph Hellwig	dff35fd41f	[XFS] update timestamp in xfs_ialloc manually In xfs_ialloc we just want to set all timestamps to the current time. We don't need to mark the inode dirty like xfs_ichgtime does, and we don't need nor want the opimizations in xfs_ichgtime that I will introduce in the next patch. So just opencode the timestamp update in xfs_ialloc, and remove the new unused XFS_ICHGTIME_ACC case in xfs_ichgtime. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31825a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:44:15 +10:00
David Chinner	ab4a9b04a3	[XFS] remove the sema_t from XFS. Now that all users of the sema_t are gone from XFS we can finally kill it. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31823a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:42:10 +10:00
David Chinner	e1f49cf20c	[XFS] replace dquot flush semaphore with a completion Use the new completion flush code to implement the dquot flush lock. Removes one of the final users of semaphores in the XFS code base. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31822a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:41:43 +10:00
David Chinner	c63942d3ee	[XFS] replace inode flush semaphore with a completion Use the new completion flush code to implement the inode flush lock. Removes one of the final users of semaphores in the XFS code base. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31817a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:41:16 +10:00
David Chinner	b4dd330b9e	[XFS] replace the XFS buf iodone semaphore with a completion The xfs_buf_t b_iodonesema is really just a semaphore that wants to be a completion. Change it to a completion and remove the last user of the sema_t from XFS. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31815a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:36:11 +10:00
David Chinner	12017faf38	[XFS] clean up stale references to semaphores A lot of code has been converted away from semaphores, but there are still comments that reference semaphore behaviour. The log code is the worst offender. Update the comments to reflect what the code really does now. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31814a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:34:31 +10:00
Harvey Harrison	597bca6378	[XFS] use get_unaligned_* helpers SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31813a Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:29:21 +10:00
Lachlan McIlroy	d63f154a36	[XFS] Fix compile failure in xfs_buf_trace() SGI-PV: 957103 SGI-Modid: xfs-linux-melb:xfs-kern:31804a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:28:40 +10:00
Christoph Hellwig	169d6227a7	[XFS] Use the same btree_cur union member for alloc and inobt trees. The alloc and inobt btree use the same agbp/agno pair in the btree_cur union. Make them use the same bc_private.a union member so that code for these two short form btree implementations can be shared. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31788a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:25:27 +10:00
Christoph Hellwig	cdcf43335c	[XFS] small cleanups in xfs_btree.c Remove unneeded xfs_btree_get_block forward declaration. Move xfs_btree_firstrec next to xfs_btree_lastrec. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31787a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:23:50 +10:00
Christoph Hellwig	41be8bed1f	[XFS] sanitize xfs_initialize_vnode Sanitize setting up the Linux indode. Setting up the xfs_inode <-> inode link is opencoded in xfs_iget_core now because that's the only place it needs to be done, xfs_initialize_vnode is renamed to xfs_setup_inode and loses all superflous paramaters. The check for I_NEW is removed because it always is true and the di_mode check moves into xfs_iget_core because it's only needed there. xfs_set_inodeops and xfs_revalidate_inode are merged into xfs_setup_inode and the whole things is moved into xfs_iops.c where it belongs. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31782a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:23:13 +10:00
Christoph Hellwig	5ec7f8c7d1	[XFS] kill bhv_vnode_t All remaining bhv_vnode_t instance are in code that's more or less Linux specific. (Well, for xfs_acl.c that could be argued, but that code is on the removal list, too). So just do an s/bhv_vnode_t/struct inode/ over the whole tree. We can clean up variable naming and some useless helpers later. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31781a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:22:40 +10:00
Christoph Hellwig	df80c933f9	[XFS] remove some easy bhv_vnode_t instances In various places we can just move a VFS_I call into the argument list of called functions/macros instead of having a local bhv_vnode_t. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31776a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:22:09 +10:00
Christoph Hellwig	e1cccd917b	[XFS] kill xfs_lock_dir_and_entry When multiple inodes are locked in XFS it happens in order of the inode number, with the everything but the first inode trylocked if any of the previous inodes is in the AIL. Except for the sorting of the inodes this logic is implemented in xfs_lock_inodes, but also partially duplicated in xfs_lock_dir_and_entry in a particularly stupid way adds a lock roundtrip if the inode ordering is not optimal. This patch adds a new helper xfs_lock_two_inodes that takes two inodes and locks them in the most optimal way according to the above locking protocol and uses it for all places that want to lock two inodes. The only caller of xfs_lock_inodes is xfs_rename which might lock up to four inodes. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31772a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:18:07 +10:00
Christoph Hellwig	1550d0b0b0	[XFS] kill INDUCE_IO_ERROR All the error injection is already enabled through ifdef DEBUG, so kill the never set second cpp symbol to activate it without the rest of the debugging infrastructure. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31771a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:17:37 +10:00
Christoph Hellwig	907f49a8f5	[XFS] implement IHOLD/IRELE directly Now that all direct calls to VN_HOLD/VN_RELE are gone we can implement IHOLD/IRELE directly. For the IHOLD case also replace igrab with a direct increment of i_count because we are guaranteed to already have a live and referenced inode by the VFS. Also remove the vn_hold statistic because it's been rather meaningless for some time with most references done by other callers. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31764a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:13:45 +10:00
Christoph Hellwig	0b1f917730	[XFS] remove remaining VN_HOLD calls Use IHOLD(ip) instead of VN_HOLD(VFS_I(ip)). SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31765a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:13:09 +10:00
Christoph Hellwig	604323ca76	[XFS] remove spurious VN_HOLD/VN_RELE calls from xfs_acl.c All the ACL routines are called from inode operations which are guaranteed to have a referenced inode by the VFS, so there's no need for the ACL code to grab another temporary one. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31763a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:12:37 +10:00
Christoph Hellwig	863890cd90	[XFS] kill vn_to_inode bhv_vnode_t is just a typedef for struct inode, so there's no need for a helper to convert between the two. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31761a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:12:05 +10:00
Christoph Hellwig	a19d033cd2	[XFS] Remove vn_from_inode() bhv_vnode_t is just a typedef for struct inode, so there's no need for a helper to convert between the two. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31760a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:11:26 +10:00
Eric Sandeen	39dab9d7da	[XFS] remove shouting-indirection macros from xfs_trans.h SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31758a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:10:52 +10:00
Eric Sandeen	db7a2c71d2	[XFS] convert xfs to use ERR_CAST Looks like somehow xfs got missed in the conversion that took place in `e231c2ee64`, "Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) <http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit diff;h=e231c2ee64eb1c5cd3c63c31da9dac7d888dcf7f>" SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31757a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:09:25 +10:00
Eric Sandeen	cdeb380aa2	[XFS] remove INT_GET and friends Thanks to hch's endian work, INT_GET etc are no longer used, and may as well be removed. INT_SET is still used in the acl code, though. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31756a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:07:53 +10:00
Niv Sardi	322ff6b8cd	[XFS] Move xfs_attr_rolltrans to xfs_trans_roll Move it from the attr code to the transaction code and make the attr code call the new function. We rolltrans is really usefull whenever we want to use rolling transaction, should be generic, it isn't dependent on any part of the attr code anyway. We use this excuse to change all the: if ((error = xfs_attr_rolltrans())) calls into: error = xfs_trans_roll(); if (error) SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31729a Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:05:49 +10:00
Christoph Hellwig	a738159df2	[XFS] don't leak m_fsname/m_rtname/m_logname Add a helper to free the m_fsname/m_rtname/m_logname allocations and use it properly for all mount failure cases. Also switch the allocations for these to kstrdup while we're at it. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31728a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:04:05 +10:00
Niv Sardi	5e9da7b7a1	[XFS] Move attr log alloc size calculator to another function. We will need that to be able to calculate the size of log we need for a specific attr (for Create+EA). The local flag is needed so that we can fail if we run into ENOSPC when trying to alloc blocks. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31727a Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:03:35 +10:00
David Chinner	6785073ba1	[XFS] Use KM_NOFS for incore inode extent tree allocation V2 If we allow incore extent tree allocations to recurse into the filesystem under memory pressure, new delayed allocations through xfs_iomap_write_delay() can deadlock on themselves if memory reclaim tries to write back dirty pages from that inode. It will deadlock in xfs_iomap_write_allocate() trying to take the ilock we already hold. This can also show up as complex ABBA deadlocks when multiple threads are triggering memory reclaim when trying to allocate extents. The main cause of this is the fact that delayed allocation is not done in a transaction, so KM_NOFS is not automatically added to the allocations to prevent this recursion. Mark all allocations done for the incore inode extent tree as KM_NOFS to ensure they never recurse back into the filesystem. Version 2: o KM_NOFS implies KM_SLEEP, so just use KM_NOFS SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31726a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:02:51 +10:00
David Chinner	e6064d30c3	[XFS] XFS: Kill xfs_vtoi() xfs_vtoi() is redundant and only unsed in small sections of code. Replace them with widely used XFS_I() inline and kill xfs_vtoi(). SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31725a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:01:45 +10:00
David Chinner	e4f7529108	[XFS] Kill shouty XFS_ITOV() macro Replace XFS_ITOV() with the new VFS_I() inline. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31724a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 16:00:45 +10:00
David Chinner	705db4a24e	[XFS] kill shouty XFS_ITOV_NULL macro Replace XFS_ITOV_NULL() with the new VFS_I() inline. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31722a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 15:47:43 +10:00
David Chinner	0165164625	[XFS] Avoid directly referencing the VFS inode. In several places we directly convert from the XFS inode to the linux (VFS) inode by a simple deference of ip->i_vnode. We should not do this - a helper function should be used to extract the VFS inode from the XFS inode. Introduce the function VFS_I() to extract the VFS inode from the XFS inode. The name was chosen to match XFS_I() which is used to extract the XFS inode from the VFS inode. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31720a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 15:45:15 +10:00
Lachlan McIlroy	3790689fa3	[XFS] Do not access buffers after dropping reference count We should not access a buffer after dropping it's reference count otherwise we could race with another thread that releases the final reference count and frees the buffer causing us to access potentially unmapped memory. The bug this change fixes only occured on DEBUG XFS since the offending code was in an ASSERT. SGI-PV: 984429 SGI-Modid: xfs-linux-melb:xfs-kern:31715a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <david@fromorbit.com>	2008-08-13 15:42:10 +10:00
David Chinner	79071eb0b2	[XFS] Use the generic bitops rather than implementing them ourselves. This keeps xfs_lowbit64 as it was since there aren't good generic helpers there ... Patch inspired by Andi Kleen. SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31472a Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-08-13 15:41:12 +10:00
Linus Torvalds	b0e0c9e7f6	Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux * 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: fs/nfsd/export.c: Adjust error handling code involving auth_domain_put MAINTAINERS: mention lockd and sunrpc in nfs entries lockd: trivial sparse endian annotations	2008-08-12 16:39:22 -07:00
Alexey Dobriyan	50ac2d694f	seq_file: add seq_cpumask(), seq_nodemask() Short enough reads from /proc/irq/*/smp_affinity return -EINVAL for no good reason. This became noticed with NR_CPUS=4096 patches, when length of printed representation of cpumask becase 1152, but cat(1) continued to read with 1024-byte chunks. bitmap_scnprintf() in good faith fills buffer, returns 1023, check returns -EINVAL. Fix it by switching to seq_file, so handler will just fill buffer and doesn't care about offsets, length, filling EOF and all this crap. For that add seq_bitmap(), and wrappers around it -- seq_cpumask() and seq_nodemask(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Paul Jackson <pj@sgi.com> Cc: Mike Travis <travis@sgi.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:30 -07:00
Huang Weiyi	dd763460eb	reiserfs: removed duplicated #include Removed duplicated #include <linux/quotaops.h> in fs/reiserfs/super.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:30 -07:00
Andrew Morton	523723bb50	fs/eventpoll.c: fix sys_epoll_create1() comment The `size' argument was removed. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ulrich Drepper <drepper@redhat.com> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-12 16:07:30 -07:00
Steve French	54b4602d5f	[CIFS] remove trailing whitespace Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-11 22:31:40 +00:00
Steve French	2c731afb0d	[CIFS] if get root inode fails during mount, cleanup tree connection CC: Stable Kernel <stable@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-11 22:28:53 +00:00
Ingo Molnar	23a0ee908c	Merge branch 'core/locking' into core/urgent	2008-08-12 00:11:49 +02:00
Takashi YOSHII	74c27c43eb	binfmt_flat: Stub in a FLAT_PLAT_INIT(). This provides a FLAT_PLAT_INIT() arch hook for platforms that need to set up specific register state prior to calling in to the process, as per ELF_PLAT_INIT(). Signed-off-by: Takashi YOSHII <yoshii.takashi@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-08-11 20:17:55 +09:00
Ingo Molnar	3295f0ef9f	lockdep: rename map_[acquire\|release]() => lock_map_[acquire\|release]() the names were too generic: drivers/uio/uio.c:87: error: expected identifier or '(' before 'do' drivers/uio/uio.c:87: error: expected identifier or '(' before 'while' drivers/uio/uio.c:113: error: 'map_release' undeclared here (not in a function) Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 10:30:30 +02:00
Peter Zijlstra	4f3e7524b2	lockdep: map_acquire Most the free-standing lock_acquire() usages look remarkably similar, sweep them into a new helper. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-11 09:30:23 +02:00
Linus Torvalds	56831a1a88	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] list entry can not return null turn cifs_setattr into a multiplexor that calls the correct function move file time and dos attribute setting logic into new function spin off cifs_setattr with unix extensions to its own function [CIFS] Code cleanup in old sessionsetup code [CIFS] cifs_mkdir and cifs_create should respect the setgid bit on parent dir Rename CIFSSMBSetFileTimes to CIFSSMBSetFileInfo and add PID arg change CIFSSMBSetTimes to CIFSSMBSetPathInfo [CIFS] fix trailing whitespace bundle up Unix SET_PATH_INFO args into a struct and change name Fix missing braces in cifs_revalidate() remove locking around tcpSesAllocCount atomic variable [CIFS] properly account for new user= field in SPNEGO upcall string allocation [CIFS] remove level of indentation from decode_negTokenInit [CIFS] cifs send2 not retrying enough in some cases on full socket [CIFS] oid should also be checked against class in cifs asn	2008-08-08 16:18:34 -07:00
Steve French	ad8b15f0ff	[CIFS] list entry can not return null Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-08 21:10:16 +00:00
Adrian Bunk	f1c7f79b6a	[NFSD] uninline nfsd4_op_name() There doesn't seem to be a compelling reason why nfsd4_op_name() is marked as "inline": It's only used in a dprintk(), and as long as it has only one caller non-ancient gcc versions anyway inline it automatically. This patch fixes the following compile error with gcc 3.4: ... CC fs/nfsd/nfs4proc.o nfs4proc.c: In function `nfsd4_proc_compound': nfs4proc.c:854: sorry, unimplemented: inlining failed in call to nfs4proc.c:897: sorry, unimplemented: called from here make[3]: *** [fs/nfsd/nfs4proc.o] Error 1 Reported-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> [ Also made it "const char *" - Linus] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-08 11:22:19 -07:00
Jeff Layton	0510eeb736	turn cifs_setattr into a multiplexor that calls the correct function Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 22:39:46 +00:00
Jeff Layton	feb3e20cee	move file time and dos attribute setting logic into new function Break up cifs_setattr further by moving the logic that sets file times and dos attributes into a separate function. This patch also refactors the logic a bit so that when the file is already open then we go ahead and do a SetFileInfo call. SetPathInfo seems to be unreliable when setting times on open files. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 22:28:06 +00:00
Jeff Layton	3fe5c1dd0a	spin off cifs_setattr with unix extensions to its own function Create a new cifs_setattr_unix function to handle a setattr when unix extensions are enabled and have cifs_setattr call it. Also, clean up variable declarations in cifs_setattr. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 22:14:52 +00:00
Denis ChengRq	1ac0ae062c	bio: make use of bvec_nr_vecs Since introduced in `7ba1ba12ee`, it should be made use of. Signed-off-by: Denis ChengRq <crquan@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-08-06 12:30:04 +02:00
Steve French	26b994fad6	[CIFS] Code cleanup in old sessionsetup code Remove some long lines Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 05:11:33 +00:00
Jeff Layton	9508991093	[CIFS] cifs_mkdir and cifs_create should respect the setgid bit on parent dir If a server supports unix extensions but does not support POSIX create routines, then the client will create a new inode with a standard SMB mkdir or create/open call and then will set the mode. When it does this, it does not take the setgid bit on the parent directory into account. This patch has CIFS flip on the setgid bit when the parent directory has it. If the share is mounted with "setuids" then also change the group owner to the gid of the parent. This patch should apply cleanly on top of the setattr cleanup patches that I sent a few weeks ago. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 04:39:02 +00:00
Jeff Layton	2dd2dfa060	Rename CIFSSMBSetFileTimes to CIFSSMBSetFileInfo and add PID arg The new name is more clear since this is also used to set file attributes. We'll need the pid_of_opener arg so that we can pass in filehandles of other pids and spare ourselves an open call. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 04:24:50 +00:00
Jeff Layton	6fc000e519	change CIFSSMBSetTimes to CIFSSMBSetPathInfo CIFSSMBSetTimes is a deceptive name. This function does more that just set file times. Change it to CIFSSMBSetPathInfo, which is closer to its real purpose. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 04:24:01 +00:00
Steve French	063ea27925	[CIFS] fix trailing whitespace Jeff left trailing whitespace in previous patch Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 04:23:13 +00:00
Jeff Layton	4e1e7fb9e8	bundle up Unix SET_PATH_INFO args into a struct and change name We'd like to be able to use the unix SET_PATH_INFO_BASIC args to set file times as well, but that makes the argument list rather long. Bundle up the args for unix SET_PATH_INFO call into a struct. For now, we don't actually use the times fields anywhere. That will be done in a follow-on patch. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-06 04:17:20 +00:00
Alexander Beregalov	7c44319dc6	proc: fix warnings proc: fix warnings fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64' fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 4 has type 'u64' fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 5 has type 'u64' fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 6 has type 'u64' fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 7 has type 'u64' fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 8 has type 'u64' fs/proc/base.c:2429: warning: format '%llu' expects type 'long long unsigned int', but argument 9 has type 'u64' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Acked-by: Andrea Righi <righi.andrea@gmail.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-05 14:33:50 -07:00
Alexander Beregalov	dc60bf1d83	omfs: fix warning fs/omfs/inode.c:495: warning: format '%llx' expects type 'long long unsigned int', but argument 2 has type 'u64' fs/omfs/inode.c:495: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type '__be64' Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com> Acked-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-05 14:33:49 -07:00
Suresh Jayaraman	9e96af8525	Fix missing braces in cifs_revalidate() Fix missing braces introduced during commit `cea218054a`. Though setting wbrc to 0 keeps this from causing real bug, this should have been there. Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-05 16:51:53 +00:00
Nick Piggin	ca5de404ff	fs: rename buffer trylock Like the page lock change, this also requires name change, so convert the raw test_and_set bitop to a trylock. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 21:56:09 -07:00
Nick Piggin	529ae9aaa0	mm: rename page trylock Converting page lock to new locking bitops requires a change of page flag operation naming, so we might as well convert it to something nicer (!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked). This also facilitates lockdeping of page lock. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 21:31:34 -07:00
Linus Torvalds	1a3f7d98e5	Revert "UFS: add const to parser token table" This reverts commit `f9247273cb` (and `fb2e405fc1` - "fix fs/nfs/nfsroot.c compilation" - that fixed a missed conversion). The changes cause problems for at least the sparc build. Let's re-do them when the exact issues are resolved. Requested-by: Andrew Morton <akpm@linux-foundation.org> Requested-by: Steven Whitehouse <swhiteho@redhat.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-04 16:50:38 -07:00
Jeff Layton	93d0ec8518	remove locking around tcpSesAllocCount atomic variable The global tcpSesAllocCount variable is an atomic already and doesn't really need the extra locking around it. Remove the locking and just use the atomic_inc_return and atomic_dec_return functions to make sure we access it correctly. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-04 02:02:15 +00:00
Linus Torvalds	8f616cd524	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: remove write-only variables from ext4_ordered_write_end ext4: unexport jbd2_journal_update_superblock ext4: Cleanup whitespace and other miscellaneous style issues ext4: improve ext4_fill_flex_info() a bit ext4: Cleanup the block reservation code path ext4: don't assume extents can't cross block groups when truncating ext4: Fix lack of credits BUG() when deleting a badly fragmented inode ext4: Fix ext4_ext_journal_restart() ext4: fix ext4_da_write_begin error path jbd2: don't abort if flushing file data failed ext4: don't read inode block if the buffer has a write error ext4: Don't allow lg prealloc list to be grow large. ext4: Convert the usage of NR_CPUS to nr_cpu_ids. ext4: Improve error handling in mballoc ext4: lock block groups when initializing ext4: sync up block and inode bitmap reading functions ext4: Allow read/only mounts with corrupted block group checksums ext4: Fix data corruption when writing to prealloc area	2008-08-03 10:50:44 -07:00
Eric Sandeen	7d55992d60	ext4: remove write-only variables from ext4_ordered_write_end The variables 'from' and 'to' are not used anywhere. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2008-08-02 21:22:18 -04:00
OGAWA Hirofumi	17263849c7	fat: Fix allow_utime option FAT has to handle the newly introduced ATTR_TIMES_SET for allow_utime option. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-02 09:13:55 -07:00
Linus Torvalds	b8a327be3f	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-pull * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-pull: (64 commits) [XFS] Remove vn_revalidate calls in xfs. [XFS] Now that xfs_setattr is only used for attributes set from ->setattr [XFS] xfs_setattr currently doesn't just handle the attributes set through [XFS] fix use after free with external logs or real-time devices [XFS] A bug was found in xfs_bmap_add_extent_unwritten_real(). In a [XFS] fix compilation without CONFIG_PROC_FS [XFS] s/XFS_PURGE_INODE/IRELE/g s/VN_HOLD(XFS_ITOV())/IHOLD()/ [XFS] fix mount option parsing in remount [XFS] Disable queue flag test in barrier check. [XFS] streamline init/exit path [XFS] Fix up problem when CONFIG_XFS_POSIX_ACL is not set and yet we still [XFS] Don't assert if trying to mount with blocksize > pagesize [XFS] Don't update mtime on rename source [XFS] Allow xfs_bmbt_split() to fallback to the lowspace allocator [XFS] Restore the lowspace extent allocator algorithm [XFS] use minleft when allocating in xfs_bmbt_split() [XFS] attrmulti cleanup [XFS] Check for invalid flags in xfs_attrlist_by_handle. [XFS] Fix CI lookup in leaf-form directories [XFS] Use the generic xattr methods. ...	2008-08-01 12:39:09 -07:00
Linus Torvalds	63a16f9016	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: [PATCH] ocfs2: Release mutex in error handling code [PATCH] ocfs2: Fix oops when racing files truncates with writes into an mmap region [PATCH 2/2] ocfs2: Fix race between mount and recovery [PATCH 1/2] ocfs2: Add counter in struct ocfs2_dinode to track journal replays [PATCH] configfs: Convenience macros for attribute definition. [PATCH] configfs: Pin configfs subsystems separately from new config_items. [PATCH] configfs: Fix open directory making rmdir() fail [PATCH] configfs: Lock new directory inodes before removing on cleanup after failure [PATCH] configfs: Prevent userspace from creating new entries under attaching directories [PATCH] configfs: Fix failing symlink() making rmdir() fail [PATCH] configfs: Fix symlink() to a removing item [PATCH] configfs: Include linux/err.h in linux/configfs.h	2008-08-01 11:54:05 -07:00
Linus Torvalds	623fa579e6	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: [MTD] [NAND] drivers/mtd/nand/nandsim.c: fix printk warnings [MTD] [NAND] Blackfin NFC Driver: Cleanup the error exit path of bf5xx_nand_probe function [MTD] [NAND] Blackfin NFC Driver: use standard dev_err() rather than printk() [MTD] [NAND] Blackfin NFC Driver: enable Blackfin nand HWECC support by default [MTD] [NAND] Blackfin NFC Driver: add proper devinit/devexit markings to probe/remove functions [MTD] [NAND] Blackfin NFC Driver: add support for the ECC layout the Blackfin bootrom uses [MTD] [NAND] Blackfin NFC Driver: fix bug - hw ecc calc by making sure we extract 11 bits from each register instead of 10 [MTD] [NAND] Blackfin NFC Driver: fix bug - do not clobber the status from the first 256 bytes if operating on 512 pages [MTD] [NAND] diskonchip.c fix sparse endian warnings [MTD] [NAND] drivers/mtd/nand/nandsim.c needs div64.h [JFFS2] Fix allocation of summary buffer Fix rename of at91_nand -> atmel_nand [MTD] [NOR] drivers/mtd/chips/jedec_probe.c: fix Am29DL800BB device ID [MTD] MTD_DEBUG always does compile-time typechecks [MTD] DataFlash: bugfix, binary page sizes now handled [MTD] [NAND] fsl_elbc_nand.c: fix printk warning [MTD] [NAND] nandsim: support random page read command [MTD] [NAND] fix subpage read for small page NAND	2008-08-01 11:29:54 -07:00
Jeff Layton	66b8bd3c40	[CIFS] properly account for new user= field in SPNEGO upcall string allocation ...it doesn't look like it's being accounted for at the moment. Also try to reorganize the calculation to make it a little more evident what each piece means. This should probably go to the stable series as well... Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-08-01 17:54:32 +00:00
Al Viro	8d66bf5481	[PATCH] pass struct path * to do_add_mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:32 -04:00
Al Viro	d5686b444f	[PATCH] switch mtd and dm-table to lookup_bdev() No need to open-code it... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:31 -04:00
Miklos Szeredi	a95164d979	[patch 3/4] vfs: remove unused nameidata argument of may_create() Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:30 -04:00
Alexey Dobriyan	7ee7c12b71	[PATCH] devpts: switch to IDA Devpts code wants just numbers for tty indexes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:29 -04:00
Alexey Dobriyan	9a18540915	[PATCH 2/2] proc: switch inode number allocation to IDA proc doesn't use "associate pointer with id" feature of IDR, so switch to IDA. NOTE, NOTE, NOTE: Do not apply if release_inode_number() still mantions MAX_ID_MASK! Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:28 -04:00
Alexey Dobriyan	67935df49d	[PATCH 1/2] proc: fix inode number bogorithmetic Id which proc gets from IDR for inode number and id which proc removes from IDR do not match. E.g. 0x11a transforms into 0x8000011a. Which stayed unnoticed for a long time because, surprise, idr_remove() masks out that high bit before doing anything. All of this due to "\| ~MAX_ID_MASK" in release_inode_number(). I still don't understand how it's supposed to work, because "\| ~MASK" is not an inversion for "& MAX" operation. So, use just one nice, working addition. Make start offset unsigned int, while I'm at it. It's longness is not used anywhere. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:27 -04:00
Al Viro	8266602033	[PATCH] fix bdev leak in block_dev.c do_open() Callers expect it to drop reference to bdev on all failure exits. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:26 -04:00
Al Viro	77e69dac3c	[PATCH] fix races and leaks in vfs_quota_on() users * new helper: vfs_quota_on_path(); equivalent of vfs_quota_on() sans the pathname resolution. * callers of vfs_quota_on() that do their own pathname resolution and checks based on it are switched to vfs_quota_on_path(); that way we avoid the races. * reiserfs leaked dentry/vfsmount references on several failure exits. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:25 -04:00
Al Viro	1b7e190b47	[PATCH] clean dup2() up a bit Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:24 -04:00
Al Viro	1027abe882	[PATCH] merge locate_fd() and get_unused_fd() New primitive: alloc_fd(start, flags). get_unused_fd() and get_unused_fd_flags() become wrappers on top of it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:23 -04:00
Stephen Smalley	f418b00607	Re: BUG at security/selinux/avc.c:883 (was: Re: linux-next: Tree for July 17: early crash on x86-64) SELinux needs MAY_APPEND to be passed down to the security hook. Otherwise, we get permission denials when only append permission is granted by policy even if the opening process specified O_APPEND. Shows up as a regression in the ltp selinux testsuite, fixed by this patch. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-08-01 11:25:21 -04:00
David Woodhouse	b7600dba6d	[JFFS2] Fix allocation of summary buffer We can't use vmalloc for the buffer we use for writing summaries, because some drivers may want to DMA from it. So limit the size to 64KiB and use kmalloc for it instead. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>	2008-08-01 10:07:51 +01:00
Julia Lawall	c259ae52e2	[PATCH] ocfs2: Release mutex in error handling code The mutex is released on a successful return, so it would seem that it should be released on an error return as well. The semantic patch finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @@ expression l; @@ mutex_lock(l); ... when != mutex_unlock(l) when any when strict ( if (...) { ... when != mutex_unlock(l) + mutex_unlock(l); return ...; } \| mutex_unlock(l); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:14 -07:00
Sunil Mushran	961cecbee6	[PATCH] ocfs2: Fix oops when racing files truncates with writes into an mmap region This patch fixes an oops that is reproduced when one races writes to a mmap-ed region with another process truncating the file. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:14 -07:00
Sunil Mushran	539d826409	[PATCH 2/2] ocfs2: Fix race between mount and recovery As the fs recovery is asynchronous, there is a small chance that another node can mount (and thus recover) the slot before the recovery thread gets to it. If this happens, the recovery thread will block indefinitely on the journal/slot lock as that lock will be held for the duration of the mount (by design) by the node assigned to that slot. The solution implemented is to keep track of the journal replays using a recovery generation in the journal inode, which will be incremented by the thread replaying that journal. The recovery thread, before attempting the blocking lock on the journal/slot lock, will compare the generation on disk with what it has cached and skip recovery if it does not match. This bug appears to have been inadvertently introduced during the mount/umount vote removal by mainline commit `34d024f843`. In the mount voting scheme, the messaging would indirectly indicate that the slot was being recovered. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:14 -07:00
Sunil Mushran	c69991aac7	[PATCH 1/2] ocfs2: Add counter in struct ocfs2_dinode to track journal replays This patch renames the ij_pad to ij_recovery_generation in struct ocfs2_dinode. This will be used to keep count of journal replays after an unclean shutdown. Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Joel Becker	70526b6744	[PATCH] configfs: Pin configfs subsystems separately from new config_items. configfs_mkdir() creates a new item by calling its parent's ->make_item/group() functions. Once that object is created, configfs_mkdir() calls try_module_get() on the new item's module. If it succeeds, the module owning the new item cannot be unloaded, and configfs is safe to reference the item. If the item and the subsystem it belongs to are part of the same module, the subsystem is also pinned. This is the common case. However, if the subsystem is made up of multiple modules, this may not pin the subsystem. Thus, it would be possible to unload the toplevel subsystem module while there is still a child item. Thus, we now try_module_get() the subsystem's module. This only really affects children of the toplevel subsystem group. Deeper children already have their parents pinned. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	99cefda42a	[PATCH] configfs: Fix open directory making rmdir() fail When checking for user-created elements under an item to be removed by rmdir(), configfs_detach_prep() counts fake configfs_dirents created by dir_open() as user-created and fails when finding one. It is however perfectly valid to remove a directory that is open. Simply make configfs_detach_prep() skip fake configfs_dirent, like it already does for attributes, and like detach_groups() does. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	2e2ce171c3	[PATCH] configfs: Lock new directory inodes before removing on cleanup after failure Once a new configfs directory is created by configfs_attach_item() or configfs_attach_group(), a failure in the remaining initialization steps leads to removing a directory which inode the VFS may have already accessed. This commit adds the necessary inode locking to safely remove configfs directories while cleaning up after a failure. As an advantage, the locking rules of populate_groups() and detach_groups() become the same: the caller must have the group's inode mutex locked. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	2a109f2a41	[PATCH] configfs: Prevent userspace from creating new entries under attaching directories process 1: process 2: configfs_mkdir("A") attach_group("A") attach_item("A") d_instantiate("A") populate_groups("A") mutex_lock("A") attach_group("A/B") attach_item("A") d_instantiate("A/B") mkdir("A/B/C") do_path_lookup("A/B/C", LOOKUP_PARENT) ok lookup_create("A/B/C") mutex_lock("A/B") ok configfs_mkdir("A/B/C") ok attach_group("A/C") attach_item("A/C") d_instantiate("A/C") populate_groups("A/C") mutex_lock("A/C") attach_group("A/C/D") attach_item("A/C/D") failure mutex_unlock("A/C") detach_groups("A/C") nothing to do mkdir("A/C/E") do_path_lookup("A/C/E", LOOKUP_PARENT) ok lookup_create("A/C/E") mutex_lock("A/C") ok configfs_mkdir("A/C/E") ok detach_item("A/C") d_delete("A/C") mutex_unlock("A") detach_groups("A") mutex_lock("A/B") detach_group("A/B") detach_groups("A/B") nothing since no _default_ group detach_item("A/B") mutex_unlock("A/B") d_delete("A/B") detach_item("A") d_delete("A") Two bugs: 1/ "A/B/C" and "A/C/E" are created, but never removed while their parent are removed in the end. The same could happen with symlink() instead of mkdir(). 2/ "A" and "A/C" inodes are not locked while detach_item() is called on them, which may probably confuse VFS. This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before building the inode and instantiating the dentry, and validating the whole group+default groups hierarchy in a second pass by clearing CONFIGFS_USET_CREATING. mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. This does not prevent userspace from calling stat() successfuly on such directories, but this prevents userspace from adding (children to \| symlinking from/to \| read/write attributes of \| listing the contents of) not validated items. In other words, userspace will not interact with the subsystem on a new item until the new item creation completes correctly. It was first proposed to re-use CONFIGFS_USET_IN_MKDIR instead of a new flag CONFIGFS_USET_CREATING, but this generated conflicts when checking the target of a new symlink: a valid target directory in the middle of attaching a new user-created child item could be wrongly detected as being attached. 2/ is fixed by next commit. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	9a73d78cda	[PATCH] configfs: Fix failing symlink() making rmdir() fail On a similar pattern as mkdir() vs rmdir(), a failing symlink() may make rmdir() fail for the symlink's parent and the symlink's target as well. failing symlink() making target's rmdir() fail: process 1: process 2: symlink("A/S" -> "B") allow_link() create_link() attach to "B" links list rmdir("B") detach_prep("B") error because of new link configfs_create_link("A", "S") error (eg -ENOMEM) failing symlink() making parent's rmdir() fail: process 1: process 2: symlink("A/D/S" -> "B") allow_link() create_link() attach to "B" links list configfs_create_link("A/D", "S") make_dirent("A/D", "S") rmdir("A") detach_prep("A") detach_prep("A/D") error because of "S" create("S") error (eg -ENOMEM) We cannot use the same solution as for mkdir() vs rmdir(), since rmdir() on the target cannot wait on the i_mutex of the new symlink's parent without risking a deadlock (with other symlink() or sys_rename()). Instead we define a global mutex protecting all configfs symlinks attachment, so that rmdir() can avoid the races above. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:13 -07:00
Louis Rilling	4768e9b18d	[PATCH] configfs: Fix symlink() to a removing item The rule for configfs symlinks is that symlinks always point to valid config_items, and prevent the target from being removed. However, configfs_symlink() only checks that it can grab a reference on the target item, without ensuring that it remains alive until the symlink is correctly attached. This patch makes configfs_symlink() fail whenever the target is being removed, using the CONFIGFS_USET_DROPPING flag set by configfs_detach_prep() and protected by configfs_dirent_lock. This patch introduces a similar (weird?) behavior as with mkdir failures making rmdir fail: if symlink() races with rmdir() of the parent directory (or its youngest user-created ancestor if parent is a default group) or rmdir() of the target directory, and then fails in configfs_create(), this can make the racing rmdir() fail despite the concerned directory having no user-created entry (resp. no symlink pointing to it or one of its default groups) in the end. This behavior is fixed in later patches. Signed-off-by: Louis Rilling <louis.rilling@kerlabs.com> Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:12 -07:00
Joel Becker	dacdd0e047	[PATCH] configfs: Include linux/err.h in linux/configfs.h We now use PTR_ERR() in the ->make_item() and ->make_group() operations. Folks including configfs.h need err.h. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>	2008-07-31 16:21:12 -07:00
Jeff Layton	2f0e58ac3a	[CIFS] remove level of indentation from decode_negTokenInit Most of this function takes place inside of an unnecessary "else" clause. The other 2 cases both return 0, so we can remove some indentation here. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-07-31 21:30:11 +00:00
Linus Torvalds	0056e65f9e	romfs_readpage: don't report errors for pages beyond i_size We zero-fill them like we are supposed to, and that's all fine. It's only an error if the 'romfs_copyfrom()' routine isn't able to fill the data that is supposed to be there. Most of the patch is really just re-organizing the code a bit, and using separate variables for the error value and for how much of the page we actually filled from the filesystem. Reported-and-tested-by: Chris Fester <cfester@wms.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Matt Waddel <matt.waddel@freescale.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 14:30:34 -07:00
Julia Lawall	53e6d8d182	fs/nfsd/export.c: Adjust error handling code involving auth_domain_put Once clp is assigned, it never becomes NULL, so we can make a label for it in the error handling code. Because the call to path_lookup follows the call to auth_domain_find, its error handling code should jump to this new label. The semantic match that finds this problem is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @r@ expression x,E; statement S; position p1,p2,p3; @@ ( if ((x = auth_domain_find@p1(...)) == NULL \|\| ...) S \| x = auth_domain_find@p1(...) ... when != x if (x == NULL \|\| ...) S ) <... if@p3 (...) { ... when != auth_domain_put(x) when != if (x) { ... auth_domain_put(x); ...} return@p2 ...; } ...> ( return x; \| return 0; \| x = E \| E = x \| auth_domain_put(x) ) @exists@ position r.p1,r.p2,r.p3; expression x; int ret != 0; statement S; @@ * x = auth_domain_find@p1(...) <... * if@p3 (...) S ...> * return@p2 $NULL\\|ret$; // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2008-07-30 13:20:20 -04:00
Thomas Petazzoni	dbacefc9c4	fs/buffer.c: uninline __remove_assoc_queue() Uninline the __remove_assoc_queue() function in fs/buffer.c, called at too many places and too long to really be inlined. Size results: text data bss dec hex filename 1134606 118840 212992 1466438 166046 vmlinux.old 1134303 118840 212992 1466135 165f17 vmlinux -303 0 0 -303 -12F +/- This patch is part of the Linux Tiny project and has been originally written by Matt Mackall <mpm@selenic.com>. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:46 -07:00
Harvey Harrison	d406f66ddb	omfs: sparse annotations Missing cpu_to_be64 on some constant assignments. fs/omfs/dir.c:107:16: warning: incorrect type in assignment (different base types) fs/omfs/dir.c:107:16: expected restricted __be64 [usertype] i_sibling fs/omfs/dir.c:107:16: got unsigned long long fs/omfs/file.c:33:13: warning: incorrect type in assignment (different base types) fs/omfs/file.c:33:13: expected restricted __be64 [usertype] e_next fs/omfs/file.c:33:13: got unsigned long long fs/omfs/file.c:36:24: warning: incorrect type in assignment (different base types) fs/omfs/file.c:36:24: expected restricted __be64 [usertype] e_cluster fs/omfs/file.c:36:24: got unsigned long long fs/omfs/file.c:37:23: warning: incorrect type in assignment (different base types) fs/omfs/file.c:37:23: expected restricted __be64 [usertype] e_blocks fs/omfs/file.c:37:23: got unsigned long long fs/omfs/bitmap.c:74:18: warning: incorrect type in argument 2 (different signedness) fs/omfs/bitmap.c:74:18: expected unsigned long volatile addr fs/omfs/bitmap.c:74:18: got long <noident> fs/omfs/bitmap.c:77:20: warning: incorrect type in argument 2 (different signedness) fs/omfs/bitmap.c:77:20: expected unsigned long volatile addr fs/omfs/bitmap.c:77:20: got long <noident> fs/omfs/bitmap.c:112:17: warning: incorrect type in argument 2 (different signedness) fs/omfs/bitmap.c:112:17: expected unsigned long volatile addr fs/omfs/bitmap.c:112:17: got long <noident> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:46 -07:00
Alex Nixon	3971e1a917	VFS: increase pseudo-filesystem block size to PAGE_SIZE This commit: commit `ba52de123d` Author: Theodore Ts'o <tytso@mit.edu> Date: Wed Sep 27 01:50:49 2006 -0700 [PATCH] inode-diet: Eliminate i_blksize from the inode structure caused the block size used by pseudo-filesystems to decrease from PAGE_SIZE to 1024 leading to a doubling of the number of context switches during a kernbench run. Signed-off-by: Alex Nixon <Alex.Nixon@citrix.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Ian Campbell <Ian.Campbell@eu.citrix.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hugh@veritas.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-30 09:41:44 -07:00
Shirish Pargaonkar	176803562b	[CIFS] cifs send2 not retrying enough in some cases on full socket There are cases in which, on a full socket which requires retry on sending data by the app (cifs in this case), that we were not retrying since we did not reinitialize a counter. This fixes the retry logic to retry up to 15 seconds on stuck sockets. Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-07-29 21:26:13 +00:00
Steve French	44051fed57	[CIFS] oid should also be checked against class in cifs asn The oid coming back from asn1_header_decode is a primitive object so class should be checked to be universal. Acked-by: Love Hörnquist Åstrand <lha@kth.se> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-07-29 21:20:14 +00:00
Eric Sandeen	7fcba05437	eCryptfs: use page_alloc not kmalloc to get a page of memory With SLUB debugging turned on in 2.6.26, I was getting memory corruption when testing eCryptfs. The root cause turned out to be that eCryptfs was doing kmalloc(PAGE_CACHE_SIZE); virt_to_page() and treating that as a nice page-aligned chunk of memory. But at least with SLUB debugging on, this is not always true, and the page we get from virt_to_page does not necessarily match the PAGE_CACHE_SIZE worth of memory we got from kmalloc. My simple testcase was 2 loops doing "rm -f fileX; cp /tmp/fileX ." for 2 different multi-megabyte files. With this change I no longer see the corruption. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Yoichi Yuasa	e3b6e806cf	bio-integrity: remove EXPORT_SYMBOL for bio_integrity_init_slab() I got section mismatch message about bio_integrity_init_slab(). WARNING: fs/built-in.o(__ksymtab+0xb60): Section mismatch in reference from the variable __ksymtab_bio_integrity_init_slab to the function .init.text:bio_integrity_init_slab() The symbol bio_integrity_init_slab is exported and annotated __init Fix this by removing the __init annotation of bio_integrity_init_slab or drop the export. It only call from init_bio(). The EXPORT_SYMBOL() can be removed. Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Hisashi Hifumi	8ab22b9abb	vfs: pagecache usage optimization for pagesize!=blocksize When we read some part of a file through pagecache, if there is a pagecache of corresponding index but this page is not uptodate, read IO is issued and this page will be uptodate. I think this is good for pagesize == blocksize environment but there is room for improvement on pagesize != blocksize environment. Because in this case a page can have multiple buffers and even if a page is not uptodate, some buffers can be uptodate. So I suggest that when all buffers which correspond to a part of a file that we want to read are uptodate, use this pagecache and copy data from this pagecache to user buffer even if a page is not uptodate. This can reduce read IO and improve system throughput. I wrote a benchmark program and got result number with this program. This benchmark do: 1: mount and open a test file. 2: create a 512MB file. 3: close a file and umount. 4: mount and again open a test file. 5: pwrite randomly 300000 times on a test file. offset is aligned by IO size(1024bytes). 6: measure time of preading randomly 100000 times on a test file. The result was: 2.6.26 330 sec 2.6.26-patched 226 sec Arch:i386 Filesystem:ext3 Blocksize:1024 bytes Memory: 1GB On ext3/4, a file is written through buffer/block. So random read/write mixed workloads or random read after random write workloads are optimized with this patch under pagesize != blocksize environment. This test result showed this. The benchmark program is as follows: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <time.h> #include <stdlib.h> #include <string.h> #include <sys/mount.h> #define LEN 1024 #define LOOP 1024512 / 512MB / main(void) { unsigned long i, offset, filesize; int fd; char buf[LEN]; time_t t1, t2; if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) { perror("cannot mount\n"); exit(1); } memset(buf, 0, LEN); fd = open("/root/test1/testfile", O_CREAT\|O_RDWR\|O_TRUNC); if (fd < 0) { perror("cannot open file\n"); exit(1); } for (i = 0; i < LOOP; i++) write(fd, buf, LEN); close(fd); if (umount("/root/test1/") < 0) { perror("cannot umount\n"); exit(1); } if (mount("/dev/sda1", "/root/test1/", "ext3", 0, 0) < 0) { perror("cannot mount\n"); exit(1); } fd = open("/root/test1/testfile", O_RDWR); if (fd < 0) { perror("cannot open file\n"); exit(1); } filesize = LEN LOOP; for (i = 0; i < 300000; i++){ offset = (random() % filesize) & (~(LEN - 1)); pwrite(fd, buf, LEN, offset); } printf("start test\n"); time(&t1); for (i = 0; i < 100000; i++){ offset = (random() % filesize) & (~(LEN - 1)); pread(fd, buf, LEN, offset); } time(&t2); printf("%ld sec\n", t2-t1); close(fd); if (umount("/root/test1/") < 0) { perror("cannot umount\n"); exit(1); } } Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jan Kara <jack@ucw.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:21 -07:00
Hugh Dickins	ca5b172bd2	exec: include pagemap.h again to fix build Fix compilation errors on avr32 and without CONFIG_SWAP, introduced by `ba92a43dba` ("exec: remove some includes") In file included from include/asm/tlb.h:24, from fs/exec.c:55: include/asm-generic/tlb.h: In function 'tlb_flush_mmu': include/asm-generic/tlb.h:76: error: implicit declaration of function 'release_pages' include/asm-generic/tlb.h: In function 'tlb_remove_page': include/asm-generic/tlb.h:105: error: implicit declaration of function 'page_cache_release' make[1]: *** [fs/exec.o] Error 1 This straightforward part-revert is nobody's favourite patch to address the underlying tlb.h needs swap.h needs pagemap.h (but sparc won't like that) mess; but appropriate to fix the build now before any overhaul. Reported-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp> Reported-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Tested-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-28 16:30:20 -07:00
Linus Torvalds	3988ba0708	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm: dlm: fix uninitialized variable for search_rsb_list callers dlm: release socket on error dlm: fix basts for granted CW waiting PR/CW dlm: check for null in device_write	2008-07-28 09:46:00 -07:00
Paul Mundt	3bc24a1a54	sh: Initial ELF FDPIC support. This adds initial support for ELF FDPIC on MMU-less SH, as per version 0.2 of the ABI definition at: http://www.codesourcery.com/public/docs/sh-fdpic/sh-fdpic-abi.txt Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2008-07-28 18:10:28 +09:00
Paul Mundt	9b14ec35f0	binfmt_elf_fdpic: Magical stack pointer index, for NEW_AUX_ENT compat. While implementing binfmt_elf_fdpic on SH it quickly became apparent that SH was the first platform to support both binfmt_elf_fdpic and binfmt_elf, as well as the only of the FDPIC platforms to make use of the auxvt. Currently binfmt_elf_fdpic uses a special version of NEW_AUX_ENT() where the first argument is the entry displacement after csp has been adjusted, being reset after each adjustment. As we have no ability to sort this out through the platform's ARCH_DLINFO, this index needs to be managed entirely in create_elf_fdpic_tables(). Presently none of the platforms that set their own auxvt entries are able to do so through their respective ARCH_DLINFOs when using binfmt_elf_fdpic. In addition to this, binfmt_elf_fdpic has been looking at DLINFO_ARCH_ITEMS for the number of architecture-specific entries in the auxvt. This is legacy cruft, and is not defined by any platforms in-tree, even those that make heavy use of the auxvt. AT_VECTOR_SIZE_ARCH is always available, and contains the number that is of interest here, so we switch to using that unconditionally as well. As this has direct bearing on how much stack is used, platforms that have configurable (or dynamically adjustable) NEW_AUX_ENT calls need to either make AT_VECTOR_SIZE_ARCH more fine-grained, or leave it as a worst-case and live with some lost stack space if those entries aren't pushed (some platforms may also need to purposely sacrifice some space here for alignment considerations, as noted in the code -- although not an issue for any FDPIC-capable platform today). Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: David Howells <dhowells@redhat.com>	2008-07-28 18:10:28 +09:00
Christoph Hellwig	f13fae2d2a	[XFS] Remove vn_revalidate calls in xfs. These days most of the attributes in struct inode are properly kept in sync by XFS. This patch removes the need for vn_revalidate completely by: - keeping inode.i_flags uptodate after any flags are updated in xfs_ioctl_setattr - keeping i_mode, i_uid and i_gid uptodate in xfs_setattr SGI-PV: 984566 SGI-Modid: xfs-linux-melb:xfs-kern:31679a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:39 +10:00
Christoph Hellwig	0f285c8a1c	[XFS] Now that xfs_setattr is only used for attributes set from ->setattr it can be switched to take struct iattr directly and thus simplify the implementation greatly. Also rename the ATTR_ flags to XFS_ATTR_ to not conflict with the ATTR_ flags used by the VFS. SGI-PV: 984565 SGI-Modid: xfs-linux-melb:xfs-kern:31678a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:37 +10:00
Christoph Hellwig	25fe55e814	[XFS] xfs_setattr currently doesn't just handle the attributes set through ->setattr but also addition XFS-specific attributes: project id, inode flags and extent size hint. Having these in a single function makes it more complicated and forces to have us a bhv_vattr intermediate structure eating up stackspace. This patch adds a new xfs_ioctl_setattr helper for the XFS ioctls that set these attributes and remove the code to set them through xfs_setattr. SGI-PV: 984564 SGI-Modid: xfs-linux-melb:xfs-kern:31677a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:36 +10:00
Lachlan McIlroy	c032bfcf46	[XFS] fix use after free with external logs or real-time devices SGI-PV: 983806 SGI-Modid: xfs-linux-melb:xfs-kern:31666a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2008-07-28 16:59:34 +10:00
Tim Shimmin	6a617dd22b	[XFS] A bug was found in xfs_bmap_add_extent_unwritten_real(). In a particular case, the delta param which is supposed to describe the region where extents have changed was not updated appropriately. SGI-PV: 984030 SGI-Modid: xfs-linux-melb:xfs-kern:31663a Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Olaf Weber <olaf@sgi.com>	2008-07-28 16:59:32 +10:00
Christoph Hellwig	766b0925c0	[XFS] fix compilation without CONFIG_PROC_FS SGI-PV: 984019 SGI-Modid: xfs-linux-melb:xfs-kern:31408a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:31 +10:00
Christoph Hellwig	26cc002180	[XFS] s/XFS_PURGE_INODE/IRELE/g s/VN_HOLD(XFS_ITOV())/IHOLD()/ SGI-PV: 981498 SGI-Modid: xfs-linux-melb:xfs-kern:31405a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:29 +10:00
Christoph Hellwig	62a877e35d	[XFS] fix mount option parsing in remount Remount currently happily accept any option thrown at it, although the only filesystem specific option it actually handles is barrier/nobarrier. And it actually doesn't handle these correctly either because it only uses the value it parsed when we're doing a ro->rw transition. In addition to that there's also a bad bug in xfs_parseargs which doesn't touch the actual option in the mount point except for a single one, XFS_MOUNT_SMALL_INUMS and thus forced any filesystem that's every remounted in some way to not support 64bit inodes with no way to recover unless unmounted. This patch changes xfs_fs_remount to use it's own linux/parser.h based options parse instead of xfs_parseargs and reject all options except for barrier/nobarrier and to the right thing in general. Eventually I'd like to have a single big option table used for mount aswell but that can wait for a while. SGI-PV: 983964 SGI-Modid: xfs-linux-melb:xfs-kern:31382a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:28 +10:00
Eric Sandeen	deeb5912db	[XFS] Disable queue flag test in barrier check. md raid1 can pass down barriers, but does not set an ordered flag on the queue, so xfs does not even attempt a barrier write, and will never use barriers on these block devices. Remove the flag check and just let the barrier write test determine barrier support. A possible risk here is that if something does not set an ordered flag and also does not properly return an error on a barrier write... but if it's any consolation jbd/ext3/reiserfs never test the flag, and don't even do a test write, they just disable barriers the first time an actual journal barrier write fails. SGI-PV: 983924 SGI-Modid: xfs-linux-melb:xfs-kern:31377a Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:26 +10:00
Christoph Hellwig	9f8868ffb3	[XFS] streamline init/exit path Currently the xfs module init/exit code is a mess. It's farmed out over a lot of function with very little error checking. This patch makes sure we propagate all initialization failures properly and clean up after them. Various runtime initializations are replaced with compile-time initializations where possible to make this easier. The exit path is similarly consolidated. There's now split out function to create/destroy the kmem zones and alloc/free the trace buffers. I've also changed the ktrace allocations to KM_MAYFAIL and handled errors resulting from that. And yes, we really should replace the XFS_*_TRACE ifdefs with a single XFS_TRACE.. SGI-PV: 976035 SGI-Modid: xfs-linux-melb:xfs-kern:31354a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Niv Sardi <xaiki@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-07-28 16:59:25 +10:00

... 4 5 6 7 8 ...

10281 Commits