linux_old1

Commit Graph

Author	SHA1	Message	Date
Al Viro	6badd79bd0	kill ->dir_notify() Remove the hopelessly misguided ->dir_notify(). The only instance (cifs) has been broken by design from the very beginning; the objects it creates are never destroyed, keep references to struct file they can outlive, nothing that could possibly evict them exists on close(2) path and no locking whatsoever is done to prevent races with close(), should the previous, er, deficiencies someday be dealt with. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:43 -05:00
Eric Dumazet	b6b3fdead2	filp_cachep can be static in fs/file_table.c Instead of creating the "filp" kmem_cache in vfs_caches_init(), we can do it a litle be later in files_init(), so that filp_cachep is static to fs/file_table.c Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:42 -05:00
Steven Rostedt	1239f26c05	make INIT_FS use the __RW_LOCK_UNLOCKED initialization [AV: rediffed on top of unification of init_fs] Initialization of init_fs still uses the deprecated RW_LOCK_UNLOCKED macro. This patch updates it to use the __RW_LOCK_UNLOCKED(lock) macro. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:42 -05:00
Al Viro	18d8fda7c3	take init_fs to saner place Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:42 -05:00
Christoph Hellwig	cb23beb551	kill vfs_permission With all the nameidata removal there's no point anymore for this helper. Of the three callers left two will go away with the next lookup series anyway. Also add proper kerneldoc to inode_permission as this is the main permission check routine now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:41 -05:00
Christoph Hellwig	3fb64190aa	pass a struct path * to may_open No need for the nameidata in may_open - a struct path is enough. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:41 -05:00
Christoph Hellwig	b4091d5f6f	kill walk_init_root walk_init_root is a tiny helper that is marked __always_inline, has just one caller and an unused argument. Just merge it into the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:41 -05:00
Christoph Hellwig	66f221875d	remove incorrect comment in inode_permission We now pass on all MAY_ flags to the filesystems permission routines, so remove the comment stating the contrary. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:41 -05:00
Arjan van de Ven	52afeefb9d	expand some comments (d_path / seq_path) Explain that you really need to use the return value of d_path rather than the buffer you passed into it. Also fix the comment for seq_path(), the function arguments changed recently but the comment hadn't been updated in sync. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:41 -05:00
Zhaolei	be42c4c433	correct wrong function name of d_put in kernel document and source comment no function named d_put(), it should be dput(). Impact: fix document and comment, no functionality changed Signed-off-by: Zhao Lei <zhaolei@cn.fuijtsu.com> Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:40 -05:00
Al Viro	dc711ca35f	fix switch_names() breakage in short-to-short case We want ->name.len to match the resulting name on both source and target Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:40 -05:00
Duane Griffin	7df5fa06de	befs: ensure fast symlinks are NUL-terminated Ensure fast symlink targets are NUL-terminated, even if corrupted on-disk. Cc: Sergey S. Kostyliov <rathamahata@php4.ru> Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:40 -05:00
Duane Griffin	a63d0ff31a	freevxfs: ensure fast symlinks are NUL-terminated Ensure fast symlink targets are NUL-terminated, even if corrupted on-disk. Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:40 -05:00
Duane Griffin	21acaf8e8d	sysv: ensure fast symlinks are NUL-terminated Ensure fast symlink targets are NUL-terminated, even if corrupted on-disk. Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:39 -05:00
Duane Griffin	e83c1397ca	ext4: ensure fast symlinks are NUL-terminated Ensure fast symlink targets are NUL-terminated, even if corrupted on-disk. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: adilger@sun.com Cc: linux-ext4@vger.kernel.org Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:39 -05:00
Duane Griffin	b5ed3112b5	ext3: ensure fast symlinks are NUL-terminated Ensure fast symlink targets are NUL-terminated, even if corrupted on-disk. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Tweedie <sct@redhat.com> Cc: linux-ext4@vger.kernel.org Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:39 -05:00
Duane Griffin	8d6d0c4da2	ext2: ensure fast symlinks are NUL-terminated Ensure fast symlink targets are NUL-terminated, even if corrupted on-disk. Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:39 -05:00
Duane Griffin	ebd09abbd9	vfs: ensure page symlinks are NUL-terminated On-disk data corruption could cause a page link to have its i_size set to PAGE_SIZE (or a multiple thereof) and its contents all non-NUL. NUL-terminate the link name to ensure this doesn't cause further problems for the kernel. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:39 -05:00
Duane Griffin	a17d5232de	eCryptfs: check readlink result was not an error before using it The result from readlink is being used to index into the link name buffer without checking whether it is a valid length. If readlink returns an error this will fault or cause memory corruption. Cc: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: ecryptfs-devel@lists.launchpad.net Signed-off-by: Duane Griffin <duaneg@dghda.com> Acked-by: Michael Halcrow <mhalcrow@us.ibm.com> Acked-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:38 -05:00
Julia Lawall	5cc4a0341a	fs/namespace.c: drop code after return The extra semicolon serves no purpose. Signed-off-by: Julia Lawall <julia@diku.dk> Reviewed-by: Richard Genoud <richard.genoud@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:38 -05:00
Nick Piggin	c2452f3278	shrink struct dentry struct dentry is one of the most critical structures in the kernel. So it's sad to see it going neglected. With CONFIG_PROFILING turned on (which is probably the common case at least for distros and kernel developers), sizeof(struct dcache) == 208 here (64-bit). This gives 19 objects per slab. I packed d_mounted into a hole, and took another 4 bytes off the inline name length to take the padding out from the end of the structure. This shinks it to 200 bytes. I could have gone the other way and increased the length to 40, but I'm aiming for a magic number, read on... I then got rid of the d_cookie pointer. This shrinks it to 192 bytes. Rant: why was this ever a good idea? The cookie system should increase its hash size or use a tree or something if lookups are a problem. Also the "fast dcookie lookups" in oprofile should be moved into the dcookie code -- how can oprofile possibly care about the dcookie_mutex? It gets dropped after get_dcookie() returns so it can't be providing any sort of protection. At 192 bytes, 21 objects fit into a 4K page, saving about 3MB on my system with ~140 000 entries allocated. 192 is also a multiple of 64, so we get nice cacheline alignment on 64 and 32 byte line systems -- any given dentry will now require 3 cachelines to touch all fields wheras previously it would require 4. I know the inline name size was chosen quite carefully, however with the reduction in cacheline footprint, it should actually be just about as fast to do a name lookup for a 36 character name as it was before the patch (and faster for other sizes). The memory footprint savings for names which are <= 32 or > 36 bytes long should more than make up for the memory cost for 33-36 byte names. Performance is a feature... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:38 -05:00
Richard Kennedy	e2b689d82c	fs: reorder struct inotify_device on 64bits to remove padding Reorder struct inotify_device to remove 8 bytes of padding on 64bit builds, reducing size to 128 bytes . Therefore allocating from a smaller slab & using one fewer cachelines. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> ---- Hi, patch against 2.6.28-rc7. built & tested on AMDX2 desktop. I've not been able to send this to the listed inotify maintainers, I just get mail failures. So I guessed filesystem was the best home for it, hope that's ok. regards Richard Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:37 -05:00
Kentaro Takeda	be6d3e56a6	introduce new LSM hooks where vfsmount is available. Add new LSM hooks for path-based checks. Call them on directory-modifying operations at the points where we still know the vfsmount involved. Signed-off-by: Kentaro Takeda <takedakn@nttdata.co.jp> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Toshiharu Harada <haradats@nttdata.co.jp> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-12-31 18:07:37 -05:00
Linus Torvalds	db200df0b3	Merge branch 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sparseirq: move __weak symbols into separate compilation unit sparseirq: work around __weak alias bug sparseirq: fix hang with !SPARSE_IRQ sparseirq: set lock_class for legacy irq when sparse_irq is selected sparseirq: work around compiler optimizing away __weak functions sparseirq: fix desc->lock init sparseirq: do not printk when migrating IRQ descriptors sparseirq: remove duplicated arch_early_irq_init() irq: simplify for_each_irq_desc() usage proc: remove ifdef CONFIG_SPARSE_IRQ from stat.c irq: for_each_irq_desc() move to irqnr.h hrtimer: remove #include <linux/irq.h>	2008-12-31 09:00:59 -08:00
Christian Borntraeger	e3a2a0d4e5	anon_inodes: use fops->owner for module refcount There is an imbalance for anonymous inodes. If the fops->owner field is set, the module reference count of owner is decreases on release. ("filp_close" --> "__fput" ---> "fops_put") On the other hand, anon_inode_getfd does not increase the module reference count of owner. This causes two problems: - if owner is set, the module refcount goes negative - if owner is not set, the module can be unloaded while code is running This patch changes anon_inode_getfd to be symmetric regarding fops->owner handling. I have checked all existing users of anon_inode_getfd. Noone sets fops->owner, thats why nobody has seen the module refcount negative. The refcounting was tested with a patched and unpatched KVM module.(see patch 2/2) I also did an epoll_open/close test. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:44 +02:00
Rusty Russell	2ca1a61583	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: arch/x86/kernel/io_apic.c	2008-12-31 23:05:57 +10:30
Artem Bityutskiy	8e5033adc7	UBIFS: add more useful debugging prints Print node sizes and maximum node sizes. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:25 +02:00
Artem Bityutskiy	5d38b3ac78	UBIFS: print debugging messages properly We cannot use ubifs_err() macro with DBGKEY() and DBGKEY1(), because this is racy and holding dbg_lock is needed. Use dbg_err() instead, which does have the lock held. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:25 +02:00
Artem Bityutskiy	80736d41f8	UBIFS: fix numerous spelling mistakes Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:25 +02:00
Artem Bityutskiy	57a450e959	UBIFS: allow mounting when short of space It is fine if there is not free space - we should still allow mounting this FS. This patch relaxes the free space requirements and adds info dumps. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:25 +02:00
Artem Bityutskiy	a9f2fc0e25	UBIFS: fix writing uncompressed files UBIFS does not disable compression if ui->flags is non-zero, e.g. if the file has "sync" flag. This is because of the typo which is fixed by this patch. The patch also adds a couple of useful debugging prints. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:25 +02:00
Artem Bityutskiy	f92b982680	UBIFS: fix checkpatch.pl warnings These are mostly long lines and wrong indentation warning fixes. But also there are two volatile variables and checkpatch.pl complains about them: WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt + volatile int gc_seq; WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt + volatile int gced_lnum; Well, we anyway use smp_wmb() for c->gc_seq and c->gced_lnum, so these 'volatile' modifiers can be just dropped. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:25 +02:00
Artem Bityutskiy	6a4a9b438f	UBIFS: fix sparse warnings fs/ubifs/compress.c:111:8: warning: incorrect type in argument 5 (different signedness) fs/ubifs/compress.c:111:8: expected unsigned int dlen fs/ubifs/compress.c:111:8: got int out_len fs/ubifs/compress.c:175:10: warning: incorrect type in argument 5 (different signedness) fs/ubifs/compress.c:175:10: expected unsigned int dlen fs/ubifs/compress.c:175:10: got int out_len Fix this by adding a cast to (unsigned int *). We guarantee that our lengths are small and no overflow is possible. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	2acf806758	UBIFS: simplify make_free_space The 'make_free_space()' function was too complex and this patch simplifies it. It also fixes a bug - the freespace test failed straight away on UBI volumes with 512 bytes LEB size. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	2edc2025c2	UBIFS: do not lie about used blocks Do not force UBIFS return 0 used space when it is empty. It leads to a situation when creating any file immediately produces tens of used blocks, which looks very weird. It is better to be honest and say that some blocks are used even if the FS is empty. And ext2 does the same. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	6edbfafda6	UBIFS: restore budg_uncommitted_idx UBIFS stores uncommitted index size in c->budg_uncommitted_idx, and this affect budgeting calculations. When mounting and replaying, this variable is not updated, so we may end up with "over-budgeting". This patch fixes the issue. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	26d05777b0	UBIFS: always commit on unmount UBIFS commits on unmount to make the next mount faster. Currently, it commits only if there is more than LEB size bytes in the journal. This is not very good, because journal size may be large (512KiB). And there may be few deletions in the journal which do not take much journal space, but which do introduce a lot of TNC changes and make mount slow. Thus, jurt remove this condition and always commit. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	cb5c6a2b2b	UBIFS: use ubi_sync UBI now has (fake for now, though) synchronization call - use it. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	f10383006c	UBIFS: always commit in sync_fs Always run commit in sync_fs, because even if the journal seems to be almost empty, there may be a deletion which removes a large file, which affects the index greatly. And because we want better free space predictions after 'sync_fs()', we have to commit. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	304d427cd9	UBIFS: fix file-system synchronization Argh. The ->sync_fs call is called _before_ all inodes are flushed. This means we first sync write buffers and commit, then all inodes are synced, and we end up with unflushed write buffers! Fix this by forcing synching all indoes from 'ubifs_sync_fs()'. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:24 +02:00
Artem Bityutskiy	79807d075a	UBIFS: fix constants initialization The c->min_idx_lebs constant depends on c->old_idx_sz, which is read from the master node. This means that we have to initialize c->min_idx_lebs only after we have read the master node. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:23 +02:00
Linus Torvalds	ec270e59a7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/hirofumi/fatfs-2.6: fat: make sure to set d_ops in fat_get_parent fat: fix duplicate addition of ->llseek handler fat: drop negative dentry on rename() path	2008-12-30 20:33:34 -08:00
Linus Torvalds	6a94cb7306	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: (184 commits) [XFS] Fix race in xfs_write() between direct and buffered I/O with DMAPI [XFS] handle unaligned data in xfs_bmbt_disk_get_all [XFS] avoid memory allocations in xfs_fs_vcmn_err [XFS] Fix speculative allocation beyond eof [XFS] Remove XFS_BUF_SHUT() and friends [XFS] Use the incore inode size in xfs_file_readdir() [XFS] set b_error from bio error in xfs_buf_bio_end_io [XFS] use inode_change_ok for setattr permission checking [XFS] add a FMODE flag to make XFS invisible I/O less hacky [XFS] resync headers with libxfs [XFS] simplify projid check in xfs_rename [XFS] replace b_fspriv with b_mount [XFS] Remove unused tracing code [XFS] Remove unnecessary assertion [XFS] Remove unused variable in ktrace_free() [XFS] Check return value of xfs_buf_get_noaddr() [XFS] Fix hang after disallowed rename across directory quota domains [XFS] Fix compile with CONFIG_COMPAT enabled move inode tracing out of xfs_vnode. move vn_iowait / vn_iowake into xfs_aops.c ...	2008-12-30 17:48:25 -08:00
Linus Torvalds	f57fa1d6a6	Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (70 commits) fs/nfs/nfs4proc.c: make nfs4_map_errors() static rpc: add service field to new upcall rpc: add target field to new upcall nfsd: support callbacks with gss flavors rpc: allow gss callbacks to client rpc: pass target name down to rpc level on callbacks nfsd: pass client principal name in rsc downcall rpc: implement new upcall rpc: store pointer to pipe inode in gss upcall message rpc: use count of pipe openers to wait for first open rpc: track number of users of the gss upcall pipe rpc: call release_pipe only on last close rpc: add an rpc_pipe_open method rpc: minor gss_alloc_msg cleanup rpc: factor out warning code from gss_pipe_destroy_msg rpc: remove unnecessary assignment NFS: remove unused status from encode routines NFS: increment number of operations in each encode routine NFS: fix comment placement in nfs4xdr.c NFS: fix tabs in nfs4xdr.c ...	2008-12-30 17:45:45 -08:00
Linus Torvalds	14eeee88bf	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6: jfs: ensure symlinks are NUL-terminated	2008-12-30 17:33:33 -08:00
Linus Torvalds	1dff81f20c	Merge branch 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.29' of git://git.kernel.dk/linux-2.6-block: (43 commits) bio: get rid of bio_vec clearing bounce: don't rely on a zeroed bio_vec list cciss: simplify parameters to deregister_disk function cfq-iosched: fix race between exiting queue and exiting task loop: Do not call loop_unplug for not configured loop device. loop: Flush possible running bios when loop device is released. alpha: remove dead BIO_VMERGE_BOUNDARY Get rid of CONFIG_LSF block: make blk_softirq_init() static block: use min_not_zero in blk_queue_stack_limits block: add one-hit cache for disk partition lookup cfq-iosched: remove limit of dispatch depth of max 4 times quantum nbd: tell the block layer that it is not a rotational device block: get rid of elevator_t typedef aio: make the lookup_ioctx() lockless bio: add support for inlining a number of bio_vecs inside the bio bio: allow individual slabs in the bio_set bio: move the slab pointer inside the bio_set bio: only mempool back the largest bio_vec slab cache block: don't use plugging on SSD devices ...	2008-12-30 17:20:05 -08:00
Linus Torvalds	179475a3b4	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, sparseirq: clean up Kconfig entry x86: turn CONFIG_SPARSE_IRQ off by default sparseirq: fix numa_migrate_irq_desc dependency and comments sparseirq: add kernel-doc notation for new member in irq_desc, -v2 locking, irq: enclose irq_desc_lock_class in CONFIG_LOCKDEP sparseirq, xen: make sure irq_desc is allocated for interrupts sparseirq: fix !SMP building, #2 x86, sparseirq: move irq_desc according to smp_affinity, v7 proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ sparse irqs: add irqnr.h to the user headers list sparse irqs: handle !GENIRQ platforms sparseirq: fix !SMP && !PCI_MSI && !HT_IRQ build sparseirq: fix Alpha build failure sparseirq: fix typo in !CONFIG_IO_APIC case x86, MSI: pass irq_cfg and irq_desc x86: MSI start irq numbering from nr_irqs_gsi x86: use NR_IRQS_LEGACY sparse irq_desc[] array: core kernel and x86 changes genirq: record IRQ_LEVEL in irq_desc[] irq.h: remove padding from irq_desc on 64bits	2008-12-30 16:20:19 -08:00
Linus Torvalds	bb758e9637	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimers: fix warning in kernel/hrtimer.c x86: make sure we really have an hpet mapping before using it x86: enable HPET on Fujitsu u9200 linux/timex.h: cleanup for userspace posix-timers: simplify de_thread()->exit_itimers() path posix-timers: check ->it_signal instead of ->it_pid to validate the timer posix-timers: use "struct pid" instead of "struct task_struct" nohz: suppress needless timer reprogramming clocksource, acpi_pm.c: put acpi_pm_read_slow() under CONFIG_PCI nohz: no softirq pending warnings for offline cpus hrtimer: removing all ur callback modes, fix hrtimer: removing all ur callback modes, fix hotplug hrtimer: removing all ur callback modes x86: correct link to HPET timer specification rtc-cmos: export second NVRAM bank Fixed up conflicts in sound/drivers/pcsp/pcsp.c and sound/core/hrtimer.c manually.	2008-12-30 16:16:21 -08:00
Trond Myklebust	08cc36cbd1	Merge branch 'devel' into next	2008-12-30 16:51:43 -05:00
WANG Cong	46f72f57d2	fs/nfs/nfs4proc.c: make nfs4_map_errors() static nfs4_map_errors() can become static. Signed-off-by: WANG Cong <wangcong@zeuux.org> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-30 16:35:55 -05:00
Rusty Russell	cb78a0ce69	bitmap: fix seq_bitmap and seq_cpumask to take const pointer Impact: cleanup seq_bitmap just calls bitmap_scnprintf on the bits: that arg can be const. Similarly, seq_cpumask just calls seq_bitmap. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-12-30 09:05:14 +10:30
Jens Axboe	d3f761104b	bio: get rid of bio_vec clearing We don't need to clear the memory used for adding bio_vec entries, since nobody should be looking at members unitialized. Any valid use should be below bio->bi_vcnt, and that members up until that count must be valid since they were added through bio_add_page(). Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:53 +01:00
Jens Axboe	b3a6ffe16b	Get rid of CONFIG_LSF We have two seperate config entries for large devices/files. One is CONFIG_LBD that guards just the devices, the other is CONFIG_LSF that handles large files. This doesn't make a lot of sense, you typically want both or none. So get rid of CONFIG_LSF and change CONFIG_LBD wording to indicate that it covers both. Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:51 +01:00
Jens Axboe	abf137dd77	aio: make the lookup_ioctx() lockless The mm->ioctx_list is currently protected by a reader-writer lock, so we always grab that lock on the read side for doing ioctx lookups. As the workload is extremely reader biased, turn this into an rcu hlist so we can make lookup_ioctx() lockless. Get rid of the rwlock and use a spinlock for providing update side exclusion. There's usually only 1 entry on this list, so it doesn't make sense to look into fancier data structures. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	392ddc3298	bio: add support for inlining a number of bio_vecs inside the bio When we go and allocate a bio for IO, we actually do two allocations. One for the bio itself, and one for the bi_io_vec that holds the actual pages we are interested in. This feature inlines a definable amount of io vecs inside the bio itself, so we eliminate the bio_vec array allocation for IO's up to a certain size. It defaults to 4 vecs, which is typically 16k of IO. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:50 +01:00
Jens Axboe	bb799ca020	bio: allow individual slabs in the bio_set Instead of having a global bio slab cache, add a reference to one in each bio_set that is created. This allows for personalized slabs in each bio_set, so that they can have bios of different sizes. This means we can personalize the bios we return. File systems may want to embed the bio inside another structure, to avoid allocation more items (and stuffing them in ->bi_private) after the get a bio. Or we may want to embed a number of bio_vecs directly at the end of a bio, to avoid doing two allocations to return a bio. This is now possible. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:29:23 +01:00
Jens Axboe	1b43449869	bio: move the slab pointer inside the bio_set In preparation for adding differently sized bios. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:46 +01:00
Jens Axboe	7ff9345ffa	bio: only mempool back the largest bio_vec slab cache We only very rarely need the mempool backing, so it makes sense to get rid of all but one of the mempool in a bio_set. So keep the largest bio_vec count mempool so we can always honor the largest allocation, and "upgrade" callers that fail. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:46 +01:00
Keith Mannthey	08bafc0341	block: Supress Buffer I/O errors when SCSI REQ_QUIET flag set Allow the scsi request REQ_QUIET flag to be propagated to the buffer file system layer. The basic ideas is to pass the flag from the scsi request to the bio (block IO) and then to the buffer layer. The buffer layer can then suppress needless printks. This patch declutters the kernel log by removed the 40-50 (per lun) buffer io error messages seen during a boot in my multipath setup . It is a good chance any real errors will be missed in the "noise" it the logs without this patch. During boot I see blocks of messages like " __ratelimit: 211 callbacks suppressed Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242847 Buffer I/O error on device sdm, logical block 1 Buffer I/O error on device sdm, logical block 5242878 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block 5242879 Buffer I/O error on device sdm, logical block `5242872` " in my logs. My disk environment is multipath fiber channel using the SCSI_DH_RDAC code and multipathd. This topology includes an "active" and "ghost" path for each lun. IO's to the "ghost" path will never complete and the SCSI layer, via the scsi device handler rdac code, quick returns the IOs to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi layer messages. I am wanting to extend the QUIET behavior to include the buffer file system layer to deal with these errors as well. I have been running this patch for a while now on several boxes without issue. A few runs of bonnie++ show no noticeable difference in performance in my setup. Thanks for John Stultz for the quiet_error finalization. Submitted-by: Keith Mannthey <kmannth@us.ibm.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2008-12-29 08:28:44 +01:00
Lachlan McIlroy	0a8c5395f9	[XFS] Fix merge failures Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: fs/xfs/linux-2.6/xfs_cred.h fs/xfs/linux-2.6/xfs_globals.h fs/xfs/linux-2.6/xfs_ioctl.c fs/xfs/xfs_vnodeops.h Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-29 16:47:18 +11:00
Linus Torvalds	3c92ec8ae9	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits) powerpc/44x: Support 16K/64K base page sizes on 44x powerpc: Force memory size to be a multiple of PAGE_SIZE powerpc/32: Wire up the trampoline code for kdump powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M powerpc/32: Allow __ioremap on RAM addresses for kdump kernel powerpc/32: Setup OF properties for kdump powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs() powerpc: Prepare xmon_save_regs for use with kdump powerpc: Remove default kexec/crash_kernel ops assignments powerpc: Make default kexec/crash_kernel ops implicit powerpc: Setup OF properties for ppc32 kexec powerpc/pseries: Fix cpu hotplug powerpc: Fix KVM build on ppc440 powerpc/cell: add QPACE as a separate Cell platform powerpc/cell: fix build breakage with CONFIG_SPUFS disabled powerpc/mpc5200: fix error paths in PSC UART probe function powerpc/mpc5200: add rts/cts handling in PSC UART driver powerpc/mpc5200: Make PSC UART driver update serial errors counters powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver ... Fix trivial conflict in drivers/char/Makefile as per Paul's directions	2008-12-28 16:54:33 -08:00
Stephen Rothwell	bf66542bef	cifs: update for new IP4/6 address printing Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-12-28 16:29:58 -08:00
Linus Torvalds	0191b625ca	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).	2008-12-28 12:49:40 -08:00
Linus Torvalds	54a696bd07	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (31 commits) [CIFS] Remove redundant test [CIFS] make sure that DFS pathnames are properly formed Remove an already-checked error condition in SendReceiveBlockingLock Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition [CIFS] Streamline SendReceive[2] by using "goto out:" in an error condition Slightly streamline SendReceive[2] Check the return value of cifs_sign_smb[2] [CIFS] Cleanup: Move the check for too large R/W requests [CIFS] Slightly simplify wait_for_free_request(), remove an unnecessary "else" branch Simplify allocate_mid() slightly: Remove some unnecessary "else" branches [CIFS] In SendReceive, move consistency check out of the mutexed region cifs: store password in tcon cifs: have calc_lanman_hash take more granular args cifs: zero out session password before freeing it cifs: fix wait_for_response to time out sleeping processes correctly [CIFS] Can not mount with prefixpath if root directory of share is inaccessible [CIFS] various minor cleanups pointed out by checkpatch script [CIFS] fix typo [CIFS] remove sparse warning ... Fix trivial conflict in fs/cifs/cifs_fs_sb.h due to comment changes for the CIFS_MOUNT_xyz bit definitions between cifs updates and security updates.	2008-12-28 12:37:14 -08:00
Linus Torvalds	1db2a5c11e	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (85 commits) [S390] provide documentation for hvc_iucv kernel parameter. [S390] convert ctcm printks to dev_xxx and pr_xxx macros. [S390] convert zfcp printks to pr_xxx macros. [S390] convert vmlogrdr printks to pr_xxx macros. [S390] convert zfcp dumper printks to pr_xxx macros. [S390] convert cpu related printks to pr_xxx macros. [S390] convert qeth printks to dev_xxx and pr_xxx macros. [S390] convert sclp printks to pr_xxx macros. [S390] convert iucv printks to dev_xxx and pr_xxx macros. [S390] convert ap_bus printks to pr_xxx macros. [S390] convert dcssblk and extmem printks messages to pr_xxx macros. [S390] convert monwriter printks to pr_xxx macros. [S390] convert s390 debug feature printks to pr_xxx macros. [S390] convert monreader printks to pr_xxx macros. [S390] convert appldata printks to pr_xxx macros. [S390] convert setup printks to pr_xxx macros. [S390] convert hypfs printks to pr_xxx macros. [S390] convert time printks to pr_xxx macros. [S390] convert cpacf printks to pr_xxx macros. [S390] convert cio printks to pr_xxx macros. ...	2008-12-28 12:33:21 -08:00
Linus Torvalds	a39b863342	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) sched: fix warning in fs/proc/base.c schedstat: consolidate per-task cpu runtime stats sched: use RCU variant of list traversal in for_each_leaf_rt_rq() sched, cpuacct: export percpu cpuacct cgroup stats sched, cpuacct: refactoring cpuusage_read / cpuusage_write sched: optimize update_curr() sched: fix wakeup preemption clock sched: add missing arch_update_cpu_topology() call sched: let arch_update_cpu_topology indicate if topology changed sched: idle_balance() does not call load_balance_newidle() sched: fix sd_parent_degenerate on non-numa smp machine sched: add uid information to sched_debug for CONFIG_USER_SCHED sched: move double_unlock_balance() higher sched: update comment for move_task_off_dead_cpu sched: fix inconsistency when redistribute per-cpu tg->cfs_rq shares sched/rt: removed unneeded defintion sched: add hierarchical accounting to cpu accounting controller sched: include group statistics in /proc/sched_debug sched: rename SCHED_NO_NO_OMIT_FRAME_POINTER => SCHED_OMIT_FRAME_POINTER sched: clean up SCHED_CPUMASK_ALLOC ...	2008-12-28 12:27:58 -08:00
Linus Torvalds	b0f4b285d7	Merge branch 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits) sched, trace: update trace_sched_wakeup() tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3 Revert "x86: disable X86_PTRACE_BTS" ring-buffer: prevent false positive warning ring-buffer: fix dangling commit race ftrace: enable format arguments checking x86, bts: memory accounting x86, bts: add fork and exit handling ftrace: introduce tracing_reset_online_cpus() helper tracing: fix warnings in kernel/trace/trace_sched_switch.c tracing: fix warning in kernel/trace/trace.c tracing/ring-buffer: remove unused ring_buffer size trace: fix task state printout ftrace: add not to regex on filtering functions trace: better use of stack_trace_enabled for boot up code trace: add a way to enable or disable the stack tracer x86: entry_64 - introduce FTRACE_ frame macro v2 tracing/ftrace: add the printk-msg-only option tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp() x86, bts: correctly report invalid bts records ... Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits being already partly merged by the SH merge.	2008-12-28 12:21:10 -08:00
KOSAKI Motohiro	26ddd8d5ca	proc: remove ifdef CONFIG_SPARSE_IRQ from stat.c Impact: cleanup irq_desc can be NULL when CONFIG_SPARSE_IRQ=y only. therefore, NULL checking can move into kstat_irqs_cpu() of SPARSE_IRQ version. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: "Yinghai Lu" <yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-26 09:48:18 +01:00
Julia Lawall	359d67d6ad	[CIFS] Remove redundant test In fs/cifs/cifssmb.c, pLockData is tested for being NULL at the beginning of the function, and not reassigned subsequently. A simplified version of the semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:13 +00:00
Steve French	c6fbba0546	[CIFS] make sure that DFS pathnames are properly formed The paths in a DFS request are supposed to only have a single preceding backslash, but we are sending them with a double backslash. This is exposing a bug in Windows where it also sends a path in the response that has a double backslash. The existing code that builds the mount option string however expects a double backslash prefix in a couple of places when it tries to use the path returned by build_path_from_dentry. Fix compose_mount_options to expect properly formed DFS paths (single backslash at front). Also clean up error handling in that function. There was a possible NULL pointer dereference and situations where a partially built option string would be returned. Tested against Samba 3.0.28-ish server and Samba 3.3 and Win2k8. CC: Stable <stable@kernel.org> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:13 +00:00
Volker Lendecke	ac6a3ef405	Remove an already-checked error condition in SendReceiveBlockingLock Remove an already-checked error condition in SendReceiveBlockingLock Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:13 +00:00
Volker Lendecke	698e96a826	Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:13 +00:00
Volker Lendecke	17c8bfed8a	Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition Streamline SendReceiveBlockingLock: Use "goto out:" in an error condition Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:12 +00:00
Steve French	2b2bdfba7a	[CIFS] Streamline SendReceive[2] by using "goto out:" in an error condition Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:12 +00:00
Volker Lendecke	8e4f2e8a1e	Slightly streamline SendReceive[2] Slightly streamline SendReceive[2] Remove an else branch by naming the error condition what it is Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:12 +00:00
Volker Lendecke	829049cbb1	Check the return value of cifs_sign_smb[2] Check the return value of cifs_sign_smb[2] Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:12 +00:00
Steve French	4c3130efda	[CIFS] Cleanup: Move the check for too large R/W requests This avoids an unnecessary else branch Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:12 +00:00
Volker Lendecke	27a97a613b	[CIFS] Slightly simplify wait_for_free_request(), remove an unnecessary "else" branch This is no functional change, because in the "if" branch we do an early "return 0;". Signed-off-by: Volker Lendecke <vl@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:12 +00:00
Volker Lendecke	8fbbd365cc	Simplify allocate_mid() slightly: Remove some unnecessary "else" branches Simplify allocate_mid() slightly: Remove some unnecessary "else" branches Signed-off-by: Volker Lendecke <vl@samba.org> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:12 +00:00
Volker Lendecke	6d9c6d5431	[CIFS] In SendReceive, move consistency check out of the mutexed region inbuf->smb_buf_length does not change in in wait_for_free_request() or in allocate_mid(), so we can check it early. Signed-off-by: Volker Lendecke <vl@samba.org> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Jeff Layton	00e485b019	cifs: store password in tcon cifs: store password in tcon Each tcon has its own password for share-level security. Store it in the tcon and wipe it clean and free it when freeing the tcon. When doing the tree connect with share-level security, use the tcon password instead of the session password. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Jeff Layton	4e53a3fb98	cifs: have calc_lanman_hash take more granular args cifs: have calc_lanman_hash take more granular args We need to use this routine to encrypt passwords associated with the tcon too. Don't assume that the password will be attached to the smb_session. Also, make some of the values in the lower encryption functions const since they aren't changed. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Jeff Layton	55162dec93	cifs: zero out session password before freeing it cifs: zero out session password before freeing it ...just to be on the safe side. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Jeff Layton	8570552425	cifs: fix wait_for_response to time out sleeping processes correctly cifs: fix wait_for_response to time out sleeping processes correctly The current scheme that CIFS uses to sleep and wait for a response is not quite what we want. After sending a request, wait_for_response puts the task to sleep with wait_event(). One of the conditions for wait_event is a timeout (using time_after()). The problem with this is that there is no guarantee that the process will ever be woken back up. If the server stops sending data, then cifs_demultiplex_thread will leave its response queue sleeping. I think the only thing that saves us here is the fact that cifs_dnotify_thread periodically (every 15s) wakes up sleeping processes on all response_q's that have calls in flight. This makes for unnecessary wakeups of some processes. It also means large variability in the timeouts since they're all woken up at once. Instead of this, put the tasks to sleep with wait_event_timeout. This makes them wake up on their own if they time out. With this change, cifs_dnotify_thread should no longer be needed. I've been testing this in conjunction with some other patches that I'm working on. It doesn't seem to affect performance at all with with heavy I/O. Identical iozone -ac runs complete in almost exactly the same time (<1% difference in times). Thanks to Wasrshi Nimara for initially pointing this out. Wasrshi, it would be nice to know whether this patch also helps your testcase. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Wasrshi Nimara <warshinimara@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Steve French	8be0ed44c2	[CIFS] Can not mount with prefixpath if root directory of share is inaccessible Windows allows you to deny access to the top of a share, but permit access to a directory lower in the path. With the prefixpath feature of cifs (ie mounting \\server\share\directory\subdirectory\etc.) this should have worked if the user specified a prefixpath which put the root of the mount at a directory to which he had access, but we still were doing a lookup on the root of the share (null path) when we should have been doing it on the prefixpath subdirectory. This fixes Samba bug # 5925 Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:11 +00:00
Steve French	61e7480158	[CIFS] various minor cleanups pointed out by checkpatch script Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Steve French	3de2091ac7	[CIFS] fix typo Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Steve French	acc18aa1e6	[CIFS] remove sparse warning Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Steve French	13a6e42af8	[CIFS] add mount option to send mandatory rather than advisory locks Some applications/subsystems require mandatory byte range locks (as is used for Windows/DOS/OS2 etc). Sending advisory (posix style) byte range lock requests (instead of mandatory byte range locks) can lead to problems for these applications (which expect that other clients be prevented from writing to portions of the file which they have locked and are updating). This mount option allows mounting cifs with the new mount option "forcemand" (or "forcemandatorylock") in order to have the cifs client use mandatory byte range locks (ie SMB/CIFS/Windows/NTFS style locks) rather than posix byte range lock requests, even if the server would support posix byte range lock requests. This has no effect if the server does not support the CIFS Unix Extensions (since posix style locks require support for the CIFS Unix Extensions), but for mounts to Samba servers this can be helpful for Wine and applications that require mandatory byte range locks. Acked-by: Jeff Layton <jlayton@redhat.com> CC: Alexander Bokovoy <ab@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	d5c5605c27	cifs: make ipv6_connect take a TCP_Server_Info arg Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	bcf4b1063d	cifs: make ipv4_connect take a TCP_Server_Info arg In order to unify the smb_send routines, we need to reorganize the routines that connect the sockets. Have ipv4_connect take a TCP_Server_Info pointer and get the necessary fields from that. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	7586b76585	cifs: don't declare smb_vol info on the stack struct smb_vol is fairly large, it's probably best to kzalloc it... Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	63c038c297	cifs: move allocation of new TCP_Server_Info into separate function Clean up cifs_mount a bit by moving the code that creates new TCP sessions into a separate function. Have that function search for an existing socket and then create a new one if one isn't found. Also reorganize the initializion of TCP_Server_Info a bit to prepare for cleanup of the socket connection code. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:10 +00:00
Jeff Layton	8ecaf67a8e	cifs: account for IPv6 in ses->serverName and clean up netbios name handling The current code for setting the session serverName is IPv4-specific. Allow it to be an IPv6 address as well. Use NIP* macros to set the format. This also entails increasing the length of the serverName field, so declare a new macro for RFC1001 name length and use it in the appropriate places. Finally, drop the unicode_server_Name field from TCP_Server_Info since it's not used. We can add it back later if needed, but for now it just wastes memory. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	954d7a1cf1	cifs: make dnotify thread experimental code Now that tasks sleeping in wait_for_response will time out on their own, we're not reliant on the dnotify thread to do this. Mark it as experimental code for now. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	72ca545b2d	cifs: convert tcpSem to a mutex Mutexes are preferred for single-holder semaphores... Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	0468a2cf91	cifs: take module reference when starting cifsd cifsd can outlive the last cifs mount. We need to hold a module reference until it exits to prevent someone from unplugging the module until we're ready. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	80909022ce	cifs: display addr and prefixpath options in /proc/mounts Have cifs_show_options display the addr and prefixpath options in /proc/mounts. Reduce struct dereferencing by adding some local variables. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Jeff Layton	24b9b06ba7	cifs: remove unused SMB session pointer from struct mid_q_entry Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2008-12-26 02:29:09 +00:00
Ingo Molnar	32e8d18683	Merge branches 'timers/clocksource', 'timers/hpet', 'timers/hrtimers', 'timers/nohz', 'timers/ntp', 'timers/posixtimers' and 'timers/rtc' into timers/core	2008-12-25 18:02:25 +01:00
Ingo Molnar	860cf8894b	Merge branches 'irq/sparseirq', 'irq/genirq' and 'irq/urgent'; commit 'v2.6.28' into irq/core	2008-12-25 16:27:54 +01:00
Ingo Molnar	4e202284e6	Merge branch 'sched/urgent'; commit 'v2.6.28' into sched/core	2008-12-25 13:42:23 +01:00
Martin Schwidefsky	fc5243d98a	[S390] arch_setup_additional_pages arguments arch_setup_additional_pages currently gets two arguments, the binary format descripton and an indication if the process uses an executable stack or not. The second argument is not used by anybody, it could be removed without replacement. What actually does make sense is to pass an indication if the process uses the elf interpreter or not. The glibc code will not use anything from the vdso if the process does not use the dynamic linker, so for statically linked binaries the architecture backend can choose not to map the vdso. Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-12-25 13:38:54 +01:00
James Morris	cbacc2c7f0	Merge branch 'next' into for-linus	2008-12-25 11:40:09 +11:00
Ingo Molnar	db8862eafe	Merge branch 'linus' into tracing/hw-branch-tracing	2008-12-24 21:08:26 +01:00
Lachlan McIlroy	25051158bb	[XFS] Fix race in xfs_write() between direct and buffered I/O with DMAPI The iolock is dropped and re-acquired around the call to XFS_SEND_NAMESP(). While the iolock is released the file can become cached. We then 'goto retry' and - if we are doing direct I/O - mapping->nrpages may now be non zero but need_i_mutex will be zero and we will hit the WARN_ON(). Since we have dropped the I/O lock then the file size may have also changed so what we need to do here is 'goto start' like we do for the XFS_SEND_DATA() DMAPI event. We also need to update the filesize before releasing the iolock so that needs to be done before the XFS_SEND_NAMESP event. If we drop the iolock before setting the filesize we could race with a truncate. Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2008-12-24 14:07:32 +11:00
Olga Kornievskaia	61054b14d5	nfsd: support callbacks with gss flavors This patch adds server-side support for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:19:00 -05:00
Olga Kornievskaia	945b34a772	rpc: allow gss callbacks to client This patch adds client-side support to allow for callbacks other than AUTH_SYS. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:18:34 -05:00
Olga Kornievskaia	608207e888	rpc: pass target name down to rpc level on callbacks The rpc client needs to know the principal that the setclientid was done as, so it can tell gssd who to authenticate to. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:40 -05:00
Olga Kornievskaia	68e76ad0ba	nfsd: pass client principal name in rsc downcall Two principals are involved in krb5 authentication: the target, who we authenticate to (normally the name of the server, like nfs/server.citi.umich.edu@CITI.UMICH.EDU), and the source, we we authenticate as (normally a user, like bfields@UMICH.EDU) In the case of NFSv4 callbacks, the target of the callback should be the source of the client's setclientid call, and the source should be the nfs server's own principal. Therefore we allow svcgssd to pass down the name of the principal that just authenticated, so that on setclientid we can store that principal name with the new client, to be used later on callbacks. Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:17:15 -05:00
Andy Adamson	cf8cdbe5bd	NFS: remove unused status from encode routines Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:18 -05:00
Andy Adamson	d017931cff	NFS: increment number of operations in each encode routine Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:17 -05:00
Benny Halevy	49c2559e29	NFS: fix comment placement in nfs4xdr.c Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:16 -05:00
Andy Adamson	05d564fe00	NFS: fix tabs in nfs4xdr.c Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:15 -05:00
Andy Adamson	6c0195a468	NFS: remove white space from nfs4xdr.c Clean-up Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:15 -05:00
Benny Halevy	374130770e	nfs: remove incorrect usage of nfs4 compound response hdr.status 3 call sites look at hdr.status before returning success. hdr.status must be zero in this case so there's no point in this. Currently, hdr.status is correctly processed at decode_op_hdr time if the op status cannot be decoded. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:14 -05:00
Benny Halevy	aadf615211	nfs: return compound hdr.status when there are no op replies When there are no op replies encoded in the compound reply hdr.status still contains the overall status of the compound rpc. This can happen, e.g., when the server returns a NFS4ERR_MINOR_VERS_MISMATCH error. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:06:13 -05:00
Trond Myklebust	027b6ca021	NFSv4: Fix an infinite loop in the NFS state recovery code Marten Gajda <marten.gajda@fernuni-hagen.de> states: I tracked the problem down to the function nfs4_do_open_expired. Within this function _nfs4_open_expired is called and may return -NFS4ERR_DELAY. When a further call to _nfs4_open_expired is executed and does not return -NFS4ERR_DELAY the "exception.retry" variable is not reset to 0, causing the loop to iterate again (and as long as err != -NFS4ERR_DELAY, probably forever) Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 16:04:13 -05:00
Peter Staubach	64672d55d9	optimize attribute timeouts for "noac" and "actimeo=0" Hi. I've been looking at a bugzilla which describes a problem where a customer was advised to use either the "noac" or "actimeo=0" mount options to solve a consistency problem that they were seeing in the file attributes. It turned out that this solution did not work reliably for them because sometimes, the local attribute cache was believed to be valid and not timed out. (With an attribute cache timeout of 0, the cache should always appear to be timed out.) In looking at this situation, it appears to me that the problem is that the attribute cache timeout code has an off-by-one error in it. It is assuming that the cache is valid in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo]. The cache should be considered valid only in the region, [read_cache_jiffies, read_cache_jiffies + attrtimeo). With this change, the options, "noac" and "actimeo=0", work as originally expected. This problem was previously addressed by special casing the attrtimeo == 0 case. However, since the problem is only an off- by-one error, the cleaner solution is address the off-by-one error and thus, not require the special case. Thanx... ps Signed-off-by: Peter Staubach <staubach@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	dc0b027dfa	NFSv4: Convert the open and close ops to use fmode Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:56 -05:00
Trond Myklebust	7a50c60e46	NFS: Use delegations to optimise ACCESS calls Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:55 -05:00
Trond Myklebust	15860ab1d7	NFSv4: Ensure that we set the verifier when revalidating delegated dentries This ensures that we don't have to look up the dentry again after we return the delegation if we know that the directory didn't change. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:54 -05:00
Trond Myklebust	5584c30630	NFSv4: Clean up is_atomic_open() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:54 -05:00
Trond Myklebust	bd7bf9d540	NFSv4: Convert delegation->type field to fmode_t Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:53 -05:00
Trond Myklebust	9082a5cc1e	NFSv4: Fix up delegation callbacks Currently, the callback server is listening on IPv6 if it is enabled. This means that IPv4 addresses will always be mapped. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:53 -05:00
Trond Myklebust	b7391f44f2	NFSv4: Return unreferenced delegations more promptly If the client is not using a delegation, the right thing to do is to return it as soon as possible. This helps reduce the amount of state the server has to track, as well as reducing the potential for conflicts with other clients. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:52 -05:00
Trond Myklebust	6411bd4a47	NFSv4: Clean up the asynchronous delegation return Reuse the state management thread in order to return delegations when we get a callback. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:51 -05:00
Trond Myklebust	b0d3ded1a2	NFSv4: Clean up nfs_expire_all_delegations() Let the actual delegreturn stuff be run in the state manager thread rather than allocating a separate kthread. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:50 -05:00
Trond Myklebust	0d62f85a81	NFSv4: Fix a BAD_SEQUENCEID condition. We really shouldn't be resetting the sequence ids when doing state expiration recovery, since we don't know if the server still remembers our previous state owners. There are servers out there that do attempt to preserve client state even if the lease has expired. Such a server would only release that state if a conflicting OPEN request occurs. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:49 -05:00
Trond Myklebust	f3c76491e7	NFSv4: Don't exit the state management if there are still tasks to do Fix up a potential race... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:48 -05:00
Trond Myklebust	e005e8041c	NFSv4: Rename the state reclaimer thread It is really a more general purpose state management thread at this point. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:48 -05:00
Trond Myklebust	707fb4b324	NFSv4: Clean up NFS4ERR_CB_PATH_DOWN error management... Add a delegation cleanup phase to the state management loop, and do the NFS4ERR_CB_PATH_DOWN recovery there. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:47 -05:00
Trond Myklebust	515d861177	NFSv4: Clean up the support for returning multiple delegations Add a flag to mark delegations as requiring return, then run a garbage collector. In the future, this will allow for more flexible delegation management, where delegations may be marked for return if it turns out that they are not being referenced. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:46 -05:00
Trond Myklebust	9e33bed552	NFSv4: Add recovery for individual stateids NFSv4 defines a number of state errors which the client does not currently handle. Among those we should worry about are: NFS4ERR_ADMIN_REVOKED - the server's administrator revoked our locks and/or delegations. NFS4ERR_BAD_STATEID - the client and server are out of sync, possibly due to a delegation return racing with an OPEN request. NFS4ERR_OPENMODE - the client attempted to do something not sanctioned by the open mode of the stateid. Should normally just occur as a result of a delegation return race. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:46 -05:00
Trond Myklebust	95d35cb4c4	NFSv4: Remove nfs_client->cl_sem Now that we're using the flags to indicate state that needs to be recovered, as well as having implemented proper refcounting and spinlocking on the state and open_owners, we can get rid of nfs_client->cl_sem. The only remaining case that was dubious was the file locking, and that case is now covered by the nfsi->rwsem. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:45 -05:00
Trond Myklebust	19e03c570e	NFSv4: Ensure that file unlock requests don't conflict with state recovery The unlock path is currently failing to take the nfs_client->cl_sem read lock, and hence the recovery path may see locks disappear from underneath it. Also ensure that it takes the nfs_inode->rwsem read lock so that it there is no conflict with delegation recalls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:44 -05:00
Trond Myklebust	65de872ed6	NFS: Remove the unnecessary argument to nfs4_wait_clnt_recover() ...and move some code around in order to clear out an unnecessary forward declaration. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:44 -05:00
Trond Myklebust	fe1d81952e	NFSv4: Ensure that nfs4_reclaim_open_state() doesn't depend on cl_sem Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:43 -05:00
Trond Myklebust	7eff03aec9	NFSv4: Add a recovery marking scheme for state owners Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:43 -05:00
Trond Myklebust	0f605b5600	NFSv4: Don't tell server we rebooted when not necessary Instead of doing a full setclientid, try doing a RENEW call first. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:42 -05:00
Trond Myklebust	e598d843c0	NFSv4: Remove redundant RENEW calls if we know the lease has expired Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:42 -05:00
Trond Myklebust	b79a4a1b45	NFSv4: Fix state recovery when the client runs over the grace period If the client for some reason is not able to recover all its state within the time allotted for the grace period, and the server reboots again, the client is not allowed to recover the state that was 'lost' using reboot recovery. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:41 -05:00
Trond Myklebust	6dc9d57af9	NFSv4: Callers to nfs4_get_renew_cred() need to hold nfs_client->cl_lock Ditto for nfs4_get_setclientid_cred(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:41 -05:00
Trond Myklebust	0286001430	NFSv4: Clean up for the state loss reclaimer Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:40 -05:00
Trond Myklebust	15c831bf1a	NFS: Use atomic bitops when changing struct nfs_delegation->flags Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:39 -05:00
Trond Myklebust	86e8948998	NFSv4: Fix up the dereferencing of delegation->inode Without an extra lock, we cannot just assume that the delegation->inode is valid when we're traversing the rcu-protected nfs_client lists. Use the delegation->lock to ensure that it is truly valid. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:39 -05:00
Trond Myklebust	343104308a	NFSv4: Fix up another delegation related race When we can update_open_stateid(), we need to be certain that we don't race with a delegation return. While we could do this by grabbing the nfs_client->cl_lock, a dedicated spin lock in the delegation structure will scale better. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:38 -05:00
Chuck Lever	0cb2659b81	NLM: allow lockd requests from an unprivileged port If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's lockd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:38 -05:00
Chuck Lever	50a737f86d	NFS: "[no]resvport" mount option changes mountd client too If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's mountd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:37 -05:00
Chuck Lever	d740351bf0	NFS: add "[no]resvport" mount option The standard default security setting for NFS is AUTH_SYS. An NFS client connects to NFS servers via a privileged source port and a fixed standard destination port (2049). The client sends raw uid and gid numbers to identify users making NFS requests, and the server assumes an appropriate authority on the client has vetted these values because the source port is privileged. On Linux, by default in-kernel RPC services use a privileged port in the range between 650 and 1023 to avoid using source ports of well- known IP services. Using such a small range limits the number of NFS mount points and the number of unique NFS servers to which a client can connect concurrently. An NFS client can use unprivileged source ports to expand the range of source port numbers, allowing more concurrent server connections and more NFS mount points. Servers must explicitly allow NFS connections from unprivileged ports for this to work. In the past, bumping the value of the sunrpc.max_resvport sysctl on the client would permit the NFS client to use unprivileged ports. Bumping this setting also changes the maximum port number used by other in-kernel RPC services, some of which still required a port number less than 1023. This is exacerbated by the way source port numbers are chosen by the Linux RPC client, which starts at the top of the range and works downwards. It means that bumping the maximum means all RPC services requesting a source port will likely get an unprivileged port instead of a privileged one. Changing this setting effects all NFS mount points on a client. A sysadmin could not selectively choose which mount points would use non-privileged ports and which could not. Lastly, this mechanism of expanding the limit on the number of NFS mount points was entirely undocumented. To address the need for the NFS client to use a large range of source ports without interfering with the activity of other in-kernel RPC services, we introduce a new NFS mount option. This option explicitly tells only the NFS client to use a non-privileged source port when communicating with the NFS server for one specific mount point. This new mount option is called "resvport," like the similar NFS mount option on FreeBSD and Mac OS X. A sister patch for nfs-utils will be submitted that documents this new option in nfs(5). The default setting for this new mount option requires the NFS client to use a privileged port, as before. Explicitly specifying the "noresvport" mount option allows the NFS client to use an unprivileged source port for this mount point when connecting to the NFS server port. This mount option is supported only for text-based NFS mounts. [ Sidebar: it is widely known that security mechanisms based on the use of privileged source ports are ineffective. However, the NFS client can combine the use of unprivileged ports with the use of secure authentication mechanisms, such as Kerberos. This allows a large number of connections and mount points while ensuring a useful level of security. Eventually we may change the default setting for this option depending on the security flavor used for the mount. For example, if the mount is using only AUTH_SYS, then the default setting will be "resvport;" if the mount is using a strong security flavor such as krb5, the default setting will be "noresvport." ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Trond.Myklebust@netapp.com: Fixed a bug whereby nfs4_init_client() was being called with incorrect arguments.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2008-12-23 15:21:37 -05:00

1 2 3 4 5 ...

11274 Commits