linux_old1

Commit Graph

Author	SHA1	Message	Date
Mark Fasheh	0879c584ff	ocfs2: Allow for debugging of transaction extends The nastiest cases of transaction extends are also the rarest. We can expose them more quickly at the expense of performance by going straight to the journal_restart() in ocfs2_extend_trans(). Wrap things in OCFS2_DEBUG_FS so that we only do this when "expensive debugging" is turned on. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-12-17 10:51:14 -08:00
Mark Fasheh	92295d8054	ocfs2: Don't panic when truncating an empty extent This BUG_ON() was unintentionally left in after the sparse file support was written. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-12-17 10:51:04 -08:00
Mark Fasheh	a86370fbb6	ocfs2: fix exit-while-locked bug in ocfs2_queue_orphans() We're holding the cluster lock when a failure might happen in ocfs2_dir_foreach() so it needs to be released. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-12-17 10:49:43 -08:00
Trond Myklebust	a10db50a4a	NFS: Fix an Oops in NFS unmount Ensure that the dummy 'root dentry' is invisible to d_find_alias(). If not, then it may be spliced into the tree if a parent directory from the same filesystem gets mounted at a later time. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-12-12 11:12:15 -05:00
Trond Myklebust	a5576cfa5c	Revert "NFS: Ensure we return zero if applications attempt to write zero bytes" This reverts commit `b9148c6b80`. On Wed, 12 Dec 2007 10:57:30 -0500, Chuck Lever wrote > commit `b9148c6b` should be reverted. It was recently forward-ported > from some years-old patches, and is clearly not needed now. > > On Dec 11, 2007, at 5:21 PM, Adrian Bunk wrote: > >> This code became dead after commit >> `b9148c6b80` >> (which BTW doesn't seem to have changed any behaviour) and can >> therefore >> be removed. >> >> Spotted by the Coverity checker. >> >> Signed-off-by: Adrian Bunk <bunk@kernel.org> >> >> --- >> --- linux-2.6/fs/nfs/direct.c.old 2007-12-02 21:54:53.000000000 +0100 >> +++ linux-2.6/fs/nfs/direct.c 2007-12-02 21:55:10.000000000 +0100 >> @@ -897,15 +897,12 @@ ssize_t nfs_file_direct_write(struct kio >> if (!count) >> goto out; /* return 0 */ >> >> retval = -EINVAL; >> if ((ssize_t) count < 0) >> goto out; >> - retval = 0; >> - if (!count) >> - goto out; >> >> retval = nfs_sync_mapping(mapping); >> if (retval) >> goto out; >> >> retval = nfs_direct_write(iocb, iov, nr_segs, pos, count); >> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-12-12 11:08:33 -05:00
Trond Myklebust	5cef338b30	NFSv2/v3: Fix a memory leak when using -onolock Neil Brown said: > Hi Trond, > > We found that a machine which made moderately heavy use of > 'automount' was leaking some nfs data structures - particularly the > 4K allocated by rpc_alloc_iostats. > It turns out that this only happens with filesystems with -onolock > set. > The problem is that if NFS_MOUNT_NONLM is set, nfs_start_lockd doesn't > set server->destroy, so when the filesystem is unmounted, the > ->client_acl is not shutdown, and so several resources are still > held. Multiple mount/umount cycles will slowly eat away memory > several pages at a time. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: NeilBrown <neilb@suse.de>	2007-12-11 22:01:56 -05:00
Trond Myklebust	4584f520e1	NFS: Fix NFS mountpoint crossing... The check that was added to nfs_xdev_get_sb() to work around broken servers, works fine for NFSv2, but causes mountpoint crossing on NFSv3 to always return ESTALE. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-12-11 19:01:45 -05:00
Eric W. Biederman	3790ee4bd8	proc: remove/Fix proc generic d_revalidate Ultimately to implement /proc perfectly we need an implementation of d_revalidate because files and directories can be removed behind the back of the VFS, and d_revalidate is the only way we can let the VFS know that this has happened. Unfortunately the linux VFS can not cope with anything in the path to a mount point going away. So a proper d_revalidate method that calls d_drop also needs to call have_submounts which is moderately expensive, so you really don't want a d_revalidate method that unconditionally calls it, but instead only calls it when the backing object has really gone away. proc generic entries only disappear on module_unload (when not counting the fledgling network namespace) so it is quite rare that we actually encounter that case and has not actually caused us real world trouble yet. So until we get a proper test for keeping dentries in the dcache fix the current d_revalidate method by completely removing it. This returns us to the current status quo. So with CONFIG_NETNS=n things should look as they have always looked. For CONFIG_NETNS=y things work most of the time but there are a few rare corner cases that don't behave properly. As the network namespace is barely present in 2.6.24 this should not be a problem. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "Denis V. Lunev" <den@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-10 19:43:55 -08:00
Linus Torvalds	41f81e88e0	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] Fix xfs_ichgtime()s broken usage of I_SYNC [XFS] Make xfsbufd threads freezable [XFS] revert to double-buffering readdir [XFS] Fix broken inode cluster setup. [XFS] Clear XBF_READ_AHEAD flag on I/O completion. [XFS] Fixed a few bugs in xfs_buf_associate_memory() [XFS] 971064 Various fixups for xfs_bulkstat(). [XFS] Fix dbflush panic in xfs_qm_sync.	2007-12-10 10:18:27 -08:00
David Chinner	cf10e82bdc	[XFS] Fix xfs_ichgtime()s broken usage of I_SYNC The recent I_LOCK->I_SYNC changes mistakenly changed xfs_ichgtime to look at I_SYNC instead of I_LOCK. This was incorrect and prevents newly created inodes from moving to the dirty list. Change this to the correct check which is for I_NEW, not I_LOCK or I_SYNC so that behaviour is correct. SGI-PV: 974225 SGI-Modid: xfs-linux-melb:xfs-kern:30204a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2007-12-10 13:47:56 +11:00
Rafael J. Wysocki	978c7b2ff4	[XFS] Make xfsbufd threads freezable Fix breakage caused by commit `8314418629` that did not introduce the necessary call to set_freezable() in xfs/linux-2.6/xfs_buf.c . SGI-PV: 974224 SGI-Modid: xfs-linux-melb:xfs-kern:30203a Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2007-12-10 13:47:36 +11:00
Christoph Hellwig	e89bc612d6	[XFS] revert to double-buffering readdir The current readdir implementation deadlocks on a btree buffers locks because nfsd calls back into ->lookup from the filldir callback. The only short-term fix for this is to revert to the old inefficient double-buffering scheme. SGI-PV: 973377 SGI-Modid: xfs-linux-melb:xfs-kern:30201a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2007-12-10 13:47:15 +11:00
David Chinner	a7430847fc	[XFS] Fix broken inode cluster setup. The radix tree based inode caches did away with the inode cluster hashes, replacing them with a bunch of masking and gang lookups on the radix tree. This masking got broken when moving the code to per-ag radix trees and indexing by agino # rather than straight inode number. The result is clustered inode writeback does not cluster and things can go extremely slowly when there are lots of inodes to write. Fix it up by comparing the agino # of the inode we just looked up to the index of the cluster we are looking for. Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com> SGI-PV: 972915 SGI-Modid: xfs-linux-melb:xfs-kern:30033a Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2007-12-10 13:46:59 +11:00
Lachlan McIlroy	77be55a5a1	[XFS] Clear XBF_READ_AHEAD flag on I/O completion. SGI-PV: 972554 SGI-Modid: xfs-linux-melb:xfs-kern:30128a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2007-12-10 13:46:45 +11:00
Lachlan McIlroy	d1afb678ce	[XFS] Fixed a few bugs in xfs_buf_associate_memory() - calculation of 'page_count' was incorrect as it did not consider the offset of 'mem' into the first page. The logic to bump 'page_count' didn't work if 'len' was <= PAGE_CACHE_SIZE (ie offset = 3k, len = 2k). - setting b_buffer_length to 'len' is incorrect if 'offset' is > 0. Set it to the total length of the buffer. - I suspect that passing a non-aligned address into mem_to_page() for the first page may have been causing issues - don't know but just tidy up that code anyway. SGI-PV: 971596 SGI-Modid: xfs-linux-melb:xfs-kern:30143a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>	2007-12-10 13:46:20 +11:00
Lachlan McIlroy	cd57e594ad	[XFS] 971064 Various fixups for xfs_bulkstat(). - sanity check for NULL user buffer in xfs_ioc_bulkstat[_compat]() - remove the special case for XFS_IOC_FSBULKSTAT with count == 1. This special case causes bulkstat to fail because the special case uses xfs_bulkstat_single() instead of xfs_bulkstat() and the two functions have different semantics. xfs_bulkstat() will return the next inode after the one supplied while skipping internal inodes (ie quota inodes). xfs_bulkstate_single() will only lookup the inode supplied and return an error if it is an internal inode. - in xfs_bulkstat(), need to initialise 'lastino' to the inode supplied so in cases were we return without examining any inodes the scan wont restart back at zero. - sanity check for valid *ubcountp values. Cannot sanity check for valid ubuffer here because some users of xfs_bulkstat() don't supply a buffer. - checks against 'ubleft' (the space left in the user's buffer) should be against 'statstruct_size' which is the supplied minimum object size. The mixture of checks against statstruct_size and 0 was one of the reasons we were skipping inodes. - if the formatter function returns BULKSTAT_RV_NOTHING and an error and the error is not ENOENT or EINVAL then we need to abort the scan. ENOENT is for inodes that are no longer valid and we just skip them. EINVAL is returned if we try to lookup an internal inode so we skip them too. For a DMF scan if the inode and DMF attribute cannot fit into the space left in the user's buffer it would return ERANGE. We didn't handle this error and skipped the inode. We would continue to skip inodes until one fitted into the user's buffer or we completed the scan. - put back the recalculation of agino (that got removed with the last fix) at the end of the while loop. This is because the code at the start of the loop expects agino to be the last inode examined if it is non-zero. - if we found some inodes but then encountered an error, return success this time and the error next time. If the formatter aborted with ENOMEM we will now return this error but only if we couldn't read any inodes. Previously if we encountered ENOMEM without reading any inodes we returned a zero count and no error which falsely indicated the scan was complete. SGI-PV: 973431 SGI-Modid: xfs-linux-melb:xfs-kern:30089a Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com>	2007-12-10 13:44:11 +11:00
Donald Douwsma	d757762bf2	[XFS] Fix dbflush panic in xfs_qm_sync. The recent behaviour layer removal dropped the check for quotas that have been requested at mount time but have subsequently been turned off. This results in a panic when accessing m_quotainfo which has been freed. This patch adds the check originally made by xfs_qm_syncall() to xfs_qm_sync(). SGI-PV: 969769 SGI-Modid: xfs-linux-melb:xfs-kern:29908a Signed-off-by: Donald Douwsma <donaldd@sgi.com> Signed-off-by: David Chinner <dgc@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>	2007-12-10 13:40:10 +11:00
Len Brown	f7a5274d7d	Pull suspend-2.6.24 into release branch	2007-12-06 16:26:52 -05:00
Al Viro	97bd7919e2	remove nonsense force-casts from ocfs2 endianness annotations in networking code had been in place for quite a while; in particular, sin_port and s_addr are annotated as big-endian. Code in ocfs2 had __force casts added apparently to shut the sparse warnings up; of course, these days they only serve to produce warnings for no reason whatsoever... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:25:20 -08:00
Al Viro	7e46aa5c8c	regression: bfs endianness bug BFS_FILEBLOCKS() expects struct bfs_inode * (on-disk data, with little- endian fields), not struct bfs_inode_info * (in-core stuff, with host- endian ones). It's a macro and fields with the right names are present in bfs_inode_info, so it compiles, but on big-endian host it gives bogus results. Introduced in commit `f433dc5634` ("Fixes to the BFS filesystem driver"). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:25:20 -08:00
Al Viro	9b5e6857b3	regression: cifs endianness bug access_flags_to_mode() gets on-the-wire data (little-endian) and treats it as host-endian. Introduced in commit `e01b640013` ("[CIFS] enable get mode from ACL when cifsacl mount option specified") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:25:19 -08:00
Alexey Dobriyan	5a622f2d0f	proc: fix proc_dir_entry refcounting Creating PDEs with refcount 0 and "deleted" flag has problems (see below). Switch to usual scheme: * PDE is created with refcount 1 * every de_get does +1 * every de_put() and remove_proc_entry() do -1 * once refcount reaches 0, PDE is freed. This elegantly fixes at least two following races (both observed) without introducing new locks, without abusing old locks, without spreading lock_kernel(): 1) PDE leak remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_read(&de->count) == 0) if (atomic_dec_and_test(&de->count)) if (de->deleted) /* also not taken! / free_proc_entry(de); else de->deleted = 1; [refcount=0, deleted=1] 2) use after free remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_dec_and_test(&de->count)) if (atomic_read(&de->count) == 0) free_proc_entry(de); / boom! / if (de->deleted) free_proc_entry(de); BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c10acdda pdpt = 00000000338f8001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4) EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1 EIP is at strnlen+0x6/0x18 EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000) Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400 c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400 f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34 Call Trace: [<c10ac4f0>] vsnprintf+0x2ad/0x49b [<c10ac779>] vscnprintf+0x14/0x1f [<c1018e6b>] vprintk+0xc5/0x2f9 [<c10379f1>] handle_fasteoi_irq+0x0/0xab [<c1004f44>] do_IRQ+0x9f/0xb7 [<c117db3b>] preempt_schedule_irq+0x3f/0x5b [<c100264e>] need_resched+0x1f/0x21 [<c10190ba>] printk+0x1b/0x1f [<c107c8ad>] de_put+0x3d/0x50 [<c107c8f8>] proc_delete_inode+0x38/0x41 [<c107c8c0>] proc_delete_inode+0x0/0x41 [<c1066298>] generic_delete_inode+0x5e/0xc6 [<c1065aa9>] iput+0x60/0x62 [<c1063c8e>] d_kill+0x2d/0x46 [<c1063fa9>] dput+0xdc/0xe4 [<c10571a1>] __fput+0xb0/0xcd [<c1054e49>] filp_close+0x48/0x4f [<c1055ee9>] sys_close+0x67/0xa5 [<c10026b6>] sysenter_past_esp+0x5f/0x85 ======================= Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9 EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44 Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds, module is already pinned and remove_proc_entry() can't happen => nobody can mark PDE deleted. Dummy proc root in netns code is not marked with refcount 1. AFAICS, we never get it, it's just for proper /proc/net removal. I double checked CLONE_NETNS continues to work. Patch survives many hours of modprobe/rmmod/cat loops without new bugs which can be attributed to refcounting. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Jan Kara	d4beaf4ab5	jbd: Fix assertion failure in fs/jbd/checkpoint.c Before we start committing a transaction, we call __journal_clean_checkpoint_list() to cleanup transaction's written-back buffers. If this call happens to remove all of them (and there were already some buffers), __journal_remove_checkpoint() will decide to free the transaction because it isn't (yet) a committing transaction and soon we fail some assertion - the transaction really isn't ready to be freed :). We change the check in __journal_remove_checkpoint() to free only a transaction in T_FINISHED state. The locking there is subtle though (as everywhere in JBD ;(). We use j_list_lock to protect the check and a subsequent call to __journal_drop_transaction() and do the same in the end of journal_commit_transaction() which is the only place where a transaction can get to T_FINISHED state. Probably I'm too paranoid here and such locking is not really necessary - checkpoint lists are processed only from log_do_checkpoint() where a transaction must be already committed to be processed or from __journal_clean_checkpoint_list() where kjournald itself calls it and thus transaction cannot change state either. Better be safe if something changes in future... Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:20 -08:00
Evgeniy Dushistov	0c664f9742	ufs: fix nexstep dir block size This patch fixes regression, introduced since 2.6.16. NextStep variant of UFS as OpenStep uses directory block size equals to 1024. Without this change, ufs_check_page fails in many cases. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru> Cc: Dave Bailey <dsbailey@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:18 -08:00
Jeff Moyer	e00ba3dae0	aio: only account I/O wait time in read_events if there are active requests On 2.6.24, top started showing 100% iowait on one CPU when a UML instance was running (but completely idle). The UML code sits in io_getevents waiting for an event to be submitted and completed. Fix this by checking ctx->reqs_active before scheduling to determine whether or not we are waiting for I/O. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Jeff Dike <jdike@addtoit.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-12-05 09:21:18 -08:00
Rafael J. Wysocki	e136e769d4	Freezer: Fix JFFS2 garbage collector freezing issue (rev. 2) Fix breakage caused by commit `d5d8c5976d` "freezer: do not send signals to kernel threads" in jffs2_garbage_collect_thread() that assumed it would be sent signals by the freezer. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Pete MacKay <armlinux@architechnical.net> Signed-off-by: Len Brown <len.brown@intel.com>	2007-12-04 01:35:41 -05:00
Linus Torvalds	8002cedc1a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/net-2.6: (27 commits) [INET]: Fix inet_diag dead-lock regression [NETNS]: Fix /proc/net breakage [TEXTSEARCH]: Do not allow zero length patterns in the textsearch infrastructure [NETFILTER]: fix forgotten module release in xt_CONNMARK and xt_CONNSECMARK [NETFILTER]: xt_TCPMSS: remove network triggerable WARN_ON [DECNET]: dn_nl_deladdr() almost always returns no error [IPV6]: Restore IPv6 when MTU is big enough [RXRPC]: Add missing select on CRYPTO mac80211: rate limit wep decrypt failed messages rfkill: fix double-mutex-locking mac80211: drop unencrypted frames if encryption is expected mac80211: Fix behavior of ieee80211_open and ieee80211_close ieee80211: fix unaligned access in ieee80211_copy_snap mac80211: free ifsta->extra_ie and clear IEEE80211_STA_PRIVACY_INVOKED SCTP: Fix build issues with SCTP AUTH. SCTP: Fix chunk acceptance when no authenticated chunks were listed. SCTP: Fix the supported extensions paramter SCTP: Fix SCTP-AUTH to correctly add HMACS paramter. SCTP: Fix the number of HB transmissions. [TCP] illinois: Incorrect beta usage ...	2007-12-03 08:15:36 -08:00
Eric W. Biederman	2b1e300a9d	[NETNS]: Fix /proc/net breakage Well I clearly goofed when I added the initial network namespace support for /proc/net. Currently things work but there are odd details visible to user space, even when we have a single network namespace. Since we do not cache proc_dir_entry dentries at the moment we can just modify ->lookup to return a different directory inode depending on the network namespace of the process looking at /proc/net, replacing the current technique of using a magic and fragile follow_link method. To accomplish that this patch: - introduces a shadow_proc method to allow different dentries to be returned from proc_lookup. - Removes the old /proc/net follow_link magic - Fixes a weakness in our not caching of proc generic dentries. As shadow_proc uses a task struct to decided which dentry to return we can go back later and fix the proc generic caching without modifying any code that uses the shadow_proc method. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2007-12-02 00:33:17 +11:00
Heiko Carstens	81257def2a	tty: add the new termios2 ioctls to the compatible list. Make them depend on TCGETS2. If that one is implemented the rest should be there as well. Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:55 -08:00
Miklos Szeredi	08b633070a	fuse: fix attribute caching after rename Invalidate attributes on rename, since some filesystems may update st_ctime. Reported by Szabolcs Szakacsits Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
John Muir	fbee36b92a	fuse: fix uninitialized field in fuse_inode I found problems accessing (executing) previously existing files, until I did chmod on them (or setattr). If the fi->attr_version is not initialized, then it could be larger than fc->attr_version until a setattr is executed, and as a result the inode attributes would never be set. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	d0186b25e6	fuse: fix FUSE_FILE_OPS sending FUSE_FILE_OPS is meant to signal that the kernel will send the open file to to the userspace filesystem for operations on open files, so that sillyrenaming unlinked files becomes unnecessary. However this needs VFS changes, which won't make it into 2.6.24. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	a6643094e7	fuse: pass open flags to read and write Some open flags (O_APPEND, O_DIRECT) can be changed with fcntl(F_SETFL, ...) after open, but fuse currently only sends the flags to userspace in open. To make it possible to correcly handle changing flags, send the current value to userspace in each read and write. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	7dca9fd39f	fuse: cleanup: add fuse_get_attr_version() Extract repeated code into helper function, as suggested by Akpm. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Miklos Szeredi	bcb4be809d	fuse: fix reading past EOF Currently reading a fuse file will stop at cached i_size and return EOF, even though the file might have grown since the attributes were last updated. So detect if trying to read past EOF, and refresh the attributes before continuing with the read. Thanks to mpb for the report. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:54 -08:00
Tobias Poschwatta	6454d1f903	fix up ext2_fs.h for userspace after reservations backport In commit a686cd898bd999fd026a51e90fb0a3410d258ddb: "Val's cross-port of the ext3 reservations code into ext2." include/linux/ext2_fs.h got a new function whose return value is only defined if __KERNEL__ is defined. Putting #ifdef __KERNEL__ around the function seems to help, patch below. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:53 -08:00
Eric W. Biederman	19fd4bb2a0	proc: remove races from proc_id_readdir() Oleg noticed that the call of task_pid_nr_ns() in proc_pid_readdir is racy with respect to tasks exiting. After a bit of examination it also appears that the call itself is completely unnecessary. So to fix the problem this patch modifies next_tgid() to return both a tgid and the task struct in question. A structure is introduced to return these values because it is slightly cleaner and easier to optimize, and the resulting code is a little shorter. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:52 -08:00
Alexey Dobriyan	c2319540cd	proc: fix NULL ->i_fop oops proc_kill_inodes() can clear ->i_fop in the middle of vfs_readdir resulting in NULL dereference during "file->f_op->readdir(file, buf, filler)". The solution is to remove proc_kill_inodes() completely: a) we don't have tricky modules implementing their tricky readdir hooks which could keeping this revoke from hell. b) In a situation when module is gone but PDE still alive, standard readdir will return only "." and "..", because pde->next was cleared by remove_proc_entry(). c) the race proc_kill_inode() destined to prevent is not completely fixed, just race window made smaller, because vfs_readdir() is run without sb_lock held and without file_list_lock held. Effectively, ->i_fop is cleared at random moment, which can't fix properly anything. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000018 printing eip: c1061205 pdpt = 0000000005b22001 pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw sr_mod k8temp cdrom hwmon amd_rng Pid: 2033, comm: find Not tainted (2.6.24-rc1-b1d08ac064268d0ae2281e98bf5e82627e0f0c56 #2) EIP: 0060:[<c1061205>] EFLAGS: 00010246 CPU: 0 EIP is at vfs_readdir+0x47/0x74 EAX: c6b6a780 EBX: 00000000 ECX: c1061040 EDX: c5decf94 ESI: c6b6a780 EDI: fffffffe EBP: c9797c54 ESP: c5decf78 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process find (pid: 2033, ti=c5dec000 task=c64bba90 task.ti=c5dec000) Stack: c5decf94 c1061040 fffffff7 0805ffbc 00000000 c6b6a780 c1061295 0805ffbc 00000000 00000400 00000000 00000004 0805ffbc 4588eff4 c5dec000 c10026ba 00000004 0805ffbc 00000400 0805ffbc 4588eff4 bfdc6c70 000000dc 0000007b Call Trace: [<c1061040>] filldir64+0x0/0xc5 [<c1061295>] sys_getdents64+0x63/0xa5 [<c10026ba>] sysenter_past_esp+0x5f/0x85 ======================= Code: 49 83 78 18 00 74 43 8d 6b 74 bf fe ff ff ff 89 e8 e8 b8 c0 12 00 f6 83 2c 01 00 00 10 75 22 8b 5e 10 8b 4c 24 04 89 f0 8b 14 24 <ff> 53 18 f6 46 1a 04 89 c7 75 0b 8b 56 0c 8b 46 08 e8 c8 66 00 EIP: [<c1061205>] vfs_readdir+0x47/0x74 SS:ESP 0068:c5decf78 hch: "Nice, getting rid of this is a very good step formwards. Unfortunately we have another copy of this junk in security/selinux/selinuxfs.c:sel_remove_entries() which would need the same treatment." Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-29 09:24:52 -08:00
Linus Torvalds	cae2f9c46d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: sysfs: fix off-by-one error in fill_read_buffer() kobject: two typo fixes UIO: add UIO documentation target to DocBook Makefile UIO: fix up the UIO documentation create /sys/.../power when CONFIG_PM is set allow LEGACY_PTYS to be set to 0	2007-11-28 15:59:50 -08:00
Miao Xie	8118a859dc	sysfs: fix off-by-one error in fill_read_buffer() I found that there is a off-by-one problem in the following code. Version: 2.6.24-rc2 File: fs/sysfs/file.c:118-122 Function: fill_read_buffer -------------------------------------------------------------------- count = ops->show(kobj, attr_sd->s_attr.attr, buffer->page); sysfs_put_active_two(attr_sd); BUG_ON(count > (ssize_t)PAGE_SIZE); -------------------------------------------------------------------- Because according to the specification of the sysfs and the implement of the show methods, the show methods return the number of bytes which would be generated for the given input, excluding the trailing null.So if the return value of the show methods equals PAGE_SIZE - 1, the buffer is full in fact. And if the return value equals PAGE_SIZE, the resulting string was already truncated,or buffer overflow occurred. This patch fixes an off-by-one error in fill_read_buffer. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Tejun Heo <teheo@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-11-28 13:53:53 -08:00
Ingo Molnar	c46f739dd3	vfs: coredumping fix fix: http://bugzilla.kernel.org/show_bug.cgi?id=3043 only allow coredumping to the same uid that the coredumping task runs under. Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Alan Cox <alan@redhat.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Al Viro <viro@ftp.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-28 10:58:01 -08:00
Mark Fasheh	b1967d0edd	ocfs2: reverse inline-data truncate args ocfs2_truncate() and ocfs2_remove_inode_range() had reversed their "set i_size" arguments to ocfs2_truncate_inline(). Fix things so that truncate sets i_size, and punching a hole ignores it. This exposed a problem where punching a hole in an inline-data file wasn't updating the page cache, so fix that too. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:03 -08:00
Mark Fasheh	0d8a4e0cd6	ocfs2: Fix comparison in ocfs2_size_fits_inline_data() This was causing us to prematurely push out inline data by one byte. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:03 -08:00
Mark Fasheh	bccb9dad89	ocfs2: Remove bug statement in ocfs2_dentry_iput() The existing bug statement didn't take into account unhashed dentries which might not have a cluster lock on them. This could happen if a node exporting the file system via NFS is rebooted, re-exported to nfs clients and then unmounted. It's fine in this case to not have a dentry cluster lock. Just remove the bug statement and replace it with an error print, which does the proper checks. Though we want to know if something has happened which might have prevented a cluster lock from being created, it's definitely not necessary to panic the machine for this. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:02 -08:00
Jan Kara	5a58c3ef22	[PATCH] ocfs2: Remove expensive bitmap scanning Enable expensive bitmap scanning only if DEBUG option is enabled. The bitmap scanning quite loads the CPU and on my machine the write throughput of dd if=/dev/zero of=/ocfs2/file bs=1M count=500 conv=sync improves from 37 MB/s to 45.4 MB/s in local mode... Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:02 -08:00
Mark Fasheh	a46043e08f	ocfs2: log valid inode # on bad inode If the inode block isn't valid then we don't want to print the value from that, instead print the block number which was passed in (which should always be correct). Also, turn this into a debug print for now - folks who hit an actual problem always have other logs indicating what the source is. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:02 -08:00
Mark Fasheh	ef9f86ceb6	ocfs2: Filter -ENOSPC in mlog_errno() It's almost never worth printing in that situation and we keep forgetting to manually filter it out. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:01 -08:00
Joe Perches	2759236f84	[PATCH] fs/ocfs2: Add missing "space" Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:01 -08:00
Mark Fasheh	e001e796e4	ocfs2: Reset journal parameters after s_mount_opt update Right now we're just setting them from the existing parameters, not the new ones that a remount specified. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-27 16:47:01 -08:00
Linus Torvalds	423eaf8f00	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: NFS: Clean up new multi-segment direct I/O changes NFS: Ensure we return zero if applications attempt to write zero bytes NFS: Support multiple segment iovecs in the NFS direct I/O path NFS: Introduce iovec I/O helpers to fs/nfs/direct.c SUNRPC: Add missing "space" to net/sunrpc/auth_gss.c SUNRPC: make sunrpc/xprtsock.c:xs_setup_{udp,tcp}() static NFS: fs/nfs/dir.c should #include "internal.h" NFS: make nfs_wb_page_priority() static NFS: mount failure causes bad page state SUNRPC: remove NFS/RDMA client's binary sysctls kernel BUG at fs/nfs/namespace.c:108! - can be triggered by bad server sunrpc: rpc_pipe_poll may miss available data in some cases sunrpc: return error if unsupported enctype or cksumtype is encountered sunrpc: gss_pipe_downcall(), don't assume all errors are transient NFS: Fix the ustat() regression	2007-11-26 19:42:59 -08:00
Linus Torvalds	0685ab4fb8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: bump version of kernel/sched_debug.c sched: fix minimum granularity tunings sched: fix RLIMIT_CPU comment sched: fix kernel/acct.c comment sched: fix prev_stime calculation sched: don't forget to unlock uids_mutex on error paths	2007-11-26 19:42:08 -08:00
Chuck Lever	02fe494619	NFS: Clean up new multi-segment direct I/O changes Simplify calling sequence of nfs_direct_{read,write}_schedule(), and rename them to reflect their new role. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:32:40 -05:00
Chuck Lever	b9148c6b80	NFS: Ensure we return zero if applications attempt to write zero bytes A zero byte count direct write request should be a successful no-op, not an error. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:32:38 -05:00
Chuck Lever	c216fd708e	NFS: Support multiple segment iovecs in the NFS direct I/O path Allow applications to perform asynchronous scatter-gather direct I/O to NFS files. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:32:36 -05:00
Chuck Lever	19f737879c	NFS: Introduce iovec I/O helpers to fs/nfs/direct.c Add helpers that iterate over multi-segment iovecs. These will be used to support multi-segment scatter/gather direct I/O in a later patch. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:32:35 -05:00
Adrian Bunk	4c30d56edc	NFS: fs/nfs/dir.c should #include "internal.h" Every file should include the headers containing the prototypes for its global functions (in this case nfs_access_cache_shrinker()). Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:49 -05:00
Adrian Bunk	5334eb13d4	NFS: make nfs_wb_page_priority() static nfs_wb_page_priority() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:48 -05:00
Russell King	f16c960332	NFS: mount failure causes bad page state While testing a kernel based upon `ecd744eec3` (with wrong boot arguments), I got the following bad page state entry while NFS was trying to mount it's rootfs: IP-Config: Complete: device=eth0, addr=192.168.1.101, mask=255.255.255.0, gw=255.255.255.255, host=192.168.1.101, domain=, nis-domain=(none), bootserver=192.168.1.100, rootserver=192.168.1.100, rootpath= Looking up port of RPC 100003/2 on 192.168.1.100 rpcbind: server 192.168.1.100 not responding, timed out Root-NFS: Unable to get nfsd port number from server, using default Looking up port of RPC 100005/1 on 192.168.1.100 rpcbind: server 192.168.1.100 not responding, timed out Root-NFS: Unable to get mountd port number from server, using default mount: server 192.168.1.100 not responding, timed out Root-NFS: Server returned error -5 while mounting /nfs/rootfs/ VFS: Unable to mount root fs via NFS, trying floppy. Bad page state in process 'swapper' page:c02b1260 flags:0x00000400 mapping:00000000 mapcount:0 count:0 Trying to fix it up, but a reboot is needed Backtrace: [<c0023e34>] (dump_stack+0x0/0x14) from [<c0062570>] (bad_page+0x70/0xac) [<c0062500>] (bad_page+0x0/0xac) from [<c0064914>] (free_hot_cold_page+0x80/0x178) [<c0064894>] (free_hot_cold_page+0x0/0x178) from [<c0064a74>] (free_hot_page+0x14/0x18) [<c0064a60>] (free_hot_page+0x0/0x18) from [<c0067078>] (put_page+0xf8/0x154) [<c0066f80>] (put_page+0x0/0x154) from [<c007dbc8>] (kfree+0xc8/0xd0) [<c007db00>] (kfree+0x0/0xd0) from [<c00cbb54>] (nfs_get_sb+0x230/0x710) [<c00cb924>] (nfs_get_sb+0x0/0x710) from [<c0084334>] (vfs_kern_mount+0x58/0xac)[<c00842dc>] (vfs_kern_mount+0x0/0xac) from [<c00843c0>] (do_kern_mount+0x38/0xf4) [<c0084388>] (do_kern_mount+0x0/0xf4) from [<c0099c7c>] (do_mount+0x1e8/0x614) ... This seems to be caused by use of an uninitialised structure due to NULL options being passed to nfs_validate_mount_data(). Ensure that the parsed mount data is always initialised. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> (Trond: added fix for the same bug in nfs4_validate_mount_data()). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-26 16:24:22 -05:00
Ingo Molnar	08e4570a4a	sched: fix prev_stime calculation Srivatsa Vaddagiri noticed occasionally incorrect CPU usage values in top and tracked it down to stime going below 0 in task_stime(). Negative values are possible there due to the sampled nature of stime/utime. Fix suggested by Balbir Singh. Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <balbir@linux.vnet.ibm.com>	2007-11-26 21:21:49 +01:00
Steve French	2b83457bde	[CIFS] Fix check after use error in ACL code Spotted by the coverity scanner. CC: Adrian Bunk <bunk@kernel.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-25 10:01:00 +00:00
Steve French	058250a0d5	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2007-11-25 09:53:27 +00:00
Jeff Layton	cea218054a	[CIFS] Fix potential data corruption when writing out cached dirty pages Fix RedHat bug 329431 The idea here is separate "conscious" from "unconscious" flushes. Conscious flushes are those due to a fsync() or close(). Unconscious ones are flushes that occur as a side effect of some other operation or due to memory pressure. Currently, when an error occurs during an unconscious flush (ENOSPC or EIO), we toss out the page and don't preserve that error to report to the user when a conscious flush occurs. If after the unconscious flush, there are no more dirty pages for the inode, the conscious flush will simply return success even though there were previous errors when writing out pages. This can lead to data corruption. The easiest way to reproduce this is to mount up a CIFS share that's very close to being full or where the user is very close to quota. mv a file to the share that's slightly larger than the quota allows. The writes will all succeed (since they go to pagecache). The mv will do a setattr to set the new file's attributes. This calls filemap_write_and_wait, which will return an error since all of the pages can't be written out. Then later, when the flush and release ops occur, there are no more dirty pages in pagecache for the file and those operations return 0. mv then assumes that the file was written out correctly and deletes the original. CIFS already has a write_behind_rc variable where it stores the results from earlier flushes, but that value is only reported in cifs_close. Since the VFS ignores the return value from the release operation, this isn't helpful. We should be reporting this error during the flush operation. This patch does the following: 1) changes cifs_fsync to use filemap_write_and_wait and cifs_flush and also sync to check its return code. If it returns successful, they then check the value of write_behind_rc to see if an earlier flush had reported any errors. If so, they return that error and clear write_behind_rc. 2) sets write_behind_rc in a few other places where pages are written out as a side effect of other operations and the code waits on them. 3) changes cifs_setattr to only call filemap_write_and_wait for ATTR_SIZE changes. 4) makes cifs_writepages accurately distinguish between EIO and ENOSPC errors when writing out pages. Some simple testing indicates that the patch works as expected and that it fixes the reproduceable known problem. Acked-by: Dave Kleikamp <shaggy@austin.rr.com> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-20 23:19:03 +00:00
Petr Tesarik	2a97468024	[CIFS] Fix spurious reconnect on 2nd peek from read of SMB length When retrying kernel_recvmsg() because of a short read, check returned length against the remaining length, not against total length. This avoids unneeded session reconnects which would otherwise occur when kernel_recvmsg() finally returns zero when asked to read zero bytes. Signed-off-by: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-20 02:24:08 +00:00
Neil Brown	4c1fe2f78a	kernel BUG at fs/nfs/namespace.c:108! - can be triggered by bad server Hi Trond, I have discovered that the BUG_ON in nfs_follow_mountpoint: BUG_ON(IS_ROOT(dentry)); can be triggered by a misbehaving server. What happens is the client does a lookup and discoveres that the named directory has a different fsid, so it initiates a mount. It then performs a GETATTR on the mounted directory and gets a different fsid again (due to a bug in the NFS server). This causes nfs_follow_mountpoint to be called on the newly mounted root, which triggers the BUG_ON. To duplicate this, have a directory which contains some mountpoints, and export that directory with the "crossmnt" flag using nfs-utils 1.1.1 (or 1.1.0 I think) The GETATTR on the root of the mounted filesystem will return the information for the top exportpoint, while a lookup will return the correct information. This difference causes the NFS client to BUG. I think the best way to fix this is to trap this possibility early, so just before completing the mount in the NFS client, check that it isn't going to use nfs_mountpoint_inode_operations. As long as i_op will never change once set (is that true?), this should be adequately safe. The following patch shows a possible approach, and it works for me. i.e. when the NFS server is misbehaving, I get ESTALE on those mountpoints, while when the NFS server is working correctly, I get correct behaviour on the client. NeilBrown Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-17 13:08:48 -05:00
Trond Myklebust	b09b9417d0	NFS: Fix the ustat() regression Since 2.6.18, the superblock sb->s_root has been a dummy dentry with a dummy inode. This breaks ustat(), which actually uses sb->s_root in a vfstat() call. Fix this by making the s_root a dummy alias to the directory inode that was used when creating the superblock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-11-17 13:08:44 -05:00
Steve French	f7a44eadd5	[CIFS] remove build warning CC: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-17 00:01:51 +00:00
Steve French	2442421b17	[CIFS] Have CIFS_SessSetup build correct SPNEGO SessionSetup request Have CIFS_SessSetup call cifs_get_spnego_key when Kerberos is negotiated. Use the info in the key payload to build a session setup request packet. Also clean up how the request buffer in the function is freed on error. With appropriate user space helper (in samba/source/client). Kerberos support (secure session establishment can be done now via Kerberos, previously users would have to use NTLMv2 instead for more secure session setup). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-16 23:37:35 +00:00
Steve French	8840dee9dc	[CIFS] minor checkpatch cleanup Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-16 23:05:52 +00:00
Jeff Layton	d6c2e4d02b	[CIFS] have cifs_get_spnego_key get the hostname from TCP_Server_Info Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-16 22:23:17 +00:00
Jeff Layton	c359cf3c61	[CIFS] add hostname field to TCP_Server_Info struct ...and populate it with the hostname portion of the UNC string. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-16 22:22:06 +00:00
Jeff Layton	70fe7dc055	[CIFS] clean up error handling in cifs_mount Move all of the kfree's sprinkled in the middle of the function to the end, and have the code set rc and just goto there on error. Also zero out the password string before freeing it. Looks like this should also fix a potential memory leak of the prepath string if an error occurs near the end of the function. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-16 22:21:07 +00:00
Steve French	68bf728a22	[CIFS] add ver= prefix to upcall format version Acked-by: Jeff Layton <jlayton@redhat.com> Acked-by: Igor Mammedov <niallan@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-16 18:32:52 +00:00
Jan Kara	7c06a8dc64	Fix 64KB blocksize in ext3 directories With 64KB blocksize, a directory entry can have size 64KB which does not fit into 16 bits we have for entry lenght. So we store 0xffff instead and convert value when read from / written to disk. The patch also converts some places to use ext3_next_entry() when we are changing them anyway. [akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:43 -08:00
Jeff Layton	dbaf4c024a	smbfs: fix debug builds Fix some warnings with SMBFS_DEBUG_* builds. This patch makes it so that builds with -Werror don't fail. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:43 -08:00
Arjan van de Ven	cb51f973bc	mark sys_open/sys_read exports unused sys_open / sys_read were used in the early 1.2 days to load firmware from disk inside drivers. Since 2.0 or so this was deprecated behavior, but several drivers still were using this. Since a few years we have a request_firmware() API that implements this in a nice, consistent way. Only some old ISA sound drivers (pre-ALSA) still straggled along for some time.... however with commit `c2b1239a9f` the last user is now gone. This is a good thing, since using sys_open / sys_read etc for firmware is a very buggy to dangerous thing to do; these operations put an fd in the process file descriptor table.... which then can be tampered with from other threads for example. For those who don't want the firmware loader, filp_open()/vfs_read are the better APIs to use, without this security issue. The patch below marks sys_open and sys_read unused now that they're really not used anymore, and for deletion in the 2.6.25 timeframe. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:42 -08:00
Eric W. Biederman	9fcc2d15b1	proc: simplify and correct proc_flush_task Currently we special case when we have only the initial pid namespace. Unfortunately in doing so the copied case for the other namespaces was broken so we don't properly flush the thread directories :( So this patch removes the unnecessary special case (removing a usage of proc_mnt) and corrects the flushing of the thread directories. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:42 -08:00
Adrian Bunk	8744969a81	fuse_file_alloc(): fix NULL dereferences Fix obvious NULL dereferences spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:42 -08:00
Fengguang Wu	c06a018fa5	reiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file This is not a new problem in 2.6.23-git17. 2.6.22/2.6.23 is buggy in the same way. Reiserfs could accumulate dirty sub-page-size files until umount time. They cannot be synced to disk by pdflush routines or explicit `sync' commands. Only `umount' can do the trick. The direct cause is: the dirty page's PG_dirty is wrongly _cleared_. Call trace: [<ffffffff8027e920>] cancel_dirty_page+0xd0/0xf0 [<ffffffff8816d470>] :reiserfs:reiserfs_cut_from_item+0x660/0x710 [<ffffffff8816d791>] :reiserfs:reiserfs_do_truncate+0x271/0x530 [<ffffffff8815872d>] :reiserfs:reiserfs_truncate_file+0xfd/0x3b0 [<ffffffff8815d3d0>] :reiserfs:reiserfs_file_release+0x1e0/0x340 [<ffffffff802a187c>] __fput+0xcc/0x1b0 [<ffffffff802a1ba6>] fput+0x16/0x20 [<ffffffff8029e676>] filp_close+0x56/0x90 [<ffffffff8029fe0d>] sys_close+0xad/0x110 [<ffffffff8020c41e>] system_call+0x7e/0x83 Fix the bug by removing the cancel_dirty_page() call. Tests show that it causes no bad behaviors on various write sizes. === for the patient === Here are more detailed demonstrations of the problem. 1) the page has both PG_dirty(D)/PAGECACHE_TAG_DIRTY(d) after being written to; and then only PAGECACHE_TAG_DIRTY(d) remains after the file is closed. ------------------------------ screen 0 ------------------------------ [T0] root /home/wfg# cat > /test/tiny [T1] hi [T2] root /home/wfg# ------------------------------ screen 1 ------------------------------ [T1] root /home/wfg# echo /test/tiny > /proc/filecache [T1] root /home/wfg# cat /proc/filecache # file /test/tiny # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback # idx len state refcnt 0 1 ___UD__Bd_ 2 [T2] root /home/wfg# cat /proc/filecache # file /test/tiny # flags R:referenced A:active M:mmap U:uptodate D:dirty W:writeback O:owner B:buffer d:dirty w:writeback # idx len state refcnt 0 1 ___U___Bd_ 2 2) note the non-zero 'cancelled_write_bytes' after /tmp/hi is copied. ------------------------------ screen 0 ------------------------------ [T0] root /home/wfg# echo hi > /tmp/hi [T1] root /home/wfg# cp /tmp/hi /dev/stdin /test [T2] hi [T3] root /home/wfg# ------------------------------ screen 1 ------------------------------ [T1] root /proc/4397# cd /proc/`pidof cp` [T1] root /proc/4713# cat io rchar: 8396 wchar: 3 syscr: 20 syscw: 1 read_bytes: 0 write_bytes: 20480 cancelled_write_bytes: 4096 [T2] root /proc/4713# cat io rchar: 8399 wchar: 6 syscr: 21 syscw: 2 read_bytes: 0 write_bytes: 24576 cancelled_write_bytes: 4096 //Question: the 'write_bytes' is a bit more than expected ;-) Tested-by: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn> Reviewed-by: Chris Mason <chris.mason@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:41 -08:00
Dmitri Vorobiev	f433dc5634	Fixes to the BFS filesystem driver I found a few bugs in the BFS driver. Detailed description of the bugs as well as the steps to reproduce the errors are given in the kernel bugzilla. Please follow these links for more information: http://bugzilla.kernel.org/show_bug.cgi?id=9363 http://bugzilla.kernel.org/show_bug.cgi?id=9364 http://bugzilla.kernel.org/show_bug.cgi?id=9365 http://bugzilla.kernel.org/show_bug.cgi?id=9366 This patch fixes the bugs described above. Besides, the patch introduces coding style changes to make the BFS driver conform to the requirements specified for Linux kernel code. Finally, I made a few cosmetic changes such as removal of trivial debug output. Also, the patch removes the fields `si_lf_ioff' and `si_lf_sblk' of the in-core superblock structure. These fields are initialized but never actually used. If you are wondering why I need BFS, here is the answer: I am using this driver in the context of Linux kernel classes I am teaching in the Moscow State University and in the International Institute of Information Technology in Pune, India. Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Cc: Tigran Aivazian <tigran@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:40 -08:00
Adam Litke	9a119c056d	hugetlb: allow bulk updating in hugetlb_*_quota() Add a second parameter 'delta' to hugetlb_get_quota and hugetlb_put_quota to allow bulk updating of the sbinfo->free_blocks counter. This will be used by the next patch in the series. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Ken Chen <kenchen@google.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <hermes@gibson.dropbear.id.au> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:40 -08:00
Adam Litke	c79fb75e5a	hugetlb: fix quota management for private mappings The hugetlbfs quota management system was never taught to handle MAP_PRIVATE mappings when that support was added. Currently, quota is debited at page instantiation and credited at file truncation. This approach works correctly for shared pages but is incomplete for private pages. In addition to hugetlb_no_page(), private pages can be instantiated by hugetlb_cow(); but this function does not respect quotas. Private huge pages are treated very much like normal, anonymous pages. They are not "backed" by the hugetlbfs file and are not stored in the mapping's radix tree. This means that private pages are invisible to truncate_hugepages() so that function will not credit the quota. This patch (based on a prototype provided by Ken Chen) moves quota crediting for all pages into free_huge_page(). page->private is used to store a pointer to the mapping to which this page belongs. This is used to credit quota on the appropriate hugetlbfs instance. Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Ken Chen <kenchen@google.com> Cc: Ken Chen <kenchen@google.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: David Gibson <hermes@gibson.dropbear.id.au> Cc: William Lee Irwin III <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:40 -08:00
Eric W. Biederman	e1a1c997af	proc: fix proc_kill_inodes to kill dentries on all proc superblocks It appears we overlooked support for removing generic proc files when we added support for multiple proc super blocks. Handle that now. [akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: Alexey Dobriyan <adobriyan@sw.ru> Acked-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:38 -08:00
Jan Kara	e47776a0a4	Forbid user to change file flags on quota files Forbid user from changing file flags on quota files. User has no bussiness in playing with these flags when quota is on. Furthermore there is a remote possibility of deadlock due to a lock inversion between quota file's i_mutex and transaction's start (i_mutex for quota file is locked only when trasaction is started in quota operations) in ext3 and ext4. Signed-off-by: Jan Kara <jack@suse.cz> Cc: LIOU Payphone <lioupayphone@gmail.com> Cc: <linux-ext4@vger.kernel.org> Acked-by: Dave Kleikamp <shaggy@austin.ibm.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:38 -08:00
Michael Halcrow	8a146a2b0d	eCryptfs: cast page->index to loff_t instead of off_t page->index should be cast to loff_t instead of off_t. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Reported-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-14 18:45:36 -08:00
Steve French	133672efbc	[CIFS] Fix buffer overflow if server sends corrupt response to small request In SendReceive() function in transport.c - it memcpy's message payload into a buffer passed via out_buf param. The function assumes that all buffers are of size (CIFSMaxBufSize + MAX_CIFS_HDR_SIZE) , unfortunately it is also called with smaller (MAX_CIFS_SMALL_BUFFER_SIZE) buffers. There are eight callers (SMB worker functions) which are primarily affected by this change: TreeDisconnect, uLogoff, Close, findClose, SetFileSize, SetFileTimes, Lock and PosixLock CC: Dave Kleikamp <shaggy@austin.ibm.com> CC: Przemyslaw Wegrzyn <czajnik@czajsoft.pl> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-13 22:41:37 +00:00
Linus Torvalds	31083eba37	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (45 commits) [NETFILTER]: xt_time should not assume CONFIG_KTIME_SCALAR [NET]: Move unneeded data to initdata section. [NET]: Cleanup pernet operation without CONFIG_NET_NS [TEHUTI]: Fix incorrect usage of strncat in bdx_get_drvinfo() [MYRI_SBUS]: Prevent that myri_do_handshake lies about ticks. [NETFILTER]: bridge: fix double POSTROUTING hook invocation [NETFILTER]: Consolidate nf_sockopt and compat_nf_sockopt [NETFILTER]: nf_nat: fix memset error [INET]: Use list_head-s in inetpeer.c [IPVS]: Remove unused exports. [NET]: Unexport sysctl_{r,w}mem_max. [TG3]: Update version to 3.86 [TG3]: MII => TP [TG3]: Add A1 revs [TG3]: Increase the PCI MRRS [TG3]: Prescaler fix [TG3]: Limit 5784 / 5764 to MAC LED mode [TG3]: Disable GPHY autopowerdown [TG3]: CPMU adjustments for loopback tests [TG3]: Fix nvram selftest failures ...	2007-11-13 09:04:48 -08:00
Linus Torvalds	0b832a4b93	Revert "ext2/ext3/ext4: add block bitmap validation" This reverts commit `7c9e69faa2`, fixing up conflicts in fs/ext4/balloc.c manually. The cost of doing the bitmap validation on each lookup - even when the bitmap is cached - is absolutely prohibitive. We could, and probably should, do it only when adding the bitmap to the buffer cache. However, right now we are better off just reverting it. Peter Zijlstra measured the cost of this extra validation as a 85% decrease in cached iozone, and while I had a patch that took it down to just 17% by not being _quite_ so stupid in the validation, it was still a big slowdown that could have been avoided by just doing it right. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: Andreas Dilger <adilger@clusterfs.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-13 08:09:11 -08:00
Denis V. Lunev	022cbae611	[NET]: Move unneeded data to initdata section. This patch reverts Eric's commit `2b008b0a8e` It diets .text & .data section of the kernel if CONFIG_NET_NS is not set. This is safe after list operations cleanup. Signed-of-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-13 03:23:50 -08:00
Trond Myklebust	91cf45f02a	[NET]: Add the helper kernel_sock_shutdown() ...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers. Looking at the sock->op->shutdown() handlers, it looks as if all of them take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the RCV_SHUTDOWN/SEND_SHUTDOWN arguments. Add a helper, and then define the SHUT_* enum to ensure that kernel users of shutdown() don't get confused. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Mark Fasheh <mark.fasheh@oracle.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-12 18:10:39 -08:00
J. Bruce Fields	6fa02839bf	nfsd4: recheck for secure ports in fh_verify As with commit `7fc90ec93a` ("knfsd: nfsd: call nfsd_setuser() on fh_compose(), fix nfsd4 permissions problem") this is a case where we need to redo a security check in fh_verify() even though the filehandle already has an associated dentry--if the filehandle was created by fh_compose() in an earlier operation of the nfsv4 compound, then we may not have done these checks yet. Without this fix it is possible, for example, to traverse from an export without the secure ports requirement to one with it in a single compound, and bypass the secure port check on the new export. While we're here, fix up some minor style problems and change a printk() to a dprintk(), to make it harder for random unprivileged users to spam the logs. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-By: NeilBrown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-12 14:28:08 -08:00
J. Bruce Fields	ac8587dcb5	knfsd: fix spurious EINVAL errors on first access of new filesystem The v2/v3 acl code in nfsd is translating any return from fh_verify() to nfserr_inval. This is particularly unfortunate in the case of an nfserr_dropit return, which is an internal error meant to indicate to callers that this request has been deferred and should just be dropped pending the results of an upcall to mountd. Thanks to Roland <devzero@web.de> for bug report and data collection. Cc: Roland <devzero@web.de> Acked-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-By: NeilBrown <neilb@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-12 14:28:08 -08:00
Linus Torvalds	46015977e7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (21 commits) [CIFS] fix oops on second mount to same server when null auth is used [CIFS] Fix stale mode after readdir when cifsacl specified [CIFS] add mode to acl conversion helper function [CIFS] Fix incorrect mode when ACL had deny access control entries [CIFS] Add uid to key description so krb can handle user mounts [CIFS] Fix walking out end of cifs dacl [CIFS] Add upcall files for cifs to use spnego/kerberos [CIFS] add OIDs for KRB5 and MSKRB5 to ASN1 parsing routines [CIFS] Register and unregister cifs_spnego_key_type on module init/exit [CIFS] implement upcalls for SPNEGO blob via keyctl API [CIFS] allow cifs_calc_signature2 to deal with a zero length iovec [CIFS] If no Access Control Entries, set mode perm bits to zero [CIFS] when mount helper missing fix slash wrong direction in share [CIFS] Don't request too much permission when reading an ACL [CIFS] enable get mode from ACL when cifsacl mount option specified [CIFS] ACL support part 8 [CIFS] acl support part 7 [CIFS] acl support part 6 [CIFS] acl support part 6 [CIFS] remove unused funtion compile warning when experimental off ...	2007-11-12 11:11:39 -08:00
Roland McGrath	00ec99da43	core dump: remain dumpable The coredump code always calls set_dumpable(0) when it starts (even if RLIMIT_CORE prevents any core from being dumped). The effect of this (via task_dumpable) is to make /proc/pid/* files owned by root instead of the user, so the user can no longer examine his own process--in a case where there was never any privileged data to protect. This affects e.g. auxv, environ, fd; in Fedora (execshield) kernels, also maps. In practice, you can only notice this when a debugger has requested PTRACE_EVENT_EXIT tracing. set_dumpable was only used in do_coredump for synchronization and not intended for any security purpose. (It doesn't secure anything that wasn't already unsecured when a process dies by SIGTERM instead of SIGQUIT.) This changes do_coredump to check the core_waiters count as the means of synchronization, which is sufficient. Now we leave the "dumpable" bits alone. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-12 10:32:29 -08:00
Jeff Layton	9b8f5f5737	[CIFS] fix oops on second mount to same server when null auth is used When a share is mounted using no username, cifs_mount sets volume_info.username as a NULL pointer, and the sesInfo userName as an empty string. The volume_info.username is passed to a couple of other functions to see if there is an existing unc or tcp connection that can be used. These functions assume that the username will be a valid string that can be passed to strncmp. If the pointer is NULL, then the kernel will oops if there's an existing session to which the string can be compared. This patch changes cifs_mount to set volume_info.username to an empty string in this situation, which prevents the oops and should make it so that the comparison to other null auth sessions match. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-09 23:25:04 +00:00
Linus Torvalds	4c31c30302	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Add UNPLUG traces to all appropriate places block: fix requeue handling in blk_queue_invalidate_tags() mmc: Fix sg helper copy-and-paste error pktcdvd: fix BUG caused by sysfs module reference semantics change ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting cfq_idle_class_timer: add paranoid checks for jiffies overflow cfq: fix IOPRIO_CLASS_IDLE delays cfq: fix IOPRIO_CLASS_IDLE accounting	2007-11-09 15:17:49 -08:00
Linus Torvalds	e36aeee65d	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: ocfs2: fix rename vs unlink race [PATCH] Fix possibly too long write in o2hb_setup_one_bio() ocfs2: fix write() performance regression ocfs2: Commit journal on sync writes ocfs2: Re-order iput in ocfs2_drop_dentry_lock ocfs2: Create locks at initially requested level [PATCH] Fix priority mistakes in fs/ocfs2/{alloc.c, dlmglue.c} [2.6 patch] make ocfs2_find_entry_el() static	2007-11-09 15:11:58 -08:00
Steve French	a6f8de3d9b	[CIFS] Fix stale mode after readdir when cifsacl specified When mounted with cifsacl mount option, readdir can not instantiate the inode with the estimated mode based on the ACL for each file since we have not queried for the ACL for each of these files yet. So set the refresh time to zero for these inodes so that the next stat will cause the client to go to the server for the ACL info so we can build the estimated mode (this means we also will issue an extra QueryPathInfo if the stat happens within 1 second, but this is trivial compared to the time required to open/getacl/close for each). ls -l is slower when cifsacl mount option is specified, but displays correct mode information. Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-08 23:10:32 +00:00
Steve French	ce06c9f025	[CIFS] add mode to acl conversion helper function Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-08 21:12:01 +00:00
Steve French	15b0395911	[CIFS] Fix incorrect mode when ACL had deny access control entries When mounted with the cifsacl mount option, we were treating any deny ACEs found like allow ACEs and it turns out for SFU and SUA Windows set these type of access control entries often. The order of ACEs is important too. The canonical order that most ACL tools and Windows explorer consruct ACLs with is to begin with DENY entries then follow with ALLOW, otherwise an allow entry could be encountered first, making the subsequent deny entry like "dead code which would be superflous since Windows stops when a match is made for the operation you are trying to perform for your user We start with no permissions in the mode and build up as we find permissions (ie allow ACEs). This fixes deny ACEs so they affect the mask used to set the subsequent allow ACEs. Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> CC: Alexander Bokovoy <ab@samba.org> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-08 17:57:40 +00:00
Igor Mammedov	9eae8a8903	[CIFS] Add uid to key description so krb can handle user mounts Adds uid to key description fro supporting user mounts and minor formating changes Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Igor Mammedov <niallain@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-08 16:13:31 +00:00
Jens Axboe	8ec680e4c3	ioprio: allow sys_ioprio_set() value of 0 to reset ioprio setting Normally io priorities follow the CPU nice, unless a specific scheduling class has been set. Once that is set, there's no way to reset the behaviour to 'none' so that it follows CPU nice again. Currently passing in 0 as the ioprio class/value will return -1/EINVAL, change that to allow resetting of a set scheduling class. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-07 13:54:07 +01:00
David S. Miller	df61c95262	[DLM] lowcomms: Do not muck with sysctl_rmem_max. Use SO_RCVBUFFORCE instead. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:11:42 -08:00
David S. Miller	44656ba128	[NET]: Kill proc_net_create() There are no more users. Signed-off-by: David S. Miller <davem@davemloft.net>	2007-11-07 04:10:52 -08:00
Srinivas Eeda	e325a88f17	ocfs2: fix rename vs unlink race If another node unlinks the destination while ocfs2_rename() is waiting on a cluster lock, ocfs2_rename() simply logs an error and continues. This causes a crash because the renaming node is now trying to delete a non-existent inode. The correct solution is to return -ENOENT. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:35:40 -08:00
Jan Kara	bc7e97cbdd	[PATCH] Fix possibly too long write in o2hb_setup_one_bio() We should subtract start of our IO from PAGE_CACHE_SIZE to get the right length of the write we want to perform. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:35:35 -08:00
Mark Fasheh	4e9563fd55	ocfs2: fix write() performance regression On file systems which don't support sparse files, Ocfs2_map_page_blocks() was reading blocks on appending writes. This caused write performance to suffer dramatically. Fix this by detecting an appending write on a nonsparse fs and skipping the read. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:35:29 -08:00
Mark Fasheh	9ea2d32f40	ocfs2: Commit journal on sync writes We're missing a meta data commit for extending sync writes. In thoery, write could return with the meta data required to read the data uncommitted to disk. Fix that by detecting an allocating write and forcing a journal commit in the sync case. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:32:00 -08:00
Mark Fasheh	9f70968af3	ocfs2: Re-order iput in ocfs2_drop_dentry_lock Do this to avoid a theoretical (I haven't seen this in practice) race where the downconvert thread might drop the dentry lock, allowing a remote unlink to proceed before dropping the inode locks. This could bounce access to the orphan dir between nodes. There doesn't seem to be a need to do the same in ocfs2_dentry_iput() as that's never called for the last ref drop from the downconvert thread. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:52 -08:00
Mark Fasheh	019d1b2247	ocfs2: Create locks at initially requested level If we have not yet created a cluster lock, ocfs2_cluster_lock() will first create it at NLMODE, and then convert the lock to either PRMODE or EXMODE (whichever is requested). Change ocfs2_cluster_lock() to just create the lock at the initially requested level. ocfs2_locking_ast() handles this case fine, so the only update required was in setup of locking state. This should reduce the number of network messages required for a new lock by one, providing an incremental performance enhancement. Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:45 -08:00
Roel Kluin	3cf0c507dd	[PATCH] Fix priority mistakes in fs/ocfs2/{alloc.c, dlmglue.c} Fixes priority mistakes similar to '!x & y' Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:39 -08:00
Adrian Bunk	0af4bd3887	[2.6 patch] make ocfs2_find_entry_el() static ocfs2_find_entry_el() can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>	2007-11-06 15:31:06 -08:00
Latchesar Ionkov	8999e04f3b	9p: use copy of the options value instead of original v9fs_parse_options function uses strsep which modifies the value of the v9ses->options field. That modified value is later passed to the function that creates the transport potentially making the transport creation function to fail. This patch creates a copy of v9ses->option field that v9fs_parse_options function uses instead of the original value. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-11-06 08:02:53 -06:00
Latchesar Ionkov	dda6b022f3	9p: fix memory leak in v9fs_get_sb This patch fixes a memory leak in v9fs_get_sb. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Acked-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-11-06 08:02:53 -06:00
Michael Halcrow	8a29f2b028	eCryptfs: release mutex on hash error path Release the crypt_stat hash mutex on allocation error. Check for error conditions when doing crypto hash calls. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Reported-by: Kazuki Ohta <kazuki.ohta@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-05 15:12:33 -08:00
Michael Halcrow	778d1a2bd4	eCryptfs: increment extent_offset once per loop interation The extent_offset is getting incremented twice per loop iteration through any given page. It should only be getting incremented once. This bug should only impact hosts with >4K page sizes. Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-05 15:12:33 -08:00
Adrian Bunk	6551198a20	fs/afs/vlocation.c: fix off-by-one This patch fixes an off-by-one error spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-05 15:12:32 -08:00
Steve French	63d2583f5a	[CIFS] Fix walking out end of cifs dacl Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-05 21:46:10 +00:00
Steve French	f1d662a7d5	[CIFS] Add upcall files for cifs to use spnego/kerberos Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-05 14:38:08 +00:00
Linus Torvalds	160acc2e89	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: [SG] Get rid of __sg_mark_end() cleanup asm/scatterlist.h includes SG: Make sg_init_one() use general table init functions	2007-11-03 12:43:21 -07:00
Anton Altaparmakov	ebab89909e	NTFS: Fix read regression. The regression was caused by: commit[`a32ea1e1f9`] Fix read/truncate race This causes ntfs_readpage() to be called for a zero i_size inode, which failed when the file was compressed and non-resident. Thanks a lot to Mike Galbraith for reporting the issue and tracking down the commit that caused the regression. Looking into it I found three bugs which the patch fixes. Signed-off-by: Anton Altaparmakov <aia21@cantab.net> Tested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-11-03 12:27:21 -07:00
Jeff Layton	e545937a51	[CIFS] add OIDs for KRB5 and MSKRB5 to ASN1 parsing routines Also, fix the parser to recognize them and set the secType accordingly. Make CIFSSMBNegotiate not error out automatically after parsing the securityBlob. Also thanks to Q (Igor) and Simo for their help on this set of kerberos patches (and Dave Howells for help on the upcall). Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-03 05:11:06 +00:00
Jeff Layton	84a15b9354	[CIFS] Register and unregister cifs_spnego_key_type on module init/exit Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-03 05:02:24 +00:00
Jeff Layton	09fe7ba78d	[CIFS] implement upcalls for SPNEGO blob via keyctl API Add routines to handle upcalls to userspace via keyctl for the purpose of getting a SPNEGO blob for a particular uid and server combination. Clean up the Makefile a bit and set it up to only compile cifs_spnego if CONFIG_CIFS_UPCALL is set. Also change CONFIG_CIFS_UPCALL to depend on CONFIG_KEYS rather than CONFIG_CONNECTOR. cifs_spnego.h defines the communications between kernel and userspace and is intended to be shared with userspace programs. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-03 04:48:29 +00:00
Jeff Layton	745542e210	[CIFS] allow cifs_calc_signature2 to deal with a zero length iovec Currently, cifs_calc_signature2 errors out if it gets a zero-length iovec. Fix it to silently continue in that case. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-03 04:34:04 +00:00
Adrian Bunk	87ae9afdca	cleanup asm/scatterlist.h includes Not architecture specific code should not #include <asm/scatterlist.h>. This patch therefore either replaces them with #include <linux/scatterlist.h> or simply removes them if they were unused. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-11-02 08:47:06 +01:00
Steve French	7505e0525c	[CIFS] If no Access Control Entries, set mode perm bits to zero Also clean up ACL code Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-01 18:03:01 +00:00
Steve French	1fb64bfc45	[CIFS] when mount helper missing fix slash wrong direction in share Kernel bugzilla bug #9228 If mount helper (mount.cifs) missing, mounts with form like //10.11.12.13/c$ would not work (only mounts with slash e.g. //10.11.12.13\\c$ would work) due to problem with slash supposed to be converted to backslash by the mount helper (which is not there). If we fail on converting an IPv4 address in in4_pton then try to canonicalize the first slash (ie between sharename and host ip address) if necessary. If we have to retry to check for IPv6 address the slash is already converted if necessary. Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-11-01 02:12:10 +00:00
Linus Torvalds	dd13810b42	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [AF_KEY]: suppress a warning for 64k pages. [TIPC]: Fix headercheck wrt. tipc_config.h [COMPAT]: Fix build on COMPAT platforms when CONFIG_NET is disabled. [CONNECTOR]: Fix a spurious kfree_skb() call [COMPAT]: Fix new dev_ifname32 returning -EFAULT [NET]: Fix incorrect sg_mark_end() calls. [IPVS]: Remove /proc/net/ip_vs_lblcr [IPV6]: remove duplicate call to proc_net_remove [NETNS]: fix net released by rcu callback [NET]: Fix free_netdev on register_netdev failure. [WAN]: fix drivers/net/wan/lmc/ compilation	2007-10-31 07:46:51 -07:00
Steve French	953f868138	[CIFS] Don't request too much permission when reading an ACL We were requesting GENERIC_READ but that fails when we do not have read permission on the file (even if we could read the ACL). Also move the dump access control entry code into debug ifdef. Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-31 04:54:42 +00:00
Adrian Bunk	78e9d3678c	sysfs: make sysfs_{get,put}_active() static sysfs_{get,put}_active() can now become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-10-30 21:52:33 -07:00
Benjamin Herrenschmidt	be48be08a8	[COMPAT]: Fix new dev_ifname32 returning -EFAULT A stray semicolon slipped in the patch that updated dev_ifname32 to not be inline, causing it to always return -EFAULT. This fixes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-30 21:29:42 -07:00
Dirk Hohndel	e403149c92	Kbuild/doc: fix links to Documentation files Fix links to files in Documentation/* in various Kconfig files Signed-off-by: Dirk Hohndel <hohndel@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-30 14:26:30 -07:00
J. Bruce Fields	97855b49b6	locks: fix possible infinite loop in posix deadlock detection It's currently possible to send posix_locks_deadlock() into an infinite loop (under the BKL). For now, fix this just by bailing out after a few iterations. We may want to fix this in a way that better clarifies the semantics of deadlock detection. But that will take more time, and this minimal fix is probably adequate for any realistic scenario, and is simple enough to be appropriate for applying to stable kernels now. Thanks to George Davis for reporting the problem. Cc: "George G. Davis" <gdavis@mvista.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-30 09:04:18 -07:00
Andrew Morton	f664f1f9b7	revert "ufs: Fix mount check in ufs_fill_super()" Evgeniy said: I wonder on what type of UFS do you test this patch? NetBSD and FreeBSD do not use "fs_state", they use "fs_clean" flag, only Solaris does check like this: fs_state + fs_time == FSOK. That's why parentheses was like that. At now with linux-2.6.24-rc1-git1, I get: fs need fsck, but NetBSD's fsck says that's all ok. I suggest revert this patch. Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Satyam Sharma <satyam.sharma@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-30 08:06:55 -07:00
Shirish Pargaonkar	e01b640013	[CIFS] enable get mode from ACL when cifsacl mount option specified Part 9 of ACL patch series. getting mode from ACL now works in some cases (and requires CIFS_EXPERIMENTAL config option). Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-30 04:45:14 +00:00
Balbir Singh	9301899be7	sched: fix /proc/<PID>/stat stime/utime monotonicity, part 2 Extend Peter's patch to fix accounting issues, by keeping stime monotonic too. Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Tested-by: Frans Pop <elendil@planet.nl>	2007-10-30 00:26:32 +01:00
Peter Zijlstra	73a2bcb0ed	sched: keep utime/stime monotonic keep utime/stime monotonic. cpustats use utime/stime as a ratio against sum_exec_runtime, as a consequence it can happen - when the ratio changes faster than time accumulates - that either can be appear to go backwards. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-29 21:18:11 +01:00
Linus Torvalds	ef49c32b84	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: [JFFS2] Update MAINTAINERS entry -- the jffs-dev list is dead [JFFS2] Prevent return of initialised variable in jffs2_init_acl_post()	2007-10-27 10:14:04 -07:00
David Woodhouse	8d6ea587d9	[JFFS2] Prevent return of initialised variable in jffs2_init_acl_post() Spotted by the Coverity checker, and pointed out by Adrian Bunk. Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-10-27 10:36:44 -04:00
Herbert Xu	68e3f5dd4d	[CRYPTO] users: Fix up scatterlist conversion errors This patch fixes the errors made in the users of the crypto layer during the sg_init_table conversion. It also adds a few conversions that were missing altogether. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-27 00:52:07 -07:00
Eric W. Biederman	2b008b0a8e	[NET]: Marking struct pernet_operations __net_initdata was inappropriate It is not safe to to place struct pernet_operations in a special section. We need struct pernet_operations to last until we call unregister_pernet_subsys. Which doesn't happen until module unload. So marking struct pernet_operations is a disaster for modules in two ways. - We discard it before we call the exit method it points to. - Because I keep struct pernet_operations on a linked list discarding it for compiled in code removes elements in the middle of a linked list and does horrible things for linked insert. So this looks safe assuming __exit_refok is not discarded for modules. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 22:54:53 -07:00
Steve French	b9c7a2bb1e	[CIFS] ACL support part 8 Now GetACL in getinodeinfo path when cifsacl mount option used, and ACL is parsed for SIDs. Missing only one piece now to be able to retrieve the mode Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-26 23:40:20 +00:00
Adrian Bunk	253879e62f	[NET] fs/proc/proc_net.c: make a struct static Struct proc_net_ns_ops can become static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-10-26 03:55:44 -07:00
Steve French	d61e5808d9	[CIFS] acl support part 7 Also fixes typo, build break Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-26 04:32:43 +00:00
Linus Torvalds	7f14957453	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: fix sg_phys to use dma_addr_t ub: add sg_init_table for sense and read capacity commands x86: pci-gart fix blackfin: fix sg fallout xtensa: dma-mapping.h is using linux/scatterlist.h functions, so include it SG: audit of drivers that use blk_rq_map_sg() arch/um/drivers/ubd_kern.c: fix a building error SG: Change sg_set_page() to take length and offset argument AVR32: Fix sg_page breakage mmc: sg fallout m68k: sg fallout More SG build fixes sg: add missing sg_init_table calls to zfcp SG build fix	2007-10-25 15:44:54 -07:00
Ram Gupta	f9e83489cb	fs: Fix to correct the mbcache entries counter This patch fixes the c_entry_count counter of the mbcache. Currently it increments the counter first & allocate the cache entry later. In case of failure to allocate the entry due to insufficient memory this counter is still left incremented. This patch fixes this anomaly. Signed-off-by: Ram Gupta <ram.gupta5@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-25 15:18:29 -07:00
David Howells	2a2da53b18	Fix pointer mismatches in proc_sysctl.c Fix pointer mismatches in proc_sysctl.c. The proc_handler() method returns a size_t through an arg pointer, but is given a pointer to a ssize_t to return into. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-25 15:16:49 -07:00
Steve French	630f3f0c45	[CIFS] acl support part 6 Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> CC: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-25 21:17:17 +00:00
Jens Axboe	642f149031	SG: Change sg_set_page() to take length and offset argument Most drivers need to set length and offset as well, so may as well fold those three lines into one. Add sg_assign_page() for those two locations that only needed to set the page, where the offset/length is set outside of the function context. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-24 11:20:47 +02:00
Neil Brown	432409eebc	NFS: Fix for bug in handling of errors for O_DIRECT writes Commit `eda3cef8dd` ("NFS: Fix error handling in nfs_direct_write_result()") ensured that if a WRITE returns an error, then data->res.verf->committed is not tested (as it is not initialised). Then commit `60fa3f769f` ("NFS: Fix two bugs in the O_DIRECT write code") inadvertently reverted this while fixing other problems. So move the test so that we never examine ->committed in an error case, and fix a speeling error while we are there. Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-23 16:41:21 -07:00
Steve French	44093ca2fe	[CIFS] acl support part 6 CC: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-23 21:22:55 +00:00
Latchesar Ionkov	22150c4f0f	9p: v9fs_vfs_rename incorrect clunk order In v9fs_vfs_rename function labels don't match the fids that are clunked. The correct clunk order is clunking newdirfid first and then olddirfid next. Signed-off-by: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-23 13:48:33 -05:00
Adrian Bunk	0a976297e1	9p: fix memleak in fs/9p/v9fs.c This patch fixes a memory leak introduced by commit `ba17674fe0`. Spotted by the Coverity checker. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2007-10-23 13:48:50 -05:00
Parag Warudkar	c94897790e	[CIFS] remove unused funtion compile warning when experimental off get rid of couple of unused function warnings which show up when CONFIG_CIFS_EXPERIMENTAL is not defined - wrap them in #ifdef CONFIG_CIFS_EXPERIMENTAL. Patch against current git. Signed-off-by: Parag Warudkar <kernel-stuff@comcast.net> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-23 18:09:48 +00:00
Linus Torvalds	6e506079c8	Merge git://git.infradead.org/mtd-2.6 * git://git.infradead.org/mtd-2.6: [MTD] [NOR] Fix deadlock in Intel chip driver caused by get_chip recursion [JFFS2] Fix return value from jffs2_write_end() [MTD] [OneNAND] Fix wrong free the static address in onenand_sim [MTD] [NAND] Replace -1 with -EBADMSG in nand error correction code [RSLIB] BUG() when passing illegal parameters to decode_rs8() or decode_rs16() [MTD] [NAND] treat any negative return value from correct() as an error [MTD] [NAND] nandsim: bugfix in initialization [MTD] Fix typo in Alauda config option help text. [MTD] [NAND] add s3c2440-specific read_buf/write_buf [MTD] [OneNAND] onenand-sim: fix kernel-doc and typos [JFFS2] Tidy up fix for ACL/permissions problem.	2007-10-23 08:56:50 -07:00
Randy Dunlap	0895e91d60	procfs: fix kernel-doc param warnings Fix mnt_flush_task() misplaced kernel-doc. Fix typos in some of the doc text. Warning(linux-2.6.23-git17//fs/proc/base.c:2280): No description found for parameter 'mnt' Warning(linux-2.6.23-git17//fs/proc/base.c:2280): No description found for parameter 'pid' Warning(linux-2.6.23-git17//fs/proc/base.c:2280): No description found for parameter 'tgid' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 19:40:02 -07:00
Linus Torvalds	69450bb5eb	Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block * 'sg' of git://git.kernel.dk/linux-2.6-block: Add CONFIG_DEBUG_SG sg validation Change table chaining layout Update arch/ to use sg helpers Update swiotlb to use sg helpers Update net/ to use sg helpers Update fs/ to use sg helpers [SG] Update drivers to use sg helpers [SG] Update crypto/ to sg helpers [SG] Update block layer to use sg helpers [SG] Add helpers for manipulating SG entries	2007-10-22 19:11:06 -07:00
Jens Axboe	60c74f8193	Update fs/ to use sg helpers Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:19:55 +02:00
Steve French	7efb35af73	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2007-10-22 16:28:19 +00:00
Christoph Hellwig	e38f981758	exportfs: update documentation Update documentation to the current state of affairs. Remove duplicated method descruptions in exportfs.h and point to Documentation/filesystems/ Exporting instead. Add a little file header comment in expfs.c describing what's going on and mentioning Neils and my copyright [1]. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	3965516440	exportfs: make struct export_operations const Now that nfsd has stopped writing to the find_exported_dentry member we an mark the export_operations const Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	cfaea787c0	exportfs: remove old methods Now that all filesystems are converted remove support for the old methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:21 -07:00
Christoph Hellwig	644f9ab3b0	ocfs2: new export ops OCFS2 has it's own 64bit-firendly filehandle format so we can't use the generic helpers here. I'll add a struct for the types later. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	34c0d15424	gfs2: new export ops Convert gfs2 to the new ops. Uses a similar structure to the generic helpers, but gfs2 has it's own file handle formats. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	be55caf177	reiserfs: new export ops Another nice little cleanup by using the new methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	905251a02e	isofs: new export ops Nice little cleanup by consolidating things a little and using a structure for the special file handle format. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	1305edad01	fat: new export ops Very little changes here, fat had a mostly no op decode_fh before and does not store any parent information. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	c38344fe9e	xfs: new export ops This one is a lot more complicated than the previous ones. XFS already had a very clever scheme for supporting 64bit inode numbers in filehandles, and I've reworked this to be some kind of a prototype for the generic 64bit inode filehandle support. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	a35132068a	ntfs: new export ops Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	d425de7043	jfs: new export ops Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	05da080482	efs: new export ops Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	1b961ac05a	ext4: new export ops Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:20 -07:00
Christoph Hellwig	74af0baad4	ext3: new export ops Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Christoph Hellwig	2e4c68e303	ext2: new export ops Trivial switch over to the new generic helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Christoph Hellwig	2596110a39	exportfs: add new methods Add the guts for the new filesystem API to exportfs. There's now a fh_to_dentry method that returns a dentry for the object looked for given a filehandle fragment, and a fh_to_parent operation that returns the dentry for the encoded parent directory in case the file handle contains it. There are default implementations for these methods that only take a callback for an nfs-enhanced iget variant and implement the rest of the semantics. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Christoph Hellwig	6e91ea2bb0	exportfs: add fid type This patchset is a medium scale rewrite of the export operations interface. The goal is to make the interface less complex, and easier to understand from the filesystem side, aswell as preparing generic support for exporting of 64bit inode numbers. This touches all nfs exporting filesystems, and I've done testing on all of the filesystems I have here locally (xfs, ext2, ext3, reiserfs, jfs) This patch: Add a structured fid type so that we don't have to pass an array of u32 values around everywhere. It's a union of possible layouts. As a start there's only the u32 array and the traditional 32bit inode format, but there will be more in one of my next patchset when I start to document the various filehandle formats we have in lowlevel filesystems better. Also add an enum that gives the various filehandle types human- readable names. Note: Some people might think the struct containing an anonymous union is ugly, but I didn't want to pass around a raw union type. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: <linux-ext4@vger.kernel.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: David Chinner <dgc@sgi.com> Cc: Timothy Shimmin <tes@sgi.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Hugh Dickins <hugh@veritas.com> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: "Vladimir V. Saveliev" <vs@namesys.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:19 -07:00
Jan Kara	89910cccb8	ext2: avoid rec_len overflow with 64KB block size With 64KB blocksize, a directory entry can have size 64KB which does not fit into 16 bits we have for entry length. So we store 0xffff instead and convert the value when read from / written to disk. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:18 -07:00
J. Bruce Fields	321bcf9216	dcache: don't expose uninitialized memory in /proc/<pid>/fd/<fd> Well, it's not especially important that target->d_iname get the contents of dentry->d_iname, but it's important that it get initialized with something, otherwise we're just exposing some random piece of memory to anyone who reads the link at /proc/<pid>/fd/<fd> for the deleted file, when it's still held open by someone. I've run a test program that copies a short (<36 character) name ontop of a long (>=36 character) name and see that the first time I run it, without this patch, I get unpredicatable results out of /proc/<pid>/fd/<fd>. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-22 08:13:18 -07:00
Nick Piggin	2a754b51aa	[JFFS2] Fix return value from jffs2_write_end() jffs2_write_end() is sometimes passing back a "written" length greater than the length we passed into it, leading to a BUG at mm/filemap.c:1749 when used with unionfs. It happens because we actually write more than was requested, to reduce log fragmentation. These "longer" writes are fine, but they shouldn't get propagated back to the vm/vfs. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-10-22 10:24:44 +01:00
Trond Myklebust	55b70a0300	NFS: Fix a typo in nfs_call_unlink() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-21 13:37:07 -04:00
Trond Myklebust	bad2a52411	NFSv2: Ensure that the directory metadata gets revalidated on file create Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-21 13:37:02 -04:00
Linus Torvalds	2fb59d623a	Merge branch 'audit.b43' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current * 'audit.b43' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: [PATCH] audit: watching subtrees [PATCH] new helper - inotify_evict_watch() [PATCH] new helper - inotify_clone_watch() [PATCH] new helpers - collect_mounts() and release_collected_mounts() [PATCH] pass dentry to audit_inode()/audit_inode_child()	2007-10-21 08:54:32 -07:00
Nick Piggin	efdc31319d	nobh: nobh_write_end fix This path mustn't have been tested :( I did attempt to exercise it by injecting failures here, but I suspect PageMappedToDisk may have been getting in the way. Will need more of a look, although I think nobh mode is OK for an -rc1 (it shouldn't eat anyone's data). Commit `03158cd7eb` ("fs: restore nobh") introcduced a NULL deref. Spotted by the Coverity checker. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-21 08:54:05 -07:00
Al Viro	74c3cbe33b	[PATCH] audit: watching subtrees New kind of audit rule predicates: "object is visible in given subtree". The part that can be sanely implemented, that is. Limitations: * if you have hardlink from outside of tree, you'd better watch it too (or just watch the object itself, obviously) * if you mount something under a watched tree, tell audit that new chunk should be added to watched subtrees * if you umount something in a watched tree and it's still mounted elsewhere, you will get matches on events happening there. New command tells audit to recalculate the trees, trimming such sources of false positives. Note that it's _not_ about path - if something mounted in several places (multiple mount, bindings, different namespaces, etc.), the match does _not_ depend on which one we are using for access. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:45 -04:00
Al Viro	455434d450	[PATCH] new helper - inotify_evict_watch() Kicks the watch out without dropping it. Called under ->inotify_mutex Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:38 -04:00
Al Viro	b9efe8a234	[PATCH] new helper - inotify_clone_watch() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:32 -04:00
Al Viro	8aec080945	[PATCH] new helpers - collect_mounts() and release_collected_mounts() Get a snapshot of a subtree, creating private clones of vfsmounts for all its components and release such snapshot resp. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:25 -04:00
Al Viro	5a190ae697	[PATCH] pass dentry to audit_inode()/audit_inode_child() makes caller simpler and allows to scan ancestors Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2007-10-21 02:37:18 -04:00
KaiGai Kohei	cfc8dc6f6f	[JFFS2] Tidy up fix for ACL/permissions problem. [In commit `9ed437c50d` we fixed a problem with standard permissions on newly-created inodes, when POSIX ACLs are enabled. This cleans it up...] The attached patch separate jffs2_init_acl() into two parts. The one is jffs2_init_acl_pre() called from jffs2_new_inode(). It compute ACL oriented inode->i_mode bits, and allocate in-memory ACL objects associated with the new inode just before when inode meta infomation is written to the medium. The other is jffs2_init_acl_post() called from jffs2_symlink(), jffs2_mkdir(), jffs2_mknod() and jffs2_do_create(). It actually writes in-memory ACL objects into the medium next to the success of writing meta-information. In the current implementation, we have to write a same inode meta infomation twice when inode->i_mode is updated by the default ACL. However, we can avoid the behavior by putting an updated i_mode before it is written at first, as jffs2_init_acl_pre() doing. Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>	2007-10-20 14:10:54 +01:00
Steve French	748c5151de	Merge branch 'master' of /pub/scm/linux/kernel/git/torvalds/linux-2.6	2007-10-20 04:26:44 +00:00
Linus Torvalds	c00046c279	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (74 commits) fix do_sys_open() prototype sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake Documentation: Fix typo in SubmitChecklist. Typo: depricated -> deprecated Add missing profile=kvm option to Documentation/kernel-parameters.txt fix typo about TBI in e1000 comment proc.txt: Add /proc/stat field small documentation fixes Fix compiler warning in smount example program from sharedsubtree.txt docs/sysfs: add missing word to sysfs attribute explanation documentation/ext3: grammar fixes Documentation/java.txt: typo and grammar fixes Documentation/filesystems/vfs.txt: typo fix include/asm-*/system.h: remove unused set_rmb(), set_wmb() macros trivial copy_data_pages() tidy up Fix typo in arch/x86/kernel/tsc_32.c file link fix for Pegasus USB net driver help remove unused return within void return function Typo fixes retrun -> return x86 hpet.h: remove broken links ...	2007-10-19 20:36:17 -07:00
Linus Torvalds	b35e704118	Avoid compile error in fs/nfs/unlink.c Erez Zadok reports that certain configurations fail to build due to schedule() TASK_[UN]INTERRUPTIBLE not being declared. Add proper include files to fix. Cc: Erez Zadok <ezk@cs.sunysb.edu> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 19:59:18 -07:00
Olof Johansson	e9a404580c	nfs: Fix build break with CONFIG_NFS_V4=n Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 19:27:46 -07:00
Chris Malley	3932bf6059	sysfs: trivial: fix sysfs_create_file kerneldoc spelling mistake Spelling error in sysfs_create_file kerneldoc. Signed-off-by: Chris Malley <mail@chrismalley.co.uk> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-20 03:14:32 +02:00
Uwe Kleine-König	dbe7f76dd6	fix typo "insted" -> "instead" Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-20 01:55:04 +02:00
Steve French	4879b44829	[CIFS] ACL support part 5 Acked-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-19 21:57:39 +00:00
Linus Torvalds	b04cde34cf	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: NFSv4: Fix an rpc_cred reference leakage in fs/nfs/delegation.c NFSv4: Ensure that we wait for the CLOSE request to complete NFS: Fix a race in sillyrename NFS: Fix a writeback race...	2007-10-19 14:31:42 -07:00
Jan Engelhardt	96de0e252c	Convert files to UTF-8 and some cleanups * Convert files to UTF-8. * Also correct some people's names (one example is Eißfeldt, which was found in a source file. Given that the author used an ß at all in a source file indicates that the real name has in fact a 'ß' and not an 'ss', which is commonly used as a substitute for 'ß' when limited to 7bit.) * Correct town names (Goettingen -> Göttingen) * Update Eberhard Mönkeberg's address (http://lkml.org/lkml/2007/1/8/313) Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:21:04 +02:00
Trond Myklebust	603c83da19	NFSv4: Fix an rpc_cred reference leakage in fs/nfs/delegation.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-19 17:19:30 -04:00
Trond Myklebust	a49c3c7736	NFSv4: Ensure that we wait for the CLOSE request to complete Otherwise, we do end up breaking close-to-open semantics. We also end up breaking some of the silly-rename tests in Connectathon on some setups. Please refer to the bug-report at http://bugzilla.linux-nfs.org/show_bug.cgi?id=150 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-19 17:19:25 -04:00
Trond Myklebust	565277f63c	NFS: Fix a race in sillyrename lookup() and sillyrename() can race one another because the sillyrename() completion cannot take the parent directory's inode->i_mutex since the latter may be held by whoever is calling dput(). We therefore have little option but to add extra locking to ensure that nfs_lookup() and nfs_atomic_open() do not race with the sillyrename completion. If somebody has looked up the sillyrenamed file in the meantime, we just transfer the sillydelete information to the new dentry. Please refer to the bug-report at http://bugzilla.linux-nfs.org/show_bug.cgi?id=150 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-19 17:19:16 -04:00
Trond Myklebust	61e930a904	NFS: Fix a writeback race... This patch fixes a regression that was introduced by commit `44dd151d5c` We cannot zero the user page in nfs_mark_uptodate() any more, since a) We'd be modifying the page without holding the page lock b) We can race with other updates of the page, most notably because of the call to nfs_wb_page() in nfs_writepage_setup(). Instead, we do the zeroing in nfs_update_request() if we see that we're creating a request that might potentially be marked as up to date. Thanks to Olivier Paquet for reporting the bug and providing a test-case. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2007-10-19 17:18:57 -04:00
Robert P. J. Day	3a4fa0a25d	Fix misspellings of "system", "controller", "interrupt" and "necessary". Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2007-10-19 23:10:43 +02:00
Linus Torvalds	ec2626815b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: sched: fix guest time accounting going faster than user time accounting	2007-10-19 12:07:03 -07:00
Linus Torvalds	2843483d2e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (51 commits) [CIFS] log better errors on failed mounts [CIFS] Return better error when server requires signing but client forbids [CIFS] fix typo [CIFS] acl support part 4 [CIFS] Fix minor problems noticed by scan [CIFS] fix bad handling of EAGAIN error on kernel_recvmsg in cifs_demultiplex_thread [CIFS] build break [CIFS] endian fixes [CIFS] endian fixes in new acl code [CIFS] Fix some endianness problems in new acl code [CIFS] missing #endif from a previous patch [CIFS] formatting fixes [CIFS] Break up unicode_sessetup string functions [CIFS] parse server_GUID in SPNEGO negProt response [CIFS] [CIFS] Fix endian conversion problem in posix mkdir [CIFS] fix build break when lanman not enabled [CIFS] remove two sparse warnings [CIFS] remove compile warnings when debug disabled [CIFS] CIFS ACL support part 3 ...	2007-10-19 12:00:58 -07:00
Linus Torvalds	f768f9d375	Merge branch 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6 * 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-2.6: [XFS] cleanup fid types mess [XFS] fixups after behavior removal merge into mainline git	2007-10-19 11:55:50 -07:00
Pavel Emelyanov	69cccb887a	Use task_pid_nr() instead of pid_nr(task_pid()) There are two places that do so - the cgroups subsystem and the autofs code. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Ian Kent <raven@themaw.net> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Pavel Emelyanov	457c25107b	Remove unused variables from fs/proc/base.c When removing the explicit task_struct->pid usage I found that proc_readfd_common() and proc_pident_readdir() get this field, but do not use it at all. So this cleanup is a cheap help with the task_struct->pid isolation. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Pavel Emelyanov	ba25f9dcc4	Use helpers to obtain task pid in printks The task_struct->pid member is going to be deprecated, so start using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in the kernel. The first thing to start with is the pid, printed to dmesg - in this case we may safely use task_pid_nr(). Besides, printks produce more (much more) than a half of all the explicit pid usage. [akpm@linux-foundation.org: git-drm went and changed lots of stuff] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:43 -07:00
Eugene Teo	270f722d4d	Fix tsk->exit_state usage tsk->exit_state can only be 0, EXIT_ZOMBIE, or EXIT_DEAD. A non-zero test is the same as tsk->exit_state & (EXIT_ZOMBIE \| EXIT_DEAD), so just testing tsk->exit_state is sufficient. Signed-off-by: Eugene Teo <eugeneteo@kernel.sg> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Neil Horman	d85f50d5e1	proc: export a processes resource limits via /proc/pid Currently, there exists no method for a process to query the resource limits of another process. They can be inferred via some mechanisms but they cannot be explicitly determined. Given that this information can be usefull to know during the debugging of an application, I've written this patch which exports all of a processes limits via /proc/<pid>/limits. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:42 -07:00
Jiri Slaby	1276b103c2	fs/select, remove unused macros fs/select, remove unused macros this is due to preparation for global BIT macro Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00
Pavel Emelyanov	bac0abd617	Isolate some explicit usage of task->tgid With pid namespaces this field is now dangerous to use explicitly, so hide it behind the helpers. Also the pid and pgrp fields o task_struct and signal_struct are to be deprecated. Unfortunately this patch cannot be sent right now as this leads to tons of warnings, so start isolating them, and deprecate later. Actually the p->tgid == pid has to be changed to has_group_leader_pid(), but Oleg pointed out that in case of posix cpu timers this is the same, and thread_group_leader() is more preferable. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	228ebcbe63	Uninline find_task_by_xxx set of functions The find_task_by_something is a set of macros are used to find task by pid depending on what kind of pid is proposed - global or virtual one. All of them are wrappers above the most generic one - find_task_by_pid_type_ns() - and just substitute some args for it. It turned out, that dereferencing the current->nsproxy->pid_ns construction and pushing one more argument on the stack inline cause kernel text size to grow. This patch moves all this stuff out-of-line into kernel/pid.c. Together with the next patch it saves a bit less than 400 bytes from the .text section. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	b488893a39	pid namespaces: changes to show virtual ids to user This is the largest patch in the set. Make all (I hope) the places where the pid is shown to or get from user operate on the virtual pids. The idea is: - all in-kernel data structures must store either struct pid itself or the pid's global nr, obtained with pid_nr() call; - when seeking the task from kernel code with the stored id one should use find_task_by_pid() call that works with global pids; - when showing pid's numerical value to the user the virtual one should be used, but however when one shows task's pid outside this task's namespace the global one is to be used; - when getting the pid from userspace one need to consider this as the virtual one and use appropriate task/pid-searching functions. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: nuther build fix] [akpm@linux-foundation.org: yet nuther build fix] [akpm@linux-foundation.org: remove unneeded casts] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	6f4e643353	pid namespaces: initialize the namespace's proc_mnt The namespace's proc_mnt must be kern_mount-ed to make this pointer always valid, independently of whether the user space mounted the proc or not. This solves raced in proc_flush_task, etc. with the proc_mnt switching from NULL to not-NULL. The initialization is done after the init's pid is created and hashed to make proc_get_sb() finr it and get for root inode. Sice the namespace holds the vfsmnt, vfsmnt holds the superblock and the superblock holds the namespace we must explicitly break this circle to destroy all the stuff. This is done after the init of the namespace dies. Running a few steps forward - when init exits it will kill all its children, so no proc_mnt will be needed after its death. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:40 -07:00
Pavel Emelyanov	130f77ecb2	pid namespaces: make proc_flush_task() actually from entries from multiple namespaces This means that proc_flush_task_mnt() is to be called for many proc mounts and with different ids, depending on the namespace this pid is to be flushed from. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	07543f5c75	pid namespaces: make proc have multiple superblocks - one for each namespace Each pid namespace have to be visible through its own proc mount. Thus we need to have per-namespace proc trees with their own superblocks. We cannot easily show different pid namespace via one global proc tree, since each pid refers to different tasks in different namespaces. E.g. pid 1 refers to the init task in the initial namespace and to some other task when seeing from another namespace. Moreover - pid, exisintg in one namespace may not exist in the other. This approach has one move advantage is that the tasks from the init namespace can see what tasks live in another namespace by reading entries from another proc tree. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	198fe21b0a	pid namespaces: helpers to find the task by its numerical ids When searching the task by numerical id on may need to find it using global pid (as it is done now in kernel) or by its virtual id, e.g. when sending a signal to a task from one namespace the sender will specify the task's virtual id and we should find the task by this value. [akpm@linux-foundation.org: fix gfs2 linkage] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:39 -07:00
Pavel Emelyanov	60347f6716	pid namespaces: prepare proc_flust_task() to flush entries from multiple proc trees The first part is trivial - we just make the proc_flush_task() to operate on arbitrary vfsmount with arbitrary ids and pass the pid and global proc_mnt to it. The other change is more tricky: I moved the proc_flush_task() call in release_task() higher to address the following problem. When flushing task from many proc trees we need to know the set of ids (not just one pid) to find the dentries' names to flush. Thus we need to pass the task's pid to proc_flush_task() as struct pid is the only object that can provide all the pid numbers. But after __exit_signal() task has detached all his pids and this information is lost. This creates a tiny gap for proc_pid_lookup() to bring some dentries back to tree and keep them in hash (since pids are still alive before __exit_signal()) till the next shrink, but since proc_flush_task() does not provide a 100% guarantee that the dentries will be flushed, this is OK to do so. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Pavel Emelyanov	8bf9725c29	pid namespaces: introduce MS_KERNMOUNT flag This flag tells the .get_sb callback that this is a kern_mount() call so that it can trust data pointer to be valid in-kernel one. If this flag is passed from the user process, it is cleared since the data pointer is not a valid kernel object. Running a few steps forward - this will be needed for proc to create the superblock and store a valid pid namespace on it during the namespace creation. The reason, why the namespace cannot live without proc mount is described in the appropriate patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Paul Menage <menage@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Matthias Kaehlcke	d473012710	fs/super.c: use list_for_each_entry() instead of list_for_each() fs/super.c: use list_for_each_entry() instead of list_for_each() in sget() [akpm@linux-foundation.org: clean up some crap while we're there] Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Matthias Kaehlcke	b70c394099	fs/eventpoll.c: use list_for_each_entry() instead of list_for_each() fs/eventpoll.c: use list_for_each_entry() instead of list_for_each() in ep_poll_safewake() Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Matthias Kaehlcke	cfdaf9e5f9	fs/file_table.c: use list_for_each_entry() instead of list_for_each() fs/file_table.c: use list_for_each_entry() instead of list_for_each() in fs_may_remount_ro() Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:38 -07:00
Pavel Emelyanov	cf7b708c8d	Make access to task's nsproxy lighter When someone wants to deal with some other taks's namespaces it has to lock the task and then to get the desired namespace if the one exists. This is slow on read-only paths and may be impossible in some cases. E.g. Oleg recently noticed a race between unshare() and the (sent for review in cgroups) pid namespaces - when the task notifies the parent it has to know the parent's namespace, but taking the task_lock() is impossible there - the code is under write locked tasklist lock. On the other hand switching the namespace on task (daemonize) and releasing the namespace (after the last task exit) is rather rare operation and we can sacrifice its speed to solve the issues above. The access to other task namespaces is proposed to be performed like this: rcu_read_lock(); nsproxy = task_nsproxy(tsk); if (nsproxy != NULL) { / * * work with the namespaces here * e.g. get the reference on one of them * / } / * * NULL task_nsproxy() means that this task is * almost dead (zombie) * / rcu_read_unlock(); This patch has passed the review by Eric and Oleg :) and, of course, tested. [clg@fr.ibm.com: fix unshare()] [ebiederm@xmission.com: Update get_net_ns_by_pid] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Sukadev Bhattiprolu	3743ca05ff	pid namespaces: use task_pid() to find leader's pid Use task_pid() to get leader's 'struct pid' and avoid the find_pid(). Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Sukadev Bhattiprolu	88f21d8182	pid namespaces: rename child_reaper() function Rename the child_reaper() function to task_child_reaper() to be similar to other task_* functions and to distinguish the function from 'struct pid_namspace.child_reaper'. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Cc: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Sukadev Bhattiprolu	2894d650cd	pid namespaces: define and use task_active_pid_ns() wrapper With multiple pid namespaces, a process is known by some pid_t in every ancestor pid namespace. Every time the process forks, the child process also gets a pid_t in every ancestor pid namespace. While a process is visible in >=1 pid namespaces, it can see pid_t's in only one pid namespace. We call this pid namespace it's "active pid namespace", and it is always the youngest pid namespace in which the process is known. This patch defines and uses a wrapper to find the active pid namespace of a process. The implementation of the wrapper will be changed in when support for multiple pid namespaces are added. Changelog: 2.6.22-rc4-mm2-pidns1: - [Pavel Emelianov, Alexey Dobriyan] Back out the change to use task_active_pid_ns() in child_reaper() since task->nsproxy can be NULL during task exit (so child_reaper() continues to use init_pid_ns). to implement child_reaper() since init_pid_ns.child_reaper to implement child_reaper() since tsk->nsproxy can be NULL during exit. 2.6.21-rc6-mm1: - Rename task_pid_ns() to task_active_pid_ns() to reflect that a process can have multiple pid namespaces. Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Pavel Emelianov	a47afb0f9d	pid namespaces: round up the API The set of functions process_session, task_session, process_group and task_pgrp is confusing, as the names can be mixed with each other when looking at the code for a long time. The proposals are to * equip the functions that return the integer with _nr suffix to represent that fact, * and to make all functions work with task (not process) by making the common prefix of the same name. For monotony the routines signal_session() and set_signal_session() are replaced with task_session_nr() and set_task_session(), especially since they are only used with the explicit task->signal dereference. Signed-off-by: Pavel Emelianov <xemul@openvz.org> Acked-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Paul Menage	8793d854ed	Task Control Groups: make cpusets a client of cgroups Remove the filesystem support logic from the cpusets system and makes cpusets a cgroup subsystem The "cpuset" filesystem becomes a dummy filesystem; attempts to mount it get passed through to the cgroup filesystem with the appropriate options to emulate the old cpuset filesystem behaviour. Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Paul Menage	a424316ca1	Task Control Groups: add procfs interface Add: /proc/cgroups - general system info /proc/*/cgroup - per-task cgroup membership info [a.p.zijlstra@chello.nl: cgroups: bdi init hooks] Signed-off-by: Paul Menage <menage@google.com> Cc: Serge E. Hallyn <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Paul Jackson <pj@sgi.com> Cc: Kirill Korotaev <dev@openvz.org> Cc: Herbert Poetzl <herbert@13thfloor.at> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:36 -07:00
Jeff Mahoney	cb680c1be6	reiserfs: ignore on disk s_bmap_nr value Implement support for file systems larger than 8 TiB. The reiserfs superblock contains a 16 bit value for counting the number of bitmap blocks. The rest of the disk format supports file systems up to 2^32 blocks, but the bitmap block limitation artificially limits this to 8 TiB with a 4KiB block size. Rather than trust the superblock's 16-bit bitmap block count, we calculate it dynamically based on the number of blocks in the file system. When an incorrect value is observed in the superblock, it is zeroed out, ensuring that older kernels will not be able to mount the file system. Userspace support has already been implemented and shipped in reiserfsprogs 3.6.20. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	4d20851d37	reiserfs: remove first_zero_hint The first_zero_hint metadata caching was never actually used, and it's of dubious optimization quality. This patch removes it. It doesn't actually shrink the size of the reiserfs_bitmap_info struct, since that doesn't work with block sizes larger than 8K. There was a big fixme in there, and with all the work lately in allowing block size > page size, I might as well kill the fixme as well. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	3ee1667042	reiserfs: fix usage of signed ints for block numbers Do a quick signedness check for block numbers. There are a number of places where signed integers are used for block numbers, which limits the usable file system size to 8 TiB. The disk format, excepting a problem which will be fixed in the following patch, supports file systems up to 16 TiB in size. This patch cleans up those sites so that we can enable the full usable size. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	6c57c2c8d3	reiserfs: fix memset byte count during resize Correct the memset in reiserfs_resize to clear the memory allocated for the new bitmap info structs. Previously, it would clear the memory used by the old size. Depending on the contents of memory, this could cause incorrect caching behavior for bitmap blocks in the newly allocated area. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	d4c3d19d0c	reiserfs: use is_reusable to catch corruption Build in is_reusable() unconditionally and use it to catch corruption before it reaches the block freeing paths. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	8e186e454e	reiserfs: dont use BUG when panicking Change reiserfs_panic() to use panic() initially instead of BUG(). Using BUG() ignores the configurable panic behavior, so systems that should be failing and rebooting are left hanging. This causes problems in active/standby HA scenarios. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jeff Mahoney	7598392894	reiserfs: fix up lockdep warnings Add I_MUTEX_XATTR annotations to the inode locking in the reiserfs xattr code. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jose R. Santos	9ad163ae0d	JBD: Fix JBD warnings when compiling with CONFIG_JBD_DEBUG Note from Mingming's JBD2 fix: Noticed all warnings are occurs when the debug level is 0. Then found the "jbd2: Move jbd2-debug file to debugfs" patch http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=0f49d5d019afa4e94253bfc92f0daca3badb990b changed the jbd2_journal_enable_debug from int type to u8, makes the jbd_debug comparision is always true when the debugging level is 0. Thus the compile warning occurs. Thought about changing the jbd2_journal_enable_debug data type back to int, but can't, because the jbd2-debug is moved to debug fs, where calling debugfs_create_u8() to create the debugfs entry needs the value to be u8 type. Even if we changed the data type back to int, the code is still buggy, kernel should not print jbd2 debug message if the jbd2_journal_enable_debug is set to 0. But this is not the case. The fix is change the level of debugging to 1. The same should fixed in ext3/JBD, but currently ext3 jbd-debug via /proc fs is broken, so we probably should fix it all together. Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jan Kara	7a266e75cf	jbd: fix commit code to properly abort journal We should really call journal_abort() and not __journal_abort_hard() in case of errors. The latter call does not record the error in the journal superblock and thus filesystem won't be marked as with errors later (and user could happily mount it without any warning). Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Jose R. Santos	c2a9159cdd	jbd: config_jbd_debug cannot create /proc entry The jbd-debug file used to be located in /proc/sys/fs/jbd-debug, but create_proc_entry() does not do lookups on file names that are more that one directory deep. This causes the entry creation to fail and hence, no proc file is created. Instead of fixing this on procfs might as well move the jbd2-debug file to debugfs which would be the preferred location for this kind of tunable. The new location is now /sys/kernel/debug/jbd/jbd-debug. [akpm@linux-foundation.org: zillions of cleanups] Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Acked-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:35 -07:00
Mingming Cao	8c3478a523	JBD/ext3 cleanups: convert to kzalloc Convert kmalloc to kzalloc() and get rid of the memset(). Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:34 -07:00
Coly Li	3ed75eb8f1	setup vma->vm_page_prot by vm_get_page_prot() This patch uses vm_get_page_prot() to setup vma->vm_page_prot. Though inside vm_get_page_prot() the protection flags is AND with (VM_READ\|VM_WRITE\|VM_EXEC\|VM_SHARED), it does not hurt correct code. Signed-off-by: Coly Li <coyli@suse.de> Cc: Hugh Dickins <hugh@veritas.com> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:34 -07:00
Miklos Szeredi	c18479fe01	put declaration of put_filesystem() in fs.h Declarations go into headers. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Ram Pai <linuxram@us.ibm.com> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:33 -07:00
Christian Borntraeger	f9e26291be	sched: fix guest time accounting going faster than user time accounting cputime_add already adds, dont do it twice. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2007-10-19 20:52:40 +02:00
Christoph Hellwig	c6143911a7	[XFS] cleanup fid types mess Currently XFs has three different fid types: struct fid, struct xfs_fid and struct xfs_fid2 with hte latter two beeing identicaly and the first one beeing the same size but an unstructured array with the same size. This patch consolidates all this to alway uuse struct xfs_fid. This patch is required for an upcoming patch series from me that revamps the nfs exporting code and introduces a Linux-wide struct fid. SGI-PV: 970336 SGI-Modid: xfs-linux-melb:xfs-kern:29651a Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-10-19 18:02:55 +10:00
Christoph Hellwig	c8fcfac5a2	[XFS] fixups after behavior removal merge into mainline git Fixup for lack of dmapi support and no quota module support. SGI-PV: 969985 Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com>	2007-10-19 17:14:45 +10:00
Steve French	a761ac579b	[CIFS] log better errors on failed mounts Also returns more accurate errors to mount for the cases of account expired and password expired Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2007-10-18 21:45:27 +00:00
Stephen Hemminger	c80544dc0b	sparse pointer use of zero as null Get rid of sparse related warnings from places that use integer as NULL pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Ian Kent <raven@themaw.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00
Miklos Szeredi	0e9663ee45	fuse: add blksize field to fuse_attr There are cases when the filesystem will be passed the buffer from a single read or write call, namely: 1) in 'direct-io' mode (not O_DIRECT), read/write requests don't go through the page cache, but go directly to the userspace fs 2) currently buffered writes are done with single page requests, but if Nick's ->perform_write() patch goes it, it will be possible to do larger write requests. But only if the original write() was also bigger than a page. In these cases the filesystem might want to give a hint to the app about the optimal I/O size. Allow the userspace filesystem to supply a blksize value to be returned by stat() and friends. If the field is zero, it defaults to the old PAGE_CACHE_SIZE value. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:31 -07:00

... 3 4 5 6 7 ...

7415 Commits