linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	16b9057804	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "This the bunch that sat in -next + lock_parent() fix. This is the minimal set; there's more pending stuff. In particular, I really hope to get acct.c fixes merged this cycle - we need that to deal sanely with delayed-mntput stuff. In the next pile, hopefully - that series is fairly short and localized (kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more iov_iter work. Most of prereqs for ->splice_write with sane locking order are there and Kent's dio rewrite would also fit nicely on top of this pile" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits) lock_parent: don't step on stale ->d_parent of all-but-freed one kill generic_file_splice_write() ceph: switch to iter_file_splice_write() shmem: switch to iter_file_splice_write() nfs: switch to iter_splice_write_file() fs/splice.c: remove unneeded exports ocfs2: switch to iter_file_splice_write() ->splice_write() via ->write_iter() bio_vec-backed iov_iter optimize copy_page_{to,from}_iter() bury generic_file_aio_{read,write} lustre: get rid of messing with iovecs ceph: switch to ->write_iter() ceph_sync_direct_write: stop poking into iov_iter guts ceph_sync_read: stop poking into iov_iter guts new helper: copy_page_from_iter() fuse: switch to ->write_iter() btrfs: switch to ->write_iter() ocfs2: switch to ->write_iter() xfs: switch to ->write_iter() ...	2014-06-12 10:30:18 -07:00
Al Viro	8d0207652c	->splice_write() via ->write_iter() iter_file_splice_write() - a ->splice_write() instance that gathers the pipe buffers, builds a bio_vec-based iov_iter covering those and feeds it to ->write_iter(). A bunch of simple cases coverted to that... [AV: fixed the braino spotted by Cyrill] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-06-12 00:18:51 -04:00
Artem Bityutskiy	ba6a7d5563	UBIFS: fix debugging check The debugging check which verifies that we never write outside of the file length was incorrect, since it was multiplying file length by the page size, instead of dividing. Fix this. Spotted-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2014-05-28 11:10:09 +03:00
Daniel Golle	a0fd59511e	UBIFS: add missing ui pointer in debugging code If UBIFS_DEBUG is defined an additional assertion of the ui_lock spinlock in do_writepage cannot compile because the ui pointer has not been previously declared. Fix this by declaring and initializing the ui pointer in case UBIFS_DEBUG is defined. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2014-05-27 15:23:40 +03:00
hujianyang	691a7c6f28	UBIFS: fix an mmap and fsync race condition There is a race condition in UBIFS: Thread A (mmap) Thread B (fsync) ->__do_fault ->write_cache_pages -> ubifs_vm_page_mkwrite -> budget_space -> lock_page -> release/convert_page_budget -> SetPagePrivate -> TestSetPageDirty -> unlock_page -> lock_page -> TestClearPageDirty -> ubifs_writepage -> do_writepage -> release_budget -> ClearPagePrivate -> unlock_page -> !(ret & VM_FAULT_LOCKED) -> lock_page -> set_page_dirty -> ubifs_set_page_dirty -> TestSetPageDirty (set page dirty without budgeting) -> unlock_page This leads to situation where we have a diry page but no budget allocated for this page, so further write-back may fail with -ENOSPC. In this fix we return from page_mkwrite without performing unlock_page. We return VM_FAULT_LOCKED instead. After doing this, the race above will not happen. Signed-off-by: hujianyang <hujianyang@huawei.com> Tested-by: Laurence Withers <lwithers@guralp.com> Cc: stable@vger.kernel.org Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2014-05-13 13:45:15 +03:00
Al Viro	f5674c31ee	ubifs: switch to ->write_iter() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-06 17:39:38 -04:00
Al Viro	aad4f8bb42	switch simple generic_file_aio_read() users to ->read_iter() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-05-06 17:37:55 -04:00
Kirill A. Shutemov	f1820361f8	mm: implement ->map_pages for page cache filemap_map_pages() is generic implementation of ->map_pages() for filesystems who uses page cache. It should be safe to use filemap_map_pages() for ->map_pages() if filesystem use filemap_fault() for ->fault(). Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Cc: Ning Qu <quning@gmail.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-04-07 16:35:53 -07:00
Lukas Czerner	d47992f86b	mm: change invalidatepage prototype to accept length Currently there is no way to truncate partial page where the end truncate point is not at the end of the page. This is because it was not needed and the functionality was enough for file system truncate operation to work properly. However more file systems now support punch hole feature and it can benefit from mm supporting truncating page just up to the certain point. Specifically, with this functionality truncate_inode_pages_range() can be changed so it supports truncating partial page at the end of the range (currently it will BUG_ON() if 'end' is not at the end of the page). This commit changes the invalidatepage() address space operation prototype to accept range to be invalidated and update all the instances for it. We also change the block_invalidatepage() in the same way and actually make a use of the new length argument implementing range invalidation. Actual file system implementations will follow except the file systems where the changes are really simple and should not change the behaviour in any way .Implementation for truncate_page_range() which will be able to accept page unaligned ranges will follow as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Hugh Dickins <hughd@google.com>	2013-05-21 23:17:23 -04:00
Kent Overstreet	a27bb332c0	aio: don't include aio.h in sched.h Faster kernel compiles by way of fewer unnecessary includes. [akpm@linux-foundation.org: fix fallout] [akpm@linux-foundation.org: fix build] Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Zach Brown <zab@redhat.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jens Axboe <axboe@kernel.dk> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Reviewed-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 20:16:25 -07:00
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Jan Kara	182dcfd648	ubifs: wait for page writeback to provide stable pages When stable pages are required, we have to wait if the page is just going to disk and we want to modify it. Add proper callback to ubifs_vm_page_mkwrite(). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-21 17:22:20 -08:00
Konstantin Khlebnikov	0b173bc4da	mm: kill vma flag VM_CAN_NONLINEAR Move actual pte filling for non-linear file mappings into the new special vma operation: ->remap_pages(). Filesystems must implement this method to get non-linear mapping support, if it uses filemap_fault() then generic_file_remap_pages() can be used. Now device drivers can implement this method and obtain nonlinear vma support. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Chris Metcalf <cmetcalf@tilera.com> #arch/tile Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Eric Paris <eparis@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Kentaro Takeda <takedakn@nttdata.co.jp> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Venkatesh Pallipadi <venki@google.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:17 +09:00
Artem Bityutskiy	79fda5179a	UBIFS: comply with coding style Join all the split printk lines in order to stop checkpatch complaining. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2012-08-31 17:32:57 +03:00
Artem Bityutskiy	5c57f20b82	UBIFS: nuke pdflush from comments The pdflush thread is long gone, so this patch removes references to pdflush from UBIFS comments. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-08-04 12:15:41 +04:00
Artem Bityutskiy	edf6be245f	UBIFS: rename dumping functions This commit re-names all functions which dump something from "dbg_dump_()" to "ubifs_dump_()". This is done for consistency with UBI and because this way it will be more logical once we remove the debugging sompilation option. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2012-05-16 19:15:56 +03:00
Subodh Nijsure	1bdcc63112	UBIFS: remove xattr Kconnfig option Remove CONFIG_UBIFS_FS_XATTR configuration option and associated UBIFS_FS_XATTR ifdefs. Testing: Tested using integck while using nandsim on x86 & MX28 based platform with Micron MT29F2G08ABAEAH4 nand. Signed-off-by: Subodh Nijsure <snijsure@grid-net.com> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>	2012-05-03 14:11:11 +03:00
Cong Wang	a1c7c13783	ubifs: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:26 +08:00
Linus Torvalds	bbd9d6f7fb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits) vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp isofs: Remove global fs lock jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory fix IN_DELETE_SELF on overwriting rename() on ramfs et.al. mm/truncate.c: fix build for CONFIG_BLOCK not enabled fs:update the NOTE of the file_operations structure Remove dead code in dget_parent() AFS: Fix silly characters in a comment switch d_add_ci() to d_splice_alias() in "found negative" case as well simplify gfs2_lookup() jfs_lookup(): don't bother with . or .. get rid of useless dget_parent() in btrfs rename() and link() get rid of useless dget_parent() in fs/btrfs/ioctl.c fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers drivers: fix up various ->llseek() implementations fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek Ext4: handle SEEK_HOLE/SEEK_DATA generically Btrfs: implement our own ->llseek fs: add SEEK_HOLE and SEEK_DATA flags reiserfs: make reiserfs default to barrier=flush ... Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new shrinker callout for the inode cache, that clashed with the xfs code to start the periodic workers later.	2011-07-22 19:02:39 -07:00
Josef Bacik	02c24a8218	fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers Btrfs needs to be able to control how filemap_write_and_wait_range() is called in fsync to make it less of a painful operation, so push down taking i_mutex and the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some file systems can drop taking the i_mutex altogether it seems, like ext3 and ocfs2. For correctness sake I just pushed everything down in all cases to make sure that we keep the current behavior the same for everybody, and then each individual fs maintainer can make up their mind about what to do from there. Thanks, Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-07-20 20:47:59 -04:00
Artem Bityutskiy	d808efb407	UBIFS: amend debugging inode size check function prototype Add 'const struct ubifs_info *c' parameter to 'dbg_check_synced_i_size()' function because we'll need it in the next patch when we switch to debugfs. So this patch is just a preparation. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-07-04 10:54:27 +03:00
Artem Bityutskiy	2cd0a60cf4	UBIFS: remove strange commentary Remove the following commentary from 'ubifs_file_mmap()': /* 'generic_file_mmap()' takes care of NOMMU case */ I do not understand what it means, and I could not find anything relater to NOMMU in 'generic_file_mmap()'. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-05-13 19:23:55 +03:00
Artem Bityutskiy	3b2f9a019e	UBIFS: use ro_mount instead of MS_RDONLY We have our own flags indicating R/O mode, and c->ro_mode is equivalent to MS_RDONLY. Let's be consistent and use UBIFS flags everywhere. This patch is just a minor cleanup. Additionally, add a comment that we are surprised with VFS behavior - as a reminder to look at this some day. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-05-13 19:23:55 +03:00
Artem Bityutskiy	b137545c44	UBIFS: introduce a separate structure for budgeting info This patch separates out all the budgeting-related information from 'struct ubifs_info' to 'struct ubifs_budg_info'. This way the code looks a bit cleaner. However, the main driver for this is that we want to save budgeting information and print it later, so a separate data structure for this is helpful. This patch is a preparation for the further debugging output improvements. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-05-13 19:23:53 +03:00
Artem Bityutskiy	c43615702f	UBIFS: fix minor stylistic issues Fix several minor stylistic issues: * lines longer than 80 characters * space before closing parenthesis ')' * spaces in the indentations Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-05-13 19:23:53 +03:00
Artem Bityutskiy	78530bf7f2	UBIFS: fix oops when R/O file-system is fsync'ed This patch fixes severe UBIFS bug: UBIFS oopses when we 'fsync()' an file on R/O-mounter file-system. We (the UBIFS authors) incorrectly thought that VFS would not propagate 'fsync()' down to the file-system if it is read-only, but this is not the case. It is easy to exploit this bug using the following simple perl script: use strict; use File::Sync qw(fsync sync); die "File path is not specified" if not defined $ARGV[0]; my $path = $ARGV[0]; open FILE, "<", "$path" or die "Cannot open $path: $!"; fsync(\*FILE) or die "cannot fsync $path: $!"; close FILE or die "Cannot close $path: $!"; Thanks to Reuben Dowle <Reuben.Dowle@navico.com> for reporting about this issue. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Reported-by: Reuben Dowle <Reuben.Dowle@navico.com> Cc: stable@kernel.org	2011-04-13 10:43:32 +03:00
Artem Bityutskiy	6ed09c34b7	UBIFS: fix assertion warning and refine comments This patch fixes the following UBIFS assertion warning: UBIFS assert failed in do_readpage at 115 (pid 199) [<b00321b8>] (unwind_backtrace+0x0/0xdc) from [<af025118>] (do_readpage+0x108/0x594 [ubifs]) [<af025118>] (do_readpage+0x108/0x594 [ubifs]) from [<af025764>] (ubifs_write_end+0x1c0/0x2e8 [ubifs]) [<af025764>] (ubifs_write_end+0x1c0/0x2e8 [ubifs]) from [<b00a0164>] (generic_file_buffered_write+0x18c/0x270) [<b00a0164>] (generic_file_buffered_write+0x18c/0x270) from [<b00a08d4>] (__generic_file_aio_write+0x478/0x4c0) [<b00a08d4>] (__generic_file_aio_write+0x478/0x4c0) from [<b00a0984>] (generic_file_aio_write+0x68/0xc8) [<b00a0984>] (generic_file_aio_write+0x68/0xc8) from [<af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs]) [<af024a78>] (ubifs_aio_write+0x178/0x1d8 [ubifs]) from [<b00d104c>] (do_sync_write+0xb0/0x100) [<b00d104c>] (do_sync_write+0xb0/0x100) from [<b00d1abc>] (vfs_write+0xac/0x154) [<b00d1abc>] (vfs_write+0xac/0x154) from [<b00d1c10>] (sys_write+0x3c/0x68) [<b00d1c10>] (sys_write+0x3c/0x68) from [<b002d9a0>] (ret_fast_syscall+0x0/0x2c) The 'PG_checked' flag is used to indicate that the page does not supposedly exist on the media (e.g., a hole or a page beyond the inode size), so it requires slightly bigger budget, because we have to account the indexing size increase. And this flag basically tells that the budget for this page has to be "new page budget". The "new page budget" is slightly bigger than the "existing page budget". The 'do_readpage()' function has the following assertion which sometimes is hit: 'ubifs_assert(!PageChecked(page))'. Obviously, the meaning of this assertion is: "I should not be asked to read a page which does not exist on the media". However, in 'ubifs_write_begin()' we have a small "trick". Notice, that VFS may write pages which were not read yet, so the page data were not loaded from the media to the page cache yet. If VFS tells that it is going to change only some part of the page, we obviously have to load it from the media. However, if VFS tells that it is going to change whole page, we do not read it from the media for optimization purposes. However, since we do not read it, we do not know if it exists on the media or not (a hole, etc). So we set the 'PG_checked' flag to this page to force bigger budget, just in case. So 'ubifs_write_begin()' sets 'PG_checked'. Then we are in 'ubifs_write_end()'. And VFS tells us: "hey, for some reasons I changed my mind and did not change whole page". Frankly, I do not know why this happens, but I hit this somehow on an ARM platform. And this is extremely rare. So in this case UBIFS does the following: 1. Cancels allocated budget. 2. Loads the page from the media by calling 'do_readpage()'. 3. Asks VFS to repeat the whole write operation from the very beginning (call '->write_begin() again, etc). And the assertion warning is hit at the step 2 - remember we have the 'PG_checked' set for this page, and 'do_readpage()' does not like this. So this patch fixes the problem by adding step 1.5 and cleaning the 'PG_checked' before calling 'do_readpage()'. All in all, this patch does not fix any functionality issue, but it silences UBIFS false positive warning which may happen in very very rare cases. And while on it, this patch also improves a commentary which explains the reasons of setting the 'PG_checked' flag for the page. The old commentary was a bit difficult to understand. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2011-03-24 16:16:18 +02:00
Artem Bityutskiy	2ef13294d2	UBIFS: introduce new flags for RO mounts Commit `2fde99cb55` "UBIFS: mark VFS SB RO too" introduced regression. This commit made UBIFS set the 'MS_RDONLY' flag in the VFS superblock when it switches to R/O mode due to an error. This was done to make VFS show the R/O UBIFS flag in /proc/mounts. However, several places in UBIFS relied on the 'MS_RDONLY' flag and assume this flag can only change when we re-mount. For example, 'ubifs_put_super()'. This patch introduces new UBIFS flag - 'c->ro_mount' which changes only when we re-mount, and preserves the way UBIFS was originally mounted (R/W or R/O). This allows us to de-initialize UBIFS cleanly in 'ubifs_put_super()'. This patch also changes all 'ubifs_assert(!c->ro_media)' assertions to 'ubifs_assert(!c->ro_media && !c->ro_mount)', because we never should write anything if the FS was mounter R/O. All the places where we test for 'MS_RDONLY' flag in the VFS SB were changed and now we test the 'c->ro_mount' flag instead, because it preserves the original UBIFS mount type, unlike the 'MS_RDONLY' flag. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2010-09-19 21:07:58 +03:00
Artem Bityutskiy	2680d722bf	UBIFS: introduce new flag for RO due to errors The R/O state may have various reasons: 1. The UBI volume is R/O 2. The FS is mounted R/O 3. The FS switched to R/O mode because of an error However, in UBIFS we have only one variable which represents cases 1 and 3 - 'c->ro_media'. Indeed, we set this to 1 if we switch to R/O mode due to an error, and then we test it in many places to make sure that we stop writing as soon as the error happens. But this is very unclean. One consequence of this, for example, is that in 'ubifs_remount_fs()' we use 'c->ro_media' to check whether we are in R/O mode because on an error, and we print a message in this case. However, if we are in R/O mode because the media is R/O, our message is bogus. This patch introduces new flag - 'c->ro_error' which is set when we switch to R/O mode because of an error. It also changes all "if (c->ro_media)" checks to "if (c->ro_error)" checks, because this is what the checks actually mean. We do not need to check for 'c->ro_media' because if the UBI volume is in R/O mode, we do not allow R/W mounting, and now writes can happen. This is guaranteed by VFS. But it is good to double-check this, so this patch also adds many "ubifs_assert(!c->ro_media)" checks. In the 'ubifs_remount_fs()' function this patch makes a bit more changes - it fixes the error messages as well. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2010-09-17 17:08:09 +03:00
Christoph Hellwig	2c27c65ed0	check ATTR_SIZE contraints in inode_change_ok Make sure we check the truncate constraints early on in ->setattr by adding those checks to inode_change_ok. Also clean up and document inode_change_ok to make this obvious. As a fallout we don't have to call inode_newsize_ok from simple_setsize and simplify it down to a truncate_setsize which doesn't return an error. This simplifies a lot of setattr implementations and means we use truncate_setsize almost everywhere. Get rid of fat_setsize now that it's trivial and mark ext2_setsize static to make the calling convention obvious. Keep the inode_newsize_ok in vmtruncate for now as all callers need an audit for its removal anyway. Note: setattr code in ecryptfs doesn't call inode_change_ok at all and needs a deeper audit, but that is left for later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-08-09 16:47:39 -04:00
npiggin@suse.de	15c6fd9786	kill spurious reference to vmtruncate Lots of filesystems calls vmtruncate despite not implementing the old ->truncate method. Switch them to use simple_setsize and add some comments about the truncate code where it seems fitting. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-27 22:15:42 -04:00
Christoph Hellwig	7ea8085910	drop unused dentry argument to ->fsync Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-05-27 22:05:02 -04:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Christoph Hellwig	a9185b41a4	pass writeback_control to ->write_inode This gives the filesystem more information about the writeback that is happening. Trond requested this for the NFS unstable write handling, and other filesystems might benefit from this too by beeing able to distinguish between the different callers in more detail. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-05 13:25:52 -05:00
Christoph Hellwig	eaff8079d4	kill I_LOCK After I_SYNC was split from I_LOCK the leftover is always used together with I_NEW and thus superflous. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-17 11:03:25 -05:00
Christoph Hellwig	774888bcd6	UBIFS: remove manual O_SYNC handling generic_file_aio_write already calls into ->fsync to handle O_SYNC/O_DSYNC. Remove the duplicate call to ubifs_sync_wbufs_by_inode which is already covered by ubifs_fsync. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-11-24 08:18:55 +02:00
Alexey Dobriyan	f0f37e2f77	const: mark struct vm_struct_operations * mark struct vm_area_struct::vm_ops as const * mark vm_ops in AGP code But leave TTM code alone, something is fishy there with global vm_ops being used. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-27 11:39:25 -07:00
Artem Bityutskiy	873a64c762	UBIFS: amend commentaries This patch amends and nicifies commentaries in file.c, as well as fixes some spelling problems. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-09-10 12:06:47 +03:00
Linus Torvalds	e0724bf6e4	Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6 * 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: fix recovery bug UBIFS: add R/O compatibility UBIFS: fix compiler warnings UBIFS: fully sort GCed nodes UBIFS: fix commentaries UBIFS: introduce a helpful variable UBIFS: use KERN_CONT UBIFS: fix lprops committing bug UBIFS: fix bogus assertion UBIFS: fix bug where page is marked uptodate when out of space UBIFS: amend key_hash return value UBIFS: improve find function interface UBIFS: list usage cleanup UBIFS: fix dbg_chk_lpt_sz()	2009-04-06 15:00:19 -07:00
Nick Piggin	c2ec175c39	mm: page_mkwrite change prototype to match fault Change the page_mkwrite prototype to take a struct vm_fault, and return VM_FAULT_xxx flags. There should be no functional change. This makes it possible to return much more detailed error information to the VM (and also can provide more information eg. virtual_address to the driver, which might be important in some special cases). This is required for a subsequent fix. And will also make it easier to merge page_mkwrite() with fault() in future. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Chris Mason <chris.mason@oracle.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Felix Blyakher <felixb@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-01 08:59:14 -07:00
Artem Bityutskiy	7d4e9ccb43	UBIFS: fix commentaries Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-03-20 19:11:12 +02:00
Adrian Hunter	f55aa59106	UBIFS: fix bug where page is marked uptodate when out of space UBIFS fast path in write_begin may mark a page up to date and then discover that there may not be enough space to do the write, and so fall back to a slow path. The slow path tries harder, but may still find no space - leaving the page marked up to date, when it is not. This patch ensures that the page is marked not up to date in that case. The bug that this patch fixes becomes evident when the write is into a hole (sparse file) or is at the end of the file and a subsequent read is off the end of the file. In both cases, the file system should return zeros but was instead returning the page that had not been written because the file system was out of space. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-03-14 16:46:33 +02:00
Artem Bityutskiy	84abf972cc	UBIFS: add re-mount debugging checks We observe space corrupted accounting when re-mounting. So add some debbugging checks to catch problems like this. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-01-26 12:54:11 +02:00
Artem Bityutskiy	e8b815663b	UBIFS: constify operations Mark super, file, and inode operation structcutes with 'const'. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-01-18 14:05:08 +02:00
Nick Piggin	54566b2c15	fs: symlink write_begin allocation context fix With the write_begin/write_end aops, page_symlink was broken because it could no longer pass a GFP_NOFS type mask into the point where the allocations happened. They are done in write_begin, which would always assume that the filesystem can be entered from reclaim. This bug could cause filesystem deadlocks. The funny thing with having a gfp_t mask there is that it doesn't really allow the caller to arbitrarily tinker with the context in which it can be called. It couldn't ever be GFP_ATOMIC, for example, because it needs to take the page lock. The only thing any callers care about is __GFP_FS anyway, so turn that into a single flag. Add a new flag for write_begin, AOP_FLAG_NOFS. Filesystems can now act on this flag in their write_begin function. Change __grab_cache_page to accept a nofs argument as well, to honour that flag (while we're there, change the name to grab_cache_page_write_begin which is more instructive and does away with random leading underscores). This is really a more flexible way to go in the end anyway -- if a filesystem happens to want any extra allocations aside from the pagecache ones in ints write_begin function, it may now use GFP_KERNEL (rather than GFP_NOFS) for common case allocations (eg. ocfs2_alloc_write_ctxt, for a random example). [kosaki.motohiro@jp.fujitsu.com: fix ubifs] [kosaki.motohiro@jp.fujitsu.com: fix fuse] Signed-off-by: Nick Piggin <npiggin@suse.de> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Cleaned up the calling convention: just pass in the AOP flags untouched to the grab_cache_page_write_begin() function. That just simplifies everybody, and may even allow future expansion of the logic. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-04 13:33:20 -08:00
Artem Bityutskiy	f92b982680	UBIFS: fix checkpatch.pl warnings These are mostly long lines and wrong indentation warning fixes. But also there are two volatile variables and checkpatch.pl complains about them: WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt + volatile int gc_seq; WARNING: Use of volatile is usually wrong: see Documentation/volatile-considered-harmful.txt + volatile int gced_lnum; Well, we anyway use smp_wmb() for c->gc_seq and c->gced_lnum, so these 'volatile' modifiers can be just dropped. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-31 14:13:25 +02:00
Artem Bityutskiy	7bbe5b5aa6	UBIFS: use PAGE_CACHE_MASK correctly It has high bits set, not low bits set as the UBIFS code assumed. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-12-23 12:19:14 +02:00
Artem Bityutskiy	3477d20465	UBIFS: pre-allocate bulk-read buffer To avoid memory allocation failure during bulk-read, pre-allocate a bulk-read buffer, so that if there is only one bulk-reader at a time, it would just use the pre-allocated buffer and would not do any memory allocation. However, if there are more than 1 bulk- reader, then only one reader would use the pre-allocated buffer, while the other reader would allocate the buffer for itself. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-11-21 18:59:33 +02:00
Artem Bityutskiy	6c0c42cdfd	UBIFS: do not allocate too much Bulk-read allocates 128KiB or more using kmalloc. The allocation starts failing often when the memory gets fragmented. UBIFS still works fine in this case, because it falls-back to standard (non-optimized) read method, though. This patch teaches bulk-read to allocate exactly the amount of memory it needs, instead of allocating 128KiB every time. This patch is also a preparation to the further fix where we'll have a pre-allocated bulk-read buffer as well. For example, now the @bu object is prepared in 'ubifs_bulk_read()', so we could path either pre-allocated or allocated information to 'ubifs_do_bulk_read()' later. Or teaching 'ubifs_do_bulk_read()' not to allocate 'bu->buf' if it is already there. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-11-21 18:59:25 +02:00
Artem Bityutskiy	39ce81ce71	UBIFS: do not print scary memory allocation warnings Bulk-read allocates a lot of memory with 'kmalloc()', and when it is/gets fragmented 'kmalloc()' fails with a scarry warning. But because bulk-read is just an optimization, UBIFS keeps working fine. Supress the warning by passing __GFP_NOWARN option to 'kmalloc()'. This patch also introduces a macro for the magic 128KiB constant. This is just neater. Note, this is not really fixes the problem we had, but just hides the warnings. The further patches fix the problem. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-11-21 18:59:16 +02:00
Harvey Harrison	0ecb9529a4	UBIFS: endian handling fixes and annotations Noticed by sparse: fs/ubifs/file.c:75:2: warning: restricted __le64 degrades to integer fs/ubifs/file.c:629:4: warning: restricted __le64 degrades to integer fs/ubifs/dir.c:431:3: warning: restricted __le64 degrades to integer This should be checked to ensure the ubifs_assert is working as intended, I've done the suggested annotation in this patch. fs/ubifs/sb.c:298:6: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:298:6: expected int [signed] [assigned] tmp fs/ubifs/sb.c:298:6: got restricted __le64 [usertype] <noident> fs/ubifs/sb.c:299:19: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:299:19: expected restricted __le64 [usertype] atime_sec fs/ubifs/sb.c:299:19: got int [signed] [assigned] tmp fs/ubifs/sb.c:300:19: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:300:19: expected restricted __le64 [usertype] ctime_sec fs/ubifs/sb.c:300:19: got int [signed] [assigned] tmp fs/ubifs/sb.c:301:19: warning: incorrect type in assignment (different base types) fs/ubifs/sb.c:301:19: expected restricted __le64 [usertype] mtime_sec fs/ubifs/sb.c:301:19: got int [signed] [assigned] tmp This looks like a bugfix as your tmp was a u32 so there was truncation in the atime, mtime, ctime value, probably not intentional, add a tmp_le64 and use it here. fs/ubifs/key.h:348:9: warning: cast to restricted __le32 fs/ubifs/key.h:348:9: warning: cast to restricted __le32 fs/ubifs/key.h:419:9: warning: cast to restricted __le32 Read from the annotated union member instead. fs/ubifs/recovery.c:175:13: warning: incorrect type in assignment (different base types) fs/ubifs/recovery.c:175:13: expected unsigned int [unsigned] [usertype] save_flags fs/ubifs/recovery.c:175:13: got restricted __le32 [usertype] flags fs/ubifs/recovery.c:186:13: warning: incorrect type in assignment (different base types) fs/ubifs/recovery.c:186:13: expected restricted __le32 [usertype] flags fs/ubifs/recovery.c:186:13: got unsigned int [unsigned] [usertype] save_flags Do byteshifting at compile time of the flag value. Annotate the saved_flags as le32. fs/ubifs/debug.c:368:10: warning: cast to restricted __le32 fs/ubifs/debug.c:368:10: warning: cast from restricted __le64 Should be checked if the truncation was intentional, I've changed the printk to print the full width. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-11-06 11:06:19 +02:00
Adrian Hunter	5c0013c16b	UBIFS: fix bulk-read handling uptodate pages Bulk-read skips uptodate pages but this was putting its array index out and causing it to treat subsequent pages as holes. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-09-30 11:12:59 +03:00
Adrian Hunter	ed382d5898	UBIFS: ensure data read beyond i_size is zeroed out correctly Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-09-30 11:12:57 +03:00
Adrian Hunter	4793e7c5e1	UBIFS: add bulk-read facility Some flash media are capable of reading sequentially at faster rates. UBIFS bulk-read facility is designed to take advantage of that, by reading in one go consecutive data nodes that are also located consecutively in the same LEB. Read speed on Arm platform with OneNAND goes from 17 MiB/s to 19 MiB/s. Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-09-30 11:12:56 +03:00
Artem Bityutskiy	04da11bfcf	UBIFS: fix zero-length truncations Always allow truncations to zero, even if budgeting thinks there is no space. UBIFS reserves some space for deletions anyway. Otherwise, the following happans: 1. create a file, and write as much as possible there, until ENOSPC 2. truncate the file, which fails with ENOSPC, which is not good. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-21 16:48:52 +03:00
Zoltan Sogor	22bc7fa8c5	UBIFS: support splice_write Signed-off-by: Zoltan Sogor <weth@inf.u-szeged.hu> Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:38:43 +03:00
Artem Bityutskiy	dab4b4d2f9	UBIFS: align inode data to eight UBIFS aligns node lengths to 8, so budgeting has to do the same. Well, direntry, inode, and page budgets are already aligned, but not inode data budget (e.g., data in special devices or symlinks). Do this for inode data as well. Also, add corresponding debugging checks. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:35:16 +03:00
Artem Bityutskiy	7d32c2bb14	UBIFS: improve debugging 1. Print inode mode in some of debugging messages 2. Add few more useful assertions Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-08-13 11:20:07 +03:00
Al Viro	3f8206d496	[PATCH] get rid of indirect users of namei.h fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all. Several places in the tree needed direct include. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2008-07-26 20:53:42 -04:00
Artem Bityutskiy	1e51764a3c	UBIFS: add new flash file system This is a new flash file system. See http://www.linux-mtd.infradead.org/doc/ubifs.html Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>	2008-07-15 17:35:15 +03:00

1 2 3

111 Commits