linux

Commit Graph

Author	SHA1	Message	Date
Al Viro	21fc61c73c	don't put symlink bodies in pagecache into highmem kmap() in page_follow_link_light() needed to go - allowing to hold an arbitrary number of kmaps for long is a great way to deadlocking the system. new helper (inode_nohighmem(inode)) needs to be used for pagecache symlinks inodes; done for all in-tree cases. page_follow_link_light() instrumented to yell about anything missed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-12-08 22:41:36 -05:00
Yaowei Bai	a8415e4b13	fs/f2fs/namei.c: remove unnecessary new_valid_dev() check new_valid_dev() always returns 1, so the !new_valid_dev() check is not needed. Remove it. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Changman Lee <cm224.lee@samsung.com> Cc: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-09 15:11:24 -08:00
Chao Yu	a6be014e1d	f2fs: fix error path of ->symlink Now, in ->symlink of f2fs, we kept the fixed invoking order between f2fs_add_link and page_symlink since we should init node info firstly in f2fs_add_link, then such node info can be used in page_symlink. But we didn't fix to release meta info which was done before page_symlink in our error path, so this will leave us corrupt symlink entry in its parent's dentry page. Fix this issue by adding f2fs_unlink in the error path for removing such linking. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-10-22 09:39:24 -07:00
Jaegeuk Kim	569cf1876a	f2fs crypto: allocate buffer for decrypting filename We got dentry pages from high_mem, and its address space directly goes into the decryption path via f2fs_fname_disk_to_usr. But, sg_init_one assumes the address is not from high_mem, so we can get this panic since it doesn't call kmap_high but kunmap_high is triggered at the end. kernel BUG at ../../../../../../kernel/mm/highmem.c:290! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM ... (kunmap_high+0xb0/0xb8) from [<c0114534>] (__kunmap_atomic+0xa0/0xa4) (__kunmap_atomic+0xa0/0xa4) from [<c035f028>] (blkcipher_walk_done+0x128/0x1ec) (blkcipher_walk_done+0x128/0x1ec) from [<c0366c24>] (crypto_cbc_decrypt+0xc0/0x170) (crypto_cbc_decrypt+0xc0/0x170) from [<c0367148>] (crypto_cts_decrypt+0xc0/0x114) (crypto_cts_decrypt+0xc0/0x114) from [<c035ea98>] (async_decrypt+0x40/0x48) (async_decrypt+0x40/0x48) from [<c032ca34>] (f2fs_fname_disk_to_usr+0x124/0x304) (f2fs_fname_disk_to_usr+0x124/0x304) from [<c03056fc>] (f2fs_fill_dentries+0xac/0x188) (f2fs_fill_dentries+0xac/0x188) from [<c03059c8>] (f2fs_readdir+0x1f0/0x300) (f2fs_readdir+0x1f0/0x300) from [<c0218054>] (vfs_readdir+0x90/0xb4) (vfs_readdir+0x90/0xb4) from [<c0218418>] (SyS_getdents64+0x64/0xcc) (SyS_getdents64+0x64/0xcc) from [<c0105ba0>] (ret_fast_syscall+0x0/0x30) Cc: <stable@vger.kernel.org> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-10-09 16:20:51 -07:00
Jaegeuk Kim	a21c20f0c8	f2fs: go out for insert_inode_locked failure We should not call unlock_new_inode when insert_inode_locked failed. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-08-20 09:00:13 -07:00
Chao Yu	d5e8f6c980	f2fs: stat inline xattr inode number This patch adds to stat the number of inline xattr inode for showing in debugfs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-08-05 08:08:05 -07:00
Chao Yu	741a7bea79	f2fs: restrict multimedia filename When testing with fs_mark, some blocks were written out as cold data which were mixed with warm data, resulting in splitting more bios. This is because fs_mark will create file with random filename as below: 559551ee~~~~~~~~15Z29OCC05JCKQP60JQ42MKV 559551ee~~~~~~~~NZAZ6X8OA8LHIIP6XD0L58RM 559551ef~~~~~~~~B15YDSWAK789HPSDZKYTW6WM 559551f1~~~~~~~~2DAE5DPS79785BUNTFWBEMP3 559551f1~~~~~~~~1MYDY0BKSQCJPI32Q8C514RM 559551f1~~~~~~~~YQOTMAOMN5CVRFOUNI026MP4 559551f3~~~~~~~~1WF42LPRTQJNPPGR3EINKMPE 559551f3~~~~~~~~8Y2NRK7CEPPAA02LY936PJPG They are regarded as cold file since their filename are ended with multimedia files' extension, but this should be wrong as we only match the extension of filename, not the whole one. In this patch, we try to fix the format of multimedia filename to: "filename + '.' + extension", then we set cold file only its filename matches the format. So after this change, it will reduce the probability we set the wrong cold file, also it helps a little for fs_mark's performance on f2fs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-08-04 14:09:57 -07:00
Jaegeuk Kim	3e72f72139	f2fs: use extent_cache by default We don't need to handle the duplicate extent information. The integrated rule is: - update on-disk extent with largest one tracked by in-memory extent_cache - destroy extent_tree for the truncation case - drop per-inode extent_cache by shrinker Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-08-04 14:09:56 -07:00
Jaegeuk Kim	c9b63bd01d	f2fs: avoid to use failed inode immediately Before iput is called, the inode number used by a bad inode can be reassigned to other new inode, resulting in any abnormal behaviors on the new inode. This should not happen for the new inode. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-08-04 14:09:53 -07:00
Linus Torvalds	cfcc0ad47f	Merge tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "New features: - per-file encryption (e.g., ext4) - FALLOC_FL_ZERO_RANGE - FALLOC_FL_COLLAPSE_RANGE - RENAME_WHITEOUT Major enhancement/fixes: - recovery broken superblocks - enhance f2fs_trim_fs with a discard_map - fix a race condition on dentry block allocation - fix a deadlock during summary operation - fix a missing fiemap result .. and many minor bug fixes and clean-ups were done" * tag 'for-f2fs-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (83 commits) f2fs: do not trim preallocated blocks when truncating after i_size f2fs crypto: add alloc_bounce_page f2fs crypto: fix to handle errors likewise ext4 f2fs: drop the volatile_write flag only f2fs: skip committing valid superblock f2fs: setting discard option in parse_options() f2fs: fix to return exact trimmed size f2fs: support FALLOC_FL_INSERT_RANGE f2fs: hide common code in f2fs_replace_block f2fs: disable the discard option when device doesn't support f2fs crypto: remove alloc_page for bounce_page f2fs: fix a deadlock for summary page lock vs. sentry_lock f2fs crypto: clean up error handling in f2fs_fname_setup_filename f2fs crypto: avoid f2fs_inherit_context for symlink f2fs crypto: do not set encryption policy for non-directory by ioctl f2fs crypto: allow setting encryption policy once f2fs crypto: check context consistent for rename2 f2fs: avoid duplicated code by reusing f2fs_read_end_io f2fs crypto: use per-inode tfm structure f2fs: recovering broken superblock during mount ...	2015-06-24 20:38:29 -07:00
Jaegeuk Kim	e992e238ff	f2fs crypto: avoid f2fs_inherit_context for symlink This patch fixes to call f2fs_inherit_context twice for newly created symlink. The original one is called by f2fs_add_link(), which invokes f2fs_setxattr. If the second one is called again, f2fs_setxattr is triggered again with same encryption index. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-06-01 16:21:07 -07:00
Chao Yu	d3baf7c472	f2fs crypto: check context consistent for rename2 For exchange rename, we should check context consistent of encryption between new_dir and old_inode or old_dir and new_inode. Otherwise inheritance of parent's encryption context will be broken. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: sync with ext4 approach] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-06-01 16:21:05 -07:00
Jaegeuk Kim	26bf3dc7e2	f2fs crypto: use per-inode tfm structure This patch applies the following ext4 patch: ext4 crypto: use per-inode tfm structure As suggested by Herbert Xu, we shouldn't allocate a new tfm each time we read or write a page. Instead we can use a single tfm hanging off the inode's crypt_info structure for all of our encryption needs for that inode, since the tfm can be used by multiple crypto requests in parallel. Also use cmpxchg() to avoid races that could result in crypt_info structure getting doubly allocated or doubly freed. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-06-01 16:21:04 -07:00
Jaegeuk Kim	304eecc346	f2fs crypto: check encryption for tmpfile This patch adds to check encryption for tmpfile in early stage. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-06-01 16:21:02 -07:00
Chao Yu	7e01e7ad74	f2fs: support RENAME_WHITEOUT As the description of rename in manual, RENAME_WHITEOUT is a special operation that only makes sense for overlay/union type filesystem. When performing rename with RENAME_WHITEOUT, dst will be replace with src, and meanwhile, a 'whiteout' will be create with name of src. A "whiteout" is designed to be a char device with 0,0 device number, it has specially meaning for stackable filesystem. In these filesystems, there are multiple layers exist, and only top of these can be modified. So a whiteout in top layer is used to hide a corresponding file in lower layer, as well removal of whiteout will make the file appear. Now in overlayfs, when we rename a file which is exist in lower layer, it will be copied up to upper if it is not on upper layer yet, and then rename it on upper layer, source file will be whiteouted to hide corresponding file in lower layer at the same time. So in upper layer filesystem, implementation of RENAME_WHITEOUT provide a atomic operation for stackable filesystem to support rename operation. There are multiple ways to implement RENAME_WHITEOUT in log of this commit: `7dcf5c3e45` ("xfs: add RENAME_WHITEOUT support") which pointed out by Dave Chinner. For now, we just try to follow the way that xfs/ext4 use. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-06-01 16:21:01 -07:00
Jaegeuk Kim	d690358b2b	f2fs crypto: remove checking key context during lookup No matter what the key is valid or not, readdir shows the dir entries correctly. So, lookup should not failed. But, we expect further accesses should be denied from open, rename, link, and so on. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-28 15:41:57 -07:00
Jaegeuk Kim	cbaf042a3c	f2fs crypto: add symlink encryption This patch implements encryption support for symlink. Signed-off-by: Uday Savagaonkar <savagaon@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-28 15:41:55 -07:00
Jaegeuk Kim	e7d5545285	f2fs crypto: add filename encryption for roll-forward recovery This patch adds a bit flag to indicate whether or not i_name in the inode is encrypted. If this name is encrypted, we can't do recover_dentry during roll-forward. So, f2fs_sync_file() needs to do checkpoint, if this will be needed in future. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-28 15:41:55 -07:00
Jaegeuk Kim	fcc85a4d86	f2fs crypto: activate encryption support for fs APIs This patch activates the following APIs for encryption support. The rules quoted by ext4 are: - An unencrypted directory may contain encrypted or unencrypted files or directories. - All files or directories in a directory must be protected using the same key as their containing directory. - Encrypted inode for regular file should not have inline_data. - Encrypted symlink and directory may have inline_data and inline_dentry. This patch activates the following APIs. 1. f2fs_link : validate context 2. f2fs_lookup : '' 3. f2fs_rename : '' 4. f2fs_create/f2fs_mkdir : inherit its dir's context 5. f2fs_direct_IO : do buffered io for regular files 6. f2fs_open : check encryption info 7. f2fs_file_mmap : '' 8. f2fs_setattr : '' 9. f2fs_file_write_iter : '' (Called by sys_io_submit) 10. f2fs_fallocate : do not support fcollapse 11. f2fs_evict_inode : free_encryption_info Signed-off-by: Michael Halcrow <mhalcrow@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-28 15:41:51 -07:00
Jaegeuk Kim	2fb2c95496	f2fs: fix counting the number of inline_data inodes This patch fixes to count the missing symlink case. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-28 15:41:36 -07:00
Jaegeuk Kim	01b960e94a	f2fs: add f2fs_may_inline_{data, dentry} This patch adds f2fs_may_inline_data and f2fs_may_inline_dentry. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-28 15:41:32 -07:00
Jaegeuk Kim	06957e8fe6	f2fs: clean up f2fs_lookup This patch cleans up to avoid deep indentation. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-28 15:41:31 -07:00
Al Viro	5f2c4179e1	switch ->put_link() from dentry to inode only one instance looks at that argument at all; that sole exception wants inode rather than dentry. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-11 08:13:12 -04:00
Al Viro	6e77137b36	don't pass nameidata to ->follow_link() its only use is getting passed to nd_jump_link(), which can obtain it from current->nameidata Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:20:15 -04:00
Al Viro	680baacbca	new ->follow_link() and ->put_link() calling conventions a) instead of storing the symlink body (via nd_set_link()) and returning an opaque pointer later passed to ->put_link(), ->follow_link() _stores_ that opaque pointer (into void * passed by address by caller) and returns the symlink body. Returning ERR_PTR() on error, NULL on jump (procfs magic symlinks) and pointer to symlink body for normal symlinks. Stored pointer is ignored in all cases except the last one. Storing NULL for opaque pointer (or not storing it at all) means no call of ->put_link(). b) the body used to be passed to ->put_link() implicitly (via nameidata). Now only the opaque pointer is. In the cases when we used the symlink body to free stuff, ->follow_link() now should store it as opaque pointer in addition to returning it. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-05-10 22:19:45 -04:00
Jaegeuk Kim	7263b1bd04	f2fs: fix wrong error hanlder in f2fs_follow_link The page_follow_link_light returns NULL and its error pointer was remained in nd->path. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-05-04 14:15:16 -07:00
Linus Torvalds	9ec3a646fe	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only	2015-04-26 17:22:07 -07:00
Jaegeuk Kim	feb7cbb079	f2fs: avoid abnormal behavior on broken symlink When f2fs_symlink was triggered and checkpoint was done before syncing its link path, f2fs can get broken symlink like "xxx -> \0\0\0". This incurs abnormal path_walk by VFS. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-16 09:45:40 -07:00
Jaegeuk Kim	d0cae97cb6	f2fs: flush symlink path to avoid broken symlink after POR This patch tries to avoid broken symlink case after POR in best effort. This results in performance regression. But, if f2fs has inline_data and the target path is under 3KB-sized long, the page would be stored in its inode_block, so that there would be no performance regression. Note that, if user wants to keep this file atomically, it needs to trigger dir->fsync. And, there is still a hole to produce broken symlink. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-16 09:45:35 -07:00
David Howells	2b0143b5c9	VFS: normal filesystems (and lustre): d_inode() annotations that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-04-15 15:06:57 -04:00
Jaegeuk Kim	510022a858	f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries If f2fs was corrupted with missing dot dentries, it needs to recover them after fsck.f2fs detection. The underlying precedure is: 1. The fsck.f2fs remains F2FS_INLINE_DOTS flag in directory inode, if it detects missing dot dentries. 2. When f2fs looks up the corrupted directory, it triggers f2fs_add_link with proper inode numbers and their dot and dotdot names. 3. Once f2fs recovers the directory without errors, it removes F2FS_INLINE_DOTS finally. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-04-10 15:08:57 -07:00
Chao Yu	3c0d84d6f1	f2fs: fix incorrectly stat number of inline data inode We should stat inline data information for temp file in f2fs_tmpfile if we enable inline_data feature. Otherwise, inline data stat number will be wrong after this temp file is evicted. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-03-03 09:58:45 -08:00
Chao Yu	560d4672e2	f2fs: fix to use highmem for pages of newly created directory In commit `a78186ebe5` ("f2fs: use highmem for directory pages"), we have set __GFP_HIGHMEM into dir mapping's gfp flag in f2fs_iget, so high address memory could be used for these existing dir's page. But we forgot to set flag for newly created dir, due to this reason, our newly created dir pages could not be allocated from high address memory. Fix it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2015-02-11 17:04:45 -08:00
Jaegeuk Kim	9486ba442b	f2fs: introduce f2fs_dentry_kunmap to clean up This patch introduces f2fs_dentry_kunmap to clean up dirty codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-23 21:51:53 -08:00
Jaegeuk Kim	b7e1d80003	f2fs: implement -o dirsync If a mount option has dirsync, we should call checkpoint for all the directory operations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-10 06:51:39 -08:00
Jaegeuk Kim	b3d208f96d	f2fs: revisit inline_data to avoid data races and potential bugs This patch simplifies the inline_data usage with the following rule. 1. inline_data is set during the file creation. 2. If new data is requested to be written ranges out of inline_data, f2fs converts that inode permanently. 3. There is no cases which converts non-inline_data inode to inline_data. 4. The inline_data flag should be changed under inode page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-04 17:34:11 -08:00
Jaegeuk Kim	e7a2bf2283	f2fs: fix counting inline_data inode numbers This patch fixes wrongly counting inline_data inode numbers. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:33 -08:00
Jaegeuk Kim	3289c061c5	f2fs: add stat info for inline_dentry inodes This patch adds status information for inline_dentry inodes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:33 -08:00
Chao Yu	622f28ae9b	f2fs: enable inline dir handling Add inline dir functions into normal dir ops' function to handle inline ops. Besides, we enable inline dir mode when a new dir inode is created if inline_data option is on. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:32 -08:00
Chao Yu	dbeacf02eb	f2fs: export dir operations for inline dir This patch exports some dir operations for inline dir, additionally introduces f2fs_drop_nlink from f2fs_delete_entry for reusing by inline dir function. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:31 -08:00
Jaegeuk Kim	44c1615651	f2fs: call f2fs_unlock_op after error was handled This patch relocates f2fs_unlock_op in every directory operations to be called after any error was processed. Otherwise, the checkpoint can be entered with valid node ids without its dentry when -ENOSPC is occurred. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:34:55 -07:00
Jaegeuk Kim	4081363fbe	f2fs: introduce F2FS_I_SB, F2FS_M_SB, and F2FS_P_SB This patch adds three inline functions to clean up dirty casting codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-03 17:37:13 -07:00
Chao Yu	b73e52824c	f2fs: reposition unlock_new_inode to prevent accessing invalid inode As the race condition on the inode cache, following scenario can appear: [Thread a] [Thread b] ->f2fs_mkdir ->f2fs_add_link ->__f2fs_add_link ->init_inode_metadata failed here ->gc_thread_func ->f2fs_gc ->do_garbage_collect ->gc_data_segment ->f2fs_iget ->iget_locked ->wait_on_inode ->unlock_new_inode ->move_data_page ->make_bad_inode ->iput When we fail in create/symlink/mkdir/mknod/tmpfile, the new allocated inode should be set as bad to avoid being accessed by other thread. But in above scenario, it allows f2fs to access the invalid inode before this inode was set as bad. This patch fix the potential problem, and this issue was found by code review. change log from v1: o Add condition judgment in gc_data_segment() suggested by Changman Lee. o use iget_failed to simplify code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-02 00:22:24 -07:00
Jaegeuk Kim	04859dba50	f2fs: remove rename and use rename2 Refer the following patch. commit `7177a9c4b5` Author: Miklos Szeredi <mszeredi@suse.cz> Date: Wed Jul 23 15:15:30 2014 +0200 fs: call rename2 if exists Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:04 -07:00
arter97	e1c4204520	f2fs: fix typo Fix typo and some grammatical errors. The words "filesystem" and "readahead" are being used without the space treewide. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-19 10:01:33 -07:00
Chao Yu	32f9bc25cb	f2fs: support ->rename2() Now new interface ->rename2() is added to VFS, here are related description: https://lkml.org/lkml/2014/2/7/873 https://lkml.org/lkml/2014/2/7/758 This patch adds function f2fs_rename2() to support ->rename2() including handling both RENAME_EXCHANGE and RENAME_NOREPLACE flag. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-25 08:14:08 -07:00
Chao Yu	1256010ab1	f2fs: reduce region of f2fs_lock_op covered for better concurrency In our rename process, region of f2fs_lock_op covered is too big as some of the code like f2fs_empty_dir/f2fs_find_entry are not needed to protect by this lock. So in the extreme case like doing checkpoint when we rename old inode to exist inode in a large directory could cause lower concurrency. Let's reduce the region of f2fs_lock_op to fix this. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-09 14:04:25 -07:00
Jaegeuk Kim	a014e037be	f2fs: clean up an unused parameter and assignment This patch cleans up simple unnecessary codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-09 14:04:25 -07:00
Jaegeuk Kim	b97a9b5da8	f2fs: introduce f2fs_do_tmpfile for code consistency This patch adds f2fs_do_tmpfile to eliminate the redundant init_inode_metadata flow. Throught this, we can provide the consistent lock usage, e.g., fi->i_sem, and this will enable better debugging stuffs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-09 14:04:24 -07:00
Chao Yu	50732df02e	f2fs: support ->tmpfile() Add function f2fs_tmpfile() to support O_TMPFILE file creation, and modify logic of init_inode_metadata to enable linkat temp file. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-09 14:04:24 -07:00
Jaegeuk Kim	b2c0829912	f2fs: do checkpoint for the renamed inode If an inode is renamed, it should be registered as file_lost_pino to conduct checkpoint at f2fs_sync_file. Otherwise, the inode cannot be recovered due to no dent_mark in the following scenario. Note that, this scenario is from xfstests/322. 1. create "a" 2. fsync "a" 3. rename "a" to "b" 4. fsync "b" 5. Sudden power-cut After recovery is done, "b" should be seen. However, the result shows "a", since the recovery procedure does not enter recover_dentry due to no dent_mark. The reason is like below. - The nid of "a" is checkpointed during #2, f2fs_sync_file. - The inode page for "b" produced by #3 is written without dent_mark by sync_node_pages. So, this patch fixes this bug by assinging file_lost_pino to the "a"'s inode. If the pino is lost, f2fs_sync_file conducts checkpoint, and then recovers the latest pino and its dentry information for further recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-09 05:59:31 -07:00
Chao Yu	dd4d961fe7	f2fs: release new entry page correctly in error path of f2fs_rename This patch correct releasing code of new_page to avoid BUG_ON in error patch of f2fs_rename. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-07-09 05:59:11 -07:00
Chao Yu	70ff5dfeb6	f2fs: use inode_init_owner() to simplify codes This patch uses exported inode_init_owner() to simplify codes in f2fs_new_inode(). Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-05-08 18:23:21 +09:00
Chao Yu	48b230a583	f2fs: fix wrong statistics of inline data If we remove a file that has inline data after mount, our statistics turns to inaccurate. cat /sys/kernel/debug/f2fs/status - Inline_data Inode: 4294967295 Let's add stat_inc_inline_inode() to stat inline info of the file when lookup. Change log from v1: o stat in f2fs_lookup() instead of in do_read_inode() for excluding wrong stat. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-04-07 12:40:58 +09:00
Jaegeuk Kim	d928bfbfe7	f2fs: introduce fi->i_sem to protect fi's info This patch introduces fi->i_sem to protect fi's info that includes xattr_ver, pino, i_nlink. This enables to remove i_mutex during f2fs_sync_file, resulting in performance improvement when a number of fsync calls are triggered from many concurrent threads. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-03-20 22:10:11 +09:00
Linus Torvalds	bf3d846b78	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "Assorted stuff; the biggest pile here is Christoph's ACL series. Plus assorted cleanups and fixes all over the place... There will be another pile later this week" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (43 commits) __dentry_path() fixes vfs: Remove second variable named error in __dentry_path vfs: Is mounted should be testing mnt_ns for NULL or error. Fix race when checking i_size on direct i/o read hfsplus: remove can_set_xattr nfsd: use get_acl and ->set_acl fs: remove generic_acl nfs: use generic posix ACL infrastructure for v3 Posix ACLs gfs2: use generic posix ACL infrastructure jfs: use generic posix ACL infrastructure xfs: use generic posix ACL infrastructure reiserfs: use generic posix ACL infrastructure ocfs2: use generic posix ACL infrastructure jffs2: use generic posix ACL infrastructure hfsplus: use generic posix ACL infrastructure f2fs: use generic posix ACL infrastructure ext2/3/4: use generic posix ACL infrastructure btrfs: use generic posix ACL infrastructure fs: make posix_acl_create more useful fs: make posix_acl_chmod more useful ...	2014-01-28 08:38:04 -08:00
Christoph Hellwig	a6dda0e63e	f2fs: use generic posix ACL infrastructure f2fs has some weird mode bit handling, so still using the old chmod code for now. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-01-25 23:58:19 -05:00
Jaegeuk Kim	a18ff06340	f2fs: call mark_inode_dirty to flush dirty pages If a dentry page is updated, we should call mark_inode_dirty to add the inode into the dirty list, so that its dentry pages are flushed to the disk. Otherwise, the inode can be evicted without flush. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2014-01-22 18:40:34 +09:00
Chao Yu	deead09009	f2fs: avoid to set wrong pino of inode when rename dir When we rename a dir to new name which is not exist previous, we will set pino of parent inode with ino of child inode in f2fs_set_link. It destroy consistency of pino, it should be fixed. Thanks for previous work of Shu Tan. Signed-off-by: Shu Tan <shu.tan@samsung.com> Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-12-23 10:42:51 +09:00
Jaegeuk Kim	ccaaca2591	f2fs: fix writing incorrect orphan blocks Previously, there was a erroneous scenario like below. thread 1: thread 2: f2fs_unlink - acquire_orphan_inode : sbi->n_orphans++ write_checkpoint - block_operations : f2fs_lock_all - do_checkpoint : write orphan blocks with sbi->n_orphans - unblock_operations - f2fs_lock_op - release_orphan_inode - f2fs_unlock_op During the checkpoint by thread 2, f2fs stores a wrong orphan block according to the wrong sbi->n_orphans. To avoid this, simply we should make cover acquire_orphan_inode too with f2fs_lock_op. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-10-08 10:19:28 +09:00
Gu Zheng	e479556bfd	f2fs: use rw_sem instead of fs_lock(locks mutex) The fs_locks is used to block other ops(ex, recovery) when doing checkpoint. And each other operate routine(besides checkpoint) needs to acquire a fs_lock, there is a terrible problem here, if these are too many concurrency threads acquiring fs_lock, so that they will block each other and may lead to some performance problem, but this is not the phenomenon we want to see. Though there are some optimization patches introduced to enhance the usage of fs_lock, but the thorough solution is using a rw_sem to replace the fs_lock. Checkpoint routine takes write_sem, and other ops take read_sem, so that we can block other ops(ex, recovery) when doing checkpoint, and other ops will not disturb each other, this can avoid the problem described above completely. Because of the weakness of rw_sem, the above change may introduce a potential problem that the checkpoint thread might get starved if other threads are intensively locking the read semaphore for I/O.(Pointed out by Xu Jin) In order to avoid this, a wait_list is introduced, the appending read semaphore ops will be dropped into the wait_list if checkpoint thread is waiting for write semaphore, and will be waked up when checkpoint thread gives up write semaphore. Thanks to Kim's previous review and test, and will be very glad to see other guys' performance tests about this patch. V2: -fix the potential starvation problem. -use more suitable func name suggested by Xu Jin. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> [Jaegeuk Kim: adjust minor coding standard] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-10-07 11:33:05 +09:00
Gu Zheng	749ebfd174	f2fs: use strncasecmp() simplify the string comparison Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-27 21:50:12 +09:00
Jaegeuk Kim	8cb8268809	f2fs: fix omitting to update inode page The f2fs_set_link updates its parent inode number, so we should sync this to the inode block. Otherwise, the data can be lost after sudden-power-off. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-08-27 21:49:04 +09:00
Jaegeuk Kim	cbd56e7d20	f2fs: fix handling orphan inodes This patch fixes mishandling of the sbi->n_orphans variable. If users request lots of f2fs_unlink(), check_orphan_space() could be contended. In such the case, sbi->n_orphans can be read incorrectly so that f2fs_unlink() would fall into the wrong state which results in the failure of add_orphan_inode(). So, let's increment sbi->n_orphans virtually prior to the actual orphan inode stuffs. After that, let's release sbi->n_orphans by calling release_orphan_inode or remove_orphan_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Jaegeuk Kim	1cd14cafc6	f2fs: update file name in the inode block during f2fs_rename The error is reproducible by: 0. mkfs.f2fs /dev/sdb1 & mount 1. touch test1 2. touch test2 3. mv test1 test2 4. umount 5. dumpt.f2fs -i 4 /dev/sdb1 After this, when we retrieve the inode->i_name of test2 by dump.f2fs, we get test1 instead of test2. This is because f2fs didn't update the file name during the f2fs_rename. So, this patch fixes that. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-07-30 15:17:03 +09:00
Jaegeuk Kim	354a3399dc	f2fs: recover wrong pino after checkpoint during fsync If a file is linked, f2fs loose its parent inode number so that fsync calls for the linked file should do checkpoint all the time. But, if we can recover its parent inode number after the checkpoint, we can adjust roll-forward mechanism for the further fsync calls, which is able to improve the fsync performance significatly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-06-14 09:04:45 +09:00
Jaegeuk Kim	2d4d9fb591	f2fs: fix i_blocks translation on various types of files Basically an inode manages the number of allocated blocks with inode->i_blocks which is represented in a unit of sectors, not file system blocks. But, f2fs has used i_blocks in a unit of file system blocks, and f2fs_getattr translates it to the number of sectors when fstat is called. However, previously f2fs_file_inode_operations only has this, so this patch adds it to all the types of inode_operations. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-06-11 16:01:09 +09:00
Jaegeuk Kim	83d5d6f66b	f2fs: cover cp_file information with ilock If a file is linked with other files, it should be checkpointed at every fsync calls. For this, we use set_cp_file() with FADVISE_CP_BIT, but previously we didn't cover the flag by the global lock. This patch fixes that the inode page stores this correctly. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:06 +09:00
Jaegeuk Kim	6f6fd833e1	f2fs: use ihold Use the following helper function committed by Al. commit `7de9c6ee3e` Author: Al Viro <viro@zeniv.linux.org.uk> Date: Sat Oct 23 11:11:40 2010 -0400 new helper: ihold() ... Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:04 +09:00
Jaegeuk Kim	93ff10d690	f2fs: should not make_bad_inode on f2fs_link failure If -ENOSPC is met during f2fs_link, we should not make the inode as bad. The inode is still alive. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:04 +09:00
Jaegeuk Kim	0a364af18f	f2fs: remove unnecessary por_doing check This por_doing check is totally not related to the recovery process. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-28 15:03:01 +09:00
Jaegeuk Kim	531ad7d58c	f2fs: avoid deadlock during evict after f2fs_gc o Deadlock case #1 Thread 1: - writeback_sb_inodes - do_writepages - f2fs_write_data_pages - write_cache_pages - f2fs_write_data_page - f2fs_balance_fs - wait mutex_lock(gc_mutex) Thread 2: - f2fs_balance_fs - mutex_lock(gc_mutex) - f2fs_gc - f2fs_iget - wait iget_locked(inode->i_lock) Thread 3: - do_unlinkat - iput - lock(inode->i_lock) - evict - inode_wait_for_writeback o Deadlock case #2 Thread 1: - __writeback_single_inode : set I_SYNC - do_writepages - f2fs_write_data_page - f2fs_balance_fs - f2fs_gc - iput - evict - inode_wait_for_writeback(I_SYNC) In order to avoid this, even though iput is called with the zero-reference count, we need to stop the eviction procedure if the inode is on writeback. So this patch links f2fs_drop_inode which checks the I_SYNC flag. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-05-08 19:54:08 +09:00
Jaegeuk Kim	d70b4f53b9	f2fs: add a tracepoint on f2fs_new_inode This can help when debugging the free nid allocation flows. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-29 10:52:01 +09:00
Namjae Jeon	a2a4a7e4ab	f2fs: add tracepoints for sync & inode operations Add tracepoints in f2fs for tracing the syncing operations like filesystem sync, file sync enter/exit. It will helf to trace the code under debugging scenarios. Also add tracepoints for tracing the various inode operations like building inode, eviction of inode, link/unlike of inodes. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-23 15:30:27 +09:00
Namjae Jeon	e66509f03e	f2fs: make is_multimedia_file code align with its name The code conditions put inside the function is_multimedia_file are reverse to the name i.e, we need to negate the return to actually check if the file is a multimedia file. So, change the code and usage path to align both the name and comparision conditions. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-23 08:56:21 +09:00
Jaegeuk Kim	399368372e	f2fs: introduce a new global lock scheme In the previous version, f2fs uses global locks according to the usage types, such as directory operations, block allocation, block write, and so on. Reference the following lock types in f2fs.h. enum lock_type { RENAME, /* for renaming operations / DENTRY_OPS, / for directory operations / DATA_WRITE, / for data write / DATA_NEW, / for data allocation / DATA_TRUNC, / for data truncate / NODE_NEW, / for node allocation / NODE_TRUNC, / for node truncate / NODE_WRITE, / for node write */ NR_LOCK_TYPE, }; In that case, we lose the performance under the multi-threading environment, since every types of operations must be conducted one at a time. In order to address the problem, let's share the locks globally with a mutex array regardless of any types. So, let users grab a mutex and perform their jobs in parallel as much as possbile. For this, I propose a new global lock scheme as follows. 0. Data structure - f2fs_sb_info -> mutex_lock[NR_GLOBAL_LOCKS] - f2fs_sb_info -> node_write 1. mutex_lock_op(sbi) - try to get an avaiable lock from the array. - returns the index of the gottern lock variable. 2. mutex_unlock_op(sbi, index of the lock) - unlock the given index of the lock. 3. mutex_lock_all(sbi) - grab all the locks in the array before the checkpoint. 4. mutex_unlock_all(sbi) - release all the locks in the array after checkpoint. 5. block_operations() - call mutex_lock_all() - sync_dirty_dir_inodes() - grab node_write - sync_node_pages() Note that, the pairs of mutex_lock_op()/mutex_unlock_op() and mutex_lock_all()/mutex_unlock_all() should be used together. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-04-09 18:21:18 +09:00
Jaegeuk Kim	953a3e27e1	f2fs: fix to give correct parent inode number for roll forward When we recover fsync'ed data after power-off-recovery, we should guarantee that any parent inode number should be correct for each direct inode blocks. So, let's make the following rules. - The fsync should do checkpoint to all the inodes that were experienced hard links. - So, the only normal files can be recovered by roll-forward. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-27 09:16:25 +09:00
Jaegeuk Kim	5a20d339c7	f2fs: align f2fs maximum name length to linux based filesystem The maximum filename length supported in linux is 255 characters. So let's follow that. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2013-03-18 21:00:35 +09:00
Leon Romanovsky	9836b8b949	f2fs: unify string length declarations and usage This patch is intended to unify string length declarations and usage. There are number of calls to strlen which return size_t object. The size of this object depends on compiler if it will be bigger, equal or even smaller than an unsigned int Signed-off-by: Leon Romanovsky <leon@leon.nu> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-28 11:27:53 +09:00
Jaegeuk Kim	1efef83202	f2fs: do f2fs_balance_fs in front of dir operations In order to conserve free sections to deal with the worst-case scenarios, f2fs should be able to freeze all the directory operations especially when there are not enough free sections. The f2fs_balance_fs() is for this use. When FS utilization becomes almost 100%, directory operations can be failed due to -ENOSPC frequently, which produces some dirty node pages occasionally. Previously, in such a case, f2fs_balance_fs() is not able to be triggered since it is triggered only if the directory operation ends up with success. So, this patch triggers f2fs_balance_fs() at first before handling directory operations. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-26 10:39:52 +09:00
Namjae Jeon	a0d42539e1	f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask Since, GFP_NOFS and __GFP_ZERO is being used to set gfp_mask. We can instead make use of already predefined macro GFP_F2FS_ZERO. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>	2012-12-11 13:43:44 +09:00
Namjae Jeon	61412b64b9	f2fs: move error condition for mkdir at proper place In function f2fs_mkdir, err is being initialized without even checking if there was any error in new inode creation. So, instead check the inode error and make use of error/return condition. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>	2012-12-11 13:43:44 +09:00
Jaegeuk Kim	0a8165d7c2	f2fs: adjust kernel coding style As pointed out by Randy Dunlap, this patch removes all usage of "/*" for comment blocks. Instead, just use "/". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:42 +09:00
Jaegeuk Kim	57397d86c6	f2fs: add inode operations for special inodes This adds inode operations for directory, symlink, and special inodes. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>	2012-12-11 13:43:41 +09:00

1 2 3

134 Commits