linux_old1

Commit Graph

Author	SHA1	Message	Date
Jaegeuk Kim	427a45c8e2	f2fs: flush_dcache_page for inline data When reading inline data, we should call flush_dcache_page. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:37 -08:00
Jaegeuk Kim	ca4b02eeed	f2fs: call write_checkpoint under disabled gc During the write_checkpoint, we should avoid f2fs_gc trigger to avoid any filesystem consistency. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:37 -08:00
Jan Kara	9234f3190b	f2fs: fix possible data corruption in f2fs_write_begin() f2fs_write_begin() doesn't initialize the 'dn' variable if the inode has inline data. However it uses its contents to decide whether it should just zero out the page or load data to it. Thus if we are unlucky we can zero out page contents instead of loading inline data into a page. CC: stable@vger.kernel.org CC: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:37 -08:00
Gu Zheng	2cc2218611	f2fs: use current_sit_addr to replace the open code Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:37 -08:00
Gu Zheng	52aca07425	f2fs: rename f2fs_set/clear_bit to f2fs_test_and_set/clear_bit Rename f2fs_set/clear_bit to f2fs_test_and_set/clear_bit, which mean set/clear bit and return the old value, for better readability. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:36 -08:00
Gu Zheng	1730663cb7	f2fs: set raw_super default to NULL to avoid compile warning Set raw_super default to NULL to avoid the possibly used uninitialized warning, though we may never hit it in fact. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:36 -08:00
Gu Zheng	c6ac4c0ec4	f2fs: introduce f2fs_change_bit to simplify the change bit logic Introduce f2fs_change_bit to simplify the change bit logic in function set_to_next_nat{sit}. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:36 -08:00
Gu Zheng	fa528722d0	f2fs: remove the redundant function cond_clear_inode_flag Use clear_inode_flag to replace the redundant cond_clear_inode_flag. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:36 -08:00
Gu Zheng	8a2d0ace3a	f2fs: remove the seems unneeded argument 'type' from __get_victim Remove the unneeded argument 'type' from __get_victim, use NO_CHECK_TYPE directly when calling v_ops->get_victim(). Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:35 -08:00
Jan Kara	9bd27ae4aa	f2fs: avoid returning uninitialized value to userspace from f2fs_trim_fs() If user specifies too low end sector for trimming, f2fs_trim_fs() will use uninitialized value as a number of trimmed blocks and returns it to userspace. Initialize number of trimmed blocks early to avoid the problem. Coverity-id: 1248809 CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:35 -08:00
Jaegeuk Kim	d64948a4df	f2fs: declare f2fs_convert_inline_dir as a static function This patch declares f2fs_convert_inline_dir as a static function, which was reported by kbuild test robot. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:35 -08:00
Jaegeuk Kim	f1e33a041e	f2fs: use kmap_atomic instead of kmap For better performance, we need to use kmap_atomic instead of kmap. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:35 -08:00
Jaegeuk Kim	062a3e7ba7	f2fs: reuse make_empty_dir code for inline_dentry This patch introduces do_make_empty_dir to mitigate code redundancy for inline_dentry. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:34 -08:00
Jaegeuk Kim	7b3cd7d6f0	f2fs: introduce f2fs_dentry_ptr structure for code clean-up This patch introduces f2fs_dentry_ptr structure for the use of a function parameter in inline_dentry operations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:34 -08:00
Jaegeuk Kim	5ab18570b8	f2fs: should not truncate any inline_dentry If the inode has inline_dentry, we should not truncate any block indices. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:34 -08:00
Jaegeuk Kim	38594de767	f2fs: reuse core function in f2fs_readdir for inline_dentry This patch introduces a core function, f2fs_fill_dentries, to remove redundant code in f2fs_readdir and f2fs_read_inline_dir. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:34 -08:00
Jaegeuk Kim	e7a2bf2283	f2fs: fix counting inline_data inode numbers This patch fixes wrongly counting inline_data inode numbers. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:33 -08:00
Jaegeuk Kim	3289c061c5	f2fs: add stat info for inline_dentry inodes This patch adds status information for inline_dentry inodes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:33 -08:00
Jaegeuk Kim	bce8d11207	f2fs: avoid deadlock on init_inode_metadata Previously, init_inode_metadata does not hold any parent directory's inode page. So, f2fs_init_acl can grab its parent inode page without any problem. But, when we use inline_dentry, that page is grabbed during f2fs_add_link, so that we can fall into deadlock condition like below. INFO: task mknod:11006 blocked for more than 120 seconds. Tainted: G OE 3.17.0-rc1+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mknod D ffff88003fc94580 0 11006 11004 0x00000000 ffff880007717b10 0000000000000002 ffff88003c323220 ffff880007717fd8 0000000000014580 0000000000014580 ffff88003daecb30 ffff88003c323220 ffff88003fc94e80 ffff88003ffbb4e8 ffff880007717ba0 0000000000000002 Call Trace: [<ffffffff8173dc40>] ? bit_wait+0x50/0x50 [<ffffffff8173d4cd>] io_schedule+0x9d/0x130 [<ffffffff8173dc6c>] bit_wait_io+0x2c/0x50 [<ffffffff8173da3b>] __wait_on_bit_lock+0x4b/0xb0 [<ffffffff811640a7>] __lock_page+0x67/0x70 [<ffffffff810acf50>] ? autoremove_wake_function+0x40/0x40 [<ffffffff811652cc>] pagecache_get_page+0x14c/0x1e0 [<ffffffffa029afa9>] get_node_page+0x59/0x130 [f2fs] [<ffffffffa02a63ad>] read_all_xattrs+0x24d/0x430 [f2fs] [<ffffffffa02a6ca2>] f2fs_getxattr+0x52/0xe0 [f2fs] [<ffffffffa02a7481>] f2fs_get_acl+0x41/0x2d0 [f2fs] [<ffffffff8122d847>] get_acl+0x47/0x70 [<ffffffff8122db5a>] posix_acl_create+0x5a/0x150 [<ffffffffa02a7759>] f2fs_init_acl+0x29/0xcb [f2fs] [<ffffffffa0286a8d>] init_inode_metadata+0x5d/0x340 [f2fs] [<ffffffffa029253a>] f2fs_add_inline_entry+0x12a/0x2e0 [f2fs] [<ffffffffa0286ea5>] __f2fs_add_link+0x45/0x4a0 [f2fs] [<ffffffffa028b5b6>] ? f2fs_new_inode+0x146/0x220 [f2fs] [<ffffffffa028b816>] f2fs_mknod+0x86/0xf0 [f2fs] [<ffffffff811e3ec1>] vfs_mknod+0xe1/0x160 [<ffffffff811e4b26>] SyS_mknod+0x1f6/0x200 [<ffffffff81741d7f>] tracesys+0xe1/0xe6 Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:33 -08:00
Jaegeuk Kim	59a0615540	f2fs: fix to wait correct block type The inode page needs to wait NODE block io. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:33 -08:00
Jaegeuk Kim	4e6ebf6d49	f2fs: reuse find_in_block code for find_in_inline_dir This patch removes redundant copied code in find_in_inline_dir. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:32 -08:00
Jaegeuk Kim	a82afa2019	f2fs: reuse room_for_filename for inline dentry operation This patch introduces to reuse the existing room_for_filename for inline dentry operation. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:32 -08:00
Chao Yu	622f28ae9b	f2fs: enable inline dir handling Add inline dir functions into normal dir ops' function to handle inline ops. Besides, we enable inline dir mode when a new dir inode is created if inline_data option is on. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:32 -08:00
Chao Yu	201a05be96	f2fs: add key function to handle inline dir Adds Functions to implement inline dir init/lookup/insert/delete/convert ops. Signed-off-by: Chao Yu <chao2.yu@samsung.com> [Jaegeuk Kim: remove needless reserved area copy, pointed by Dan Carpenter] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:31 -08:00
Chao Yu	dbeacf02eb	f2fs: export dir operations for inline dir This patch exports some dir operations for inline dir, additionally introduces f2fs_drop_nlink from f2fs_delete_entry for reusing by inline dir function. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:31 -08:00
Chao Yu	5efd3c6f1b	f2fs: add a new mount option for inline dir Adds a new mount option 'inline_dentry' for inline dir. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:31 -08:00
Chao Yu	34d67debe0	f2fs: add infra struct and helper for inline dir This patch defines macro/inline dentry structure, and adds some helpers for inline dir infrastructure. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:31 -08:00
Jaegeuk Kim	af41d3ee00	f2fs: avoid infinite loop at cp_error This patch avoids an infinite loop in sync_dirty_inode_page when -EIO was detected. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:31 -08:00
Jaegeuk Kim	4a257ed677	f2fs: avoid build warning This patch removes build warning. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:30 -08:00
Jaegeuk Kim	13fd8f89f6	f2fs: fix to call f2fs_unlock_op This patch fixes to call f2fs_unlock_op, which was missing before. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:30 -08:00
Jaegeuk Kim	9ba69cf987	f2fs: avoid to allocate when inline_data was written The sceanrio is like this. inline_data i_size page write_begin/vm_page_mkwrite X 30 dirty_page X 30 write to #4096 position X 30 get_dnode_of_data wait for get_dnode_of_data O 30 write inline_data O 30 get_dnode_of_data O 30 reserve data block .. In this case, we have #0 = NEW_ADDR and inline_data as well. We should not allow this condition for further access. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:30 -08:00
Jaegeuk Kim	a78186ebe5	f2fs: use highmem for directory pages This patch fixes to use highmem for directory pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:30 -08:00
Jaegeuk Kim	1ce86bf6f8	f2fs: fix race conditon on truncation with inline_data Let's consider the following scenario. blkaddr[0] inline_data i_size i_blocks writepage truncate NEW X 4096 2 dirty page #0 NEW X 0 change i_size NEW X 0 2 f2fs_write_inline_data NEW X 0 2 get_dnode_of_data NEW X 0 2 truncate_data_blocks_range NULL O 0 1 memcpy(inline_data) NULL O 0 1 f2fs_put_dnode NULL O 0 1 f2fs_truncate NULL O 0 1 get_dnode_of_data NULL O 0 1 invalid block addr This patch adds checking inline_data flag during f2fs_truncate not to refer corrupted block indices. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:29 -08:00
Jaegeuk Kim	c08a690b46	f2fs: should truncate any allocated block for inline_data write When trying to write inline_data, we should truncate any data block allocated and pointed by the inode block. We should consider the data index is not 0. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:29 -08:00
Jaegeuk Kim	cbcb2872e3	f2fs: invalidate inmemory page If user truncates file's data, we should truncate inmemory pages too. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:29 -08:00
Jaegeuk Kim	34ba94bac9	f2fs: do not make dirty any inmemory pages This patch let inmemory pages be clean all the time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-11-03 16:07:29 -08:00
Jaegeuk Kim	02a1335f25	f2fs: support volatile operations for transient data This patch adds support for volatile writes which keep data pages in memory until f2fs_evict_inode is called by iput. For instance, we can use this feature for the sqlite database as follows. While supporting atomic writes for main database file, we can keep its journal data temporarily in the page cache by the following sequence. 1. open -> ioctl(F2FS_IOC_START_VOLATILE_WRITE); 2. writes : keep all the data in the page cache. 3. flush to the database file with atomic writes a. ioctl(F2FS_IOC_START_ATOMIC_WRITE); b. writes c. ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); 4. close -> drop the cached data Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-10-07 11:54:41 -07:00
Jaegeuk Kim	88b88a6679	f2fs: support atomic writes This patch introduces a very limited functionality for atomic write support. In order to support atomic write, this patch adds two ioctls: o F2FS_IOC_START_ATOMIC_WRITE o F2FS_IOC_COMMIT_ATOMIC_WRITE The database engine should be aware of the following sequence. 1. open -> ioctl(F2FS_IOC_START_ATOMIC_WRITE); 2. writes : all the written data will be treated as atomic pages. 3. commit -> ioctl(F2FS_IOC_COMMIT_ATOMIC_WRITE); : this flushes all the data blocks to the disk, which will be shown all or nothing by f2fs recovery procedure. 4. repeat to #2. The IO pattens should be: ,- START_ATOMIC_WRITE ,- COMMIT_ATOMIC_WRITE CP \| D D D D D D \| FSYNC \| D D D D \| FSYNC ... `- COMMIT_ATOMIC_WRITE Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-10-06 17:39:50 -07:00
Jaegeuk Kim	120c2cba1d	f2fs: remove unused return value Don't return any value without any usage. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-10-05 21:05:15 -07:00
Jaegeuk Kim	52656e6cf7	f2fs: clean up f2fs_ioctl functions This patch cleans up f2fs_ioctl functions for better readability. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:34:56 -07:00
Dan Carpenter	8a21984d5d	f2fs: potential shift wrapping buf in f2fs_trim_fs() My static checker complains that segment is a u64 but only the lower 31 bits can be used before we hit a shift wrapping bug. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:34:56 -07:00
Jaegeuk Kim	44c1615651	f2fs: call f2fs_unlock_op after error was handled This patch relocates f2fs_unlock_op in every directory operations to be called after any error was processed. Otherwise, the checkpoint can be entered with valid node ids without its dentry when -ENOSPC is occurred. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:34:55 -07:00
Jaegeuk Kim	7cd8558baa	f2fs: check the use of macros on block counts and addresses This patch cleans up the existing and new macros for readability. Rule is like this. ,-----------------------------------------> MAX_BLKADDR -, \| ,------------- TOTAL_BLKS ----------------------------, \| \| \| \| ,- seg0_blkaddr ,----- sit/nat/ssa/main blkaddress \| block \| \| (SEG0_BLKADDR) \| \| \| \| (e.g., MAIN_BLKADDR) \| address 0..x................ a b c d ............................. \| \| global seg# 0...................... m ............................. \| \| \| \| `------- MAIN_SEGS -----------' `-------------- TOTAL_SEGS ---------------------------' \| \| seg# 0..........xx.................. = Note = o GET_SEGNO_FROM_SEG0 : blk address -> global segno o GET_SEGNO : blk address -> segno o START_BLOCK : segno -> starting block address Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:34:47 -07:00
Jaegeuk Kim	309cc2b6e7	f2fs: refactor flush_nat_entries to remove costly reorganizing ops Previously, f2fs tries to reorganize the dirty nat entries into multiple sets according to its nid ranges. This can improve the flushing nat pages, however, if there are a lot of cached nat entries, it becomes a bottleneck. This patch introduces a new set management flow by removing dirty nat list and adding a series of set operations when the nat entry becomes dirty. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:30:41 -07:00
Jaegeuk Kim	4b2fecc846	f2fs: introduce FITRIM in f2fs_ioctl This patch introduces FITRIM in f2fs_ioctl. In this case, f2fs will issue small discards and prefree discards as many as possible for the given area. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:06:09 -07:00
Jaegeuk Kim	75ab4cb830	f2fs: introduce cp_control structure This patch add a new data structure to control checkpoint parameters. Currently, it presents the reason of checkpoint such as is_umount and normal sync. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-30 15:01:28 -07:00
Jaegeuk Kim	95dd897301	f2fs: use more free segments until SSR is activated Previously, f2fs activates SSR if the # of free segments reaches to the # of overprovisioned segments. In this case, SSR starts to use dirty segments only, so that the overprovisoned space cannot be selected for new data. This means that we have no chance to utilizae the overprovisioned space at all. This patch fixes that by allowing LFS allocations until the # of free segments reaches to the last threshold, reserved space. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:24 -07:00
Jaegeuk Kim	9b5f136fd4	f2fs: change the ipu_policy option to enable combinations This patch changes the ipu_policy setting to use any combination of orthogonal policies. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:24 -07:00
Chao Yu	210f41bc04	f2fs: fix to search whole dirty segmap when get_victim In ->get_victim we get max_search value from dirty_i->nr_dirty without protection of seglist_lock, after that, nr_dirty can be increased/decreased before we hold seglist_lock lock. Then in main loop we attempt to traverse all dirty section one time to find victim section, but it's not accurate to use max_search as the total loop count, because we might lose checking several sections or check sections redundantly for the case of nr_dirty are increased or decreased previously. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:23 -07:00
Chao Yu	26666c8a43	f2fs: fix to clean previous mount option when remount_fs In manual of mount, we descript remount as below: "mount -o remount,rw /dev/foo /dir After this call all old mount options are replaced and arbitrary stuff from fstab is ignored, except the loop= option which is internally generated and maintained by the mount command." Previously f2fs do not clear up old mount options when remount_fs, so we have no chance of disabling previous option (e.g. flush_merge). Fix it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:22 -07:00
Chao Yu	14cecc5cd6	f2fs: skip punching hole in special condition Now punching hole in directory is not supported in f2fs, so let's limit file type in punch_hole(). In addition, in punch_hole if offset is exceed file size, we should skip punching hole. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:21 -07:00
Chao Yu	55cf9cb63f	f2fs: support large sector size Block size in f2fs is 4096 bytes, so theoretically, f2fs can support 4096 bytes sector device at maximum. But now f2fs only support 512 bytes size sector, so block device such as zRAM which uses page cache as its block storage space will not be mounted successfully as mismatch between sector size of zRAM and sector size of f2fs supported. In this patch we support large sector size in f2fs, so block device with sector size of 512/1024/2048/4096 bytes can be supported in f2fs. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:20 -07:00
Chao Yu	09db6a2ef8	f2fs: fix to truncate blocks past EOF in ->setattr By using FALLOC_FL_KEEP_SIZE in ->fallocate of f2fs, we can fallocate block past EOF without changing i_size of inode. These blocks past EOF will not be truncated in ->setattr as we truncate them only when change the file size. We should give a chance to truncate blocks out of filesize in setattr(). Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:20 -07:00
Jaegeuk Kim	976e4c50ae	f2fs: update i_size when __allocate_data_block The f2fs_direct_IO uses __allocate_data_block, but inside the allocation path, we should update i_size at the changed time to update its inode page. Otherwise, we can get wrong i_size after roll-forward recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:19 -07:00
Jaegeuk Kim	90a893c749	f2fs: use MAX_BIO_BLOCKS(sbi) This patch cleans up a simple macro. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:18 -07:00
Jaegeuk Kim	c52e1b10b1	f2fs: remove redundant operation during roll-forward recovery If same data is updated multiple times, we don't need to redo whole the operations. Let's just update the lastest one. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:17 -07:00
Jaegeuk Kim	19c9c466e5	f2fs: do not skip latest inode information In f2fs_sync_file, if there is no written appended writes, it skips to write its node blocks. But, if there is up-to-date inode page, we should write it to update its metadata during the roll-forward recovery. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:16 -07:00
Jaegeuk Kim	441ac5cb32	f2fs: fix roll-forward missing scenarios We can summarize the roll forward recovery scenarios as follows. [Term] F: fsync_mark, D: dentry_mark 1. inode(x) \| CP \| inode(x) \| dnode(F) -> Update the latest inode(x). 2. inode(x) \| CP \| inode(F) \| dnode(F) -> No problem. 3. inode(x) \| CP \| dnode(F) \| inode(x) -> Recover to the latest dnode(F), and drop the last inode(x) 4. inode(x) \| CP \| dnode(F) \| inode(F) -> No problem. 5. CP \| inode(x) \| dnode(F) -> The inode(DF) was missing. Should drop this dnode(F). 6. CP \| inode(DF) \| dnode(F) -> No problem. 7. CP \| dnode(F) \| inode(DF) -> If f2fs_iget fails, then goto next to find inode(DF). 8. CP \| dnode(F) \| inode(x) -> If f2fs_iget fails, then goto next to find inode(DF). But it will fail due to no inode(DF). So, this patch adds some missing points such as #1, #5, #7, and #8. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:16 -07:00
Jaegeuk Kim	88bd02c947	f2fs: fix conditions to remain recovery information in f2fs_sync_file This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? / b) HAS_FSYNCED_INODE, / is the inode fsynced before? / c) HAS_LAST_FSYNC, / has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) \| CP \| inode(x) \| dnode(F) 2. inode(x) \| CP \| inode(F) \| dnode(F) 3. inode(x) \| CP \| dnode(F) \| inode(x) \| inode(F) 4. inode(x) \| CP \| dnode(F) \| inode(F) 5. CP \| inode(x) \| dnode(F) \| inode(DF) 6. CP \| inode(DF) \| dnode(F) 7. CP \| dnode(F) \| inode(DF) 8. CP \| dnode(F) \| inode(x) \| inode(DF) For example, #3, the three conditions should be changed as follows. inode(x) \| CP \| dnode(F) \| inode(x) \| inode(F) a) x o o o o b) x x x x o c) x o o x o If f2fs_sync_file stops ------^, it should write inode(F) --------------^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false. For example, #8, CP \| alloc \| dnode(F) \| inode(x) \| inode(DF) a) o x x x x b) x x x o c) o o x o If f2fs_sync_file stops -------^, it should write inode(DF) --------------^ Note that, the roll-forward policy should follow this rule, which means, if there are any missing blocks, we doesn't need to recover that inode. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:15 -07:00
Jaegeuk Kim	7ef35e3b9e	f2fs: introduce a flag to represent each nat entry information This patch introduces a flag in the nat entry structure to merge various information such as checkpointed and fsync_done marks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:14 -07:00
Jaegeuk Kim	4c521f493b	f2fs: use meta_inode cache to improve roll-forward speed Previously, all the dnode pages should be read during the roll-forward recovery. Even worsely, whole the chain was traversed twice. This patch removes that redundant and costly read operations by using page cache of meta_inode and readahead function as well. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-23 11:10:12 -07:00
Jaegeuk Kim	60979115a6	f2fs: fix double lock for inode page during roll-foward recovery If the inode is same and its data index are needed to truncate, we can fall into double lock for its inode page via get_dnode_of_data. Error case is like this. 1. write data 1, 2, 3, 4, 5 in inode #4. 2. write data 100, 102, 103, 104, 105 in dnode #6 of inode #4. 3. sync 4. update data 100->106 in dnode #6. 5. fsync inode #4. 6. power-cut -> Then, 1. go back to #3's checkpoint 2. in do_recover_data, get_dnode_of_data() gets inode #4. 3. detect 100->106 in dnode #6. 4. check_index_in_prev_nodes tries to truncate 100 in dnode #6. 5. to trigger truncate_hole, get_dnode_of_data should grab inode #4. 6. detect kernel hang This patch should resolve that bug. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-16 04:10:47 -07:00
Huang Ying	c6e489305e	f2fs: fix a race condition in next_free_nid The nm_i->fcnt checking is executed before spin_lock, so if another thread delete the last free_nid from the list, the wrong nid may be gotten. So fix the race condition by moving the nm_i->fnct checking into spin_lock. Signed-off-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-16 04:10:46 -07:00
Huang Ying	7704182387	f2fs: use nm_i->next_scan_nid as default for next_free_nid Now, if there is no free nid in nm_i->free_nid_list, 0 may be saved into next_free_nid of checkpoint, this may cause useless scanning for next mount. nm_i->next_scan_nid should be a better default value than 0. Signed-off-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-16 04:10:45 -07:00
Jaegeuk Kim	c1ce1b02bb	f2fs: give an option to enable in-place-updates during fsync to users If user wrote F2FS_IPU_FSYNC:4 in /sys/fs/f2fs/ipu_policy, f2fs_sync_file only starts to try in-place-updates. And, if the number of dirty pages is over /sys/fs/f2fs/min_fsync_blocks, it keeps out-of-order manner. Otherwise, it triggers in-place-updates. This may be used by storage showing very high random write performance. For example, it can be used when, Seq. writes (Data) + wait + Seq. writes (Node) is pretty much slower than, Rand. writes (Data) Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-16 04:10:44 -07:00
Jaegeuk Kim	a7ffdbe22c	f2fs: expand counting dirty pages in the inode page cache Previously f2fs only counts dirty dentry pages, but there is no reason not to expand the scope. This patch changes the names on the management of dirty pages and to count dirty pages in each inode info as well. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-16 04:10:39 -07:00
Jaegeuk Kim	2403c155b8	f2fs: remove lengthy inode->i_ino This patch is to remove lengthy name by adding a new variable. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-10 17:00:25 -07:00
Jaegeuk Kim	0b4c5afde9	f2fs: fix negative value for lseek offset If application throws negative value of lseek with SEEK_DATA\|SEEK_HOLE, previous f2fs went into BUG_ON in get_dnode_of_data, which was reported by Tommi Rantala. He could make a simple code to detect this having: lseek(fd, -17595150933902LL, SEEK_DATA); This patch should resolve that bug. Reported-by: Tommi Rentala <tt.rantala@gmail.com> [Jaegeuk Kim: relocate the condition as suggested by Chao] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 14:46:36 -07:00
Huang Ying	9a01b56b1a	f2fs: avoid node page to be written twice in gc_node_segment In gc_node_segment, if node page gc is run concurrently with node page writeback, and check_valid_map and get_node_page run after page locked and before cur_valid_map is updated as below, it is possible for the page to be written twice unnecessarily. sync_node_pages try_lock_page ... check_valid_map f2fs_write_node_page ... write_node_page do_write_page allocate_data_block ... refresh_sit_entry /* update cur_valid_map */ ... ... unlock_page get_node_page ... set_page_dirty ... f2fs_put_page unlock_page This can be solved via calling check_valid_map after get_node_page again. Signed-off-by: Huang, Ying <ying.huang@intel.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:07 -07:00
Gu Zheng	721bd4d5c3	f2fs: use lock-less list(llist) to simplify the flush cmd management We use flush cmd control to collect many flush cmds, and flush them together. In this case, we use two list to manage the flush cmds (collect and dispatch), and one spin lock is used to protect this. In fact, the lock-less list(llist) is very suitable to this case, and we use simplify this routine. - v2: -use llist_for_each_entry_safe to fix possible use-after-free issue. -remove the unused field from struct flush_cmd. Thanks for Yu's suggestion. - Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:06 -07:00
Chao Yu	184a5cd2ce	f2fs: refactor flush_sit_entries codes for reducing SIT writes In commit `aec71382c6` ("f2fs: refactor flush_nat_entries codes for reducing NAT writes"), we descripte the issue as below: "Although building NAT journal in cursum reduce the read/write work for NAT block, but previous design leave us lower performance when write checkpoint frequently for these cases: 1. if journal in cursum has already full, it's a bit of waste that we flush all nat entries to page for persistence, but not to cache any entries. 2. if journal in cursum is not full, we fill nat entries to journal util journal is full, then flush the left dirty entries to disk without merge journaled entries, so these journaled entries may be flushed to disk at next checkpoint but lost chance to flushed last time." Actually, we have the same problem in using SIT journal area. In this patch, firstly we will update sit journal with dirty entries as many as possible. Secondly if there is no space in sit journal, we will remove all entries in journal and walk through the whole dirty entry bitmap of sit, accounting dirty sit entries located in same SIT block to sit entry set. All entry sets are linked to list sit_entry_set in sm_info, sorted ascending order by count of entries in set. Later we flush entries in set which have fewest entries into journal as many as we can, and then flush dense set with merged entries to disk. In this way we can use sit journal area more effectively, also we will reduce SIT update, result in gaining in performance and saving lifetime of flash device. In my testing environment, it shows this patch can help to reduce SIT block update obviously. virtual machine + hard disk: fsstress -p 20 -n 400 -l 5 sit page num cp count sit pages/cp based 2006.50 1349.75 1.486 patched 1566.25 1463.25 1.070 Our latency of merging op is small when handling a great number of dirty SIT entries in flush_sit_entries: latency(ns) dirty sit count 36038 2151 49168 2123 37174 2232 Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:05 -07:00
Chao Yu	d3a14afd5e	f2fs: remove unneeded sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO is not used, remove it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:05 -07:00
Jaegeuk Kim	b0c44f05a2	f2fs: need fsck.f2fs if the recovery was failed If the roll-forward recovery was failed, we'd better conduct fsck.f2fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:04 -07:00
Jaegeuk Kim	ec325b5270	f2fs: handle bug cases by letting fsck.f2fs initiate This patch adds to handle corner buggy cases for fsck.f2fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:03 -07:00
Jaegeuk Kim	05796763b8	f2fs: add BUG cases to initiate fsck.f2fs This patch replaces BUG cases with f2fs_bug_on to remain fsck.f2fs information. And it implements some void functions to initiate fsck.f2fs too. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:03 -07:00
Jaegeuk Kim	9850cf4a89	f2fs: need fsck.f2fs when f2fs_bug_on is triggered If any f2fs_bug_on is triggered, fsck.f2fs is needed. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:15:02 -07:00
Jaegeuk Kim	2ae4c673e3	f2fs: retain inconsistency information to initiate fsck.f2fs This patch adds sbi->need_fsck to conduct fsck.f2fs later. This flag can only be removed by fsck.f2fs. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-09 13:14:25 -07:00
Jaegeuk Kim	4081363fbe	f2fs: introduce F2FS_I_SB, F2FS_M_SB, and F2FS_P_SB This patch adds three inline functions to clean up dirty casting codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-03 17:37:13 -07:00
Chao Yu	b73e52824c	f2fs: reposition unlock_new_inode to prevent accessing invalid inode As the race condition on the inode cache, following scenario can appear: [Thread a] [Thread b] ->f2fs_mkdir ->f2fs_add_link ->__f2fs_add_link ->init_inode_metadata failed here ->gc_thread_func ->f2fs_gc ->do_garbage_collect ->gc_data_segment ->f2fs_iget ->iget_locked ->wait_on_inode ->unlock_new_inode ->move_data_page ->make_bad_inode ->iput When we fail in create/symlink/mkdir/mknod/tmpfile, the new allocated inode should be set as bad to avoid being accessed by other thread. But in above scenario, it allows f2fs to access the invalid inode before this inode was set as bad. This patch fix the potential problem, and this issue was found by code review. change log from v1: o Add condition judgment in gc_data_segment() suggested by Changman Lee. o use iget_failed to simplify code. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-09-02 00:22:24 -07:00
Jaegeuk Kim	3304b56401	f2fs: fix wrong casting for dentry name The dentry name type is unsigned char *. If we don't match this type, some character codes can be changed by signed bit. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-29 00:26:50 -07:00
Dan Carpenter	922cedbd00	f2fs: simplify by using a literal We can make the code a bit simpler because we know that "!retry" is zero. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-28 09:25:29 -07:00
Jaegeuk Kim	c2e69583a4	f2fs: truncate stale block for inline_data This verifies to truncate any allocated blocks, offset[0], by inline_data. Not figured out, but for making sure. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-25 14:52:09 -07:00
Chao Yu	b5b822050c	f2fs: use macro for code readability This patch introduces DEF_NIDS_PER_INODE/GET_ORPHAN_BLOCKS/F2FS_CP_PACKS macro instead of numbers in code for readability. change log from v1: o fix typo pointed out by Jaegeuk Kim. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-22 13:56:47 -07:00
Chao Yu	9d1589ef2e	f2fs: introduce need_do_checkpoint for readability This patch introduce need_do_checkpoint() to include numerous judgment condition for readability. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:07 -07:00
Chao Yu	c200b1aa6c	f2fs: fix incorrect calculation with total/free inode num Theoretically, our total inodes number is the same as total node number, but there are three node ids are reserved in f2fs, they are 0, 1 (node nid), and 2 (meta nid), and they should never be used by user, so our total/free inode number calculated in ->statfs is wrong. This patch indroduces F2FS_RESERVED_NODE_NUM and then fixes this issue by recalculating total/free inode number with the macro. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:06 -07:00
Jaegeuk Kim	04859dba50	f2fs: remove rename and use rename2 Refer the following patch. commit `7177a9c4b5` Author: Miklos Szeredi <mszeredi@suse.cz> Date: Wed Jul 23 15:15:30 2014 +0200 fs: call rename2 if exists Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:04 -07:00
Jaegeuk Kim	ec4e7af4ca	f2fs: skip if inline_data was converted already This patch checks inline_data one more time under the inode page lock whether its inline_data is converted or not. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:03 -07:00
Jaegeuk Kim	202095a7a0	f2fs: remove rewrite_node_page I think we need to let the dirty node pages remain in the page cache instead of rewriting them in their places. So, after done with successful recovery, write_checkpoint will flush all of them through the normal write path. Through this, we can avoid potential error cases in terms of block allocation. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:02 -07:00
Jaegeuk Kim	764aa3e978	f2fs: avoid double lock in truncate_blocks The init_inode_metadata calls truncate_blocks when error is occurred. The callers holds f2fs_lock_op, so we should not call it again in truncate_blocks. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:01 -07:00
Jaegeuk Kim	14f4e69085	f2fs: prevent checkpoint during roll-forward Any checkpoint should not be done during the core roll-forward procedure. Especially, it includes error cases too. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:57:00 -07:00
Jaegeuk Kim	b3fe0a0da2	f2fs: add WARN_ON in f2fs_bug_on This patch adds WARN_ON when f2fs_bug_on is disable to see kernel messages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:56:59 -07:00
Jaegeuk Kim	cf779cab14	f2fs: handle EIO not to break fs consistency There are two rules when EIO is occurred. 1. don't write any checkpoint data to preserve the previous checkpoint 2. don't lose the cached dentry/node/meta pages So, at first, this patch adds set_page_dirty in f2fs_write_end_io's failure. Then, writing checkpoint/dentry/node blocks is not allowed. Note that, for the data pages, we can't just throw away by redirtying them. Otherwise, kworker can fall into infinite loop to flush them. (Ref. xfstests/019) Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 13:55:05 -07:00
Jaegeuk Kim	8501017e50	f2fs: check s_dirty under cp_mutex It needs to check s_dirty under cp_mutex, since s_dirty is reset under that mutex. And previous condition was not correct, since we can omit doing checkpoint when checkpoint was done followed by all the node pages were written back. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 09:21:02 -07:00
Jaegeuk Kim	5274651927	f2fs: unlock_page when node page is redirtied out This patch fixes missing unlock_page when a node page is redirtied out. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 09:21:01 -07:00
Jaegeuk Kim	1e968fdfe6	f2fs: introduce f2fs_cp_error for readability This patch adds f2fs_cp_error for readability. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 09:21:00 -07:00
Jaegeuk Kim	ed2e621a95	f2fs: give a chance to mount again when encountering errors This patch gives another chance to try mount process when we encounter an error. This makes an effect on the roll-forward recovery failures as well. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 09:21:00 -07:00
Jaegeuk Kim	6f12ac25f0	f2fs: trigger release_dirty_inode in f2fs_put_super The generic_shutdown_super calls sync_filesystem, evict_inode, and then f2fs_put_super. In f2fs_evict_inode, we remain some dirty inode information so we should release them at f2fs_put_super. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-21 09:20:29 -07:00
Jaegeuk Kim	97c3c5cac2	f2fs: don't skip checkpoint if there is no dirty node pages This is the errorneous scenario. 1. write data 2. do checkpoint 3. produce some dirty node pages by the gc thread 4. write back dirty node pages 5. f2fs_put_super will skip the checkpoint, since dirty count for node pages is zero. This patch removes such the wrong condition check. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-19 10:01:35 -07:00
Jaegeuk Kim	b307384e4f	f2fs: avoid bug_on when error is occurred During the recovery, if an error like EIO or ENOMEM, f2fs_bug_on should skip. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-19 10:01:35 -07:00
Jaegeuk Kim	1c35a90e8a	f2fs: fix to recover inline_xattr/data and blocks This patch fixes not to skip xattr recovery and inline xattr/data recovery order. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>	2014-08-19 10:01:34 -07:00

1 2 3 4 5 ...

752 Commits