linux/fs
Theodore Ts'o f7ad6d2e92 ext4: handle writeback of inodes which are being freed
The following BUG can occur when an inode which is getting freed when
it still has dirty pages outstanding, and it gets deleted (in this
because it was the target of a rename).  In ordered mode, we need to
make sure the data pages are written just in case we crash before the
rename (or unlink) is committed.  If the inode is being freed then
when we try to igrab the inode, we end up tripping the BUG_ON at
fs/ext4/page-io.c:146.

To solve this problem, we need to keep track of the number of io
callbacks which are pending, and avoid destroying the inode until they
have all been completed.  That way we don't have to bump the inode
count to keep the inode from being destroyed; an approach which
doesn't work because the count could have already been dropped down to
zero before the inode writeback has started (at which point we're not
allowed to bump the count back up to 1, since it's already started
getting freed).

Thanks to Dave Chinner for suggesting this approach, which is also
used by XFS.

  kernel BUG at /scratch_space/linux-2.6/fs/ext4/page-io.c:146!
  Call Trace:
   [<ffffffff811075b1>] ext4_bio_write_page+0x172/0x307
   [<ffffffff811033a7>] mpage_da_submit_io+0x2f9/0x37b
   [<ffffffff811068d7>] mpage_da_map_and_submit+0x2cc/0x2e2
   [<ffffffff811069b3>] mpage_add_bh_to_extent+0xc6/0xd5
   [<ffffffff81106c66>] write_cache_pages_da+0x2a4/0x3ac
   [<ffffffff81107044>] ext4_da_writepages+0x2d6/0x44d
   [<ffffffff81087910>] do_writepages+0x1c/0x25
   [<ffffffff810810a4>] __filemap_fdatawrite_range+0x4b/0x4d
   [<ffffffff810815f5>] filemap_fdatawrite_range+0xe/0x10
   [<ffffffff81122a2e>] jbd2_journal_begin_ordered_truncate+0x7b/0xa2
   [<ffffffff8110615d>] ext4_evict_inode+0x57/0x24c
   [<ffffffff810c14a3>] evict+0x22/0x92
   [<ffffffff810c1a3d>] iput+0x212/0x249
   [<ffffffff810bdf16>] dentry_iput+0xa1/0xb9
   [<ffffffff810bdf6b>] d_kill+0x3d/0x5d
   [<ffffffff810be613>] dput+0x13a/0x147
   [<ffffffff810b990d>] sys_renameat+0x1b5/0x258
   [<ffffffff81145f71>] ? _atomic_dec_and_lock+0x2d/0x4c
   [<ffffffff810b2950>] ? cp_new_stat+0xde/0xea
   [<ffffffff810b29c1>] ? sys_newlstat+0x2d/0x38
   [<ffffffff810b99c6>] sys_rename+0x16/0x18
   [<ffffffff81002a2b>] system_call_fastpath+0x16/0x1b

Reported-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Tested-by: Nick Bowler <nbowler@elliptictech.com>
2010-11-08 13:43:33 -05:00
..
9p convert v9fs 2010-10-29 04:16:38 -04:00
adfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
affs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
afs convert afs 2010-10-29 04:17:13 -04:00
autofs4 convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
befs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
bfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
btrfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable 2010-10-30 09:05:48 -07:00
cachefiles llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
ceph convert ceph 2010-10-29 04:17:18 -04:00
cifs locks: let the caller free file_lock on ->setlease failure 2010-10-31 06:35:15 -07:00
coda convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
configfs convert get_sb_single() users 2010-10-29 04:16:28 -04:00
cramfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
debugfs convert get_sb_single() users 2010-10-29 04:16:28 -04:00
devpts convert get_sb_single() users 2010-10-29 04:16:28 -04:00
dlm Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm 2010-10-22 17:33:16 -07:00
ecryptfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 2010-10-29 14:15:12 -07:00
efs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
exofs convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
exportfs exportfs: use dget_parent 2010-10-25 21:26:13 -04:00
ext2 new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
ext3 new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
ext4 ext4: handle writeback of inodes which are being freed 2010-11-08 13:43:33 -05:00
fat new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
freevxfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
fscache Add a dummy printk function for the maintenance of unused printks 2010-08-12 09:51:35 -07:00
fuse convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
gfs2 locks: let the caller free file_lock on ->setlease failure 2010-10-31 06:35:15 -07:00
hfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
hfsplus new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
hostfs convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
hpfs Merge branches 'irq-core-for-linus' and 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-31 20:40:24 -04:00
hppfs convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
hugetlbfs convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
isofs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
jbd Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 2010-10-27 20:13:18 -07:00
jbd2 jbd2: Convert jbd2_slab_create_sem to mutex 2010-10-30 12:12:50 +02:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2010-10-30 08:31:35 -07:00
jfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
lockd lockd: fix nlmsvc_notify_blocked locking 2010-10-27 21:39:50 +02:00
logfs switch logfs to ->mount() 2010-10-29 04:16:51 -04:00
minix new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
ncpfs convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
nfs locks: let the caller free file_lock on ->setlease failure 2010-10-31 06:35:15 -07:00
nfs_common
nfsd locks: let the caller free file_lock on ->setlease failure 2010-10-31 06:35:15 -07:00
nilfs2 convert nilfs 2010-10-29 04:16:53 -04:00
nls
notify make fanotify_read() restartable across signals 2010-10-30 14:07:35 -04:00
ntfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
ocfs2 convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
omfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
openpromfs convert get_sb_single() users 2010-10-29 04:16:28 -04:00
partitions Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block 2010-10-25 07:45:10 -07:00
proc switch procfs to ->mount() 2010-10-29 04:17:01 -04:00
qnx4 new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
quota quota: Fix possible oops in __dquot_initialize() 2010-10-28 01:30:06 +02:00
ramfs convert get_sb_nodev() users 2010-10-29 04:16:31 -04:00
reiserfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
romfs convert get_sb_mtd() users to ->mount() 2010-10-29 04:16:26 -04:00
squashfs Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus 2010-10-29 08:48:58 -07:00
sysfs convert sysfs 2010-10-29 04:17:08 -04:00
sysv new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
ubifs convert ubifs 2010-10-29 04:16:36 -04:00
udf new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
ufs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
xfs new helper: mount_bdev() 2010-10-29 04:16:13 -04:00
Kconfig Merge 'staging-next' to Linus's tree 2010-10-28 09:44:56 -07:00
Kconfig.binfmt coredump: default CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y 2010-10-27 18:03:12 -07:00
Makefile Merge 'staging-next' to Linus's tree 2010-10-28 09:44:56 -07:00
aio.c new helper: ihold() 2010-10-25 21:26:11 -04:00
anon_inodes.c convert get_sb_pseudo() users 2010-10-29 04:16:33 -04:00
attr.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
bad_inode.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
binfmt_aout.c Don't dump task struct in a.out core-dumps 2010-10-14 10:57:40 -07:00
binfmt_elf.c ARM: 6342/1: fix ASLR of PIE executables 2010-10-08 10:02:53 +01:00
binfmt_elf_fdpic.c binfmt_elf_fdpic: Fix clear_user() error handling 2010-06-01 08:11:06 -07:00
binfmt_em86.c
binfmt_flat.c flat: tweak default stack alignment 2010-06-29 15:29:31 -07:00
binfmt_misc.c convert get_sb_single() users 2010-10-29 04:16:28 -04:00
binfmt_script.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
binfmt_som.c
bio-integrity.c fs/bio-integrity.c: return -ENOMEM on kmalloc failure 2010-08-23 13:36:59 +02:00
bio.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
block_dev.c convert get_sb_pseudo() users 2010-10-29 04:16:33 -04:00
buffer.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-10-26 17:58:44 -07:00
char_dev.c Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl 2010-10-22 10:52:56 -07:00
compat.c fs/compat.c: fix build on MIPS/s390 2010-10-30 08:19:35 -07:00
compat_binfmt_elf.c
compat_ioctl.c Merge 'staging-next' to Linus's tree 2010-10-28 09:44:56 -07:00
dcache.c fs: use RCU read side protection in d_validate 2010-10-25 21:26:13 -04:00
dcookies.c
direct-io.c fs/direct-io.c: fix truncation error in dio_complete() return 2010-10-26 16:52:13 -07:00
drop_caches.c simplify checks for I_CLEAR/I_FREEING 2010-08-09 16:47:44 -04:00
eventfd.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
eventpoll.c epoll: make epoll_wait() use the hrtimer range feature 2010-10-27 18:03:18 -07:00
exec.c exec: don't turn PF_KTHREAD off when a target command was not found 2010-10-27 18:03:13 -07:00
fcntl.c fasync: Fix placement of FASYNC flag comment 2010-10-27 18:17:02 -07:00
fifo.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
file.c vfs: use kmalloc() to allocate fdmem if possible 2010-08-11 08:59:02 -07:00
file_table.c fs: allow for more than 2^31 files 2010-10-26 16:52:15 -07:00
filesystems.c
fs-writeback.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable 2010-10-30 09:05:48 -07:00
fs_struct.c fs: fs_struct rwlock to spinlock 2010-08-18 08:35:46 -04:00
generic_acl.c vfs: update ctime when changing the file's permission by setfacl 2010-08-18 01:04:22 -04:00
inode.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-10-26 17:58:44 -07:00
internal.h braino in internal.h 2010-10-29 05:49:13 -04:00
ioctl.c fs: Add FITRIM ioctl 2010-10-27 21:30:11 -04:00
ioprio.c
libfs.c convert get_sb_pseudo() users 2010-10-29 04:16:33 -04:00
locks.c locks: remove fl_copy_lock lock_manager operation 2010-10-31 06:35:15 -07:00
mbcache.c mbcache: Limit the maximum number of cache entries 2010-08-18 06:24:41 -04:00
mpage.c
namei.c fix open/umount race 2010-10-29 04:14:56 -04:00
namespace.c vfs: fix infinite loop caused by clone_mnt race 2010-10-25 21:24:16 -04:00
nfsctl.c
no-block.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
open.c fix open/umount race 2010-10-29 04:14:56 -04:00
pipe.c convert get_sb_pseudo() users 2010-10-29 04:16:33 -04:00
pnode.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
pnode.h
posix_acl.c
read_write.c readv/writev: do the same MAX_RW_COUNT truncation that read/write does 2010-10-29 10:36:49 -07:00
read_write.h
readdir.c vfs: fix warning: 'dirent' is used uninitialized in this function 2010-08-09 20:45:05 -07:00
select.c epoll: make epoll_wait() use the hrtimer range feature 2010-10-27 18:03:18 -07:00
seq_file.c fs: take dcache_lock inside __d_path 2010-10-25 21:26:12 -04:00
signalfd.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2010-10-26 10:13:10 -07:00
splice.c splice: fix misuse of SPLICE_F_NONBLOCK 2010-08-07 18:52:56 +02:00
stack.c
stat.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
statfs.c add f_flags to struct statfs(64) 2010-08-09 16:48:44 -04:00
super.c switch get_sb_ns() users 2010-10-29 04:17:03 -04:00
sync.c get rid of file_fsync() 2010-08-09 16:47:43 -04:00
timerfd.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
utimes.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
xattr.c fs: xattr_handler table should be const 2010-05-21 18:31:18 -04:00
xattr_acl.c