linux/fs
Peter Staubach c293621bbf [PATCH] stale POSIX lock handling
I believe that there is a problem with the handling of POSIX locks, which
the attached patch should address.

The problem appears to be a race between fcntl(2) and close(2).  A
multithreaded application could close a file descriptor at the same time as
it is trying to acquire a lock using the same file descriptor.  I would
suggest that that multithreaded application is not providing the proper
synchronization for itself, but the OS should still behave correctly.

SUS3 (Single UNIX Specification Version 3, read: POSIX) indicates that when
a file descriptor is closed, that all POSIX locks on the file, owned by the
process which closed the file descriptor, should be released.

The trick here is when those locks are released.  The current code releases
all locks which exist when close is processing, but any locks in progress
are handled when the last reference to the open file is released.

There are three cases to consider.

One is the simple case, a multithreaded (mt) process has a file open and
races to close it and acquire a lock on it.  In this case, the close will
release one reference to the open file and when the fcntl is done, it will
release the other reference.  For this situation, no locks should exist on
the file when both the close and fcntl operations are done.  The current
system will handle this case because the last reference to the open file is
being released.

The second case is when the mt process has dup(2)'d the file descriptor.
The close will release one reference to the file and the fcntl, when done,
will release another, but there will still be at least one more reference
to the open file.  One could argue that the existence of a lock on the file
after the close has completed is okay, because it was acquired after the
close operation and there is still a way for the application to release the
lock on the file, using an existing file descriptor.

The third case is when the mt process has forked, after opening the file
and either before or after becoming an mt process.  In this case, each
process would hold a reference to the open file.  For each process, this
degenerates to first case above.  However, the lock continues to exist
until both processes have released their references to the open file.  This
lock could block other lock requests.

The changes to release the lock when the last reference to the open file
aren't quite right because they would allow the lock to exist as long as
there was a reference to the open file.  This is too long.

The new proposed solution is to add support in the fcntl code path to
detect a race with close and then to release the lock which was just
acquired when such as race is detected.  This causes locks to be released
in a timely fashion and for the system to conform to the POSIX semantic
specification.

This was tested by instrumenting a kernel to detect the handling locks and
then running a program which generates case #3 above.  A dangling lock
could be reliably generated.  When the changes to detect the close/fcntl
race were added, a dangling lock could no longer be generated.

Cc: Matthew Wilcox <willy@debian.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-27 16:26:06 -07:00
..
adfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
affs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
afs [PATCH] Cleanup patch for process freezing 2005-06-25 17:10:13 -07:00
autofs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
autofs4 [PATCH] autofs4: fix infamous "Busy inodes after umount ..." message 2005-07-27 16:25:51 -07:00
befs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
bfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
cifs [CIFS] Fix cifs update of page cache. Write at correct offset when out of memory 2005-06-09 14:44:07 -07:00
coda [PATCH] class: convert the remaining class_simple users in the kernel to usee the new class api 2005-06-20 15:15:11 -07:00
cramfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
debugfs [PATCH] remove duplicate get_dentry functions in various places 2005-06-23 09:45:20 -07:00
devfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
devpts Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
efs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
exportfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
ext2 [PATCH] fix xip sparse file handling in ext2 2005-07-27 16:25:53 -07:00
ext3 [PATCH] ext3: drop quota references before releasing inode 2005-07-27 16:25:50 -07:00
fat [PATCH] fatfs sectioning fix 2005-06-30 22:29:48 -07:00
freevxfs [PATCH] freevxfs: minor cleanups 2005-06-30 08:45:12 -07:00
hfs [PATCH] hfs, hfsplus: don't leak s_fs_info and fix an oops 2005-05-01 08:59:16 -07:00
hfsplus [PATCH] hfs, hfsplus: don't leak s_fs_info and fix an oops 2005-05-01 08:59:16 -07:00
hostfs [PATCH] uml: hostfs: unuse ROOT_DEV 2005-07-14 09:00:25 -07:00
hpfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
hppfs [PATCH] uml: fix hppfs error path 2005-07-14 09:00:25 -07:00
hugetlbfs [PATCH] Avoiding mmap fragmentation 2005-06-21 18:46:16 -07:00
isofs [PATCH] isofs: show hidden files, add granularity for assoc/hidden files flags 2005-06-21 19:07:38 -07:00
jbd [PATCH] Cleanup patch for process freezing 2005-06-25 17:10:13 -07:00
jffs [PATCH] Fix missing refrigerator invocation in jffs2 2005-07-27 16:25:49 -07:00
jffs2 [JFFS2] Fix node allocation leak 2005-07-15 08:14:44 +02:00
jfs JFS: Need to be root to create files with security context 2005-07-13 09:15:18 -05:00
lockd [PATCH] NFS: procfs/sysctl interfaces for lockd do not work on x86_64 2005-07-13 11:25:24 -07:00
minix Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
msdos Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
ncpfs [PATCH] fs/ncpfs/: remove unused #ifdef USE_OLD_SLOW_DIRECTORY_LISTING code 2005-06-25 16:25:04 -07:00
nfs [PATCH] really remove xattr_acl.h 2005-06-28 21:20:31 -07:00
nfs_common [PATCH] NFSD: Add server support for NFSv3 ACLs. 2005-06-22 16:07:23 -04:00
nfsd [PATCH] inotify 2005-07-12 20:38:38 -07:00
nls [PATCH] make some things static 2005-05-05 16:36:47 -07:00
ntfs NTFS: Fix a nasty deadlock that appeared in recent kernels. 2005-06-26 22:12:02 +01:00
openpromfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
partitions [PATCH] small partitions/msdos cleanups 2005-06-25 16:24:59 -07:00
proc [PATCH] kdump: Parse elf32 headers and export through /proc/vmcore 2005-06-25 16:24:53 -07:00
qnx4 [PATCH] fs/qnx4/*: fix sparse warnings 2005-06-24 14:14:24 -07:00
ramfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
reiserfs [PATCH] reiserfs: fix deadlock in inode creation failure path w/ default ACL 2005-07-27 16:25:50 -07:00
romfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
smbfs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
sysfs [PATCH] inotify 2005-07-12 20:38:38 -07:00
sysv Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
udf [PATCH] udf_find_entry() cleanup 2005-06-30 08:45:11 -07:00
ufs Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
umsdos Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
vfat Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
xfs [PATCH] Cleanup patch for process freezing 2005-06-25 17:10:13 -07:00
Kconfig [PATCH] inotify 2005-07-12 20:38:38 -07:00
Kconfig.binfmt Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
Makefile [PATCH] inotify 2005-07-12 20:38:38 -07:00
aio.c [PATCH] aio-retry-fix: fix aio retry work queueing 2005-06-28 21:20:32 -07:00
attr.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
bad_inode.c [PATCH] make some things static 2005-05-05 16:36:47 -07:00
binfmt_aout.c [PATCH] Avoiding mmap fragmentation 2005-06-21 18:46:16 -07:00
binfmt_elf.c [PATCH] Avoiding mmap fragmentation 2005-06-21 18:46:16 -07:00
binfmt_elf_fdpic.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
binfmt_em86.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
binfmt_flat.c [PATCH] binfmt_flat mmap flag fix 2005-06-06 14:57:51 -07:00
binfmt_misc.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
binfmt_script.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
binfmt_som.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
bio.c [PATCH] mostly_read data section 2005-07-07 18:23:46 -07:00
block_dev.c [PATCH] block: add unlocked_ioctl support for block devices 2005-06-23 09:45:32 -07:00
buffer.c [PATCH] page_uptodate locking scalability 2005-07-07 18:23:45 -07:00
char_dev.c [PATCH] cdev: cdev_put oops 2005-07-12 16:01:02 -07:00
compat.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
compat_ioctl.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
dcache.c [PATCH] make some things static 2005-05-05 16:36:47 -07:00
dcookies.c [PATCH] dcookies.c: use proper refcounting functions 2005-07-07 18:23:52 -07:00
direct-io.c [PATCH] pass iocb to dio_iodone_t 2005-06-24 00:05:19 -07:00
dnotify.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
dquot.c [PATCH] list_for_each_entry: fs-dquot.c 2005-06-25 16:25:11 -07:00
eventpoll.c [PATCH] Remove eventpoll macro obfuscation 2005-06-23 09:45:30 -07:00
exec.c [PATCH] reset real_timer target on exec leader change 2005-07-12 16:01:01 -07:00
fcntl.c [PATCH] stale POSIX lock handling 2005-07-27 16:26:06 -07:00
fifo.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
file.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
file_table.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
filesystems.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
fs-writeback.c [PATCH] O(1) sb list traversing on syncs 2005-06-23 09:45:27 -07:00
inode.c [PATCH] Fix soft lockup due to NTFS: VFS part and explanation 2005-07-13 11:25:24 -07:00
inotify.c [PATCH] inotify: fix oops fix 2005-07-26 14:34:18 -07:00
ioctl.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
ioprio.c [PATCH] move ioprio syscalls into syscalls.h 2005-07-07 18:23:37 -07:00
libfs.c [PATCH] fix fsync(dir) return value for ram-based filesystems 2005-06-25 16:24:38 -07:00
locks.c [PATCH] stale POSIX lock handling 2005-07-27 16:26:06 -07:00
mbcache.c [PATCH] make some things static 2005-05-05 16:36:47 -07:00
mpage.c [PATCH] mpage_end_io_write() I/O error handling fix 2005-06-04 17:12:59 -07:00
namei.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
namespace.c [PATCH] namespace: rename mnt_fslink to mnt_expire 2005-07-07 18:23:52 -07:00
nfsctl.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
open.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
pipe.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
posix_acl.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
quota.c [PATCH] O(1) sb list traversing on syncs 2005-06-23 09:45:27 -07:00
quota_v1.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
quota_v2.c [PATCH] quota: possible bug in quota format v2 support 2005-04-16 15:25:47 -07:00
read_write.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
readdir.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
select.c [PATCH] make some things static 2005-05-05 16:36:47 -07:00
seq_file.c [PATCH] DocBook: fix some descriptions 2005-05-01 08:59:26 -07:00
stat.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00
super.c [PATCH] set mnt_namespace in the correct place 2005-07-07 18:23:52 -07:00
xattr.c [PATCH] inotify 2005-07-12 20:38:38 -07:00
xattr_acl.c Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00