Commit Graph

46313 Commits

Author SHA1 Message Date
Gerald Schaefer c1821c2e97 [S390] noexec protection
This provides a noexec protection on s390 hardware. Our hardware does
not have any bits left in the pte for a hw noexec bit, so this is a
different approach using shadow page tables and a special addressing
mode that allows separate address spaces for code and data.

As a special feature of our "secondary-space" addressing mode, separate
page tables can be specified for the translation of data addresses
(storage operands) and instruction addresses. The shadow page table is
used for the instruction addresses and the standard page table for the
data addresses.
The shadow page table is linked to the standard page table by a pointer
in page->lru.next of the struct page corresponding to the page that
contains the standard page table (since page->private is not really
private with the pte_lock and the page table pages are not in the LRU
list).
Depending on the software bits of a pte, it is either inserted into
both page tables or just into the standard (data) page table. Pages of
a vma that does not have the VM_EXEC bit set get mapped only in the
data address space. Any try to execute code on such a page will cause a
page translation exception. The standard reaction to this is a SIGSEGV
with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn)
and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the
kernel to the signal stack frame. Unfortunately, the signal return
mechanism cannot be modified to use an SA_RESTORER because the
exception unwinding code depends on the system call opcode stored
behind the signal stack frame.

This feature requires that user space is executed in secondary-space
mode and the kernel in home-space mode, which means that the addressing
modes need to be switched and that the noexec protection only works
for user space.
After switching the addressing modes, we cannot use the mvcp/mvcs
instructions anymore to copy between kernel and user space. A new
mvcos instruction has been added to the z9 EC/BC hardware which allows
to copy between arbitrary address spaces, but on older hardware the
page tables need to be walked manually.

Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:18:17 +01:00
Jan Glauber 86aa9fc245 [S390] move crypto options and some cleanup.
This patch moves the config options for the s390 crypto instructions
to the standard "Hardware crypto devices" menu. In addition some
cleanup has been done: use a flag for supported keylengths, add a
warning about machien limitation, return ENOTSUPP in case the
hardware has no support, remove superfluous printks and update
email addresses.

Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:18:14 +01:00
Cornelia Huck 347d59d7e9 [S390] cio: Don't spam debug feature.
Lower priority of "Blacklisted device detected" messages so we don't
overwrite more useful messages.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:56 +01:00
Peter Oberparleiter 184357a596 [S390] Cleanup of CHSC event handling.
Change CHSC event handling to be more easily extensible.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:42 +01:00
Peter Oberparleiter 0f008aa300 [S390] cio: declare hardware structures packed.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:40 +01:00
Heiko Carstens 9b241cc862 [S390] Add set_fs(USER_DS) to start_thread().
Currently works anyway since search_binary_handler has a
set_fs(USER_DS). But start_thread() is the place where this should be
done. Following all other architectures...

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:38 +01:00
Cornelia Huck 758976f9a5 [S390] cio: Catch operand exceptions on stsch.
If we have a subchannel id which has been generated via
for_each_subchannel(), it might contain an invalid subchannel set id.
We need to catch the ensuing operand exception by using stsch_err()
instead of stsch() in all possible cases.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:36 +01:00
Heiko Carstens d8c351a97e [S390] Fix register usage description.
Fix description of register usage as pointed out by Andreas Krebbel.
Since this document is completely outdated and would need a lot of
fixing, it might be worth considering to get rid of it...

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:34 +01:00
Heiko Carstens d42335a33b [S390] kretprobe_trampoline_holder() in wrong section.
kretprobe_trampoline_holder() is in kprobes section but used to
register a kprobe in arch_init_kprobes(). Hence register_kprobe()
and therefore arch_init_kprobes() will fail.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:32 +01:00
Heiko Carstens 35df8d53f5 [S390] Fix kprobes breakpoint handling.
In case of an illegal op the die notifier gets called with DIE_TRAP
instead of DIE_BPT first.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:29 +01:00
Martin Schwidefsky d58140cc18 [S390] Update maintainers file.
Use the new linux-s390@vger.kernel.org mailing list instead of
linux-390@vm.marist.edu.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:27 +01:00
Horst Hummel 336c340b68 [S390] dasd: fix unconditional reserve handling.
The reserve/release IOCTLs sometimes do not work. If second system
does a 'steal lock' the pending unit check (Format 3 Msg F) is
delivered. Since ERP is disabled for reserve/release, the IOCTL call
fails. We have to allow basic ERP (retries) for reserve/release IOCTLs.

Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:24 +01:00
Horst Hummel db2738197b [S390] Remove dasd_ccw_log function.
Logging of relevant information is already done by disciplines
dump_sense function.

Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:22 +01:00
Heiko Carstens c48e09131b [S390] Small barrier() and cpu_relax() cleanup.
cpu_relax() has barrier() semantics hence there is no need to use both
of them in conjunction in sclp_sync_wait(). Also change cpu_relax()
so it's more obvious that it has barrier semantics.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:20 +01:00
Cornelia Huck 1125b4640f [S390] cio: Use device_{create,remove}_bin_file.
Create/remove the channel measurement binary files with
device_{create,remove}_bin_file instead of sysfs_{create,remove}_bin_file.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:18 +01:00
Heiko Carstens c59d744bd8 [S390] sclp: don't call local_bh_disable/_local_bh_enable if in_interrupt()
local_bh_disable/_local_bh_enable must not be called if in_irq() is
true. Besides that if in_interrupt() is true bottom halves are
disabled anyway.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:16 +01:00
Gerald Schaefer 444f0e5489 [S390] Show loaded DCSS segments under /proc/iomem.
Currently loaded DCSS segments are now listed in /proc/iomem with
their name followed by a trailing "(DCSS)".

Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:11 +01:00
Cornelia Huck 18374d376c [S390] cio: Restart path verification after unsolicited interrupt.
If we try to start path verification when an unsolicited interrupt
is already pending, stctl shows status pending and we delay path
verification again. We need to check for the doverify bit when the
unsolicited interrupt comes in and then do path verification.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:09 +01:00
Heiko Carstens b075083f35 [S390] Fix FCP dump feature detection.
FCP dump feature detection works only if the sclp command in head.S
was succesful. Since the sclp command is skipped if diag260 works,
we don't have any dump feature detection anymore.
Bug was introduced with d57de5a367.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:07 +01:00
Stefan Weinhuber e3c699b38e [S390] dasd: fix bug in dasd initialization cleanup
The initialization of the dasd_eer code is one of the last steps of the
dasd driver initialization. When initialization fails in one of the
earlier steps, the dasd_exit function is called to clean up what has been
done so far. So the dasd_eer_exit function may be called, although the
dasd_eer_init function wasn't called before and dasd_eer_exit tries to
unregister a misc device that wasn't registered, which results in a BUG.

Make sure that dasd_eer_exit can be called without initialization. Use a
dynamically allocated struct miscdevice instead of a static one, so we
only try to unregister the device if it exists and was actually registered.

Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:04 +01:00
Peter Oberparleiter dbd8ae6306 [S390] sclp: invalid handling of temporary 'not operational' status
Requests are aborted when the sclp interface reports 'not operational'
even though they may still be active at the sclp, leading to concurrent
writes to request memory by both the kernel and the sclp interface.
Do not abort requests for which the sclp interface reports not
operational status during request retry.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>5A
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:17:00 +01:00
Heiko Carstens 3b0b4af2c7 [S390] Simplify virt_to_phys.
No need to use lrag in 64 bit addressing mode since lra will do the
same.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:16:58 +01:00
Cornelia Huck 32c5b05092 [S390] cio: Remove check for ssd in chpids_show().
Since ssd_info is now available before the subchannel is registered,
we don't need to check whether it is available.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:16:56 +01:00
Christian Borntraeger bda3563fb2 [S390] cpcmd with vmalloc addresses.
Change the bounce buffer logic of cpcmd. diag8 needs _real_ memory below
2GB. Therefore vmalloced data does not work. As the data might cross a
page boundary, we cannot use virt_to_page either. The solution is to use
virt_to_page only in the check for a bounce buffer.

There was a redundant check for response==NULL. response < 2GB contains
this check as well.

I also removed the rlen==0 check, since rlen=0 and response!=NULL would
be a caller bug and response==NULL is already checked.

Signed-off-by: Christian Borntraeger <cborntra@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:16:54 +01:00
Heiko Carstens 60383201c2 [S390] Remove pointless/unreliable kernel messages.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:16:52 +01:00
Akinobu Mita b0f1779a87 [S390] Check the return value of kthread_run().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:16:49 +01:00
Heiko Carstens 2b67fc4606 [S390] Get rid of a lot of sparse warnings.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:16:47 +01:00
Heiko Carstens 55dff5224a [S390] Move init_irq_proc to the other irq related functions.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-02-05 21:16:44 +01:00
Horms 9473252f20 [IA64] add newline to PAL-code warning message
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:32:59 -08:00
Horms abac08dbb4 [IA64] kexec: Remove inline declaration of efi_get_pal_addr()
Remove the Remove inline declaration of efi_get_pal_addr() as it is
declared in linux/efi.h.

Signed-Off-By: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:31:43 -08:00
Horms 8a697d0a4c [IA64] kexec: Minor enhancement to includes in crash.c
linux/uaccess.h was being included, but it seems that
really the following includes are needed.

asm/page.h: for __va() and PAGE_SHIFT
asm/uaccess.h: for copy_to_user()

I guess that linux/uaccess.h pulls in both asm/page.h and asm/uaccess.h.
I notices this while backporting the code to xen's linux-2.6.16.33,
which does not have linux/uaccess.h. I'm posting it as I think it is a
correct, though somewhat cosmetic fix.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:31:04 -08:00
Horms 233c2f99d6 [IA64] kexec: typo in the saved_max_pfn description in contig.c
Fix a typo in the saved_max_pfn description in contig.c

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:30:25 -08:00
Horms 475c63bded [IA64] Zero size /proc/vmcore on ia64
Set saved_max_pfn when discontig memory is in use.

This sets up saved_max_pfn when disctontig memory is in use.
This mirrors the code for contig memory.

This patch does not entirely solve the problem of making vmcore work,
however it does appear to be neccessary. Please consider applying.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:29:33 -08:00
Magnus Damm bcb9b99d1f [IA64] kexec: Fix CONFIG_SMP=n compilation
Kexec support for 2.6.20 on ia64 does not build properly using a config
made up by CONFIG_SMP=n and CONFIG_HOTPLUG_CPU=n:

Signed-off-by: Magnus Damm <magnus@valinux.co.jp>
Acked-by: Simon Horman <horms@verge.net.au>
Acked-by: Jay Lan <jlan@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2007-02-05 11:27:21 -08:00
Patrick Caulfield a34fbc6363 [DLM] fix softlockup in dlm_recv
This patch stops the dlm_recv workqueue from busy-waiting when a node
disconnects. This can cause soft lockup errors on debug systems and bad
performance generally.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:27 -05:00
David Teigland 62a0f62369 [DLM] zero new user lvbs
A new lvb for a userland lock wasn't being initialized to zero.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:24 -05:00
Randy Dunlap 9beeb9f3c5 [DLM/GFS2] indent help text
Indent help text as expected.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:20 -05:00
Russell Cattelan ddee76089c [GFS2] Fix unlink deadlocks
Move the glock acquisition to outside of the transactions.

Lock odering must be preserved in order to prevent ABBA
deadlocks. The current gfs2_change_nlink code would tries
to grab the glock after having started a transaction and thus is holding
the log lock. This is inconsistent with other code paths in
gfs that grab the resource group glock prior to staring
a tranactions.

One problem with this fix is that the resource group
lock is always grabbed now even if the inode still has
ref count and can not be marked for unlink.

Signed-off-by: Russell Cattelan <cattelan@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:17 -05:00
Steven Whitehouse 61be084efc [GFS2] Put back semaphore to avoid umount problem
Dave Teigland fixed this bug a while back, but I managed to mistakenly
remove the semaphore during later development. It is required to avoid
the list of inodes changing during an invalidate_inodes call. I have
made it an rwsem since the read side will be taken frequently during
normal filesystem operation. The write site will only happen during
umount of the file system.

Also the bug only triggers when using the DLM lock manager and only then
under certain conditions as its timing related.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Teigland <teigland@redhat.com>
2007-02-05 13:38:14 -05:00
Eric Sandeen bbb28ab759 [GFS2] more CURRENT_TIME_SEC
Whoops, quilt user error, missed this one in the previous patch.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:11 -05:00
Adrian Bunk 0011727785 [GFS2/DLM] fix GFS2 circular dependency
On Sun, Jan 28, 2007 at 11:08:18AM +0100, Jiri Slaby wrote:
> Andrew Morton napsal(a):
> >Temporarily at
> >
> >	http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/
>
> Unable to select IPV6. Menuconfig doesn't offer it when INET is selected.
> When it's not it appears in the menu, but after state change it gets away.
> The same behaviour in xconfig, gconfig.
>
> $ mkdir ../a/tst
> $ make O=../a/tst menuconfig
>   HOSTCC  scripts/basic/fixdep
> [...]
>   HOSTLD  scripts/kconfig/mconf
> scripts/kconfig/mconf arch/i386/Kconfig
> Warning! Found recursive dependency: INET GFS2_FS_LOCKING_DLM SYSFS
> OCFS2_FS INET
>
> Maybe this is the problem?

Yes, patch below.

> regards,

cu
Adrian

<--  snip  -->

This patch fixes a circular dependency by letting GFS2_FS_LOCKING_DLM
and DLM depend on instead of select SYSFS.

Since SYSFS depends on EMBEDDED this change shouldn't cause any problems
for users.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:08 -05:00
Randy Dunlap 67f55897ee [GFS2/DLM] use sysfs
With CONFIG_DLM=m, CONFIG_PROC_FS=n, and CONFIG_SYSFS=n, kernel build
fails with:

WARNING: "kernel_subsys" [fs/gfs2/locking/dlm/lock_dlm.ko] undefined!
WARNING: "kernel_subsys" [fs/dlm/dlm.ko] undefined!
WARNING: "kernel_subsys" [fs/configfs/configfs.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2

Since fs/dlm/lockspace.c and fs/gfs2/locking/dlm/sysfs.c use
kernel_subsys, they should either DEPEND on it or SELECT it.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:05 -05:00
David Teigland ee32e4f3d3 [GFS2] make lock_dlm drop_count tunable in sysfs
We want to be able to change or disable the default drop_count (number at
which the dlm asks gfs to limit the the number of locks it's holding).
Add it to the collection of sysfs tunables for an fs.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:38:01 -05:00
David Teigland 2f708649ba [GFS2] increase default lock limit
Increase the number of locks at which point the dlm begins asking gfs to
reduce its lock usage.  The default value is largely arbitrary, but the
current value of 50,000 ends up limiting performance unnecessarily for too
many users.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:59 -05:00
Steven Whitehouse 8bd9572769 [GFS2] Fix list corruption in lops.c
The patch below appears to fix the list corruption that we are seeing on
occasion. Although the transaction structure is private to a single
thread, when the queued structures are dismantled during an in-core
commit, its possible for a different thread to be trying to add the same
structure to another, new, transaction at the same time.

To avoid this, this patch takes the log spinlock during this operation.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:56 -05:00
Steven Whitehouse d7c103d0bd [GFS2] Fix recursive locking attempt with NFS
In certain cases, its possible for NFS to call the lookup code while
holding the glock (when doing a readdirplus operation) so we need to
check for that and not try and lock the glock twice. This also fixes a
typo in a previous NFS related GFS2 patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:53 -05:00
David Teigland b790c3b7c3 [DLM] can miss clearing resend flag
A long, complicated sequence of events, beginning with the RESEND flag not
being cleared on an lkb, can result in an unlock never completing.

- lkb on waiters list for remote lookup
- the remote node is both the dir node and the master node, so
  it optimizes the lookup into a request and sends a request
  reply back
- the request reply is saved on the requestqueue to be processed
  after recovery
- recovery runs dlm_recover_waiters_pre() which sets RESEND flag
  so the lookup will be resent after recovery
- end of recovery: process_requestqueue takes saved request reply
  which removes the lkb off the waitesr list, _without_ clearing
  the RESEND flag
- end of recovery: dlm_recover_waiters_post() doesn't do anything
  with the now completed lookup lkb (would usually clear RESEND)
- later, the node unmounts, unlocks this lkb that still has RESEND
  flag set
- the lkb is on the waiters list again, now for unlock, when recovery
  occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND
  set, doesn't do anything since the master still exists
- end of recovery: dlm_recover_waiters_post() takes this lkb off
  the waiters list because it has the RESEND flag set, then reports
  an error because unlocks are never supposed to be handled in
  recover_waiters_post().
- later, the unlock reply is received, doesn't find the lkb on
  the waiters list because recover_waiters_post() has wrongly
  removed it.
- the unlock operation has been lost, and we're left with a
  stray granted lock
- unmount spins waiting for the unlock to complete

The visible evidence of this problem will be a node where gfs umount is
spinning, the dlm waiters list will be empty, and the dlm locks list will
show a granted lock.

The fix is simply to clear the RESEND flag when taking an lkb off the
waiters list.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:50 -05:00
David Teigland 8fd3a98f2c [DLM] saved dlm message can be dropped
dlm_receive_message() returns 0 instead of returning 'error'.  What would
happen is that process_requestqueue would take a saved message off the
requestqueue and call receive_message on it.  receive_message would then
see that recovery had been aborted, set error to EINTR, and 'goto out',
expecting that the error would be returned.  Instead, 0 was always
returned, so process_requestqueue would think that the message had been
processed and delete it instead of saving it to process next time.  This
means the message (usually an unlock in my tests) would be lost.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:47 -05:00
Patrick Caulfield f1f1c1ccf7 [DLM] Make sock_sem into a mutex
Now that there can be multiple dlm_recv threads running we need to prevent two
recvs running for the same connection - it's unlikely but it can happen and it
causes message corruption.

Signed-Off-By: Patrick Caulfield <pcaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:44 -05:00
Steven Whitehouse d043e1900c [GFS2] Fix typo in glock.c
This is a one letter typo fix in glock.c, spotted by Rob Kenna.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2007-02-05 13:37:41 -05:00