Commit Graph

147 Commits

Author SHA1 Message Date
Dave Chinner 7f8275d0d6 mm: add context argument to shrinker callback
The current shrinker implementation requires the registered callback
to have global state to work from. This makes it difficult to shrink
caches that are not global (e.g. per-filesystem caches). Pass the shrinker
structure to the callback so that users can embed the shrinker structure
in the context the shrinker needs to operate on and get back to it in the
callback via container_of().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-07-19 14:56:17 +10:00
Bob Peterson b7dc2df572 GFS2: recovery stuck on transaction lock
This patch fixes bugzilla bug #590878: GFS2: recovery stuck on
transaction lock.  We set the frozen flag on the glock when we receive
a completion that cannot be delivered due to blocked locks. At that
point we check to see whether the first waiting holder has the noexp
flag set. If the noexp lock is queued later, then we need to unfreeze
the glock at that point in time, namely, in the glock work function.

This patch was originally written by Steve Whitehouse, but since
he's on holiday, I'm submitting it.  It's been well tested with a
complex recovery test called revolver.

Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2010-07-15 09:05:57 +01:00
Bob Peterson 1a0eae8848 GFS2: glock livelock
This patch fixes a couple gfs2 problems with the reclaiming of
unlinked dinodes.  First, there were a couple of livelocks where
everything would come to a halt waiting for a glock that was
seemingly held by a process that no longer existed.  In fact, the
process did exist, it just had the wrong pid number in the holder
information.  Second, there was a lock ordering problem between
inode locking and glock locking.  Third, glock/inode contention
could sometimes cause inodes to be improperly marked invalid by
iget_failed.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2010-04-14 16:48:05 +01:00
Bob Peterson 4818972efb GFS2: print glock numbers in hex
This patch changes glock numbers from printing in decimal to hex.
Since DLM prints corresponding resource IDs in hex, it makes debugging
easier.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-03-01 14:09:04 +00:00
Steven Whitehouse c1184f8ab7 GFS2: Remove loopy umount code
As a consequence of the previous patch, we can now remove the
loop which used to be required due to the circular dependency
between the inodes and glocks. Instead we can just invalidate
the inodes, and then clear up any glocks which are left.

Also we no longer need the rwsem since there is no longer any
danger of the inode invalidation calling back into the glock
code (and from there back into the inode code).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-03-01 14:07:53 +00:00
Steven Whitehouse 009d851837 GFS2: Metadata address space clean up
Since the start of GFS2, an "extra" inode has been used to store
the metadata belonging to each inode. The only reason for using
this inode was to have an extra address space, the other fields
were unused. This means that the memory usage was rather inefficient.

The reason for keeping each inode's metadata in a separate address
space is that when glocks are requested on remote nodes, we need to
be able to efficiently locate the data and metadata which relating
to that glock (inode) in order to sync or sync and invalidate it
(depending on the remotely requested lock mode).

This patch adds a new type of glock, which has in addition to
its normal fields, has an address space. This applies to all
inode and rgrp glocks (but to no other glock types which remain
as before). As a result, we no longer need to have the second
inode.

This results in three major improvements:
 1. A saving of approx 25% of memory used in caching inodes
 2. A removal of the circular dependency between inodes and glocks
 3. No confusion between "normal" and "metadata" inodes in super.c

Although the first of these is the more immediately apparent, the
second is just as important as it now enables a number of clean
ups at umount time. Those will be the subject of future patches.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-03-01 14:07:37 +00:00
Steven Whitehouse 8f05228ee7 GFS2: Extend umount wait coverage to full glock lifetime
Although all glocks are, by the time of the umount glock wait,
scheduled for demotion, some of them haven't made it far
enough through the process for the original set of waiting
code to wait for them.

This extends the ref count to the whole glock lifetime in order
to ensure that the waiting does catch all glocks. It does make
it a bit more invasive, but it seems the only sensible solution
at the moment.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2010-02-03 09:56:21 +00:00
Steven Whitehouse 26bb7505cf GFS2: Fix glock refcount issues
This patch fixes some ref counting issues. Firstly by moving
the point at which we drop the ref count after a dlm lock
operation has completed we ensure that we never call
gfs2_glock_hold() on a lock with a zero ref count.

Secondly, by using atomic_dec_and_lock() in gfs2_glock_put()
we ensure that at no time will a glock with zero ref count
appear on the lru_list. That means that we can remove the
check for this in our shrinker (which was racy).

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-12-03 12:00:12 +00:00
Steven Whitehouse 7e71c55ee7 GFS2: Fix potential race in glock code
We need to be careful of the ordering between clearing the
GLF_LOCK bit and scheduling the workqueue.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-12-03 11:42:25 +00:00
Benjamin Marzinski b94a170e96 GFS2: remove dcache entries for remote deleted inodes
When a file is deleted from a gfs2 filesystem on one node, a dcache
entry for it may still exist on other nodes in the cluster. If this
happens, gfs2 will be unable to free this file on disk. Because of this,
it's possible to have a gfs2 filesystem with no files on it and no free
space. With this patch, when a node receives a callback notifying it
that the file is being deleted on another node, it schedules a new
workqueue thread to remove the file's dcache entry.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30 11:01:03 +01:00
Benjamin Marzinski 8ff22a6f9b GFS2: Don't put unlikely reclaim candidates on the reclaim list.
GFS2 was placing far too many glocks on the reclaim list that were not good
candidates for freeing up from cache.  These locks would sit there and
repeatedly get scanned to see if they could be reclaimed, wasting a lot
of time when there was memory pressure. This fix does more checks on the
locks to see if they are actually likely to be removable from cache.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30 11:00:09 +01:00
Benjamin Marzinski a51b56fff3 GFS2: Fix panic in glock memory shrinker
It is possible for gfs2_shrink_glock_memory() to check a glock for
demotion
that's in the process of being freed by gfs2_glock_put().  In this case,
gfs2_shrink_glock_memory() will acquire a new reference to this glock,
and
then try to free the glock itself when it drops the refernce.  To solve
this, gfs2_shrink_glock_memory() just needs to check if the glock is in
the process of being freed, and if so skip it without ever unlocking the
lru_lock.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Acked-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30 10:59:28 +01:00
Steven Whitehouse 2163b1e616 GFS2: Shrink the shrinker
This patch removes some of the special cases that the shrinker
was trying to deal with. As a result we leave fewer items on
the list and none at all which cannot be demoted. This makes
the list scanning more efficient and solves some issues seen
with large numbers of inodes.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-07-30 10:52:14 +01:00
Steven Whitehouse 63997775b7 GFS2: Add tracepoints
This patch adds the ability to trace various aspects of the GFS2
filesystem. The trace points are divided into three groups,
glocks, logging and bmap. These points have been chosen because
they allow inspection of the major internal functions of GFS2
and they are also generic enough that they are unlikely to need
any major changes as the filesystem evolves.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-06-12 08:49:20 +01:00
Steven Whitehouse fe64d517df GFS2: Umount recovery race fix
This patch fixes a race condition where we can receive recovery
requests part way through processing a umount. This was causing
problems since the recovery thread had already gone away.

Looking in more detail at the recovery code, it was really trying
to implement a slight variation on a work queue, and that happens to
align nicely with the recently introduced slow-work subsystem. As a
result I've updated the code to use slow-work, rather than its own home
grown variety of work queue.

When using the wait_on_bit() function, I noticed that the wait function
that was supplied as an argument was appearing in the WCHAN field, so
I've updated the function names in order to produce more meaningful
output.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-19 10:01:18 +01:00
Steven Whitehouse 0c7a531a20 GFS2: Fix glock ref counting bug
Depending on the ordering of events as we go around the
glock shrinker loop, it is possible to drop the ref count
of a glock incorrectly. It doesn't happen very often. This
patch corrects the got_ref variable, fixing the problem.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-05-09 15:15:17 +01:00
Steven Whitehouse a228df6339 GFS2: Move umount flush rwsem
The rwsem, used only on umount, is in the wrong place in glock.c.
This patch moves it up a bit so that it does not get called under
a spinlock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-04-15 10:16:13 +01:00
Steven Whitehouse 64d576ba23 GFS2: Add a "demote a glock" interface to sysfs
This adds a sysfs file called demote_rq to GFS2's
per filesystem directory. Its possible to use this
file to demote arbitrary glocks in exactly the same
way as if a request had come in from a remote node.

This is intended for testing issues relating to caching
of data under glocks. Despite that, the interface is
generic enough to send requests to any type of glock,
but be careful as its not always safe to send an
arbitrary message to an arbitrary glock. For that reason
and to prevent DoS, this interface is restricted to root
only.

The messages look like this:

<type>:<glocknumber> <mode>

Example:

echo -n "2:13324 EX" >/sys/fs/gfs2/unity:myfs/demote_rq

Which means "please demote inode glock (type 2) number 13324 so that
I can get an EX (exclusive) lock". The lock modes are those which
would normally be sent by a remote node in its callback so if you
want to unlock a glock, you use EX, to demote to shared, use SH or PR
(depending on whether you like GFS2 or DLM lock modes better!).

If the glock doesn't exist, you'll get -ENOENT returned. If the
arguments don't make sense, you'll get -EINVAL returned.

The plan is that this interface will be used in combination with
the blktrace patch which I recently posted for comments although
it is, of course, still useful in its own right.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:22 +00:00
Steven Whitehouse d8348de06f GFS2: Fix deadlock on journal flush
This patch fixes a deadlock when the journal is flushed and there
are dirty inodes other than the one which caused the journal flush.
Originally the journal flushing code was trying to obtain the
transaction glock while running the flush code for an inode glock.
We no longer require the transaction glock at this point in time
since we know that any attempt to get the transaction glock from
another node will result in a journal flush. So if we are flushing
the journal, we can be sure that the transaction lock is still
cached from when the transaction was started.

By inlining a version of gfs2_trans_begin() (minus the bit which
gets the transaction glock) we can avoid the deadlock problems
caused if there is a demote request queued up on the transaction
glock.

In addition I've also moved the umount rwsem so that it covers
the glock workqueue, since it all demotions are done by this
workqueue now. That fixes a bug on umount which I came across
while fixing the original problem.

Reported-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:18 +00:00
Steven Whitehouse ac2425e7d3 GFS2: Remove unused field from glock
The time stamp field is unused in the glock now that we are
using a shrinker, so that we can remove it and save sizeof(unsigned long)
bytes in each glock.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:17 +00:00
Steven Whitehouse f057f6cdf6 GFS2: Merge lock_dlm module into GFS2
This is the big patch that I've been working on for some time
now. There are many reasons for wanting to make this change
such as:
 o Reducing overhead by eliminating duplicated fields between structures
 o Simplifcation of the code (reduces the code size by a fair bit)
 o The locking interface is now the DLM interface itself as proposed
   some time ago.
 o Fewer lookups of glocks when processing replies from the DLM
 o Fewer memory allocations/deallocations for each glock
 o Scope to do further optimisations in the future (but this patch is
   more than big enough for now!)

Please note that (a) this patch relates to the lock_dlm module and
not the DLM itself, that is still a separate module; and (b) that
we retain the ability to build GFS2 as a standalone single node
filesystem with out requiring the DLM.

This patch needs a lot of testing, hence my keeping it I restarted
my -git tree after the last merge window. That way, this has the maximum
exposure before its merged. This is (modulo a few minor bug fixes) the
same patch that I've been posting on and off the the last three months
and its passed a number of different tests so far.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-03-24 11:21:14 +00:00
Julia Lawall eb8374e71f GFS2: Use DEFINE_SPINLOCK
SPIN_LOCK_UNLOCKED is deprecated.  The following makes the change suggested
in Documentation/spinlocks.txt

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
declarer name DEFINE_SPINLOCK;
identifier xxx_lock;
@@

- spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED;
+ DEFINE_SPINLOCK(xxx_lock);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:45:02 +00:00
Steven Whitehouse fefc03bfed Revert "GFS2: Fix use-after-free bug on umount"
This reverts commit 78802499912f1ba31ce83a94c55b5a980f250a43.

The original patch is causing problems in relation to order of
operations at umount in relation to jdata files. I need to fix
this a different way.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:18 +00:00
Steven Whitehouse 3af165ac4d GFS2: Fix use-after-free bug on umount
There was a use-after-free with the GFS2 super block during
umount. This patch moves almost all of the umount code from
->put_super into ->kill_sb, the only bit that cannot be moved
being the glock hash clearing which has to remain as ->put_super
due to umount ordering requirements. As a result its now obvious
that the kfree is the final operation, whereas before it was
hidden in ->put_super.

Also gfs2_jindex_free is then only referenced from a single file
so thats moved and marked static too.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:14 +00:00
Steven Whitehouse 2bfb6449b7 GFS2: Move four functions from super.c
The functions which are being moved can all be marked
static in their new locations, since they only have
a single caller each. Their new locations are more
logical than before and some of the functions are
small enough that the compiler might well inline them.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:12 +00:00
Steven Whitehouse 97cc1025b1 GFS2: Kill two daemons with one patch
This patch removes the two daemons, gfs2_scand and gfs2_glockd
and replaces them with a shrinker which is called from the VM.

The net result is that GFS2 responds better when there is memory
pressure, since it shrinks the glock cache at the same rate
as the VFS shrinks the dcache and icache. There are no longer
any time based criteria for shrinking glocks, they are kept
until such time as the VM asks for more memory and then we
demote just as many glocks as required.

There are potential future changes to this code, including the
possibility of sorting the glocks which are to be written back
into inode number order, to get a better I/O ordering. It would
be very useful to have an elevator based workqueue implementation
for this, as that would automatically deal with the read I/O cases
at the same time.

This patch is my answer to Andrew Morton's remark, made during
the initial review of GFS2, asking why GFS2 needs so many kernel
threads, the answer being that it doesn't :-) This patch is a
net loss of about 200 lines of code.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:09 +00:00
Steven Whitehouse 813e0c46c9 GFS2: Fix "truncate in progress" hang
Following on from the recent clean up of gfs2_quotad, this patch moves
the processing of "truncate in progress" inodes from the glock workqueue
into gfs2_quotad. This fixes a hang due to the "truncate in progress"
processing requiring glocks in order to complete.

It might seem odd to use gfs2_quotad for this particular item, but
we have to use a pre-existing thread since creating a thread implies
a GFP_KERNEL memory allocation which is not allowed from the glock
workqueue context. Of the existing threads, gfs2_logd and gfs2_recoverd
may deadlock if used for this operation. gfs2_scand and gfs2_glockd are
both scheduled for removal at some (hopefully not too distant) future
point. That leaves only gfs2_quotad whose workload is generally fairly
light and is easily adapted for this extra task.

Also, as a result of this change, it opens the way for a future patch to
make the reading of the inode's information asynchronous with respect to
the glock workqueue, which is another improvement that has been on the list
for some time now.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:39:06 +00:00
Harvey Harrison 55ba474dae GFS2: sparse annotation of gl->gl_spin
fs/gfs2/glock.c:308:5: warning: context problem in 'do_promote': '_spin_unlock' expected different context
fs/gfs2/glock.c:308:5:    context '*gl+28': wanted >= 1, got 0
fs/gfs2/glock.c:529:2: warning: context problem in 'do_xmote': '_spin_unlock' expected different context
fs/gfs2/glock.c:529:2:    context '*gl+28': wanted >= 1, got 0
fs/gfs2/glock.c:925:3: warning: context problem in 'add_to_queue': '_spin_unlock' expected different context
fs/gfs2/glock.c:925:3:    context '*gl+28': wanted >= 1, got 0

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-01-05 07:38:50 +00:00
Steven Whitehouse 719ee34467 GFS2: high time to take some time over atime
Until now, we've used the same scheme as GFS1 for atime. This has failed
since atime is a per vfsmnt flag, not a per fs flag and as such the
"noatime" flag was not getting passed down to the filesystems. This
patch removes all the "special casing" around atime updates and we
simply use the VFS's atime code.

The net result is that GFS2 will now support all the same atime related
mount options of any other filesystem on a per-vfsmnt basis. We do lose
the "lazy atime" updates, but we gain "relatime". We could add lazy
atime to the VFS at a later date, if there is a requirement for that
variant still - I suspect relatime will be enough.

Also we lose about 100 lines of code after this patch has been applied,
and I have a suspicion that it will speed things up a bit, even when
atime is "on". So it seems like a nice clean up as well.

From a user perspective, everything stays the same except the loss of
the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very
least, and to be honest I don't think anybody ever used it) and that a
number of options which were ignored before now work correctly.

Please let me know if you've got any comments. I'm pushing this out
early so that you can all see what my plans are.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-18 13:53:59 +01:00
Steven Whitehouse dff5257473 GFS2: Fix race relating to glock min-hold time
In the case that a request for a glock arrives right after the
grant reply has arrived, it sometimes means that the gl_tstamp
field hasn't been updated recently enough. The net result is that
the min-hold time for the glock is ignored. If this happens
often enough, it leads to poor performance.

This patch adds an additional test, so that if the reply pending
bit is set on a glock, then it will select the maximum length of
time for the min-hold time, rather than looking at gl_tstamp.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-09-05 14:18:02 +01:00
Steven Whitehouse c1e817d03a GFS2: Fix debugfs glock file iterator
Due to an incorrect iterator, some glocks were being missed from the
glock dumps obtained via debugfs. This patch fixes the problem and
ensures that we don't miss any glocks in future.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-08-13 09:59:10 +01:00
Steven Whitehouse 209806aba9 [GFS2] Allow local DF locks when holding a cached EX glock
We already allow local SH locks while we hold a cached EX glock, so here
we allow DF locks as well. This works only because we rely on the VFS's
invalidation for locally cached data, and because if we hold an EX lock,
then we know that no other node can be caching data relating to this
file.

It dramatically speeds up initial writes to O_DIRECT files since we fall
back to buffered I/O for this and would otherwise bounce between DF and
EX modes on each and every write call. The lessons to be learned from
that are to ensure that (for the time being anyway) O_DIRECT files are
preallocated and that they are written to using reasonably large I/O
sizes. Even so this change fixes that corner case nicely

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-07-07 10:07:28 +01:00
Steven Whitehouse 265d529cef [GFS2] Fix delayed demote race
There is a race in the delayed demote code where it does the wrong thing
if a demotion to UN has occurred for other reasons before the delay has
expired. This patch adds an assert to catch that condition as well as
fixing the root cause by adding an additional check for the UN state.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
2008-07-07 10:02:36 +01:00
Steven Whitehouse 1bdad60633 [GFS2] Remove remote lock dropping code
There are several reasons why this is undesirable:

 1. It never happens during normal operation anyway
 2. If it does happen it causes performance to be very, very poor
 3. It isn't likely to solve the original problem (memory shortage
    on remote DLM node) it was supposed to solve
 4. It uses a bunch of arbitrary constants which are unlikely to be
    correct for any particular situation and for which the tuning seems
    to be a black art.
 5. In an N node cluster, only 1/N of the dropped locked will actually
    contribute to solving the problem on average.

So all in all we are better off without it. This also makes merging
the lock_dlm module into GFS2 a bit easier.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-06-27 09:39:44 +01:00
Steven Whitehouse 048bca2237 [GFS2] No lock_nolock
This patch merges the lock_nolock module into GFS2 itself. As well as removing
some of the overhead of the module, it also means that its now impossible to
build GFS2 without a lock module (which would be a pointless thing to do
anyway).

We also plan to merge lock_dlm into GFS2 in the future, but that is a more
tricky task, and will therefore be a separate patch.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Teigland <teigland@redhat.com>
2008-06-27 09:39:28 +01:00
Steven Whitehouse 6802e3400f [GFS2] Clean up the glock core
This patch implements a number of cleanups to the core of the
GFS2 glock code. As a result a lot of code is removed. It looks
like a really big change, but actually a large part of this patch
is either removing or moving existing code.

There are some new bits too though, such as the new run_queue()
function which is considerably streamlined. Highlights of this
patch include:

 o Fixes a cluster coherency bug during SH -> EX lock conversions
 o Removes the "glmutex" code in favour of a single bit lock
 o Removes the ->go_xmote_bh() for inodes since it was duplicating
   ->go_lock()
 o We now only use the ->lm_lock() function for both locks and
   unlocks (i.e. unlock is a lock with target mode LM_ST_UNLOCKED)
 o The fast path is considerably shortly, giving performance gains
   especially with lock_nolock
 o The glock_workqueue is now used for all the callbacks from the DLM
   which allows us to simplify the lock_dlm module (see following patch)
 o The way is now open to make further changes such as eliminating the two
   threads (gfs2_glockd and gfs2_scand) in favour of a more efficient
   scheme.

This patch has undergone extensive testing with various test suites
so it should be pretty stable by now.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
2008-06-27 09:39:22 +01:00
Benjamin Marzinski 58e9fee13e [GFS2] Invalidate cache at correct point
GFS2 wasn't invalidating its cache before it called into the lock manager
with a request that could potentially drop a lock.  This was leaving a
window where the lock could be actually be held by another node, but the
file's page cache would still appear valid, causing coherency problems.
This patch moves the cache invalidation to before the lock manager call
when dropping a lock. It also adds the option to the lock_dlm lock
manager to not use conversion mode deadlock avoidance, which, on a
conversion from shared to exclusive, could internally drop the lock, and
then reacquire in. GFS2 now asks lock_dlm to not do this.  Instead, GFS2
manually drops the lock and reacquires it.

Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:41:44 +01:00
Steven Whitehouse 840ca0ec70 [GFS2] Fix bug where we called drop_bh incorrectly
As a result of an earlier patch, drop_bh was being called in cases
when it shouldn't have been. Since we never have a gh in the drop
case and we always have a gh in the promote case, we can use that
extra information to tell which case has been seen.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
2008-03-31 10:41:01 +01:00
Bob Peterson cf45b752c9 [GFS2] Remove rgrp and glock version numbers
This patch further reduces GFS2's memory requirements by
eliminating the 64-bit version number fields in lieu of
a couple bits.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:29 +01:00
Steven Whitehouse da755fdb41 [GFS2] Remove lm.[ch] and distribute content
The functions in lm.c were just wrappers which were mostly
only used in one other file. By moving the functions to
the files where they are being used, they can be marked
static and also this will usually result in them being inlined
since they are often only used from one point in the code.

A couple of really trivial functions have been inlined by hand
into the function which called them as it makes the code clearer
to do that.

We also gain from one fewer function call in the glock lock and
unlock paths.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:26 +01:00
Bob Peterson ab0d756681 [GFS2] Eliminate gl_req_bh
This patch further reduces the memory needs of GFS2 by
eliminating the gl_req_bh variable from struct gfs2_glock.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:23 +01:00
Bob Peterson 29d38cd163 [GFS2] Get rid of gl_waiters2
This patch reduces memory by replacing the int variable
gl_waiters2 by a single bit in the gl_flags.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:13 +01:00
Adrian Bunk 048786f1e6 [GFS2] make gfs2_glock_hold() static
gfs2_glock_hold() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:02 +01:00
Bob Peterson ef8c441cb7 [GFS2] Only wake the reclaim daemon if we need to
This patch only wakes up the glock reclaim daemon if there is
actually something to be reclaimed.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-03-31 10:40:00 +01:00
Pavel Emelyanov eccba06891 gfs2: make gfs2_glock.gl_owner_pid be a struct pid *
The gl_owner_pid field is used to get the lock owning task by its pid, so make
it in a proper manner, i.e.  by using the struct pid pointer and pid_task()
function.

The pid_task() becomes exported for the gfs2 module.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:06 -08:00
Pavel Emelyanov b1e058da50 gfs2: make gfs2_holder.gh_owner_pid be a struct pid *
The gl_owner_pid field is used to get the holder task by its pid and check
whether the current is a holder, so make it in a proper manner, i.e.  via the
struct pid * manipulations.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-07 08:42:06 -08:00
Bob Peterson 398bbe6832 [GFS2] Reorganize function gfs2_glmutex_lock
This patch optimizes the function gfs2_glmutex_lock.
The basic theory is: Why bother initializing a holder, setting up
wait bits and then waiting on them, if you know the glock can be
yours.  So the holder stuff is placed inside the if checking if the
glock is locked.  This one needs careful scrutiny because changing
anything to do with locking should strike terror into one's heart.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:13:52 +00:00
Fabio Massimo Di Nitto 1a2781cfa5 [GFS2] Fix runtime issue with UP kernels
The issue is indeed UP vs SMP and it is totally random.

spin_is_locked() is a bad assertion because there is no correct answer on UP.
on UP spin_is_locked() has to return either one value or another, always.

This means that in my setup I am lucky enough to trigger the issue and your you
are lucky enough not to.

the patch in attachment removes the bogus calls to BUG_ON and according to David
(in CC and thanks for the long explanation on the problem) we can rely upon
things like lockdep to find problem that might be trying to catch.

Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:08:06 +00:00
Steven Whitehouse 2bcd610d2f [GFS2] Don't add glocks to the journal
The only reason for adding glocks to the journal was to keep track
of which locks required a log flush prior to release. We add a
flag to the glock to allow this check to be made in a simpler way.

This reduces the size of a glock (by 12 bytes on i386, 24 on x86_64)
and means that we can avoid extra work during the journal flush.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:07:52 +00:00
Steven Whitehouse e589665eb9 [GFS2] Remove flags no longer required
The HIF_MUTEX and HIF_PROMOTE flags were set on the glock holders
depending upon which of the two waiters lists they were going to
be queued upon. They were then tested when the holders were taken
off the lists to ensure that the right type of holder was being
dequeued.

Since we are already using separate lists, there doesn't seem a
lot of point having these flags as well, and since setting them
and testing them is in the fast path for locking and unlocking
glock, this patch removes them.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2008-01-25 08:07:44 +00:00