linux_old1/fs/fscache/operation.c

540 lines
14 KiB
C
Raw Normal View History

FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
/* FS-Cache worker operation management routines
*
* Copyright (C) 2008 Red Hat, Inc. All Rights Reserved.
* Written by David Howells (dhowells@redhat.com)
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*
* See Documentation/filesystems/caching/operations.txt
*/
#define FSCACHE_DEBUG_LEVEL OPERATION
#include <linux/module.h>
#include <linux/seq_file.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
#include <linux/slab.h>
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
#include "internal.h"
atomic_t fscache_op_debug_id;
EXPORT_SYMBOL(fscache_op_debug_id);
/**
* fscache_enqueue_operation - Enqueue an operation for processing
* @op: The operation to enqueue
*
* Enqueue an operation for processing by the FS-Cache thread pool.
*
* This will get its own ref on the object.
*/
void fscache_enqueue_operation(struct fscache_operation *op)
{
_enter("{OBJ%x OP%x,%u}",
op->object->debug_id, op->debug_id, atomic_read(&op->usage));
ASSERT(list_empty(&op->pend_link));
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
ASSERT(op->processor != NULL);
ASSERT(fscache_object_is_available(op->object));
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
ASSERTCMP(atomic_read(&op->usage), >, 0);
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
ASSERTCMP(op->state, ==, FSCACHE_OP_ST_IN_PROGRESS);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
fscache_stat(&fscache_n_op_enqueue);
switch (op->flags & FSCACHE_OP_TYPE) {
case FSCACHE_OP_ASYNC:
_debug("queue async");
atomic_inc(&op->usage);
if (!queue_work(fscache_op_wq, &op->work))
fscache_put_operation(op);
break;
case FSCACHE_OP_MYTHREAD:
_debug("queue for caller's attention");
break;
default:
printk(KERN_ERR "FS-Cache: Unexpected op type %lx",
op->flags);
BUG();
break;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
}
}
EXPORT_SYMBOL(fscache_enqueue_operation);
/*
* start an op running
*/
static void fscache_run_op(struct fscache_object *object,
struct fscache_operation *op)
{
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
ASSERTCMP(op->state, ==, FSCACHE_OP_ST_PENDING);
op->state = FSCACHE_OP_ST_IN_PROGRESS;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
object->n_in_progress++;
if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
if (op->processor)
fscache_enqueue_operation(op);
fscache_stat(&fscache_n_op_run);
}
/*
* submit an exclusive operation for an object
* - other ops are excluded from running simultaneously with this one
* - this gets any extra refs it needs on an op
*/
int fscache_submit_exclusive_op(struct fscache_object *object,
struct fscache_operation *op)
{
FS-Cache: Exclusive op submission can BUG if there's been an I/O error The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG if there's been an I/O error because it may see the parent cache object in an unexpected state. It should only BUG if there hasn't been an I/O error. In this case the problem was produced by remounting the cache partition to be R/O. The EROFS state was detected and the cache was aborted, but not everything handled the aborting correctly. SysRq : Emergency Remount R/O EXT4-fs (sda6): re-mounted. Opts: (null) Emergency Remount complete CacheFiles: I/O Error: Failed to update xattr with error -30 FS-Cache: Cache cachefiles stopped due to I/O error ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:128! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093 /DG965RY RIP: 0010:[<ffffffffa00739c0>] [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache] RSP: 0018:ffff880000853d40 EFLAGS: 00010206 RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0 RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280 RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268 R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162 FS: 0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040) Stack: ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260 ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198 ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308 Call Trace: [<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache] [<ffffffff81047e44>] process_one_work+0x1eb/0x37f [<ffffffff81047de6>] ? process_one_work+0x18d/0x37f [<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache] [<ffffffff810482e4>] worker_thread+0x15a/0x21a [<ffffffff8104818a>] ? rescuer_thread+0x188/0x188 [<ffffffff8104bf96>] kthread+0x7f/0x87 [<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe [<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6f0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-05 21:34:48 +08:00
int ret;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
_enter("{OBJ%x OP%x},", object->debug_id, op->debug_id);
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
ASSERTCMP(op->state, ==, FSCACHE_OP_ST_INITIALISED);
ASSERTCMP(atomic_read(&op->usage), >, 0);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
spin_lock(&object->lock);
ASSERTCMP(object->n_ops, >=, object->n_in_progress);
ASSERTCMP(object->n_ops, >=, object->n_exclusive);
ASSERT(list_empty(&op->pend_link));
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
op->state = FSCACHE_OP_ST_PENDING;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
if (fscache_object_is_active(object)) {
op->object = object;
object->n_ops++;
object->n_exclusive++; /* reads and writes must wait */
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
if (object->n_in_progress > 0) {
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
atomic_inc(&op->usage);
list_add_tail(&op->pend_link, &object->pending_ops);
fscache_stat(&fscache_n_op_pend);
} else if (!list_empty(&object->pending_ops)) {
atomic_inc(&op->usage);
list_add_tail(&op->pend_link, &object->pending_ops);
fscache_stat(&fscache_n_op_pend);
fscache_start_operations(object);
} else {
ASSERTCMP(object->n_in_progress, ==, 0);
fscache_run_op(object, op);
}
/* need to issue a new write op after this */
clear_bit(FSCACHE_OBJECT_PENDING_WRITE, &object->flags);
FS-Cache: Exclusive op submission can BUG if there's been an I/O error The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG if there's been an I/O error because it may see the parent cache object in an unexpected state. It should only BUG if there hasn't been an I/O error. In this case the problem was produced by remounting the cache partition to be R/O. The EROFS state was detected and the cache was aborted, but not everything handled the aborting correctly. SysRq : Emergency Remount R/O EXT4-fs (sda6): re-mounted. Opts: (null) Emergency Remount complete CacheFiles: I/O Error: Failed to update xattr with error -30 FS-Cache: Cache cachefiles stopped due to I/O error ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:128! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093 /DG965RY RIP: 0010:[<ffffffffa00739c0>] [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache] RSP: 0018:ffff880000853d40 EFLAGS: 00010206 RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0 RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280 RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268 R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162 FS: 0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040) Stack: ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260 ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198 ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308 Call Trace: [<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache] [<ffffffff81047e44>] process_one_work+0x1eb/0x37f [<ffffffff81047de6>] ? process_one_work+0x18d/0x37f [<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache] [<ffffffff810482e4>] worker_thread+0x15a/0x21a [<ffffffff8104818a>] ? rescuer_thread+0x188/0x188 [<ffffffff8104bf96>] kthread+0x7f/0x87 [<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe [<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6f0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-05 21:34:48 +08:00
ret = 0;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
} else if (object->state == FSCACHE_OBJECT_CREATING) {
op->object = object;
object->n_ops++;
object->n_exclusive++; /* reads and writes must wait */
atomic_inc(&op->usage);
list_add_tail(&op->pend_link, &object->pending_ops);
fscache_stat(&fscache_n_op_pend);
FS-Cache: Exclusive op submission can BUG if there's been an I/O error The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG if there's been an I/O error because it may see the parent cache object in an unexpected state. It should only BUG if there hasn't been an I/O error. In this case the problem was produced by remounting the cache partition to be R/O. The EROFS state was detected and the cache was aborted, but not everything handled the aborting correctly. SysRq : Emergency Remount R/O EXT4-fs (sda6): re-mounted. Opts: (null) Emergency Remount complete CacheFiles: I/O Error: Failed to update xattr with error -30 FS-Cache: Cache cachefiles stopped due to I/O error ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:128! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093 /DG965RY RIP: 0010:[<ffffffffa00739c0>] [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache] RSP: 0018:ffff880000853d40 EFLAGS: 00010206 RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0 RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280 RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268 R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162 FS: 0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040) Stack: ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260 ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198 ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308 Call Trace: [<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache] [<ffffffff81047e44>] process_one_work+0x1eb/0x37f [<ffffffff81047de6>] ? process_one_work+0x18d/0x37f [<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache] [<ffffffff810482e4>] worker_thread+0x15a/0x21a [<ffffffff8104818a>] ? rescuer_thread+0x188/0x188 [<ffffffff8104bf96>] kthread+0x7f/0x87 [<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe [<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6f0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-05 21:34:48 +08:00
ret = 0;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
} else {
FS-Cache: Exclusive op submission can BUG if there's been an I/O error The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG if there's been an I/O error because it may see the parent cache object in an unexpected state. It should only BUG if there hasn't been an I/O error. In this case the problem was produced by remounting the cache partition to be R/O. The EROFS state was detected and the cache was aborted, but not everything handled the aborting correctly. SysRq : Emergency Remount R/O EXT4-fs (sda6): re-mounted. Opts: (null) Emergency Remount complete CacheFiles: I/O Error: Failed to update xattr with error -30 FS-Cache: Cache cachefiles stopped due to I/O error ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:128! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093 /DG965RY RIP: 0010:[<ffffffffa00739c0>] [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache] RSP: 0018:ffff880000853d40 EFLAGS: 00010206 RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0 RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280 RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268 R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162 FS: 0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040) Stack: ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260 ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198 ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308 Call Trace: [<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache] [<ffffffff81047e44>] process_one_work+0x1eb/0x37f [<ffffffff81047de6>] ? process_one_work+0x18d/0x37f [<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache] [<ffffffff810482e4>] worker_thread+0x15a/0x21a [<ffffffff8104818a>] ? rescuer_thread+0x188/0x188 [<ffffffff8104bf96>] kthread+0x7f/0x87 [<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe [<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6f0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-05 21:34:48 +08:00
/* If we're in any other state, there must have been an I/O
* error of some nature.
*/
ASSERT(test_bit(FSCACHE_IOERROR, &object->cache->flags));
ret = -EIO;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
}
spin_unlock(&object->lock);
FS-Cache: Exclusive op submission can BUG if there's been an I/O error The function to submit an exclusive op (fscache_submit_exclusive_op()) can BUG if there's been an I/O error because it may see the parent cache object in an unexpected state. It should only BUG if there hasn't been an I/O error. In this case the problem was produced by remounting the cache partition to be R/O. The EROFS state was detected and the cache was aborted, but not everything handled the aborting correctly. SysRq : Emergency Remount R/O EXT4-fs (sda6): re-mounted. Opts: (null) Emergency Remount complete CacheFiles: I/O Error: Failed to update xattr with error -30 FS-Cache: Cache cachefiles stopped due to I/O error ------------[ cut here ]------------ kernel BUG at fs/fscache/operation.c:128! invalid opcode: 0000 [#1] SMP CPU 0 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 6612, comm: kworker/u:2 Not tainted 3.1.0-rc8-fsdevel+ #1093 /DG965RY RIP: 0010:[<ffffffffa00739c0>] [<ffffffffa00739c0>] fscache_submit_exclusive_op+0x2ad/0x2c2 [fscache] RSP: 0018:ffff880000853d40 EFLAGS: 00010206 RAX: ffff880038ac72a8 RBX: ffff8800181f2260 RCX: ffffffff81f2b2b0 RDX: 0000000000000001 RSI: ffffffff8179a478 RDI: ffff8800181f2280 RBP: ffff880000853d60 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff880038ac7268 R13: ffff8800181f2280 R14: ffff88003a359190 R15: 000000010122b162 FS: 0000000000000000(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000034cc4a77f0 CR3: 0000000010e96000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:2 (pid: 6612, threadinfo ffff880000852000, task ffff880014c3c040) Stack: ffff8800181f2260 ffff8800181f2310 ffff880038ac7268 ffff8800181f2260 ffff880000853dc0 ffffffffa0072375 ffff880037ecfe00 ffff88003a359198 ffff880000853dc0 0000000000000246 0000000000000000 ffff88000a91d308 Call Trace: [<ffffffffa0072375>] fscache_object_work_func+0x792/0xe65 [fscache] [<ffffffff81047e44>] process_one_work+0x1eb/0x37f [<ffffffff81047de6>] ? process_one_work+0x18d/0x37f [<ffffffffa0071be3>] ? fscache_enqueue_dependents+0xd8/0xd8 [fscache] [<ffffffff810482e4>] worker_thread+0x15a/0x21a [<ffffffff8104818a>] ? rescuer_thread+0x188/0x188 [<ffffffff8104bf96>] kthread+0x7f/0x87 [<ffffffff813ad6f4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abd1d>] ? retint_restore_args+0xe/0xe [<ffffffff8104bf17>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6f0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-05 21:34:48 +08:00
return ret;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
}
/*
* report an unexpected submission
*/
static void fscache_report_unexpected_submission(struct fscache_object *object,
struct fscache_operation *op,
unsigned long ostate)
{
static bool once_only;
struct fscache_operation *p;
unsigned n;
if (once_only)
return;
once_only = true;
kdebug("unexpected submission OP%x [OBJ%x %s]",
op->debug_id, object->debug_id,
fscache_object_states[object->state]);
kdebug("objstate=%s [%s]",
fscache_object_states[object->state],
fscache_object_states[ostate]);
kdebug("objflags=%lx", object->flags);
kdebug("objevent=%lx [%lx]", object->events, object->event_mask);
kdebug("ops=%u inp=%u exc=%u",
object->n_ops, object->n_in_progress, object->n_exclusive);
if (!list_empty(&object->pending_ops)) {
n = 0;
list_for_each_entry(p, &object->pending_ops, pend_link) {
ASSERTCMP(p->object, ==, object);
kdebug("%p %p", op->processor, op->release);
n++;
}
kdebug("n=%u", n);
}
dump_stack();
}
/*
* submit an operation for an object
* - objects may be submitted only in the following states:
* - during object creation (write ops may be submitted)
* - whilst the object is active
* - after an I/O error incurred in one of the two above states (op rejected)
* - this gets any extra refs it needs on an op
*/
int fscache_submit_op(struct fscache_object *object,
struct fscache_operation *op)
{
unsigned long ostate;
int ret;
_enter("{OBJ%x OP%x},{%u}",
object->debug_id, op->debug_id, atomic_read(&op->usage));
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
ASSERTCMP(op->state, ==, FSCACHE_OP_ST_INITIALISED);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
ASSERTCMP(atomic_read(&op->usage), >, 0);
spin_lock(&object->lock);
ASSERTCMP(object->n_ops, >=, object->n_in_progress);
ASSERTCMP(object->n_ops, >=, object->n_exclusive);
ASSERT(list_empty(&op->pend_link));
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
ostate = object->state;
smp_rmb();
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
op->state = FSCACHE_OP_ST_PENDING;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
if (fscache_object_is_active(object)) {
op->object = object;
object->n_ops++;
if (object->n_exclusive > 0) {
atomic_inc(&op->usage);
list_add_tail(&op->pend_link, &object->pending_ops);
fscache_stat(&fscache_n_op_pend);
} else if (!list_empty(&object->pending_ops)) {
atomic_inc(&op->usage);
list_add_tail(&op->pend_link, &object->pending_ops);
fscache_stat(&fscache_n_op_pend);
fscache_start_operations(object);
} else {
ASSERTCMP(object->n_exclusive, ==, 0);
fscache_run_op(object, op);
}
ret = 0;
} else if (object->state == FSCACHE_OBJECT_CREATING) {
op->object = object;
object->n_ops++;
atomic_inc(&op->usage);
list_add_tail(&op->pend_link, &object->pending_ops);
fscache_stat(&fscache_n_op_pend);
ret = 0;
FS-Cache: Handle read request vs lookup, creation or other cache failure FS-Cache doesn't correctly handle the netfs requesting a read from the cache on an object that failed or was withdrawn by the cache. A trace similar to the following might be seen: CacheFiles: Lookup failed error -105 [exe ] unexpected submission OP165afe [OBJ6cac OBJECT_LC_DYING] [exe ] objstate=OBJECT_LC_DYING [OBJECT_LC_DYING] [exe ] objflags=0 [exe ] objevent=9 [fffffffffffffffb] [exe ] ops=0 inp=0 exc=0 Pid: 6970, comm: exe Not tainted 2.6.32-rc6-cachefs #50 Call Trace: [<ffffffffa0076477>] fscache_submit_op+0x3ff/0x45a [fscache] [<ffffffffa0077997>] __fscache_read_or_alloc_pages+0x187/0x3c4 [fscache] [<ffffffffa00b6480>] ? nfs_readpage_from_fscache_complete+0x0/0x66 [nfs] [<ffffffffa00b6388>] __nfs_readpages_from_fscache+0x7e/0x176 [nfs] [<ffffffff8108e483>] ? __alloc_pages_nodemask+0x11c/0x5cf [<ffffffffa009d796>] nfs_readpages+0x114/0x1d7 [nfs] [<ffffffff81090314>] __do_page_cache_readahead+0x15f/0x1ec [<ffffffff81090228>] ? __do_page_cache_readahead+0x73/0x1ec [<ffffffff810903bd>] ra_submit+0x1c/0x20 [<ffffffff810906bb>] ondemand_readahead+0x227/0x23a [<ffffffff81090762>] page_cache_sync_readahead+0x17/0x19 [<ffffffff8108a99e>] generic_file_aio_read+0x236/0x5a0 [<ffffffffa00937bd>] nfs_file_read+0xe4/0xf3 [nfs] [<ffffffff810b2fa2>] do_sync_read+0xe3/0x120 [<ffffffff81354cc3>] ? _spin_unlock_irq+0x2b/0x31 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffff811848e5>] ? selinux_file_permission+0x5d/0x10f [<ffffffff81352bdb>] ? thread_return+0x3e/0x101 [<ffffffff8117d7b0>] ? security_file_permission+0x11/0x13 [<ffffffff810b3b06>] vfs_read+0xaa/0x16f [<ffffffff81058df0>] ? trace_hardirqs_on_caller+0x10c/0x130 [<ffffffff810b3c84>] sys_read+0x45/0x6c [<ffffffff8100ae2b>] system_call_fastpath+0x16/0x1b The object state might also be OBJECT_DYING or OBJECT_WITHDRAWING. This should be handled by simply rejecting the new operation with ENOBUFS. There's no need to log an error for it. Events of this type now appear in the stats file under Ops:rej. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-20 02:11:32 +08:00
} else if (object->state == FSCACHE_OBJECT_DYING ||
object->state == FSCACHE_OBJECT_LC_DYING ||
object->state == FSCACHE_OBJECT_WITHDRAWING) {
fscache_stat(&fscache_n_op_rejected);
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
op->state = FSCACHE_OP_ST_CANCELLED;
FS-Cache: Handle read request vs lookup, creation or other cache failure FS-Cache doesn't correctly handle the netfs requesting a read from the cache on an object that failed or was withdrawn by the cache. A trace similar to the following might be seen: CacheFiles: Lookup failed error -105 [exe ] unexpected submission OP165afe [OBJ6cac OBJECT_LC_DYING] [exe ] objstate=OBJECT_LC_DYING [OBJECT_LC_DYING] [exe ] objflags=0 [exe ] objevent=9 [fffffffffffffffb] [exe ] ops=0 inp=0 exc=0 Pid: 6970, comm: exe Not tainted 2.6.32-rc6-cachefs #50 Call Trace: [<ffffffffa0076477>] fscache_submit_op+0x3ff/0x45a [fscache] [<ffffffffa0077997>] __fscache_read_or_alloc_pages+0x187/0x3c4 [fscache] [<ffffffffa00b6480>] ? nfs_readpage_from_fscache_complete+0x0/0x66 [nfs] [<ffffffffa00b6388>] __nfs_readpages_from_fscache+0x7e/0x176 [nfs] [<ffffffff8108e483>] ? __alloc_pages_nodemask+0x11c/0x5cf [<ffffffffa009d796>] nfs_readpages+0x114/0x1d7 [nfs] [<ffffffff81090314>] __do_page_cache_readahead+0x15f/0x1ec [<ffffffff81090228>] ? __do_page_cache_readahead+0x73/0x1ec [<ffffffff810903bd>] ra_submit+0x1c/0x20 [<ffffffff810906bb>] ondemand_readahead+0x227/0x23a [<ffffffff81090762>] page_cache_sync_readahead+0x17/0x19 [<ffffffff8108a99e>] generic_file_aio_read+0x236/0x5a0 [<ffffffffa00937bd>] nfs_file_read+0xe4/0xf3 [nfs] [<ffffffff810b2fa2>] do_sync_read+0xe3/0x120 [<ffffffff81354cc3>] ? _spin_unlock_irq+0x2b/0x31 [<ffffffff8104c0f1>] ? autoremove_wake_function+0x0/0x34 [<ffffffff811848e5>] ? selinux_file_permission+0x5d/0x10f [<ffffffff81352bdb>] ? thread_return+0x3e/0x101 [<ffffffff8117d7b0>] ? security_file_permission+0x11/0x13 [<ffffffff810b3b06>] vfs_read+0xaa/0x16f [<ffffffff81058df0>] ? trace_hardirqs_on_caller+0x10c/0x130 [<ffffffff810b3c84>] sys_read+0x45/0x6c [<ffffffff8100ae2b>] system_call_fastpath+0x16/0x1b The object state might also be OBJECT_DYING or OBJECT_WITHDRAWING. This should be handled by simply rejecting the new operation with ENOBUFS. There's no need to log an error for it. Events of this type now appear in the stats file under Ops:rej. Signed-off-by: David Howells <dhowells@redhat.com>
2009-11-20 02:11:32 +08:00
ret = -ENOBUFS;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
} else if (!test_bit(FSCACHE_IOERROR, &object->cache->flags)) {
fscache_report_unexpected_submission(object, op, ostate);
ASSERT(!fscache_object_is_active(object));
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
op->state = FSCACHE_OP_ST_CANCELLED;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
ret = -ENOBUFS;
} else {
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
op->state = FSCACHE_OP_ST_CANCELLED;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
ret = -ENOBUFS;
}
spin_unlock(&object->lock);
return ret;
}
/*
* queue an object for withdrawal on error, aborting all following asynchronous
* operations
*/
void fscache_abort_object(struct fscache_object *object)
{
_enter("{OBJ%x}", object->debug_id);
fscache_raise_event(object, FSCACHE_OBJECT_EV_ERROR);
}
/*
* jump start the operation processing on an object
* - caller must hold object->lock
*/
void fscache_start_operations(struct fscache_object *object)
{
struct fscache_operation *op;
bool stop = false;
while (!list_empty(&object->pending_ops) && !stop) {
op = list_entry(object->pending_ops.next,
struct fscache_operation, pend_link);
if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags)) {
if (object->n_in_progress > 0)
break;
stop = true;
}
list_del_init(&op->pend_link);
fscache_run_op(object, op);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
/* the pending queue was holding a ref on the object */
fscache_put_operation(op);
}
ASSERTCMP(object->n_in_progress, <=, object->n_ops);
_debug("woke %d ops on OBJ%x",
object->n_in_progress, object->debug_id);
}
/*
* cancel an operation that's pending on an object
*/
FS-Cache: Clear remaining page count on retrieval cancellation Provide fscache_cancel_op() with a pointer to a function it should invoke under lock if it cancels an operation. Use this to clear the remaining page count upon cancellation of a pending retrieval operation so that fscache_release_retrieval_op() doesn't get an assertion failure (see below). This can happen when a signal occurs, say from CTRL-C being pressed during data retrieval. FS-Cache: Assertion failed 3 == 0 is false ------------[ cut here ]------------ kernel BUG at fs/fscache/page.c:237! invalid opcode: 0000 [#641] SMP Modules linked in: cachefiles(F) nfsv4(F) nfsv3(F) nfsv2(F) nfs(F) fscache(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F) CPU 0 Pid: 6075, comm: slurp-q Tainted: GF D 3.7.0-rc8-fsdevel+ #411 /DG965RY RIP: 0010:[<ffffffffa007f328>] [<ffffffffa007f328>] fscache_release_retrieval_op+0x75/0xff [fscache] RSP: 0000:ffff88001c6d7988 EFLAGS: 00010296 RAX: 000000000000000f RBX: ffff880014cdfe00 RCX: ffffffff6c102000 RDX: ffffffff8102d1ad RSI: ffffffff6c102000 RDI: ffffffff8102d1d6 RBP: ffff88001c6d7998 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffe00 R13: ffff88001c6d7ab4 R14: ffff88001a8638a0 R15: ffff88001552b190 FS: 00007f877aaf0700(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fff11378fd2 CR3: 000000001c6c6000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process slurp-q (pid: 6075, threadinfo ffff88001c6d6000, task ffff88001c6c4080) Stack: ffffffffa007ec07 ffff880014cdfe00 ffff88001c6d79c8 ffffffffa007db4d ffffffffa007ec07 ffff880014cdfe00 00000000fffffe00 ffff88001c6d7ab4 ffff88001c6d7a38 ffffffffa008116d 0000000000000000 ffff88001c6c4080 Call Trace: [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache] [<ffffffffa007db4d>] fscache_put_operation+0x135/0x2ed [fscache] [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache] [<ffffffffa008116d>] __fscache_read_or_alloc_pages+0x413/0x4bc [fscache] [<ffffffff810ac8ae>] ? __alloc_pages_nodemask+0x195/0x75c [<ffffffffa00aab0f>] __nfs_readpages_from_fscache+0x86/0x13d [nfs] [<ffffffffa00a5fe0>] nfs_readpages+0x186/0x1bd [nfs] [<ffffffff810d23c8>] ? alloc_pages_current+0xc7/0xe4 [<ffffffff810a68b5>] ? __page_cache_alloc+0x84/0x91 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0 [<ffffffff810afaa3>] __do_page_cache_readahead+0x237/0x2e0 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0 [<ffffffff810afe3e>] ra_submit+0x1c/0x20 [<ffffffff810b019b>] ondemand_readahead+0x359/0x382 [<ffffffff810b0279>] page_cache_sync_readahead+0x38/0x3a [<ffffffff810a77b5>] generic_file_aio_read+0x26b/0x637 [<ffffffffa00f1852>] ? nfs_mark_delegation_referenced+0xb/0xb [nfsv4] [<ffffffffa009cc85>] nfs_file_read+0xaa/0xcf [nfs] [<ffffffff810db5b3>] do_sync_read+0x91/0xd1 [<ffffffff810dbb8b>] vfs_read+0x9b/0x144 [<ffffffff810dbc78>] sys_read+0x44/0x75 [<ffffffff81422892>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-14 19:02:22 +08:00
int fscache_cancel_op(struct fscache_operation *op,
void (*do_cancel)(struct fscache_operation *))
{
struct fscache_object *object = op->object;
int ret;
_enter("OBJ%x OP%x}", op->object->debug_id, op->debug_id);
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
ASSERTCMP(op->state, >=, FSCACHE_OP_ST_PENDING);
ASSERTCMP(op->state, !=, FSCACHE_OP_ST_CANCELLED);
ASSERTCMP(atomic_read(&op->usage), >, 0);
spin_lock(&object->lock);
ret = -EBUSY;
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
if (op->state == FSCACHE_OP_ST_PENDING) {
ASSERT(!list_empty(&op->pend_link));
fscache_stat(&fscache_n_op_cancelled);
list_del_init(&op->pend_link);
FS-Cache: Clear remaining page count on retrieval cancellation Provide fscache_cancel_op() with a pointer to a function it should invoke under lock if it cancels an operation. Use this to clear the remaining page count upon cancellation of a pending retrieval operation so that fscache_release_retrieval_op() doesn't get an assertion failure (see below). This can happen when a signal occurs, say from CTRL-C being pressed during data retrieval. FS-Cache: Assertion failed 3 == 0 is false ------------[ cut here ]------------ kernel BUG at fs/fscache/page.c:237! invalid opcode: 0000 [#641] SMP Modules linked in: cachefiles(F) nfsv4(F) nfsv3(F) nfsv2(F) nfs(F) fscache(F) auth_rpcgss(F) nfs_acl(F) lockd(F) sunrpc(F) CPU 0 Pid: 6075, comm: slurp-q Tainted: GF D 3.7.0-rc8-fsdevel+ #411 /DG965RY RIP: 0010:[<ffffffffa007f328>] [<ffffffffa007f328>] fscache_release_retrieval_op+0x75/0xff [fscache] RSP: 0000:ffff88001c6d7988 EFLAGS: 00010296 RAX: 000000000000000f RBX: ffff880014cdfe00 RCX: ffffffff6c102000 RDX: ffffffff8102d1ad RSI: ffffffff6c102000 RDI: ffffffff8102d1d6 RBP: ffff88001c6d7998 R08: 0000000000000002 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffe00 R13: ffff88001c6d7ab4 R14: ffff88001a8638a0 R15: ffff88001552b190 FS: 00007f877aaf0700(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fff11378fd2 CR3: 000000001c6c6000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process slurp-q (pid: 6075, threadinfo ffff88001c6d6000, task ffff88001c6c4080) Stack: ffffffffa007ec07 ffff880014cdfe00 ffff88001c6d79c8 ffffffffa007db4d ffffffffa007ec07 ffff880014cdfe00 00000000fffffe00 ffff88001c6d7ab4 ffff88001c6d7a38 ffffffffa008116d 0000000000000000 ffff88001c6c4080 Call Trace: [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache] [<ffffffffa007db4d>] fscache_put_operation+0x135/0x2ed [fscache] [<ffffffffa007ec07>] ? fscache_cancel_op+0x194/0x1cf [fscache] [<ffffffffa008116d>] __fscache_read_or_alloc_pages+0x413/0x4bc [fscache] [<ffffffff810ac8ae>] ? __alloc_pages_nodemask+0x195/0x75c [<ffffffffa00aab0f>] __nfs_readpages_from_fscache+0x86/0x13d [nfs] [<ffffffffa00a5fe0>] nfs_readpages+0x186/0x1bd [nfs] [<ffffffff810d23c8>] ? alloc_pages_current+0xc7/0xe4 [<ffffffff810a68b5>] ? __page_cache_alloc+0x84/0x91 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0 [<ffffffff810afaa3>] __do_page_cache_readahead+0x237/0x2e0 [<ffffffff810af912>] ? __do_page_cache_readahead+0xa6/0x2e0 [<ffffffff810afe3e>] ra_submit+0x1c/0x20 [<ffffffff810b019b>] ondemand_readahead+0x359/0x382 [<ffffffff810b0279>] page_cache_sync_readahead+0x38/0x3a [<ffffffff810a77b5>] generic_file_aio_read+0x26b/0x637 [<ffffffffa00f1852>] ? nfs_mark_delegation_referenced+0xb/0xb [nfsv4] [<ffffffffa009cc85>] nfs_file_read+0xaa/0xcf [nfs] [<ffffffff810db5b3>] do_sync_read+0x91/0xd1 [<ffffffff810dbb8b>] vfs_read+0x9b/0x144 [<ffffffff810dbc78>] sys_read+0x44/0x75 [<ffffffff81422892>] system_call_fastpath+0x16/0x1b Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-14 19:02:22 +08:00
if (do_cancel)
do_cancel(op);
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
op->state = FSCACHE_OP_ST_CANCELLED;
if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
object->n_exclusive--;
if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
fscache_put_operation(op);
ret = 0;
}
spin_unlock(&object->lock);
_leave(" = %d", ret);
return ret;
}
/*
* Cancel all pending operations on an object
*/
void fscache_cancel_all_ops(struct fscache_object *object)
{
struct fscache_operation *op;
_enter("OBJ%x", object->debug_id);
spin_lock(&object->lock);
while (!list_empty(&object->pending_ops)) {
op = list_entry(object->pending_ops.next,
struct fscache_operation, pend_link);
fscache_stat(&fscache_n_op_cancelled);
list_del_init(&op->pend_link);
ASSERTCMP(op->state, ==, FSCACHE_OP_ST_PENDING);
op->state = FSCACHE_OP_ST_CANCELLED;
if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
object->n_exclusive--;
if (test_and_clear_bit(FSCACHE_OP_WAITING, &op->flags))
wake_up_bit(&op->flags, FSCACHE_OP_WAITING);
fscache_put_operation(op);
cond_resched_lock(&object->lock);
}
spin_unlock(&object->lock);
_leave("");
}
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
/*
* Record the completion or cancellation of an in-progress operation.
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
*/
void fscache_op_complete(struct fscache_operation *op, bool cancelled)
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
{
struct fscache_object *object = op->object;
_enter("OBJ%x", object->debug_id);
ASSERTCMP(op->state, ==, FSCACHE_OP_ST_IN_PROGRESS);
ASSERTCMP(object->n_in_progress, >, 0);
ASSERTIFCMP(test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags),
object->n_exclusive, >, 0);
ASSERTIFCMP(test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags),
object->n_in_progress, ==, 1);
spin_lock(&object->lock);
op->state = cancelled ?
FSCACHE_OP_ST_CANCELLED : FSCACHE_OP_ST_COMPLETE;
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
if (test_bit(FSCACHE_OP_EXCLUSIVE, &op->flags))
object->n_exclusive--;
object->n_in_progress--;
if (object->n_in_progress == 0)
fscache_start_operations(object);
spin_unlock(&object->lock);
_leave("");
}
EXPORT_SYMBOL(fscache_op_complete);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
/*
* release an operation
* - queues pending ops if this is the last in-progress op
*/
void fscache_put_operation(struct fscache_operation *op)
{
struct fscache_object *object;
struct fscache_cache *cache;
_enter("{OBJ%x OP%x,%d}",
op->object->debug_id, op->debug_id, atomic_read(&op->usage));
ASSERTCMP(atomic_read(&op->usage), >, 0);
if (!atomic_dec_and_test(&op->usage))
return;
_debug("PUT OP");
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
ASSERTIFCMP(op->state != FSCACHE_OP_ST_COMPLETE,
op->state, ==, FSCACHE_OP_ST_CANCELLED);
op->state = FSCACHE_OP_ST_DEAD;
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
fscache_stat(&fscache_n_op_release);
if (op->release) {
op->release(op);
op->release = NULL;
}
object = op->object;
if (test_bit(FSCACHE_OP_DEC_READ_CNT, &op->flags)) {
if (atomic_dec_and_test(&object->n_reads)) {
clear_bit(FSCACHE_COOKIE_WAITING_ON_READS,
&object->cookie->flags);
wake_up_bit(&object->cookie->flags,
FSCACHE_COOKIE_WAITING_ON_READS);
}
}
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
/* now... we may get called with the object spinlock held, so we
* complete the cleanup here only if we can immediately acquire the
* lock, and defer it otherwise */
if (!spin_trylock(&object->lock)) {
_debug("defer put");
fscache_stat(&fscache_n_op_deferred_release);
cache = object->cache;
spin_lock(&cache->op_gc_list_lock);
list_add_tail(&op->pend_link, &cache->op_gc_list);
spin_unlock(&cache->op_gc_list_lock);
schedule_work(&cache->op_gc);
_leave(" [defer]");
return;
}
ASSERTCMP(object->n_ops, >, 0);
object->n_ops--;
if (object->n_ops == 0)
fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
spin_unlock(&object->lock);
kfree(op);
_leave(" [done]");
}
EXPORT_SYMBOL(fscache_put_operation);
/*
* garbage collect operations that have had their release deferred
*/
void fscache_operation_gc(struct work_struct *work)
{
struct fscache_operation *op;
struct fscache_object *object;
struct fscache_cache *cache =
container_of(work, struct fscache_cache, op_gc);
int count = 0;
_enter("");
do {
spin_lock(&cache->op_gc_list_lock);
if (list_empty(&cache->op_gc_list)) {
spin_unlock(&cache->op_gc_list_lock);
break;
}
op = list_entry(cache->op_gc_list.next,
struct fscache_operation, pend_link);
list_del(&op->pend_link);
spin_unlock(&cache->op_gc_list_lock);
object = op->object;
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
spin_lock(&object->lock);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
_debug("GC DEFERRED REL OBJ%x OP%x",
object->debug_id, op->debug_id);
fscache_stat(&fscache_n_op_gc);
ASSERTCMP(atomic_read(&op->usage), ==, 0);
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
ASSERTCMP(op->state, ==, FSCACHE_OP_ST_DEAD);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
ASSERTCMP(object->n_ops, >, 0);
object->n_ops--;
if (object->n_ops == 0)
fscache_raise_event(object, FSCACHE_OBJECT_EV_CLEARED);
spin_unlock(&object->lock);
FS-Cache: Fix operation state management and accounting Fix the state management of internal fscache operations and the accounting of what operations are in what states. This is done by: (1) Give struct fscache_operation a enum variable that directly represents the state it's currently in, rather than spreading this knowledge over a bunch of flags, who's processing the operation at the moment and whether it is queued or not. This makes it easier to write assertions to check the state at various points and to prevent invalid state transitions. (2) Add an 'operation complete' state and supply a function to indicate the completion of an operation (fscache_op_complete()) and make things call it. The final call to fscache_put_operation() can then check that an op in the appropriate state (complete or cancelled). (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better govern the state of an object: (a) The ->n_ops is now the number of extant operations on the object and is now decremented by fscache_put_operation() only. (b) The ->n_in_progress is simply the number of objects that have been taken off of the object's pending queue for the purposes of being run. This is decremented by fscache_op_complete() only. (c) The ->n_exclusive is the number of exclusive ops that have been submitted and queued or are in progress. It is decremented by fscache_op_complete() and by fscache_cancel_op(). fscache_put_operation() and fscache_operation_gc() now no longer try to clean up ->n_exclusive and ->n_in_progress. That was leading to double decrements against fscache_cancel_op(). fscache_cancel_op() now no longer decrements ->n_ops. That was leading to double decrements against fscache_put_operation(). fscache_submit_exclusive_op() now decides whether it has to queue an op based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter will persist in being true even after all preceding operations have been cancelled or completed. Furthermore, if an object is active and there are runnable ops against it, there must be at least one op running. (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and provide a function to record completion of the pages as they complete. When n_pages reaches 0, the operation is deemed to be complete and fscache_op_complete() is called. Add calls to fscache_retrieval_complete() anywhere we've finished with a page we've been given to read or allocate for. This includes places where we just return pages to the netfs for reading from the server and where accessing the cache fails and we discard the proposed netfs page. The bugs in the unfixed state management manifest themselves as oopses like the following where the operation completion gets out of sync with return of the cookie by the netfs. This is possible because the cache unlocks and returns all the netfs pages before recording its completion - which means that there's nothing to stop the netfs discarding them and returning the cookie. FS-Cache: Cookie 'NFS.fh' still has outstanding reads ------------[ cut here ]------------ kernel BUG at fs/fscache/cookie.c:519! invalid opcode: 0000 [#1] SMP CPU 1 Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090 /DG965RY RIP: 0010:[<ffffffffa007050a>] [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache] RSP: 0018:ffff8800368cfb00 EFLAGS: 00010282 RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000 RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000 R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98 R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370 FS: 0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040) Stack: ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0 ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0 ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91 Call Trace: [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs] [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs] [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs] [<ffffffff810d8d47>] evict+0xa1/0x15c [<ffffffff810d8e2e>] dispose_list+0x2c/0x38 [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b [<ffffffff810c56b7>] prune_super+0xd5/0x140 [<ffffffff8109b615>] shrink_slab+0x102/0x1ab [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595 [<ffffffff8103e009>] ? process_timeout+0xb/0xb [<ffffffff8109dba3>] kswapd+0x270/0x289 [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46 [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595 [<ffffffff8104bf7a>] kthread+0x7f/0x87 [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10 [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0 [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53 [<ffffffff813ad6b0>] ? gs_change+0xb/0xb Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-21 05:52:35 +08:00
kfree(op);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
} while (count++ < 20);
if (!list_empty(&cache->op_gc_list))
schedule_work(&cache->op_gc);
_leave("");
}
/*
* execute an operation using fs_op_wq to provide processing context -
* the caller holds a ref to this object, so we don't need to hold one
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
*/
void fscache_op_work_func(struct work_struct *work)
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
{
struct fscache_operation *op =
container_of(work, struct fscache_operation, work);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
unsigned long start;
_enter("{OBJ%x OP%x,%d}",
op->object->debug_id, op->debug_id, atomic_read(&op->usage));
ASSERT(op->processor != NULL);
start = jiffies;
op->processor(op);
fscache_hist(fscache_ops_histogram, start);
fscache_put_operation(op);
FS-Cache: Add and document asynchronous operation handling Add and document asynchronous operation handling for use by FS-Cache's data storage and retrieval routines. The following documentation is added to: Documentation/filesystems/caching/operations.txt ================================ ASYNCHRONOUS OPERATIONS HANDLING ================================ ======== OVERVIEW ======== FS-Cache has an asynchronous operations handling facility that it uses for its data storage and retrieval routines. Its operations are represented by fscache_operation structs, though these are usually embedded into some other structure. This facility is available to and expected to be be used by the cache backends, and FS-Cache will create operations and pass them off to the appropriate cache backend for completion. To make use of this facility, <linux/fscache-cache.h> should be #included. =============================== OPERATION RECORD INITIALISATION =============================== An operation is recorded in an fscache_operation struct: struct fscache_operation { union { struct work_struct fast_work; struct slow_work slow_work; }; unsigned long flags; fscache_operation_processor_t processor; ... }; Someone wanting to issue an operation should allocate something with this struct embedded in it. They should initialise it by calling: void fscache_operation_init(struct fscache_operation *op, fscache_operation_release_t release); with the operation to be initialised and the release function to use. The op->flags parameter should be set to indicate the CPU time provision and the exclusivity (see the Parameters section). The op->fast_work, op->slow_work and op->processor flags should be set as appropriate for the CPU time provision (see the Parameters section). FSCACHE_OP_WAITING may be set in op->flags prior to each submission of the operation and waited for afterwards. ========== PARAMETERS ========== There are a number of parameters that can be set in the operation record's flag parameter. There are three options for the provision of CPU time in these operations: (1) The operation may be done synchronously (FSCACHE_OP_MYTHREAD). A thread may decide it wants to handle an operation itself without deferring it to another thread. This is, for example, used in read operations for calling readpages() on the backing filesystem in CacheFiles. Although readpages() does an asynchronous data fetch, the determination of whether pages exist is done synchronously - and the netfs does not proceed until this has been determined. If this option is to be used, FSCACHE_OP_WAITING must be set in op->flags before submitting the operation, and the operating thread must wait for it to be cleared before proceeding: wait_on_bit(&op->flags, FSCACHE_OP_WAITING, fscache_wait_bit, TASK_UNINTERRUPTIBLE); (2) The operation may be fast asynchronous (FSCACHE_OP_FAST), in which case it will be given to keventd to process. Such an operation is not permitted to sleep on I/O. This is, for example, used by CacheFiles to copy data from a backing fs page to a netfs page after the backing fs has read the page in. If this option is used, op->fast_work and op->processor must be initialised before submitting the operation: INIT_WORK(&op->fast_work, do_some_work); (3) The operation may be slow asynchronous (FSCACHE_OP_SLOW), in which case it will be given to the slow work facility to process. Such an operation is permitted to sleep on I/O. This is, for example, used by FS-Cache to handle background writes of pages that have just been fetched from a remote server. If this option is used, op->slow_work and op->processor must be initialised before submitting the operation: fscache_operation_init_slow(op, processor) Furthermore, operations may be one of two types: (1) Exclusive (FSCACHE_OP_EXCLUSIVE). Operations of this type may not run in conjunction with any other operation on the object being operated upon. An example of this is the attribute change operation, in which the file being written to may need truncation. (2) Shareable. Operations of this type may be running simultaneously. It's up to the operation implementation to prevent interference between other operations running at the same time. ========= PROCEDURE ========= Operations are used through the following procedure: (1) The submitting thread must allocate the operation and initialise it itself. Normally this would be part of a more specific structure with the generic op embedded within. (2) The submitting thread must then submit the operation for processing using one of the following two functions: int fscache_submit_op(struct fscache_object *object, struct fscache_operation *op); int fscache_submit_exclusive_op(struct fscache_object *object, struct fscache_operation *op); The first function should be used to submit non-exclusive ops and the second to submit exclusive ones. The caller must still set the FSCACHE_OP_EXCLUSIVE flag. If successful, both functions will assign the operation to the specified object and return 0. -ENOBUFS will be returned if the object specified is permanently unavailable. The operation manager will defer operations on an object that is still undergoing lookup or creation. The operation will also be deferred if an operation of conflicting exclusivity is in progress on the object. If the operation is asynchronous, the manager will retain a reference to it, so the caller should put their reference to it by passing it to: void fscache_put_operation(struct fscache_operation *op); (3) If the submitting thread wants to do the work itself, and has marked the operation with FSCACHE_OP_MYTHREAD, then it should monitor FSCACHE_OP_WAITING as described above and check the state of the object if necessary (the object might have died whilst the thread was waiting). When it has finished doing its processing, it should call fscache_put_operation() on it. (4) The operation holds an effective lock upon the object, preventing other exclusive ops conflicting until it is released. The operation can be enqueued for further immediate asynchronous processing by adjusting the CPU time provisioning option if necessary, eg: op->flags &= ~FSCACHE_OP_TYPE; op->flags |= ~FSCACHE_OP_FAST; and calling: void fscache_enqueue_operation(struct fscache_operation *op) This can be used to allow other things to have use of the worker thread pools. ===================== ASYNCHRONOUS CALLBACK ===================== When used in asynchronous mode, the worker thread pool will invoke the processor method with a pointer to the operation. This should then get at the container struct by using container_of(): static void fscache_write_op(struct fscache_operation *_op) { struct fscache_storage *op = container_of(_op, struct fscache_storage, op); ... } The caller holds a reference on the operation, and will invoke fscache_put_operation() when the processor function returns. The processor function is at liberty to call fscache_enqueue_operation() or to take extra references. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
2009-04-03 23:42:39 +08:00
_leave("");
}