mirror of https://gitee.com/openkylin/linux.git
docs/vm: cleancache.txt: convert to ReST format
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This commit is contained in:
parent
d04f9f5a78
commit
5ef829e056
|
@ -1,4 +1,11 @@
|
|||
MOTIVATION
|
||||
.. _cleancache:
|
||||
|
||||
==========
|
||||
Cleancache
|
||||
==========
|
||||
|
||||
Motivation
|
||||
==========
|
||||
|
||||
Cleancache is a new optional feature provided by the VFS layer that
|
||||
potentially dramatically increases page cache effectiveness for
|
||||
|
@ -21,9 +28,10 @@ Transcendent memory "drivers" for cleancache are currently implemented
|
|||
in Xen (using hypervisor memory) and zcache (using in-kernel compressed
|
||||
memory) and other implementations are in development.
|
||||
|
||||
FAQs are included below.
|
||||
:ref:`FAQs <faq>` are included below.
|
||||
|
||||
IMPLEMENTATION OVERVIEW
|
||||
Implementation Overview
|
||||
=======================
|
||||
|
||||
A cleancache "backend" that provides transcendent memory registers itself
|
||||
to the kernel's cleancache "frontend" by calling cleancache_register_ops,
|
||||
|
@ -80,22 +88,33 @@ different Linux threads are simultaneously putting and invalidating a page
|
|||
with the same handle, the results are indeterminate. Callers must
|
||||
lock the page to ensure serial behavior.
|
||||
|
||||
CLEANCACHE PERFORMANCE METRICS
|
||||
Cleancache Performance Metrics
|
||||
==============================
|
||||
|
||||
If properly configured, monitoring of cleancache is done via debugfs in
|
||||
the /sys/kernel/debug/cleancache directory. The effectiveness of cleancache
|
||||
the `/sys/kernel/debug/cleancache` directory. The effectiveness of cleancache
|
||||
can be measured (across all filesystems) with:
|
||||
|
||||
succ_gets - number of gets that were successful
|
||||
failed_gets - number of gets that failed
|
||||
puts - number of puts attempted (all "succeed")
|
||||
invalidates - number of invalidates attempted
|
||||
``succ_gets``
|
||||
number of gets that were successful
|
||||
|
||||
``failed_gets``
|
||||
number of gets that failed
|
||||
|
||||
``puts``
|
||||
number of puts attempted (all "succeed")
|
||||
|
||||
``invalidates``
|
||||
number of invalidates attempted
|
||||
|
||||
A backend implementation may provide additional metrics.
|
||||
|
||||
FAQ
|
||||
.. _faq:
|
||||
|
||||
1) Where's the value? (Andrew Morton)
|
||||
FAQ
|
||||
===
|
||||
|
||||
* Where's the value? (Andrew Morton)
|
||||
|
||||
Cleancache provides a significant performance benefit to many workloads
|
||||
in many environments with negligible overhead by improving the
|
||||
|
@ -137,8 +156,8 @@ device that stores pages of data in a compressed state. And
|
|||
the proposed "RAMster" driver shares RAM across multiple physical
|
||||
systems.
|
||||
|
||||
2) Why does cleancache have its sticky fingers so deep inside the
|
||||
filesystems and VFS? (Andrew Morton and Christoph Hellwig)
|
||||
* Why does cleancache have its sticky fingers so deep inside the
|
||||
filesystems and VFS? (Andrew Morton and Christoph Hellwig)
|
||||
|
||||
The core hooks for cleancache in VFS are in most cases a single line
|
||||
and the minimum set are placed precisely where needed to maintain
|
||||
|
@ -168,9 +187,9 @@ filesystems in the future.
|
|||
The total impact of the hooks to existing fs and mm files is only
|
||||
about 40 lines added (not counting comments and blank lines).
|
||||
|
||||
3) Why not make cleancache asynchronous and batched so it can
|
||||
more easily interface with real devices with DMA instead
|
||||
of copying each individual page? (Minchan Kim)
|
||||
* Why not make cleancache asynchronous and batched so it can more
|
||||
easily interface with real devices with DMA instead of copying each
|
||||
individual page? (Minchan Kim)
|
||||
|
||||
The one-page-at-a-time copy semantics simplifies the implementation
|
||||
on both the frontend and backend and also allows the backend to
|
||||
|
@ -182,8 +201,8 @@ are avoided. While the interface seems odd for a "real device"
|
|||
or for real kernel-addressable RAM, it makes perfect sense for
|
||||
transcendent memory.
|
||||
|
||||
4) Why is non-shared cleancache "exclusive"? And where is the
|
||||
page "invalidated" after a "get"? (Minchan Kim)
|
||||
* Why is non-shared cleancache "exclusive"? And where is the
|
||||
page "invalidated" after a "get"? (Minchan Kim)
|
||||
|
||||
The main reason is to free up space in transcendent memory and
|
||||
to avoid unnecessary cleancache_invalidate calls. If you want inclusive,
|
||||
|
@ -193,7 +212,7 @@ be easily extended to add a "get_no_invalidate" call.
|
|||
|
||||
The invalidate is done by the cleancache backend implementation.
|
||||
|
||||
5) What's the performance impact?
|
||||
* What's the performance impact?
|
||||
|
||||
Performance analysis has been presented at OLS'09 and LCA'10.
|
||||
Briefly, performance gains can be significant on most workloads,
|
||||
|
@ -206,7 +225,7 @@ single-core systems with slow memory-copy speeds, cleancache
|
|||
has little value, but in newer multicore machines, especially
|
||||
consolidated/virtualized machines, it has great value.
|
||||
|
||||
6) How do I add cleancache support for filesystem X? (Boaz Harrash)
|
||||
* How do I add cleancache support for filesystem X? (Boaz Harrash)
|
||||
|
||||
Filesystems that are well-behaved and conform to certain
|
||||
restrictions can utilize cleancache simply by making a call to
|
||||
|
@ -217,26 +236,26 @@ not enable the optional cleancache.
|
|||
|
||||
Some points for a filesystem to consider:
|
||||
|
||||
- The FS should be block-device-based (e.g. a ram-based FS such
|
||||
as tmpfs should not enable cleancache)
|
||||
- To ensure coherency/correctness, the FS must ensure that all
|
||||
file removal or truncation operations either go through VFS or
|
||||
add hooks to do the equivalent cleancache "invalidate" operations
|
||||
- To ensure coherency/correctness, either inode numbers must
|
||||
be unique across the lifetime of the on-disk file OR the
|
||||
FS must provide an "encode_fh" function.
|
||||
- The FS must call the VFS superblock alloc and deactivate routines
|
||||
or add hooks to do the equivalent cleancache calls done there.
|
||||
- To maximize performance, all pages fetched from the FS should
|
||||
go through the do_mpag_readpage routine or the FS should add
|
||||
hooks to do the equivalent (cf. btrfs)
|
||||
- Currently, the FS blocksize must be the same as PAGESIZE. This
|
||||
is not an architectural restriction, but no backends currently
|
||||
support anything different.
|
||||
- A clustered FS should invoke the "shared_init_fs" cleancache
|
||||
hook to get best performance for some backends.
|
||||
- The FS should be block-device-based (e.g. a ram-based FS such
|
||||
as tmpfs should not enable cleancache)
|
||||
- To ensure coherency/correctness, the FS must ensure that all
|
||||
file removal or truncation operations either go through VFS or
|
||||
add hooks to do the equivalent cleancache "invalidate" operations
|
||||
- To ensure coherency/correctness, either inode numbers must
|
||||
be unique across the lifetime of the on-disk file OR the
|
||||
FS must provide an "encode_fh" function.
|
||||
- The FS must call the VFS superblock alloc and deactivate routines
|
||||
or add hooks to do the equivalent cleancache calls done there.
|
||||
- To maximize performance, all pages fetched from the FS should
|
||||
go through the do_mpag_readpage routine or the FS should add
|
||||
hooks to do the equivalent (cf. btrfs)
|
||||
- Currently, the FS blocksize must be the same as PAGESIZE. This
|
||||
is not an architectural restriction, but no backends currently
|
||||
support anything different.
|
||||
- A clustered FS should invoke the "shared_init_fs" cleancache
|
||||
hook to get best performance for some backends.
|
||||
|
||||
7) Why not use the KVA of the inode as the key? (Christoph Hellwig)
|
||||
* Why not use the KVA of the inode as the key? (Christoph Hellwig)
|
||||
|
||||
If cleancache would use the inode virtual address instead of
|
||||
inode/filehandle, the pool id could be eliminated. But, this
|
||||
|
@ -251,7 +270,7 @@ of cleancache would be lost because the cache of pages in cleanache
|
|||
is potentially much larger than the kernel pagecache and is most
|
||||
useful if the pages survive inode cache removal.
|
||||
|
||||
8) Why is a global variable required?
|
||||
* Why is a global variable required?
|
||||
|
||||
The cleancache_enabled flag is checked in all of the frequently-used
|
||||
cleancache hooks. The alternative is a function call to check a static
|
||||
|
@ -262,14 +281,14 @@ global variable allows cleancache to be enabled by default at compile
|
|||
time, but have insignificant performance impact when cleancache remains
|
||||
disabled at runtime.
|
||||
|
||||
9) Does cleanache work with KVM?
|
||||
* Does cleanache work with KVM?
|
||||
|
||||
The memory model of KVM is sufficiently different that a cleancache
|
||||
backend may have less value for KVM. This remains to be tested,
|
||||
especially in an overcommitted system.
|
||||
|
||||
10) Does cleancache work in userspace? It sounds useful for
|
||||
memory hungry caches like web browsers. (Jamie Lokier)
|
||||
* Does cleancache work in userspace? It sounds useful for
|
||||
memory hungry caches like web browsers. (Jamie Lokier)
|
||||
|
||||
No plans yet, though we agree it sounds useful, at least for
|
||||
apps that bypass the page cache (e.g. O_DIRECT).
|
||||
|
|
Loading…
Reference in New Issue