docs/vm: unevictable-lru.txt: convert to ReST format
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
This commit is contained in:
parent
44f380fe90
commit
a5e4da91e0
|
@ -1,37 +1,13 @@
|
||||||
==============================
|
.. _unevictable_lru:
|
||||||
UNEVICTABLE LRU INFRASTRUCTURE
|
|
||||||
==============================
|
|
||||||
|
|
||||||
========
|
==============================
|
||||||
CONTENTS
|
Unevictable LRU Infrastructure
|
||||||
========
|
==============================
|
||||||
|
|
||||||
(*) The Unevictable LRU
|
.. contents:: :local:
|
||||||
|
|
||||||
- The unevictable page list.
|
|
||||||
- Memory control group interaction.
|
|
||||||
- Marking address spaces unevictable.
|
|
||||||
- Detecting Unevictable Pages.
|
|
||||||
- vmscan's handling of unevictable pages.
|
|
||||||
|
|
||||||
(*) mlock()'d pages.
|
|
||||||
|
|
||||||
- History.
|
|
||||||
- Basic management.
|
|
||||||
- mlock()/mlockall() system call handling.
|
|
||||||
- Filtering special vmas.
|
|
||||||
- munlock()/munlockall() system call handling.
|
|
||||||
- Migrating mlocked pages.
|
|
||||||
- Compacting mlocked pages.
|
|
||||||
- mmap(MAP_LOCKED) system call handling.
|
|
||||||
- munmap()/exit()/exec() system call handling.
|
|
||||||
- try_to_unmap().
|
|
||||||
- try_to_munlock() reverse map scan.
|
|
||||||
- Page reclaim in shrink_*_list().
|
|
||||||
|
|
||||||
|
|
||||||
============
|
Introduction
|
||||||
INTRODUCTION
|
|
||||||
============
|
============
|
||||||
|
|
||||||
This document describes the Linux memory manager's "Unevictable LRU"
|
This document describes the Linux memory manager's "Unevictable LRU"
|
||||||
|
@ -46,8 +22,8 @@ details - the "what does it do?" - by reading the code. One hopes that the
|
||||||
descriptions below add value by provide the answer to "why does it do that?".
|
descriptions below add value by provide the answer to "why does it do that?".
|
||||||
|
|
||||||
|
|
||||||
===================
|
|
||||||
THE UNEVICTABLE LRU
|
The Unevictable LRU
|
||||||
===================
|
===================
|
||||||
|
|
||||||
The Unevictable LRU facility adds an additional LRU list to track unevictable
|
The Unevictable LRU facility adds an additional LRU list to track unevictable
|
||||||
|
@ -66,17 +42,17 @@ completely unresponsive.
|
||||||
|
|
||||||
The unevictable list addresses the following classes of unevictable pages:
|
The unevictable list addresses the following classes of unevictable pages:
|
||||||
|
|
||||||
(*) Those owned by ramfs.
|
* Those owned by ramfs.
|
||||||
|
|
||||||
(*) Those mapped into SHM_LOCK'd shared memory regions.
|
* Those mapped into SHM_LOCK'd shared memory regions.
|
||||||
|
|
||||||
(*) Those mapped into VM_LOCKED [mlock()ed] VMAs.
|
* Those mapped into VM_LOCKED [mlock()ed] VMAs.
|
||||||
|
|
||||||
The infrastructure may also be able to handle other conditions that make pages
|
The infrastructure may also be able to handle other conditions that make pages
|
||||||
unevictable, either by definition or by circumstance, in the future.
|
unevictable, either by definition or by circumstance, in the future.
|
||||||
|
|
||||||
|
|
||||||
THE UNEVICTABLE PAGE LIST
|
The Unevictable Page List
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list
|
The Unevictable LRU infrastructure consists of an additional, per-zone, LRU list
|
||||||
|
@ -118,7 +94,7 @@ the unevictable list when one task has the page isolated from the LRU and other
|
||||||
tasks are changing the "evictability" state of the page.
|
tasks are changing the "evictability" state of the page.
|
||||||
|
|
||||||
|
|
||||||
MEMORY CONTROL GROUP INTERACTION
|
Memory Control Group Interaction
|
||||||
--------------------------------
|
--------------------------------
|
||||||
|
|
||||||
The unevictable LRU facility interacts with the memory control group [aka
|
The unevictable LRU facility interacts with the memory control group [aka
|
||||||
|
@ -144,7 +120,9 @@ effects:
|
||||||
the control group to thrash or to OOM-kill tasks.
|
the control group to thrash or to OOM-kill tasks.
|
||||||
|
|
||||||
|
|
||||||
MARKING ADDRESS SPACES UNEVICTABLE
|
.. _mark_addr_space_unevict:
|
||||||
|
|
||||||
|
Marking Address Spaces Unevictable
|
||||||
----------------------------------
|
----------------------------------
|
||||||
|
|
||||||
For facilities such as ramfs none of the pages attached to the address space
|
For facilities such as ramfs none of the pages attached to the address space
|
||||||
|
@ -152,15 +130,15 @@ may be evicted. To prevent eviction of any such pages, the AS_UNEVICTABLE
|
||||||
address space flag is provided, and this can be manipulated by a filesystem
|
address space flag is provided, and this can be manipulated by a filesystem
|
||||||
using a number of wrapper functions:
|
using a number of wrapper functions:
|
||||||
|
|
||||||
(*) void mapping_set_unevictable(struct address_space *mapping);
|
* ``void mapping_set_unevictable(struct address_space *mapping);``
|
||||||
|
|
||||||
Mark the address space as being completely unevictable.
|
Mark the address space as being completely unevictable.
|
||||||
|
|
||||||
(*) void mapping_clear_unevictable(struct address_space *mapping);
|
* ``void mapping_clear_unevictable(struct address_space *mapping);``
|
||||||
|
|
||||||
Mark the address space as being evictable.
|
Mark the address space as being evictable.
|
||||||
|
|
||||||
(*) int mapping_unevictable(struct address_space *mapping);
|
* ``int mapping_unevictable(struct address_space *mapping);``
|
||||||
|
|
||||||
Query the address space, and return true if it is completely
|
Query the address space, and return true if it is completely
|
||||||
unevictable.
|
unevictable.
|
||||||
|
@ -177,12 +155,13 @@ These are currently used in two places in the kernel:
|
||||||
ensure they're in memory.
|
ensure they're in memory.
|
||||||
|
|
||||||
|
|
||||||
DETECTING UNEVICTABLE PAGES
|
Detecting Unevictable Pages
|
||||||
---------------------------
|
---------------------------
|
||||||
|
|
||||||
The function page_evictable() in vmscan.c determines whether a page is
|
The function page_evictable() in vmscan.c determines whether a page is
|
||||||
evictable or not using the query function outlined above [see section "Marking
|
evictable or not using the query function outlined above [see section
|
||||||
address spaces unevictable"] to check the AS_UNEVICTABLE flag.
|
:ref:`Marking address spaces unevictable <mark_addr_space_unevict>`]
|
||||||
|
to check the AS_UNEVICTABLE flag.
|
||||||
|
|
||||||
For address spaces that are so marked after being populated (as SHM regions
|
For address spaces that are so marked after being populated (as SHM regions
|
||||||
might be), the lock action (eg: SHM_LOCK) can be lazy, and need not populate
|
might be), the lock action (eg: SHM_LOCK) can be lazy, and need not populate
|
||||||
|
@ -202,7 +181,7 @@ flag, PG_mlocked (as wrapped by PageMlocked()), which is set when a page is
|
||||||
faulted into a VM_LOCKED vma, or found in a vma being VM_LOCKED.
|
faulted into a VM_LOCKED vma, or found in a vma being VM_LOCKED.
|
||||||
|
|
||||||
|
|
||||||
VMSCAN'S HANDLING OF UNEVICTABLE PAGES
|
Vmscan's Handling of Unevictable Pages
|
||||||
--------------------------------------
|
--------------------------------------
|
||||||
|
|
||||||
If unevictable pages are culled in the fault path, or moved to the unevictable
|
If unevictable pages are culled in the fault path, or moved to the unevictable
|
||||||
|
@ -233,8 +212,7 @@ extra evictabilty checks should not occur in the majority of calls to
|
||||||
putback_lru_page().
|
putback_lru_page().
|
||||||
|
|
||||||
|
|
||||||
=============
|
MLOCKED Pages
|
||||||
MLOCKED PAGES
|
|
||||||
=============
|
=============
|
||||||
|
|
||||||
The unevictable page list is also useful for mlock(), in addition to ramfs and
|
The unevictable page list is also useful for mlock(), in addition to ramfs and
|
||||||
|
@ -242,7 +220,7 @@ SYSV SHM. Note that mlock() is only available in CONFIG_MMU=y situations; in
|
||||||
NOMMU situations, all mappings are effectively mlocked.
|
NOMMU situations, all mappings are effectively mlocked.
|
||||||
|
|
||||||
|
|
||||||
HISTORY
|
History
|
||||||
-------
|
-------
|
||||||
|
|
||||||
The "Unevictable mlocked Pages" infrastructure is based on work originally
|
The "Unevictable mlocked Pages" infrastructure is based on work originally
|
||||||
|
@ -263,7 +241,7 @@ replaced by walking the reverse map to determine whether any VM_LOCKED VMAs
|
||||||
mapped the page. More on this below.
|
mapped the page. More on this below.
|
||||||
|
|
||||||
|
|
||||||
BASIC MANAGEMENT
|
Basic Management
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
mlocked pages - pages mapped into a VM_LOCKED VMA - are a class of unevictable
|
mlocked pages - pages mapped into a VM_LOCKED VMA - are a class of unevictable
|
||||||
|
@ -304,10 +282,10 @@ mlocked pages become unlocked and rescued from the unevictable list when:
|
||||||
(4) before a page is COW'd in a VM_LOCKED VMA.
|
(4) before a page is COW'd in a VM_LOCKED VMA.
|
||||||
|
|
||||||
|
|
||||||
mlock()/mlockall() SYSTEM CALL HANDLING
|
mlock()/mlockall() System Call Handling
|
||||||
---------------------------------------
|
---------------------------------------
|
||||||
|
|
||||||
Both [do_]mlock() and [do_]mlockall() system call handlers call mlock_fixup()
|
Both [do\_]mlock() and [do\_]mlockall() system call handlers call mlock_fixup()
|
||||||
for each VMA in the range specified by the call. In the case of mlockall(),
|
for each VMA in the range specified by the call. In the case of mlockall(),
|
||||||
this is the entire active address space of the task. Note that mlock_fixup()
|
this is the entire active address space of the task. Note that mlock_fixup()
|
||||||
is used for both mlocking and munlocking a range of memory. A call to mlock()
|
is used for both mlocking and munlocking a range of memory. A call to mlock()
|
||||||
|
@ -351,7 +329,7 @@ mlock_vma_page() is unable to isolate the page from the LRU, vmscan will handle
|
||||||
it later if and when it attempts to reclaim the page.
|
it later if and when it attempts to reclaim the page.
|
||||||
|
|
||||||
|
|
||||||
FILTERING SPECIAL VMAS
|
Filtering Special VMAs
|
||||||
----------------------
|
----------------------
|
||||||
|
|
||||||
mlock_fixup() filters several classes of "special" VMAs:
|
mlock_fixup() filters several classes of "special" VMAs:
|
||||||
|
@ -379,8 +357,9 @@ VM_LOCKED flag. Therefore, we won't have to deal with them later during
|
||||||
munlock(), munmap() or task exit. Neither does mlock_fixup() account these
|
munlock(), munmap() or task exit. Neither does mlock_fixup() account these
|
||||||
VMAs against the task's "locked_vm".
|
VMAs against the task's "locked_vm".
|
||||||
|
|
||||||
|
.. _munlock_munlockall_handling:
|
||||||
|
|
||||||
munlock()/munlockall() SYSTEM CALL HANDLING
|
munlock()/munlockall() System Call Handling
|
||||||
-------------------------------------------
|
-------------------------------------------
|
||||||
|
|
||||||
The munlock() and munlockall() system calls are handled by the same functions -
|
The munlock() and munlockall() system calls are handled by the same functions -
|
||||||
|
@ -426,7 +405,7 @@ This is fine, because we'll catch it later if and if vmscan tries to reclaim
|
||||||
the page. This should be relatively rare.
|
the page. This should be relatively rare.
|
||||||
|
|
||||||
|
|
||||||
MIGRATING MLOCKED PAGES
|
Migrating MLOCKED Pages
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
A page that is being migrated has been isolated from the LRU lists and is held
|
A page that is being migrated has been isolated from the LRU lists and is held
|
||||||
|
@ -451,7 +430,7 @@ list because of a race between munlock and migration, page migration uses the
|
||||||
putback_lru_page() function to add migrated pages back to the LRU.
|
putback_lru_page() function to add migrated pages back to the LRU.
|
||||||
|
|
||||||
|
|
||||||
COMPACTING MLOCKED PAGES
|
Compacting MLOCKED Pages
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
The unevictable LRU can be scanned for compactable regions and the default
|
The unevictable LRU can be scanned for compactable regions and the default
|
||||||
|
@ -461,7 +440,7 @@ unevictable LRU is enabled, the work of compaction is mostly handled by
|
||||||
the page migration code and the same work flow as described in MIGRATING
|
the page migration code and the same work flow as described in MIGRATING
|
||||||
MLOCKED PAGES will apply.
|
MLOCKED PAGES will apply.
|
||||||
|
|
||||||
MLOCKING TRANSPARENT HUGE PAGES
|
MLOCKING Transparent Huge Pages
|
||||||
-------------------------------
|
-------------------------------
|
||||||
|
|
||||||
A transparent huge page is represented by a single entry on an LRU list.
|
A transparent huge page is represented by a single entry on an LRU list.
|
||||||
|
@ -483,7 +462,7 @@ to unevictable LRU and the rest can be reclaimed.
|
||||||
|
|
||||||
See also comment in follow_trans_huge_pmd().
|
See also comment in follow_trans_huge_pmd().
|
||||||
|
|
||||||
mmap(MAP_LOCKED) SYSTEM CALL HANDLING
|
mmap(MAP_LOCKED) System Call Handling
|
||||||
-------------------------------------
|
-------------------------------------
|
||||||
|
|
||||||
In addition the mlock()/mlockall() system calls, an application can request
|
In addition the mlock()/mlockall() system calls, an application can request
|
||||||
|
@ -514,7 +493,7 @@ memory range accounted as locked_vm, as the protections could be changed later
|
||||||
and pages allocated into that region.
|
and pages allocated into that region.
|
||||||
|
|
||||||
|
|
||||||
munmap()/exit()/exec() SYSTEM CALL HANDLING
|
munmap()/exit()/exec() System Call Handling
|
||||||
-------------------------------------------
|
-------------------------------------------
|
||||||
|
|
||||||
When unmapping an mlocked region of memory, whether by an explicit call to
|
When unmapping an mlocked region of memory, whether by an explicit call to
|
||||||
|
@ -568,16 +547,18 @@ munlock or munmap system calls, mm teardown (munlock_vma_pages_all), reclaim,
|
||||||
holepunching, and truncation of file pages and their anonymous COWed pages.
|
holepunching, and truncation of file pages and their anonymous COWed pages.
|
||||||
|
|
||||||
|
|
||||||
try_to_munlock() REVERSE MAP SCAN
|
try_to_munlock() Reverse Map Scan
|
||||||
---------------------------------
|
---------------------------------
|
||||||
|
|
||||||
[!] TODO/FIXME: a better name might be page_mlocked() - analogous to the
|
.. warning::
|
||||||
page_referenced() reverse map walker.
|
[!] TODO/FIXME: a better name might be page_mlocked() - analogous to the
|
||||||
|
page_referenced() reverse map walker.
|
||||||
|
|
||||||
When munlock_vma_page() [see section "munlock()/munlockall() System Call
|
When munlock_vma_page() [see section :ref:`munlock()/munlockall() System Call
|
||||||
Handling" above] tries to munlock a page, it needs to determine whether or not
|
Handling <munlock_munlockall_handling>` above] tries to munlock a
|
||||||
the page is mapped by any VM_LOCKED VMA without actually attempting to unmap
|
page, it needs to determine whether or not the page is mapped by any
|
||||||
all PTEs from the page. For this purpose, the unevictable/mlock infrastructure
|
VM_LOCKED VMA without actually attempting to unmap all PTEs from the
|
||||||
|
page. For this purpose, the unevictable/mlock infrastructure
|
||||||
introduced a variant of try_to_unmap() called try_to_munlock().
|
introduced a variant of try_to_unmap() called try_to_munlock().
|
||||||
|
|
||||||
try_to_munlock() calls the same functions as try_to_unmap() for anonymous and
|
try_to_munlock() calls the same functions as try_to_unmap() for anonymous and
|
||||||
|
@ -595,7 +576,7 @@ large region or tearing down a large address space that has been mlocked via
|
||||||
mlockall(), overall this is a fairly rare event.
|
mlockall(), overall this is a fairly rare event.
|
||||||
|
|
||||||
|
|
||||||
PAGE RECLAIM IN shrink_*_list()
|
Page Reclaim in shrink_*_list()
|
||||||
-------------------------------
|
-------------------------------
|
||||||
|
|
||||||
shrink_active_list() culls any obviously unevictable pages - i.e.
|
shrink_active_list() culls any obviously unevictable pages - i.e.
|
||||||
|
|
Loading…
Reference in New Issue