mm, thp: tweak reclaim/compaction effort of local-only and all-node allocations
THP page faults now attempt a __GFP_THISNODE allocation first, which
should only compact existing free memory, followed by another attempt
that can allocate from any node using reclaim/compaction effort
specified by global defrag setting and madvise.
This patch makes the following changes to the scheme:
- Before the patch, the first allocation relies on a check for
pageblock order and __GFP_IO to prevent excessive reclaim. This
however affects also the second attempt, which is not limited to
single node.
Instead of that, reuse the existing check for costly order
__GFP_NORETRY allocations, and make sure the first THP attempt uses
__GFP_NORETRY. As a side-effect, all costly order __GFP_NORETRY
allocations will bail out if compaction needs reclaim, while
previously they only bailed out when compaction was deferred due to
previous failures.
This should be still acceptable within the __GFP_NORETRY semantics.
- Before the patch, the second allocation attempt (on all nodes) was
passing __GFP_NORETRY. This is redundant as the check for pageblock
order (discussed above) was stronger. It's also contrary to
madvise(MADV_HUGEPAGE) which means some effort to allocate THP is
requested.
After this patch, the second attempt doesn't pass __GFP_THISNODE nor
__GFP_NORETRY.
To sum up, THP page faults now try the following attempts:
1. local node only THP allocation with no reclaim, just compaction.
2. for madvised VMA's or when synchronous compaction is enabled always - THP
allocation from any node with effort determined by global defrag setting
and VMA madvise
3. fallback to base pages on any node
Link: http://lkml.kernel.org/r/08a3f4dd-c3ce-0009-86c5-9ee51aba8557@suse.cz
Fixes: b39d0ee263
("mm, page_alloc: avoid expensive reclaim when compaction may not succeed")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
b3a987b026
commit
cc638f329e
|
@ -2148,18 +2148,22 @@ alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
|
|||
nmask = policy_nodemask(gfp, pol);
|
||||
if (!nmask || node_isset(hpage_node, *nmask)) {
|
||||
mpol_cond_put(pol);
|
||||
/*
|
||||
* First, try to allocate THP only on local node, but
|
||||
* don't reclaim unnecessarily, just compact.
|
||||
*/
|
||||
page = __alloc_pages_node(hpage_node,
|
||||
gfp | __GFP_THISNODE, order);
|
||||
gfp | __GFP_THISNODE | __GFP_NORETRY, order);
|
||||
|
||||
/*
|
||||
* If hugepage allocations are configured to always
|
||||
* synchronous compact or the vma has been madvised
|
||||
* to prefer hugepage backing, retry allowing remote
|
||||
* memory as well.
|
||||
* memory with both reclaim and compact as well.
|
||||
*/
|
||||
if (!page && (gfp & __GFP_DIRECT_RECLAIM))
|
||||
page = __alloc_pages_node(hpage_node,
|
||||
gfp | __GFP_NORETRY, order);
|
||||
gfp, order);
|
||||
|
||||
goto out;
|
||||
}
|
||||
|
|
|
@ -4476,8 +4476,11 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
|
|||
if (page)
|
||||
goto got_pg;
|
||||
|
||||
if (order >= pageblock_order && (gfp_mask & __GFP_IO) &&
|
||||
!(gfp_mask & __GFP_RETRY_MAYFAIL)) {
|
||||
/*
|
||||
* Checks for costly allocations with __GFP_NORETRY, which
|
||||
* includes some THP page fault allocations
|
||||
*/
|
||||
if (costly_order && (gfp_mask & __GFP_NORETRY)) {
|
||||
/*
|
||||
* If allocating entire pageblock(s) and compaction
|
||||
* failed because all zones are below low watermarks
|
||||
|
@ -4498,23 +4501,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
|
|||
if (compact_result == COMPACT_SKIPPED ||
|
||||
compact_result == COMPACT_DEFERRED)
|
||||
goto nopage;
|
||||
}
|
||||
|
||||
/*
|
||||
* Checks for costly allocations with __GFP_NORETRY, which
|
||||
* includes THP page fault allocations
|
||||
*/
|
||||
if (costly_order && (gfp_mask & __GFP_NORETRY)) {
|
||||
/*
|
||||
* If compaction is deferred for high-order allocations,
|
||||
* it is because sync compaction recently failed. If
|
||||
* this is the case and the caller requested a THP
|
||||
* allocation, we do not want to heavily disrupt the
|
||||
* system, so we fail the allocation instead of entering
|
||||
* direct reclaim.
|
||||
*/
|
||||
if (compact_result == COMPACT_DEFERRED)
|
||||
goto nopage;
|
||||
|
||||
/*
|
||||
* Looks like reclaim/compaction is worth trying, but
|
||||
|
|
Loading…
Reference in New Issue