linux/mm
Muchun Song b0ba3bff3e mm: memcontrol: fix NR_ANON_THPS accounting in charge moving
Patch series "Convert all THP vmstat counters to pages", v6.

This patch series is aimed to convert all THP vmstat counters to pages.

The unit of some vmstat counters are pages, some are bytes, some are
HPAGE_PMD_NR, and some are KiB. When we want to expose these vmstat
counters to the userspace, we have to know the unit of the vmstat counters
is which one. When the unit is bytes or kB, both clearly distinguishable
by the B/KB suffix. But for the THP vmstat counters, we may make mistakes.

For example, the below is some bug fix for the THP vmstat counters:

  - 7de2e9f195 ("mm: memcontrol: correct the NR_ANON_THPS counter of hierarchical memcg")
  - The first commit in this series ("fix NR_ANON_THPS accounting in charge moving")

This patch series can make the code clear. And make all the unit of the THP
vmstat counters in pages. Finally, the unit of the vmstat counters are
pages, kB and bytes. The B/KB suffix can tell us that the unit is bytes
or kB. The rest which is without suffix are pages.

In this series, I changed the following vmstat counters unit from HPAGE_PMD_NR
to pages. However, there is no change to the print format of output to user
space.

  - NR_ANON_THPS
  - NR_FILE_THPS
  - NR_SHMEM_THPS
  - NR_SHMEM_PMDMAPPED
  - NR_FILE_PMDMAPPED

Doing this also can make the statistics more accuracy for the THP vmstat
counters. This series is consistent with 8f182270df ("mm/swap.c: flush lru
pvecs on compound page arrival").

Because we use struct per_cpu_nodestat to cache the vmstat counters, which
leads to inaccurate statistics especially THP vmstat counters. In the systems
with hundreds of processors it can be GBs of memory. For example, for a 96
CPUs system, the threshold is the maximum number of 125. And the per cpu
counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth of
memory in one go) so skipping the batching seems like sensible. Although every
THP stats update overflows the per-cpu counter, resorting to atomic global
updates. But it can make the statistics more accuracy for the THP vmstat
counters. From this point of view, I think that do this converting is
reasonable.

Thanks Hugh for mentioning this. This was inspired by Johannes and Roman.
Thanks to them.

This patch (of 7):

The unit of NR_ANON_THPS is HPAGE_PMD_NR already.  So it should inc/dec by
one rather than nr_pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20201228164110.2838-2-songmuchun@bytedance.com
Fixes: 468c398233 ("mm: memcontrol: switch to native NR_ANON_THPS counter")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Rafael. J. Wysocki <rafael@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:29 -08:00
..
kasan kasan: fix stack traces dependency for HW_TAGS 2021-02-09 17:26:44 -08:00
Kconfig media: videobuf2: Move frame_vector into media subsystem 2021-01-12 14:15:31 +01:00
Kconfig.debug mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO 2020-12-15 12:13:46 -08:00
Makefile media: videobuf2: Move frame_vector into media subsystem 2021-01-12 14:15:31 +01:00
backing-dev.c mm: backing-dev: Remove duplicated macro definition 2021-02-24 13:38:28 -08:00
balloon_compaction.c mm/balloon_compaction: suppress allocation warnings 2019-09-04 07:42:01 -04:00
cleancache.c
cma.c mm: cma: improve pr_debug log in cma_release() 2020-12-15 12:13:46 -08:00
cma.h mm: cma: use CMA_MAX_NAME to define the length of cma name array 2020-09-01 09:19:43 +02:00
cma_debug.c debugfs: make sure we can remove u32_array files cleanly 2020-07-10 13:54:00 -07:00
compaction.c mm, compaction: move high_pfn to the for loop scope 2021-02-05 11:03:47 -08:00
debug.c mm/debug: improve memcg debugging 2021-02-24 13:38:27 -08:00
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable/basic: iterate over entire protection_map[] 2021-02-24 13:38:27 -08:00
dmapool.c mm/dmapool.c: replace hard coded function name with __func__ 2020-10-13 18:38:32 -07:00
early_ioremap.c mm/early_ioremap.c: use %pa to print resource_size_t variables 2020-01-31 10:30:38 -08:00
fadvise.c mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED 2020-10-13 18:38:29 -07:00
failslab.c
filemap.c mm/filemap: simplify generic_file_read_iter 2021-02-24 13:38:28 -08:00
frontswap.c mm/frontswap: mark various intentional data races 2020-08-14 19:56:56 -07:00
gup.c Merge branch 'akpm' (patches from Andrew) 2020-12-15 12:53:37 -08:00
gup_test.c mm/gup_test.c: mark gup_test_init as __init function 2020-12-15 12:13:38 -08:00
gup_test.h selftests/vm: gup_test: introduce the dump_pages() sub-test 2020-12-15 12:13:38 -08:00
highmem.c mm/highmem: prepare for overriding set_pte_at() 2021-01-24 10:34:52 -08:00
hmm.c mm: do page fault accounting in handle_mm_fault 2020-08-12 10:58:02 -07:00
huge_memory.c mm/filemap: pass a sleep state to put_and_wait_on_page_locked 2021-02-24 13:38:28 -08:00
hugetlb.c These changes fix MM (soft-)dirty bit management in the procfs code & clean up the API. 2021-02-21 12:19:56 -08:00
hugetlb_cgroup.c hugetlb_cgroup: fix offline of hugetlb cgroup with reservations 2020-12-06 10:19:07 -08:00
hwpoison-inject.c mm,hwpoison-inject: don't pin for hwpoison_filter 2020-10-16 11:11:16 -07:00
init-mm.c mm/gup: prevent gup_fast from racing with COW during fork 2020-12-15 12:13:39 -08:00
internal.h mm, page_alloc: disable pcplists during memory offline 2020-12-15 12:13:43 -08:00
interval_tree.c
ioremap.c mm: move p?d_alloc_track to separate header file 2020-08-07 11:33:26 -07:00
khugepaged.c mm: Avoid modifying vmf.address in __collapse_huge_page_swapin() 2021-01-21 12:50:18 +00:00
kmemleak.c mm/kmemleak: rely on rcu for task stack scanning 2020-10-13 18:38:27 -07:00
ksm.c mm: cleanup kstrto*() usage 2020-12-15 12:13:47 -08:00
list_lru.c mm: list_lru: set shrinker map bit when child nr_items is not zero 2020-12-06 10:19:07 -08:00
maccess.c uaccess: add force_uaccess_{begin,end} helpers 2020-08-12 10:57:59 -07:00
madvise.c idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
mapping_dirty_helpers.c mm/mapping_dirty_helpers: enhance the kernel-doc markups 2020-12-15 12:13:41 -08:00
memblock.c memblock: remove return value of memblock_free_all() 2021-02-22 13:01:23 -08:00
memcontrol.c mm: memcontrol: fix NR_ANON_THPS accounting in charge moving 2021-02-24 13:38:29 -08:00
memfd.c mm: page cache: store only head pages in i_pages 2019-09-24 15:54:08 -07:00
memory-failure.c mm: fix page reference leak in soft_offline_page() 2021-01-24 10:34:52 -08:00
memory.c Fixes around VM_FPNMAP and follow_pfn 2021-02-22 17:45:02 -08:00
memory_hotplug.c mm: memmap defer init doesn't work as expected 2020-12-29 15:36:49 -08:00
mempolicy.c mm: migrate: initialize err in do_migrate_pages 2021-01-12 18:12:54 -08:00
mempool.c kasan, mm: rename kasan_poison_kfree 2020-12-22 12:55:09 -08:00
memremap.c mm/mremap_pages: fix static key devmap_managed_key updates 2020-11-02 12:14:18 -08:00
memtest.c
migrate.c mm/filemap: pass a sleep state to put_and_wait_on_page_locked 2021-02-24 13:38:28 -08:00
mincore.c inode: make init and permission helpers idmapped mount aware 2021-01-24 14:27:16 +01:00
mlock.c mm/lru: introduce relock_page_lruvec() 2020-12-15 14:48:04 -08:00
mm_init.c mm: fix fall-through warnings for Clang 2020-12-15 12:13:47 -08:00
mmap.c tlb: mmu_gather: Remove start/end arguments from tlb_gather_mmu() 2021-01-29 20:02:29 +01:00
mmap_lock.c mm: mmap_lock: add tracepoints around lock acquisition 2020-12-15 12:13:41 -08:00
mmu_gather.c tlb: mmu_gather: Remove start/end arguments from tlb_gather_mmu() 2021-01-29 20:02:29 +01:00
mmu_notifier.c mm: track mmu notifiers in fs_reclaim_acquire/release 2020-12-15 12:13:41 -08:00
mmzone.c mm/lru: replace pgdat lru_lock with lruvec lock 2020-12-15 14:48:04 -08:00
mprotect.c mm: Add 'mprotect' hook to struct vm_operations_struct 2020-11-17 14:36:14 +01:00
mremap.c This pull request contains the following changes for UML: 2021-02-21 13:53:00 -08:00
msync.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
nommu.c mm/nommu: Fix return type of filemap_map_pages() 2021-01-28 14:10:31 +00:00
oom_kill.c tlb: mmu_gather: Remove start/end arguments from tlb_gather_mmu() 2021-01-29 20:02:29 +01:00
page-writeback.c mm: make wait_on_page_writeback() wait for multiple pending writebacks 2021-01-05 11:33:00 -08:00
page_alloc.c mm: page_frag: Introduce page_frag_alloc_align() 2021-02-06 11:57:28 -08:00
page_counter.c mm/page_counter: use page_counter_read in page_counter_set_max 2020-12-15 12:13:40 -08:00
page_ext.c mm: fix some spelling mistakes in comments 2020-12-15 22:46:19 -08:00
page_idle.c mm: page_idle_get_page() does not need lru_lock 2020-12-15 14:48:03 -08:00
page_io.c mm/page_io: use pr_alert_ratelimited for swap read/write errors 2021-02-24 13:38:29 -08:00
page_isolation.c mm/page_isolation: do not isolate the max order page 2020-12-15 12:13:45 -08:00
page_owner.c mm/page_owner: use helper function zone_end_pfn() to get end_pfn 2021-02-24 13:38:27 -08:00
page_poison.c kasan, mm: reset tags when accessing metadata 2020-12-22 12:55:08 -08:00
page_reporting.c mm: rename page_order() to buddy_order() 2020-10-16 11:11:19 -07:00
page_reporting.h mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
page_vma_mapped.c mm/page_vma_mapped.c: add colon to fix kernel-doc markups error for check_pte 2020-12-15 12:13:41 -08:00
pagewalk.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
percpu-internal.h mm: memcg/percpu: account percpu memory to memory cgroups 2020-08-12 10:57:55 -07:00
percpu-km.c mm: memcg/percpu: account percpu memory to memory cgroups 2020-08-12 10:57:55 -07:00
percpu-stats.c mm: memcg/percpu: account percpu memory to memory cgroups 2020-08-12 10:57:55 -07:00
percpu-vm.c mm: memcg/percpu: account percpu memory to memory cgroups 2020-08-12 10:57:55 -07:00
percpu.c percpu: fix clang modpost section mismatch 2021-02-14 18:15:15 +00:00
pgalloc-track.h mm: move p?d_alloc_track to separate header file 2020-08-07 11:33:26 -07:00
pgtable-generic.c mm: introduce include/linux/pgtable.h 2020-06-09 09:39:13 -07:00
process_vm_access.c mm/process_vm_access.c: include compat.h 2021-01-12 18:12:54 -08:00
ptdump.c kasan, arm64: expand CONFIG_KASAN checks 2020-12-22 12:55:08 -08:00
readahead.c mm: use limited read-ahead to satisfy read 2020-10-17 13:49:08 -06:00
rmap.c mm/lru: revise the comments of lru_lock 2020-12-15 14:48:04 -08:00
rodata_test.c mm/rodata_test.c: fix missing function declaration 2020-08-21 09:52:53 -07:00
shmem.c idmapped-mounts-v5.12 2021-02-23 13:39:45 -08:00
shuffle.c mm: rename page_order() to buddy_order() 2020-10-16 11:11:19 -07:00
shuffle.h mm/shuffle: remove dynamic reconfiguration 2020-08-07 11:33:29 -07:00
slab.c mm: memcg/slab: pre-allocate obj_cgroups for slab caches with SLAB_ACCOUNT 2021-02-24 13:38:29 -08:00
slab.h mm: memcg/slab: pre-allocate obj_cgroups for slab caches with SLAB_ACCOUNT 2021-02-24 13:38:29 -08:00
slab_common.c mm, slab, slub: stop taking cpu hotplug lock 2021-02-24 13:38:27 -08:00
slob.c mm, tracing: record slab name for kmem_cache_free() 2021-02-24 13:38:26 -08:00
slub.c mm: memcg/slab: pre-allocate obj_cgroups for slab caches with SLAB_ACCOUNT 2021-02-24 13:38:29 -08:00
sparse-vmemmap.c mm/sparse: only sub-section aligned range would be populated 2020-08-07 11:33:27 -07:00
sparse.c mm/memory_hotplug: guard more declarations by CONFIG_MEMORY_HOTPLUG 2020-10-16 11:11:18 -07:00
swap.c mm/lru: introduce relock_page_lruvec() 2020-12-15 14:48:04 -08:00
swap_cgroup.c mm: memcontrol: make swap tracking an integral part of memory control 2020-06-03 20:09:48 -07:00
swap_slots.c mm/swap_slots.c: remove redundant NULL check 2021-02-24 13:38:28 -08:00
swap_state.c mm/swap: don't SetPageWorkingset unconditionally during swapin 2021-02-24 13:38:29 -08:00
swapfile.c mm/swapfile.c: fix debugging information problem 2021-02-24 13:38:28 -08:00
truncate.c mm: fix kernel-doc markups 2020-12-15 12:13:47 -08:00
usercopy.c mm/usercopy.c: delete duplicated word 2020-08-12 10:57:58 -07:00
userfaultfd.c mm/vmscan: protect the workingset on anonymous LRU 2020-08-12 10:57:55 -07:00
util.c mm: Make mem_dump_obj() handle vmalloc() memory 2021-01-22 15:24:04 -08:00
vmacache.c kernel: better document the use_mm/unuse_mm API contract 2020-06-10 19:14:18 -07:00
vmalloc.c Merge branch 'for-mingo-rcu' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu 2021-02-12 12:56:55 +01:00
vmpressure.c mm: vmpressure: use mem_cgroup_is_root API 2020-04-02 09:35:31 -07:00
vmscan.c mm: don't put pinned pages into the swap cache 2021-01-17 12:08:04 -08:00
vmstat.c arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL 2020-12-15 12:13:42 -08:00
workingset.c Merge branch 'akpm' (patches from Andrew) 2020-12-15 14:55:10 -08:00
z3fold.c z3fold: remove preempt disabled sections for RT 2020-12-15 12:13:45 -08:00
zbud.c mm/zbud: remove redundant initialization 2020-10-13 18:38:34 -07:00
zpool.c mm/zpool.c: delete duplicated word and fix grammar 2020-08-12 10:57:58 -07:00
zsmalloc.c mm/zsmalloc.c: rework the list_add code in insert_zspage() 2020-12-15 12:13:46 -08:00
zswap.c mm/zswap: move to use crypto_acomp API for hardware acceleration 2020-12-15 12:13:46 -08:00