Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton: "173 patches. Subsystems affected by this series: ia64, ocfs2, block, and mm (debug, pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap, bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock, oom-kill, migration, ksm, percpu, vmstat, and madvise)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits) mm/madvise: add MADV_WILLNEED to process_madvise() mm/vmstat: remove unneeded return value mm/vmstat: simplify the array size calculation mm/vmstat: correct some wrong comments mm/percpu,c: remove obsolete comments of pcpu_chunk_populated() selftests: vm: add COW time test for KSM pages selftests: vm: add KSM merging time test mm: KSM: fix data type selftests: vm: add KSM merging across nodes test selftests: vm: add KSM zero page merging test selftests: vm: add KSM unmerge test selftests: vm: add KSM merge test mm/migrate: correct kernel-doc notation mm: wire up syscall process_mrelease mm: introduce process_mrelease system call memblock: make memblock_find_in_range method private mm/mempolicy.c: use in_task() in mempolicy_slab_node() mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies mm/mempolicy: advertise new MPOL_PREFERRED_MANY mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY ...
This commit is contained in:
commit
14726903c8
|
@ -0,0 +1,24 @@
|
|||
What: /sys/kernel/mm/numa/
|
||||
Date: June 2021
|
||||
Contact: Linux memory management mailing list <linux-mm@kvack.org>
|
||||
Description: Interface for NUMA
|
||||
|
||||
What: /sys/kernel/mm/numa/demotion_enabled
|
||||
Date: June 2021
|
||||
Contact: Linux memory management mailing list <linux-mm@kvack.org>
|
||||
Description: Enable/disable demoting pages during reclaim
|
||||
|
||||
Page migration during reclaim is intended for systems
|
||||
with tiered memory configurations. These systems have
|
||||
multiple types of memory with varied performance
|
||||
characteristics instead of plain NUMA systems where
|
||||
the same kind of memory is found at varied distances.
|
||||
Allowing page migration during reclaim enables these
|
||||
systems to migrate pages from fast tiers to slow tiers
|
||||
when the fast tier is under pressure. This migration
|
||||
is performed before swap. It may move data to a NUMA
|
||||
node that does not fall into the cpuset of the
|
||||
allocating process which might be construed to violate
|
||||
the guarantees of cpusets. This should not be enabled
|
||||
on systems which need strict cpuset location
|
||||
guarantees.
|
|
@ -245,6 +245,13 @@ MPOL_INTERLEAVED
|
|||
address range or file. During system boot up, the temporary
|
||||
interleaved system default policy works in this mode.
|
||||
|
||||
MPOL_PREFERRED_MANY
|
||||
This mode specifices that the allocation should be preferrably
|
||||
satisfied from the nodemask specified in the policy. If there is
|
||||
a memory pressure on all nodes in the nodemask, the allocation
|
||||
can fall back to all existing numa nodes. This is effectively
|
||||
MPOL_PREFERRED allowed for a mask rather than a single node.
|
||||
|
||||
NUMA memory policy supports the following optional mode flags:
|
||||
|
||||
MPOL_F_STATIC_NODES
|
||||
|
@ -253,10 +260,10 @@ MPOL_F_STATIC_NODES
|
|||
nodes changes after the memory policy has been defined.
|
||||
|
||||
Without this flag, any time a mempolicy is rebound because of a
|
||||
change in the set of allowed nodes, the node (Preferred) or
|
||||
nodemask (Bind, Interleave) is remapped to the new set of
|
||||
allowed nodes. This may result in nodes being used that were
|
||||
previously undesired.
|
||||
change in the set of allowed nodes, the preferred nodemask (Preferred
|
||||
Many), preferred node (Preferred) or nodemask (Bind, Interleave) is
|
||||
remapped to the new set of allowed nodes. This may result in nodes
|
||||
being used that were previously undesired.
|
||||
|
||||
With this flag, if the user-specified nodes overlap with the
|
||||
nodes allowed by the task's cpuset, then the memory policy is
|
||||
|
|
|
@ -118,7 +118,8 @@ compaction_proactiveness
|
|||
|
||||
This tunable takes a value in the range [0, 100] with a default value of
|
||||
20. This tunable determines how aggressively compaction is done in the
|
||||
background. Setting it to 0 disables proactive compaction.
|
||||
background. Write of a non zero value to this tunable will immediately
|
||||
trigger the proactive compaction. Setting it to 0 disables proactive compaction.
|
||||
|
||||
Note that compaction has a non-trivial system-wide impact as pages
|
||||
belonging to different processes are moved around, which could also lead
|
||||
|
|
|
@ -271,10 +271,15 @@ maps this page at its virtual address.
|
|||
|
||||
``void flush_dcache_page(struct page *page)``
|
||||
|
||||
Any time the kernel writes to a page cache page, _OR_
|
||||
the kernel is about to read from a page cache page and
|
||||
user space shared/writable mappings of this page potentially
|
||||
exist, this routine is called.
|
||||
This routines must be called when:
|
||||
|
||||
a) the kernel did write to a page that is in the page cache page
|
||||
and / or in high memory
|
||||
b) the kernel is about to read from a page cache page and user space
|
||||
shared/writable mappings of this page potentially exist. Note
|
||||
that {get,pin}_user_pages{_fast} already call flush_dcache_page
|
||||
on any page found in the user address space and thus driver
|
||||
code rarely needs to take this into account.
|
||||
|
||||
.. note::
|
||||
|
||||
|
@ -284,38 +289,34 @@ maps this page at its virtual address.
|
|||
handling vfs symlinks in the page cache need not call
|
||||
this interface at all.
|
||||
|
||||
The phrase "kernel writes to a page cache page" means,
|
||||
specifically, that the kernel executes store instructions
|
||||
that dirty data in that page at the page->virtual mapping
|
||||
of that page. It is important to flush here to handle
|
||||
D-cache aliasing, to make sure these kernel stores are
|
||||
visible to user space mappings of that page.
|
||||
The phrase "kernel writes to a page cache page" means, specifically,
|
||||
that the kernel executes store instructions that dirty data in that
|
||||
page at the page->virtual mapping of that page. It is important to
|
||||
flush here to handle D-cache aliasing, to make sure these kernel stores
|
||||
are visible to user space mappings of that page.
|
||||
|
||||
The corollary case is just as important, if there are users
|
||||
which have shared+writable mappings of this file, we must make
|
||||
sure that kernel reads of these pages will see the most recent
|
||||
stores done by the user.
|
||||
The corollary case is just as important, if there are users which have
|
||||
shared+writable mappings of this file, we must make sure that kernel
|
||||
reads of these pages will see the most recent stores done by the user.
|
||||
|
||||
If D-cache aliasing is not an issue, this routine may
|
||||
simply be defined as a nop on that architecture.
|
||||
If D-cache aliasing is not an issue, this routine may simply be defined
|
||||
as a nop on that architecture.
|
||||
|
||||
There is a bit set aside in page->flags (PG_arch_1) as
|
||||
"architecture private". The kernel guarantees that,
|
||||
for pagecache pages, it will clear this bit when such
|
||||
a page first enters the pagecache.
|
||||
There is a bit set aside in page->flags (PG_arch_1) as "architecture
|
||||
private". The kernel guarantees that, for pagecache pages, it will
|
||||
clear this bit when such a page first enters the pagecache.
|
||||
|
||||
This allows these interfaces to be implemented much more
|
||||
efficiently. It allows one to "defer" (perhaps indefinitely)
|
||||
the actual flush if there are currently no user processes
|
||||
mapping this page. See sparc64's flush_dcache_page and
|
||||
update_mmu_cache implementations for an example of how to go
|
||||
about doing this.
|
||||
This allows these interfaces to be implemented much more efficiently.
|
||||
It allows one to "defer" (perhaps indefinitely) the actual flush if
|
||||
there are currently no user processes mapping this page. See sparc64's
|
||||
flush_dcache_page and update_mmu_cache implementations for an example
|
||||
of how to go about doing this.
|
||||
|
||||
The idea is, first at flush_dcache_page() time, if
|
||||
page->mapping->i_mmap is an empty tree, just mark the architecture
|
||||
private page flag bit. Later, in update_mmu_cache(), a check is
|
||||
made of this flag bit, and if set the flush is done and the flag
|
||||
bit is cleared.
|
||||
The idea is, first at flush_dcache_page() time, if page_file_mapping()
|
||||
returns a mapping, and mapping_mapped on that mapping returns %false,
|
||||
just mark the architecture private page flag bit. Later, in
|
||||
update_mmu_cache(), a check is made of this flag bit, and if set the
|
||||
flush is done and the flag bit is cleared.
|
||||
|
||||
.. important::
|
||||
|
||||
|
@ -351,19 +352,6 @@ maps this page at its virtual address.
|
|||
architectures). For incoherent architectures, it should flush
|
||||
the cache of the page at vmaddr.
|
||||
|
||||
``void flush_kernel_dcache_page(struct page *page)``
|
||||
|
||||
When the kernel needs to modify a user page is has obtained
|
||||
with kmap, it calls this function after all modifications are
|
||||
complete (but before kunmapping it) to bring the underlying
|
||||
page up to date. It is assumed here that the user has no
|
||||
incoherent cached copies (i.e. the original page was obtained
|
||||
from a mechanism like get_user_pages()). The default
|
||||
implementation is a nop and should remain so on all coherent
|
||||
architectures. On incoherent architectures, this should flush
|
||||
the kernel cache for page (using page_address(page)).
|
||||
|
||||
|
||||
``void flush_icache_range(unsigned long start, unsigned long end)``
|
||||
|
||||
When the kernel stores into addresses that it will execute
|
||||
|
|
|
@ -181,9 +181,16 @@ By default, KASAN prints a bug report only for the first invalid memory access.
|
|||
With ``kasan_multi_shot``, KASAN prints a report on every invalid access. This
|
||||
effectively disables ``panic_on_warn`` for KASAN reports.
|
||||
|
||||
Alternatively, independent of ``panic_on_warn`` the ``kasan.fault=`` boot
|
||||
parameter can be used to control panic and reporting behaviour:
|
||||
|
||||
- ``kasan.fault=report`` or ``=panic`` controls whether to only print a KASAN
|
||||
report or also panic the kernel (default: ``report``). The panic happens even
|
||||
if ``kasan_multi_shot`` is enabled.
|
||||
|
||||
Hardware tag-based KASAN mode (see the section about various modes below) is
|
||||
intended for use in production as a security mitigation. Therefore, it supports
|
||||
boot parameters that allow disabling KASAN or controlling its features.
|
||||
additional boot parameters that allow disabling KASAN or controlling features:
|
||||
|
||||
- ``kasan=off`` or ``=on`` controls whether KASAN is enabled (default: ``on``).
|
||||
|
||||
|
@ -199,10 +206,6 @@ boot parameters that allow disabling KASAN or controlling its features.
|
|||
- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack
|
||||
traces collection (default: ``on``).
|
||||
|
||||
- ``kasan.fault=report`` or ``=panic`` controls whether to only print a KASAN
|
||||
report or also panic the kernel (default: ``report``). The panic happens even
|
||||
if ``kasan_multi_shot`` is enabled.
|
||||
|
||||
Implementation details
|
||||
----------------------
|
||||
|
||||
|
|
|
@ -298,15 +298,6 @@ HyperSparc cpu就是这样一个具有这种属性的cpu。
|
|||
用。默认的实现是nop(对于所有相干的架构应该保持这样)。对于不一致性
|
||||
的架构,它应该刷新vmaddr处的页面缓存。
|
||||
|
||||
``void flush_kernel_dcache_page(struct page *page)``
|
||||
|
||||
当内核需要修改一个用kmap获得的用户页时,它会在所有修改完成后(但在
|
||||
kunmapping之前)调用这个函数,以使底层页面达到最新状态。这里假定用
|
||||
户没有不一致性的缓存副本(即原始页面是从类似get_user_pages()的机制
|
||||
中获得的)。默认的实现是一个nop,在所有相干的架构上都应该如此。在不
|
||||
一致性的架构上,这应该刷新内核缓存中的页面(使用page_address(page))。
|
||||
|
||||
|
||||
``void flush_icache_range(unsigned long start, unsigned long end)``
|
||||
|
||||
当内核存储到它将执行的地址中时(例如在加载模块时),这个函数被调用。
|
||||
|
|
|
@ -180,7 +180,6 @@ Limitations
|
|||
===========
|
||||
- Not all page types are supported and never will. Most kernel internal
|
||||
objects cannot be recovered, only LRU pages for now.
|
||||
- Right now hugepage support is missing.
|
||||
|
||||
---
|
||||
Andi Kleen, Oct 2009
|
||||
|
|
|
@ -486,3 +486,5 @@
|
|||
554 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
555 common landlock_add_rule sys_landlock_add_rule
|
||||
556 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 557 reserved for memfd_secret
|
||||
558 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -291,6 +291,7 @@ extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr
|
|||
#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
|
||||
extern void flush_dcache_page(struct page *);
|
||||
|
||||
#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
|
||||
static inline void flush_kernel_vmap_range(void *addr, int size)
|
||||
{
|
||||
if ((cache_is_vivt() || cache_is_vipt_aliasing()))
|
||||
|
@ -312,9 +313,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
|
|||
__flush_anon_page(vma, page, vmaddr);
|
||||
}
|
||||
|
||||
#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
extern void flush_kernel_dcache_page(struct page *);
|
||||
|
||||
#define flush_dcache_mmap_lock(mapping) xa_lock_irq(&mapping->i_pages)
|
||||
#define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&mapping->i_pages)
|
||||
|
||||
|
|
|
@ -1012,31 +1012,25 @@ static void __init reserve_crashkernel(void)
|
|||
unsigned long long lowmem_max = __pa(high_memory - 1) + 1;
|
||||
if (crash_max > lowmem_max)
|
||||
crash_max = lowmem_max;
|
||||
crash_base = memblock_find_in_range(CRASH_ALIGN, crash_max,
|
||||
crash_size, CRASH_ALIGN);
|
||||
|
||||
crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
|
||||
CRASH_ALIGN, crash_max);
|
||||
if (!crash_base) {
|
||||
pr_err("crashkernel reservation failed - No suitable area found.\n");
|
||||
return;
|
||||
}
|
||||
} else {
|
||||
unsigned long long crash_max = crash_base + crash_size;
|
||||
unsigned long long start;
|
||||
|
||||
start = memblock_find_in_range(crash_base,
|
||||
crash_base + crash_size,
|
||||
crash_size, SECTION_SIZE);
|
||||
if (start != crash_base) {
|
||||
start = memblock_phys_alloc_range(crash_size, SECTION_SIZE,
|
||||
crash_base, crash_max);
|
||||
if (!start) {
|
||||
pr_err("crashkernel reservation failed - memory is in use.\n");
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
ret = memblock_reserve(crash_base, crash_size);
|
||||
if (ret < 0) {
|
||||
pr_warn("crashkernel reservation failed - memory is in use (0x%lx)\n",
|
||||
(unsigned long)crash_base);
|
||||
return;
|
||||
}
|
||||
|
||||
pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System RAM: %ldMB)\n",
|
||||
(unsigned long)(crash_size >> 20),
|
||||
(unsigned long)(crash_base >> 20),
|
||||
|
|
|
@ -345,39 +345,6 @@ void flush_dcache_page(struct page *page)
|
|||
}
|
||||
EXPORT_SYMBOL(flush_dcache_page);
|
||||
|
||||
/*
|
||||
* Ensure cache coherency for the kernel mapping of this page. We can
|
||||
* assume that the page is pinned via kmap.
|
||||
*
|
||||
* If the page only exists in the page cache and there are no user
|
||||
* space mappings, this is a no-op since the page was already marked
|
||||
* dirty at creation. Otherwise, we need to flush the dirty kernel
|
||||
* cache lines directly.
|
||||
*/
|
||||
void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
if (cache_is_vivt() || cache_is_vipt_aliasing()) {
|
||||
struct address_space *mapping;
|
||||
|
||||
mapping = page_mapping_file(page);
|
||||
|
||||
if (!mapping || mapping_mapped(mapping)) {
|
||||
void *addr;
|
||||
|
||||
addr = page_address(page);
|
||||
/*
|
||||
* kmap_atomic() doesn't set the page virtual
|
||||
* address for highmem pages, and
|
||||
* kunmap_atomic() takes care of cache
|
||||
* flushing already.
|
||||
*/
|
||||
if (!IS_ENABLED(CONFIG_HIGHMEM) || addr)
|
||||
__cpuc_flush_dcache_area(addr, PAGE_SIZE);
|
||||
}
|
||||
}
|
||||
}
|
||||
EXPORT_SYMBOL(flush_kernel_dcache_page);
|
||||
|
||||
/*
|
||||
* Flush an anonymous page so that users of get_user_pages()
|
||||
* can safely access the data. The expected sequence is:
|
||||
|
|
|
@ -166,12 +166,6 @@ void flush_dcache_page(struct page *page)
|
|||
}
|
||||
EXPORT_SYMBOL(flush_dcache_page);
|
||||
|
||||
void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
|
||||
}
|
||||
EXPORT_SYMBOL(flush_kernel_dcache_page);
|
||||
|
||||
void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
|
||||
unsigned long uaddr, void *dst, const void *src,
|
||||
unsigned long len)
|
||||
|
|
|
@ -460,3 +460,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -38,7 +38,7 @@
|
|||
#define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5)
|
||||
#define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800)
|
||||
|
||||
#define __NR_compat_syscalls 447
|
||||
#define __NR_compat_syscalls 449
|
||||
#endif
|
||||
|
||||
#define __ARCH_WANT_SYS_CLONE
|
||||
|
|
|
@ -901,6 +901,8 @@ __SYSCALL(__NR_landlock_create_ruleset, sys_landlock_create_ruleset)
|
|||
__SYSCALL(__NR_landlock_add_rule, sys_landlock_add_rule)
|
||||
#define __NR_landlock_restrict_self 446
|
||||
__SYSCALL(__NR_landlock_restrict_self, sys_landlock_restrict_self)
|
||||
#define __NR_process_mrelease 448
|
||||
__SYSCALL(__NR_process_mrelease, sys_process_mrelease)
|
||||
|
||||
/*
|
||||
* Please add new compat syscalls above this comment and update
|
||||
|
|
|
@ -92,12 +92,10 @@ void __init kvm_hyp_reserve(void)
|
|||
* this is unmapped from the host stage-2, and fallback to PAGE_SIZE.
|
||||
*/
|
||||
hyp_mem_size = hyp_mem_pages << PAGE_SHIFT;
|
||||
hyp_mem_base = memblock_find_in_range(0, memblock_end_of_DRAM(),
|
||||
ALIGN(hyp_mem_size, PMD_SIZE),
|
||||
PMD_SIZE);
|
||||
hyp_mem_base = memblock_phys_alloc(ALIGN(hyp_mem_size, PMD_SIZE),
|
||||
PMD_SIZE);
|
||||
if (!hyp_mem_base)
|
||||
hyp_mem_base = memblock_find_in_range(0, memblock_end_of_DRAM(),
|
||||
hyp_mem_size, PAGE_SIZE);
|
||||
hyp_mem_base = memblock_phys_alloc(hyp_mem_size, PAGE_SIZE);
|
||||
else
|
||||
hyp_mem_size = ALIGN(hyp_mem_size, PMD_SIZE);
|
||||
|
||||
|
@ -105,7 +103,6 @@ void __init kvm_hyp_reserve(void)
|
|||
kvm_err("Failed to reserve hyp memory\n");
|
||||
return;
|
||||
}
|
||||
memblock_reserve(hyp_mem_base, hyp_mem_size);
|
||||
|
||||
kvm_info("Reserved %lld MiB at 0x%llx\n", hyp_mem_size >> 20,
|
||||
hyp_mem_base);
|
||||
|
|
|
@ -74,6 +74,7 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
|
|||
static void __init reserve_crashkernel(void)
|
||||
{
|
||||
unsigned long long crash_base, crash_size;
|
||||
unsigned long long crash_max = arm64_dma_phys_limit;
|
||||
int ret;
|
||||
|
||||
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
|
||||
|
@ -84,33 +85,18 @@ static void __init reserve_crashkernel(void)
|
|||
|
||||
crash_size = PAGE_ALIGN(crash_size);
|
||||
|
||||
if (crash_base == 0) {
|
||||
/* Current arm64 boot protocol requires 2MB alignment */
|
||||
crash_base = memblock_find_in_range(0, arm64_dma_phys_limit,
|
||||
crash_size, SZ_2M);
|
||||
if (crash_base == 0) {
|
||||
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
|
||||
crash_size);
|
||||
return;
|
||||
}
|
||||
} else {
|
||||
/* User specifies base address explicitly. */
|
||||
if (!memblock_is_region_memory(crash_base, crash_size)) {
|
||||
pr_warn("cannot reserve crashkernel: region is not memory\n");
|
||||
return;
|
||||
}
|
||||
/* User specifies base address explicitly. */
|
||||
if (crash_base)
|
||||
crash_max = crash_base + crash_size;
|
||||
|
||||
if (memblock_is_region_reserved(crash_base, crash_size)) {
|
||||
pr_warn("cannot reserve crashkernel: region overlaps reserved memory\n");
|
||||
return;
|
||||
}
|
||||
|
||||
if (!IS_ALIGNED(crash_base, SZ_2M)) {
|
||||
pr_warn("cannot reserve crashkernel: base address is not 2MB aligned\n");
|
||||
return;
|
||||
}
|
||||
/* Current arm64 boot protocol requires 2MB alignment */
|
||||
crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
|
||||
crash_base, crash_max);
|
||||
if (!crash_base) {
|
||||
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
|
||||
crash_size);
|
||||
return;
|
||||
}
|
||||
memblock_reserve(crash_base, crash_size);
|
||||
|
||||
pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
|
||||
crash_base, crash_base + crash_size, crash_size >> 20);
|
||||
|
|
|
@ -56,17 +56,6 @@ void update_mmu_cache(struct vm_area_struct *vma, unsigned long addr,
|
|||
}
|
||||
}
|
||||
|
||||
void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
struct address_space *mapping;
|
||||
|
||||
mapping = page_mapping_file(page);
|
||||
|
||||
if (!mapping || mapping_mapped(mapping))
|
||||
dcache_wbinv_all();
|
||||
}
|
||||
EXPORT_SYMBOL(flush_kernel_dcache_page);
|
||||
|
||||
void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
|
||||
unsigned long end)
|
||||
{
|
||||
|
|
|
@ -14,12 +14,10 @@ extern void flush_dcache_page(struct page *);
|
|||
#define flush_cache_page(vma, page, pfn) cache_wbinv_all()
|
||||
#define flush_cache_dup_mm(mm) cache_wbinv_all()
|
||||
|
||||
#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
extern void flush_kernel_dcache_page(struct page *);
|
||||
|
||||
#define flush_dcache_mmap_lock(mapping) xa_lock_irq(&mapping->i_pages)
|
||||
#define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&mapping->i_pages)
|
||||
|
||||
#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
|
||||
static inline void flush_kernel_vmap_range(void *addr, int size)
|
||||
{
|
||||
dcache_wbinv_all();
|
||||
|
|
|
@ -283,8 +283,7 @@ int __kprobes kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr)
|
|||
* normal page fault.
|
||||
*/
|
||||
regs->pc = (unsigned long) cur->addr;
|
||||
if (!instruction_pointer(regs))
|
||||
BUG();
|
||||
BUG_ON(!instruction_pointer(regs));
|
||||
|
||||
if (kcb->kprobe_status == KPROBE_REENTER)
|
||||
restore_previous_kprobe(kcb);
|
||||
|
|
|
@ -29,7 +29,6 @@ struct rsvd_region {
|
|||
};
|
||||
|
||||
extern struct rsvd_region rsvd_region[IA64_MAX_RSVD_REGIONS + 1];
|
||||
extern int num_rsvd_regions;
|
||||
|
||||
extern void find_memory (void);
|
||||
extern void reserve_memory (void);
|
||||
|
@ -40,7 +39,6 @@ extern unsigned long efi_memmap_init(u64 *s, u64 *e);
|
|||
extern int find_max_min_low_pfn (u64, u64, void *);
|
||||
|
||||
extern unsigned long vmcore_find_descriptor_size(unsigned long address);
|
||||
extern int reserve_elfcorehdr(u64 *start, u64 *end);
|
||||
|
||||
/*
|
||||
* For rounding an address to the next IA64_GRANULE_SIZE or order
|
||||
|
|
|
@ -906,6 +906,6 @@ EXPORT_SYMBOL(acpi_unregister_ioapic);
|
|||
/*
|
||||
* acpi_suspend_lowlevel() - save kernel state and suspend.
|
||||
*
|
||||
* TBD when when IA64 starts to support suspend...
|
||||
* TBD when IA64 starts to support suspend...
|
||||
*/
|
||||
int acpi_suspend_lowlevel(void) { return 0; }
|
||||
|
|
|
@ -131,7 +131,7 @@ unsigned long ia64_cache_stride_shift = ~0;
|
|||
* We use a special marker for the end of memory and it uses the extra (+1) slot
|
||||
*/
|
||||
struct rsvd_region rsvd_region[IA64_MAX_RSVD_REGIONS + 1] __initdata;
|
||||
int num_rsvd_regions __initdata;
|
||||
static int num_rsvd_regions __initdata;
|
||||
|
||||
|
||||
/*
|
||||
|
@ -325,6 +325,31 @@ static inline void __init setup_crashkernel(unsigned long total, int *n)
|
|||
{}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_CRASH_DUMP
|
||||
static int __init reserve_elfcorehdr(u64 *start, u64 *end)
|
||||
{
|
||||
u64 length;
|
||||
|
||||
/* We get the address using the kernel command line,
|
||||
* but the size is extracted from the EFI tables.
|
||||
* Both address and size are required for reservation
|
||||
* to work properly.
|
||||
*/
|
||||
|
||||
if (!is_vmcore_usable())
|
||||
return -EINVAL;
|
||||
|
||||
if ((length = vmcore_find_descriptor_size(elfcorehdr_addr)) == 0) {
|
||||
vmcore_unusable();
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
*start = (unsigned long)__va(elfcorehdr_addr);
|
||||
*end = *start + length;
|
||||
return 0;
|
||||
}
|
||||
#endif /* CONFIG_CRASH_DUMP */
|
||||
|
||||
/**
|
||||
* reserve_memory - setup reserved memory areas
|
||||
*
|
||||
|
@ -522,32 +547,6 @@ static __init int setup_nomca(char *s)
|
|||
}
|
||||
early_param("nomca", setup_nomca);
|
||||
|
||||
#ifdef CONFIG_CRASH_DUMP
|
||||
int __init reserve_elfcorehdr(u64 *start, u64 *end)
|
||||
{
|
||||
u64 length;
|
||||
|
||||
/* We get the address using the kernel command line,
|
||||
* but the size is extracted from the EFI tables.
|
||||
* Both address and size are required for reservation
|
||||
* to work properly.
|
||||
*/
|
||||
|
||||
if (!is_vmcore_usable())
|
||||
return -EINVAL;
|
||||
|
||||
if ((length = vmcore_find_descriptor_size(elfcorehdr_addr)) == 0) {
|
||||
vmcore_unusable();
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
*start = (unsigned long)__va(elfcorehdr_addr);
|
||||
*end = *start + length;
|
||||
return 0;
|
||||
}
|
||||
|
||||
#endif /* CONFIG_PROC_VMCORE */
|
||||
|
||||
void __init
|
||||
setup_arch (char **cmdline_p)
|
||||
{
|
||||
|
|
|
@ -367,3 +367,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -446,3 +446,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -112,8 +112,7 @@ extern int page_is_ram(unsigned long pfn);
|
|||
# define page_to_phys(page) (page_to_pfn(page) << PAGE_SHIFT)
|
||||
|
||||
# define ARCH_PFN_OFFSET (memory_start >> PAGE_SHIFT)
|
||||
# define pfn_valid(pfn) ((pfn) < (max_mapnr + ARCH_PFN_OFFSET))
|
||||
|
||||
# define pfn_valid(pfn) ((pfn) >= ARCH_PFN_OFFSET && (pfn) < (max_mapnr + ARCH_PFN_OFFSET))
|
||||
# endif /* __ASSEMBLY__ */
|
||||
|
||||
#define virt_addr_valid(vaddr) (pfn_valid(virt_to_pfn(vaddr)))
|
||||
|
|
|
@ -443,8 +443,6 @@ extern int mem_init_done;
|
|||
|
||||
asmlinkage void __init mmu_init(void);
|
||||
|
||||
void __init *early_get_page(void);
|
||||
|
||||
#endif /* __ASSEMBLY__ */
|
||||
#endif /* __KERNEL__ */
|
||||
|
||||
|
|
|
@ -452,3 +452,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -265,18 +265,6 @@ asmlinkage void __init mmu_init(void)
|
|||
dma_contiguous_reserve(memory_start + lowmem_size - 1);
|
||||
}
|
||||
|
||||
/* This is only called until mem_init is done. */
|
||||
void __init *early_get_page(void)
|
||||
{
|
||||
/*
|
||||
* Mem start + kernel_tlb -> here is limit
|
||||
* because of mem mapping from head.S
|
||||
*/
|
||||
return memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
|
||||
MEMBLOCK_LOW_LIMIT, memory_start + kernel_tlb,
|
||||
NUMA_NO_NODE);
|
||||
}
|
||||
|
||||
void * __ref zalloc_maybe_bootmem(size_t size, gfp_t mask)
|
||||
{
|
||||
void *p;
|
||||
|
|
|
@ -33,6 +33,7 @@
|
|||
#include <linux/init.h>
|
||||
#include <linux/mm_types.h>
|
||||
#include <linux/pgtable.h>
|
||||
#include <linux/memblock.h>
|
||||
|
||||
#include <asm/pgalloc.h>
|
||||
#include <linux/io.h>
|
||||
|
@ -242,15 +243,13 @@ unsigned long iopa(unsigned long addr)
|
|||
|
||||
__ref pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
|
||||
{
|
||||
pte_t *pte;
|
||||
if (mem_init_done) {
|
||||
pte = (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
|
||||
} else {
|
||||
pte = (pte_t *)early_get_page();
|
||||
if (pte)
|
||||
clear_page(pte);
|
||||
}
|
||||
return pte;
|
||||
if (mem_init_done)
|
||||
return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
|
||||
else
|
||||
return memblock_alloc_try_nid(PAGE_SIZE, PAGE_SIZE,
|
||||
MEMBLOCK_LOW_LIMIT,
|
||||
memory_start + kernel_tlb,
|
||||
NUMA_NO_NODE);
|
||||
}
|
||||
|
||||
void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t flags)
|
||||
|
|
|
@ -125,13 +125,7 @@ static inline void kunmap_noncoherent(void)
|
|||
kunmap_coherent();
|
||||
}
|
||||
|
||||
#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
static inline void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
BUG_ON(cpu_has_dc_aliases && PageHighMem(page));
|
||||
flush_dcache_page(page);
|
||||
}
|
||||
|
||||
#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
|
||||
/*
|
||||
* For now flush_kernel_vmap_range and invalidate_kernel_vmap_range both do a
|
||||
* cache writeback and invalidate operation.
|
||||
|
|
|
@ -452,8 +452,9 @@ static void __init mips_parse_crashkernel(void)
|
|||
return;
|
||||
|
||||
if (crash_base <= 0) {
|
||||
crash_base = memblock_find_in_range(CRASH_ALIGN, CRASH_ADDR_MAX,
|
||||
crash_size, CRASH_ALIGN);
|
||||
crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
|
||||
CRASH_ALIGN,
|
||||
CRASH_ADDR_MAX);
|
||||
if (!crash_base) {
|
||||
pr_warn("crashkernel reservation failed - No suitable area found.\n");
|
||||
return;
|
||||
|
@ -461,8 +462,9 @@ static void __init mips_parse_crashkernel(void)
|
|||
} else {
|
||||
unsigned long long start;
|
||||
|
||||
start = memblock_find_in_range(crash_base, crash_base + crash_size,
|
||||
crash_size, 1);
|
||||
start = memblock_phys_alloc_range(crash_size, 1,
|
||||
crash_base,
|
||||
crash_base + crash_size);
|
||||
if (start != crash_base) {
|
||||
pr_warn("Invalid memory region reserved for crash kernel\n");
|
||||
return;
|
||||
|
@ -656,10 +658,6 @@ static void __init arch_mem_init(char **cmdline_p)
|
|||
mips_reserve_vmcore();
|
||||
|
||||
mips_parse_crashkernel();
|
||||
#ifdef CONFIG_KEXEC
|
||||
if (crashk_res.start != crashk_res.end)
|
||||
memblock_reserve(crashk_res.start, resource_size(&crashk_res));
|
||||
#endif
|
||||
device_tree_init();
|
||||
|
||||
/*
|
||||
|
|
|
@ -385,3 +385,5 @@
|
|||
444 n32 landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 n32 landlock_add_rule sys_landlock_add_rule
|
||||
446 n32 landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 n32 process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -361,3 +361,5 @@
|
|||
444 n64 landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 n64 landlock_add_rule sys_landlock_add_rule
|
||||
446 n64 landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 n64 process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -434,3 +434,5 @@
|
|||
444 o32 landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 o32 landlock_add_rule sys_landlock_add_rule
|
||||
446 o32 landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 o32 process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -36,8 +36,7 @@ void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
|
|||
void flush_anon_page(struct vm_area_struct *vma,
|
||||
struct page *page, unsigned long vaddr);
|
||||
|
||||
#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
void flush_kernel_dcache_page(struct page *page);
|
||||
#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
|
||||
void flush_kernel_vmap_range(void *addr, int size);
|
||||
void invalidate_kernel_vmap_range(void *addr, int size);
|
||||
#define flush_dcache_mmap_lock(mapping) xa_lock_irq(&(mapping)->i_pages)
|
||||
|
|
|
@ -318,15 +318,6 @@ void flush_anon_page(struct vm_area_struct *vma,
|
|||
local_irq_restore(flags);
|
||||
}
|
||||
|
||||
void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
unsigned long flags;
|
||||
local_irq_save(flags);
|
||||
cpu_dcache_wbinval_page((unsigned long)page_address(page));
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
EXPORT_SYMBOL(flush_kernel_dcache_page);
|
||||
|
||||
void flush_kernel_vmap_range(void *addr, int size)
|
||||
{
|
||||
unsigned long flags;
|
||||
|
|
|
@ -36,16 +36,12 @@ void flush_cache_all_local(void);
|
|||
void flush_cache_all(void);
|
||||
void flush_cache_mm(struct mm_struct *mm);
|
||||
|
||||
#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
void flush_kernel_dcache_page_addr(void *addr);
|
||||
static inline void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
flush_kernel_dcache_page_addr(page_address(page));
|
||||
}
|
||||
|
||||
#define flush_kernel_dcache_range(start,size) \
|
||||
flush_kernel_dcache_range_asm((start), (start)+(size));
|
||||
|
||||
#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
|
||||
void flush_kernel_vmap_range(void *vaddr, int size);
|
||||
void invalidate_kernel_vmap_range(void *vaddr, int size);
|
||||
|
||||
|
@ -59,7 +55,7 @@ extern void flush_dcache_page(struct page *page);
|
|||
#define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&mapping->i_pages)
|
||||
|
||||
#define flush_icache_page(vma,page) do { \
|
||||
flush_kernel_dcache_page(page); \
|
||||
flush_kernel_dcache_page_addr(page_address(page)); \
|
||||
flush_kernel_icache_page(page_address(page)); \
|
||||
} while (0)
|
||||
|
||||
|
|
|
@ -334,7 +334,7 @@ void flush_dcache_page(struct page *page)
|
|||
return;
|
||||
}
|
||||
|
||||
flush_kernel_dcache_page(page);
|
||||
flush_kernel_dcache_page_addr(page_address(page));
|
||||
|
||||
if (!mapping)
|
||||
return;
|
||||
|
@ -375,7 +375,6 @@ EXPORT_SYMBOL(flush_dcache_page);
|
|||
|
||||
/* Defined in arch/parisc/kernel/pacache.S */
|
||||
EXPORT_SYMBOL(flush_kernel_dcache_range_asm);
|
||||
EXPORT_SYMBOL(flush_kernel_dcache_page_asm);
|
||||
EXPORT_SYMBOL(flush_data_cache_local);
|
||||
EXPORT_SYMBOL(flush_kernel_icache_range_asm);
|
||||
|
||||
|
|
|
@ -444,3 +444,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -526,3 +526,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -211,13 +211,11 @@ static int update_lmb_associativity_index(struct drmem_lmb *lmb)
|
|||
static struct memory_block *lmb_to_memblock(struct drmem_lmb *lmb)
|
||||
{
|
||||
unsigned long section_nr;
|
||||
struct mem_section *mem_sect;
|
||||
struct memory_block *mem_block;
|
||||
|
||||
section_nr = pfn_to_section_nr(PFN_DOWN(lmb->base_addr));
|
||||
mem_sect = __nr_to_section(section_nr);
|
||||
|
||||
mem_block = find_memory_block(mem_sect);
|
||||
mem_block = find_memory_block(section_nr);
|
||||
return mem_block;
|
||||
}
|
||||
|
||||
|
|
|
@ -819,38 +819,22 @@ static void __init reserve_crashkernel(void)
|
|||
|
||||
crash_size = PAGE_ALIGN(crash_size);
|
||||
|
||||
if (crash_base == 0) {
|
||||
/*
|
||||
* Current riscv boot protocol requires 2MB alignment for
|
||||
* RV64 and 4MB alignment for RV32 (hugepage size)
|
||||
*/
|
||||
crash_base = memblock_find_in_range(search_start, search_end,
|
||||
crash_size, PMD_SIZE);
|
||||
|
||||
if (crash_base == 0) {
|
||||
pr_warn("crashkernel: couldn't allocate %lldKB\n",
|
||||
crash_size >> 10);
|
||||
return;
|
||||
}
|
||||
} else {
|
||||
/* User specifies base address explicitly. */
|
||||
if (!memblock_is_region_memory(crash_base, crash_size)) {
|
||||
pr_warn("crashkernel: requested region is not memory\n");
|
||||
return;
|
||||
}
|
||||
|
||||
if (memblock_is_region_reserved(crash_base, crash_size)) {
|
||||
pr_warn("crashkernel: requested region is reserved\n");
|
||||
return;
|
||||
}
|
||||
|
||||
|
||||
if (!IS_ALIGNED(crash_base, PMD_SIZE)) {
|
||||
pr_warn("crashkernel: requested region is misaligned\n");
|
||||
return;
|
||||
}
|
||||
if (crash_base) {
|
||||
search_start = crash_base;
|
||||
search_end = crash_base + crash_size;
|
||||
}
|
||||
|
||||
/*
|
||||
* Current riscv boot protocol requires 2MB alignment for
|
||||
* RV64 and 4MB alignment for RV32 (hugepage size)
|
||||
*/
|
||||
crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
|
||||
search_start, search_end);
|
||||
if (crash_base == 0) {
|
||||
pr_warn("crashkernel: couldn't allocate %lldKB\n",
|
||||
crash_size >> 10);
|
||||
return;
|
||||
}
|
||||
memblock_reserve(crash_base, crash_size);
|
||||
|
||||
pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
|
||||
crash_base, crash_base + crash_size, crash_size >> 20);
|
||||
|
|
|
@ -677,8 +677,9 @@ static void __init reserve_crashkernel(void)
|
|||
return;
|
||||
}
|
||||
low = crash_base ?: low;
|
||||
crash_base = memblock_find_in_range(low, high, crash_size,
|
||||
KEXEC_CRASH_MEM_ALIGN);
|
||||
crash_base = memblock_phys_alloc_range(crash_size,
|
||||
KEXEC_CRASH_MEM_ALIGN,
|
||||
low, high);
|
||||
}
|
||||
|
||||
if (!crash_base) {
|
||||
|
@ -687,8 +688,10 @@ static void __init reserve_crashkernel(void)
|
|||
return;
|
||||
}
|
||||
|
||||
if (register_memory_notifier(&kdump_mem_nb))
|
||||
if (register_memory_notifier(&kdump_mem_nb)) {
|
||||
memblock_free(crash_base, crash_size);
|
||||
return;
|
||||
}
|
||||
|
||||
if (!oldmem_data.start && MACHINE_IS_VM)
|
||||
diag10_range(PFN_DOWN(crash_base), PFN_DOWN(crash_size));
|
||||
|
|
|
@ -449,3 +449,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -822,7 +822,7 @@ void do_secure_storage_access(struct pt_regs *regs)
|
|||
break;
|
||||
case KERNEL_FAULT:
|
||||
page = phys_to_page(addr);
|
||||
if (unlikely(!try_get_page(page)))
|
||||
if (unlikely(!try_get_compound_head(page, 1)))
|
||||
break;
|
||||
rc = arch_make_page_accessible(page);
|
||||
put_page(page);
|
||||
|
|
|
@ -63,6 +63,8 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
|
|||
if (boot_cpu_data.dcache.n_aliases && PageAnon(page))
|
||||
__flush_anon_page(page, vmaddr);
|
||||
}
|
||||
|
||||
#define ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE 1
|
||||
static inline void flush_kernel_vmap_range(void *addr, int size)
|
||||
{
|
||||
__flush_wback_region(addr, size);
|
||||
|
@ -72,12 +74,6 @@ static inline void invalidate_kernel_vmap_range(void *addr, int size)
|
|||
__flush_invalidate_region(addr, size);
|
||||
}
|
||||
|
||||
#define ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
static inline void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
flush_dcache_page(page);
|
||||
}
|
||||
|
||||
extern void copy_to_user_page(struct vm_area_struct *vma,
|
||||
struct page *page, unsigned long vaddr, void *dst, const void *src,
|
||||
unsigned long len);
|
||||
|
|
|
@ -449,3 +449,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -492,3 +492,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -452,3 +452,4 @@
|
|||
445 i386 landlock_add_rule sys_landlock_add_rule
|
||||
446 i386 landlock_restrict_self sys_landlock_restrict_self
|
||||
447 i386 memfd_secret sys_memfd_secret
|
||||
448 i386 process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -369,6 +369,7 @@
|
|||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
447 common memfd_secret sys_memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
||||
#
|
||||
# Due to a historical design error, certain syscalls are numbered differently
|
||||
|
|
|
@ -109,14 +109,13 @@ static u32 __init allocate_aperture(void)
|
|||
* memory. Unfortunately we cannot move it up because that would
|
||||
* make the IOMMU useless.
|
||||
*/
|
||||
addr = memblock_find_in_range(GART_MIN_ADDR, GART_MAX_ADDR,
|
||||
aper_size, aper_size);
|
||||
addr = memblock_phys_alloc_range(aper_size, aper_size,
|
||||
GART_MIN_ADDR, GART_MAX_ADDR);
|
||||
if (!addr) {
|
||||
pr_err("Cannot allocate aperture memory hole [mem %#010lx-%#010lx] (%uKB)\n",
|
||||
addr, addr + aper_size - 1, aper_size >> 10);
|
||||
return 0;
|
||||
}
|
||||
memblock_reserve(addr, aper_size);
|
||||
pr_info("Mapping aperture over RAM [mem %#010lx-%#010lx] (%uKB)\n",
|
||||
addr, addr + aper_size - 1, aper_size >> 10);
|
||||
register_nosave_region(addr >> PAGE_SHIFT,
|
||||
|
|
|
@ -154,7 +154,7 @@ static struct ldt_struct *alloc_ldt_struct(unsigned int num_entries)
|
|||
if (num_entries > LDT_ENTRIES)
|
||||
return NULL;
|
||||
|
||||
new_ldt = kmalloc(sizeof(struct ldt_struct), GFP_KERNEL);
|
||||
new_ldt = kmalloc(sizeof(struct ldt_struct), GFP_KERNEL_ACCOUNT);
|
||||
if (!new_ldt)
|
||||
return NULL;
|
||||
|
||||
|
@ -168,9 +168,9 @@ static struct ldt_struct *alloc_ldt_struct(unsigned int num_entries)
|
|||
* than PAGE_SIZE.
|
||||
*/
|
||||
if (alloc_size > PAGE_SIZE)
|
||||
new_ldt->entries = vzalloc(alloc_size);
|
||||
new_ldt->entries = __vmalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_ZERO);
|
||||
else
|
||||
new_ldt->entries = (void *)get_zeroed_page(GFP_KERNEL);
|
||||
new_ldt->entries = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
|
||||
|
||||
if (!new_ldt->entries) {
|
||||
kfree(new_ldt);
|
||||
|
|
|
@ -127,14 +127,12 @@ __ref void *alloc_low_pages(unsigned int num)
|
|||
unsigned long ret = 0;
|
||||
|
||||
if (min_pfn_mapped < max_pfn_mapped) {
|
||||
ret = memblock_find_in_range(
|
||||
ret = memblock_phys_alloc_range(
|
||||
PAGE_SIZE * num, PAGE_SIZE,
|
||||
min_pfn_mapped << PAGE_SHIFT,
|
||||
max_pfn_mapped << PAGE_SHIFT,
|
||||
PAGE_SIZE * num , PAGE_SIZE);
|
||||
max_pfn_mapped << PAGE_SHIFT);
|
||||
}
|
||||
if (ret)
|
||||
memblock_reserve(ret, PAGE_SIZE * num);
|
||||
else if (can_use_brk_pgt)
|
||||
if (!ret && can_use_brk_pgt)
|
||||
ret = __pa(extend_brk(PAGE_SIZE * num, PAGE_SIZE));
|
||||
|
||||
if (!ret)
|
||||
|
@ -610,8 +608,17 @@ static void __init memory_map_top_down(unsigned long map_start,
|
|||
unsigned long addr;
|
||||
unsigned long mapped_ram_size = 0;
|
||||
|
||||
/* xen has big range in reserved near end of ram, skip it at first.*/
|
||||
addr = memblock_find_in_range(map_start, map_end, PMD_SIZE, PMD_SIZE);
|
||||
/*
|
||||
* Systems that have many reserved areas near top of the memory,
|
||||
* e.g. QEMU with less than 1G RAM and EFI enabled, or Xen, will
|
||||
* require lots of 4K mappings which may exhaust pgt_buf.
|
||||
* Start with top-most PMD_SIZE range aligned at PMD_SIZE to ensure
|
||||
* there is enough mapped memory that can be allocated from
|
||||
* memblock.
|
||||
*/
|
||||
addr = memblock_phys_alloc_range(PMD_SIZE, PMD_SIZE, map_start,
|
||||
map_end);
|
||||
memblock_free(addr, PMD_SIZE);
|
||||
real_end = addr + PMD_SIZE;
|
||||
|
||||
/* step_size need to be small so pgt_buf from BRK could cover it */
|
||||
|
|
|
@ -376,15 +376,14 @@ static int __init numa_alloc_distance(void)
|
|||
cnt++;
|
||||
size = cnt * cnt * sizeof(numa_distance[0]);
|
||||
|
||||
phys = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
|
||||
size, PAGE_SIZE);
|
||||
phys = memblock_phys_alloc_range(size, PAGE_SIZE, 0,
|
||||
PFN_PHYS(max_pfn_mapped));
|
||||
if (!phys) {
|
||||
pr_warn("Warning: can't allocate distance table!\n");
|
||||
/* don't retry until explicitly reset */
|
||||
numa_distance = (void *)1LU;
|
||||
return -ENOMEM;
|
||||
}
|
||||
memblock_reserve(phys, size);
|
||||
|
||||
numa_distance = __va(phys);
|
||||
numa_distance_cnt = cnt;
|
||||
|
|
|
@ -447,13 +447,12 @@ void __init numa_emulation(struct numa_meminfo *numa_meminfo, int numa_dist_cnt)
|
|||
if (numa_dist_cnt) {
|
||||
u64 phys;
|
||||
|
||||
phys = memblock_find_in_range(0, PFN_PHYS(max_pfn_mapped),
|
||||
phys_size, PAGE_SIZE);
|
||||
phys = memblock_phys_alloc_range(phys_size, PAGE_SIZE, 0,
|
||||
PFN_PHYS(max_pfn_mapped));
|
||||
if (!phys) {
|
||||
pr_warn("NUMA: Warning: can't allocate copy of distance table, disabling emulation\n");
|
||||
goto no_emu;
|
||||
}
|
||||
memblock_reserve(phys, phys_size);
|
||||
phys_dist = __va(phys);
|
||||
|
||||
for (i = 0; i < numa_dist_cnt; i++)
|
||||
|
|
|
@ -28,7 +28,7 @@ void __init reserve_real_mode(void)
|
|||
WARN_ON(slab_is_available());
|
||||
|
||||
/* Has to be under 1M so we can execute real-mode AP code. */
|
||||
mem = memblock_find_in_range(0, 1<<20, size, PAGE_SIZE);
|
||||
mem = memblock_phys_alloc_range(size, PAGE_SIZE, 0, 1<<20);
|
||||
if (!mem)
|
||||
pr_info("No sub-1M memory is available for the trampoline\n");
|
||||
else
|
||||
|
|
|
@ -417,3 +417,5 @@
|
|||
444 common landlock_create_ruleset sys_landlock_create_ruleset
|
||||
445 common landlock_add_rule sys_landlock_add_rule
|
||||
446 common landlock_restrict_self sys_landlock_restrict_self
|
||||
# 447 reserved for memfd_secret
|
||||
448 common process_mrelease sys_process_mrelease
|
||||
|
|
|
@ -309,7 +309,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
|
|||
|
||||
static void bio_invalidate_vmalloc_pages(struct bio *bio)
|
||||
{
|
||||
#ifdef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
#ifdef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE
|
||||
if (bio->bi_private && !op_is_write(bio_op(bio))) {
|
||||
unsigned long i, len = 0;
|
||||
|
||||
|
|
|
@ -583,8 +583,8 @@ void __init acpi_table_upgrade(void)
|
|||
}
|
||||
|
||||
acpi_tables_addr =
|
||||
memblock_find_in_range(0, ACPI_TABLE_UPGRADE_MAX_PHYS,
|
||||
all_tables_size, PAGE_SIZE);
|
||||
memblock_phys_alloc_range(all_tables_size, PAGE_SIZE,
|
||||
0, ACPI_TABLE_UPGRADE_MAX_PHYS);
|
||||
if (!acpi_tables_addr) {
|
||||
WARN_ON(1);
|
||||
return;
|
||||
|
@ -599,7 +599,6 @@ void __init acpi_table_upgrade(void)
|
|||
* Both memblock_reserve and e820__range_add (via arch_reserve_mem_area)
|
||||
* works fine.
|
||||
*/
|
||||
memblock_reserve(acpi_tables_addr, all_tables_size);
|
||||
arch_reserve_mem_area(acpi_tables_addr, all_tables_size);
|
||||
|
||||
/*
|
||||
|
|
|
@ -279,13 +279,10 @@ static int __init numa_alloc_distance(void)
|
|||
int i, j;
|
||||
|
||||
size = nr_node_ids * nr_node_ids * sizeof(numa_distance[0]);
|
||||
phys = memblock_find_in_range(0, PFN_PHYS(max_pfn),
|
||||
size, PAGE_SIZE);
|
||||
phys = memblock_phys_alloc_range(size, PAGE_SIZE, 0, PFN_PHYS(max_pfn));
|
||||
if (WARN_ON(!phys))
|
||||
return -ENOMEM;
|
||||
|
||||
memblock_reserve(phys, size);
|
||||
|
||||
numa_distance = __va(phys);
|
||||
numa_distance_cnt = nr_node_ids;
|
||||
|
||||
|
|
|
@ -578,9 +578,9 @@ static struct memory_block *find_memory_block_by_id(unsigned long block_id)
|
|||
/*
|
||||
* Called under device_hotplug_lock.
|
||||
*/
|
||||
struct memory_block *find_memory_block(struct mem_section *section)
|
||||
struct memory_block *find_memory_block(unsigned long section_nr)
|
||||
{
|
||||
unsigned long block_id = memory_block_id(__section_nr(section));
|
||||
unsigned long block_id = memory_block_id(section_nr);
|
||||
|
||||
return find_memory_block_by_id(block_id);
|
||||
}
|
||||
|
|
|
@ -578,10 +578,6 @@ static bool jz4740_mmc_read_data(struct jz4740_mmc_host *host,
|
|||
}
|
||||
}
|
||||
data->bytes_xfered += miter->length;
|
||||
|
||||
/* This can go away once MIPS implements
|
||||
* flush_kernel_dcache_page */
|
||||
flush_dcache_page(miter->page);
|
||||
}
|
||||
sg_miter_stop(miter);
|
||||
|
||||
|
|
|
@ -941,7 +941,7 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct mmc_command *cmd,
|
|||
|
||||
/* discard mappings */
|
||||
if (direction == DMA_FROM_DEVICE)
|
||||
flush_kernel_dcache_page(sg_page(sg));
|
||||
flush_dcache_page(sg_page(sg));
|
||||
kunmap(sg_page(sg));
|
||||
if (dma_dev)
|
||||
dma_unmap_page(dma_dev, dma_addr, PAGE_SIZE, dir);
|
||||
|
|
|
@ -33,18 +33,22 @@ static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
|
|||
phys_addr_t *res_base)
|
||||
{
|
||||
phys_addr_t base;
|
||||
int err = 0;
|
||||
|
||||
end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end;
|
||||
align = !align ? SMP_CACHE_BYTES : align;
|
||||
base = memblock_find_in_range(start, end, size, align);
|
||||
base = memblock_phys_alloc_range(size, align, start, end);
|
||||
if (!base)
|
||||
return -ENOMEM;
|
||||
|
||||
*res_base = base;
|
||||
if (nomap)
|
||||
return memblock_mark_nomap(base, size);
|
||||
if (nomap) {
|
||||
err = memblock_mark_nomap(base, size);
|
||||
if (err)
|
||||
memblock_free(base, size);
|
||||
}
|
||||
|
||||
return memblock_reserve(base, size);
|
||||
return err;
|
||||
}
|
||||
|
||||
/*
|
||||
|
|
|
@ -3,6 +3,7 @@
|
|||
* Implement the manual drop-all-pagecache function
|
||||
*/
|
||||
|
||||
#include <linux/pagemap.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/fs.h>
|
||||
|
@ -27,7 +28,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
|
|||
* we need to reschedule to avoid softlockups.
|
||||
*/
|
||||
if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
|
||||
(inode->i_mapping->nrpages == 0 && !need_resched())) {
|
||||
(mapping_empty(inode->i_mapping) && !need_resched())) {
|
||||
spin_unlock(&inode->i_lock);
|
||||
continue;
|
||||
}
|
||||
|
|
|
@ -217,8 +217,10 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
|
|||
* We are doing an exec(). 'current' is the process
|
||||
* doing the exec and bprm->mm is the new process's mm.
|
||||
*/
|
||||
mmap_read_lock(bprm->mm);
|
||||
ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
|
||||
&page, NULL, NULL);
|
||||
mmap_read_unlock(bprm->mm);
|
||||
if (ret <= 0)
|
||||
return NULL;
|
||||
|
||||
|
@ -574,7 +576,7 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
|
|||
}
|
||||
|
||||
if (kmapped_page) {
|
||||
flush_kernel_dcache_page(kmapped_page);
|
||||
flush_dcache_page(kmapped_page);
|
||||
kunmap(kmapped_page);
|
||||
put_arg_page(kmapped_page);
|
||||
}
|
||||
|
@ -592,7 +594,7 @@ static int copy_strings(int argc, struct user_arg_ptr argv,
|
|||
ret = 0;
|
||||
out:
|
||||
if (kmapped_page) {
|
||||
flush_kernel_dcache_page(kmapped_page);
|
||||
flush_dcache_page(kmapped_page);
|
||||
kunmap(kmapped_page);
|
||||
put_arg_page(kmapped_page);
|
||||
}
|
||||
|
@ -634,7 +636,7 @@ int copy_string_kernel(const char *arg, struct linux_binprm *bprm)
|
|||
kaddr = kmap_atomic(page);
|
||||
flush_arg_page(bprm, pos & PAGE_MASK, page);
|
||||
memcpy(kaddr + offset_in_page(pos), arg, bytes_to_copy);
|
||||
flush_kernel_dcache_page(page);
|
||||
flush_dcache_page(page);
|
||||
kunmap_atomic(kaddr);
|
||||
put_arg_page(page);
|
||||
}
|
||||
|
|
|
@ -1051,7 +1051,8 @@ static int __init fcntl_init(void)
|
|||
__FMODE_EXEC | __FMODE_NONOTIFY));
|
||||
|
||||
fasync_cache = kmem_cache_create("fasync_cache",
|
||||
sizeof(struct fasync_struct), 0, SLAB_PANIC, NULL);
|
||||
sizeof(struct fasync_struct), 0,
|
||||
SLAB_PANIC | SLAB_ACCOUNT, NULL);
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
|
|
@ -406,6 +406,11 @@ static bool inode_do_switch_wbs(struct inode *inode,
|
|||
inc_wb_stat(new_wb, WB_WRITEBACK);
|
||||
}
|
||||
|
||||
if (mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) {
|
||||
atomic_dec(&old_wb->writeback_inodes);
|
||||
atomic_inc(&new_wb->writeback_inodes);
|
||||
}
|
||||
|
||||
wb_get(new_wb);
|
||||
|
||||
/*
|
||||
|
@ -1034,20 +1039,20 @@ static void bdi_split_work_to_wbs(struct backing_dev_info *bdi,
|
|||
* cgroup_writeback_by_id - initiate cgroup writeback from bdi and memcg IDs
|
||||
* @bdi_id: target bdi id
|
||||
* @memcg_id: target memcg css id
|
||||
* @nr: number of pages to write, 0 for best-effort dirty flushing
|
||||
* @reason: reason why some writeback work initiated
|
||||
* @done: target wb_completion
|
||||
*
|
||||
* Initiate flush of the bdi_writeback identified by @bdi_id and @memcg_id
|
||||
* with the specified parameters.
|
||||
*/
|
||||
int cgroup_writeback_by_id(u64 bdi_id, int memcg_id, unsigned long nr,
|
||||
int cgroup_writeback_by_id(u64 bdi_id, int memcg_id,
|
||||
enum wb_reason reason, struct wb_completion *done)
|
||||
{
|
||||
struct backing_dev_info *bdi;
|
||||
struct cgroup_subsys_state *memcg_css;
|
||||
struct bdi_writeback *wb;
|
||||
struct wb_writeback_work *work;
|
||||
unsigned long dirty;
|
||||
int ret;
|
||||
|
||||
/* lookup bdi and memcg */
|
||||
|
@ -1076,24 +1081,22 @@ int cgroup_writeback_by_id(u64 bdi_id, int memcg_id, unsigned long nr,
|
|||
}
|
||||
|
||||
/*
|
||||
* If @nr is zero, the caller is attempting to write out most of
|
||||
* The caller is attempting to write out most of
|
||||
* the currently dirty pages. Let's take the current dirty page
|
||||
* count and inflate it by 25% which should be large enough to
|
||||
* flush out most dirty pages while avoiding getting livelocked by
|
||||
* concurrent dirtiers.
|
||||
*
|
||||
* BTW the memcg stats are flushed periodically and this is best-effort
|
||||
* estimation, so some potential error is ok.
|
||||
*/
|
||||
if (!nr) {
|
||||
unsigned long filepages, headroom, dirty, writeback;
|
||||
|
||||
mem_cgroup_wb_stats(wb, &filepages, &headroom, &dirty,
|
||||
&writeback);
|
||||
nr = dirty * 10 / 8;
|
||||
}
|
||||
dirty = memcg_page_state(mem_cgroup_from_css(memcg_css), NR_FILE_DIRTY);
|
||||
dirty = dirty * 10 / 8;
|
||||
|
||||
/* issue the writeback work */
|
||||
work = kzalloc(sizeof(*work), GFP_NOWAIT | __GFP_NOWARN);
|
||||
if (work) {
|
||||
work->nr_pages = nr;
|
||||
work->nr_pages = dirty;
|
||||
work->sync_mode = WB_SYNC_NONE;
|
||||
work->range_cyclic = 1;
|
||||
work->reason = reason;
|
||||
|
@ -1999,7 +2002,6 @@ static long writeback_inodes_wb(struct bdi_writeback *wb, long nr_pages,
|
|||
static long wb_writeback(struct bdi_writeback *wb,
|
||||
struct wb_writeback_work *work)
|
||||
{
|
||||
unsigned long wb_start = jiffies;
|
||||
long nr_pages = work->nr_pages;
|
||||
unsigned long dirtied_before = jiffies;
|
||||
struct inode *inode;
|
||||
|
@ -2053,8 +2055,6 @@ static long wb_writeback(struct bdi_writeback *wb,
|
|||
progress = __writeback_inodes_wb(wb, work);
|
||||
trace_writeback_written(wb, work);
|
||||
|
||||
wb_update_bandwidth(wb, wb_start);
|
||||
|
||||
/*
|
||||
* Did we write something? Try for more
|
||||
*
|
||||
|
|
|
@ -254,7 +254,7 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type,
|
|||
struct fs_context *fc;
|
||||
int ret = -ENOMEM;
|
||||
|
||||
fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL);
|
||||
fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT);
|
||||
if (!fc)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
|
@ -649,7 +649,7 @@ const struct fs_context_operations legacy_fs_context_ops = {
|
|||
*/
|
||||
static int legacy_init_fs_context(struct fs_context *fc)
|
||||
{
|
||||
fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL);
|
||||
fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL_ACCOUNT);
|
||||
if (!fc->fs_private)
|
||||
return -ENOMEM;
|
||||
fc->ops = &legacy_fs_context_ops;
|
||||
|
|
|
@ -770,7 +770,7 @@ static enum lru_status inode_lru_isolate(struct list_head *item,
|
|||
return LRU_ROTATE;
|
||||
}
|
||||
|
||||
if (inode_has_buffers(inode) || inode->i_data.nrpages) {
|
||||
if (inode_has_buffers(inode) || !mapping_empty(&inode->i_data)) {
|
||||
__iget(inode);
|
||||
spin_unlock(&inode->i_lock);
|
||||
spin_unlock(lru_lock);
|
||||
|
|
|
@ -2941,10 +2941,12 @@ static int __init filelock_init(void)
|
|||
int i;
|
||||
|
||||
flctx_cache = kmem_cache_create("file_lock_ctx",
|
||||
sizeof(struct file_lock_context), 0, SLAB_PANIC, NULL);
|
||||
sizeof(struct file_lock_context), 0,
|
||||
SLAB_PANIC | SLAB_ACCOUNT, NULL);
|
||||
|
||||
filelock_cache = kmem_cache_create("file_lock_cache",
|
||||
sizeof(struct file_lock), 0, SLAB_PANIC, NULL);
|
||||
sizeof(struct file_lock), 0,
|
||||
SLAB_PANIC | SLAB_ACCOUNT, NULL);
|
||||
|
||||
for_each_possible_cpu(i) {
|
||||
struct file_lock_list_struct *fll = per_cpu_ptr(&file_lock_list, i);
|
||||
|
|
|
@ -4089,7 +4089,9 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
|
|||
return -EPERM;
|
||||
|
||||
inode_lock(target);
|
||||
if (is_local_mountpoint(dentry))
|
||||
if (IS_SWAPFILE(target))
|
||||
error = -EPERM;
|
||||
else if (is_local_mountpoint(dentry))
|
||||
error = -EBUSY;
|
||||
else {
|
||||
error = security_inode_unlink(dir, dentry);
|
||||
|
@ -4597,6 +4599,10 @@ int vfs_rename(struct renamedata *rd)
|
|||
else if (target)
|
||||
inode_lock(target);
|
||||
|
||||
error = -EPERM;
|
||||
if (IS_SWAPFILE(source) || (target && IS_SWAPFILE(target)))
|
||||
goto out;
|
||||
|
||||
error = -EBUSY;
|
||||
if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
|
||||
goto out;
|
||||
|
|
|
@ -203,7 +203,8 @@ static struct mount *alloc_vfsmnt(const char *name)
|
|||
goto out_free_cache;
|
||||
|
||||
if (name) {
|
||||
mnt->mnt_devname = kstrdup_const(name, GFP_KERNEL);
|
||||
mnt->mnt_devname = kstrdup_const(name,
|
||||
GFP_KERNEL_ACCOUNT);
|
||||
if (!mnt->mnt_devname)
|
||||
goto out_free_id;
|
||||
}
|
||||
|
@ -3370,7 +3371,7 @@ static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns, bool a
|
|||
if (!ucounts)
|
||||
return ERR_PTR(-ENOSPC);
|
||||
|
||||
new_ns = kzalloc(sizeof(struct mnt_namespace), GFP_KERNEL);
|
||||
new_ns = kzalloc(sizeof(struct mnt_namespace), GFP_KERNEL_ACCOUNT);
|
||||
if (!new_ns) {
|
||||
dec_mnt_namespaces(ucounts);
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
@ -4306,7 +4307,7 @@ void __init mnt_init(void)
|
|||
int err;
|
||||
|
||||
mnt_cache = kmem_cache_create("mnt_cache", sizeof(struct mount),
|
||||
0, SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
|
||||
0, SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
|
||||
|
||||
mount_hashtable = alloc_large_system_hash("Mount-cache",
|
||||
sizeof(struct hlist_head),
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
#include <linux/debugfs.h>
|
||||
#include <linux/seq_file.h>
|
||||
#include <linux/time.h>
|
||||
#include <linux/delay.h>
|
||||
#include <linux/quotaops.h>
|
||||
#include <linux/sched/signal.h>
|
||||
|
||||
|
@ -2721,7 +2722,7 @@ int ocfs2_inode_lock_tracker(struct inode *inode,
|
|||
return status;
|
||||
}
|
||||
}
|
||||
return tmp_oh ? 1 : 0;
|
||||
return 1;
|
||||
}
|
||||
|
||||
void ocfs2_inode_unlock_tracker(struct inode *inode,
|
||||
|
@ -3912,6 +3913,17 @@ static int ocfs2_unblock_lock(struct ocfs2_super *osb,
|
|||
spin_unlock_irqrestore(&lockres->l_lock, flags);
|
||||
ret = ocfs2_downconvert_lock(osb, lockres, new_level, set_lvb,
|
||||
gen);
|
||||
/* The dlm lock convert is being cancelled in background,
|
||||
* ocfs2_cancel_convert() is asynchronous in fs/dlm,
|
||||
* requeue it, try again later.
|
||||
*/
|
||||
if (ret == -EBUSY) {
|
||||
ctl->requeue = 1;
|
||||
mlog(ML_BASTS, "lockres %s, ReQ: Downconvert busy\n",
|
||||
lockres->l_name);
|
||||
ret = 0;
|
||||
msleep(20);
|
||||
}
|
||||
|
||||
leave:
|
||||
if (ret)
|
||||
|
|
|
@ -357,7 +357,6 @@ int ocfs2_global_read_info(struct super_block *sb, int type)
|
|||
}
|
||||
oinfo->dqi_gi.dqi_sb = sb;
|
||||
oinfo->dqi_gi.dqi_type = type;
|
||||
ocfs2_qinfo_lock_res_init(&oinfo->dqi_gqlock, oinfo);
|
||||
oinfo->dqi_gi.dqi_entry_size = sizeof(struct ocfs2_global_disk_dqblk);
|
||||
oinfo->dqi_gi.dqi_ops = &ocfs2_global_ops;
|
||||
oinfo->dqi_gqi_bh = NULL;
|
||||
|
|
|
@ -702,6 +702,8 @@ static int ocfs2_local_read_info(struct super_block *sb, int type)
|
|||
info->dqi_priv = oinfo;
|
||||
oinfo->dqi_type = type;
|
||||
INIT_LIST_HEAD(&oinfo->dqi_chunk);
|
||||
oinfo->dqi_gqinode = NULL;
|
||||
ocfs2_qinfo_lock_res_init(&oinfo->dqi_gqlock, oinfo);
|
||||
oinfo->dqi_rec = NULL;
|
||||
oinfo->dqi_lqi_bh = NULL;
|
||||
oinfo->dqi_libh = NULL;
|
||||
|
|
|
@ -191,7 +191,7 @@ EXPORT_SYMBOL(generic_pipe_buf_try_steal);
|
|||
*/
|
||||
bool generic_pipe_buf_get(struct pipe_inode_info *pipe, struct pipe_buffer *buf)
|
||||
{
|
||||
return try_get_page(buf->page);
|
||||
return try_get_compound_head(buf->page, 1);
|
||||
}
|
||||
EXPORT_SYMBOL(generic_pipe_buf_get);
|
||||
|
||||
|
|
|
@ -655,7 +655,7 @@ int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
|
|||
goto out_nofds;
|
||||
|
||||
alloc_size = 6 * size;
|
||||
bits = kvmalloc(alloc_size, GFP_KERNEL);
|
||||
bits = kvmalloc(alloc_size, GFP_KERNEL_ACCOUNT);
|
||||
if (!bits)
|
||||
goto out_nofds;
|
||||
}
|
||||
|
@ -1000,7 +1000,7 @@ static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
|
|||
|
||||
len = min(todo, POLLFD_PER_PAGE);
|
||||
walk = walk->next = kmalloc(struct_size(walk, entries, len),
|
||||
GFP_KERNEL);
|
||||
GFP_KERNEL_ACCOUNT);
|
||||
if (!walk) {
|
||||
err = -ENOMEM;
|
||||
goto out_fds;
|
||||
|
|
116
fs/userfaultfd.c
116
fs/userfaultfd.c
|
@ -33,11 +33,6 @@ int sysctl_unprivileged_userfaultfd __read_mostly;
|
|||
|
||||
static struct kmem_cache *userfaultfd_ctx_cachep __read_mostly;
|
||||
|
||||
enum userfaultfd_state {
|
||||
UFFD_STATE_WAIT_API,
|
||||
UFFD_STATE_RUNNING,
|
||||
};
|
||||
|
||||
/*
|
||||
* Start with fault_pending_wqh and fault_wqh so they're more likely
|
||||
* to be in the same cacheline.
|
||||
|
@ -69,12 +64,10 @@ struct userfaultfd_ctx {
|
|||
unsigned int flags;
|
||||
/* features requested from the userspace */
|
||||
unsigned int features;
|
||||
/* state machine */
|
||||
enum userfaultfd_state state;
|
||||
/* released */
|
||||
bool released;
|
||||
/* memory mappings are changing because of non-cooperative event */
|
||||
bool mmap_changing;
|
||||
atomic_t mmap_changing;
|
||||
/* mm with one ore more vmas attached to this userfaultfd_ctx */
|
||||
struct mm_struct *mm;
|
||||
};
|
||||
|
@ -104,6 +97,14 @@ struct userfaultfd_wake_range {
|
|||
unsigned long len;
|
||||
};
|
||||
|
||||
/* internal indication that UFFD_API ioctl was successfully executed */
|
||||
#define UFFD_FEATURE_INITIALIZED (1u << 31)
|
||||
|
||||
static bool userfaultfd_is_initialized(struct userfaultfd_ctx *ctx)
|
||||
{
|
||||
return ctx->features & UFFD_FEATURE_INITIALIZED;
|
||||
}
|
||||
|
||||
static int userfaultfd_wake_function(wait_queue_entry_t *wq, unsigned mode,
|
||||
int wake_flags, void *key)
|
||||
{
|
||||
|
@ -623,7 +624,8 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx,
|
|||
* already released.
|
||||
*/
|
||||
out:
|
||||
WRITE_ONCE(ctx->mmap_changing, false);
|
||||
atomic_dec(&ctx->mmap_changing);
|
||||
VM_BUG_ON(atomic_read(&ctx->mmap_changing) < 0);
|
||||
userfaultfd_ctx_put(ctx);
|
||||
}
|
||||
|
||||
|
@ -666,15 +668,14 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs)
|
|||
|
||||
refcount_set(&ctx->refcount, 1);
|
||||
ctx->flags = octx->flags;
|
||||
ctx->state = UFFD_STATE_RUNNING;
|
||||
ctx->features = octx->features;
|
||||
ctx->released = false;
|
||||
ctx->mmap_changing = false;
|
||||
atomic_set(&ctx->mmap_changing, 0);
|
||||
ctx->mm = vma->vm_mm;
|
||||
mmgrab(ctx->mm);
|
||||
|
||||
userfaultfd_ctx_get(octx);
|
||||
WRITE_ONCE(octx->mmap_changing, true);
|
||||
atomic_inc(&octx->mmap_changing);
|
||||
fctx->orig = octx;
|
||||
fctx->new = ctx;
|
||||
list_add_tail(&fctx->list, fcs);
|
||||
|
@ -721,7 +722,7 @@ void mremap_userfaultfd_prep(struct vm_area_struct *vma,
|
|||
if (ctx->features & UFFD_FEATURE_EVENT_REMAP) {
|
||||
vm_ctx->ctx = ctx;
|
||||
userfaultfd_ctx_get(ctx);
|
||||
WRITE_ONCE(ctx->mmap_changing, true);
|
||||
atomic_inc(&ctx->mmap_changing);
|
||||
} else {
|
||||
/* Drop uffd context if remap feature not enabled */
|
||||
vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
|
||||
|
@ -766,7 +767,7 @@ bool userfaultfd_remove(struct vm_area_struct *vma,
|
|||
return true;
|
||||
|
||||
userfaultfd_ctx_get(ctx);
|
||||
WRITE_ONCE(ctx->mmap_changing, true);
|
||||
atomic_inc(&ctx->mmap_changing);
|
||||
mmap_read_unlock(mm);
|
||||
|
||||
msg_init(&ewq.msg);
|
||||
|
@ -810,7 +811,7 @@ int userfaultfd_unmap_prep(struct vm_area_struct *vma,
|
|||
return -ENOMEM;
|
||||
|
||||
userfaultfd_ctx_get(ctx);
|
||||
WRITE_ONCE(ctx->mmap_changing, true);
|
||||
atomic_inc(&ctx->mmap_changing);
|
||||
unmap_ctx->ctx = ctx;
|
||||
unmap_ctx->start = start;
|
||||
unmap_ctx->end = end;
|
||||
|
@ -943,38 +944,33 @@ static __poll_t userfaultfd_poll(struct file *file, poll_table *wait)
|
|||
|
||||
poll_wait(file, &ctx->fd_wqh, wait);
|
||||
|
||||
switch (ctx->state) {
|
||||
case UFFD_STATE_WAIT_API:
|
||||
if (!userfaultfd_is_initialized(ctx))
|
||||
return EPOLLERR;
|
||||
case UFFD_STATE_RUNNING:
|
||||
/*
|
||||
* poll() never guarantees that read won't block.
|
||||
* userfaults can be waken before they're read().
|
||||
*/
|
||||
if (unlikely(!(file->f_flags & O_NONBLOCK)))
|
||||
return EPOLLERR;
|
||||
/*
|
||||
* lockless access to see if there are pending faults
|
||||
* __pollwait last action is the add_wait_queue but
|
||||
* the spin_unlock would allow the waitqueue_active to
|
||||
* pass above the actual list_add inside
|
||||
* add_wait_queue critical section. So use a full
|
||||
* memory barrier to serialize the list_add write of
|
||||
* add_wait_queue() with the waitqueue_active read
|
||||
* below.
|
||||
*/
|
||||
ret = 0;
|
||||
smp_mb();
|
||||
if (waitqueue_active(&ctx->fault_pending_wqh))
|
||||
ret = EPOLLIN;
|
||||
else if (waitqueue_active(&ctx->event_wqh))
|
||||
ret = EPOLLIN;
|
||||
|
||||
return ret;
|
||||
default:
|
||||
WARN_ON_ONCE(1);
|
||||
/*
|
||||
* poll() never guarantees that read won't block.
|
||||
* userfaults can be waken before they're read().
|
||||
*/
|
||||
if (unlikely(!(file->f_flags & O_NONBLOCK)))
|
||||
return EPOLLERR;
|
||||
}
|
||||
/*
|
||||
* lockless access to see if there are pending faults
|
||||
* __pollwait last action is the add_wait_queue but
|
||||
* the spin_unlock would allow the waitqueue_active to
|
||||
* pass above the actual list_add inside
|
||||
* add_wait_queue critical section. So use a full
|
||||
* memory barrier to serialize the list_add write of
|
||||
* add_wait_queue() with the waitqueue_active read
|
||||
* below.
|
||||
*/
|
||||
ret = 0;
|
||||
smp_mb();
|
||||
if (waitqueue_active(&ctx->fault_pending_wqh))
|
||||
ret = EPOLLIN;
|
||||
else if (waitqueue_active(&ctx->event_wqh))
|
||||
ret = EPOLLIN;
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
static const struct file_operations userfaultfd_fops;
|
||||
|
@ -1169,7 +1165,7 @@ static ssize_t userfaultfd_read(struct file *file, char __user *buf,
|
|||
int no_wait = file->f_flags & O_NONBLOCK;
|
||||
struct inode *inode = file_inode(file);
|
||||
|
||||
if (ctx->state == UFFD_STATE_WAIT_API)
|
||||
if (!userfaultfd_is_initialized(ctx))
|
||||
return -EINVAL;
|
||||
|
||||
for (;;) {
|
||||
|
@ -1700,7 +1696,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx,
|
|||
user_uffdio_copy = (struct uffdio_copy __user *) arg;
|
||||
|
||||
ret = -EAGAIN;
|
||||
if (READ_ONCE(ctx->mmap_changing))
|
||||
if (atomic_read(&ctx->mmap_changing))
|
||||
goto out;
|
||||
|
||||
ret = -EFAULT;
|
||||
|
@ -1757,7 +1753,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx,
|
|||
user_uffdio_zeropage = (struct uffdio_zeropage __user *) arg;
|
||||
|
||||
ret = -EAGAIN;
|
||||
if (READ_ONCE(ctx->mmap_changing))
|
||||
if (atomic_read(&ctx->mmap_changing))
|
||||
goto out;
|
||||
|
||||
ret = -EFAULT;
|
||||
|
@ -1807,7 +1803,7 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx,
|
|||
struct userfaultfd_wake_range range;
|
||||
bool mode_wp, mode_dontwake;
|
||||
|
||||
if (READ_ONCE(ctx->mmap_changing))
|
||||
if (atomic_read(&ctx->mmap_changing))
|
||||
return -EAGAIN;
|
||||
|
||||
user_uffdio_wp = (struct uffdio_writeprotect __user *) arg;
|
||||
|
@ -1855,7 +1851,7 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
|
|||
user_uffdio_continue = (struct uffdio_continue __user *)arg;
|
||||
|
||||
ret = -EAGAIN;
|
||||
if (READ_ONCE(ctx->mmap_changing))
|
||||
if (atomic_read(&ctx->mmap_changing))
|
||||
goto out;
|
||||
|
||||
ret = -EFAULT;
|
||||
|
@ -1908,9 +1904,10 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
|
|||
static inline unsigned int uffd_ctx_features(__u64 user_features)
|
||||
{
|
||||
/*
|
||||
* For the current set of features the bits just coincide
|
||||
* For the current set of features the bits just coincide. Set
|
||||
* UFFD_FEATURE_INITIALIZED to mark the features as enabled.
|
||||
*/
|
||||
return (unsigned int)user_features;
|
||||
return (unsigned int)user_features | UFFD_FEATURE_INITIALIZED;
|
||||
}
|
||||
|
||||
/*
|
||||
|
@ -1923,12 +1920,10 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
|
|||
{
|
||||
struct uffdio_api uffdio_api;
|
||||
void __user *buf = (void __user *)arg;
|
||||
unsigned int ctx_features;
|
||||
int ret;
|
||||
__u64 features;
|
||||
|
||||
ret = -EINVAL;
|
||||
if (ctx->state != UFFD_STATE_WAIT_API)
|
||||
goto out;
|
||||
ret = -EFAULT;
|
||||
if (copy_from_user(&uffdio_api, buf, sizeof(uffdio_api)))
|
||||
goto out;
|
||||
|
@ -1952,9 +1947,13 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
|
|||
ret = -EFAULT;
|
||||
if (copy_to_user(buf, &uffdio_api, sizeof(uffdio_api)))
|
||||
goto out;
|
||||
ctx->state = UFFD_STATE_RUNNING;
|
||||
|
||||
/* only enable the requested features for this uffd context */
|
||||
ctx->features = uffd_ctx_features(features);
|
||||
ctx_features = uffd_ctx_features(features);
|
||||
ret = -EINVAL;
|
||||
if (cmpxchg(&ctx->features, 0, ctx_features) != 0)
|
||||
goto err_out;
|
||||
|
||||
ret = 0;
|
||||
out:
|
||||
return ret;
|
||||
|
@ -1971,7 +1970,7 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd,
|
|||
int ret = -EINVAL;
|
||||
struct userfaultfd_ctx *ctx = file->private_data;
|
||||
|
||||
if (cmd != UFFDIO_API && ctx->state == UFFD_STATE_WAIT_API)
|
||||
if (cmd != UFFDIO_API && !userfaultfd_is_initialized(ctx))
|
||||
return -EINVAL;
|
||||
|
||||
switch(cmd) {
|
||||
|
@ -2085,9 +2084,8 @@ SYSCALL_DEFINE1(userfaultfd, int, flags)
|
|||
refcount_set(&ctx->refcount, 1);
|
||||
ctx->flags = flags;
|
||||
ctx->features = 0;
|
||||
ctx->state = UFFD_STATE_WAIT_API;
|
||||
ctx->released = false;
|
||||
ctx->mmap_changing = false;
|
||||
atomic_set(&ctx->mmap_changing, 0);
|
||||
ctx->mm = current->mm;
|
||||
/* prevent the mm struct to be freed */
|
||||
mmgrab(ctx->mm);
|
||||
|
|
|
@ -116,6 +116,7 @@ struct bdi_writeback {
|
|||
struct list_head b_dirty_time; /* time stamps are dirty */
|
||||
spinlock_t list_lock; /* protects the b_* lists */
|
||||
|
||||
atomic_t writeback_inodes; /* number of inodes under writeback */
|
||||
struct percpu_counter stat[NR_WB_STAT_ITEMS];
|
||||
|
||||
unsigned long congested; /* WB_[a]sync_congested flags */
|
||||
|
@ -142,6 +143,7 @@ struct bdi_writeback {
|
|||
spinlock_t work_lock; /* protects work_list & dwork scheduling */
|
||||
struct list_head work_list;
|
||||
struct delayed_work dwork; /* work item used for writeback */
|
||||
struct delayed_work bw_dwork; /* work item used for bandwidth estimate */
|
||||
|
||||
unsigned long dirty_sleep; /* last wait */
|
||||
|
||||
|
|
|
@ -288,6 +288,17 @@ static inline struct bdi_writeback *inode_to_wb(const struct inode *inode)
|
|||
return inode->i_wb;
|
||||
}
|
||||
|
||||
static inline struct bdi_writeback *inode_to_wb_wbc(
|
||||
struct inode *inode,
|
||||
struct writeback_control *wbc)
|
||||
{
|
||||
/*
|
||||
* If wbc does not have inode attached, it means cgroup writeback was
|
||||
* disabled when wbc started. Just use the default wb in that case.
|
||||
*/
|
||||
return wbc->wb ? wbc->wb : &inode_to_bdi(inode)->wb;
|
||||
}
|
||||
|
||||
/**
|
||||
* unlocked_inode_to_wb_begin - begin unlocked inode wb access transaction
|
||||
* @inode: target inode
|
||||
|
@ -366,6 +377,14 @@ static inline struct bdi_writeback *inode_to_wb(struct inode *inode)
|
|||
return &inode_to_bdi(inode)->wb;
|
||||
}
|
||||
|
||||
static inline struct bdi_writeback *inode_to_wb_wbc(
|
||||
struct inode *inode,
|
||||
struct writeback_control *wbc)
|
||||
{
|
||||
return inode_to_wb(inode);
|
||||
}
|
||||
|
||||
|
||||
static inline struct bdi_writeback *
|
||||
unlocked_inode_to_wb_begin(struct inode *inode, struct wb_lock_cookie *cookie)
|
||||
{
|
||||
|
|
|
@ -409,7 +409,7 @@ static inline void invalidate_inode_buffers(struct inode *inode) {}
|
|||
static inline int remove_inode_buffers(struct inode *inode) { return 1; }
|
||||
static inline int sync_mapping_buffers(struct address_space *mapping) { return 0; }
|
||||
static inline void invalidate_bh_lrus_cpu(int cpu) {}
|
||||
static inline bool has_bh_in_lru(int cpu, void *dummy) { return 0; }
|
||||
static inline bool has_bh_in_lru(int cpu, void *dummy) { return false; }
|
||||
#define buffer_heads_over_limit 0
|
||||
|
||||
#endif /* CONFIG_BLOCK */
|
||||
|
|
|
@ -84,6 +84,8 @@ static inline unsigned long compact_gap(unsigned int order)
|
|||
extern unsigned int sysctl_compaction_proactiveness;
|
||||
extern int sysctl_compaction_handler(struct ctl_table *table, int write,
|
||||
void *buffer, size_t *length, loff_t *ppos);
|
||||
extern int compaction_proactiveness_sysctl_handler(struct ctl_table *table,
|
||||
int write, void *buffer, size_t *length, loff_t *ppos);
|
||||
extern int sysctl_extfrag_threshold;
|
||||
extern int sysctl_compact_unevictable_allowed;
|
||||
|
||||
|
|
|
@ -130,10 +130,7 @@ static inline void flush_anon_page(struct vm_area_struct *vma, struct page *page
|
|||
}
|
||||
#endif
|
||||
|
||||
#ifndef ARCH_HAS_FLUSH_KERNEL_DCACHE_PAGE
|
||||
static inline void flush_kernel_dcache_page(struct page *page)
|
||||
{
|
||||
}
|
||||
#ifndef ARCH_IMPLEMENTS_FLUSH_KERNEL_VMAP_RANGE
|
||||
static inline void flush_kernel_vmap_range(void *vaddr, int size)
|
||||
{
|
||||
}
|
||||
|
|
|
@ -121,6 +121,13 @@ static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg)
|
|||
css_put(&h_cg->css);
|
||||
}
|
||||
|
||||
static inline void resv_map_dup_hugetlb_cgroup_uncharge_info(
|
||||
struct resv_map *resv_map)
|
||||
{
|
||||
if (resv_map->css)
|
||||
css_get(resv_map->css);
|
||||
}
|
||||
|
||||
extern int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
|
||||
struct hugetlb_cgroup **ptr);
|
||||
extern int hugetlb_cgroup_charge_cgroup_rsvd(int idx, unsigned long nr_pages,
|
||||
|
@ -199,6 +206,11 @@ static inline void hugetlb_cgroup_put_rsvd_cgroup(struct hugetlb_cgroup *h_cg)
|
|||
{
|
||||
}
|
||||
|
||||
static inline void resv_map_dup_hugetlb_cgroup_uncharge_info(
|
||||
struct resv_map *resv_map)
|
||||
{
|
||||
}
|
||||
|
||||
static inline int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
|
||||
struct hugetlb_cgroup **ptr)
|
||||
{
|
||||
|
|
|
@ -99,8 +99,6 @@ void memblock_discard(void);
|
|||
static inline void memblock_discard(void) {}
|
||||
#endif
|
||||
|
||||
phys_addr_t memblock_find_in_range(phys_addr_t start, phys_addr_t end,
|
||||
phys_addr_t size, phys_addr_t align);
|
||||
void memblock_allow_resize(void);
|
||||
int memblock_add_node(phys_addr_t base, phys_addr_t size, int nid);
|
||||
int memblock_add(phys_addr_t base, phys_addr_t size);
|
||||
|
|
|
@ -105,14 +105,6 @@ struct mem_cgroup_reclaim_iter {
|
|||
unsigned int generation;
|
||||
};
|
||||
|
||||
struct lruvec_stat {
|
||||
long count[NR_VM_NODE_STAT_ITEMS];
|
||||
};
|
||||
|
||||
struct batched_lruvec_stat {
|
||||
s32 count[NR_VM_NODE_STAT_ITEMS];
|
||||
};
|
||||
|
||||
/*
|
||||
* Bitmap and deferred work of shrinker::id corresponding to memcg-aware
|
||||
* shrinkers, which have elements charged to this memcg.
|
||||
|
@ -123,24 +115,30 @@ struct shrinker_info {
|
|||
unsigned long *map;
|
||||
};
|
||||
|
||||
struct lruvec_stats_percpu {
|
||||
/* Local (CPU and cgroup) state */
|
||||
long state[NR_VM_NODE_STAT_ITEMS];
|
||||
|
||||
/* Delta calculation for lockless upward propagation */
|
||||
long state_prev[NR_VM_NODE_STAT_ITEMS];
|
||||
};
|
||||
|
||||
struct lruvec_stats {
|
||||
/* Aggregated (CPU and subtree) state */
|
||||
long state[NR_VM_NODE_STAT_ITEMS];
|
||||
|
||||
/* Pending child counts during tree propagation */
|
||||
long state_pending[NR_VM_NODE_STAT_ITEMS];
|
||||
};
|
||||
|
||||
/*
|
||||
* per-node information in memory controller.
|
||||
*/
|
||||
struct mem_cgroup_per_node {
|
||||
struct lruvec lruvec;
|
||||
|
||||
/*
|
||||
* Legacy local VM stats. This should be struct lruvec_stat and
|
||||
* cannot be optimized to struct batched_lruvec_stat. Because
|
||||
* the threshold of the lruvec_stat_cpu can be as big as
|
||||
* MEMCG_CHARGE_BATCH * PAGE_SIZE. It can fit into s32. But this
|
||||
* filed has no upper limit.
|
||||
*/
|
||||
struct lruvec_stat __percpu *lruvec_stat_local;
|
||||
|
||||
/* Subtree VM stats (batched updates) */
|
||||
struct batched_lruvec_stat __percpu *lruvec_stat_cpu;
|
||||
atomic_long_t lruvec_stat[NR_VM_NODE_STAT_ITEMS];
|
||||
struct lruvec_stats_percpu __percpu *lruvec_stats_percpu;
|
||||
struct lruvec_stats lruvec_stats;
|
||||
|
||||
unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
|
||||
|
||||
|
@ -595,13 +593,6 @@ static inline struct obj_cgroup **page_objcgs_check(struct page *page)
|
|||
}
|
||||
#endif
|
||||
|
||||
static __always_inline bool memcg_stat_item_in_bytes(int idx)
|
||||
{
|
||||
if (idx == MEMCG_PERCPU_B)
|
||||
return true;
|
||||
return vmstat_item_in_bytes(idx);
|
||||
}
|
||||
|
||||
static inline bool mem_cgroup_is_root(struct mem_cgroup *memcg)
|
||||
{
|
||||
return (memcg == root_mem_cgroup);
|
||||
|
@ -693,13 +684,35 @@ static inline bool mem_cgroup_below_min(struct mem_cgroup *memcg)
|
|||
page_counter_read(&memcg->memory);
|
||||
}
|
||||
|
||||
int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask);
|
||||
int __mem_cgroup_charge(struct page *page, struct mm_struct *mm,
|
||||
gfp_t gfp_mask);
|
||||
static inline int mem_cgroup_charge(struct page *page, struct mm_struct *mm,
|
||||
gfp_t gfp_mask)
|
||||
{
|
||||
if (mem_cgroup_disabled())
|
||||
return 0;
|
||||
return __mem_cgroup_charge(page, mm, gfp_mask);
|
||||
}
|
||||
|
||||
int mem_cgroup_swapin_charge_page(struct page *page, struct mm_struct *mm,
|
||||
gfp_t gfp, swp_entry_t entry);
|
||||
void mem_cgroup_swapin_uncharge_swap(swp_entry_t entry);
|
||||
|
||||
void mem_cgroup_uncharge(struct page *page);
|
||||
void mem_cgroup_uncharge_list(struct list_head *page_list);
|
||||
void __mem_cgroup_uncharge(struct page *page);
|
||||
static inline void mem_cgroup_uncharge(struct page *page)
|
||||
{
|
||||
if (mem_cgroup_disabled())
|
||||
return;
|
||||
__mem_cgroup_uncharge(page);
|
||||
}
|
||||
|
||||
void __mem_cgroup_uncharge_list(struct list_head *page_list);
|
||||
static inline void mem_cgroup_uncharge_list(struct list_head *page_list)
|
||||
{
|
||||
if (mem_cgroup_disabled())
|
||||
return;
|
||||
__mem_cgroup_uncharge_list(page_list);
|
||||
}
|
||||
|
||||
void mem_cgroup_migrate(struct page *oldpage, struct page *newpage);
|
||||
|
||||
|
@ -884,11 +897,6 @@ static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
|
|||
return !!(memcg->css.flags & CSS_ONLINE);
|
||||
}
|
||||
|
||||
/*
|
||||
* For memory reclaim.
|
||||
*/
|
||||
int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);
|
||||
|
||||
void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
|
||||
int zid, int nr_pages);
|
||||
|
||||
|
@ -955,22 +963,21 @@ static inline void mod_memcg_state(struct mem_cgroup *memcg,
|
|||
local_irq_restore(flags);
|
||||
}
|
||||
|
||||
static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx)
|
||||
{
|
||||
return READ_ONCE(memcg->vmstats.state[idx]);
|
||||
}
|
||||
|
||||
static inline unsigned long lruvec_page_state(struct lruvec *lruvec,
|
||||
enum node_stat_item idx)
|
||||
{
|
||||
struct mem_cgroup_per_node *pn;
|
||||
long x;
|
||||
|
||||
if (mem_cgroup_disabled())
|
||||
return node_page_state(lruvec_pgdat(lruvec), idx);
|
||||
|
||||
pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
|
||||
x = atomic_long_read(&pn->lruvec_stat[idx]);
|
||||
#ifdef CONFIG_SMP
|
||||
if (x < 0)
|
||||
x = 0;
|
||||
#endif
|
||||
return x;
|
||||
return READ_ONCE(pn->lruvec_stats.state[idx]);
|
||||
}
|
||||
|
||||
static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
|
||||
|
@ -985,7 +992,7 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
|
|||
|
||||
pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
|
||||
for_each_possible_cpu(cpu)
|
||||
x += per_cpu(pn->lruvec_stat_local->count[idx], cpu);
|
||||
x += per_cpu(pn->lruvec_stats_percpu->state[idx], cpu);
|
||||
#ifdef CONFIG_SMP
|
||||
if (x < 0)
|
||||
x = 0;
|
||||
|
@ -993,6 +1000,8 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
|
|||
return x;
|
||||
}
|
||||
|
||||
void mem_cgroup_flush_stats(void);
|
||||
|
||||
void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx,
|
||||
int val);
|
||||
void __mod_lruvec_kmem_state(void *p, enum node_stat_item idx, int val);
|
||||
|
@ -1391,6 +1400,11 @@ static inline void mod_memcg_state(struct mem_cgroup *memcg,
|
|||
{
|
||||
}
|
||||
|
||||
static inline unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx)
|
||||
{
|
||||
return 0;
|
||||
}
|
||||
|
||||
static inline unsigned long lruvec_page_state(struct lruvec *lruvec,
|
||||
enum node_stat_item idx)
|
||||
{
|
||||
|
@ -1403,6 +1417,10 @@ static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec,
|
|||
return node_page_state(lruvec_pgdat(lruvec), idx);
|
||||
}
|
||||
|
||||
static inline void mem_cgroup_flush_stats(void)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void __mod_memcg_lruvec_state(struct lruvec *lruvec,
|
||||
enum node_stat_item idx, int val)
|
||||
{
|
||||
|
|
|
@ -90,7 +90,7 @@ int create_memory_block_devices(unsigned long start, unsigned long size,
|
|||
void remove_memory_block_devices(unsigned long start, unsigned long size);
|
||||
extern void memory_dev_init(void);
|
||||
extern int memory_notify(unsigned long val, void *v);
|
||||
extern struct memory_block *find_memory_block(struct mem_section *);
|
||||
extern struct memory_block *find_memory_block(unsigned long section_nr);
|
||||
typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *);
|
||||
extern int walk_memory_blocks(unsigned long start, unsigned long size,
|
||||
void *arg, walk_memory_blocks_func_t func);
|
||||
|
|
|
@ -184,6 +184,14 @@ extern bool vma_migratable(struct vm_area_struct *vma);
|
|||
extern int mpol_misplaced(struct page *, struct vm_area_struct *, unsigned long);
|
||||
extern void mpol_put_task_policy(struct task_struct *);
|
||||
|
||||
extern bool numa_demotion_enabled;
|
||||
|
||||
static inline bool mpol_is_preferred_many(struct mempolicy *pol)
|
||||
{
|
||||
return (pol->mode == MPOL_PREFERRED_MANY);
|
||||
}
|
||||
|
||||
|
||||
#else
|
||||
|
||||
struct mempolicy {};
|
||||
|
@ -292,5 +300,13 @@ static inline nodemask_t *policy_nodemask_current(gfp_t gfp)
|
|||
{
|
||||
return NULL;
|
||||
}
|
||||
|
||||
#define numa_demotion_enabled false
|
||||
|
||||
static inline bool mpol_is_preferred_many(struct mempolicy *pol)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
#endif /* CONFIG_NUMA */
|
||||
#endif
|
||||
|
|
|
@ -28,6 +28,7 @@ enum migrate_reason {
|
|||
MR_NUMA_MISPLACED,
|
||||
MR_CONTIG_RANGE,
|
||||
MR_LONGTERM_PIN,
|
||||
MR_DEMOTION,
|
||||
MR_TYPES
|
||||
};
|
||||
|
||||
|
@ -41,7 +42,8 @@ extern int migrate_page(struct address_space *mapping,
|
|||
struct page *newpage, struct page *page,
|
||||
enum migrate_mode mode);
|
||||
extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
|
||||
unsigned long private, enum migrate_mode mode, int reason);
|
||||
unsigned long private, enum migrate_mode mode, int reason,
|
||||
unsigned int *ret_succeeded);
|
||||
extern struct page *alloc_migration_target(struct page *page, unsigned long private);
|
||||
extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
|
||||
|
||||
|
@ -56,7 +58,7 @@ extern int migrate_page_move_mapping(struct address_space *mapping,
|
|||
static inline void putback_movable_pages(struct list_head *l) {}
|
||||
static inline int migrate_pages(struct list_head *l, new_page_t new,
|
||||
free_page_t free, unsigned long private, enum migrate_mode mode,
|
||||
int reason)
|
||||
int reason, unsigned int *ret_succeeded)
|
||||
{ return -ENOSYS; }
|
||||
static inline struct page *alloc_migration_target(struct page *page,
|
||||
unsigned long private)
|
||||
|
@ -166,6 +168,14 @@ struct migrate_vma {
|
|||
int migrate_vma_setup(struct migrate_vma *args);
|
||||
void migrate_vma_pages(struct migrate_vma *migrate);
|
||||
void migrate_vma_finalize(struct migrate_vma *migrate);
|
||||
int next_demotion_node(int node);
|
||||
|
||||
#else /* CONFIG_MIGRATION disabled: */
|
||||
|
||||
static inline int next_demotion_node(int node)
|
||||
{
|
||||
return NUMA_NO_NODE;
|
||||
}
|
||||
|
||||
#endif /* CONFIG_MIGRATION */
|
||||
|
||||
|
|
|
@ -1216,18 +1216,10 @@ static inline void get_page(struct page *page)
|
|||
}
|
||||
|
||||
bool __must_check try_grab_page(struct page *page, unsigned int flags);
|
||||
__maybe_unused struct page *try_grab_compound_head(struct page *page, int refs,
|
||||
unsigned int flags);
|
||||
struct page *try_grab_compound_head(struct page *page, int refs,
|
||||
unsigned int flags);
|
||||
|
||||
|
||||
static inline __must_check bool try_get_page(struct page *page)
|
||||
{
|
||||
page = compound_head(page);
|
||||
if (WARN_ON_ONCE(page_ref_count(page) <= 0))
|
||||
return false;
|
||||
page_ref_inc(page);
|
||||
return true;
|
||||
}
|
||||
struct page *try_get_compound_head(struct page *page, int refs);
|
||||
|
||||
static inline void put_page(struct page *page)
|
||||
{
|
||||
|
@ -1849,7 +1841,6 @@ int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
|
|||
struct kvec;
|
||||
int get_kernel_pages(const struct kvec *iov, int nr_pages, int write,
|
||||
struct page **pages);
|
||||
int get_kernel_page(unsigned long start, int write, struct page **pages);
|
||||
struct page *get_dump_page(unsigned long addr);
|
||||
|
||||
extern int try_to_release_page(struct page * page, gfp_t gfp_mask);
|
||||
|
@ -3121,7 +3112,7 @@ extern void memory_failure_queue_kick(int cpu);
|
|||
extern int unpoison_memory(unsigned long pfn);
|
||||
extern int sysctl_memory_failure_early_kill;
|
||||
extern int sysctl_memory_failure_recovery;
|
||||
extern void shake_page(struct page *p, int access);
|
||||
extern void shake_page(struct page *p);
|
||||
extern atomic_long_t num_poisoned_pages __read_mostly;
|
||||
extern int soft_offline_page(unsigned long pfn, int flags);
|
||||
|
||||
|
|
|
@ -846,6 +846,7 @@ typedef struct pglist_data {
|
|||
enum zone_type kcompactd_highest_zoneidx;
|
||||
wait_queue_head_t kcompactd_wait;
|
||||
struct task_struct *kcompactd;
|
||||
bool proactive_compact_trigger;
|
||||
#endif
|
||||
/*
|
||||
* This is a per-node reserve of pages that are not available
|
||||
|
@ -1342,7 +1343,6 @@ static inline struct mem_section *__nr_to_section(unsigned long nr)
|
|||
return NULL;
|
||||
return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
|
||||
}
|
||||
extern unsigned long __section_nr(struct mem_section *ms);
|
||||
extern size_t mem_section_usage_size(void);
|
||||
|
||||
/*
|
||||
|
@ -1365,7 +1365,7 @@ extern size_t mem_section_usage_size(void);
|
|||
#define SECTION_TAINT_ZONE_DEVICE (1UL<<4)
|
||||
#define SECTION_MAP_LAST_BIT (1UL<<5)
|
||||
#define SECTION_MAP_MASK (~(SECTION_MAP_LAST_BIT-1))
|
||||
#define SECTION_NID_SHIFT 3
|
||||
#define SECTION_NID_SHIFT 6
|
||||
|
||||
static inline struct page *__section_mem_map_addr(struct mem_section *section)
|
||||
{
|
||||
|
|
|
@ -736,7 +736,7 @@ extern void add_page_wait_queue(struct page *page, wait_queue_entry_t *waiter);
|
|||
/*
|
||||
* Fault everything in given userspace address range in.
|
||||
*/
|
||||
static inline int fault_in_pages_writeable(char __user *uaddr, int size)
|
||||
static inline int fault_in_pages_writeable(char __user *uaddr, size_t size)
|
||||
{
|
||||
char __user *end = uaddr + size - 1;
|
||||
|
||||
|
@ -763,7 +763,7 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size)
|
|||
return 0;
|
||||
}
|
||||
|
||||
static inline int fault_in_pages_readable(const char __user *uaddr, int size)
|
||||
static inline int fault_in_pages_readable(const char __user *uaddr, size_t size)
|
||||
{
|
||||
volatile char c;
|
||||
const char __user *end = uaddr + size - 1;
|
||||
|
|
|
@ -174,13 +174,13 @@ static inline gfp_t current_gfp_context(gfp_t flags)
|
|||
}
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
extern void __fs_reclaim_acquire(void);
|
||||
extern void __fs_reclaim_release(void);
|
||||
extern void __fs_reclaim_acquire(unsigned long ip);
|
||||
extern void __fs_reclaim_release(unsigned long ip);
|
||||
extern void fs_reclaim_acquire(gfp_t gfp_mask);
|
||||
extern void fs_reclaim_release(gfp_t gfp_mask);
|
||||
#else
|
||||
static inline void __fs_reclaim_acquire(void) { }
|
||||
static inline void __fs_reclaim_release(void) { }
|
||||
static inline void __fs_reclaim_acquire(unsigned long ip) { }
|
||||
static inline void __fs_reclaim_release(unsigned long ip) { }
|
||||
static inline void fs_reclaim_acquire(gfp_t gfp_mask) { }
|
||||
static inline void fs_reclaim_release(gfp_t gfp_mask) { }
|
||||
#endif
|
||||
|
@ -306,7 +306,7 @@ set_active_memcg(struct mem_cgroup *memcg)
|
|||
{
|
||||
struct mem_cgroup *old;
|
||||
|
||||
if (in_interrupt()) {
|
||||
if (!in_task()) {
|
||||
old = this_cpu_read(int_active_memcg);
|
||||
this_cpu_write(int_active_memcg, memcg);
|
||||
} else {
|
||||
|
|
|
@ -18,6 +18,7 @@ struct shmem_inode_info {
|
|||
unsigned long flags;
|
||||
unsigned long alloced; /* data pages alloced to file */
|
||||
unsigned long swapped; /* subtotal assigned to swap */
|
||||
pgoff_t fallocend; /* highest fallocate endindex */
|
||||
struct list_head shrinklist; /* shrinkable hpage inodes */
|
||||
struct list_head swaplist; /* chain of maybes on swap */
|
||||
struct shared_policy policy; /* NUMA memory alloc policy */
|
||||
|
@ -31,7 +32,7 @@ struct shmem_sb_info {
|
|||
struct percpu_counter used_blocks; /* How many are allocated */
|
||||
unsigned long max_inodes; /* How many inodes are allowed */
|
||||
unsigned long free_inodes; /* How many are left for allocation */
|
||||
spinlock_t stat_lock; /* Serialize shmem_sb_info changes */
|
||||
raw_spinlock_t stat_lock; /* Serialize shmem_sb_info changes */
|
||||
umode_t mode; /* Mount mode for root directory */
|
||||
unsigned char huge; /* Whether to try for hugepages */
|
||||
kuid_t uid; /* Mount uid for root directory */
|
||||
|
@ -85,7 +86,12 @@ extern void shmem_truncate_range(struct inode *inode, loff_t start, loff_t end);
|
|||
extern int shmem_unuse(unsigned int type, bool frontswap,
|
||||
unsigned long *fs_pages_to_unuse);
|
||||
|
||||
extern bool shmem_huge_enabled(struct vm_area_struct *vma);
|
||||
extern bool shmem_is_huge(struct vm_area_struct *vma,
|
||||
struct inode *inode, pgoff_t index);
|
||||
static inline bool shmem_huge_enabled(struct vm_area_struct *vma)
|
||||
{
|
||||
return shmem_is_huge(vma, file_inode(vma->vm_file), vma->vm_pgoff);
|
||||
}
|
||||
extern unsigned long shmem_swap_usage(struct vm_area_struct *vma);
|
||||
extern unsigned long shmem_partial_swap_usage(struct address_space *mapping,
|
||||
pgoff_t start, pgoff_t end);
|
||||
|
@ -93,9 +99,8 @@ extern unsigned long shmem_partial_swap_usage(struct address_space *mapping,
|
|||
/* Flag allocation requirements to shmem_getpage */
|
||||
enum sgp_type {
|
||||
SGP_READ, /* don't exceed i_size, don't allocate page */
|
||||
SGP_NOALLOC, /* similar, but fail on hole or use fallocated page */
|
||||
SGP_CACHE, /* don't exceed i_size, may allocate page */
|
||||
SGP_NOHUGE, /* like SGP_CACHE, but no huge pages */
|
||||
SGP_HUGE, /* like SGP_CACHE, huge pages preferred */
|
||||
SGP_WRITE, /* may exceed i_size, may allocate !Uptodate page */
|
||||
SGP_FALLOC, /* like SGP_WRITE, but make existing page Uptodate */
|
||||
};
|
||||
|
@ -119,6 +124,18 @@ static inline bool shmem_file(struct file *file)
|
|||
return shmem_mapping(file->f_mapping);
|
||||
}
|
||||
|
||||
/*
|
||||
* If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages
|
||||
* beyond i_size's notion of EOF, which fallocate has committed to reserving:
|
||||
* which split_huge_page() must therefore not delete. This use of a single
|
||||
* "fallocend" per inode errs on the side of not deleting a reservation when
|
||||
* in doubt: there are plenty of cases when it preserves unreserved pages.
|
||||
*/
|
||||
static inline pgoff_t shmem_fallocend(struct inode *inode, pgoff_t eof)
|
||||
{
|
||||
return max(eof, SHMEM_I(inode)->fallocend);
|
||||
}
|
||||
|
||||
extern bool shmem_charge(struct inode *inode, long pages);
|
||||
extern void shmem_uncharge(struct inode *inode, long pages);
|
||||
|
||||
|
|
|
@ -408,7 +408,7 @@ static inline bool node_reclaim_enabled(void)
|
|||
|
||||
extern void check_move_unevictable_pages(struct pagevec *pvec);
|
||||
|
||||
extern int kswapd_run(int nid);
|
||||
extern void kswapd_run(int nid);
|
||||
extern void kswapd_stop(int nid);
|
||||
|
||||
#ifdef CONFIG_SWAP
|
||||
|
@ -721,7 +721,13 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup *mem)
|
|||
#endif
|
||||
|
||||
#if defined(CONFIG_SWAP) && defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP)
|
||||
extern void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask);
|
||||
extern void __cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask);
|
||||
static inline void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask)
|
||||
{
|
||||
if (mem_cgroup_disabled())
|
||||
return;
|
||||
__cgroup_throttle_swaprate(page, gfp_mask);
|
||||
}
|
||||
#else
|
||||
static inline void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask)
|
||||
{
|
||||
|
@ -730,8 +736,22 @@ static inline void cgroup_throttle_swaprate(struct page *page, gfp_t gfp_mask)
|
|||
|
||||
#ifdef CONFIG_MEMCG_SWAP
|
||||
extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
|
||||
extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
|
||||
extern void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages);
|
||||
extern int __mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
|
||||
static inline int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry)
|
||||
{
|
||||
if (mem_cgroup_disabled())
|
||||
return 0;
|
||||
return __mem_cgroup_try_charge_swap(page, entry);
|
||||
}
|
||||
|
||||
extern void __mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages);
|
||||
static inline void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages)
|
||||
{
|
||||
if (mem_cgroup_disabled())
|
||||
return;
|
||||
__mem_cgroup_uncharge_swap(entry, nr_pages);
|
||||
}
|
||||
|
||||
extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg);
|
||||
extern bool mem_cgroup_swap_full(struct page *page);
|
||||
#else
|
||||
|
|
|
@ -915,6 +915,7 @@ asmlinkage long sys_mincore(unsigned long start, size_t len,
|
|||
asmlinkage long sys_madvise(unsigned long start, size_t len, int behavior);
|
||||
asmlinkage long sys_process_madvise(int pidfd, const struct iovec __user *vec,
|
||||
size_t vlen, int behavior, unsigned int flags);
|
||||
asmlinkage long sys_process_mrelease(int pidfd, unsigned int flags);
|
||||
asmlinkage long sys_remap_file_pages(unsigned long start, unsigned long size,
|
||||
unsigned long prot, unsigned long pgoff,
|
||||
unsigned long flags);
|
||||
|
|
|
@ -60,16 +60,16 @@ extern int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
|
|||
|
||||
extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start,
|
||||
unsigned long src_start, unsigned long len,
|
||||
bool *mmap_changing, __u64 mode);
|
||||
atomic_t *mmap_changing, __u64 mode);
|
||||
extern ssize_t mfill_zeropage(struct mm_struct *dst_mm,
|
||||
unsigned long dst_start,
|
||||
unsigned long len,
|
||||
bool *mmap_changing);
|
||||
atomic_t *mmap_changing);
|
||||
extern ssize_t mcopy_continue(struct mm_struct *dst_mm, unsigned long dst_start,
|
||||
unsigned long len, bool *mmap_changing);
|
||||
unsigned long len, atomic_t *mmap_changing);
|
||||
extern int mwriteprotect_range(struct mm_struct *dst_mm,
|
||||
unsigned long start, unsigned long len,
|
||||
bool enable_wp, bool *mmap_changing);
|
||||
bool enable_wp, atomic_t *mmap_changing);
|
||||
|
||||
/* mm helpers */
|
||||
static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma,
|
||||
|
|
|
@ -33,6 +33,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
|
|||
PGREUSE,
|
||||
PGSTEAL_KSWAPD,
|
||||
PGSTEAL_DIRECT,
|
||||
PGDEMOTE_KSWAPD,
|
||||
PGDEMOTE_DIRECT,
|
||||
PGSCAN_KSWAPD,
|
||||
PGSCAN_DIRECT,
|
||||
PGSCAN_DIRECT_THROTTLE,
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue