platform_kernel-5.15/kernel
Dan Williams 7138970383 mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash
The x86 conversion to the generic GUP code included a small change which causes
crashes and data corruption in the pmem code - not good.

The root cause is that the /dev/pmem driver code implicitly relies on the x86
get_user_pages() implementation doing a get_page() on the page refcount, because
get_page() does a get_zone_device_page() which properly refcounts pmem's separate
page struct arrays that are not present in the regular page struct structures.
(The pmem driver does this because it can cover huge memory areas.)

But the x86 conversion to the generic GUP code changed the get_page() to
page_cache_get_speculative() which is faster but doesn't do the
get_zone_device_page() call the pmem code relies on.

One way to solve the regression would be to change the generic GUP code to use
get_page(), but that would slow things down a bit and punish other generic-GUP
using architectures for an x86-ism they did not care about. (Arguably the pmem
driver was probably not working reliably for them: but nvdimm is an Intel
feature, so non-x86 exposure is probably still limited.)

So restructure the pmem code's interface with the MM instead: get rid of the
get/put_zone_device_page() distinction, integrate put_zone_device_page() into
__put_page() and and restructure the pmem completion-wait and teardown machinery:

Kirill points out that the calls to {get,put}_dev_pagemap() can be
removed from the mm fast path if we take a single get_dev_pagemap()
reference to signify that the page is alive and use the final put of the
page to drop that reference.

This does require some care to make sure that any waits for the
percpu_ref to drop to zero occur *after* devm_memremap_page_release(),
since it now maintains its own elevated reference.

This speeds up things while also making the pmem refcounting more robust going
forward.

Suggested-by: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/149339998297.24933.1129582806028305912.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-01 09:15:53 +02:00
..
bpf bpf: fix hashmap extra_elems logic 2017-03-22 14:12:18 -07:00
cgroup Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2017-03-14 15:11:19 -07:00
configs config: android-base: enable hardened usercopy and kernel ASLR 2017-02-27 18:43:46 -08:00
debug sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
events Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-17 13:59:52 -07:00
gcov
irq sched/headers: Prepare to move the get_task_struct()/put_task_struct() and related APIs from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:40 +01:00
livepatch livepatch/module: make TAINT_LIVEPATCH module-specific 2016-08-26 14:42:08 +02:00
locking locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y 2017-03-16 09:28:30 +01:00
power Merge branch 'WIP.sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-03 10:16:38 -08:00
printk sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
rcu sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
sched Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-04-02 09:25:10 -07:00
time Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-07 14:45:22 -08:00
trace scripts/spelling.txt: add "overide" pattern and fix typo instances 2017-03-09 17:01:09 -08:00
.gitignore
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/mutex: Allow MUTEX_SPIN_ON_OWNER when DEBUG_MUTEXES 2016-10-25 11:31:51 +02:00
Kconfig.preempt
Makefile cgroup: move cgroup files under kernel/cgroup/ 2016-12-27 14:49:05 -05:00
acct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
async.c
audit.c audit: fix auditd/kernel connection state tracking 2017-03-21 11:26:35 -04:00
audit.h audit: fix auditd/kernel connection state tracking 2017-03-21 11:26:35 -04:00
audit_fsnotify.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-17 18:44:00 -08:00
audit_tree.c Merge branch 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit 2017-01-05 23:06:06 -08:00
audit_watch.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-17 18:44:00 -08:00
auditfilter.c audit: add support for session ID user filter 2016-11-29 15:10:12 -05:00
auditsc.c audit: fix auditd/kernel connection state tracking 2017-03-21 11:26:35 -04:00
backtracetest.c
bounds.c
capability.c capability: export has_capability 2017-01-12 07:01:56 -07:00
compat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
configs.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
context_tracking.c
cpu.c cpu/hotplug: Serialize callback invocations proper 2017-03-14 19:19:27 +01:00
cpu_pm.c
crash_dump.c
cred.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/coredump.h> 2017-03-02 08:42:28 +01:00
delayacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c userfaultfd: non-cooperative: rollback userfaultfd_exit 2017-03-09 17:01:09 -08:00
extable.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-02-22 10:15:09 -08:00
fork.c sched/headers, RCU: Move rcu_copy_process() from <linux/sched/task.h> to kernel/fork.c 2017-03-03 01:43:46 +01:00
freezer.c
futex.c futex: Add missing error handling to FUTEX_REQUEUE_PI 2017-03-14 21:45:36 +01:00
futex_compat.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
groups.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
hung_task.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
irq_work.c
jump_label.c This release has no new tracing features, just clean ups, minor fixes 2017-02-27 13:26:17 -08:00
kallsyms.c bpf: make jited programs visible in traces 2017-02-17 13:40:05 -05:00
kcmp.c
kcov.c kcov: make kcov work properly with KASLR enabled 2016-12-20 09:48:47 -08:00
kexec.c
kexec_core.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2017-02-22 17:33:34 -08:00
kexec_file.c kexec, x86/purgatory: Unbreak it and clean it up 2017-03-10 20:55:09 +01:00
kexec_internal.h kexec, x86/purgatory: Unbreak it and clean it up 2017-03-10 20:55:09 +01:00
kmod.c sched/headers, vfs/execve: Prepare to move the do_execve*() prototypes from <linux/sched.h> to <linux/binfmts.h> 2017-03-02 08:42:39 +01:00
kprobes.c powerpc updates for 4.11 part 1. 2017-02-22 10:30:38 -08:00
ksysfs.c kernel/ksysfs.c: add __ro_after_init to bin_attribute structure 2017-02-24 17:46:56 -08:00
kthread.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
latencytop.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
membarrier.c Fix: Disable sys_membarrier when nohz_full is enabled 2017-01-23 11:32:16 -08:00
memremap.c mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash 2017-05-01 09:15:53 +02:00
module-internal.h
module.c Modules updates for v4.11 2017-02-22 17:08:33 -08:00
module_signing.c
notifier.c kernel/notifier.c: simplify expression 2017-02-24 17:46:56 -08:00
nsproxy.c
padata.c padata: avoid race in reordering 2017-03-24 21:51:33 +08:00
panic.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
params.c
pid.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
pid_namespace.c sched/headers: Prepare for the reduction of <linux/sched.h>'s signal API dependency 2017-03-02 08:42:37 +01:00
profile.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
ptrace.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
range.c
reboot.c
relay.c lib/vsprintf.c: remove %Z support 2017-02-27 18:43:47 -08:00
resource.c
seccomp.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task_stack.h> 2017-03-02 08:42:36 +01:00
signal.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
smp.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/idle.h> 2017-03-02 08:42:26 +01:00
smpboot.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
smpboot.h
softirq.c softirq: Display IRQ_POLL for irq-poll statistics 2016-10-21 15:45:47 -06:00
stacktrace.c stacktrace, lockdep: Fix address, newline ugliness 2017-02-08 08:21:31 +01:00
stop_machine.c locking/core, stop_machine: Yield the CPU during stop machine() 2016-11-16 10:15:09 +01:00
sys.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
sys_ni.c move aio compat to fs/aio.c 2016-12-22 22:58:37 -05:00
sysctl.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/coredump.h> 2017-03-02 08:42:28 +01:00
sysctl_binary.c sysctl: add KERN_CONT to deprecated_sysctl_warning() 2016-12-14 16:04:07 -08:00
task_work.c
taskstats.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-11-15 10:54:36 -05:00
test_kprobes.c
torture.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
tracepoint.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
tsacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
ucount.c ucount: Remove the atomicity from ucount->count 2017-03-06 15:26:37 -06:00
uid16.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
up.c smp: Add function to execute a function synchronously on a CPU 2016-09-05 13:52:39 +02:00
user-return-notifier.c
user.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> 2017-03-02 08:42:29 +01:00
user_namespace.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> 2017-03-02 08:42:29 +01:00
utsname.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
utsname_sysctl.c sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> 2017-03-03 01:45:36 +01:00
watchdog.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
watchdog_hld.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
workqueue.c workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq 2017-03-06 15:33:42 -05:00
workqueue_internal.h