Go to file
Yang Shi 88aa7cc688 mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct
mmap_sem is on the hot path of kernel, and it very contended, but it is
abused too.  It is used to protect arg_start|end and evn_start|end when
reading /proc/$PID/cmdline and /proc/$PID/environ, but it doesn't make
sense since those proc files just expect to read 4 values atomically and
not related to VM, they could be set to arbitrary values by C/R.

And, the mmap_sem contention may cause unexpected issue like below:

INFO: task ps:14018 blocked for more than 120 seconds.
       Tainted: G            E 4.9.79-009.ali3000.alios7.x86_64 #1
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
 ps              D    0 14018      1 0x00000004
 Call Trace:
   schedule+0x36/0x80
   rwsem_down_read_failed+0xf0/0x150
   call_rwsem_down_read_failed+0x18/0x30
   down_read+0x20/0x40
   proc_pid_cmdline_read+0xd9/0x4e0
   __vfs_read+0x37/0x150
   vfs_read+0x96/0x130
   SyS_read+0x55/0xc0
   entry_SYSCALL_64_fastpath+0x1a/0xc5

Both Alexey Dobriyan and Michal Hocko suggested to use dedicated lock
for them to mitigate the abuse of mmap_sem.

So, introduce a new spinlock in mm_struct to protect the concurrent
access to arg_start|end, env_start|end and others, as well as replace
write map_sem to read to protect the race condition between prctl and
sys_brk which might break check_data_rlimit(), and makes prctl more
friendly to other VM operations.

This patch just eliminates the abuse of mmap_sem, but it can't resolve
the above hung task warning completely since the later
access_remote_vm() call needs acquire mmap_sem.  The mmap_sem
scalability issue will be solved in the future.

[yang.shi@linux.alibaba.com: add comment about mmap_sem and arg_lock]
  Link: http://lkml.kernel.org/r/1524077799-80690-1-git-send-email-yang.shi@linux.alibaba.com
Link: http://lkml.kernel.org/r/1523730291-109696-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-07 17:34:34 -07:00
Documentation media updates for v4.18-rc1 2018-06-07 12:34:37 -07:00
LICENSES LICENSES: Add Linux-OpenIB license text 2018-04-27 16:41:53 -06:00
arch media updates for v4.18-rc1 2018-06-07 12:34:37 -07:00
block Changes for 4.18: 2018-06-05 13:24:20 -07:00
certs certs/blacklist_nohashes.c: fix const confusion in certs blacklist 2018-02-21 15:35:43 -08:00
crypto - Introduce arithmetic overflow test helper functions (Rasmus) 2018-06-06 17:27:14 -07:00
drivers media updates for v4.18-rc1 2018-06-07 12:34:37 -07:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct 2018-06-07 17:34:34 -07:00
include mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct 2018-06-07 17:34:34 -07:00
init Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
ipc Merge branch 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-06-04 21:02:18 -07:00
kernel mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct 2018-06-07 17:34:34 -07:00
lib Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
mm mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct 2018-06-07 17:34:34 -07:00
net net/9p/trans_xen.c: don't inclide rwlock.h directly 2018-06-07 17:34:34 -07:00
samples samples/bpf: xdpsock: use skb Tx path for XDP_SKB 2018-06-05 15:48:57 +02:00
scripts scripts: use SPDX tag in get_maintainer and checkpatch 2018-06-07 17:34:33 -07:00
security Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
sound media updates for v4.18-rc1 2018-06-07 12:34:37 -07:00
tools powerpc updates for 4.18 2018-06-07 10:23:33 -07:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2018-06-04 15:23:48 -07:00
.clang-format clang-format: add configuration file 2018-04-11 10:28:35 -07:00
.cocciconfig
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap Merge branch 'asoc-4.17' into asoc-4.18 for compress dependencies 2018-04-26 12:24:28 +01:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS/CREDITS: Drop METAG ARCHITECTURE 2018-03-05 16:34:24 +00:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: add basic helper macros to scripts/Kconfig.include 2018-05-29 03:31:19 +09:00
MAINTAINERS media updates for v4.18-rc1 2018-06-07 12:34:37 -07:00
Makefile Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-06-06 18:39:49 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.