mirror of https://gitee.com/openkylin/linux.git
872485 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Arnaldo Carvalho de Melo | 9afec87ec1 |
perf trace: Pass a syscall_arg to syscall_arg_fmt->strtoul()
With just what we need for the STUL_STRARRAY, i.e. the 'struct strarray' pointer to be used, just like with syscall_arg_fmt->scnprintf() for the other direction (number -> string). With this all the strarrays that are associated with syscalls can be used with '-e syscalls:sys_enter_SYSCALLNAME --filter', and soon will be possible as well to use with the strace-like shorter form, with just the syscall names, i.e. something like: -e lseek/whence==END/ For now we have to use the longer form: # perf trace -e syscalls:sys_enter_lseek 0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.031 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.046 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.528 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.575 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.593 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.017 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.051 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.068 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) ^C# perf trace -e syscalls:sys_enter_lseek --filter="whence!=CUR" 0.000 sshd/24476 syscalls:sys_enter_lseek(fd: 3, offset: 9032, whence: SET) 0.060 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 9032, whence: SET) 0.187 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET) 0.203 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET) 0.349 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 61936, whence: SET) ^C# And for those curious about what are those lseek(DSO, offset, SET), well, its the loader: # perf trace -e syscalls:sys_enter_lseek/max-stack=16/ --filter="whence!=CUR" 0.000 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.067 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object_from_fd (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.198 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.219 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object_from_fd (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) ^C# :-) With this we can use strings in strarrays in filters, which allows us to reuse all these that are in place for syscalls: $ find tools/perf/trace/beauty/ -name "*.c" | xargs grep -w DEFINE_STRARRAY tools/perf/trace/beauty/fcntl.c: static DEFINE_STRARRAY(fcntl_setlease, "F_"); tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(mmap_flags, "MAP_"); tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(madvise_advices, "MADV_"); tools/perf/trace/beauty/sync_file_range.c: static DEFINE_STRARRAY(sync_file_range_flags, "SYNC_FILE_RANGE_"); tools/perf/trace/beauty/socket.c: static DEFINE_STRARRAY(socket_ipproto, "IPPROTO_"); tools/perf/trace/beauty/mount_flags.c: static DEFINE_STRARRAY(mount_flags, "MS_"); tools/perf/trace/beauty/pkey_alloc.c: static DEFINE_STRARRAY(pkey_alloc_access_rights, "PKEY_"); tools/perf/trace/beauty/sockaddr.c:DEFINE_STRARRAY(socket_families, "PF_"); tools/perf/trace/beauty/tracepoints/x86_irq_vectors.c:static DEFINE_STRARRAY(x86_irq_vectors, "_VECTOR"); tools/perf/trace/beauty/tracepoints/x86_msr.c:static DEFINE_STRARRAY(x86_MSRs, "MSR_"); tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_options, "PR_"); tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_set_mm_options, "PR_SET_MM_"); tools/perf/trace/beauty/fspick.c: static DEFINE_STRARRAY(fspick_flags, "FSPICK_"); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(ioctl_tty_cmd, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(drm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_pcm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_ctl_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(kvm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_read_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(perf_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(usbdevfs_ioctl_cmds, ""); tools/perf/trace/beauty/fsmount.c: static DEFINE_STRARRAY(fsmount_attr_flags, "MOUNT_ATTR_"); tools/perf/trace/beauty/renameat.c: static DEFINE_STRARRAY(rename_flags, "RENAME_"); tools/perf/trace/beauty/kcmp.c: static DEFINE_STRARRAY(kcmp_types, "KCMP_"); tools/perf/trace/beauty/move_mount.c: static DEFINE_STRARRAY(move_mount_flags, "MOVE_MOUNT_"); $ Well, some, as the mmap flags are like: $ tools/perf/trace/beauty/mmap_flags.sh static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x008000) + 1] = "POPULATE", [ilog2(0x010000) + 1] = "NONBLOCK", [ilog2(0x020000) + 1] = "STACK", [ilog2(0x040000) + 1] = "HUGETLB", [ilog2(0x080000) + 1] = "SYNC", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", }; $ So we'll need a strarray__strtoul_flags() that will break donw the flags into tokens separated by '|' before doing the lookup and then go on reconstructing the value from, say: # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE" into: # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x2|0x10|0x0800" and finally into: # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x812" That is what we see if we don't use the augmented view obtained from: # perf trace -e mmap <SNIP> 211792.885 procmail/15393 mmap(addr: 0x7fcd11645000, len: 8192, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 8, off: 0xa000) = 0x7fcd11645000 <SNIP> But plain use tracefs: procmail-15559 [000] .... 54557.178262: sys_mmap(addr: 7f5c9bf7a000, len: 9b000, prot: 1, flags: 812, fd: 3, off: a9000) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c6mgkjt8ujnc263eld5tb7q3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Linus Torvalds | 998d75510e |
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton: "Rather a lot of fixes, almost all affecting mm/" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (26 commits) scripts/gdb: fix debugging modules on s390 kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe register mm/thp: allow dropping THP from page cache mm/vmscan.c: support removing arbitrary sized pages from mapping mm/thp: fix node page state in split_huge_page_to_list() proc/meminfo: fix output alignment mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch mm/filemap.c: include <linux/ramfs.h> for generic_file_vm_ops definition mm: include <linux/huge_mm.h> for is_vma_temporary_stack zram: fix race between backing_dev_show and backing_dev_store mm/memcontrol: update lruvec counters in mem_cgroup_move_account ocfs2: fix panic due to ocfs2_wq is null hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic() mm: memblock: do not enforce current limit for memblock_phys* family mm: memcg: get number of pages on the LRU list in memcgroup base on lru_zone_size mm/gup: fix a misnamed "write" argument, and a related bug mm/gup_benchmark: add a missing "w" to getopt string ocfs2: fix error handling in ocfs2_setattr() mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release mm/memunmap: don't access uninitialized memmap in memunmap_pages() ... |
|
Ilya Leoshkevich | 585d730d41 |
scripts/gdb: fix debugging modules on s390
Currently lx-symbols assumes that module text is always located at module->core_layout->base, but s390 uses the following layout: +------+ <- module->core_layout->base | GOT | +------+ <- module->core_layout->base + module->arch->plt_offset | PLT | +------+ <- module->core_layout->base + module->arch->plt_offset + | TEXT | module->arch->plt_size +------+ Therefore, when trying to debug modules on s390, all the symbol addresses are skewed by plt_offset + plt_size. Fix by adding plt_offset + plt_size to module_addr in load_module_symbols(). Link: http://lkml.kernel.org/r/20191017085917.81791-1-iii@linux.ibm.com Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Song Liu | aa5de305c9 |
kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe register
Attaching uprobe to text section in THP splits the PMD mapped page table
into PTE mapped entries. On uprobe detach, we would like to regroup PMD
mapped page table entry to regain performance benefit of THP.
However, the regroup is broken For perf_event based trace_uprobe. This
is because perf_event based trace_uprobe calls uprobe_unregister twice
on close: first in TRACE_REG_PERF_CLOSE, then in
TRACE_REG_PERF_UNREGISTER. The second call will split the PMD mapped
page table entry, which is not the desired behavior.
Fix this by only use FOLL_SPLIT_PMD for uprobe register case.
Add a WARN() to confirm uprobe unregister never work on huge pages, and
abort the operation when this WARN() triggers.
Link: http://lkml.kernel.org/r/20191017164223.2762148-6-songliubraving@fb.com
Fixes:
|
|
Kirill A. Shutemov | ef18a1ca84 |
mm/thp: allow dropping THP from page cache
Once a THP is added to the page cache, it cannot be dropped via
/proc/sys/vm/drop_caches. Fix this issue with proper handling in
invalidate_mapping_pages().
Link: http://lkml.kernel.org/r/20191017164223.2762148-5-songliubraving@fb.com
Fixes:
|
|
William Kucharski | 906d278d75 |
mm/vmscan.c: support removing arbitrary sized pages from mapping
__remove_mapping() assumes that pages can only be either base pages or
HPAGE_PMD_SIZE. Ask the page what size it is.
Link: http://lkml.kernel.org/r/20191017164223.2762148-4-songliubraving@fb.com
Fixes:
|
|
Kirill A. Shutemov | 06d3eff62d |
mm/thp: fix node page state in split_huge_page_to_list()
Make sure split_huge_page_to_list() handles the state of shmem THP and
file THP properly.
Link: http://lkml.kernel.org/r/20191017164223.2762148-3-songliubraving@fb.com
Fixes:
|
|
Kirill A. Shutemov | 2be5fbf9a9 |
proc/meminfo: fix output alignment
Patch series "Fixes for THP in page cache", v2.
This patch (of 5):
Add extra space for FileHugePages and FilePmdMapped, so the output is
aligned with other rows.
Link: http://lkml.kernel.org/r/20191017164223.2762148-2-songliubraving@fb.com
Fixes:
|
|
Ben Dooks (Codethink) | a2ae8c0551 |
mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch
mm_init.c needs to include <linux/mman.h> for the definition of vm_committed_as_batch. Fixes the following sparse warning: mm/mm_init.c:141:5: warning: symbol 'vm_committed_as_batch' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20191016091509.26708-1-ben.dooks@codethink.co.uk Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Ben Dooks | d0e6a5821c |
mm/filemap.c: include <linux/ramfs.h> for generic_file_vm_ops definition
The generic_file_vm_ops is defined in <linux/ramfs.h> so include it to fix the following warning: mm/filemap.c:2717:35: warning: symbol 'generic_file_vm_ops' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20191008102311.25432-1-ben.dooks@codethink.co.uk Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Ben Dooks | 444f84fd2a |
mm: include <linux/huge_mm.h> for is_vma_temporary_stack
Include <linux/huge_mm.h> for the definition of is_vma_temporary_stack to fix the following sparse warning: mm/rmap.c:1673:6: warning: symbol 'is_vma_temporary_stack' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20191009151155.27763-1-ben.dooks@codethink.co.uk Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Chenwandun | f7daefe423 |
zram: fix race between backing_dev_show and backing_dev_store
CPU0: CPU1: backing_dev_show backing_dev_store ...... ...... file = zram->backing_dev; down_read(&zram->init_lock); down_read(&zram->init_init_lock) file_path(file, ...); zram->backing_dev = backing_dev; up_read(&zram->init_lock); up_read(&zram->init_lock); gets the value of zram->backing_dev too early in backing_dev_show, which resultin the value being NULL at the beginning, and not NULL later. backtrace: d_path+0xcc/0x174 file_path+0x10/0x18 backing_dev_show+0x40/0xb4 dev_attr_show+0x20/0x54 sysfs_kf_seq_show+0x9c/0x10c kernfs_seq_show+0x28/0x30 seq_read+0x184/0x488 kernfs_fop_read+0x5c/0x1a4 __vfs_read+0x44/0x128 vfs_read+0xa0/0x138 SyS_read+0x54/0xb4 Link: http://lkml.kernel.org/r/1571046839-16814-1-git-send-email-chenwandun@huawei.com Signed-off-by: Chenwandun <chenwandun@huawei.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Konstantin Khlebnikov | ae8af4388d |
mm/memcontrol: update lruvec counters in mem_cgroup_move_account
Mapped, dirty and writeback pages are also counted in per-lruvec stats.
These counters needs update when page is moved between cgroups.
Currently is nobody *consuming* the lruvec versions of these counters and
that there is no user-visible effect.
Link: http://lkml.kernel.org/r/157112699975.7360.1062614888388489788.stgit@buzz
Fixes:
|
|
Yi Li | b918c43021 |
ocfs2: fix panic due to ocfs2_wq is null
mount.ocfs2 failed when reading ocfs2 filesystem superblock encounters an error. ocfs2_initialize_super() returns before allocating ocfs2_wq. ocfs2_dismount_volume() triggers the following panic. Oct 15 16:09:27 cnwarekv-205120 kernel: On-disk corruption discovered.Please run fsck.ocfs2 once the filesystem is unmounted. Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_read_locked_inode:537 ERROR: status = -30 Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_init_global_system_inodes:458 ERROR: status = -30 Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_init_global_system_inodes:491 ERROR: status = -30 Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_initialize_super:2313 ERROR: status = -30 Oct 15 16:09:27 cnwarekv-205120 kernel: (mount.ocfs2,22804,44): ocfs2_fill_super:1033 ERROR: status = -30 ------------[ cut here ]------------ Oops: 0002 [#1] SMP NOPTI CPU: 1 PID: 11753 Comm: mount.ocfs2 Tainted: G E 4.14.148-200.ckv.x86_64 #1 Hardware name: Sugon H320-G30/35N16-US, BIOS 0SSDX017 12/21/2018 task: ffff967af0520000 task.stack: ffffa5f05484000 RIP: 0010:mutex_lock+0x19/0x20 Call Trace: flush_workqueue+0x81/0x460 ocfs2_shutdown_local_alloc+0x47/0x440 [ocfs2] ocfs2_dismount_volume+0x84/0x400 [ocfs2] ocfs2_fill_super+0xa4/0x1270 [ocfs2] ? ocfs2_initialize_super.isa.211+0xf20/0xf20 [ocfs2] mount_bdev+0x17f/0x1c0 mount_fs+0x3a/0x160 Link: http://lkml.kernel.org/r/1571139611-24107-1-git-send-email-yili@winhong.com Signed-off-by: Yi Li <yilikernel@gmail.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
David Hildenbrand | f231fe4235 |
hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic()
Uninitialized memmaps contain garbage and in the worst case trigger kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get touched. Let's make sure that we only consider online memory (managed by the buddy) that has initialized memmaps. ZONE_DEVICE is not applicable. page_zone() will call page_to_nid(), which will trigger VM_BUG_ON_PGFLAGS(PagePoisoned(page), page) with CONFIG_PAGE_POISONING and CONFIG_DEBUG_VM_PGFLAGS when called on uninitialized memmaps. This can be the case when an offline memory block (e.g., never onlined) is spanned by a zone. Note: As explained by Michal in [1], alloc_contig_range() will verify the range. So it boils down to the wrong access in this function. [1] http://lkml.kernel.org/r/20180423000943.GO17484@dhcp22.suse.cz Link: http://lkml.kernel.org/r/20191015120717.4858-1-david@redhat.com Fixes: |
|
Mike Rapoport | f3057ad767 |
mm: memblock: do not enforce current limit for memblock_phys* family
Until commit |
|
Honglei Wang | b11edebbc9 |
mm: memcg: get number of pages on the LRU list in memcgroup base on lru_zone_size
Commit |
|
John Hubbard | 0cd22afdce |
mm/gup: fix a misnamed "write" argument, and a related bug
In several routines, the "flags" argument is incorrectly named "write".
Change it to "flags".
Also, in one place, the misnaming led to an actual bug:
"flags & FOLL_WRITE" is required, rather than just "flags".
(That problem was flagged by krobot, in v1 of this patch.)
Also, change the flags argument from int, to unsigned int.
You can see that this was a simple oversight, because the
calling code passes "flags" to the fifth argument:
gup_pgd_range():
...
if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr,
PGDIR_SHIFT, next, flags, pages, nr))
...which, until this patch, the callees referred to as "write".
Also, change two lines to avoid checkpatch line length
complaints, and another line to fix another oversight
that checkpatch called out: missing "int" on pdshift.
Link: http://lkml.kernel.org/r/20191014184639.1512873-3-jhubbard@nvidia.com
Fixes:
|
|
John Hubbard | 6f24c8d30d |
mm/gup_benchmark: add a missing "w" to getopt string
Even though gup_benchmark.c has code to handle the -w command-line option, the "w" is not part of the getopt string. It looks as if it has been missing the whole time. On my machine, this leads naturally to the following predictable result: $ sudo ./gup_benchmark -w ./gup_benchmark: invalid option -- 'w' ...which is fixed with this commit. Link: http://lkml.kernel.org/r/20191014184639.1512873-2-jhubbard@nvidia.com Signed-off-by: John Hubbard <jhubbard@nvidia.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: kbuild test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Chengguang Xu | ce750f43f5 |
ocfs2: fix error handling in ocfs2_setattr()
Should set transfer_to[USRQUOTA/GRPQUOTA] to NULL on error case before jumping to do dqput(). Link: http://lkml.kernel.org/r/20191010082349.1134-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
Roman Gushchin | b749ecfaf6 |
mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release
Karsten reported the following panic in __free_slab() happening on a s390x
machine:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:00000000017d4007 R3:000000007fbd0007 S:000000007fbff000 P:000000000000003d
Oops: 0004 ilc:3 Ý#1¨ PREEMPT SMP
Modules linked in: tcp_diag inet_diag xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_at nf_nat
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-05872-g6133e3e4bada-dirty #14
Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
Krnl PSW : 0704d00180000000 00000000003cadb6 (__free_slab+0x686/0x6b0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS: 00000000f3a32928 0000000000000000 000000007fbf5d00 000000000117c4b8
0000000000000000 000000009e3291c1 0000000000000000 0000000000000000
0000000000000003 0000000000000008 000000002b478b00 000003d080a97600
0000000000000003 0000000000000008 000000002b478b00 000003d080a97600
000000000117ba00 000003e000057db0 00000000003cabcc 000003e000057c78
Krnl Code: 00000000003cada6: e310a1400004 lg %r1,320(%r10)
00000000003cadac: c0e50046c286 brasl %r14,ca32b8
#00000000003cadb2: a7f4fe36 brc 15,3caa1e
>00000000003cadb6: e32060800024 stg %r2,128(%r6)
00000000003cadbc: a7f4fd9e brc 15,3ca8f8
00000000003cadc0: c0e50046790c brasl %r14,c99fd8
00000000003cadc6: a7f4fe2c brc 15,3caa
00000000003cadc6: a7f4fe2c brc 15,3caa1e
00000000003cadca: ecb1ffff00d9 aghik %r11,%r1,-1
Call Trace:
(<00000000003cabcc> __free_slab+0x49c/0x6b0)
<00000000001f5886> rcu_core+0x5a6/0x7e0
<0000000000ca2dea> __do_softirq+0xf2/0x5c0
<0000000000152644> irq_exit+0x104/0x130
<000000000010d222> do_IRQ+0x9a/0xf0
<0000000000ca2344> ext_int_handler+0x130/0x134
<0000000000103648> enabled_wait+0x58/0x128
(<0000000000103634> enabled_wait+0x44/0x128)
<0000000000103b00> arch_cpu_idle+0x40/0x58
<0000000000ca0544> default_idle_call+0x3c/0x68
<000000000018eaa4> do_idle+0xec/0x1c0
<000000000018ee0e> cpu_startup_entry+0x36/0x40
<000000000122df34> arch_call_rest_init+0x5c/0x88
<0000000000000000> 0x0
INFO: lockdep is turned off.
Last Breaking-Event-Address:
<00000000003ca8f4> __free_slab+0x1c4/0x6b0
Kernel panic - not syncing: Fatal exception in interrupt
The kernel panics on an attempt to dereference the NULL memcg pointer.
When shutdown_cache() is called from the kmem_cache_destroy() context, a
memcg kmem_cache might have empty slab pages in a partial list, which are
still charged to the memory cgroup.
These pages are released by free_partial() at the beginning of
shutdown_cache(): either directly or by scheduling a RCU-delayed work
(if the kmem_cache has the SLAB_TYPESAFE_BY_RCU flag). The latter case
is when the reported panic can happen: memcg_unlink_cache() is called
immediately after shrinking partial lists, without waiting for scheduled
RCU works. It sets the kmem_cache->memcg_params.memcg pointer to NULL,
and the following attempt to dereference it by __free_slab() from the
RCU work context causes the panic.
To fix the issue, let's postpone the release of the memcg pointer to
destroy_memcg_params(). It's called from a separate work context by
slab_caches_to_rcu_destroy_workfn(), which contains a full RCU barrier.
This guarantees that all scheduled page release RCU works will complete
before the memcg pointer will be zeroed.
Big thanks for Karsten for the perfect report containing all necessary
information, his help with the analysis of the problem and testing of the
fix.
Link: http://lkml.kernel.org/r/20191010160549.1584316-1-guro@fb.com
Fixes:
|
|
Aneesh Kumar K.V | 77e080e768 |
mm/memunmap: don't access uninitialized memmap in memunmap_pages()
Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6. This series fixes the access of uninitialized memmaps when shrinking zones/nodes and when removing memory. Also, it contains all fixes for crashes that can be triggered when removing certain namespace using memunmap_pages() - ZONE_DEVICE, reported by Aneesh. We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be more involved (we don't have SECTION_IS_ONLINE as an indicator), and shrinking is only of limited use (set_zone_contiguous() cannot detect the ZONE_DEVICE as contiguous). We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of code to a minimum. Shrinking is especially necessary to keep zone->contiguous set where possible, especially, on memory unplug of DIMMs at zone boundaries. -------------------------------------------------------------------------- Zones are now properly shrunk when offlining memory blocks or when onlining failed. This allows to properly shrink zones on memory unplug even if the separate memory blocks of a DIMM were onlined to different zones or re-onlined to a different zone after offlining. Example: :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 :/# echo "online_movable" > /sys/devices/system/memory/memory41/state :/# echo "online_movable" > /sys/devices/system/memory/memory43/state :/# cat /proc/zoneinfo Node 1, zone Movable spanned 98304 present 65536 managed 65536 :/# echo 0 > /sys/devices/system/memory/memory43/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 32768 present 32768 managed 32768 :/# echo 0 > /sys/devices/system/memory/memory41/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 This patch (of 10): With an altmap, the memmap falling into the reserved altmap space are not initialized and, therefore, contain a garbage NID and a garbage zone. Make sure to read the NID/zone from a memmap that was initialized. This fixes a kernel crash that is observed when destroying a namespace: kernel BUG at include/linux/mm.h:1107! cpu 0x1: Vector: 700 (Program Check) at [c000000274087890] pc: c0000000004b9728: memunmap_pages+0x238/0x340 lr: c0000000004b9724: memunmap_pages+0x234/0x340 ... pid = 3669, comm = ndctl kernel BUG at include/linux/mm.h:1107! devm_action_release+0x30/0x50 release_nodes+0x268/0x2d0 device_release_driver_internal+0x174/0x240 unbind_store+0x13c/0x190 drv_attr_store+0x44/0x60 sysfs_kf_write+0x70/0xa0 kernfs_fop_write+0x1ac/0x290 __vfs_write+0x3c/0x70 vfs_write+0xe4/0x200 ksys_write+0x7c/0x140 system_call+0x5c/0x68 The "page_zone(pfn_to_page(pfn)" was introduced by |
|
David Hildenbrand | 00d6c019b5 |
mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()
We might use the nid of memmaps that were never initialized. For example, if the memmap was poisoned, we will crash the kernel in pfn_to_nid() right now. Let's use the calculated boundaries of the separate zones instead. This now also avoids having to iterate over a whole bunch of subsections again, after shrinking one zone. Before commit |
|
Qian Cai | a26ee565b6 |
mm/page_owner: don't access uninitialized memmaps when reading /proc/pagetypeinfo
Uninitialized memmaps contain garbage and in the worst case trigger kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get touched. For example, when not onlining a memory block that is spanned by a zone and reading /proc/pagetypeinfo with CONFIG_DEBUG_VM_PGFLAGS and CONFIG_PAGE_POISONING, we can trigger a kernel BUG: :/# echo 1 > /sys/devices/system/memory/memory40/online :/# echo 1 > /sys/devices/system/memory/memory42/online :/# cat /proc/pagetypeinfo > test.file page:fffff2c585200000 is uninitialized and poisoned raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p)) There is not page extension available. ------------[ cut here ]------------ kernel BUG at include/linux/mm.h:1107! invalid opcode: 0000 [#1] SMP NOPTI Please note that this change does not affect ZONE_DEVICE, because pagetypeinfo_showmixedcount_print() is called from mm/vmstat.c:pagetypeinfo_showmixedcount() only for populated zones, and ZONE_DEVICE is never populated (zone->present_pages always 0). [david@redhat.com: move check to outer loop, add comment, rephrase description] Link: http://lkml.kernel.org/r/20191011140638.8160-1-david@redhat.com Fixes: |
|
Joel Colledge | ca210ba32e |
scripts/gdb: fix lx-dmesg when CONFIG_PRINTK_CALLER is set
When CONFIG_PRINTK_CALLER is set, struct printk_log contains an additional member caller_id. This affects the offset of the log text. Account for this by using the type information from gdb to determine all the offsets instead of using hardcoded values. This fixes following error: (gdb) lx-dmesg Python Exception <class 'ValueError'> embedded null character: Error occurred in Python command: embedded null character The read_u* utility functions now take an offset argument to make them easier to use. Link: http://lkml.kernel.org/r/20191011142500.2339-1-joel.colledge@linbit.com Signed-off-by: Joel Colledge <joel.colledge@linbit.com> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Leonard Crestez <leonard.crestez@nxp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
|
David Hildenbrand | 96c804a6ae |
mm/memory-failure.c: don't access uninitialized memmaps in memory_failure()
We should check for pfn_to_online_page() to not access uninitialized memmaps. Reshuffle the code so we don't have to duplicate the error message. Link: http://lkml.kernel.org/r/20191009142435.3975-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Fixes: |
|
David Hildenbrand | aad5f69bc1 |
fs/proc/page.c: don't access uninitialized memmaps in fs/proc/page.c
There are three places where we access uninitialized memmaps, namely: - /proc/kpagecount - /proc/kpageflags - /proc/kpagecgroup We have initialized memmaps either when the section is online or when the page was initialized to the ZONE_DEVICE. Uninitialized memmaps contain garbage and in the worst case trigger kernel BUGs, especially with CONFIG_PAGE_POISONING. For example, not onlining a DIMM during boot and calling /proc/kpagecount with CONFIG_PAGE_POISONING: :/# cat /proc/kpagecount > tmp.test BUG: unable to handle page fault for address: fffffffffffffffe #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 114616067 P4D 114616067 PUD 114618067 PMD 0 Oops: 0000 [#1] SMP NOPTI CPU: 0 PID: 469 Comm: cat Not tainted 5.4.0-rc1-next-20191004+ #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 RIP: 0010:kpagecount_read+0xce/0x1e0 Code: e8 09 83 e0 3f 48 0f a3 02 73 2d 4c 89 e7 48 c1 e7 06 48 03 3d ab 51 01 01 74 1d 48 8b 57 08 480 RSP: 0018:ffffa14e409b7e78 EFLAGS: 00010202 RAX: fffffffffffffffe RBX: 0000000000020000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 00007f76b5595000 RDI: fffff35645000000 RBP: 00007f76b5595000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000020000 R14: 00007f76b5595000 R15: ffffa14e409b7f08 FS: 00007f76b577d580(0000) GS:ffff8f41bd400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffffffffffffe CR3: 0000000078960000 CR4: 00000000000006f0 Call Trace: proc_reg_read+0x3c/0x60 vfs_read+0xc5/0x180 ksys_read+0x68/0xe0 do_syscall_64+0x5c/0xa0 entry_SYSCALL_64_after_hwframe+0x49/0xbe For now, let's drop support for ZONE_DEVICE from the three pseudo files in order to fix this. To distinguish offline memory (with garbage memmap) from ZONE_DEVICE memory with properly initialized memmaps, we would have to check get_dev_pagemap() and pfn_zone_device_reserved() right now. The usage of both (especially, special casing devmem) is frowned upon and needs to be reworked. The fundamental issue we have is: if (pfn_to_online_page(pfn)) { /* memmap initialized */ } else if (pfn_valid(pfn)) { /* * ??? * a) offline memory. memmap garbage. * b) devmem: memmap initialized to ZONE_DEVICE. * c) devmem: reserved for driver. memmap garbage. * (d) devmem: memmap currently initializing - garbage) */ } We'll leave the pfn_zone_device_reserved() check in stable_page_flags() in place as that function is also used from memory failure. We now no longer dump information about pages that are not in use anymore - offline. Link: http://lkml.kernel.org/r/20191009142435.3975-2-david@redhat.com Fixes: |
|
David Hildenbrand | 641fe2e938 |
drivers/base/memory.c: don't access uninitialized memmaps in soft_offline_page_store()
Uninitialized memmaps contain garbage and in the worst case trigger kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get touched. Right now, when trying to soft-offline a PFN that resides on a memory block that was never onlined, one gets a misleading error with CONFIG_PAGE_POISONING: :/# echo 5637144576 > /sys/devices/system/memory/soft_offline_page [ 23.097167] soft offline: 0x150000 page already poisoned But the actual result depends on the garbage in the memmap. soft_offline_page() can only work with online pages, it returns -EIO in case of ZONE_DEVICE. Make sure to only forward pages that are online (iow, managed by the buddy) and, therefore, have an initialized memmap. Add a check against pfn_to_online_page() and similarly return -EIO. Link: http://lkml.kernel.org/r/20191010141200.8985-1-david@redhat.com Fixes: |
|
Ingo Molnar | ae79d5588a |
perf/core: Fix !CONFIG_PERF_EVENTS build warnings and failures
sparc64 runs into this warning: include/linux/security.h:1913:52: warning: 'struct perf_event' declared inside parameter list will not be visible outside of this definition or declaration which is escalated to a build error in some of the .c files due to -Werror. Fix it via a forward declaration, like we do for perf_event_attr, the stub inlines don't actually need to know the structure of this struct. Fixes: da97e18458fb: ("perf_event: Add support for LSM and SELinux checks") Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> Cc: linux-kernel@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | d418d07005 |
for-linus-2019-10-18
-----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl2qbF0QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgptsuEADEKL8pta74uy50pl0t8l9fZ++U+wdIeEIW 9uumpOEPnI2GpkG1sOyKWK6tl8InQLw6pAquP9MoT2BHXqFHk7NIgtvk67lwQeoc dRwklVfvOLAdnKzyfODqE9Fh9BgczZIuOLzgdtNqrPKqgJfFRCwN94Kj/r2tYuy7 v+riK3A49u12dOLtjU6ciNgZ0m1iUX9s0+PFYVUXtJHU/1OYToQaKP+sgWiue0Ca VJP/L4MLYD0a7tfd92WAK7xWLsYWTDw1Gg20hXH/tV+IIDQ5+OXhu2s6PuqI7c0y cZqWHQHBDkZMQvT8+V+YqZtEa+xwVCom51prJEPasmdq3fGx+2sDC1HQiySao1ML wfFxZvFvY9fm6M7p2xsSNEcOmamrx1aLLyNSbjIvAqLUDYJWWS56BHsKyTU5Z+Jp RA9dpq8iR6ISaIAcFf0IB0pJSv1HEeHyo/ixlALqezBFJaMdhWy/M+dEbWKtix9M s19ozcpe+omN9+O0anlLtzKNgj2Xnjiwuu8mhVcqn6uG/p6GUOup+lNvTW/fig3I JBH8kObjYXL181V9rYVqFutnuqcf2HYqMvV2vzAmg4LYnPVUmU7HMj8zEpxc4N+f Evd77j0wXmY9S+4JERxaqQZuvKBEIkvM1rkk3N4NbNghfa7QL4aW+I9cWtuelPC2 E+DK7if0Gg== =rvkw -----END PGP SIGNATURE----- Merge tag 'for-linus-2019-10-18' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: - NVMe pull request from Keith that address deadlocks, double resets, memory leaks, and other regression. - Fixup elv_support_iosched() for bio based devices (Damien) - Fixup for the ahci PCS quirk (Dan) - Socket O_NONBLOCK handling fix for io_uring (me) - Timeout sequence io_uring fixes (yangerkun) - MD warning fix for parameter default_layout (Song) - blkcg activation fixes (Tejun) - blk-rq-qos node deletion fix (Tejun) * tag 'for-linus-2019-10-18' of git://git.kernel.dk/linux-block: nvme-pci: Set the prp2 correctly when using more than 4k page io_uring: fix logic error in io_timeout io_uring: fix up O_NONBLOCK handling for sockets md/raid0: fix warning message for parameter default_layout libata/ahci: Fix PCS quirk application blk-rq-qos: fix first node deletion of rq_qos_del() blkcg: Fix multiple bugs in blkcg_activate_policy() io_uring: consider the overflow of sequence for timeout req nvme-tcp: fix possible leakage during error flow nvmet-loop: fix possible leakage during error flow block: Fix elv_support_iosched() nvme-tcp: Initialize sk->sk_ll_usec only with NET_RX_BUSY_POLL nvme: Wait for reset state when required nvme: Prevent resets during paused controller state nvme: Restart request timers in resetting state nvme: Remove ADMIN_ONLY state nvme-pci: Free tagset if no IO queues nvme: retain split access workaround for capability reads nvme: fix possible deadlock when nvme_update_formats fails |
|
Linus Torvalds | dfdcff3215 |
RISC-V updates for v5.4-rc4
Some RISC-V fixes for v5.4-rc4: - Fix the virtual memory layout so the fixaddr region doesn't overlap with other regions. (This was originally intended to go in as part of an earlier patch, but I inadvertently dropped it during a rebase.) - Add the DT chosen/stdout-path property to the HiFive Unleashed DT file. This is so "earlycon" can be specified with no arguments on the kernel command line, and the correct UART will be automatically selected. And two cleanup patches: - Simplify the code in our breakpoint trap handler. - Drop a comment in our TLB flush code that has caused some confusion. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl2qS2MACgkQx4+xDQu9 KksbNg/9FkWN0OvkFNWvj79IYYSc5wvdntTnDdaij9a5/RQ/mTQKjB411ZnL99Yt 7ac+NxzJAJ2i3h+D0ijudwY27vAbvVBEA4rT4SDjo5ENa7ceomt5tdCGmRKaJVpN Qz5Tvhx+KYJ3iROp8+MLkYzIgHFpB4vcSVwruxi5r5Dtnr+doclzgD6bAuzEz3It AQ00upeuB02cbTsu5OclFQ+6BuJ+V2ERQ3CQzNs9+p69ax/1etCYY+n+ZDwhlZ0U XoXSVs8hn9zjFviU0CySCVwReoZUAermU+q7r06BxWrLHixAmIQEgNZGKc3E+KRt nvMKlgjGFMpSo28OrJS7ron2jjC9bGLpI177SgelnH5lZCAyx/xWT9snk51AUyv0 aCVwEZVVSsohfNB5d7zkZq3uWHnoxOlGcISkBe0bVER4o8jBKNYJcF1rjQKt2dYR +4zZVpQNS9aOyrvpa4zIKIyuy4sxuOgE1gNmgVU/rEOpzgysqM+zqzW2JJ9UKJdd 9IXFGpUYSJMAQxIEslSfEQ2Ep+L/n1AMGn4fFEDdNiDGYwC6miLxrLc1zw6MR3DA ds470VAZL6uvt0TvSWxRLVI/2LA80S2TrfzEcHp/Stzc8U/wPppO4ZCw7npgTYfa qMedZQm58+t99/Uh1RXRWq8ciC/3aC490WtrN6ZgMofZRyLU4XI= =aGYX -----END PGP SIGNATURE----- Merge tag 'riscv/for-v5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "Some RISC-V fixes: - Fix the virtual memory layout so the fixaddr region doesn't overlap with other regions. (This was originally intended to go in as part of an earlier patch, but I inadvertently dropped it during a rebase) - Add the DT chosen/stdout-path property to the HiFive Unleashed DT file. This is so "earlycon" can be specified with no arguments on the kernel command line, and the correct UART will be automatically selected. And two cleanup patches: - Simplify the code in our breakpoint trap handler. - Drop a comment in our TLB flush code that has caused some confusion" * tag 'riscv/for-v5.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: fix virtual address overlapped in FIXADDR_START and VMEMMAP_START riscv: tlbflush: remove confusing comment on local_flush_tlb_all() riscv: dts: HiFive Unleashed: add default chosen/stdout-path riscv: remove the switch statement in do_trap_break() |
|
Linus Torvalds | b9959c7a34 |
filldir[64]: remove WARN_ON_ONCE() for bad directory entries
This was always meant to be a temporary thing, just for testing and to
see if it actually ever triggered.
The only thing that reported it was syzbot doing disk image fuzzing, and
then that warning is expected. So let's just remove it before -rc4,
because the extra sanity testing should probably go to -stable, but we
don't want the warning to do so.
Reported-by: syzbot+3031f712c7ad5dd4d926@syzkaller.appspotmail.com
Fixes:
|
|
Linus Torvalds | 6b95cf9b8b |
A future-proofing decoding fix from Jeff intended for stable and
a patch for a mostly benign race from Dongsheng. -----BEGIN PGP SIGNATURE----- iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAl2qAEkTHGlkcnlvbW92 QGdtYWlsLmNvbQAKCRBKf944AhHzi4mZB/9HMfEfZ8JC9keyVaJpAyvV8ufTR4qs 4b8NNc0MDM01z1Z23G0o89b5M0WEDcGslh25plCifxyNIMa+L/lYKl8CTr7CLVQS qCEtNgJ7ibfM26v7rfHOlk6Nnd07/OmjcioaHu/R3bqEQmXpcWQg+aX9C6mPh/2f yzZTKZdKhTZfUyQQctuhNo9G+wD8K86DYT1XRbubPNQ3VtXKPuNH9rLhvLCZzbVA 6FHW05A4mwSv80MsLgN6qLSKxv/+LjV/voHepH4HygqUKw2+1lwi9BC/4k7sprQs 1jFONZ0p1sv/LdWwJYUyCpwj6d3NliXM0uvYxfyzKveWWCxb3l3gaWUS =7scd -----END PGP SIGNATURE----- Merge tag 'ceph-for-5.4-rc4' of git://github.com/ceph/ceph-client Pull ceph fixes from Ilya Dryomov: "A future-proofing decoding fix from Jeff intended for stable and a patch for a mostly benign race from Dongsheng" * tag 'ceph-for-5.4-rc4' of git://github.com/ceph/ceph-client: rbd: cancel lock_dwork if the wait is interrupted ceph: just skip unrecognized info in ceph_reply_info_extra |
|
Linus Torvalds | fb8527e5c1 |
- Fix DM snapshot deadlock that can occur due to COW throttling
preventing locks from being released. - Fix DM cache's GFP_NOWAIT allocation failure error paths by switching to GFP_NOIO. - Make __hash_find() static in the DM clone target. -----BEGIN PGP SIGNATURE----- iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl2pyB0THHNuaXR6ZXJA cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWgevB/9+YCl73OAJfvT2d74ZO+2qNU8LiWVF /kGlClB7Z7lIyoe7ikwR5wQzmflxqOjBss6/ISOQ7lgizpzNL3xj4LCJnmaoDfDg pjzBevFI7U/Dknx+J6+RHGDRfIFMBHF119+QY1ayjsYpaqVPZQcZI74ZbPJzJoky lrlBbXuW1jClIvsvzN1wfYX3jua7kpGGYlWiEqezWLuv8tfz/6A28chCkjx9/SlN VaIXb2dZuODXpT/0m9HRjAnk04ceqV6fXRhpQp217WR/foK8/EUWznKmWP9d1nLN hkuqIOVQA0AfUvJFMVf9PrZz4TdIICV9fz6+tOogRqYMBEe3sU1chmIB =93dq -----END PGP SIGNATURE----- Merge tag 'for-5.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM snapshot deadlock that can occur due to COW throttling preventing locks from being released. - Fix DM cache's GFP_NOWAIT allocation failure error paths by switching to GFP_NOIO. - Make __hash_find() static in the DM clone target. * tag 'for-5.4/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm cache: fix bugs when a GFP_NOWAIT allocation fails dm snapshot: rework COW throttling to fix deadlock dm snapshot: introduce account_start_copy() and account_end_copy() dm clone: Make __hash_find static |
|
Linus Torvalds | 90105ae1ee |
IOMMU Fixes for Linux v5.4-rc3:
Including: - Fixes for page-table issues on Mali GPUs - Missing free in an error path for ARM-SMMU - PASID decoding in the AMD IOMMU Event log code - Another update for the locking fixes in the AMD IOMMU driver - Reduce the calls to platform_get_irq() in the IPMMU-VMSA and Rockchip IOMMUs to get rid of the warning message added to this function recently -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAl2p08MACgkQK/BELZcB GuMlzA/7Bs9TTOr6Cn0idE3FY9T998ZezYi1vLInlyfe+5dD0NZThn1QCDUguTjA KLaWkNkQUPsPr36EWEgrh85mCeUMjf/WB/ua7zUHKdox4om5bQwtb4o8dnDfsrzr V2geah7mwuzTtksTsNhC8oHqHHT2bMw/Uw0ykMOHyOOrwFQ/fwe2Aj+LTqlOASex eUbqeDbE8XkHcflKGjA8r+2fpyaZPUIGj6fSEkxLcgj5S2uDFQJ1Og82TogkQb57 ox4pex+ZcdJbMJd4kP4iv/AQosuYX83S+nOrICZkdfbXN/YQOqDfhesy3uhHw/du +MoFn/lxUA6DX0/hNewuXXH/+uO6JwMOPOZqa+rkxDEDSUQ1NsNALoqUIW3l+CJe GWkjObEllWEDb1ddSa6C/x3qKKzleN5XOkZuKFHRuN0gtkgZnceqtHKU3ZGXIBWg RP+UjCE6DBujoiWIi2ywE88ccEz0lh/3Aad4TiBlE4zkfphXyF2wfzW9oQntqThQ tRPZlJqdc5d1Vx2D743t3Ueq/vvLspbREa8kIDW9MTJESKenK4O+PVca6Q/zy1lI d8dJ7uFqO+PyUW19IXoRqFHh2MRDGugcWPkzgagWuyeWqFeCuOMhRQDiwImBCpJo BepX1ksmhjACASIq6M7yjV7K/6InXvkakNXTo77hugQIrClvAs8= =Pawk -----END PGP SIGNATURE----- Merge tag 'iommu-fixes-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fixes for page-table issues on Mali GPUs - Missing free in an error path for ARM-SMMU - PASID decoding in the AMD IOMMU Event log code - Another update for the locking fixes in the AMD IOMMU driver - Reduce the calls to platform_get_irq() in the IPMMU-VMSA and Rockchip IOMMUs to get rid of the warning message added to this function recently * tag 'iommu-fixes-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Check PM_LEVEL_SIZE() condition in locked section iommu/amd: Fix incorrect PASID decoding from event log iommu/ipmmu-vmsa: Only call platform_get_irq() when interrupt is mandatory iommu/rockchip: Don't use platform_get_irq to implicitly count irqs iommu/io-pgtable-arm: Support all Mali configurations iommu/io-pgtable-arm: Correct Mali attributes iommu/arm-smmu: Free context bitmap in the err path of arm_smmu_init_domain_context |
|
Linus Torvalds | 8eb4b3b0dd |
copy-struct-from-user-v5.4-rc4
-----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXacV8gAKCRCRxhvAZXjc oqaZAQDG+ziyN6umUemQPEX1Ar+FOJPIwDrEJdMRmoz3ozTFQAEA0RxquU3LkVnR Rx9wX07ObZB5nMi/V4yANpuH7Vbzrg4= =7JJk -----END PGP SIGNATURE----- Merge tag 'copy-struct-from-user-v5.4-rc4' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux Pull usercopy test fixlets from Christian Brauner: "This contains two improvements for the copy_struct_from_user() tests: - a coding style change to get rid of the ugly "if ((ret |= test()))" pointed out when pulling the original patchset. - avoid a soft lockups when running the usercopy tests on machines with large page sizes by scanning only a 1024 byte region" * tag 'copy-struct-from-user-v5.4-rc4' of gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux: usercopy: Avoid soft lockups in test_check_nonzero_user() lib: test_user_copy: style cleanup |
|
Vincenzo Frascino |
8a1bef4193
|
mips: vdso: Fix __arch_get_hw_counter()
On some MIPS variants (e.g. MIPS r1), vDSO clock_mode is set to VDSO_CLOCK_NONE. When VDSO_CLOCK_NONE is set the expected kernel behavior is to fallback on syscalls. To do that the generic vDSO library expects UULONG_MAX as return value of __arch_get_hw_counter(). Fix __arch_get_hw_counter() on MIPS defining a __VDSO_USE_SYSCALL case that addressed the described scenario. Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: linux-mips@vger.kernel.org |
|
Paul Burton |
0ad8f7aa9f
|
MAINTAINERS: Use @kernel.org address for Paul Burton
Switch to using my paulburton@kernel.org email address in order to avoid subject mangling that's being imposed on my previous address. Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Paul Burton <paulburton@kernel.org> Cc: linux-kernel@vger.kernel.org |
|
Roger Quadros | 9794476942 |
usb: cdns3: Error out if USB_DR_MODE_UNKNOWN in cdns3_core_init_role()
USB_DR_MODE_UNKNOWN should be treated as error as it is done in
cdns3_drd_update_mode().
Fixes:
|
|
Stefan Wahren | 626c45d223 |
ARM: dts: bcm2837-rpi-cm3: Avoid leds-gpio probing issue
bcm2835-rpi.dtsi defines the behavior of the ACT LED, which is available on all Raspberry Pi boards. But there is no driver for this particual GPIO on CM3 in mainline yet, so this node was left incomplete without the actual GPIO definition. Since commit |
|
Johan Hovold | 7a6f22d747 |
USB: ldusb: fix read info leaks
Fix broken read implementation, which could be used to trigger slab info
leaks.
The driver failed to check if the custom ring buffer was still empty
when waking up after having waited for more data. This would happen on
every interrupt-in completion, even if no data had been added to the
ring buffer (e.g. on disconnect events).
Due to missing sanity checks and uninitialised (kmalloced) ring-buffer
entries, this meant that huge slab info leaks could easily be triggered.
Note that the empty-buffer check after wakeup is enough to fix the info
leak on disconnect, but let's clear the buffer on allocation and add a
sanity check to read() to prevent further leaks.
Fixes:
|
|
Greg Kroah-Hartman | ec83e4c9af |
USB-serial fixes for 5.4-rc4
Here's a fix for a long-standing locking bug in ti_usb_3410_5052 and related clean up. Both have been in linux-next with no reported issues. Signed-off-by: Johan Hovold <johan@kernel.org> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQQHbPq+cpGvN/peuzMLxc3C7H1lCAUCXancfAAKCRALxc3C7H1l CNClAQCOgbL6NYZwy8xDBOHe5iPHJ218RfCHYwxiRpz3qA830QEA44yBBjPp44Z+ pi3yDRiNcS7epVN7mXGs+9GIrKPTdA8= =8+KD -----END PGP SIGNATURE----- Merge tag 'usb-serial-5.4-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for 5.4-rc4 Here's a fix for a long-standing locking bug in ti_usb_3410_5052 and related clean up. Both have been in linux-next with no reported issues. Signed-off-by: Johan Hovold <johan@kernel.org> * tag 'usb-serial-5.4-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: ti_usb_3410_5052: clean up serial data access USB: serial: ti_usb_3410_5052: fix port-close races |
|
Zhengjun Xing | 9fa8c9c647 |
tracing: Fix "gfp_t" format for synthetic events
In the format of synthetic events, the "gfp_t" is shown as "signed:1", but in fact the "gfp_t" is "unsigned", should be shown as "signed:0". The issue can be reproduced by the following commands: echo 'memlatency u64 lat; unsigned int order; gfp_t gfp_flags; int migratetype' > /sys/kernel/debug/tracing/synthetic_events cat /sys/kernel/debug/tracing/events/synthetic/memlatency/format name: memlatency ID: 2233 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:u64 lat; offset:8; size:8; signed:0; field:unsigned int order; offset:16; size:4; signed:0; field:gfp_t gfp_flags; offset:24; size:4; signed:1; field:int migratetype; offset:32; size:4; signed:1; print fmt: "lat=%llu, order=%u, gfp_flags=%x, migratetype=%d", REC->lat, REC->order, REC->gfp_flags, REC->migratetype Link: http://lkml.kernel.org/r/20191018012034.6404-1-zhengjun.xing@linux.intel.com Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> |
|
Andrew Lunn | 38b4fe3201 |
net: usb: lan78xx: Connect PHY before registering MAC
As soon as the netdev is registers, the kernel can start using the
interface. If the driver connects the MAC to the PHY after the netdev
is registered, there is a race condition where the interface can be
opened without having the PHY connected.
Change the order to close this race condition.
Fixes:
|
|
David S. Miller | e381d2b4e2 |
Merge branch 'vsock-virtio-make-the-credit-mechanism-more-robust'
Stefano Garzarella says: ==================== vsock/virtio: make the credit mechanism more robust This series makes the credit mechanism implemented in the virtio-vsock devices more robust. Patch 1 sends an update to the remote peer when the buf_alloc change. Patch 2 prevents a malicious peer (especially the guest) can consume all the memory of the other peer, discarding packets when the credit available is not respected. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Stefano Garzarella | ae6fcfbf5f |
vsock/virtio: discard packets if credit is not respected
If the remote peer doesn't respect the credit information (buf_alloc, fwd_cnt), sending more data than it can send, we should drop the packets to prevent a malicious peer from using all of our memory. This is patch follows the VIRTIO spec: "VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has sufficient free buffer space for the payload" Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Stefano Garzarella | ec3359b685 |
vsock/virtio: send a credit update when buffer size is changed
When the user application set a new buffer size value, we should update the remote peer about this change, since it uses this information to calculate the credit available. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Ido Schimmel | 2e978795bb |
mlxsw: spectrum_trap: Push Ethernet header before reporting trap
devlink maintains packets and bytes statistics for each trap. Since
eth_type_trans() was called to set the skb's protocol, the data pointer
no longer points to the start of the packet and the bytes accounting is
off by 14 bytes.
Fix this by pushing the skb's data pointer to the start of the packet.
Fixes:
|
|
Dragos Tarcatu |
95a32c9805
|
ASoC: SOF: control: return true when kcontrol values change
All the kcontrol put() functions are currently returning 0 when successful. This does not go well with alsamixer as it does not seem to get notified on SND_CTL_EVENT_MASK_VALUE callbacks when values change for (some of) the sof kcontrols. This patch fixes that by returning true for volume, switch and enum type kcontrols when values do change in put(). Signed-off-by: Dragos Tarcatu <dragos_tarcatu@mentor.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20191018123806.18063-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> |
|
Olivier Moysan |
9b7a7f9216
|
ASoC: stm32: sai: fix sysclk management on shutdown
The commit below, adds a call to sysclk callback on shutdown.
This introduces a regression in stm32 SAI driver, as some clock
services are called twice, leading to unbalanced calls.
Move processing related to mclk from shutdown to sysclk callback.
When requested frequency is 0, assume shutdown and release mclk.
Fixes:
|