Commit Graph

737777 Commits

Author SHA1 Message Date
Jason Gunthorpe 104f268d43 IB/uverbs: Improve lockdep_check
This is really being used as an assert that the expected usecnt
is being held and implicitly that the usecnt is valid. Rename it to
assert_uverbs_usecnt and tighten the checks to only accept valid
values of usecnt (eg 0 and < -1 are invalid).

The tigher checkes make the assertion cover more cases and is more
likely to find bugs via syzkaller/etc.

Fixes: 3832125624 ("IB/core: Add support for idr types")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:47 -07:00
Leon Romanovsky 6623e3e3cd RDMA/uverbs: Protect from races between lookup and destroy of uobjects
The race is between lookup_get_idr_uobject and
uverbs_idr_remove_uobj -> uverbs_uobject_put.

We deliberately do not call sychronize_rcu after the idr_remove in
uverbs_idr_remove_uobj for performance reasons, instead we call
kfree_rcu() during uverbs_uobject_put.

However, this means we can obtain pointers to uobj's that have
already been released and must protect against krefing them
using kref_get_unless_zero.

==================================================================
BUG: KASAN: use-after-free in copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
Read of size 4 at addr ffff88005fda1ac8 by task syz-executor2/441

CPU: 1 PID: 441 Comm: syz-executor2 Not tainted 4.15.0-rc2+ #56
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
dump_stack+0x8d/0xd4
print_address_description+0x73/0x290
kasan_report+0x25c/0x370
? copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
? uverbs_try_lock_object+0x68/0xc0
? modify_qp.isra.7+0xdc4/0x10e0
modify_qp.isra.7+0xdc4/0x10e0
ib_uverbs_modify_qp+0xfe/0x170
? ib_uverbs_query_qp+0x970/0x970
? __lock_acquire+0xa11/0x1da0
ib_uverbs_write+0x55a/0xad0
? ib_uverbs_query_qp+0x970/0x970
? ib_uverbs_query_qp+0x970/0x970
? ib_uverbs_open+0x760/0x760
? futex_wake+0x147/0x410
? sched_clock_cpu+0x18/0x180
? check_prev_add+0x1680/0x1680
? do_futex+0x3b6/0xa30
? sched_clock_cpu+0x18/0x180
__vfs_write+0xf7/0x5c0
? ib_uverbs_open+0x760/0x760
? kernel_read+0x110/0x110
? lock_acquire+0x370/0x370
? __fget+0x264/0x3b0
vfs_write+0x18a/0x460
SyS_write+0xc7/0x1a0
? SyS_read+0x1a0/0x1a0
? trace_hardirqs_on_thunk+0x1a/0x1c
entry_SYSCALL_64_fastpath+0x18/0x85
RIP: 0033:0x448e29
RSP: 002b:00007f443fee0c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f443fee16bc RCX: 0000000000448e29
RDX: 0000000000000078 RSI: 00000000209f8000 RDI: 0000000000000012
RBP: 000000000070bea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000008e98 R14: 00000000006ebf38 R15: 0000000000000000

Allocated by task 1:
kmem_cache_alloc_trace+0x16c/0x2f0
mlx5_alloc_cmd_msg+0x12e/0x670
cmd_exec+0x419/0x1810
mlx5_cmd_exec+0x40/0x70
mlx5_core_mad_ifc+0x187/0x220
mlx5_MAD_IFC+0xd7/0x1b0
mlx5_query_mad_ifc_gids+0x1f3/0x650
mlx5_ib_query_gid+0xa4/0xc0
ib_query_gid+0x152/0x1a0
ib_query_port+0x21e/0x290
mlx5_port_immutable+0x30f/0x490
ib_register_device+0x5dd/0x1130
mlx5_ib_add+0x3e7/0x700
mlx5_add_device+0x124/0x510
mlx5_register_interface+0x11f/0x1c0
mlx5_ib_init+0x56/0x61
do_one_initcall+0xa3/0x250
kernel_init_freeable+0x309/0x3b8
kernel_init+0x14/0x180
ret_from_fork+0x24/0x30

Freed by task 1:
kfree+0xeb/0x2f0
mlx5_free_cmd_msg+0xcd/0x140
cmd_exec+0xeba/0x1810
mlx5_cmd_exec+0x40/0x70
mlx5_core_mad_ifc+0x187/0x220
mlx5_MAD_IFC+0xd7/0x1b0
mlx5_query_mad_ifc_gids+0x1f3/0x650
mlx5_ib_query_gid+0xa4/0xc0
ib_query_gid+0x152/0x1a0
ib_query_port+0x21e/0x290
mlx5_port_immutable+0x30f/0x490
ib_register_device+0x5dd/0x1130
mlx5_ib_add+0x3e7/0x700
mlx5_add_device+0x124/0x510
mlx5_register_interface+0x11f/0x1c0
mlx5_ib_init+0x56/0x61
do_one_initcall+0xa3/0x250
kernel_init_freeable+0x309/0x3b8
kernel_init+0x14/0x180
ret_from_fork+0x24/0x30

The buggy address belongs to the object at ffff88005fda1ab0
which belongs to the cache kmalloc-32 of size 32
The buggy address is located 24 bytes inside of
32-byte region [ffff88005fda1ab0, ffff88005fda1ad0)
The buggy address belongs to the page:
page:00000000d5655c19 count:1 mapcount:0 mapping: (null)
index:0xffff88005fda1fc0
flags: 0x4000000000000100(slab)
raw: 4000000000000100 0000000000000000 ffff88005fda1fc0 0000000180550008
raw: ffffea00017f6780 0000000400000004 ffff88006c803980 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff88005fda1980: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb
ffff88005fda1a00: fb fb fc fc fb fb fb fb fc fc 00 00 00 00 fc fc
ffff88005fda1a80: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb
ffff88005fda1b00: fc fc 00 00 00 00 fc fc fb fb fb fb fc fc fb fb
ffff88005fda1b80: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
==================================================================@

Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 4.11
Fixes: 3832125624 ("IB/core: Add support for idr types")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:47 -07:00
Jason Gunthorpe d9dc7a3500 IB/uverbs: Hold the uobj write lock after allocate
This clarifies the design intention that time between allocate and
commit has the uobj exclusive to the caller. We already guarantee
this by delaying publishing the uobj pointer via idr_insert,
fd_install, list_add, etc.

Additionally holding the usecnt lock during this period provides
extra clarity and more protection against future mistakes.

Fixes: 3832125624 ("IB/core: Add support for idr types")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:46 -07:00
Matan Barak 4d39a959bc IB/uverbs: Fix possible oops with duplicate ioctl attributes
If the same attribute is listed twice by the user in the ioctl attribute
list then error unwind can cause the kernel to deref garbage.

This happens when an object with WRITE access is sent twice. The second
parse properly fails but corrupts the state required for the error unwind
it triggers.

Fixing this by making duplicates in the attribute list invalid. This is
not something we need to support.

The ioctl interface is currently recommended to be disabled in kConfig.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:46 -07:00
Matan Barak 9dfb2ff400 IB/uverbs: Add ioctl support for 32bit processes
32 bit processes running on a 64 bit kernel call compat_ioctl so that
implementations can revise any structure layout issues. Point compat_ioctl
at our normal ioctl because:

- All our structures are designed to be the same on 32 and 64 bit, ie we
  use __aligned_u64 when required and are careful to manage padding.

- Any pointers are stored in u64's and userspace is expected
  to prepare them properly.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:46 -07:00
Jason Gunthorpe 5d2beb576d IB/uverbs: Use __aligned_u64 for uapi headers
This has no impact on the structure layout since these structs already
have their u64s already properly aligned, but it does document that we
have this requirement for 32 bit compatibility.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:45 -07:00
Matan Barak 3d89459e2e IB/uverbs: Fix method merging in uverbs_ioctl_merge
Fix a bug in uverbs_ioctl_merge that looked at the object's iterator
number instead of the method's iterator number when merging methods.

While we're at it, make the uverbs_ioctl_merge code a bit more clear
and faster.

Fixes: 118620d368 ('IB/core: Add uverbs merge trees functionality')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:45 -07:00
Jason Gunthorpe 2f36028ce9 IB/uverbs: Use u64_to_user_ptr() not a union
The union approach will get the endianness wrong sometimes if the kernel's
pointer size is 32 bits resulting in EFAULTs when trying to copy to/from
user.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:45 -07:00
Jason Gunthorpe 6c976c30ad IB/uverbs: Use inline data transfer for UHW_IN
The rule for the API is pointers less than 8 bytes are inlined into
the .data field of the attribute. Fix the creation of the driver udata
struct to follow this rule and point to the .data itself when the size
is less than 8 bytes.

Otherwise if the UHW struct is less than 8 bytes the driver will get
EFAULT during copy_from_user.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:44 -07:00
Matan Barak 89d9e8d3f1 IB/uverbs: Always use the attribute size provided by the user
This fixes several bugs around the copy_to/from user path:
 - copy_to used the user provided size of the attribute
   and could copy data beyond the end of the kernel buffer into
   userspace.
 - copy_from didn't know the size of the kernel buffer and
   could have left kernel memory unexpectedly un-initialized.
 - copy_from did not use the user length to determine if the
   attribute data is inlined or not.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:44 -07:00
Leon Romanovsky 415bb699d7 RDMA/restrack: Remove unimplemented XRCD object
Resource tracking of XRCD objects is not implemented in current
version of restrack and hence can be removed.

Fixes: 02d8883f52 ("RDMA/restrack: Add general infrastructure to track RDMA resources")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:44 -07:00
Alaa Hleihel 14fa91e0fe IB/ipoib: Do not warn if IPoIB debugfs doesn't exist
netdev_wait_allrefs() could rebroadcast NETDEV_UNREGISTER event
multiple times until all refs are gone, which will result in calling
ipoib_delete_debug_files multiple times and printing a warning.

Remove the WARN_ONCE since checks of NULL pointers before calling
debugfs_remove are not needed.

Fixes: 771a525840 ("IB/IPoIB: ibX: failed to create mcg debug file")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-15 14:59:43 -07:00
James Hogan 5efad9eee3
sparc,leon: Select USB_UHCI_BIG_ENDIAN_{MMIO,DESC}
Now that USB_UHCI_BIG_ENDIAN_MMIO and USB_UHCI_BIG_ENDIAN_DESC are moved
outside of the USB_SUPPORT conditional, simply select them from
SPARC_LEON rather than by the symbol's defaults in drivers/usb/Kconfig,
similar to how it is done for USB_EHCI_BIG_ENDIAN_MMIO and
USB_EHCI_BIG_ENDIAN_DESC.

Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: sparclinux@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Patchwork: https://patchwork.linux-mips.org/patch/18560/
2018-02-15 21:45:16 +00:00
James Hogan ec897569ad
usb: Move USB_UHCI_BIG_ENDIAN_* out of USB_SUPPORT
Move the Kconfig symbols USB_UHCI_BIG_ENDIAN_MMIO and
USB_UHCI_BIG_ENDIAN_DESC out of drivers/usb/host/Kconfig, which is
conditional upon USB && USB_SUPPORT, so that it can be freely selected
by platform Kconfig symbols in architecture code.

For example once the MIPS_GENERIC platform selects are fixed in commit
2e6522c565 ("MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN"), the MIPS
32r6_defconfig warns like so:

warning: (MIPS_GENERIC) selects USB_UHCI_BIG_ENDIAN_MMIO which has unmet direct dependencies (USB_SUPPORT && USB)
warning: (MIPS_GENERIC) selects USB_UHCI_BIG_ENDIAN_DESC which has unmet direct dependencies (USB_SUPPORT && USB)

Fixes: 2e6522c565 ("MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: linux-usb@vger.kernel.org
Cc: linux-mips@linux-mips.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patchwork: https://patchwork.linux-mips.org/patch/18559/
2018-02-15 21:29:13 +00:00
Jack Stocker 7a1646d922 Add delay-init quirk for Corsair K70 RGB keyboards
Following on from this patch: https://lkml.org/lkml/2017/11/3/516,
Corsair K70 RGB keyboards also require the DELAY_INIT quirk to
start correctly at boot.

Device ids found here:
usb 3-3: New USB device found, idVendor=1b1c, idProduct=1b13
usb 3-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 3-3: Product: Corsair K70 RGB Gaming Keyboard

Signed-off-by: Jack Stocker <jackstocker.93@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 20:52:56 +01:00
Max Filippov 6ac5a11dc6 xtensa: fix high memory/reserved memory collision
Xtensa memory initialization code frees high memory pages without
checking whether they are in the reserved memory regions or not. That
results in invalid value of totalram_pages and duplicate page usage by
CMA and highmem. It produces a bunch of BUGs at startup looking like
this:

BUG: Bad page state in process swapper  pfn:70800
page:be60c000 count:0 mapcount:-127 mapping:  (null) index:0x1
flags: 0x80000000()
raw: 80000000 00000000 00000001 ffffff80 00000000 be60c014 be60c014 0000000a
page dumped because: nonzero mapcount
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Tainted: G    B            4.16.0-rc1-00015-g7928b2cbe55b-dirty #23
Stack:
 bd839d33 00000000 00000018 ba97b64c a106578c bd839d70 be60c000 00000000
 a1378054 bd86a000 00000003 ba97b64c a1066166 bd839da0 be60c000 ffe00000
 a1066b58 bd839dc0 be504000 00000000 000002f4 bd838000 00000000 0000001e
Call Trace:
 [<a1065734>] bad_page+0xac/0xd0
 [<a106578c>] free_pages_check_bad+0x34/0x4c
 [<a1066166>] __free_pages_ok+0xae/0x14c
 [<a1066b58>] __free_pages+0x30/0x64
 [<a1365de5>] init_cma_reserved_pageblock+0x35/0x44
 [<a13682dc>] cma_init_reserved_areas+0xf4/0x148
 [<a10034b8>] do_one_initcall+0x80/0xf8
 [<a1361c16>] kernel_init_freeable+0xda/0x13c
 [<a125b59d>] kernel_init+0x9/0xd0
 [<a1004304>] ret_from_kernel_thread+0xc/0x18

Only free high memory pages that are not reserved.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2018-02-15 09:46:42 -08:00
AMAN DEEP 46408ea558 usb: ohci: Proper handling of ed_rm_list to handle race condition between usb_kill_urb() and finish_unlinks()
There is a race condition between finish_unlinks->finish_urb() function
and usb_kill_urb() in ohci controller case. The finish_urb calls
spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call,
then if during this time, usb_kill_urb is called for another endpoint,
then new ed will be added to ed_rm_list at beginning for unlink, and
ed_rm_list will point to newly added.

When finish_urb() is completed in finish_unlinks() and ed->td_list
becomes empty as in below code (in finish_unlinks() function):

        if (list_empty(&ed->td_list)) {
                *last = ed->ed_next;
                ed->ed_next = NULL;
        } else if (ohci->rh_state == OHCI_RH_RUNNING) {
                *last = ed->ed_next;
                ed->ed_next = NULL;
                ed_schedule(ohci, ed);
        }

The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next
and previously added ed by usb_kill_urb will be left unreferenced by
ed_rm_list. This causes usb_kill_urb() hang forever waiting for
finish_unlink to remove added ed from ed_rm_list.

The main reason for hang in this race condtion is addition and removal
of ed from ed_rm_list in the beginning during usb_kill_urb and later
last* is modified in finish_unlinks().

As suggested by Alan Stern, the solution for proper handling of
ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing
any URBs. Then at the end, we can add ed back to the list if necessary.

This properly handle the updated ohci->ed_rm_list in usb_kill_urb().

Fixes: 977dcfdc60 ("USB: OHCI: don't lose track of EDs when a controller dies")
Acked-by: Alan Stern <stern@rowland.harvard.edu>
CC: <stable@vger.kernel.org>
Signed-off-by: Aman Deep <aman.deep@samsung.com>
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:45:34 +01:00
Peter Chen 91b119359c usb: host: ehci: always enable interrupt for qtd completion at test mode
At former code, the SETUP stage does not enable interrupt
for qtd completion, it relies on IAA watchdog to complete
interrupt, then the transcation would be considered timeout
if the flag need_io_watchdog is cleared by platform code.

In this commit, we always add enable interrupt for qtd completion,
then the qtd completion can be notified by hardware interrupt.

Signed-off-by: Peter Chen <peter.chen@nxp.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:45:34 +01:00
Karsten Koop 52ad2bd891 usb: ldusb: add PIDs for new CASSY devices supported by this driver
This patch adds support for new CASSY devices to the ldusb driver. The
PIDs are also added to the ignore list in hid-quirks.

Signed-off-by: Karsten Koop <kkoop@ld-didactic.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:44:03 +01:00
Yoshihiro Shimoda d6efa938ac usb: renesas_usbhs: missed the "running" flag in usb_dmac with rx path
This fixes an issue that a gadget driver (usb_f_fs) is possible to
stop rx transactions after the usb-dmac is used because the following
functions missed to set/check the "running" flag.
 - usbhsf_dma_prepare_pop_with_usb_dmac()
 - usbhsf_dma_pop_done_with_usb_dmac()

So, if next transaction uses pio, the usbhsf_prepare_pop() can not
start the transaction because the "running" flag is 0.

Fixes: 8355b2b308 ("usb: renesas_usbhs: fix the behavior of some usbhs_pkt_handle")
Cc: <stable@vger.kernel.org> # v3.19+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:43:57 +01:00
Peter Chen 02a10f061a usb: host: ehci: use correct device pointer for dma ops
commit a8c06e407e ("usb: separate out sysdev pointer from usb_bus")
converted to use hcd->self.sysdev for DMA operations instead of
hcd->self.controller, but forgot to do it for hcd test mode. Replace
the correct one in this commit.

Fixes: a8c06e407e ("usb: separate out sysdev pointer from usb_bus")
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:43:57 +01:00
Shuah Khan 009f41aed4 usbip: keep usbip_device sockfd state in sync with tcp_socket
Keep usbip_device sockfd state in sync with tcp_socket. When tcp_socket
is reset to null, reset sockfd to -1 to keep it in sync.

Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:43:57 +01:00
Shigeru Yoshida b2685bdacd ohci-hcd: Fix race condition caused by ohci_urb_enqueue() and io_watchdog_func()
Running io_watchdog_func() while ohci_urb_enqueue() is running can
cause a race condition where ohci->prev_frame_no is corrupted and the
watchdog can mis-detect following error:

  ohci-platform 664a0800.usb: frame counter not updating; disabled
  ohci-platform 664a0800.usb: HC died; cleaning up

Specifically, following scenario causes a race condition:

  1. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags)
     and enters the critical section
  2. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it
     returns false
  3. ohci_urb_enqueue() sets ohci->prev_frame_no to a frame number
     read by ohci_frame_no(ohci)
  4. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer()
  5. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock,
     flags) and exits the critical section
  6. Later, ohci_urb_enqueue() is called
  7. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags)
     and enters the critical section
  8. The timer scheduled on step 4 expires and io_watchdog_func() runs
  9. io_watchdog_func() calls spin_lock_irqsave(&ohci->lock, flags)
     and waits on it because ohci_urb_enqueue() is already in the
     critical section on step 7
 10. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it
     returns false
 11. ohci_urb_enqueue() sets ohci->prev_frame_no to new frame number
     read by ohci_frame_no(ohci) because the frame number proceeded
     between step 3 and 6
 12. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer()
 13. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock,
     flags) and exits the critical section, then wake up
     io_watchdog_func() which is waiting on step 9
 14. io_watchdog_func() enters the critical section
 15. io_watchdog_func() calls ohci_frame_no(ohci) and set frame_no
     variable to the frame number
 16. io_watchdog_func() compares frame_no and ohci->prev_frame_no

On step 16, because this calling of io_watchdog_func() is scheduled on
step 4, the frame number set in ohci->prev_frame_no is expected to the
number set on step 3.  However, ohci->prev_frame_no is overwritten on
step 11.  Because step 16 is executed soon after step 11, the frame
number might not proceed, so ohci->prev_frame_no must equals to
frame_no.

To address above scenario, this patch introduces a special sentinel
value IO_WATCHDOG_OFF and set this value to ohci->prev_frame_no when
the watchdog is not pending or running.  When ohci_urb_enqueue()
schedules the watchdog (step 4 and 12 above), it compares
ohci->prev_frame_no to IO_WATCHDOG_OFF so that ohci->prev_frame_no is
not overwritten while io_watchdog_func() is running.

Signed-off-by: Shigeru Yoshida <Shigeru.Yoshida@windriver.com>
Signed-off-by: Haiqing Bai <Haiqing.Bai@windriver.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:43:57 +01:00
Kristian Evensen 71a0483d56 USB: serial: option: Add support for Quectel EP06
The Quectel EP06 is a Cat. 6 LTE modem, and the interface mapping is as
follows:

0: Diag
1: NMEA
2: AT
3: Modem

Interface 4 is QMI and interface 5 is ADB, so they are blacklisted.

This patch should also be considered for -stable. The QMI-patch for this
modem is already in the -stable-queue.

v1->v2:
* Updated commit prefix (thanks Johan Hovold)
* Updated commit message slightly.

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Acked-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:36:19 +01:00
Zhengjun Xing 11cd764dc9 xhci: fix xhci debugfs errors in xhci_stop
In function xhci_stop, xhci_debugfs_exit called before xhci_mem_cleanup.
xhci_debugfs_exit removed the xhci debugfs root nodes, xhci_mem_cleanup
called function xhci_free_virt_devices_depth_first which in turn called
function xhci_debugfs_remove_slot.
Function xhci_debugfs_remove_slot removed the nodes for devices, the nodes
folders are sub folder of xhci debugfs.

It is unreasonable to remove xhci debugfs root folder before
xhci debugfs sub folder. Function xhci_mem_cleanup should be called
before function xhci_debugfs_exit.

Fixes: 02b6fdc2a1 ("usb: xhci: Add debugfs interface for xHCI driver")
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:36:19 +01:00
Zhengjun Xing 8c5a93ebf7 xhci: xhci debugfs device nodes weren't removed after device plugged out
There is a bug after plugged out USB device, the device and its ep00
nodes are still kept, we need to remove the nodes in xhci_free_dev when
USB device is plugged out.

Fixes: 052f71e25a ("xhci: Fix xhci debugfs NULL pointer dereference in resume from hibernate")
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:36:19 +01:00
Zhengjun Xing d916767172 xhci: Fix xhci debugfs devices node disappearance after hibernation
During system resume from hibernation, xhci host is reset, all the
nodes in devices folder are removed in xhci_mem_cleanup function.
Later nodes in /sys/kernel/debug/usb/xhci/* are created again in
function xhci_run, but the nodes already exist, so the nodes still
keep the old ones, finally device nodes in xhci debugfs folder
/sys/kernel/debug/usb/xhci/*/devices/* are disappeared.

This fix removed xhci debugfs nodes before the nodes are re-created,
so all the nodes in xhci debugfs can be re-created successfully.

Fixes: 02b6fdc2a1 ("usb: xhci: Add debugfs interface for xHCI driver")
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:36:19 +01:00
Zhengjun Xing fa2dfd0ec2 xhci: Fix NULL pointer in xhci debugfs
Commit dde634057d ("xhci: Fix use-after-free in xhci debugfs") causes a
null pointer dereference while fixing xhci-debugfs usage of ring pointers
that were freed during hibernate.

The fix passed addresses to ring pointers instead, but forgot to do this
change for the xhci_ring_trb_show function.

The address of the ring pointer passed to xhci-debugfs was of a temporary
ring pointer "new_ring" instead of the actual ring "ring" pointer. The
temporary new_ring pointer will be set to NULL later causing the NULL
pointer dereference.

This issue was seen when reading xhci related files in debugfs:

cat /sys/kernel/debug/usb/xhci/*/devices/*/ep*/trbs

[  184.604861] BUG: unable to handle kernel NULL pointer dereference at (null)
[  184.613776] IP: xhci_ring_trb_show+0x3a/0x890
[  184.618733] PGD 264193067 P4D 264193067 PUD 263238067 PMD 0
[  184.625184] Oops: 0000 [#1] SMP
[  184.726410] RIP: 0010:xhci_ring_trb_show+0x3a/0x890
[  184.731944] RSP: 0018:ffffba8243c0fd90 EFLAGS: 00010246
[  184.737880] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000000295d6
[  184.746020] RDX: 00000000000295d5 RSI: 0000000000000001 RDI: ffff971a6418d400
[  184.754121] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  184.762222] R10: ffff971a64c98a80 R11: ffff971a62a00e40 R12: ffff971a62a85500
[  184.770325] R13: 0000000000020000 R14: ffff971a6418d400 R15: ffff971a6418d400
[  184.778448] FS:  00007fe725a79700(0000) GS:ffff971a6ec00000(0000) knlGS:0000000000000000
[  184.787644] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  184.794168] CR2: 0000000000000000 CR3: 000000025f365005 CR4: 00000000003606f0
[  184.802318] Call Trace:
[  184.805094]  ? seq_read+0x281/0x3b0
[  184.809068]  seq_read+0xeb/0x3b0
[  184.812735]  full_proxy_read+0x4d/0x70
[  184.817007]  __vfs_read+0x23/0x120
[  184.820870]  vfs_read+0x91/0x130
[  184.824538]  SyS_read+0x42/0x90
[  184.828106]  entry_SYSCALL_64_fastpath+0x1a/0x7d

Fixes: dde634057d ("xhci: Fix use-after-free in xhci debugfs")
Cc: <stable@vger.kernel.org> # v4.15
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:36:19 +01:00
Mathias Nyman 1208d8a84f xhci: Don't print a warning when setting link state for disabled ports
When disabling a USB3 port the hub driver will set the port link state to
U3 to prevent "ejected" or "safely removed" devices that are still
physically connected from immediately re-enumerating.

If the device was really unplugged, then error messages were printed
as the hub tries to set the U3 link state for a port that is no longer
enabled.

xhci-hcd ee000000.usb: Cannot set link state.
usb usb8-port1: cannot disable (err = -32)

Don't print error message in xhci-hub if hub tries to set port link state
for a disabled port. Return -ENODEV instead which also silences hub driver.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:36:19 +01:00
Joe Lee bde0716d1f xhci: workaround for AMD Promontory disabled ports wakeup
For AMD Promontory xHCI host, although you can disable USB ports in
BIOS settings, those ports will be enabled anyway after you remove a
device on that port and re-plug it in again. It's a known limitation of
the chip. As a workaround we can clear the PORT_WAKE_BITS.

[commit and code comment rephrasing -Mathias]
Signed-off-by: Joe Lee <asmt.swfae@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-15 18:36:19 +01:00
Linus Torvalds 1388c80438 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes:

   - fix rq->lock lockdep annotation bug

   - fix/improve update_curr_rt() and update_curr_dl() accounting

   - update documentation

   - remove unused macro"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/cpufreq: Remove unused SUGOV_KTHREAD_PRIORITY macro
  sched/core: Fix DEBUG_SPINLOCK annotation for rq->lock
  sched/rt: Make update_curr_rt() more accurate
  sched/deadline: Make update_curr_dl() more accurate
  membarrier-sync-core: Document architecture support
2018-02-15 09:28:47 -08:00
Linus Torvalds e9e3b3002f Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "This contains two qspinlock fixes and three documentation and comment
  fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/semaphore: Update the file path in documentation
  locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()
  locking/qspinlock: Ensure node->count is updated before initialising node
  locking/qspinlock: Ensure node is initialised before updating prev->next
  Documentation/locking/mutex-design: Update to reflect latest changes
2018-02-15 09:05:26 -08:00
Minwoo Im 096392e071 block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS
Update comment typo _consisitent_ to _consistent_ from following commit.
commit 0206319fdf ("blk-mq: Fix poll_stat for new size-based bucketing.")

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-02-15 08:27:06 -07:00
Roger Quadros 98112041bc usb: dwc3: core: Fix ULPI PHYs and prevent phy_get/ulpi_init during suspend/resume
In order for ULPI PHYs to work, dwc3_phy_setup() and dwc3_ulpi_init()
must be doene before dwc3_core_get_phy().

commit 541768b08a ("usb: dwc3: core: Call dwc3_core_get_phy() before initializing phys")
broke this.

The other issue is that dwc3_core_get_phy() and dwc3_ulpi_init() should
be called only once during the life cycle of the driver. However,
as dwc3_core_init() is called during system suspend/resume it will
result in multiple calls to dwc3_core_get_phy() and dwc3_ulpi_init()
which is wrong.

Fix this by moving dwc3_ulpi_init() out of dwc3_phy_setup()
into dwc3_core_ulpi_init(). Use a flag 'ulpi_ready' to ensure that
dwc3_core_ulpi_init() is called only once from dwc3_core_init().

Use another flag 'phys_ready' to call dwc3_core_get_phy() only once from
dwc3_core_init().

Fixes: 541768b08a ("usb: dwc3: core: Call dwc3_core_get_phy() before initializing phys")
Fixes: f54edb539c ("usb: dwc3: core: initialize ULPI before trying to get the PHY")
Cc: linux-stable <stable@vger.kernel.org> # >= v4.13
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2018-02-15 15:28:35 +02:00
Wei Yongjun 8874ae5f15 USB: gadget: udc: Add missing platform_device_put() on error in bdc_pci_probe()
Add the missing platform_device_put() before return from bdc_pci_probe()
in the platform_device_add_resources() error handling case.

Fixes: efed421a94 ("usb: gadget: Add UDC driver for Broadcom USB3.0 device controller IP BDC")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
2018-02-15 15:28:13 +02:00
Hendrik Brueckner f1d0b4cde9 Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
This reverts commit f120c7b187e6c418238710b48723ce141f467543 which is no
longer required with the introduction of a syscall.tbl on s390.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-q1lg0nvhha1tk39ri9aqalcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 10:06:15 -03:00
Hendrik Brueckner 690d22d9d4 perf s390: Rework system call table creation by using syscall.tbl
Recently, s390 uses a syscall.tbl input file to generate its system call
table and unistd uapi header files.  Hence, update mksyscalltbl to use
it as input to create the system table for perf.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-4-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-bdyhllhsq1zgxv2qx4m377y6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 10:06:08 -03:00
Hendrik Brueckner baa6761030 perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
Grab a copy of the s390 system call table file introduced with commit
857f46bfb0 "s390/syscalls: add system call
table".

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-3-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-hpw7vdjp7g92ivgpddrp5ydq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 10:06:00 -03:00
Ingo Molnar f091f1d6a2 tools/headers: Synchronize kernel ABI headers, v4.16-rc1
Sync the following tooling headers with the latest kernel version:

  tools/arch/powerpc/include/uapi/asm/kvm.h
  tools/arch/x86/include/asm/cpufeatures.h
  tools/include/uapi/drm/i915_drm.h
  tools/include/uapi/linux/if_link.h
  tools/include/uapi/linux/kvm.h

All the changes are new ABI additions which don't impact their use
in existing tooling.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 10:01:46 -03:00
Thomas Richter 7a92453620 perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
On Intel test case trace+probe_libc_inet_pton.sh succeeds and the
output is:

[root@f27 perf]# ./perf trace --no-syscalls
                  -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
     0.000 probe_libc:inet_pton:(7fa40ac618a0))
              __GI___inet_pton (/usr/lib64/libc-2.26.so)
              getaddrinfo (/usr/lib64/libc-2.26.so)
              main (/usr/bin/ping)

The kernel stack unwinder is used, it is specified implicitly
as call-graph=fp (frame pointer).

On s390x only dwarf is available for stack unwinding. It is also
done in user space. This requires different parameter setup
and result checking for s390x and Intel.

This patch adds separate perf trace setup and result checking
for Intel and s390x. On s390x specify this command line to
get a call-graph and handle the different call graph result
checking:

[root@s35lp76 perf]# ./perf trace --no-syscalls
	-e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.041 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.041/0.041/0.041/0.000 ms
     0.000 probe_libc:inet_pton:(3ffb9942060))
            __GI___inet_pton (/usr/lib64/libc-2.26.so)
            gaih_inet (inlined)
            __GI_getaddrinfo (inlined)
            main (/usr/bin/ping)
            __libc_start_main (/usr/lib64/libc-2.26.so)
            _start (/usr/bin/ping)
[root@s35lp76 perf]#

Before:
[root@s8360047 perf]# ./perf test -vv 58
58: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 26349
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms
0.000 probe_libc:inet_pton:(3ff925c2060))
test child finished with -1
 ---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
[root@s8360047 perf]#

After:
[root@s35lp76 perf]# ./perf test -vv 57
57: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 38708
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.038 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.038/0.038/0.038/0.000 ms
0.000 probe_libc:inet_pton:(3ff87342060))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
gaih_inet (inlined)
__GI_getaddrinfo (inlined)
main (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/ping)
test child finished with 0
 ---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
[root@s35lp76 perf]#

On Intel the test case runs unchanged and succeeds.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180117083831.101001-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:47 -03:00
Sangwon Hong ba7e851642 perf data: Document missing --force option
Add the --force option to the man page.

Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1517831315-31490-1-git-send-email-qpakzk@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:33 -03:00
Andy Shevchenko 6677d26c8b perf tools: Substitute yet another strtoull()
Instead of home grown function let's use what library provides us.

Signed-off-by: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20180129130359.1490-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:19 -03:00
Kan Liang 8cc42de736 perf top: Check the latency of perf_top__mmap_read()
The latency of perf_top__mmap_read() should be lower than refresh time.
If not, give some hints to reduce the latency.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-18-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:57:06 -03:00
Kan Liang ebebbf0823 perf top: Switch default mode to overwrite mode
perf_top__mmap_read() has a severe performance issue in the Knights
Landing/Mill platform, when monitoring heavy load systems. It costs
several minutes to finish, which is unacceptable.

Currently, 'perf top' uses the non overwrite mode. For non overwrite
mode, it tries to read everything in the ringbuffer and doesn't pause
it. Once there are lots of samples delivered persistently, the
processing time could be very long. Also, the latest samples could be
lost when the ringbuffer is full.

For overwrite mode, it takes a snapshot for the system by pausing the
ringbuffer, which could significantly reduce the processing time.  Also,
the overwrite mode always keep the latest samples.  Considering the real
time requirement for 'perf top', the overwrite mode is more suitable for
it.

Actually, 'perf top' was overwrite mode. It is changed to non overwrite
mode since commit 93fc64f144 ("perf top: Switch to non overwrite
mode"). It's better to change it back to overwrite mode by default.

For the kernel which doesn't support overwrite mode, it will fall back
to non overwrite mode.

There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and can be tolerated.

For overwrite mode, unconditionally wait 100 ms before each snapshot. It
also reduces the overhead caused by pausing ringbuffer, especially on
light load system.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-17-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:54 -03:00
Kan Liang a1ff5b05e9 perf top: Remove lost events checking
There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and could be tolerated by 'perf top'.

Remove the lost events checking.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-16-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:43 -03:00
Kan Liang 06cc1a470a perf hists browser: Add parameter to disable lost event warning
For overwrite mode, the ringbuffer will be paused. The event lost is
expected. It needs a way to notify the browser not print the warning.

It will be used later for perf top to disable lost event warning in
overwrite mode. There is no behavior change for now.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-15-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:26 -03:00
Kan Liang 204721d7ea perf top: Add overwrite fall back
Switch to non-overwrite mode if kernel doesnot support overwrite
ringbuffer.

It's only effect when overwrite mode is supported.  No change to current
behavior.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-14-git-send-email-kan.liang@intel.com
[ Use perf_missing_features.write_backward instead of the non merged is_write_backward_fail() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:56:14 -03:00
Arnaldo Carvalho de Melo 9a831b3a32 perf evsel: Expose the perf_missing_features struct
As tools may need to adjust to missing features, as 'perf top' will, in
the next csets, to cope with a missing 'write_backward' feature.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-jelngl9q1ooaizvkcput9tic@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:54:53 -03:00
Kan Liang 63878a53ce perf top: Check per-event overwrite term
Per-event overwrite term is not forbidden in 'perf top', which can bring
problems. Because 'perf top' only support non-overwrite mode now.

Add new rules and check regarding to overwrite term for 'perf top'.
- All events either have same per-event term or don't have per-event
  mode setting. Otherwise, it will error out.
- Per-event overwrite term should be consistent as opts->overwrite.
  If not, updating the opts->overwrite according to per-event term.

Make it possible to support either non-overwrite or overwrite mode.
The overwrite mode is forbidden now, which will be removed when the
overwrite mode is supported later.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-12-git-send-email-kan.liang@intel.com
[ Renamed perf_top_overwrite_check to perf_top__overwrite_check, to follow existing convention ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:54:42 -03:00
Kan Liang 3effc2f165 perf mmap: Discard legacy interface for mmap read
Discards perf_mmap__read_backward() and perf_mmap__read_catchup(). No
tools use them.

There are tools still use perf_mmap__read_forward(). Keep it, but add
comments to point to the new interface for future use.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-11-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-15 09:54:17 -03:00