Pull perf fixes from Thomas Gleixner:
"A set of perf related fixes:
- fix a CR4.PCE propagation issue caused by usage of mm instead of
active_mm and therefore propagated the wrong value.
- perf core fixes, which plug a use-after-free issue and make the
event inheritance on fork more robust.
- a tooling fix for symbol handling"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf symbols: Fix symbols__fixup_end heuristic for corner cases
x86/perf: Clarify why x86_pmu_event_mapped() isn't racy
x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm
perf/core: Better explain the inherit magic
perf/core: Simplify perf_event_free_task()
perf/core: Fix event inheritance on fork()
perf/core: Fix use-after-free in perf_release()
The current symbols__fixup_end() heuristic for the last entry in the rb
tree is suboptimal as it leads to not being able to recognize the symbol
in the call graph in a couple of corner cases, for example:
i) If the symbol has a start address (f.e. exposed via kallsyms)
that is at a page boundary, then the roundup(curr->start, 4096)
for the last entry will result in curr->start == curr->end with
a symbol length of zero.
ii) If the symbol has a start address that is shortly before a page
boundary, then also here, curr->end - curr->start will just be
very few bytes, where it's unrealistic that we could perform a
match against.
Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so
that we can catch such corner cases and have a better chance to find
that specific symbol. It's still just best effort as the real end of the
symbol is unknown to us (and could even be at a larger offset than the
current range), but better than the current situation.
Alexei reported that he recently run into case i) with a JITed eBPF
program (these are all page aligned) as the last symbol which wasn't
properly shown in the call graph (while other eBPF program symbols in
the rb tree were displayed correctly). Since this is a generic issue,
lets try to improve the heuristic a bit.
Reported-and-Tested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 2e538c4a18 ("perf tools: Improve kernel/modules symbol lookup")
Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Conflicts:
drivers/net/ethernet/broadcom/genet/bcmgenet.c
net/core/sock.c
Conflicts were overlapping changes in bcmgenet and the
lockdep handling of sockets.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Ensure that mtu is at least IPV6_MIN_MTU in ipv6 VTI tunnel driver,
from Steffen Klassert.
2) Fix crashes when user tries to get_next_key on an LPM bpf map, from
Alexei Starovoitov.
3) Fix detection of VLAN fitlering feature for bnx2x VF devices, from
Michal Schmidt.
4) We can get a divide by zero when TCP socket are morphed into
listening state, fix from Eric Dumazet.
5) Fix socket refcounting bugs in skb_complete_wifi_ack() and
skb_complete_tx_timestamp(). From Eric Dumazet.
6) Use after free in dccp_feat_activate_values(), also from Eric
Dumazet.
7) Like bonding team needs to use ETH_MAX_MTU as netdev->max_mtu, from
Jarod Wilson.
8) Fix use after free in vrf_xmit(), from David Ahern.
9) Don't do UDP Fragmentation Offload on IPComp ipsec packets, from
Alexey Kodanev.
10) Properly check napi_complete_done() return value in order to decide
whether to re-enable IRQs or not in amd-xgbe driver, from Thomas
Lendacky.
11) Fix double free of hwmon device in marvell phy driver, from Andrew
Lunn.
12) Don't crash on malformed netlink attributes in act_connmark, from
Etienne Noss.
13) Don't remove routes with a higher metric in ipv6 ECMP route replace,
from Sabrina Dubroca.
14) Don't write into a cloned SKB in ipv6 fragmentation handling, from
Florian Westphal.
15) Fix routing redirect races in dccp and tcp, basically the ICMP
handler can't modify the socket's cached route in it's locked by the
user at this moment. From Jon Maxwell.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (108 commits)
qed: Enable iSCSI Out-of-Order
qed: Correct out-of-bound access in OOO history
qed: Fix interrupt flags on Rx LL2
qed: Free previous connections when releasing iSCSI
qed: Fix mapping leak on LL2 rx flow
qed: Prevent creation of too-big u32-chains
qed: Align CIDs according to DORQ requirement
mlxsw: reg: Fix SPVMLR max record count
mlxsw: reg: Fix SPVM max record count
net: Resend IGMP memberships upon peer notification.
dccp: fix memory leak during tear-down of unsuccessful connection request
tun: fix premature POLLOUT notification on tun devices
dccp/tcp: fix routing redirect race
ucc/hdlc: fix two little issue
vxlan: fix ovs support
net: use net->count to check whether a netns is alive or not
bridge: drop netfilter fake rtable unconditionally
ipv6: avoid write to a possibly cloned skb
net: wimax/i2400m: fix NULL-deref at probe
isdn/gigaset: fix NULL-deref at probe
...
The main item is the addition of the Power9 Machine Check handler. This was
delayed to make sure some details were correct, and is as minimal as possible.
The rest is small fixes, two for the Power9 PMU, two dealing with obscure
toolchain problems, two for the PowerNV IOMMU code (used by VFIO), and one to
fix a crash on 32-bit machines with macio devices due to missing dma_ops.
Thanks to:
Alexey Kardashevskiy, Cyril Bur, Larry Finger, Madhavan Srinivasan, Nicholas
Piggin.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYx0IxAAoJEFHr6jzI4aWAplIQAKtEJklDDnu/lqnR1iR+Tiqf
fyVAdiPJ2MBcwkodcodg12PNcU2vB9nQwzfNc2BbZe81xZjjAPLNSA3IwAZGm+oB
U+B+oltJu5eKMg7wjRp3rkZZ7h19jT5j/auUAq+kJ9EmtT0Auo0CiQXBuxm2XBpF
77s52A64Ey1EIiSQz/GUW8/vJtGiWj5+tQj55Fsstv8vDyPCrq2AZCoU27z8keFs
iGXSLIuBUCC/VH3U6CmxzBH+g8eYm7ccL/D0T51qgxmUFWh/5NStzIPzjRP1Kq57
iV7hcKiSfNvzLY/rKYr+ziPDH8E3fixZUtcFBMpLKTEfLqJhRZQL8dDvxsfHNe2E
LpWabvnuHCIEf5UEyrrfev+CYVGIrlSC+BD9Ra895KH2h2zmmziRAuQ7gB/h72+o
FDpfcy1Pzgw3BA+CVqL73jZZSgL3GkGigozO1jpU8h+7ufBRKHqdFehvso72N18U
NOHVrNil5qerwN3R9obaVUnXDLCVj67c8ep6cW2zYRkX3oDaXDlBf88VIc4bU9dm
adHUdkmbWIQB096bMTfukY+lsxA3KFq2xfPjlkAwoRkrXx55Qa4ZYCnLcE1rwj8M
18zjroq+7UQsbVGH4rK3iUgUxYbvT7seVA/U7lLchyLdn4qn1TAYXYscW0GIZDdM
dZELElGPncH5x4uEA6Sy
=390M
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull some more powerpc fixes from Michael Ellerman:
"The main item is the addition of the Power9 Machine Check handler.
This was delayed to make sure some details were correct, and is as
minimal as possible.
The rest is small fixes, two for the Power9 PMU, two dealing with
obscure toolchain problems, two for the PowerNV IOMMU code (used by
VFIO), and one to fix a crash on 32-bit machines with macio devices
due to missing dma_ops.
Thanks to:
Alexey Kardashevskiy, Cyril Bur, Larry Finger, Madhavan Srinivasan,
Nicholas Piggin"
* tag 'powerpc-4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: POWER9 machine check handler
powerpc/64s: allow machine check handler to set severity and initiator
powerpc/64s: fix handling of non-synchronous machine checks
powerpc/pmac: Fix crash in dma-mapping.h with NULL dma_ops
powerpc/powernv/ioda2: Update iommu table base on ownership change
powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
selftests/powerpc: Replace stxvx and lxvx with stxvd2x/lxvd2x
powerpc/perf: Handle sdar_mode for marked event in power9
powerpc/perf: Fix perf_get_data_addr() for power9 DD1
powerpc/boot: Fix zImage TOC alignment
Recent merge of 'linux-kselftest-4.11-rc1' tree broke bpf test build.
None of the tests were building and test_verifier.c had tons of compiler errors.
Fix it and add #ifdef CAP_IS_SUPPORTED to support old versions of libcap.
Tested on centos 6.8 and 7
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
linux/tools/testing/selftests/vm $ make
gcc -Wall -I ../../../../usr/include compaction_test.c -lrt -o /compaction_test
/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.4/../../../../x86_64-pc-linux-gnu/bin/ld: cannot open output file /compaction_test: Permission denied
collect2: error: ld returned 1 exit status
make: *** [../lib.mk:54: /compaction_test] Error 1
Since commit a8ba798bc8 ("selftests: enable O and KBUILD_OUTPUT")
selftests/vm build fails if run from the "selftests/vm" directory, but
it works in the selftests/ directory. It's quicker to be able to do a
local vm-only build after a tree wipe and this patch allows for it
again.
Link: http://lkml.kernel.org/r/20170302173738.18994-4-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typos and add the following to the scripts/spelling.txt:
overide||override
While we are here, fix the doubled "address" in the touched line
Documentation/devicetree/bindings/regulator/ti-abb-regulator.txt.
Also, fix the comment block style in the touched hunks in
drivers/media/dvb-frontends/drx39xyj/drx_driver.h.
Link: http://lkml.kernel.org/r/1481573103-11329-21-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On POWER8 (ISA 2.07) lxvx and stxvx are defined to be extended mnemonics
of lxvd2x and stxvd2x. For POWER9 (ISA 3.0) the HW architects in their
infinite wisdom made lxvx and stxvx instructions in their own right.
POWER9 aware GCC will use the POWER9 instruction for lxvx and stxvx
causing these selftests to fail on POWER8. Further compounding the
issue, because of the way -mvsx works it will cause the power9
instructions to be used regardless of -mcpu=power8 to GCC or -mpower8 to
AS.
The safest way to address the problem for now is to not use the extended
mnemonic. We don't care how the CPU loads the values from memory since
the tests only performs register comparisons, so using stdvd2x/lxvd2x
does not impact the test.
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Acked-by: Balbir Singh<bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
infinite loop while doing the make mrproper. Looking into the cause I noticed
that a recent update to the function run_command (used for running all
shell commands, including "make mrproper") changed the internal loop to
use the function wait_for_input. The wait_for_input uses select to look
at two file descriptors. One is the file descriptor of the command it is
running, the other is STDIN. The STDIN check was not checking the return
status of the sysread call, and was also just writing a lot of data into
syswrite without regard to the size of the data read.
Changing the code to check the return status of sysread, and also to still
process the passed in descriptor data without looping back to the select
fixed Greg's problem.
While looking at this code I also realized that the loop did not honor
the timeout if STDIN always had input (or for some reason return error).
this could prevent wait_for_input to timeout on the file descriptor it
is suppose to be waiting for. That is fixed too.
-----BEGIN PGP SIGNATURE-----
iQExBAABCAAbBQJYwChiFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
0vwH/0gxaT134N6lkZ5Bdv2RJNVUu8mvAbjnXNPpUz1XSBd4zUVpfKONhxc7O50V
mNb9WfmJ4nhcjp4qeEIhdpJvO2Fjm1grIVWcvnT6FwNfvGG9S73OYyRdK0ggcYhE
gFRsdXBipVNL0pNlJhl1//XHq644IMhqDGRBQmR+eKUym2iiJHYhgteeGOQ3PHg1
L5MW1zORbPzeuVPDKGBVA4LDqlu3/gwJSIGZyYivAJp7f5Q5+t+1FPfUMdhodvps
XiNsgHkHSpjhcCKxbjgSFrIX52AyrciYt+ZlIDps97R+IRk671BFHoOEcSZDux9O
Cm3L3eBA8zIJQn9yXjlVvHfbVxU=
=sGdD
-----END PGP SIGNATURE-----
Merge tag 'ktest-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest fixes from Steven Rostedt:
"Greg Kroah-Hartman reported to me that the ktest of v4.11-rc1 locked
up in an infinite loop while doing the make mrproper.
Looking into the cause I noticed that a recent update to the function
run_command (used for running all shell commands, including "make
mrproper") changed the internal loop to use the function
wait_for_input.
The wait_for_input function uses select to look at two file
descriptors. One is the file descriptor of the command it is running,
the other is STDIN. The STDIN check was not checking the return status
of the sysread call, and was also just writing a lot of data into
syswrite without regard to the size of the data read.
Changing the code to check the return status of sysread, and also to
still process the passed in descriptor data without looping back to
the select fixed Greg's problem.
While looking at this code I also realized that the loop did not honor
the timeout if STDIN always had input (or for some reason return
error). this could prevent wait_for_input to timeout on the file
descriptor it is suppose to be waiting for. That is fixed too"
* tag 'ktest-v4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Make sure wait_for_input does honor the timeout
ktest: Fix while loop in wait_for_input
The function wait_for_input takes in a timeout, and even has a default
timeout. But if for some reason the STDIN descriptor keeps sending in data,
the function will never time out. The timout is to wait for the data from
the passed in file descriptor, not for STDIN. Adding a test in the case
where there's no data from the passed in file descriptor that checks to see
if the timeout passed, will ensure that it will timeout properly even if
there's input in STDIN.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
The run_command function was changed to use the wait_for_input function to
allow having a timeout if the command to run takes too much time. There was
a bug in the wait_for_input where it could end up going into an infinite
loop. There's two issues here. One is that the return value of the sysread
wasn't used for the write (to write a proper size), and that it should
continue processing the passed in file descriptor too even if there was
input. There was no check for error, if for some reason STDIN returned an
error, the function would go into an infinite loop and never exit.
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 6e98d1b441 ("ktest: Add timeout to ssh command")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull x86 fixes from Ingo Molnar:
"Misc fixes and minor updates all over the place:
- an SGI/UV fix
- a defconfig update
- a build warning fix
- move the boot_params file to the arch location in debugfs
- a pkeys fix
- selftests fix
- boot message fixes
- sparse fixes
- a resume warning fix
- ioapic hotplug fixes
- reboot quirks
... plus various minor cleanups"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build/x86_64_defconfig: Enable CONFIG_R8169
x86/reboot/quirks: Add ASUS EeeBook X205TA/W reboot quirk
x86/hpet: Prevent might sleep splat on resume
x86/boot: Correct setup_header.start_sys name
x86/purgatory: Fix sparse warning, symbol not declared
x86/purgatory: Make functions and variables static
x86/events: Remove last remnants of old filenames
x86/pkeys: Check against max pkey to avoid overflows
x86/ioapic: Split IOAPIC hot-removal into two steps
x86/PCI: Implement pcibios_release_device to release IRQ from IOAPIC
x86/intel_rdt: Remove duplicate inclusion of linux/cpu.h
x86/vmware: Remove duplicate inclusion of asm/timer.h
x86/hyperv: Hide unused label
x86/reboot/quirks: Add ASUS EeeBook X205TA reboot quirk
x86/platform/uv/BAU: Fix HUB errors by remove initial write to sw-ack register
x86/selftests: Add clobbers for int80 on x86_64
x86/apic: Simplify enable_IR_x2apic(), remove try_to_enable_IR()
x86/apic: Fix a warning message in logical CPU IDs allocation
x86/kdebugfs: Move boot params hierarchy under (debugfs)/x86/
Pull core fixes from Ingo Molnar:
"A couple of sched.h splitup related build fixes, plus an objtool fix"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix another GCC jump table detection issue
drivers/char/nwbutton: Fix build breakage caused by include file reshuffling
h8300: Fix build breakage caused by header file changes
avr32: Fix build error caused by include file reshuffling
Pull idr fix (and new tests) from Matthew Wilcox:
"One urgent patch in here; freeing the correct IDA bitmap.
Everything else is changes to the test suite"
* 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax:
radix tree test suite: Specify -m32 in LDFLAGS too
ida: Free correct IDA bitmap
radix tree test suite: Depend on Makefile and quieten grep
radix tree test suite: Fix build with --as-needed
radix tree test suite: Build 32 bit binaries
radix tree test suite: Add performance test for radix_tree_join()
radix tree test suite: Add performance test for radix_tree_split()
radix tree test suite: Add performance benchmarks
radix tree test suite: Add test for radix_tree_clear_tags()
radix tree test suite: Add tests for ida_simple_get() and ida_simple_remove()
radix tree test suite: Add test for idr_get_next()
Five fairly small fixes for things that went in this cycle.
A fairly large patch to rework the CAS logic on Power9, necessitated by a late
change to the firmware API, and we can't boot without it.
Three fixes going to stable, allowing more instructions to be emulated on LE,
fixing a boot crash on 32-bit Freescale BookE machines, and the OPAL XICS
workaround.
And a patch from me to sort the selects under CONFIG PPC. Annoying churn, but
worth it in the long run, and best for it to go in now to avoid conflicts.
Thanks to:
Alexey Kardashevskiy, Anton Blanchard, Balbir Singh, Gautham R. Shenoy,
Laurentiu Tudor, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Sachin Sant,
Shile Zhang, Suraj Jitindar Singh.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYvqSxAAoJEFHr6jzI4aWAjMQP/06OFGz3VQvO5Q8jPsqRF22y
Wr+04OKFmKnYVObdQk15HGOagp1fSkWWHfP/eu50kx1WNCzq7tQdLjNSi7H4F3s1
4NwlaOfSQoxctsVtfnITJkfVScjcxK7XVagswtb3wvBpBx4lwD8fGwxkSxj6NhRw
PNxLi44wobb8mDyR6L/6tJKBI2Jt12qXZY+kBQIleun5+lF8fNXIu4qPiglMOia6
oPhXlp4RASt8wz74H8JuMTwGv17MxG+zvbkDPwQC7PI/fohJLybgWEfByN4H5UMy
7Xi/lWHlShAyc7ulAIN+A1mHKY9LSv45U6qrrHFUJgRftZihoZHe6ekcI+h5oFVX
chP9oUrQNeeZ5QqUC4rYdWwsMfiXBI0y5+BCupItixXc1LANBH9Ym9IECbgPRP93
LQVqiS4958KijHlYBOA2zPicl/FnVO16orqakyRS0B3lQ54XBvhcgG8gIXjQr8PM
Mt2W4r6RtGJ4ddhUPpF/W4lEuR4+dmXfEqs7DkgBKRbvi8XYkiLx2byBNh/OMRUG
T4ILXsYf50AKRAq/jFTs9A0zkjtmtBeDdn96Mcan8i3WZuTQ7b8mQlC46zEg23A8
XmTG2xt7N1dMjjwS78CfnvQ8sIVtA9AUfK37aTc0ICMsBCqEcWLAhHKZyCw0h25C
wq9BMn4e5Gdg2xLTHKlL
=SxON
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Five fairly small fixes for things that went in this cycle.
A fairly large patch to rework the CAS logic on Power9, necessitated
by a late change to the firmware API, and we can't boot without it.
Three fixes going to stable, allowing more instructions to be emulated
on LE, fixing a boot crash on 32-bit Freescale BookE machines, and the
OPAL XICS workaround.
And a patch from me to sort the selects under CONFIG PPC. Annoying
churn, but worth it in the long run, and best for it to go in now to
avoid conflicts.
Thanks to:
Alexey Kardashevskiy, Anton Blanchard, Balbir Singh, Gautham R.
Shenoy, Laurentiu Tudor, Nicholas Piggin, Paul Mackerras, Ravi
Bangoria, Sachin Sant, Shile Zhang, Suraj Jitindar Singh"
* tag 'powerpc-4.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Sort the selects under CONFIG_PPC
powerpc/64: Fix L1D cache shape vector reporting L1I values
powerpc/64: Avoid panic during boot due to divide by zero in init_cache_info()
powerpc: Update to new option-vector-5 format for CAS
powerpc: Parse the command line before calling CAS
powerpc/xics: Work around limitations of OPAL XICS priority handling
powerpc/64: Fix checksum folding in csum_add()
powerpc/powernv: Fix opal tracepoints with JUMP_LABEL=n
powerpc/booke: Fix boot crash due to null hugepd
powerpc: Fix compiling a BE kernel with a powerpc64le toolchain
selftest/powerpc: Fix false failures for skipped tests
powerpc/powernv: Fix bug due to labeling ambiguity in power_enter_stop
powerpc/64: Invalidate process table caching after setting process table
powerpc: emulate_step() tests for load/store instructions
powerpc: Emulation support for load/store instructions on LE
Michael's patch to use the default make rule for linking and the patch
from Rehas to use -m32 if building a 32-bit test-suite on a 64-bit
platform don't work well together.
Reported-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
There's a relatively rare race where we look at the per-cpu preallocated
IDA bitmap, see it's NULL, allocate a new one, and atomically update it.
If the kmalloc() happened to sleep and we were rescheduled to a different
CPU, or an interrupt came in at the exact right time, another task
might have successfully allocated a bitmap and already deposited it.
I forgot what the semantics of cmpxchg() were and ended up freeing the
wrong bitmap leading to KASAN reporting a use-after-free.
Dmitry found the bug with syzkaller & wrote the patch. I wrote the test
case that will reproduce the bug without his patch being applied.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Changing the CFLAGS in the Makefile didn't always lead to a
recompilation because the OFILES didn't depend on the Makefile.
Also, after doing make clean, grep would still complain about
a missing map-shift.h; we need -s as well as -q.
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Currently the radix tree test suite doesn't build with toolchains that
use --as-needed by default, for example Ubuntu's:
cc -I. -I../../include -g -O2 -Wall -D_LGPL_SOURCE -fsanitize=address -lpthread -lurcu main.o ... -o main
/usr/bin/ld: regression1.o: undefined reference to symbol 'pthread_join@@GLIBC_2.17'
/lib/powerpc64le-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
This is caused by the custom makefile rules placing LDFLAGS before the
.o files that need the libraries.
We could fix it by using --no-as-needed, or rewriting the custom rules.
But we can also just drop the custom rules and move the libraries to
LDLIBS, and then the default rules work correctly - with the one caveat
that we need to add -fsanitize=address to LDFLAGS because that must be
passed to the linker as well as the compiler.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Add performance benchmarks for radix tree insertion, tagging and deletion.
Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Assert that radix_tree_clear_tags() clears the tags on the passed node and
slot. Assert that the case where the radix tree has only one entry at index
zero and the node is NULL, is also handled.
Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Assert that ida_simple_get() allocates an id in the passed range or returns
error on failure, and ida_simple_remove() releases an allocated id.
Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Assert that idr_get_next() returns the next populated entry in the tree with
an ID greater than or equal to the value pointed to by @nextid argument.
Signed-off-by: Rehas Sachdeva <aquannie@gmail.com>
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Arnd Bergmann reported a (false positive) objtool warning:
drivers/infiniband/sw/rxe/rxe_resp.o: warning: objtool: rxe_responder()+0xfe: sibling call from callable instruction with changed frame pointer
The issue is in find_switch_table(). It tries to find a switch
statement's jump table by walking backwards from an indirect jump
instruction, looking for a relocation to the .rodata section. In this
case it stopped walking prematurely: the first .rodata relocation it
encountered was for a variable (resp_state_name) instead of a jump
table, so it just assumed there wasn't a jump table.
The fix is to ignore any .rodata relocation which refers to an ELF
object symbol. This works because the jump tables are anonymous and
have no symbols associated with them.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 3732710ff6 ("objtool: Improve rare switch jump table pattern detection")
Link: http://lkml.kernel.org/r/20170302225723.3ndbsnl4hkqbne7a@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch adds a function to clean up duplicate config info
on Ubuntu.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
turbostat displays a GFXMHz column, which comes from reading
/sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz
But GFXMHz was not changing, even when a manual
cat /sys/class/graphics/fb0/device/drm/card0/gt_cur_freq_mhz
showed a new value.
It turns out that a rewind() on the open file is not sufficient,
fflush() (or a close/open) is needed to read fresh values.
Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
PPC:
* correct assumption about ASDR on POWER9
* fix MMIO emulation on POWER9
x86:
* add a simple test for ioperm
* cleanup TSS
(going through KVM tree as the whole undertaking was caused by VMX's
use of TSS)
* fix nVMX interrupt delivery
* fix some performance counters in the guest
And two cleanup patches.
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJYuu5qAAoJEED/6hsPKofoRAUH/jkx/KFDcw3FggixysWVgRai
iLSbbAZemnSLFSOkOU/t7Bz0fXCUgB0tAcMJd9ow01Dg1zObiTpuUIo6qEPaYHdX
gqtUzlHuyECZEcgK0RXS9kDYLrvw7EFocxnDWQfV91qCZSS6nBSSLF3ST1rNV69W
mUvcZG+MciDcZUe1lTexoswVTh1m7avvozEnQ5OHnZR9yicoXiadBQjzL6yqWoqf
Ml/29zRk5+MvloTudxjkAKm3mh7psW88jNMh37TXbAA7i+Xwl9cU6GLR9mFWstoP
7Ot7ecq9mNAUO3lTIQh7lqvB60LMFznS4IlYK7MbplC3kvJLkfzhTWaN1aGvh90=
=cqHo
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.11-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Radim Krčmář:
"Second batch of KVM changes for the 4.11 merge window:
PPC:
- correct assumption about ASDR on POWER9
- fix MMIO emulation on POWER9
x86:
- add a simple test for ioperm
- cleanup TSS (going through KVM tree as the whole undertaking was
caused by VMX's use of TSS)
- fix nVMX interrupt delivery
- fix some performance counters in the guest
... and two cleanup patches"
* tag 'kvm-4.11-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: nVMX: Fix pending events injection
x86/kvm/vmx: remove unused variable in segment_base()
selftests/x86: Add a basic selftest for ioperm
x86/asm: Tidy up TSS limit code
kvm: convert kvm.users_count from atomic_t to refcount_t
KVM: x86: never specify a sample period for virtualized in_tx_cp counters
KVM: PPC: Book3S HV: Don't use ASDR for real-mode HPT faults on POWER9
KVM: PPC: Book3S HV: Fix software walk of guest process page tables
Pull misc final vfs updates from Al Viro:
"A few unrelated patches that got beating in -next.
Everything else will have to go into the next window ;-/"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
hfs: fix hfs_readdir()
selftest for default_file_splice_read() infoleak
9p: constify ->d_name handling
Pull libnvdimm fixes from Dan Williams:
"A fix and regression test case for nvdimm namespace label
compatibility.
Details:
- An "nvdimm namespace label" is metadata on an nvdimm that
provisions dimm capacity into a "namespace" that can host a block
device / dax-filesytem, or a device-dax character device.
A namespace is an object that other operating environment and
platform firmware needs to comprehend for capabilities like booting
from an nvdimm.
The label metadata contains a checksum that Linux was not
calculating correctly leading to other environments rejecting the
Linux label.
These have received a build success notification from the kbuild
robot, and a positive test result from Nick who reported the problem"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit, libnvdimm: fix interleave set cookie calculation
tools/testing/nvdimm: make iset cookie predictable
This update consists of an urgent fix for individual test build failures
introduced in the 4.11-rc1 update.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYuY6GAAoJEAsCRMQNDUMcEE8QAJ4omNZEcezwGdWSkvV+gDFf
+ybgSxucds/uBGsLv18ephrgOvROvVbmVym5YO1DeIYvE4X2Ze0R+ZzTen49JJis
Zslw3K9GCwviVyn5MjL1JRkq2ea47Gm5EbfAcDaxcphg/xfQtFSGrD4NeaQ8FrNh
M4/9GMbKOfxnOya0V56M4RY+WxLFySk/zfUmgPLLH0gBtY8aAyxQDOFQKvqOBExv
CUFMw7dkQx/JvrVRHZcF34nqXDcKQQeZXBoDAqiHt7M/4YaYlBPVqou/8URgWF6J
hhI1F0O9gadsQ/uK+ENLDe5IWYRGm8fU3gMQnfdPCh9I2Dt6euvmrzjUO2dN56Ps
klo6vlIMamJrCsIi/q+7ak15JVkk2L7YCwu644ZpCVc5Ts8Oa9pukzKA/2ozBx0L
u7xcsC4qiOnxw3I2mgigbT6PP9t7TLQHcSYDZlXvQH2C9ZLjYS9437taAts2iZSM
JpRSMdt6XIQv09Ij6GRnCTo/e+vMAqoLLA8rpFLtecgN1W3YznsStPX/lg47zfdF
rTwvYSrq9JXTgGOZOULa8qDR/ng1Oe5AltLzn5h6gO/sv+gi1bmc+e6s7LqUUQot
2vi20tfCShy6rgB2GZ3pM9Yn/QwItUQW8zfzCtQpJ/9C38AsYFlFBtlde7BGnaow
dPfrL4Ux+7FT/Ne3priu
=A1Z0
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-4.11-rc1-urgent_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fix from Shuah Khan:
"This update consists of an urgent fix for individual test build
failures introduced in the 4.11-rc1 update"
* tag 'linux-kselftest-4.11-rc1-urgent_fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: lib.mk Fix individual test builds
These update turbostat significantly and in particular:
- Default output is now verbose, --debug is no longer required to
get all counters. As a result, some options have been added to
specify exactly what output is wanted.
- Added --quiet to skip system configuration output
- Added --list, --show and --hide parameters
- Added --cpu parameter
- Enhanced Baytrail SoC support
- Added Gemini Lake SoC support
- Added sysfs C-state columns
Also the symbol definitions in arch/x86/include/asm/intel-family.h
and arch/x86/include/asm/msr-index.h are updated and the intel_idle
and intel_pstate drivers are modified to use the updated symbols.
Credits to Len Brown for all of these changes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJYuLyNAAoJEILEb/54YlRxEvkQAJsggzpgGrlhrO6KHSm4yC9M
CqhBVsdeppX1ZTAVPiMk/pcXQYtL5fZ97ELk2So/CjT5Nh3jwDPMA/ux5n3uiob+
O2BTdtxnpNLxPQPQM1mW7Dr/uAIRlJug9gSMxKDbFSU9Oe3aET58PUdUTs7xaT59
nbtLxVSvzrdGk/bX6WO4ic+7F2licJLZPfDGhYidnoika8LxD4M+cIO73gFpgqQi
yoKrTZyLimvneFT0eAUUvHIyKjkJIxeMfslW57uBpz8rW5my+3UwsdpRG4AIVeWc
wSBlsNqj+TuR4BBiZ2VR2RoHF3qbH/SceI+k864BqyThfyK/g2q/vV/GvLZQCR/R
yWcajWD9kvLKvnm1D3XYOIQDBeP4l60j3vVwHytSvmaPYjn5Ms3jq6b+2K6zkXMM
8y3leW/hgw+rGCacdXPrKIlpBykSV7h+TnD2iMxeeDISNkbefWWDe/WB6HncocAg
HDtKRvU9ntRq6/MlnTKbCFM5c0oCXWRw4QNjDy3AsjJELgeAIwiqpHWMKO6XltFj
qU/rdyW/BTCuAlIjWVbjooAIJZ268geupeug3zvE3uGzrxT4DaVIo8W1wtJ+XQrt
By7sOW/gMQ2EcTJQiuFjS/Gz5gOKQ2F8OLCm6T8Prjh6SxrCUAiuIvP0LmxUCa8i
KMlx+8c9E2f9j+TTt9AP
=oMZe
-----END PGP SIGNATURE-----
Merge tag 'pm-turbostat-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull turbostat utility updates from Rafael Wysocki:
"Power management turbostat utility updates.
These update turbostat significantly and in particular:
- default output is now verbose, --debug is no longer required to get
all counters. As a result, some options have been added to specify
exactly what output is wanted.
- added --quiet to skip system configuration output
- added --list, --show and --hide parameters
- added --cpu parameter
- enhanced Baytrail SoC support
- added Gemini Lake SoC support
- added sysfs C-state columns
Also the symbol definitions in arch/x86/include/asm/intel-family.h and
arch/x86/include/asm/msr-index.h are updated and the intel_idle and
intel_pstate drivers are modified to use the updated symbols.
Credits to Len Brown for all of these changes"
* tag 'pm-turbostat-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits)
tools/power turbostat: version 17.02.24
tools/power turbostat: bugfix: --add u32 was printed as u64
tools/power turbostat: show error on exec
tools/power turbostat: dump p-state software config
tools/power turbostat: show package number, even without --debug
tools/power turbostat: support "--hide C1" etc.
tools/power turbostat: move --Package and --processor into the --cpu option
tools/power turbostat: turbostat.8 update
tools/power turbostat: update --list feature
tools/power turbostat: use wide columns to display large numbers
tools/power turbostat: Add --list option to show available header names
tools/power turbostat: fix zero IRQ count shown in one-shot command mode
tools/power turbostat: add --cpu parameter
tools/power turbostat: print sysfs C-state stats
tools/power turbostat: extend --add option to accept /sys path
tools/power turbostat: skip unused counters on BDX
tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
tools/power turbostat: skip unused counters on SKX
tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
tools/power turbostat: initial Gemini Lake SOC support
...
Tests under alignment subdirectory are skipped when executed on previous
generation hardware, but harness still marks them as failed.
test: test_copy_unaligned
tags: git_version:unknown
[SKIP] Test skipped on line 26
skip: test_copy_unaligned
selftests: copy_unaligned [FAIL]
The MAGIC_SKIP_RETURN_VALUE value assigned to rc variable is retained till
the program exit which causes the test to be marked as failed.
This patch resets the value before returning to the main() routine.
With this patch the test o/p is as follows:
test: test_copy_unaligned
tags: git_version:unknown
[SKIP] Test skipped on line 26
skip: test_copy_unaligned
selftests: copy_unaligned [PASS]
Signed-off-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
gcc-7 has an "optimization" pass that completely screws up, and
generates the code expansion for the (impossible) case of calling
ilog2() with a zero constant, even when the code gcc compiles does not
actually have a zero constant.
And we try to generate a compile-time error for anybody doing ilog2() on
a constant where that doesn't make sense (be it zero or negative). So
now gcc7 will fail the build due to our sanity checking, because it
created that constant-zero case that didn't actually exist in the source
code.
There's a whole long discussion on the kernel mailing about how to work
around this gcc bug. The gcc people themselevs have discussed their
"feature" in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72785
but it's all water under the bridge, because while it looked at one
point like it would be solved by the time gcc7 was released, that was
not to be.
So now we have to deal with this compiler braindamage.
And the only simple approach seems to be to just delete the code that
tries to warn about bad uses of ilog2().
So now "ilog2()" will just return 0 not just for the value 1, but for
any non-positive value too.
It's not like I can recall anybody having ever actually tried to use
this function on any invalid value, but maybe the sanity check just
meant that such code never made it out in public.
Reported-by: Laura Abbott <labbott@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>,
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit a8ba798bc8 ("selftests: enable O and KBUILD_OUTPUT"), added
support to generate compile targets in a user specified directory. OUTPUT
variable controls the location which is undefined when tests are built in
the test directory or with "make -C tools/testing/selftests/x86".
make -C tools/testing/selftests/x86/
make: Entering directory '/lkml/linux_4.11/tools/testing/selftests/x86'
Makefile:44: warning: overriding recipe for target 'clean'
../lib.mk:51: warning: ignoring old recipe for target 'clean'
gcc -m64 -o /single_step_syscall_64 -O2 -g -std=gnu99 -pthread -Wall single_step_syscall.c -lrt -ldl
/usr/bin/ld: cannot open output file /single_step_syscall_64: Permission denied
collect2: error: ld returned 1 exit status
Makefile:50: recipe for target '/single_step_syscall_64' failed
make: *** [/single_step_syscall_64] Error 1
make: Leaving directory '/lkml/linux_4.11/tools/testing/selftests/x86'
Same failure with "cd tools/testing/selftests/x86/;make" run.
Fix this with a change to lib.mk to define OUTPUT to be the pwd when
MAKELEVEL is 0. This covers both cases mentioned above.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Pull changes related to turbostat for v4.11 from Len Brown.
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (44 commits)
tools/power turbostat: version 17.02.24
tools/power turbostat: bugfix: --add u32 was printed as u64
tools/power turbostat: show error on exec
tools/power turbostat: dump p-state software config
tools/power turbostat: show package number, even without --debug
tools/power turbostat: support "--hide C1" etc.
tools/power turbostat: move --Package and --processor into the --cpu option
tools/power turbostat: turbostat.8 update
tools/power turbostat: update --list feature
tools/power turbostat: use wide columns to display large numbers
tools/power turbostat: Add --list option to show available header names
tools/power turbostat: fix zero IRQ count shown in one-shot command mode
tools/power turbostat: add --cpu parameter
tools/power turbostat: print sysfs C-state stats
tools/power turbostat: extend --add option to accept /sys path
tools/power turbostat: skip unused counters on BDX
tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits
tools/power turbostat: skip unused counters on SKX
tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7
tools/power turbostat: initial Gemini Lake SOC support
...
The '__unreachable' and '__func_stack_frame_non_standard' sections are
only used at compile time. They're discarded for vmlinux but they
should also be discarded for modules.
Since this is a recurring pattern, prefix the section names with
".discard.". It's a nice convention and vmlinux.lds.h already discards
such sections.
Also remove the 'a' (allocatable) flag from the __unreachable section
since it doesn't make sense for a discarded section.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: d1091c7fa3 ("objtool: Improve detection of BUG() and other dead ends")
Link: http://lkml.kernel.org/r/20170301180444.lhd53c5tibc4ns77@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This doesn't fully exercise the interaction between KVM and ioperm(),
but it does test basic functionality.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Update to the new file paths, remove them from introductory comments.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170218113140.8051-1-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Kernel erases R8..R11 registers prior returning to userspace
from int80:
https://lkml.org/lkml/2009/10/1/164
GCC can reuse these registers and doesn't expect them to change
during syscall invocation. I met this kind of bug in CRIU once
GCC 6.1 and CLANG stored local variables in those registers
and the kernel zerofied them during syscall:
990d33f1a1
By that reason I suggest to add those registers to clobbers
in selftests. Also, as noted by Andy - removed unneeded clobber
for flags in INT $0x80 inline asm.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kselftest@vger.kernel.org
Link: http://lkml.kernel.org/r/20170213101336.20486-1-dsafonov@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
For testing changes to the iset cookie algorithm we need a value that is
constant from run-to-run.
Stop including dynamic data in the emulated region_offset values. Also,
pick values that sort in a different order depending on whether the
comparison is a memcmp() of two 8-byte arrays or subtraction of two
64-bit values.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The turbostat before this last set of changes is obsolete.
This new version can do a lot more, but it also has
some different defaults, that might catch some off-guard.
So it seems a good time to give a new version number.
Signed-off-by: Len Brown <len.brown@intel.com>
When the "u32" keyword is used with --add, it means that
the output should be truncated to 32-bits. This was not
happening and all 64-bits were printed.
Also, when no column name was used for an added MSR,
The default column name was in deximal, eg. MSR16.
Users report that they tend to use hex MSR numbers,
so print them in hex. To always fit into the columns,
use the syntax M0x10. Note that the user can always
supply any column header that they want.
eg --add msr0x10,MY_TSC
Signed-off-by: Len Brown <len.brown@intel.com>
When turbostat is run in one-shot command mode,
the parent takes the 'before' counter snapshot,
fork/exec/wait for the child to exit,
takes the 'after' counter snapshot,
and prints the results.
however, if the child fails to exec the command,
it immediately returns, without indicating that
anythign was wrong.
Add an error message showing that exec failed:
sudo turbostat sleeeep 4
...
turbostat: exec sleeeep: No such file or directory
...
Note that the parent will still print out the statistics,
because it can't tell the difference between the failed
exec and a command that is purposefully returning
the same status. Unfortunately, this may obscure the
error message. However, if the --out parameter is used,
the error message is evident on stderr.
Reported-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
On multi-package systems, the "Package" column was being displayed
only if --debug was used. Show it always.
Signed-off-by: Len Brown <len.brown@intel.com>
Originally, the only way to hide the sysfs C-state statistics columns
was with "--hide sysfs". This was because we process "--hide" before
we probe for those columns.
hack --hide to remember deferred hide requests, and apply
them when sysfs is probed.
"--hide sysfs" is still available as short-hand to refer to
the entire group of counters.
The down-side of this change is that we no longer error check for
bogus --hide column names. But the user will quickly figure that
out if a column they mean to hide is still there...
Signed-off-by: Len Brown <len.brown@intel.com>
--Package is now "--cpu package",
which will display just the 1st CPU in each package
--processor is not "--cpu core"
which will display just the 1st CPU in each core
Signed-off-by: Len Brown <len.brown@intel.com>
Make it possible to take the entire un-edited output
from `turbostat --list` and feed it to "turbostat --show"
or "turbostat --hide".
To do this, the leading comma was removed
(no mater what columns are active)
and also they dynamic C-state "C1, C2, C3" etc are replaced
by the string "sysfs", which refers to them as a group.
Signed-off-by: Len Brown <len.brown@intel.com>
When a counter overlfows 7 columns, it shifts the remaining
columns to the right, so they no longer line up under
their column header.
Update turbostat to dectect when it is handling large
numbers, and switch to wider columns where, necessary.
Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
It is handy to know the list of column header names,
so that they can be used with --add and --skip
The new --list option shows them:
sudo ./turbostat --list --hide sysfs
,Core,CPU,Avg_MHz,Busy%,Bzy_MHz,TSC_MHz,IRQ,SMI,CPU%c1,CPU%c3,CPU%c6,CPU%c7,CoreTmp,PkgTmp,GFX%rc6,GFXMHz,PkgWatt,CorWatt,GFXWatt
Signed-off-by: Len Brown <len.brown@intel.com>
The IRQ column has been working for periodic mode,
but not in one-shot command mode, it shows only 0.
until now.
Signed-off-by: Len Brown <len.brown@intel.com>
With the --cpu parameter, turbostat prints only lines
for the specified set of CPUs:
sudo ./turbostat --quiet --show Core,CPU --cpu 0,1,3..5,6-7
Core CPU
- -
0 0
0 4
1 1
1 5
2 6
3 3
3 7
Signed-off-by: Len Brown <len.brown@intel.com>
When turbostat shows % of time in a CPU idle power state,
it has always been showing information from underlying
hardware residency counters.
While this reflects what the hardware is doing, and is thus
useful for understanding the hardware,
it doesn't directly tell us what Linux requested --
which is useful for tuning Linux itself.
Here we add columns to turbostat to show the
Linux cpuidle sub-system statistics:
/sys/devices/system/cpu/cpu*/cpuidle/state*/*
The first group of columns are the "usage", which is the
number of times software requested that C-state in the
measurement interval. eg C1 below.
The second group of columns are the "time", which is the percentage
of the measurement interval time that software has requested
the specified C-state. eg C1% below.
These software counters can be compared to the underlying
hardware residency counters (eg CPU%c1 CPU%c3 CPU%c6 CPU%c7)
to compare what sofware requested to what the hardware delivered.
These sysfs attributes are discovered when turbostat starts,
rather than being "built in". So the --show and --hide
parameters do not know about these dynamic column names.
However "--show sysfs" and "--hide sysfs" act on the
entire group of columns:
turbostat --show sysfs
...
cpu4: POLL: CPUIDLE CORE POLL IDLE
cpu4: C1: MWAIT 0x00
cpu4: C1E: MWAIT 0x01
cpu4: C3: MWAIT 0x10
cpu4: C6: MWAIT 0x20
cpu4: C7s: MWAIT 0x32
...
C1 C1E C3 C6 C7s C1% C1E% C3% C6% C7s%
3 6 5 1 188 0.00 0.02 0.00 0.00 99.93
0 6 5 0 58 0.00 0.16 0.02 0.00 99.70
0 0 0 0 9 0.00 0.00 0.00 0.00 99.96
0 0 0 1 24 0.00 0.00 0.00 0.02 99.93
0 0 0 0 9 0.00 0.00 0.00 0.00 99.97
0 0 0 0 32 0.00 0.00 0.00 0.00 99.96
0 0 0 0 7 0.00 0.00 0.00 0.00 99.98
2 0 0 0 36 0.00 0.00 0.00 0.00 99.97
1 0 0 0 13 0.00 0.00 0.00 0.00 99.98
Signed-off-by: Len Brown <len.brown@intel.com>
Previously, the --add option could specify only an MSR.
Here is is extended so an arbitrary /sys attribute,
as specified by an absolute file path name.
sudo ./turbostat --add /sys/devices/system/cpu/cpu0/cpuidle/state5/usage
Signed-off-by: Len Brown <len.brown@intel.com>
Newer processors do not hard-code the the number of cpus in each bin
to {1, 2, 3, 4, 5, 6, 7, 8} Rather, they can specify any number
of CPUS in each of the 8 bins:
eg.
...
37 * 100.0 = 3600.0 MHz max turbo 4 active cores
38 * 100.0 = 3700.0 MHz max turbo 3 active cores
39 * 100.0 = 3800.0 MHz max turbo 2 active cores
39 * 100.0 = 3900.0 MHz max turbo 1 active cores
could now look something like this:
...
37 * 100.0 = 3600.0 MHz max turbo 16 active cores
38 * 100.0 = 3700.0 MHz max turbo 8 active cores
39 * 100.0 = 3800.0 MHz max turbo 4 active cores
39 * 100.0 = 3900.0 MHz max turbo 2 active cores
Signed-off-by: Len Brown <len.brown@intel.com>
The CC1 column in tubostat can be computed by subtracting
the core c-state residency countes from the total Cx residency.
CC1 = (Idle_time_as_measured by MPERF) - (all core C-states with
residency counters)
However, as the underlying counter reads are not atomic,
error can be noticed in this calculations, especially
when the numbers are small.
Denverton has a hardware CC1 residency counter
to improve the accuracy of the cc1 statistic -- use it.
At the same time, Denverton has no concept of CC3, PC3, CC7, PC7,
so skip collecting and printing those columns.
Finally, a note of clarification.
Turbostat prints the standard PC2 residency counter,
but on Denverton hardware, that actually means PC1E.
Turbostat prints the standard PC6 residency counter,
but on Denverton hardware, that actually means PC2.
At this point, we document that differnce in this commit message,
rather than adding a quirk to the software.
Signed-off-by: Len Brown <len.brown@intel.com>
Fix a bug with --add, where the title of the column
is un-initialized if not specified by the user.
The initial implementation of --show and --hide
neglected to handle the pc8/pc9/pc10 counters.
Fix a bug where "--show Core" only worked with --debug
Reported-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The CPU ticks at a rate in the "bus clock" domain.
eg. 100 MHz * bus_ratio.
On newer processors, the TSC has been moved out of this BCLK
domain and into a separate crystal-clock domain.
While the TSC ticks "close to" the base frequency, those that look
closely at the numbers will notice small errors in calculations that
mix units of TSC clocks and bus clocks.
"tsc_tweak" was introduced to address the most visible
mixing -- the %Busy and the the Busy_MHz calculations.
(A simplification as since removed TSC from the BusyMHz calculation)
Here we apply the tsc_tweak to everyplace where BCLK
and TSC units are mixed. The results is that
on a system which is 100% idle, the sum of the C-states
are now much more likely to be closer to 100%.
Reported-by: Travis Downs <travis.downs@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Some users want turbostat to tell them everything, by default.
Some users want turbostat to be quiet, by default.
I find that I'm in the 1st camp, and so I've never liked
needing to type the --debug parameter to decode the system
configuration.
So here we change the default and print the system configuration,
by default. (The --debug option is now un-documented, though
it does still exist for debugging turbostat internals)
When you do not want to see the system configuration
header, use the new "--quiet" option.
Signed-off-by: Len Brown <len.brown@intel.com>
Some time ago, turbostat overflowed 80 columns.
So on the assumption that a "casual" user would always
want topology and frequency columns, we hid the rest
of the columns and the system configuration decoding
behind the --debug option.
Not everybody liked that change -- including me.
I use --debug 99% of the time...
Well, now we have "-o file" to put turbostat output into a file,
so unless you are watching real-time in a small window,
column count is less frequently a factor.
And more recently, we got the "--hide columnA,columnB" option
to specify columns to skip.
So now we "un-hide" the rest of the columns from behind --debug,
and show them all, by default.
Signed-off-by: Len Brown <len.brown@intel.com>
useful for observing if the BIOS disabled prefetch
Not architectural, but docuemented as present on NHM, SNB
and is present on others.
Signed-off-by: Len Brown <len.brown@intel.com>
show the CPUID feature for turbo to clarify the case
when it may not be shown in MISC_ENABLE
CPUID(6): APERF, TURBO, DTS, PTM, No-HWP, No-HWPnotify, No-HWPwindow, No-HWPepp, No-HWPpkg, EPB
cpu4: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT TURBO)
Signed-off-by: Len Brown <len.brown@intel.com>
Turbostat dumps MSR_TURBO_RATIO_LIMIT on Core Architecture.
But Atom Architecture uses MSR_ATOM_CORE_RATIOS and
MSR_ATOM_CORE_TURBO_RATIOS.
Signed-off-by: Len Brown <len.brown@intel.com>
Decode MISC_ENABLE.NO_TURBO,
also use the #defines in msr-index.h for decoding this register
cpu0: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT TURBO)
Although it is not architectural, decode also
MSR_IA32_MISC_ENABLE.prefetch-disable (bit-9).
documented to be present on: Core, P4, Intel-Xeon
reserved on: Atom, Silvermont, Nehalem, SNB, PHI ec.
Signed-off-by: Len Brown <len.brown@intel.com>
Add a digit of precision to the --debug output for frequency range.
This is useful when BCLK is not an integer.
old:
6 * 83 = 500 MHz max efficiency frequency
26 * 83 = 2166 MHz base frequency
new:
6 * 83.3 = 499.8 MHz max efficiency frequency
26 * 83.3 = 2165.8 MHz base frequency
Signed-off-by: Len Brown <len.brown@intel.com>
The Baytrail SOC, with its Silvermont core, has some unique properties:
1. a hardware CC1 residency counter
2. a module-c6 residency counter
3. a package-c6 counter at traditional package-c7 counter address.
The SOC does not support c3, pc3, c7 or pc7 counters.
Signed-off-by: Len Brown <len.brown@intel.com>
Pull IDR rewrite from Matthew Wilcox:
"The most significant part of the following is the patch to rewrite the
IDR & IDA to be clients of the radix tree. But there's much more,
including an enhancement of the IDA to be significantly more space
efficient, an IDR & IDA test suite, some improvements to the IDR API
(and driver changes to take advantage of those improvements), several
improvements to the radix tree test suite and RCU annotations.
The IDR & IDA rewrite had a good spin in linux-next and Andrew's tree
for most of the last cycle. Coupled with the IDR test suite, I feel
pretty confident that any remaining bugs are quite hard to hit. 0-day
did a great job of watching my git tree and pointing out problems; as
it hit them, I added new test-cases to be sure not to be caught the
same way twice"
Willy goes on to expand a bit on the IDR rewrite rationale:
"The radix tree and the IDR use very similar data structures.
Merging the two codebases lets us share the memory allocation pools,
and results in a net deletion of 500 lines of code. It also opens up
the possibility of exposing more of the features of the radix tree to
users of the IDR (and I have some interesting patches along those
lines waiting for 4.12)
It also shrinks the size of the 'struct idr' from 40 bytes to 24 which
will shrink a fair few data structures that embed an IDR"
* 'idr-4.11' of git://git.infradead.org/users/willy/linux-dax: (32 commits)
radix tree test suite: Add config option for map shift
idr: Add missing __rcu annotations
radix-tree: Fix __rcu annotations
radix-tree: Add rcu_dereference and rcu_assign_pointer calls
radix tree test suite: Run iteration tests for longer
radix tree test suite: Fix split/join memory leaks
radix tree test suite: Fix leaks in regression2.c
radix tree test suite: Fix leaky tests
radix tree test suite: Enable address sanitizer
radix_tree_iter_resume: Fix out of bounds error
radix-tree: Store a pointer to the root in each node
radix-tree: Chain preallocated nodes through ->parent
radix tree test suite: Dial down verbosity with -v
radix tree test suite: Introduce kmalloc_verbose
idr: Return the deleted entry from idr_remove
radix tree test suite: Build separate binaries for some tests
ida: Use exceptional entries for small IDAs
ida: Move ida_bitmap to a percpu variable
Reimplement IDR and IDA using the radix tree
radix-tree: Add radix_tree_iter_delete
...
Pull objtool fixes from Ingo Molnar:
"A handful of objtool fixes related to unreachable code, plus a build
fix for out of tree modules"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Enclose contents of unreachable() macro in a block
objtool: Prevent GCC from merging annotate_unreachable()
objtool: Improve detection of BUG() and other dead ends
objtool: Fix CONFIG_STACK_VALIDATION=y warning for out-of-tree modules
I've been asked to get the upstream repo back up-to-date.
-----BEGIN PGP SIGNATURE-----
iQExBAABCAAbBQJYtD0iFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
7E0H/0Sg5YxCH3iaKyZ0Z6vj9k09wBcXpzBLtUKrCiK46XEhvXCkh7vFJSY5AmGO
TcIsSBwmCdHWT5M5jM2n0j2GdxJOE0jXHlbnMvXob5EjW6QAw+mRI+KK0KGzmQIq
M4cYxrmD+HDwgulaFkF3P/nQ64wU35zaUQxFAmh0SY2xNUTg/BL8BMebfH9BauFB
SWBWF156kon0+eXvqLDfgfRyWiDH4IB65CEu9N+1ubJPWKrHoqHTopMW+jsqbVeu
qEEiigfmRlPiCOEiD4LQn+4qdmTji4dR/ZRvOCEGBGXtJPtqvI275m3/UxDvp9jx
07hZ+1701J+aPaYx0hS5iPegFq4=
=P7wC
-----END PGP SIGNATURE-----
Merge tag 'ktest-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
"These are various fixes that I have made and never got around to
pushing. I've been asked to get the upstream repo back up-to-date"
* tag 'ktest-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Add variable run_command_status to save status of commands executed
ktest.pl: Powercycle the box on reboot if no connection can be made
ktest: Add timeout to ssh command
ktest: Fix child exit code processing
ktest: Have POST_TEST run after the test has totally completed
Pull cgroup updates from Tejun Heo:
"Several noteworthy changes.
- Parav's rdma controller is finally merged. It is very straight
forward and can limit the abosolute numbers of common rdma
constructs used by different cgroups.
- kernel/cgroup.c got too chubby and disorganized. Created
kernel/cgroup/ subdirectory and moved all cgroup related files
under kernel/ there and reorganized the core code. This hurts for
backporting patches but was long overdue.
- cgroup v2 process listing reimplemented so that it no longer
depends on allocating a buffer large enough to cache the entire
result to sort and uniq the output. v2 has always mangled the sort
order to ensure that users don't depend on the sorted output, so
this shouldn't surprise anybody. This makes the pid listing
functions use the same iterators that are used internally, which
have to have the same iterating capabilities anyway.
- perf cgroup filtering now works automatically on cgroup v2. This
patch was posted a long time ago but somehow fell through the
cracks.
- misc fixes asnd documentation updates"
* 'for-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (27 commits)
kernfs: fix locking around kernfs_ops->release() callback
cgroup: drop the matching uid requirement on migration for cgroup v2
cgroup, perf_event: make perf_event controller work on cgroup2 hierarchy
cgroup: misc cleanups
cgroup: call subsys->*attach() only for subsystems which are actually affected by migration
cgroup: track migration context in cgroup_mgctx
cgroup: cosmetic update to cgroup_taskset_add()
rdmacg: Fixed uninitialized current resource usage
cgroup: Add missing cgroup-v2 PID controller documentation.
rdmacg: Added documentation for rdmacg
IB/core: added support to use rdma cgroup controller
rdmacg: Added rdma cgroup controller
cgroup: fix a comment typo
cgroup: fix RCU related sparse warnings
cgroup: move namespace code to kernel/cgroup/namespace.c
cgroup: rename functions for consistency
cgroup: move v1 mount functions to kernel/cgroup/cgroup-v1.c
cgroup: separate out cgroup1_kf_syscall_ops
cgroup: refactor mount path and clearly distinguish v1 and v2 paths
cgroup: move cgroup v1 specific code to kernel/cgroup/cgroup-v1.c
...
Fix typos and add the following to the scripts/spelling.txt:
overrided||overridden
Link: http://lkml.kernel.org/r/1481573103-11329-22-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typos and add the following to the scripts/spelling.txt:
an one||a one
I dropped the "an" before "one or more" in
drivers/net/ethernet/sfc/mcdi_pcol.h.
Link: http://lkml.kernel.org/r/1481573103-11329-6-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typos and add the following to the scripts/spelling.txt:
an union||a union
Link: http://lkml.kernel.org/r/1481573103-11329-5-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typos and add the following to the scripts/spelling.txt:
an user||a user
an userspace||a userspace
I also added "userspace" to the list since it is a common word in Linux.
I found some instances for "an userfaultfd", but I did not add it to the
list. I felt it is endless to find words that start with "user" such as
"userland" etc., so must draw a line somewhere.
Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently it uses %i for bitmasks, which makes it difficult to properly
decode the values. Use %x instead.
Link: http://lkml.kernel.org/r/b7b4c45d-2f21-de6c-d1c8-16c8386da27c@list.ru
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This update consists of:
-- fixes to several existing tests from Stafford Horne
-- cpufreq tests from Viresh Kumar
-- Selftest build and install fixes from Bamvor Jian Zhang
and Michael Ellerman
-- Fixes to protection-keys tests from Dave Hansen
-- Warning fixes from Shuah Khan
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJYsJgxAAoJEAsCRMQNDUMcNCUQALy+jVZV3U1yypLCQinlgbdH
rlh7oKIpGfWGXNe1BQVLS5S+bjil9XDdty+4VOB7x9gfQ6fvea3w0IQhI5CyONmm
hZg/miheZzN5ujqKjfuUQrHzEbEAs+CH0A0sVH+ueptw37roTWhf1ZCSpQBpas5p
XMZrfBI0mQLd9Z3D0G5TSsVjSPcMhKeoYDMGPMCulZuamVMY40XkPcvaYe1Zg1Mj
7nD7Aw6JxxV0tlZwo0n540w8tdx/yQ+49jqhulozCQNL+KmXO8FlM/Jnu1b24/YW
hlu5dvLmi9rAHYEHwqFf5yqZci/50Q+LHuxcxEp3RLxRW+KXJP7c53Kn8eutIwqH
HR03TSA1TRv9b4MvWJs/ULF/EYYtTPUDSinAtNMf4iegXp0BbT7P0eOibF1vj3tz
bcfPB5vi1SxQqLQwCPomUzhlPB4muBu9lHjZ2tI5EKynXXZxN33zugHYqBY0zNPm
7dS+4iXs/phEDlW0j+3BhHQz2of+Q6fSOC/jvgAYGdmqh1aNHl9WpIWfFubuBQhd
fkKJmgpJ1Mk5mBG/dGdCGTryv38tzFLr+n4MJWthfya84cbvk1W0HQQjwmROrIiP
qxC4F1Da6F88mfrpFDKW9LxAwfJFCgSxnYFygRsyzZK/VKdm2CI8yeoY2rt2lyRF
jUdxx7SJ7+71sO1xWcAE
=F3yO
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest update from Shuah Khan:
"This update consists of:
- fixes to several existing tests from Stafford Horne
- cpufreq tests from Viresh Kumar
- Selftest build and install fixes from Bamvor Jian Zhang and Michael
Ellerman
- Fixes to protection-keys tests from Dave Hansen
- Warning fixes from Shuah Khan"
* tag 'linux-kselftest-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (28 commits)
selftests/powerpc: Fix remaining fallout from recent changes
selftests/powerpc: Fix the clean rule since recent changes
selftests: Fix the .S and .S -> .o rules
selftests: Fix the .c linking rule
selftests: Fix selftests build to just build, not run tests
selftests, x86, protection_keys: fix wrong offset in siginfo
selftests, x86, protection_keys: fix uninitialized variable warning
selftest: cpufreq: Update MAINTAINERS file
selftest: cpufreq: Add special tests
selftest: cpufreq: Add support to test cpufreq modules
selftest: cpufreq: Add suspend/resume/hibernate support
selftest: cpufreq: Add support for cpufreq tests
selftests: Add intel_pstate to TARGETS
selftests/intel_pstate: Update makefile to match new style
selftests/intel_pstate: Fix warning on loop index overflow
cpupower: Restore format of frequency-info limit
selftests/futex: Add headers to makefile dependencies
selftests/futex: Add stdio used for logging
selftests: x86 protection_keys remove dead code
selftests: x86 protection_keys fix unused variable compile warnings
...
with --debug, see:
cpu0: MSR_CC6_DEMOTION_POLICY_CONFIG: 0x00000000 (DISable-CC6-Demotion)
cpu0: MSR_MC6_DEMOTION_POLICY_CONFIG: 0x00000000 (DISable-MC6-Demotion)
Note that the hardware default is to enable demotion,
and Linux started clearing these registers in 3.17.
Signed-off-by: Len Brown <len.brown@intel.com>
and so --debug fails with:
turbostat: msr 1 offset 0x1aa read failed: Input/output error
It seems that baytrail, and airmont do not have this MSR.
It is included in subsequent Goldmont Atom.
Signed-off-by: Len Brown <len.brown@intel.com>
Add the "--show" and "--hide" cmdline parameters.
By default, turbostat shows all columns.
turbostat --hide counter_list
will continue showing all columns, except for those listed.
turbostat --show counter_list
will show _only_ the listed columns
These features work for built-in counters, and have no effect
on columns added with the --add parameter.
Signed-off-by: Len Brown <len.brown@intel.com>
When --add was used more than once, overflowed buffers
caused some counters to be stored on top of others,
corrupting the results. Simplify the code by simply
reserving space for up to 16 added counters per each
cpu, core, package.
Per-cpu added counters were being printed only per-core.
Signed-off-by: Len Brown <len.brown@intel.com>
This saves 32 bytes on my x86-64 build, mostly due to alignment
considerations and sharing more code between find_next_bit and
find_next_zero_bit, but it does save a couple of instructions.
There's really two parts to this commit:
- First, the first half of the test: (!nbits || start >= nbits) is
trivially a subset of the second half, since nbits and start are both
unsigned
- Second, while looking at the disassembly, I noticed that GCC was
predicting the branch taken. Since this is a failure case, it's
clearly the less likely of the two branches, so add an unlikely() to
override GCC's heuristics.
[mawilcox@microsoft.com: v2]
Link: http://lkml.kernel.org/r/1483709016-1834-1-git-send-email-mawilcox@linuxonhyperv.com
Link: http://lkml.kernel.org/r/1483709016-1834-1-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Yury Norov <ynorov@caviumnetworks.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now when madvise(MADV_REMOVE) notifies uffd reader, we should verify
that appliciation actually sees zeros at the removed range.
Link: http://lkml.kernel.org/r/1484814154-1557-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "userfaultfd: non-cooperative: add madvise() event for
MADV_REMOVE request".
These patches add notification of madvise(MADV_REMOVE) event to
non-cooperative userfaultfd monitor.
The first pacth renames EVENT_MADVDONTNEED to EVENT_REMOVE along with
relevant functions and structures. Using _REMOVE instead of
_MADVDONTNEED describes the event semantics more clearly and I hope it's
not too late for such change in the ABI.
This patch (of 3):
The UFFD_EVENT_MADVDONTNEED purpose is to notify uffd monitor about
removal of certain range from address space tracked by userfaultfd.
Hence, UFFD_EVENT_REMOVE seems to better reflect the operation
semantics. Respectively, 'madv_dn' field of uffd_msg is renamed to
'remove' and the madvise_userfault_dontneed callback is renamed to
userfaultfd_remove.
Link: http://lkml.kernel.org/r/1484814154-1557-2-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The BUG() macro's use of __builtin_unreachable() via the unreachable()
macro tells gcc that the instruction is a dead end, and that it's safe
to assume the current code path will not execute past the previous
instruction.
On x86, the BUG() macro is implemented with the 'ud2' instruction. When
objtool's branch analysis sees that instruction, it knows the current
code path has come to a dead end.
Peter Zijlstra has been working on a patch to change the WARN macros to
use 'ud2'. That patch will break objtool's assumption that 'ud2' is
always a dead end.
Generally it's best for objtool to avoid making those kinds of
assumptions anyway. The more ignorant it is of kernel code internals,
the better.
So create a more generic way for objtool to detect dead ends by adding
an annotation to the unreachable() macro. The annotation stores a
pointer to the end of the unreachable code path in an '__unreachable'
section. Objtool can read that section to find the dead ends.
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/41a6d33971462ebd944a1c60ad4bf5be86c17b77.1487712920.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYr5aeAAoJEAx081l5xIa+ZK4P/RD3XUsduYqziVFCRQ2n0X8r
+D92F4peTnSeSq7ZcZvprv+fezUGAHbfsWFs8feYCI5quUO6pEQSPwN+wyGazUi0
4hUVB/K9Iq7U/Bj7Z/SmsU3NuWJnkNqbmvSFvUdqYK9D/kl+Tnllzap2N4cTzjwu
GZOObz4n85cx94NqC3qw+7/ptL1X2MhXa+z0MzbkKyas84Bko1LwCSHRHsDKUnJc
IcSpOcYZ6pSRMIsKH4Kd79Go4vWm7djXT9XL3PwDk2NcXXUOuR+cfdHqYchYaM/O
iD2hvaSywBcflxSAml5x6vlXraoRd91ZZulgOObXtFfnUXdZB81TVq4uv6LU4Bx3
jLFixUZuk/TJT+W/8N10l7M6yMIFaTpNoNMc5n4IF5RNNyWba4BKnrI+f+lQiOpY
mmjIaidb0t5BICnJzCD264RhCEXmP0HaDV+iQQV6y6jJRXfd1bgnOXLKP73JekzB
TsbDshCoE7UO0dJ7n0LFpXSTQDTYzlazoEp14f2kFBxir5/l7r67nUlnDTvUQfuN
tSRvpN/s0wqvH3o7zhmpHxyJ/ZasPMQjNCFAuUEbx8L5SKXsua0FubIzN4aVpilb
XvfdFRWM/lkOT/q+8cGI/TcE3YTqEmALmGxdV/akbdNCiCg6aClyCLRE/DZhgmSQ
UMFjr9wlHl5Qo/OqLKj0
=Yjfg
-----END PGP SIGNATURE-----
Merge tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"This is the main drm pull request for v4.11.
Nothing too major, the tinydrm and mmu-less support should make
writing smaller drivers easier for some of the simpler platforms, and
there are a bunch of documentation updates.
Intel grew displayport MST audio support which is hopefully useful to
people, and FBC is on by default for GEN9+ (so people know where to
look for regressions). AMDGPU has a lot of fixes that would like new
firmware files installed for some GPUs.
Other than that it's pretty scattered all over.
I may have a follow up pull request as I know BenH has a bunch of AST
rework and fixes and I'd like to get those in once they've been tested
by AST, and I've got at least one pull request I'm just trying to get
the author to fix up.
Core:
- drm_mm reworked
- Connector list locking and iterators
- Documentation updates
- Format handling rework
- MMU-less support for fbdev helpers
- drm_crtc_from_index helper
- Core CRC API
- Remove drm_framebuffer_unregister_private
- Debugfs cleanup
- EDID/Infoframe fixes
- Release callback
- Tinydrm support (smaller drivers for simple hw)
panel:
- Add support for some new simple panels
i915:
- FBC by default for gen9+
- Shared dpll cleanups and docs
- GEN8 powerdomain cleanup
- DMC support on GLK
- DP MST audio support
- HuC loading support
- GVT init ordering fixes
- GVT IOMMU workaround fix
amdgpu/radeon:
- Power/clockgating improvements
- Preliminary SR-IOV support
- TTM buffer priority and eviction fixes
- SI DPM quirks removed due to firmware fixes
- Powerplay improvements
- VCE/UVD powergating fixes
- Cleanup SI GFX code to match CI/VI
- Support for > 2 displays on 3/5 crtc asics
- SI headless fixes
nouveau:
- Rework securre boot code in prep for GP10x secure boot
- Channel recovery improvements
- Initial power budget code
- MMU rework preperation
vmwgfx:
- Bunch of fixes and cleanups
exynos:
- Runtime PM support for MIC driver
- Cleanups to use atomic helpers
- UHD Support for TM2/TM2E boards
- Trigger mode fix for Rinato board
etnaviv:
- Shader performance fix
- Command stream validator fixes
- Command buffer suballocator
rockchip:
- CDN DisplayPort support
- IOMMU support for arm64 platform
imx-drm:
- Fix i.MX5 TV encoder probing
- Remove lower fb size limits
msm:
- Support for HW cursor on MDP5 devices
- DSI encoder cleanup
- GPU DT bindings cleanup
sti:
- stih410 cleanups
- Create fbdev at binding
- HQVDP fixes
- Remove stih416 chip functionality
- DVI/HDMI mode selection fixes
- FPS statistic reporting
omapdrm:
- IRQ code cleanup
dwi-hdmi bridge:
- Cleanups and fixes
adv-bridge:
- Updates for nexus
sii8520 bridge:
- Add interlace mode support
- Rework HDMI and lots of fixes
qxl:
- probing/teardown cleanups
ZTE drm:
- HDMI audio via SPDIF interface
- Video Layer overlay plane support
- Add TV encoder output device
atmel-hlcdc:
- Rework fbdev creation logic
tegra:
- OF node fix
fsl-dcu:
- Minor fixes
mali-dp:
- Assorted fixes
sunxi:
- Minor fix"
[ This was the "fixed" pull, that still had build warnings due to people
not even having build tested the result. I'm not a happy camper
I've fixed the things I noticed up in this merge. - Linus ]
* tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linux: (1177 commits)
lib/Kconfig: make PRIME_NUMBERS not user selectable
drm/tinydrm: helpers: Properly fix backlight dependency
drm/tinydrm: mipi-dbi: Fix field width specifier warning
drm/tinydrm: mipi-dbi: Silence: ‘cmd’ may be used uninitialized
drm/sti: fix build warnings in sti_drv.c and sti_vtg.c files
drm/amd/powerplay: fix PSI feature on Polars12
drm/amdgpu: refuse to reserve io mem for split VRAM buffers
drm/ttm: fix use-after-free races in vm fault handling
drm/tinydrm: Add support for Multi-Inno MI0283QT display
dt-bindings: Add Multi-Inno MI0283QT binding
dt-bindings: display/panel: Add common rotation property
of: Add vendor prefix for Multi-Inno
drm/tinydrm: Add MIPI DBI support
drm/tinydrm: Add helper functions
drm: Add DRM support for tiny LCD displays
drm/amd/amdgpu: post card if there is real hw resetting performed
drm/nouveau/tmr: provide backtrace when a timeout is hit
drm/nouveau/pci/g92: Fix rearm
drm/nouveau/drm/therm/fan: add a fallback if no fan control is specified in the vbios
drm/nouveau/hwmon: expose power_max and power_crit
..
Core changes:
- Augment fwnode_get_named_gpiod() to configure the GPIO pin
immediately after requesting it like all other APIs do.
This is a treewide change also updating all users.
- Pass a GPIO label down to gpiod_request() from
fwnode_get_named_gpiod(). This makes debugfs and the userspace
ABI correctly reflect the current in-kernel consumer of a pin
taken using this abstraction. This is a treewide change also
updating all users.
- Rename devm_get_gpiod_from_child() to
devm_fwnode_get_gpiod_from_child() to reflect the fact that this
function is operating on a fwnode object. This is a treewide
change also updating all users.
- Make it possible to take multiple GPIOs in a single hog of device
tree hogs.
- The refactorings switching GPIO chips to use the .set_config()
callback using standard pin control properties and providing
a backend into the pin control subsystem that were also merged
into the pin control tree naturally appear here too.
Testing instrumentation:
- A whole slew of cleanups and improvements to the mockup GPIO
driver. We now have an extended userspace test exercising the
subsystem, and we can inject interrupts etc from userspace
to fully test the core GPIO functionality.
New drivers:
- New driver for the Cortina Systems Gemini GPIO controller.
- New driver for the Exar XR17V352/354/358 chips.
- New driver for the ACCES PCI-IDIO-16 PCI GPIO card.
Driver changes:
- RCAR: set the irqchip parent device, add fine-grained runtime
PM support.
- pca953x: support optional RESET control line on the chip.
- DaVinci: cleanups and simplifications. Add support for multiple
instances.
- .set_multiple() and naming of lines on more or less all of the
ISA/PCI GPIO controllers.
- mcp23s08: refactored to use regmap as a first step to further
rewrites and modernizations.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJYrqvqAAoJEEEQszewGV1zoHsP/i1iZBEywR9+yIx/p2/F2mJu
nriuYFlp0V3FjHQAQ//YCA9+Catri+ZqT5l+BmG/EYdqqikHbziTyS0YArlfrMHv
OOBfDmfftexvRI/jQAl+X/nIW531ZjYo6ZApFy/2TirTwfkI7DIMi6ujm09fcG5D
BgCT1KuszbVtyrmhrQvbeEdVKw0qLAgwnn5eOOCQE4KuDB3s7eyal0rJaDEXhpMF
kH/y6eySs4FChEhAEmCkM6205F5T4c2YFjL1bo5Fkh/WPrVPaKI0Ny16qbaDWU9K
W9RaJUzf92KIW0MgcRl+r8Lxn+GekN6/jvrxddQ/Ajs/Dkh5r2JCrm7RIC9tBPcJ
VbLfjL+cMehlSEu9eyxRQcAIeuUYCqkN8ghuVoj9xt/tDtNYsQIcJZtfW1yjmONq
mFsd5KhfBFgspQkwF4IX3hthaqj8MH4zefQdWzAGPZMGEA1rrx2kVSEdZD3EV4VN
84qt5Cx9hLllafthJOGjEIZFCjPIpbMRwTQ+fmc+1IB1DgN8Kc5E1FMssKbUEoOK
2eLquLvd7iNDMidTjoi87YAisW9qnrPeRDywsqeXdQf7fzpB97gX4MQfJ5fJWEYr
3uHCfu2u4J4cff9ygg8c4ut7ePEjz+ld/sBh9EHicbbryR4I5ZG7Ne1aQhsmb2M5
dHZSRfQYEQ4Nl7cMJQuh
=O81I
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v4.11 cycle
Core changes:
- Augment fwnode_get_named_gpiod() to configure the GPIO pin
immediately after requesting it like all other APIs do. This is a
treewide change also updating all users.
- Pass a GPIO label down to gpiod_request() from
fwnode_get_named_gpiod(). This makes debugfs and the userspace ABI
correctly reflect the current in-kernel consumer of a pin taken
using this abstraction. This is a treewide change also updating all
users.
- Rename devm_get_gpiod_from_child() to
devm_fwnode_get_gpiod_from_child() to reflect the fact that this
function is operating on a fwnode object. This is a treewide change
also updating all users.
- Make it possible to take multiple GPIOs in a single hog of device
tree hogs.
- The refactorings switching GPIO chips to use the .set_config()
callback using standard pin control properties and providing a
backend into the pin control subsystem that were also merged into
the pin control tree naturally appear here too.
Testing instrumentation:
- A whole slew of cleanups and improvements to the mockup GPIO
driver. We now have an extended userspace test exercising the
subsystem, and we can inject interrupts etc from userspace to fully
test the core GPIO functionality.
New drivers:
- New driver for the Cortina Systems Gemini GPIO controller.
- New driver for the Exar XR17V352/354/358 chips.
- New driver for the ACCES PCI-IDIO-16 PCI GPIO card.
Driver changes:
- RCAR: set the irqchip parent device, add fine-grained runtime PM
support.
- pca953x: support optional RESET control line on the chip.
- DaVinci: cleanups and simplifications. Add support for multiple
instances.
- .set_multiple() and naming of lines on more or less all of the
ISA/PCI GPIO controllers.
- mcp23s08: refactored to use regmap as a first step to further
rewrites and modernizations"
* tag 'gpio-v4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (61 commits)
gpio: reintroduce devm_get_gpiod_from_child()
gpio: pci-idio-16: Fix PCI BAR index
gpio: pci-idio-16: Fix PCI device ID code
gpio: mockup: implement event injecting over debugfs
gpio: mockup: add a dummy irqchip
gpio: mockup: implement naming the lines
gpio: mockup: code shrink
gpio: mockup: readability tweaks
gpio: Add GPIO support for the ACCES PCI-IDIO-16
gpio: Add the devm_fwnode_get_index_gpiod_from_child() helper
gpio: Rename devm_get_gpiod_from_child()
gpio: mcp23s08: Select REGMAP/REGMAP_I2C to fix build error
gpio: ws16c48: Add support for GPIO names
gpio: gpio-mm: Add support for GPIO names
gpio: 104-idio-16: Add support for GPIO names
gpio: 104-idi-48: Add support for GPIO names
gpio: 104-dio-48e: Add support for GPIO names
gpio: ws16c48: Remove unnecessary driver_data set
gpio: gpio-mm: Remove unnecessary driver_data set
gpio: 104-idio-16: Remove unnecessary driver_data set
...
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJYoM2fAAoJEHm+PkMAQRiGr9MH/izEAMri7rJ0QMc3ejt+WmD0
8pkZw3+MVn71z6cIEgpzk4QkEWJd5rfhkETCeCp7qQ9V6cDW1FDE9+0OmPjiphDt
nnzKs7t7skEBwH5Mq5xygmIfkv+Z0QGHZ20gfQWY3F56Uxo+ARF88OBHBLKhqx3v
98C7YbMFLKBslKClA78NUEIdx0UfBaRqerlERx0Lfl9aoOrbBS6WI3iuREiylpih
9o7HTrwaGKkU4Kd6NdgJP2EyWPsd1LGalxBBjeDSpm5uokX6ALTdNXDZqcQscHjE
RmTqJTGRdhSThXOpNnvUJvk9L442yuNRrVme/IqLpxMdHPyjaXR3FGSIDb2SfjY=
=VMy8
-----END PGP SIGNATURE-----
Merge tag 'v4.10-rc8' into drm-next
Linux 4.10-rc8
Backmerge Linus rc8 to fix some conflicts, but also
to avoid pulling it in via a fixes pull from someone.