mirror of https://gitee.com/openkylin/linux.git
3051 Commits
Author | SHA1 | Message | Date |
---|---|---|---|
Linus Torvalds | 1200b6809d |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: "Highlights: 1) Support more Realtek wireless chips, from Jes Sorenson. 2) New BPF types for per-cpu hash and arrap maps, from Alexei Starovoitov. 3) Make several TCP sysctls per-namespace, from Nikolay Borisov. 4) Allow the use of SO_REUSEPORT in order to do per-thread processing of incoming TCP/UDP connections. The muxing can be done using a BPF program which hashes the incoming packet. From Craig Gallek. 5) Add a multiplexer for TCP streams, to provide a messaged based interface. BPF programs can be used to determine the message boundaries. From Tom Herbert. 6) Add 802.1AE MACSEC support, from Sabrina Dubroca. 7) Avoid factorial complexity when taking down an inetdev interface with lots of configured addresses. We were doing things like traversing the entire address less for each address removed, and flushing the entire netfilter conntrack table for every address as well. 8) Add and use SKB bulk free infrastructure, from Jesper Brouer. 9) Allow offloading u32 classifiers to hardware, and implement for ixgbe, from John Fastabend. 10) Allow configuring IRQ coalescing parameters on a per-queue basis, from Kan Liang. 11) Extend ethtool so that larger link mode masks can be supported. From David Decotigny. 12) Introduce devlink, which can be used to configure port link types (ethernet vs Infiniband, etc.), port splitting, and switch device level attributes as a whole. From Jiri Pirko. 13) Hardware offload support for flower classifiers, from Amir Vadai. 14) Add "Local Checksum Offload". Basically, for a tunneled packet the checksum of the outer header is 'constant' (because with the checksum field filled into the inner protocol header, the payload of the outer frame checksums to 'zero'), and we can take advantage of that in various ways. From Edward Cree" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits) bonding: fix bond_get_stats() net: bcmgenet: fix dma api length mismatch net/mlx4_core: Fix backward compatibility on VFs phy: mdio-thunder: Fix some Kconfig typos lan78xx: add ndo_get_stats64 lan78xx: handle statistics counter rollover RDS: TCP: Remove unused constant RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket net: smc911x: convert pxa dma to dmaengine team: remove duplicate set of flag IFF_MULTICAST bonding: remove duplicate set of flag IFF_MULTICAST net: fix a comment typo ethernet: micrel: fix some error codes ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it bpf, dst: add and use dst_tclassid helper bpf: make skb->tc_classid also readable net: mvneta: bm: clarify dependencies cls_bpf: reset class and reuse major in da ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c ldmvsw: Add ldmvsw.c driver code ... |
|
Linus Torvalds | 277edbabf6 |
Power management and ACPI material for v4.6-rc1, part 1
- Redesign of cpufreq governors and the intel_pstate driver to make them use callbacks invoked by the scheduler to trigger CPU frequency evaluation instead of using per-CPU deferrable timers for that purpose (Rafael Wysocki). - Reorganization and cleanup of cpufreq governor code to make it more straightforward and fix some concurrency problems in it (Rafael Wysocki, Viresh Kumar). - Cleanup and improvements of locking in the cpufreq core (Viresh Kumar). - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh Kumar, Eric Biggers). - intel_pstate driver updates including fixes, optimizations and a modification to make it enable enable hardware-coordinated P-state selection (HWP) by default if supported by the processor (Philippe Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe Franciosi). - Operating Performance Points (OPP) framework updates to improve its handling of voltage regulators and device clocks and updates of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter). - Updates of the powernv cpufreq driver to fix initialization and cleanup problems in it and correct its worker thread handling with respect to CPU offline, new powernv_throttle tracepoint (Shilpasri Bhat). - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki). - ACPICA updates including one fix for a regression introduced by previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box, Colin Ian King). - Support for installing ACPI tables from initrd (Lv Zheng). - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin Chaugule). - Support for _HID(ACPI0010) devices (ACPI processor containers) and ACPI processor driver cleanups (Sudeep Holla). - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory, Aleksey Makarov). - Modification of the ACPI PCI IRQ management code to make it treat 255 in the Interrupt Line register as "not connected" on x86 (as per the specification) and avoid attempts to use that value as a valid interrupt vector (Chen Fan). - ACPI APEI fixes related to resource leaks (Josh Hunt). - Removal of modularity from a few ACPI drivers (BGRT, GHES, intel_pmic_crc) that cannot be built as modules in practice (Paul Gortmaker). - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS as a valid resource type (Harb Abdulhamid). - New device ID (future AMD I2C controller) in the ACPI driver for AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu). - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin). - cpuidle menu governor optimization to avoid a square root computation in it (Rasmus Villemoes). - Fix for potential use-after-free in the generic device properties framework (Heikki Krogerus). - Updates of the generic power domains (genpd) framework including support for multiple power states of a domain, fixes and debugfs output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart, Geert Uytterhoeven). - Intel RAPL power capping driver updates to reduce IPI overhead in it (Jacob Pan). - System suspend/hibernation code cleanups (Eric Biggers, Saurabh Sengar). - Year 2038 fix for the process freezer (Abhilash Jindal). - turbostat utility updates including new features (decoding of more registers and CPUID fields, sub-second intervals support, GFX MHz and RC6 printout, --out command line option), fixes (syscall jitter detection and workaround, reductioin of the number of syscalls made, fixes related to Xeon x200 processors, compiler warning fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJW50NXAAoJEILEb/54YlRxvr8QAIktC9+ft0y5AmU46hDcBWcK QutyWJL9X9BS6DWBJZA2qclDYFmhMfi5Fza1se0gQ9TnLB/KrBwHWLsiYoTsb1k+ nPKf214aPk+qAhkVuyB4leNWML9Qz9n9jwku/EYxWWpgtbSRf3+0ioIKZeWWc/8V JvuaOu4O+g/tkmL7QTrnGWBwhIIssAAV85QPsHkx+g68MrCj4UMMzm7z9G21SPXX bmP8yIHsczX/XnRsY0W2NSno7Vdk6ImHpDJ26IAZg28WRNPWICHgGYHvB0TTWMvb tts+yqfF7/7QLRjT/M8k9CzDBDE/DnVqoZ0fNJ+aYr7hNKF32mtAN+jH9ZB9dl/P fEFapJkPxnWyzAoVoB9Dz0rkcZkYMlbxlLWzUGpaPq0JflUUTzLk0ApSjmMn4HRO UddwCDdyHTaYThp3gn6GbOb0pIP0SdOVbI1M2QV2x/4PLcT2Ft8Np1+1RFWOeinZ Bdl9AE890big0808mqbBzw/buETwr9FjHtCdDPXpP0vJpkBLu3nIYRNb0LCt39es mWMp6dFhGgvGj3D3ahTuV3GI8hdpDkh9SObexa11RCjkTKrXcwEmFxHxLeFXwKYq alG278bo6cSChRMziS1lis+W/3tsJRN4TXUSv1PPzJHrFgptQVFRStU9ngBKP+pN WB+itPc4Fw0YHOrAFsrx =cfty -----END PGP SIGNATURE----- Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "This time the majority of changes go into cpufreq and they are significant. First off, the way CPU frequency updates are triggered is different now. Instead of having to set up and manage a deferrable timer for each CPU in the system to evaluate and possibly change its frequency periodically, cpufreq governors set up callbacks to be invoked by the scheduler on a regular basis (basically on utilization updates). The "old" governors, "ondemand" and "conservative", still do all of their work in process context (although that is triggered by the scheduler now), but intel_pstate does it all in the callback invoked by the scheduler with no need for any additional asynchronous processing. Of course, this eliminates the overhead related to the management of all those timers, but also it allows the cpufreq governor code to be simplified quite a bit. On top of that, the common code and data structures used by the "ondemand" and "conservative" governors are cleaned up and made more straightforward and some long-standing and quite annoying problems are addressed. In particular, the handling of governor sysfs attributes is modified and the related locking becomes more fine grained which allows some concurrency problems to be avoided (particularly deadlocks with the core cpufreq code). In principle, the new mechanism for triggering frequency updates allows utilization information to be passed from the scheduler to cpufreq. Although the current code doesn't make use of it, in the works is a new cpufreq governor that will make decisions based on the scheduler's utilization data. That should allow the scheduler and cpufreq to work more closely together in the long run. In addition to the core and governor changes, cpufreq drivers are updated too. Fixes and optimizations go into intel_pstate, the cpufreq-dt driver is updated on top of some modification in the Operating Performance Points (OPP) framework and there are fixes and other updates in the powernv cpufreq driver. Apart from the cpufreq updates there is some new ACPICA material, including a fix for a problem introduced by previous ACPICA updates, and some less significant changes in the ACPI code, like CPPC code optimizations, ACPI processor driver cleanups and support for loading ACPI tables from initrd. Also updated are the generic power domains framework, the Intel RAPL power capping driver and the turbostat utility and we have a bunch of traditional assorted fixes and cleanups. Specifics: - Redesign of cpufreq governors and the intel_pstate driver to make them use callbacks invoked by the scheduler to trigger CPU frequency evaluation instead of using per-CPU deferrable timers for that purpose (Rafael Wysocki). - Reorganization and cleanup of cpufreq governor code to make it more straightforward and fix some concurrency problems in it (Rafael Wysocki, Viresh Kumar). - Cleanup and improvements of locking in the cpufreq core (Viresh Kumar). - Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh Kumar, Eric Biggers). - intel_pstate driver updates including fixes, optimizations and a modification to make it enable enable hardware-coordinated P-state selection (HWP) by default if supported by the processor (Philippe Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe Franciosi). - Operating Performance Points (OPP) framework updates to improve its handling of voltage regulators and device clocks and updates of the cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter). - Updates of the powernv cpufreq driver to fix initialization and cleanup problems in it and correct its worker thread handling with respect to CPU offline, new powernv_throttle tracepoint (Shilpasri Bhat). - ACPI cpufreq driver optimization and cleanup (Rafael Wysocki). - ACPICA updates including one fix for a regression introduced by previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box, Colin Ian King). - Support for installing ACPI tables from initrd (Lv Zheng). - Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin Chaugule). - Support for _HID(ACPI0010) devices (ACPI processor containers) and ACPI processor driver cleanups (Sudeep Holla). - Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory, Aleksey Makarov). - Modification of the ACPI PCI IRQ management code to make it treat 255 in the Interrupt Line register as "not connected" on x86 (as per the specification) and avoid attempts to use that value as a valid interrupt vector (Chen Fan). - ACPI APEI fixes related to resource leaks (Josh Hunt). - Removal of modularity from a few ACPI drivers (BGRT, GHES, intel_pmic_crc) that cannot be built as modules in practice (Paul Gortmaker). - PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS as a valid resource type (Harb Abdulhamid). - New device ID (future AMD I2C controller) in the ACPI driver for AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu). - Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin). - cpuidle menu governor optimization to avoid a square root computation in it (Rasmus Villemoes). - Fix for potential use-after-free in the generic device properties framework (Heikki Krogerus). - Updates of the generic power domains (genpd) framework including support for multiple power states of a domain, fixes and debugfs output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart, Geert Uytterhoeven). - Intel RAPL power capping driver updates to reduce IPI overhead in it (Jacob Pan). - System suspend/hibernation code cleanups (Eric Biggers, Saurabh Sengar). - Year 2038 fix for the process freezer (Abhilash Jindal). - turbostat utility updates including new features (decoding of more registers and CPUID fields, sub-second intervals support, GFX MHz and RC6 printout, --out command line option), fixes (syscall jitter detection and workaround, reductioin of the number of syscalls made, fixes related to Xeon x200 processors, compiler warning fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)" * tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits) tools/power turbostat: bugfix: TDP MSRs print bits fixing tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump tools/power turbostat: call __cpuid() instead of __get_cpuid() tools/power turbostat: indicate SMX and SGX support tools/power turbostat: detect and work around syscall jitter tools/power turbostat: show GFX%rc6 tools/power turbostat: show GFXMHz tools/power turbostat: show IRQs per CPU tools/power turbostat: make fewer systems calls tools/power turbostat: fix compiler warnings tools/power turbostat: add --out option for saving output in a file tools/power turbostat: re-name "%Busy" field to "Busy%" tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding tools/power turbostat: Intel Xeon x200: fix erroneous bclk value tools/power turbostat: allow sub-sec intervals ACPI / APEI: ERST: Fixed leaked resources in erst_init ACPI / APEI: Fix leaked resources intel_pstate: Do not skip samples partially intel_pstate: Remove freq calculation from intel_pstate_calc_busy() intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance() ... |
|
Linus Torvalds | e71c2c1eeb |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Main kernel side changes: - Big reorganization of the x86 perf support code. The old code grew organically deep inside arch/x86/kernel/cpu/perf* and its naming became somewhat messy. The new location is under arch/x86/events/, using the following cleaner hierarchy of source code files: perf/x86: Move perf_event.c .................. => x86/events/core.c perf/x86: Move perf_event_amd.c .............. => x86/events/amd/core.c perf/x86: Move perf_event_amd_ibs.c .......... => x86/events/amd/ibs.c perf/x86: Move perf_event_amd_iommu.[ch] ..... => x86/events/amd/iommu.[ch] perf/x86: Move perf_event_amd_uncore.c ....... => x86/events/amd/uncore.c perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch] perf/x86: Move perf_event_intel_rapl.c ....... => x86/events/intel/rapl.c perf/x86: Move perf_event_intel_uncore.[ch] .. => x86/events/intel/uncore.[ch] perf/x86: Move perf_event_intel_uncore_nhmex.c => x86/events/intel/uncore_nmhex.c perf/x86: Move perf_event_intel_uncore_snb.c => x86/events/intel/uncore_snb.c perf/x86: Move perf_event_intel_uncore_snbep.c => x86/events/intel/uncore_snbep.c perf/x86: Move perf_event_knc.c .............. => x86/events/intel/knc.c perf/x86: Move perf_event_p4.c ............... => x86/events/intel/p4.c perf/x86: Move perf_event_p6.c ............... => x86/events/intel/p6.c perf/x86: Move perf_event_msr.c .............. => x86/events/msr.c (Borislav Petkov) - Update various x86 PMU constraint and hw support details (Stephane Eranian) - Optimize kprobes for BPF execution (Martin KaFai Lau) - Rewrite, refactor and fix the Intel uncore PMU driver code (Thomas Gleixner) - Rewrite, refactor and fix the Intel RAPL PMU code (Thomas Gleixner) - Various fixes and smaller cleanups. There are lots of perf tooling updates as well. A few highlights: perf report/top: - Hierarchy histogram mode for 'perf top' and 'perf report', showing multiple levels, one per --sort entry: (Namhyung Kim) On a mostly idle system: # perf top --hierarchy -s comm,dso Then expand some levels and use 'P' to take a snapshot: # cat perf.hist.0 - 92.32% perf 58.20% perf 22.29% libc-2.22.so 5.97% [kernel] 4.18% libelf-0.165.so 1.69% [unknown] - 4.71% qemu-system-x86 3.10% [kernel] 1.60% qemu-system-x86_64 (deleted) + 2.97% swapper # - Add 'L' hotkey to dynamicly set the percent threshold for histogram entries and callchains, i.e. dynamicly do what the --percent-limit command line option to 'top' and 'report' does. (Namhyung Kim) perf mem: - Allow specifying events via -e in 'perf mem record', also listing what events can be specified via 'perf mem record -e list' (Jiri Olsa) perf record: - Add 'perf record' --all-user/--all-kernel options, so that one can tell that all the events in the command line should be restricted to the user or kernel levels (Jiri Olsa), i.e.: perf record -e cycles:u,instructions:u is equivalent to: perf record --all-user -e cycles,instructions - Make 'perf record' collect CPU cache info in the perf.data file header: $ perf record usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] $ perf report --header-only -I | tail -10 | head -8 # CPU cache info: # L1 Data 32K [0-1] # L1 Instruction 32K [0-1] # L1 Data 32K [2-3] # L1 Instruction 32K [2-3] # L2 Unified 256K [0-1] # L2 Unified 256K [2-3] # L3 Unified 4096K [0-3] Will be used in 'perf c2c' and eventually in 'perf diff' to allow, for instance running the same workload in multiple machines and then when using 'diff' show the hardware difference. (Jiri Olsa) - Improved support for Java, using the JVMTI agent library to do jitdumps that then will be inserted in synthesized PERF_RECORD_MMAP2 events via 'perf inject' pointed to synthesized ELF files stored in ~/.debug and keyed with build-ids, to allow symbol resolution and even annotation with source line info, see the changeset comments to see how to use it (Stephane Eranian) perf script/trace: - Decode data_src values (e.g. perf.data files generated by 'perf mem record') in 'perf script': (Jiri Olsa) # perf script perf 693 [1] 4.088652: 1 cpu/mem-loads,ldlat=30/P: ffff88007d0b0f40 68100142 L1 hit|SNP None|TLB L1 or L2 hit|LCK No <SNIP> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - Improve support to 'data_src', 'weight' and 'addr' fields in 'perf script' (Jiri Olsa) - Handle empty print fmts in 'perf script -s' i.e. when running python or perl scripts (Taeung Song) perf stat: - 'perf stat' now shows shadow metrics (insn per cycle, etc) in interval mode too. E.g: # perf stat -I 1000 -e instructions,cycles sleep 1 # time counts unit events 1.000215928 519,620 instructions # 0.69 insn per cycle 1.000215928 752,003 cycles <SNIP> - Port 'perf kvm stat' to PowerPC (Hemant Kumar) - Implement CSV metrics output in 'perf stat' (Andi Kleen) perf BPF support: - Support converting data from bpf events in 'perf data' (Wang Nan) - Print bpf-output events in 'perf script': (Wang Nan). # perf record -e bpf-output/no-inherit,name=evt/ -e ./test_bpf_output_3.c/map:channel.event=evt/ usleep 1000 # perf script usleep 4882 21384.532523: evt: ffffffff810e97d1 sys_nanosleep ([kernel.kallsyms]) BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" # - Add API to set values of map entries in a BPF object, be it individual map slots or ranges (Wang Nan) - Introduce support for the 'bpf-output' event (Wang Nan) - Add glue to read perf events in a BPF program (Wang Nan) - Improve support for bpf-output events in 'perf trace' (Wang Nan) ... and tons of other changes as well - see the shortlog and git log for details!" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (342 commits) perf stat: Add --metric-only support for -A perf stat: Implement --metric-only mode perf stat: Document CSV format in manpage perf hists browser: Check sort keys before hot key actions perf hists browser: Allow thread filtering for comm sort key perf tools: Add sort__has_comm variable perf tools: Recalc total periods using top-level entries in hierarchy perf tools: Remove nr_sort_keys field perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry() perf tools: Remove hist_entry->fmt field perf tools: Fix command line filters in hierarchy mode perf tools: Add more sort entry check functions perf tools: Fix hist_entry__filter() for hierarchy perf jitdump: Build only on supported archs tools lib traceevent: Add '~' operation within arg_num_eval() perf tools: Omit unnecessary cast in perf_pmu__parse_scale perf tools: Pass perf_hpp_list all the way through setup_sort_list perf tools: Fix perf script python database export crash perf jitdump: DWARF is also needed perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changes ... |
|
Rafael J. Wysocki | 4ed3900427 |
Merge branch 'pm-cpufreq'
* pm-cpufreq: (94 commits) intel_pstate: Do not skip samples partially intel_pstate: Remove freq calculation from intel_pstate_calc_busy() intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance() intel_pstate: Optimize calculation for max/min_perf_adj intel_pstate: Remove extra conversions in pid calculation cpufreq: Move scheduler-related code to the sched directory Revert "cpufreq: postfix policy directory with the first CPU in related_cpus" cpufreq: Reduce cpufreq_update_util() overhead a bit cpufreq: Select IRQ_WORK if CPU_FREQ_GOV_COMMON is set cpufreq: Remove 'policy->governor_enabled' cpufreq: Rename __cpufreq_governor() to cpufreq_governor() cpufreq: Relocate handle_update() to kill its declaration cpufreq: governor: Drop unnecessary checks from show() and store() cpufreq: governor: Fix race in dbs_update_util_handler() cpufreq: governor: Make gov_set_update_util() static cpufreq: governor: Narrow down the dbs_data_mutex coverage cpufreq: governor: Make dbs_data_mutex static cpufreq: governor: Relocate definitions of tuners structures cpufreq: governor: Move per-CPU data to the common code cpufreq: governor: Make governor private data per-policy ... |
|
Alexei Starovoitov | b121d1e74d |
bpf: prevent kprobe+bpf deadlocks
if kprobe is placed within update or delete hash map helpers that hold bucket spin lock and triggered bpf program is trying to grab the spinlock for the same bucket on the same cpu, it will deadlock. Fix it by extending existing recursion prevention mechanism. Note, map_lookup and other tracing helpers don't have this problem, since they don't hold any locks and don't modify global data. bpf_trace_printk has its own recursive check and ok as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
David S. Miller | 810813c47a |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Several cases of overlapping changes, as well as one instance (vxlan) of a bug fix in 'net' overlapping with code movement in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Linus Torvalds | 78baab7aa8 |
A feature was added in 4.3 that allowed users to filter trace points on
a tasks "comm" field. But this prevented filtering on a comm field that is within a trace event (like sched_migrate_task). When trying to filter on when a program migrated, this change prevented the filtering of the sched_migrate_task. To fix this, the event fields are examined first, and then the extra fields like "comm" and "cpu" are examined. Also, instead of testing to assign the comm filter function based on the field's name, the generic comm field is given a new filter type (FILTER_COMM). When this field is used to filter the type is checked. The same is done for the cpu filter field. Two new special filter types are added: "COMM" and "CPU". This allows users to still filter the tasks comm for events that have "comm" as one of their fields, in cases that users would like to filter sched_migrate_task on the comm of the task that called the event, and not the comm of the task that is being migrated. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJW2argAAoJEKKk/i67LK/8b78H/32nYPizDIsK/p2bL1mgbtMl vrkcfb+maPOC7cjB+CdQmyV4EIVpSn06XFouYghGprdoVocVyBuIflxn0j3Gbymy zLCg8lR70KTATTqst1wsWMbnh+UvAKNEiXj8jf2qcK2xhgalXMDwsTC4+LDlLugu YAx89lmsjK1YpP/wIzMww2jQG+07Nhm9gHWXF2MC3egZ+sgYxARnfds0yTcGgS8o dc/yJGZDCI44JMDNThcCFxNvsmoTa9tpm+JNe2YTht6KCympa+Ht9Jj9MMlD06cq M5CqMQlok+mrVsW5LbJPCk1u83ynr6d/PcPQuT2nykRx8bGvKjA7AKMPaxw1Jz4= =ixBz -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "A feature was added in 4.3 that allowed users to filter trace points on a tasks "comm" field. But this prevented filtering on a comm field that is within a trace event (like sched_migrate_task). When trying to filter on when a program migrated, this change prevented the filtering of the sched_migrate_task. To fix this, the event fields are examined first, and then the extra fields like "comm" and "cpu" are examined. Also, instead of testing to assign the comm filter function based on the field's name, the generic comm field is given a new filter type (FILTER_COMM). When this field is used to filter the type is checked. The same is done for the cpu filter field. Two new special filter types are added: "COMM" and "CPU". This allows users to still filter the tasks comm for events that have "comm" as one of their fields, in cases that users would like to filter sched_migrate_task on the comm of the task that called the event, and not the comm of the task that is being migrated" * tag 'trace-fixes-v4.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Do not have 'comm' filter override event 'comm' field |
|
Steven Rostedt (Red Hat) | e57cbaf0eb |
tracing: Do not have 'comm' filter override event 'comm' field
Commit |
|
Taeung Song | 026842d148 |
tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:
Some tracepoint have multiple fields with the same name, "nr", the first one is a unique syscall ID, the other is a syscall argument: # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_io_getevents/format name: sys_enter_io_getevents ID: 747 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int nr; offset:8; size:4; signed:1; field:aio_context_t ctx_id; offset:16; size:8; signed:0; field:long min_nr; offset:24; size:8; signed:0; field:long nr; offset:32; size:8; signed:0; field:struct io_event * events; offset:40; size:8; signed:0; field:struct timespec * timeout; offset:48; size:8; signed:0; print fmt: "ctx_id: 0x%08lx, min_nr: 0x%08lx, nr: 0x%08lx, events: 0x%08lx, timeout: 0x%08lx", ((unsigned long)(REC->ctx_id)), ((unsigned long)(REC->min_nr)), ((unsigned long)(REC->nr)), ((unsigned long)(REC->events)), ((unsigned long)(REC->timeout)) # Fix it by renaming the "/format" common tracepoint field "nr" to "__syscall_nr". Signed-off-by: Taeung Song <treeze.taeung@gmail.com> [ Do not rename the struct member, just the '/format' field name ] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160226132301.3ae065a4@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
|
Ingo Molnar | 0a7348925f |
Linux 4.5-rc6
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJW0yM6AAoJEHm+PkMAQRiGeUwIAJRTHFPJTFpJcJjeZEV4/EL1 7Pl0WSHs/CWBkXIevAg2HgkECSQ9NI9FAUFvoGxCldDpFAnL1U2QV8+Ur2qhiXMG 5v0jILJuiw57qT/NfhEudZolerlRoHILmB3JRTb+DUV4GHZuWpTkJfUSI9j5aTEl w83XUgtK4bKeIyFbHdWQk6xqfzfFBSuEITuSXreOMwkFfMmeScE0WXOPLBZWyhPa v0rARJLYgM+vmRAnJjnG8unH+SgnqiNcn2oOFpevKwmpVcOjcEmeuxh/HdeZf7HM /R8F86OwdmXsO+z8dQxfcucLg+I9YmKfFr8b6hopu1sRztss2+Uk6H1j2J7IFIg= =tvkh -----END PGP SIGNATURE----- Merge tag 'v4.5-rc6' into perf/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | 5bb9871eb8 |
Another small bug reported to me by Chunyu Hu.
When perf added a "reg" function to the function tracing event (not a tracepoint), it caused that event to be displayed as a tracepoint and could cause errors in tracepoint handling. That was solved by adding a flag to ignore ftrace non-tracepoint events. But that flag was missed when displaying events in available_events, which should only contain tracepoint events. This broke a documented way to enable all events with: cat available_events > set_event As the function non-tracepoint event would cause that to error out. The commit here fixes that by having the available_events file not list events that have the ignore flag set. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWzxLPAAoJEKKk/i67LK/8uAMH+gPedE1YbBmhFOgOEEyjADsV AJ/XRkqHhNyjB5h05jvO9KeoaEL27E0unvRNIKFjeKoco1fqQ5USMXr1hdOpQKt6 X16478XGWiZI2w11Yp86UmhrzVFmRcCv1zSIhTFhidOIQVwu6aUSRA3DchaBXCez nzQBgRUzsHLWhS1tRNtdnf6gRe6PSImjwmpyYETPzzatedAW4D1Xp+u+7z2uKXka jJIaE90hYDCT08+6Q+QeD1RR2Uzh+E72ZhM5wCcnuwKd5BfxLcZs23PCIlZjX24k Iu2hPeW1VVIN/mbqY04R4oAKRXJG8Wio8l1S5ldJTPReBCZLDK5N2O+LWPwIEIU= =Ypbd -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v4.5-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Another small bug reported to me by Chunyu Hu. When perf added a "reg" function to the function tracing event (not a tracepoint), it caused that event to be displayed as a tracepoint and could cause errors in tracepoint handling. That was solved by adding a flag to ignore ftrace non-tracepoint events. But that flag was missed when displaying events in available_events, which should only contain tracepoint events. This broke a documented way to enable all events with: cat available_events > set_event As the function non-tracepoint event would cause that to error out. The commit here fixes that by having the available_events file not list events that have the ignore flag set" * tag 'trace-fixes-v4.5-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix showing function event in available_events |
|
Steven Rostedt (Red Hat) | d045437a16 |
tracing: Fix showing function event in available_events
The ftrace:function event is only displayed for parsing the function tracer data. It is not used to enable function tracing, and does not include an "enable" file in its event directory. Originally, this event was kept separate from other events because it did not have a ->reg parameter. But perf added a "reg" parameter for its use which caused issues, because it made the event available to functions where it was not compatible for. Commit |
|
David S. Miller | b633353115 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts: drivers/net/phy/bcm7xxx.c drivers/net/phy/marvell.c drivers/net/vxlan.c All three conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Linus Torvalds | 4de8ebeff8 |
Two more small fixes.
One is by Yang Shi who added a READ_ONCE_NOCHECK() to the scan of the stack made by the stack tracer. As the stack tracer scans the entire kernel stack, KASAN triggers seeing it as a "stack out of bounds" error. As the scan is looking at the contents of the stack from parent functions. The NOCHECK() tells KASAN that this is done on purpose, and is not some kind of stack overflow. The second fix is to the ftrace selftests, to retrieve the PID of executed commands from the shell with "$!" and not by parsing "jobs". -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWyycyAAoJEKKk/i67LK/8ALoH/RkMZ7Cih7vXb30wB13xSNrB 6o4ApuC4YOS9Un/4ruCXb+cGbW2LJLHkEU2ageoHLOZMvwuAM7iQ6fTUW1KxCRP2 ECvqyqi0ZRyoi/CibxVVH9hHEAJzUTwok67nkLeZBqIN9Fglcfd7toAwgcrH3y59 Pybyv5CV2eaff5IKoLXKZJNRLdrVLeM7v4BvdI0dxEikhWZ0XsA0RoIaNfTPqyQJ F6sJ/njdoMMJK4N8CCPxlvnvEOzn0DnJnfUNUQEj5J3YU9DbAHAACaBSg5oSh9oK BcFYKV2GIzPku1cafutRRlErcGyB2yqv7bB8Eo86zXRHbeonaj4XGJmH276ldVg= =srlj -----END PGP SIGNATURE----- Merge tag 'trace-fixes-v4.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Two more small fixes. One is by Yang Shi who added a READ_ONCE_NOCHECK() to the scan of the stack made by the stack tracer. As the stack tracer scans the entire kernel stack, KASAN triggers seeing it as a "stack out of bounds" error. As the scan is looking at the contents of the stack from parent functions. The NOCHECK() tells KASAN that this is done on purpose, and is not some kind of stack overflow. The second fix is to the ftrace selftests, to retrieve the PID of executed commands from the shell with '$!' and not by parsing 'jobs'" * tag 'trace-fixes-v4.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing, kasan: Silence Kasan warning in check_stack of stack_tracer ftracetest: Fix instance test to use proper shell command for pids |
|
Alexei Starovoitov | d5a3b1f691 |
bpf: introduce BPF_MAP_TYPE_STACK_TRACE
add new map type to store stack traces and corresponding helper bpf_get_stackid(ctx, map, flags) - walk user or kernel stack and return id @ctx: struct pt_regs* @map: pointer to stack_trace map @flags: bits 0-7 - numer of stack frames to skip bit 8 - collect user stack instead of kernel bit 9 - compare stacks by hash only bit 10 - if two different stacks hash into the same stackid discard old other bits - reserved Return: >= 0 stackid on success or negative error stackid is a 32-bit integer handle that can be further combined with other data (including other stackid) and used as a key into maps. Userspace will access stackmap using standard lookup/delete syscall commands to retrieve full stack trace for given stackid. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> |
|
Yang Shi | 6e22c83664 |
tracing, kasan: Silence Kasan warning in check_stack of stack_tracer
When enabling stack trace via "echo 1 > /proc/sys/kernel/stack_tracer_enabled", the below KASAN warning is triggered: BUG: KASAN: stack-out-of-bounds in check_stack+0x344/0x848 at addr ffffffc0689ebab8 Read of size 8 by task ksoftirqd/4/29 page:ffffffbdc3a27ac0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x0() page dumped because: kasan: bad access detected CPU: 4 PID: 29 Comm: ksoftirqd/4 Not tainted 4.5.0-rc1 #129 Hardware name: Freescale Layerscape 2085a RDB Board (DT) Call trace: [<ffffffc000091300>] dump_backtrace+0x0/0x3a0 [<ffffffc0000916c4>] show_stack+0x24/0x30 [<ffffffc0009bbd78>] dump_stack+0xd8/0x168 [<ffffffc000420bb0>] kasan_report_error+0x6a0/0x920 [<ffffffc000421688>] kasan_report+0x70/0xb8 [<ffffffc00041f7f0>] __asan_load8+0x60/0x78 [<ffffffc0002e05c4>] check_stack+0x344/0x848 [<ffffffc0002e0c8c>] stack_trace_call+0x1c4/0x370 [<ffffffc0002af558>] ftrace_ops_no_ops+0x2c0/0x590 [<ffffffc00009f25c>] ftrace_graph_call+0x0/0x14 [<ffffffc0000881bc>] fpsimd_thread_switch+0x24/0x1e8 [<ffffffc000089864>] __switch_to+0x34/0x218 [<ffffffc0011e089c>] __schedule+0x3ac/0x15b8 [<ffffffc0011e1f6c>] schedule+0x5c/0x178 [<ffffffc0001632a8>] smpboot_thread_fn+0x350/0x960 [<ffffffc00015b518>] kthread+0x1d8/0x2b0 [<ffffffc0000874d0>] ret_from_fork+0x10/0x40 Memory state around the buggy address: ffffffc0689eb980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 f4 f4 f4 ffffffc0689eba00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 >ffffffc0689eba80: 00 00 f1 f1 f1 f1 00 f4 f4 f4 f3 f3 f3 f3 00 00 ^ ffffffc0689ebb00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffc0689ebb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 The stacker tracer traverses the whole kernel stack when saving the max stack trace. It may touch the stack red zones to cause the warning. So, just disable the instrumentation to silence the warning. Link: http://lkml.kernel.org/r/1455309960-18930-1-git-send-email-yang.shi@linaro.org Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Linus Torvalds | 705d43dbe1 |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatching fixes from Jiri Kosina: - regression (from 4.4) fix for ordering issue, introduced by an earlier ftrace change, that broke live patching of modules. The fix replaces the ftrace module notifier by direct call in order to make the ordering guaranteed and well-defined. The patch, from Jessica Yu, has been acked both by Steven and Rusty - error message fix from Miroslav Benes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching: ftrace/module: remove ftrace module notifier livepatch: change the error message in asm/livepatch.h header files |
|
Jessica Yu | 7dcd182bec |
ftrace/module: remove ftrace module notifier
Remove the ftrace module notifier in favor of directly calling
ftrace_module_enable() and ftrace_release_mod() in the module loader.
Hard-coding the function calls directly in the module loader removes
dependence on the module notifier call chain and provides better
visibility and control over what gets called when, which is important
to kernel utilities such as livepatch.
This fixes a notifier ordering issue in which the ftrace module notifier
(and hence ftrace_module_enable()) for coming modules was being called
after klp_module_notify(), which caused livepatch modules to initialize
incorrectly. This patch removes dependence on the module notifier call
chain in favor of hard coding the corresponding function calls in the
module loader. This ensures that ftrace and livepatch code get called in
the correct order on patch module load and unload.
Fixes:
|
|
Rafael J. Wysocki | b2a3b193b7 | Merge branch 'pm-opp' into pm-cpufreq | |
Martin KaFai Lau | a7636d9ecf |
kprobes: Optimize hot path by using percpu counter to collect 'nhit' statistics
When doing ebpf+kprobe on some hot TCP functions (e.g. tcp_rcv_established), the kprobe_dispatcher() function shows up in 'perf report'. In kprobe_dispatcher(), there is a lot of cache bouncing on 'tk->nhit++'. 'tk->nhit' and 'tk->tp.flags' also share the same cacheline. perf report (cycles:pp): 8.30% ipv4_dst_check 4.74% copy_user_enhanced_fast_string 3.93% dst_release 2.80% tcp_v4_rcv 2.31% queued_spin_lock_slowpath 2.30% _raw_spin_lock 1.88% mlx4_en_process_rx_cq 1.84% eth_get_headlen 1.81% ip_rcv_finish ~~~~ 1.71% kprobe_dispatcher ~~~~ 1.55% mlx4_en_xmit 1.09% __probe_kernel_read perf report after patch: 9.15% ipv4_dst_check 5.00% copy_user_enhanced_fast_string 4.12% dst_release 2.96% tcp_v4_rcv 2.50% _raw_spin_lock 2.39% queued_spin_lock_slowpath 2.11% eth_get_headlen 2.03% mlx4_en_process_rx_cq 1.69% mlx4_en_xmit 1.19% ip_rcv_finish 1.12% __probe_kernel_read 1.02% ehci_hcd_cleanup Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Kernel Team <kernel-team@fb.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1454531308-2441898-1-git-send-email-kafai@fb.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Shilpasri G Bhat | 0306e481d4 |
cpufreq: powernv/tracing: Add powernv_throttle tracepoint
This patch adds the powernv_throttle tracepoint to trace the CPU frequency throttling event, which is used by the powernv-cpufreq driver in POWER8. Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
|
Linus Torvalds | ef582d095d |
A cleanup to the stack tracer broke stack tracing on s390.
Here's a simple fix to correct that issue. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWr9ZIAAoJEKKk/i67LK/8Zw4H/jTSM58YqENMrXLkfL1DbzXR UsJM+tnJX1BjYDy57yAj3HXYYWKB9h+T9Fku4CMxzRqkFHA3Vu95YIJN8hpQ0fqT R4/nvetq214bH27DNFuDHzBwVJL368De0Kcmqy83FB5G89G8JXoxiY6nvDkmQUIq mzYU9duCbCRvXrOSDCVSVf/hVg71Ek/erZMVfYwSf56yy8ICOoiW8Fyv6kludBAu /71ztEWPlIXJWijIQsH2fWsdOln7N/Ej5+9wtotSlbHtTuhJJi2xr817WwOLNUBN HC5OM5K6mWqnLveZZLTp6o77Ap6BYw2vCyElvARt23Eywz3iUE1ZzeRGSPtQwI8= =s+F8 -----END PGP SIGNATURE----- Merge tag 'trace-v4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "A cleanup to the stack tracer broke stack tracing on s390. Here's a simple fix to correct that issue" * tag 'trace-v4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/stacktrace: Show entire trace if passed in function not found |
|
Linus Torvalds | 29d14f0835 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner: "This is much bigger than typical fixes, but Peter found a category of races that spurred more fixes and more debugging enhancements. Work started before the merge window, but got finished only now. Aside of that this contains the usual small fixes to perf and tools. Nothing particular exciting" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) perf: Remove/simplify lockdep annotation perf: Synchronously clean up child events perf: Untangle 'owner' confusion perf: Add flags argument to perf_remove_from_context() perf: Clean up sync_child_event() perf: Robustify event->owner usage and SMP ordering perf: Fix STATE_EXIT usage perf: Update locking order perf: Remove __free_event() perf/bpf: Convert perf_event_array to use struct file perf: Fix NULL deref perf/x86: De-obfuscate code perf/x86: Fix uninitialized value usage perf: Fix race in perf_event_exit_task_context() perf: Fix orphan hole perf stat: Do not clean event's private stats perf hists: Fix HISTC_MEM_DCACHELINE width setting perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed perf tests: Remove wrong semicolon in while loop in CQM test perf: Synchronously free aux pages in case of allocation failure ... |
|
Steven Rostedt | 6ccd83714a |
tracing/stacktrace: Show entire trace if passed in function not found
When a max stack trace is discovered, the stack dump is saved. In order to
not record the overhead of the stack tracer, the ip of the traced function
is looked for within the dump. The trace is started from the location of
that function. But if for some reason the ip is not found, the entire stack
trace is then truncated. That's not very useful. Instead, print everything
if the ip of the traced function is not found within the trace.
This issue showed up on s390.
Link: http://lkml.kernel.org/r/20160129102241.1b3c9c04@gandalf.local.home
Fixes:
|
|
Alexei Starovoitov | e03e7ee34f |
perf/bpf: Convert perf_event_array to use struct file
Robustify refcounting. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Wang Nan <wangnan0@huawei.com> Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160126045947.GA40151@ast-mbp.thefacebook.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
|
Linus Torvalds | 26cd83670f |
This includes three minor fixes, mostly due to cut-and-paste issues.
The first is a cut and paste issue that changed the amount of stack to skip when tracing a stack dump from 0 to 6, which basically made the stack disappear for small stack traces. The second fix is just removing an unused field in a struct that is no longer used, and currently just wastes space. The third is another cut-and-paste fix that had a tracepoint recording the wrong field (it was recording the previous field a second time). -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWqibPAAoJEKKk/i67LK/8/NkH/3M6WB7RIiMMd4O403imbKcs yIH0j9vH6Z5hwoAUUr0bEw+gHVgzsiRky5z+fP0f1J3QdVAdgEig6RgQtIbWRynu i7fohNAiSMBob0wOIHTohQDKkQjvgoO9gO5S8nY6Axgpf4iqOTy3RF2a/gcltULY qdgy9A0vLk6yMbP6c0P+kEzg4y+Q90DsUh8YzQKW7F1EJPneDmNdug3VM16gefTR 4yrodSBHxr8NV3kAhN8G7FjWmK5cBDFwD66vsti64mKVCW00hjYRCQ+5BrgQ7h0V EDC7kHisckLb415SQxe8XdF4fKbfE1PuQYZhjTo02hx9XCMeyxDWbjTF2PrZCHw= =gab6 -----END PGP SIGNATURE----- Merge tag 'trace-v4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull minor tracing fixes from Steven Rostedt: "This includes three minor fixes, mostly due to cut-and-paste issues. The first is a cut and paste issue that changed the amount of stack to skip when tracing a stack dump from 0 to 6, which basically made the stack disappear for small stack traces. The second fix is just removing an unused field in a struct that is no longer used, and currently just wastes space. The third is another cut-and-paste fix that had a tracepoint recording the wrong field (it was recording the previous field a second time)" * tag 'trace-v4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/dma-buf/fence: Fix timeline str value on fence_annotate_wait_on ftrace: Remove unused nr_trampolines var tracing: Fix stacktrace skip depth in trace_buffer_unlock_commit_regs() |
|
Steven Rostedt (Red Hat) | 7717c6be69 |
tracing: Fix stacktrace skip depth in trace_buffer_unlock_commit_regs()
While cleaning the stacktrace code I unintentially changed the skip depth of
trace_buffer_unlock_commit_regs() from 0 to 6. kprobes uses this function,
and with skipping 6 call backs, it can easily produce no stack.
Here's how I tested it:
# echo 'p:ext4_sync_fs ext4_sync_fs ' > /sys/kernel/debug/tracing/kprobe_events
# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
# cat /sys/kernel/debug/trace
sync-2394 [005] 502.457060: ext4_sync_fs: (ffffffff81317650)
sync-2394 [005] 502.457063: kernel_stack: <stack trace>
sync-2394 [005] 502.457086: ext4_sync_fs: (ffffffff81317650)
sync-2394 [005] 502.457087: kernel_stack: <stack trace>
sync-2394 [005] 502.457091: ext4_sync_fs: (ffffffff81317650)
After putting back the skip stack to zero, we have:
sync-2270 [000] 748.052693: ext4_sync_fs: (ffffffff81317650)
sync-2270 [000] 748.052695: kernel_stack: <stack trace>
=> iterate_supers (ffffffff8126412e)
=> sys_sync (ffffffff8129c4b6)
=> entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
sync-2270 [000] 748.053017: ext4_sync_fs: (ffffffff81317650)
sync-2270 [000] 748.053019: kernel_stack: <stack trace>
=> iterate_supers (ffffffff8126412e)
=> sys_sync (ffffffff8129c4b6)
=> entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
sync-2270 [000] 748.053381: ext4_sync_fs: (ffffffff81317650)
sync-2270 [000] 748.053383: kernel_stack: <stack trace>
=> iterate_supers (ffffffff8126412e)
=> sys_sync (ffffffff8129c4b6)
=> entry_SYSCALL_64_fastpath (ffffffff8181f0b2)
Cc: stable@vger.kernel.org # v4.4+
Fixes:
|
|
Linus Torvalds | c17488d066 |
Not much new with tracing for this release. Mostly just clean ups and
minor fixes. Here's what else is new: o A new TRACE_EVENT_FN_COND macro, combining both _FN and _COND for those that want both. o New selftest to test the instance create and delete o Better debug output when ftrace fails -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWlU8tAAoJEKKk/i67LK/8JckH/2XIhjwMunm35uCg1308sDqy d44G3+p0pm8ztjBf8iD8wH2nP3m7z+nC8JBmSPIUgAHsKOYHWsBy2A/36OVWv5lK 1hVXvBwOuZXnyWXr7bC2RO9S9f9acSFaabZXWDi1BCJRJSgEcknz32V7ZAL4jOCO SfBWBNrWJfUsURbfbElfVxPLArvyUg9Bb5dW5B+QFf6PuoJaORYzNLYXHlbsq++T WlrlnD+mFZ/DKFZ/gl3FMSGMPaGimw09/3eqMzv/tLQobp6PbCWlJTwjUoxJ/9dO XOY4sWUrUUZilU8qCk0i0ZSEumWmE+SWS3eq+Ef18B/5haIj/LkoM4UQD3h2Rc4= =FDR+ -----END PGP SIGNATURE----- Merge tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Not much new with tracing for this release. Mostly just clean ups and minor fixes. Here's what else is new: - A new TRACE_EVENT_FN_COND macro, combining both _FN and _COND for those that want both. - New selftest to test the instance create and delete - Better debug output when ftrace fails" * tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (24 commits) ftrace: Fix the race between ftrace and insmod ftrace: Add infrastructure for delayed enabling of module functions x86: ftrace: Fix the comments for ftrace_modify_code_direct() tracing: Fix comment to use tracing_on over tracing_enable metag: ftrace: Fix the comments for ftrace_modify_code sh: ftrace: Fix the comments for ftrace_modify_code() ia64: ftrace: Fix the comments for ftrace_modify_code() ftrace: Clean up ftrace_module_init() code ftrace: Join functions ftrace_module_init() and ftrace_init_module() tracing: Introduce TRACE_EVENT_FN_COND macro tracing: Use seq_buf_used() in seq_buf_to_user() instead of len bpf: Constify bpf_verifier_ops structure ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too ftrace: Remove use of control list and ops ftrace: Fix output of enabled_functions for showing tramp ftrace: Fix a typo in comment ftrace: Show all tramps registered to a record on ftrace_bug() ftrace: Add variable ftrace_expected for archs to show expected code ftrace: Add new type to distinguish what kind of ftrace_bug() tracing: Update cond flag when enabling or disabling a trigger ... |
|
Linus Torvalds | 33caf82acf |
Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro: "All kinds of stuff. That probably should've been 5 or 6 separate branches, but by the time I'd realized how large and mixed that bag had become it had been too close to -final to play with rebasing. Some fs/namei.c cleanups there, memdup_user_nul() introduction and switching open-coded instances, burying long-dead code, whack-a-mole of various kinds, several new helpers for ->llseek(), assorted cleanups and fixes from various people, etc. One piece probably deserves special mention - Neil's lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets called without ->i_mutex and tries to avoid ever taking it. That, of course, means that it's not useful for any directory modifications, but things like getting inode attributes in nfds readdirplus are fine with that. I really should've asked for moratorium on lookup-related changes this cycle, but since I hadn't done that early enough... I *am* asking for that for the coming cycle, though - I'm going to try and get conversion of i_mutex to rwsem with ->lookup() done under lock taken shared. There will be a patch closer to the end of the window, along the lines of the one Linus had posted last May - mechanical conversion of ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/ inode_is_locked()/inode_lock_nested(). To quote Linus back then: ----- | This is an automated patch using | | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/' | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/' | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/' | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/' | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/' | | with a very few manual fixups ----- I'm going to send that once the ->i_mutex-affecting stuff in -next gets mostly merged (or when Linus says he's about to stop taking merges)" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits) nfsd: don't hold i_mutex over userspace upcalls fs:affs:Replace time_t with time64_t fs/9p: use fscache mutex rather than spinlock proc: add a reschedule point in proc_readfd_common() logfs: constify logfs_block_ops structures fcntl: allow to set O_DIRECT flag on pipe fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE fs: xattr: Use kvfree() [s390] page_to_phys() always returns a multiple of PAGE_SIZE nbd: use ->compat_ioctl() fs: use block_device name vsprintf helper lib/vsprintf: add %*pg format specifier fs: use gendisk->disk_name where possible poll: plug an unused argument to do_poll amdkfd: don't open-code memdup_user() cdrom: don't open-code memdup_user() rsxx: don't open-code memdup_user() mtip32xx: don't open-code memdup_user() [um] mconsole: don't open-code memdup_user_nul() [um] hostaudio: don't open-code memdup_user() ... |
|
Qiu Peiyang | 5156dca34a |
ftrace: Fix the race between ftrace and insmod
We hit ftrace_bug report when booting Android on a 64bit ATOM SOC chip. Basically, there is a race between insmod and ftrace_run_update_code. After load_module=>ftrace_module_init, another thread jumps in to call ftrace_run_update_code=>ftrace_arch_code_modify_prepare =>set_all_modules_text_rw, to change all modules as RW. Since the new module is at MODULE_STATE_UNFORMED, the text attribute is not changed. Then, the 2nd thread goes ahead to change codes. However, load_module continues to call complete_formation=>set_section_ro_nx, then 2nd thread would fail when probing the module's TEXT. The patch fixes it by using notifier to delay the enabling of ftrace records to the time when module is at state MODULE_STATE_COMING. Link: http://lkml.kernel.org/r/567CE628.3000609@intel.com Signed-off-by: Qiu Peiyang <peiyangx.qiu@intel.com> Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | b7ffffbb46 |
ftrace: Add infrastructure for delayed enabling of module functions
Qiu Peiyang pointed out that there's a race when enabling function tracing and loading a module. In order to make the modifications of converting nops in the prologue of functions into callbacks, the text needs to be converted from read-only to read-write. When enabling function tracing, the text permission is updated, the functions are modified, and then they are put back. When loading a module, the updates to convert function calls to mcount is done before the module text is set to read-only. But after it is done, the module text is visible by the function tracer. Thus we have the following race: CPU 0 CPU 1 ----- ----- start function tracing set text to read-write load_module add functions to ftrace set module text read-only update all functions to callbacks modify module functions too < Can't it's read-only > When this happens, ftrace detects the issue and disables itself till the next reboot. To fix this, a new DISABLED flag is added for ftrace records, which all module functions get when they are added. Then later, after the module code is all set, the records will have the DISABLED flag cleared, and they will be enabled if any callback wants all functions to be traced. Note, this doesn't add the delay to later. It simply changes the ftrace_module_init() to do both the setting of DISABLED records, and then immediately calls the enable code. This helps with testing this new code as it has the same behavior as previously. Another change will come after this to have the ftrace_module_enable() called after the text is set to read-only. Cc: Qiu Peiyang <peiyangx.qiu@intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Qiu Peiyang | f36d1be293 |
tracing: Fix setting of start_index in find_next()
When we do cat /sys/kernel/debug/tracing/printk_formats, we hit kernel
panic at t_show.
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 0 PID: 2957 Comm: sh Tainted: G W O 3.14.55-x86_64-01062-gd4acdc7 #2
RIP: 0010:[<ffffffff811375b2>]
[<ffffffff811375b2>] t_show+0x22/0xe0
RSP: 0000:ffff88002b4ebe80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
RDX: 0000000000000004 RSI: ffffffff81fd26a6 RDI: ffff880032f9f7b1
RBP: ffff88002b4ebe98 R08: 0000000000001000 R09: 000000000000ffec
R10: 0000000000000000 R11: 000000000000000f R12: ffff880004d9b6c0
R13: 7365725f6d706400 R14: ffff880004d9b6c0 R15: ffffffff82020570
FS: 0000000000000000(0000) GS:ffff88003aa00000(0063) knlGS:00000000f776bc40
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 00000000f6c02ff0 CR3: 000000002c2b3000 CR4: 00000000001007f0
Call Trace:
[<ffffffff811dc076>] seq_read+0x2f6/0x3e0
[<ffffffff811b749b>] vfs_read+0x9b/0x160
[<ffffffff811b7f69>] SyS_read+0x49/0xb0
[<ffffffff81a3a4b9>] ia32_do_call+0x13/0x13
---[ end trace 5bd9eb630614861e ]---
Kernel panic - not syncing: Fatal exception
When the first time find_next calls find_next_mod_format, it should
iterate the trace_bprintk_fmt_list to find the first print format of
the module. However in current code, start_index is smaller than *pos
at first, and code will not iterate the list. Latter container_of will
get the wrong address with former v, which will cause mod_fmt be a
meaningless object and so is the returned mod_fmt->fmt.
This patch will fix it by correcting the start_index. After fixed,
when the first time calls find_next_mod_format, start_index will be
equal to *pos, and code will iterate the trace_bprintk_fmt_list to
get the right module printk format, so is the returned mod_fmt->fmt.
Link: http://lkml.kernel.org/r/5684B900.9000309@intel.com
Cc: stable@vger.kernel.org # 3.12+
Fixes:
|
|
Al Viro | 70f6cbb6f9 |
kernel/*: switch to memdup_user_nul()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
|
Al Viro | 16e5c1fc36 |
convert a bunch of open-coded instances of memdup_user_nul()
A _lot_ of ->write() instances were open-coding it; some are converted to memdup_user_nul(), a lot more remain... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> |
|
Chuyu Hu | 05a724bd44 |
tracing: Fix comment to use tracing_on over tracing_enable
The file tracing_enable is obsolete and does not exist anymore. Replace the comment that references it with the proper tracing_on file. Link: http://lkml.kernel.org/r/1450787141-45544-1-git-send-email-chuhu@redhat.com Signed-off-by: Chuyu Hu <chuhu@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | 97e9b4fca5 |
ftrace: Clean up ftrace_module_init() code
The start and end variables were only used when ftrace_module_init() was split up into multiple functions. No need to keep them around after the merger. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Abel Vesa | b6b71f66a1 |
ftrace: Join functions ftrace_module_init() and ftrace_init_module()
Simple cleanup. No need for two functions here. The whole work can simply be done inside 'ftrace_module_init'. Link: http://lkml.kernel.org/r/1449067197-5718-1-git-send-email-abelvesa@linux.com Signed-off-by: Abel Vesa <abelvesa@linux.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Julia Lawall | 27dff4e041 |
bpf: Constify bpf_verifier_ops structure
This bpf_verifier_ops structure is never modified, like the other bpf_verifier_ops structures, so declare it as const. Done with the help of Coccinelle. Link: http://lkml.kernel.org/r/1449855359-13724-1-git-send-email-Julia.Lawall@lip6.fr Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | c68c0fa293 |
ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too
Jiri Olsa noted that the change to replace the control_ops did not update the trampoline for when running perf on a single CPU and with CONFIG_PREEMPT disabled (where dynamic ops, like perf, can use trampolines directly). The result was that perf function could be called when RCU is not watching as well as not handle the ftrace_local_disable(). Modify the ftrace_ops_get_func() to also check the RCU and PER_CPU ops flags and use the recursive function if they are set. The recursive function is modified to check those flags and execute the appropriate checks if they are set. Link: http://lkml.kernel.org/r/20151201134213.GA14155@krava.brq.redhat.com Reported-by: Jiri Olsa <jolsa@redhat.com> Patch-fixed-up-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | ba27f2bc73 |
ftrace: Remove use of control list and ops
Currently perf has its own list function within the ftrace infrastructure that seems to be used only to allow for it to have per-cpu disabling as well as a check to make sure that it's not called while RCU is not watching. It uses something called the "control_ops" which is used to iterate over ops under it with the control_list_func(). The problem is that this control_ops and control_list_func unnecessarily complicates the code. By replacing FTRACE_OPS_FL_CONTROL with two new flags (FTRACE_OPS_FL_RCU and FTRACE_OPS_FL_PER_CPU) we can remove all the code that is special with the control ops and add the needed checks within the generic ftrace_list_func(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | 030f4e1cb8 |
ftrace: Fix output of enabled_functions for showing tramp
When showing all tramps registered to a ftrace record in the file
enabled_functions, it exits the loop with ops == NULL. But then it is
suppose to show the function on the ops->trampoline and
add_trampoline_func() is called with the given ops. But because ops is now
NULL (to exit the loop), it always shows the static trampoline instead of
the one that is really registered to the record.
The call to add_trampoline_func() that shows the trampoline for the given
ops needs to be called at every iteration.
Fixes:
|
|
Li Bin | b8ec330a63 |
ftrace: Fix a typo in comment
s/ARCH_SUPPORT_FTARCE_OPS/ARCH_SUPPORTS_FTRACE_OPS/ Link: http://lkml.kernel.org/r/1448879016-8659-1-git-send-email-huawei.libin@huawei.com Signed-off-by: Li Bin <huawei.libin@huawei.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Linus Torvalds | 51825c8a86 |
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar: "This tree includes four core perf fixes for misc bugs, three fixes to x86 PMU drivers, and two updates to old email addresses" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Do not send exit event twice perf/x86/intel: Fix INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_NA macro perf/x86/intel: Make L1D_PEND_MISS.FB_FULL not constrained on Haswell perf: Fix PERF_EVENT_IOC_PERIOD deadlock treewide: Remove old email address perf/x86: Fix LBR call stack save/restore perf: Update email address in MAINTAINERS perf/core: Robustify the perf_cgroup_from_task() RCU checks perf/core: Fix RCU problem with cgroup context switching code |
|
Steven Rostedt (Red Hat) | 0f72e37e42 |
tracing: Add sched_wakeup_new and sched_waking tracepoints for pid filter
The set_event_pid filter relies on attaching to the sched_switch and sched_wakeup tracepoints to see if it should filter the tracing on schedule tracepoints. By adding the callbacks to sched_wakeup, pids in the set_event_pid file will trace the wakeups of those tasks with those pids. But sched_wakeup_new and sched_waking were missed. These two should also be traced. Luckily, these tracepoints share the same class as sched_wakeup which means they can use the same pre and post callbacks as sched_wakeup does. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | 39daa7b9e8 |
ftrace: Show all tramps registered to a record on ftrace_bug()
When an anomaly is detected in the function call modification code, ftrace_bug() is called to disable function tracing as well as give any information that may help debug the problem. Currently, only the first found trampoline that is attached to the failed record is reported. Instead, show all trampolines that are hooked to it. Also, not only show the ops pointer but also report the function it calls. While at it, add this info to the enabled_functions debug file too. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | b05086c77a |
ftrace: Add variable ftrace_expected for archs to show expected code
When an anomaly is found while modifying function code, ftrace_bug() is called which disables the function tracing infrastructure and reports information about what failed. If the code that is to be replaced does not match what is expected, then actual code is shown. Currently there is no arch generic way to show what was expected. Add a new variable pointer calld ftrace_expected that the arch code can set to point to what it expected so that ftrace_bug() can report the actual text as well as the text that was expected to be there. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | 02a392a043 |
ftrace: Add new type to distinguish what kind of ftrace_bug()
The ftrace function hook utility has several internal checks to make sure that whatever it modifies is exactly what it expects to be modifying. This is essential as modifying running code can be extremely dangerous to the system. When an anomaly is detected, ftrace_bug() is called which sends a splat to the console and disables function tracing. There's some extra information that is printed to help diagnose the issue. One thing that is missing though is output of what ftrace was doing at the time of the crash. Was it updating a call site or perhaps converting a call site to a nop? A new global enum variable is created to state what ftrace was doing at the time of the anomaly, and this is reported in ftrace_bug(). Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Tom Zanussi | 4e4a4d7570 |
tracing: Update cond flag when enabling or disabling a trigger
When a trigger is enabled, the cond flag should be set beforehand, otherwise a trigger that's expecting to process a trace record (e.g. one with post_trigger set) could be invoked without one. Likewise a trigger's cond flag should be reset after it's disabled, not before. Link: http://lkml.kernel.org/r/a420b52a67b1c2d3cab017914362d153255acb99.1448303214.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | 4239c38fe0 |
ring-buffer: Process commits whenever moving to a new page.
When crossing over to a new page, commit the current work. This will allow readers to get data with less latency, and also simplifies the work to get timestamps working for interrupted events. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
|
Steven Rostedt (Red Hat) | 70004986ff |
ring-buffer: Remove redundant update of page timestamp
The first commit of a buffer page updates the timestamp of that page. No need to have the update to the next page add the timestamp too. It will only be replaced by the first commit on that page anyway. Only update to a page if it contains an event. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |