Instead of creating tombstone FDs in place and passing them out to
crash_dump directly, create them as O_TMPFILEs and link them into place
when crash_dump reports success, to avoid creating empty tombstones
in cases like an aborting thread racing with another thread that
manages to cleanly exit_group before the dump finishes.
Bug: http://b/77729983
Test: debuggerd_test
Test: adb shell 'for x in `seq 0 50`; do crasher; done'
Change-Id: I31ce4fd4a524abf8bde57152450209483d9d0ba9
Host services are attempted after handle_host_request, which means that
failing to find a transport to give to handle_forward_request shouldn't
send an error over to the other end.
Bug: http://b/78294734
Test: `adb track-devices` with multiple devices connected
Change-Id: I46c89cc1894b51d48fea7d4e629b1d57f73e3fd6
(cherry picked from commit 78f133d7d4)
Host services are attempted after handle_host_request, which means that
failing to find a transport to give to handle_forward_request shouldn't
send an error over to the other end.
Bug: http://b/78294734
Test: `adb track-devices` with multiple devices connected
Change-Id: I46c89cc1894b51d48fea7d4e629b1d57f73e3fd6
This will prevent services from reaching out to logd if this tag is
present in the event log.
Bug: 64734187
Test: tree-hugger
Merged-In: If117e1c0cfa678af4190913f0ca87f4e92c54373
Change-Id: If117e1c0cfa678af4190913f0ca87f4e92c54373
(cherry picked from commit dcc4b2bb4a)
Also include relevant new metric_logger.proto values.
Test: m
Test: Exercised by ag/3890335 in art
Bug: 77517571
Change-Id: Ia527f2b94c7a6147ad9d537376266e5ffc597b04
memunreachable_binder_test is pulled in by
test/vts/tools/build/tasks/list/vts_test_bin_package_list.mk, it
doesn't need to be listed in test_suites.
Fixes warnings:
build/make/core/base_rules.mk:620: warning: overriding commands for target `out/host/linux-x86/vts/android-vts/testcases/memunreachable_binder_test'
build/make/core/base_rules.mk:620: warning: ignoring old commands for target `out/host/linux-x86/vts/android-vts/testcases/memunreachable_binder_test'
Bug: 78229249
Test: vts-tradefed run commandAndExit vts -m VtsKernelBinderTest
Change-Id: Ifd282b2f5bb652295fa34ad247919eb85ea7abc8
Merged-In: Ifd282b2f5bb652295fa34ad247919eb85ea7abc8
(cherry picked from commit f013b62152)
This will prevent services from reaching out to logd if this tag is
present in the event log.
Bug: 64734187
Test: tree-hugger
Change-Id: If117e1c0cfa678af4190913f0ca87f4e92c54373
Report kernel_panic,sysrq,livelock,<state> reboot reason via last
dmesg (pstore console). Add ro.llk.killtest property, which will
allow reliable ABA platforms to drop kill test and go directly
to kernel panic. This should also allow some manual unit testing
of the canonical boot reason report.
New canonical boot reasons from llkd are:
- kernel_panic,sysrq,livelock,alarm llkd itself locked up (Hail Mary)
- kernel_panic,sysrq,livelock,driver uninterrruptible D state
- kernel_panic,sysrq,livelock,zombie uninterrruptible Z state
Manual test assumptions:
- llkd is built by the platform and landed on system partition
- unit test is built and landed in /data/nativetest (could
land in /data/nativetest64, adjust test correspondingly)
- llkd not enabled, ro.llk.enable and ro.llk.killtest
are not set by platform allowing test to adjust all the
configuration properties and start llkd.
- or, llkd is enabled, ro.llk.enable is true, and killtest is
disabled, ro.llk.killtest is false, setup by the platform.
This breaks the go/apct generic operations of the unit test
for llk.zombie and llk.driver as kernel panic results
requiring manual intervention otherwise. If test moves to
go/apct, then we will be forced to bypass these tests under
this condition (but allow them to run if ro.llk.killtest
is "off" so specific testing above/below can be run).
for i in driver zombie; do
adb shell su root setprop ro.llk.killtest off
adb shell /data/nativetest/llkd_unit_test/llkd_unit_test --gtest_filter=llkd.${i}
adb wait-for-device
adb shell su root setprop ro.llk.killtest off
sleep 60
adb shell getprop sys.boot.reason
adb shell /data/nativetest/llkd_unit_test/llkd_unit_test --gtest_filter=llkd.${i}
done
Test: llkd_unit_test (see test assumptions)
Bug: 33808187
Bug: 72838192
Change-Id: I2b24875376ddfdbc282ba3da5c5b3567de85dbc0
If LLK_ENABLE_DEFAULT is false, then check "ro.llk.enable" for "eng",
also the default value if not set, and then check if userdebug build
to establish a default of true for enable. Same for
ro.khungtask.enable.
Test: llkd_unit_test report eng status on "userdebug" or "user" builds
Bug: 33808187
Bug: 72838192
Change-Id: I2adb23c7629dccaa2856c50bccbf4e363703c82c
Introduce a standalone live-lock daemon (llkd), to catch kernel
or native user space deadlocks and take mitigating actions. Will
also configure [khungtaskd] to fortify the actions.
If a thread is in D or Z state with no forward progress for longer
than ro.llk.timeout_ms, or ro.llk.[D|Z].timeout_ms, kill the process
or parent process respectively. If another scan shows the same
process continues to exist, then have a confirmed live-lock condition
and need to panic. Panic the kernel in a manner to provide the
greatest bugreporting details as to the condition. Add a alarm self
watchdog should llkd ever get locked up that is double the expected
time to flow through the mainloop. Sampling is every
ro.llk_sample_ms.
Default will not monitor init, or [kthreadd] and all that [kthreadd]
spawns. This reduces the effectiveness of llkd by limiting its
coverage. If in the future, if value in covering kthreadd spawned
threads, the requirement will be to code drivers so that they do not
remain in a persistent 'D' state, or that they have mechanisms to
recover the thread should it be killed externally. Then the
blacklists can be adjusted accordingly if these conditions are met.
An accompanying gTest set have been added, and will setup a persistent
D or Z process, with and without forward progress, but not in a
live-lock state because that would require a buggy kernel, or a module
or kernel modification to stimulate.
Android Properties llkd respond to (*_ms parms are in milliseconds):
- ro.config.low_ram default false, if true do not sysrq t (dump
all threads).
- ro.llk.enable default false, allow live-lock daemon to be enabled.
- ro.khungtask.enable default false, allow [khungtaskd] to be enabled.
- ro.llk.mlockall default true, allow mlock'd live-lock daemon.
- ro.khungtask.timeout default 12 minutes.
- ro.llk.timeout_ms default 10 minutes, D or Z maximum timelimit,
double this value and it sets the alarm watchdog for llkd.
- ro.llk.D.timeout_ms default ro.llk.timeout_ms, D maximum timelimit.
- ro.llk.Z.timeout_ms default ro.llk.timeout_ms, Z maximum timelimit.
- ro.llk.check_ms default 2 minutes sampling interval
(ro.llk.timeout_ms / 5) for threads in D or Z state.
- ro.llk.blacklist.process default 0,1,2 (kernel, init and
[kthreadd]), and process names (/comm or /cmdline) init,[kthreadd],
lmkd,lmkd.llkd,llkd,[khungtaskd],watchdogd,[watchdogd],
[watchdogd/0] ...
- ro.llk.blacklist.parent default 0,2 (kernel and [kthreadd]) and
"[kthreadd]". A comma separated lists of process ids, /comm names
or /cmdline names.
- ro.llk.blacklist.uid default <empty>, comma separated list of
uid numbers or names from getpwuid/getpwnam.
Test: llkd_unit_test
Bug: 33808187
Bug: 72838192
Change-Id: I32e8aa78aef10834e093265d0f3ed5b4199807c6