Revert "Let crash_dump read /proc/$PID."
Revert submission 1556807-tombstone_proto
Reason for revert: b/178455196, Broken test: android.seccomp.cts.SeccompHostJUnit4DeviceTest#testAppZygoteSyscalls on git_master on cf_x86_64_phone-userdebug
Reverted Changes:
Ide6811297:tombstoned: switch from goto to RAII.
I8d285c4b4:tombstoned: make it easier to add more types of ou...
Id0f0fa285:tombstoned: support for protobuf fds.
I6be6082ab:Let crash_dump read /proc/$PID.
Id812ca390:Make protobuf vendor_ramdisk_available.
Ieeece6e6d:libdebuggerd: add protobuf implementation.
Change-Id: Ia0a1ee57e7630e01c495dc166218f665340aad7f
* changes:
libdebuggerd: add protobuf implementation.
tombstoned: support for protobuf fds.
tombstoned: make it easier to add more types of outputs.
tombstoned: switch from goto to RAII.
Currently, all MTE failures end up displaying 'Fault address falls at
0x<addr> after any mapped regions'. Clearly when scanning, we should use
the untagged address to figure out which ranges it's in.
I've taken the liberty of removing all si_addr parsing and moving it
into the common ProcessInfo, as well as making it really explicit
whether you want the (possibly tagged) original si_addr, or whether you
want the untagged variant (for scanning /proc/maps or whatever).
This is not particularly easily testable, as ReadCrashInfo isn't easily
injectable and `dump_all_maps` should already be passed the untagged
pointer to scan for. I've tested this locally on FVP under SYNC MTE with
a simple UaF binary and noted the problem is fixed. Given that this is
making the code more clear, I'm hoping the owners see no need for a
regression test :).
Bug: 135772972
Test: On FVP, run 'adb shell MEMTAG_OPTIONS=sync sanitizer-status' and
check that the use-after-free test ends up with the /proc/maps
desription in the right place.
Change-Id: I220e4200c75a72474a95a67e5bbc36173a438dd2
This commit implements protobuf output for tombstones, along with a
translator that should emit bytewise identical output to the existing
tombstone dumping code, except for ancillary data from GWP-ASan and
Scudo, which haven't been implemented yet.
Test: setprop debug.debuggerd.translate.translate_proto_to_text 1 &&
/data/nativetest64/debuggerd_test/debuggerd_test
Test: for TOMBSTONE in /data/tombstones/tombstone_??; do
pbtombstone $TOMBSTONE.pb | diff $TOMBSTONE -
done
Change-Id: Ieeece6e6d1c26eb608b00ec24e2e725e161c8c92
This simplifies some of the logic and removes the need to pass an
Arch value to functions that should already know about the arch
it is operating on.
Includes fixes for debuggerd/libbacktrace.
Added new unit tests to cover new cases.
Test: All unit tests pass.
Test: Faked unwinder failing to verify debuggerd error messages display
Test: properly in backtrace and tombstone.
Change-Id: I439fcae0695befcfb1cb4c0a786cc74949d33425
This value indicates whether memory tagging is enabled on a thread,
the mode (sync or async) and the set of excluded tags. This information
can sometimes be important for understanding an MTE related crash,
so include it in the per-thread tombstone output.
Bug: 135772972
Change-Id: I25a16e10ac7fbb2b1ab2a961a5279f787039000b
Until 77fdb22cf6, logd started as
AID_ROOT and then dropped its privileges. Since then, there's been no
reason to use string comparisons rather than checking the uid.
Test: pkill -SEGV logd
Test: treehugger
Change-Id: Ia709f8f59cb0ab9abac7df84c96c701b5d0a83ea
An OEM asks for sub-second granularity, and that's most easily done if
we only have one timestamp generator. I'm not convinced sub-second
granularity is particularly useful myself, and I definitely don't think
that nanosecond resolution is meaningful but I do like this cleanup, and
if I'm going to use sub-second precision I may as well use the maximum
precision available to me.
Also reduce some duplication of code reading cmdline/comm.
Bug: https://issuetracker.google.com/161860597
Test: head /data/tombstones/*
Change-Id: I035ecfd4a3338ccd84dae0ef973a998a7c7c5056
Teach debuggerd to use the new scudo APIs proposed in
https://reviews.llvm.org/D77283 for extracing MTE error reports from crashed
processes, and include those reports in tombstones if possible.
Bug: 135772972
Change-Id: I082dfd0ac9d781cfed2b8c34cc73562614bb0dbb
log/log.h primarily concerns itself with writing logs. The few users
who read logs should directly include log/log_read.h.
Bug: 78370064
Test: build
Change-Id: Ie95c55ea2ffc76fc95768323d445ada6ad4f2520
On aarch64, the top 8 bits of the address (i.e. the tag bits) of
the fault address in si_addr are always clear. This isn't ideal for
MTE which will require these bits in order to correctly diagnose
tag mismatches.
A proposed kernel patch [1] exposes the full fault address including
the tag bits as part of the ucontext. Change debuggerd to read this
fault address if available.
[1] https://patchwork.kernel.org/patch/11435077/
Bug: 135772972
Change-Id: Ia05be574113860f4e9ecc36a310c4b740e0c4afb
A future change will introduce a version lock between linker and
crash_dump. Move crash_dump into the runtime APEX alongside linker in order to
ensure that they will be the same version even if the runtime APEX is updated.
Bug: 135772972
Change-Id: Ic2eae31b6927eb0e8a62315ac141f50933c00bcc
Merged-In: Ic2eae31b6927eb0e8a62315ac141f50933c00bcc
We're now passing around a couple of addresses for GWP-ASan in addition
to abort_msg_address and fdsan_table_address, and I'm going to need to add
more of them for MTE. Move them into a data structure in order to simplify
various function signatures.
Bug: 135772972
Change-Id: Ie01e1bd93a9ab64f21865f56574696825a6a125f
GWP-ASan can provide information about a crash that it caused. Grab the
GWP-ASan regions from the globals shared by the linker for crash-handler
purpopses, pull the information from GWP-ASan, and display it.
This adds two regions:
1. Causality tracking by GWP-ASan. We now print a cause header about
the crash, like `Cause: [GWP-ASan]: Use After Free on a 1-byte
allocation at 0x7365bb3ff8`
2. Allocation and deallocation stack traces.
Bug: 135634846
Test: atest debuggerd_test
Change-Id: Id28d5400c9a9a053fcde83a4788f971e677d4643
This takes a lot of space, isn't convincingly useful, and makes it
likely that the far more valuable stuff that comes after it gets
truncated. So let's just drop it.
Bug: http://b/139860930
Test: manual crasher, presubmit
Change-Id: Ie417ffc07e3cb17e95fdb3d183f8c87de0f34b89
A thread's PSTATE can sometimes be critical for understanding a crash,
especially with MTE and other new features that store per-thread state
in PSTATE.
Bug: 135772972
Change-Id: I1bee25bffe7eea395f04b6449dc9227298cf866e
logger_entry and logger_entry_v2 were used for the kernel logger,
which we have long since deprecated. logger_entry_v3 is the same as
logger_entry_v4 without a uid field, so it is trivially removable,
especially since we're now always providing uids in log messages.
liblog and logd already get updated in sync with each other, so we
have no reason for backwards compatibility with their format.
Test: build, unit tests
Change-Id: I27c90609f28c8d826e5614fdb3fe59bde22b5042
Test: Ran new unit tests.
Test: Ran crasher stack-overflow, crasher64 stack-overflow and verified
Test: stack overflow cause is shown.
Test: Ran stack overflow app and verified tombstone includes stack-overflow
Test: message.
Change-Id: I9bb01186dff5ed81c77d84b6aaedb5332ddd7256
This is for Android Telemetry to be able to categorise the processes
that produce tombstones.
Test: atest debugerd_test:TombstoneTest
Change-Id: Ie635347c9839eb58bfd27739050bd68cbdbf98da
Modify the unwinder library to indicate that at least one of the stack
frames contains an elf file that is unreadable.
Modify debuggerd to display a note about the unreadable frame and a possible
way to fix it.
Bug: 129769339
Test: New unit tests pass.
Test: Ran an app that crashes and has an unreadable file and verified the
Test: message is displayed. Then setenforce 0 and verify the message is
Test: not displayed.
Change-Id: Ibc4fe1d117e9b5840290454e90914ddc698d3cc2
Somehow the code was still including this include from libbacktrace.
I think the libbacktrace include directory was coming from some
transitive includes. I verified that nothing in debuggerd is using
the libbacktace.so shared library.
Bug: 120606663
Test: Builds, unit tests pass.
Change-Id: I85c2837c5a539ccefc5a7140949988058d21697a
Update the entries only when the list is modified by the runtime.
Check that the list wasn't concurrently modified when being read.
Bug: 124287208
Test: libunwindstack_test
Test: art/test.py -b --host -r -t 137-cfi
Change-Id: I87ba70322053a01b3d5be1fdf6310e1dc21bb084
Update debuggerd to print BuildId information by default.
Bug: 120975492
Test: New unit tests pass.
Test: debuggerd -b <PID> shows build id information.
Test: tombstones include build id information.
Change-Id: I019b031113d0b77385516223c63455b868924440
Currently, moving or copying a Maps object leads to double free of MapInfo.
Even moving a Maps object did not prevent this, as after a move
the object only has to be in an "unspecified but valid state", which can
be the original state for a vector of raw pointers (but not for a vector
of unique_ptrs).
Changing to unique_ptrs is the most failsafe way to make sure we never
accidentally destruct MapInfo.
Test: atest libuwindstack_test
Failed LocalUnwinderTest#unwind_after_dlopen which also fails at master.
Change-Id: Id1c9739b334da5c1ba532fd55366e115940a66d3
Small modifications to the dump_stack method and added unit tests to
verify the output.
Bug: 120606663
Test: Unit tests pass, debuggerd run on processes on target.
Change-Id: Id385a915b751abda3dd6baebed6c3ce498c3bf6e
Make XOM related crashes a little less mysterious by adding an abort
cause explaining the crash.
Bug: 77958880
Test: Abort cause in tombstone for a XOM-related crash.
Change-Id: I7af1bc251d9823bc755ad98d8b3b87c12bbaecba
Include the illegal instruction in the header if we get a
SIGILL. Otherwise (since these tend to be one-off bit flips), we don't
usually have any information to try to confirm our suspicion that any
given instance is actually a one-off bit flip.
Also add `SIGILL` as a crasher option to easily generate such crashes.
Before:
signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0xab1456da
After:
signal 4 (SIGILL), code 1 (ILL_ILLOPC), fault addr 0xab1456da (*pc=0xe7f0def0)
Bug: http://b/77274448
Test: ran crasher
Change-Id: I5f8dedca5eea2b117b1b1e48430214b38e1366ed
Suicide doesn't change:
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
But homicide now looks like this (this is `sleep 666` killed by
`kill -SEGV` as root:
signal 11 (SIGSEGV), code 0 (SI_USER from pid 4446, uid 0), fault addr --------
Bug: http://b/78594105
Test: manual
Change-Id: I8c2feafba8cc5a3db85e8250004d428a464c5d9e
Let the logging implementation be the imposer of limits.
Bug: http://b/64759619
Test: debuggerd_test
Change-Id: I8bc73bf2301ce071668993b740880224846a4e75
The stack dump was not printing leading zeros for data after the
change to remove uintptr_t types from the libbacktrace API.
Bug: 65682279
Test: Created an arm tombstone and an arm64 tombstone and verified
Test: that the stack data has leading zeros.
Change-Id: I1fbec2c4fa7c8b0fab18894c5628d18c5a580299
In order to support the offline unwinding properly, get rid of the
usage of non-fixed type uintptr_t from all API calls.
In addition, completely remove the old local and remote unwinding code
that used libunwind.
The next step will be to move the offline unwinding to the new unwinder.
Bug: 65682279
Test: Ran unit tests for libbacktrace/debuggerd.
Test: Ran debuggerd -b on a few arm and arm64 processes.
Test: Ran crasher and crasher64 and verified tombstones look correct.
Change-Id: Ib0c6cee3ad6785a102b74908a3d8e5e93e5c6b33
The abort message was accidentally relocated to be printed below the
registers, backtrace, and stack, which isn't very helpful. Move it back
to its rightful place.
Test: treehugger
Change-Id: I8aa5b63e58081f27ccdb42481fed8d9eb3a892a4
Reduce the amount of time that a process remains paused by pausing its
threads, fetching their registers, and then performing unwinding on a
copy of its address space. This also works around a kernel change
that's in 4.9 that prevents ptrace from reading memory of processes
that we don't have immediate permissions to ptrace (even if we
previously ptraced them).
Bug: http://b/62112103
Bug: http://b/63989615
Test: treehugger
Change-Id: I7b9cc5dd8f54a354bc61f1bda0d2b7a8a55733c4
Add a static GetLoadBias method to the Elf object that only reads just
enough to get the load bias.
Add a method to MapInfo that gets the load bias. First attempt to get
it if the elf object already exists. If no elf object was created, use
the new static method to get the load bias.
In BacktraceMap, add a custom iterator so that when code dereferences
a map element, that's when the load bias will be retrieved if it hasn't
already been set.
Bug: 69871050
Test: New unit tests, verify tombstones have non-zero load bias values for
Test: libraries with a non-zero load bias.
Change-Id: I125f4abc827589957fce2f0df24b0f25d037d732
Nobody is looking at the mismatches, and it can cause problems
with tombstone parsers.
Also, fix the dump_header_info test and remove unused properties_fake.cpp.
Test: Ran unit tests, verified tombstones still work.
Change-Id: I4261646016b4e84b26a5aee72f3227f1ce48ec9a