The exit of init panics the system *after* process context (mm, stack,
...etc.) are recycled, according to Linux kernel's 'do_exit'
implementation. To preserve most init process context for debugging,
triggers the panic via proc-sysrq explicitly.
Note: after this change, there will be no "Attempt to kill init" panic
when androidboot.init_fatal_panic is set.
Test: Insert data abort fault in init, the full process context is
preserved in memory dump captured after panic.
Bug: 155940351
Change-Id: I3393bd00f99b8cb432cfa19a105b7d636b411764
Since this function is used in userspace reboot, we need to be more
diligent with error handling, e.g.:
* If init fails to read /sys/block/zram0/backing_dev, then fail and
fallback to hard reboot.
* Always call swapoff.
* Always reset zram.
* Tear down loop device only if zram is backed by a loop device.
Test: adb reboot userspace
Bug: 153917129
Change-Id: I4709da1d08cf427ad9c898cfb2506b6a29f1d680
Merged-In: I4709da1d08cf427ad9c898cfb2506b6a29f1d680
(cherry picked from commit a840d405eb)
Similarly to other recovery mechanisms, timeout is controlled by a
read-only property that can be configured per-device.
Test: adb root
Test: adb shell setprop init.userspace_reboot.started.timeoutmillis 2
Test: adb reboot userspace
Bug: 152803929
Change-Id: Id70710b46da798945ac5422ef7d69265911ea5ef
Devices in the lab are hitting an issue where they're getting stuck
likely in the sync() call in DoReboot() before we start the reboot
monitor thread and before we shut down services.
It's possible that concurrent writing to RW file systems is causing
this sync() call to take essentially forever. To protect against
this, we need to remove this sync(). Note that we will still call
sync() after shutting down services.
Note that the service shutdown code has a timeout and there is a
reboot monitor thread that will shutdown the device if more than 30
seconds pass above that timeout. This change increases that timeout
to 300 seconds to give the final sync() calls explicitly more time to
finish.
Bug: 150863651
Test: reboot functions normally
Test: put an infinite loop in DoReboot and the the reboot monitor thread
triggers and shuts down the device appropriately
Change-Id: I6fd7d3a25d3225081388e39a14c9fdab21b592ba
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
A previous version of this change was reverted and added locks around
all service operations and allowed the property thread to spawn
services directly. This was complex due to the fact that this code
was not designed to be multi-threaded. It was reverted due to
apparent issues during reboot. This change keeps a queue of processes
pending control messages, which it will then handle in the future. It
is less flexible but safer.
Bug: 146877356
Bug: 148236233
Bug: 150863651
Bug: 151251827
Test: multiple reboot tests, safely restarting hwservicemanager
Change-Id: Ice773436e85d3bf636bb0a892f3f6002bdf996b6
This is apparently causing problems with reboot.
This reverts commit 7205c62933.
Bug: 150863651
Test: build
Change-Id: Ib8a4835cdc8358a54c7acdebc5c95038963a0419
If init is wedged, then the write will never succeed and reboot won't
happen.
Also, in case of normal reboot, move call to PersistRebootReason to the
top of DoReboot() function, to make sure we persist it even if /data is
not mounted.
Test: builds
Test: adb shell svc power reboot userspace
Test: atest CtsUserspaceRebootHostSideTestCases
Bug: 148767783
Change-Id: I4ae40e1f6fdc41cc0bcae57020fa3d3385dda1b4
A previous change moved property_service into its own thread, since
there was otherwise a deadlock whenever a process called by init would
try to set a property. This new thread, however, would send a message
via a blocking socket to init for each property that it received,
since init may need to take action depending on which property it is.
Unfortunately, this means that the deadlock is still possible, the
only difference is the socket's buffer must be filled before init deadlocks.
There are possible partial solutions here: the socket's buffer may be
increased or property_service may only send messages for the
properties that init will take action on, however all of these
solutions still lead to eventual deadlock. The only complete solution
is to handle these messages asynchronously.
This change, therefore, adds the following:
1) A lock for instructing init to reboot
2) A lock for waiting on properties
3) A lock for queueing new properties
4) A lock for any actions with ServiceList or any Services, enforced
through thread annotations, particularly since this code was not
designed with the intention of being multi-threaded.
Bug: 146877356
Bug: 148236233
Test: boot
Test: kill hwservicemanager without deadlock
Change-Id: I84108e54217866205a48c45e8b59355012c32ea8
Helps with support of recovery and rollback boot reason history, by
also using /metadata/bootstat/persist.sys.boot.reason to file the
reboot reason.
Test: manual
Bug: 129007837
Change-Id: Id1d21c404067414847bef14a0c43f70cafe1a3e2
Instead they will be logged from system_server. This CL just prepares
grounds for logging CL to land.
Test: adb reboot userspace
Bug: 148767783
Change-Id: Ie9482ef735344ecfb0de8a37785d314a3c0417ff
Also reset some more properties to make bootanimation work properly.
Test: adb reboot userspace
Bug: 148172262
Change-Id: I0154d4fe9377c019150f5b1a709c406925db584d
When BCB (bootloader message structure inside of misc partition) is
malformed (contains some non-printable characters in its fields),
"reboot recovery" command won't be able to write required string to
"command" field. It can happen for example when partition table was
created anew and 'misc' partition area contains some garbage. Also this
behavior can be emulated with this command:
$ fastboot erase misc
which leads to 'misc' partition to be filled with 0xFF characters. Hence
this code:
if (boot.command[0] == '\0') {
won't let us to set new string to "command" field. Let's check if
"command" field is malformed and fix it, before actually checking for
previously set content.
"fastboot erase" shouldn't be used for testing purposes though, as it
doesn't work sometimes due to alignment, on bootloader side:
Erasing blocks 6144 to 6144 due to alignment
........ erased 0 bytes from 'misc'
Instead one might use "dd" command to fill 'misc' with 0xFF's:
$ dd if=/dev/zero ibs=2k count=1 | tr "\000" "\377" >misc.img
$ fastboot flash misc misc.img
Test: Fill 'misc' partition with 0xFF's, then do "adb reboot recovery"
Change-Id: Ica8ca31012b9b2249645e7305830c07a20dd013c
Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org>
Since I was there, added two more properties to reset, and switched
ordering of sys.init.updatable_crashing and
sys.init.updatable_crashing_process_name setprops to make sure that
process name is already set when apexd/PackageWatchdog get's notified
about sys.init.updatable_crashing.
Also fixed a typo in what HandleUserspaceReboot function.
Test: adb reboot userspace
Bug: 135984674
Change-Id: I954ec49aae0734cda1bd833ad68f386ecd808f73
Such services will be re-parsed and added back to the service list
during post-fs-data stage.
Test: adb reboot userspace
Test: atest CtsInitTestCases
Bug: 145669993
Bug: 135984674
Change-Id: Ibb393dfe0f101c4ebe37bc763733fd5d981d3691
Init is no longer a special case and talks to property service just
like every other client, therefore move it away from property_set()
and to android::base::SetProperty().
In doing so, this change moves the initial property set up from the
kernel command line and property files directly into PropertyInit().
This makes the responsibilities between init and property services
more clear.
Test: boot, unit test cases
Change-Id: I36b8c83e845d887f1b203355c2391ec123c3d05f
sys.init.userspace_reboot.in_progress will be used to notify all
the processes (including vendor ones) that userspace reboot is
happening, hence it should be treated as stable public api.
All other sys.init.userspace_reboot.* props will be internal to /system
partition and don't require any stability guarantees.
Test: builds
Test: adb reboot userspace
Bug: 135984674
Change-Id: Ifb64a6bfae2de76bac67edea68df44e33c9cfe2d
Watchdog is just a forked process that is going to fall back to the
full reboot in case device wasn't able to boot in given amount of time.
Currently this amount is hard-coded to 1 minute, but in the future it
will be controlled by a read-only property.
Also added sync calls before and after tearing down services.
Test: adb reboot userspace
Bug: 135984674
Change-Id: Ie6053c9446a6761deae6dc104036bb35b09ef0e2
There will be useful in debugging/logging events to statsd.
Also as part of this CL, sys.init.userspace_reboot.in_progress property
is now used as a mean of synchronization. It is set directly in
DoUserspaceReboot, to make sure that all the setprop actions triggered
by userspace-reboot-requested were processed.
Test: adb reboot userspace
Test: adb shell getprop sys.init.userspace_reboot.last_started
Test: adb shell getprop sys.init.userspace_reboot.last_finished
Bug: 135984674
Change-Id: I9debcd4f058e790855200d5295344dafb30e496a
Previously, we assumed that TriggerShutdown() should never be called
from vendor_init and used property service as a back up in case it
ever did. We have since then found out that vendor_init may indeed
call TriggerShutdown() and we want to make it just as strict as it is
in init, wherein it will immediately start the shutdown sequence
without executing any further commands.
Test: init unit tests, trigger shuttdown from init and vendor_init
Change-Id: I1f44dae801a28269eb8127879a8b7d6adff6f353
This will bring device to the state closer to the one during normal boot
Bug: 135984674
Test: adb install system/apex/shim/com.android.apex.cts.shim.v1.apex
Test: adb reboot userspace
Test: verified install succeeded
Change-Id: I6ef73bde2ca817c8a62bf19b8f1895dd0d6d2829
We are going to teamfood userspace reboot soon, and in order to gather
as much data as possible we are fine with ignoring checkpointing for the
devices with ext4 (teamfood will be a very limited set of people that
are aware what they've signed for).
As result of this, we don't need to reset vold and kill zram backing
device. Added a TODO to restore that functionality if needed.
Since I was there, fixed yet another typo in userspace-reboot-resume -_-
Bug: 135984674
Test: adb reboot userspace
Change-Id: I2b7a93aaf738fe9bec9d606d7e11aefb325550b1
Especially now that property_service is a thread, there may be some
delay between when init sets sys.powerctl and when the main thread of
init receives this and triggers shutdown. It's possible that
outstanding init commands are run during this gap and that is not
desirable.
Instead, have builtins call TriggerShutdown() directly, so we can be
sure that the next action that init runs will be to shutdown the
device.
Test: reboot works
Test: reboot into recovery due to bad /data works
Change-Id: I26fb9f4f57f46c7451b8b58187138cfedd6fd9eb
* Refactored code around stopping services a little bit to reuse it
between full reboot and userspace reboot.
* Add a scope_guard to fallback to full reboot in case userspace reboot
fails.
* In case of userspace reboot init will also wait for services to be
terminated/killed and log the ones that didn't react to
SIGTERM/SIGKILL in time.
* If some of the services didn't react to SIGKILL, fail userspace reboot.
Test: adb reboot userspace
Bug: 135984674
Change-Id: I820c7bc406169333b0f929f0eea028d8384eb2ac
This CL only draws boundaries between userspace and full reboots, and
adds some functionality that will be required for userspace reboot:
* Whenever device is shutting down is now controlled in reboot.cpp,
since during userspace reboot this state can change.
* Now it's also possible to restart handling of control messages inside
property service. In case of userspace reboot, init will restart it
after stopping post-data services.
* New userspace-reboot-requested trigger is added similar to shutdown
one for full reboot.
Test: adb reboot
Test: adb reboot userspace
Bug: 135984674
Change-Id: Id55a53ba781d2b90ce40449037b6d8d47e72c476
All of the logic in reboot.cpp is meant to safely shutdown services,
safely unmount emulated RW file systems, then finally unmount the
remaining RW file systems, particularly /data. If /data hasn't been
mounted, then none of this logic is required.
Running this logic caused a lock up when shutting down blueline from
early-init. Vold, or potentially a related HAL, locked up during the
ShutdownVold() calls. debuggerd separately locked up in the watchdog
thread.
Therefore, this change immediately reboots if /data is not mounted.
It also removes the lines to call into debuggerd. debuggerd will not
run due to SELinux in any case, so it can only be used when hands-on
debugging a device.
Bug: 141082587
Test: shutdown with /data mounted continues as normal
Test: shutdown from early-init immediately shuts the device down
Change-Id: I79c72346b17c7dfe57e955d9739bcaf559badc14
It's been a long standing issue that init cannot respond to property
set messages when it is running a builtin command. This is
particularly problematic when the commands involve IPC to vold or
other daemons, as it prevents them from being able to set properties.
This change has init run property service in a thread, which
eliminates the above issue.
This change may also serve as a starting block to running property
service in an entirely different process to better isolate init from
handling property requests.
Reland: during reboot, init stops processing property_changed messages
from property service, since it will not act on these anyway. This
had an unexpected effect of causing future property_set calls to block
indefinitely, since the buffer between init and property_service was
filling up and the send() call from property_service would then
block. This change has init tell property_service to stop sending it
property_changed messages once reboot begins.
Test: CF boots, walleye boots, properties are set appropriately
Change-Id: I26902708e8be788caa6dbcf4b6d2968d90962785
This reverts commit 4d35f2e59c.
Reason for revert: b/137523800 This breaks factory reset on all devices (and potentially rescue party and non-ab updates). Because the init code unconditionally clear the arguments like "--wipe_data" written by framework; as a result, device boots into recovery without doing wipe.
I guess one fix is to check the content of BCB, and skip the overwrite if it already boots into recovery. Revert the cl first to unblock p1, will submit the fix separately.
Change-Id: Iccaf3dce6999005c2199490a138844d5a5d99e7f
Without this change "adb reboot recovery" leads to normal boot.
Change-Id: I361d0a1f6f6f2c57f3dc80102c21970b462c9b9c
Signed-off-by: Sam Protsenko <semen.protsenko@linaro.org>
Init had some pretty horrid Error() << StringPrintf(...) calls that
are all much better replaced by Errorf(...) now.
Test: build, check that keyword_map errors look correct
Change-Id: I572588c7541b928c72ae1bf140b814acdef1cd60
Now that Result<T> is actually expected<T, ...>, and the expected
proposal states expected<void, ...> as the way to indicate an expected
object that returns either successfully with no object or an error,
let's move init's Result<Success> to the preferred Result<void>.
Bug: 132145659
Test: boot, init unit tests
Change-Id: Ib2f98396d8e6e274f95a496fcdfd8341f77585ee