This change explicitly drops all inheritable capabilities (and, by
extension, ambient capabilities) when there are no explicit capabilities
being set by a service and the user is changed. This prevents Android
running in a container from accidentally granting extra capabilities to
services.
Bug: 69320306
Test: aosp_sailfish still boots
Test: sailfish:/ $ grep Cap /proc/`pidof android.hardware.audio@2.0-service`/status
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
Test: sailfish:/ $ grep Cap /proc/`pidof logd`/status
CapInh: 0000000000000000
CapPrm: 0000000440000000
CapEff: 0000000440000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
Test: Android in Chrome OS still boots
Test: localhost ~ # grep Cap /proc/`pidof android.hardware.audio@2.0-service`/status
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 000000006daefdff
CapAmb: 0000000000000000
Test: localhost ~ # grep Cap /proc/`pidof logd`/status
CapInh: 0000000000000000
CapPrm: 0000000040000000
CapEff: 0000000040000000
CapBnd: 000000006daefdff
CapAmb: 0000000000000000
Change-Id: I9218f2e27ff4fb4d91d50f9a98c0fdb4e272952c
For instance, on vendor.img:
service foo /vendor/bin/nfc
...
And then on odm.img:
service foo /odm/bin/super-nfc
override
Allows a service on ODM to override a HAL on vendor.
Bug: 69050941
Test: boot, init_tests
Change-Id: I4e908fb66e89fc6e021799fe1fa6603d3072d62a
This is paving the way to allow an "override" tag
in init services. This also means that errors for
part of a service definition in its section will
be shown in addition to the fact that the service
is duplicated.
Bug: 69050941
Test: boot, init_tests
Change-Id: Ic1ea8597789f45ead1083451b3e933db1524bdc9
Allow it to fail. When there is an error for a section ending,
print the error pointing to the line where the section starts.
Bug: 69050941
Test: boot, init_tests
Change-Id: I1d8ed25f4b74cc9ac24d38b8075751c7d606aea8
If there is a restart follow a stop/reset immediately or vice versa,
clear previous flag bits.
Test: manual - trigger restart after stop immediately to check if
service get started.
Change-Id: I4503177d7cb5ed054dbcf50cd8e09728415404d4
For a oneshot service, if start happens immediately after stop,
the service could be still in stopping status and then start
won't do anything. This fix this race condition.
Test: manual - see reproduce instructions in bug.
Bug: 68020256
Change-Id: I20202fa346f1949a8bda3d90deedc8b6a6d814d3
Fixed issues related to forking services into new PID + mount
namespaces.
Remounting rootfs recursively as slave when creating a service in new
PID + mount namespaces. This prevents the service from interfering with
mount points in the parent namespace.
Unmount then mount /proc instead of mounting it with MS_REMOUNT, since
MS_REMOUNT is not sufficient to update /proc to the state appropriate
for the new PID namespace. Note that the /proc mount options specified
here are not the same as those used in the default mount namespace. I
kept them consistent with those used in the code prior to this fix.
Test: Used custom sleepd service to test init 'namespace' keyword.
Tested on angler in oreo-dev - I had to add PID namespaces to the
kernel (commit ad82c662).
Change-Id: I859104525f82fef3400d5abbad465331fc3d732f
This associates every service with a list of HIDL services
it provides. If these are disabled, hwservicemanager will
request for the service to startup.
Bug: 64678982
Test: manual with the light service
Change-Id: Ibf8a6f1cd38312c91c798b74574fa792f23c2df4
One of the major aspects of treble is the compartmentalization of system
and vendor components, however init leaves a huge gap here, as vendor
init scripts run in the same context as system init scripts and thus can
access and modify the same properties, files, etc as the system can.
This change is meant to close that gap. It forks a separate 'subcontext'
init that runs in a different SELinux context with permissions that match
what vendors should have access to. Commands get sent over a socket to
this 'subcontext' init that then runs them in this SELinux context and
returns the result.
Note that not all commands run in the subcontext; some commands such as
those dealing with services only make sense in the context of the main
init process.
Bug: 62875318
Test: init unit tests, boot bullhead, boot sailfish
Change-Id: Idf4a4ebf98842d27b8627f901f961ab9eb412aee
ExpandArgs() was factored out of Service::Start() to clean up init,
however this introduced a bug: the scope of expanded_args ends when
ExpandArgs() returns, yet pointers to the c strings contained within
those std::strings are returned from the function. These pointers are
invalid and have been seen to cause failures on real devices.
This change moves the execv() into ExpandArgs() and renames it
ExpandArgsAndExecv() to keep the clean separation of Service::Start()
but fix the variable scope issue.
Bug: 65303004
Test: boot fugu
Change-Id: I612128631f5b58d040bffcbc2220593ad16cd450
Add a new service option, `rlimit` that allows a given rlimit to be
set for a specific service instead of globally.
Use the same parsing, now allowing text such as 'cpu' or 'rtprio'
instead of relying on the enum value for the `setrlimit` builtin
command as well.
Bug: 63882119
Bug: 64894637
Test: boot bullhead, run a test app that attempts to set its rtprio to
95, see that the priority set fails normally but passes when
`rlimit rtprio 99 99` is used as its service option.
See that this fails when `rlimit rtprio 50 50` is used as well.
Test: new unit tests
Change-Id: I4a13ca20e8529937d8b4bc11718ffaaf77523a52
1) Attempt to make the error message associated with a missing service
better.
2) Provide a link to more in-depth documentation.
Bug: 65023716
Test: code compiles.
Change-Id: Ie0f1896fb41d5afd11501f046cb51d4c8afe0a62
Log Service failures via Result<T> such that their context can be
captured when interacting with services through builtin functions.
Test: boot bullhead
Change-Id: I4d99744d64008d4a06a404e3c9817182c6e177bc
Init keep its own copy of the environment that it uses for execve when
starting services. This is unnecessary however as libc already has
functions that mutate the environment and the environment that init
uses is clean for starting services. This change removes init's copy
of the environment and uses the libc functions instead.
This also makes small clean-up to the way the Service class stores
service specific environment variables.
Test: boot bullhead
Change-Id: I7c98a0b7aac9fa8f195ae33bd6a7515bb56faf78
Currently, init attempts to set ro.boottime.<service> properties
whenever a service starts, however since these properties are ro. this
means that an error is printed whenever a service is restarted.
Since these properties are intended for reporting boottime, these
subsequent writes during restarts are erroneous and therefore this
change stops attempting to write them, thus silencing the error.
Test: boot bullhead, restart processes, observe no error print
Change-Id: I372f8d5c26590fc0661b92f632410e23e6418841
Test: boot bullhead
Test: Introduce LOG(FATAL) at various points of init and ensure that
it reboots to the bootloader successfully
Test: Introduce LOG(FATAL) during DoReboot() and ensure that it reboots
instead of recursing infinitely
Test: Ensure that fatal signals reboot to bootloader
Change-Id: I409005b6fab379df2d635e3e33d2df48a1a97df3
init tries to propagate error information up to build context before
logging errors. This is a good thing, however too often init has the
overly verbose paradigm for error handling, below:
bool CalculateResult(const T& input, U* output, std::string* err)
bool CalculateAndUseResult(const T& input, std::string* err) {
U output;
std::string calculate_result_err;
if (!CalculateResult(input, &output, &calculate_result_err)) {
*err = "CalculateResult " + input + " failed: " +
calculate_result_err;
return false;
}
UseResult(output);
return true;
}
Even more common are functions that return only true/false but also
require passing a std::string* err in order to see the error message.
This change introduces a Result<T> that is use to either hold a
successful return value of type T or to hold an error message as a
std::string. If the functional only returns success or a failure with
an error message, Result<Success> may be used. The classes Error and
ErrnoError are used to indicate a failed Result<T>.
A successful Result<T> is constructed implicitly from any type that
can be implicitly converted to T or from the constructor arguments for
T. This allows you to return a type T directly from a function that
returns Result<T>.
Error and ErrnoError are used to construct a Result<T> has
failed. Each of these classes take an ostream as an input and are
implicitly cast to a Result<T> containing that failure. ErrnoError()
additionally appends ": " + strerror(errno) to the end of the failure
string to aid in interacting with C APIs.
The end result is that the above code snippet is turned into the much
clearer example below:
Result<U> CalculateResult(const T& input);
Result<Success> CalculateAndUseResult(const T& input) {
auto output = CalculateResult(input);
if (!output) {
return Error() << "CalculateResult " << input << " failed: "
<< output.error();
}
UseResult(*output);
return Success();
}
This change also makes this conversion for some of the util.cpp
functions that used the old paradigm.
Test: boot bullhead, init unit tests
Merged-In: I1e7d3a8820a79362245041251057fbeed2f7979b
Change-Id: I1e7d3a8820a79362245041251057fbeed2f7979b
ServiceManager is essentially just a list now that the rest of its
functionality has been moved elsewhere, so the class is renamed
appropriately.
The ServiceList::Find* functions have been cleaned up into a single
smaller interface.
The ServiceList::ForEach functions have been removed in favor of
ServiceList itself being directly iterable.
Test: boot bullhead
Change-Id: Ibd57c103338f03b83d81e8b48ea0e46cd48fd8f0
signal_handler.cpp itself needs to be cleaned up, but this is a step
to clean up ServiceManager.
Test: boot bullhead
Change-Id: I81f1e8ac4d09692cfb364bc702cbd3deb61aa55a
These can be implemented without ServiceManager, so we remove them and
make ServiceManager slightly less of a God class.
Test: boot bullhead
Test: init unit tests
Change-Id: Ia6e546fe5292255412245256f7d230af4ece135f
The time data types associated with restarting processes halfway moved
to std::chrono and halfway didn't. In this intermediate state, the
times would get converted from nanoseconds to seconds then to
milliseconds. The precision lost when converting to seconds would
cause the main loop of init to spin whenever a process was within a
second of being restarted.
This patch cleans up this logic and uses nanoseconds and milliseconds
explicitly, with a ceiling to milliseconds to prevent unneeded
spinning.
Test: boot bullhead, kill processes, see that they restart sanely.
Change-Id: I0b017ba0e50c09704b0c5cdfcde1dba461804593
prctl(PR_SET_SECUREBITS, ...) expects an unsigned long as its 2nd argument.
Passing in a int64_t happens to work with a 64-bit kernel, but does not
work with a 32-bit kernel.
Bug: 63680332
Test: boot 32-bit kernel; verify services with capabilities can successfully
set those capabilties
Change-Id: I60250d107a77b54b2e9fe3419b4480b921c7e2f8
Signed-off-by: Ben Fennema <fennema@google.com>
Currently, the order that we kill to services during shutdown is the
order of services_ in ServiceManager and that is defacto the order in
which they were parsed, which is not a very useful ordering.
Related to this, we have seen a few issues during shutdown that may be
related to services with dependencies on other services, where the
dependency is killed first and the dependent service then misbehaves.
This change allows services to keep track of the order in which they
were started and shutdown then uses that information to kill running
services in the opposite order that they were started.
Bug: 64067984
Test: Boot and reboot bullhead
Change-Id: I6b4cacb03aed2a72ae98a346bce41ed5434a09c2
Allow configuring memory.swappiness, memory.soft_limit_in_bytes
and memory.limit_in_bytes by init; by doing so there is better
control of memory consumption per native app.
Test: tested on gobo branch.
bug: 63765067
Change-Id: I8906f3ff5ef77f75a0f4cdfbf9d424a579ed52bb
- "shutdown critical" prevents killing the service during
shutdown. And the service will be started if not running.
- Without it, services will be killed by SIGTERM / SIGKILL during shutdown.
- Even services with "shutdown critical" will be killed if shutdown
times out.
- Removes ueventd and vold from hard coded list. Each service's rc will
be updated to add "shutdown critical". watchdogd is still kept in the list.
bug: 37626581
Test: reboot and check last kmsg
Change-Id: Ie8cc699d1efbc59b9a2561bdd40fec64aed5a4bb
We have been seeing panics and errors during shutdown sequence in
some vendor's platform, and it is required to disable error handling
during shutdown.
This CL separates the shutdown request to execute another "shutdown"
trigger at the beginning of shutdown stage. And vendor can use this
trigger to add custom commands needed for shutting down gracefully.
Bug: 38203024
Bug: 62084631
Test: device reboot/shutdown
Change-Id: I3fac4ed59f06667d86e477ee55ed391cf113717f
When Android is running in a container, some of the securebits might be
locked, which makes prctl(PR_SET_SECUREBITS) fail.
This change gets the previous state of the process' securebits and adds
the desired bits.
Bug: 62388055
Test: aosp_bullhead-eng boots
Test: If init has non-zero securebits, it can also boot
Change-Id: Ie03bf2538f9dca40955bc58314d269246f5731bd
When init gets SIGCHLD, it uses waitpid() to get the pid of an exited
process. It then calls kill(-pid, ...) to ensure that all processes
in the process group started by that process are killed as well.
There is a bug here however as waitpid() reaps the pid when it
returns, meaning that the call to kill(-pid, ...) may fail with ESRCH
as there are no remaining references to that pid. Or worse, if the
pid is reused, the wrong processes may get the signal.
This fixes the bug by using waitid() with WNOWAIT to get the pid of an
exited process, which does not reap the pid. It then uses waitpid()
with the returned pid to do the reap only after the above kill(-pid,
...) and other operations have completed.
Bug: 38164998
Test: kill surfaceflinger and see that processes exit and are reaped
appropriately
Test: `adb reboot` and observe that the extraneous kill() failed
messages do not appear
Change-Id: Ic0213e1c97e0141e6c13129dc2abbfed86de138b
1) property_set() takes const std::string& for both of its arguments,
so stop using .c_str() with its parameters
2) Simplify a few places where StringPrintf() is used to concatenate strings
3) Use std::to_string() instead of StringPrintf() where it's better suited
Test: Boot bullhead
Test: init unit tests
Change-Id: I68ebda0e469f6230c8f9ad3c8d5f9444e0c4fdfd
restorecon_recursive may take a long time if there are a lot of files on
the volume. This can trigger a watchdog timeout in any process that
tries to set a property while it is running. Fix this by running
restorecon_recursive in its own process.
See https://jira.lineageos.org/browse/BUGBASH-555
Change-Id: I2ce26ff2b5bfc9a133ea42f4dbac50a3ac289c04
libprocessgroup kills the cgroup associated with a given pid and uid,
but not the POSIX process group associated with it. This means that
to kill both, two of the same signals must be sent, which may cause
some issues.
This change kills all POSIX process groups whose group leaders are
found within a cgroup. It only then kills processes in the cgroup
that are not part of the POSIX process groups that have been killed.
Bug: 37853905
Bug: 62418791
Test: Boot, kill zygote, reboot
Change-Id: Id1d96935745899b4c454c36c351ec16a0b1d3827
* changes:
init: change kill order and fix error reporting in KillProcessGroup()
Better logging in libprocessgroup and make resources clean up themselves
In the init scripts for socket, the type can have a suffix of
"+passcred" to request that the socket be bound to report SO_PASSCRED
credentials as part of socket transactions.
Test: gTest logd-unit-tests --gtest_filter=logd.statistics right after boot
(fails without logd.rc change)
Bug: 37985222
Change-Id: Ie5b50e99fb92fa9bec9a32463a0e6df26a968bfd
Check the result of DecodeUid() and return failure when uids/gids are
unable to be decoded.
Also, use an error string instead of logging directly such that more
context can be added when decoding fails.
Bug: 38038887
Test: Boot bullhead
Test: Init unit tests
Change-Id: I84c11aa5a8041bf5d2f754ee9af748344b789b37
First kill the process group before killing the cgroup to catch
the hopeful case that killing the cgroup becomes a no-op as all of its
processes have already been killed.
Do not report an error if kill fails due to ESRCH, as this happens
often when reaping processes due to the order in which we call
waitpid() and kill().
Do not call killProcessGroup in libprocessgroup if we have already
successfully killed and removed a process group.
Bug: 36661364
Bug: 36701253
Bug: 37540956
Test: Reboot bullhead
Test: Start and stop services
Test: Init unit tests
Change-Id: I172acf0f8e00189f910f865f4635a7b1782fc7e3
Add unit test to ensure all POD types of Service are initialized.
Bug: 37855222
Test: Ensure bugreport is triggered via keychord properly.
Test: New unit tests
Change-Id: If2cfea15a74ab417a7b909a60c264cb8eb990de7
- allows easier tracking of wait time from monitoring tools
- this change also reduces unnecessary log spam
- service exit log looks like this:
init: Service 'exec 4 (/system/bin/otapreopt_slot)' (pid 611) exited with status 0 waiting took 0.060771 seconds
bug: 37752410
Test: reboot and check log
Change-Id: I122902538697f33939eede548e39f155ec419e03
This line shows up immediately before starting a service for each
service without a 'seclabel' option, essentially becoming log spam.
We already log if we fail to compute the context as well.
Test: Boot bullhead
Change-Id: Ibe91fd2dd9f53a8ae2ca95ccea1636ecef2af224