bwrap Containers Developer Alexander Larsson Developer Colin Walters bwrap 1 User Commands bwrap container setup utility bwrap OPTION COMMAND Description bwrap is a unprivileged low-level sandboxing tool (optionally setuid on older distributions). You are unlikely to use it directly from the commandline, although that is possible. It works by creating a new, completely empty, filesystem namespace where the root is on a tmpfs that is invisible from the host, and which will be automatically cleaned up when the last process exits. You can then use commandline options to construct the root filesystem and process environment for the command to run in the namespace. By default, bwrap creates a new mount namespace for the sandbox. Optionally it also sets up new user, ipc, pid, network and uts namespaces (but note the user namespace is required if bwrap is not installed setuid root). The application in the sandbox can be made to run with a different UID and GID. If needed (e.g. when using a PID namespace) bwrap is running a minimal pid 1 process in the sandbox that is responsible for reaping zombies. It also detects when the initial application process (pid 2) dies and reports its exit status back to the original spawner. The pid 1 process exits to clean up the sandbox when there are no other processes in the sandbox left. Options When options are used multiple times, the last option wins, unless otherwise specified. General options: Print help and exit Print version Parse nul-separated arguments from the given file descriptor. This option can be used multiple times to parse options from multiple sources. Set argv[0] to the value VALUE before running the program Options related to kernel namespaces: Create a new user namespace Create a new user namespace if possible else skip it Create a new ipc namespace Create a new pid namespace Create a new network namespace Create a new uts namespace Create a new cgroup namespace Create a new cgroup namespace if possible else skip it Unshare all possible namespaces. Currently equivalent with: Retain the network namespace, overriding an earlier or Use an existing user namespace instead of creating a new one. The namespace must fulfil the permission requirements for setns(), which generally means that it must be a descendant of the currently active user namespace, owned by the same user. This is incompatible with --unshare-user, and doesn't work in the setuid version of bubblewrap. After setting up the new namespace, switch into the specified namespace. For this to work the specified namespace must be a descendant of the user namespace used for the setup, so this is only useful in combination with --userns. This is useful because sometimes bubblewrap itself creates nested user namespaces (to work around some kernel issues) and --userns2 can be used to enter these. Prevent the process in the sandbox from creating further user namespaces, so that it cannot rearrange the filesystem namespace or do other more complex namespace modification. This is currently implemented by setting the user.max_user_namespaces sysctl to 1, and then entering a nested user namespace which is unable to raise that limit in the outer namespace. This option requires , and doesn't work in the setuid version of bubblewrap. Confirm that the process in the sandbox has been prevented from creating further user namespaces, but without taking any particular action to prevent that. For example, this can be combined with to check that the given user namespace has already been set up to prevent the creation of further user namespaces. Use an existing pid namespace instead of creating one. This is often used with --userns, because the pid namespace must be owned by the same user namespace that bwrap uses. Note that this can be combined with --unshare-pid, and in that case it means that the sandbox will be in its own pid namespace, which is a child of the passed in one. Use a custom user id in the sandbox (requires ) Use a custom group id in the sandbox (requires ) Use a custom hostname in the sandbox (requires ) Options about environment setup: Change directory to DIR Set an environment variable Unset an environment variable Unset all environment variables, except for PWD and any that are subsequently set by Options for monitoring the sandbox from the outside: Take a lock on DEST while the sandbox is running. This option can be used multiple times to take locks on multiple files. Keep this file descriptor open while the sandbox is running Filesystem related options. These are all operations that modify the filesystem directly, or mounts stuff in the filesystem. These are applied in the order they are given as arguments. Any missing parent directories that are required to create a specified destination are automatically created as needed. Their permissions are normally set to 0755 (rwxr-xr-x). However, if a option is in effect, and it sets the permissions for group or other to zero, then newly-created parent directories will also have their corresponding permission set to zero. modifies the size of the created mount when preceding a action; and can be combined. This option does nothing on its own, and must be followed by one of the options that it affects. It sets the permissions for the next operation to OCTAL. Subsequent operations are not affected: for example, --perms 0700 --tmpfs /a --tmpfs /b will mount /a with permissions 0700, then return to the default permissions for /b. Note that and can be combined: --perms 0700 --size 10485760 --tmpfs /s will apply permissions as well as a maximum size to the created tmpfs. This option does nothing on its own, and must be followed by --tmpfs. It sets the size in bytes for the next tmpfs. For example, --size 10485760 --tmpfs /tmp will create a tmpfs at /tmp of size 10MiB. Subsequent operations are not affected: for example, --size 10485760 --tmpfs /a --tmpfs /b will mount /a with size 10MiB, then return to the default size for /b. Note that and can be combined: --size 10485760 --perms 0700 --tmpfs /s will apply permissions as well as a maximum size to the created tmpfs. Bind mount the host path SRC on DEST Equal to but ignores non-existent SRC Bind mount the host path SRC on DEST, allowing device access Equal to but ignores non-existent SRC Bind mount the host path SRC readonly on DEST Equal to but ignores non-existent SRC Remount the path DEST as readonly. It works only on the specified mount point, without changing any other mount point under the specified path Mount procfs on DEST Mount new devtmpfs on DEST Mount new tmpfs on DEST. If the previous option was , it sets the mode of the tmpfs. Otherwise, the tmpfs has mode 0755. If the previous option was , it sets the size in bytes of the tmpfs. Otherwise, the tmpfs has the default size. Mount new mqueue on DEST Create a directory at DEST. If the directory already exists, its permissions are unmodified, ignoring (use if the permissions of an existing directory need to be changed). If the directory is newly created and the previous option was , it sets the mode of the directory. Otherwise, newly-created directories have mode 0755. Copy from the file descriptor FD to DEST. If the previous option was , it sets the mode of the new file. Otherwise, the file has mode 0666 (note that this is not the same as ). Copy from the file descriptor FD to a file which is bind-mounted on DEST. If the previous option was , it sets the mode of the new file. Otherwise, the file has mode 0600 (note that this is not the same as ). Copy from the file descriptor FD to a file which is bind-mounted read-only on DEST. If the previous option was , it sets the mode of the new file. Otherwise, the file has mode 0600 (note that this is not the same as ). Create a symlink at DEST with target SRC. Since version 0.9.0, it is not considered to be an error if DEST already exists as a symbolic link and its target is exactly SRC. Before version 0.9.0, if DEST already existed, this would be treated as an error (even if its target was identical to SRC). Set the permissions of PATH, which must already exist, to OCTAL. Lockdown options: Load and use seccomp rules from FD. The rules need to be in the form of a compiled cBPF program, as generated by seccomp_export_bpf. If this option is given more than once, only the last one is used. Use if multiple seccomp programs are needed. Load and use seccomp rules from FD. The rules need to be in the form of a compiled cBPF program, as generated by seccomp_export_bpf. This option can be repeated, in which case all the seccomp programs will be loaded in the order given (note that the kernel will evaluate them in reverse order, so the last program on the bwrap command-line is evaluated first). All of them, except possibly the last, must allow use of the PR_SET_SECCOMP prctl. This option cannot be combined with . Exec Label from the sandbox. On an SELinux system you can specify the SELinux context for the sandbox process(s). File label for temporary sandbox content. On an SELinux system you can specify the SELinux context for the sandbox content. Block the sandbox on reading from FD until some data is available. Do not initialize the user namespace but wait on FD until it is ready. This allow external processes (like newuidmap/newgidmap) to setup the user namespace before it is used by the sandbox process. Write information in JSON format about the sandbox to FD. Multiple JSON documents are written to FD, one per line ("JSON lines" format). Each line is a single JSON object. After bwrap has started the child process inside the sandbox, it writes an object with a child-pid member to the (this duplicates the older ). The corresponding value is the process ID of the child process in the pid namespace from which bwrap was run. If available, the namespace IDs are also included in the object with the child-pid; again, this duplicates the older . When the child process inside the sandbox exits, bwrap writes an object with an exit-code member, and then closes the . The value corresponding to exit-code is the exit status of the child, in the usual shell encoding (n if it exited normally with status n, or 128+n if it was killed by signal n). Other members may be added to those objects in future versions of bwrap, and other JSON objects may be added before or after the current objects, so readers must ignore members and objects that they do not understand. Create a new terminal session for the sandbox (calls setsid()). This disconnects the sandbox from the controlling terminal which means the sandbox can't for instance inject input into the terminal. Note: In a general sandbox, if you don't use --new-session, it is recommended to use seccomp to disallow the TIOCSTI ioctl, otherwise the application can feed keyboard input to the terminal which can e.g. lead to out-of-sandbox command execution (see CVE-2017-5226). Ensures child process (COMMAND) dies when bwrap's parent dies. Kills (SIGKILL) all bwrap sandbox processes in sequence from parent to child including COMMAND process when bwrap or bwrap's parent dies. See prctl, PR_SET_PDEATHSIG. Do not create a process with PID=1 in the sandbox to reap child processes. Add the specified capability CAP, e.g. CAP_DAC_READ_SEARCH, when running as privileged user. It accepts the special value ALL to add all the permitted caps. Drop the specified capability when running as privileged user. It accepts the special value ALL to drop all the caps. By default no caps are left in the sandboxed process. The and options are processed in the order they are specified on the command line. Please be careful to the order they are specified. Environment HOME Used as the cwd in the sandbox if has not been explicitly specified and the current cwd is not present inside the sandbox. The option can be used to override the value that is used here. Exit status The bwrap command returns the exit status of the initial application process (pid 2 in the sandbox).