virtiofsd: probe unshare(CLONE_FS) and print an error

An assertion failure is raised during request processing if
unshare(CLONE_FS) fails. Implement a probe at startup so the problem can
be detected right away.

Unfortunately Docker/Moby does not include unshare in the seccomp.json
list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always
include unshare (e.g. podman is unaffected):
https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json

Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if the
default seccomp.json is missing unshare.

Cc: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200727190223.422280-4-stefanha@redhat.com>
Reviewed-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This commit is contained in:
Stefan Hajnoczi 2020-07-27 20:02:23 +01:00 committed by Dr. David Alan Gilbert
parent 1c7cb1f52e
commit fd9279ec99
1 changed files with 16 additions and 0 deletions

View File

@ -949,6 +949,22 @@ int virtio_session_mount(struct fuse_session *se)
{
int ret;
/*
* Test that unshare(CLONE_FS) works. fv_queue_worker() will need it. It's
* an unprivileged system call but some Docker/Moby versions are known to
* reject it via seccomp when CAP_SYS_ADMIN is not given.
*
* Note that the program is single-threaded here so this syscall has no
* visible effect and is safe to make.
*/
ret = unshare(CLONE_FS);
if (ret == -1 && errno == EPERM) {
fuse_log(FUSE_LOG_ERR, "unshare(CLONE_FS) failed with EPERM. If "
"running in a container please check that the container "
"runtime seccomp policy allows unshare.\n");
return -1;
}
ret = fv_create_listen_socket(se);
if (ret < 0) {
return ret;