Commit Graph

461 Commits

Author SHA1 Message Date
Markus Armbruster dec9776049 trace-events: Fix attribution of trace points to source
Some trace points are attributed to the wrong source file.  Happens
when we neglect to update trace-events for code motion, or add events
in the wrong place, or misspell the file name.

Clean up with help of cleanup-trace-events.pl.  Same funnies as in the
previous commit, of course.  Manually shorten its change to
linux-user/trace-events to */signal.c.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20190314180929.27722-6-armbru@redhat.com
Message-Id: <20190314180929.27722-6-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22 16:18:07 +00:00
Markus Armbruster 500016e5db trace-events: Shorten file names in comments
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to
source files.  That's because when trace-events got split up, the
comments were moved verbatim.

Delete the sub/dir/ part from these comments.  Gets rid of several
misspellings.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190314180929.27722-3-armbru@redhat.com
Message-Id: <20190314180929.27722-3-armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22 16:18:07 +00:00
Yang Zhong b42075bb77 virtio: express virtio dependencies with Kconfig
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-42-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:45:53 +01:00
Paolo Bonzini e0e312f352 build: switch to Kconfig
The make_device_config.sh script is replaced by minikconf, which
is modified to support the same command line as its predecessor.

The roots of the parsing are default-configs/*.mak, Kconfig.host and
hw/Kconfig.  One difference with make_device_config.sh is that all symbols
have to be defined in a Kconfig file, including those coming from the
configure script.  This is the reason for the Kconfig.host file introduced
in the previous patch. Whenever a file in default-configs/*.mak used
$(...) to refer to a config-host.mak symbol, this is replaced by a
Kconfig dependency; this part must be done already in this patch
for bisectability.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190123065618.3520-28-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:45:53 +01:00
Paolo Bonzini 82f5181777 kconfig: introduce kconfig files
The Kconfig files were generated mostly with this script:

  for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do
    set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' `
    shift
    if test $# = 1; then
      cat >> $(dirname $1)/Kconfig << EOF
config ${i#CONFIG_}
    bool

EOF
      git add $(dirname $1)/Kconfig
    else
      echo $i $*
    fi
  done
  sed -i '$d' hw/*/Kconfig
  for i in hw/*; do
    if test -d $i && ! test -f $i/Kconfig; then
      touch $i/Kconfig
      git add $i/Kconfig
    fi
  done

Whenever a symbol is referenced from multiple subdirectories, the
script prints the list of directories that reference the symbol.
These symbols have to be added manually to the Kconfig files.

Kconfig.host and hw/Kconfig were created manually.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20190123065618.3520-27-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:45:53 +01:00
Paolo Bonzini 3fa749654e 9pfs: remove unnecessary conditionals
The VIRTIO_9P || VIRTFS && XEN condition can be computed in hw/Makefile.objs,
removing an "if" from hw/9pfs/Makefile.objs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07 21:45:53 +01:00
Paul Durrant 2d0ed5e642 xen: re-name XenDevice to XenLegacyDevice...
...and xen_backend.h to xen-legacy-backend.h

Rather than attempting to convert the existing backend infrastructure to
be QOM compliant (which would be hard to do in an incremental fashion),
subsequent patches will introduce a completely new framework for Xen PV
backends. Hence it is necessary to re-name parts of existing code to avoid
name clashes. The re-named 'legacy' infrastructure will be removed once all
backends have been ported to the new framework.

This patch is purely cosmetic. No functional change.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2019-01-14 13:45:40 +00:00
Greg Kurz 93aee84f57 9p: remove support for the "handle" backend
The "handle" fsdev backend was deprecated in QEMU 2.12.0 with:

commit db3b3c7281
Author: Greg Kurz <groug@kaod.org>
Date:   Mon Jan 8 11:18:23 2018 +0100

    9pfs: deprecate handle backend

    This backend raise some concerns:

    - doesn't support symlinks
    - fails +100 tests in the PJD POSIX file system test suite [1]
    - requires the QEMU process to run with the CAP_DAC_READ_SEARCH
      capability, which isn't recommended for security reasons

    This backend should not be used and wil be removed. The 'local'
    backend is the recommended alternative.

    [1] https://www.tuxera.com/community/posix-test-suite/

    Signed-off-by: Greg Kurz <groug@kaod.org>
    Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
    Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

It has passed the two release cooling period without any complaint.

Remove it now.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2018-12-12 14:18:10 +01:00
Greg Kurz 75607e0dcc xen/9pfs: use g_new(T, n) instead of g_malloc(sizeof(T) * n)
Because it is a recommended coding practice (see HACKING).

Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2018-12-12 14:18:10 +01:00
Greg Kurz 1923923bfa 9p: use g_new(T, n) instead of g_malloc(sizeof(T) * n)
Because it is a recommended coding practice (see HACKING).

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2018-12-12 14:18:10 +01:00
Greg Kurz 1d20398694 9p: fix QEMU crash when renaming files
When using the 9P2000.u version of the protocol, the following shell
command line in the guest can cause QEMU to crash:

    while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done

With 9P2000.u, file renaming is handled by the WSTAT command. The
v9fs_wstat() function calls v9fs_complete_rename(), which calls
v9fs_fix_path() for every fid whose path is affected by the change.
The involved calls to v9fs_path_copy() may race with any other access
to the fid path performed by some worker thread, causing a crash like
shown below:

Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0,
 flags=65536, mode=0) at hw/9pfs/9p-local.c:59
59          while (*path && fd != -1) {
(gdb) bt
#0  0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8,
 path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59
#1  0x0000555555a25e0c in local_opendir_nofollow (fs_ctx=0x555557d958b8,
 path=0x0) at hw/9pfs/9p-local.c:92
#2  0x0000555555a261b8 in local_lstat (fs_ctx=0x555557d958b8,
 fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185
#3  0x0000555555a2b367 in v9fs_co_lstat (pdu=0x555557d97498,
 path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53
#4  0x0000555555a1e9e2 in v9fs_stat (opaque=0x555557d97498)
 at hw/9pfs/9p.c:1083
#5  0x0000555555e060a2 in coroutine_trampoline (i0=-669165424, i1=32767)
 at util/coroutine-ucontext.c:116
#6  0x00007fffef4f5600 in __start_context () at /lib64/libc.so.6
#7  0x0000000000000000 in  ()
(gdb)

The fix is to take the path write lock when calling v9fs_complete_rename(),
like in v9fs_rename().

Impact:  DoS triggered by unprivileged guest users.

Fixes: CVE-2018-19489
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-11-23 13:28:03 +01:00
Greg Kurz 5b3c77aa58 9p: take write lock on fid path updates (CVE-2018-19364)
Recent commit 5b76ef50f6 fixed a race where v9fs_co_open2() could
possibly overwrite a fid path with v9fs_path_copy() while it is being
accessed by some other thread, ie, use-after-free that can be detected
by ASAN with a custom 9p client.

It turns out that the same can happen at several locations where
v9fs_path_copy() is used to set the fid path. The fix is again to
take the write lock.

Fixes CVE-2018-19364.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-11-20 13:00:35 +01:00
Greg Kurz 5b76ef50f6 9p: write lock path in v9fs_co_open2()
The assumption that the fid cannot be used by any other operation is
wrong. At least, nothing prevents a misbehaving client to create a
file with a given fid, and to pass this fid to some other operation
at the same time (ie, without waiting for the response to the creation
request). The call to v9fs_path_copy() performed by the worker thread
after the file was created can race with any access to the fid path
performed by some other thread. This causes use-after-free issues that
can be detected by ASAN with a custom 9p client.

Unlike other operations that only read the fid path, v9fs_co_open2()
does modify it. It should hence take the write lock.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-11-08 21:19:05 +01:00
Markus Armbruster b836723dfe fsdev: Clean up error reporting in qemu_fsdev_add()
Calling error_report() from within a function that takes an Error **
argument is suspicious.  qemu_fsdev_add() does that, and its caller
fsdev_init_func() then fails without setting an error.  Its caller
main(), via qemu_opts_foreach(), is fine with it, but clean it up
anyway.

Cc: Greg Kurz <groug@kaod.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Message-Id: <20181017082702.5581-32-armbru@redhat.com>
2018-10-19 14:51:34 +02:00
Markus Armbruster 1245402e3c 9pfs: Fix CLI parsing crash on error
Calling error_report() in a function that takes an Error ** argument
is suspicious.  9p-handle.c's handle_parse_opts() does that, and then
fails without setting an error.  Wrong.  Its caller crashes when it
tries to report the error:

    $ qemu-system-x86_64 -nodefaults -fsdev id=foo,fsdriver=handle
    qemu-system-x86_64: -fsdev id=foo,fsdriver=handle: warning: handle backend is deprecated
    qemu-system-x86_64: -fsdev id=foo,fsdriver=handle: fsdev: No path specified
    Segmentation fault (core dumped)

Screwed up when commit 91cda4e8f3 (v2.12.0) converted the function to
Error.  Fix by calling error_setg() instead of error_report().

Fixes: 91cda4e8f3
Cc: Greg Kurz <groug@kaod.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181017082702.5581-9-armbru@redhat.com>
2018-10-19 14:51:34 +02:00
Markus Armbruster 4b5766488f error: Fix use of error_prepend() with &error_fatal, &error_abort
From include/qapi/error.h:

  * Pass an existing error to the caller with the message modified:
  *     error_propagate(errp, err);
  *     error_prepend(errp, "Could not frobnicate '%s': ", name);

Fei Li pointed out that doing error_propagate() first doesn't work
well when @errp is &error_fatal or &error_abort: the error_prepend()
is never reached.

Since I doubt fixing the documentation will stop people from getting
it wrong, introduce error_propagate_prepend(), in the hope that it
lures people away from using its constituents in the wrong order.
Update the instructions in error.h accordingly.

Convert existing error_prepend() next to error_propagate to
error_propagate_prepend().  If any of these get reached with
&error_fatal or &error_abort, the error messages improve.  I didn't
check whether that's the case anywhere.

Cc: Fei Li <fli@suse.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20181017082702.5581-2-armbru@redhat.com>
2018-10-19 14:51:34 +02:00
Keno Fischer 230f1b31c5 9p: darwin: Explicitly cast comparisons of mode_t with -1
Comparisons of mode_t with -1 require an explicit cast, since mode_t
is unsigned on Darwin.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-29 12:32:10 +02:00
Keno Fischer 5c99fa375d cutils: Provide strchrnul
strchrnul is a GNU extension and thus unavailable on a number of targets.
In the review for a commit removing strchrnul from 9p, I was asked to
create a qemu_strchrnul helper to factor out this functionality.
Do so, and use it in a number of other places in the code base that inlined
the replacement pattern in a place where strchrnul could be used.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-29 12:32:10 +02:00
Keno Fischer aca6897fba 9p: xattr: Properly translate xattrcreate flags
As with unlinkat, these flags come from the client and need to
be translated to their host values. The protocol values happen
to match linux, but that need not be true in general.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-07 12:17:22 +02:00
Keno Fischer 67e8734574 9p: Properly check/translate flags in unlinkat
The 9p-local code previously relied on P9_DOTL_AT_REMOVEDIR and AT_REMOVEDIR
having the same numerical value and deferred any errorchecking to the
syscall itself. However, while the former assumption is true on Linux,
it is not true in general. 9p-handle did this properly however. Move
the translation code to the generic 9p server code and add an error
if unrecognized flags are passed.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-07 12:17:22 +02:00
Keno Fischer 5b7b2f9a85 9p: local: Avoid warning if FS_IOC_GETVERSION is not defined
Both `stbuf` and `local_ioc_getversion` where unused when
FS_IOC_GETVERSION was not defined, causing a compiler warning.

Reorganize the code to avoid this warning.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-07 12:17:22 +02:00
Keno Fischer a647502c58 9p: xattr: Fix crashes due to free of uninitialized value
If the size returned from llistxattr/lgetxattr is 0, we skipped
the malloc call, leaving xattr.value uninitialized. However, this
value is later passed to `g_free` without any further checks,
causing an error. Fix that by always calling g_malloc unconditionally.
If `size` is 0, it will return NULL, which is safe to pass to g_free.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-07 12:17:22 +02:00
Keno Fischer ec70b956fd 9p: Move a couple xattr functions to 9p-util
These functions will need custom implementations on Darwin. Since the
implementation is very similar among all of them, and 9p-util already
has the _nofollow version of fgetxattrat, let's move them all there.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-07 12:17:22 +02:00
Keno Fischer 2306271c38 9p: local: Properly set errp in fstatfs error path
In the review of

    9p: Avoid warning if FS_IOC_GETVERSION is not defined

Grep Kurz noted this error path was failing to set errp.
Fix that.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
[added local: to commit title, Greg Kurz]
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-07 12:17:21 +02:00
Keno Fischer fde1f3e4a0 9p: proxy: Fix size passed to `connect`
The size to pass to the `connect` call is the size of the entire
`struct sockaddr_un`. Passing anything shorter than this causes errors
on darwin.

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-06-07 12:17:21 +02:00
Paolo Bonzini b5dfdb082f hw: make virtio devices configurable via default-configs/
This is only half of the work, because the proxy devices (virtio-*-pci,
virtio-*-ccw, etc.) are still included unconditionally.  It is still a
move in the right direction.

Based-on: <20180522194943.24871-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 15:14:31 +02:00
Paul Durrant 58560f2ae7 xen: remove other open-coded use of libxengnttab
Now that helpers are available in xen_backend, use them throughout all
Xen PV backends.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2018-05-22 11:43:21 -07:00
Greg Kurz 8f9c64bfa5 9p: add trace event for v9fs_setattr()
Don't print the tv_nsec part of atime and mtime, to stay below the 10
argument limit of trace events.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2018-05-02 08:59:24 +02:00
Marc-André Lureau 6ce7177ae2 9p: fix leak in synth_name_to_path()
Leak found thanks to ASAN:

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x55995789ac90 in __interceptor_malloc (/home/elmarco/src/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x1510c90)
    #1 0x7f0a91190f0c in g_malloc /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:94
    #2 0x5599580a281c in v9fs_path_copy /home/elmarco/src/qemu/hw/9pfs/9p.c:196:17
    #3 0x559958f9ec5d in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:116:9
    #4 0x7f0a8766ebbf  (/lib64/libc.so.6+0x50bbf)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-02-19 18:27:32 +01:00
Marc-André Lureau e446a1eb5e 9p: v9fs_path_copy() readability
lhs/rhs doesn't tell much about how argument are handled, dst/src is
and const arguments is clearer in my mind. Use g_memdup() while at it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
2018-02-19 18:27:15 +01:00
Markus Armbruster 922a01a013 Move include qemu/option.h from qemu-common.h to actual users
qemu-common.h includes qemu/option.h, but most places that include the
former don't actually need the latter.  Drop the include, and add it
to the places that actually need it.

While there, drop superfluous includes of both headers, and
separate #include from file comment with a blank line.

This cleanup makes the number of objects depending on qemu/option.h
drop from 4545 (out of 4743) to 284 in my "build everything" tree.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-20-armbru@redhat.com>
[Semantic conflict with commit bdd6a90a9e in block/nvme.c resolved]
2018-02-09 13:52:16 +01:00
Markus Armbruster e688df6bc4 Include qapi/error.h exactly where needed
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.

While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
2018-02-09 13:50:17 +01:00
Greg Kurz 357e2f7f4e tests: virtio-9p: add FLUSH operation test
The idea is to send a victim request that will possibly block in the
server and to send a flush request to cancel the victim request.

This patch adds two test to verifiy that:
- the server does not reply to a victim request that was actually
  cancelled
- the server replies to the flush request after replying to the
  victim request if it could not cancel it

9p request cancellation reference:

http://man.cat-v.org/plan_9/5/flush

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(groug, change the test to only write a single byte to avoid
        any alignment or endianess consideration)
2018-02-02 11:11:55 +01:00
Greg Kurz 354b86f85f tests: virtio-9p: add WRITE operation test
Trivial test of a successful write.

Signed-off-by: Greg Kurz <groug@kaod.org>
(groug, handle potential overflow when computing request size,
        add missing g_free(buf),
        backend handles one written byte at a time to validate
        the server doesn't do short-reads)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-02-01 21:21:28 +01:00
Greg Kurz 82469aaefe tests: virtio-9p: add LOPEN operation test
Trivial test of a successful open.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-02-01 21:21:28 +01:00
Greg Kurz 2893ddd598 tests: virtio-9p: use the synth backend
The purpose of virtio-9p-test is to test the virtio-9p device, especially
the 9p server state machine. We don't really care what fsdev backend we're
using. Moreover, if we want to be able to test the flush request or a
device reset with in-flights I/O, it is close to impossible to achieve
with a physical backend because we cannot ask it reliably to put an I/O
on hold at a specific point in time.

Fortunately, we can do that with the synthetic backend, which allows to
register callbacks on read/write accesses to a specific file. This will
be used by a later patch to test the 9P flush request.

The walk request test is converted to using the synth backend.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2018-02-01 21:21:27 +01:00
Keno Fischer fc78d5ee76 9pfs: Correctly handle cancelled requests
# Background

I was investigating spurious non-deterministic EINTR returns from
various 9p file system operations in a Linux guest served from the
qemu 9p server.

 ## EINTR, ERESTARTSYS and the linux kernel

When a signal arrives that the Linux kernel needs to deliver to user-space
while a given thread is blocked (in the 9p case waiting for a reply to its
request in 9p_client_rpc -> wait_event_interruptible), it asks whatever
driver is currently running to abort its current operation (in the 9p case
causing the submission of a TFLUSH message) and return to user space.
In these situations, the error message reported is generally ERESTARTSYS.
If the userspace processes specified SA_RESTART, this means that the
system call will get restarted upon completion of the signal handler
delivery (assuming the signal handler doesn't modify the process state
in complicated ways not relevant here). If SA_RESTART is not specified,
ERESTARTSYS gets translated to EINTR and user space is expected to handle
the restart itself.

 ## The 9p TFLUSH command

The 9p TFLUSH commands requests that the server abort an ongoing operation.
The man page [1] specifies:

```
If it recognizes oldtag as the tag of a pending transaction, it should
abort any pending response and discard that tag.
[...]
When the client sends a Tflush, it must wait to receive the corresponding
Rflush before reusing oldtag for subsequent messages. If a response to the
flushed request is received before the Rflush, the client must honor the
response as if it had not been flushed, since the completed request may
signify a state change in the server
```

In particular, this means that the server must not send a reply with the
orignal tag in response to the cancellation request, because the client is
obligated to interpret such a reply as a coincidental reply to the original
request.

 # The bug

When qemu receives a TFlush request, it sets the `cancelled` flag on the
relevant pdu. This flag is periodically checked, e.g. in
`v9fs_co_name_to_path`, and if set, the operation is aborted and the error
is set to EINTR. However, the server then violates the spec, by returning
to the client an Rerror response, rather than discarding the message
entirely. As a result, the client is required to assume that said Rerror
response is a result of the original request, not a result of the
cancellation and thus passes the EINTR error back to user space.
This is not the worst thing it could do, however as discussed above, the
correct error code would have been ERESTARTSYS, such that user space
programs with SA_RESTART set get correctly restarted upon completion of
the signal handler.
Instead, such programs get spurious EINTR results that they were not
expecting to handle.

It should be noted that there are plenty of user space programs that do not
set SA_RESTART and do not correctly handle EINTR either. However, that is
then a userspace bug. It should also be noted that this bug has been
mitigated by a recent commit to the Linux kernel [2], which essentially
prevents the kernel from sending Tflush requests unless the process is about
to die (in which case the process likely doesn't care about the response).
Nevertheless, for older kernels and to comply with the spec, I believe this
change is beneficial.

 # Implementation

The fix is fairly simple, just skipping notification of a reply if
the pdu was previously cancelled. We do however, also notify the transport
layer that we're doing this, so it can clean up any resources it may be
holding. I also added a new trace event to distinguish
operations that caused an error reply from those that were cancelled.

One complication is that we only omit sending the message on EINTR errors in
order to avoid confusing the rest of the code (which may assume that a
client knows about a fid if it sucessfully passed it off to pud_complete
without checking for cancellation status). This does mean that if the server
acts upon the cancellation flag, it always needs to set err to EINTR. I
believe this is true of the current code.

[1] https://9fans.github.io/plan9port/man/man9/flush.html
[2] https://github.com/torvalds/linux/commit/9523feac272ccad2ad8186ba4fcc891

Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, send a zero-sized reply instead of detaching the buffer]
Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2018-02-01 21:21:27 +01:00
Greg Kurz 066eb006b5 9pfs: drop v9fs_register_transport()
No good reasons to do this outside of v9fs_device_realize_common().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2018-02-01 21:21:27 +01:00
Greg Kurz db3b3c7281 9pfs: deprecate handle backend
This backend raise some concerns:

- doesn't support symlinks
- fails +100 tests in the PJD POSIX file system test suite [1]
- requires the QEMU process to run with the CAP_DAC_READ_SEARCH
  capability, which isn't recommended for security reasons

This backend should not be used and wil be removed. The 'local'
backend is the recommended alternative.

[1] https://www.tuxera.com/community/posix-test-suite/

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2018-01-08 11:18:23 +01:00
Greg Kurz 65603a801e fsdev: improve error handling of backend init
This patch changes some error messages in the backend init code and
convert backends to propagate QEMU Error objects instead of calling
error_report().

One notable improvement is that the local backend now provides a more
detailed error report when it fails to open the shared directory.

Signed-off-by: Greg Kurz <groug@kaod.org>
2018-01-08 11:18:23 +01:00
Greg Kurz 91cda4e8f3 fsdev: improve error handling of backend opts parsing
This patch changes some error messages in the backend opts parsing
code and convert backends to propagate QEMU Error objects instead
of calling error_report().

Signed-off-by: Greg Kurz <groug@kaod.org>
2018-01-08 11:18:23 +01:00
Greg Kurz 7567359094 9pfs: make pdu_marshal() and pdu_unmarshal() static functions
They're only used by the 9p core code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-01-08 11:18:22 +01:00
Greg Kurz d1471233bb 9pfs: fix error path in pdu_submit()
If we receive an unsupported request id, we first decide to
return -ENOTSUPP to the client, but since the request id
causes is_read_only_op() to return false, we change the
error to be -EROFS if the fsdev is read-only. This doesn't
make sense since we don't know what the client asked for.

This patch ensures that -EROFS can only be returned if the
request id is supported.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-01-08 11:18:22 +01:00
Greg Kurz 7bd41d3db6 9pfs: fix type in *_parse_opts declarations
To comply with the QEMU coding style.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2018-01-08 11:18:22 +01:00
Greg Kurz c4ce2c0ff3 9pfs: handle: fix type definition
To comply with the QEMU coding style.

Signed-off-by: Greg Kurz <groug@kaod.org>
2018-01-08 11:18:22 +01:00
Greg Kurz 8e71b96c62 9pfs: fix some type definitions
To comply with the QEMU coding style.

Signed-off-by: Greg Kurz <groug@kaod.org>
2018-01-08 11:18:22 +01:00
Greg Kurz 01847522bc 9pfs: fix XattrOperations typedef
To comply with the QEMU coding style.

Signed-off-by: Greg Kurz <groug@kaod.org>
2018-01-08 11:18:22 +01:00
Greg Kurz bd3be4dbbf virtio-9p: move unrealize/realize after virtio_9p_transport definition
And drop the now useless forward declaration of virtio_9p_transport.

Signed-off-by: Greg Kurz <groug@kaod.org>
2018-01-08 11:18:22 +01:00
Greg Kurz 267fcadf32 9pfs: fix v9fs_mark_fids_unreclaim() return value
The return value of v9fs_mark_fids_unreclaim() is then propagated to
pdu_complete(). It should be a negative errno, not -1.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-11-06 18:05:35 +01:00
Greg Kurz 21cf9edf4f 9pfs: drop one user of struct V9fsFidState
To comply with QEMU coding style.

Signed-off-by: Greg Kurz <groug@kaod.org>
2017-11-06 18:05:35 +01:00