Commit Graph

136 Commits

Author SHA1 Message Date
Peter Maydell 80c7c2b00d nbd: Don't take address of fields in packed structs
Taking the address of a field in a packed struct is a bad idea, because
it might not be actually aligned enough for that pointer type (and
thus cause a crash on dereference on some host architectures). Newer
versions of clang warn about this. Avoid the bug by not using the
"modify in place" byte swapping functions.

This patch was produced with the following spatch script:
@@
expression E;
@@
-be16_to_cpus(&E);
+E = be16_to_cpu(E);
@@
expression E;
@@
-be32_to_cpus(&E);
+E = be32_to_cpu(E);
@@
expression E;
@@
-be64_to_cpus(&E);
+E = be64_to_cpu(E);
@@
expression E;
@@
-cpu_to_be16s(&E);
+E = cpu_to_be16(E);
@@
expression E;
@@
-cpu_to_be32s(&E);
+E = cpu_to_be32(E);
@@
expression E;
@@
-cpu_to_be64s(&E);
+E = cpu_to_be64(E);

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180927164200.15097-1-peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: rebase, and squash in missed changes]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-10-03 09:58:43 -05:00
Vladimir Sementsov-Ogievskiy fb7afc797e nbd/server: send more than one extent of base:allocation context
This is necessary for efficient block-status export, for clients which
support it.  (qemu is not yet such a client, but could become one.)

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180704112302.471456-3-vsementsov@virtuozzo.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-09-26 21:37:48 -05:00
Vladimir Sementsov-Ogievskiy 6545916d52 nbd/server: fix bitmap export
bitmap_to_extents function is broken: it switches dirty variable after
every iteration, however it can process only part of dirty (or zero)
area during one iteration in case when this area is too large for one
extent.

Fortunately, the bug doesn't produce wrong extent flags: it just inserts
a zero-length extent between sequential extents representing large dirty
(or zero) area. However, zero-length extents are forbidden by the NBD
protocol. So, a careful client should consider such a reply as a server
fault, while a less-careful will likely ignore zero-length extents.

The bug can only be triggered by a client that requests block status
for nearly 4G at once (a request of 4G and larger is impossible per
the protocol, and requests smaller than 4G less the bitmap granularity
cause the loop to quit iterating rather than revisit the tail of the
large area); it also cannot trigger if the client used the
NBD_CMD_FLAG_REQ_ONE flag.  Since qemu 3.0 as client (using the
x-dirty-bitmap extension) always passes the flag, it is immune; and
we are not aware of other open-source clients that know how to request
qemu:dirty-bitmap:FOO contexts.  Clients that want to avoid the bug
could cap block status requests to a smaller length, such as 2G or 3G.

Fix this by more careful handling of dirty variable.

Bug was introduced in 3d068aff16
 "nbd/server: implement dirty bitmap export", with the whole function.
and is present in v3.0.0 release.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180914165116.23182-1-vsementsov@virtuozzo.com>
CC: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: improved commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-09-26 10:08:55 -05:00
Vladimir Sementsov-Ogievskiy 0c0eaed147 nbd/server: fix nbd_co_send_block_status
Call nbd_co_send_extents() with correct length parameter
(extent.length may be smaller than original length).

Also, switch length parameter type to uint32_t, to correspond with
request->len and similar nbd_co_send_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180704112302.471456-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-07-07 20:30:09 -05:00
Eric Blake 7606c99a04 nbd/server: Fix dirty bitmap logic regression
In my hurry to fix a build failure, I introduced a logic bug.
The assertion conditional is backwards, meaning that qemu will
now abort instead of reporting dirty bitmap status.

The bug can only be tickled by an NBD client using an exported
dirty bitmap (which is still an experimental QMP command), so
it's not the end of the world for supported usage (and neither
'make check' nor qemu-iotests fails); but it also shows that we
really want qemu-io support for reading dirty bitmaps if only
so that I can add iotests coverage to prevent future
brown-bag-of-shame events like this one.

Fixes: 45eb6fb6
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180622153509.375130-1-eblake@redhat.com>
2018-07-02 14:27:48 -05:00
Eric Blake 45eb6fb6ce nbd/server: Silence gcc false positive
The code has a while() loop that always initialized 'end', and
the loop always executes at least once (as evidenced by the assert()
just prior to the loop).  But some versions of gcc still complain
that 'end' is used uninitialized, so silence them.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180622125814.345274-1-eblake@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-22 14:18:36 +01:00
Vladimir Sementsov-Ogievskiy bc37b06a5c nbd/server: introduce NBD_CMD_CACHE
Handle nbd CACHE command. Just do read, without sending read data back.
Cache mechanism should be done by exported node driver chain.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180413143156.11409-1-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix two missing case labels in switch statements]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-06-21 09:41:39 -05:00
Vladimir Sementsov-Ogievskiy 3d068aff16 nbd/server: implement dirty bitmap export
Handle a new NBD meta namespace: "qemu", and corresponding queries:
"qemu:dirty-bitmap:<export bitmap name>".

With the new metadata context negotiated, BLOCK_STATUS query will reply
with dirty-bitmap data, converted to extents. The new public function
nbd_export_bitmap selects which bitmap to export. For now, only one bitmap
may be exported.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: wording tweaks, minor cleanups, additional tracing]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-06-21 09:23:58 -05:00
Vladimir Sementsov-Ogievskiy b0769d8f8d nbd/server: add nbd_meta_empty_or_pattern helper
Add nbd_meta_pattern() and nbd_meta_empty_or_pattern() helpers for
metadata query parsing. nbd_meta_pattern() will be reused for the
"qemu" namespace in following patches.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-06-21 09:23:58 -05:00
Vladimir Sementsov-Ogievskiy af736e5467 nbd/server: refactor NBDExportMetaContexts
Use NBDExport pointer instead of just export name: there is no need to
store a duplicated name in the struct; moreover, NBDExport will be used
further.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: commit message grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-06-21 09:23:58 -05:00
Vladimir Sementsov-Ogievskiy dbb8b396bb nbd/server: fix trace
Return code = 1 doesn't mean that we parsed base:allocation. Use
correct traces in both -parsed and -skipped cases.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180609151758.17343-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-06-21 09:23:58 -05:00
Eric Blake d8b20291cb nbd/server: Reject 0-length block status request
The NBD spec says that behavior is unspecified if the client
requests 0 length for block status; but since the structured
reply is documenting as returning a non-zero length, it's
easier to just diagnose this with an EINVAL error than to
figure out what to return.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180621124937.166549-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-06-21 09:21:34 -05:00
Eric Blake 2b53af2523 nbd: trace meta context negotiation
Having a more detailed log of the interaction between client and
server is invaluable in debugging how meta context negotiation
actually works.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180330130950.1931229-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-04-02 09:10:49 -05:00
Vladimir Sementsov-Ogievskiy e7b1948d51 nbd: BLOCK_STATUS for standard get_block_status function: server part
Minimal realization: only one extent in server answer is supported.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: tweak whitespace, move constant from .h to .c, improve
logic of check_meta_export_name, simplify nbd_negotiate_options
by doing more in nbd_negotiate_meta_queries]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:56 -05:00
Vladimir Sementsov-Ogievskiy 12296459f4 nbd/server: add nbd_read_opt_name helper
Add helper to read name in format:

  uint32 len       (<= NBD_MAX_NAME_SIZE)
  len bytes string (not 0-terminated)

The helper will be reused in following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180312152126.286890-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar fixes, actually check error]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy 2e425fd568 nbd/server: add nbd_opt_invalid helper
NBD_REP_ERR_INVALID is often parameter to nbd_opt_drop and it would
be used more in following patches. So, let's add a helper.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180312152126.286890-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Eric Blake 65529782f8 nbd/server: Honor FUA request on NBD_CMD_TRIM
The NBD spec states that since trim requests can affect disk contents,
then they should allow for FUA semantics just like writes for ensuring
the disk has settled before returning.  As bdrv_[co_]pdiscard() does
not support a flags argument, we can't pass FUA down the block layer
stack, and must therefore emulate it with a flush at the NBD layer.

Note that in all reality, generic well-behaved clients will never
send TRIM+FUA (in fact, qemu as a client never does, and we have no
intention to plumb flags into bdrv_pdiscard).  This is because the
NBD protocol states that it is unspecified to READ a trimmed area
(you might read stale data, all zeroes, or even random unrelated
data) without first rewriting it, and even the experimental
BLOCK_STATUS extension states that TRIM need not affect reported
status.  Thus, in the general case, a client cannot tell the
difference between an arbitrary server that ignores TRIM, a server
that had a power outage without flushing to disk, and a server that
actually affected the disk before returning; so waiting for the
trim actions to flush to disk makes little sense.  However, for a
specific client and server pair, where the client knows the server
treats TRIM'd areas as guaranteed reads-zero, waiting for a flush
makes sense, hence why the protocol documents that FUA is valid on
trim.  So, even though the NBD protocol doesn't have a way for the
server to advertise what effects (if any) TRIM will actually have,
and thus any client that relies on specific effects is probably
in error, we can at least support a client that requests TRIM+FUA.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180307225732.155835-1-eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy 6f302e6093 nbd/server: refactor nbd_trip: split out nbd_handle_request
Split out request handling logic.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: touch up blank line placement]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy 6a4175997b nbd/server: refactor nbd_trip: cmd_read and generic reply
nbd_trip has difficult logic when sending replies: it tries to use one
code path for all replies. It is ok for simple replies, but is not
comfortable for structured replies. Also, two types of error (and
corresponding messages in local_err) - fatal (leading to disconnect)
and not-fatal (just to be sent to the client) are difficult to follow.

To make things a bit clearer, the following is done:
 - split CMD_READ logic to separate function. It is the most difficult
   command for now, and it is definitely cramped inside nbd_trip. Also,
   it is difficult to follow CMD_READ logic, shared between
   "case NBD_CMD_READ" and "if"s under "reply:" label.
 - create separate helper function nbd_send_generic_reply() and use it
   both in new nbd_do_cmd_read and for other commands in nbd_trip instead
   of common code-path under "reply:" label in nbd_trip. The helper
   supports an error message, so logic with local_err in nbd_trip is
   simplified.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweaks and blank line placement]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy a0d7ce20a9 nbd/server: fix: check client->closing before sending reply
Since the unchanged code has just set client->recv_coroutine to
NULL before calling nbd_client_receive_next_request(), we are
spawning a new coroutine unconditionally, but the first thing
that coroutine will do is check for client->closing, making it
a no-op if we have already detected that the client is going
away.  Furthermore, for any error other than EIO (where we
disconnect, which itself sets client->closing), if the client
has already gone away, we'll probably encounter EIO later
in the function and attempt disconnect at that point.  Logically,
as soon as we know the connection is closing, there is no need
to try a likely-to-fail a response or spawn a no-op coroutine.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: squash in further reordering: hoist check before spawning
next coroutine, and document rationale in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy 37e02aebf8 nbd/server: fix sparse read
In case of io error in nbd_co_send_sparse_read we should not
"goto reply:", as it was a fatal error and the common behavior
is to disconnect in this case. We should not try to send the
client an additional error reply, since we already hit a
channel-io error on our previous attempt to send one.

Fix this by handling block-status error in nbd_co_send_sparse_read,
so nbd_co_send_sparse_read fails only on io error. Then just skip
common "reply:" code path in nbd_trip.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-3-vsementsov@virtuozzo.com>
[eblake: grammar tweaks]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Vladimir Sementsov-Ogievskiy 60ace2bacf nbd/server: move nbd_co_send_structured_error up
To be reused in nbd_co_send_sparse_read() in the following patch.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180308184636.178534-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-13 15:38:55 -05:00
Peter Xu 1939ccdaa6 qio: non-default context for TLS handshake
A new parameter "context" is added to qio_channel_tls_handshake() is to
allow the TLS to be run on a non-default context.  Still, no functional
change.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-03-06 10:19:07 +00:00
Vladimir Sementsov-Ogievskiy 28fb494f9b nbd/client: fix error messages in nbd_handle_reply_err
1. NBD_REP_ERR_INVALID is not only about length, so, make message more
   general

2. hex format is not very good: it's hard to read something like
   "option a (set meta context)", so switch to dec.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1518702707-7077-6-git-send-email-vsementsov@virtuozzo.com>
[eblake: expand scope of patch: ALL uses of nbd_opt_lookup and
nbd_rep_lookup are now decimal]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-03-01 14:48:23 -06:00
Vladimir Sementsov-Ogievskiy a3b0dc7582 qapi: add nbd-server-remove
Add command for removing an export. It is needed for cases when we
don't want to keep the export after the operation on it was completed.
The other example is a temporary node, created with blockdev-add.
If we want to delete it we should firstly remove any corresponding
NBD export.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180119135719.24745-3-vsementsov@virtuozzo.com>
[eblake: drop dead nb_clients code]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-26 09:37:20 -06:00
Vladimir Sementsov-Ogievskiy 1d17922a28 nbd/server: structurize option reply sending
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171122101958.17065-6-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eric Blake 894e02804c nbd/server: Add helper functions for parsing option payload
Rather than making every callsite perform length sanity checks
and error reporting, add the helper functions nbd_opt_read()
and nbd_opt_drop() that use the length stored in the client
struct; also add an assertion that optlen is 0 before any
option (ie. any previous option was fully handled), complementing
the assertion added in an earlier patch that optlen is 0 after
all negotiation completes.

Note that the call in nbd_negotiate_handle_export_name() does
not use the new helper (in part because the server cannot
reply to NBD_OPT_EXPORT_NAME - it either succeeds or the
connection drops).

Based on patches by Vladimir Sementsov-Ogievskiy.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-6-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eric Blake 41f5dfafbb nbd/server: Add va_list form of nbd_negotiate_send_rep_err()
This will be useful for the next patch.

Based on a patch by Vladimir Sementsov-Ogievskiy

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-5-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eric Blake 32f158a635 nbd/server: Better error for NBD_OPT_EXPORT_NAME failure
When a client abruptly disconnects before we've finished reading
the name sent with NBD_OPT_EXPORT_NAME, we are better off logging
the failure as EIO (we can't communicate with the client), rather
than EINVAL (the client sent bogus data).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-4-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Vladimir Sementsov-Ogievskiy 0cfae925d2 nbd/server: refactor negotiation functions parameters
Instead of passing currently negotiating option and its length to
many of negotiation functions let's just store them on NBDClient
struct to be state-variables of negotiation phase.

This unifies semantics of negotiation functions and allows
tracking changes of remaining option length in future patches.

Asssert that optlen is back to 0 after negotiation (including
old-style connections which don't negotiate), although we need
more patches before we can assert optlen is 0 between options
during negotiation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171122101958.17065-2-vsementsov@virtuozzo.com>
[eblake: rebase, commit message tweak, assert !optlen after
negotiation completes]
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Eric Blake a16a790770 nbd/server: Hoist nbd_reject_length() earlier
No semantic change, but will make it easier for an upcoming patch
to refactor code without having to add forward declarations.  Fix
a poor comment while at it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-2-eblake@redhat.com>
2018-01-17 20:14:12 -06:00
Vladimir Sementsov-Ogievskiy 9156245ec4 nbd/server: add additional assert to nbd_export_put
This place is not obvious, nbd_export_close may theoretically reduce
refcount to 0. It may happen if someone calls nbd_export_put on named
export not through nbd_export_set_name when refcount is 1.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20171207155102.66622-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2018-01-09 12:53:44 -06:00
Eric Blake e2de3256c3 nbd/server: Optimize final chunk of sparse read
If we are careful to handle 0-length read requests correctly,
we can optimize our sparse read to send the NBD_REPLY_FLAG_DONE
bit on our last OFFSET_DATA or OFFSET_HOLE chunk rather than
needing a separate chunk.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171107030912.23930-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-01-08 09:12:23 -06:00
Eric Blake 418638d3e4 nbd/server: Implement sparse reads atop structured reply
The reason that NBD added structured reply in the first place was
to allow for efficient reads of sparse files, by allowing the
reply to include chunks to quickly communicate holes to the client
without sending lots of zeroes over the wire.  Time to implement
this in the server; our client can already read such data.

We can only skip holes insofar as the block layer can query them;
and only if the client is okay with a fragmented request (if a
client requests NBD_CMD_FLAG_DF and the entire read is a hole, we
could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but
that's a fringe case not worth catering to here).  Sadly, the
control flow is a bit wonkier than I would have preferred, but
it was minimally invasive to have a split in the action between
a fragmented read (handled directly where we recognize
NBD_CMD_READ with the right conditions, and sending multiple
chunks) vs. a single read (handled at the end of nbd_trip, for
both simple and structured replies, when we know there is only
one thing being read).  Likewise, I didn't make any effort to
optimize the final chunk of a fragmented read to set the
NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate
NBD_REPLY_TYPE_NONE.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171107030912.23930-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2018-01-08 09:12:23 -06:00
Eric Blake 51ae4f8455 nbd/server: CVE-2017-15118 Stack smash on large export name
Introduced in commit f37708f6b8 (2.10).  The NBD spec says a client
can request export names up to 4096 bytes in length, even though
they should not expect success on names longer than 256.  However,
qemu hard-codes the limit of 256, and fails to filter out a client
that probes for a longer name; the result is a stack smash that can
potentially give an attacker arbitrary control over the qemu
process.

The smash can be easily demonstrated with this client:
$ qemu-io f raw nbd://localhost:10809/$(printf %3000d 1 | tr ' ' a)

If the qemu NBD server binary (whether the standalone qemu-nbd, or
the builtin server of QMP nbd-server-start) was compiled with
-fstack-protector-strong, the ability to exploit the stack smash
into arbitrary execution is a lot more difficult (but still
theoretically possible to a determined attacker, perhaps in
combination with other CVEs).  Still, crashing a running qemu (and
losing the VM) is bad enough, even if the attacker did not obtain
full execution control.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-11-28 06:58:01 -06:00
Eric Blake fdad35ef6c nbd/server: CVE-2017-15119 Reject options larger than 32M
The NBD spec gives us permission to abruptly disconnect on clients
that send outrageously large option requests, rather than having
to spend the time reading to the end of the option.  No real
option request requires that much data anyways; and meanwhile, we
already have the practice of abruptly dropping the connection on
any client that sends NBD_CMD_WRITE with a payload larger than 32M.

For comparison, nbdkit drops the connection on any request with
more than 4096 bytes; however, that limit is probably too low
(as the NBD spec states an export name can theoretically be up
to 4096 bytes, which means a valid NBD_OPT_INFO could be even
longer) - even if qemu doesn't permit exports longer than 256
bytes.

It could be argued that a malicious client trying to get us to
read nearly 4G of data on a bad request is a form of denial of
service.  In particular, if the server requires TLS, but a client
that does not know the TLS credentials sends any option (other
than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated
payload of nearly 4G, then the server was keeping the connection
alive trying to read all the payload, tying up resources that it
would rather be spending on a client that can get past the TLS
handshake.  Hence, this warranted a CVE.

Present since at least 2.5 when handling known options, and made
worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE
to handle unknown options.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-11-28 06:42:26 -06:00
Eric Blake fed5f8f820 nbd/server: Fix error reporting for bad requests
The NBD spec says an attempt to NBD_CMD_TRIM on a read-only
export should fail with EPERM, as a trim has the potential
to change disk contents, but we were relying on the block
layer to catch that for us, which might not always give the
right error (and even if it does, it does not let us pass
back a sane message for structured replies).

The NBD spec says an attempt to NBD_CMD_WRITE_ZEROES out of
bounds should fail with ENOSPC, not EINVAL.

Our check for u64 offset + u32 length wraparound up front is
pointless; nothing uses offset until after the second round
of sanity checks, and we can just as easily ensure there is
no wraparound by checking whether offset is in bounds (since
a disk size cannot exceed off_t which is 63 bits, adding a
32-bit number for a valid offset can't overflow).  Bonus:
dropping the up-front check lets us keep the connection alive
after NBD_CMD_WRITE, whereas before we would drop the
connection (of course, any client sending a packet that would
trigger the failure is already buggy, so it's also okay to
drop the connection, but better quality-of-implementation
never hurts).

Solve all of these issues by some code motion and improved
request validation.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171115213557.3548-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-17 08:38:38 -06:00
Eric Blake ef8c887ee0 nbd/server: Fix structured read of length 0
The NBD spec was recently clarified to state that a read of length 0
should not be attempted by a compliant client; but that a server must
still handle it correctly in an unspecified manner (that is, either
a successful no-op or an error reply, but not a crash) [1].  However,
it also implies that NBD_REPLY_TYPE_OFFSET_DATA must have a non-zero
payload length, but our existing code was replying with a chunk
that a picky client could reject as invalid because it was missing
a payload (our own client implementation was recently patched to be
that picky, after first fixing it to not send 0-length requests).

We are already doing successful no-ops for 0-length writes and for
non-structured reads; so for consistency, we want structured reply
reads to also be a no-op.  The easiest way to do this is to return
a NBD_REPLY_TYPE_NONE chunk; this is best done via a new helper
function (especially since future patches for other structured
replies may benefit from using the same helper).

[1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-8-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-09 10:25:11 -06:00
Eric Blake efdc0c103d nbd: Fix struct name for structured reads
A closer read of the NBD spec shows that a structured reply chunk
for a hole is not quite identical to the prefix of a data chunk,
because the hole has to also send a 32-bit size field.  Although
we do not yet send holes, we should fix the misleading information
in our header and make it easier for a future patch to support
sparse reads.  Messed up in commit bae245d1.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-11-09 10:17:12 -06:00
Vladimir Sementsov-Ogievskiy 46321d6b5f nbd/server: fix nbd_negotiate_handle_info
namelen should be here, length is unrelated, and always 0 at this
point.  Broken in introduction in commit f37708f6, but mostly
harmless (replying with '' as the name does not violate protocol,
and does not confuse qemu as the nbd client since our implementation
does not ask for the name; but might confuse some other client that
does ask for the name especially if the default export is different
than the export name being queried).

Adding an assert makes it obvious that we are not skipping any bytes
in the client's message, as well as making it obvious that we were
using the wrong variable.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
CC: qemu-stable@nongnu.org
Message-Id: <20171101154204.27146-1-vsementsov@virtuozzo.com>
[eblake: improve commit message, squash in assert addition]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-11-08 16:32:26 -06:00
Eric Blake a57f6dea02 nbd/server: Include human-readable message in structured errors
The NBD spec permits including a human-readable error string if
structured replies are in force, so we might as well send the
client the message that we logged on any error.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-9-eblake@redhat.com>
2017-10-30 21:48:11 +01:00
Vladimir Sementsov-Ogievskiy 5c54e7fa71 nbd: Minimal structured read for server
Minimal implementation of structured read: one structured reply chunk,
no segmentation.
Minimal structured error implementation: no text message.
Support DF flag, but just ignore it, as there is no segmentation any
way.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-8-eblake@redhat.com>
2017-10-30 21:48:06 +01:00
Eric Blake e68c35cfb8 nbd/server: Refactor zero-length option check
Consolidate the response for a non-zero-length option payload
into a new function, nbd_reject_length().  This check will
also be used when introducing support for structured replies.

Note that STARTTLS response differs based on time: if the connection
is still unencrypted, we set fatal to true (a client that can't
request TLS correctly may still think that we are ready to start
the TLS handshake, so we must disconnect); while if the connection
is already encrypted, the client is sending a bogus request but
is no longer at risk of being confused by continuing the connection.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-7-eblake@redhat.com>
[eblake: correct return value on STARTTLS]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-10-30 21:47:18 +01:00
Eric Blake 8cbee49ed7 nbd/server: Simplify nbd_negotiate_options loop
Instead of making each caller check whether a transmission error
occurred, we can sink a common error check to the end of the loop.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-6-eblake@redhat.com>
[eblake: squash in compiler warning fix]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2017-10-30 21:07:59 +01:00
Eric Blake 8fb48b8b38 nbd/server: Report error for write to read-only export
When the server is read-only, we were already reporting an error
message for NBD_CMD_WRITE_ZEROES, but failed to set errp for a
similar NBD_CMD_WRITE.  This will matter more once structured
replies allow the server to propagate the errp information back
to the client.  While at it, use an error message that makes a
bit more sense if viewed on the client side.

Note that when using qemu-io to test qemu-nbd behavior, it is
rather difficult to convince qemu-io to send protocol violations
(such as a read beyond bounds), because we have a lot of active
checking on the client side that a qemu-io request makes sense
before it ever goes over the wire to the server.  The case of a
client attempting a write when the server is started as
'qemu-nbd -r' is one of the few places where we can easily test
error path handling, without having to resort to hacking in known
temporary bugs to either the server or client.  [Maybe we want a
future patch to the client to do up-front checking on writes to a
read-only export, the way it does up-front bounds checking; but I
don't see anything in the NBD spec that points to a protocol
violation in our current behavior.]

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-5-eblake@redhat.com>
2017-10-30 21:07:44 +01:00
Eric Blake bae245d19a nbd: Expose constants and structs for structured read
Upcoming patches will implement the NBD structured reply
extension [1] for both client and server roles.  Declare the
constants, structs, and lookup routines that will be valuable
whether the server or client code is backported in isolation.

This includes moving one constant from an internal header to
the public header, as part of the structured read processing
will be done in block/nbd-client.c rather than nbd/client.c.

[1]https://github.com/NetworkBlockDevice/nbd/blob/extension-structured-reply/doc/proto.md

Based on patches from Vladimir Sementsov-Ogievskiy.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-4-eblake@redhat.com>
2017-10-30 21:07:21 +01:00
Eric Blake e7a78d0eff nbd: Include error names in trace messages
NBD errors were originally sent over the wire based on Linux errno
values; but not all the world is Linux, and not all platforms share
the same values.  Since a number isn't very easy to decipher on all
platforms, update the trace messages to include the name of NBD
errors being sent/received over the wire.  Tweak the trace messages
to be at the point where we are using the NBD error, not the
translation to the host errno values.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-2-eblake@redhat.com>
2017-10-30 21:07:21 +01:00
Vladimir Sementsov-Ogievskiy de79bfc36f nbd/server: simplify reply transmission
Send qiov via qio_channel_writev_all instead of calling nbd_write twice
with a cork.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171012095319.136610-8-vsementsov@virtuozzo.com>
[eblake: rebase to tweaks earlier in series]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-10-13 08:05:16 -05:00
Vladimir Sementsov-Ogievskiy 978df1b6bf nbd/server: refactor nbd_co_send_simple_reply parameters
Pass client and buffer (*data) parameters directly, to make the function
consistent with further structured reply sending functions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171012095319.136610-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-10-13 08:05:14 -05:00
Vladimir Sementsov-Ogievskiy 14cea41d39 nbd/server: do not use NBDReply structure
NBDReply structure will be upgraded in future patches to handle both
simple and structured replies and will be used only in the client

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171012095319.136610-6-vsementsov@virtuozzo.com>
[eblake: rebase to tweaks earlier in series]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-10-13 08:05:11 -05:00