Detection of unclean shutdown has moved into user space.
The kernel code will, whenever it updates the meta data, mark it as
"unclean", and will refuse to attach to such unclean meta data.
"drbdadm up" now schedules "drbdmeta apply-al", which will apply
the activity log to the bitmap, and/or reinitialize it, if necessary,
as well as set a "clean" indicator flag.
This moves a bit code out of kernel space.
As a side effect, it also prevents some 8.3 module from accidentally
ignoring the 8.4 style activity log, if someone should downgrade,
whether on purpose, or accidentally because he changed kernel versions
without providing an 8.4 for the new kernel, and the new kernel comes
with in-tree 8.3.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* drbd-8.3:
documentation: Documented detach's --force and disk's --disk-timeout
drbd: Implemented the disk-timeout option
drbd: Force flag for the detach operation
drbd: Allow new IOs while the local disk in in FAILED state
drbd: Bitmap IO functions can not return prematurely if the disk breaks
drbd: Added a kref to bm_aio_ctx
drbd: Hold a reference to ldev while doing meta-data IO
drbd: Keep a reference to the bio until the completion handler finished
drbd: Implemented wait_until_done_or_disk_failure()
drbd: Replaced md_io_mutex by an atomic: md_io_in_use
drbd: moved md_io into mdev
drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
drbd: Keep a reference to barrier acked requests
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* commit 'ae57a0a':
drbd: Only print sanitize state's warnings, if the state change happens
drbd: we should write meta data updates with FLUSH FUA
drbd: fix limit define, we support 1 PiByte now
drbd: fix log message argument order
drbd: Typo in user-visible message.
drbd: Make "(rcv|snd)buf-size" and "ping-timeout" available for the proxy, too.
drbd: Allow keywords to be used in multiple config sections.
drbd: fix typos in comments.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Preparing to use the same mutex for disk_conf updates
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
There are races where the receiver may be exiting,
but still need the worker to process some stuff.
Do not wait for the receiver to die from an exiting worker.
The receiver must already be dead in case the worker decides to exit.
If the receiver was still alive, it may still want to queue work, and do
drbd_flush_workqueue() from it's disconnect cleanup code,
which would no longer be processed by an exiting worker.
This also would deadlock,
if the worker was to synchornously wait for the receiver to die.
Do not implicitly stop the worker.
The worker will only be stopped from configuration context, from
conn_reconfig_done(), drbd_adm_down() or drbd_adm_delete_connection(),
after making sure the receiver is already stopped.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
If a forced disconnect hits a restarting receiver right after it passed
its final "if (C_DISCONNECTING)" test in drbdd_init(), but before it was
actually restarted by drbd_thread_setup, we could be left with a
connection stuck in C_DISCONNECTING, never reaching C_STANDALONE,
which would be necessary to take it down or reconfigure it.
Move the last cleanup into w_after_conn_state_ch(), and do an additional
state change request in conn_try_disconnect(), just in case.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
If an admin requests disconnect at a time when the state handling
already disconnects/reconnects, there have been some races.
Make sure to always really stop the network threads before
returning success for disconnect. Do not pretend successfull
forced disconnect, if the state handling returned an error.
Return success from drbd_adm_down() only after all threads are finished.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
References hold by:
* Each (running) drbd thread has a reference on tconn
* Each mdev has a referenc on tconn
* Beeing in the all_tconn list counts for one reference
* Each after_conn_state_chg_work has a reference to tconn
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Removing the get_net_conf()/put_net_conf() calls
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
During a disconnect the oc variable in _conn_request_state()
could become outdated. Determin the common old state after
sleeping.
While at it, I implemented that for all parts of the state
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This helps to ensure that we don't miss one of them when changing their
return value semantics.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
It actually returned the lowest volume number. While doing that
renamed a few wrongly named variables.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This commit breaks the API again.
Move per-volume former syncer options into disk_conf.
Move per-connection former syncer options into net_conf.
Renamed the remainign sync_conf to res_opts
Syncer settings have been changeable at runtime, so we need to prepare
for these settings to be runtime-changeable in their new home as well.
Introduce new configuration operations, and share the netlink attribute
between "attach" (create new disk) and "disk-opts" (change options).
Same for "connect" and "net-opts".
Some fields cannot be changed at runtime, however.
Introduce a new flag GENLA_F_INVARIANT to be able to trigger on that in
the generated validation and assignment functions.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
somehow a "goto abort" was introduced with commit
drbd: Extracted is_valid_transition() out of sanitize_state()
which left drbd_req_state still holding the spin lock.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Get rid of a temporary variable and, funny bitand assignment.
Just short circuit, returning false, once we encounter the first
still configured volume.
FIXME verify call sites for need of rcu_read_lock or stronger.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* Moved CONFIG_PENDING and DEVICE_DYING from mdev to tconn.
* Renamed drbd_reconfig_start() and drbd_reconfig_done() to
conn_reconfig_start() and conn_reconfig_done().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
That is used for graceful disconnect only
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
In compatibility mode with old DRBDs, use that as the state_mutex
as well.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The lock they constructed is only taken when the state_mutex
was already taken. It is superficial.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>