mirror of https://gitee.com/openkylin/linux.git
srcu: Allow use of Tiny/Tree SRCU from both process and interrupt context
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit719d93cd5f
("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit5a41344a3d
("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Unlike Classic and Tree SRCU, Tiny SRCU does increments and decrements on a single variable. Therefore, as Peter Zijlstra pointed out, Tiny SRCU's implementation already supports mixed-context use of srcu_read_lock() and srcu_read_unlock(), at least as long as uses of srcu_read_lock() and srcu_read_unlock() in each handler are nested and paired properly. In other words, it is still illegal to (say) invoke srcu_read_lock() in an interrupt handler and to invoke the matching srcu_read_unlock() in a softirq handler. Therefore, the only change required for Tiny SRCU is to its comments. Fixes:719d93cd5f
("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: Linu Cherian <linuc.decode@gmail.com> Suggested-by: Linu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Paolo Bonzini <pbonzini@redhat.com>
This commit is contained in:
parent
3c2993b8c6
commit
cdf7abc461
|
@ -97,8 +97,9 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Counts the new reader in the appropriate per-CPU element of the
|
* Counts the new reader in the appropriate per-CPU element of the
|
||||||
* srcu_struct. Must be called from process context.
|
* srcu_struct. Can be invoked from irq/bh handlers, but the matching
|
||||||
* Returns an index that must be passed to the matching srcu_read_unlock().
|
* __srcu_read_unlock() must be in the same handler instance. Returns an
|
||||||
|
* index that must be passed to the matching srcu_read_unlock().
|
||||||
*/
|
*/
|
||||||
int __srcu_read_lock(struct srcu_struct *sp)
|
int __srcu_read_lock(struct srcu_struct *sp)
|
||||||
{
|
{
|
||||||
|
@ -112,7 +113,7 @@ EXPORT_SYMBOL_GPL(__srcu_read_lock);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Removes the count for the old reader from the appropriate element of
|
* Removes the count for the old reader from the appropriate element of
|
||||||
* the srcu_struct. Must be called from process context.
|
* the srcu_struct.
|
||||||
*/
|
*/
|
||||||
void __srcu_read_unlock(struct srcu_struct *sp, int idx)
|
void __srcu_read_unlock(struct srcu_struct *sp, int idx)
|
||||||
{
|
{
|
||||||
|
|
|
@ -357,7 +357,7 @@ EXPORT_SYMBOL_GPL(cleanup_srcu_struct);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Counts the new reader in the appropriate per-CPU element of the
|
* Counts the new reader in the appropriate per-CPU element of the
|
||||||
* srcu_struct. Must be called from process context.
|
* srcu_struct.
|
||||||
* Returns an index that must be passed to the matching srcu_read_unlock().
|
* Returns an index that must be passed to the matching srcu_read_unlock().
|
||||||
*/
|
*/
|
||||||
int __srcu_read_lock(struct srcu_struct *sp)
|
int __srcu_read_lock(struct srcu_struct *sp)
|
||||||
|
@ -365,7 +365,7 @@ int __srcu_read_lock(struct srcu_struct *sp)
|
||||||
int idx;
|
int idx;
|
||||||
|
|
||||||
idx = READ_ONCE(sp->srcu_idx) & 0x1;
|
idx = READ_ONCE(sp->srcu_idx) & 0x1;
|
||||||
__this_cpu_inc(sp->sda->srcu_lock_count[idx]);
|
this_cpu_inc(sp->sda->srcu_lock_count[idx]);
|
||||||
smp_mb(); /* B */ /* Avoid leaking the critical section. */
|
smp_mb(); /* B */ /* Avoid leaking the critical section. */
|
||||||
return idx;
|
return idx;
|
||||||
}
|
}
|
||||||
|
@ -375,7 +375,6 @@ EXPORT_SYMBOL_GPL(__srcu_read_lock);
|
||||||
* Removes the count for the old reader from the appropriate per-CPU
|
* Removes the count for the old reader from the appropriate per-CPU
|
||||||
* element of the srcu_struct. Note that this may well be a different
|
* element of the srcu_struct. Note that this may well be a different
|
||||||
* CPU than that which was incremented by the corresponding srcu_read_lock().
|
* CPU than that which was incremented by the corresponding srcu_read_lock().
|
||||||
* Must be called from process context.
|
|
||||||
*/
|
*/
|
||||||
void __srcu_read_unlock(struct srcu_struct *sp, int idx)
|
void __srcu_read_unlock(struct srcu_struct *sp, int idx)
|
||||||
{
|
{
|
||||||
|
|
Loading…
Reference in New Issue