Merge branches 'cond_resched.2017.12.04a', 'dyntick.2017.11.28a', 'fixes.2017.12.11a', 'srbd.2017.12.05a' and 'torture.2017.12.11a' into HEAD

cond_resched.2017.12.04a: Convert cond_resched_rcu_qs() to cond_resched()
dyntick.2017.11.28a: Make RCU dynticks handle interrupts from NMI
fixes.2017.12.11a: Miscellaneous fixes
srbd.2017.12.05a: Remove now-redundant smp_read_barrier_depends()
torture.2017.12.11a: Torture-testing update
This commit is contained in:
Paul E. McKenney 2017-12-11 09:21:58 -08:00
72 changed files with 482 additions and 729 deletions

View File

@ -1183,8 +1183,8 @@ CPU (and from tracing) unless otherwise stated.
Its fields are as follows:
<pre>
1 int dynticks_nesting;
2 int dynticks_nmi_nesting;
1 long dynticks_nesting;
2 long dynticks_nmi_nesting;
3 atomic_t dynticks;
4 bool rcu_need_heavy_qs;
5 unsigned long rcu_qs_ctr;
@ -1192,15 +1192,31 @@ Its fields are as follows:
</pre>
<p>The <tt>-&gt;dynticks_nesting</tt> field counts the
nesting depth of normal interrupts.
In addition, this counter is incremented when exiting dyntick-idle
mode and decremented when entering it.
nesting depth of process execution, so that in normal circumstances
this counter has value zero or one.
NMIs, irqs, and tracers are counted by the <tt>-&gt;dynticks_nmi_nesting</tt>
field.
Because NMIs cannot be masked, changes to this variable have to be
undertaken carefully using an algorithm provided by Andy Lutomirski.
The initial transition from idle adds one, and nested transitions
add two, so that a nesting level of five is represented by a
<tt>-&gt;dynticks_nmi_nesting</tt> value of nine.
This counter can therefore be thought of as counting the number
of reasons why this CPU cannot be permitted to enter dyntick-idle
mode, aside from non-maskable interrupts (NMIs).
NMIs are counted by the <tt>-&gt;dynticks_nmi_nesting</tt>
field, except that NMIs that interrupt non-dyntick-idle execution
are not counted.
mode, aside from process-level transitions.
<p>However, it turns out that when running in non-idle kernel context,
the Linux kernel is fully capable of entering interrupt handlers that
never exit and perhaps also vice versa.
Therefore, whenever the <tt>-&gt;dynticks_nesting</tt> field is
incremented up from zero, the <tt>-&gt;dynticks_nmi_nesting</tt> field
is set to a large positive number, and whenever the
<tt>-&gt;dynticks_nesting</tt> field is decremented down to zero,
the the <tt>-&gt;dynticks_nmi_nesting</tt> field is set to zero.
Assuming that the number of misnested interrupts is not sufficient
to overflow the counter, this approach corrects the
<tt>-&gt;dynticks_nmi_nesting</tt> field every time the corresponding
CPU enters the idle loop from process context.
</p><p>The <tt>-&gt;dynticks</tt> field counts the corresponding
CPU's transitions to and from dyntick-idle mode, so that this counter
@ -1232,14 +1248,16 @@ in response.
<tr><th>&nbsp;</th></tr>
<tr><th align="left">Quick Quiz:</th></tr>
<tr><td>
Why not just count all NMIs?
Wouldn't that be simpler and less error prone?
Why not simply combine the <tt>-&gt;dynticks_nesting</tt>
and <tt>-&gt;dynticks_nmi_nesting</tt> counters into a
single counter that just counts the number of reasons that
the corresponding CPU is non-idle?
</td></tr>
<tr><th align="left">Answer:</th></tr>
<tr><td bgcolor="#ffffff"><font color="ffffff">
It seems simpler only until you think hard about how to go about
updating the <tt>rcu_dynticks</tt> structure's
<tt>-&gt;dynticks</tt> field.
Because this would fail in the presence of interrupts whose
handlers never return and of handlers that manage to return
from a made-up interrupt.
</font></td></tr>
<tr><td>&nbsp;</td></tr>
</table>

View File

@ -581,7 +581,8 @@ This guarantee was only partially premeditated.
DYNIX/ptx used an explicit memory barrier for publication, but had nothing
resembling <tt>rcu_dereference()</tt> for subscription, nor did it
have anything resembling the <tt>smp_read_barrier_depends()</tt>
that was later subsumed into <tt>rcu_dereference()</tt>.
that was later subsumed into <tt>rcu_dereference()</tt> and later
still into <tt>READ_ONCE()</tt>.
The need for these operations made itself known quite suddenly at a
late-1990s meeting with the DEC Alpha architects, back in the days when
DEC was still a free-standing company.

View File

@ -122,11 +122,7 @@ o Be very careful about comparing pointers obtained from
Note that if checks for being within an RCU read-side
critical section are not required and the pointer is never
dereferenced, rcu_access_pointer() should be used in place
of rcu_dereference(). The rcu_access_pointer() primitive
does not require an enclosing read-side critical section,
and also omits the smp_read_barrier_depends() included in
rcu_dereference(), which in turn should provide a small
performance gain in some CPUs (e.g., the DEC Alpha).
of rcu_dereference().
o The comparison is against a pointer that references memory
that was initialized "a long time ago." The reason

View File

@ -600,8 +600,7 @@ don't forget about them when submitting patches making use of RCU!]
#define rcu_dereference(p) \
({ \
typeof(p) _________p1 = p; \
smp_read_barrier_depends(); \
typeof(p) _________p1 = READ_ONCE(p); \
(_________p1); \
})

View File

@ -2049,9 +2049,6 @@
This tests the locking primitive's ability to
transition abruptly to and from idle.
locktorture.torture_runnable= [BOOT]
Start locktorture running at boot time.
locktorture.torture_type= [KNL]
Specify the locking implementation to test.
@ -3459,9 +3456,6 @@
the same as for rcuperf.nreaders.
N, where N is the number of CPUs
rcuperf.perf_runnable= [BOOT]
Start rcuperf running at boot time.
rcuperf.perf_type= [KNL]
Specify the RCU implementation to test.
@ -3595,9 +3589,6 @@
Test RCU's dyntick-idle handling. See also the
rcutorture.shuffle_interval parameter.
rcutorture.torture_runnable= [BOOT]
Start rcutorture running at boot time.
rcutorture.torture_type= [KNL]
Specify the RCU implementation to test.

View File

@ -220,8 +220,7 @@ before it writes the new tail pointer, which will erase the item.
Note the use of READ_ONCE() and smp_load_acquire() to read the
opposition index. This prevents the compiler from discarding and
reloading its cached value - which some compilers will do across
smp_read_barrier_depends(). This isn't strictly needed if you can
reloading its cached value. This isn't strictly needed if you can
be sure that the opposition index will _only_ be used the once.
The smp_load_acquire() additionally forces the CPU to order against
subsequent memory references. Similarly, smp_store_release() is used

View File

@ -57,11 +57,6 @@ torture_type Type of lock to torture. By default, only spinlocks will
o "rwsem_lock": read/write down() and up() semaphore pairs.
torture_runnable Start locktorture at boot time in the case where the
module is built into the kernel, otherwise wait for
torture_runnable to be set via sysfs before starting.
By default it will begin once the module is loaded.
** Torture-framework (RCU + locking) **

View File

@ -227,17 +227,20 @@ There are some minimal guarantees that may be expected of a CPU:
(*) On any given CPU, dependent memory accesses will be issued in order, with
respect to itself. This means that for:
Q = READ_ONCE(P); smp_read_barrier_depends(); D = READ_ONCE(*Q);
Q = READ_ONCE(P); D = READ_ONCE(*Q);
the CPU will issue the following memory operations:
Q = LOAD P, D = LOAD *Q
and always in that order. On most systems, smp_read_barrier_depends()
does nothing, but it is required for DEC Alpha. The READ_ONCE()
is required to prevent compiler mischief. Please note that you
should normally use something like rcu_dereference() instead of
open-coding smp_read_barrier_depends().
and always in that order. However, on DEC Alpha, READ_ONCE() also
emits a memory-barrier instruction, so that a DEC Alpha CPU will
instead issue the following memory operations:
Q = LOAD P, MEMORY_BARRIER, D = LOAD *Q, MEMORY_BARRIER
Whether on DEC Alpha or not, the READ_ONCE() also prevents compiler
mischief.
(*) Overlapping loads and stores within a particular CPU will appear to be
ordered within that CPU. This means that for:
@ -1815,7 +1818,7 @@ The Linux kernel has eight basic CPU memory barriers:
GENERAL mb() smp_mb()
WRITE wmb() smp_wmb()
READ rmb() smp_rmb()
DATA DEPENDENCY read_barrier_depends() smp_read_barrier_depends()
DATA DEPENDENCY READ_ONCE()
All memory barriers except the data dependency barriers imply a compiler
@ -2864,7 +2867,10 @@ access depends on a read, not all do, so it may not be relied on.
Other CPUs may also have split caches, but must coordinate between the various
cachelets for normal memory accesses. The semantics of the Alpha removes the
need for coordination in the absence of memory barriers.
need for hardware coordination in the absence of memory barriers, which
permitted Alpha to sport higher CPU clock rates back in the day. However,
please note that smp_read_barrier_depends() should not be used except in
Alpha arch-specific code and within the READ_ONCE() macro.
CACHE COHERENCY VS DMA

View File

@ -8194,6 +8194,7 @@ F: arch/*/include/asm/rwsem.h
F: include/linux/seqlock.h
F: lib/locking*.[ch]
F: kernel/locking/
X: kernel/locking/locktorture.c
LOGICAL DISK MANAGER SUPPORT (LDM, Windows 2000/XP/Vista Dynamic Disks)
M: "Richard Russon (FlatCap)" <ldm@flatcap.org>
@ -11451,15 +11452,6 @@ L: linux-wireless@vger.kernel.org
S: Orphan
F: drivers/net/wireless/ray*
RCUTORTURE MODULE
M: Josh Triplett <josh@joshtriplett.org>
M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
L: linux-kernel@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
F: Documentation/RCU/torture.txt
F: kernel/rcu/rcutorture.c
RCUTORTURE TEST FRAMEWORK
M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
M: Josh Triplett <josh@joshtriplett.org>
@ -13748,6 +13740,18 @@ L: platform-driver-x86@vger.kernel.org
S: Maintained
F: drivers/platform/x86/topstar-laptop.c
TORTURE-TEST MODULES
M: Davidlohr Bueso <dave@stgolabs.net>
M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
M: Josh Triplett <josh@joshtriplett.org>
L: linux-kernel@vger.kernel.org
S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
F: Documentation/RCU/torture.txt
F: kernel/torture.c
F: kernel/rcu/rcutorture.c
F: kernel/locking/locktorture.c
TOSHIBA ACPI EXTRAS DRIVER
M: Azael Avalos <coproscefalo@gmail.com>
L: platform-driver-x86@vger.kernel.org

View File

@ -550,7 +550,7 @@ static void mn10300_serial_receive_interrupt(struct mn10300_serial_port *port)
return;
}
smp_read_barrier_depends();
/* READ_ONCE() enforces dependency, but dangerous through integer!!! */
ch = port->rx_buffer[ix++];
st = port->rx_buffer[ix++];
smp_mb();
@ -1728,7 +1728,10 @@ static int mn10300_serial_poll_get_char(struct uart_port *_port)
if (CIRC_CNT(port->rx_inp, ix, MNSC_BUFFER_SIZE) == 0)
return NO_POLL_CHAR;
smp_read_barrier_depends();
/*
* READ_ONCE() enforces dependency, but dangerous
* through integer!!!
*/
ch = port->rx_buffer[ix++];
st = port->rx_buffer[ix++];
smp_mb();

View File

@ -597,7 +597,6 @@ static void __cleanup(struct ioatdma_chan *ioat_chan, dma_addr_t phys_complete)
for (i = 0; i < active && !seen_current; i++) {
struct dma_async_tx_descriptor *tx;
smp_read_barrier_depends();
prefetch(ioat_get_ring_ent(ioat_chan, idx + i + 1));
desc = ioat_get_ring_ent(ioat_chan, idx + i);
dump_desc_dbg(ioat_chan, desc);
@ -715,7 +714,6 @@ static void ioat_abort_descs(struct ioatdma_chan *ioat_chan)
for (i = 1; i < active; i++) {
struct dma_async_tx_descriptor *tx;
smp_read_barrier_depends();
prefetch(ioat_get_ring_ent(ioat_chan, idx + i + 1));
desc = ioat_get_ring_ent(ioat_chan, idx + i);

View File

@ -4,6 +4,7 @@ menuconfig INFINIBAND
depends on NET
depends on INET
depends on m || IPV6 != m
depends on !ALPHA
select IRQ_POLL
---help---
Core support for InfiniBand (IB). Make sure to also select

View File

@ -302,7 +302,6 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
goto bail;
/* We are in the error state, flush the work request. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_last == READ_ONCE(qp->s_head))
goto bail;
/* If DMAs are in progress, we can't flush immediately. */
@ -346,7 +345,6 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
newreq = 0;
if (qp->s_cur == qp->s_tail) {
/* Check if send work queue is empty. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_tail == READ_ONCE(qp->s_head)) {
clear_ahg(qp);
goto bail;
@ -900,7 +898,6 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd,
}
/* Ensure s_rdma_ack_cnt changes are committed */
smp_read_barrier_depends();
if (qp->s_rdma_ack_cnt) {
hfi1_queue_rc_ack(qp, is_fecn);
return;
@ -1562,7 +1559,6 @@ static void rc_rcv_resp(struct hfi1_packet *packet)
trace_hfi1_ack(qp, psn);
/* Ignore invalid responses. */
smp_read_barrier_depends(); /* see post_one_send */
if (cmp_psn(psn, READ_ONCE(qp->s_next_psn)) >= 0)
goto ack_done;

View File

@ -362,7 +362,6 @@ static void ruc_loopback(struct rvt_qp *sqp)
sqp->s_flags |= RVT_S_BUSY;
again:
smp_read_barrier_depends(); /* see post_one_send() */
if (sqp->s_last == READ_ONCE(sqp->s_head))
goto clr_busy;
wqe = rvt_get_swqe_ptr(sqp, sqp->s_last);

View File

@ -553,7 +553,6 @@ static void sdma_hw_clean_up_task(unsigned long opaque)
static inline struct sdma_txreq *get_txhead(struct sdma_engine *sde)
{
smp_read_barrier_depends(); /* see sdma_update_tail() */
return sde->tx_ring[sde->tx_head & sde->sdma_mask];
}

View File

@ -79,7 +79,6 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
goto bail;
/* We are in the error state, flush the work request. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_last == READ_ONCE(qp->s_head))
goto bail;
/* If DMAs are in progress, we can't flush immediately. */
@ -119,7 +118,6 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
RVT_PROCESS_NEXT_SEND_OK))
goto bail;
/* Check if send work queue is empty. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_cur == READ_ONCE(qp->s_head)) {
clear_ahg(qp);
goto bail;

View File

@ -486,7 +486,6 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
goto bail;
/* We are in the error state, flush the work request. */
smp_read_barrier_depends(); /* see post_one_send */
if (qp->s_last == READ_ONCE(qp->s_head))
goto bail;
/* If DMAs are in progress, we can't flush immediately. */
@ -500,7 +499,6 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
}
/* see post_one_send() */
smp_read_barrier_depends();
if (qp->s_cur == READ_ONCE(qp->s_head))
goto bail;

View File

@ -246,7 +246,6 @@ int qib_make_rc_req(struct rvt_qp *qp, unsigned long *flags)
if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
goto bail;
/* We are in the error state, flush the work request. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_last == READ_ONCE(qp->s_head))
goto bail;
/* If DMAs are in progress, we can't flush immediately. */
@ -293,7 +292,6 @@ int qib_make_rc_req(struct rvt_qp *qp, unsigned long *flags)
newreq = 0;
if (qp->s_cur == qp->s_tail) {
/* Check if send work queue is empty. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_tail == READ_ONCE(qp->s_head))
goto bail;
/*
@ -1340,7 +1338,6 @@ static void qib_rc_rcv_resp(struct qib_ibport *ibp,
goto ack_done;
/* Ignore invalid responses. */
smp_read_barrier_depends(); /* see post_one_send */
if (qib_cmp24(psn, READ_ONCE(qp->s_next_psn)) >= 0)
goto ack_done;

View File

@ -367,7 +367,6 @@ static void qib_ruc_loopback(struct rvt_qp *sqp)
sqp->s_flags |= RVT_S_BUSY;
again:
smp_read_barrier_depends(); /* see post_one_send() */
if (sqp->s_last == READ_ONCE(sqp->s_head))
goto clr_busy;
wqe = rvt_get_swqe_ptr(sqp, sqp->s_last);

View File

@ -60,7 +60,6 @@ int qib_make_uc_req(struct rvt_qp *qp, unsigned long *flags)
if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
goto bail;
/* We are in the error state, flush the work request. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_last == READ_ONCE(qp->s_head))
goto bail;
/* If DMAs are in progress, we can't flush immediately. */
@ -90,7 +89,6 @@ int qib_make_uc_req(struct rvt_qp *qp, unsigned long *flags)
RVT_PROCESS_NEXT_SEND_OK))
goto bail;
/* Check if send work queue is empty. */
smp_read_barrier_depends(); /* see post_one_send() */
if (qp->s_cur == READ_ONCE(qp->s_head))
goto bail;
/*

View File

@ -252,7 +252,6 @@ int qib_make_ud_req(struct rvt_qp *qp, unsigned long *flags)
if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
goto bail;
/* We are in the error state, flush the work request. */
smp_read_barrier_depends(); /* see post_one_send */
if (qp->s_last == READ_ONCE(qp->s_head))
goto bail;
/* If DMAs are in progress, we can't flush immediately. */
@ -266,7 +265,6 @@ int qib_make_ud_req(struct rvt_qp *qp, unsigned long *flags)
}
/* see post_one_send() */
smp_read_barrier_depends();
if (qp->s_cur == READ_ONCE(qp->s_head))
goto bail;

View File

@ -1684,7 +1684,6 @@ static inline int rvt_qp_is_avail(
/* non-reserved operations */
if (likely(qp->s_avail))
return 0;
smp_read_barrier_depends(); /* see rc.c */
slast = READ_ONCE(qp->s_last);
if (qp->s_head >= slast)
avail = qp->s_size - (qp->s_head - slast);

View File

@ -97,9 +97,7 @@ static int __qed_spq_block(struct qed_hwfn *p_hwfn,
while (iter_cnt--) {
/* Validate we receive completion update */
if (READ_ONCE(comp_done->done) == 1) {
/* Read updated FW return value */
smp_read_barrier_depends();
if (smp_load_acquire(&comp_done->done) == 1) { /* ^^^ */
if (p_fw_ret)
*p_fw_ret = comp_done->fw_return_code;
return 0;

View File

@ -1877,12 +1877,7 @@ static unsigned next_desc(struct vhost_virtqueue *vq, struct vring_desc *desc)
return -1U;
/* Check they're not leading us off end of descriptors. */
next = vhost16_to_cpu(vq, desc->next);
/* Make sure compiler knows to grab that: we don't want it changing! */
/* We will use the result as an index in an array, so most
* architectures only need a compiler barrier here. */
read_barrier_depends();
next = vhost16_to_cpu(vq, READ_ONCE(desc->next));
return next;
}

View File

@ -1636,8 +1636,7 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
dname[name->len] = 0;
/* Make sure we always see the terminating NUL character */
smp_wmb();
dentry->d_name.name = dname;
smp_store_release(&dentry->d_name.name, dname); /* ^^^ */
dentry->d_lockref.count = 1;
dentry->d_flags = 0;
@ -3047,17 +3046,14 @@ static int prepend(char **buffer, int *buflen, const char *str, int namelen)
* retry it again when a d_move() does happen. So any garbage in the buffer
* due to mismatched pointer and length will be discarded.
*
* Data dependency barrier is needed to make sure that we see that terminating
* NUL. Alpha strikes again, film at 11...
* Load acquire is needed to make sure that we see that terminating NUL.
*/
static int prepend_name(char **buffer, int *buflen, const struct qstr *name)
{
const char *dname = READ_ONCE(name->name);
const char *dname = smp_load_acquire(&name->name); /* ^^^ */
u32 dlen = READ_ONCE(name->len);
char *p;
smp_read_barrier_depends();
*buflen -= dlen + 1;
if (*buflen < 0)
return -ENAMETOOLONG;

View File

@ -31,8 +31,7 @@ extern wait_queue_head_t genl_sk_destructing_waitq;
* @p: The pointer to read, prior to dereferencing
*
* Return the value of the specified RCU-protected pointer, but omit
* both the smp_read_barrier_depends() and the READ_ONCE(), because
* caller holds genl mutex.
* the READ_ONCE(), because caller holds genl mutex.
*/
#define genl_dereference(p) \
rcu_dereference_protected(p, lockdep_genl_is_held())

View File

@ -67,8 +67,7 @@ static inline bool lockdep_nfnl_is_held(__u8 subsys_id)
* @ss: The nfnetlink subsystem ID
*
* Return the value of the specified RCU-protected pointer, but omit
* both the smp_read_barrier_depends() and the READ_ONCE(), because
* caller holds the NFNL subsystem mutex.
* the READ_ONCE(), because caller holds the NFNL subsystem mutex.
*/
#define nfnl_dereference(p, ss) \
rcu_dereference_protected(p, lockdep_nfnl_is_held(ss))

View File

@ -139,12 +139,12 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
* when using it as a pointer, __PERCPU_REF_ATOMIC may be set in
* between contaminating the pointer value, meaning that
* READ_ONCE() is required when fetching it.
*
* The smp_read_barrier_depends() implied by READ_ONCE() pairs
* with smp_store_release() in __percpu_ref_switch_to_percpu().
*/
percpu_ptr = READ_ONCE(ref->percpu_count_ptr);
/* paired with smp_store_release() in __percpu_ref_switch_to_percpu() */
smp_read_barrier_depends();
/*
* Theoretically, the following could test just ATOMIC; however,
* then we'd have to mask off DEAD separately as DEAD may be

View File

@ -433,12 +433,12 @@ static inline void rcu_preempt_sleep_check(void) { }
* @p: The pointer to read
*
* Return the value of the specified RCU-protected pointer, but omit the
* smp_read_barrier_depends() and keep the READ_ONCE(). This is useful
* when the value of this pointer is accessed, but the pointer is not
* dereferenced, for example, when testing an RCU-protected pointer against
* NULL. Although rcu_access_pointer() may also be used in cases where
* update-side locks prevent the value of the pointer from changing, you
* should instead use rcu_dereference_protected() for this use case.
* lockdep checks for being in an RCU read-side critical section. This is
* useful when the value of this pointer is accessed, but the pointer is
* not dereferenced, for example, when testing an RCU-protected pointer
* against NULL. Although rcu_access_pointer() may also be used in cases
* where update-side locks prevent the value of the pointer from changing,
* you should instead use rcu_dereference_protected() for this use case.
*
* It is also permissible to use rcu_access_pointer() when read-side
* access to the pointer was removed at least one grace period ago, as
@ -521,12 +521,11 @@ static inline void rcu_preempt_sleep_check(void) { }
* @c: The conditions under which the dereference will take place
*
* Return the value of the specified RCU-protected pointer, but omit
* both the smp_read_barrier_depends() and the READ_ONCE(). This
* is useful in cases where update-side locks prevent the value of the
* pointer from changing. Please note that this primitive does *not*
* prevent the compiler from repeating this reference or combining it
* with other references, so it should not be used without protection
* of appropriate locks.
* the READ_ONCE(). This is useful in cases where update-side locks
* prevent the value of the pointer from changing. Please note that this
* primitive does *not* prevent the compiler from repeating this reference
* or combining it with other references, so it should not be used without
* protection of appropriate locks.
*
* This function is only for update-side use. Using this function
* when protected only by rcu_read_lock() will result in infrequent

View File

@ -111,7 +111,6 @@ static inline void rcu_cpu_stall_reset(void) { }
static inline void rcu_idle_enter(void) { }
static inline void rcu_idle_exit(void) { }
static inline void rcu_irq_enter(void) { }
static inline bool rcu_irq_enter_disabled(void) { return false; }
static inline void rcu_irq_exit_irqson(void) { }
static inline void rcu_irq_enter_irqson(void) { }
static inline void rcu_irq_exit(void) { }

View File

@ -85,7 +85,6 @@ void rcu_irq_enter(void);
void rcu_irq_exit(void);
void rcu_irq_enter_irqson(void);
void rcu_irq_exit_irqson(void);
bool rcu_irq_enter_disabled(void);
void exit_rcu(void);

View File

@ -70,8 +70,7 @@ static inline bool lockdep_rtnl_is_held(void)
* @p: The pointer to read, prior to dereferencing
*
* Return the value of the specified RCU-protected pointer, but omit
* both the smp_read_barrier_depends() and the READ_ONCE(), because
* caller holds RTNL.
* the READ_ONCE(), because caller holds RTNL.
*/
#define rtnl_dereference(p) \
rcu_dereference_protected(p, lockdep_rtnl_is_held())

View File

@ -278,9 +278,8 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)
static inline int raw_read_seqcount_latch(seqcount_t *s)
{
int seq = READ_ONCE(s->sequence);
/* Pairs with the first smp_wmb() in raw_write_seqcount_latch() */
smp_read_barrier_depends();
int seq = READ_ONCE(s->sequence); /* ^^^ */
return seq;
}

View File

@ -40,7 +40,7 @@ struct srcu_data {
unsigned long srcu_unlock_count[2]; /* Unlocks per CPU. */
/* Update-side state. */
raw_spinlock_t __private lock ____cacheline_internodealigned_in_smp;
spinlock_t __private lock ____cacheline_internodealigned_in_smp;
struct rcu_segcblist srcu_cblist; /* List of callbacks.*/
unsigned long srcu_gp_seq_needed; /* Furthest future GP needed. */
unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */
@ -58,7 +58,7 @@ struct srcu_data {
* Node in SRCU combining tree, similar in function to rcu_data.
*/
struct srcu_node {
raw_spinlock_t __private lock;
spinlock_t __private lock;
unsigned long srcu_have_cbs[4]; /* GP seq for children */
/* having CBs, but only */
/* is > ->srcu_gq_seq. */
@ -78,7 +78,7 @@ struct srcu_struct {
struct srcu_node *level[RCU_NUM_LVLS + 1];
/* First node at each level. */
struct mutex srcu_cb_mutex; /* Serialize CB preparation. */
raw_spinlock_t __private lock; /* Protect counters */
spinlock_t __private lock; /* Protect counters */
struct mutex srcu_gp_mutex; /* Serialize GP work. */
unsigned int srcu_idx; /* Current rdr array element. */
unsigned long srcu_gp_seq; /* Grace-period seq #. */
@ -107,7 +107,7 @@ struct srcu_struct {
#define __SRCU_STRUCT_INIT(name) \
{ \
.sda = &name##_srcu_data, \
.lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
.srcu_gp_seq_needed = 0 - 1, \
__SRCU_DEP_MAP_INIT(name) \
}

View File

@ -79,7 +79,7 @@ void stutter_wait(const char *title);
int torture_stutter_init(int s);
/* Initialization and cleanup. */
bool torture_init_begin(char *ttype, bool v, int *runnable);
bool torture_init_begin(char *ttype, bool v);
void torture_init_end(void);
bool torture_cleanup_begin(void);
void torture_cleanup_end(void);
@ -96,4 +96,10 @@ void _torture_stop_kthread(char *m, struct task_struct **tp);
#define torture_stop_kthread(n, tp) \
_torture_stop_kthread("Stopping " #n " task", &(tp))
#ifdef CONFIG_PREEMPT
#define torture_preempt_schedule() preempt_schedule()
#else
#define torture_preempt_schedule()
#endif
#endif /* __LINUX_TORTURE_H */

View File

@ -137,11 +137,8 @@ extern void syscall_unregfunc(void);
\
if (!(cond)) \
return; \
if (rcucheck) { \
if (WARN_ON_ONCE(rcu_irq_enter_disabled())) \
return; \
if (rcucheck) \
rcu_irq_enter_irqson(); \
} \
rcu_read_lock_sched_notrace(); \
it_func_ptr = rcu_dereference_sched((tp)->funcs); \
if (it_func_ptr) { \

View File

@ -243,6 +243,7 @@ TRACE_EVENT(rcu_exp_funnel_lock,
__entry->grphi, __entry->gpevent)
);
#ifdef CONFIG_RCU_NOCB_CPU
/*
* Tracepoint for RCU no-CBs CPU callback handoffs. This event is intended
* to assist debugging of these handoffs.
@ -285,6 +286,7 @@ TRACE_EVENT(rcu_nocb_wake,
TP_printk("%s %d %s", __entry->rcuname, __entry->cpu, __entry->reason)
);
#endif
/*
* Tracepoint for tasks blocking within preemptible-RCU read-side
@ -421,76 +423,40 @@ TRACE_EVENT(rcu_fqs,
/*
* Tracepoint for dyntick-idle entry/exit events. These take a string
* as argument: "Start" for entering dyntick-idle mode, "End" for
* leaving it, "--=" for events moving towards idle, and "++=" for events
* moving away from idle. "Error on entry: not idle task" and "Error on
* exit: not idle task" indicate that a non-idle task is erroneously
* toying with the idle loop.
* as argument: "Start" for entering dyntick-idle mode, "Startirq" for
* entering it from irq/NMI, "End" for leaving it, "Endirq" for leaving it
* to irq/NMI, "--=" for events moving towards idle, and "++=" for events
* moving away from idle.
*
* These events also take a pair of numbers, which indicate the nesting
* depth before and after the event of interest. Note that task-related
* events use the upper bits of each number, while interrupt-related
* events use the lower bits.
* depth before and after the event of interest, and a third number that is
* the ->dynticks counter. Note that task-related and interrupt-related
* events use two separate counters, and that the "++=" and "--=" events
* for irq/NMI will change the counter by two, otherwise by one.
*/
TRACE_EVENT(rcu_dyntick,
TP_PROTO(const char *polarity, long long oldnesting, long long newnesting),
TP_PROTO(const char *polarity, long oldnesting, long newnesting, atomic_t dynticks),
TP_ARGS(polarity, oldnesting, newnesting),
TP_ARGS(polarity, oldnesting, newnesting, dynticks),
TP_STRUCT__entry(
__field(const char *, polarity)
__field(long long, oldnesting)
__field(long long, newnesting)
__field(long, oldnesting)
__field(long, newnesting)
__field(int, dynticks)
),
TP_fast_assign(
__entry->polarity = polarity;
__entry->oldnesting = oldnesting;
__entry->newnesting = newnesting;
__entry->dynticks = atomic_read(&dynticks);
),
TP_printk("%s %llx %llx", __entry->polarity,
__entry->oldnesting, __entry->newnesting)
);
/*
* Tracepoint for RCU preparation for idle, the goal being to get RCU
* processing done so that the current CPU can shut off its scheduling
* clock and enter dyntick-idle mode. One way to accomplish this is
* to drain all RCU callbacks from this CPU, and the other is to have
* done everything RCU requires for the current grace period. In this
* latter case, the CPU will be awakened at the end of the current grace
* period in order to process the remainder of its callbacks.
*
* These tracepoints take a string as argument:
*
* "No callbacks": Nothing to do, no callbacks on this CPU.
* "In holdoff": Nothing to do, holding off after unsuccessful attempt.
* "Begin holdoff": Attempt failed, don't retry until next jiffy.
* "Dyntick with callbacks": Entering dyntick-idle despite callbacks.
* "Dyntick with lazy callbacks": Entering dyntick-idle w/lazy callbacks.
* "More callbacks": Still more callbacks, try again to clear them out.
* "Callbacks drained": All callbacks processed, off to dyntick idle!
* "Timer": Timer fired to cause CPU to continue processing callbacks.
* "Demigrate": Timer fired on wrong CPU, woke up correct CPU.
* "Cleanup after idle": Idle exited, timer canceled.
*/
TRACE_EVENT(rcu_prep_idle,
TP_PROTO(const char *reason),
TP_ARGS(reason),
TP_STRUCT__entry(
__field(const char *, reason)
),
TP_fast_assign(
__entry->reason = reason;
),
TP_printk("%s", __entry->reason)
TP_printk("%s %lx %lx %#3x", __entry->polarity,
__entry->oldnesting, __entry->newnesting,
__entry->dynticks & 0xfff)
);
/*
@ -799,8 +765,7 @@ TRACE_EVENT(rcu_barrier,
grplo, grphi, gp_tasks) do { } \
while (0)
#define trace_rcu_fqs(rcuname, gpnum, cpu, qsevent) do { } while (0)
#define trace_rcu_dyntick(polarity, oldnesting, newnesting) do { } while (0)
#define trace_rcu_prep_idle(reason) do { } while (0)
#define trace_rcu_dyntick(polarity, oldnesting, newnesting, dyntick) do { } while (0)
#define trace_rcu_callback(rcuname, rhp, qlen_lazy, qlen) do { } while (0)
#define trace_rcu_kfree_callback(rcuname, rhp, offset, qlen_lazy, qlen) \
do { } while (0)

View File

@ -1167,8 +1167,8 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
}
ret = 0;
smp_wmb(); /* pairs with get_xol_area() */
mm->uprobes_state.xol_area = area;
/* pairs with get_xol_area() */
smp_store_release(&mm->uprobes_state.xol_area, area); /* ^^^ */
fail:
up_write(&mm->mmap_sem);
@ -1230,8 +1230,8 @@ static struct xol_area *get_xol_area(void)
if (!mm->uprobes_state.xol_area)
__create_xol_area(0);
area = mm->uprobes_state.xol_area;
smp_read_barrier_depends(); /* pairs with wmb in xol_add_vma() */
/* Pairs with xol_add_vma() smp_store_release() */
area = READ_ONCE(mm->uprobes_state.xol_area); /* ^^^ */
return area;
}
@ -1528,8 +1528,8 @@ static unsigned long get_trampoline_vaddr(void)
struct xol_area *area;
unsigned long trampoline_vaddr = -1;
area = current->mm->uprobes_state.xol_area;
smp_read_barrier_depends();
/* Pairs with xol_add_vma() smp_store_release() */
area = READ_ONCE(current->mm->uprobes_state.xol_area); /* ^^^ */
if (area)
trampoline_vaddr = area->vaddr;

View File

@ -77,10 +77,6 @@ struct lock_stress_stats {
long n_lock_acquired;
};
int torture_runnable = IS_ENABLED(MODULE);
module_param(torture_runnable, int, 0444);
MODULE_PARM_DESC(torture_runnable, "Start locktorture at module init");
/* Forward reference. */
static void lock_torture_cleanup(void);
@ -130,10 +126,8 @@ static void torture_lock_busted_write_delay(struct torture_random_state *trsp)
if (!(torture_random(trsp) %
(cxt.nrealwriters_stress * 2000 * longdelay_ms)))
mdelay(longdelay_ms);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */
#endif
torture_preempt_schedule(); /* Allow test to be preempted. */
}
static void torture_lock_busted_write_unlock(void)
@ -179,10 +173,8 @@ static void torture_spin_lock_write_delay(struct torture_random_state *trsp)
if (!(torture_random(trsp) %
(cxt.nrealwriters_stress * 2 * shortdelay_us)))
udelay(shortdelay_us);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */
#endif
torture_preempt_schedule(); /* Allow test to be preempted. */
}
static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock)
@ -352,10 +344,8 @@ static void torture_mutex_delay(struct torture_random_state *trsp)
mdelay(longdelay_ms * 5);
else
mdelay(longdelay_ms / 5);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */
#endif
torture_preempt_schedule(); /* Allow test to be preempted. */
}
static void torture_mutex_unlock(void) __releases(torture_mutex)
@ -507,10 +497,8 @@ static void torture_rtmutex_delay(struct torture_random_state *trsp)
if (!(torture_random(trsp) %
(cxt.nrealwriters_stress * 2 * shortdelay_us)))
udelay(shortdelay_us);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */
#endif
torture_preempt_schedule(); /* Allow test to be preempted. */
}
static void torture_rtmutex_unlock(void) __releases(torture_rtmutex)
@ -547,10 +535,8 @@ static void torture_rwsem_write_delay(struct torture_random_state *trsp)
mdelay(longdelay_ms * 10);
else
mdelay(longdelay_ms / 10);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */
#endif
torture_preempt_schedule(); /* Allow test to be preempted. */
}
static void torture_rwsem_up_write(void) __releases(torture_rwsem)
@ -570,14 +556,12 @@ static void torture_rwsem_read_delay(struct torture_random_state *trsp)
/* We want a long delay occasionally to force massive contention. */
if (!(torture_random(trsp) %
(cxt.nrealwriters_stress * 2000 * longdelay_ms)))
(cxt.nrealreaders_stress * 2000 * longdelay_ms)))
mdelay(longdelay_ms * 2);
else
mdelay(longdelay_ms / 2);
#ifdef CONFIG_PREEMPT
if (!(torture_random(trsp) % (cxt.nrealreaders_stress * 20000)))
preempt_schedule(); /* Allow test to be preempted. */
#endif
torture_preempt_schedule(); /* Allow test to be preempted. */
}
static void torture_rwsem_up_read(void) __releases(torture_rwsem)
@ -715,8 +699,7 @@ static void __torture_print_stats(char *page,
{
bool fail = 0;
int i, n_stress;
long max = 0;
long min = statp[0].n_lock_acquired;
long max = 0, min = statp ? statp[0].n_lock_acquired : 0;
long long sum = 0;
n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;
@ -823,7 +806,7 @@ static void lock_torture_cleanup(void)
* such, only perform the underlying torture-specific cleanups,
* and avoid anything related to locktorture.
*/
if (!cxt.lwsa)
if (!cxt.lwsa && !cxt.lrsa)
goto end;
if (writer_tasks) {
@ -879,7 +862,7 @@ static int __init lock_torture_init(void)
&percpu_rwsem_lock_ops,
};
if (!torture_init_begin(torture_type, verbose, &torture_runnable))
if (!torture_init_begin(torture_type, verbose))
return -EBUSY;
/* Process args and tell the world that the torturer is on the job. */
@ -898,6 +881,13 @@ static int __init lock_torture_init(void)
firsterr = -EINVAL;
goto unwind;
}
if (nwriters_stress == 0 && nreaders_stress == 0) {
pr_alert("lock-torture: must run at least one locking thread\n");
firsterr = -EINVAL;
goto unwind;
}
if (cxt.cur_ops->init)
cxt.cur_ops->init();
@ -921,7 +911,7 @@ static int __init lock_torture_init(void)
#endif
/* Initialize the statistics so that each run gets its own numbers. */
if (nwriters_stress) {
lock_is_write_held = 0;
cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, GFP_KERNEL);
if (cxt.lwsa == NULL) {
@ -929,10 +919,12 @@ static int __init lock_torture_init(void)
firsterr = -ENOMEM;
goto unwind;
}
for (i = 0; i < cxt.nrealwriters_stress; i++) {
cxt.lwsa[i].n_lock_fail = 0;
cxt.lwsa[i].n_lock_acquired = 0;
}
}
if (cxt.cur_ops->readlock) {
if (nreaders_stress >= 0)
@ -948,6 +940,7 @@ static int __init lock_torture_init(void)
cxt.nrealreaders_stress = cxt.nrealwriters_stress;
}
if (nreaders_stress) {
lock_is_read_held = 0;
cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * cxt.nrealreaders_stress, GFP_KERNEL);
if (cxt.lrsa == NULL) {
@ -963,6 +956,7 @@ static int __init lock_torture_init(void)
cxt.lrsa[i].n_lock_acquired = 0;
}
}
}
lock_torture_print_module_parms(cxt.cur_ops, "Start of test");
@ -990,6 +984,7 @@ static int __init lock_torture_init(void)
goto unwind;
}
if (nwriters_stress) {
writer_tasks = kzalloc(cxt.nrealwriters_stress * sizeof(writer_tasks[0]),
GFP_KERNEL);
if (writer_tasks == NULL) {
@ -997,6 +992,7 @@ static int __init lock_torture_init(void)
firsterr = -ENOMEM;
goto unwind;
}
}
if (cxt.cur_ops->readlock) {
reader_tasks = kzalloc(cxt.nrealreaders_stress * sizeof(reader_tasks[0]),

View File

@ -170,7 +170,7 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock)
* @tail : The new queue tail code word
* Return: The previous queue tail code word
*
* xchg(lock, tail)
* xchg(lock, tail), which heads an address dependency
*
* p,*,* -> n,*,* ; prev = xchg(lock, node)
*/
@ -409,13 +409,11 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
if (old & _Q_TAIL_MASK) {
prev = decode_tail(old);
/*
* The above xchg_tail() is also a load of @lock which generates,
* through decode_tail(), a pointer.
*
* The address dependency matches the RELEASE of xchg_tail()
* such that the access to @prev must happen after.
* The above xchg_tail() is also a load of @lock which
* generates, through decode_tail(), a pointer. The address
* dependency matches the RELEASE of xchg_tail() such that
* the subsequent access to @prev happens after.
*/
smp_read_barrier_depends();
WRITE_ONCE(prev->next, node);

View File

@ -30,31 +30,8 @@
#define RCU_TRACE(stmt)
#endif /* #else #ifdef CONFIG_RCU_TRACE */
/*
* Process-level increment to ->dynticks_nesting field. This allows for
* architectures that use half-interrupts and half-exceptions from
* process context.
*
* DYNTICK_TASK_NEST_MASK defines a field of width DYNTICK_TASK_NEST_WIDTH
* that counts the number of process-based reasons why RCU cannot
* consider the corresponding CPU to be idle, and DYNTICK_TASK_NEST_VALUE
* is the value used to increment or decrement this field.
*
* The rest of the bits could in principle be used to count interrupts,
* but this would mean that a negative-one value in the interrupt
* field could incorrectly zero out the DYNTICK_TASK_NEST_MASK field.
* We therefore provide a two-bit guard field defined by DYNTICK_TASK_MASK
* that is set to DYNTICK_TASK_FLAG upon initial exit from idle.
* The DYNTICK_TASK_EXIT_IDLE value is thus the combined value used upon
* initial exit from idle.
*/
#define DYNTICK_TASK_NEST_WIDTH 7
#define DYNTICK_TASK_NEST_VALUE ((LLONG_MAX >> DYNTICK_TASK_NEST_WIDTH) + 1)
#define DYNTICK_TASK_NEST_MASK (LLONG_MAX - DYNTICK_TASK_NEST_VALUE + 1)
#define DYNTICK_TASK_FLAG ((DYNTICK_TASK_NEST_VALUE / 8) * 2)
#define DYNTICK_TASK_MASK ((DYNTICK_TASK_NEST_VALUE / 8) * 3)
#define DYNTICK_TASK_EXIT_IDLE (DYNTICK_TASK_NEST_VALUE + \
DYNTICK_TASK_FLAG)
/* Offset to allow for unmatched rcu_irq_{enter,exit}(). */
#define DYNTICK_IRQ_NONIDLE ((LONG_MAX / 2) + 1)
/*

View File

@ -106,10 +106,6 @@ static int rcu_perf_writer_state;
#define MAX_MEAS 10000
#define MIN_MEAS 100
static int perf_runnable = IS_ENABLED(MODULE);
module_param(perf_runnable, int, 0444);
MODULE_PARM_DESC(perf_runnable, "Start rcuperf at boot");
/*
* Operations vector for selecting different types of tests.
*/
@ -646,7 +642,7 @@ rcu_perf_init(void)
&tasks_ops,
};
if (!torture_init_begin(perf_type, verbose, &perf_runnable))
if (!torture_init_begin(perf_type, verbose))
return -EBUSY;
/* Process args and tell the world that the perf'er is on the job. */

View File

@ -187,10 +187,6 @@ static const char *rcu_torture_writer_state_getname(void)
return rcu_torture_writer_state_names[i];
}
static int torture_runnable = IS_ENABLED(MODULE);
module_param(torture_runnable, int, 0444);
MODULE_PARM_DESC(torture_runnable, "Start rcutorture at boot");
#if defined(CONFIG_RCU_BOOST) && !defined(CONFIG_HOTPLUG_CPU)
#define rcu_can_boost() 1
#else /* #if defined(CONFIG_RCU_BOOST) && !defined(CONFIG_HOTPLUG_CPU) */
@ -315,11 +311,9 @@ static void rcu_read_delay(struct torture_random_state *rrsp)
}
if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us)))
udelay(shortdelay_us);
#ifdef CONFIG_PREEMPT
if (!preempt_count() &&
!(torture_random(rrsp) % (nrealreaders * 20000)))
preempt_schedule(); /* No QS if preempt_disable() in effect */
#endif
!(torture_random(rrsp) % (nrealreaders * 500)))
torture_preempt_schedule(); /* QS only if preemptible. */
}
static void rcu_torture_read_unlock(int idx) __releases(RCU)
@ -1731,7 +1725,7 @@ rcu_torture_init(void)
&sched_ops, &tasks_ops,
};
if (!torture_init_begin(torture_type, verbose, &torture_runnable))
if (!torture_init_begin(torture_type, verbose))
return -EBUSY;
/* Process args and tell the world that the torturer is on the job. */

View File

@ -53,6 +53,33 @@ static void srcu_invoke_callbacks(struct work_struct *work);
static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay);
static void process_srcu(struct work_struct *work);
/* Wrappers for lock acquisition and release, see raw_spin_lock_rcu_node(). */
#define spin_lock_rcu_node(p) \
do { \
spin_lock(&ACCESS_PRIVATE(p, lock)); \
smp_mb__after_unlock_lock(); \
} while (0)
#define spin_unlock_rcu_node(p) spin_unlock(&ACCESS_PRIVATE(p, lock))
#define spin_lock_irq_rcu_node(p) \
do { \
spin_lock_irq(&ACCESS_PRIVATE(p, lock)); \
smp_mb__after_unlock_lock(); \
} while (0)
#define spin_unlock_irq_rcu_node(p) \
spin_unlock_irq(&ACCESS_PRIVATE(p, lock))
#define spin_lock_irqsave_rcu_node(p, flags) \
do { \
spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \
smp_mb__after_unlock_lock(); \
} while (0)
#define spin_unlock_irqrestore_rcu_node(p, flags) \
spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags) \
/*
* Initialize SRCU combining tree. Note that statically allocated
* srcu_struct structures might already have srcu_read_lock() and
@ -77,7 +104,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static)
/* Each pass through this loop initializes one srcu_node structure. */
rcu_for_each_node_breadth_first(sp, snp) {
raw_spin_lock_init(&ACCESS_PRIVATE(snp, lock));
spin_lock_init(&ACCESS_PRIVATE(snp, lock));
WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) !=
ARRAY_SIZE(snp->srcu_data_have_cbs));
for (i = 0; i < ARRAY_SIZE(snp->srcu_have_cbs); i++) {
@ -111,7 +138,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static)
snp_first = sp->level[level];
for_each_possible_cpu(cpu) {
sdp = per_cpu_ptr(sp->sda, cpu);
raw_spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
rcu_segcblist_init(&sdp->srcu_cblist);
sdp->srcu_cblist_invoking = false;
sdp->srcu_gp_seq_needed = sp->srcu_gp_seq;
@ -170,7 +197,7 @@ int __init_srcu_struct(struct srcu_struct *sp, const char *name,
/* Don't re-initialize a lock while it is held. */
debug_check_no_locks_freed((void *)sp, sizeof(*sp));
lockdep_init_map(&sp->dep_map, name, key, 0);
raw_spin_lock_init(&ACCESS_PRIVATE(sp, lock));
spin_lock_init(&ACCESS_PRIVATE(sp, lock));
return init_srcu_struct_fields(sp, false);
}
EXPORT_SYMBOL_GPL(__init_srcu_struct);
@ -187,7 +214,7 @@ EXPORT_SYMBOL_GPL(__init_srcu_struct);
*/
int init_srcu_struct(struct srcu_struct *sp)
{
raw_spin_lock_init(&ACCESS_PRIVATE(sp, lock));
spin_lock_init(&ACCESS_PRIVATE(sp, lock));
return init_srcu_struct_fields(sp, false);
}
EXPORT_SYMBOL_GPL(init_srcu_struct);
@ -210,13 +237,13 @@ static void check_init_srcu_struct(struct srcu_struct *sp)
/* The smp_load_acquire() pairs with the smp_store_release(). */
if (!rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq_needed))) /*^^^*/
return; /* Already initialized. */
raw_spin_lock_irqsave_rcu_node(sp, flags);
spin_lock_irqsave_rcu_node(sp, flags);
if (!rcu_seq_state(sp->srcu_gp_seq_needed)) {
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
spin_unlock_irqrestore_rcu_node(sp, flags);
return;
}
init_srcu_struct_fields(sp, true);
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
spin_unlock_irqrestore_rcu_node(sp, flags);
}
/*
@ -513,7 +540,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
mutex_lock(&sp->srcu_cb_mutex);
/* End the current grace period. */
raw_spin_lock_irq_rcu_node(sp);
spin_lock_irq_rcu_node(sp);
idx = rcu_seq_state(sp->srcu_gp_seq);
WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
cbdelay = srcu_get_delay(sp);
@ -522,7 +549,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
gpseq = rcu_seq_current(&sp->srcu_gp_seq);
if (ULONG_CMP_LT(sp->srcu_gp_seq_needed_exp, gpseq))
sp->srcu_gp_seq_needed_exp = gpseq;
raw_spin_unlock_irq_rcu_node(sp);
spin_unlock_irq_rcu_node(sp);
mutex_unlock(&sp->srcu_gp_mutex);
/* A new grace period can start at this point. But only one. */
@ -530,7 +557,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
idxnext = (idx + 1) % ARRAY_SIZE(snp->srcu_have_cbs);
rcu_for_each_node_breadth_first(sp, snp) {
raw_spin_lock_irq_rcu_node(snp);
spin_lock_irq_rcu_node(snp);
cbs = false;
if (snp >= sp->level[rcu_num_lvls - 1])
cbs = snp->srcu_have_cbs[idx] == gpseq;
@ -540,7 +567,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
snp->srcu_gp_seq_needed_exp = gpseq;
mask = snp->srcu_data_have_cbs[idx];
snp->srcu_data_have_cbs[idx] = 0;
raw_spin_unlock_irq_rcu_node(snp);
spin_unlock_irq_rcu_node(snp);
if (cbs)
srcu_schedule_cbs_snp(sp, snp, mask, cbdelay);
@ -548,11 +575,11 @@ static void srcu_gp_end(struct srcu_struct *sp)
if (!(gpseq & counter_wrap_check))
for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) {
sdp = per_cpu_ptr(sp->sda, cpu);
raw_spin_lock_irqsave_rcu_node(sdp, flags);
spin_lock_irqsave_rcu_node(sdp, flags);
if (ULONG_CMP_GE(gpseq,
sdp->srcu_gp_seq_needed + 100))
sdp->srcu_gp_seq_needed = gpseq;
raw_spin_unlock_irqrestore_rcu_node(sdp, flags);
spin_unlock_irqrestore_rcu_node(sdp, flags);
}
}
@ -560,17 +587,17 @@ static void srcu_gp_end(struct srcu_struct *sp)
mutex_unlock(&sp->srcu_cb_mutex);
/* Start a new grace period if needed. */
raw_spin_lock_irq_rcu_node(sp);
spin_lock_irq_rcu_node(sp);
gpseq = rcu_seq_current(&sp->srcu_gp_seq);
if (!rcu_seq_state(gpseq) &&
ULONG_CMP_LT(gpseq, sp->srcu_gp_seq_needed)) {
srcu_gp_start(sp);
raw_spin_unlock_irq_rcu_node(sp);
spin_unlock_irq_rcu_node(sp);
/* Throttle expedited grace periods: Should be rare! */
srcu_reschedule(sp, rcu_seq_ctr(gpseq) & 0x3ff
? 0 : SRCU_INTERVAL);
} else {
raw_spin_unlock_irq_rcu_node(sp);
spin_unlock_irq_rcu_node(sp);
}
}
@ -590,18 +617,18 @@ static void srcu_funnel_exp_start(struct srcu_struct *sp, struct srcu_node *snp,
if (rcu_seq_done(&sp->srcu_gp_seq, s) ||
ULONG_CMP_GE(READ_ONCE(snp->srcu_gp_seq_needed_exp), s))
return;
raw_spin_lock_irqsave_rcu_node(snp, flags);
spin_lock_irqsave_rcu_node(snp, flags);
if (ULONG_CMP_GE(snp->srcu_gp_seq_needed_exp, s)) {
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
spin_unlock_irqrestore_rcu_node(snp, flags);
return;
}
WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
spin_unlock_irqrestore_rcu_node(snp, flags);
}
raw_spin_lock_irqsave_rcu_node(sp, flags);
spin_lock_irqsave_rcu_node(sp, flags);
if (!ULONG_CMP_LT(sp->srcu_gp_seq_needed_exp, s))
sp->srcu_gp_seq_needed_exp = s;
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
spin_unlock_irqrestore_rcu_node(sp, flags);
}
/*
@ -623,12 +650,12 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
for (; snp != NULL; snp = snp->srcu_parent) {
if (rcu_seq_done(&sp->srcu_gp_seq, s) && snp != sdp->mynode)
return; /* GP already done and CBs recorded. */
raw_spin_lock_irqsave_rcu_node(snp, flags);
spin_lock_irqsave_rcu_node(snp, flags);
if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
snp_seq = snp->srcu_have_cbs[idx];
if (snp == sdp->mynode && snp_seq == s)
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
spin_unlock_irqrestore_rcu_node(snp, flags);
if (snp == sdp->mynode && snp_seq != s) {
srcu_schedule_cbs_sdp(sdp, do_norm
? SRCU_INTERVAL
@ -644,11 +671,11 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
snp->srcu_gp_seq_needed_exp = s;
raw_spin_unlock_irqrestore_rcu_node(snp, flags);
spin_unlock_irqrestore_rcu_node(snp, flags);
}
/* Top of tree, must ensure the grace period will be started. */
raw_spin_lock_irqsave_rcu_node(sp, flags);
spin_lock_irqsave_rcu_node(sp, flags);
if (ULONG_CMP_LT(sp->srcu_gp_seq_needed, s)) {
/*
* Record need for grace period s. Pair with load
@ -667,7 +694,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
queue_delayed_work(system_power_efficient_wq, &sp->work,
srcu_get_delay(sp));
}
raw_spin_unlock_irqrestore_rcu_node(sp, flags);
spin_unlock_irqrestore_rcu_node(sp, flags);
}
/*
@ -830,7 +857,7 @@ void __call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
rhp->func = func;
local_irq_save(flags);
sdp = this_cpu_ptr(sp->sda);
raw_spin_lock_rcu_node(sdp);
spin_lock_rcu_node(sdp);
rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp, false);
rcu_segcblist_advance(&sdp->srcu_cblist,
rcu_seq_current(&sp->srcu_gp_seq));
@ -844,7 +871,7 @@ void __call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
sdp->srcu_gp_seq_needed_exp = s;
needexp = true;
}
raw_spin_unlock_irqrestore_rcu_node(sdp, flags);
spin_unlock_irqrestore_rcu_node(sdp, flags);
if (needgp)
srcu_funnel_gp_start(sp, sdp, s, do_norm);
else if (needexp)
@ -900,7 +927,7 @@ static void __synchronize_srcu(struct srcu_struct *sp, bool do_norm)
/*
* Make sure that later code is ordered after the SRCU grace
* period. This pairs with the raw_spin_lock_irq_rcu_node()
* period. This pairs with the spin_lock_irq_rcu_node()
* in srcu_invoke_callbacks(). Unlike Tree RCU, this is needed
* because the current CPU might have been totally uninvolved with
* (and thus unordered against) that grace period.
@ -1024,7 +1051,7 @@ void srcu_barrier(struct srcu_struct *sp)
*/
for_each_possible_cpu(cpu) {
sdp = per_cpu_ptr(sp->sda, cpu);
raw_spin_lock_irq_rcu_node(sdp);
spin_lock_irq_rcu_node(sdp);
atomic_inc(&sp->srcu_barrier_cpu_cnt);
sdp->srcu_barrier_head.func = srcu_barrier_cb;
debug_rcu_head_queue(&sdp->srcu_barrier_head);
@ -1033,7 +1060,7 @@ void srcu_barrier(struct srcu_struct *sp)
debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
atomic_dec(&sp->srcu_barrier_cpu_cnt);
}
raw_spin_unlock_irq_rcu_node(sdp);
spin_unlock_irq_rcu_node(sdp);
}
/* Remove the initial count, at which point reaching zero can happen. */
@ -1082,17 +1109,17 @@ static void srcu_advance_state(struct srcu_struct *sp)
*/
idx = rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq)); /* ^^^ */
if (idx == SRCU_STATE_IDLE) {
raw_spin_lock_irq_rcu_node(sp);
spin_lock_irq_rcu_node(sp);
if (ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)) {
WARN_ON_ONCE(rcu_seq_state(sp->srcu_gp_seq));
raw_spin_unlock_irq_rcu_node(sp);
spin_unlock_irq_rcu_node(sp);
mutex_unlock(&sp->srcu_gp_mutex);
return;
}
idx = rcu_seq_state(READ_ONCE(sp->srcu_gp_seq));
if (idx == SRCU_STATE_IDLE)
srcu_gp_start(sp);
raw_spin_unlock_irq_rcu_node(sp);
spin_unlock_irq_rcu_node(sp);
if (idx != SRCU_STATE_IDLE) {
mutex_unlock(&sp->srcu_gp_mutex);
return; /* Someone else started the grace period. */
@ -1141,19 +1168,19 @@ static void srcu_invoke_callbacks(struct work_struct *work)
sdp = container_of(work, struct srcu_data, work.work);
sp = sdp->sp;
rcu_cblist_init(&ready_cbs);
raw_spin_lock_irq_rcu_node(sdp);
spin_lock_irq_rcu_node(sdp);
rcu_segcblist_advance(&sdp->srcu_cblist,
rcu_seq_current(&sp->srcu_gp_seq));
if (sdp->srcu_cblist_invoking ||
!rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
raw_spin_unlock_irq_rcu_node(sdp);
spin_unlock_irq_rcu_node(sdp);
return; /* Someone else on the job or nothing to do. */
}
/* We are on the job! Extract and invoke ready callbacks. */
sdp->srcu_cblist_invoking = true;
rcu_segcblist_extract_done_cbs(&sdp->srcu_cblist, &ready_cbs);
raw_spin_unlock_irq_rcu_node(sdp);
spin_unlock_irq_rcu_node(sdp);
rhp = rcu_cblist_dequeue(&ready_cbs);
for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) {
debug_rcu_head_unqueue(rhp);
@ -1166,13 +1193,13 @@ static void srcu_invoke_callbacks(struct work_struct *work)
* Update counts, accelerate new callbacks, and if needed,
* schedule another round of callback invocation.
*/
raw_spin_lock_irq_rcu_node(sdp);
spin_lock_irq_rcu_node(sdp);
rcu_segcblist_insert_count(&sdp->srcu_cblist, &ready_cbs);
(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
rcu_seq_snap(&sp->srcu_gp_seq));
sdp->srcu_cblist_invoking = false;
more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
raw_spin_unlock_irq_rcu_node(sdp);
spin_unlock_irq_rcu_node(sdp);
if (more)
srcu_schedule_cbs_sdp(sdp, 0);
}
@ -1185,7 +1212,7 @@ static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay)
{
bool pushgp = true;
raw_spin_lock_irq_rcu_node(sp);
spin_lock_irq_rcu_node(sp);
if (ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)) {
if (!WARN_ON_ONCE(rcu_seq_state(sp->srcu_gp_seq))) {
/* All requests fulfilled, time to go idle. */
@ -1195,7 +1222,7 @@ static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay)
/* Outstanding request and no GP. Start one. */
srcu_gp_start(sp);
}
raw_spin_unlock_irq_rcu_node(sp);
spin_unlock_irq_rcu_node(sp);
if (pushgp)
queue_delayed_work(system_power_efficient_wq, &sp->work, delay);

View File

@ -265,24 +265,11 @@ void rcu_bh_qs(void)
#endif
static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
.dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
.dynticks_nesting = 1,
.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
.dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR),
};
/*
* There's a few places, currently just in the tracing infrastructure,
* that uses rcu_irq_enter() to make sure RCU is watching. But there's
* a small location where that will not even work. In those cases
* rcu_irq_enter_disabled() needs to be checked to make sure rcu_irq_enter()
* can be called.
*/
static DEFINE_PER_CPU(bool, disable_rcu_irq_enter);
bool rcu_irq_enter_disabled(void)
{
return this_cpu_read(disable_rcu_irq_enter);
}
/*
* Record entry into an extended quiescent state. This is only to be
* called when not already in an extended quiescent state.
@ -762,68 +749,39 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
}
/*
* rcu_eqs_enter_common - current CPU is entering an extended quiescent state
* Enter an RCU extended quiescent state, which can be either the
* idle loop or adaptive-tickless usermode execution.
*
* Enter idle, doing appropriate accounting. The caller must have
* disabled interrupts.
* We crowbar the ->dynticks_nmi_nesting field to zero to allow for
* the possibility of usermode upcalls having messed up our count
* of interrupt nesting level during the prior busy period.
*/
static void rcu_eqs_enter_common(bool user)
static void rcu_eqs_enter(bool user)
{
struct rcu_state *rsp;
struct rcu_data *rdp;
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
struct rcu_dynticks *rdtp;
rdtp = this_cpu_ptr(&rcu_dynticks);
WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0);
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
rdtp->dynticks_nesting == 0);
if (rdtp->dynticks_nesting != 1) {
rdtp->dynticks_nesting--;
return;
}
lockdep_assert_irqs_disabled();
trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
!user && !is_idle_task(current)) {
struct task_struct *idle __maybe_unused =
idle_task(smp_processor_id());
trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
rcu_ftrace_dump(DUMP_ORIG);
WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
current->pid, current->comm,
idle->pid, idle->comm); /* must be idle task! */
}
trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0, rdtp->dynticks);
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
for_each_rcu_flavor(rsp) {
rdp = this_cpu_ptr(rsp->rda);
do_nocb_deferred_wakeup(rdp);
}
rcu_prepare_for_idle();
__this_cpu_inc(disable_rcu_irq_enter);
rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
__this_cpu_dec(disable_rcu_irq_enter);
WRITE_ONCE(rdtp->dynticks_nesting, 0); /* Avoid irq-access tearing. */
rcu_dynticks_eqs_enter();
rcu_dynticks_task_enter();
/*
* It is illegal to enter an extended quiescent state while
* in an RCU read-side critical section.
*/
RCU_LOCKDEP_WARN(lock_is_held(&rcu_lock_map),
"Illegal idle entry in RCU read-side critical section.");
RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map),
"Illegal idle entry in RCU-bh read-side critical section.");
RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map),
"Illegal idle entry in RCU-sched read-side critical section.");
}
/*
* Enter an RCU extended quiescent state, which can be either the
* idle loop or adaptive-tickless usermode execution.
*/
static void rcu_eqs_enter(bool user)
{
struct rcu_dynticks *rdtp;
rdtp = this_cpu_ptr(&rcu_dynticks);
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
(rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0);
if ((rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE)
rcu_eqs_enter_common(user);
else
rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
}
/**
@ -834,10 +792,6 @@ static void rcu_eqs_enter(bool user)
* critical sections can occur in irq handlers in idle, a possibility
* handled by irq_enter() and irq_exit().)
*
* We crowbar the ->dynticks_nesting field to zero to allow for
* the possibility of usermode upcalls having messed up our count
* of interrupt nesting level during the prior busy period.
*
* If you add or remove a call to rcu_idle_enter(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
@ -866,6 +820,46 @@ void rcu_user_enter(void)
}
#endif /* CONFIG_NO_HZ_FULL */
/**
* rcu_nmi_exit - inform RCU of exit from NMI context
*
* If we are returning from the outermost NMI handler that interrupted an
* RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting
* to let the RCU grace-period handling know that the CPU is back to
* being RCU-idle.
*
* If you add or remove a call to rcu_nmi_exit(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
void rcu_nmi_exit(void)
{
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
/*
* Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
* (We are exiting an NMI handler, so RCU better be paying attention
* to us!)
*/
WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0);
WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
/*
* If the nesting level is not 1, the CPU wasn't RCU-idle, so
* leave it in non-RCU-idle state.
*/
if (rdtp->dynticks_nmi_nesting != 1) {
trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nmi_nesting, rdtp->dynticks_nmi_nesting - 2, rdtp->dynticks);
WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* No store tearing. */
rdtp->dynticks_nmi_nesting - 2);
return;
}
/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
trace_rcu_dyntick(TPS("Startirq"), rdtp->dynticks_nmi_nesting, 0, rdtp->dynticks);
WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
rcu_dynticks_eqs_enter();
}
/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
*
@ -875,8 +869,8 @@ void rcu_user_enter(void)
*
* This code assumes that the idle loop never does anything that might
* result in unbalanced calls to irq_enter() and irq_exit(). If your
* architecture violates this assumption, RCU will give you what you
* deserve, good and hard. But very infrequently and irreproducibly.
* architecture's idle loop violates this assumption, RCU will give you what
* you deserve, good and hard. But very infrequently and irreproducibly.
*
* Use things like work queues to work around this limitation.
*
@ -887,23 +881,14 @@ void rcu_user_enter(void)
*/
void rcu_irq_exit(void)
{
struct rcu_dynticks *rdtp;
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
lockdep_assert_irqs_disabled();
rdtp = this_cpu_ptr(&rcu_dynticks);
/* Page faults can happen in NMI handlers, so check... */
if (rdtp->dynticks_nmi_nesting)
return;
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
rdtp->dynticks_nesting < 1);
if (rdtp->dynticks_nesting <= 1) {
rcu_eqs_enter_common(true);
} else {
trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nesting, rdtp->dynticks_nesting - 1);
rdtp->dynticks_nesting--;
}
if (rdtp->dynticks_nmi_nesting == 1)
rcu_prepare_for_idle();
rcu_nmi_exit();
if (rdtp->dynticks_nmi_nesting == 0)
rcu_dynticks_task_enter();
}
/*
@ -921,56 +906,34 @@ void rcu_irq_exit_irqson(void)
local_irq_restore(flags);
}
/*
* rcu_eqs_exit_common - current CPU moving away from extended quiescent state
*
* If the new value of the ->dynticks_nesting counter was previously zero,
* we really have exited idle, and must do the appropriate accounting.
* The caller must have disabled interrupts.
*/
static void rcu_eqs_exit_common(long long oldval, int user)
{
RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
rcu_dynticks_task_exit();
rcu_dynticks_eqs_exit();
rcu_cleanup_after_idle();
trace_rcu_dyntick(TPS("End"), oldval, rdtp->dynticks_nesting);
if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
!user && !is_idle_task(current)) {
struct task_struct *idle __maybe_unused =
idle_task(smp_processor_id());
trace_rcu_dyntick(TPS("Error on exit: not idle task"),
oldval, rdtp->dynticks_nesting);
rcu_ftrace_dump(DUMP_ORIG);
WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
current->pid, current->comm,
idle->pid, idle->comm); /* must be idle task! */
}
}
/*
* Exit an RCU extended quiescent state, which can be either the
* idle loop or adaptive-tickless usermode execution.
*
* We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
* allow for the possibility of usermode upcalls messing up our count of
* interrupt nesting level during the busy period that is just now starting.
*/
static void rcu_eqs_exit(bool user)
{
struct rcu_dynticks *rdtp;
long long oldval;
long oldval;
lockdep_assert_irqs_disabled();
rdtp = this_cpu_ptr(&rcu_dynticks);
oldval = rdtp->dynticks_nesting;
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
if (oldval & DYNTICK_TASK_NEST_MASK) {
rdtp->dynticks_nesting += DYNTICK_TASK_NEST_VALUE;
} else {
__this_cpu_inc(disable_rcu_irq_enter);
rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
rcu_eqs_exit_common(oldval, user);
__this_cpu_dec(disable_rcu_irq_enter);
if (oldval) {
rdtp->dynticks_nesting++;
return;
}
rcu_dynticks_task_exit();
rcu_dynticks_eqs_exit();
rcu_cleanup_after_idle();
trace_rcu_dyntick(TPS("End"), rdtp->dynticks_nesting, 1, rdtp->dynticks);
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
WRITE_ONCE(rdtp->dynticks_nesting, 1);
WRITE_ONCE(rdtp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
}
/**
@ -979,11 +942,6 @@ static void rcu_eqs_exit(bool user)
* Exit idle mode, in other words, -enter- the mode in which RCU
* read-side critical sections can occur.
*
* We crowbar the ->dynticks_nesting field to DYNTICK_TASK_NEST to
* allow for the possibility of usermode upcalls messing up our count
* of interrupt nesting level during the busy period that is just
* now starting.
*
* If you add or remove a call to rcu_idle_exit(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
@ -1012,65 +970,6 @@ void rcu_user_exit(void)
}
#endif /* CONFIG_NO_HZ_FULL */
/**
* rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
*
* Enter an interrupt handler, which might possibly result in exiting
* idle mode, in other words, entering the mode in which read-side critical
* sections can occur. The caller must have disabled interrupts.
*
* Note that the Linux kernel is fully capable of entering an interrupt
* handler that it never exits, for example when doing upcalls to
* user mode! This code assumes that the idle loop never does upcalls to
* user mode. If your architecture does do upcalls from the idle loop (or
* does anything else that results in unbalanced calls to the irq_enter()
* and irq_exit() functions), RCU will give you what you deserve, good
* and hard. But very infrequently and irreproducibly.
*
* Use things like work queues to work around this limitation.
*
* You have been warned.
*
* If you add or remove a call to rcu_irq_enter(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
void rcu_irq_enter(void)
{
struct rcu_dynticks *rdtp;
long long oldval;
lockdep_assert_irqs_disabled();
rdtp = this_cpu_ptr(&rcu_dynticks);
/* Page faults can happen in NMI handlers, so check... */
if (rdtp->dynticks_nmi_nesting)
return;
oldval = rdtp->dynticks_nesting;
rdtp->dynticks_nesting++;
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
rdtp->dynticks_nesting == 0);
if (oldval)
trace_rcu_dyntick(TPS("++="), oldval, rdtp->dynticks_nesting);
else
rcu_eqs_exit_common(oldval, true);
}
/*
* Wrapper for rcu_irq_enter() where interrupts are enabled.
*
* If you add or remove a call to rcu_irq_enter_irqson(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
void rcu_irq_enter_irqson(void)
{
unsigned long flags;
local_irq_save(flags);
rcu_irq_enter();
local_irq_restore(flags);
}
/**
* rcu_nmi_enter - inform RCU of entry to NMI context
*
@ -1086,7 +985,7 @@ void rcu_irq_enter_irqson(void)
void rcu_nmi_enter(void)
{
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
int incby = 2;
long incby = 2;
/* Complain about underflow. */
WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 0);
@ -1103,45 +1002,61 @@ void rcu_nmi_enter(void)
rcu_dynticks_eqs_exit();
incby = 1;
}
rdtp->dynticks_nmi_nesting += incby;
trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
rdtp->dynticks_nmi_nesting,
rdtp->dynticks_nmi_nesting + incby, rdtp->dynticks);
WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* Prevent store tearing. */
rdtp->dynticks_nmi_nesting + incby);
barrier();
}
/**
* rcu_nmi_exit - inform RCU of exit from NMI context
* rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
*
* If we are returning from the outermost NMI handler that interrupted an
* RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting
* to let the RCU grace-period handling know that the CPU is back to
* being RCU-idle.
* Enter an interrupt handler, which might possibly result in exiting
* idle mode, in other words, entering the mode in which read-side critical
* sections can occur. The caller must have disabled interrupts.
*
* If you add or remove a call to rcu_nmi_exit(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
* Note that the Linux kernel is fully capable of entering an interrupt
* handler that it never exits, for example when doing upcalls to user mode!
* This code assumes that the idle loop never does upcalls to user mode.
* If your architecture's idle loop does do upcalls to user mode (or does
* anything else that results in unbalanced calls to the irq_enter() and
* irq_exit() functions), RCU will give you what you deserve, good and hard.
* But very infrequently and irreproducibly.
*
* Use things like work queues to work around this limitation.
*
* You have been warned.
*
* If you add or remove a call to rcu_irq_enter(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
void rcu_nmi_exit(void)
void rcu_irq_enter(void)
{
struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
/*
* Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
* (We are exiting an NMI handler, so RCU better be paying attention
* to us!)
*/
WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0);
WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
/*
* If the nesting level is not 1, the CPU wasn't RCU-idle, so
* leave it in non-RCU-idle state.
*/
if (rdtp->dynticks_nmi_nesting != 1) {
rdtp->dynticks_nmi_nesting -= 2;
return;
lockdep_assert_irqs_disabled();
if (rdtp->dynticks_nmi_nesting == 0)
rcu_dynticks_task_exit();
rcu_nmi_enter();
if (rdtp->dynticks_nmi_nesting == 1)
rcu_cleanup_after_idle();
}
/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
rdtp->dynticks_nmi_nesting = 0;
rcu_dynticks_eqs_enter();
/*
* Wrapper for rcu_irq_enter() where interrupts are enabled.
*
* If you add or remove a call to rcu_irq_enter_irqson(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
void rcu_irq_enter_irqson(void)
{
unsigned long flags;
local_irq_save(flags);
rcu_irq_enter();
local_irq_restore(flags);
}
/**
@ -1233,7 +1148,8 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
*/
static int rcu_is_cpu_rrupt_from_idle(void)
{
return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 1;
return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 0 &&
__this_cpu_read(rcu_dynticks.dynticks_nmi_nesting) <= 1;
}
/*
@ -2789,6 +2705,11 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
rdp->n_force_qs_snap = rsp->n_force_qs;
} else if (count < rdp->qlen_last_fqs_check - qhimark)
rdp->qlen_last_fqs_check = count;
/*
* The following usually indicates a double call_rcu(). To track
* this down, try building with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y.
*/
WARN_ON_ONCE(rcu_segcblist_empty(&rdp->cblist) != (count == 0));
local_irq_restore(flags);
@ -3723,7 +3644,7 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
raw_spin_lock_irqsave_rcu_node(rnp, flags);
rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
rdp->dynticks = &per_cpu(rcu_dynticks, cpu);
WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE);
WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1);
WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp->dynticks)));
rdp->cpu = cpu;
rdp->rsp = rsp;
@ -3752,7 +3673,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
if (rcu_segcblist_empty(&rdp->cblist) && /* No early-boot CBs? */
!init_nocb_callback_list(rdp))
rcu_segcblist_init(&rdp->cblist); /* Re-enable callbacks. */
rdp->dynticks->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
rdp->dynticks->dynticks_nesting = 1; /* CPU not up, no tearing. */
rcu_dynticks_eqs_online();
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */

View File

@ -38,9 +38,8 @@
* Dynticks per-CPU state.
*/
struct rcu_dynticks {
long long dynticks_nesting; /* Track irq/process nesting level. */
/* Process level is worth LLONG_MAX/2. */
int dynticks_nmi_nesting; /* Track NMI nesting level. */
long dynticks_nesting; /* Track process nesting level. */
long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */
atomic_t dynticks; /* Even value for idle, else odd. */
bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */
unsigned long rcu_qs_ctr; /* Light universal quiescent state ctr. */

View File

@ -61,7 +61,6 @@ DEFINE_PER_CPU(char, rcu_cpu_has_work);
#ifdef CONFIG_RCU_NOCB_CPU
static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
static bool have_rcu_nocb_mask; /* Was rcu_nocb_mask allocated? */
static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */
#endif /* #ifdef CONFIG_RCU_NOCB_CPU */
@ -1687,7 +1686,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu)
}
print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
delta = rdp->mynode->gpnum - rdp->rcu_iw_gpnum;
pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%llx/%d softirq=%u/%u fqs=%ld %s\n",
pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%ld softirq=%u/%u fqs=%ld %s\n",
cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
@ -1752,7 +1751,6 @@ static void increment_cpu_stall_ticks(void)
static int __init rcu_nocb_setup(char *str)
{
alloc_bootmem_cpumask_var(&rcu_nocb_mask);
have_rcu_nocb_mask = true;
cpulist_parse(str, rcu_nocb_mask);
return 1;
}
@ -1801,7 +1799,7 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
/* Is the specified CPU a no-CBs CPU? */
bool rcu_is_nocb_cpu(int cpu)
{
if (have_rcu_nocb_mask)
if (cpumask_available(rcu_nocb_mask))
return cpumask_test_cpu(cpu, rcu_nocb_mask);
return false;
}
@ -2295,14 +2293,13 @@ void __init rcu_init_nohz(void)
need_rcu_nocb_mask = true;
#endif /* #if defined(CONFIG_NO_HZ_FULL) */
if (!have_rcu_nocb_mask && need_rcu_nocb_mask) {
if (!cpumask_available(rcu_nocb_mask) && need_rcu_nocb_mask) {
if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) {
pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n");
return;
}
have_rcu_nocb_mask = true;
}
if (!have_rcu_nocb_mask)
if (!cpumask_available(rcu_nocb_mask))
return;
#if defined(CONFIG_NO_HZ_FULL)
@ -2428,7 +2425,7 @@ static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp)
struct rcu_data *rdp_leader = NULL; /* Suppress misguided gcc warn. */
struct rcu_data *rdp_prev = NULL;
if (!have_rcu_nocb_mask)
if (!cpumask_available(rcu_nocb_mask))
return;
if (ls == -1) {
ls = int_sqrt(nr_cpu_ids);

View File

@ -47,6 +47,7 @@
#include <linux/ktime.h>
#include <asm/byteorder.h>
#include <linux/torture.h>
#include "rcu/rcu.h"
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com>");
@ -60,7 +61,6 @@ static bool verbose;
#define FULLSTOP_RMMOD 2 /* Normal rmmod of torture. */
static int fullstop = FULLSTOP_RMMOD;
static DEFINE_MUTEX(fullstop_mutex);
static int *torture_runnable;
#ifdef CONFIG_HOTPLUG_CPU
@ -500,7 +500,7 @@ static int torture_shutdown(void *arg)
torture_shutdown_hook();
else
VERBOSE_TOROUT_STRING("No torture_shutdown_hook(), skipping.");
ftrace_dump(DUMP_ALL);
rcu_ftrace_dump(DUMP_ALL);
kernel_power_off(); /* Shut down the system. */
return 0;
}
@ -572,17 +572,19 @@ static int stutter;
*/
void stutter_wait(const char *title)
{
int spt;
cond_resched_rcu_qs();
while (READ_ONCE(stutter_pause_test) ||
(torture_runnable && !READ_ONCE(*torture_runnable))) {
if (stutter_pause_test)
if (READ_ONCE(stutter_pause_test) == 1)
spt = READ_ONCE(stutter_pause_test);
for (; spt; spt = READ_ONCE(stutter_pause_test)) {
if (spt == 1) {
schedule_timeout_interruptible(1);
else
} else if (spt == 2) {
while (READ_ONCE(stutter_pause_test))
cond_resched();
else
} else {
schedule_timeout_interruptible(round_jiffies_relative(HZ));
}
torture_shutdown_absorb(title);
}
}
@ -596,17 +598,15 @@ static int torture_stutter(void *arg)
{
VERBOSE_TOROUT_STRING("torture_stutter task started");
do {
if (!torture_must_stop()) {
if (stutter > 1) {
if (!torture_must_stop() && stutter > 1) {
WRITE_ONCE(stutter_pause_test, 1);
schedule_timeout_interruptible(stutter - 1);
WRITE_ONCE(stutter_pause_test, 2);
}
schedule_timeout_interruptible(1);
WRITE_ONCE(stutter_pause_test, 1);
}
WRITE_ONCE(stutter_pause_test, 0);
if (!torture_must_stop())
schedule_timeout_interruptible(stutter);
WRITE_ONCE(stutter_pause_test, 0);
torture_shutdown_absorb("torture_stutter");
} while (!torture_must_stop());
torture_kthread_stopping("torture_stutter");
@ -647,7 +647,7 @@ static void torture_stutter_cleanup(void)
* The runnable parameter points to a flag that controls whether or not
* the test is currently runnable. If there is no such flag, pass in NULL.
*/
bool torture_init_begin(char *ttype, bool v, int *runnable)
bool torture_init_begin(char *ttype, bool v)
{
mutex_lock(&fullstop_mutex);
if (torture_type != NULL) {
@ -659,7 +659,6 @@ bool torture_init_begin(char *ttype, bool v, int *runnable)
}
torture_type = ttype;
verbose = v;
torture_runnable = runnable;
fullstop = FULLSTOP_DONTSTOP;
return true;
}

View File

@ -2682,17 +2682,6 @@ void __trace_stack(struct trace_array *tr, unsigned long flags, int skip,
if (unlikely(in_nmi()))
return;
/*
* It is possible that a function is being traced in a
* location that RCU is not watching. A call to
* rcu_irq_enter() will make sure that it is, but there's
* a few internal rcu functions that could be traced
* where that wont work either. In those cases, we just
* do nothing.
*/
if (unlikely(rcu_irq_enter_disabled()))
return;
rcu_irq_enter_irqson();
__ftrace_trace_stack(buffer, flags, skip, pc, NULL);
rcu_irq_exit_irqson();

View File

@ -212,11 +212,10 @@ static int tracepoint_add_func(struct tracepoint *tp,
}
/*
* rcu_assign_pointer has a smp_wmb() which makes sure that the new
* probe callbacks array is consistent before setting a pointer to it.
* This array is referenced by __DO_TRACE from
* include/linux/tracepoints.h. A matching smp_read_barrier_depends()
* is used.
* rcu_assign_pointer has as smp_store_release() which makes sure
* that the new probe callbacks array is consistent before setting
* a pointer to it. This array is referenced by __DO_TRACE from
* include/linux/tracepoint.h using rcu_dereference_sched().
*/
rcu_assign_pointer(tp->funcs, tp_funcs);
if (!static_key_enabled(&tp->key))

View File

@ -38,12 +38,10 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
if (assoc_array_ptr_is_shortcut(cursor)) {
/* Descend through a shortcut */
shortcut = assoc_array_ptr_to_shortcut(cursor);
smp_read_barrier_depends();
cursor = READ_ONCE(shortcut->next_node);
cursor = READ_ONCE(shortcut->next_node); /* Address dependency. */
}
node = assoc_array_ptr_to_node(cursor);
smp_read_barrier_depends();
slot = 0;
/* We perform two passes of each node.
@ -55,15 +53,12 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
*/
has_meta = 0;
for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
ptr = READ_ONCE(node->slots[slot]);
ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
has_meta |= (unsigned long)ptr;
if (ptr && assoc_array_ptr_is_leaf(ptr)) {
/* We need a barrier between the read of the pointer
* and dereferencing the pointer - but only if we are
* actually going to dereference it.
/* We need a barrier between the read of the pointer,
* which is supplied by the above READ_ONCE().
*/
smp_read_barrier_depends();
/* Invoke the callback */
ret = iterator(assoc_array_ptr_to_leaf(ptr),
iterator_data);
@ -86,10 +81,8 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
continue_node:
node = assoc_array_ptr_to_node(cursor);
smp_read_barrier_depends();
for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
ptr = READ_ONCE(node->slots[slot]);
ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
if (assoc_array_ptr_is_meta(ptr)) {
cursor = ptr;
goto begin_node;
@ -98,16 +91,15 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
finished_node:
/* Move up to the parent (may need to skip back over a shortcut) */
parent = READ_ONCE(node->back_pointer);
parent = READ_ONCE(node->back_pointer); /* Address dependency. */
slot = node->parent_slot;
if (parent == stop)
return 0;
if (assoc_array_ptr_is_shortcut(parent)) {
shortcut = assoc_array_ptr_to_shortcut(parent);
smp_read_barrier_depends();
cursor = parent;
parent = READ_ONCE(shortcut->back_pointer);
parent = READ_ONCE(shortcut->back_pointer); /* Address dependency. */
slot = shortcut->parent_slot;
if (parent == stop)
return 0;
@ -147,7 +139,7 @@ int assoc_array_iterate(const struct assoc_array *array,
void *iterator_data),
void *iterator_data)
{
struct assoc_array_ptr *root = READ_ONCE(array->root);
struct assoc_array_ptr *root = READ_ONCE(array->root); /* Address dependency. */
if (!root)
return 0;
@ -194,7 +186,7 @@ assoc_array_walk(const struct assoc_array *array,
pr_devel("-->%s()\n", __func__);
cursor = READ_ONCE(array->root);
cursor = READ_ONCE(array->root); /* Address dependency. */
if (!cursor)
return assoc_array_walk_tree_empty;
@ -216,11 +208,9 @@ assoc_array_walk(const struct assoc_array *array,
consider_node:
node = assoc_array_ptr_to_node(cursor);
smp_read_barrier_depends();
slot = segments >> (level & ASSOC_ARRAY_KEY_CHUNK_MASK);
slot &= ASSOC_ARRAY_FAN_MASK;
ptr = READ_ONCE(node->slots[slot]);
ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
pr_devel("consider slot %x [ix=%d type=%lu]\n",
slot, level, (unsigned long)ptr & 3);
@ -254,7 +244,6 @@ assoc_array_walk(const struct assoc_array *array,
cursor = ptr;
follow_shortcut:
shortcut = assoc_array_ptr_to_shortcut(cursor);
smp_read_barrier_depends();
pr_devel("shortcut to %d\n", shortcut->skip_to_level);
sc_level = level + ASSOC_ARRAY_LEVEL_STEP;
BUG_ON(sc_level > shortcut->skip_to_level);
@ -294,7 +283,7 @@ assoc_array_walk(const struct assoc_array *array,
} while (sc_level < shortcut->skip_to_level);
/* The shortcut matches the leaf's index to this point. */
cursor = READ_ONCE(shortcut->next_node);
cursor = READ_ONCE(shortcut->next_node); /* Address dependency. */
if (((level ^ sc_level) & ~ASSOC_ARRAY_KEY_CHUNK_MASK) != 0) {
level = sc_level;
goto jumped;
@ -331,20 +320,18 @@ void *assoc_array_find(const struct assoc_array *array,
return NULL;
node = result.terminal_node.node;
smp_read_barrier_depends();
/* If the target key is available to us, it's has to be pointed to by
* the terminal node.
*/
for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
ptr = READ_ONCE(node->slots[slot]);
ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
if (ptr && assoc_array_ptr_is_leaf(ptr)) {
/* We need a barrier between the read of the pointer
* and dereferencing the pointer - but only if we are
* actually going to dereference it.
*/
leaf = assoc_array_ptr_to_leaf(ptr);
smp_read_barrier_depends();
if (ops->compare_object(leaf, index_key))
return (void *)leaf;
}

View File

@ -197,10 +197,10 @@ static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
atomic_long_add(PERCPU_COUNT_BIAS, &ref->count);
/*
* Restore per-cpu operation. smp_store_release() is paired with
* smp_read_barrier_depends() in __ref_is_percpu() and guarantees
* that the zeroing is visible to all percpu accesses which can see
* the following __PERCPU_REF_ATOMIC clearing.
* Restore per-cpu operation. smp_store_release() is paired
* with READ_ONCE() in __ref_is_percpu() and guarantees that the
* zeroing is visible to all percpu accesses which can see the
* following __PERCPU_REF_ATOMIC clearing.
*/
for_each_possible_cpu(cpu)
*per_cpu_ptr(percpu_count, cpu) = 0;

View File

@ -675,15 +675,8 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
expected_mapping = (void *)((unsigned long)stable_node |
PAGE_MAPPING_KSM);
again:
kpfn = READ_ONCE(stable_node->kpfn);
kpfn = READ_ONCE(stable_node->kpfn); /* Address dependency. */
page = pfn_to_page(kpfn);
/*
* page is computed from kpfn, so on most architectures reading
* page->mapping is naturally ordered after reading node->kpfn,
* but on Alpha we need to be more careful.
*/
smp_read_barrier_depends();
if (READ_ONCE(page->mapping) != expected_mapping)
goto stale;

View File

@ -202,13 +202,8 @@ unsigned int arpt_do_table(struct sk_buff *skb,
local_bh_disable();
addend = xt_write_recseq_begin();
private = table->private;
private = READ_ONCE(table->private); /* Address dependency. */
cpu = smp_processor_id();
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
smp_read_barrier_depends();
table_base = private->entries;
jumpstack = (struct arpt_entry **)private->jumpstack[cpu];

View File

@ -260,13 +260,8 @@ ipt_do_table(struct sk_buff *skb,
WARN_ON(!(table->valid_hooks & (1 << hook)));
local_bh_disable();
addend = xt_write_recseq_begin();
private = table->private;
private = READ_ONCE(table->private); /* Address dependency. */
cpu = smp_processor_id();
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
smp_read_barrier_depends();
table_base = private->entries;
jumpstack = (struct ipt_entry **)private->jumpstack[cpu];

View File

@ -282,12 +282,7 @@ ip6t_do_table(struct sk_buff *skb,
local_bh_disable();
addend = xt_write_recseq_begin();
private = table->private;
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
smp_read_barrier_depends();
private = READ_ONCE(table->private); /* Address dependency. */
cpu = smp_processor_id();
table_base = private->entries;
jumpstack = (struct ip6t_entry **)private->jumpstack[cpu];

View File

@ -5586,6 +5586,12 @@ sub process {
}
}
# check for smp_read_barrier_depends and read_barrier_depends
if (!$file && $line =~ /\b(smp_|)read_barrier_depends\s*\(/) {
WARN("READ_BARRIER_DEPENDS",
"$1read_barrier_depends should only be used in READ_ONCE or DEC Alpha code\n" . $herecurr);
}
# check of hardware specific defines
if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) {
CHK("ARCH_DEFINES",

View File

@ -713,7 +713,6 @@ static bool search_nested_keyrings(struct key *keyring,
* doesn't contain any keyring pointers.
*/
shortcut = assoc_array_ptr_to_shortcut(ptr);
smp_read_barrier_depends();
if ((shortcut->index_key[0] & ASSOC_ARRAY_FAN_MASK) != 0)
goto not_this_keyring;
@ -723,8 +722,6 @@ static bool search_nested_keyrings(struct key *keyring,
}
node = assoc_array_ptr_to_node(ptr);
smp_read_barrier_depends();
ptr = node->slots[0];
if (!assoc_array_ptr_is_meta(ptr))
goto begin_node;
@ -736,7 +733,6 @@ static bool search_nested_keyrings(struct key *keyring,
kdebug("descend");
if (assoc_array_ptr_is_shortcut(ptr)) {
shortcut = assoc_array_ptr_to_shortcut(ptr);
smp_read_barrier_depends();
ptr = READ_ONCE(shortcut->next_node);
BUG_ON(!assoc_array_ptr_is_node(ptr));
}
@ -744,7 +740,6 @@ static bool search_nested_keyrings(struct key *keyring,
begin_node:
kdebug("begin_node");
smp_read_barrier_depends();
slot = 0;
ascend_to_node:
/* Go through the slots in a node */
@ -792,14 +787,12 @@ static bool search_nested_keyrings(struct key *keyring,
if (ptr && assoc_array_ptr_is_shortcut(ptr)) {
shortcut = assoc_array_ptr_to_shortcut(ptr);
smp_read_barrier_depends();
ptr = READ_ONCE(shortcut->back_pointer);
slot = shortcut->parent_slot;
}
if (!ptr)
goto not_this_keyring;
node = assoc_array_ptr_to_node(ptr);
smp_read_barrier_depends();
slot++;
/* If we've ascended to the root (zero backpointer), we must have just

View File

@ -1,25 +0,0 @@
#!/bin/bash
# Usage: config2frag.sh < .config > configfrag
#
# Converts the "# CONFIG_XXX is not set" to "CONFIG_XXX=n" so that the
# resulting file becomes a legitimate Kconfig fragment.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, you can access it online at
# http://www.gnu.org/licenses/gpl-2.0.html.
#
# Copyright (C) IBM Corporation, 2013
#
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
LANG=C sed -e 's/^# CONFIG_\([a-zA-Z0-9_]*\) is not set$/CONFIG_\1=n/'

View File

@ -51,7 +51,7 @@ then
mkdir $builddir
fi
else
echo Bad build directory: \"$builddir\"
echo Bad build directory: \"$buildloc\"
exit 2
fi
fi

View File

@ -29,11 +29,6 @@ then
exit 1
fi
builddir=${2}
if test -z "$builddir" -o ! -d "$builddir" -o ! -w "$builddir"
then
echo "kvm-build.sh :$builddir: Not a writable directory, cannot build into it"
exit 1
fi
T=${TMPDIR-/tmp}/test-linux.sh.$$
trap 'rm -rf $T' 0

View File

@ -23,7 +23,7 @@
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
i="$1"
if test -d $i
if test -d "$i" -a -r "$i"
then
:
else

View File

@ -23,14 +23,14 @@
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
i="$1"
if test -d $i
if test -d "$i" -a -r "$i"
then
:
else
echo Unreadable results directory: $i
exit 1
fi
. tools/testing/selftests/rcutorture/bin/functions.sh
. functions.sh
configfile=`echo $i | sed -e 's/^.*\///'`
ngps=`grep ver: $i/console.log 2> /dev/null | tail -1 | sed -e 's/^.* ver: //' -e 's/ .*$//'`

View File

@ -26,7 +26,7 @@
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
i="$1"
. tools/testing/selftests/rcutorture/bin/functions.sh
. functions.sh
if test "`grep -c 'rcu_exp_grace_period.*start' < $i/console.log`" -lt 100
then

View File

@ -23,7 +23,7 @@
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
i="$1"
if test -d $i
if test -d "$i" -a -r "$i"
then
:
else
@ -31,7 +31,7 @@ else
exit 1
fi
PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
. tools/testing/selftests/rcutorture/bin/functions.sh
. functions.sh
if kvm-recheck-rcuperf-ftrace.sh $i
then

View File

@ -25,7 +25,7 @@
# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
. tools/testing/selftests/rcutorture/bin/functions.sh
. functions.sh
for rd in "$@"
do
firsttime=1

View File

@ -42,7 +42,7 @@ T=${TMPDIR-/tmp}/kvm-test-1-run.sh.$$
trap 'rm -rf $T' 0
mkdir $T
. $KVM/bin/functions.sh
. functions.sh
. $CONFIGFRAG/ver_functions.sh
config_template=${1}
@ -154,9 +154,7 @@ cpu_count=`configfrag_boot_cpus "$boot_args" "$config_template" "$cpu_count"`
vcpus=`identify_qemu_vcpus`
if test $cpu_count -gt $vcpus
then
echo CPU count limited from $cpu_count to $vcpus
touch $resdir/Warnings
echo CPU count limited from $cpu_count to $vcpus >> $resdir/Warnings
echo CPU count limited from $cpu_count to $vcpus | tee -a $resdir/Warnings
cpu_count=$vcpus
fi
qemu_args="`specify_qemu_cpus "$QEMU" "$qemu_args" "$cpu_count"`"

View File

@ -1,8 +1,7 @@
#!/bin/bash
#
# Run a series of 14 tests under KVM. These are not particularly
# well-selected or well-tuned, but are the current set. Run from the
# top level of the source tree.
# well-selected or well-tuned, but are the current set.
#
# Edit the definitions below to set the locations of the various directories,
# as well as the test duration.
@ -34,6 +33,8 @@ T=${TMPDIR-/tmp}/kvm.sh.$$
trap 'rm -rf $T' 0
mkdir $T
cd `dirname $scriptname`/../../../../../
dur=$((30*60))
dryrun=""
KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
@ -70,7 +71,7 @@ usage () {
echo " --kmake-arg kernel-make-arguments"
echo " --mac nn:nn:nn:nn:nn:nn"
echo " --no-initrd"
echo " --qemu-args qemu-system-..."
echo " --qemu-args qemu-arguments"
echo " --qemu-cmd qemu-system-..."
echo " --results absolute-pathname"
echo " --torture rcu"
@ -150,7 +151,7 @@ do
TORTURE_INITRD=""; export TORTURE_INITRD
;;
--qemu-args|--qemu-arg)
checkarg --qemu-args "-qemu args" $# "$2" '^-' '^error'
checkarg --qemu-args "(qemu arguments)" $# "$2" '^-' '^error'
TORTURE_QEMU_ARG="$2"
shift
;;
@ -238,7 +239,6 @@ BEGIN {
}
END {
alldone = 0;
batch = 0;
nc = -1;
@ -331,8 +331,7 @@ awk < $T/cfgcpu.pack \
# Dump out the scripting required to run one test batch.
function dump(first, pastlast, batchnum)
{
print "echo ----Start batch " batchnum ": `date`";
print "echo ----Start batch " batchnum ": `date` >> " rd "/log";
print "echo ----Start batch " batchnum ": `date` | tee -a " rd "log";
print "needqemurun="
jn=1
for (j = first; j < pastlast; j++) {
@ -349,21 +348,18 @@ function dump(first, pastlast, batchnum)
ovf = "-ovf";
else
ovf = "";
print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date`";
print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` >> " rd "/log";
print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` | tee -a " rd "log";
print "rm -f " builddir ".*";
print "touch " builddir ".wait";
print "mkdir " builddir " > /dev/null 2>&1 || :";
print "mkdir " rd cfr[jn] " || :";
print "kvm-test-1-run.sh " CONFIGDIR cf[j], builddir, rd cfr[jn], dur " \"" TORTURE_QEMU_ARG "\" \"" TORTURE_BOOTARGS "\" > " rd cfr[jn] "/kvm-test-1-run.sh.out 2>&1 &"
print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date`";
print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` >> " rd "/log";
print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` | tee -a " rd "log";
print "while test -f " builddir ".wait"
print "do"
print "\tsleep 1"
print "done"
print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date`";
print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` >> " rd "/log";
print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` | tee -a " rd "log";
jn++;
}
for (j = 1; j < jn; j++) {
@ -371,8 +367,7 @@ function dump(first, pastlast, batchnum)
print "rm -f " builddir ".ready"
print "if test -f \"" rd cfr[j] "/builtkernel\""
print "then"
print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date`";
print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date` >> " rd "/log";
print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date` | tee -a " rd "log";
print "\tneedqemurun=1"
print "fi"
}
@ -386,31 +381,26 @@ function dump(first, pastlast, batchnum)
njitter = ja[1];
if (TORTURE_BUILDONLY && njitter != 0) {
njitter = 0;
print "echo Build-only run, so suppressing jitter >> " rd "/log"
print "echo Build-only run, so suppressing jitter | tee -a " rd "log"
}
if (TORTURE_BUILDONLY) {
print "needqemurun="
}
print "if test -n \"$needqemurun\""
print "then"
print "\techo ---- Starting kernels. `date`";
print "\techo ---- Starting kernels. `date` >> " rd "/log";
print "\techo ---- Starting kernels. `date` | tee -a " rd "log";
for (j = 0; j < njitter; j++)
print "\tjitter.sh " j " " dur " " ja[2] " " ja[3] "&"
print "\twait"
print "\techo ---- All kernel runs complete. `date`";
print "\techo ---- All kernel runs complete. `date` >> " rd "/log";
print "\techo ---- All kernel runs complete. `date` | tee -a " rd "log";
print "else"
print "\twait"
print "\techo ---- No kernel runs. `date`";
print "\techo ---- No kernel runs. `date` >> " rd "/log";
print "\techo ---- No kernel runs. `date` | tee -a " rd "log";
print "fi"
for (j = 1; j < jn; j++) {
builddir=KVM "/b" j
print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results:";
print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results: >> " rd "/log";
print "cat " rd cfr[j] "/kvm-test-1-run.sh.out";
print "cat " rd cfr[j] "/kvm-test-1-run.sh.out >> " rd "/log";
print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results: | tee -a " rd "log";
print "cat " rd cfr[j] "/kvm-test-1-run.sh.out | tee -a " rd "log";
}
}

View File

@ -55,7 +55,7 @@ then
exit
fi
grep --binary-files=text 'torture:.*ver:' $file | grep --binary-files=text -v '(null)' | sed -e 's/^(initramfs)[^]]*] //' -e 's/^\[[^]]*] //' |
grep --binary-files=text 'torture:.*ver:' $file | egrep --binary-files=text -v '\(null\)|rtc: 000000000* ' | sed -e 's/^(initramfs)[^]]*] //' -e 's/^\[[^]]*] //' |
awk '
BEGIN {
ver = 0;

View File

@ -38,6 +38,5 @@ per_version_boot_params () {
echo $1 `locktorture_param_onoff "$1" "$2"` \
locktorture.stat_interval=15 \
locktorture.shutdown_secs=$3 \
locktorture.torture_runnable=1 \
locktorture.verbose=1
}

View File

@ -51,7 +51,6 @@ per_version_boot_params () {
`rcutorture_param_n_barrier_cbs "$1"` \
rcutorture.stat_interval=15 \
rcutorture.shutdown_secs=$3 \
rcutorture.torture_runnable=1 \
rcutorture.test_no_idle_hz=1 \
rcutorture.verbose=1
}

View File

@ -46,7 +46,6 @@ rcuperf_param_nwriters () {
per_version_boot_params () {
echo $1 `rcuperf_param_nreaders "$1"` \
`rcuperf_param_nwriters "$1"` \
rcuperf.perf_runnable=1 \
rcuperf.shutdown=1 \
rcuperf.verbose=1
}