linux

Commit Graph

Author	SHA1	Message	Date
Paul E. McKenney	dad81a2026	srcu: Introduce CLASSIC_SRCU Kconfig option The TREE_SRCU rewrite is large and a bit on the non-simple side, so this commit helps reduce risk by allowing the old v4.11 SRCU algorithm to be selected using a new CLASSIC_SRCU Kconfig option that depends on RCU_EXPERT. The default is to use the new TREE_SRCU and TINY_SRCU algorithms, in order to help get these the testing that they need. However, if your users do not require the update-side scalability that is to be provided by TREE_SRCU, select RCU_EXPERT and then CLASSIC_SRCU to revert back to the old classic SRCU algorithm. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:23 -07:00
Paul E. McKenney	f60d231a87	srcu: Crude control of expedited grace periods SRCU's implementation of expedited grace periods has always assumed that the SRCU instance is idle when the expedited request arrives. This commit improves this a bit by maintaining a count of the number of outstanding expedited requests, thus allowing prior non-expedited grace periods accommodate these requests by shifting to expedited mode. However, any non-expedited wait already in progress will still wait for the full duration. Improved control of expedited grace periods is planned, but one step at a time. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:22 -07:00
Paul E. McKenney	80a7956fe3	srcu: Merge ->srcu_state into ->srcu_gp_seq Updating ->srcu_state and ->srcu_gp_seq will lead to extremely complex race conditions given multiple callback queues, so this commit takes advantage of the two-bit state now available in rcu_seq counters to store the state in the bottom two bits of ->srcu_gp_seq. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:22 -07:00
Paul E. McKenney	91e27c356c	srcu: Fix bogus try_check_zero() comment Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:21 -07:00
Paul E. McKenney	2b34c43cc1	srcu: Move rcu_init_levelspread() to rcu_tree_node.h This commit moves the rcu_init_levelspread() function from kernel/rcu/tree.c to kernel/rcu/rcu.h so that SRCU can access it. This is another step towards enabling SRCU to create its own combining tree. This commit is code-movement only, give or take knock-on adjustments. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	8660b7d8a5	srcu: Use rcu_segcblist to track SRCU callbacks This commit switches SRCU from custom-built callback queues to the new rcu_segcblist structure. This change associates grace-period sequence numbers with groups of callbacks, which will be needed for efficient processing of per-CPU callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	ac367c1c62	srcu: Add grace-period sequence numbers This commit adds grace-period sequence numbers, which will be used to handle mid-boot grace periods and per-CPU callback lists. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	c2a8ec0778	srcu: Move to state-based grace-period sequencing The current SRCU grace-period processing might never reach the last portion of srcu_advance_batches(). This is OK given the current implementation, as the first portion, up to the try_check_zero() following the srcu_flip() is sufficient to drive grace periods forward. However, it has the unfortunate side-effect of making it impossible to determine when a given grace period has ended, and it will be necessary to efficiently trace ends of grace periods in order to efficiently handle per-CPU SRCU callback lists. This commit therefore adds states to the SRCU grace-period processing, so that the end of a given SRCU grace period is marked by the transition to the SRCU_STATE_DONE state. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	c6e56f593a	srcu: Push srcu_advance_batches() fastpath into common case This commit simplifies the SRCU state machine by pushing the srcu_advance_batches() idle-SRCU fastpath into the common case. This is done by giving srcu_reschedule() a delay parameter, which is zero in the call from srcu_advance_batches(). This commit is a step towards numbering callbacks in order to efficiently handle per-CPU callback lists. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:19 -07:00
Paul E. McKenney	b5eaeaa509	srcu: Allow early boot use of synchronize_srcu() This commit checks for pre-scheduler state, and if that early in the boot process, synchronize_srcu() and friends are no-ops. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:18 -07:00
Paul E. McKenney	15c68f7f63	srcu: Check for tardy grace-period activity in cleanup_srcu_struct() Users of SRCU are obliged to complete all grace-period activity before invoking cleanup_srcu_struct(). This means that all calls to either synchronize_srcu() or synchronize_srcu_expedited() must have returned, and all calls to call_srcu() must have returned, and the last call to call_srcu() must have been followed by a call to srcu_barrier(). Furthermore, the caller must have done something to prevent any further calls to synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu(). Therefore, if there has ever been an invocation of call_srcu() on the srcu_struct in question, the sequence of events must be as follows: 1. Prevent any further calls to call_srcu(). 2. Wait for any pre-existing call_srcu() invocations to return. 3. Invoke srcu_barrier(). 4. It is now safe to invoke cleanup_srcu_struct(). On the other hand, if there has ever been a call to synchronize_srcu() or synchronize_srcu_expedited(), the sequence of events must be as follows: 1. Prevent any further calls to synchronize_srcu() or synchronize_srcu_expedited(). 2. Wait for any pre-existing synchronize_srcu() or synchronize_srcu_expedited() invocations to return. 3. It is now safe to invoke cleanup_srcu_struct(). If there have been calls to all both types of functions (call_srcu() and either of synchronize_srcu() and synchronize_srcu_expedited()), then the caller must do the first three steps of the call_srcu() procedure above and the first two steps of the synchronize_s*() procedure above, and only then invoke cleanup_srcu_struct(). Note that cleanup_srcu_struct() does some probabilistic checks for the caller failing to follow these procedures, in which case cleanup_srcu_struct() does WARN_ON() and avoids freeing the per-CPU structures associated with the specified srcu_struct structure. Reported-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2017-04-18 11:30:58 -07:00
Paul E. McKenney	cc985822a0	srcu: Consolidate batch checking into rcu_all_batches_empty() The srcu_reschedule() function invokes rcu_batch_empty() on each of the four rcu_batch structures in the srcu_struct in question twice. Given that this check will also be needed in cleanup_srcu_struct(), this commit consolidates these four checks into a new rcu_all_batches_empty() function. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2017-04-18 11:22:16 -07:00
Ingo Molnar	f9411ebe3d	rcu: Separate the RCU synchronization types and APIs into <linux/rcupdate_wait.h> So rcupdate.h is a pretty complex header, in particular it includes <linux/completion.h> which includes <linux/wait.h> - creating a dependency that includes <linux/wait.h> in <linux/sched.h>, which prevents the isolation of <linux/sched.h> from the derived <linux/wait.h> header. Solve part of the problem by decoupling rcupdate.h from completions: this can be done by separating out the rcu_synchronize types and APIs, and updating their usage sites. Since this is a mostly RCU-internal types this will not just simplify <linux/sched.h>'s dependencies, but will make all the hundreds of .c files that include rcupdate.h but not completions or wait.h build faster. ( For rcutiny this means that two dependent APIs have to be uninlined, but that shouldn't be much of a problem as they are rare variants. ) Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-03-02 08:42:24 +01:00
Paul E. McKenney	7f554a3d05	srcu: Reduce probability of SRCU ->unlock_count[] counter overflow Because there are no memory barriers between the srcu_flip() ->completed increment and the summation of the read-side ->unlock_count[] counters, both the compiler and the CPU can reorder the summation with the ->completed increment. If the updater is preempted long enough during this process, the read-side counters could overflow, resulting in a too-short grace period. This commit therefore adds a memory barrier just after the ->completed increment, ensuring that if the summation misses an increment of ->unlock_count[] from __srcu_read_unlock(), the next __srcu_read_lock() will see the new value of ->completed, thus bounding the number of ->unlock_count[] increments that can be missed to NR_CPUS. The actual overflow computation is more complex due to the possibility of nesting of __srcu_read_lock(). Reported-by: Lance Roy <ldr709@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-01-25 12:54:22 -08:00
Paul E. McKenney	d85b62f18d	srcu: Force full grace-period ordering If a process invokes synchronize_srcu(), is delayed just the right amount of time, and thus does not sleep when waiting for the grace period to complete, there is no ordering between the end of the grace period and the code following the synchronize_srcu(). Similarly, there can be a lack of ordering between the end of the SRCU grace period and callback invocation. This commit adds the necessary ordering. Reported-by: Lance Roy <ldr709@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Further smp_mb() adjustment per email with Lance Roy. ]	2017-01-25 12:54:22 -08:00
Lance Roy	f2c4689640	srcu: Implement more-efficient reader counts SRCU uses two per-cpu counters: a nesting counter to count the number of active critical sections, and a sequence counter to ensure that the nesting counters don't change while they are being added together in srcu_readers_active_idx_check(). This patch instead uses per-cpu lock and unlock counters. Because both counters only increase and srcu_readers_active_idx_check() reads the unlock counter before the lock counter, this achieves the same end without having to increment two different counters in srcu_read_lock(). This also saves a smp_mb() in srcu_readers_active_idx_check(). Possible bug: There is no guarantee that the lock counter won't overflow during srcu_readers_active_idx_check(), as there are no memory barriers around srcu_flip() (see comment in srcu_readers_active_idx_check() for details). However, this problem was already present before this patch. Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Lance Roy <ldr709@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-01-25 12:53:20 -08:00
Paul E. McKenney	5a9be7c628	rcu: Add rcu_normal kernel parameter to suppress expediting Although expedited grace periods can be quite useful, and although their OS jitter has been greatly reduced, they can still pose problems for extreme real-time workloads. This commit therefore adds a rcu_normal kernel boot parameter (which can also be manipulated via sysfs) to suppress expedited grace periods, that is, to treat requests for expedited grace periods as if they were requests for normal grace periods. If both rcu_expedited and rcu_normal are specified, rcu_normal wins. This means that if you are relying on expedited grace periods to speed up boot, you will want to specify rcu_expedited on the kernel command line, and then specify rcu_normal via sysfs once boot completes. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2015-12-04 12:26:53 -08:00
Paul E. McKenney	49f5903b47	rcu: Move preemption disabling out of __srcu_read_lock() Currently, __srcu_read_lock() cannot be invoked from restricted environments because it contains calls to preempt_disable() and preempt_enable(), both of which can invoke lockdep, which is a bad idea in some restricted execution modes. This commit therefore moves the preempt_disable() and preempt_enable() from __srcu_read_lock() to srcu_read_lock(). It also inserts the preempt_disable() and preempt_enable() around the call to __srcu_read_lock() in do_exit(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2015-10-06 11:15:43 -07:00
Boqun Feng	b6a4ae766e	rcu: Use rcu_callback_t in call_rcu() and friends As we now have rcu_callback_t typedefs as the type of rcu callbacks, we should use it in call_rcu() and friends as the type of parameters. This could save us a few lines of code and make it clear which function requires an rcu callbacks rather than other callbacks as its argument. Besides, this can also help cscope to generate a better database for code reading. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2015-10-06 11:08:05 -07:00
Paul E. McKenney	f78f5b90c4	rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() This commit renames rcu_lockdep_assert() to RCU_LOCKDEP_WARN() for consistency with the WARN() series of macros. This also requires inverting the sense of the conditional, which this commit also does. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Ingo Molnar <mingo@kernel.org>	2015-07-22 15:27:32 -07:00
Nicholas Mc Guire	f765d11307	rcu: Change return type to bool Type-checking coccinelle spatches are being used to locate type mismatches between function signatures and return values in this case this produced: ./kernel/rcu/srcu.c:271 WARNING: return of wrong type int != unsigned long, srcu_readers_active() returns an int that is the sum of per_cpu unsigned long but the only user is cleanup_srcu_struct() which is using it as a boolean (condition) to see if there is any readers rather than actually using the approximate number of readers. The theoretically possible unsigned long overflow case does not need to be handled explicitly - if we had 4G++ readers then something else went wrong a long time ago. proposal: change the return type to boolean. The function name is left unchanged as it fits the naming expectation for a boolean. patch was compile tested for x86_64_defconfig (implies CONFIG_SRCU=y) patch is against 4.1-rc5 (localversion-next is -next-20150525) Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2015-07-15 14:43:53 -07:00
Paul E. McKenney	7d0ae8086b	rcu: Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE() This commit moves from the old ACCESS_ONCE() API to the new READ_ONCE() and WRITE_ONCE() APIs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Updated to include kernel/torture.c as suggested by Jason Low. ]	2015-05-27 12:56:15 -07:00
Paul E. McKenney	42528795ac	Merge branches 'doc.2015.02.26a', 'earlycb.2015.03.03a', 'fixes.2015.03.03a', 'gpexp.2015.02.26a', 'hotplug.2015.03.20a', 'sysidle.2015.02.26b' and 'tiny.2015.02.26a' into HEAD doc.2015.02.26a: Documentation changes earlycb.2015.03.03a: Permit early-boot RCU callbacks fixes.2015.03.03a: Miscellaneous fixes gpexp.2015.02.26a: In-kernel expediting of normal grace periods hotplug.2015.03.20a: CPU hotplug fixes sysidle.2015.02.26b: NO_HZ_FULL_SYSIDLE fixes tiny.2015.02.26a: TINY_RCU fixes	2015-03-20 08:31:01 -07:00
Paul E. McKenney	5afff48bdf	rcu: Update from rcu_expedited variable to rcu_gp_is_expedited() This commit updates open-coded tests of the rcu_expedited variable to instead use rcu_gp_is_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2015-02-26 12:03:01 -08:00
Paul E. McKenney	ee376dbdf2	rcu: Consolidate rcu_synchronize and wakeme_after_rcu() There are currently duplicate identical definitions of the rcu_synchronize() structure and the wakeme_after_rcu() function. Thie commit therefore consolidates them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2015-02-25 17:03:03 -08:00
Paul E. McKenney	a5c198f4f7	rcu: Expand SRCU ->completed to 64 bits When rcutorture used only the low-order 32 bits of the grace-period number, it was not a problem for SRCU to use a 32-bit completed field. However, rcutorture now uses the full 64 bits on 64-bit systems, so this commit converts SRCU's ->completed field to unsigned long so as to provide 64 bits on 64-bit systems. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2015-01-06 11:04:26 -08:00
Paul E. McKenney	a792563bd4	rcu: Eliminate read-modify-write ACCESS_ONCE() calls RCU contains code of the following forms: ACCESS_ONCE(x)++; ACCESS_ONCE(x) += y; ACCESS_ONCE(x) -= y; Now these constructs do operate correctly, but they really result in a pair of volatile accesses, one to do the load and another to do the store. This can be confusing, as the casual reader might well assume that (for example) gcc might generate a memory-to-memory add instruction for each of these three cases. In fact, gcc will do no such thing. Also, there is a good chance that the kernel will move to separate load and store variants of ACCESS_ONCE(), and constructs like the above could easily confuse both people and scripts attempting to make that sort of change. Finally, most of RCU's read-modify-write uses of ACCESS_ONCE() really only need the store to be volatile, so that the read-modify-write form might be misleading. This commit therefore changes the above forms in RCU so that each instance of ACCESS_ONCE() either does a load or a store, but not both. In a few cases, ACCESS_ONCE() was not critical, for example, for maintaining statisitics. In these cases, ACCESS_ONCE() has been dispensed with entirely. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2014-07-09 09:14:49 -07:00
Paul Gortmaker	5cb5c6e18f	rcu: Ensure kernel/rcu/rcu.h can be sourced/used stand-alone The kbuild test bot uncovered an implicit dependence on the trace header being present before rcu.h in ia64 allmodconfig that looks like this: In file included from kernel/ksysfs.c:22:0: kernel/rcu/rcu.h: In function '__rcu_reclaim': kernel/rcu/rcu.h:107:3: error: implicit declaration of function 'trace_rcu_invoke_kfree_callback' [-Werror=implicit-function-declaration] kernel/rcu/rcu.h:112:3: error: implicit declaration of function 'trace_rcu_invoke_callback' [-Werror=implicit-function-declaration] cc1: some warnings being treated as errors Looking at other rcu.h users, we can find that they all were sourcing the trace header in advance of rcu.h itself, as seen in the context of this diff. There were also some inconsistencies as to whether it was or wasn't sourced based on the parent tracing Kconfig. Rather than "fix" it at each use site, and have inconsistent use based on whether "#ifdef CONFIG_RCU_TRACE" was used or not, lets just source the trace header just once, in the actual consumer of it, which is rcu.h itself. We include it unconditionally, as build testing shows us that is a hard requirement for some files. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2014-02-26 06:35:18 -08:00
Shaibal Dutta	ae1670339c	rcu: Move SRCU grace period work to power efficient workqueue For better use of CPU idle time, allow the scheduler to select the CPU on which the SRCU grace period work would be scheduled. This improves idle residency time and conserves power. This functionality is enabled when CONFIG_WQ_POWER_EFFICIENT is selected. Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Shaibal Dutta <shaibal.dutta@broadcom.com> [zoran.markovic@linaro.org: Rebased to latest kernel version. Added commit message. Fixed code alignment.] Signed-off-by: Zoran Markovic <zoran.markovic@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2014-02-17 15:02:14 -08:00
Paul E. McKenney	87de1cfdc5	rcu: Stop tracking FSF's postal address All of the RCU source files have the usual GPL header, which contains a long-obsolete postal address for FSF. To avoid the need to track the FSF office's movements, this commit substitutes the URL where GPL may be found. Reported-by: Greg KH <gregkh@linuxfoundation.org> Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2014-02-17 15:01:37 -08:00
Paul E. McKenney	bc72d962d6	rcu: Improve SRCU's grace-period comments This commit documents the memory-barrier guarantees provided by synchronize_srcu() and call_srcu(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-12-09 15:12:38 -08:00
Paul E. McKenney	4461212aa0	rcu: Fix srcu_barrier() docbook header The srcu_barrier() docbook header left out the "sp" argument, so this commit adds that argument's docbook text. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-12-03 10:10:19 -08:00
Paul E. McKenney	4102adab91	rcu: Move RCU-related source code to kernel/rcu directory Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Ingo Molnar <mingo@kernel.org>	2013-10-15 12:53:31 -07:00

33 Commits