linux

Commit Graph

Author	SHA1	Message	Date
Paul E. McKenney	e4696a1d3b	documentation: Fix some inconsistencies in RTFP.txt Some of the early history leaves out some citations and vice versa. This commit fixes these up. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2014-02-17 14:56:10 -08:00
Paul E. McKenney	6e67669678	documentation: Document call_rcu() safety mechanisms and limitations The call_rcu() family of primitives will take action to accelerate grace periods when the number of callbacks pending on a given CPU becomes excessive. Although this safety mechanism can be useful, it is no substitute for users of call_rcu() having rate-limit controls in place. This commit adds this nuance to the documentation. Reported-by: "Michael S. Tsirkin" <mst@redhat.com> Reported-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2014-02-17 14:55:58 -08:00
Henrik Austad	3cf8ca1c25	Documentation/: update 00-INDEX files Some of the 00-INDEX files are somewhat outdated and some folders does not contain 00-INDEX at all. Only outdated (with the notably exception of spi) indexes are touched here, the 169 folders without 00-INDEX has not been touched. New 00-INDEX - spi/* was added in a series of commits dating back to 2006 Added files (missing in (/)00-INDEX) - dmatest.txt was added by commit `851b7e16a0` ("dmatest: run test via debugfs") - this_cpu_ops.txt was added by commit `a1b2a555d6` ("percpu: add documentation on this_cpu operations") - ww-mutex-design.txt was added by commit `040a0a3710` ("mutex: Add support for wound/wait style locks") - bcache.txt was added by commit `cafe563591` ("bcache: A block layer cache") - kernel-per-CPU-kthreads.txt was added by commit `49717cb404` ("kthread: Document ways of reducing OS jitter due to per-CPU kthreads") - phy.txt was added by commit `ff76496347` ("drivers: phy: add generic PHY framework") - block/null_blk was added by commit `12f8f4fc03` ("null_blk: documentation") - module-signing.txt was added by commit `3cafea3076` ("Add Documentation/module-signing.txt file") - assoc_array.txt was added by commit `3cb989501c` ("Add a generic associative array implementation.") - arm/IXP4xx was part of the initial repo - arm/cluster-pm-race-avoidance.txt was added by commit `7fe31d28e8` ("ARM: mcpm: introduce helpers for platform coherency exit/setup") - arm/firmware.txt was added by commit `7366b92a77` ("ARM: Add interface for registering and calling firmware-specific operations") - arm/kernel_mode_neon.txt was added by commit `2afd0a0524` ("ARM: 7825/1: document the use of NEON in kernel mode") - arm/tcm.txt was added by commit `bc581770cf` ("ARM: 5580/2: ARM TCM (Tightly-Coupled Memory) support v3") - arm/vlocks.txt was added by commit `9762f12d3e` ("ARM: mcpm: Add baremetal voting mutexes") - blackfin/gptimers-example.c, Makefile was added by commit `4b60779d5e` ("Blackfin: add an example showing how to use the gptimers API") - devicetree/usage-model.txt was added by commit `31134efc68` ("dt: Linux DT usage model documentation") - fb/api.txt was added by commit `fb21c2f428` ("fbdev: Add FOURCC-based format configuration API") - fb/sm501.txt was added by commit `e6a0498071` ("video, sm501: add edid and commandline support") - fb/udlfb.txt was added by commit `96f8d864af` ("fbdev: move udlfb out of staging.") - filesystems/Makefile was added by commit `1e0051ae48` ("Documentation/fs/: split txt and source files") - filesystems/nfs/nfsd-admin-interfaces.txt was added by commit `8a4c6e19cf` ("nfsd: document kernel interfaces for nfsd configuration") - ide/warm-plug-howto.txt was added by commit `f74c91413e` ("ide: add warm-plug support for IDE devices (take 2)") - laptops/Makefile was added by commit `d49129accc` ("Documentation/laptop/: split txt and source files") - leds/leds-blinkm.txt was added by commit `b54cf35a7f` ("LEDS: add BlinkM RGB LED driver, documentation and update MAINTAINERS") - leds/ledtrig-oneshot.txt was added by commit `5e417281cd` ("leds: add oneshot trigger") - leds/ledtrig-transient.txt was added by commit `44e1e9f8e7` ("leds: add new transient trigger for one shot timer activation") - m68k/README.buddha was part of the initial repo - networking/LICENSE.(qla3xxx\|qlcnic\|qlge) was added by commits `40839129f7`, `c4e84bde1d`, `5a4faa8737` - networking/Makefile was added by commit `3794f3e812` ("docsrc: build Documentation/ sources") - networking/i40evf.txt was added by commit `105bf2fe6b` ("i40evf: add driver to kernel build system") - networking/ipsec.txt was added by commit `b3c6efbc36` ("xfrm: Add file to document IPsec corner case") - networking/mac80211-auth-assoc-deauth.txt was added by commit `3cd7920a2b` ("mac80211: add auth/assoc/deauth flow diagram") - networking/netlink_mmap.txt was added by commit `5683264c39` ("netlink: add documentation for memory mapped I/O") - networking/nf_conntrack-sysctl.txt was added by commit `c9f9e0e159` ("netfilter: doc: add nf_conntrack sysctl api documentation") lan) - networking/team.txt was added by commit `3d249d4ca7` ("net: introduce ethernet teaming device") - networking/vxlan.txt was added by commit `d342894c5d` ("vxlan: virtual extensible lan") - power/runtime_pm.txt was added by commit `5e928f77a0` ("PM: Introduce core framework for run-time PM of I/O devices (rev. 17)") - power/charger-manager.txt was added by commit `3bb3dbbd56` ("power_supply: Add initial Charger-Manager driver") - RCU/lockdep-splat.txt was added by commit `d7bd2d68aa` ("rcu: Document interpretation of RCU-lockdep splats") - s390/kvm.txt was added by `5ecee4b` (KVM: s390: API documentation) - s390/qeth.txt was added by commit `b4d72c08b3` ("qeth: bridgeport support - basic control") - scheduler/sched-bwc.txt was added by commit `88ebc08ea9` ("sched: Add documentation for bandwidth control") - scsi/advansys.txt was added by commit `4bd6d7f356` ("[SCSI] advansys: Move documentation to Documentation/scsi") - scsi/bfa.txt was added by commit `1ec90174bd` ("[SCSI] bfa: add readme file") - scsi/bnx2fc.txt was added by commit `12b8fc10ea` ("[SCSI] bnx2fc: Add driver documentation") - scsi/cxgb3i.txt was added by commit `c3673464eb` ("[SCSI] cxgb3i: Add cxgb3i iSCSI driver.") - scsi/hpsa.txt was added by commit `992ebcf14f` ("[SCSI] hpsa: Add hpsa.txt to Documentation/scsi") - scsi/link_power_management_policy.txt was added by commit `ca77329fb7` ("[libata] Link power management infrastructure") - scsi/osd.txt was added by commit `78e0c621de` ("[SCSI] osd: Documentation for OSD library") - scsi/scsi-parameter.txt was created/moved by commit `163475fb11` ("Documentation: move SCSI parameters to their own text file") - serial/driver was part of the initial repo - serial/n_gsm.txt was added by commit `323e84122e` ("n_gsm: add a documentation") - timers/Makefile was added by commit `3794f3e812` ("docsrc: build Documentation/ sources") - virt/kvm/s390.txt was added by commit `d9101fca3d` ("KVM: s390: diagnose call documentation") - vm/split_page_table_lock was added by commit `49076ec2cc` ("mm: dynamically allocate page->ptl if it cannot be embedded to struct page") - w1/slaves/w1_ds28e04 was added by commit `fbf7f7b4e2` ("w1: Add 1-wire slave device driver for DS28E04-100") - w1/masters/omap-hdq was added by commit `e0a29382c6` ("hdq: documentation for OMAP HDQ") - x86/early-microcode.txt was added by commit `0d91ea86a8` ("x86, doc: Documentation for early microcode loading") - x86/earlyprintk.txt was added by commit `a1aade4788` ("x86/doc: mini-howto for using earlyprintk=dbgp") - x86/entry_64.txt was added by commit `8b4777a4b5` ("x86-64: Document some of entry_64.S") - x86/pat.txt was added by commit `d27554d874` ("x86: PAT documentation") Moved files - arm/kernel_user_helpers.txt was moved out of arch/arm/kernel by commit `37b8304642` ("ARM: kuser: move interface documentation out of the source code") - efi-stub.txt was moved out of x86/ and down into Documentation/ in commit `4172fe2f8a` ("EFI stub documentation updates") - laptops/hpfall.c was moved out of hwmon/ and into laptops/ in commit `efcfed9bad` ("Move hp_accel to drivers/platform/x86") - commit `5616c23ad9` ("x86: doc: move x86-generic documentation from Doc/x86/i386"): x86/usb-legacy-support.txt * x86/boot.txt * x86/zero_page.txt - power/video_extension.txt was moved to acpi in commit `70e66e4df1` ("ACPI / video: move video_extension.txt to Documentation/acpi") Removed files (left in 00-INDEX) - memory.txt was removed by commit `00ea8990aa` ("memory.txt: remove stray information") - gpio.txt was moved to gpio/ in commit `fd8e198cfc` ("Documentation: gpiolib: document new interface") - networking/DLINK.txt was removed by commit `168e06ae26` ("drivers/net: delete old parallel port de600/de620 drivers") - serial/hayes-esp.txt was removed by commit `f53a2ade0b` ("tty: esp: remove broken driver") - s390/TAPE was removed by commit `9e280f6693` ("[S390] remove tape block docu") - vm/locking was removed by commit `57ea8171d2` ("mm: documentation: remove hopelessly out-of-date locking doc") - laptops/acer-wmi.txt was remvoed by commit `020036678e` ("acer-wmi: Delete out-of-date documentation") Typos/misc issues - rpc-server-gss.txt was added as knfsd-rpcgss.txt in commit `030d794bf4` ("SUNRPC: Use gssproxy upcall for server RPCGSS authentication.") - commit `b88cf73d92` ("net: add missing entries to Documentation/networking/00-INDEX") * generic-hdlc.txt was added as generic_hdlc.txt * spider_net.txt was added as spider-net.txt - w1/master/mxc-w1 was added as mxc_w1 by commit `a5fd9139f7` ("w1: add 1-wire master driver for i.MX27 / i.MX31") - s390/zfcpdump.txt was added as zfcpdump by commit `6920c12a40` ("[S390] Add Documentation/s390/00-INDEX.") Signed-off-by: Henrik Austad <henrik@austad.us> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [rcu bits] Acked-by: Rob Landley <rob@landley.net> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Mark Brown <broonie@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Gleb Natapov <gleb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Len Brown <len.brown@intel.com> Cc: James Bottomley <JBottomley@parallels.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-02-10 16:01:40 -08:00
Paul E. McKenney	0d3c55bc9f	Merge branches 'doc.2013.12.03a', 'fixes.2013.12.12a', 'rcutorture.2013.12.03a' and 'sparse.2013.12.12a' into HEAD doc.2013.12.03a: Topic branch for documentation changes. fixes.2013.12.12a: Topic branch for miscellaneous fixes. rcutorture.2013.12.03a: Topic branch for new rcutorture/KVM scripting. sparse.2013.12.12a: Topic branch for sparse-RCU changes.	2013-12-12 12:35:38 -08:00
Paul E. McKenney	96d3fd0d31	rcu: Break call_rcu() deadlock involving scheduler and perf Dave Jones got the following lockdep splat: > ====================================================== > [ INFO: possible circular locking dependency detected ] > 3.12.0-rc3+ #92 Not tainted > ------------------------------------------------------- > trinity-child2/15191 is trying to acquire lock: > (&rdp->nocb_wq){......}, at: [<ffffffff8108ff43>] __wake_up+0x23/0x50 > > but task is already holding lock: > (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #3 (&ctx->lock){-.-...}: > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80 > [<ffffffff811500ff>] __perf_event_task_sched_out+0x2df/0x5e0 > [<ffffffff81091b83>] perf_event_task_sched_out+0x93/0xa0 > [<ffffffff81732052>] __schedule+0x1d2/0xa20 > [<ffffffff81732f30>] preempt_schedule_irq+0x50/0xb0 > [<ffffffff817352b6>] retint_kernel+0x26/0x30 > [<ffffffff813eed04>] tty_flip_buffer_push+0x34/0x50 > [<ffffffff813f0504>] pty_write+0x54/0x60 > [<ffffffff813e900d>] n_tty_write+0x32d/0x4e0 > [<ffffffff813e5838>] tty_write+0x158/0x2d0 > [<ffffffff811c4850>] vfs_write+0xc0/0x1f0 > [<ffffffff811c52cc>] SyS_write+0x4c/0xa0 > [<ffffffff8173d4e4>] tracesys+0xdd/0xe2 > > -> #2 (&rq->lock){-.-.-.}: > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80 > [<ffffffff810980b2>] wake_up_new_task+0xc2/0x2e0 > [<ffffffff81054336>] do_fork+0x126/0x460 > [<ffffffff81054696>] kernel_thread+0x26/0x30 > [<ffffffff8171ff93>] rest_init+0x23/0x140 > [<ffffffff81ee1e4b>] start_kernel+0x3f6/0x403 > [<ffffffff81ee1571>] x86_64_start_reservations+0x2a/0x2c > [<ffffffff81ee1664>] x86_64_start_kernel+0xf1/0xf4 > > -> #1 (&p->pi_lock){-.-.-.}: > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90 > [<ffffffff810979d1>] try_to_wake_up+0x31/0x350 > [<ffffffff81097d62>] default_wake_function+0x12/0x20 > [<ffffffff81084af8>] autoremove_wake_function+0x18/0x40 > [<ffffffff8108ea38>] __wake_up_common+0x58/0x90 > [<ffffffff8108ff59>] __wake_up+0x39/0x50 > [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0 > [<ffffffff81111450>] __call_rcu+0x140/0x820 > [<ffffffff81111b8d>] call_rcu+0x1d/0x20 > [<ffffffff81093697>] cpu_attach_domain+0x287/0x360 > [<ffffffff81099d7e>] build_sched_domains+0xe5e/0x10a0 > [<ffffffff81efa7fc>] sched_init_smp+0x3b7/0x47a > [<ffffffff81ee1f4e>] kernel_init_freeable+0xf6/0x202 > [<ffffffff817200be>] kernel_init+0xe/0x190 > [<ffffffff8173d22c>] ret_from_fork+0x7c/0xb0 > > -> #0 (&rdp->nocb_wq){......}: > [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0 > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90 > [<ffffffff8108ff43>] __wake_up+0x23/0x50 > [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0 > [<ffffffff81111450>] __call_rcu+0x140/0x820 > [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30 > [<ffffffff81149abf>] put_ctx+0x4f/0x70 > [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230 > [<ffffffff81056b8d>] do_exit+0x30d/0xcc0 > [<ffffffff8105893c>] do_group_exit+0x4c/0xc0 > [<ffffffff810589c4>] SyS_exit_group+0x14/0x20 > [<ffffffff8173d4e4>] tracesys+0xdd/0xe2 > > other info that might help us debug this: > > Chain exists of: > &rdp->nocb_wq --> &rq->lock --> &ctx->lock > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&ctx->lock); > lock(&rq->lock); > lock(&ctx->lock); > lock(&rdp->nocb_wq); > > * DEADLOCK * > > 1 lock held by trinity-child2/15191: > #0: (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230 > > stack backtrace: > CPU: 2 PID: 15191 Comm: trinity-child2 Not tainted 3.12.0-rc3+ #92 > ffffffff82565b70 ffff880070c2dbf8 ffffffff8172a363 ffffffff824edf40 > ffff880070c2dc38 ffffffff81726741 ffff880070c2dc90 ffff88022383b1c0 > ffff88022383aac0 0000000000000000 ffff88022383b188 ffff88022383b1c0 > Call Trace: > [<ffffffff8172a363>] dump_stack+0x4e/0x82 > [<ffffffff81726741>] print_circular_bug+0x200/0x20f > [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0 > [<ffffffff810c6439>] ? get_lock_stats+0x19/0x60 > [<ffffffff8100b2f4>] ? native_sched_clock+0x24/0x80 > [<ffffffff810cc243>] lock_acquire+0x93/0x200 > [<ffffffff8108ff43>] ? __wake_up+0x23/0x50 > [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90 > [<ffffffff8108ff43>] ? __wake_up+0x23/0x50 > [<ffffffff8108ff43>] __wake_up+0x23/0x50 > [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0 > [<ffffffff81111450>] __call_rcu+0x140/0x820 > [<ffffffff8109bc8f>] ? local_clock+0x3f/0x50 > [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30 > [<ffffffff81149abf>] put_ctx+0x4f/0x70 > [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230 > [<ffffffff81056b8d>] do_exit+0x30d/0xcc0 > [<ffffffff810c9af5>] ? trace_hardirqs_on_caller+0x115/0x1e0 > [<ffffffff810c9bcd>] ? trace_hardirqs_on+0xd/0x10 > [<ffffffff8105893c>] do_group_exit+0x4c/0xc0 > [<ffffffff810589c4>] SyS_exit_group+0x14/0x20 > [<ffffffff8173d4e4>] tracesys+0xdd/0xe2 The underlying problem is that perf is invoking call_rcu() with the scheduler locks held, but in NOCB mode, call_rcu() will with high probability invoke the scheduler -- which just might want to use its locks. The reason that call_rcu() needs to invoke the scheduler is to wake up the corresponding rcuo callback-offload kthread, which does the job of starting up a grace period and invoking the callbacks afterwards. One solution (championed on a related problem by Lai Jiangshan) is to simply defer the wakeup to some point where scheduler locks are no longer held. Since we don't want to unnecessarily incur the cost of such deferral, the task before us is threefold: 1. Determine when it is likely that a relevant scheduler lock is held. 2. Defer the wakeup in such cases. 3. Ensure that all deferred wakeups eventually happen, preferably sooner rather than later. We use irqs_disabled_flags() as a proxy for relevant scheduler locks being held. This works because the relevant locks are always acquired with interrupts disabled. We may defer more often than needed, but that is at least safe. The wakeup deferral is tracked via a new field in the per-CPU and per-RCU-flavor rcu_data structure, namely ->nocb_defer_wakeup. This flag is checked by the RCU core processing. The __rcu_pending() function now checks this flag, which causes rcu_check_callbacks() to initiate RCU core processing at each scheduling-clock interrupt where this flag is set. Of course this is not sufficient because scheduling-clock interrupts are often turned off (the things we used to be able to count on!). So the flags are also checked on entry to any state that RCU considers to be idle, which includes both NO_HZ_IDLE idle state and NO_HZ_FULL user-mode-execution state. This approach should allow call_rcu() to be invoked regardless of what locks you might be holding, the key word being "should". Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org>	2013-12-03 10:10:18 -08:00
Paul E. McKenney	c9592ecb98	rcu: Fix typo in Documentation/RCU/trace.txt Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-12-03 10:08:56 -08:00
Michael Opdenacker	4b0d3f0fde	rcu: Fix occurrence of "the the" in checklist.txt Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Add "then" as suggested by Josh Triplett. ] Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 10:07:02 -07:00
Paul E. McKenney	64d3b7a1d5	rcu: Update stall-warning documentation Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-09-25 06:49:37 -07:00
Paul E. McKenney	25f27ce4a6	Merge branches 'doc.2013.08.19a', 'fixes.2013.08.20a', 'sysidle.2013.08.31a' and 'torture.2013.08.20a' into HEAD doc.2013.08.19a: Documentation updates fixes.2013.08.20a: Miscellaneous fixes sysidle.2013.08.31a: Detect system-wide idle state. torture.2013.08.20a: rcutorture updates.	2013-08-31 14:44:45 -07:00
Paul E. McKenney	2ec1f2d987	rcu: Increase rcutorture test coverage Currently, rcutorture has separate torture_types to test synchronous, asynchronous, and expedited grace-period primitives. This has two disadvantages: (1) Three times the number of runs to cover the combinations and (2) Little testing of concurrent combinations of the three options. This commit therefore adds a pair of module parameters that control normal and expedited state, with the default being both types, randomly selected, by the fakewriter processes, thus reducing source-code size and increasing test coverage. In addtion, the writer task switches between asynchronous-normal and expedited grace-period primitives driven by the same pair of module parameters. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-20 11:38:41 -07:00
Paul E. McKenney	6ae3771850	rcu: Update RTFP documentation Note that this commit also updates the formatting of serveral of the bibtex entries to conform to that of my .bib files. I started accumulating entries back in the 1980s, back when bibtex insisted that comma (",") was a separator, not a terminator. This rule forced commas to the fronts of lines. 25 years later, bibtex allows commas to be terminators, but I am too lazy to rework all my .bib files. Keeping the same format as my .bib files allows my to simply incorporate my RCU.bib file into Documentation/RCU/RTFP.txt, which is much easier than my earlier practice of keeping track of what had changed and adding individual entries. (I sometimes find relevant papers that were published some years back, for example.) In addition, this change adds entries for papers published in the last year or so. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-19 21:39:27 -07:00
Paul E. McKenney	d84297c99b	rcu: Fix rcu_barrier() documentation There was a time when rcu_barrier() was guaranteed to wait for at least a grace period, but that time ended due to energy-efficiency concerns. So now rcu_barrier() is a no-op if there are no RCU callbacks queued in the system. This commit updates the documentation to reflect this change. Now, rcu_barrier() often does wait for a grace period, so, one could imagine some modification to rcu_barrier() to more efficiently handle cases where both rcu_barrier() and a grace period are needed. But this must wait until someone shows a real-world need for a change. Reported-by: Bob Copeland <bob@cozybit.com> Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-08-18 17:05:32 -07:00
Paul E. McKenney	be77f87c00	Merge branches 'cbnum.2013.06.10a', 'doc.2013.06.10a', 'fixes.2013.06.10a', 'srcu.2013.06.10a' and 'tiny.2013.06.10a' into HEAD cbnum.2013.06.10a: Apply simplifications stemming from the new callback numbering. doc.2013.06.10a: Documentation updates. fixes.2013.06.10a: Miscellaneous fixes. srcu.2013.06.10a: Updates to SRCU. tiny.2013.06.10a: Eliminate TINY_PREEMPT_RCU.	2013-06-10 13:46:44 -07:00
Paul E. McKenney	7807acdb6b	rcu: Remove TINY_PREEMPT_RCU tracing documentation Because TINY_PREEMPT_RCU is no more, this commit removes its tracing formats from the documentation. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-06-10 13:45:52 -07:00
Paul E. McKenney	99f88919f8	rcu: Remove srcu_read_lock_raw() and srcu_read_unlock_raw(). These interfaces never did get used, so this commit removes them, their rcutorture tests, and documentation referencing them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2013-06-10 13:45:25 -07:00
Frederic Weisbecker	c032862fba	Merge commit '8700c95adb03' into timers/nohz The full dynticks tree needs the latest RCU and sched upstream updates in order to fix some dependencies. Merge a common upstream merge point that has these updates. Conflicts: include/linux/perf_event.h kernel/rcutree.h kernel/rcutree_plugin.h Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2013-05-02 17:54:19 +02:00
Frederic Weisbecker	3451d0243c	nohz: Rename CONFIG_NO_HZ to CONFIG_NO_HZ_COMMON We are planning to convert the dynticks Kconfig options layout into a choice menu. The user must be able to easily pick any of the following implementations: constant periodic tick, idle dynticks, full dynticks. As this implies a mutual exclusion, the two dynticks implementions need to converge on the selection of a common Kconfig option in order to ease the sharing of a common infrastructure. It would thus seem pretty natural to reuse CONFIG_NO_HZ to that end. It already implements all the idle dynticks code and the full dynticks depends on all that code for now. So ideally the choice menu would propose CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED then both would select CONFIG_NO_HZ. On the other hand we want to stay backward compatible: if CONFIG_NO_HZ is set in an older config file, we want to enable CONFIG_NO_HZ_IDLE by default. But we can't afford both at the same time or we run into a circular dependency: 1) CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED both select CONFIG_NO_HZ 2) If CONFIG_NO_HZ is set, we default to CONFIG_NO_HZ_IDLE We might be able to support that from Kconfig/Kbuild but it may not be wise to introduce such a confusing behaviour. So to solve this, create a new CONFIG_NO_HZ_COMMON option which gathers the common code between idle and full dynticks (that common code for now is simply the idle dynticks code) and select it from their referring Kconfig. Then we'll later create CONFIG_NO_HZ_IDLE and map CONFIG_NO_HZ to it for backward compatibility. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>	2013-04-03 13:56:03 +02:00
Paul E. McKenney	6d87669357	Merge branches 'doc.2013.03.12a', 'fixes.2013.03.13a' and 'idlenocb.2013.03.26b' into HEAD doc.2013.03.12a: Documentation changes. fixes.2013.03.13a: Miscellaneous fixes. idlenocb.2013.03.26b: Remove restrictions on no-CBs CPUs, make RCU_FAST_NO_HZ take advantage of numbered callbacks, add callback acceleration based on numbered callbacks.	2013-03-26 08:07:38 -07:00
Paul E. McKenney	6231069bda	rcu: Add softirq-stall indications to stall-warning messages If RCU's softirq handler is prevented from executing, an RCU CPU stall warning can result. Ways to prevent RCU's softirq handler from executing include: (1) CPU spinning with interrupts disabled, (2) infinite loop in some softirq handler, and (3) in -rt kernels, an infinite loop in a set of real-time threads running at priorities higher than that of RCU's softirq handler. Because this situation can be difficult to track down, this commit causes the count of RCU softirq handler invocations to be printed with RCU CPU stall warnings. This information does require some interpretation, as now documented in Documentation/RCU/stallwarn.txt. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2013-03-13 14:43:56 -07:00
Paul E. McKenney	3f944adb9d	rcu: Documentation update This commit applies a few updates based on a quick review of the RCU documentations. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-03-12 14:09:02 -07:00
Paul E. McKenney	4357fb570b	rcu: Make bugginess of code sample more evident One of the code samples in whatisRCU.txt shows a bug, but someone scanning the document quickly might mistake it for a valid use of RCU. Add some screaming comments to help keep speed-readers on track. Reported-by: Nathan Zimmer <nzimmer@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-03-12 14:09:01 -07:00
Paul E. McKenney	aac1cda34b	Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD urgent.2012.10.27a: Fix for RCU user-mode transition (already in -tip). doc.2012.11.08a: Documentation updates, most notably codifying the memory-barrier guarantees inherent to grace periods. fixes.2012.11.13a: Miscellaneous fixes. srcu.2012.10.27a: Allow statically allocated and initialized srcu_struct structures (courtesy of Lai Jiangshan). stall.2012.11.13a: Add more diagnostic information to RCU CPU stall warnings, also decrease from 60 seconds to 21 seconds. hotplug.2012.11.08a: Minor updates to CPU hotplug handling. tracing.2012.11.08a: Improved debugfs tracing, courtesy of Michael Wang. idle.2012.10.24a: Updates to RCU idle/adaptive-idle handling, including a boot parameter that maps normal grace periods to expedited. Resolved conflict in kernel/rcutree.c due to side-by-side change.	2012-11-16 09:59:58 -08:00
Paul E. McKenney	d484a21513	rcu: Add documentation for the new rcuexp debugfs trace file This commit adds the documentation of the rcuexp debugfs trace file that records statistics for expedited grace periods. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-16 09:55:31 -08:00
Paul E. McKenney	40e80c469f	rcu: Update documentation for TREE_RCU debugfs tracing This commit updates the tracing documentation to reflect the new format that has per-RCU-flavor directories. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-16 09:54:02 -08:00
Paul E. McKenney	bb08f76d84	rcu: Remove list_for_each_continue_rcu() The list_for_each_continue_rcu() macro is no longer used, so this commit removes it. The list_for_each_entry_continue_rcu() macro should be used instead. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-13 14:08:21 -08:00
Paul E. McKenney	a4d611fdca	rcu: Document alternative RCU/reference-count algorithms The approach for mixing RCU and reference counting listed in the RCU documentation only describes one possible approach. This approach can result in failure on the read side, which is nice if you want fresh data, but not so good if you want simple code. This commit therefore adds two additional approaches that feature unconditional reference-count acquisition by RCU readers. These approaches are very similar to that used in the security code. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-08 11:44:38 -08:00
Kees Cook	57d34a6cee	rcu: Update docs to include kfree_rcu() Mention kfree_rcu() in the call_rcu() section. Additionally fix the example code for list replacement that used the wrong structure element. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-08 11:44:25 -08:00
Dhaval Giani	0f9574d832	rcu: Correct the name of a reference in list of RCU papers Trying to go through the history of RCU (not for the weak minded) led me to search for a non-existent paper. Correct it to the actual reference Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-10-23 14:44:47 -07:00
Paul E. McKenney	bda4ec9f6a	Merge branches 'bigrt.2012.09.23a', 'doctorture.2012.09.23a', 'fixes.2012.09.23a', 'hotplug.2012.09.23a' and 'idlechop.2012.09.23a' into HEAD bigrt.2012.09.23a contains additional commits to reduce scheduling latency from RCU on huge systems (many hundrends or thousands of CPUs). doctorture.2012.09.23a contains documentation changes and rcutorture fixes. fixes.2012.09.23a contains miscellaneous fixes. hotplug.2012.09.23a contains CPU-hotplug-related changes. idle.2012.09.23a fixes architectures for which RCU no longer considered the idle loop to be a quiescent state due to earlier adaptive-dynticks changes. Affected architectures are alpha, cris, frv, h8300, m32r, m68k, mn10300, parisc, score, xtensa, and ia64.	2012-09-24 20:02:22 -07:00
Paul E. McKenney	86f343b50b	rcu: Fix CONFIG_RCU_FAST_NO_HZ stall warning message The print_cpu_stall_fast_no_hz() function attempts to print -1 when the ->idle_gp_timer is not pending, but unsigned arithmetic causes it to instead print ULONG_MAX, which is 4294967295 on 32-bit systems and 18446744073709551615 on 64-bit systems. Neither of these are the most reader-friendly values, so this commit instead causes "timer not pending" to be printed when ->idle_gp_timer is not pending. Reported-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-09-23 07:42:52 -07:00
Paul E. McKenney	2aef619c75	rcu: Document SRCU dead-CPU capabilities, emphasize read-side limits The current documentation did not help someone grepping for SRCU to learn that disabling preemption is not a replacement for srcu_read_lock(), so upgrade the documentation to bring this out, not just for SRCU, but also for RCU-bh. Also document the fact that SRCU readers are respected on CPUs executing in user mode, idle CPUs, and even on offline CPUs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>	2012-09-23 07:42:23 -07:00
Paul E. McKenney	4605c0143c	rcu: Adjust debugfs tracing for kthread-based quiescent-state forcing Moving quiescent-state forcing into a kthread dispenses with the need for the ->n_rp_need_fqs field, so this commit removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2012-09-23 07:41:54 -07:00
Paul E. McKenney	74d874e7bd	rcu: Update documentation to cover call_srcu() and srcu_barrier(). The advent of call_srcu() and srcu_barrier() obsoleted some of the documentation, so this commit brings that up to date. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-07-02 12:34:03 -07:00
Paul E. McKenney	fae4b54f28	rcu: Introduce rcutorture testing for rcu_barrier() Although rcutorture does invoke rcu_barrier() and friends, it cannot really be called a torture test given that it invokes them only once at the end of the test. This commit therefore introduces heavy-duty rcutorture testing for rcu_barrier(), which may be carried out concurrently with normal rcutorture testing. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:18 -07:00
Paul E. McKenney	236fefafe5	rcu: Call out dangers of expedited RCU primitives The expedited RCU primitives can be quite useful, but they have some high costs as well. This commit updates and creates docbook comments calling out the costs, and updates the RCU documentation as well. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:06:08 -08:00
Paul E. McKenney	2036d94a7b	rcu: Rework detection of use of RCU by offline CPUs Because newly offlined CPUs continue executing after completing the CPU_DYING notifiers, they legitimately enter the scheduler and use RCU while appearing to be offline. This calls for a more sophisticated approach as follows: 1. RCU marks the CPU online during the CPU_UP_PREPARE phase. 2. RCU marks the CPU offline during the CPU_DEAD phase. 3. Diagnostics regarding use of read-side RCU by offline CPUs use RCU's accounting rather than the cpu_online_map. (Note that __call_rcu() still uses cpu_online_map to detect illegal invocations within CPU_DYING notifiers.) 4. Offline CPUs are prevented from hanging the system by force_quiescent_state(), which pays attention to cpu_online_map. Some additional work (in a later commit) will be needed to guarantee that force_quiescent_state() waits a full jiffy before assuming that a CPU is offline, for example, when called from idle entry. (This commit also makes the one-jiffy wait explicit, since the old-style implicit wait can now be defeated by RCU_FAST_NO_HZ and by rcutorture.) This approach avoids the false positives encountered when attempting to use more exact classification of CPU online/offline state. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:06:07 -08:00
Paul E. McKenney	24cd7fd0ea	rcu: Update stall-warning documentation Add documentation of CONFIG_RCU_CPU_STALL_VERBOSE, CONFIG_RCU_CPU_STALL_INFO, and RCU_STALL_DELAY_DELTA. Describe multiple stall-warning messages from a single stall, and the timing of the subsequent messages. Add headings. Remove RCU_SECONDS_TILL_STALL_RECHECK because this value is now computed at runtime from RCU_CPU_STALL_TIMEOUT, so that sysfs changes to the timeout value now directly affect the RCU_SECONDS_TILL_STALL_RECHECK value. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:03:52 -08:00
Paul E. McKenney	c13f3757d0	rcu: Add CPU-stall capability to rcutorture Add module parameters to rcutorture that induce a CPU stall. The stall_cpu parameter specifies how long to stall in seconds, defaulting to zero, which indicates no stalling is to be undertaken. The stall_cpu_holdoff parameter specifies how many seconds after insmod (or boot, if rcutorture is built into the kernel) that this stall is to start. The default value for stall_cpu_holdoff is ten seconds. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:03:52 -08:00
Paul E. McKenney	105617da8d	rcu: Make documentation give more realistic rcutorture duration The torture.txt documentation gives an example rcutorture run with a 100-second duration. This is ridiculously short, unless maybe testing a fix for a egregious bug. Use a more-realistic one-hour duration for the example. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:03:51 -08:00
Paul E. McKenney	9b9ec9b90e	rcutorture: Permit holding off CPU-hotplug operations during boot When rcutorture is started automatically at boot time, it might well also start CPU-hotplug operations at that time, which might not be desirable. This commit therefore adds an rcutorture parameter that allows CPU-hotplug operations to be held off for the specified number of seconds after the start of boot. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:03:50 -08:00
Paul E. McKenney	4c62abc90b	rcu: Bring RTFP.txt up to date. Add publications from 2010 and 2011 to RTFP.txt. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:03:21 -08:00
Kees Cook	d493011a37	docs: Additional LWN links to RCU API Tyler Hicks pointed me at an additional article on RCU and I figured it should probably be mentioned with the others. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:32:23 -08:00
Paul E. McKenney	b58bdccaa8	rcu: Add rcutorture CPU-hotplug capability Running CPU-hotplug operations concurrently with rcutorture has historically been a good way to find bugs in both RCU and CPU hotplug. This commit therefore adds an rcutorture module parameter called "onoff_interval" that causes a randomly selected CPU-hotplug operation to be executed at the specified interval, in seconds. The default value of "onoff_interval" is zero, which disables rcutorture-instigated CPU-hotplug operations. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:31:56 -08:00
Paul E. McKenney	d5f546d834	rcu: Add rcutorture system-shutdown capability Although it is easy to run rcutorture tests under KVM, there is currently no nice way to run such a test for a fixed time period, collect all of the rcutorture data, and then shut the system down cleanly. This commit therefore adds an rcutorture module parameter named "shutdown_secs" that specified the run duration in seconds, after which rcutorture terminates the test and powers the system down. The default value for "shutdown_secs" is zero, which disables shutdown. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:31:46 -08:00
Paul E. McKenney	9ceae0e248	rcu: Add documentation for raw SRCU read-side primitives Update various files in Documentation/RCU to reflect srcu_read_lock_raw() and srcu_read_unlock_raw(). Credit to Peter Zijlstra for suggesting use of the existing _raw suffix instead of the earlier bulkref names. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-12-11 10:31:41 -08:00
Paul E. McKenney	2c01531f08	rcu: Document failing tick as cause of RCU CPU stall warning One of lclaudio's systems was seeing RCU CPU stall warnings from idle. These turned out to be caused by a bug that stopped scheduling-clock tick interrupts from being sent to a given CPU for several hundred seconds. This commit therefore updates the documentation to call this out as a possible cause for RCU CPU stall warnings. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:26 -08:00
Paul E. McKenney	9b2e4f1880	rcu: Track idleness independent of idle tasks Earlier versions of RCU used the scheduling-clock tick to detect idleness by checking for the idle task, but handled idleness differently for CONFIG_NO_HZ=y. But there are now a number of uses of RCU read-side critical sections in the idle task, for example, for tracing. A more fine-grained detection of idleness is therefore required. This commit presses the old dyntick-idle code into full-time service, so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is always invoked at the beginning of an idle loop iteration. Similarly, rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked at the end of an idle-loop iteration. This allows the idle task to use RCU everywhere except between consecutive rcu_idle_enter() and rcu_idle_exit() calls, in turn allowing architecture maintainers to specify exactly where in the idle loop that RCU may be used. Because some of the userspace upcall uses can result in what looks to RCU like half of an interrupt, it is not possible to expect that the irq_enter() and irq_exit() hooks will give exact counts. This patch therefore expands the ->dynticks_nesting counter to 64 bits and uses two separate bitfields to count process/idle transitions and interrupt entry/exit transitions. It is presumed that userspace upcalls do not happen in the idle loop or from usermode execution (though usermode might do a system call that results in an upcall). The counter is hard-reset on each process/idle transition, which avoids the interrupt entry/exit error from accumulating. Overflow is avoided by the 64-bitness of the ->dyntick_nesting counter. This commit also adds warnings if a non-idle task asks RCU to enter idle state (and these checks will need some adjustment before applying Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246). In addition, validation of ->dynticks and ->dynticks_nesting is added. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:24 -08:00
Paul E. McKenney	d7bd2d68aa	rcu: Document interpretation of RCU-lockdep splats There has been quite a bit of confusion about what RCU-lockdep splats mean, so this commit adds some documentation describing how to interpret them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:28 -07:00
Paul E. McKenney	8cd889cbb6	rcu: Update documentation for additional RCU lockdep functions Add documentation for rcu_dereference_bh_check(), rcu_dereference_sched_check(), srcu_dereference_check(), and rcu_dereference_index_check(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:25 -07:00
Michal Hocko	e5177ec77d	rcu: Not necessary to pass rcu_read_lock_held() to rcu_dereference_protected() Since `ca5ecddf` (rcu: define __rcu address space modifier for sparse) rcu_dereference_check() use rcu_read_lock_held() as a part of condition automatically. Therefore, callers of rcu_dereference_check() no longer need to pass rcu_read_lock_held() to rcu_dereference_check(). Signed-off-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:23 -07:00
Paul E. McKenney	e4cc1f22b2	rcu: Simplify quiescent-state accounting There is often a delay between the time that a CPU passes through a quiescent state and the time that this quiescent state is reported to the RCU core. It is quite possible that the grace period ended before the quiescent state could be reported, for example, some other CPU might have deduced that this CPU passed through dyntick-idle mode. It is critically important that quiescent state be counted only against the grace period that was in effect at the time that the quiescent state was detected. Previously, this was handled by recording the number of the last grace period to complete when passing through a quiescent state. The RCU core then checks this number against the current value, and rejects the quiescent state if there is a mismatch. However, one additional possibility must be accounted for, namely that the quiescent state was recorded after the prior grace period completed but before the current grace period started. In this case, the RCU core must reject the quiescent state, but the recorded number will match. This is handled when the CPU becomes aware of a new grace period -- at that point, it invalidates any prior quiescent state. This works, but is a bit indirect. The new approach records the current grace period, and the RCU core checks to see (1) that this is still the current grace period and (2) that this grace period has not yet ended. This approach simplifies reasoning about correctness, and this commit changes over to this new approach. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:38:22 -07:00
Paul E. McKenney	b15a2e7d16	rcu: Fix RCU's NMI documentation It has long been the case that the architecture must call nmi_enter() and nmi_exit() rather than irq_enter() and irq_exit() in order to permit RCU read-side critical sections in NMIs. Catch the documentation up with reality. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>	2011-09-28 21:36:44 -07:00
Paul E. McKenney	bdf2a43649	rcu: Catch rcutorture up to new RCU API additions Now that the RCU API contains synchronize_rcu_bh(), synchronize_sched(), call_rcu_sched(), and rcu_bh_expedited()... Make rcutorture test synchronize_rcu_bh(), getting rid of the old rcu_bh_torture_synchronize() workaround. Similarly, make rcutorture test synchronize_sched(), getting rid of the old sched_torture_synchronize() workaround. Make rcutorture test call_rcu_sched() instead of wrappering synchronize_sched(). Also add testing of rcu_bh_expedited(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:36:43 -07:00
Paul E. McKenney	63cd758e07	rcu: Update rcutorture documentation Update rcutorture documentation to account for boosting, new types of RCU torture testing that have been added over the past few years, and the memory-barrier testing that was added an embarrassingly long time ago. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:36:39 -07:00
Paul E. McKenney	d5988af531	rcu: Update documentation to flag RCU_BOOST trace information Call out the RCU_TRACE information that is provided only in kernels built with RCU_BOOST. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:36:35 -07:00
Wanlong Gao	25eb650a69	doc: fix wrong arch/i386 references Change all "arch/i386" to "arch/x86" in Documentaion/, since the directory has changed. Also update the files which have changed their filename in the meantime accordingly. Signed-off-by: Wanlong Gao <wanlong.gao@gmail.com> [jkosina@suse.cz: reword changelog] Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-06-13 13:43:05 +02:00
Paul E. McKenney	23b5c8fa01	rcu: Decrease memory-barrier usage based on semi-formal proof (Note: this was reverted, and is now being re-applied in pieces, with this being the fifth and final piece. See below for the reason that it is now felt to be safe to re-apply this.) Commit `d09b62d` fixed grace-period synchronization, but left some smp_mb() invocations in rcu_process_callbacks() that are no longer needed, but sheer paranoia prevented them from being removed. This commit removes them and provides a proof of correctness in their absence. It also adds a memory barrier to rcu_report_qs_rsp() immediately before the update to rsp->completed in order to handle the theoretical possibility that the compiler or CPU might move massive quantities of code into a lock-based critical section. This also proves that the sheer paranoia was not entirely unjustified, at least from a theoretical point of view. In addition, the old dyntick-idle synchronization depended on the fact that grace periods were many milliseconds in duration, so that it could be assumed that no dyntick-idle CPU could reorder a memory reference across an entire grace period. Unfortunately for this design, the addition of expedited grace periods breaks this assumption, which has the unfortunate side-effect of requiring atomic operations in the functions that track dyntick-idle state for RCU. (There is some hope that the algorithms used in user-level RCU might be applied here, but some work is required to handle the NMIs that user-space applications can happily ignore. For the short term, better safe than sorry.) This proof assumes that neither compiler nor CPU will allow a lock acquisition and release to be reordered, as doing so can result in deadlock. The proof is as follows: 1. A given CPU declares a quiescent state under the protection of its leaf rcu_node's lock. 2. If there is more than one level of rcu_node hierarchy, the last CPU to declare a quiescent state will also acquire the ->lock of the next rcu_node up in the hierarchy, but only after releasing the lower level's lock. The acquisition of this lock clearly cannot occur prior to the acquisition of the leaf node's lock. 3. Step 2 repeats until we reach the root rcu_node structure. Please note again that only one lock is held at a time through this process. The acquisition of the root rcu_node's ->lock must occur after the release of that of the leaf rcu_node. 4. At this point, we set the ->completed field in the rcu_state structure in rcu_report_qs_rsp(). However, if the rcu_node hierarchy contains only one rcu_node, then in theory the code preceding the quiescent state could leak into the critical section. We therefore precede the update of ->completed with a memory barrier. All CPUs will therefore agree that any updates preceding any report of a quiescent state will have happened before the update of ->completed. 5. Regardless of whether a new grace period is needed, rcu_start_gp() will propagate the new value of ->completed to all of the leaf rcu_node structures, under the protection of each rcu_node's ->lock. If a new grace period is needed immediately, this propagation will occur in the same critical section that ->completed was set in, but courtesy of the memory barrier in #4 above, is still seen to follow any pre-quiescent-state activity. 6. When a given CPU invokes __rcu_process_gp_end(), it becomes aware of the end of the old grace period and therefore makes any RCU callbacks that were waiting on that grace period eligible for invocation. If this CPU is the same one that detected the end of the grace period, and if there is but a single rcu_node in the hierarchy, we will still be in the single critical section. In this case, the memory barrier in step #4 guarantees that all callbacks will be seen to execute after each CPU's quiescent state. On the other hand, if this is a different CPU, it will acquire the leaf rcu_node's ->lock, and will again be serialized after each CPU's quiescent state for the old grace period. On the strength of this proof, this commit therefore removes the memory barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp(). The effect is to reduce the number of memory barriers by one and to reduce the frequency of execution from about once per scheduling tick per CPU to once per grace period. This was reverted do to hangs found during testing by Yinghai Lu and Ingo Molnar. Frederic Weisbecker supplied Yinghai with tracing that located the underlying problem, and Frederic also provided the fix. The underlying problem was that the HARDIRQ_ENTER() macro from lib/locking-selftest.c invoked irq_enter(), which in turn invokes rcu_irq_enter(), but HARDIRQ_EXIT() invoked __irq_exit(), which does not invoke rcu_irq_exit(). This situation resulted in calls to rcu_irq_enter() that were not balanced by the required calls to rcu_irq_exit(). Therefore, after these locking selftests completed, RCU's dyntick-idle nesting count was a large number (for example, 72), which caused RCU to to conclude that the affected CPU was not in dyntick-idle mode when in fact it was. RCU would therefore incorrectly wait for this dyntick-idle CPU, resulting in hangs. In contrast, with Frederic's patch, which replaces the irq_enter() in HARDIRQ_ENTER() with an __irq_enter(), these tests don't ever call either rcu_irq_enter() or rcu_irq_exit(), which works because the CPU running the test is already marked as not being in dyntick-idle mode. This means that the rcu_irq_enter() and rcu_irq_exit() calls and RCU then has no problem working out which CPUs are in dyntick-idle mode and which are not. The reason that the imbalance was not noticed before the barrier patch was applied is that the old implementation of rcu_enter_nohz() ignored the nesting depth. This could still result in delays, but much shorter ones. Whenever there was a delay, RCU would IPI the CPU with the unbalanced nesting level, which would eventually result in rcu_enter_nohz() being called, which in turn would force RCU to see that the CPU was in dyntick-idle mode. The reason that very few people noticed the problem is that the mismatched irq_enter() vs. __irq_exit() occured only when the kernel was built with CONFIG_DEBUG_LOCKING_API_SELFTESTS. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-26 09:42:23 -07:00
Paul E. McKenney	80d02085d9	Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" This reverts commit `e59fb3120b`. This reversion was due to (extreme) boot-time slowdowns on SPARC seen by Yinghai Lu and on x86 by Ingo . This is a non-trivial reversion due to intervening commits. Conflicts: Documentation/RCU/trace.txt kernel/rcutree.c Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-05-19 23:25:29 +02:00
Paul E. McKenney	5ece5bab3e	rcu: Add forward-progress diagnostic for per-CPU kthreads Increment a per-CPU counter on each pass through rcu_cpu_kthread()'s service loop, and add it to the rcudata trace output. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:57 -07:00
Paul E. McKenney	15ba0ba860	rcu: add grace-period age and more kthread state to tracing This commit adds the age in jiffies of the current grace period along with the duration in jiffies of the longest grace period since boot to the rcu/rcugp debugfs file. It also adds an additional "O" state to kthread tracing to differentiate between the kthread waiting due to having nothing to do on the one hand and waiting due to being on the wrong CPU on the other hand. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-05-05 23:16:56 -07:00
Paul E. McKenney	90e6ac3657	rcu: update tracing documentation for new rcutorture and rcuboost This commit documents the new debugfs rcu/rcutorture and rcu/rcuboost trace files. The description has been updated as suggested by Josh Triplett. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-05-05 23:16:56 -07:00
Paul E. McKenney	0ac3d136b2	rcu: add callback-queue information to rcudata output This commit adds an indication of the state of the callback queue using a string of four characters following the "ql=" integer queue length. The first character is "N" if there are callbacks that have been queued that are not yet ready to be handled by the next grace period, or "." otherwise. The second character is "R" if there are callbacks queued that are ready to be handled by the next grace period, or "." otherwise. The third character is "W" if there are callbacks waiting for the current grace period, or "." otherwise. Finally, the fourth character is "D" if there are callbacks that have been handled by a prior grace period and are waiting to be invoked, or ".". Note that callbacks that are in the process of being invoked are not shown. These callbacks would have been removed from the rcu_data structure's list by rcu_do_batch() prior to being executed. (These callbacks are also not reflected in the "ql=" total, FWIW.) Also, document the new callback-queue trace information. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:56 -07:00
Paul E. McKenney	2fa218d8bb	rcu: Update RCU's trace.txt documentation for new format The trace.txt file had obsolete output for the debugfs rcu/rcudata file, so update it. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:56 -07:00
Paul E. McKenney	12f5f524ca	rcu: merge TREE_PREEPT_RCU blocked_tasks[] lists Combine the current TREE_PREEMPT_RCU ->blocked_tasks[] lists in the rcu_node structure into a single ->blkd_tasks list with ->gp_tasks and ->exp_tasks tail pointers. This is in preparation for RCU priority boosting, which will add a third dimension to the combinatorial explosion in the ->blocked_tasks[] case, but simply a third pointer in the new ->blkd_tasks case. Also update documentation to reflect blocked_tasks[] merge Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:54 -07:00
Paul E. McKenney	e59fb3120b	rcu: Decrease memory-barrier usage based on semi-formal proof Commit `d09b62d` fixed grace-period synchronization, but left some smp_mb() invocations in rcu_process_callbacks() that are no longer needed, but sheer paranoia prevented them from being removed. This commit removes them and provides a proof of correctness in their absence. It also adds a memory barrier to rcu_report_qs_rsp() immediately before the update to rsp->completed in order to handle the theoretical possibility that the compiler or CPU might move massive quantities of code into a lock-based critical section. This also proves that the sheer paranoia was not entirely unjustified, at least from a theoretical point of view. In addition, the old dyntick-idle synchronization depended on the fact that grace periods were many milliseconds in duration, so that it could be assumed that no dyntick-idle CPU could reorder a memory reference across an entire grace period. Unfortunately for this design, the addition of expedited grace periods breaks this assumption, which has the unfortunate side-effect of requiring atomic operations in the functions that track dyntick-idle state for RCU. (There is some hope that the algorithms used in user-level RCU might be applied here, but some work is required to handle the NMIs that user-space applications can happily ignore. For the short term, better safe than sorry.) This proof assumes that neither compiler nor CPU will allow a lock acquisition and release to be reordered, as doing so can result in deadlock. The proof is as follows: 1. A given CPU declares a quiescent state under the protection of its leaf rcu_node's lock. 2. If there is more than one level of rcu_node hierarchy, the last CPU to declare a quiescent state will also acquire the ->lock of the next rcu_node up in the hierarchy, but only after releasing the lower level's lock. The acquisition of this lock clearly cannot occur prior to the acquisition of the leaf node's lock. 3. Step 2 repeats until we reach the root rcu_node structure. Please note again that only one lock is held at a time through this process. The acquisition of the root rcu_node's ->lock must occur after the release of that of the leaf rcu_node. 4. At this point, we set the ->completed field in the rcu_state structure in rcu_report_qs_rsp(). However, if the rcu_node hierarchy contains only one rcu_node, then in theory the code preceding the quiescent state could leak into the critical section. We therefore precede the update of ->completed with a memory barrier. All CPUs will therefore agree that any updates preceding any report of a quiescent state will have happened before the update of ->completed. 5. Regardless of whether a new grace period is needed, rcu_start_gp() will propagate the new value of ->completed to all of the leaf rcu_node structures, under the protection of each rcu_node's ->lock. If a new grace period is needed immediately, this propagation will occur in the same critical section that ->completed was set in, but courtesy of the memory barrier in #4 above, is still seen to follow any pre-quiescent-state activity. 6. When a given CPU invokes __rcu_process_gp_end(), it becomes aware of the end of the old grace period and therefore makes any RCU callbacks that were waiting on that grace period eligible for invocation. If this CPU is the same one that detected the end of the grace period, and if there is but a single rcu_node in the hierarchy, we will still be in the single critical section. In this case, the memory barrier in step #4 guarantees that all callbacks will be seen to execute after each CPU's quiescent state. On the other hand, if this is a different CPU, it will acquire the leaf rcu_node's ->lock, and will again be serialized after each CPU's quiescent state for the old grace period. On the strength of this proof, this commit therefore removes the memory barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp(). The effect is to reduce the number of memory barriers by one and to reduce the frequency of execution from about once per scheduling tick per CPU to once per grace period. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:54 -07:00
Paul E. McKenney	a00e0d714f	rcu: Remove conditional compilation for RCU CPU stall warnings The RCU CPU stall warnings can now be controlled using the rcu_cpu_stall_suppress boot-time parameter or via the same parameter from sysfs. There is therefore no longer any reason to have kernel config parameters for this feature. This commit therefore removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE kernel config parameters. The RCU_CPU_STALL_TIMEOUT parameter remains to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter remains to allow task-stall information to be suppressed if desired. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:54 -07:00
Paul E. McKenney	fea651267e	rcu: add documentation saying which RCU flavor to choose Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-03-04 08:05:25 -08:00
Paul E. McKenney	2d999e03b7	rcu: update documentation/comments for Lai's adoption patch Lai's RCU-callback immediate-adoption patch changes the RCU tracing output, so update tracing.txt. Also update a few comments to clarify the synchronization design. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:01:59 -08:00
Paul E. McKenney	8e79e1f961	rcu: document TINY_RCU and TINY_PREEMPT_RCU tracing. Add the required verbiage to Documentation/RCU/trace.txt. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:01:56 -08:00
Paul E. McKenney	269dcc1c2e	rcu: Add tracing data to support queueing models The current tracing data is not sufficient to deduce the average time that a callback spends waiting for a grace period to end. Add three per-CPU counters recording the number of callbacks invoked (ci), the number of callbacks orphaned (co), and the number of callbacks adopted (ca). Given the existing callback queue length (ql), the average wait time in absence of CPU hotplug operations is ql/ci. The units of wait time will be in terms of the duration over which ci was measured. In the presence of CPU hotplug operations, there is room for argument, but ql/(ci-co+ca) won't steer you too far wrong. Also fixes a typo called out by Lucas De Marchi <lucas.de.marchi@gmail.com>. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-09-23 09:16:53 -07:00
Paul E. McKenney	2c96c7751d	rcu: upgrade stallwarn.txt documentation for CPU-bound RT processes CPU-bound real-time processes can cause RCU CPU stall warnings, and much other trouble as well. Document the fact that they can cause RCU CPU stall warnings. Suggested-by: Darren Hart <dvhltc@us.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-23 16:34:02 -07:00
Paul E. McKenney	5cc6517abd	rcu: document ways of stalling updates in low-memory situations Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-20 09:00:14 -07:00
Paul E. McKenney	84483ea42c	rcu: add shiny new debug assists to Documentation/RCU/checklist.txt Add a section describing PROVE_RCU, DEBUG_OBJECTS_RCU_HEAD, and the __rcu sparse checking to the RCU checklist. Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2010-08-19 17:18:01 -07:00
Justin P. Mattock	0ea6e61122	Documentation: update broken web addresses. Below you will find an updated version from the original series bunching all patches into one big patch updating broken web addresses that are located in Documentation/* Some of the addresses date as far far back as 1995 etc... so searching became a bit difficult, the best way to deal with these is to use web.archive.org to locate these addresses that are outdated. Now there are also some addresses pointing to .spec files some are located, but some(after searching on the companies site)where still no where to be found. In this case I just changed the address to the company site this way the users can contact the company and they can locate them for the users. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Thomas Weber <weber@corscience.de> Signed-off-by: Mike Frysinger <vapier.adi@gmail.com> Cc: Paulo Marques <pmarques@grupopie.com> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Michael Neuling <mikey@neuling.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-08-04 15:21:40 +02:00
Linus Torvalds	b8ae30ee26	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits) stop_machine: Move local variable closer to the usage site in cpu_stop_cpu_callback() sched, wait: Use wrapper functions sched: Remove a stale comment ondemand: Make the iowait-is-busy time a sysfs tunable ondemand: Solve a big performance issue by counting IOWAIT time as busy sched: Intoduce get_cpu_iowait_time_us() sched: Eliminate the ts->idle_lastupdate field sched: Fold updating of the last_update_time_info into update_ts_time_stats() sched: Update the idle statistics in get_cpu_idle_time_us() sched: Introduce a function to update the idle statistics sched: Add a comment to get_cpu_idle_time_us() cpu_stop: add dummy implementation for UP sched: Remove rq argument to the tracepoints rcu: need barrier() in UP synchronize_sched_expedited() sched: correctly place paranioa memory barriers in synchronize_sched_expedited() sched: kill paranoia check in synchronize_sched_expedited() sched: replace migration_thread with cpu_stop stop_machine: reimplement using cpu_stop cpu_stop: implement stop_cpu[s]() sched: Fix select_idle_sibling() logic in select_task_rq_fair() ...	2010-05-18 08:27:54 -07:00
Paul E. McKenney	f1d507beea	rcu: improve the RCU CPU-stall warning documentation The existing Documentation/RCU/stallwarn.txt has proven unhelpful, so rework it a bit. In particular, show how to interpret the stall-warning messages. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-05-10 11:08:35 -07:00
Paul E. McKenney	d21670acab	rcu: reduce the number of spurious RCU_SOFTIRQ invocations Lai Jiangshan noted that up to 10% of the RCU_SOFTIRQ are spurious, and traced this down to the fact that the current grace-period machinery will uselessly raise RCU_SOFTIRQ when a given CPU needs to go through a quiescent state, but has not yet done so. In this situation, there might well be nothing that RCU_SOFTIRQ can do, and the overhead can be worth worrying about in the ksoftirqd case. This patch therefore avoids raising RCU_SOFTIRQ in this situation. Changes since v1 (http://lkml.org/lkml/2010/3/30/122 from Lai Jiangshan): o Omit the rcu_qs_pending() prechecks, as they aren't that much less expensive than the quiescent-state checks. o Merge with the set_need_resched() patch that reduces IPIs. o Add the new n_rp_report_qs field to the rcu_pending tracing output. o Update the tracing documentation accordingly. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-05-10 11:08:35 -07:00
Ingo Molnar	e7858f52a5	Merge branch 'cpu_stop' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into sched/core	2010-05-08 18:11:19 +02:00
Tejun Heo	969c79215a	sched: replace migration_thread with cpu_stop Currently migration_thread is serving three purposes - migration pusher, context to execute active_load_balance() and forced context switcher for expedited RCU synchronize_sched. All three roles are hardcoded into migration_thread() and determining which job is scheduled is slightly messy. This patch kills migration_thread and replaces all three uses with cpu_stop. The three different roles of migration_thread() are splitted into three separate cpu_stop callbacks - migration_cpu_stop(), active_load_balance_cpu_stop() and synchronize_sched_expedited_cpu_stop() - and each use case now simply asks cpu_stop to execute the callback as necessary. synchronize_sched_expedited() was implemented with private preallocated resources and custom multi-cpu queueing and waiting logic, both of which are provided by cpu_stop. synchronize_sched_expedited_count is made atomic and all other shared resources along with the mutex are dropped. synchronize_sched_expedited() also implemented a check to detect cases where not all the callback got executed on their assigned cpus and fall back to synchronize_sched(). If called with cpu hotplug blocked, cpu_stop already guarantees that and the condition cannot happen; otherwise, stop_machine() would break. However, this patch preserves the paranoid check using a cpumask to record on which cpus the stopper ran so that it can serve as a bisection point if something actually goes wrong theree. Because the internal execution state is no longer visible, rcu_expedited_torture_stats() is removed. This patch also renames cpu_stop threads to from "stopper/%d" to "migration/%d". The names of these threads ultimately don't matter and there's no reason to make unnecessary userland visible changes. With this patch applied, stop_machine() and sched now share the same resources. stop_machine() is faster without wasting any resources and sched migration users are much cleaner. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Josh Triplett <josh@freedesktop.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>	2010-05-06 18:49:21 +02:00
Paul E. McKenney	50aec0024e	rcu: Update docs for rcu_access_pointer and rcu_dereference_protected Update examples and lists of APIs to include these new primitives. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: eric.dumazet@gmail.com LKML-Reference: <1270852752-25278-3-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-14 12:20:12 +02:00
Paul E. McKenney	1bd22e374b	rcu: Use canonical URL for Mathieu's dissertation The version numbers change too quickly, so use a canonical URL that represents the most recent version. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-16-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 10:34:56 +01:00
Paul E. McKenney	998f2ac3fe	rcu: Fix citation of Mathieu's dissertation Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-14-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 10:34:54 +01:00
Paul E. McKenney	c598a070bc	rcu: Documentation update for CONFIG_PROVE_RCU Adds a lockdep.txt file and updates checklist.txt and whatisRCU.txt to reflect the new lockdep-enabled capabilities of RCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-13-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 10:34:53 +01:00
Paul E. McKenney	4c54005ca4	rcu: 1Q2010 update for RCU documentation Add expedited functions. Review documentation and update obsolete verbiage. Also fix the advice for the RCU CPU-stall kernel configuration parameter, and document RCU CPU-stall warnings. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <12635142581866-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-01-16 10:25:22 +01:00
Paul E. McKenney	64179861cb	rcu: Add synchronize_srcu_expedited() to the documentation Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: avi@redhat.com Cc: mtosatti@redhat.com LKML-Reference: <12565226354176-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-26 09:40:31 +01:00
Paul E. McKenney	0edf1a683e	rcu: Update trace.txt documentation for blocked-tasks lists Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: npiggin@suse.de Cc: jens.axboe@oracle.com LKML-Reference: <12555405592804-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 11:20:23 +02:00
Paul E. McKenney	bd58b43003	rcu: Update trace.txt documentation to reflect recent changes o Remove the CONFIG_PREEMPT_RCU documentation since this config option has now been removed. o Change the now-incorrect references to "rcu" labels to instead be "rcu_sched". o Add notes stating that CONFIG_TREE_PREEMPT_RCU kernels will have additional "rcu_preempt" output. o Note the new "oqlen" field in the rcuhier output (for RCU callbacks orphaned by an offlined CPU). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: npiggin@suse.de Cc: jens.axboe@oracle.com LKML-Reference: <1255540559799-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-10-15 11:20:23 +02:00
Paul E. McKenney	6b3ef48adf	rcu: Remove CONFIG_PREEMPT_RCU Now that CONFIG_TREE_PREEMPT_RCU is in place, there is no further need for CONFIG_PREEMPT_RCU. Remove it, along with whatever subtle bugs it may (or may not) contain. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josht@linux.vnet.ibm.com Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org LKML-Reference: <125097461396-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-23 10:32:40 +02:00
Paul E. McKenney	d6714c22b4	rcu: Renamings to increase RCU clarity Make RCU-sched, RCU-bh, and RCU-preempt be underlying implementations, with "RCU" defined in terms of one of the three. Update the outdated rcu_qsctr_inc() names, as these functions no longer increment anything. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: akpm@linux-foundation.org Cc: mathieu.desnoyers@polymtl.ca Cc: josht@linux.vnet.ibm.com Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org LKML-Reference: <12509746132696-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-23 10:32:37 +02:00
Ingo Molnar	fa08661af8	Merge commit 'v2.6.31-rc6' into core/rcu Merge reason: the branch was on pre-rc1 .30, update to latest. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-15 18:56:13 +02:00
Eric Dumazet	941297f443	netfilter: nf_conntrack: nf_conntrack_alloc() fixes When a slab cache uses SLAB_DESTROY_BY_RCU, we must be careful when allocating objects, since slab allocator could give a freed object still used by lockless readers. In particular, nf_conntrack RCU lookups rely on ct->tuplehash[xxx].hnnode.next being always valid (ie containing a valid 'nulls' value, or a valid pointer to next object in hash chain.) kmem_cache_zalloc() setups object with NULL values, but a NULL value is not valid for ct->tuplehash[xxx].hnnode.next. Fix is to call kmem_cache_alloc() and do the zeroing ourself. As spotted by Patrick, we also need to make sure lookup keys are committed to memory before setting refcount to 1, or a lockless reader could get a reference on the old version of the object. Its key re-check could then pass the barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-07-16 14:03:40 +02:00
Paul E. McKenney	240ebbf81f	rcu: Add synchronize_sched_expedited() rcutorture doc + updates This patch updates the rcutorture documentation to include updated output format. It also brings the RCU documentation up to date. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Cc: davem@davemloft.net Cc: dada1@cosmosbay.com Cc: zbr@ioremap.net Cc: jeff.chua.linux@gmail.com Cc: paulus@samba.org Cc: laijs@cn.fujitsu.com Cc: jengelh@medozas.de Cc: r000n@r000n.net Cc: benh@kernel.crashing.org Cc: mathieu.desnoyers@polymtl.ca LKML-Reference: <12459460983193-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-07-03 10:02:29 +02:00
Matt LaPlante	19f5946001	trivial: Miscellaneous documentation typo fixes Fix various typos in documentation txts. Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:47 +02:00
Paul E. McKenney	6fd9b3a40b	rcu: Update RCU tracing documentation for __rcu_pending This patch updates the RCU documentation to reflect the changes in tracing made in the previous patch in the set. Located-by: Anton Blanchard <anton@au1.ibm.com> Tested-by: Anton Blanchard <anton@au1.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: anton@samba.org Cc: akpm@linux-foundation.org Cc: dipankar@in.ibm.com Cc: manfred@colorfullife.com Cc: cl@linux-foundation.org Cc: josht@linux.vnet.ibm.com Cc: schamp@sgi.com Cc: niv@us.ibm.com Cc: dvhltc@us.ibm.com Cc: ego@in.ibm.com Cc: laijs@cn.fujitsu.com Cc: rostedt@goodmis.org Cc: peterz@infradead.org Cc: penberg@cs.helsinki.fi Cc: andi@firstfloor.org Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> LKML-Reference: <12396834792865-git-send-email-> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-04-14 11:33:43 +02:00
Jesper Dangaard Brouer	edd4070f5d	Doc: Fix spelling in RCU/rculist_nulls.txt. Doc: Fix spelling in RCU/rculist_nulls.txt. Trival spelling fixes in RCU/rculist_nulls.txt. Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Tested-by: Jarek Poplawski <jarkao2@gmail.com;-> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-02 01:33:51 -07:00
Jesper Dangaard Brouer	3943ac5d99	Doc: Fix wrong API example usage of call_rcu(). At some point the API of call_rcu() changed from three parameters to two parameters, correct the documentation. One confusing thing in RCU/listRCU.txt, which is NOT fixed in this patch, is that no reason or explaination is given for using call_rcu() instead of the normal synchronize_rcu() call. Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-02 01:33:50 -07:00
Jesper Dangaard Brouer	9ba30d7444	Doc: Fix missing whitespaces in RCU documentation. Trivial fix while reading through the RCU docs. Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-04-02 01:33:50 -07:00
Paul E. McKenney	0612ea00a0	rcu: documentation 1Q09 update Update the RCU documentation to call out the need for callers of primitives like call_rcu() and synchronize_rcu() to prevent subsequent RCU readers from hazard. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-03-10 15:55:11 -07:00
Linus Torvalds	1df2d017fe	Merge branch 'docs-next' of git://git.lwn.net/linux-2.6 * 'docs-next' of git://git.lwn.net/linux-2.6: Fix a typo in the development process document. Document handling of bad memory Document RCU and unloadable modules	2009-01-08 15:52:13 -08:00
Linus Torvalds	5f34fe1cfc	Merge branch 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits) stacktrace: provide save_stack_trace_tsk() weak alias rcu: provide RCU options on non-preempt architectures too printk: fix discarding message when recursion_bug futex: clean up futex_(un)lock_pi fault handling "Tree RCU": scalable classic RCU implementation futex: rename field in futex_q to clarify single waiter semantics x86/swiotlb: add default swiotlb_arch_range_needs_mapping x86/swiotlb: add default phys<->bus conversion x86: unify pci iommu setup and allow swiotlb to compile for 32 bit x86: add swiotlb allocation functions swiotlb: consolidate swiotlb info message printing swiotlb: support bouncing of HighMem pages swiotlb: factor out copy to/from device swiotlb: add arch hook to force mapping swiotlb: allow architectures to override phys<->bus<->phys conversions swiotlb: add comment where we handle the overflow of a dma mask on 32 bit rcu: fix rcutorture behavior during reboot resources: skip sanity check of busy resources swiotlb: move some definitions to header swiotlb: allow architectures to override swiotlb pool allocation ... Fix up trivial conflicts in arch/x86/kernel/Makefile arch/x86/mm/init_32.c include/linux/hardirq.h as per Ingo's suggestions.	2008-12-30 16:10:19 -08:00
Paul E. McKenney	64db4cfff9	"Tree RCU": scalable classic RCU implementation This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-12-18 21:56:04 +01:00
Paul E. McKenney	1c12757c56	Document RCU and unloadable modules Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>	2008-12-03 15:58:01 -07:00
Eric Dumazet	536533e69e	rcu: documents rculist_nulls Adds Documentation/RCU/rculist_nulls.txt file to describe how 'nulls' end-of-list can help in some RCU algos. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-11-16 19:41:14 -08:00
Lai Jiangshan	e8aed68614	doc/RCU: fix pseudocode in rcuref.txt atomic_inc_not_zero(v) return 0 if *v = 0. use spin_lock instead of write_lock for update lock. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-09-10 08:36:07 +02:00
Paul E. McKenney	34d7c2b38d	rcu: remove list_for_each_rcu() All of the in-tree uses of list_for_each_rcu() have been converted to list_for_each_entry_rcu(), so list_for_each_rcu() can now be removed. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-08-15 17:03:06 +02:00
Paul E. McKenney	0729fbf3bc	rcu: make rcutorture even more vicious: invoke RCU readers from irq handlers (timers) This patch allows torturing RCU from irq handlers (timers, in this case). A new module parameter irqreader enables such additional torturing, and is enabled by default. Variants of RCU that do not tolerate readers being called from irq handlers (e.g., SRCU) ignore irqreader. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: josh@freedesktop.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: dino@in.ibm.com Cc: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org Cc: vegard.nossum@gmail.com Cc: adobriyan@gmail.com Cc: oleg@tv-sign.ru Cc: bunk@kernel.org Cc: rjw@sisk.pl Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-26 09:24:33 +02:00
Paul E. McKenney	31a72bce0b	rcu: make rcutorture more vicious: reinstate boot-time testing This patch re-institutes the ability to build rcutorture directly into the Linux kernel. The reason that this capability was removed was that this could result in your kernel being pretty much useless, as rcutorture would be running starting from early boot. This problem has been avoided by (1) making rcutorture run only three seconds of every six by default, (2) adding a CONFIG_RCU_TORTURE_TEST_RUNNABLE that permits rcutorture to be quiesced at boot time, and (3) adding a sysctl in /proc named /proc/sys/kernel/rcutorture_runnable that permits rcutorture to be quiesced and unquiesced when built into the kernel. Please note that this /proc file is -not- available when rcutorture is built as a module. Please also note that to get the earlier take-no-prisoners behavior, you must use the boot command line to set rcutorture's "stutter" parameter to zero. The rcutorture quiescing mechanism is currently quite crude: loops in each rcutorture process that poll a global variable once per tick. Suggestions for improvement are welcome. The default action will be to reduce the polling rate to a few times per second. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Suggested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-19 11:22:15 +02:00
Paul E. McKenney	d120f65f3a	rcu: make rcutorture more vicious: add stutter feature This patch takes a step towards making rcutorture more brutal by allowing the test to be automatically periodically paused, with the default being to run the test for five seconds then pause for five seconds and repeat. This behavior can be controlled using a new "stutter" module parameter, so that "stutter=0" gives the old default behavior of running continuously. Starting and stopping rcutorture more heavily stresses RCU's interaction with the scheduler, as well as exercising more paths through the grace-period detection code. Note that the default to "shuffle_interval" has also been adjusted from 5 seconds to 3 seconds to provide varying overlap with the "stutter" interval. I am still unable to provoke the failures that Alexey has been seeing, even with this patch, but will be doing a few additional things to beef up rcutorture. Suggested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-06-18 15:06:56 +02:00
Paul E. McKenney	32300751b4	sched: 1Q08 RCU doc update, add call_rcu_sched() Long-delayed update to the RCU documentation, including adding the new call_rcu_sched() and rcu_barrier_sched() APIs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2008-05-19 10:01:37 +02:00
Harvey Harrison	b5606c2d44	remove final fastcall users fastcall always expands to empty, remove it. Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-13 16:21:18 -08:00
Paul E. McKenney	f85d6c7168	Preempt-RCU: update RCU Documentation. This patch updates the RCU documentation to reflect preemptible RCU as well as recent publications. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Gautham R Shenoy <ego@in.ibm.com> Reviewed-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-25 21:08:25 +01:00
Rob Landley	e54e54a94c	Add Documentation/RCU/00-Index Add Documentation/RCU/00-INDEX Signed-off-by: Rob Landley <rob@landley.net> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-17 08:43:07 -07:00
Paul E. McKenney	ef48bd2461	Document the fact that RCU callbacks can run in parallel Add an item to the RCU documentation checklist noting that RCU callbacks can run in parallel. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-16 09:05:47 -07:00
Josh Triplett	4b6c2cca6e	[PATCH] rcu: add sched torture type to rcutorture Implement torture testing for the "sched" variant of RCU, which uses preempt_disable, preempt_enable, and synchronize_sched. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:31 -07:00
Josh Triplett	11a147013e	[PATCH] rcu: add rcu_bh_sync torture type to rcutorture Use the newly-generic synchronous deferred free function to implement torture testing for rcu_bh using synchronize_rcu_bh rather than the asynchronous call_rcu_bh. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:31 -07:00
Josh Triplett	20d2e4283a	[PATCH] rcu: add rcu_sync torture type to rcutorture Use the newly-generic synchronous deferred free function to implement torture testing for RCU using synchronize_rcu rather than the asynchronous call_rcu. Signed-off-by: Josh Triplett <josh@freedesktop.org> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:31 -07:00
Josh Triplett	b772e1dd4b	[PATCH] RCU: add fake writers to rcutorture rcutorture currently has one writer and an arbitrary number of readers. To better exercise some of the code paths in RCU implementations, add fake writer threads which call the synchronize function for the RCU variant in a loop, with a delay between calls to arrange for different numbers of writers running in parallel. [bunk@stusta.de: cleanup] Acked-by: Paul McKenney <paulmck@us.ibm.com> Cc: Dipkanar Sarma <dipankar@in.ibm.com> Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:31 -07:00
Paul E. McKenney	b2896d2e75	[PATCH] srcu-3: add SRCU operations to rcutorture Adds SRCU operations to rcutorture and updates rcutorture documentation. Also increases the stress imposed by the rcutorture test. [bunk@stusta.de: make needlessly global code static] Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:30 -07:00
Paul E. McKenney	621934ee7e	[PATCH] srcu-3: RCU variant permitting read-side blocking Updated patch adding a variant of RCU that permits sleeping in read-side critical sections. SRCU is as follows: o Each use of SRCU creates its own srcu_struct, and each srcu_struct has its own set of grace periods. This is critical, as it prevents one subsystem with a blocking reader from holding up SRCU grace periods for other subsystems. o The SRCU primitives (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) all take a pointer to a srcu_struct. o The SRCU primitives must be called from process context. o srcu_read_lock() returns an int that must be passed to the matching srcu_read_unlock(). Realtime RCU avoids the need for this by storing the state in the task struct, but SRCU needs to allow a given code path to pass through multiple SRCU domains -- storing state in the task struct would therefore require either arbitrary space in the task struct or arbitrary limits on SRCU nesting. So I kicked the state-storage problem up to the caller. Of course, it is not permitted to call synchronize_srcu() while in an SRCU read-side critical section. o There is no call_srcu(). It would not be hard to implement one, but it seems like too easy a way to OOM the system. (Hey, we have enough trouble with call_rcu(), which does -not- permit readers to sleep!!!) So, if you want it, please tell me why... [josht@us.ibm.com: sparse notation] Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Josh Triplett <josh@freedesktop.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-10-04 07:55:30 -07:00
Paolo Ornati	670e9f34ee	Documentation: remove duplicated words Remove many duplicated words under Documentation/ and do other small cleanups. Examples: "and and" --> "and" "in in" --> "in" "the the" --> "the" "the the" --> "to the" ... Signed-off-by: Paolo Ornati <ornati@fastwebnet.it> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-10-03 22:57:56 +02:00
Matt LaPlante	53cb47268e	Fix typos in Documentation/: 'S' This patch fixes typos in various Documentation txts. The patch addresses some words starting with the letter 'S'. Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com> Acked-by: Alan Cox <alan@redhat.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-10-03 22:55:17 +02:00
Urs Thuermann	82a854ec4f	[PATCH] RCU Documentation fix Updater should use _rcu variant of list_del(). Signed-off-by: Urs Thuermann <urs@isnogud.escape.de> Acked-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-10 13:24:15 -07:00
Paul E. McKenney	72e9bb5492	[PATCH] rcutorture: add ops vector and Classic RCU ops Add an ops vector to rcutorture, and add the ops for Classic RCU. Update the rcutorture documentation to reflect slight change to the dmesg formats. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:40 -07:00
Paul E. McKenney	29766f1eb3	[PATCH] rcutorture: catchup doc fixes for idle-hz tests This just catches the RCU torture documentation up with the recent fixes that test RCU for architectures that turn of the scheduling-clock interrupt for idle CPUs and the addition of a SUCCESS/FAILURE indication, fixing up an obsolete comment as well. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 17:32:40 -07:00
Paul E. McKenney	165d6c78ee	[PATCH] RCU documentation: self-limiting updates and call_rcu() An update to the RCU documentation calling out the self-limiting-update-rate advantages of synchronize_rcu(), and describing how to use call_rcu() in a way that results in self-limiting updates. Self-limiting updates are important to avoiding RCU-induced OOM in face of denial-of-service attacks. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 10:01:17 -07:00
Paul E. McKenney	d83015b8f6	[PATCH] Make RCU API inaccessible to non-GPL Linux kernel modules Remove synchronize_kernel() (deprecated 2-APR-2005 in http://lkml.org/lkml/2005/4/3/11) and makes the RCU API inaccessible to non-GPL Linux kernel modules (as was announced more than one year ago in http://lkml.org/lkml/2005/4/3/8). Tested on x86 and ppc64. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:43:07 -07:00
KAMEZAWA Hiroyuki	3c30a75256	[PATCH] for_each_possible_cpu: documentaion Replace for_each_cpu with for_each_possible_cpu. Modifies occurences in documentaion. for_each_cpu in whatisRCU.txt should be for_each_online_cpu ??? (I'm not sure..) Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-28 09:16:05 -08:00
Baruch Even	de0dfcdf55	rcu: undeclared variable used in documentation The RCU documentation uses an fp variable which is not declared in the code snippets. Use the new_fp variable instead. Signed-Off-By: Baruch Even <baruch@ev-en.org> Acked-by: <paulmck@us.ibm.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-03-24 18:25:25 +01:00
Paul E. McKenney	d19720a909	[PATCH] RCU documentation fixes (January 2006 update) Updates to in-tree RCU documentation based on comments over the past few months. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-02-01 08:53:25 -08:00
Nick Piggin	095975da26	[PATCH] rcu file: use atomic primitives Use atomic_inc_not_zero for rcu files instead of special case rcuref. Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:48 -08:00
Paul E. McKenney	665a7583f3	[PATCH] Remove hlist_for_each_rcu() API, convert existing use to hlist_for_each_entry_rcu Remove the hlist_for_each_rcu() API, which is used only in one place, and is trivially converted to hlist_for_each_entry_rcu(), making the code shorter and more readable. Any out-of-tree uses may be similarly converted. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-07 07:53:35 -08:00
Paul E. McKenney	a241ec65ae	[PATCH] RCU torture-testing kernel module This patch is a rewrite of the one submitted on October 1st, using modules (http://marc.theaimsgroup.com/?l=linux-kernel&m=112819093522998&w=2). This rewrite adds a tristate CONFIG_RCU_TORTURE_TEST, which enables an intense torture test of the RCU infratructure. This is needed due to the continued changes to the RCU infrastructure to accommodate dynamic ticks, CPU hotplug, realtime, and so on. Most of the code is in a separate file that is compiled only if the CONFIG variable is set. Documentation on how to run the test and interpret the output is also included. This code has been tested on i386 and ppc64, and an earlier version of the code has received extensive testing on a number of architectures as part of the PREEMPT_RT patchset. Signed-off-by: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-30 17:37:27 -08:00
Paul E. McKenney	dd81eca83c	[PATCH] Yet another RCU documentation update Update RCU documentation based on discussions and review of RCU-based tree patches. Add an introductory whatisRCU.txt file. Signed-off-by: <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-10 10:06:24 -07:00
Dipankar Sarma	c0dfb29051	[PATCH] files: rcuref APIs Adds a set of primitives to do reference counting for objects that are looked up without locks using RCU. Signed-off-by: Ravikiran Thirumalai <kiran_th@gmail.com> Signed-off-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-09 13:57:54 -07:00
Paul E. McKenney	19306059cd	[PATCH] NMI: Update NMI users of RCU to use new API Uses of RCU for dynamically changeable NMI handlers need to use the new rcu_dereference() and rcu_assign_pointer() facilities. This change makes it clear that these uses are safe from a memory-barrier viewpoint, but the main purpose is to document exactly what operations are being protected by RCU. This has been tested on x86 and x86-64, which are the only architectures affected by this change. Signed-off-by: <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-09-07 16:57:19 -07:00
Paul E. McKenney	a83f1fe27f	[PATCH] Update RCU documentation Update the RCU documentation to allow for the new synchronize_rcu() and synchronize_sched() primitives. Fix a few other nits as well. Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-05-01 08:59:05 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

1 2 3 4 5

237 Commits