linux

Commit Graph

Author	SHA1	Message	Date
Rusty Russell	811d66a0e1	module: group post-relocation functions into post_relocation() This simply hoists more code out of load_module; we also put the identification of the extable and dynamic debug table in with the others in find_module_sections(). We move the taint check to the actual add/remove of the dynamic debug info: this is certain (find_module_sections is too early). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Yehuda Sadeh <yehuda@hq.newdream.net>	2010-08-05 12:59:13 +09:30
Rusty Russell	6526c534b2	module: move module args strndup_user to just before use Instead of copying and allocating the args and storing it in load_info, we can just allocate them right before we need them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:12 +09:30
Rusty Russell	49668688dd	module: pass load_info into other functions Pass the struct load_info into all the other functions in module loading. This neatens things and makes them more consistent. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:10 +09:30
Rusty Russell	36b0360d17	module: fix sysfs cleanup for !CONFIG_SYSFS Restore the stub module_remove_modinfo_attrs, remove the now-unused !CONFIG_SYSFS module_sysfs_init. Also, rename mod_kobject_remove() to mod_sysfs_teardown() as it is the logical counterpart to mod_sysfs_setup now. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:10 +09:30
Rusty Russell	8f6d037815	module: sysfs cleanup We change the sysfs functions to take struct load_info, and call them all in mod_sysfs_setup(). We also clean up the #ifdefs a little. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:09 +09:30
Rusty Russell	d913188c75	module: layout_and_allocate layout_and_allocate() does everything up to and including the final struct module placement inside the allocated module memory. We have to store the symbol layout information in our struct load_info though. This avoids the nasty code we had before where 'mod' pointed first to the version inside the temporary allocation containing the entire file, then later was moved to point to the real struct module: now the main code only ever sees the final module address. (Includes fix for the Tony Luck-found Linus-diagnosed failure path error). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:09 +09:30
Rusty Russell	511ca6ae43	module: fix crash in get_ksymbol() when oopsing in module init Andrew had the sole pleasure of tickling this bug in linux-next; when we set up "info->strtab" it's pointing into the temporary copy of the module. For most uses that is fine, but kallsyms keeps a pointer around during module load (inside mod->strtab). If we oops for some reason inside a module's init function, kallsyms will use the mod->strtab pointer into the now-freed temporary module copy. (Later oopses work fine: after init we overwrite mod->strtab to point to a compacted core-only strtab). Reported-by: Andrew "Grumpy" Morton <akpm@linux-foundation.org> Signed-off-by: Rusty "Buggy" Russell <rusty@rustcorp.com.au> Tested-by: Andrew "Happy" Morton <akpm@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:08 +09:30
Rusty Russell	eded41c1c6	module: kallsyms functions take struct load_info Simple refactor causes us to lift struct definition to top of file. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:08 +09:30
Rusty Russell	d6df72a06e	module: refactor out section header rewriting: FIX modversions We can't do the find_sec after removing the SHF_ALLOC flags; it won't find the sections. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:07 +09:30
Rusty Russell	8b5f61a795	module: refactor out section header rewriting Put all the "rewrite and check section headers" in one place. This adds another iteration over the sections, but it's far clearer. We iterate once for every find_section() so we already iterate over many times. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:06 +09:30
Linus Torvalds	3264d3f9dd	module: add load_info Btw, here's a patch that _looks_ large, but it really pretty trivial, and sets things up so that it would be way easier to split off pieces of the module loading. The reason it looks large is that it creates a "module_info" structure that contains all the module state that we're building up while loading, instead of having individual variables for all the indices etc. So the patch ends up being large, because every "symindex" access instead becomes "info.index.sym" etc. That may be a few characters longer, but it then means that we can just pass a pointer to that "info" structure around. and let all the pieces fill it in very naturally. As an example of that, the patch also moves the initialization of all those convenience variables into a "setup_module_info()" function. And at this point it really does become very natural to start to peel off some of the error labels and move them into the helper functions - now the "truncated" case is gone, and is handled inside that setup function instead. So maybe you don't like this approach, and it does make the variable accesses a bit longer, but I don't think unreadably so. And the patch really does look big and scary, but there really should be absolutely no semantic changes - most of it was a trivial and mindless rename. In fact, it was so mindless that I on purpose kept the existing helper functions looking like this: - err = check_modinfo(mod, sechdrs, infoindex, versindex); + err = check_modinfo(mod, info.sechdrs, info.index.info, info.index.vers); rather than changing them to just take the "info" pointer. IOW, a second phase (if you think the approach is ok) would change that calling convention to just do err = check_modinfo(mod, &info); (and same for "layout_sections()", "layout_symtabs()" etc.) Similarly, while right now it makes things _look_ bigger, with things like this: versindex = find_sec(hdr, sechdrs, secstrings, "__versions"); becoming info->index.vers = find_sec(info->hdr, info->sechdrs, info->secstrings, "__versions"); in the new "setup_module_info()" function, that's again just a result of it being a search-and-replace patch. By using the 'info' pointer, we could just change the 'find_sec()' interface so that it ends up being info->index.vers = find_sec(info, "__versions"); instead, and then we'd actually have a shorter and more readable line. So for a lot of those mindless variable name expansions there's would be room for separate cleanups. I didn't move quite everything in there - if we do this to layout_symtabs, for example, we'd want to move the percpu, symoffs, stroffs, *strmap variables to be fields in that module_info structure too. But that's a much smaller patch, I moved just the really core stuff that is currently being set up and used in various parts. But even in this rough form, it removes close to 70 lines from that function (but adds 22 lines overall, of course - the structure definition, the helper function declarations and call-sites etc etc). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:06 +09:30
Linus Torvalds	44032e6316	module: reduce stack usage for each_symbol() And now that I'm looking at that call-chain (to see if it would make sense to use some other more specific lock - doesn't look like it: all the readers are using RCU and this is the only writer), I also give you this trivial one-liner. It changes each_symbol() to not put that constant array on the stack, resulting in changing movq $C.388.31095, %rsi #, tmp85 subq $376, %rsp #, movq %rdi, %rbx # fn, fn leaq -208(%rbp), %rdi #, tmp84 movq %rbx, %rdx # fn, rep movsl xorl %esi, %esi # leaq -208(%rbp), %rdi #, tmp87 movq %r12, %rcx # data, call each_symbol_in_section.clone.0 # into xorl %esi, %esi # subq $216, %rsp #, movq %rdi, %rbx # fn, fn movq $arr.31078, %rdi #, call each_symbol_in_section.clone.0 # which is not so much about being obviously shorter and simpler because we don't unnecessarily copy that constant array around onto the stack, but also about having a much smaller stack footprint (376 vs 216 bytes - see the update of 'rsp'). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:06 +09:30
Rusty Russell	22e268ebec	module: refactor load_module part 5 1) Extract out the relocation loop into apply_relocations 2) Extract license and version checks into check_module_license_and_versions 3) Extract icache flushing into flush_module_icache 4) Move __obsparm warning into find_module_sections 5) Move license setting into check_modinfo. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:05 +09:30
Rusty Russell	9f85a4bbb1	module: refactor load_module part 4 Allocate references inside module_unload_init(), clean up inside module_unload_free(). This version fixed to do allocation before __this_cpu_write, thanks to bug reports from linux-next from Dave Young <hidave.darkstar@gmail.com> and Stephen Rothwell <sfr@canb.auug.org.au>. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:05 +09:30
Rusty Russell	40dd2560ec	module: refactor load_module part 3 Extract out the allocation and copying in from userspace, and the first set of modinfo checks. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:04 +09:30
Linus Torvalds	65b8a9b4d5	module: refactor load_module part 2 Here's a second one. It's slightly less trivial - since we now have error cases - and equally untested so it may well be totally broken. But it also cleans up a bit more, and avoids one of the goto targets, because the "move_module()" helper now does both allocations or none. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:03 +09:30
Linus Torvalds	f91a13bb99	module: refactor load_module I'd start from the trivial stuff. There's a fair amount of straight-line code that just makes the function hard to read just because you have to page up and down so far. Some of it is trivial to just create a helper function for. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-08-05 12:59:02 +09:30
Eric Dumazet	2409e74278	module: module_unload_init() cleanup No need to clear mod->refptr in module_unload_init(), since alloc_percpu() already clears allocated chunks. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed unused var)	2010-08-05 12:59:02 +09:30
Jason Baron	b82bab4bbe	dynamic debug: move ddebug_remove_module() down into free_module() The command echo "file ec.c +p" >/sys/kernel/debug/dynamic_debug/control causes an oops. Move the call to ddebug_remove_module() down into free_module(). In this way it should be called from all error paths. Currently, we are missing the remove if the module init routine fails. Signed-off-by: Jason Baron <jbaron@redhat.com> Reported-by: Thomas Renninger <trenn@suse.de> Tested-by: Thomas Renninger <trenn@suse.de> Cc: <stable@kernel.org> [2.6.32+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-27 14:32:06 -07:00
Yehuda Sadeh	ff49d74ad3	module: initialize module dynamic debug later We should initialize the module dynamic debug datastructures only after determining that the module is not loaded yet. This fixes a bug that introduced in 2.6.35-rc2, where when a trying to load a module twice, we also load it's dynamic printing data twice which causes all sorts of nasty issues. Also handle the dynamic debug cleanup later on failure. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (removed a #ifdef) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-04 20:17:22 -07:00
Rusty Russell	9bea7f2395	module: fix bne2 "gave up waiting for init of module libcrc32c" Problem: it's hard to avoid an init routine stumbling over a request_module these days. And it's not clear it's always a bad idea: for example, a module like kvm with dynamic dependencies on kvm-intel or kvm-amd would be neater if it could simply request_module the right one. In this particular case, it's libcrc32c: libcrc32c_mod_init crypto_alloc_shash crypto_alloc_tfm crypto_find_alg crypto_alg_mod_lookup crypto_larval_lookup request_module If another module is waiting inside resolve_symbol() for libcrc32c to finish initializing (ie. bne2 depends on libcrc32c) then it does so holding the module lock, and our request_module() can't make progress until that is released. Waiting inside resolve_symbol() without the lock isn't all that hard: we just need to pass the -EBUSY up the call chain so we can sleep where we don't hold the lock. Error reporting is a bit trickier: we need to copy the name of the unfinished module before releasing the lock. Other notes: 1) This also fixes a theoretical issue where a weak dependency would allow symbol version mismatches to be ignored. 2) We rename use_module to ref_module to make life easier for the only external user (the out-of-tree ksplice patches). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Tim Abbot <tabbott@ksplice.com> Tested-by: Brandon Philips <bphilips@suse.de>	2010-06-05 11:17:37 +09:30
Rusty Russell	be593f4ce4	module: verify_export_symbols under the lock It disabled preempt so it was "safe", but nothing stops another module slipping in before this module is added to the global list now we don't hold the lock the whole time. So we check this just after we check for duplicate modules, and just before we put the module in the global list. (find_symbol finds symbols in coming and going modules, too). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-06-05 11:17:37 +09:30
Linus Torvalds	3bafeb6247	module: move find_module check to end I think Rusty may have made the lock a bit _too_ finegrained there, and didn't add it to some places that needed it. It looks, for example, like PATCH 1/2 actually drops the lock in places where it's needed ("find_module()" is documented to need it, but now load_module() didn't hold it at all when it did the find_module()). Rather than adding a new "module_loading" list, I think we should be able to just use the existing "modules" list, and just fix up the locking a bit. In fact, maybe we could just move the "look up existing module" a bit later - optimistically assuming that the module doesn't exist, and then just undoing the work if it turns out that we were wrong, just before adding ourselves to the list. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-06-05 11:17:37 +09:30
Rusty Russell	75676500f8	module: make locking more fine-grained. Kay Sievers <kay.sievers@vrfy.org> reports that we still have some contention over module loading which is slowing boot. Linus also disliked a previous "drop lock and regrab" patch to fix the bne2 "gave up waiting for init of module libcrc32c" message. This is more ambitious: we only grab the lock where we need it. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Brandon Philips <brandon@ifup.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>	2010-06-05 11:17:36 +09:30
Rusty Russell	6407ebb271	module: Make module sysfs functions private. These were placed in the header in `ef665c1a06` to get the various SYSFS/MODULE config combintations to compile. That may have been necessary then, but it's not now. These functions are all local to module.c. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Randy Dunlap <randy.dunlap@oracle.com>	2010-06-05 11:17:36 +09:30
Rusty Russell	80a3d1bb41	module: move sysfs exposure to end of load_module This means a little extra work, but is more logical: we don't put anything in sysfs until we're about to put the module into the global list an parse its parameters. This also gives us a logical place to put duplicate module detection in the next patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2010-06-05 11:17:36 +09:30
Rusty Russell	c8e21ced08	module: fix kdb's illicit use of struct module_use. Linus changed the structure, and luckily this didn't compile any more. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Martin Hicks <mort@sgi.com>	2010-06-05 11:17:36 +09:30
Linus Torvalds	2c02dfe7fe	module: Make the 'usage' lists be two-way When adding a module that depends on another one, we used to create a one-way list of "modules_which_use_me", so that module unloading could see who needs a module. It's actually quite simple to make that list go both ways: so that we not only can see "who uses me", but also see a list of modules that are "used by me". In fact, we always wanted that list in "module_unload_free()": when we unload a module, we want to also release all the other modules that are used by that module. But because we didn't have that list, we used to first iterate over all modules, and then iterate over each "used by me" list of that module. By making the list two-way, we simplify module_unload_free(), and it allows for some trivial fixes later too. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned & rebased)	2010-06-05 11:17:35 +09:30
Linus Torvalds	1f73897861	Merge branch 'for-35' of git://repo.or.cz/linux-kbuild * 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits) kbuild: Revert part of `e8d400a` to resolve a conflict kbuild: Fix checking of scm-identifier variable gconfig: add support to show hidden options that have prompts menuconfig: add support to show hidden options which have prompts gconfig: remove show_debug option gconfig: remove dbg_print_ptype() and dbg_print_stype() kconfig: fix zconfdump() kconfig: some small fixes add random binaries to .gitignore kbuild: Include gen_initramfs_list.sh and the file list in the .d file kconfig: recalc symbol value before showing search results .gitignore: ignore *.lzo files headerdep: perlcritic warning scripts/Makefile.lib: Align the output of LZO kbuild: Generate modules.builtin in make modules_install Revert "kbuild: specify absolute paths for cscope" kbuild: Do not unnecessarily regenerate modules.builtin headers_install: use local file handles headers_check: fix perl warnings export_report: fix perl warnings ...	2010-06-01 08:55:52 -07:00
Rusty Russell	293a7cfeed	module: fix reference to mod->percpu after freeing module. Rafael sees a sometimes crash at precpu_modfree from kernel/module.c; it only occurred with another (since-reverted) patch, but that patch simply changed timing to uncover this bug, it was otherwise unrelated. The comment about the mod being freed is self-explanatory, but neither Tejun nor I read it. This bug was introduced in `259354deaa`, after it had previously been fixed in `6e2b75740b`. How embarrassing. Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Embarrassingly-Acked-by: Tejun Heo <tj@kernel.org> Cc: Masami Hiramatsu <mhiramat@redhat.com> Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-31 11:21:32 -07:00
Linus Torvalds	218ce73514	Revert "module: drop the lock while waiting for module to complete initialization." This reverts commit `480b02df3a`, since Rafael reports that it causes occasional kernel paging request faults in load_module(). Dropping the module lock and re-taking it deep in the call-chain is definitely not the right thing to do. That just turns the mutex from a lock into a "random non-locking data structure" that doesn't actually protect what it's supposed to protect. Requested-and-tested-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Brandon Philips <brandon@ifup.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-25 16:48:30 -07:00
Wenji Huang	7d52669b14	module: remove duplicate declaration of __ksymtab_gpl_future Minor cleanup on duplicate __{start/stop}__ksymtab_gpl_future. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-25 08:07:04 -07:00
Linus Torvalds	a8251096b4	Merge branch 'modules' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * 'modules' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: module: drop the lock while waiting for module to complete initialization. MODULE_DEVICE_TABLE(isapnp, ...) does nothing hisax_fcpcipnp: fix broken isapnp device table. isapnp: move definitions to mod_devicetable.h so file2alias can reach them.	2010-05-21 17:15:44 -07:00
Linus Torvalds	90b9a32d8f	Merge branch 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb * 'kdb-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb: (25 commits) kdb,debug_core: Allow the debug core to receive a panic notification MAINTAINERS: update kgdb, kdb, and debug_core info debug_core,kdb: Allow the debug core to process a recursive debug entry printk,kdb: capture printk() when in kdb shell kgdboc,kdb: Allow kdb to work on a non open console port kgdb: Add the ability to schedule a breakpoint via a tasklet mips,kgdb: kdb low level trap catch and stack trace powerpc,kgdb: Introduce low level trap catching x86,kgdb: Add low level debug hook kgdb: remove post_primary_code references kgdb,docs: Update the kgdb docs to include kdb kgdboc,keyboard: Keyboard driver for kdb with kgdb kgdb: gdb "monitor" -> kdb passthrough sparc,sunzilog: Add console polling support for sunzilog serial driver sh,sh-sci: Use NO_POLL_CHAR in the SCIF polled console code kgdb,8250,pl011: Return immediately from console poll kgdb: core changes to support kdb kdb: core for kgdb back end (2 of 2) kdb: core for kgdb back end (1 of 2) kgdb,blackfin: Add in kgdb_arch_set_pc for blackfin ...	2010-05-21 11:08:05 -07:00
Chris Wright	2c3c8bea60	sysfs: add struct file* to bin_attr callbacks This allows bin_attr->read,write,mmap callbacks to check file specific data (such as inode owner) as part of any privilege validation. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-05-21 09:37:31 -07:00
Jason Wessel	67fc4e0cb9	kdb: core for kgdb back end (2 of 2) This patch contains the hooks and instrumentation into kernel which live outside the kernel/debug directory, which the kdb core will call to run commands like lsmod, dmesg, bt etc... CC: linux-arch@vger.kernel.org Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Martin Hicks <mort@sgi.com>	2010-05-20 21:04:21 -05:00
Rusty Russell	480b02df3a	module: drop the lock while waiting for module to complete initialization. This fixes "gave up waiting for init of module libcrc32c." which happened at boot time due to multiple parallel module loads. The problem was a deadlock: we wait for a module to finish initializing, but we keep the module_lock mutex so it can't complete. In particular, this could reasonably happen if a module does a request_module() in its initialization routine. So we change use_module() to return an errno rather than a bool, and if it's -EBUSY we drop the lock and wait in the caller, then reaquire the lock. Reported-by: Brandon Philips <brandon@ifup.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Brandon Philips <brandon@ifup.org>	2010-05-19 17:33:39 +09:30
Linus Torvalds	b8ae30ee26	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (49 commits) stop_machine: Move local variable closer to the usage site in cpu_stop_cpu_callback() sched, wait: Use wrapper functions sched: Remove a stale comment ondemand: Make the iowait-is-busy time a sysfs tunable ondemand: Solve a big performance issue by counting IOWAIT time as busy sched: Intoduce get_cpu_iowait_time_us() sched: Eliminate the ts->idle_lastupdate field sched: Fold updating of the last_update_time_info into update_ts_time_stats() sched: Update the idle statistics in get_cpu_idle_time_us() sched: Introduce a function to update the idle statistics sched: Add a comment to get_cpu_idle_time_us() cpu_stop: add dummy implementation for UP sched: Remove rq argument to the tracepoints rcu: need barrier() in UP synchronize_sched_expedited() sched: correctly place paranioa memory barriers in synchronize_sched_expedited() sched: kill paranoia check in synchronize_sched_expedited() sched: replace migration_thread with cpu_stop stop_machine: reimplement using cpu_stop cpu_stop: implement stop_cpu[s]() sched: Fix select_idle_sibling() logic in select_task_rq_fair() ...	2010-05-18 08:27:54 -07:00
Tejun Heo	3fc1f1e27a	stop_machine: reimplement using cpu_stop Reimplement stop_machine using cpu_stop. As cpu stoppers are guaranteed to be available for all online cpus, stop_machine_create/destroy() are no longer necessary and removed. With resource management and synchronization handled by cpu_stop, the new implementation is much simpler. Asking the cpu_stop to execute the stop_cpu() state machine on all online cpus with cpu hotplug disabled is enough. stop_machine itself doesn't need to manage any global resources anymore, so all per-instance information is rolled into struct stop_machine_data and the mutex and all static data variables are removed. The previous implementation created and destroyed RT workqueues as necessary which made stop_machine() calls highly expensive on very large machines. According to Dimitri Sivanich, preventing the dynamic creation/destruction makes booting faster more than twice on very large machines. cpu_stop resources are preallocated for all online cpus and should have the same effect. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Dimitri Sivanich <sivanich@sgi.com>	2010-05-06 18:49:20 +02:00
Ingo Molnar	c1ab9cab75	Merge branch 'linus' into tracing/core Conflicts: include/linux/module.h kernel/module.c Semantic conflict: include/trace/events/module.h Merge reason: Resolve the conflict with upstream commit `5fbfb18` ("Fix up possibly racy module refcounting") Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-08 10:18:47 +02:00
Nick Piggin	5fbfb18d7a	Fix up possibly racy module refcounting Module refcounting is implemented with a per-cpu counter for speed. However there is a race when tallying the counter where a reference may be taken by one CPU and released by another. Reference count summation may then see the decrement without having seen the previous increment, leading to lower than expected count. A module which never has its actual reference drop below 1 may return a reference count of 0 due to this race. Module removal generally runs under stop_machine, which prevents this race causing bugs due to removal of in-use modules. However there are other real bugs in module.c code and driver code (module_refcount is exported) where the callers do not run under stop_machine. Fix this by maintaining running per-cpu counters for the number of module refcount increments and the number of refcount decrements. The increments are tallied after the decrements, so any decrement seen will always have its corresponding increment counted. The final refcount is the difference of the total increments and decrements, preventing a low-refcount from being returned. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-05 19:50:02 -07:00
Steven Rostedt	eb0c53771f	tracing: Fix compile error in module tracepoints when MODULE_UNLOAD not set If modules are configured in the build but unloading of modules is not, then the refcnt is not defined. Place the get/put module tracepoints under CONFIG_MODULE_UNLOAD since it references this field in the module structure. As a side-effect, this patch also reduces the code when MODULE_UNLOAD is not set, because these unused tracepoints are not created. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-03-31 22:56:59 -04:00
Li Zefan	ae832d1e03	tracing: Remove side effect from module tracepoints that caused a GPF Remove the @refcnt argument, because it has side-effects, and arguments with side-effects are not skipped by the jump over disabled instrumentation and are executed even when the tracepoint is disabled. This was also causing a GPF as found by Randy Dunlap: Subject: 2.6.33 GP fault only when built with tracing LKML-Reference: <4BA2B69D.3000309@oracle.com> Note, the current 2.6.34-rc has a fix for the actual cause of the GPF, but this fixes one of its triggers. Tested-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4BA97FA7.6040406@cn.fujitsu.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2010-03-31 22:56:58 -04:00
Tejun Heo	10fad5e46f	percpu, module: implement and use is_kernel/module_percpu_address() lockdep has custom code to check whether a pointer belongs to static percpu area which is somewhat broken. Implement proper is_kernel/module_percpu_address() and replace the custom code. On UP, percpu variables are regular static variables and can't be distinguished from them. Always return %false on UP. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com>	2010-03-29 23:07:12 +09:00
Tejun Heo	259354deaa	module: encapsulate percpu handling better and record percpu_size Better encapsulate module static percpu area handling so that code outsidef of CONFIG_SMP ifdef doesn't deal with mod->percpu directly and add mod->percpu_size and record percpu_size in it. Both percpu fields are compiled out on UP. While at it, mark mod->percpu w/ __percpu. This is to prepare for is_module_percpu_address(). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au>	2010-03-29 23:07:12 +09:00
Eric W. Biederman	361795b1eb	sysfs: Use sysfs_attr_init and sysfs_bin_attr_init on module dynamic attributes A little more whack-a-mole annotating the dynamic sysfs attributes. I had everything built into my earlier test kernel, and so I missed these. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-03-07 17:04:51 -08:00
Denys Vlasenko	3d9a854c2d	Rename .data[.percpu][.XXX] to .data[..percpu][..XXX]. Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com> Signed-off-by: Michal Marek <mmarek@suse.cz>	2010-03-03 11:26:00 +01:00
Tejun Heo	ab386128f2	Merge branch 'master' into percpu	2010-02-02 14:38:15 +09:00
Ben Hutchings	10b465aaf9	modules: Skip empty sections when exporting section notes Commit `35dead4` "modules: don't export section names of empty sections via sysfs" changed the set of sections that have attributes, but did not change the iteration over these attributes in add_notes_attrs(). This can lead to add_notes_attrs() creating attributes with the wrong names or with null name pointers. Introduce a sect_empty() function and use it in both add_sect_attrs() and add_notes_attrs(). Reported-by: Martin Michlmayr <tbm@cyrius.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Tested-by: Martin Michlmayr <tbm@cyrius.com> Cc: stable@kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-06 01:11:29 -08:00
Christoph Lameter	e1783a240f	module: Use this_cpu_xx to dynamically allocate counters Use cpu ops to deal with the per cpu data instead of a local_t. Reduces memory requirements, cache footprint and decreases cycle counts. The this_cpu_xx operations are also used for !SMP mode. Otherwise we could not drop the use of __module_ref_addr() which would make per cpu data handling complicated. this_cpu_xx operations have their own fallback for !SMP. V8-V9: - Leave include asm/module.h since ringbuffer.c depends on it. Nothing else does though. Another patch will deal with that. - Remove spurious free. Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Tejun Heo <tj@kernel.org>	2010-01-05 15:34:50 +09:00
Linus Torvalds	dcc7cd0112	Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6 * 'kmemleak' of git://linux-arm.org/linux-2.6: kmemleak: fix kconfig for crc32 build error kmemleak: Reduce the false positives by checking for modified objects kmemleak: Show the age of an unreferenced object kmemleak: Release the object lock before calling put_object() kmemleak: Scan the _ftrace_events section in modules kmemleak: Simplify the kmemleak_scan_area() function prototype kmemleak: Do not use off-slab management with SLAB_NOLEAKTRACE	2009-12-17 16:00:19 -08:00
Rusty Russell	d4703aefdb	module: handle ppc64 relocating kcrctabs when CONFIG_RELOCATABLE=y powerpc applies relocations to the kcrctab. They're absolute symbols, but it's not completely unreasonable: other archs may too, but the relocation is often 0. http://lists.ozlabs.org/pipermail/linuxppc-dev/2009-November/077972.html Inspired-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Paul Mackerras <paulus@samba.org>	2009-12-15 16:28:34 +10:30
Linus Torvalds	d0316554d3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits) m68k: rename global variable vmalloc_end to m68k_vmalloc_end percpu: add missing per_cpu_ptr_to_phys() definition for UP percpu: Fix kdump failure if booted with percpu_alloc=page percpu: make misc percpu symbols unique percpu: make percpu symbols in ia64 unique percpu: make percpu symbols in powerpc unique percpu: make percpu symbols in x86 unique percpu: make percpu symbols in xen unique percpu: make percpu symbols in cpufreq unique percpu: make percpu symbols in oprofile unique percpu: make percpu symbols in tracer unique percpu: make percpu symbols under kernel/ and mm/ unique percpu: remove some sparse warnings percpu: make alloc_percpu() handle array types vmalloc: fix use of non-existent percpu variable in put_cpu_var() this_cpu: Use this_cpu_xx in trace_functions_graph.c this_cpu: Use this_cpu_xx for ftrace this_cpu: Use this_cpu_xx in nmi handling this_cpu: Use this_cpu operations in RCU this_cpu: Use this_cpu ops for VM statistics ... Fix up trivial (famous last words) global per-cpu naming conflicts in arch/x86/kvm/svm.c mm/slab.c	2009-12-14 09:58:24 -08:00
Helge Deller	35dead4235	modules: don't export section names of empty sections via sysfs On the parisc architecture we face for each and every loaded kernel module this kernel "badness warning": sysfs: cannot create duplicate filename '/module/ac97_bus/sections/.text' Badness at fs/sysfs/dir.c:487 Reason for that is, that on parisc all kernel modules do have multiple .text sections due to the usage of the -ffunction-sections compiler flag which is needed to reach all jump targets on this platform. An objdump on such a kernel module gives: Sections: Idx Name Size VMA LMA File off Algn 0 .note.gnu.build-id 00000024 00000000 00000000 00000034 22 CONTENTS, ALLOC, LOAD, READONLY, DATA 1 .text 00000000 00000000 00000000 00000058 20 CONTENTS, ALLOC, LOAD, READONLY, CODE 2 .text.ac97_bus_match 0000001c 00000000 00000000 00000058 22 CONTENTS, ALLOC, LOAD, READONLY, CODE 3 .text 00000000 00000000 00000000 000000d4 20 CONTENTS, ALLOC, LOAD, READONLY, CODE ... Since the .text sections are empty (size of 0 bytes) and won't be loaded by the kernel module loader anyway, I don't see a reason why such sections need to be listed under /sys/module/<module_name>/sections/<section_name> either. The attached patch does solve this issue by not exporting section names which are empty. This fixes bugzilla http://bugzilla.kernel.org/show_bug.cgi?id=14703 Signed-off-by: Helge Deller <deller@gmx.de> CC: rusty@rustcorp.com.au CC: akpm@linux-foundation.org CC: James.Bottomley@HansenPartnership.com CC: roland@redhat.com CC: dave@hiauly1.hia.nrc.ca Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-02 15:38:25 -08:00
Catalin Marinas	a6f5aa1ea0	kmemleak: Scan the _ftrace_events section in modules This section contains pointers to allocated objects and not scanning it leads to false positives. Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2009-10-28 17:07:54 +00:00
Catalin Marinas	c017b4be3e	kmemleak: Simplify the kmemleak_scan_area() function prototype This function was taking non-necessary arguments which can be determined by kmemleak. The patch also modifies the calling sites. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au>	2009-10-28 15:11:00 +00:00
Tejun Heo	23fb064bb9	percpu: kill legacy percpu allocator With ia64 converted, there's no arch left which still uses legacy percpu allocator. Kill it. Signed-off-by: Tejun Heo <tj@kernel.org> Delightedly-acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Lameter <cl@linux-foundation.org>	2009-10-02 13:29:29 +09:00
Paul Mundt	3ae91c21dd	module: fix up CONFIG_KALLSYMS=n build. Starting from commit `4a4962263f` "reduce symbol table for loaded modules (v2)", the kernel/module.c build is broken with CONFIG_KALLSYMS disabled. CC kernel/module.o kernel/module.c:1995: warning: type defaults to 'int' in declaration of 'Elf_Hdr' kernel/module.c:1995: error: expected ';', ',' or ')' before '' token kernel/module.c: In function 'load_module': kernel/module.c:2203: error: 'strmap' undeclared (first use in this function) kernel/module.c:2203: error: (Each undeclared identifier is reported only once kernel/module.c:2203: error: for each function it appears in.) kernel/module.c:2239: error: 'symoffs' undeclared (first use in this function) kernel/module.c:2239: error: implicit declaration of function 'layout_symtab' kernel/module.c:2240: error: 'stroffs' undeclared (first use in this function) make[1]: [kernel/module.o] Error 1 make: * [kernel/module.o] Error 2 There are three different issues: - layout_symtab() takes a const Elf_Ehdr - layout_symtab() needs to return a value - symoffs/stroffs/strmap are referenced by the load_module() code despite being ifdefed out, which seems unnecessary given the noop behaviour of layout_symtab()/add_kallsyms() in the case of CONFIG_KALLSYMS=n. Signed-off-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Jan Beulich <jbeulich@novell.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-01 16:11:11 -07:00
Linus Torvalds	4187e7e9f1	Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: modules, tracing: Remove stale struct marker signature from module_layout() tracing/workqueue: Use %pf in workqueue trace events tracing: Fix a comment and a trivial format issue in tracepoint.h tracing: Fix failure path in ftrace_regex_open() tracing: Fix failure path in ftrace_graph_write() tracing: Check the return value of trace_get_user() tracing: Fix off-by-one in trace_get_user()	2009-09-26 10:13:54 -07:00
Rusty Russell	ffa9f12a41	module: don't call percpu_modfree on NULL pointer. The general one handles NULL, the static obsolescent (CONFIG_HAVE_LEGACY_PER_CPU_AREA) one in module.c doesn't; Eric's commit `720eba31` assumed it did, and various frobbings since then kept that assumption. All other callers in module.c all protect it with an if; this effectively does the same as free_init is only goto if we fail percpu_modalloc(). Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Américo Wang <xiyou.wangcong@gmail.com> Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>	2009-09-25 00:32:59 +09:30
Rusty Russell	a263f7763c	module: fix memory leak when load fails after srcversion/version allocated Normally the twisty paths of sysfs will free the attributes, but not if we fail before we hook it into sysfs (which is the last thing we do in load_module). (This sysfs code is a turd, no doubt there are other issues lurking too). Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>	2009-09-25 00:32:59 +09:30
Jan Beulich	554bdfe5ac	module: reduce string table for loaded modules (v2) Also remove all parts of the string table (referenced by the symbol table) that are not needed for kallsyms use (i.e. which were only referenced by symbols discarded by the previous patch, or not referenced at all for whatever reason). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-25 00:32:57 +09:30
Jan Beulich	4a4962263f	module: reduce symbol table for loaded modules (v2) Discard all symbols not interesting for kallsyms use: absolute, section, and in the common case (!KALLSYMS_ALL) data ones. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-25 00:32:57 +09:30
Ingo Molnar	115e8a2882	modules, tracing: Remove stale struct marker signature from module_layout() Linus reported this new build warning: kernel/module.c:2951: warning: ?struct marker? declared inside parameter list kernel/module.c:2951: warning: its scope is only this definition or declaration, which is probably not what you want Caused by: fc53776: tracing: Remove markers module_layout() is an artificial symbol with 'significant' symbols listed in its argument list so that it gets a proper argument types signature that modversions can pick up to decide whether a module is version-compatible or not. If these dont match then we wont even look at a module. Remove the stale marker symbol. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <alpine.LFD.2.01.0909210908020.4950@localhost.localdomain> Cc: Christoph Hellwig <hch@lst.de> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-23 10:34:21 +02:00
Bernd Schmidt	eb8cdec4a9	nommu: add support for Memory Protection Units (MPU) Some architectures (like the Blackfin arch) implement some of the "simpler" features that one would expect out of a MMU such as memory protection. In our case, we actually get read/write/exec protection down to the page boundary so processes can't stomp on each other let alone the kernel. There is a performance decrease (which depends greatly on the workload) however as the hardware/software interaction was not optimized at design time. Signed-off-by: Bernd Schmidt <bernds_cb1@t-online.de> Signed-off-by: Bryan Wu <cooloney@kernel.org> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-22 07:17:43 -07:00
Christoph Hellwig	fc5377668c	tracing: Remove markers Now that the last users of markers have migrated to the event tracer we can kill off the (now orphan) support code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090917173527.GA1699@lst.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-18 21:22:08 +02:00
Linus Torvalds	ada3fa1505	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c	2009-09-15 09:39:44 -07:00
Ingo Molnar	ed011b22ce	Merge commit 'v2.6.31-rc9' into tracing/core Merge reason: move from -rc5 to -rc9. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-06 06:11:42 +02:00
Ingo Molnar	ea6bff3685	modules: Fix build error in the !CONFIG_KALLSYMS case > James Bottomley (1): > module: workaround duplicate section names -tip testing found that this patch breaks the build on x86 if CONFIG_KALLSYMS is disabled: kernel/module.c: In function ‘load_module’: kernel/module.c:2367: error: ‘struct module’ has no member named ‘sect_attrs’ distcc[8269] ERROR: compile kernel/module.c on ph/32 failed make[1]: * [kernel/module.o] Error 1 make: * [kernel] Error 2 make: *** Waiting for unfinished jobs.... Commit `1b364bf` misses the fact that section attributes are only built and dealt with if kallsyms is enabled. The patch below fixes this. ( note, technically speaking this should depend on CONFIG_SYSFS as well but this patch is correct too and keeps the #ifdef less intrusive - in the KALLSYMS && !SYSFS case the code is a NOP. ) Signed-off-by: Ingo Molnar <mingo@elte.hu> [ Replaced patch with a slightly cleaner variation by James Bottomley ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-28 19:35:00 -10:00
James Bottomley	1b364bf438	module: workaround duplicate section names The root cause is a duplicate section name (.text); is this legal? [ Amerigo Wang: "AFAIK, yes." ] However, there's a problem with commit `6d76013381` in that if you fail to allocate a mod->sect_attrs (in this case it's null because of the duplication), it still gets used without checking in add_notes_attrs() This should fix it [ This patch leaves other problems, particularly the sections directory, but recent parisc toolchains seem to produce these modules and this prevents a crash and is a minimal change -- RR ] Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-27 12:33:19 -07:00
Rusty Russell	7d1d16e416	module: fix BUG_ON() for powerpc (and other function descriptor archs) The rarely-used symbol_put_addr() needs to use dereference_function_descriptor on powerpc. Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-27 12:33:19 -07:00
Li Zefan	7ead8b8313	tracing/events: Add module tracepoints Add trace points to trace module_load, module_free, module_get, module_put and module_request, and use trace_event facility to get the trace output. Here's the sample output: TASK-PID CPU# TIMESTAMP FUNCTION \| \| \| \| \| <...>-42 [000] 1.758380: module_request: fb0 wait=1 call_site=fb_open ... <...>-60 [000] 3.269403: module_load: scsi_wait_scan <...>-60 [000] 3.269432: module_put: scsi_wait_scan call_site=sys_init_module refcnt=0 <...>-61 [001] 3.273168: module_free: scsi_wait_scan ... <...>-1021 [000] 13.836081: module_load: sunrpc <...>-1021 [000] 13.840589: module_put: sunrpc call_site=sys_init_module refcnt=-1 <...>-1027 [000] 13.848098: module_get: sunrpc call_site=try_module_get refcnt=0 <...>-1027 [000] 13.848308: module_get: sunrpc call_site=get_filesystem refcnt=1 <...>-1027 [000] 13.848692: module_put: sunrpc call_site=put_filesystem refcnt=0 ... modprobe-2587 [001] 1088.437213: module_load: trace_events_sample F modprobe-2587 [001] 1088.437786: module_put: trace_events_sample call_site=sys_init_module refcnt=0 Note: - the taints flag can be 'F', 'C' and/or 'P' if mod->taints != 0 - the module refcnt is percpu, so it can be negative in a specific cpu Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> LKML-Reference: <4A891B3C.5030608@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-17 11:25:08 +02:00
Tejun Heo	384be2b18a	Merge branch 'percpu-for-linus' into percpu-for-next Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-08-14 14:45:31 +09:00
Mike Frysinger	6560dc160f	module: use MODULE_SYMBOL_PREFIX with module_layout The check_modstruct_version() needs to look up the symbol "module_layout" in the kernel, but it does so literally and not by a C identifier. The trouble is that it does not include a symbol prefix for those ports that need it (like the Blackfin and H8300 port). So make sure we tack on the MODULE_SYMBOL_PREFIX define to the front of it. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-27 12:15:45 -07:00
Joe Perches	ad361c9884	Remove multiple KERN_ prefixes from printk formats Commit `5fd29d6ccb` ("printk: clean up handling of log-levels and newlines") changed printk semantics. printk lines with multiple KERN_<level> prefixes are no longer emitted as before the patch. <level> is now included in the output on each additional use. Remove all uses of multiple KERN_<level>s in formats. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-08 10:30:03 -07:00
Tejun Heo	e74e396204	percpu: use dynamic percpu allocator as the default percpu allocator This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use dynamic percpu allocator. The first chunk is allocated using embedding helper and 8k is reserved for modules. This ensures that the new allocator behaves almost identically to the original allocator as long as static percpu variables are concerned, so it shouldn't introduce much breakage. s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing range limit the addressing model imposes. Unfortunately, this breaks if the address is specified using a variable, so for now, the two archs aren't converted. The following architectures are affected by this change. * sh * arm * cris * mips * sparc(32) * blackfin * avr32 * parisc (broken, under investigation) * m32r * powerpc(32) As this change makes the dynamic allocator the default one, CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert - CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted archs. These archs implement their own setup_per_cpu_areas() and the conversion is not trivial. * powerpc(64) * sparc(64) * ia64 * alpha * s390 Boot and batch alloc/free tests on x86_32 with debug code (x86_32 doesn't use default first chunk initialization). Compile tested on sparc(32), powerpc(32), arm and alpha. Kyle McMartin reported that this change breaks parisc. The problem is still under investigation and he is okay with pushing this patch forward and fixing parisc later. [ Impact: use dynamic allocator for most archs w/o custom percpu setup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Mikael Starvik <starvik@axis.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bryan Wu <cooloney@kernel.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Grant Grundler <grundler@parisc-linux.org> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu>	2009-06-24 15:13:35 +09:00
Peter Oberparleiter	b99b87f70c	kernel: constructor support Call constructors (gcc-generated initcall-like functions) during kernel start and module load. Constructors are e.g. used for gcov data initialization. Disable constructor support for usermode Linux to prevent conflicts with host glibc. Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Li Wei <W.Li@Sun.COM> Cc: Michael Ellerman <michaele@au1.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Heiko Carstens <heicars2@linux.vnet.ibm.com> Cc: Martin Schwidefsky <mschwid2@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-18 13:03:57 -07:00
Linus Torvalds	b231125af7	printk: add KERN_DEFAULT loglevel to print_modules() Several WARN_ON() messages omit the '\n' at the end of the string, which is a simple (and understandable) error. The next line printed after that warning line is usually the current module list, and that printk does not have a log-level marker - resulting in one long mixed-up line. Adding this loglevel marker will now avoid this unreadable mess. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-16 11:07:14 -07:00
Rusty Russell	ad6561dffa	module: trim exception table on init free. It's theoretically possible that there are exception table entries which point into the (freed) init text of modules. These could cause future problems if other modules get loaded into that memory and cause an exception as we'd see the wrong fixup. The only case I know of is kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n). Amerigo fixed this long-standing FIXME in the x86 version, but this patch is more general. This implements trim_init_extable(); most archs are simple since they use the standard lib/extable.c sort code. Alpha and IA64 use relative addresses in their fixups, so thier trimming is a slight variation. Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE, yet it defines its own sort_extable() which overrides the one in lib. It doesn't sort, so we have to mark deleted entries instead of actually trimming them. Inspired-by: Amerigo Wang <amwang@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: linux-alpha@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-ia64@vger.kernel.org	2009-06-12 21:47:04 +09:30
Linus Torvalds	512626a04e	Merge branch 'for-linus' of git://linux-arm.org/linux-2.6 * 'for-linus' of git://linux-arm.org/linux-2.6: kmemleak: Add the corresponding MAINTAINERS entry kmemleak: Simple testing module for kmemleak kmemleak: Enable the building of the memory leak detector kmemleak: Remove some of the kmemleak false positives kmemleak: Add modules support kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash kmemleak: Add the vmalloc memory allocation/freeing hooks kmemleak: Add the slub memory allocation/freeing hooks kmemleak: Add the slob memory allocation/freeing hooks kmemleak: Add the slab memory allocation/freeing hooks kmemleak: Add documentation on the memory leak detector kmemleak: Add the base support Manual conflict resolution (with the slab/earlyboot changes) in: drivers/char/vt.c init/main.c mm/slab.c	2009-06-11 14:15:57 -07:00
Linus Torvalds	3296ca27f5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (44 commits) nommu: Provide mmap_min_addr definition. TOMOYO: Add description of lists and structures. TOMOYO: Remove unused field. integrity: ima audit dentry_open failure TOMOYO: Remove unused parameter. security: use mmap_min_addr indepedently of security models TOMOYO: Simplify policy reader. TOMOYO: Remove redundant markers. SELinux: define audit permissions for audit tree netlink messages TOMOYO: Remove unused mutex. tomoyo: avoid get+put of task_struct smack: Remove redundant initialization. integrity: nfsd imbalance bug fix rootplug: Remove redundant initialization. smack: do not beyond ARRAY_SIZE of data integrity: move ima_counts_get integrity: path_check update IMA: Add __init notation to ima functions IMA: Minimal IMA policy and boot param for TCB IMA policy selinux: remove obsolete read buffer limit from sel_read_bool ...	2009-06-11 10:01:41 -07:00
Catalin Marinas	4f2294b6dc	kmemleak: Add modules support This patch handles the kmemleak operations needed for modules loading so that memory allocations from inside a module are properly tracked. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2009-06-11 17:03:31 +01:00
James Morris	d254117099	Merge branch 'master' into next	2009-05-08 17:56:47 +10:00
Steven Rostedt	93eb677d74	ftrace: use module notifier for function tracer The hooks in the module code for the function tracer must be called before any of that module code runs. The function tracer hooks modify the module (replacing calls to mcount to nops). If the code is executed while the change occurs, then the CPU can take a GPF. To handle the above with a bit of paranoia, I originally implemented the hooks as calls directly from the module code. After examining the notifier calls, it looks as though the start up notify is called before any of the module's code is executed. This makes the use of the notify safe with ftrace. Only the startup notify is required to be "safe". The shutdown simply removes the entries from the ftrace function list, and does not modify any code. This change has another benefit. It removes a issue with a reverse dependency in the mutexes of ftrace_lock and module_mutex. [ Impact: fix lock dependency bug, cleanup ] Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-04-17 16:59:15 +02:00
Stephen Rothwell	19e4529ee7	modules: Fix up build when CONFIG_MODULE_UNLOAD=n. Commit `3d43321b70` ("modules: sysctl to block module loading") introduces a modules_disabled variable that is only defined if CONFIG_MODULE_UNLOAD is enabled, despite being used in other places. This moves it up and fixes up the build. CC kernel/module.o kernel/module.c: In function 'sys_init_module': kernel/module.c:2401: error: 'modules_disabled' undeclared (first use in this function) kernel/module.c:2401: error: (Each undeclared identifier is reported only once kernel/module.c:2401: error: for each function it appears in.) make[1]: * [kernel/module.o] Error 1 make: * [kernel/module.o] Error 2 Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: James Morris <jmorris@namei.org>	2009-04-15 08:17:31 +10:00
Steven Rostedt	6d723736e4	tracing/events: add support for modules to TRACE_EVENT Impact: allow modules to add TRACE_EVENTS on load This patch adds the final hooks to allow modules to use the TRACE_EVENT macro. A notifier and a data structure are used to link the TRACE_EVENTs defined in the module to connect them with the ftrace event tracing system. It also adds the necessary automated clean ups to the trace events when a module is removed. Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-04-14 12:58:03 -04:00
Linus Torvalds	d6de2c80e9	async: Fix module loading async-work regression Several drivers use asynchronous work to do device discovery, and we synchronize with them in the compiled-in case before we actually try to mount root filesystems etc. However, when compiled as modules, that synchronization is missing - the module loading completes, but the driver hasn't actually finished probing for devices, and that means that any user mode that expects to use the devices after the 'insmod' is now potentially broken. We already saw one case of a similar issue in the ACPI battery code, where the kernel itself expected the module to be all done, and unmapped the init memory - but the async device discovery was still running. That got hacked around by just removing the "__init" (see commit `5d38258ec0` "ACPI battery: fix async boot oops"), but the real fix is to just make the module loading wait for all async work to be completed. It will slow down module loading, but since common devices should be built in anyway, and since the bug is really annoying and hard to handle from user space (and caused several S3 resume regressions), the simple fix to wait is the right one. This fixes at least http://bugzilla.kernel.org/show_bug.cgi?id=13063 but probably a few other bugzilla entries too (12936, for example), and is confirmed to fix Rafael's storage driver breakage after resume bug report (no bugzilla entry). We should also be able to now revert that ACPI battery fix. Reported-and-tested-by: Rafael J. Wysocki <rjw@suse.com> Tested-by: Heinz Diehl <htd@fancy-poultry.org> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-04-11 12:44:49 -07:00
Rusty Russell	2e45e77787	Revert "module: remove the SHF_ALLOC flag on the __versions section." This reverts commit `9cb610d8e3`. This was an impressively stupid patch. Firstly, we reset the SHF_ALLOC flag lower down in the same function, so the patch was useless. Even better, find_sec() ignores sections with SHF_ALLOC not set, so it breaks CONFIG_MODVERSIONS=y with CONFIG_MODULE_FORCE_LOAD=n, which refuses to load the module since it can't find the __versions section. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-04-07 17:12:43 +09:30
Linus Torvalds	714f83d5d9	Merge branch 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits) tracing, net: fix net tree and tracing tree merge interaction tracing, powerpc: fix powerpc tree and tracing tree interaction ring-buffer: do not remove reader page from list on ring buffer free function-graph: allow unregistering twice trace: make argument 'mem' of trace_seq_putmem() const tracing: add missing 'extern' keywords to trace_output.h tracing: provide trace_seq_reserve() blktrace: print out BLK_TN_MESSAGE properly blktrace: extract duplidate code blktrace: fix memory leak when freeing struct blk_io_trace blktrace: fix blk_probes_ref chaos blktrace: make classic output more classic blktrace: fix off-by-one bug blktrace: fix the original blktrace blktrace: fix a race when creating blk_tree_root in debugfs blktrace: fix timestamp in binary output tracing, Text Edit Lock: cleanup tracing: filter fix for TRACE_EVENT_FORMAT events ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release() x86: kretprobe-booster interrupt emulation code fix ... Fix up trivial conflicts in arch/parisc/include/asm/ftrace.h include/linux/memory.h kernel/extable.c kernel/module.c	2009-04-05 11:04:19 -07:00
Kees Cook	3d43321b70	modules: sysctl to block module loading Implement a sysctl file that disables module-loading system-wide since there is no longer a viable way to remove CAP_SYS_MODULE after the system bounding capability set was removed in 2.6.25. Value can only be set to "1", and is tested only if standard capability checks allow CAP_SYS_MODULE. Given existing /dev/mem protections, this should allow administrators a one-way method to block module loading after initial boot-time module loading has finished. Signed-off-by: Kees Cook <kees.cook@canonical.com> Acked-by: Serge Hallyn <serue@us.ibm.com> Signed-off-by: James Morris <jmorris@namei.org>	2009-04-03 11:47:11 +11:00
Ingo Molnar	8302294f43	Merge branch 'tracing/core-v2' into tracing-for-linus Conflicts: include/linux/slub_def.h lib/Kconfig.debug mm/slob.c mm/slub.c	2009-04-02 00:49:02 +02:00
Rusty Russell	49502677e1	module: use strstarts() Impact: minor cleanup. I'm not going to neaten anyone else's code, but I'm happy to clean up my own. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:37 +10:30
Rusty Russell	e91defa26c	module: don't use stop_machine on module load Kay Sievers <kay.sievers@vrfy.org> discovered that boot times are slowed by about half a second because all the stop_machine_create() calls, and he only probes about 40 modules (I have 125 loaded on this laptop). We only do stop_machine_create() so we can unlink the module if something goes wrong, but it's overkill (and buggy anyway: if stop_machine_create() fails we still call stop_machine_destroy()). Since we are only protecting against kallsyms (esp. oops) walking the list, synchronize_sched() is sufficient (synchronize_rcu() is probably sufficient, but we're not in a hurry). Kay says of this patch: ... no module takes more than 40 millisecs to link now, most of them are between 3 and 8 millisecs. That looks very different to the numbers without this patch and the otherwise same setup, where we get heavy noise in the traces and many delays of up to 200 millisecs until linking, most of them taking 30+ millisecs. Tested-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:35 +10:30
Rusty Russell	8c8ef42aee	module: include other structures in module version check With CONFIG_MODVERSIONS, we version 'struct module' using a dummy export, but other things matter too: 1) 'struct modversion_info' determines the layout of the __versions section, 2) 'struct kernel_param' determines the layout of the __params section, 3) 'struct kernel_symbol' determines __ksymtab*. 4) 'struct marker' determines __markers. 5) 'struct tracepoint' determines __tracepoints. So we rename 'struct_module' to 'module_layout' and include these in the signature. Now it's general we can add others later on without confusion. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:34 +10:30
Rusty Russell	9cb610d8e3	module: remove the SHF_ALLOC flag on the __versions section. Impact: reduce kernel memory usage This patch just takes off the SHF_ALLOC flag on __versions so we don't keep them around after module load. This saves about 7% of module memory if CONFIG_MODVERSIONS=y. Cc: Shawn Bohrer <shawn.bohrer@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:34 +10:30
Rusty Russell	c6e665c8f0	module: clarify the force-loading taint message. Impact: Message cleanup Two of three callers of try_to_force_load() are not because of a missing version, so change the messages: Old: <modname>: no version for "magic" found: kernel tainted. New: <modname>: bad vermagic: kernel tainted. Old: <modname>: no version for "nocrc" found: kernel tainted. New: <modname>: no versions for exported symbols: kernel tainted. Old: <modname>: no version for "<symname>" found: kernel tainted. New: <modname>: <symname>: kernel tainted. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:33 +10:30
Tim Abbott	c6b3780191	module: Export symbols needed for Ksplice Impact: Expose some module.c symbols Ksplice uses several functions from module.c in order to resolve symbols and implement dependency handling. Calling these functions requires holding module_mutex, so it is exported. (This is just the module part of a bigger add-exports patch from Tim). Cc: Anders Kaseorg <andersk@mit.edu> Cc: Jeff Arnold <jbarnold@mit.edu> Signed-off-by: Tim Abbott <tabbott@mit.edu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:33 +10:30
Anders Kaseorg	75a66614db	Ksplice: Add functions for walking kallsyms symbols Impact: New API kallsyms_lookup_name only returns the first match that it finds. Ksplice needs information about all symbols with a given name in order to correctly resolve local symbols. kallsyms_on_each_symbol provides a generic mechanism for iterating over the kallsyms table. Cc: Jeff Arnold <jbarnold@mit.edu> Cc: Tim Abbott <tabbott@mit.edu> Signed-off-by: Anders Kaseorg <andersk@mit.edu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:32 +10:30
Rusty Russell	a6e6abd575	module: remove module_text_address() Impact: Replace and remove risky (non-EXPORTed) API module_text_address() returns a pointer to the module, which given locking improvements in module.c, is useless except to test for NULL: 1) If the module can't go away, use __module_text_address. 2) Otherwise, just use is_module_text_address(). Cc: linux-mtd@lists.infradead.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:32 +10:30
Rusty Russell	e610499e26	module: __module_address Impact: New API, cleanup ksplice wants to know the bounds of a module, not just the module text. It makes sense to have __module_address. We then implement is_module_address and __module_text_address in terms of this (and change is_module_text_address() to bool while we're at it). Also, add proper kerneldoc for them all. Cc: Anders Kaseorg <andersk@mit.edu> Cc: Jeff Arnold <jbarnold@mit.edu> Cc: Tim Abbott <tabbott@mit.edu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-03-31 13:05:31 +10:30

1 2 3 4 5 ...

323 Commits