linux

Commit Graph

Author	SHA1	Message	Date
Al Viro	5411be59db	[PATCH] drop task argument of audit_syscall_{entry,exit} ... it's always current, and that's a good thing - allows simpler locking. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-05-01 06:06:18 -04:00
Linus Torvalds	37e53db8aa	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] update sn2 defconfig [IA64] Add mca recovery failure messages [IA64-SGI] fix SGI Altix tioce_reserve_m32() bug [IA64] enable dumps to capture second page of kernel stack [IA64-SGI] - Reduce overhead of reading sn_topology [IA64-SGI] - Fix discover of nearest cpu node to IO node [IA64] IOC4 config option ordering [IA64] Setup an IA64 specific reclaim distance [IA64] eliminate compile time warnings [IA64] eliminate compile time warnings [IA64-SGI] SN SAL call to inject memory errors [IA64] - Fix MAX_PXM_DOMAINS for systems with > 256 nodes [IA64] Remove unused variable in sn_sal.h [IA64] Remove redundant NULL checks before kfree [IA64] wire up compat_sys_adjtimex()	2006-04-27 17:01:37 -07:00
Jes Sorensen	7384c8bd90	[IA64] update sn2 defconfig Update SN2 defconfig to latest kernel and add QLA FC drivers commonly found in SN2 boxes. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-27 14:38:03 -07:00
Russ Anderson	189979619f	[IA64] Add mca recovery failure messages When the mca recovery code encounters a condition that makes the MCA non-recoverable, print the reason it could not recover. This will make it easier to identify why the recovery code did not recover. Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-27 14:34:01 -07:00
Mike Habeck	cda3d4a069	[IA64-SGI] fix SGI Altix tioce_reserve_m32() bug The following patch fixes a bug in the SGI Altix tioce_reserve_m32() code. The bug was that we could walking past the end of the CE ASIC 32/40bit PMU ATE Buffer, resulting in a PIO Reply Error. Signed-off-by: Mike Habeck <habeck@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-27 14:32:07 -07:00
Jack Steiner	dd4cb9f8ac	[IA64-SGI] - Reduce overhead of reading sn_topology MPI programs using certain debug options have a long startup time. This was traced to a "vmalloc/vfree" in the code that reads /proc/sgi_sn/sn_topology. On large systems, vfree requires an IPI to all cpus to do TLB purging. Replace the vmalloc/vfree with kmalloc/kfree. Although the size of the structure being allocated is unknown, it will not not exceed 96 bytes. Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-27 14:29:05 -07:00
Jack Steiner	f0fe253c47	[IA64-SGI] - Fix discover of nearest cpu node to IO node Fix a bug that causes discovery of the nearest node/cpu to a TIO (IO node) to fail. Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-27 14:28:37 -07:00
Chandra Seetharaman	83d722f7e1	[PATCH] Remove __devinit and __cpuinit from notifier_call definitions Few of the notifier_chain_register() callers use __init in the definition of notifier_call. It is incorrect as the function definition should be available after the initializations (they do not unregister them during initializations). This patch fixes all such usages to _not_ have the notifier_call __init section. Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-26 08:30:03 -07:00
Jens Axboe	912d35f867	[PATCH] Add support for the sys_vmsplice syscall sys_splice() moves data to/from pipes with a file input/output. sys_vmsplice() moves data to a pipe, with the input being a user address range instead. This uses an approach suggested by Linus, where we can hold partial ranges inside the pages[] map. Hopefully this will be useful for network receive support as well. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-26 10:59:21 +02:00
Brent Casavant	c1311af12c	[IA64] IOC4 config option ordering SERIAL_SGI_IOC4 and BLK_DEV_SGIIOC4 depend upon SGI_IOC4, and SERIAL_SGI_IOC3 depends upon SGI_IOC3. Currently the definitions are out of order in the config sequence. Fix by including drivers/sn/Kconfig immediately after SGI_SN, upon which SGI_IOC4 and SGI_IOC3 depend. Signed-off-by: Brent Casavant <bcasavan@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-21 10:59:00 -07:00
Satoru Takeuchi	a72391e42f	[IA64] eliminate compile time warnings This patch removes following compile time warnings: drivers/pci/pci-sysfs.c: In function `pci_read_legacy_io': drivers/pci/pci-sysfs.c:257: warning: implicit declaration of function `ia64_pci_legacy_read' drivers/pci/pci-sysfs.c: In function `pci_write_legacy_io': drivers/pci/pci-sysfs.c:280: warning: implicit declaration of function `ia64_pci_legacy_write' It also fixes wrong definition of ia64_pci_legacy_write (type of `bus' is not `pci_dev', but `pci_bus'). Signed-Off-By: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-20 17:06:54 -07:00
Satoru Takeuchi	ee6d4b6ef8	[IA64] eliminate compile time warnings This is a trivial patch to remove following compile time warning: arch/ia64/ia32/../../../fs/binfmt_elf.c:508: warning: 'randomize_stack_top' defined but not used Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-20 17:06:35 -07:00
Jesper Juhl	cbf283c048	[IA64] Remove redundant NULL checks before kfree Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-20 10:11:09 -07:00
Luck, Tony	c6180deb1d	[IA64] wire up compat_sys_adjtimex() Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-20 10:02:37 -07:00
Prasanna S Panchamukhi	3ca269d8b4	[PATCH] Switch Kprobes inline functions to __kprobes for ia64 Andrew Morton pointed out that compiler might not inline the functions marked for inline in kprobes. There-by allowing the insertion of probes on these kprobes routines, which might cause recursion. This patch removes all such inline and adds them to kprobes section there by disallowing probes on all such routines. Some of the routines can even still be inlined, since these routines gets executed after the kprobes had done necessay setup for reentrancy. Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-19 09:13:53 -07:00
Bjorn Helgaas	4f705ae3e9	[PATCH] DMI: move dmi_scan.c from arch/i386 to drivers/firmware/ dmi_scan.c is arch-independent and is used by i386, x86_64, and ia64. Currently all three arches compile it from arch/i386, which means that ia64 and x86_64 depend on things in arch/i386 that they wouldn't otherwise care about. This is simply "mv arch/i386/kernel/dmi_scan.c drivers/firmware/" (removing trailing whitespace) and the associated Makefile changes. All three architectures already set CONFIG_DMI in their top-level Kconfig files. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Andi Kleen <ak@muc.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andrey Panin <pazke@orbita1.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2006-04-14 11:41:25 -07:00
Linus Torvalds	9ca686626c	Merge branch 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block * 'tee' of git://brick.kernel.dk/data/git/linux-2.6-block: [PATCH] splice: add support for sys_tee() [PATCH] splice: pass offset around for ->splice_read() and ->splice_write()	2006-04-14 09:02:07 -07:00
Robin Holt	ace1d816a1	[IA64] Make show_mem() skip holes in a pgdat This patch modifies ia64's show_mem() to walk the vmem_map page tables and rapidly skip forward across regions where the page tables are missing. This prevents the pfn_valid() check from causing numerous unnecessary page faults. Without this patch on a 512 node 512 cpu system where every node has four memory holes, the show_mem() call takes 1 hour 18 minutes. With this patch, it takes less than 3 seconds. Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-13 15:34:45 -07:00
Keith Owens	356a5c1c6f	[IA64] ia64_wait_for_slaves() incorrectly reports MCA ia64_wait_for_slaves() was changed in 2.6.17-rc1 to report the slave state. It incorrectly assumes that all slaves are for MCA, but ia64_wait_for_slaves() is also called from the INIT monarch handler. The existing message is very misleading, so correct it. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-13 14:57:18 -07:00
Jens Axboe	70524490ee	[PATCH] splice: add support for sys_tee() Basically an in-kernel implementation of tee, which uses splice and the pipe buffers as an intelligent way to pass data around by reference. Where the user space tee consumes the input and produces a stdout and file output, this syscall merely duplicates the data inside a pipe to another pipe. No data is copied, the output just grabs a reference to the input pipe data. Signed-off-by: Jens Axboe <axboe@suse.de>	2006-04-11 15:51:17 +02:00
Linus Torvalds	b3967dc566	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] Prefetch mmap_sem in ia64_do_page_fault() [IA64] Failure to resume after INIT in user space [IA64] Pass more data to the MCA/INIT notify_die hooks [IA64] always map VGA framebuffer UC, even if it supports WB [IA64] fix bug in ia64 __mutex_fastpath_trylock [IA64] for_each_possible_cpu: ia64 [IA64] update HP CSR space discovery via ACPI [IA64] Wire up new syscalls {set,get}_robust_list [IA64] 'msg' may be used uninitialized in xpc_initiate_allocate() [IA64] Wire up new syscall sync_file_range()	2006-04-11 06:40:17 -07:00
Yasunori Goto	c80d79d746	[PATCH] Configurable NODES_SHIFT Current implementations define NODES_SHIFT in include/asm-xxx/numnodes.h for each arch. Its definition is sometimes configurable. Indeed, ia64 defines 5 NODES_SHIFT values in the current git tree. But it looks a bit messy. SGI-SN2(ia64) system requires 1024 nodes, and the number of nodes already has been changeable by config. Suitable node's number may be changed in the future even if it is other architecture. So, I wrote configurable node's number. This patch set defines just default value for each arch which needs multi nodes except ia64. But, it is easy to change to configurable if necessary. On ia64 the number of nodes can be already configured in generic ia64 and SN2 config. But, NODES_SHIFT is defined for DIG64 and HP'S machine too. So, I changed it so that all platforms can be configured via CONFIG_NODES_SHIFT. It would be simpler. See also: http://marc.theaimsgroup.com/?l=linux-kernel&m=114358010523896&w=2 Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jack Steiner <steiner@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-04-11 06:18:39 -07:00
Christoph Lameter	0ffe984917	[IA64] Prefetch mmap_sem in ia64_do_page_fault() Take a hint from an x86_64 optimization by Arjan van de Ven and use it for ia64. See `a9ba9a3b38` Prefetch the mmap_sem, which is critical for the performance of the page fault handler. Note: mm may be NULL but I guess that is safe. See `458f935527` Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-07 23:08:16 -07:00
Keith Owens	8cab7ccccb	[IA64] Failure to resume after INIT in user space The OS INIT handler is loading incorrect values into cr.ifa on exit. This shows up as a hang when resuming after an INIT that is delivered while a cpu is in user space. Correct the value loaded into cr.ifa. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-07 23:01:32 -07:00
Keith Owens	958b166c00	[IA64] Pass more data to the MCA/INIT notify_die hooks The MCA/INIT handlers maintain important state in the SAL to OS (sos) area and in the monarch_cpu flag. Kernel debuggers (such as KDB) need this data, and may need to adjust the monarch_cpu field so make the data available to the notify_die hooks. Define two more events for calling the functions on the notify_die chain. Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-07 22:51:51 -07:00
KAMEZAWA Hiroyuki	0681226661	[IA64] for_each_possible_cpu: ia64 for_each_cpu() actually iterates across all possible CPUs. We've had mistakes in the past where people were using for_each_cpu() where they should have been iterating across only online or present CPUs. This is inefficient and possibly buggy. We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the future. This patch replaces for_each_cpu with for_each_possible_cpu under arch/ia64/kernel/. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fjitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-06 15:03:49 -07:00
Bjorn Helgaas	03fbaca36a	[IA64] update HP CSR space discovery via ACPI Get rid of the manual search of _CRS, in favor of acpi_get_vendor_resource() which is now provided by the ACPI CA. And fall back to searching for a consumer-only address space descriptor if no vendor-defined resource is found. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-06 14:42:38 -07:00
Tony Luck	b8cd2af862	[IA64] Wire up new syscalls {set,get}_robust_list Join the dots to enable Ingo's robut futex syscalls. Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-06 14:20:16 -07:00
Tony Luck	27f4aa3db0	[IA64] 'msg' may be used uninitialized in xpc_initiate_allocate() Found by gcc4.1 and reported by Dean Nelson. Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-04 14:11:49 -07:00
Tony Luck	d905b00b3b	[IA64] Wire up new syscall sync_file_range() Also reserve syscall numbers for {set,get}_robust_list Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-04-04 14:08:11 -07:00
Tony Luck	2ab9391dea	[IA64] Avoid "u64 foo : 32;" for gcc3 vs. gcc4 compatibility gcc3 thinks that a 32-bit field of a u64 type is itself a u64, so should be printed with "%ld". gcc4 thinks it needs just "%d". Make both versions happy by avoiding this construct. Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-31 10:28:29 -08:00
Zhang, Yanmin	f19180056e	[IA64] Export cpu cache info by sysfs The patch exports 8 attributes of cpu cache info under /sys/devices/system/cpu/cpuX/cache/indexX: 1) level 2) type 3) coherency_line_size 4) ways_of_associativity 5) size 6) shared_cpu_map 7) attributes 8) number_of_sets: number_of_sets=size/ways_of_associativity/coherency_line_size. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-30 17:14:12 -08:00
Linus Torvalds	d1127e40e8	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6 * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: [IA64] ioremap() should prefer WB over UC [IA64] Add __mca_table to the DISCARD list in gate.lds [IA64] Move __mca_table out of the __init section [IA64] simplify some condition checks in iosapic_check_gsi_range [IA64] correct some messages and fixes some minor things [IA64-SGI] fix for-loop in sn_hwperf_geoid_to_cnode() [IA64-SGI] sn_hwperf use of num_online_cpus() [IA64] optimize flush_tlb_range on large numa box [IA64] lazy_mmu_prot_update needs to be aware of huge pages	2006-03-30 12:38:18 -08:00
Jens Axboe	5274f052e7	[PATCH] Introduce sys_splice() system call This adds support for the sys_splice system call. Using a pipe as a transport, it can connect to files or sockets (latter as output only). From the splice.c comments: "splice": joining two ropes together by interweaving their strands. This is the "extended pipe" functionality, where a pipe is used as an arbitrary in-memory buffer. Think of a pipe as a small kernel buffer that you can use to transfer data from one end to the other. The traditional unix read/write is extended with a "splice()" operation that transfers data buffers to or from a pipe buffer. Named by Larry McVoy, original implementation from Linus, extended by Jens to support splicing to files and fixing the initial implementation bugs. Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-30 12:28:18 -08:00
Bjorn Helgaas	c1c57d7671	[IA64] ioremap() should prefer WB over UC efi_memmap_init() collects full granules of WB memory, without regard for whether they also support UC. So in order for ioremap() to work for main memory, it must prefer WB mappings when possible. Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-30 09:05:41 -08:00
Jes Sorensen	3283a67d86	[IA64] Add __mca_table to the DISCARD list in gate.lds Add __mca_table to the DISCARD list for the gate.lds linker script to avoid broken linker references when linking the final vmlinux file. Also add comment to include/asm-ia64/asmmacros.h to avoid anyone else hitting this problem in the future. Credits to James Bottomley <James.Bottomley@SteelEye.com> for spotting the DISCARD list in gate.lds.S Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-30 09:04:19 -08:00
Russ Anderson	d89cfe7f1e	[IA64] Move __mca_table out of the __init section Move __mca_table out of the __init section. Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-29 10:43:04 -08:00
Satoru Takeuchi	e6d1ba5cd9	[IA64] simplify some condition checks in iosapic_check_gsi_range Some condition checks on iosapic_check_gsi_range() can be omitted because always `base <= end' is assured. This patch simplifies those checks. Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-28 10:42:15 -08:00
Satoru Takeuchi	46cba3dcae	[IA64] correct some messages and fixes some minor things This patch corrects some wrong comments and a printk message. It also fixes some minor things, and makes all lines fit in 80 columns. Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-28 10:41:23 -08:00
Andrew Morton	ec1b9466cb	[PATCH] ia64: const f_ops fix Tweak the proc setup code so things work OK with const proc_dir_entry.proc_fops. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-28 09:16:05 -08:00
Dean Roe	769ebc66de	[IA64-SGI] fix for-loop in sn_hwperf_geoid_to_cnode() Fix a for-loop in sn_hwperf_geoid_to_cnode(). It needs to loop over num_cnodes to ensure it can still process TIO nodes in addition to compute nodes on systems with many nodes. Interim fix until better support for many (>265) nodes is complete. Signed-off-by: Dean Roe <roe@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-28 08:39:10 -08:00
hawkes@sgi.com	e6ef0fca2c	[IA64-SGI] sn_hwperf use of num_online_cpus() Eliminate an unnecessary -- and flawed -- use of the expensive num_online_cpus(). Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-27 13:47:44 -08:00
Chen, Kenneth W	ce9eed5a98	[IA64] optimize flush_tlb_range on large numa box It was reported from a field customer that global spin lock ptcg_lock is giving a lot of grief on munmap performance running on a large numa machine. What appears to be a problem coming from flush_tlb_range(), which currently unconditionally calls platform_global_tlb_purge(). For some of the numa machines in existence today, this function is mapped into ia64_global_tlb_purge(), which holds ptcg_lock spin lock while executing ptc.ga instruction. Here is a patch that attempt to avoid global tlb purge whenever possible. It will use local tlb purge as much as possible. Though the conditions to use local tlb purge is pretty restrictive. One of the side effect of having flush tlb range instruction on ia64 is that kernel don't get a chance to clear out cpu_vm_mask. On ia64, this mask is sticky and it will accumulate if process bounces around. Thus diminishing the possible use of ptc.l. Thoughts? Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Jack Steiner <steiner@sgi.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-27 10:20:03 -08:00
Zhang, Yanmin	5e48521e86	[IA64] lazy_mmu_prot_update needs to be aware of huge pages Function lazy_mmu_prot_update is also used on huge pages when it is called by set_huge_ptep_writable, but it isn't aware of huge pages. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Acked-by: Ken Chen <kenneth.w.chen@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2006-03-27 10:15:41 -08:00
Alan Stern	e041c68341	[PATCH] Notifier chain update: API changes The kernel's implementation of notifier chains is unsafe. There is no protection against entries being added to or removed from a chain while the chain is in use. The issues were discussed in this thread: http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2 We noticed that notifier chains in the kernel fall into two basic usage classes: "Blocking" chains are always called from a process context and the callout routines are allowed to sleep; "Atomic" chains can be called from an atomic context and the callout routines are not allowed to sleep. We decided to codify this distinction and make it part of the API. Therefore this set of patches introduces three new, parallel APIs: one for blocking notifiers, one for atomic notifiers, and one for "raw" notifiers (which is really just the old API under a new name). New kinds of data structures are used for the heads of the chains, and new routines are defined for registration, unregistration, and calling a chain. The three APIs are explained in include/linux/notifier.h and their implementation is in kernel/sys.c. With atomic and blocking chains, the implementation guarantees that the chain links will not be corrupted and that chain callers will not get messed up by entries being added or removed. For raw chains the implementation provides no guarantees at all; users of this API must provide their own protections. (The idea was that situations may come up where the assumptions of the atomic and blocking APIs are not appropriate, so it should be possible for users to handle these things in their own way.) There are some limitations, which should not be too hard to live with. For atomic/blocking chains, registration and unregistration must always be done in a process context since the chain is protected by a mutex/rwsem. Also, a callout routine for a non-raw chain must not try to register or unregister entries on its own chain. (This did happen in a couple of places and the code had to be changed to avoid it.) Since atomic chains may be called from within an NMI handler, they cannot use spinlocks for synchronization. Instead we use RCU. The overhead falls almost entirely in the unregister routine, which is okay since unregistration is much less frequent that calling a chain. Here is the list of chains that we adjusted and their classifications. None of them use the raw API, so for the moment it is only a placeholder. ATOMIC CHAINS ------------- arch/i386/kernel/traps.c: i386die_chain arch/ia64/kernel/traps.c: ia64die_chain arch/powerpc/kernel/traps.c: powerpc_die_chain arch/sparc64/kernel/traps.c: sparc64die_chain arch/x86_64/kernel/traps.c: die_chain drivers/char/ipmi/ipmi_si_intf.c: xaction_notifier_list kernel/panic.c: panic_notifier_list kernel/profile.c: task_free_notifier net/bluetooth/hci_core.c: hci_notifier net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_chain net/ipv4/netfilter/ip_conntrack_core.c: ip_conntrack_expect_chain net/ipv6/addrconf.c: inet6addr_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_chain net/netfilter/nf_conntrack_core.c: nf_conntrack_expect_chain net/netlink/af_netlink.c: netlink_chain BLOCKING CHAINS --------------- arch/powerpc/platforms/pseries/reconfig.c: pSeries_reconfig_chain arch/s390/kernel/process.c: idle_chain arch/x86_64/kernel/process.c idle_notifier drivers/base/memory.c: memory_chain drivers/cpufreq/cpufreq.c cpufreq_policy_notifier_list drivers/cpufreq/cpufreq.c cpufreq_transition_notifier_list drivers/macintosh/adb.c: adb_client_list drivers/macintosh/via-pmu.c sleep_notifier_list drivers/macintosh/via-pmu68k.c sleep_notifier_list drivers/macintosh/windfarm_core.c wf_client_list drivers/usb/core/notify.c usb_notifier_list drivers/video/fbmem.c fb_notifier_list kernel/cpu.c cpu_chain kernel/module.c module_notify_list kernel/profile.c munmap_notifier kernel/profile.c task_exit_notifier kernel/sys.c reboot_notifier_list net/core/dev.c netdev_chain net/decnet/dn_dev.c: dnaddr_chain net/ipv4/devinet.c: inetaddr_chain It's possible that some of these classifications are wrong. If they are, please let us know or submit a patch to fix them. Note that any chain that gets called very frequently should be atomic, because the rwsem read-locking used for blocking chains is very likely to incur cache misses on SMP systems. (However, if the chain's callout routines may sleep then the chain cannot be atomic.) The patch set was written by Alan Stern and Chandra Seetharaman, incorporating material written by Keith Owens and suggestions from Paul McKenney and Andrew Morton. [jes@sgi.com: restructure the notifier chain initialization macros] Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-27 08:44:50 -08:00
KAMEZAWA Hiroyuki	3571761fe4	[PATCH] for_each_online_pgdat: remove sorting pgdat Because pgdat_list was linked to pgdat_list in reverse order, (By default) some of arch has to sort it by themselves. for_each_pgdat has gone..for_each_online_pgdat() uses node_online_map, which doesn't need to be sorted. This patch removes codes for sorting pgdat. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-27 08:44:48 -08:00
KAMEZAWA Hiroyuki	ec936fc563	[PATCH] for_each_online_pgdat: renaming for_each_pgdat Replace for_each_pgdat() with for_each_online_pgdat(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-27 08:44:48 -08:00
Akinobu Mita	55b0f8a68a	[PATCH] bitops: ia64: make partial_page.bitmap an unsigned long The find__bit() routines are defined to work on a pointer to unsigned long. But partial_page.bitmap is unsigned int and it is passed to find__bit() in arch/ia64/ia32/sys_ia32.c. So the compiler will print warnings. This patch changes to unsigned long instead. Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:57:15 -08:00
Akinobu Mita	2875aef8bd	[PATCH] bitops: ia64: use generic bitops - remove generic_fls64() - remove find_{next,first}{,_zero}_bit() - remove ext2_{set,clear,test,find_first_zero,find_next_zero}_bit() - remove minix_{test,set,test_and_clear,test,find_first_zero}_bit() - remove sched_find_first_bit() Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:57:12 -08:00
Akinobu Mita	4668f0cd0a	[PATCH] bitops: ia64: use cpu_set() instead of __set_bit() __set_bit() --> cpu_set() cleanup Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:57:09 -08:00

1 2 3 4 5 ...

669 Commits