linux

Commit Graph

Author	SHA1	Message	Date
Andi Kleen	ce06e0b21d	vfs: optimize touch_time() too Do a similar optimization as earlier for touch_atime. Getting the lock in mnt_get_write is relatively costly, so try all avenues to avoid it first. This patch is careful to still only update inode fields inside the lock region. This didn't show up in benchmarks, but it's easy enough to do. [akpm@linux-foundation.org: fix typo in comment] [hugh.dickins@tiscali.co.uk: fix inverted test of mnt_want_write_file()] Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Valerie Aurora <vaurora@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 07:47:27 -04:00
Andi Kleen	b12536c270	vfs: optimization for touch_atime() Some benchmark testing shows touch_atime to be high up in profile logs for IO intensive workloads. Most likely that's due to the lock in mnt_want_write(). Unfortunately touch_atime first takes the lock, and then does all the other tests that could avoid atime updates (like noatime or relatime). Do it the other way round -- first try to avoid the update and only then if that didn't succeed take the lock. That works because none of the atime avoidance tests rely on locking. This also eliminates a goto. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Christoph Hellwig <hch@infradead.org> Reviewed-by: Valerie Aurora <vaurora@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 07:47:26 -04:00
Jan Kara	22fe404218	vfs: split generic_forget_inode() so that hugetlbfs does not have to copy it Hugetlbfs needs to do special things instead of truncate_inode_pages(). Currently, it copied generic_forget_inode() except for truncate_inode_pages() call which is asking for trouble (the code there isn't trivial). So create a separate function generic_detach_inode() which does all the list magic done in generic_forget_inode() and call it from hugetlbfs_forget_inode(). Signed-off-by: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 07:47:25 -04:00
Manish Katiyar	af0d9ae811	fs/inode.c: add dev-id and inode number for debugging in init_special_inode() Add device-id and inode number for better debugging. This was suggested by Andreas in one of the threads http://article.gmane.org/gmane.comp.file-systems.ext4/12062 . "If anyone has a chance, fixing this error message to be not-useless would be good... Including the device name and the inode number would help track down the source of the problem." Signed-off-by: Manish Katiyar <mkatiyar@gmail.com> Cc: Andreas Dilger <adilger@sun.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 07:47:24 -04:00
Steven Rostedt	14be27460e	libfs: make simple_read_from_buffer conventional Impact: have simple_read_from_buffer conform to standards It was brought to my attention by Andrew Morton, Theodore Tso, and H. Peter Anvin that a read from userspace should only return -EFAULT if nothing was actually read. Looking at the simple_read_from_buffer I noticed that this function does not conform to that rule. This patch fixes that function. [akpm@linux-foundation.org: simplification suggested by hpa] [hpa@zytor.com: fix count==0 handling] Signed-off-by: Steven Rostedt <srostedt@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 07:47:22 -04:00
Jason Wessel	429a6e5e2c	x86: early_printk: Protect against using the same device twice If you use the kernel argument: earlyprintk=serial,ttyS0,115200 This will cause a recursive hang printing the same line again and again: BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) bootconsole [earlyser0] enabled Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009 Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009 Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009 Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009 Linux version 2.6.31-07863-gb64ada6 (mingo@sirius) (gcc version 4.3.2 20081105 (Red Hat 4.3.2-7) (GCC) ) #16789 SMP Wed Sep 23 21:09:43 CEST 2009 Instead warn the end user that they specified the device a second time, and ignore that second console. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Cc: Len Brown <lenb@kernel.org> Cc: Greg KH <gregkh@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <4ABAAB89.1080407@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 13:01:13 +02:00
Ingo Molnar	d2ff6de537	Merge branch 'linus' into x86/urgent Merge reason: Queueing up dependent early-printk fix. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 12:59:18 +02:00
Mike Galbraith	dd906a0fe8	perf tools: Handle relative paths while loading module symbols Inform util/module.c::mod_dso__load_module_paths() that relative paths do exist in some modules.dep, and make it fail noisily should it encounter a path that it doesn't understand, or a module it cannot open. Reported-by: Avi Kivity <avi@redhat.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: rostedt@goodmis.org Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <1253779628.10513.8.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 11:40:35 +02:00
Roland Dreier	e23a8b6a8f	x86: Reduce verbosity of "PAT enabled" kernel message On modern systems, the kernel prints the message x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 once for every CPU. This gets kind of ridiculous on huge systems; for example, on a 64-thread system I was lucky enough to get: dmesg\| grep 'PAT enabled' \| wc 64 704 5174 There is already a BUG() if non-boot CPUs have PAT capabilities that don't match the boot CPU, so just print the message on the boot CPU. (I kept the print after the wrmsrl() that enables PAT, so that the log output continues to mean that the system survived enabling PAT on the boot CPU) Signed-off-by: Roland Dreier <rolandd@cisco.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> LKML-Reference: <adavdj92sso.fsf@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 11:35:19 +02:00
Roland Dreier	ea01c0d731	x86: Reduce verbosity of "TSC is reliable" message On modern systems, the kernel prints the message Skipping synchronization checks as TSC is reliable. once for every non-boot CPU. This gets kind of ridiculous on huge systems; for example, on a 64-thread system I was lucky enough to get: $ dmesg \| grep 'TSC is reliable' \| wc 63 567 4221 There's no point to doing this for every CPU, since the code is just checking the boot CPU anyway, so change this to a printk_once() to make the message appears only once. Signed-off-by: Roland Dreier <rolandd@cisco.com> LKML-Reference: <adazl8l2swc.fsf@cisco.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-24 11:35:19 +02:00
Paul Mundt	40258ee97d	sh: Fix up uninitialized variable use caught by gcc 4.4. In the unaligned kernel exception fixup case the printk() was ordered before the copy_from_user(), resulting in a nonsensical instruction value. This fixes up the ordering properly. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-09-24 17:48:15 +09:00
Paul Mundt	23c4c82171	sh: Handle unaligned 16-bit instructions on SH-2A. This adds some sanity checking in the unaligned instruction handler to verify the instruction size, which enables basic support for 16-bit fixups on SH-2A parts. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-09-24 17:38:18 +09:00
Michal Simek	bfc8125858	microblaze: Disable heartbeat/enable emaclite in defconfigs I need to disable heartbeat function because this features breaks testing in Qemu. Signed-off-by: Michal Simek <monstr@monstr.eu>	2009-09-24 10:30:27 +02:00
Michal Simek	f05131cd7a	microblaze: Support simpleImage.dts make target Instead of remembering to specify DTB= on the make commandline, this commit allows the much friendlier make simpleImage.<dts> where <dts>.dts is expected to be found in arch/microblaze/boot/dts/ The resulting vmlinux, with the compiled DTS linked in, will be copied to boot/simpleImage.<dts> This mirrors the same functionality as on PowerPC, albeit achieving it in a slightly different way. + strip simpleImage file The size of output file is very similar to linux.bin. vmlinux - full elf without fdt blob simpleImage.<dtb name>.unstrip - full elf with fdt blob simpleImage.<dtb name> - stripped elf with fdt blob Add symlink to generic system.dts in platform folder Signed-off-by: John Williams <john.williams@petalogix.com> Signed-off-by: Michal Simek <monstr@monstr.eu>	2009-09-24 10:28:22 +02:00
Michael Abbott	96830a57de	[PATCH] Fix idle time field in /proc/uptime Git commit `79741dd` changes idle cputime accounting, but unfortunately the /proc/uptime file hasn't caught up. Here the idle time calculation from /proc/stat is copied over. Signed-off-by: Michael Abbott <michael.abbott@diamond.ac.uk> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-24 10:16:24 +02:00
Paul Moore	d81165919e	lsm: Use a compressed IPv6 string format in audit events Currently the audit subsystem prints uncompressed IPv6 addresses which not only differs from common usage but also results in ridiculously large audit strings which is not a good thing. This patch fixes this by simply converting audit to always print compressed IPv6 addresses. Old message example: audit(1253576792.161:30): avc: denied { ingress } for saddr=0000:0000:0000:0000:0000:0000:0000:0001 src=5000 daddr=0000:0000:0000:0000:0000:0000:0000:0001 dest=35502 netif=lo scontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tcontext=system_u:object_r:lo_netif_t:s0-s15:c0.c1023 tclass=netif New message example: audit(1253576792.161:30): avc: denied { ingress } for saddr=::1 src=5000 daddr=::1 dest=35502 netif=lo scontext=system_u:object_r:unlabeled_t:s15:c0.c1023 tcontext=system_u:object_r:lo_netif_t:s0-s15:c0.c1023 tclass=netif Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 03:50:26 -04:00
Eric Paris	939cbf260c	Audit: send signal info if selinux is disabled Audit will not respond to signal requests if selinux is disabled since it is unable to translate the 0 sid from the sending process to a context. This patch just doesn't send the context info if there isn't any. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 03:50:26 -04:00
Eric Paris	44e51a1b78	Audit: rearrange audit_context to save 16 bytes per struct pahole pointed out that on x86_64 struct audit_context can be rearrainged to save 16 bytes per struct. Since we have an audit_context per task this can acually be a pretty significant gain. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 03:50:26 -04:00
Eric Paris	e08b061ec0	Audit: reorganize struct audit_watch to save 8 bytes pahole showed that struct audit_watch had two holes: struct audit_watch { atomic_t count; /* 0 4 / / XXX 4 bytes hole, try to pack / char path; /* 8 8 / dev_t dev; / 16 4 / / XXX 4 bytes hole, try to pack / long unsigned int ino; / 24 8 / struct audit_parent parent; /* 32 8 / struct list_head wlist; / 40 16 / struct list_head rules; / 56 16 / / --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- / / size: 72, cachelines: 2, members: 7 / / sum members: 64, holes: 2, sum holes: 8 / / last cacheline: 8 bytes / }; / definitions: 1 */ by moving dev after count we save 8 bytes, actually improving cacheline usage. There are typically very few of these in the kernel so it won't be a large savings, but it's a good thing no matter what. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-09-24 03:50:25 -04:00
Kuninori Morimoto	acf3cc283f	sh: mach-ecovec24: Add active low setting for sh_eth Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-09-24 16:21:51 +09:00
Jaswinder Singh Rajput	a6bbce200d	sh: includecheck fix: dwarf.c fix the following 'make includecheck' warning: arch/sh/kernel/dwarf.c: asm/dwarf.h is included more than once. Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2009-09-24 16:21:50 +09:00
Benjamin Herrenschmidt	09dd3fc19c	Fix build of cpm_uart due to core changes Commit `ebd2c8f6d2` "serial: kill off uart_info" broke the build of this driver, this fixes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 17:01:22 +10:00
Rex Feany	e0908085fc	powerpc/8xx: Fix regression introduced by cache coherency rewrite After upgrading to the latest kernel on my mpc875 userspace started running incredibly slow (hours to get to a shell, even!). I tracked it down to commit `8d30c14cab`, that patch removed a work-around for the 8xx. Adding it back makes my problem go away. Signed-off-by: Rex Feany <rfeany@mrv.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:56:30 +10:00
Josh Boyer	daf8f40391	powerpc/4xx: Fix erroneous xmon warning on PowerPC 4xx The xmon code relies on MSR_RI being non-zero to indicate that an exception is recoverable. If it is not, it prints a warning message. However, the PowerPC 4xx cores do not have an MSR_RI bit and this warning is produced for every xmon event. This introduces an unrecoverable_excp function to determine if an exception is recoverable or not. This gets rid of the erroneous warnings on 4xx. Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:49 +10:00
Benjamin Herrenschmidt	f32af63ed1	powerpc/mm: Fix 40x and 8xx vs. _PAGE_SPECIAL The test to check whether we have _PAGE_SPECIAL defined is broken, since we always define it, just not always to a meaninful value :-) That broke 8xx and 40x under some circumstances. This fixes it by adding _PAGE_SPECIAL for both of these since they had a free PTE bit, and removing the condition around advertising it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:49 +10:00
Tim Abbott	142597dbbd	powerpc: Cleanup linker script using new linker script macros. Signed-off-by: Tim Abbott <tabbott@ksplice.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@ozlabs.org Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:48 +10:00
Anton Blanchard	049d049706	powerpc: Fix ibm,client-architecture-support printout On machines without the ibm,client-architecture-support call we were missing a newline. We may as well print the full name in all its glory too - its ibm,client-architecture-support, not ibm,client-architecture as I mistakenly wrote (a name only an IBM architect could love). For my penance I will write out ibm,client-architecture-support 100 times. Before: Calling ibm,client-architecture...command line: root=/dev/sda6 console=hvc0 quiet After: Calling ibm,client-architecture-support... not implemented command line: root=/dev/sda6 console=hvc0 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:47 +10:00
Anton Blanchard	ea55bf2912	powerpc: Increase NODES_SHIFT on 64bit from 4 to 8 Some System p configurations can already have more than 16 nodes so we need to increase NODES_SHIFT. I chose 256 to give us some room to grow in the future, although we can look at something smaller if the memory bloat is considered too much. Unless we clamp MAX_ACTIVE_REGIONS we end up with 300kB of extra bloat in early_node_map in mm/page_alloc.c: < 6144 early_node_map > 307200 early_node_map due to: #if MAX_NUMNODES >= 32 /* If there can be many nodes, allow up to 50 holes per node / #define MAX_ACTIVE_REGIONS (MAX_NUMNODES50) #else /* By default, allow up to 256 distinct regions */ #define MAX_ACTIVE_REGIONS 256 Since our memory is mostly contiguous it seems reasonable to keep this at 256 for now. I also set 32bit to 32 to save space (is there any chance a 32bit system will have more than 32 discontiguous memory ranges?). Even with that fixed we have a few data structures that grow: < 896 bootmem_node_data > 14336 bootmem_node_data < 1280 node_devices > 20480 node_devices < 25088 kmalloc_caches > 59648 kmalloc_caches < 1632 hstates > 21792 hstates Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:46 +10:00
Anton Blanchard	f2053f1a7b	powerpc/perf_counter: Fix vdso detection perf_counter uses arch_vma_name() to detect a vdso region which in turn uses current->mm->context.vdso_base. We need to initialise this before doing the mmap or else we fail to detect the vdso. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:45 +10:00
Anton Blanchard	8bbde7a706	powerpc: Move 64bit heap above 1TB on machines with 1TB segments If we are using 1TB segments and we are allowed to randomise the heap, we can put it above 1TB so it is backed by a 1TB segment. Otherwise the heap will be in the bottom 1TB which always uses 256MB segments and this may result in a performance penalty. This functionality is disabled when heap randomisation is turned off: echo 1 > /proc/sys/kernel/randomize_va_space which may be useful when trying to allocate the maximum amount of 16M or 16G pages. On a microbenchmark that repeatedly touches 32GB of memory with a stride of 256MB + 4kB (designed to stress 256MB segments while still mapping nicely into the L1 cache), we see the improvement: Force malloc to use heap all the time: # export MALLOC_MMAP_MAX_=0 MALLOC_TRIM_THRESHOLD_=-1 Disable heap randomization: # echo 1 > /proc/sys/kernel/randomize_va_space # time ./test 12.51s Enable heap randomization: # echo 2 > /proc/sys/kernel/randomize_va_space # time ./test 1.70s Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:44 +10:00
Becky Bruce	738ef42e32	powerpc: Change archdata dma_data to a union Sometimes this is used to hold a simple offset, and sometimes it is used to hold a pointer. This patch changes it to a union containing void * and dma_addr_t. get/set accessors are also provided, because it was getting a bit ugly to get to the actual data. Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:43 +10:00
Becky Bruce	1cebd7a0f6	powerpc: Rename get_dma_direct_offset get_dma_offset The former is no longer really accurate with the swiotlb case now a possibility. I also move it into dma-mapping.h - it no longer needs to be in dma.c, and there are about to be some more accessors that should all end up in the same place. A comment is added to indicate that this function is not used in configs where there is no simple dma offset, such as the iommu case. Signed-off-by: Becky Bruce <beckyb@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:43 +10:00
Huang Weiyi	b9eceb2307	powerpc/mm: Remove duplicated #include Remove duplicated #include('s) in arch/powerpc/mm/tlb_low_64e.S Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:42 +10:00
Huang Weiyi	5c8f382c0b	powerpc/book3e-64: Remove duplicated #include Remove duplicated #include('s) in arch/powerpc/kernel/exceptions-64e.S Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:41 +10:00
Tony Breeds	144ef909c0	powerpc: Check for unsupported relocs when using CONFIG_RELOCATABLE When using CONFIG_RELOCATABLE, we build the kernel as a position independent executable. The kernel then uses a little bit of relocation code to relocate itself. That code only deals with R_PPC64_RELATIVE relocations though. If for some reason you use assembly constructs such as LOAD_REG_IMMEDIATE() to load the address of a symbol, you'll generate different kinds of relocations that won't be processed properly and bad things will happen. (We have 2 such bugs today). The perl script tries to filter out "known" bad ones. It's possible that we are missing some in the case of a weak function that nobody implements, we'll see if we get false positive and fix it. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:40 +10:00
Benjamin Herrenschmidt	ad08587e5d	powerpc/pmc: Don't access lppaca on Book3E It doesn't exist ! Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:39 +10:00
roel kluin	0f3372741f	powerpc: kmalloc failure ignored in vio_build_iommu_table() Prevent NULL dereference if kmalloc() fails. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:31:38 +10:00
Hendrik Brueckner	254be490f2	hvc_console: Provide (un)locked version for hvc_resize() Rename the locking free hvc_resize() function to __hvc_resize() and provide an inline function that locks the hvc_struct and calls __hvc_resize(). The rationale for this patch is that virtio_console calls the hvc_resize() function without locking the hvc_struct. So it needs to call the lock itself. According to naming rules, the unlocked version is renamed and prefixed with "__". References to unlocked function calls in hvc back-ends has been updated. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2009-09-24 15:12:47 +10:00
Linus Torvalds	94a8d5caba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (39 commits) cpumask: Move deprecated functions to end of header. cpumask: remove unused deprecated functions, avoid accusations of insanity cpumask: use new-style cpumask ops in mm/quicklist. cpumask: use mm_cpumask() wrapper: x86 cpumask: use mm_cpumask() wrapper: um cpumask: use mm_cpumask() wrapper: mips cpumask: use mm_cpumask() wrapper: mn10300 cpumask: use mm_cpumask() wrapper: m32r cpumask: use mm_cpumask() wrapper: arm cpumask: Use accessors for cpu__mask: um cpumask: Use accessors for cpu__mask: powerpc cpumask: Use accessors for cpu__mask: mips cpumask: Use accessors for cpu__mask: m32r cpumask: remove arch_send_call_function_ipi cpumask: arch_send_call_function_ipi_mask: s390 cpumask: arch_send_call_function_ipi_mask: powerpc cpumask: arch_send_call_function_ipi_mask: mips cpumask: arch_send_call_function_ipi_mask: m32r cpumask: arch_send_call_function_ipi_mask: alpha cpumask: remove obsolete topology_core_siblings and topology_thread_siblings: ia64 ...	2009-09-23 18:14:11 -07:00
Alexey Dobriyan	2bcd57ab61	headers: utsname.h redux * remove asm/atomic.h inclusion from linux/utsname.h -- not needed after kref conversion * remove linux/utsname.h inclusion from files which do not need it NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however due to some personality stuff it _is_ needed -- cowardly leave ELF-related headers and files alone. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 18:13:10 -07:00
Sebastian Andrzej Siewior	95e0d86bad	Revert "kmod: fix race in usermodehelper code" This reverts commit `c02e3f361c` ("kmod: fix race in usermodehelper code") The patch is wrong. UMH_WAIT_EXEC is called with VFORK what ensures that the child finishes prior returing back to the parent. No race. In fact, the patch makes it even worse because it does the thing it claims not do: - It calls ->complete() on UMH_WAIT_EXEC - the complete() callback may de-allocated subinfo as seen in the following call chain: [<c009f904>] (__link_path_walk+0x20/0xeb4) from [<c00a094c>] (path_walk+0x48/0x94) [<c00a094c>] (path_walk+0x48/0x94) from [<c00a0a34>] (do_path_lookup+0x24/0x4c) [<c00a0a34>] (do_path_lookup+0x24/0x4c) from [<c00a158c>] (do_filp_open+0xa4/0x83c) [<c00a158c>] (do_filp_open+0xa4/0x83c) from [<c009ba90>] (open_exec+0x24/0xe0) [<c009ba90>] (open_exec+0x24/0xe0) from [<c009bfa8>] (do_execve+0x7c/0x2e4) [<c009bfa8>] (do_execve+0x7c/0x2e4) from [<c0026a80>] (kernel_execve+0x34/0x80) [<c0026a80>] (kernel_execve+0x34/0x80) from [<c004b514>] (____call_usermodehelper+0x130/0x148) [<c004b514>] (____call_usermodehelper+0x130/0x148) from [<c0024858>] (kernel_thread_exit+0x0/0x8) and the path pointer was NULL. Good that ARM's kernel_execve() doesn't check the pointer for NULL or else I wouldn't notice it. The only race there might be is with UMH_NO_WAIT but it is too late for me to investigate it now. UMH_WAIT_PROC could probably also use VFORK and we could save one exec. So the only race I see is with UMH_NO_WAIT and recent scheduler changes where the child does not always run first might have trigger here something but as I said, it is late.... Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-23 18:12:10 -07:00
Chris Mason	11ef160fda	Btrfs: fix releasepage to avoid unlocking extents we haven't locked During releasepage, we try to drop any extent_state structs for the bye offsets of the page we're releaseing. But the code was incorrectly telling clear_extent_bit to delete the state struct unconditionallly. Normally this would be fine because we have the page locked, but other parts of btrfs will lock down an entire extent, the most common place being IO completion. releasepage was deleting the extent state without first locking the extent, which may result in removing a state struct that another process had locked down. The fix here is to leave the NODATASUM and EXTENT_LOCKED bits alone in releasepage. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-09-23 20:30:53 -04:00
Chris Mason	46562cec98	Btrfs: Fix test_range_bit for whole file extents If test_range_bit finds an extent that goes all the way to (u64)-1, it can incorrectly wrap the u64 instead of treaing it like the end of the address space. This just adds a check for the highest possible offset so we don't wrap. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-09-23 20:30:52 -04:00
Chris Mason	42daec299b	Btrfs: fix errors handling cached state in set/clear_extent_bit Both set and clear_extent_bit allow passing a cached state struct to reduce rbtree search times. clear_extent_bit was improperly bypassing some of the checks around making sure the extent state fields were correct for a given operation. The fix used here (from Yan Zheng) is to use the hit_next goto target instead of jumping all the way down to start clearing bits without making sure the cached state was exactly correct for the operation we were doing. This also fixes up the setting of the start variable for both ops in the case where we find an overlapping extent that begins before the range we want to change. In both cases we were incorrectly going backwards from the original requested change. Signed-off-by: Chris Mason <chris.mason@oracle.com>	2009-09-23 20:30:52 -04:00
Amit Shah	0aea51c37f	virtio_net: Check for room in the vq before adding buffer Saves us one cycle of alloc-add-free if the queue was full. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (modified)	2009-09-24 09:59:21 +09:30
Rusty Russell	48925e372f	virtio_net: avoid (most) NETDEV_TX_BUSY by stopping queue early. Now we can tell the theoretical capacity remaining in the output queue, virtio_net can waste entries by stopping the queue early. It doesn't work in the case of indirect buffers and kmalloc failure, but that's rare (we could drop the packet in that case, but other drivers return TX_BUSY for similar reasons). For the record, I think this patch reflects poorly on the linux network API. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Dinesh Subhraveti <dineshs@us.ibm.com>	2009-09-24 09:59:20 +09:30
Rusty Russell	b3f24698a7	virtio_net: formalize skb_vnet_hdr We put the virtio_net_hdr into the skb's cb region; turn this into a union to clean up the code slightly and allow future expansion. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Mark McLoughlin <markmc@redhat.com> Cc: Dinesh Subhraveti <dineshs@us.ibm.com>	2009-09-24 09:59:20 +09:30
Rusty Russell	b0c39dbdc2	virtio_net: don't free buffers in xmit ring The virtio_net driver is complicated by the two methods of freeing old xmit buffers (in addition to freeing old ones at the start of the xmit path). The original code used a 1/10 second timer attached to xmit_free(), reset on every xmit. Before we orphaned skbs on xmit, the transmitting userspace could block with a full socket until the timer fired, the skb destructor was called, and they were re-woken. So we added the VIRTIO_F_NOTIFY_ON_EMPTY feature: supporting devices send an interrupt (even if normally suppressed) on an empty xmit ring which makes us schedule xmit_tasklet(). This was a benchmark win. Unfortunately, VIRTIO_F_NOTIFY_ON_EMPTY makes quite a lot of work: a host which is faster than the guest will fire the interrupt every xmit packet (slowing the guest down further). Attempting mitigation in the host adds overhead of userspace timers (possibly with the additional pain of signals), and risks increasing latency anyway if you get it wrong. In practice, this effect was masked by benchmarks which take advantage of GSO (with its inherent transmit batching), but it's still there. Now we orphan xmitted skbs, the pressure is off: remove both paths and no longer request VIRTIO_F_NOTIFY_ON_EMPTY. Note that the current QEMU will notify us even if we don't negotiate this feature (legal, but suboptimal); a patch is outstanding to improve that. Move the skb_orphan/nf_reset to after we've done the send and notified the other end, for a slight optimization. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Mark McLoughlin <markmc@redhat.com>	2009-09-24 09:59:19 +09:30
Rusty Russell	8958f574db	virtio_net: return NETDEV_TX_BUSY instead of queueing an extra skb. This effectively reverts `99ffc696d1` "virtio: wean net driver off NETDEV_TX_BUSY". The complexity of queuing an skb (setting a tasklet to re-xmit) is questionable, especially once we get rid of the other reason for the tasklet in the next patch. If the skb won't fit in the tx queue, just return NETDEV_TX_BUSY. This is frowned upon, so a followup patch uses a more complex solution. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Herbert Xu <herbert@gondor.apana.org.au>	2009-09-24 09:59:19 +09:30
Rusty Russell	2b5bbe3b8b	virtio_net: skb_orphan() and nf_reset() in xmit path. The complex transmit free logic was introduced to avoid hangs on removing the ip_conntrack module and also because drivers aren't generally supposed to keep stale skbs for unbounded times. After some debate, it was decided that while doing skb_orphan() generally is a rat's nest, we can do it in this driver. Following patches take advantage of this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2009-09-24 09:59:18 +09:30

... 7 8 9 10 11 ...

166635 Commits All Branches Search

166635 Commits

All Branches