access_ok() checks must be done on every part of the userspace structure
that is accessed. If access_ok() on one part of the struct succeeded, it
does not imply it will succeed on other parts of the struct. (Does
depend on the architecture implementation of access_ok()).
This changes the __get_user() users to first check access_ok() on the
data structure.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: stable <stable@kernel.org>
Cc: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
I am submitting a patch for the pl2303 driver. This patch adds support
for the "Sony QN-3USB" cable (vendor=0x054c, product=0x0437). This USB
cable is a so-called data cable used to connect a Sony mobile phone to a
computer. Supported models are Sony CMD-J5, J6, J7, J16, J26, J70 and
Z7.
I have used this patch with my Sony CMD-J70 for several days and I
haven't encountered any kernel/hardware issue.
From: Khanh-Dang Nguyen Thu Lam <kdntl@yahoo.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Do not exceed array status_map[]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
In a non-sparse extend, we correctly allocate (and zero) the clusters between
the old_i_size and pos, but we don't zero the portions of the cluster we're
writing to outside of pos<->len.
It handles clustersize > pagesize and blocksize < pagesize.
[Cleaned up by Joel Becker.]
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
invalidate_inode_pages2_range may return -EBUSY occasionally
which results Oops. This patch fixes the issue by moving
invalidate_inode_pages2_range into a loop and keeping calling
it until the return value is not -EBUSY.
The EBUSY return is temporary, and can happen when the btrfs release page
function is unable to release a page because the EXTENT_LOCK
bit is set.
Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
find_zlib_workspace returns an ERR_PTR value in an error case instead of NULL.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@match exists@
expression x, E;
statement S1, S2;
@@
x = find_zlib_workspace(...)
... when != x = E
(
* if (x == NULL || ...) S1 else S2
|
* if (x == NULL && ...) S1 else S2
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
This takes care of the following entry from Dan's list:
fs/btrfs/inode.c +4788 btrfs_rename(36) warning: variable derefenced before check 'old_inode'
Reported-by: Dan Carpenter <error27@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Eugene Teo <eteo@redhat.com>
Cc: Julia Lawall <julia@diku.dk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter: Fix double list iteration in per task precise stats
perf: Auto-detect libelf
perf symbol: Fix symbol parsing in certain cases: use the build-id as a symlink
perf_counter/powerpc: Check oprofile_cpu_type for NULL before using it
ftrace: Fix perf-tracepoint OOPS
perf report: Add missing command line options to man page
perf: Auto-detect libbfd
perf report: Make --sort comm,dso,symbol the default
* git://git.infradead.org/mtd-2.6:
jffs2: Fix return value from jffs2_do_readpage_nolock()
mtd: mtdblock: introduce mtdblks_lock
mtd: remove 'SBC8240 Wind River' Device Driver Code
mtd: OneNAND: OMAP2/3: free GPMC CS on module removal
mtd: OneNAND: fix incorrect bufferram offset
mtd: blkdevs: do not forget to get MTD devices
mtd: fix the conversion from dev to mtd_info
mtd: let include/linux/mtd/partitions.h stand on its own
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: setup MC/VRAM the same way for suspend/resume
drm/radeon/kms: Fix caching mode selection for GTT object
The new credentials code broke load_flat_shared_library() as it now uses
an uninitialized cred pointer.
Reported-by: Bernd Schmidt <bernds_cb1@t-online.de>
Tested-by: Bernd Schmidt <bernds_cb1@t-online.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These includes were added by 079effb693
("kmemtrace, kbuild: fix slab.h dependency problem in
lib/decompress_inflate.c") to fix the build when using kmemtrace. However
this is not necessary when used to create a compressed kernel, and
actually creates issues (brings a lot of things unavailable in the
decompression environment), so don't include it if STATIC is defined.
Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
decompress_bunzip2 and decompress_unlzma have a nasty hack that subtracts
4 from the input length if being called in the pre-boot environment.
This is a nasty hack because it relies on the fact that flush = NULL only
when called from the pre-boot environment (i.e.
arch/x86/boot/compressed/misc.c). initramfs.c/do_mounts_rd.c pass in a
flush buffer (flush != NULL).
This hack prevents the decompressors from being used with flush = NULL by
other callers unless knowledge of the hack is propagated to them.
This patch removes the hack by making decompress (called only from the
pre-boot environment) a wrapper function that subtracts 4 from the input
length before calling the decompressor.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix and improve comments in decompress/generic.h that describe the
decompressor API. Also remove an unused definition, and rename INBUF_LEN
in lib/decompress_inflate.c to conform to bzip2/lzma naming.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While looking at Jens Rosenboom bug report
(http://lkml.org/lkml/2009/7/27/35) about strange sys_futex call done from
a dying "ps" program, we found following problem.
clone() syscall has special support for TID of created threads. This
support includes two features.
One (CLONE_CHILD_SETTID) is to set an integer into user memory with the
TID value.
One (CLONE_CHILD_CLEARTID) is to clear this same integer once the created
thread dies.
The integer location is a user provided pointer, provided at clone()
time.
kernel keeps this pointer value into current->clear_child_tid.
At execve() time, we should make sure kernel doesnt keep this user
provided pointer, as full user memory is replaced by a new one.
As glibc fork() actually uses clone() syscall with CLONE_CHILD_SETTID and
CLONE_CHILD_CLEARTID set, chances are high that we might corrupt user
memory in forked processes.
Following sequence could happen:
1) bash (or any program) starts a new process, by a fork() call that
glibc maps to a clone( ... CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID
...) syscall
2) When new process starts, its current->clear_child_tid is set to a
location that has a meaning only in bash (or initial program) context
(&THREAD_SELF->tid)
3) This new process does the execve() syscall to start a new program.
current->clear_child_tid is left unchanged (a non NULL value)
4) If this new program creates some threads, and initial thread exits,
kernel will attempt to clear the integer pointed by
current->clear_child_tid from mm_release() :
if (tsk->clear_child_tid
&& !(tsk->flags & PF_SIGNALED)
&& atomic_read(&mm->mm_users) > 1) {
u32 __user * tidptr = tsk->clear_child_tid;
tsk->clear_child_tid = NULL;
/*
* We don't check the error code - if userspace has
* not set up a proper pointer then tough luck.
*/
<< here >> put_user(0, tidptr);
sys_futex(tidptr, FUTEX_WAKE, 1, NULL, NULL, 0);
}
5) OR : if new program is not multi-threaded, but spied by /proc/pid
users (ps command for example), mm_users > 1, and the exiting program
could corrupt 4 bytes in a persistent memory area (shm or memory mapped
file)
If current->clear_child_tid points to a writeable portion of memory of the
new program, kernel happily and silently corrupts 4 bytes of memory, with
unexpected effects.
Fix is straightforward and should not break any sane program.
Reported-by: Jens Rosenboom <jens@mcbone.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sdhci_alloc_host returns an ERR_PTR value in an error case instead of NULL.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@match exists@
expression x, E;
statement S1, S2;
@@
x = sdhci_alloc_host(...)
... when != x = E
(
* if (x == NULL || ...) S1 else S2
|
* if (x == NULL && ...) S1 else S2
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Matt Fleming <matt@console-pimps.org>
Cc: Ian Molton <ian@mnementh.co.uk>
Cc: "Roberto A. Foglietta" <roberto.foglietta@gmail.com>
Cc: Philip Langdale <philipl@overt.org>
Cc: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recent framebuffer locking patches first made affected systems unbootable,
then the dead-lock has been fixed but as of 2.6.31-rc4 the framebuffer on
mx3 machines doesn't work. Fix this.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I suspect that mnt_want_write_file() may have wrong assumption. I think
mnt_want_write_file() is assuming it increments ->mnt_writers if
(file->f_mode & FMODE_WRITE). But, if it's special_file(), it is false?
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The FIEMAP_IOC_FIEMAP mapping ioctl was missing a 32-bit compat handler,
which means that 32-bit suerspace on 64-bit kernels cannot use this ioctl
command.
The structure is nicely aligned, padded, and sized, so it is just this
simple.
Tested w/ 32-bit ioctl tester (from Josef) on a 64-bit kernel on ext4.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Mark Lord <lkml@rtr.ca>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Josef Bacik <josef@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Catalin and kmemleak spotted a leak of a VC screen buffer in
vc_allocate() due to the following chain of events:
vc_allocate()
visual_init(init=1)
vc->vc_sw->con_init(init=1)
fbcon_init()
vc_resize()
vc->screen_buf = kmalloc()
vc->screen_buf = kmalloc()
The common way for the VC drivers is to set the screen dimension
parameters manually in the init case and only call vc_resize() for
!init - which allocates a screen buffer according to the new
dimensions.
fbcon instead would do vc_resize() unconditionally and afterwards set
the dimensions manually (again) for !init - i.e. completely upside
down. The vc_resize() allocated buffer would then get lost by
vc_allocate() allocating a fresh one.
Use vc_resize() only for actual resizing to close the leak.
Set the dimensions manually only in initialization mode to remove the
redundant setting in resize mode.
The kmemleak trace from Catalin:
unreferenced object 0xde158000 (size 12288):
comm "Xorg", pid 1439, jiffies 4294961016
hex dump (first 32 bytes):
20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 . . . . . . . .
20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 . . . . . . . .
backtrace:
[<c006f74b>] __save_stack_trace+0x17/0x1c
[<c006f81d>] create_object+0xcd/0x188
[<c01f5457>] kmemleak_alloc+0x1b/0x3c
[<c006e303>] __kmalloc+0xdb/0xe8
[<c012cc4b>] vc_do_resize+0x73/0x1e0
[<c012cdf1>] vc_resize+0x15/0x18
[<c011afc1>] fbcon_init+0x1f9/0x2b8
[<c0129e87>] visual_init+0x9f/0xdc
[<c012aff3>] vc_allocate+0x7f/0xfc
[<c012b087>] con_open+0x17/0x80
[<c0120e43>] tty_open+0x1f7/0x2e4
[<c0072fa1>] chrdev_open+0x101/0x118
[<c006ffad>] __dentry_open+0x105/0x1cc
[<c00700fd>] nameidata_to_filp+0x2d/0x38
[<c00788cd>] do_filp_open+0x2c1/0x54c
[<c006fdff>] do_sys_open+0x3b/0xb4
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes a bug caused by changing pointers (viafb_mode, viafb_mode1)
assigned by module_param. It reduces driver complexity by not needlessly
changing these vars as they are only read once and removing now
superfluous code.
On unpatched kernels loading viafb with viafb_mode or viafb_mode1 option
used and afterwards unloading it results in:
kernel BUG at mm/slub.c:2926!
invalid opcode: 0000 [#1] PREEMPT
last sysfs file: /sys/devices/virtual/block/loop0/removable
Modules linked in: snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm rtl8187 snd_timer eeprom_93cx6 mmc_block snd soundcore
via_sdmmc fb snd_page_alloc i2c_algo_bit i2c_viapro ehci_hcd uhci_hcd
cfbcopyarea mmc_core cfbimgblt cfbfillrect video output [last unloaded:
viafb]
Pid: 3355, comm: rmmod Not tainted (2.6.31-rc1 #0)
EIP: 0060:[<c106a759>] EFLAGS: 00010246 CPU: 0
EIP is at kfree+0x80/0xda
EAX: c17c2da0 EBX: dc7edbdc ECX: 0000010f EDX: 00000000
ESI: c102c700 EDI: dc7ed8fa EBP: d703ff2c ESP: d703ff20
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rmmod (pid: 3355, ti=d703e000 task=db1412c0 task.ti=d703e000)
Stack:
dc7edbdc 00000014 00000016 d703ff40 c102c700 dc7f45d4 dc7f45d4 00000880
d703ff4c c103e571 00000000 d703ffac c103e751 66616976 da140062 db89ba80
00000328 d702edf8 db89ba80 d703ff9c c105d0f0 00000200 da14f898 00000014
Call Trace:
[<c102c700>] ? destroy_params+0x1e/0x2b
[<c103e571>] ? free_module+0xa2/0xd7
[<c103e751>] ? sys_delete_module+0x1ab/0x1da
[<c105d0f0>] ? do_munmap+0x20a/0x225
[<c10029b4>] ? sysenter_do_call+0x12/0x26
Code: 10 76 7a 8d 87 00 00 00 40 c1 e8 0c c1 e0 05 03 05 1c 87 41 c1 66 83 38 00 79 03 8b 40 0c 8b 10 84 d2 78 12 66 f7 c2 00 c0 75 04 <0f> 0b eb fe e8 6f 5a fe ff eb 47 8b 55 04 8b 58 0c 9c 5e fa 3b
EIP: [<c106a759>] kfree+0x80/0xda SS:ESP 0068:d703ff20
This is caused by the current code changing the pointers assigned by
module_param. During unload it tries to free the memory the pointers
point at which is now part of an internal structure.
The patch simply avoids changing the pointers. This is okay as they are
read only once during the initialization process.
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Scott Fang <ScottFang@viatech.com.cn>
Cc: Joseph Chan <JosephChan@via.com.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
At first, init_task's mems_allowed is initialized as this.
init_task->mems_allowed == node_state[N_POSSIBLE]
And cpuset's top_cpuset mask is initialized as this
top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY]
Before 2.6.29:
policy's mems_allowed is initialized as this.
1. update tasks->mems_allowed by its cpuset->mems_allowed.
2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask)
Updating task's mems_allowed in reference to top_cpuset's one.
cpuset's mems_allowed is aware of N_HIGH_MEMORY, always.
In 2.6.30: After commit 58568d2a82
("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed
is initialized as this.
1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask)
Here, if task is in top_cpuset, task->mems_allowed is not updated from
init's one. Assume user excutes command as #numactrl --interleave=all
,....
policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK)
Then, policy's mems_allowd can includes a possible node, which has no pgdat.
MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this
directly.
NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL
Then, what's we need is making policy->mems_allowed be aware of
N_HIGH_MEMORY. This patch does that. But to do so, extra nodemask will
be on statck. Because I know cpumask has a new interface of
CPUMASK_ALLOC(), I added it to node.
This patch stands on old behavior. But I feel this fix itself is just a
Band-Aid. But to do fundametal fix, we have to take care of memory
hotplug and it takes time. (task->mems_allowd should be N_HIGH_MEMORY, I
think.)
mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask
should be includes only online nodes.
In old behavior, this is guaranteed by frequent reference to cpuset's
code. Now, most of them are removed and mempolicy has to check it by
itself.
To do check, a few nodemask_t will be used for calculating nodemask. But,
size of nodemask_t can be big and it's not good to allocate them on stack.
Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area.
NODEMASK_ALLOC/FREE shoudl be there.
[akpm@linux-foundation.org: cleanups & tweaks]
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the rotate_ud() function not to crash in case of a font which has not
a width of multiple by 8: The inner loop of the font pixel copy should not
access a bit outside the font memory area. Subtract the shift offset from
the font width will prevent this.
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use CONFIG_HOTPLUG_CPU, not CONFIG_CPU_HOTPLUG
When hot-unpluging a cpu, it will leak memory allocated at cpu hotplug,
but only if CPUMASK_OFFSTACK=y, which is default to n.
The bug was introduced by 8969a5ede0
("generic-ipi: remove kmalloc()").
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was found using a semantic patch, more info can be found at:
http://www.emn.fr/x-info/coccinelle/
Signed-off-by: Stoyan Gaydarov <sgayda2@uiuc.edu>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When freeing an inode that lost race getting added to the inode cache we
must not call into ->destroy_inode, because that would delete the inode
that won the race from the inode cache radix tree.
This patch uses splits a new xfs_inode_free helper out of xfs_ireclaim
and uses that plus __destroy_inode to make sure we really only free
the memory allocted for the inode that lost the race, and not mess with
the inode cache state.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reported-by: Alex Samad <alex@samad.com.au>
Reported-by: Andrew Randrianasulu <randrik@mail.ru>
Reported-by: Stephane <sharnois@max-t.com>
Reported-by: Tommy <tommy@news-service.com>
Reported-by: Miah Gregory <mace@darksilence.net>
Reported-by: Gabriel Barazer <gabriel@oxeva.fr>
Reported-by: Leandro Lucarella <llucax@gmail.com>
Reported-by: Daniel Burr <dburr@fami.com.au>
Reported-by: Nickolay <newmail@spaces.ru>
Reported-by: Michael Guntsche <mike@it-loops.com>
Reported-by: Dan Carley <dan.carley+linuxkern-bugs@gmail.com>
Reported-by: Michael Ole Olsen <gnu@gmx.net>
Reported-by: Michael Weissenbacher <mw@dermichi.com>
Reported-by: Martin Spott <Martin.Spott@mgras.net>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Michael Guntsche <mike@it-loops.com>
Tested-by: Dan Carley <dan.carley+linuxkern-bugs@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
When we want to tear down an inode that lost the add to the cache race
in XFS we must not call into ->destroy_inode because that would delete
the inode that won the race from the inode cache radix tree.
This patch provides the __destroy_inode helper needed to fix this,
the actual fix will be in th next patch. As XFS was the only reason
destroy_inode was exported we shift the export to the new __destroy_inode.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Currently inode_init_always calls into ->destroy_inode if the additional
initialization fails. That's not only counter-intuitive because
inode_init_always did not allocate the inode structure, but in case of
XFS it's actively harmful as ->destroy_inode might delete the inode from
a radix-tree that has never been added. This in turn might end up
deleting the inode for the same inum that has been instanciated by
another process and cause lots of cause subtile problems.
Also in the case of re-initializing a reclaimable inode in XFS it would
free an inode we still want to keep alive.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
By the pci slot changes, callbacks of attributes under slot directory
(/sys/bus/pci/slots) had been changed to get the pointer to struct
pci_slot instead of struct hotplug_slot. So the path_show() that
assumes the parameter is a pointer to struct hotplug_slot seems
broken.
Tested-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The commit bd3d99c170 ("PCI: Remove
untested Electromechanical Interlock (EMI) support in pciehp."), which
removes the definition of "struct hotplug_slot_attr", broke SGI
hotplug driver. By this commit, we get the following compile error.
drivers/pci/hotplug/sgi_hotplug.c:106: error: variable 'sn_slot_path_attr' has initializer but incomplete type
drivers/pci/hotplug/sgi_hotplug.c:106: error: unknown field 'attr' specified in initializer
drivers/pci/hotplug/sgi_hotplug.c:106: error: extra brace group at end of initializer
drivers/pci/hotplug/sgi_hotplug.c:106: error: (near initialization for 'sn_slot_path_attr')
drivers/pci/hotplug/sgi_hotplug.c:106: warning: excess elements in struct initializer
drivers/pci/hotplug/sgi_hotplug.c:106: warning: (near initialization for 'sn_slot_path_attr')
drivers/pci/hotplug/sgi_hotplug.c:106: error: unknown field 'show' specified in initializer
drivers/pci/hotplug/sgi_hotplug.c:106: warning: excess elements in struct initializer
drivers/pci/hotplug/sgi_hotplug.c:106: warning: (near initialization for 'sn_slot_path_attr')
drivers/pci/hotplug/sgi_hotplug.c: In function 'sn_hp_destroy':
drivers/pci/hotplug/sgi_hotplug.c:203: error: invalid use of undefined type 'struct hotplug_slot_attribute'
drivers/pci/hotplug/sgi_hotplug.c: In function 'sn_hotplug_slot_register':
drivers/pci/hotplug/sgi_hotplug.c:655: error: invalid use of undefined type 'struct hotplug_slot_attribute'
This patch fixes this regression by adding the definition of struct
hotplug_slot_attr into sgi_hotplug.c.
Tested-by: Mike Habeck <habeck@sgi.com>
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
I noticed oprofile memleaked in linux-2.6 current tree,
and tracked this ring-buffer leak.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
LKML-Reference: <4A7C06B9.2090302@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
we should align the GTT after VRAM no matter what, as we can
come back from resume and put in a different place and bad things happen.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Check whether index is within bounds before testing the element.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Currently, the machine_flags are stored late in the startup
initialization which results in failing machine type checks
(e.g. for MACHINE_IS_VM).
To allow these checks, store the machine flags in the lowcore
when the machine type has been detected.
Moving the machine_flags to the lowcore has been introduced with
git commit 25097bf153
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Roland Dreier found that a section that contained only a weak
function in one of the staging drivers and this caused
recordmcount.pl to spit out a warning and fail.
Although it is strange that a driver would have a weak function, and
this function only be used in one place, it should not be something
to make recordmcount.pl fail.
This patch fixes the issue in a simple manner: if only weak
functions exist in a section, then that section will not be
recorded.
Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Brice Goglin reported this crash with per task precise stats:
> I finally managed to test the threaded perfcounter statistics (thanks a
> lot for implementing it). I am running 2.6.31-rc5 (with the AMD
> magny-cours patches but I don't think they matter here). I am trying to
> measure local/remote memory accesses per thread during the well-known
> stream benchmark. It's compiled with OpenMP using 16 threads on a
> quad-socket quad-core barcelona machine.
>
> Command line is:
> /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf record -f -s
> -e r1000001e0 -e r1000002e0 -e r1000004e0 -e r1000008e0 ./stream
>
> It seems to work fine with a single -e <counter> on the command line
> while it crashes when there are at least 2 of them.
> It seems to work fine without -s as well.
A silly copy-paste resulted in a messed up iteration which would
cause the OOPS.
Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
LKML-Reference: <1249574786.32113.550.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adds autodetection for libelf as well, and simplifies the
libbfd code. Furthermore, fail make with an error when libelf
is not found and warn about the lack of libbfd.
Also provide an option to build a 32bit version even though you
might be running a 64bit kernel.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In some cases distros have binaries and debuginfo in weird places:
[root@doppio tuna]# ls -la /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
-rwxr-xr-x 1 root root 90024 2009-08-03 19:45 /usr/lib64/firefox-3.5.2/firefox
-rwxr-xr-x 1 root root 90024 2009-08-03 18:23 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
[root@doppio tuna]# sha1sum /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/firefox-3.5.2/firefox
[root@doppio tuna]# rpm -qf /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
xulrunner-1.9.1.2-1.fc11.x86_64
firefox-3.5.2-2.fc11.x86_64
[root@doppio tuna]# ls -la /usr/lib/debug/{usr/lib64/xulrunner-1.9.1/xulrunner-stub,usr/lib64/firefox-3.5.2/firefox}.debug
ls: cannot access /usr/lib/debug/usr/lib64/firefox-3.5.2/firefox.debug: No such file or directory
-rwxr-xr-x 1 root root 403608 2009-08-03 18:22 /usr/lib/debug/usr/lib64/xulrunner-1.9.1/xulrunner-stub.debug
Seemingly we don't have a .symtab when we actually can find it
if we use the .note.gnu.build-id ELF section put in place by
some distros. Use it and find the symbols we need.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When calling rb_buffer_peek() from ring_buffer_consume() and a
padding event is returned, the function rb_advance_reader() is
called twice. This may lead to missing samples or under high
workloads to the warning below. This patch fixes this. If a padding
event is returned by rb_buffer_peek() it will be consumed by the
calling function now.
Also, I simplified some code in ring_buffer_consume().
------------[ cut here ]------------
WARNING: at /dev/shm/.source/linux/kernel/trace/ring_buffer.c:2289 rb_advance_reader+0x2e/0xc5()
Hardware name: Anaheim
Modules linked in:
Pid: 29, comm: events/2 Tainted: G W 2.6.31-rc3-oprofile-x86_64-standard-00059-g5050dc2 #1
Call Trace:
[<ffffffff8106776f>] ? rb_advance_reader+0x2e/0xc5
[<ffffffff81039ffe>] warn_slowpath_common+0x77/0x8f
[<ffffffff8103a025>] warn_slowpath_null+0xf/0x11
[<ffffffff8106776f>] rb_advance_reader+0x2e/0xc5
[<ffffffff81068bda>] ring_buffer_consume+0xa0/0xd2
[<ffffffff81326933>] op_cpu_buffer_read_entry+0x21/0x9e
[<ffffffff810be3af>] ? __find_get_block+0x4b/0x165
[<ffffffff8132749b>] sync_buffer+0xa5/0x401
[<ffffffff810be3af>] ? __find_get_block+0x4b/0x165
[<ffffffff81326c1b>] ? wq_sync_buffer+0x0/0x78
[<ffffffff81326c76>] wq_sync_buffer+0x5b/0x78
[<ffffffff8104aa30>] worker_thread+0x113/0x1ac
[<ffffffff8104dd95>] ? autoremove_wake_function+0x0/0x38
[<ffffffff8104a91d>] ? worker_thread+0x0/0x1ac
[<ffffffff8104dc9a>] kthread+0x88/0x92
[<ffffffff8100bdba>] child_rip+0xa/0x20
[<ffffffff8104dc12>] ? kthread+0x0/0x92
[<ffffffff8100bdb0>] ? child_rip+0x0/0x20
---[ end trace f561c0a58fcc89bd ]---
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@kernel.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If the current CPU doesn't support performance counters,
cur_cpu_spec->oprofile_cpu_type can be NULL. The current
perf_counter modules don't test for that case and would thus
crash at boot time.
Bug reported by David Woodhouse.
Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul Mackerras <paulus@samba.org>
LKML-Reference: <19066.48028.446975.501454@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Two defects work together result in KVM device passthrough randomly can't
work:
1. iommu_snooping is not initialized to zero when vm_iommu_init() called.
So it is possible to get a random value.
2. One line added by commit 2c2e2c38("IOMMU Identity Mapping Support")
change the code path, let it bypass domain_update_iommu_cap(), as well as
missing the increment of domain iommu reference count.
The latter is also likely to cause a leak of domains on repeated VMM
assignment and deassignment.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Otherwise the host can spend too long traversing an rmap chain, which
happens under a spinlock.
Cc: stable@kernel.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>