Use mddev->new_layout in setup_conf.
Also use new_chunk, and don't set ->degraded in takeover(). That
gets set in run()
Signed-off-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Most array level changes leave the list of devices largely unchanged,
possibly causing one at the end to become redundant.
However conversions between RAID0 and RAID10 need to renumber
all devices (except 0).
This renumbering is currently being done in the ->run method when the
new personality takes over. However this is too late as the common
code in md.c might already have invalidated some of the devices if
they had a ->raid_disk number that appeared to high.
Moving it into the ->takeover method is too early as the array is
still active at that time and wrong ->raid_disk numbers could cause
confusion.
So add a ->new_raid_disk field to mdk_rdev_s and use it to communicate
the new raid_disk number.
Now the common code knows exactly which devices need to be renumbered,
and which can be invalidated, and can do it all at a convenient time
when the array is suspend.
It can also update some symlinks in sysfs which previously were not be
updated correctly.
Reported-by: Maciej Trela <maciej.trela@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Such NULL pointer dereference can occur when the driver was fixing the
read errors/bad blocks and the disk was physically removed
causing a system crash. This patch check if the
rcu_dereference() returns valid rdev before accessing it in fix_read_error().
Cc: stable@kernel.org
Signed-off-by: Prasanna S. Panchamukhi <prasanna.panchamukhi@riverbed.com>
Signed-off-by: Rob Becker <rbecker@riverbed.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Commit b821eaa572 broke partition
detection for md arrays.
The logic was almost right. However if revalidate_disk is called
when the device is not yet open, bdev->bd_disk won't be set, so the
flush_disk() Call will not set bd_invalidated.
So when md_open is called we still need to ensure that
->bd_invalidated gets set. This is easily done with a call to
check_disk_size_change in the place where the offending commit removed
check_disk_change. At the important times, the size will have changed
from 0 to non-zero, so check_disk_size_change will set bd_invalidated.
Tested-by: Duncan <1i5t5.duncan@cox.net>
Reported-by: Duncan <1i5t5.duncan@cox.net>
Signed-off-by: NeilBrown <neilb@suse.de>
As reported by Sergei, a couple of braces were missing after
the WARN removal patch.
[07/22] OMAP: hwmod: Replace WARN by pr_warning if clock lookup failed
https://patchwork.kernel.org/patch/100756/
Signed-off-by: Benoit Cousson <b-cousson@ti.com>
[paul@pwsan.com: fixed patch description per Anand's E-mail]
Signed-off-by: Paul Walmsley <paul@pwsan.com>
Cc: Sergei Shtylyov <sshtylyov@mvista.com>
Cc: Anand Gadiyar <gadiyar@ti.com>
Because cgroup_fork() is ran before sched_fork() [ from copy_process() ]
and the child's pid is not yet visible the child is pinned to its
cgroup. Therefore we can silence this warning.
A nicer solution would be moving cgroup_fork() to right after
dup_task_struct() and exclude PF_STARTING from task_subsys_state().
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
sky2_phy_reinit is called by the ethtool helpers sky2_set_settings,
sky2_nway_reset and sky2_set_pauseparam when netif_running.
However, at the end of sky2_phy_init GM_GP_CTRL has GM_GPCR_RX_ENA and
GM_GPCR_TX_ENA cleared. So, doing these commands causes the device to
stop working:
$ ethtool -r eth0
$ ethtool -A eth0 autoneg off
Fix this issue by enabling Rx/Tx after running sky2_phy_init in
sky2_phy_reinit.
Signed-off-by: Brandon Philips <bphilips@suse.de>
Tested-by: Brandon Philips <bphilips@suse.de>
Cc: stable@kernel.org
Tested-by: Mike McCormack <mikem@ring3k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We're moving the mailing list to linux-cifs@vger.kernel.org.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Disable statistics initialization for eth clients that do not support
statistics. This prevents memory corruption on bnx2x hw.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
netxen_nic and qlcnic driver depends on firmware_class module.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit aa2ea0586d (tcp: fix outsegs stat for TSO segments) incorrectly
assumed SNMP_ADD_STATS() was used from BH context.
Fix this using mib[!in_softirq()] instead of mib[0]
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert to rcu_dereference_raw() given that many callers may have many
different locking models.
Located-by: Miles Lane <miles.lane@gmail.com>
Tested-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The task_group() function returns a pointer that must be protected
by either RCU, the ->alloc_lock, or the cgroup lock (see the
rcu_dereference_check() in task_subsys_state(), which is invoked by
task_group()). The wake_affine() function currently does none of these,
which means that a concurrent update would be within its rights to free
the structure returned by task_group(). Because wake_affine() uses this
structure only to compute load-balancing heuristics, there is no reason
to acquire either of the two locks.
Therefore, this commit introduces an RCU read-side critical section that
starts before the first call to task_group() and ends after the last use
of the "tg" pointer returned from task_group(). Thanks to Li Zefan for
pointing out the need to extend the RCU read-side critical section from
that proposed by the original patch.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
virtio-pci resets the device at startup by writing to the status
register, but this does not clear the pci config space,
specifically msi enable status which affects register
layout.
This breaks things like kdump when they try to use e.g. virtio-blk.
Fix by forcing msi off at startup. Since pci.c already has
a routine to do this, we export and use it instead of duplicating code.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
add_buf returns ring size on out of memory,
this is not what devices expect.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org # .34.x
Cintiq 21UX2 added 8 more bits for the tool serial number and more
buttons for the expresskey. We did not enable them properly in the
last patch.
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Some of the recent X86_MRST additions make some "select"s
conditional on X86_MRST but missed some related kconfig symbols,
causing:
drivers/built-in.o: In function `ps2_end_command':
(.text+0x257ab2): undefined reference to `i8042_check_port_owner'
drivers/built-in.o: In function `ps2_end_command':
(.text+0x257ae1): undefined reference to `i8042_unlock_chip'
drivers/built-in.o: In function `ps2_begin_command':
(.text+0x257b40): undefined reference to `i8042_check_port_owner'
drivers/built-in.o: In function `ps2_begin_command':
(.text+0x257b6f): undefined reference to `i8042_lock_chip'
when SERIO_I8042=m, SERIO_LIBPS2=y, KEYBOARD_ATKBD=y.
We need to make i8042 dependant upon !X86_MRST and allow deselecting
atkbd on Moorestown even when !CONFIG_EMBEDDED.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Apparently, we have never been able to set the atime correctly from the
NFSv4 client.
Reported-by: 小倉一夫 <ka-ogura@bd6.so-net.ne.jp>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Currently, we do not display the minor version mount parameter in the
/proc mount info.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Put the code that is common to both the referral and ordinary mount cases
into a common helper routine.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the attempt to read the calldir fails, then instead of storing the read
bytes, we currently discard them. This leads to a garbage final result when
upon re-entry to the same routine, we read the remaining bytes.
Fixes the regression in bugzilla number 16213. Please see
https://bugzilla.kernel.org/show_bug.cgi?id=16213
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
S_ISDIR(fsinfo.fattr->mode) checks the file type rather than the mode bits,
so we should be checking for the NFS_ATTR_FATTR_TYPE fattr property.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
This patch removes the setting of the low_latency flag.
tty_flip_buffer_push() is occasionally being called in irq context, which
causes a hang if the low_latency flag is set.
Removing the low_latency flag only seems to impact the flush to ldisc,
which will now be put on a workqueue.
Signed-off-by: Filip Aben <f.aben@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The previous CMT fixup accidentally copied in the TMU shift value, reset
this back to its original value while preserving the TMU fix.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
It has been reported that the new UFO software fallback path
fails under certain conditions with NFS. I tracked the problem
down to the generation of UFO packets that are smaller than the
MTU. The software fallback path simply discards these packets.
This patch fixes the problem by not generating such packets on
the UFO path.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix commit 4cd24eaf0 (net: use netdev_mc_count and netdev_mc_empty when
appropriate)
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
$ make CONFIG_DEBUG_SECTION_MISMATCH=y
[...]
WARNING: drivers/net/built-in.o(.data+0x0): Section mismatch in reference from the variable mipsnet_driver to the function .init.text:mipsnet_probe()
The variable mipsnet_driver references
the function __init mipsnet_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
[...]
Fixed by making mipsnet_probe __devinit.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
drivers/net/mipsnet.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
The header file include/linux/tracepoint.h may be included without
include/linux/errno.h and then the compiler will fail on building for
undelcared ENOSYS. This patch fixes this problem via including <linux/errno.h>
to include/linux/tracepoint.h.
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
LKML-Reference: <1277118549-622-1-git-send-email-wuzhangjin@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Stanse found that in snd_usb_parse_audio_endpoints, there is a
dangling pointer dereference. When snd_usb_parse_audio_format fails,
fp is freed, and continue invoked. On the next loop, there is
"fp && fp->altsetting == 1 && fp->channels == 1" test, but fp is set
from the last iteration (but is bogus) and thus ilegally dereferenced.
Set fp to NULL before "continue".
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Daniel Mack <daniel@caiaq.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
e98ef89b has a typo, causing cfq_blkiocg_update_completion_stats()
to call itself instead of blkiocg_update_completion_stats().
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The function begins and ends with a read_lock. The latter is changed to a
read_unlock.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@locked@
expression E1;
position p;
@@
read_lock(E1@p,...);
@r exists@
expression x <= locked.E1;
expression locked.E1;
expression E2;
identifier lock;
position locked.p,p1,p2;
@@
*lock@p1 (E1@p,...);
... when != E1
when != \(x = E2\|&x\)
*lock@p2 (E1,...);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Commit a2e066bba2 introduced core
swapping for CPU models 64 and later. I recently had a report about
a Sempron 3200+, model 95, for which this patch broke temperature
reading. It happens that this is a single-core processor, so the
effect of the swapping was to read a temperature value for a core
that didn't exist, leading to an incorrect value (-49 degrees C.)
Disabling core swapping on singe-core processors should fix this.
Additional comment from Andreas:
The BKDG says
Thermal Sensor Core Select (ThermSenseCoreSel)-Bit 2. This bit
selects the CPU whose temperature is reported in the CurTemp
field. This bit only applies to dual core processors. For
single core processors CPU0 Thermal Sensor is always selected.
k8temp_probe() correctly detected that SEL_CORE can't be used on single
core CPU. Thus k8temp did never update the temperature values stored
in temp[1][x] and -49 degrees was reported. For single core CPUs we
must use the values read into temp[0][x].
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Rick Moritz <rhavin@gmx.net>
Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: stable@kernel.org
i5k_amb.ko uses dynamically allocated memory (by kmalloc) for
attributes passed to sysfs. So, sysfs_attr_init() should be called
for working happy with lockdep.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org [2.6.34 only]
When detecting AM2+ or AM3 socket with DDR2, only blacklist cores
which are known to exist in AM2+ format.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: stable@kernel.org
This fixes the -Os breaks with gcc 4.5 bug. rdtsc_barrier needs to be
force inlined, otherwise user space will jump into kernel space and
kill init.
This also addresses http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44129
I believe.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <20100618210859.GA10913@basil.fritz.box>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>
Commit c7f486567c
(PCI PM: PCIe PME root port service driver) causes the native PCIe
PME signaling to be used by default, if the BIOS allows the kernel to
control the standard configuration registers of PCIe root ports.
However, the native PCIe PME is coupled to the native PCIe hotplug
and calling pcie_pme_acpi_setup() makes some BIOSes expect that
the native PCIe hotplug will be used as well. That, in turn, causes
problems to appear on systems where the PCIe hotplug driver is not
loaded. The usual symptom, as reported by Jaroslav Kameník and
others, is that the ACPI GPE associated with PCIe hotplug keeps
firing continuously causing kacpid to take substantial percentage
of CPU time.
To work around this issue, change the default so that the native
PCIe PME signaling is only used if directly requested with the help
of the pcie_pme= command line switch.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=15924 , which is
a listed regression from 2.6.33.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Jaroslav Kameník <jaroslav@kamenik.cz>
Tested-by: Antoni Grzymala <antekgrzymala@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
per_cpu_ptr_to_phys() determines whether the passed in @addr belongs
to the first_chunk or not by just matching the address against the
address range of the base unit (unit0, used by cpu0). When an adress
from another cpu was passed in, it will always determine that the
address doesn't belong to the first chunk even when it does. This
makes the function return a bogus physical address which may lead to
crash.
This problem was discovered by Cliff Wickman while investigating a
crash during kdump on a SGI UV system.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Cliff Wickman <cpw@sgi.com>
Tested-by: Cliff Wickman <cpw@sgi.com>
Cc: stable@kernel.org
Commit e70971591 ("sched: Optimize unused cgroup configuration") introduced
an imbalanced scheduling bug.
If we do not use CGROUP, function update_h_load won't update h_load. When the
system has a large number of tasks far more than logical CPU number, the
incorrect cfs_rq[cpu]->h_load value will cause load_balance() to pull too
many tasks to the local CPU from the busiest CPU. So the busiest CPU keeps
going in a round robin. That will hurt performance.
The issue was found originally by a scientific calculation workload that
developed by Yanmin. With that commit, the workload performance drops
about 40%.
CPU before after
00 : 2 : 7
01 : 1 : 7
02 : 11 : 6
03 : 12 : 7
04 : 6 : 6
05 : 11 : 7
06 : 10 : 6
07 : 12 : 7
08 : 11 : 6
09 : 12 : 6
10 : 1 : 6
11 : 1 : 6
12 : 6 : 6
13 : 2 : 6
14 : 2 : 6
15 : 1 : 6
Reviewed-by: Yanmin zhang <yanmin.zhang@intel.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1276754893.9452.5442.camel@debian>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It is common in end-node, non STP bridges to set forwarding
delay to zero; which causes the forwarding database cleanup
to run every clock tick. Change to run only as soon as needed
or at next ageing timer interval which ever is sooner.
Use round_jiffies_up macro rather than attempting round up
by changing value.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hi,
A user reported a kernel bug when running a particular program that did
the following:
created 32 threads
- each thread took a mutex, grabbed a global offset, added a buffer size
to that offset, released the lock
- read from the given offset in the file
- created a new thread to do the same
- exited
The result is that cfq's close cooperator logic would trigger, as the
threads were issuing I/O within the mean seek distance of one another.
This workload managed to routinely trigger a use after free bug when
walking the list of merge candidates for a particular cfqq
(cfqq->new_cfqq). The logic used for merging queues looks like this:
static void cfq_setup_merge(struct cfq_queue *cfqq, struct cfq_queue *new_cfqq)
{
int process_refs, new_process_refs;
struct cfq_queue *__cfqq;
/* Avoid a circular list and skip interim queue merges */
while ((__cfqq = new_cfqq->new_cfqq)) {
if (__cfqq == cfqq)
return;
new_cfqq = __cfqq;
}
process_refs = cfqq_process_refs(cfqq);
/*
* If the process for the cfqq has gone away, there is no
* sense in merging the queues.
*/
if (process_refs == 0)
return;
/*
* Merge in the direction of the lesser amount of work.
*/
new_process_refs = cfqq_process_refs(new_cfqq);
if (new_process_refs >= process_refs) {
cfqq->new_cfqq = new_cfqq;
atomic_add(process_refs, &new_cfqq->ref);
} else {
new_cfqq->new_cfqq = cfqq;
atomic_add(new_process_refs, &cfqq->ref);
}
}
When a merge candidate is found, we add the process references for the
queue with less references to the queue with more. The actual merging
of queues happens when a new request is issued for a given cfqq. In the
case of the test program, it only does a single pread call to read in
1MB, so the actual merge never happens.
Normally, this is fine, as when the queue exits, we simply drop the
references we took on the other cfqqs in the merge chain:
/*
* If this queue was scheduled to merge with another queue, be
* sure to drop the reference taken on that queue (and others in
* the merge chain). See cfq_setup_merge and cfq_merge_cfqqs.
*/
__cfqq = cfqq->new_cfqq;
while (__cfqq) {
if (__cfqq == cfqq) {
WARN(1, "cfqq->new_cfqq loop detected\n");
break;
}
next = __cfqq->new_cfqq;
cfq_put_queue(__cfqq);
__cfqq = next;
}
However, there is a hole in this logic. Consider the following (and
keep in mind that each I/O keeps a reference to the cfqq):
q1->new_cfqq = q2 // q2 now has 2 process references
q3->new_cfqq = q2 // q2 now has 3 process references
// the process associated with q2 exits
// q2 now has 2 process references
// queue 1 exits, drops its reference on q2
// q2 now has 1 process reference
// q3 exits, so has 0 process references, and hence drops its references
// to q2, which leaves q2 also with 0 process references
q4 comes along and wants to merge with q3
q3->new_cfqq still points at q2! We follow that link and end up at an
already freed cfqq.
So, the fix is to not follow a merge chain if the top-most queue does
not have a process reference, otherwise any queue in the chain could be
already freed. I also changed the logic to disallow merging with a
queue that does not have any process references. Previously, we did
this check for one of the merge candidates, but not the other. That
doesn't really make sense.
Without the attached patch, my system would BUG within a couple of
seconds of running the reproducer program. With the patch applied, my
system ran the program for over an hour without issues.
This addresses the following bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=16217
Thanks a ton to Phil Carns for providing the bug report and an excellent
reproducer.
[ Note for stable: this applies to 2.6.32/33/34 ].
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Reported-by: Phil Carns <carns@mcs.anl.gov>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Chris Wedgwood reports that 39c0cbe (sched: Rate-limit nohz) causes a
serial console regression, unresponsiveness, and indeed it does. The
reason is that the nohz code is skipped even when the tick was already
stopped before the nohz_ratelimit(cpu) condition changed.
Move the nohz_ratelimit() check to the other conditions which prevent
long idle sleeps.
Reported-by: Chris Wedgwood <cw@f00f.org>
Tested-by: Brian Bloniarz <bmb@athenacr.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg KH <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Jef Driesen <jefdriesen@telenet.be>
LKML-Reference: <1276790557.27822.516.camel@twins>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
At exit, perf record will kill the process it was profiling by sending a
SIGTERM to child_pid (if it had been initialised), but in certain situations
child_pid may be 0 and perf would mistakenly kill more processes than intended.
child_pid is set to the return of fork() to either 0 or the pid of the child.
Ordinarily this would not present an issue as the child calls execvp to spawn
the process to be profiled and would therefore never run it's sig_atexit and
never attempt to kill pid 0.
However, if a nonexistant binary had been passed in to perf record the call to
execvp would fail and child_pid would be left set to 0. The child would then
exit and it's atexit handler, finding that child_pid was initialised to 0,
would call kill(0, SIGTERM), resulting in every process within it's process
group being killed.
In the case that perf was being run directly from the shell this typically
would not be an issue as the shell isolates the process. However, if perf was
being called from another program it could kill unexpected processes, which may
even include X.
This patch changes the logic of the test for whether child_pid was initialised
to only consider positive pids as valid, thereby never attempting to kill pid
0.
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1276072680-17378-1-git-send-email-imunsie@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>