Commit Graph

56 Commits

Author SHA1 Message Date
Daniel Thompson d07ce4e32a kdb: Avoid array subscript warnings on non-SMP builds
Recent versions of gcc (reported on gcc-7.4) issue array subscript
warnings for builds where SMP is not enabled.

kernel/debug/debug_core.c: In function 'kdb_dump_stack_on_cpu':
kernel/debug/debug_core.c:452:17: warning: array subscript is outside array
+bounds [-Warray-bounds]
     if (!(kgdb_info[cpu].exception_state & DCPU_IS_SLAVE)) {
           ~~~~~~~~~^~~~~
   kernel/debug/debug_core.c:469:33: warning: array subscript is outside array
+bounds [-Warray-bounds]
     kgdb_info[cpu].exception_state |= DCPU_WANT_BT;
   kernel/debug/debug_core.c:470:18: warning: array subscript is outside array
+bounds [-Warray-bounds]
     while (kgdb_info[cpu].exception_state & DCPU_WANT_BT)

There is no bug here but there is scope to improve the code
generation for non-SMP systems (whilst also silencing the warning).

Reported-by: kbuild test robot <lkp@intel.com>
Fixes: 2277b49258 ("kdb: Fix stack crawling on 'running' CPUs that aren't the master")
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20191021101057.23861-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
2019-10-24 15:34:58 +01:00
Douglas Anderson 2277b49258 kdb: Fix stack crawling on 'running' CPUs that aren't the master
In kdb when you do 'btc' (back trace on CPU) it doesn't necessarily
give you the right info.  Specifically on many architectures
(including arm64, where I tested) you can't dump the stack of a
"running" process that isn't the process running on the current CPU.
This can be seen by this:

 echo SOFTLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
 # wait 2 seconds
 <sysrq>g

Here's what I see now on rk3399-gru-kevin.  I see the stack crawl for
the CPU that handled the sysrq but everything else just shows me stuck
in __switch_to() which is bogus:

======

[0]kdb> btc
btc: cpu status: Currently on cpu 0
Available cpus: 0, 1-3(I), 4, 5(I)
Stack traceback for pid 0
0xffffff801101a9c0        0        0  1    0   R  0xffffff801101b3b0 *swapper/0
Call trace:
 dump_backtrace+0x0/0x138
 ...
 kgdb_compiled_brk_fn+0x34/0x44
 ...
 sysrq_handle_dbg+0x34/0x5c
Stack traceback for pid 0
0xffffffc0f175a040        0        0  1    1   I  0xffffffc0f175aa30  swapper/1
Call trace:
 __switch_to+0x1e4/0x240
 0xffffffc0f65616c0
Stack traceback for pid 0
0xffffffc0f175d040        0        0  1    2   I  0xffffffc0f175da30  swapper/2
Call trace:
 __switch_to+0x1e4/0x240
 0xffffffc0f65806c0
Stack traceback for pid 0
0xffffffc0f175b040        0        0  1    3   I  0xffffffc0f175ba30  swapper/3
Call trace:
 __switch_to+0x1e4/0x240
 0xffffffc0f659f6c0
Stack traceback for pid 1474
0xffffffc0dde8b040     1474      727  1    4   R  0xffffffc0dde8ba30  bash
Call trace:
 __switch_to+0x1e4/0x240
 __schedule+0x464/0x618
 0xffffffc0dde8b040
Stack traceback for pid 0
0xffffffc0f17b0040        0        0  1    5   I  0xffffffc0f17b0a30  swapper/5
Call trace:
 __switch_to+0x1e4/0x240
 0xffffffc0f65dd6c0

===

The problem is that 'btc' eventually boils down to
  show_stack(task_struct, NULL);

...and show_stack() doesn't work for "running" CPUs because their
registers haven't been stashed.

On x86 things might work better (I haven't tested) because kdb has a
special case for x86 in kdb_show_stack() where it passes the stack
pointer to show_stack().  This wouldn't work on arm64 where the stack
crawling function seems needs the "fp" and "pc", not the "sp" which is
presumably why arm64's show_stack() function totally ignores the "sp"
parameter.

NOTE: we _can_ get a good stack dump for all the cpus if we manually
switch each one to the kdb master and do a back trace.  AKA:
  cpu 4
  bt
...will give the expected trace.  That's because now arm64's
dump_backtrace will now see that "tsk == current" and go through a
different path.

In this patch I fix the problems by catching a request to stack crawl
a task that's running on a CPU and then I ask that CPU to do the stack
crawl.

NOTE: this will (presumably) change what stack crawls are printed for
x86 machines.  Now kdb functions will show up in the stack crawl.
Presumably this is OK but if it's not we can go back and add a special
case for x86 again.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2019-10-10 16:28:48 +01:00
Douglas Anderson 7d92bda271 kgdb: don't use a notifier to enter kgdb at panic; call directly
Right now kgdb/kdb hooks up to debug panics by registering for the panic
notifier.  This works OK except that it means that kgdb/kdb gets called
_after_ the CPUs in the system are taken offline.  That means that if
anything important was happening on those CPUs (like something that might
have contributed to the panic) you can't debug them.

Specifically I ran into a case where I got a panic because a task was
"blocked for more than 120 seconds" which was detected on CPU 2.  I nicely
got shown stack traces in the kernel log for all CPUs including CPU 0,
which was running 'PID: 111 Comm: kworker/0:1H' and was in the middle of
__mmc_switch().

I then ended up at the kdb prompt where switched over to kgdb to try to
look at local variables of the process on CPU 0.  I found that I couldn't.
Digging more, I found that I had no info on any tasks running on CPUs
other than CPU 2 and that asking kdb for help showed me "Error: no saved
data for this cpu".  This was because all the CPUs were offline.

Let's move the entry of kdb/kgdb to a direct call from panic() and stop
using the generic notifier.  Putting a direct call in allows us to order
things more properly and it also doesn't seem like we're breaking any
abstractions by calling into the debugger from the panic function.

Daniel said:

: This patch changes the way kdump and kgdb interact with each other.
: However it would seem rather odd to have both tools simultaneously armed
: and, even if they were, the user still has the option to use panic_timeout
: to force a kdump to happen.  Thus I think the change of order is
: acceptable.

Link: http://lkml.kernel.org/r/20190703170354.217312-1-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Feng Tang <feng.tang@intel.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-25 17:51:40 -07:00
Nadav Amit d8a050f5a3 kgdb: fix comment regarding static function
The comment that says that module_event() is not static is clearly
wrong.

Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2019-09-03 11:19:41 +01:00
Douglas Anderson 162bc7f5af kdb: Don't back trace on a cpu that didn't round up
If you have a CPU that fails to round up and then run 'btc' you'll end
up crashing in kdb becaue we dereferenced NULL.  Let's add a check.
It's wise to also set the task to NULL when leaving the debugger so
that if we fail to round up on a later entry into the debugger we
won't backtrace a stale task.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2018-12-30 08:31:23 +00:00
Douglas Anderson 87b0959285 kgdb: Don't round up a CPU that failed rounding up before
If we're using the default implementation of kgdb_roundup_cpus() that
uses smp_call_function_single_async() we can end up hanging
kgdb_roundup_cpus() if we try to round up a CPU that failed to round
up before.

Specifically smp_call_function_single_async() will try to wait on the
csd lock for the CPU that we're trying to round up.  If the previous
round up never finished then that lock could still be held and we'll
just sit there hanging.

There's not a lot of use trying to round up a CPU that failed to round
up before.  Let's keep a flag that indicates whether the CPU started
but didn't finish to round up before.  If we see that flag set then
we'll skip the next round up.

In general we have a few goals here:
- We never want to end up calling smp_call_function_single_async()
  when the csd is still locked.  This is accomplished because
  flush_smp_call_function_queue() unlocks the csd _before_ invoking
  the callback.  That means that when kgdb_nmicallback() runs we know
  for sure the the csd is no longer locked.  Thus when we set
  "rounding_up = false" we know for sure that the csd is unlocked.
- If there are no timeouts rounding up we should never skip a round
  up.

NOTE #1: In general trying to continue running after failing to round
up CPUs doesn't appear to be supported in the debugger.  When I
simulate this I find that kdb reports "Catastrophic error detected"
when I try to continue.  I can overrule and continue anyway, but it
should be noted that we may be entering the land of dragons here.
Possibly the "Catastrophic error detected" was added _because_ of the
future failure to round up, but even so this is an area of the code
that hasn't been strongly tested.

NOTE #2: I did a bit of testing before and after this change.  I
introduced a 10 second hang in the kernel while holding a spinlock
that I could invoke on a certain CPU with 'taskset -c 3 cat /sys/...".

Before this change if I did:
- Invoke hang
- Enter debugger
- g (which warns about Catastrophic error, g again to go anyway)
- g
- Enter debugger

...I'd hang the rest of the 10 seconds without getting a debugger
prompt.  After this change I end up in the debugger the 2nd time after
only 1 second with the standard warning about 'Timed out waiting for
secondary CPUs.'

I'll also note that once the CPU finished waiting I could actually
debug it (aka "btc" worked)

I won't promise that everything works perfectly if the errant CPU
comes back at just the wrong time (like as we're entering or exiting
the debugger) but it certainly seems like an improvement.

NOTE #3: setting 'kgdb_info[cpu].rounding_up = false' is in
kgdb_nmicallback() instead of kgdb_call_nmi_hook() because some
implementations override kgdb_call_nmi_hook().  It shouldn't hurt to
have it in kgdb_nmicallback() in any case.

NOTE #4: this logic is really only needed because there is no API call
like "smp_try_call_function_single_async()" or "smp_csd_is_locked()".
If such an API existed then we'd use it instead, but it seemed a bit
much to add an API like this just for kgdb.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2018-12-30 08:29:13 +00:00
Douglas Anderson 3cd99ac355 kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()
When I had lockdep turned on and dropped into kgdb I got a nice splat
on my system.  Specifically it hit:
  DEBUG_LOCKS_WARN_ON(current->hardirq_context)

Specifically it looked like this:
  sysrq: SysRq : DEBUG
  ------------[ cut here ]------------
  DEBUG_LOCKS_WARN_ON(current->hardirq_context)
  WARNING: CPU: 0 PID: 0 at .../kernel/locking/lockdep.c:2875 lockdep_hardirqs_on+0xf0/0x160
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0 #27
  pstate: 604003c9 (nZCv DAIF +PAN -UAO)
  pc : lockdep_hardirqs_on+0xf0/0x160
  ...
  Call trace:
   lockdep_hardirqs_on+0xf0/0x160
   trace_hardirqs_on+0x188/0x1ac
   kgdb_roundup_cpus+0x14/0x3c
   kgdb_cpu_enter+0x53c/0x5cc
   kgdb_handle_exception+0x180/0x1d4
   kgdb_compiled_brk_fn+0x30/0x3c
   brk_handler+0x134/0x178
   do_debug_exception+0xfc/0x178
   el1_dbg+0x18/0x78
   kgdb_breakpoint+0x34/0x58
   sysrq_handle_dbg+0x54/0x5c
   __handle_sysrq+0x114/0x21c
   handle_sysrq+0x30/0x3c
   qcom_geni_serial_isr+0x2dc/0x30c
  ...
  ...
  irq event stamp: ...45
  hardirqs last  enabled at (...44): [...] __do_softirq+0xd8/0x4e4
  hardirqs last disabled at (...45): [...] el1_irq+0x74/0x130
  softirqs last  enabled at (...42): [...] _local_bh_enable+0x2c/0x34
  softirqs last disabled at (...43): [...] irq_exit+0xa8/0x100
  ---[ end trace adf21f830c46e638 ]---

Looking closely at it, it seems like a really bad idea to be calling
local_irq_enable() in kgdb_roundup_cpus().  If nothing else that seems
like it could violate spinlock semantics and cause a deadlock.

Instead, let's use a private csd alongside
smp_call_function_single_async() to round up the other CPUs.  Using
smp_call_function_single_async() doesn't require interrupts to be
enabled so we can remove the offending bit of code.

In order to avoid duplicating this across all the architectures that
use the default kgdb_roundup_cpus(), we'll add a "weak" implementation
to debug_core.c.

Looking at all the people who previously had copies of this code,
there were a few variants.  I've attempted to keep the variants
working like they used to.  Specifically:
* For arch/arc we passed NULL to kgdb_nmicallback() instead of
  get_irq_regs().
* For arch/mips there was a bit of extra code around
  kgdb_nmicallback()

NOTE: In this patch we will still get into trouble if we try to round
up a CPU that failed to round up before.  We'll try to round it up
again and potentially hang when we try to grab the csd lock.  That's
not new behavior but we'll still try to do better in a future patch.

Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2018-12-30 08:28:02 +00:00
Douglas Anderson 9ef7fa507d kgdb: Remove irq flags from roundup
The function kgdb_roundup_cpus() was passed a parameter that was
documented as:

> the flags that will be used when restoring the interrupts. There is
> local_irq_save() call before kgdb_roundup_cpus().

Nobody used those flags.  Anyone who wanted to temporarily turn on
interrupts just did local_irq_enable() and local_irq_disable() without
looking at them.  So we can definitely remove the flags.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2018-12-30 08:24:21 +00:00
Ingo Molnar 38b8d208a4 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h>
We are going to move softlockup APIs out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

<linux/nmi.h> already includes <linux/sched.h>.

Include the <linux/nmi.h> header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:30 +01:00
Ingo Molnar 314ff7851f mm/vmacache, sched/headers: Introduce 'struct vmacache' and move it from <linux/sched.h> to <linux/mm_types>
The <linux/sched.h> header includes various vmacache related defines,
which are arguably misplaced.

Move them to mm_types.h and minimize the sched.h impact by putting
all task vmacache state into a new 'struct vmacache' structure.

No change in functionality.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02 08:42:25 +01:00
Douglas Anderson 2d13bb6494 kernel/debug/debug_core.c: more properly delay for secondary CPUs
We've got a delay loop waiting for secondary CPUs.  That loop uses
loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures.  It is quite
common to see things like this in the boot log:

  Calibrating delay loop (skipped), value calculated using timer
  frequency.. 48.00 BogoMIPS (lpj=24000)

In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.

Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout.  We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.

[akpm@linux-foundation.org: simplifications, per Daniel]
Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Colin Cross 5516fd7b92 debug: prevent entering debug mode on panic/exception.
On non-developer devices, kgdb prevents the device from rebooting
after a panic.

Incase of panics and exceptions, to allow the device to reboot, prevent
entering debug mode to avoid getting stuck waiting for the user to
interact with debugger.

To avoid entering the debugger on panic/exception without any extra
configuration, panic_timeout is being used which can be set via
/proc/sys/kernel/panic at run time and CONFIG_PANIC_TIMEOUT sets the
default value.

Setting panic_timeout indicates that the user requested machine to
perform unattended reboot after panic. We dont want to get stuck waiting
for the user input incase of panic.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: kgdb-bugreport@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
Cc: Android Kernel Team <kernel-team@android.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Colin Cross <ccross@android.com>
[Kiran: Added context to commit message.
panic_timeout is used instead of break_on_panic and
break_on_exception to honor CONFIG_PANIC_TIMEOUT
Modified the commit as per community feedback]
Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19 12:39:03 -06:00
Jason Wessel df0036d117 kdb: Fix off by one error in kdb_cpu()
There was a follow on replacement patch against the prior
"kgdb: Timeout if secondary CPUs ignore the roundup".

See: https://lkml.org/lkml/2015/1/7/442

This patch is the delta vs the patch that was committed upstream:
  * Fix an off-by-one error in kdb_cpu().
  * Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we
    really want a static limit.
  * Removed the "KGDB: " prefix from the pr_crit() in debug_core.c
    (kgdb-next contains a patch which introduced pr_fmt() to this file
    to the tag will now be applied automatically).

Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2015-02-19 12:39:02 -06:00
Fabian Frederick 0f16996cf2 kernel/debug/debug_core.c: Logging clean-up
-Convert printk( to pr_foo()
-Add pr_fmt
-Coalesce formats

Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2014-11-11 09:31:53 -06:00
Daniel Thompson a1465d2f39 kgdb: timeout if secondary CPUs ignore the roundup
Currently if an active CPU fails to respond to a roundup request the CPU
that requested the roundup will become stuck.  This needlessly reduces the
robustness of the debugger.

This patch introduces a timeout allowing the system state to be examined
even when the system contains unresponsive processors.  It also modifies
kdb's cpu command to make it censor attempts to switch to unresponsive
processors and to report their state as (D)ead.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2014-11-11 09:31:53 -06:00
Peter Zijlstra 4e857c58ef arch: Mass conversion of smp_mb__*()
Mostly scripted conversion of the smp_mb__* barriers.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18 14:20:48 +02:00
Davidlohr Bueso 615d6e8756 mm: per-thread vma caching
This patch is a continuation of efforts trying to optimize find_vma(),
avoiding potentially expensive rbtree walks to locate a vma upon faults.
The original approach (https://lkml.org/lkml/2013/11/1/410), where the
largest vma was also cached, ended up being too specific and random,
thus further comparison with other approaches were needed.  There are
two things to consider when dealing with this, the cache hit rate and
the latency of find_vma().  Improving the hit-rate does not necessarily
translate in finding the vma any faster, as the overhead of any fancy
caching schemes can be too high to consider.

We currently cache the last used vma for the whole address space, which
provides a nice optimization, reducing the total cycles in find_vma() by
up to 250%, for workloads with good locality.  On the other hand, this
simple scheme is pretty much useless for workloads with poor locality.
Analyzing ebizzy runs shows that, no matter how many threads are
running, the mmap_cache hit rate is less than 2%, and in many situations
below 1%.

The proposed approach is to replace this scheme with a small per-thread
cache, maximizing hit rates at a very low maintenance cost.
Invalidations are performed by simply bumping up a 32-bit sequence
number.  The only expensive operation is in the rare case of a seq
number overflow, where all caches that share the same address space are
flushed.  Upon a miss, the proposed replacement policy is based on the
page number that contains the virtual address in question.  Concretely,
the following results are seen on an 80 core, 8 socket x86-64 box:

1) System bootup: Most programs are single threaded, so the per-thread
   scheme does improve ~50% hit rate by just adding a few more slots to
   the cache.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline       | 50.61%   | 19.90            |
| patched        | 73.45%   | 13.58            |
+----------------+----------+------------------+

2) Kernel build: This one is already pretty good with the current
   approach as we're dealing with good locality.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline       | 75.28%   | 11.03            |
| patched        | 88.09%   | 9.31             |
+----------------+----------+------------------+

3) Oracle 11g Data Mining (4k pages): Similar to the kernel build workload.

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline       | 70.66%   | 17.14            |
| patched        | 91.15%   | 12.57            |
+----------------+----------+------------------+

4) Ebizzy: There's a fair amount of variation from run to run, but this
   approach always shows nearly perfect hit rates, while baseline is just
   about non-existent.  The amounts of cycles can fluctuate between
   anywhere from ~60 to ~116 for the baseline scheme, but this approach
   reduces it considerably.  For instance, with 80 threads:

+----------------+----------+------------------+
| caching scheme | hit-rate | cycles (billion) |
+----------------+----------+------------------+
| baseline       | 1.06%    | 91.54            |
| patched        | 99.97%   | 14.18            |
+----------------+----------+------------------+

[akpm@linux-foundation.org: fix nommu build, per Davidlohr]
[akpm@linux-foundation.org: document vmacache_valid() logic]
[akpm@linux-foundation.org: attempt to untangle header files]
[akpm@linux-foundation.org: add vmacache_find() BUG_ON]
[hughd@google.com: add vmacache_valid_mm() (from Oleg)]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: adjust and enhance comments]
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Tested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-04-07 16:35:53 -07:00
Vijaya Kumar K d498d4b47f KGDB: make kgdb_breakpoint() as noinline
The function kgdb_breakpoint() sets up break point at
compile time by calling arch_kgdb_breakpoint();
Though this call is surrounded by wmb() barrier,
the compile can still re-order the break point,
because this scheduling barrier is not a code motion
barrier in gcc.

Making kgdb_breakpoint() as noinline solves this problem
of code reording around break point instruction and also
avoids problem of being called as inline function from
other places

More details about discussion on this can be found here
http://comments.gmane.org/gmane.linux.ports.arm.kernel/269732

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2014-02-26 11:16:25 +00:00
Mike Travis fc8b13740b kgdb/kdb: Fix no KDB config problem
Some code added to the debug_core module had KDB dependencies
that it shouldn't have.  Move the KDB dependent REASON back to
the caller to remove the dependency in the debug core code.

Update the call from the UV NMI handler to conform to the new
interface.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20140114162551.318251993@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-25 08:55:09 +01:00
Mike Travis 8daaa5f826 kdb: Add support for external NMI handler to call KGDB/KDB
This patch adds a kgdb_nmicallin() interface that can be used by
external NMI handlers to call the KGDB/KDB handler.  The primary
need for this is for those types of NMI interrupts where all the
CPUs have already received the NMI signal.  Therefore no
send_IPI(NMI) is required, and in fact it will cause a 2nd
unhandled NMI to occur. This generates the "Dazed and Confuzed"
messages.

Since all the CPUs are getting the NMI at roughly the same time,
it's not guaranteed that the first CPU that hits the NMI handler
will manage to enter KGDB and set the dbg_master_lock before the
slaves start entering. The new argument "send_ready" was added
for KGDB to signal the NMI handler to release the slave CPUs for
entry into KGDB.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Reviewed-by: Dimitri Sivanich <sivanich@sgi.com>
Reviewed-by: Hedi Berriche <hedi@sgi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20131002151417.928886849@asylum.americas.sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-10-03 18:47:54 +02:00
zhangwei(Jovi) f345650964 kgdb/sysrq: fix inconstistent help message of sysrq key
Currently help message of /proc/sysrq-trigger highlight its upper-case
characters, like below:

      SysRq : HELP : loglevel(0-9) reBoot Crash terminate-all-tasks(E)
      memory-full-oom-kill(F) kill-all-tasks(I) ...

this would confuse user trigger sysrq by upper-case character, which is
inconsistent with the real lower-case character registed key.

This inconsistent help message will also lead more confused when
26 upper-case letters put into use in future.

This patch fix kgdb sysrq key: "debug(g)"

Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:10 -07:00
Greg Kroah-Hartman 16559ae48c kgdb: remove #include <linux/serial_8250.h> from kgdb.h
There's no reason kgdb.h itself needs to include the 8250 serial port
header file.  So push it down to the _very_ limited number of individual
drivers that need the values in that file, and fix up the places where
people really wanted serial_core.h and platform_device.h.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-02-04 15:35:26 -08:00
Linus Torvalds 6c536a17fa KGDB/KDB fixes and cleanups
Cleanups
    Clean up compile warnings in kgdboc.c and x86/kernel/kgdb.c
    Add module event hooks for simplified debugging with gdb
  Fixes
    Fix kdb to stop paging with 'q' on bta and dmesg
    Fix for data that scrolls off the vga console due to line wrapping
      when using the kdb pager
  New
    The debug core registers for kernel module events which allows a
      kernel aware gdb to automatically load symbols and break on entry
      to a kernel module
    Allow kgdboc=kdb to setup kdb on the vga console
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJQeB8KAAoJEIciOldedpOjpbIP/j+LXEkzXKKfi/3m79VQ87DB
 5iUmTS84t84pomHamXX175AC0gA/2mC0FbbcHpqjlhxF4awXcviCNIiTdtSOTbbu
 G102naLHY8i77X+XbHuN2utJeaRLw8rsfMMZGmjJnjfpc4LtsaH0YTkUzbt3qvba
 N6/QvknadzIrmoCJvHipdOdsSmL0YmTS22+koG4es9B5jvOqVH/W7jZs1qRlVw96
 VxG5Psx4LPB+RI+ZwF1WwbGxbtqKGwkVvkcGG1XIW7FQojHmjw+vUERQCjoFueJ5
 NkKfus98j85/+MvSTkWx3L1K46MHMCFbtJs9RWftJ8GtoNNnm7GDxasoIG2bJKyG
 HFD3IGPuKAokE/equF3eGTRHeEM0IUGwT3EnBqdKd73zud27WsHaSqC/1CPR+74v
 ojLQ2ft1QF+pEkGrhRTdQpLyVnvEmxu8q+j9z9n/HlGEVv8kZ6LGxDPjWB+um/Yi
 Cs0XAryYrL5gE5O+Vwna61luughtIYJwR7+DeVxnQYJ43x/0MtN/SoURnwvrCTEo
 9FeoMgZm1nLh6EW29ahIT/hMu4f0sM91Kiwrmc/zEWZgoB++wo1n470qQmUUrOx4
 CPD7zdmDrf6YxDG2QTHjCtVErO4aJ5zN4Dq0+YyodV545SZVn3t4qBDTVvKhq4Y6
 NIhZAxrv5RKABwtLcP9E
 =uf0L
 -----END PGP SIGNATURE-----

Merge tag 'for_linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb

Pull KGDB/KDB fixes and cleanups from Jason Wessel:
 "Cleanups
   - Clean up compile warnings in kgdboc.c and x86/kernel/kgdb.c
   - Add module event hooks for simplified debugging with gdb
 Fixes
   - Fix kdb to stop paging with 'q' on bta and dmesg
   - Fix for data that scrolls off the vga console due to line wrapping
     when using the kdb pager
 New
   - The debug core registers for kernel module events which allows a
     kernel aware gdb to automatically load symbols and break on entry
     to a kernel module
   - Allow kgdboc=kdb to setup kdb on the vga console"

* tag 'for_linus-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
  tty/console: fix warnings in drivers/tty/serial/kgdboc.c
  kdb,vt_console: Fix missed data due to pager overruns
  kdb: Fix dmesg/bta scroll to quit with 'q'
  kgdboc: Accept either kbd or kdb to activate the vga + keyboard kdb shell
  kgdb,x86: fix warning about unused variable
  mips,kgdb: fix recursive page fault with CONFIG_KPROBES
  kgdb: Add module event hooks
2012-10-13 11:16:58 +09:00
Jason Wessel f30fed10c4 kgdb: Add module event hooks
Allow gdb to auto load kernel modules when it is attached,
which makes it trivially easy to debug module init functions
or pre-set breakpoints in a kernel module that has not loaded yet.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-10-12 06:37:33 -05:00
Anton Vorontsov 5a14fead07 kernel/debug: Mask KGDB NMI upon entry
The new arch callback should manage NMIs that usually cause KGDB to
enter. That is, not all NMIs should be enabled/disabled, but only
those that issue kgdb_handle_exception().

We must mask it as serial-line interrupt can be used as an NMI, so
if the original KGDB-entry cause was say a breakpoint, then every
input to KDB console will cause KGDB to reenter, which we don't want.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-26 13:42:25 -07:00
Linus Torvalds 6c216ec636 KGDB/KDB regression fixes
3.4: Fix an an Smatch warning that appeared in the 3.4 merge window
    3.0: Fix kgdb test suite with SMP for all archs without HW single stepping
 2.6.36: Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86
 2.6.26: Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA
         Fix kgdb test suite with SMP for all archs with HW single stepping
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPedocAAoJEIciOldedpOjn7EP/397Rh0zmRlG8oQwMEJcK3E5
 gaRyBNpkGoU3ekHXHx/nzgQ/CS9opzW7nBZDu8weWLjRKMT4RyHfuJcWyu525GvQ
 SnoiX2ZUzP315d8llCYwXmaCEYA7lHQi4T2bGMlDSn1J8kS235EQxllgEfhXDdEC
 DxRWgHABG2UR62C62sGKbPaMMDO9TcNcrAQK27LDLTS7pKLmYqBWBdZKgWzBM/Pr
 AF8vakqSgUw3Aq9qrLge+483uT7uhMoUJofxRppWtm1QgnDcTmri9LOagiazDotz
 RQliRGwVxj9hEo5mLEiQtI0N1kIGCAsK0+9aUJEZRXovRBR9kvqaqHT4c5xdhznr
 VKYvqqTcHBkKLIfNXFvQZnn2cXtNVNqve9CZZwdBJaFYEkaR7ZVQqE6f2xq8KAb2
 RmhvzlEUyLU+89YKkH66uSa22VLSazkeH+4b8AJ4JxYDEab3BHoBCe8axcBQrTsj
 7X5NOs7V3Oj+4J3bS1fbUbxq4t0dfpLLyg8e/lELWtT+Kq7nQRzA2XHRZAMTve8M
 T0cTdrwtUbgY9ZMTpywNB2KlPgTvhWOyfYbH6/Kcks7ecSXlkow3edXoiUbw79iE
 hP8vcMWbT2Rv3IbLkSMFZEQGAG9qL1YyGv4NDmLOoljO1c/Bi3WQIR5aI+di6asV
 Z5q5s/bmGa4+OhFFITSd
 =SW2N
 -----END PGP SIGNATURE-----

Merge tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb

Pull KGDB/KDB regression fixes from Jason Wessel:
 - Fix a Smatch warning that appeared in the 3.4 merge window
 - Fix kgdb test suite with SMP for all archs without HW single stepping
 - Fix kgdb sw breakpoints with CONFIG_DEBUG_RODATA=y limitations on x86
 - Fix oops on kgdb test suite with CONFIG_DEBUG_RODATA
 - Fix kgdb test suite with SMP for all archs with HW single stepping

* tag 'for_linus-3.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
  x86,kgdb: Fix DEBUG_RODATA limitation using text_poke()
  kgdb,debug_core: pass the breakpoint struct instead of address and memory
  kgdbts: (2 of 2) fix single step awareness to work correctly with SMP
  kgdbts: (1 of 2) fix single step awareness to work correctly with SMP
  kgdbts: Fix kernel oops with CONFIG_DEBUG_RODATA
  kdb: Fix smatch warning on dbg_io_ops->is_console
2012-04-04 17:26:08 -07:00
Jason Wessel 98b54aa1a2 kgdb,debug_core: pass the breakpoint struct instead of address and memory
There is extra state information that needs to be exposed in the
kgdb_bpt structure for tracking how a breakpoint was installed.  The
debug_core only uses the the probe_kernel_write() to install
breakpoints, but this is not enough for all the archs.  Some arch such
as x86 need to use text_poke() in order to install a breakpoint into a
read only page.

Passing the kgdb_bpt structure to kgdb_arch_set_breakpoint() and
kgdb_arch_remove_breakpoint() allows other archs to set the type
variable which indicates how the breakpoint was installed.

Cc: stable@vger.kernel.org # >= 2.6.36
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-29 17:41:25 -05:00
Linus Torvalds 0195c00244 Disintegrate and delete asm/system.h
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk
 8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m
 u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB
 ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m
 rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl
 eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45
 HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+
 /5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9
 Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK
 4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR
 FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD
 NypQthI85pc=
 =G9mT
 -----END PGP SIGNATURE-----

Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system

Pull "Disintegrate and delete asm/system.h" from David Howells:
 "Here are a bunch of patches to disintegrate asm/system.h into a set of
  separate bits to relieve the problem of circular inclusion
  dependencies.

  I've built all the working defconfigs from all the arches that I can
  and made sure that they don't break.

  The reason for these patches is that I recently encountered a circular
  dependency problem that came about when I produced some patches to
  optimise get_order() by rewriting it to use ilog2().

  This uses bitops - and on the SH arch asm/bitops.h drags in
  asm-generic/get_order.h by a circuituous route involving asm/system.h.

  The main difficulty seems to be asm/system.h.  It holds a number of
  low level bits with no/few dependencies that are commonly used (eg.
  memory barriers) and a number of bits with more dependencies that
  aren't used in many places (eg.  switch_to()).

  These patches break asm/system.h up into the following core pieces:

    (1) asm/barrier.h

        Move memory barriers here.  This already done for MIPS and Alpha.

    (2) asm/switch_to.h

        Move switch_to() and related stuff here.

    (3) asm/exec.h

        Move arch_align_stack() here.  Other process execution related bits
        could perhaps go here from asm/processor.h.

    (4) asm/cmpxchg.h

        Move xchg() and cmpxchg() here as they're full word atomic ops and
        frequently used by atomic_xchg() and atomic_cmpxchg().

    (5) asm/bug.h

        Move die() and related bits.

    (6) asm/auxvec.h

        Move AT_VECTOR_SIZE_ARCH here.

  Other arch headers are created as needed on a per-arch basis."

Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that.  We'll find out anything that got broken and fix it..

* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
  Delete all instances of asm/system.h
  Remove all #inclusions of asm/system.h
  Add #includes needed to permit the removal of asm/system.h
  Move all declarations of free_initmem() to linux/mm.h
  Disintegrate asm/system.h for OpenRISC
  Split arch_align_stack() out from asm-generic/system.h
  Split the switch_to() wrapper out of asm-generic/system.h
  Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
  Create asm-generic/barrier.h
  Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
  Disintegrate asm/system.h for Xtensa
  Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
  Disintegrate asm/system.h for Tile
  Disintegrate asm/system.h for Sparc
  Disintegrate asm/system.h for SH
  Disintegrate asm/system.h for Score
  Disintegrate asm/system.h for S390
  Disintegrate asm/system.h for PowerPC
  Disintegrate asm/system.h for PA-RISC
  Disintegrate asm/system.h for MN10300
  ...
2012-03-28 15:58:21 -07:00
David Howells 9ffc93f203 Remove all #inclusions of asm/system.h
Remove all #inclusions of asm/system.h preparatory to splitting and killing
it.  Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-28 18:30:03 +01:00
Jason Wessel bec4d62ead kgdb,debug_core: add the ability to control the reboot notifier
Sometimes it is desirable to stop the kernel debugger before allowing
a system to reboot either with kdb or kgdb.  This patch adds the
ability to turn the reboot notifier on and off or enter the debugger
and stop kernel execution before rebooting.

It is possible to change the setting after booting the kernel with the
following:

echo 1 > /sys/module/debug_core/parameters/kgdbreboot

It is also possible to change this setting using kdb / kgdb to
manipulate the variable directly.

Using KDB:
   mm kgdbreboot 1

Using gdb:
   set kgdbreboot=1

Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-22 15:07:16 -05:00
Jason Wessel 2366e04784 kgdb,debug-core,gdbstub: Hook the reboot notifier for debugger detach
The gdbstub and kdb should get detached if the system is rebooting.
Calling gdbstub_exit() will set the proper debug core state and send a
message to any debugger that is connected to correctly detach.

An attached debugger will receive the exit code from
include/linux/reboot.h based on SYS_HALT, SYS_REBOOT, etc...

Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2012-03-22 15:07:15 -05:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Lucas De Marchi 25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Dongdong Deng d7ba979d45 debug_core,x86,blackfin: Clean up hw debug disable API
The kgdb_disable_hw_debug() was an architecture specific function for
disabling all hardware breakpoints on a per cpu basis when entering
the debug core.

This patch will remove the weak function kdbg_disable_hw_debug() and
change it into a call back which lives with the rest of hw breakpoint
call backs in struct kgdb_arch.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-29 13:14:41 -05:00
Jason Wessel 495363d380 kdb,debug_core: adjust master cpu switch logic against new debug_core locking
The kdb shell needs to enforce switching back to the original CPU that
took the exception before restoring normal kernel execution.  Resuming
from a different CPU than what took the original exception will cause
problems with spin locks that are freed from the a different processor
than had taken the lock.

The special logic in dbg_cpu_switch() can go away entirely with
because the state of what cpus want to be masters or slaves will
remain unchanged between entry and exit of the debug_core exception
context.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22 15:34:13 -05:00
Jason Wessel dfee3a7b92 debug_core: refactor locking for master/slave cpus
For quite some time there have been problems with memory barriers and
various races with NMI on multi processor systems using the kernel
debugger.  The algorithm for entering the kernel debug core and
resuming kernel execution was racy and had several known edge case
problems with attempting to debug something on a heavily loaded system
using breakpoints that are hit repeatedly and quickly.

The prior "locking" design entry worked as follows:

  * The atomic counter kgdb_active was used with atomic exchange in
    order to elect a master cpu out of all the cpus that may have
    taken a debug exception.
  * The master cpu increments all elements of passive_cpu_wait[].
  * The master cpu issues the round up cpus message.
  * Each "slave cpu" that enters the debug core increments its own
    element in cpu_in_kgdb[].
  * Each "slave cpu" spins on passive_cpu_wait[] until it becomes 0.
  * The master cpu debugs the system.

The new scheme removes the two arrays of atomic counters and replaces
them with 2 single counters.  One counter is used to count the number
of cpus waiting to become a master cpu (because one or more hit an
exception). The second counter is use to indicate how many cpus have
entered as slave cpus.

The new entry logic works as follows:

  * One or more cpus enters via kgdb_handle_exception() and increments
    the masters_in_kgdb. Each cpu attempts to get the spin lock called
    dbg_master_lock.
  * The master cpu sets kgdb_active to the current cpu.
  * The master cpu takes the spinlock dbg_slave_lock.
  * The master cpu asks to round up all the other cpus.
  * Each slave cpu that is not already in kgdb_handle_exception()
    will enter and increment slaves_in_kgdb.  Each slave will now spin
    try_locking on dbg_slave_lock.
  * The master cpu waits for the sum of masters_in_kgdb and slaves_in_kgdb
    to be equal to the sum of the online cpus.
  * The master cpu debugs the system.

In the new design the kgdb_active can only be changed while holding
dbg_master_lock.  Stress testing has not turned up any further
entry/exit races that existed in the prior locking design.  The prior
locking design suffered from atomic variables not being truly atomic
(in the capacity as used by kgdb) along with memory barrier races.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Dongdong Deng <dongdong.deng@windriver.com>
2010-10-22 15:34:13 -05:00
Dongdong Deng c1bb9a9c19 debug_core: disable hw_breakpoints on all cores in kgdb_cpu_enter()
The slave cpus do not have the hw breakpoints disabled upon entry to
the debug_core and as a result could cause unrecoverable recursive
faults on badly placed breakpoints, or get out of sync with the arch
specific hw breakpoint operations.

This patch addresses the problem by invoking kgdb_disable_hw_debug()
earlier in kgdb_enter_cpu for each cpu that enters the debug core.

The hw breakpoint dis/enable flow should be:

master_debug_cpu   slave_debug_cpu
         \              /
          kgdb_cpu_enter
                |
        kgdb_disable_hw_debug --> uninstall pre-enabled hw_breakpoint
                |
 do add/rm dis/enable operates to hw_breakpoints on master_debug_cpu..
                |
        correct_hw_break --> correct/install the enabled hw_breakpoint
                |
           leave_kgdb

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22 15:34:12 -05:00
Jason Wessel fb70b5888b debug_core: stop rcu warnings on kernel resume
When returning from the kernel debugger reset the rcu jiffies_stall
value to prevent the rcu stall detector from sending NMI events which
invoke a stack dump for each cpu in the system.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22 15:34:10 -05:00
Jason Wessel 16cdc628c3 debug_core: move all watch dog syncs to a single function
Move the various clock and watch dog syncs to a single function in
advance of adding another sync for the rcu stall detector.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-10-22 15:34:10 -05:00
Dmitry Torokhov 1495cc9df4 Input: sysrq - drop tty argument from sysrq ops handlers
Noone is using tty argument so let's get rid of it.

Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2010-08-19 22:07:06 -07:00
Linus Torvalds 89a6c8cb9e Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  debug_core,kdb: fix crash when arch does not have single step
  kgdb,x86: use macro HBP_NUM to replace magic number 4
  kgdb,mips: remove unused kgdb_cpu_doing_single_step operations
  mm,kdb,kgdb: Add a debug reference for the kdb kmap usage
  KGDB: Remove set but unused newPC
  ftrace,kdb: Allow dumping a specific cpu's buffer with ftdump
  ftrace,kdb: Extend kdb to be able to dump the ftrace buffer
  kgdb,powerpc: Replace hardcoded offset by BREAK_INSTR_SIZE
  arm,kgdb: Add ability to trap into debugger on notify_die
  gdbstub: do not directly use dbg_reg_def[] in gdb_cmd_reg_set()
  gdbstub: Implement gdbserial 'p' and 'P' packets
  kgdb,arm: Individual register get/set for arm
  kgdb,mips: Individual register get/set for mips
  kgdb,x86: Individual register get/set for x86
  kgdb,kdb: individual register set and and get API
  gdbstub: Optimize kgdb's "thread:" response for the gdb serial protocol
  kgdb: remove custom hex_to_bin()implementation
2010-08-05 15:59:48 -07:00
Jason Wessel 3fa43aba08 debug_core,kdb: fix crash when arch does not have single step
When an arch such as mips and microblaze does not implement either HW
or software single stepping the debug core should re-enter kdb.  The
kdb code will properly ignore the single step operation.  Attempting
to single step the kernel without software or hardware support causes
unpredictable kernel crashes.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-08-05 09:22:25 -05:00
Jiri Kosina d790d4d583 Merge branch 'master' into for-next 2010-08-04 15:14:38 +02:00
Jason Wessel b0679c63db debug_core,kdb: fix kgdb_connected bit set in the wrong place
Immediately following an exit from the kdb shell the kgdb_connected
variable should be set to zero, unless there are breakpoints planted.
If the kgdb_connected variable is not zeroed out with kdb, it is
impossible to turn off kdb.

This patch is merely a work around for now, the real fix will check
for the breakpoints.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-07-21 19:27:07 -05:00
Pavel Machek a2531293db update email address
pavel@suse.cz no longer works, replace it with working address.

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-07-19 10:56:54 +02:00
Jason Wessel 0b4b3827db x86, kgdb, init: Add early and late debug states
The kernel debugger can operate well before mm_init(), but the x86
hardware breakpoint code which uses the perf api requires that the
kernel allocators are initialized.

This means the kernel debug core needs to provide an optional arch
specific call back to allow the initialization functions to run after
the kernel has been further initialized.

The kdb shell already had a similar restriction with an early
initialization and late initialization.  The kdb_init() was moved into
the debug core's version of the late init which is called
dbg_late_init();

CC: kgdb-bugreport@lists.sourceforge.net
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20 21:04:29 -05:00
Jason Wessel 4402c153cb kdb,debug_core: Allow the debug core to receive a panic notification
It is highly desirable to trap into kdb on panic.  The debug core will
attempt to register as the first in line for the panic notifier.

CC: Ingo Molnar <mingo@elte.hu>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20 21:04:28 -05:00
Jason Wessel 6d90634076 debug_core,kdb: Allow the debug core to process a recursive debug entry
This allows kdb to debug a crash with in the kms code with a
single level recursive re-entry.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20 21:04:27 -05:00
Jason Wessel 1cee5e35f1 kgdb: Add the ability to schedule a breakpoint via a tasklet
Some kgdb I/O modules require the ability to create a breakpoint
tasklet, such as kgdboc and external modules such as kgdboe.  The
breakpoint tasklet is used as an asynchronous entry point into the
debugger which will have a different function scope than the current
execution path where it might not be safe to have an inline
breakpoint.  This is true of some of the kgdb I/O drivers which share
code with kgdb and rest of the kernel users.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20 21:04:26 -05:00
Jason Wessel f503b5ae53 x86,kgdb: Add low level debug hook
The only way the debugger can handle a trap in inside rcu_lock,
notify_die, or atomic_notifier_call_chain without a triple fault is
to have a low level "first opportunity handler" in the int3 exception
handler.

Generally this will be something the vast majority of folks will not
need, but for those who need it, it is added as a kernel .config
option called KGDB_LOW_LEVEL_TRAP.

CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: H. Peter Anvin <hpa@zytor.com>
CC: x86@kernel.org
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20 21:04:25 -05:00