A couple issues here:
* Some resources weren't released.
* If alloc_etherdev() failed it would have caused a NULL dereference
because "pep" would be null when we checked "if (pep->clk)".
* Also it's better to propagate the error codes from mdiobus_register()
instead of just returning -ENOMEM.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
"pep->pd" isn't checked consistently in this function. For example it's
dereferenced unconditionally on the next line after the end of the if
condition. This function is only called from pxa168_eth_probe() and
pep->pd is always non-NULL so I removed the check.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible that phylib will call adjust_link before returning
from {,of_}phy_connect(), which may cause the following [very rare,
though] oops upon reopening the device:
Unable to handle kernel paging request for data at address 0x0000024c
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT SMP NR_CPUS=2 LTT NESTING LEVEL : 0
P1021 RDB
Modules linked in:
NIP: c0345dac LR: c0345dac CTR: c0345d84
TASK = dffab6b0[30] 'events/0' THREAD: c0d24000 CPU: 0
[...]
NIP [c0345dac] adjust_link+0x28/0x19c
LR [c0345dac] adjust_link+0x28/0x19c
Call Trace:
[c0d25f00] [000045e1] 0x45e1 (unreliable)
[c0d25f30] [c036c158] phy_state_machine+0x3ac/0x554
[...]
Here is why. Drivers store phydev in their private structures, e.g.
gianfar driver:
static int init_phy(struct net_device *dev)
{
...
priv->phydev = of_phy_connect(...);
...
}
So that adjust_link could retrieve it back:
static void adjust_link(struct net_device *dev)
{
...
struct phy_device *phydev = priv->phydev;
...
}
If the device has been opened before, then phydev->state is set to
PHY_HALTED (or undefined if the driver didn't call phy_stop()).
Now, phy_connect starts the PHY state machine before returning phydev to
the driver:
phy_start_machine(phydev, NULL);
if (phydev->irq > 0)
phy_start_interrupts(phydev);
return phydev;
The time between 'phy_start_machine()' and 'return phydev' is undefined.
The start machine routine delays execution for 1 second, which is enough
for most cases. But under heavy load, or if you're unlucky, it is quite
possible that PHY state machine will execute before phy_connect()
returns, and so adjust_link callback will try to dereference phydev,
which is not yet ready.
To fix the issue, simply initialize the PHY's state to PHY_READY during
phy_attach(). This will ensure that phylib won't call adjust_link before
phy_start().
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PCIe port driver's module exit routine is never used, so drop it.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The PCIe PME code only consists of one file, so it doesn't need to
occupy its own directory. Move it to drivers/pci/pcie/pme.c and
remove the contents of drivers/pci/pcie/pme .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In principle PCIe port services may be enabled by the BIOS, so it's
better to disable them during port initialization to avoid spurious
events from being generated.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
After commit 852972acff (ACPI: Disable
ASPM if the platform won't provide _OSC control for PCIe) control of
the PCIe Capability Structure is unconditionally requested by
acpi_pci_root_add(), which in principle may cause problems to
happen in two ways. First, the BIOS may refuse to give control of
the PCIe Capability Structure if it is not asked for any of the
_OSC features depending on it at the same time. Second, the BIOS may
assume that control of the _OSC features depending on the PCIe
Capability Structure will be requested in the future and may behave
incorrectly if that doesn't happen. For this reason, control of
the PCIe Capability Structure should always be requested along with
control of any other _OSC features that may depend on it (ie. PCIe
native PME, PCIe native hot-plug, PCIe AER).
Rework the PCIe port driver so that (1) it checks which native PCIe
port services can be enabled, according to the BIOS, and (2) it
requests control of all these services simultaneously. In
particular, this causes pcie_portdrv_probe() to fail if the BIOS
refuses to grant control of the PCIe Capability Structure, which
means that no native PCIe port services can be enabled for the PCIe
Root Complex the given port belongs to. If that happens, ASPM is
disabled to avoid problems with mishandling it by the part of the
PCIe hierarchy for which control of the PCIe Capability Structure
has not been received.
Make it possible to override this behavior using 'pcie_ports=native'
(use the PCIe native services regardless of the BIOS response to the
control request), or 'pcie_ports=compat' (do not use the PCIe native
services at all).
Accordingly, rework the existing PCIe port service drivers so that
they don't request control of the services directly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
It is possible that the BIOS will not grant control of all _OSC
features requested via acpi_pci_osc_control_set(), so it is
recommended to negotiate the final set of _OSC features with the
query flag set before calling _OSC to request control of these
features.
To implement it, rework acpi_pci_osc_control_set() so that the caller
can specify the mask of _OSC control bits to negotiate and the mask
of _OSC control bits that are absolutely necessary to it. Then,
acpi_pci_osc_control_set() will run _OSC queries in a loop until
the mask of _OSC control bits returned by the BIOS is equal to the
mask passed to it. Also, before running the _OSC request
acpi_pci_osc_control_set() will check if the caller's required
control bits are present in the final mask.
Using this mechanism we will be able to avoid situations in which the
BIOS doesn't grant control of certain _OSC features, because they
depend on some other _OSC features that have not been requested.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
There is the assumption in acpi_pci_osc_control_set() that it is
always sufficient to compare the mask of _OSC control bits to be
requested with the result of an _OSC query where all of the known
control bits have been checked. However, in general, that need not
be the case. For example, if an _OSC feature A depends on an _OSC
feature B and control of A, B plus another _OSC feature C is
requested simultaneously, the BIOS may return A, B, C, while it would
only return C if A and C were requested without B.
That may result in passing a wrong mask of _OSC control bits to an
_OSC control request, in which case the BIOS may only grant control
of a subset of the requested features. Moreover, acpi_pci_run_osc()
will return error code if that happens and the caller of
acpi_pci_osc_control_set() will not know that it's been granted
control of some _OSC features. Consequently, the system will
generally not work as expected.
Apart from this acpi_pci_osc_control_set() always uses the mask
of _OSC control bits returned by the very first invocation of
acpi_pci_query_osc(), but that is done with the second argument
equal to OSC_PCI_SEGMENT_GROUPS_SUPPORT which generally happens
to affect the returned _OSC control bits.
For these reasons, make acpi_pci_osc_control_set() always check if
control of the requested _OSC features will be granted before making
the final control request. As a result, the osc_control_qry and
osc_queried members of struct acpi_pci_root are not necessary any
more, so drop them and remove the remaining code referring to them.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make acpi_pci_query_osc() use an additional pointer argument to
return the mask of control bits obtained from the BIOS to the
caller.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Make acpi_pci_osc_control_set() attempt to find the handle of the
_OSC object under the given PCI root bridge object after verifying
that its second argument is correct and that there is a struct
acpi_pci_root object for the given root bridge handle, which is
more logical than the old code.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce kernel command line switch pcie_ports= allowing one to
disable all of the native PCIe port services, so that PCIe ports
are treated like PCI-to-PCI bridges.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Introduce a function allowing the caller to check whether to try to
enable PCIe AER.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Fix this error on an s390 allyesconfig build:
linux-2.6/drivers/net/caif/caif_spi.c:98:
undefined reference to `dma_free_coherent'
Cc: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pa-risc and ia64 have stacks that grow upwards. Check that
they do not run into other mappings. By making VM_GROWSUP
0x0 on architectures that do not ever use it, we can avoid
some unpleasant #ifdefs in check_stack_guard_page().
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When converting this to the new wait_for macro I inverted the wait
condition, which causes all sorts of problems. So correct it to fix
several failures caused by the bad wait (flickering, bad output
detection, tearing, etc.).
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sparse spotted that the kzalloc() in pm_qos_power_open() in the
current Linus' git tree had its parameters swapped. Fix this.
Signed-off-by: David Alan Gilbert <linux@treblig.org>
Acked-by: mark gross <markgross@thegnar.org>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Xen events are logically edge triggered, as Xen only calls the event
upcall when an event is newly set, but not continuously as it remains set.
As a result, use handle_edge_irq rather than handle_level_irq.
This has the important side-effect of fixing a long-standing bug of
events getting lost if:
- an event's interrupt handler is running
- the event is migrated to a different vcpu
- the event is re-triggered
The most noticable symptom of these lost events is occasional lockups
of blkfront.
Many thanks to Tom Kopec and Daniel Stodden in tracking this down.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Tom Kopec <tek@acm.org>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
IPIs and VIRQs are inherently per-cpu event types, so treat them as such:
- use a specific percpu irq_chip implementation, and
- handle them with handle_percpu_irq
This makes the path for delivering these interrupts more efficient
(no masking/unmasking, no locks), and it avoid problems with attempts
to migrate them.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Eliminiate sparse warning during usage of crypto_shash_* APIs
error: bad constant expression
Allocate memory for shash descriptors once, so that we do not kmalloc/kfree it
for every signature generation (shash descriptor for md5 hash).
From ed7538619817777decc44b5660b52268077b74f3 Mon Sep 17 00:00:00 2001
From: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Date: Tue, 24 Aug 2010 11:47:43 -0500
Subject: [PATCH] eliminate sparse warnings during crypto_shash_* APis usage
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
We should pass the data to the data register.
Signed-off-by: Jianwei Yang <jianwei.yang@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It looks like there is an off-by-one error in one of your changes to
drivers/staging/rar_register/rar_register.c:
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This build bug triggers:
drivers/built-in.o: In function `mantis_exit':
(.text+0x377413): undefined reference to `ir_input_unregister'
drivers/built-in.o: In function `mantis_input_init':
(.text+0x3774ff): undefined reference to `__ir_input_register'
If MANTIS_CORE is enabled but IR_CORE is not. Add the correct
dependency.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Get rid of indirect p1275 PROM call buffer.
sparc64: Fill a missing delay slot.
sparc64: Make lock backoff really a NOP on UP builds.
sparc64: simple microoptimizations for atomic functions
sparc64: Make rwsems 64-bit.
sparc64: Really fix atomic64_t interface types.
Now that the worklist is global, having works pending after wq
destruction can easily lead to oops and destroy_workqueue() have
several BUG_ON()s to catch these cases. Unfortunately, BUG_ON()
doesn't tell much about how the work became pending after the final
flush_workqueue().
This patch adds WQ_DYING which is set before the final flush begins.
If a work is requested to be queued on a dying workqueue,
WARN_ON_ONCE() is triggered and the request is ignored. This clearly
indicates which caller is trying to queue a work on a dying workqueue
and keeps the system working in most cases.
Locking rule comment is updated such that the 'I' rule includes
modifying the field from destruction path.
Signed-off-by: Tejun Heo <tj@kernel.org>
If netconsole is in use, there is a possibility for deadlock in 3c59x between
boomerang_interrupt and boomerang_start_xmit. Both routines take the vp->lock,
and if netconsole is in use, a pr_* call from the boomerang_interrupt routine
will result in the netconsole code attempting to trnasmit an skb, which can try
to take the same spin lock, resulting in deadlock.
The fix is pretty straightforward. This patch allocats a bit in the 3c59x
private structure to indicate that its handling an interrupt. If we get into
the transmit routine and that bit is set, we can be sure that we have recursed
and will deadlock if we continue, so instead we just return NETDEV_TX_BUSY, so
the stack requeues the skb to try again later.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tlb flushing code uses the mm_users field of the mm_struct to
decide if each page table entry needs to be flushed individually with
IPTE or if a global flush for the mm_struct is sufficient after all page
table updates have been done. The comment for mm_users says "How many
users with user space?" but the /proc code increases mm_users after it
found the process structure by pid without creating a new user process.
Which makes mm_users useless for the decision between the two tlb
flusing methods. The current code can be confused to not flush tlb
entries by a concurrent access to /proc files if e.g. a fork is in
progres. The solution for this problem is to make the tlb flushing
logic independent from the mm_users field.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
powerpc: Fix config dependency problem with MPIC_U3_HT_IRQS
via-pmu: Add compat_pmu_ioctl
powerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscalls
powerpc/pci: Fix checking for child bridges in PCI code.
powerpc: Fix typo in uImage target
powerpc: Initialise paca->kstack before early_setup_secondary
powerpc: Fix bogus it_blocksize in VIO iommu code
powerpc: Inline ppc64_runlatch_off
powerpc: Correct smt_enabled=X boot option for > 2 threads per core
powerpc: Silence xics_migrate_irqs_away() during cpu offline
powerpc: Silence __cpu_up() under normal operation
powerpc: Re-enable preemption before cpu_die()
powerpc/pci: Drop unnecessary null test
powerpc/powermac: Drop unnecessary null test
powerpc/powermac: Drop unnecessary of_node_put
powerpc/kdump: Stop all other CPUs before running crash handlers
powerpc/mm: Fix vsid_scrample typo
powerpc: Use is_32bit_task() helper to test 32 bit binary
powerpc: Export memstart_addr and kernstart_addr on ppc64
powerpc: Make rwsem use "long" type
...
fix this build error:
arch/s390/kernel/process.c:272: error: conflicting types for 'sys_execve'
arch/s390/kernel/entry.h:45: error: previous declaration of 'sys_execve' was here
make[1]: *** [arch/s390/kernel/process.o] Error 1
make: *** [arch/s390/kernel] Error 2
introduced by d7627467b7
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
68328serial: check return value of copy_*_user() instead of access_ok()
synclink: add mutex_unlock() on error path
rocket: add a mutex_unlock()
ip2: return -EFAULT on copy_to_user errors
ip2: remove unneeded NULL check
serial: print early console device address in hex
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
kobject_uevent: fix typo in comments
firmware_class: fix typo in error path
kobject: Break the kobject namespace defs into their own header
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (29 commits)
ARM: imx: fix build failure concerning otg/ulpi
USB: ftdi_sio: add product ID for Lenz LI-USB
USB: adutux: fix misuse of return value of copy_to_user()
USB: iowarrior: fix misuse of return value of copy_to_user()
USB: xHCI: update ring dequeue pointer when process missed tds
USB: xhci: Remove buggy assignment in next_trb()
USB: ftdi_sio: Add ID for Ionics PlugComputer
USB: serial: io_ti.c: don't return 0 if writing the download record failed
USB: otg: twl4030: fix wrong assumption of starting state
USB: gadget: Return -ENOMEM on memory allocation failure
USB: gadget: fix composite kernel-doc warnings
USB: ssu100: set tty_flags in ssu100_process_packet
USB: ssu100: add disconnect function for ssu100
USB: serial: export symbol usb_serial_generic_disconnect
USB: ssu100: rework logic for TIOCMIWAIT
USB: ssu100: add register parameter to ssu100_setregister
USB: ssu100: remove duplicate #defines in ssu100
USB: ssu100: refine process_packet in ssu100
USB: ssu100: add locking for port private data in ssu100
USB: r8a66597-udc: return -ENOMEM if kzalloc() fails
...
This is based upon a report by Meelis Roos showing that it's possible
that we'll try to fetch a property that is 32K in size with some
devices. With the current fixed 3K buffer we use for moving data in
and out of the firmware during PROM calls, that simply won't work.
In fact, it will scramble random kernel data during bootup.
The reasoning behind the temporary buffer is entirely historical. It
used to be the case that we had problems referencing dynamic kernel
memory (including the stack) early in the boot process before we
explicitly told the firwmare to switch us over to the kernel trap
table.
So what we did was always give the firmware buffers that were locked
into the main kernel image.
But we no longer have problems like that, so get rid of all of this
indirect bounce buffering.
Besides fixing Meelis's bug, this also makes the kernel data about 3K
smaller.
It was also discovered during these conversions that the
implementation of prom_retain() was completely wrong, so that was
fixed here as well. Currently that interface is not in use.
Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
MPIC_U3_HT_IRQS is selected both by PPC_PMAC64 and PPC_MAPLE, but depends
on PPC_MAPLE, so a PPC_PMAC64-only config gets this warning:
warning: (PPC_PMAC64 && PPC_PMAC && POWER4 || PPC_MAPLE && PPC64 && PPC_BOOK3S) selects MPIC_U3_HT_IRQS which has unmet direct dependencies (PPC_MAPLE)
Fix that by removing the dependency on PPC_MAPLE.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The ioctls are actually compatible, but due to historical mistake the
numbers differ between 32bit and 64bit.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pci_device_to_OF_node() can return null, and list_for_each_entry will
never enter the loop when dev is NULL, so it looks like this test is
a typo.
Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit e32e78c5ee
(powerpc: fix build with make 3.82) introduced a
typo in uImage target and broke building uImage:
make: *** No rule to make target `uImage'. Stop.
Signed-off-by: Anatolij Gustschin <agust@denx.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As early setup calls down to slb_initialize(), we must have kstack
initialised before checking "should we add a bolted SLB entry for our kstack?"
Failing to do so means stack access requires an SLB miss exception to refill
an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text
& static data). It's not always allowable to take such a miss, and
intermittent crashes will result.
Primary CPUs don't have this issue; an SLB entry is not bolted for their
stack anyway (as that lives within SLB(0)). This patch therefore only
affects the init of secondaries.
Signed-off-by: Matt Evans <matt@ozlabs.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When looking at some issues with the virtual ethernet driver I noticed
that TCE allocation was following a very strange pattern:
address 00e9000 length 2048
address 0409000 length 2048 <-----
address 0429000 length 2048
address 0449000 length 2048
address 0469000 length 2048
address 0489000 length 2048
address 04a9000 length 2048
address 04c9000 length 2048
address 04e9000 length 2048
address 4009000 length 2048 <-----
address 4029000 length 2048
Huge unexplained gaps in what should be an empty TCE table. It turns out
it_blocksize, the amount we want to align the next allocation to, was
c0000000fe903b20. Completely bogus.
Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc
everywhere to protect against this when we next add a non compulsary
field to iommu code and forget to initialise it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it
into the callers. To avoid a mess of circular includes I didn't add
it as an inline function.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The 'smt_enabled=X' boot option does not handle values of X > 2.
For Power 7 processors with smt modes of 0,1,2,3, and 4 this does
not work. This patch allows the smt_enabled option to be set to
any value limited to a max equal to the number of threads per
core.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
All IRQs are migrated away from a CPU that is being offlined so the
following messages suggest a problem when the system is behaving as
designed:
IRQ 262 affinity broken off cpu 1
IRQ 17 affinity broken off cpu 0
IRQ 18 affinity broken off cpu 0
IRQ 19 affinity broken off cpu 0
IRQ 256 affinity broken off cpu 0
IRQ 261 affinity broken off cpu 0
IRQ 262 affinity broken off cpu 0
Don't print these messages when the CPU is not online.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
During CPU offline/online tests __cpu_up would flood the logs with
the following message:
Processor 0 found.
This provides no useful information to the user as there is no context
provided, and since the operation was a success (to this point) it is expected
that the CPU will come back online, providing all the feedback necessary.
Change the "Processor found" message to DBG() similar to other such messages in
the same function. Also, add an appropriate log level for the "Processor is
stuck" message.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
start_secondary() is called shortly after _start and also via
cpu_idle()->cpu_die()->pseries_mach_cpu_die()
start_secondary() expects a preempt_count() of 0. pseries_mach_cpu_die() is
called via the cpu_idle() routine with preemption disabled, resulting in the
following repeating message during rapid cpu offline/online tests
with CONFIG_PREEMPT=y:
BUG: scheduling while atomic: swapper/0/0x00000002
Modules linked in: autofs4 binfmt_misc dm_mirror dm_region_hash dm_log [last unloaded: scsi_wait_scan]
Call Trace:
[c00000010e7079c0] [c0000000000133ec] .show_stack+0xd8/0x218 (unreliable)
[c00000010e707aa0] [c0000000006a47f0] .dump_stack+0x28/0x3c
[c00000010e707b20] [c00000000006e7a4] .__schedule_bug+0x7c/0x9c
[c00000010e707bb0] [c000000000699d9c] .schedule+0x104/0x800
[c00000010e707cd0] [c000000000015b24] .cpu_idle+0x1c4/0x1d8
[c00000010e707d70] [c0000000006aa1b4] .start_secondary+0x398/0x3d4
[c00000010e707e30] [c000000000008278] .start_secondary_resume+0x10/0x14
Move the cpu_die() call inside the existing preemption enabled block of
cpu_idle(). This is safe as the idle task is affined to a single CPU so the
debug_smp_processor_id() tests (from cpu_should_die()) won't trigger as we are
in a "migration disabled" region.
Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>