The calculation of the size for the exception save area of the TLB
miss handler is wrong, luckily it's too big not too small.
Rework it to make it a bit clearer, and also correct. We want 3 save
areas, each EX_TLB_SIZE _bytes_.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The MSR bit which indicates 64-bit-ness is different between server and
booke, so add a #define which gives you the right mask regardless.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The goal is to avoid adding overhead to MMIO when only PIO is needed
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The DSCR (aka Data Stream Control Register) is supported on some
server PowerPC chips and allow some control over the prefetch
of data streams.
This patch allows the value to be specified per thread by emulating
the corresponding mfspr and mtspr instructions. Children of such
threads inherit the value. Other threads use a default value that
can be specified in sysfs - /sys/devices/system/cpu/dscr_default.
If a thread starts with non default value in the sysfs entry,
all children threads inherit this non default value even if
the sysfs value is changed later.
Signed-off-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we set up the TLB for ourselves on Book3E, we need to flush out any
old mappings established by the firmware or bootloader. At present we
attempt this with a tlbilx to flush everything, but this will leave behind
any entries with the IPROT bit set.
There are several good reason firmware might establish mappings with IPROT,
and in fact ePAPR compliant firmwares are required to establish their
initial mapped area with IPROT.
This patch, therefore adds more complex code to scan through the TLB upon
entry and flush away any entries that are not our own.
Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add the cputable entry, regs and setup & restore entries for
the PowerPC A2 core.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
mac-fec.c was setting individual UDP address registers instead of multicast
group address registers when joining a multicast group.
This prevented from correctly receiving UDP multicast packets.
According to datasheet, replaced hash_table_high and hash_table_low
with grp_hash_table_high and grp_hash_table_low respectively.
Also renamed hash_table_* with grp_hash_table_* in struct fec declaration
for 8xx: these registers are used only for multicast there.
Tested on a MPC5121 based board.
Build tested also against mpc866_ads_defconfig.
Signed-off-by: Andrea Galbusera <gizero@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
An upcoming new ics backend will need to implement different matching
semantics to the current ones, which are essentially the RTAS ics
backends. So move the current match into the RTAS backend, and allow
other ics backends to override.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
SCOM is a side-band configuration bus implemented on some processors.
This code provides a way for code to map and operate on devices via
SCOM, while the details of how that is implemented is left up to a
SCOM "controller" in the platform code.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we start a cpu we use smp_ops->kick_cpu(), which currently
returns void, it should be able to fail. Convert it to return
int, and update all uses.
Convert all the current error cases to return -ENOENT, which is
what would eventually be returned by __cpu_up() currently when
it doesn't detect the cpu as coming up in time.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On Book3E, MMU_NO_CONTEXT != 0, but the slice_mm_new_context()
macro assumes that it is. This means that the map of the
page sizes for each slice is always initialized to zeroes
(which happens to be 4k pages), rather than to the correct
default base page size value - which might be 64k.
This patch corrects the problem.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use MMU_NO_CONTEXT as the initialiser for mm_context.id on
nohash and hash64.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Wakeup comes from the system reset handler with a potential loss of
the non-hypervisor CPU state. We save the non-volatile state on the
stack and a pointer to it in the PACA, which the system reset handler
uses to restore things
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We need to wait a bit for them to have done their CPU setup
or we might end up with translation and EE on with different
LPCR values between threads
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This uses feature sections to arrange that we always use HSPRG1
as the scratch register in the interrupt entry code rather than
SPRG2 when we're running in hypervisor mode on POWER7. This will
ensure that we don't trash the guest's SPRG2 when we are running
KVM guests. To simplify the code, we define GET_SCRATCH0() and
SET_SCRATCH0() macros like the GET_PACA/SET_PACA macros.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rework exception macros a bit to split offset from vector and add
some basic support for HDEC, HDSI, HISI and a few more.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pass the register type to the prolog, also provides alternate "HV"
version of hardware interrupt (0x500) and adjust LPES accordingly
We tag those interrupts by setting bit 0x2 in the trap number
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When running in Hypervisor mode (arch 2.06 or later), we store the PACA
in HSPRG0 instead of SPRG1. The architecture specifies that SPRGs may be
lost during a "nap" power management operation (though they aren't
currently on POWER7) and this enables use of SPRG1 by KVM guests.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This bit indicates that we are operating in hypervisor mode on a CPU
compliant to architecture 2.06 or later (currently server only).
We set it on POWER7 and have a boot-time CPU setup function that
clears it if MSR:HV isn't set (booting under a hypervisor).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This adds more SPR definitions used on newer processors when running
in hypervisor mode. Along with some other P7 specific bits and pieces
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is a significant rework of the XICS driver, too significant to
conveniently break it up into a series of smaller patches to be honest.
The driver is moved to a more generic location to allow new platforms
to use it, and is broken up into separate ICP and ICS "backends". For
now we have the native and "hypervisor" ICP backends and one common
RTAS ICS backend.
The driver supports one ICP backend instanciation, and many ICS ones,
in order to accomodate future platforms with multiple possibly different
interrupt "sources" mechanisms.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This problem was noticed on an MPC855T platform. Ftrace did oops
when trying to write to the kernel text segment.
Many thanks to Joakim for finding the root cause of this problem.
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Joakim Tjernlund <joakim.tjernlund@transmode.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
e500mc does not support the HID0/MSR mechanism that is used by e500_idle
(and there are also issues with waking on certain types of interrupts).
Further, even if napping is never actually enabled, just having
CPU_FTR_CAN_NAP will cause machine_init() to overwrite the board's supplied
ppc_md.power_save().
We drop CPU_FTR_MAYBE_CAN_DOZE becuase we should use 'wait' instead on
e500mc.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The CPU_FTRS_POSSIBLE and CPU_FTRS_ALWAYS defines did not encompass
e5500 CPU features when built for 64-bit. This causes issues with
cpu_has_feature() as it utilizes the POSSIBLE & ALWAYS defines as part
of its check.
Create a unique CPU_FTRS_E5500 (as its different from CPU_FTRS_E500MC),
created a new group for 64-bit Book3e based CPUs and add CPU_FTRS_E5500
to that group.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Instead, keep it static, expose an accessor and use that from
the PowerMac code. Avoids easy namespace collisions and will
make it easier to consolidate with other implementations.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This allows us to stop abusing smp_ops->setup_cpu() for cleanup
tasks that have to take place after the initial boot time CPU
bringup.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current code soft-disables, and then goes to NAP mode which
turns interrupts on. That means that if an interrupt occurs, we
will hit the masked interrupt code path which isn't what we want,
as it will return with EE off, which will either get us out of
NAP mode, or fail to enter it (according to spec).
Instead, let's just rely on the fact that it is safe to take
decrementer interrupts on an offline CPU and leave interrupts
enabled. We can also get rid of the special case in asm for
power4_cpu_offline_powersave() and just use power4_idle().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use generic cpu_state, call idle_task_exit() properly, and
remove smp_core99_cpu_die() which isn't useful, the generic
function does the job just fine.
Remove the last remnants of cpu_enable(), everybody uses the normal
__cpu_up() path now
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Various thing are torn down when a CPU is hot-unplugged. That CPU
is expected to go back to start_secondary when re-plugged to re
initialize everything, such as clock sources, maps, ...
Some implementations just return from cpu_die() callback
in the idle loop when the CPU is "re-plugged". This is not enough.
We fix it using a little asm trampoline which resets the stack
and calls back into start_secondary as if we were all fresh from
boot. The trampoline already existed on ppc64, but we add it for
ppc32
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
avr32: Fix missing irq namespace conversion
powerpc: qe_ic: Rename get_irq_desc_data and get_irq_desc_chip
genirq: Remove the now obsolete config options and select statements
arm: versatile : Fix typo introduced in irq namespace cleanup
sound: Fixup the last user of the old irq functions
genirq: Remove obsolete comment
genirq: Remove now obsolete set_irq_wake()
sh: Fix irq cleanup fallout
x86: apb_timer: Fixup genirq fallout
genirq: Fix misnamed label in handle_edge_eoi_irq
Fix up crazy conflict in arch/powerpc/include/asm/qe_ic.h:
- commit eead4d5c63 ("powerpc: qe_ic: Rename get_irq_desc_data and
get_irq_desc_chip") made the helper functions use
irq_desc_get_handler_data() instead of the legacy (and no longer
existing) get_irq_desc_data.
- commit d4db35e8dc ("powerpc/qe_ic: Fix another breakage from the
irq_data conversion") used irq_desc_get_chip_data() instead.
According to Thomas, the former is the correct direct conversion, but it
does look like both should work (arch/powerpc/sysdev/qe_lib/qe_ic.c
seems to initialize both to the same thing), and the chip data in some
ways is the more logical. Somebody should really decide on one of the
other.
This merge picks irq_desc_get_handler_data() as the straightforward pure
conversion to new names, as per Thomas.
These two functions disappeared in commit
0c6f8a8b91
"genirq: Remove compat code"
but they still exist in qe_ic.h.
This patch renames the function to their new names.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Lennert Buytenhek <buytenh@secretlab.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LKML-Reference: <20110330132504.GA31832@riccoc20.at.omicron.at>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Recent upstream builds with allmodconfig fail due to lack of space
between 0x3000 and 0x6000. We have a hard block at 0x7000 but we can
spare a page by moving the STAB0 from 0x6000 to 0x8000.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
These syscalls have been added recently:
name_to_handle_at
open_by_handle_at
clock_adjtime
syncfs
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
339 is the SPR number for MAS5 documented by Power ISA 2.06, and
implemented by e500mc. It is not yet used anywhere in the kernel,
so nothing should be relying on the wrong number.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
pfns are unsigned long, but MEMORY_START is phys_addr_t. This leads
to page_to_pfn() returning phys_addr_t, and thus type mismatches in a few
print statements.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is used by Alsa to mmap buffers allocated with dma_alloc_coherent()
into userspace. We need a special variant to handle machines with
non-coherent DMAs as those buffers have "special" virt addresses and
require non-cachable mappings
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There is no user now.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: David Miller <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
minix bit operations are only used by minix filesystem and useless by
other modules. Because byte order of inode and block bitmaps is different
on each architecture like below:
m68k:
big-endian 16bit indexed bitmaps
h8300, microblaze, s390, sparc, m68knommu:
big-endian 32 or 64bit indexed bitmaps
m32r, mips, sh, xtensa:
big-endian 32 or 64bit indexed bitmaps for big-endian mode
little-endian bitmaps for little-endian mode
Others:
little-endian bitmaps
In order to move minix bit operations from asm/bitops.h to architecture
independent code in minix filesystem, this provides two config options.
CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k.
CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use
native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu,
m32r, mips, sh, xtensa). The architectures which always use little-endian
bitmaps do not select these options.
Finally, we can remove minix bit operations from asm/bitops.h for all
architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the result of conversions, there are no users of ext2 non-atomic bit
operations except for ext2 filesystem itself. Now we can put them into
architecture independent code in ext2 filesystem, and remove from
asm/bitops.h for all architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce little-endian bit operations by renaming existing powerpc native
little-endian bit operations and changing them to take any pointer types.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This makes the little-endian bitops take any pointer types by changing the
prototypes and adding casts in the preprocessor macros.
That would seem to at least make all the filesystem code happier, and they
can continue to do just something like
#define ext2_set_bit __test_and_set_bit_le
(or whatever the exact sequence ends up being).
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All architectures can use the common dma_addr_t typedef now. We can
remove the arch specific dma_addr_t.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a node parameter to alloc_thread_info(), and change its name to
alloc_thread_info_node()
This change is needed to allow NUMA aware kthread_create_on_cpu()
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In some cases during a threaded core dump not all the threads will have
a full register set. This happens when the signal causing the core dump
races with a thread exiting. The race happens when the exiting thread
has entered the kernel for the last time before the signal arrives, but
doesn't get far enough through the exit code to avoid being included
in the core dump.
So we get a thread included in the core dump which is never going to go
out to userspace again and only has a partial register set recorded
Normally we would catch each thread as it is about to go into userspace
and capture the full register set then.
However, this exiting thread is never going to go out to userspace
again, so we have no way to capture its full register set. It doesn't
really matter, though, as this is a thread which is effectively
already dead.
So instead of hitting a BUG() in this case (a really bad choice of
action in the first place), we use a poison value for the register
values.
[BenH]: Some cosmetic/stylistic changes and fix build on ppc32
Signed-off-by: Mike Wolf <mjw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This property, defined in the Open PIC binding, tells the kernel not to use
the reset bit in the global configuration register. Additionally, its
presence mandates that only sources which are actually used (i.e. appear in
the device tree) should have their VECPRI bits initialized.
Although, "pic-no-reset" can be used for the same use cases that
"protected-sources" is covering, the "protected-sources" implementation was
left completely intact. This is a more pragmatic approach as there are
already several existing systems which use protected sources. If
"pic-no-reset" *and* "protected-sources" are both used, however, then
"pic-no-reset" takes precedence in terms of the init behavior and the
sanity checks done by protected sources will still take place.
Signed-off-by: Meador Inge <meador_inge@mentor.com>
Cc: Hollis Blanchard <hollis_blanchard@mentor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
trace, filters: Initialize the match variable in process_ops() properly
trace, documentation: Fix branch profiling location in debugfs
oprofile, s390: Cleanups
oprofile, s390: Remove hwsampler_files.c and merge it into init.c
perf: Fix tear-down of inherited group events
perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
perf: Handle stopped state with tracepoints
perf: Fix the software events state check
perf, powerpc: Handle events that raise an exception without overflowing
perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
perf, x86: Clean up SandyBridge PEBS events
perf lock: Fix sorting by wait_min
perf tools: Version incorrect with some versions of grep
perf evlist: New command to list the names of events present in a perf.data file
perf script: Add support for H/W and S/W events
perf script: Add support for dumping symbols
perf script: Support custom field selection for output
perf script: Move printing of 'common' data from print_event and rename
perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
perf script: Change process_event prototype
...
* 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (76 commits)
pch_uart: reference clock on CM-iTC
pch_phub: add new device ML7213
n_gsm: fix UIH control byte : P bit should be 0
n_gsm: add a documentation
serial: msm_serial_hs: Add MSM high speed UART driver
tty_audit: fix tty_audit_add_data live lock on audit disabled
tty: move cd1865.h to drivers/staging/tty/
Staging: tty: fix build with epca.c driver
pcmcia: synclink_cs: fix prototype for mgslpc_ioctl()
Staging: generic_serial: fix double locking bug
nozomi: don't use flush_scheduled_work()
tty/serial: Relax the device_type restriction from of_serial
MAINTAINERS: Update HVC file patterns
tty: phase out of ioctl file pointer for tty3270 as well
tty: forgot to remove ipwireless from drivers/char/pcmcia/Makefile
pch_uart: Fix DMA channel miss-setting issue.
pch_uart: fix exclusive access issue
pch_uart: fix auto flow control miss-setting issue
pch_uart: fix uart clock setting issue
pch_uart : Use dev_xxx not pr_xxx
...
Fix up trivial conflicts in drivers/misc/pch_phub.c (same patch applied
twice, then changes to the same area in one branch)
Events on POWER7 can roll back if a speculative event doesn't
eventually complete. Unfortunately in some rare cases they will
raise a performance monitor exception. We need to catch this to
ensure we reset the PMC. In all cases the PMC will be 256 or less
cycles from overflow.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org> # as far back as it applies cleanly
LKML-Reference: <20110309143842.6c22845e@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
x86: Clean up apic.c and apic.h
x86: Remove superflous goal definition of tsc_sync
x86: dt: Correct local apic documentation in device tree bindings
x86: dt: Cleanup local apic setup
x86: dt: Fix OLPC=y/INTEL_CE=n build
rtc: cmos: Add OF bindings
x86: ce4100: Use OF to setup devices
x86: ioapic: Add OF bindings for IO_APIC
x86: dtb: Add generic bus probe
x86: dtb: Add support for PCI devices backed by dtb nodes
x86: dtb: Add device tree support for HPET
x86: dtb: Add early parsing of IO_APIC
x86: dtb: Add irq domain abstraction
x86: dtb: Add a device tree for CE4100
x86: Add device tree support
x86: e820: Remove conditional early mapping in parse_e820_ext
x86: OLPC: Make OLPC=n build again
x86: OLPC: Remove extra OLPC_OPENFIRMWARE_DT indirection
x86: OLPC: Cleanup config maze completely
x86: OLPC: Hide OLPC_OPENFIRMWARE config switch
...
Fix up conflicts in arch/x86/platform/ce4100/ce4100.c
* 'core-futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
arm: Remove bogus comment in futex_atomic_cmpxchg_inatomic()
futex: Deobfuscate handle_futex_death()
plist: Add priority list test
plist: Shrink struct plist_head
futex,plist: Remove debug lock assignment from plist_node
futex,plist: Pass the real head of the priority list to plist_del()
futex: Sanitize futex ops argument types
futex: Sanitize cmpxchg_futex_value_locked API
futex: Remove redundant pagefault_disable in futex_atomic_cmpxchg_inatomic()
futex: Avoid redudant evaluation of task_pid_vnr()
futex: Update futex_wait_setup comments about locking
This erratum can occur if a single-precision floating-point,
double-precision floating-point or vector floating-point instruction on a
mispredicted branch path signals one of the floating-point data interrupts
which are enabled by the SPEFSCR (FINVE, FDBZE, FUNFE or FOVFE bits). This
interrupt must be recorded in a one-cycle window when the misprediction is
resolved. If this extremely rare event should occur, the result could be:
The SPE Data Exception from the mispredicted path may be reported
erroneously if a single-precision floating-point, double-precision
floating-point or vector floating-point instruction is the second
instruction on the correct branch path.
According to errata description, some efp instructions which are not
supposed to trigger SPE exceptions can trigger the exceptions in this case.
However, as we haven't emulated these instructions here, a signal will
send to userspace, and userspace application would exit.
This patch re-issue the efp instruction that we haven't emulated,
so that hardware can properly execute it again if this case happen.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Change futex_atomic_op_inuser and futex_atomic_cmpxchg_inatomic
prototypes to use u32 types for the futex as this is the data type the
futex core code uses all over the place.
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311025058.GD26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The cmpxchg_futex_value_locked API was funny in that it returned either
the original, user-exposed futex value OR an error code such as -EFAULT.
This was confusing at best, and could be a source of livelocks in places
that retry the cmpxchg_futex_value_locked after trying to fix the issue
by running fault_in_user_writeable().
This change makes the cmpxchg_futex_value_locked API more similar to the
get_futex_value_locked one, returning an error code and updating the
original value through a reference argument.
Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile]
Acked-by: Tony Luck <tony.luck@intel.com> [ia64]
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michal Simek <monstr@monstr.eu> [microblaze]
Acked-by: David Howells <dhowells@redhat.com> [frv]
Cc: Darren Hart <darren@dvhart.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20110311024851.GC26122@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The combination of commit
8154c5d22d and
93c22703ef
Broke boot on iSeries.
The problem is that iSeries very early boot code, which generates
the device-tree and runs before our normal early initializations
does need access the lppaca's very early, before the PACA array is
initialized, and in fact even before the boot PACA has been
initialized (it contains all 0's at this stage).
However, the first patch above makes that code use the new
llpaca_of(cpu) accessor, which itself is changed by the second patch to
use the PACA array.
We fix that by reverting iSeries to directly dereferencing the array. In
addition, we fix all iterators in the iSeries code to always skip CPU
whose number is above 63 which is the maximum size of that array and
the maximum number of supported CPUs on these machines.
Additionally, we make sure the boot_paca is properly initialized
in our early startup code.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move SPRN_PID declearations in various locations into one place.
Signed-off-by: Tseng-Hui (Frank) Lin <thlin@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Adapt the functions used to create and write to the RTAS-log partition
to work with any OS-type partition.
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
A number of drivers are using pgprot_writecombine() to enable write
combining on userspace mappings. Implement it on powerpc.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Define the ARCH_IRQ_INIT_FLAGS instead of fixing it up in a loop.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kyle Moffett points out that mpc85xx has started using the
ppc_md.machine_kexec hook. As such, revert patch c94868788c
(powerpc/kexec: Remove ppc_md.machine_kexec).
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reason: Import mainline device tree changes on which further patches
depend on or conflict.
Trivial conflict in: drivers/spi/pxa2xx_spi_pci.c
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This is useful for system management software so that it can kick
off things like gettys and everything that's started from a tty,
before we reuse it from/for something else or shut it down.
Without this ioctl it would have to temporarily become the owner of
the tty, then call vhangup() and then give it up again.
Cc: Lennart Poettering <lennart@poettering.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
max_mapnr is a pfn, not an index innto mem_map[]. So don't add
ARCH_PFN_OFFSET a second time.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Currently, ppc32 uses sysdata for the pci_controller pointer, and
ppc64 uses it to hold the device_node pointer. This patch moves the
of_node pointer into (struct pci_bus*)->dev.of_node and
(struct pci_dev*)->dev.of_node so that sysdata can be converted to always
use the pci_controller pointer instead. It also fixes up the
allocating of pci devices so that the of_node pointer gets assigned
consistently and increments the ref count.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
There is a tiny difference between PPC32 and PPC64. Microblaze uses the
PPC32 variant.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
[grant.likely@secretlab.ca: Added comment to #endif, moved documentation
block to function implementation, fixed for non ppc and microblaze
compiles]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
The DD2 core still has some unstability. Define CPU_FTR_476_DD2 to
enable workarounds in later patches.
This is based on an earlier, unreleased patch for DD1 by Ben Herrenschmidt.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
All architecture specific rwsem headers carry the same function
prototypes. Just x86 adds asmregparm, which is an empty define on all
other architectures. S390 has a stale rwsem_downgrade_write()
prototype.
Remove the duplicates and add the prototypes to linux/rwsem.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Henderson <rth@twiddle.net>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.970840140@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Instead of having the same implementation in each architecture, move
it to linux/rwsem.h and remove the duplicates. It's unlikely that an
arch will ever implement something different, but we can deal with
that when it happens.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.876773757@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The rwsem initializers and related macros and functions are mostly the
same. Some of them lack the lockdep initializer, but having it in
place does not matter for architectures which do not support lockdep.
powerpc, sparc, x86: No functional change
sh, s390: Removes the duplicate init_rwsem (inline and #define)
alpha, ia64, xtensa: Use the lockdep capable init function in
lib/rwsem.c which is just uninlining the init
function for the LOCKDEP=n case
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.771812729@linutronix.de>
The difference between these declarations is the data type of the
count member and the lack of lockdep in some architectures/
long is equivivalent to signed long and the #ifdef guarded dep_map
member does not hurt anyone.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.679641914@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
All rwsem implementations include the same headers. Include them from
include/linux/rwsem.h
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Matt Turner <mattst88@gmail.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
LKML-Reference: <20110126195833.483520950@linutronix.de>
Use the crash handler hooks to run the SPU stop code, just like we do for
ehea and cell RAS code.
While I'm here I noticed "CPUSs reliabally"
so fix the spelling MISTAKESs reliabally.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_crash_shutdown, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_kexec, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
No one uses ppc_md.machine_kexec_cleanup, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move all the kexec handlers together.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack.
Add a second stack when calling trace_hardirqs_on/off() otherwise
the following oops might occur:
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT SMP NR_CPUS=2 PA Semi PWRficient
last sysfs file: /sys/block/sda/size
Modules linked in: ohci_hcd ehci_hcd usbcore
NIP: c0000000000e1c00 LR: c0000000000034d4 CTR: 000000011012c440
REGS: c00000003e2f3af0 TRAP: 0300 Not tainted (2.6.37-rc6+)
MSR: 9000000000001032 <ME,IR,DR> CR: 48044444 XER: 20000000
DAR: 00000001ffb9db50, DSISR: 0000000040000000
TASK = c00000003e1a00a0[2088] 'emacs' THREAD: c00000003e2f0000 CPU: 1
GPR00: 0000000000000001 c00000003e2f3d70 c00000000084e0d0 c0000000008816e8
GPR04: 000000001034c678 000000001032e8f9 0000000010336540 0000000040020000
GPR08: 0000000040020000 00000001ffb9db40 c00000003e2f3e30 0000000060000000
GPR12: 100000000000f032 c00000000fff0280 000000001032e8c9 0000000000000008
GPR16: 00000000105be9c0 00000000105be950 00000000105be9b0 00000000105be950
GPR20: 00000000ffb9dc50 00000000ffb9dbf0 00000000102f0000 00000000102f0000
GPR24: 00000000102e0000 00000000102f0000 0000000010336540 c0000000009ded38
GPR28: 00000000102e0000 c0000000000034d4 c0000000007ccb10 c00000003e2f3d70
NIP [c0000000000e1c00] .trace_hardirqs_off+0xb0/0x1d0
LR [c0000000000034d4] decrementer_common+0xd4/0x100
Call Trace:
[c00000003e2f3d70] [c00000003e2f3e30] 0xc00000003e2f3e30 (unreliable)
[c00000003e2f3e30] [c0000000000034d4] decrementer_common+0xd4/0x100
Instruction dump:
81690000 7f8b0000 419e0018 f84a0028 60000000 60000000 60000000 e95f0000
80030000 e92a0000 eb6301f8 2f800000 <eb890010> 41fe00dc a06d000a eb1e8050
---[ end trace 4ec7fd2be9240928 ]---
Reported-by: Joerg Sommer <joerg@alea.gnuu.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When we create an alternative feature section, the else case must be the
same size or smaller than the body. This is because when we patch the
else case in we just overwrite the body, so there must be room.
Up to now we just did this by inspection, but it's quite easy to enforce
it in the assembler, so we should.
The only change is to add the ifgt block, but that effects the alignment
of the tabs and so the whole macro is modified.
Also add a test, but #if 0 it because we don't want to break the build.
Anyone who's modifying the feature macros should enable the test.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>