Commit Graph

749 Commits

Author SHA1 Message Date
David S. Miller 87b385da1f [SPARC]: Add unique device_node IDs and a ".node" property.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:18:57 -07:00
David S. Miller fb7cd9d9ac [SPARC]: Add of_set_property() interface.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:18:36 -07:00
David S. Miller dda9beb414 [SPARC64]: Export auxio_register to modules.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:15:08 -07:00
David S. Miller 3505599615 [SPARC64]: Allow floppy driver to build modular.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:15:01 -07:00
David S. Miller 698539187a [SPARC]: Export x_bus_type to modules.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-25 23:14:59 -07:00
David S. Miller b5ba0740f8 [SPARC64]: Make auxio a real driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:16:21 -07:00
David S. Miller 576c352e89 [SBUS]: Rewrite and plug into of_device framework.
I severely apologize, I was still learning how to program
in C when I wrote this stuff 10 years ago...

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:50 -07:00
David S. Miller a2bd4fd179 [SPARC64]: Add of_device layer and make ebus/isa use it.
Sparcspkr and power drivers are converted, to make sure it works.
Eventually the SBUS device layer will use this as a sub-class.

I really cannot cut loose on that bit until sparc32 is given the
same infrastructure.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:43 -07:00
David S. Miller 8cd24ed4f8 [SPARC64]: Expand of_*() interfaces some more.
Import some more stuff from powerpc.

Add of_device_is_compatible(), and of_find_compatible_node().
Export some more of the other routines to modules.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:41 -07:00
David S. Miller 92c4e22593 [SPARC64]: Kill unused local vars in map_prom_timers().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:38 -07:00
David S. Miller 25c7581bcd [SPARC64]: Kill off some more prom_getproperty() remnants.
The remaining ones occur before we have imported the
device tree.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:36 -07:00
David S. Miller 44bdef5e8f [SPARC64]: Convert Cheetah memory controller driver to in-kernel PROM tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:34 -07:00
David S. Miller cecc4e9222 [SPARC64]: Convert central bus layer to in-kernel PROM device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:32 -07:00
David S. Miller 9c10a58ed6 [SPARC64]: Kill ebus/isa range and interrupt mapping struct members.
Unused outside of initial bus probe scan.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:30 -07:00
David S. Miller 690c8fd31f [SPARC64]: Use in-kernel PROM tree for EBUS and ISA.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:28 -07:00
David S. Miller de8d28b16f [SPARC64]: Convert sparc64 PCI layer to in-kernel device tree.
One thing this change pointed out was that we really should
pull the "get 'local-mac-address' property" logic into a helper
function all the network drivers can call.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:26 -07:00
David S. Miller 765b5f3273 [SPARC64]: Must run smp_setup_cpu_possible_map() after paging_init()
Otherwise the in-kernel PROM device tree isn't built yet,
and therefore the present cpu bits don't get set properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:23 -07:00
David S. Miller c2a5a46be4 [SPARC64]: Fix for Niagara memory corruption.
On some sun4v systems, after netboot the ethernet controller and it's
DMA mappings can be left active.  The net result is that the kernel
can end up using memory the ethernet controller will continue to DMA
into, resulting in corruption.

To deal with this, we are more careful about importing IOMMU
translations which OBP has left in the IO-TLB.  If the mapping maps
into an area the firmware claimed was free and available memory for
the kernel to use, we demap instead of import that IOMMU entry.

This is going to cause the network chip to take a PCI master abort on
the next DMA it attempts, if it has been left going like this.  All
tests show that this is handled properly by the PCI layer and the e1000
drivers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:21 -07:00
David S. Miller 07f8e5f358 [SPARC64]: Convert cpu_find_by_*() interface to in-kernel PROM device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:17 -07:00
David S. Miller 6d307724cb [SPARC64]: Add of_getintprop_default().
This encodes a common idiomatic coding pattern used when
dealing with integer properties.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:15 -07:00
David S. Miller 6760d28bc6 [SPARC64]: Convert sun4v virtual-device layer to in-kernel PROM device tree.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:13 -07:00
David S. Miller 27cc64c7cc [SPARC64]: Rate limited kernel unaligned trap logging, ala IA64.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:11 -07:00
David S. Miller 20edac8ad4 [SPARC64]: Disable verbose PCI IRQ probing messages by default.
Allow them to be enabled with "pci=irq_verbose" on the
boot command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:09 -07:00
David S. Miller e87dc35020 [SPARC64]: Use in-kernel OBP device tree for PCI controller probing.
It can be pushed even further down, but this is a first step.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:07 -07:00
David S. Miller aaf7cec276 [SPARC64]: Add of_find_node_by_{name,type}().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:04 -07:00
David S. Miller 372b07bb5a [SPARC64]: Import OBP device tree into kernel data structures.
The basic framework is based on the PowerPC OF code.

This code even tries to get the device addressing components
correct in the full path names.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:02 -07:00
David S. Miller 8fae097deb [SBUS]: Start cleaning up generic sbus support layer.
In particular, move the IRQ probing out to sparc32/sparc64
arch specific code where it belongs.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-23 23:15:00 -07:00
David S. Miller c8bfcd95de [SPARC64]: Don't double-export synchronize_irq.
It is done by the generic IRQ layer now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:23:56 -07:00
David S. Miller e18e2a00ef [SPARC64]: Move over to GENERIC_HARDIRQS.
This is the long overdue conversion of sparc64 over to
the generic IRQ layer.

The kernel image is slightly larger, but the BSS is ~60K
smaller due to the reduced size of struct ino_bucket.

A lot of IRQ implementation details, including ino_bucket,
were moved out of asm-sparc64/irq.h and are now private to
arch/sparc64/kernel/irq.c, and most of the code in irq.c
totally disappeared.

One thing that's different at the moment is IRQ distribution,
we do it at enable_irq() time.  If the cpu mask is ALL then
we round-robin using a global rotating cpu counter, else
we pick the first cpu in the mask to support single cpu
targetting.  This is similar to what powerpc's XICS IRQ
support code does.

This works fine on my UP SB1000, and the SMP build goes
fine and runs on that machine, but lots of testing on
different setups is needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:23:32 -07:00
David S. Miller 8047e247c8 [SPARC64]: Virtualize IRQ numbers.
Inspired by PowerPC XICS interrupt support code.

All IRQs are virtualized in order to keep NR_IRQS from needing
to be too large.  Interrupts on sparc64 are arbitrary 11-bit
values, but we don't need to define NR_IRQS to 2048 if we
virtualize the IRQs.

As PCI and SBUS controller drivers build device IRQs, we divy
out virtual IRQ numbers incrementally starting at 1.  Zero is
a special virtual IRQ used for the timer interrupt.

So device drivers all see virtual IRQs, and all the normal
interfaces such as request_irq(), enable_irq(), etc. translate
that into a real IRQ number in order to configure the IRQ.

At this point knowledge of the struct ino_bucket is almost
entirely contained within arch/sparc64/kernel/irq.c  There are
a few small bits in the PCI controller drivers that need to
be swept away before we can remove ino_bucket's definition
out of asm-sparc64/irq.h and privately into kernel/irq.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:22:35 -07:00
David S. Miller 37cdcd9e82 [SPARC64]: Kill ino_bucket->pil
And reuse that struct member for virt_irq, which will
be used in future changesets for the implementation of
mapping between real and virtual IRQ numbers.

This nicely kills off a ton of SBUS and PCI controller
PIL assignment code which is no longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:21:57 -07:00
David S. Miller 6a76267f0e [SPARC64]: bp->pil can never be zero
Only pil0_dummy_bucket had a pil of zero and we just killed that
off, so we can delete all special case code that used bp->pil==0
as a way to identify a dummy bucket.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:20:30 -07:00
David S. Miller fd0504c321 [SPARC64]: Send all device interrupts via one PIL.
This is the first in a series of cleanups that will hopefully
allow a seamless attempt at using the generic IRQ handling
infrastructure in the Linux kernel.

Define PIL_DEVICE_IRQ and vector all device interrupts through
there.

Get rid of the ugly pil0_dummy_{bucket,desc}, instead vector
the timer interrupt directly to a specific handler since the
timer interrupt is the only event that will be signaled on
PIL 14.

The irq_worklist is now in the per-cpu trap_block[].

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-20 01:20:00 -07:00
David S. Miller ccefb5f3f6 [SPARC64]: Do not double-export sys_close() when CONFIG_SOLARIS_EMUL_MODULE
It is already exported by fs/open.c

Noticed by Ben Collins.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-11 21:05:25 -07:00
David S. Miller 9145bcf635 [SPARC64]: Set appropriate max_cache_size.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-10 22:02:17 -07:00
David S. Miller 46b304934d [SPARC64]: Avoid JBUS errors on some Niagara systems.
Doing PCI config space accesses to non-present PCI slots
can result in fatal JBUS errors if the PCI config access
hypervisor call is performed on cpus other than the boot
cpu.

PCI config space accesses to present PCI slots works just
fine.

Recursively traverse the OBP device tree under the PCI
controller node and record all present device IDs into
a small hash table.

Avoid the hypervisor call for any PCI config space access
attempt for a device not recorded in the hash table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-10 01:06:25 -07:00
David S. Miller 5224e6cc3a [SPARC64]: Dump local cpu registers in sun4v_log_error()
This makes the debugging information more usable.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-06-09 12:03:49 -07:00
David S. Miller 951bc82c53 [SPARC64]: Make smp_processor_id() functional before start_kernel()
Uses of smp_processor_id() get pushed earlier and earlier in
the start_kernel() sequence.  So just get it working before
we call start_kernel() to avoid all possible problems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-31 01:24:02 -07:00
David S. Miller 42f142371e [SPARC64]: Respect gfp_t argument to dma_alloc_coherent().
Using asm-generic/dma-mapping.h does not work because pushing
the call down to pci_alloc_coherent() causes the gfp_t argument
of dma_alloc_coherent() to be ignored.

Fix this by implementing things directly, and adding a gfp_t
argument we can use in the internal call down to the PCI DMA
implementation of pci_alloc_coherent().

This fixes massive memory corruption when using the sound driver
layer, which passes things like __GFP_COMP down into these
routines and (correctly) expects that to work.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-23 02:07:22 -07:00
David S. Miller 353b28bafd [SPARC]: Add robust futex syscall entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-21 21:22:53 -07:00
David S. Miller 06a1be167e [SPARC]: Handle UNWIND_INFO properly.
For sparc32 we need R_SPARC_UA32 relocation support, for
sparc64 we need the handle R_SPARC_DISP32 relocations.

Based upon reports and initial patch by Martin Habets.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-12 12:45:50 -07:00
David S. Miller 8c45112b82 [SPARC]: Hook up vmsplice into syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-05-03 13:55:46 -07:00
Al Viro 5411be59db [PATCH] drop task argument of audit_syscall_{entry,exit}
... it's always current, and that's a good thing - allows simpler locking.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2006-05-01 06:06:18 -04:00
Prasanna S Panchamukhi 07fab8da80 [PATCH] Switch Kprobes inline functions to __kprobes for sparc64
Andrew Morton pointed out that compiler might not inline the functions
marked for inline in kprobes.  There-by allowing the insertion of probes
on these kprobes routines, which might cause recursion.

This patch removes all such inline and adds them to kprobes section
there by disallowing probes on all such routines.  Some of the routines
can even still be inlined, since these routines gets executed after the
kprobes had done necessay setup for reentrancy.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-19 09:13:53 -07:00
David S. Miller 5fdfd42e3a [SPARC64]: Export pcibios_resource_to_bus().
SYM2 driver uses it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-17 13:34:44 -07:00
David S. Miller 5fdef39495 [SPARC]: Hook up sys_tee() into syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-14 15:29:32 -07:00
Kyle McMartin 894b5779ce [PATCH] No arch-specific strpbrk implementations
While cleaning up parisc_ksyms.c earlier, I noticed that strpbrk wasn't
being exported from lib/string.c.  Investigating further, I noticed a
changeset that removed its export and added it to _ksyms.c on a few more
architectures.  The justification was that "other arches do it."

I think this is wrong, since no architecture currently defines
__HAVE_ARCH_STRPBRK, there's no reason for any of them to be exporting it
themselves.  Therefore, consolidate the export to lib/string.c.

Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:40 -07:00
KAMEZAWA Hiroyuki a283a52520 [PATCH] for_each_possible_cpu: sparc64
for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu.
for sparc64.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-11 06:18:31 -07:00
David S. Miller aa1d1a0af6 [SPARC64]: smp_call_function() fixups...
1) Take doc-book function comment from i386 implementation.
2) cacheline align call_lock, taken from powerpc
3) Need memory barrier after setting call_data
4) Remove timeout

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:44 -07:00
David S. Miller 731bbe431f [SPARC64]: Translate PTRACE_GETEVENTMSG for 32-bit tasks.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:41 -07:00
David S. Miller 955c054f79 [SPARC64]: Print out return PC in cheetah_log_errors().
This makes debugging things a little bit easier.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:37 -07:00
David S. Miller 1759e58ed2 [SPARC64]: Add dummy PTRACE_PEEKUSR for gdb.
GDB uses a PTRACE_PEEKUSR call with offset 0 to see
if a thread is alive, so provide a success return for
this particular special case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-04-09 22:56:35 -07:00
David S. Miller 289eee6fa7 [SPARC]: Wire up sys_sync_file_range() into syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 23:49:34 -08:00
David S. Miller 1339713a32 [SPARC]: Wire up sys_splice() into the syscall tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 23:03:38 -08:00
David S. Miller 6f25f3986a [SPARC64]: Make tsb_sync() mm comparison more precise.
switch_mm() changes the mm state and does a tsb_context_switch()
first, then we do the cpu register state switch which changes
current_thread_info() and current().

So it's safer to check the PGD physical address stored in the
trap block (which will be updated by the tsb_context_switch() in
switch_mm()) than current->active_mm.

Technically we should never run here in between those two
updates, because interrupts are disabled during the entire
context switch operation.  But some day we might like to leave
interrupts enabled during the context switch and this change
allows that to happen without any surprises.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-31 23:03:34 -08:00
Matt Mackall 3dedf53bb1 [PATCH] RTC: Remove RTC UIP synchronization on Sparc64
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-28 09:16:00 -08:00
Alan Stern e041c68341 [PATCH] Notifier chain update: API changes
The kernel's implementation of notifier chains is unsafe.  There is no
protection against entries being added to or removed from a chain while the
chain is in use.  The issues were discussed in this thread:

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113018709002036&w=2

We noticed that notifier chains in the kernel fall into two basic usage
classes:

	"Blocking" chains are always called from a process context
	and the callout routines are allowed to sleep;

	"Atomic" chains can be called from an atomic context and
	the callout routines are not allowed to sleep.

We decided to codify this distinction and make it part of the API.  Therefore
this set of patches introduces three new, parallel APIs: one for blocking
notifiers, one for atomic notifiers, and one for "raw" notifiers (which is
really just the old API under a new name).  New kinds of data structures are
used for the heads of the chains, and new routines are defined for
registration, unregistration, and calling a chain.  The three APIs are
explained in include/linux/notifier.h and their implementation is in
kernel/sys.c.

With atomic and blocking chains, the implementation guarantees that the chain
links will not be corrupted and that chain callers will not get messed up by
entries being added or removed.  For raw chains the implementation provides no
guarantees at all; users of this API must provide their own protections.  (The
idea was that situations may come up where the assumptions of the atomic and
blocking APIs are not appropriate, so it should be possible for users to
handle these things in their own way.)

There are some limitations, which should not be too hard to live with.  For
atomic/blocking chains, registration and unregistration must always be done in
a process context since the chain is protected by a mutex/rwsem.  Also, a
callout routine for a non-raw chain must not try to register or unregister
entries on its own chain.  (This did happen in a couple of places and the code
had to be changed to avoid it.)

Since atomic chains may be called from within an NMI handler, they cannot use
spinlocks for synchronization.  Instead we use RCU.  The overhead falls almost
entirely in the unregister routine, which is okay since unregistration is much
less frequent that calling a chain.

Here is the list of chains that we adjusted and their classifications.  None
of them use the raw API, so for the moment it is only a placeholder.

  ATOMIC CHAINS
  -------------
arch/i386/kernel/traps.c:		i386die_chain
arch/ia64/kernel/traps.c:		ia64die_chain
arch/powerpc/kernel/traps.c:		powerpc_die_chain
arch/sparc64/kernel/traps.c:		sparc64die_chain
arch/x86_64/kernel/traps.c:		die_chain
drivers/char/ipmi/ipmi_si_intf.c:	xaction_notifier_list
kernel/panic.c:				panic_notifier_list
kernel/profile.c:			task_free_notifier
net/bluetooth/hci_core.c:		hci_notifier
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_chain
net/ipv4/netfilter/ip_conntrack_core.c:	ip_conntrack_expect_chain
net/ipv6/addrconf.c:			inet6addr_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_chain
net/netfilter/nf_conntrack_core.c:	nf_conntrack_expect_chain
net/netlink/af_netlink.c:		netlink_chain

  BLOCKING CHAINS
  ---------------
arch/powerpc/platforms/pseries/reconfig.c:	pSeries_reconfig_chain
arch/s390/kernel/process.c:		idle_chain
arch/x86_64/kernel/process.c		idle_notifier
drivers/base/memory.c:			memory_chain
drivers/cpufreq/cpufreq.c		cpufreq_policy_notifier_list
drivers/cpufreq/cpufreq.c		cpufreq_transition_notifier_list
drivers/macintosh/adb.c:		adb_client_list
drivers/macintosh/via-pmu.c		sleep_notifier_list
drivers/macintosh/via-pmu68k.c		sleep_notifier_list
drivers/macintosh/windfarm_core.c	wf_client_list
drivers/usb/core/notify.c		usb_notifier_list
drivers/video/fbmem.c			fb_notifier_list
kernel/cpu.c				cpu_chain
kernel/module.c				module_notify_list
kernel/profile.c			munmap_notifier
kernel/profile.c			task_exit_notifier
kernel/sys.c				reboot_notifier_list
net/core/dev.c				netdev_chain
net/decnet/dn_dev.c:			dnaddr_chain
net/ipv4/devinet.c:			inetaddr_chain

It's possible that some of these classifications are wrong.  If they are,
please let us know or submit a patch to fix them.  Note that any chain that
gets called very frequently should be atomic, because the rwsem read-locking
used for blocking chains is very likely to incur cache misses on SMP systems.
(However, if the chain's callout routines may sleep then the chain cannot be
atomic.)

The patch set was written by Alan Stern and Chandra Seetharaman, incorporating
material written by Keith Owens and suggestions from Paul McKenney and Andrew
Morton.

[jes@sgi.com: restructure the notifier chain initialization macros]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-27 08:44:50 -08:00
David S. Miller 5d5d7727a8 [SPARC64]: Kill duplicate exports of string library functions.
Kbuild now points these out.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-26 15:30:29 -08:00
Akinobu Mita 2d78d4beb6 [PATCH] bitops: sparc64: use generic bitops
- remove __{,test_and_}{set,clear,change}_bit() and test_bit()
- remove ffz()
- remove __ffs()
- remove generic_fls()
- remove generic_fls64()
- remove sched_find_first_bit()
- remove ffs()

- unless defined(ULTRA_HAS_POPULATION_COUNT)

  - remove generic_hweight{64,32,16,8}()

- remove find_{next,first}{,_zero}_bit()
- remove ext2_{set,clear,test,find_first_zero,find_next_zero}_bit()
- remove minix_{test,set,test_and_clear,test,find_first_zero}_bit()

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:14 -08:00
Prasanna S Panchamukhi b67000962f [PATCH] kprobes: fix broken fault handling for sparc64
Provide proper kprobes fault handling, if a user-specified pre/post handlers
tries to access user address space, through copy_from_user(), get_user() etc.

The user-specified fault handler gets called only if the fault occurs while
executing user-specified handlers.  In such a case user-specified handler is
allowed to fix it first, later if the user-specifed fault handler does not fix
it, we try to fix it by calling fix_exception().

The user-specified handler will not be called if the fault happens when single
stepping the original instruction, instead we reset the current probe and
allow the system page fault handler to fix it up.

I could not test this patch for sparc64.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:05 -08:00
bibo,mao 2326c77017 [PATCH] kprobe handler: discard user space trap
Currently kprobe handler traps only happen in kernel space, so function
kprobe_exceptions_notify should skip traps which happen in user space.
This patch modifies this, and it is based on 2.6.16-rc4.

Signed-off-by: bibo mao <bibo.mao@intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>
Cc: <hiramatu@sdl.hitachi.co.jp>
Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:57:04 -08:00
Stephen Rothwell 3158e9411a [PATCH] consolidate sys32/compat_adjtimex
Create compat_sys_adjtimex and use it an all appropriate places.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
Stephen Rothwell 88959ea968 [PATCH] create struct compat_timex and use it everywhere
We had a copy of the compatibility version of struct timex in each 64 bit
architecture.  This patch just creates a global one and replaces all the
usages of the old ones.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-26 08:56:57 -08:00
David S. Miller 7d3aee9a96 [SPARC64]: Keep cpu_present_map in sync with phys_cpu_present_map.
Don't rely on fixup_cpu_present_map() to do this as that function
is about to be removed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-25 13:00:17 -08:00
Alexey Dobriyan 53b3531bbb [PATCH] s/;;/;/g
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:24 -08:00
Andrew Morton 394e3902c5 [PATCH] more for_each_cpu() conversions
When we stop allocating percpu memory for not-possible CPUs we must not touch
the percpu data for not-possible CPUs at all.  The correct way of doing this
is to test cpu_possible() or to use for_each_cpu().

This patch is a kernel-wide sweep of all instances of NR_CPUS.  I found very
few instances of this bug, if any.  But the patch converts lots of open-coded
test to use the preferred helper macros.

Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Kyle McMartin <kyle@parisc-linux.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Christian Zankel <chris@zankel.net>
Cc: Philippe Elie <phil.el@wanadoo.fr>
Cc: Nathan Scott <nathans@sgi.com>
Cc: Jens Axboe <axboe@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23 07:38:17 -08:00
David S. Miller dcc1e8dd88 [SPARC64]: Add a secondary TSB for hugepage mappings.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 01:15:14 -08:00
David S. Miller 14778d9072 [SPARC]: Respect vm_page_prot in io_remap_page_range().
Make sure the callers do a pgprot_noncached() on
vma->vm_page_prot.

Pointed out by Hugh Dickens.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-22 01:15:13 -08:00
Andrew Morton 467418f350 [SPARC64]: CONFIG_BLK_DEV_RAM fix
init/do_mounts_rd.c depends upon CONFIG_BLK_DEV_RAM, not CONFIG_BLK_DEV_INITRD.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:41 -08:00
David S. Miller bb8646d834 [SPARC64]: Optimized TSB table initialization.
We only need to write an invalid tag every 16 bytes,
so taking advantage of this can save many instructions
compared to the simple memset() call we make now.

A prefetching implementation is implemented for sun4u
and a block-init store version if implemented for Niagara.

The next trick is to be able to perform an init and
a copy_tsb() in parallel when growing a TSB table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:41 -08:00
David S. Miller 05f9ca8359 [SPARC64]: Randomize mm->mmap_base when PF_RANDOMIZE is set.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:37 -08:00
David S. Miller d61e16df94 [SPARC64]: Increase top of 32-bit process stack.
Put it one page below the top of the 32-bit address space.
This gives us ~16MB more address space to work with.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:36 -08:00
David S. Miller a91690ddd0 [SPARC64]: Top-down address space allocation for 32-bit tasks.
Currently allocations are very constrained for 32-bit processes.
It grows down-up from 0x70000000 to 0xf0000000 which gives about
2GB of stack + dynamic mmap() space.

So support the top-down method, and we need to override the
generic helper function in order to deal with D-cache coloring.

With these changes I was able to squeeze out a mmap() just over
3.6GB in size in a 32-bit process.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:35 -08:00
David S. Miller 7a1ac52641 [SPARC64]: Fix and re-enable dynamic TSB sizing.
This is good for up to %50 performance improvement of some test cases.
The problem has been the race conditions, and hopefully I've plugged
them all up here.

1) There was a serious race in switch_mm() wrt. lazy TLB
   switching to and from kernel threads.

   We could erroneously skip a tsb_context_switch() and thus
   use a stale TSB across a TSB grow event.

   There is a big comment now in that function describing
   exactly how it can happen.

2) All code paths that do something with the TSB need to be
   guarded with the mm->context.lock spinlock.  This makes
   page table flushing paths properly synchronize with both
   TSB growing and TLB context changes.

3) TSB growing events are moved to the end of successful fault
   processing.  Previously it was in update_mmu_cache() but
   that is deadlock prone.  At the end of do_sparc64_fault()
   we hold no spinlocks that could deadlock the TSB grow
   sequence.  We also have dropped the address space semaphore.

While we're here, add prefetching to the copy_tsb() routine
and put it in assembler into the tsb.S file.  This piece of
code is quite time critical.

There are some small negative side effects to this code which
can be improved upon.  In particular we grab the mm->context.lock
even for the tsb insert done by update_mmu_cache() now and that's
a bit excessive.  We can get rid of that locking, and the same
lock taking in flush_tsb_user(), by disabling PSTATE_IE around
the whole operation including the capturing of the tsb pointer
and tsb_nentries value.  That would work because anyone growing
the TSB won't free up the old TSB until all cpus respond to the
TSB change cross call.

I'm not quite so confident in that optimization to put it in
right now, but eventually we might be able to and the description
is here for reference.

This code seems very solid now.  It passes several parallel GCC
bootstrap builds, and our favorite "nut cruncher" stress test which is
a full "make -j8192" build of a "make allmodconfig" kernel.  That puts
about 256 processes on each cpu's run queue, makes lots of process cpu
migrations occur, causes lots of page table and TLB flushing activity,
incurs many context version number changes, and it swaps the machine
real far out to disk even though there is 16GB of ram on this test
system. :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:16:33 -08:00
David S. Miller 0c51ed93ca [SPARC64]: First cut at VIS simulator for Niagara.
Niagara does not implement some of the VIS instructions in
hardware, so we have to emulate them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:26 -08:00
David S. Miller 90a6646bf6 [SPARC64]: Fix system type in /proc/cpuinfo and remove bogus OBP check.
Report 'sun4v' when appropriate in /proc/cpuinfo

Remove all the verifications of the OBP version string.  Just
make sure it's there, and report it raw in the bootup logs and
via /proc/cpuinfo.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:25 -08:00
David S. Miller 8935dced54 [SPARC64]: Add SMT scheduling support for Niagara.
The mapping is a simple "(cpuid >> 2) == core" for now.
Later we'll add more sophisticated code that will walk
the sun4v machine description and figure this out from
there.

We should also add core mappings for jaguar and panther
processors.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:24 -08:00
David S. Miller d1112018b4 [SPARC64]: Move over to sparsemem.
This has been pending for a long time, and the fact
that we waste a ton of ram on some configurations
kind of pushed things over the edge.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:22 -08:00
David S. Miller ee29074d3b [SPARC64]: Fix new context version SMP handling.
Don't piggy back the SMP receive signal code to do the
context version change handling.

Instead allocate another fixed PIL number for this
asynchronous cross-call.  We can't use smp_call_function()
because this thing is invoked with interrupts disabled
and a few spinlocks held.

Also, fix smp_call_function_mask() to count "cpus" correctly.
There is no guarentee that the local cpu is in the mask
yet that is exactly what this code was assuming.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:21 -08:00
Eric Sesterhenn 9132983ae1 [SPARC64]: kzalloc() conversion
this patch converts arch/sparc64 to kzalloc usage.
Crosscompile tested with allyesconfig.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:19 -08:00
David S. Miller 74ae998772 [SPARC64]: Simplify TSB insert checks.
Don't try to avoid putting non-base page sized entries
into the user TSB.  It actually costs us more to check
this than it helps.

Eventually we'll have a multiple TSB scheme for user
processes.  Once a process starts using larger pages,
we'll allocate and use such a TSB.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:18 -08:00
David S. Miller 3cab0c3e86 [SPARC64]: More SUN4V cpu mondo bug fixing.
This cpu mondo sending interface isn't all that easy to
use correctly...

We were clearing out the wrong bits from the "mask" after getting
something other than EOK from the hypervisor.

It turns out the hypervisor can just be resent the same cpu_list[]
array, with the 0xffff "done" entries still in there, and it will do
the right thing.

So don't update or try to rebuild the cpu_list[] array to condense it.

This requires the "forward_progress" check to be done slightly
differently, but this new scheme is less bug prone than what we were
doing before.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:17 -08:00
David S. Miller bcc28ee0bf [SPARC64]: Fix sun4v mna winfixup handling.
We were clobbering a base register before we were done
using it.  Fix a comment typo while we're here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:16 -08:00
David S. Miller c4f8ef77f9 [SPARC64]: Fix mini RTC driver reading.
Need to subtract 1900 from year and 1 from month before
giving it back to userspace.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:15 -08:00
David S. Miller 8bcd174116 [SPARC64]: Do not allow mapping pages within 4GB of 64-bit VA hole.
The UltraSPARC T1 manual recommends this because the chip
could instruction prefetch into the VA hole, and this would
also make decoding  certain kinds of memory access traps
more difficult (because the chip sign extends certain pieces
of trap state).

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:14 -08:00
David S. Miller 45f791eb0f [SPARC64]: Fix _PAGE_EXEC handling.
First of all, use the known _PAGE_EXEC_{4U,4V} value instead
of loading _PAGE_EXEC from memory.  We either know which one
to use by context, or we can code patch the test.

Next, we need to check executability of a PTE in the generic
TSB miss handler.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:13 -08:00
David S. Miller 92daa77e9a [SPARC64]: Fix typo in SUN4V D-TLB miss handler.
Should put FAULT_CODE_DTLB into %g3 not FAULT_CODE_ITLB.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:12 -08:00
David S. Miller 8ba706a95b [SPARC64]: Add mini-RTC driver for Starfire and SUN4V.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:10 -08:00
David S. Miller b830ab665a [SPARC64]: Fix bugs in SUN4V cpu mondo dispatch.
There were several bugs in the SUN4V cpu mondo dispatch code.

In fact, if we ever got a EWOULDBLOCK or other error from
the hypervisor call, we'd potentially send a cpu mondo multiple
times to the same cpu and even worse we could loop until the
timeout resending the same mondo over and over to such cpus.

So let's bulletproof this thing as follows:

1) Implement cpu_mondo_send() and cpu_state() hypervisor calls
   in arch/sparc64/kernel/entry.S, add prototypes to asm/hypervisor.h

2) Don't build and update the cpulist using inline functions, this
   was causing the cpu mask to not get updated in the caller.

3) Disable interrupts during the entire mondo send, otherwise our
   cpu list and/or mondo block could get overwritten if we take
   an interrupt and do a cpu mondo send on the current cpu.

4) Check for all possible error return types from the cpu_mondo_send()
   hypervisor call.  In particular:

   HV_EOK) Our work is done, all cpus have received the mondo.
   HV_CPUERROR) One or more of the cpus in the cpu list we passed
                to the hypervisor are in error state.  Use cpu_state()
                calls over the entries in the cpu list to see which
		ones.  Record them in "error_mask" and report this
		after we are done sending the mondo to cpus which are
		not in error state.
   HV_EWOULDBLOCK) We need to keep trying.

   Any other error we consider fatal, we report the event and exit
   immediately.

5) We only timeout if forward progress is not made.  Forward progress
   is defined as having at least one cpu get the mondo successfully
   in a given cpu_mondo_send() call.  Otherwise we bump a counter
   and delay a little.  If the counter hits a limit, we signal an
   error and report the event.

Also, smp_call_function_mask() error handling reports the number
of cpus incorrectly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:09 -08:00
David S. Miller aac0aadf09 [SPARC64]: Fix bugs in SMP TLB context version expiration handling.
1) We must flush the TLB, duh.

2) Even if the sw context was seen to be valid, the local cpu's
   hw context can be out of date, so reload it unconditionally.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:08 -08:00
David S. Miller 6889331a12 [SPARC64]: Fix indexing into kpte_linear_bitmap.
Need to shift back up by 3 bits to get 8-byte entry
index.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:07 -08:00
David S. Miller 2a3a5f5ddb [SPARC64]: Bulletproof hypervisor TLB flushing.
Check TLB flush hypervisor calls for errors and report them.

Pass HV_MMU_ALL always for now, we can add back the optimization
to avoid the I-TLB flush later.

Always explicitly page align the virtual address arguments.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:05 -08:00
David S. Miller 6cc80cfab8 [SPARC64]: Report mondo error correctly in hypervisor_xcall_deliver().
It's in "arg0" not "func".

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:04 -08:00
David S. Miller 3634476239 [SPARC64]: Niagara optimized XOR functions for RAID.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:03 -08:00
Andrew Morton c4e9249b19 [SPARC64]: Fix binfmt_aout32.c build.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:02 -08:00
David S. Miller a0663a79ad [SPARC64]: Fix TLB context allocation with SMT style shared TLBs.
The context allocation scheme we use depends upon there being a 1<-->1
mapping from cpu to physical TLB for correctness.  Chips like Niagara
break this assumption.

So what we do is notify all cpus with a cross call when the context
version number changes, and if necessary this makes them allocate
a valid context for the address space they are running at the time.

Stress tested with make -j1024, make -j2048, and make -j4096 kernel
builds on a 32-strand, 8 core, T2000 with 16GB of ram.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:14:00 -08:00
David S. Miller 074d82cf68 [SPARC64]: Put syscall tables after trap table.
Otherwise with too much stuff enabled in the kernel config
we can end up with an unaligned trap table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:59 -08:00
David S. Miller fc50492867 [SPARC64]: Drop %gl to 0 before re-enabling PSTATE_IE in rtrap
If we take a window fault, on SUN4V set %gl to zero before we
turn PSTATE_IE back on in %pstate.  Otherwise if we take an
interrupt we'll end up with corrupt register state.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:57 -08:00
David S. Miller d7744a0950 [SPARC64]: Create a seperate kernel TSB for 4MB/256MB mappings.
It can map all of the linear kernel mappings with zero TSB hash
conflicts for systems with 16GB or less ram.  In such cases, on
SUN4V, once we load up this TSB the first time with all the
mappings, we never take a linear kernel mapping TLB miss ever
again, the hypervisor handles them all.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:56 -08:00
David S. Miller 9cc3a1ac9a [SPARC64]: Make use of Niagara 256MB PTEs for kernel mappings.
We use a bitmap, one bit for every 256MB of memory.  If the
bit is set we can use a 256MB PTE for linear mappings, else
we have to use a 4MB PTE.

SUN4V support is there, and we can very easily add support
for Panther cpu 256MB PTEs in the future.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:55 -08:00
David S. Miller 30c91d576e [SPARC64]: Use sun4v_cpu_idle() in cpu_idle() on SUN4V.
We have to turn off the "polling nrflag" bit when we sleep
the cpu like this, so that we'll get a cross-cpu interrupt
to wake the processor up from the yield.

We also have to disable PSTATE_IE in %pstate around the yield
call and recheck need_resched() in order to avoid any races.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:54 -08:00
David S. Miller 6f5374c91f [SPARC64]: Add sun4v_cpu_yield().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:52 -08:00
David S. Miller 1bd0cd74d1 [SPARC64]: Kill cpudata->idle_volume.
Set, but never used.

We used to use this for dynamic IRQ retargetting, but that
code died a long time ago.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:51 -08:00
David S. Miller 8ca2557c48 [SPARC64]: Niagara optimized memset/bzero/clear_user.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:50 -08:00
David S. Miller d371c0c174 [SPARC64]: Pass multiple CPUs at once to hypervisor cross-call API.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:49 -08:00
David S. Miller 55555633bd [SPARC64]: Typo in sun4v_data_access_exception log message.
Should be "Dax" not "Iax".

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:46 -08:00
David S. Miller d82965c167 [SPARC64]: Handle zero-length map requests in pci_sun4v.c
By simply changing the do-while loop into a plain
while loop.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:45 -08:00
David S. Miller abf3b7bd89 [SPARC64]: Kill stray PGLIST_NENTS check in pci_sun4v.c
I forgot to remove the one in pci_4v_map_sg() during the
iommu batching commit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:44 -08:00
David S. Miller 39334a4b2c [SPARC64]: Fix typo in dump_tl1_traplog()
Actually make use of the 'limit' we compute.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:43 -08:00
David S. Miller 37133c006c [SPARC64]: Disable smp_report_regs() for now.
It's extremely noisy and causes much grief on slow
consoles with large numbers of cpus.

We'll have to provide this some saner way in order
to re-enable this.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:42 -08:00
David S. Miller 6a32fd4d0d [SPARC64]: Remove PGLIST_NENTS PCI IOMMU mapping limitation on SUN4V.
Use a batching queue system for IOMMU mapping setup,
with a page sized batch.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:41 -08:00
David S. Miller 04d74758eb [SPARC64]: Use KERN_EMERG in dump_tl1_traplog() and sun4v TLB errors.
We're about to seriously die in these cases so it is important
that the messages make it to the console.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:40 -08:00
David S. Miller 24c523ecc6 [SPARC64]: Fix unaligned access winfxup handling on SUN4V.
Another case where we have to force ourselves into global register
level one.  Also make sure the arguments passed to sun4v_do_mna() are
correct.

This area actually needs some more work, for example spill fixup is
not necessarily going to do the right thing for this case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:39 -08:00
David S. Miller 6cc200db95 [SPARC64]: Set %gl to 1 in kvmap_itlb_longpath on SUN4V.
Just like kvmap_dtlb_longpath we have to force the
global register level to one in order to mimick the
PSTATE_MG --> PSTATE_AG trasition done on SUN4U.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:39 -08:00
David S. Miller 8b23427441 [SPARC64]: More TLB/TSB handling fixes.
The SUN4V convention with non-shared TSBs is that the context
bit of the TAG is clear.  So we have to choose an "invalid"
bit and initialize new TSBs appropriately.  Otherwise a zero
TAG looks "valid".

Make sure, for the window fixup cases, that we use the right
global registers and that we don't potentially trample on
the live global registers in etrap/rtrap handling (%g2 and
%g6) and that we put the missing virtual address properly
in %g5.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:34 -08:00
David S. Miller 7adb37fe80 [SPARC64]: Don't do anything in flush_ptrace_access() on SUN4V.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:33 -08:00
David S. Miller 6c8927c963 [SPARC64]: Fix some SUN4V TLB handling bugs.
1) Add error return checking for TLB load hypervisor
   calls.

2) Don't fallthru to dtlb tsb miss handler from itlb tsb
   miss handler, oops.

3) On window fixups, propagate fault information to fixup
   handler correctly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:32 -08:00
David S. Miller 52845cdb3b [SPARC64]: Init boot cpu's trap_block[] before paging_init()
It must be ready when we take over the trap table.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:30 -08:00
David S. Miller 3763be32d5 [SPARC64]: Define ARCH_HAS_READ_CURRENT_TIMER.
This gives more consistent bogomips and delay() semantics,
especially on sun4v.  It gives weird looking values though...

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:29 -08:00
David S. Miller c857e3fdbc [SPARC64]: __bzero_noasi --> __clear_user
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:28 -08:00
David S. Miller 46f8604714 [SPARC64]: Put SUN4V ITSB miss into correct trap table entry.
It's 0x9 not 0xb.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:27 -08:00
David S. Miller ebd8c56c5a [SPARC64]: Fix uniprocessor IRQ targetting on SUN4V.
We need to use the real hardware processor ID when
targetting interrupts, not the "define to 0" thing
the uniprocessor build gives us.

Also, fill in the Node-ID and Agent-ID fields properly
on sun4u/Safari.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:24 -08:00
David S. Miller 101d5c18a9 [SPARC64]: Fix PCI IRQ probing regression.
If the top-level cnode had multi entries in it's "reg"
property, we'd fail.  The buffer wasn't large enough in
such cases.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:23 -08:00
David S. Miller 72aff53f1f [SPARC64]: Get SUN4V SMP working.
The sibling cpu bringup is extremely fragile.  We can only
perform the most basic calls until we take over the trap
table from the firmware/hypervisor on the new cpu.

This means no accesses to %g4, %g5, %g6 since those can't be
TLB translated without our trap handlers.

In order to achieve this:

1) Change sun4v_init_mondo_queues() so that it can operate in
   several modes.

   It can allocate the queues, or install them in the current
   processor, or both.

   The boot cpu does both in it's call early on.

   Later, the boot cpu allocates the sibling cpu queue, starts
   the sibling cpu, then the sibling cpu loads them in.

2) init_cur_cpu_trap() is changed to take the current_thread_info()
   as an argument instead of reading %g6 directly on the current
   cpu.

3) Create a trampoline stack for the sibling cpus.  We do our basic
   kernel calls using this stack, which is locked into the kernel
   image, then go to our proper thread stack after taking over the
   trap table.

4) While we are in this delicate startup state, we put 0xdeadbeef
   into %g4/%g5/%g6 in order to catch accidental accesses.

5) On the final prom_set_trap_table*() call, we put &init_thread_union
   into %g6.  This is a hack to make prom_world(0) work.  All that
   wants to do is restore the %asi register using
   get_thread_current_ds().

Longer term we should just do the OBP calls to set the trap table by
hand just like we do for everything else.  This would avoid that silly
prom_world(0) issue, then we can remove the init_thread_union hack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:22 -08:00
David S. Miller 19a0d585e8 [SPARC64]: Disable smp_report_regs() for now.
For 32 cpus and a slow console, it just wedges the
machine especially with DETECT_SOFTLOCKUP enabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:21 -08:00
David S. Miller 6154f94f0e [SPARC64]: Rewrite pci_intmap_match().
The whole algorithm was wrong.  What we need to do is:

1) Walk each PCI bus above this device on the path to the
   PCI controller nexus, and for each:
      a) If interrupt-map exists, apply it, record IRQ controller node
      b) Else, swivel interrupt number using PCI_SLOT(), use PCI bus
	 parent OBP node as controller node
      c) Walk up to "controller node" until we hit the first PCI bus
	 in this domain, or "controller node" is the PCI controller
	 OBP node
2) If we walked to PCI controller OBP node, we're done.
3) Else, apply PCI controller interrupt-map to interrupt.

There is some stuff that needs to be checked out for ebus and
isa, but the PCI part is good to go.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:20 -08:00
David S. Miller 14f6689cbb [SPARC64]: Don't set interrupt state to IDLE in enable_irq().
We'll lose events that way.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:20 -08:00
David S. Miller af02bec662 [SPARC64]: Fix return from trap on SUN4V.
We need to set the global register set _AND_ disable
PSTATE_IE in %pstate.  The original patch sequence was
leaving PSTATE_IE enabled when returning to kernel mode,
oops.

This fixes the random register corruption being seen
on SUN4V.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:19 -08:00
David S. Miller 22780e23c6 [SPARC64]: Set dummy bucket->{imap,iclr} unique on SUN4V.
So that free_irq() disable's the IRQ correctly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:17 -08:00
David S. Miller 94f8762db9 [SPARC64]: Add sun4v_cpu_qconf() hypervisor call.
Call it from register_one_mondo().

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:16 -08:00
David S. Miller 8e42550c68 [SPARC64]: do_fptrap needs to load the thread reg into %g6.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:14 -08:00
David S. Miller 9b6b46470c [SPARC64]: Fix bogus call to sun4v_mna in winfixup code.
The C function is named sun4v_do_mna not sun4v_mna.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:13 -08:00
David S. Miller 3d6395cb77 [SPARC64]: Fix tl1 trap state capture/dump on SUN4V.
No trap levels above 2 in privileged mode on SUN4V.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:12 -08:00
David S. Miller e7a0453ef8 [SPARC64] PCI: Size TSB correctly on SUN4V.
Forgot to multiply by 8 * 1024, oops.  Correct the size constant when
the virtual-dma arena is 2GB in size, it should bet 256 not 128.

Finally, log some info about the TSB at probe time.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:10 -08:00
David S. Miller c7f81d42d3 [SPARC64]: Don't use ASI_QUAD_LDD_PHYS on SUN4V.
Need to use ASI_QUAD_LDD_PHYS_4V instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:09 -08:00
David S. Miller a7b31bac69 [SPARC64]: Do not write garbage into %pstate in tsb_context_switch().
For SUN4V, we were clobbering %o5 to do the hypervisor call.
This clobbers the saved %pstate value and we end up writing
garbage into that register as a result.  Oops.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:08 -08:00
David S. Miller 9d29a3fafd [SPARC64]: Decode virtual-devices interrupts correctly.
Need to translate through the interrupt-map{,-mask] properties.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:05 -08:00
David S. Miller 7890f794e0 [SPARC64]: Add prom_{start,stop}cpu_cpuid().
Use prom_startcpu_cpuid() on SUN4V instead of prom_startcpu().

We should really test for "SUNW,start-cpu-by-cpuid" presence
and use it if present even on SUN4U.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:04 -08:00
David S. Miller 63c2a0e598 [SPARC64]: Fix pci_intmap_match().
When crawling up the PCI bus chain, stop at the first node
that has an interrupt-map property before we hit the root.

Also, if we use a bus interrupt-{map,mask} do not forget to
update the 'intmask' pointer as we do for the 'intmap' pointer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:03 -08:00
David S. Miller ab66a50e31 [SPARC64]: Two IRQ handling fixes.
On SUN4V, force IRQ state to idle in enable_irq().  However,
I'm still not sure this is %100 correct.

Call add_interrupt_randomness() on SUN4V too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:02 -08:00
David S. Miller f03b8a5468 [SPARC64]: Use different cache sizing defaults on SUN4V.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:01 -08:00
David S. Miller 329c68b218 [SPARC64]: Make lack of interrupt-map-* a fatal error on SUN4V.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:13:00 -08:00
David S. Miller abd92b2d21 [SPARC64]: Fix sun4v_intr_setenabled() return value check in enable_irq().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:59 -08:00
David S. Miller 355db99860 [SPARC64]: Explicitly init *nregs to 0 in find_device_prom_node().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:58 -08:00
David S. Miller 987b6de710 [SPARC64]: Restrict PCI bus scanning on SUN4V.
On the PBM's first bus number, only allow device 0, function 0, to be
poked at with PCI config space accesses.

For some reason, this single device responds to all device numbers.

Also, reduce the verbiage of the debugging log printk's for PCI cfg
space accesses in the SUN4V PCI controller driver, so that it doesn't
overwhelm the slow SUN4V hypervisor console.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:57 -08:00
David S. Miller 9f8a5b843f [SPARC64]: Fix C-function name called by sun4v_mna trap code.
The trap code was calling itself :-)

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:56 -08:00
David S. Miller fbf1c68eaf [SPARC64]: Don't printk() any messaages in sun4v_build_irq().
It just clutters up the log.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:55 -08:00
David S. Miller e7093703d9 [SPARC64]: INO is never fully specified already on SUN4V.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:54 -08:00
David S. Miller 4a07e646c5 [SPARC64]: Kill sun4v_register_fault_status() on SMP.
That now gets done as a side effect of taking over the
trap table from OBP.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:54 -08:00
David S. Miller 3af6e01e9a [SPARC64]: arch/sparc64/kernel/trampoline.S needs asm/cpudata.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:52 -08:00
David S. Miller c4bea28839 [SPARC64]: Make error codes available from sun4v_intr_get*().
And check for errors at call sites.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:51 -08:00
David S. Miller 4bf447d6f7 [SPARC64]: Pass correct ino to sun4v_intr_*().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:50 -08:00
David S. Miller a615fea48b [SPARC64]: Use TRAP_LOAD_IRQ_WORK() in sun4v device mondo handler.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:49 -08:00
David S. Miller 7c8f486ae7 [SPARC64]: Fix IOMMU mapping on sun4v.
We should dynamically allocate the per-cpu pglist not use
an in-kernel-image datum, since __pa() does not work on
such addresses.

Also, consistently use "u32" for devhandle.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:48 -08:00
David S. Miller 87bdc367ca [SPARC64]: Trim down sun4v IRQ translation kernel log message.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:47 -08:00
David S. Miller d5a2aa241a [SPARC64] sunhv: Bug fixes.
Add udelay to polling console write loop, and increment
the loop limit.

Name the device "ttyHV" and pass that to add_preferred_console()
when we're using hypervisor console.

Kill sunhv_console_setup(), it's empty.

Handle the case where we don't want to use hypervisor console.
(ie. we have a head attached to a sun4v machine)

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:46 -08:00
David S. Miller e77227eb4e [SPARC64]: Probe virtual-devices root node on sun4v.
This is where we learn how to get the interrupts
for things like the hypervisor console device.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:44 -08:00
David S. Miller d5eb400430 [SPARC64]: Kill spurious semicolon in sun4v_pci_init().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:43 -08:00
David S. Miller 10951ee610 [SPARC64]: Program IRQ registers correctly on sun4v.
Need to use hypervisor calls instead of direct register
accesses.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:41 -08:00
David S. Miller e3999574b4 [SPARC64]: Generic sun4v_build_irq().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:40 -08:00
David S. Miller 10804828fd [SPARC64]: More SUN4V PCI work.
Get bus range from child of PCI controller root nexus.
This is actually a hack, but the PCI-E bridge sitting
at the top of the PCI tree responds to PCI config cycles
for every device number, so best to just ignore it for now.

Preliminary PCI irq routing, needs lots of work.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:39 -08:00
David S. Miller 6c0f402f6c [SPARC64]: Implement rest of generic interrupt hypervisor calls.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:37 -08:00
David S. Miller 85dfa19ba9 [SPARC64]: Move devino_to_sysino out of pci_sun4v_asm.S
It is not PCI specific, it is for all system interrupts.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:36 -08:00
David S. Miller 059833eb81 [SPARC64]: Range check bus number in SUN4V PCI controller driver.
It has to be somewhere in the range from pbm->pci_first_busno to
pbm->pci_last_busno, inclusive.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:36 -08:00
David S. Miller 0b522497a1 [SPARC64]: Missing 'return' statement in sun4v_pci_init().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:35 -08:00
David S. Miller c260926750 [SPARC64]: Implement basic pci_sun4v_scan_bus().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:34 -08:00
David S. Miller 3833789bb2 [SPARC64]: PCI-SUN4V fixes.
Clear top 8-bits of physical addresses in "ranges" property.
This gives the actual physical address.

Detect PBM-A vs. PBM-B by checking bit 0x40 of the devhandle.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:33 -08:00
David S. Miller 221b2fb818 [SPARC64]: Don't expect cfg space in PCI PBM ranges on SUN4V.
PCI cfg space is accessed transparently through the Hypervisor and not
through direct cpu PIO operations.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:30 -08:00
David S. Miller 1a7a242c89 [SPARC64]: Recognize "virtual-console" as input and output console device.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:28 -08:00
David S. Miller 02fead7505 [SPARC64]: Do not try to synchronize %stick registers on SUN4V.
Writes by privileged code are not allowed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:27 -08:00
David S. Miller 7aa6264543 [SPARC64]: Do not try to write to %tick or %stick on SUN4V.
Writes by privileged code are disallowed.  The hypervisor manages
the non-privileged bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:27 -08:00
David S. Miller b5a37e96b8 [SPARC64]: Fix mondo queue allocations.
We have to use bootmem during init_IRQ and page alloc
for sibling cpu calls.

Also, fix incorrect hypervisor call return value
checks in the hypervisor SMP cpu mondo send code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:26 -08:00
David S. Miller c4bce90ea2 [SPARC64]: Deal with PTE layout differences in SUN4V.
Yes, you heard it right, they changed the PTE layout for
SUN4V.  Ho hum...

This is the simple and inefficient way to support this.
It'll get optimized, don't worry.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:25 -08:00
David S. Miller 490384e752 [SPARC64]: Register kernel TSB with hypervisor.
We do this right after we take over the trap table from OBP.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:23 -08:00
David S. Miller 459b6e621e [SPARC64]: Fix some SUN4V TLB miss bugs.
Code patching did not sign extend negative branch
offsets correctly.

Kernel TLB miss path needs patching and %g4 register
preservation in order to handle SUN4V correctly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:23 -08:00
David S. Miller fd05068d7b [SPARC64]: Fix typo in sun4v_patch().
Second instruction offset is '4' not '3'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:22 -08:00
David S. Miller 6cebb52094 [SPARC64]: Fix sun4v early bootup.
prom_sun4v_name should be "sun4v" not "SUNW,sun4v"

Also, this is too early to make use of the
.sun4v_Xinsn_patch code patching, so just check
things manually.

This gets us at least to prom_init() on Niagara.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:21 -08:00
David S. Miller 4bdff41464 [SPARC64]: Fetch bootup time of day from Hypervisor.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:17 -08:00
David S. Miller 36a68e77c5 [SPARC64]: Simplify sun4v TLB handling using macros.
There was also a bug in sun4v_itlb_miss, it loaded the
MMU Fault Status base into %g3 instead of %g2.

This pointed out a fast path for TSB miss processing,
since we have %g2 with the MMU Fault Status base, we
can use that to quickly load up the PGD phys address.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:16 -08:00
David S. Miller 12eaa328f9 [SPARC64]: Use ASI_SCRATCHPAD address 0x0 properly.
This is where the virtual address of the fault status
area belongs.

To set it up we don't make a hypervisor call, instead
we call OBP's SUNW,set-trap-table with the real address
of the fault status area as the second argument.  And
right before that call we write the virtual address into
ASI_SCRATCHPAD vaddr 0x0.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:15 -08:00
David S. Miller 1839794464 [SPARC64]: First cut at SUN4V PCI IOMMU handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:15 -08:00
David S. Miller 164c220fa3 [SPARC64]: Fix hypervisor call arg passing.
Function goes in %o5, args go in %o0 --> %o5.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:14 -08:00
David S. Miller 7eae642f75 [SPARC64]: Implement SUN4V PCI config space access.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:12 -08:00
David S. Miller bade562216 [SPARC64]: More SUN4V PCI controller work.
Add assembler file for PCI hypervisor calls.
Setup basic skeleton of SUN4V PCI controller driver.

Add 32-bit devhandle to PBM struct, as this is needed for
hypervisor calls.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:11 -08:00
David S. Miller 8f6a93a196 [SPARC64]: Beginnings of SUN4V PCI controller support.
Abstract out IOMMU operations so that we can have a different
set of calls on sun4v, which needs to do things through
hypervisor calls.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:10 -08:00
David S. Miller 4cce4b7cc5 [SPARC64]: Fetch cpu mid properly on sun4v.
If there is a "cpuid" property, use that.  Else suck
it out of the top bits of the "reg" property.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:09 -08:00
David S. Miller ed6b0b4543 [SPARC64]: SUN4V memory exception trap handlers.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:07 -08:00
David S. Miller 618e9ed98a [SPARC64]: Hypervisor TSB context switching.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:06 -08:00
David S. Miller aa9143b971 [SPARC64]: Implement sun4v TSB miss handlers.
When we register a TSB with the hypervisor, so that it or hardware can
handle TLB misses and do the TSB walk for us, the hypervisor traps
down to these trap when it incurs a TSB miss.

Processing is simple, we load the missing virtual address and context,
and do a full page table walk.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:05 -08:00
David S. Miller 12816ab38a [SPARC64]: kernel/cpu.c needs asm/spitfire.h
For 'tlb_type'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:05 -08:00
David S. Miller 3a8c069d0e [SPARC64]: Print ARCH as SUN4V when tlb_type is hypervisor.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:04 -08:00
David S. Miller d82ace7dc4 [SPARC64]: Detect sun4v early in boot process.
We look for "SUNW,sun4v" in the 'compatible' property
of the root OBP device tree node.

Protect every %ver register access, to make sure it is
not touched on sun4v, as %ver is hyperprivileged there.

Lock kernel TLB entries using hypervisor calls instead of
calls into OBP.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:03 -08:00
David S. Miller 1d2f1f90a1 [SPARC64]: Sun4v cross-call sending support.
Technically the hypervisor call supports sending in a list
of all cpus to get the cross-call, but I only pass in one
cpu at a time for now.

The multi-cpu support is there, just ifdef'd out so it's easy to
enable or delete it later.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:02 -08:00
David S. Miller 5b0c0572fc [SPARC64]: Sun4v interrupt handling.
Sun4v has 4 interrupt queues: cpu, device, resumable errors,
and non-resumable errors.  A set of head/tail offset pointers
help maintain a work queue in physical memory.  The entries
are 64-bytes in size.

Each queue is allocated then registered with the hypervisor
as we bring cpus up.

The two error queues each get a kernel side buffer that we
use to quickly empty the main interrupt queue before we
call up to C code to log the event and possibly take evasive
action.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:01 -08:00
David S. Miller ac29c11d4c [SPARC64]: Allocate and register the 4 sun4v mondo queues at bootup.
Needs to occur before we enable PSTATE_IE in %pstate.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:12:00 -08:00
David S. Miller e088ad7ca3 [SPARC64]: Verify all trap_per_cpu assembler offsets in trap_init()
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:59 -08:00
David S. Miller 8b11bd12af [SPARC64]: Patch up mmu context register writes for sun4v.
sun4v uses ASI_MMU instead of ASI_DMMU

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:56 -08:00
David S. Miller 481295f982 [SPARC64]: Register per-cpu fault status area with sun4v hypervisor.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:55 -08:00
David S. Miller 8591e30272 [SPARC64]: Niagara copy/clear page.
Happily we have no D-cache aliasing issues on these
chips, so the implementation is very straightforward.

Add a stub in bootup which will be where the patching
calls will be made for niagara/sun4v/hypervisor.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:54 -08:00
David S. Miller df7d6aec96 [SPARC64]: Rename gl_{1,2}insn_patch --> sun4v_{1,2}insn_patch
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:53 -08:00
David S. Miller d257d5da39 [SPARC64]: Initial sun4v TLB miss handling infrastructure.
Things are a little tricky because, unlike sun4u, we have
to:

1) do a hypervisor trap to do the TLB load.
2) do the TSB lookup calculations by hand

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:52 -08:00
David S. Miller 840aaef8db [SPARC64]: Add missing memory barriers to instruction patching functions.
V9 requires a write memory barrier before the instruction flush.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:51 -08:00
David S. Miller 45fec05f80 [SPARC64]: Sanitize %pstate writes for sun4v.
If we're just switching between different alternate global
sets, nop it out on sun4v.  Also, get rid of all of the
alternate global save/restore in the OBP CIF trampoline code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:50 -08:00
David S. Miller 314981ac71 [SPARC64]: Kill all %pstate changes in context switch code.
They are totally unnecessary because:

1) Interrupts are already disabled when switch_to()
   runs.

2) We don't use hard-coded alternate globals any longer.

This found a case in rtrap, which still assumed alternate
global %g6 was current_thread_info(), and that is fixed
by this changeset as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:49 -08:00
David S. Miller 936f482af1 [SPARC64]: Add initial code to twiddle %gl on trap entry/exit.
Instead of setting/clearing PSTATE_AG we have to change
the %gl register value on sun4v.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:48 -08:00
David S. Miller 6e02493a7f [SPARC64]: Fill dead cycles on trap entry with real work.
As we save trap state onto the stack, the store buffer fills up
mid-way through and we stall for several cycles as the store buffer
trickles out to the L2 cache.  Meanwhile we can do some privileged
register reads and other calculations, essentially for free.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:47 -08:00
David S. Miller d96b81533b [SPARC64]: Add sun4v case to __GET_CPUID() patch tables.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:45 -08:00
David S. Miller a43fe0e789 [SPARC64]: Add some hypervisor tlb_type checks.
And more consistently check cheetah{,_plus} instead
of assuming anything not spitfire is cheetah{,_plus}.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:40 -08:00
David S. Miller 314ef68597 [SPARC64]: Refine register window trap handling.
When saving and restoing trap state, do the window spill/fill
handling inline so that we never trap deeper than 2 trap levels.
This is important for chips like Niagara.

The window fixup code is massively simplified, and many more
improvements are now possible.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:36 -08:00
David S. Miller ffe483d552 [SPARC64]: Add explicit register args to trap state loading macros.
This, as well as making the code cleaner, allows a simplification in
the TSB miss handling path.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:35 -08:00
David S. Miller 92704a1c63 [SPARC64]: Refine code sequences to get the cpu id.
On uniprocessor, it's always zero for optimize that.

On SMP, the jmpl to the stub kills the return address stack in the cpu
branch prediction logic, so expand the code sequence inline and use a
code patching section to fix things up.  This also always better and
explicit register selection, which will be taken advantage of in a
future changeset.

The hard_smp_processor_id() function is big, so do not inline it.

Fix up tests for Jalapeno to also test for Serrano chips too.  These
tests want "jbus Ultra-IIIi" cases to match, so that is what we should
test for.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:35 -08:00
David S. Miller 7bec08e38a [SPARC64]: Correctable ECC errors cannot occur at trap level > 0.
The are distrupting, which by the sparc v9 definition means they
can only occur when interrupts are enabled in the %pstate register.
This never occurs in any of the trap handling code running at
trap levels > 0.

So just mark it as an unexpected trap.

This allows us to kill off the cee_stuff member of struct thread_info.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:33 -08:00
David S. Miller 517af33237 [SPARC64]: Access TSB with physical addresses when possible.
This way we don't need to lock the TSB into the TLB.
The trick is that every TSB load/store is registered into
a special instruction patch section.  The default uses
virtual addresses, and the patch instructions use physical
address load/stores.

We can't do this on all chips because only cheetah+ and later
have the physical variant of the atomic quad load.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:32 -08:00
David S. Miller 30a6ecad96 [SPARC64]: Don't clobber alt-global %g4 on window fixups.
If we are returning back to kernel mode, %g4 could be live
(for example, in the case where we window spill in the etrap
code).  So do not change it's value if going back to kernel.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:30 -08:00
David S. Miller 86b818687d [SPARC64]: Fix race in LOAD_PER_CPU_BASE()
Since we use %g5 itself as a temporary, it can get clobbered
if we take an interrupt mid-stream and thus cause end up with
the final %g5 value too early as a result of rtrap processing.

Set %g5 at the very end, atomically, to avoid this problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:29 -08:00
David S. Miller 9bc657b28e [SPARC64]: Fix too early reference to %g6
%g6 is not necessarily set to current_thread_info()
at sparc64_realfault_common.  So store the fault
code and address after we invoke etrap and %g6 is
properly set up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:27 -08:00
David S. Miller 764afe2edb [SPARC64]: Kill hard-coded %pstate setting in sparc_exit.
Just flip the bit off of whatever it's currently set to.
PSTATE_IE is guarenteed to be enabled when we get here.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:26 -08:00
David S. Miller 2f7ee7c63f [SPARC64]: Increase swapper_tsb size to 32K.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:26 -08:00
David S. Miller a8b900d801 [SPARC64]: Kill sole argument passed to setup_tba().
No longer used, and move extern declaration to a header file.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:25 -08:00
David S. Miller 3487d1d441 [SPARC64]: Kill PROM locked TLB entry preservation code.
It is totally unnecessary complexity.  After we take over
the trap table, we handle all PROM tlb misses fully.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:24 -08:00
David S. Miller 6b6d017235 [SPARC64]: Use sparc64_highest_unlocked_tlb_ent in __tsb_context_switch()
Instead of ugly hard-coded value.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:23 -08:00
David S. Miller 4da808c352 [SPARC64]: Fix bogus flush instruction usage.
Some of the trap code was still assuming that alternate
global %g6 was hard coded with current_thread_info().
Let's just consistently flush at KERNBASE when we need
a pipeline synchronization.  That's locked into the TLB
and will always work.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:22 -08:00
David S. Miller 96c6e0d8e2 [SPARC64]: Kill {save,restore}_alternate_globals()
No longer needed now that we no longer have hard-coded
alternate global register usage.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:20 -08:00
David S. Miller b70c0fa161 [SPARC64]: Preload TSB entries from update_mmu_cache().
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:19 -08:00
David S. Miller bd40791e1d [SPARC64]: Dynamically grow TSB in response to RSS growth.
As the RSS grows, grow the TSB in order to reduce the likelyhood
of hash collisions and thus poor hit rates in the TSB.

This definitely needs some serious tuning.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:18 -08:00
David S. Miller 98c5584cfc [SPARC64]: Add infrastructure for dynamic TSB sizing.
This also cleans up tsb_context_switch().  The assembler
routine is now __tsb_context_switch() and the former is
an inline function that picks out the bits from the mm_struct
and passes it into the assembler code as arguments.

setup_tsb_parms() computes the locked TLB entry to map the
TSB.  Later when we support using the physical address quad
load instructions of Cheetah+ and later, we'll simply use
the physical address for the TSB register value and set
the map virtual and PTE both to zero.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:17 -08:00
David S. Miller 09f94287f7 [SPARC64]: TSB refinements.
Move {init_new,destroy}_context() out of line.

Do not put huge pages into the TSB, only base page size translations.
There are some clever things we could do here, but for now let's be
correct instead of fancy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:16 -08:00
David S. Miller 56fb4df6da [SPARC64]: Elminate all usage of hard-coded trap globals.
UltraSPARC has special sets of global registers which are switched to
for certain trap types.  There is one set for MMU related traps, one
set of Interrupt Vector processing, and another set (called the
Alternate globals) for all other trap types.

For what seems like forever we've hard coded the values in some of
these trap registers.  Some examples include:

1) Interrupt Vector global %g6 holds current processors interrupt
   work struct where received interrupts are managed for IRQ handler
   dispatch.

2) MMU global %g7 holds the base of the page tables of the currently
   active address space.

3) Alternate global %g6 held the current_thread_info() value.

Such hardcoding has resulted in some serious issues in many areas.
There are some code sequences where having another register available
would help clean up the implementation.  Taking traps such as
cross-calls from the OBP firmware requires some trick code sequences
wherein we have to save away and restore all of the special sets of
global registers when we enter/exit OBP.

We were also using the IMMU TSB register on SMP to hold the per-cpu
area base address, which doesn't work any longer now that we actually
use the TSB facility of the cpu.

The implementation is pretty straight forward.  One tricky bit is
getting the current processor ID as that is different on different cpu
variants.  We use a stub with a fancy calling convention which we
patch at boot time.  The calling convention is that the stub is
branched to and the (PC - 4) to return to is in register %g1.  The cpu
number is left in %g6.  This stub can be invoked by using the
__GET_CPUID macro.

We use an array of per-cpu trap state to store the current thread and
physical address of the current address space's page tables.  The
TRAP_LOAD_THREAD_REG loads %g6 with the current thread from this
table, it uses __GET_CPUID and also clobbers %g1.

TRAP_LOAD_IRQ_WORK is used by the interrupt vector processing to load
the current processor's IRQ software state into %g6.  It also uses
__GET_CPUID and clobbers %g1.

Finally, TRAP_LOAD_PGD_PHYS loads the physical address base of the
current address space's page tables into %g7, it clobbers %g1 and uses
__GET_CPUID.

Many refinements are possible, as well as some tuning, with this stuff
in place.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:16 -08:00
David S. Miller 3c93646524 [SPARC64]: Kill pgtable quicklists and use SLAB.
Taking a nod from the powerpc port.

With the per-cpu caching of both the page allocator and SLAB, the
pgtable quicklist scheme becomes relatively silly and primitive.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:14 -08:00
David S. Miller 74bf4312ff [SPARC64]: Move away from virtual page tables, part 1.
We now use the TSB hardware assist features of the UltraSPARC
MMUs.

SMP is currently knowingly broken, we need to find another place
to store the per-cpu base pointers.  We hid them away in the TSB
base register, and that obviously will not work any more :-)

Another known broken case is non-8KB base page size.

Also noticed that flush_tlb_all() is not referenced anywhere, only
the internal __flush_tlb_all() (local cpu only) is used by the
sparc64 port, so we can get rid of flush_tlb_all().

The kernel gets it's own 8KB TSB (swapper_tsb) and each address space
gets it's own private 8K TSB.  Later we can add code to dynamically
increase the size of per-process TSB as the RSS grows.  An 8KB TSB is
good enough for up to about a 4MB RSS, after which the TSB starts to
incur many capacity and conflict misses.

We even accumulate OBP translations into the kernel TSB.

Another area for refinement is large page size support.  We could use
a secondary address space TSB to handle those.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:11:13 -08:00
Bernhard R Link 94bbc1763b [SPARC64]: fix sparc_floppy_irq's auxio_register reseting
The patch "[SPARC64]: Get rid of fast IRQ feature"
moved the the code from arch/sparc64/kernel/entry.S:
      lduba           [%g7] ASI_PHYS_BYPASS_EC_E, %g5
      or              %g5, AUXIO_AUX1_FTCNT, %g5
      stba            %g5, [%g7] ASI_PHYS_BYPASS_EC_E
      andn            %g5, AUXIO_AUX1_FTCNT, %g5
      stba            %g5, [%g7] ASI_PHYS_BYPASS_EC_E
to arch/sparc64/kernel/irq.c:
              val = readb(auxio_register);
              val |= AUXIO_AUX1_FTCNT;
              writeb(val, auxio_register);
              val &= AUXIO_AUX1_FTCNT;
              writeb(val, auxio_register);
This looks like it it missing a bitwise not, which is reintroduced
by this patch.

Due to lack of a floppy device, I could not test it, but it looks
evident.

Signed-off-by: Bernhard R Link <brlink@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-20 01:10:34 -08:00
David S. Miller 4d000d5b96 [SPARC64]: Mark __ex_table section correctly.
We must use the "a" (allocate) attribute every time we
emit an entry into the __ex_table section.

For consistency, use "a" instead of #alloc which is some
Solaris compat cruft GNU as provides on Sparc.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-03-04 23:23:56 -08:00
David S. Miller 7abea92145 [SPARC64]: Make cpu_present_map available earlier.
The change to kernel/sched.c's init code to use for_each_cpu()
requires that the cpu_possible_map be setup much earlier.

Set it up via setup_arch(), constrained to NR_CPUS, and later
constrain it to max_cpus in smp_prepare_cpus().

This fixes SMP booting on sparc64.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-26 19:36:00 -08:00
David S. Miller 40ad7a6afc [SPARC]: sys_newfstatat --> sys_fstatat64
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-12 23:30:11 -08:00
Heiko Carstens 642fe301c3 [SPARC64]: Fix sys_newfstatat syscall table entry for 64-bit.
The sparc64 64 bit syscall table seems to be broken as it has
compat_sys_newfstatat in its syscall table instead of sys_newfstatat.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-09 15:09:15 -08:00
David S. Miller 1b9a428901 [SPARC]: Wire up sys_unshare().
Also, the Solaris syscall table is sized differrently,
and does not go beyond entry 255, so trim off the excess
entries.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-02-07 18:11:24 -08:00
David S. Miller cddfc12e25 [SPARC64]: Kill compat_sys_clock_settime sign extension stub.
It's wrong and totally unneeded.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-30 01:31:09 -08:00
David S. Miller 4415863773 [SPARC]: Increase NR_SYSCALLS to 299
To let new syscalls through.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-22 12:12:01 -08:00
David S. Miller 3ee68c4af3 [SPARC64]: Use compat_sys_futimesat in 32-bit syscall table.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-20 01:49:15 -08:00
David S. Miller 2d7d5f0511 [SPARC]: Add support for *at(), ppoll, and pselect syscalls.
This also includes by necessity _TIF_RESTORE_SIGMASK support,
which actually resulted in a lot of cleanups.

The sparc signal handling code is quite a mess and I should
clean it up some day.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-19 02:42:49 -08:00
Ulrich Drepper 5590ff0d55 [PATCH] vfs: *at functions: core
Here is a series of patches which introduce in total 13 new system calls
which take a file descriptor/filename pair instead of a single file
name.  These functions, openat etc, have been discussed on numerous
occasions.  They are needed to implement race-free filesystem traversal,
they are necessary to implement a virtual per-thread current working
directory (think multi-threaded backup software), etc.

We have in glibc today implementations of the interfaces which use the
/proc/self/fd magic.  But this code is rather expensive.  Here are some
results (similar to what Jim Meyering posted before).

The test creates a deep directory hierarchy on a tmpfs filesystem.  Then
rm -fr is used to remove all directories.  Without syscall support I get
this:

real    0m31.921s
user    0m0.688s
sys     0m31.234s

With syscall support the results are much better:

real    0m20.699s
user    0m0.536s
sys     0m20.149s

The interfaces are for obvious reasons currently not much used.  But they'll
be used.  coreutils (and Jeff's posixutils) are already using them.
Furthermore, code like ftw/fts in libc (maybe even glob) will also start using
them.  I expect a patch to make follow soon.  Every program which is walking
the filesystem tree will benefit.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 19:20:29 -08:00
David S. Miller 959a85ada5 [SPARC64]: Fix build with CONFIG_COMPAT disabled.
Based upon a report and preliminary patch from Jim Gifford.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-18 14:58:05 -08:00
Eddie C. Dost c126cf80d4 [SPARC64]: Serial Console for E250 Patch
From: Eddie C. Dost <ecd@brainaid.de>

I have the following patch for serial console over the RSC
(remote system controller) on my E250 machine. It basically adds
support for input-device=rsc and output-device=rsc from OBP, and
allows 115200,8,n,1,- serial mode setting.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-18 14:54:31 -08:00
Richard Mortimer 9eb3394bf2 [SPARC64]: Eliminate race condition reading Hummingbird STICK register
Ensure a consistent value is read from the STICK register by ensuring
that both high and low are read without high changing due to a roll
over of the low register.

Various Debian/SPARC users (myself include) have noticed problems with
Hummingbird based systems. The symptoms are that the system time is
seen to jump forward 3 days, 6 hours, 11 minutes give or take a few
seconds. In many cases the system then hangs some time afterwards.

I've spotted a race condition in the code to read the STICK register.
I could not work out why 3d, 6h, 11m is important but guess that it is
due to the 2^32 jump of STICK (forwards on one read and then the next
read will seem to be backwards) during a timer interrupt. I'm guessing
that a change of -2^32 will get converted to a large unsigned
increment after the arithmetic manipulation between STICK,
nanoseconds, jiffies etc.

I did a test where I modified __hbird_read_stick to artificially
inject rollover faults forcefully every few seconds. With this I saw
the clock jump over 6 times in 12 hours compared to once every month
or so.

Signed-off-by: Richard Mortimer <richm@oldelvet.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-17 15:21:01 -08:00
Al Viro 26ecbdea4b [PATCH] sparc64: task_pt_regs()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:08:52 -08:00
Al Viro ee3eea165e [PATCH] sparc64: task_stack_page()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:08:52 -08:00
Al Viro f3169641c1 [PATCH] sparc64: task_thread_info()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-12 09:08:52 -08:00
Randy Dunlap a941564458 [PATCH] capable/capability.h (arch/)
arch: Use <linux/capability.h> where capable() is used.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:14 -08:00
Keshavamurthy Anil S eb3a72921c [PATCH] kprobes: fix race in recovery of reentrant probe
There is a window where a probe gets removed right after the probe is hit
on some different cpu.  In this case probe handlers can't find a matching
probe instance related to break address.  In this case we need to read the
original instruction at break address to see if that is not a break/int3
instruction and recover safely.

Previous code had a bug where we were not checking for the above race in
case of reentrant probes and the below patch fixes this race.

Tested on IA64, Powerpc, x86_64.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-11 18:42:12 -08:00
Linus Torvalds dd49f96777 Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2006-01-10 08:28:53 -08:00
Anil S Keshavamurthy e597c2984c [PATCH] kprobes: arch_remove_kprobe
Currently arch_remove_kprobes() is only implemented/required for x86_64 and
powerpc.  All other architecture like IA64, i386 and sparc64 implementes a
dummy function which is being called from arch independent kprobes.c file.

This patch removes the dummy functions and replaces it with
#define arch_remove_kprobe(p, s)	do { } while(0)

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:40 -08:00
Anil S Keshavamurthy 49a2a1b83b [PATCH] kprobes: changed from using spinlock to mutex
Since Kprobes runtime exception handlers is now lock free as this code path is
now using RCU to walk through the list, there is no need for the
register/unregister{_kprobe} to use spin_{lock/unlock}_isr{save/restore}.  The
serialization during registration/unregistration is now possible using just a
mutex.

In the above process, this patch also fixes a minor memory leak for x86_64 and
powerpc.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:40 -08:00
Christoph Hellwig e6a6d2efcb [PATCH] sanitize building of fs/compat_ioctl.c
Now that all these entries in the arch ioctl32.c files are gone [1], we can
build fs/compat_ioctl.c as a normal object and kill tons of cruft.  We need a
special do_ioctl32_pointer handler for s390 so the compat_ptr call is done.
This is not needed but harmless on all other architectures.  Also remove some
superflous includes in fs/compat_ioctl.c

Tested on ppc64.

[1] parisc still had it's PPP handler left, which is not fully correct
    for ppp and besides that ppp uses the generic SIOCPRIV ioctl so it'd
    kick in for all netdevice users.  We can introduce a proper handler
    in one of the next patch series by adding a compat_ioctl method to
    struct net_device but for now let's just kill it - parisc doesn't
    compile in mainline anyway and I don't want this to block this
    patchset.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:33 -08:00
Christoph Hellwig 3a0f69d59b [PATCH] common compat_sys_timer_create
The comment in compat.c is wrong, every architecture provides a
get_compat_sigevent() for the IPC compat code already.

This basically moves the x86_64 version to common code and removes all the
others.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:32 -08:00
akpm@osdl.org df2e71fb91 [PATCH] dump_thread() cleanup
)

From: Adrian Bunk <bunk@stusta.de>

- create one common dump_thread() prototype in kernel.h

- dump_thread() is only used in fs/binfmt_aout.c and can therefore be
  removed on all architectures where CONFIG_BINFMT_AOUT is not
  available

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-10 08:01:25 -08:00
David S. Miller cdade109d3 [SPARC64]: Fix sys_fstat64() entry in 64-bit syscall table.
Noticed by Jakub Jelinek.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 20:43:57 -08:00
Linus Torvalds 977127174a Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 2006-01-09 18:41:42 -08:00
Richard Mortimer 695ca07bf1 [SPARC64]: Fix ptrace/strace
Don't clobber register %l0 while checking TI_SYS_NOERROR value in
syscall return path.  This bug was introduced by:

db7d9a4eb7

Problem narrowed down by Luis F. Ortiz and Richard Mortimer.

I tried using %l2 as suggested by Luis and that works for me.

Looking at the code I wonder if it makes sense to simplify the code
a little bit. The following works for me but I'm not sure how to
exercise the "NOERROR" codepath.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:35:50 -08:00
David S. Miller 90bf811664 [SPARC64]: Add needed pm_power_off symbol.
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-09 14:12:50 -08:00
Jiri Slaby d08fa1a22e [PATCH] PCI: pci_find_device remove (sparc64/kernel/ebus.c)
Signed-off-by: Jiri Slaby <xslaby@fi.muni.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-09 12:13:15 -08:00
Christoph Hellwig 6b9c7ed848 [PATCH] use ptrace_get_task_struct in various places
The ptrace_get_task_struct() helper that I added as part of the ptrace
consolidation is useful in variety of places that currently opencode it.
Switch them to the common helpers.

Add a ptrace_traceme() helper that needs to be explicitly called, and simplify
the ptrace_get_task_struct() interface.  We don't need the request argument
now, and we return the task_struct directly, using ERR_PTR() for error
returns.  It's a bit more code in the callers, but we have two sane routines
that do one thing well now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:51 -08:00
David S. Miller d5784b57d2 [SPARC]: Use STABS_DEBUG and DWARF_DEBUG macros in vmlinux.lds.S
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-28 13:22:54 -08:00
David S. Miller 597d1f0622 [SPARC]: Kill CHILD_MAX.
It's definition is wrong (-1 means "no limit" not 999),
only the Sparc SunOS/Solaris compat code uses it, so
let's just kill it off completely from limits.h and
all referencing code.

Noticed by Ulrich Drepper.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-12-22 23:10:03 -08:00
Keshavamurthy Anil S bf8d5c52c3 [PATCH] kprobes: increment kprobe missed count for multiprobes
When multiple probes are registered at the same address and if due to some
recursion (probe getting triggered within a probe handler), we skip calling
pre_handlers and just increment nmissed field.

The below patch make sure it walks the list for multiple probes case.
Without the below patch we get incorrect results of nmissed count for
multiple probe case.

Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-12-12 08:57:45 -08:00
Hugh Dickins f3d48f0373 [PATCH] unpaged: fix sound Bad page states
Earlier I unifdefed PageCompound, so that snd_pcm_mmap_control_nopage and
others can give out a 0-order component of a higher-order page, which won't
be mistakenly freed when zap_pte_range unmaps it.  But many Bad page states
reported a PG_reserved was freed after all: I had missed that we need to
say __GFP_COMP to get compound page behaviour.

Some of these higher-order pages are allocated by snd_malloc_pages, some by
snd_malloc_dev_pages; or if SBUS, by sbus_alloc_consistent - but that has
no gfp arg, so add __GFP_COMP into its sparc32/64 implementations.

I'm still rather puzzled that DRM seems not to need a similar change.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-22 09:13:43 -08:00
Christoph Hellwig 9ffb83bcc5 [SBUSFB]: implement ->compat_ioctl
This patch adds a new function, sbusfb_compat_ioctl() to
drivers/video/sbuslib.c and uses it as compat_ioctl in all sbus fb
drivers

This remove the last per-arch compat ioctl bits in
arch/sparc64/kernel/ioctl32.c so it would be nice if people could test
if this actually copiles and works and if yes apply it :)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-12 12:11:12 -08:00
David S. Miller 4d45cbacb8 [SPARC64]: Restore 2.4.x /proc/cpuinfo behavior for "ncpus probed" field.
Noticed by Tom 'spot' Callaway.

Even on uniprocessor we always reported the number of physical
cpus in the system via /proc/cpuinfo.  But when this got changed
to use num_possible_cpus() it always reads as "1" on uniprocessor.
This change was unintentional.

So scan the firmware device tree and count the number of cpu
nodes, and report that, as we always did.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-11 12:48:56 -08:00
Christoph Hellwig 8ca2bdc7a9 [SPARC] sbus rtc: implement ->compat_ioctl
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:07:18 -08:00
Tobias Klauser 84c1a13a30 [SPARC64]: Use ARRAY_SIZE macro
Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE which is never used anyways.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-09 12:03:42 -08:00
Nick Piggin 64c7c8f885 [PATCH] sched: resched and cpu_idle rework
Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
confusion, and make their semantics rigid.  Improves efficiency of
resched_task and some cpu_idle routines.

* In resched_task:
- TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
  and as we hold it during resched_task, then there is no need for an
  atomic test and set there. The only other time this should be set is
  when the task's quantum expires, in the timer interrupt - this is
  protected against because the rq lock is irq-safe.

- If TIF_NEED_RESCHED is set, then we don't need to do anything. It
  won't get unset until the task get's schedule()d off.

- If we are running on the same CPU as the task we resched, then set
  TIF_NEED_RESCHED and no further action is required.

- If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
  after TIF_NEED_RESCHED has been set, then we need to send an IPI.

Using these rules, we are able to remove the test and set operation in
resched_task, and make clear the previously vague semantics of
POLLING_NRFLAG.

* In idle routines:
- Enter cpu_idle with preempt disabled. When the need_resched() condition
  becomes true, explicitly call schedule(). This makes things a bit clearer
  (IMO), but haven't updated all architectures yet.

- Many do a test and clear of TIF_NEED_RESCHED for some reason. According
  to the resched_task rules, this isn't needed (and actually breaks the
  assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
  held). So remove that. Generally one less locked memory op when switching
  to the idle thread.

- Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
  most polling idle loops. The above resched_task semantics allow it to be
  set until before the last time need_resched() is checked before going into
  a halt requiring interrupt wakeup.

  Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
  can be always left set, completely eliminating resched IPIs when rescheduling
  the idle task.

  POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Nick Piggin 5bfb5d690f [PATCH] sched: disable preempt in idle tasks
Run idle threads with preempt disabled.

Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
How did it ever work before?

Might fix the CPU hotplugging hang which Nigel Cunningham noted.

We think the bug hits if the idle thread is preempted after checking
need_resched() and before going to sleep, then the CPU offlined.

After calling stop_machine_run, the CPU eventually returns from preemption and
into the idle thread and goes to sleep.  The CPU will continue executing
previous idle and have no chance to call play_dead.

By disabling preemption until we are ready to explicitly schedule, this bug is
fixed and the idle threads generally become more robust.

From: alexs <ashepard@u.washington.edu>

  PPC build fix

From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>

  MIPS build fix

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:33 -08:00
Christoph Hellwig 7e4c54a2a4 [PATCH] remove ioctl32_handler_t
Some architectures define and use this type in their compat_ioctl code, but
all of them can easily use the identical ioctl_trans_handler_t type that is
defined in common code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-09 07:56:00 -08:00
David S. Miller dd3e2dcf34 [SPARC64]: Kill some unnecessary includes from ioctl32.c
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:46 -08:00
Christoph Hellwig f48497e383 [SPARC64]: remove drm compat ioctl handling
drivers/drm/ now implements proper ->compat_ioctl methods, so this isn't
needed anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:27 -08:00
Christoph Hellwig b66621fef3 [SPARC] cpwatchdog: implement ->compat_ioctl
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:14 -08:00
Christoph Hellwig 1d5d00bd9c [SPARC] display7seg: implement ->unlocked_ioctl and ->compat_ioctl
all ioctls are 32bit compat clean, so the driver can use ->compat_ioctl
and ->unlocked_ioctl easily.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:13:01 -08:00
Christoph Hellwig b31023fc24 [SPARC] openprom: implement ->compat_ioctl
implement a compat_ioctl handle in the driver instead of having table
entries in sparc64 ioctl32.c (I plan to get rid of the arch ioctl32.c
file eventually)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:12:47 -08:00
Christoph Hellwig 1928f8e541 [SPARC] envctrl: implement ->unlocked_ioctl and ->compat_ioctl
all the ioctls in the driver are 32bit compat clean and don't need BKL,
so we can switch it to ->unlocked_ioctl and ->compat_ioctl trivially.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:12:34 -08:00
Christoph Hellwig 16cf0d8165 [SPARC]: Kill remaining kbio.h references.
Would you mind applying the following patch that kills those two + the
m68k and Documentation/ references?

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:12:21 -08:00
Christoph Hellwig 261b033afc [SPARC64]: remove duplicated compat ioctl entries
all these are handled by fs/compat_ioctls.c already.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:49 -08:00
Christoph Hellwig 59f85dc95e [SPARC]: remove vuid_event.h
I don't know if we ever implemented this, but the only user in any 2.6
tree are the compat ioctls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:38 -08:00
Christoph Hellwig e1413315b8 [SPARC]: remove kbio.h
The old keyboard driver is gone in 2.6, so the only user left are the
compat ioctls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:25 -08:00
Christoph Hellwig 9d3c7d1bfd [SPARC]: remove audioio.h
The old sound drivers are gone in 2.6, so the only user left are the
compat ioctls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:14 -08:00
Christoph Hellwig e0436b3164 [SPARC64]: remove alloc_user_space()
this inline routine in arch/sparc64/kernel/ioctl32.c is completely
unused and superceeded by compat_alloc_user_space()

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:11:02 -08:00
David S. Miller fc3214952f [SPARC64]: Kill off dummy_tick_ops.
It only serves to generate false-positive buildcheck warnings.
Just set it initially to tick_operations which uses the v9
%tick register which every sparc64 processor has.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:10:10 -08:00
David S. Miller 62dbec78be [SPARC64] mm: Do not flush TLB mm in tlb_finish_mmu()
It isn't needed any longer, as noted by Hugh Dickins.

We still need the flush routines, due to the one remaining
call site in hugetlb_prefault_arch_hook().  That can be
eliminated at some later point, however.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:09:58 -08:00
Hugh Dickins dedeb0029b [SPARC64] mm: context switch ptlock
sparc64 is unique among architectures in taking the page_table_lock in
its context switch (well, cris does too, but erroneously, and it's not
yet SMP anyway).

This seems to be a private affair between switch_mm and activate_mm,
using page_table_lock as a per-mm lock, without any relation to its uses
elsewhere.  That's fine, but comment it as such; and unlock sooner in
switch_mm, more like in activate_mm (preemption is disabled here).

There is a block of "if (0)"ed code in smp_flush_tlb_pending which would
have liked to rely on the page_table_lock, in switch_mm and elsewhere;
but its comment explains how dup_mmap's flush_tlb_mm defeated it.  And
though that could have been changed at any time over the past few years,
now the chance vanishes as we push the page_table_lock downwards, and
perhaps split it per page table page.  Just delete that block of code.

Which leaves the mysterious spin_unlock_wait(&oldmm->page_table_lock)
in kernel/fork.c copy_mm.  Textual analysis (supported by Nick Piggin)
suggests that the comment was written by DaveM, and that it relates to
the defeated approach in the sparc64 smp_flush_tlb_pending.  Just delete
this block too.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:09:01 -08:00
Hugh Dickins b8ae48656d [SPARC64] mm: don't re-evaluate *ptep
sparc64 prom_callback and new_setup_frame32 each operates on a user page
table without holding lock, and no doubt they've good reason.  But I'd
feel more confident if they were to do a "pte = *ptep" and then operate
on pte, rather than re-evaluating *ptep.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-11-07 14:08:46 -08:00
Jesper Juhl b2325fe1b7 [PATCH] kfree cleanup: arch
This is the arch/ part of the big kfree cleanup patch.

Remove pointless checks for NULL prior to calling kfree() in arch/.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:54:06 -08:00
Ananth N Mavinakayanahalli d217d5450f [PATCH] Kprobes: preempt_disable/enable() simplification
Reorganize the preempt_disable/enable calls to eliminate the extra preempt
depth.  Changes based on Paul McKenney's review suggestions for the kprobes
RCU changeset.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli 991a51d83a [PATCH] Kprobes: Use RCU for (un)register synchronization - arch changes
Changes to the arch kprobes infrastructure to take advantage of the locking
changes introduced by usage of RCU for synchronization.  All handlers are now
run without any locks held, so they have to be re-entrant or provide their own
synchronization.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli f215d985e9 [PATCH] Kprobes: Track kprobe on a per_cpu basis - sparc64 changes
Sparc64 changes to track kprobe execution on a per-cpu basis.  We now track
the kprobe state machine independently on each cpu using an arch specific
kprobe control block.

Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:46 -08:00
Ananth N Mavinakayanahalli 66ff2d0691 [PATCH] Kprobes: rearrange preempt_disable/enable() calls
The following set of patches are aimed at improving kprobes scalability.  We
currently serialize kprobe registration, unregistration and handler execution
using a single spinlock - kprobe_lock.

With these changes, kprobe handlers can run without any locks held.  It also
allows for simultaneous kprobe handler executions on different processors as
we now track kprobe execution on a per processor basis.  It is now necessary
that the handlers be re-entrant since handlers can run concurrently on
multiple processors.

All changes have been tested on i386, ia64, ppc64 and x86_64, while sparc64
has been compile tested only.

The patches can be viewed as 3 logical chunks:

patch 1: 	Reorder preempt_(dis/en)able calls
patches 2-7: 	Introduce per_cpu data areas to track kprobe execution
patches 8-9: 	Use RCU to synchronize kprobe (un)registration and handler
		execution.

Thanks to Maneesh Soni, James Keniston and Anil Keshavamurthy for their
review and suggestions. Thanks again to Anil, Hien Nguyen and Kevin Stafford
for testing the patches.

This patch:

Reorder preempt_disable/enable() calls in arch kprobes files in preparation to
introduce locking changes.  No functional changes introduced by this patch.

Signed-off-by: Ananth N Mavinakayahanalli <ananth@in.ibm.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:45 -08:00
Thomas Gleixner ecea8d19c9 [PATCH] jiffies_64 cleanup
Define jiffies_64 in kernel/timer.c rather than having 24 duplicated
defines in each architecture.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:25 -08:00
Christoph Hellwig 9c0cbd54ce [PATCH] TIOC* compat ioctl handling
TIOCSTART and TIOCSTOP are defined in asm/ioctls.h and asm/termios.h by
various architectures but not actually implemented anywhere but in the IRIX
compatibility layer, so remove their COMPATIBLE_IOCTL from parisc, ppc64
and sparc64.

Move the TIOCSLTC COMPATIBLE_IOCTL to common code, guided by an ifdef to
only show up on architectures that support it (same as the code handling it
in tty_ioctl.c), aswell as it's brother TIOCGLTC that wasn't handled so
far.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30 17:37:17 -08:00
Hugh Dickins 404351e67a [PATCH] mm: mm_init set_mm_counters
How is anon_rss initialized?  In dup_mmap, and by mm_alloc's memset; but
that's not so good if an mm_counter_t is a special type.  And how is rss
initialized?  By set_mm_counter, all over the place.  Come on, we just need to
initialize them both at once by set_mm_counter in mm_init (which follows the
memcpy when forking).

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:38 -07:00
David S. Miller b4d1b82578 [SPARC64]: Fix powering off on SMP.
Doing a "SUNW,stop-self" firmware call on the other cpus is not the
correct thing to do when dropping into the firmware for a halt,
reboot, or power-off.

For now, just do nothing to quiet the other cpus, as the system should
be quiescent enough.  Later we may decide to implement smp_send_stop()
like the other SMP platforms do.

Based upon a report from Christopher Zimmermann.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-14 15:26:08 -07:00
David S. Miller 688cb30bdc [SPARC64]: Eliminate PCI IOMMU dma mapping size limit.
The hairy fast allocator in the sparc64 PCI IOMMU code
has a hard limit of 256 pages.  Certain devices can
exceed this when performing very large I/Os.

So replace with a more simple allocator, based largely
upon the arch/ppc64/kernel/iommu.c code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 22:15:24 -07:00
David S. Miller 51e8513615 [SPARC64]: Consolidate common PCI IOMMU init code.
All the PCI controller drivers were doing the same thing
setting up the IOMMU software state, put it all in one spot.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-13 21:10:08 -07:00
David S. Miller c9c1083074 [SPARC64]: Fix boot failures on SunBlade-150
The sequence to move over to the Linux trap tables from
the firmware ones needs to be more air tight.  It turns
out that to be %100 safe we do need to be able to translate
OBP mappings in our TLB miss handlers early.

In order not to eat up a lot of kernel image memory with
static page tables, just use the translations array in
the OBP TLB miss handlers.  That solves the bulk of the
problem.

Furthermore, to make sure the OBP TLB miss path will work
even before the fixed MMU globals are loaded, explicitly
load %g1 to TLB_SFSR at the beginning of the i-TLB and
d-TLB miss handlers.

To ease the OBP TLB miss walking of the prom_trans[] array,
we sort it then delete all of the non-OBP entries in there
(for example, there are entries for the kernel image itself
which we're not interested in at all).

We also save about 32K of kernel image size with this change.
Not a bad side effect :-)

There are still some reasons why trampoline.S can't use the
setup_trap_table() yet.  The most noteworthy are:

1) OBP boots secondary processors with non-bias'd stack for
   some reason.  This is easily fixed by using a small bootup
   stack in the kernel image explicitly for this purpose.

2) Doing a firmware call via the normal C call prom_set_trap_table()
   goes through the whole OBP enter/exit sequence that saves and
   restores OBP and Linux kernel state in the MMUs.  This path
   unfortunately does a "flush %g6" while loading up the OBP locked
   TLB entries for the firmware call.

   If we setup the %g6 in the trampoline.S code properly, that
   is in the PAGE_OFFSET linear mapping, but we're not on the
   kernel trap table yet so those addresses won't translate properly.

   One idea is to do a by-hand firmware call like we do in the
   early bootup code and elsewhere here in trampoline.S  But this
   fails as well, as aparently the secondary processors are not
   booted with OBP's special locked TLB entries loaded.  These
   are necessary for the firwmare to processes TLB misses correctly
   up until the point where we take over the trap table.

This does need to be resolved at some point.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-12 12:22:46 -07:00
David S. Miller b1b510aa28 [SPARC64]: Fix net booting on Ultra5
We were not doing alignment properly when remapping the kernel image.

What we want is a 4MB aligned physical address to map at KERNBASE.
Mistakedly we were 4MB aligning the virtual address where the kernel
initially sits, that's wrong.

Instead, we should PAGE align the virtual address, then 4MB align the
physical address result the prom gives to us.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-11 15:45:16 -07:00
David S. Miller 5d8e1b181c [SPARC64]: Fix Ultra5, Ultra60, et al. boot failures.
On the boot processor, we need to do the move onto the Linux trap
table a little bit differently else we'll take unhandlable faults in
the firmware address space.

Previously we would do the following:

1) Disable PSTATE_IE in %pstate.
2) Set %tba by hand to sparc64_ttable_tl0
3) Initialize alternate, mmu, and interrupt global
   trap registers.
4) Call prom_set_traptable()

That doesn't work very well actually with the way we boot the kernel
VM these days.  It worked by luck on many systems because the firmware
accesses for the prom_set_traptable() call happened to be loaded into
the TLB already, something we cannot assume.

So the new scheme is this:

1) Clear PSTATE_IE in %pstate and set %pil to 15
2) Call prom_set_traptable()
3) Initialize alternate, mmu, and interrupt global
   trap registers.

and this works quite well.  This sequence has been moved into a
callable function in assembler named setup-trap_table().  The idea is
that eventually trampoline.S can use this code as well.  That isn't
possible currently due to some complications, but eventually we should
be able to do it.

Thanks to Meelis Roos for the Ultra5 boot failure report.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-10 16:12:13 -07:00
Sven Hartge 2e457ef667 [SPARC64]: Fix compile error in irq.c
irq.c is missing the inclusion of asm/io.h, which causes
readb() and writeb() the be undefined.

Signed-off-by: Sven Hartge <hartge@ds9.argh.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-08 21:12:04 -07:00
David S. Miller ba6399334d [SPARC64]: Fix userland FPU state corruption.
We need to use stricter memory barriers around the block
load and store instructions we use to save and restore the
FPU register file.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-07 13:30:49 -07:00
David S. Miller 2256c13b99 [SPARC64]: Probe for power device on ISA bus too.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-06 20:43:54 -07:00
David S. Miller 0835ae0f27 [SPARC64]: Replace cheetah+ code patching with variables.
Instead of code patching to handle the page size fields in
the context registers, just use variables from which we get
the proper values.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-10-04 15:23:20 -07:00
David S. Miller 717463d806 [SPARC64]: Fix several bugs in flush_ptrace_access().
1) Use cpudata cache line sizes, not magic constants.
2) Align start address in cheetah case so we do not get
   unaligned address traps.  (pgrep was good at triggering
   this, via /proc/${pid}/cmdline accesses)

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29 18:50:34 -07:00
David S. Miller 13edad7a5c [SPARC64]: Rewrite convoluted physical memory probing.
Delete all of the code working with sp_banks[] and replace
with clean acquisition and sorting of physical memory
parameters from the firmware.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-29 17:58:26 -07:00
David S. Miller ed3ffaf7b5 [SPARC64]: Solidify check in cheetah_check_main_memory().
Need to make sure the address is below high_memory before
passing it to kern_addr_valid().

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 21:48:25 -07:00
David S. Miller 10147570f9 [SPARC64]: Kill all external references to sp_banks[]
Thus, we can mark sp_banks[] static in arch/sparc64/mm/init.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 21:46:43 -07:00
David S. Miller 0836a0eb40 [SPARC64]: Move phys_base, kern_{base,size}, and sp_banks[] init to paging_init
Also, move prom_probe_memory() into arch/sparc64/mm/init.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 21:38:08 -07:00
David S. Miller 801ab3c731 [SPARC]: Declare paging_init() in asm/pgtable.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 21:31:25 -07:00
David S. Miller 5fd29752f0 [SPARC64]: Fix fault handling in unaligned trap handler.
We were not calling kernel_mna_trap_fault() correctly.
Instead of being fancy, just return 0 vs. -EFAULT from
the assembler stubs, and handle that return value as
appropriate.

Create an "__retl_efault" stub for assembler exception
table entries and use it where possible.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 20:41:45 -07:00
David S. Miller 8cf14af0a7 [SPARC64]: Convert to use generic exception table support.
The funny "range" exception table entries we had were only
used by the compat layer socketcall assembly, and it wasn't
even needed there.

For free we now get proper exception table sorting and fast
binary searching.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 20:21:11 -07:00
David S. Miller 705747ab87 [SPARC64]: Fix bug in unaligned load endianness swapping
The in-memory value was being swapped, not the value we
loaded into the register.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-28 16:48:40 -07:00
David S. Miller d2212bc7db [SPARC64]: Add missing IDs for newer cpus.
Also, the us3_cpufreq driver can work on Ultra-IV and IV+.
They use the SAFARI bus register to control the clock divider
just like Ultra-III and III+ do.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-27 22:50:06 -07:00
David S. Miller 80dc0d6b44 [SPARC64]: Probe D/I/E-cache config and use.
At boot time, determine the D-cache, I-cache and E-cache size and
line-size.  Use them in cache flushes when appropriate.

This change was motivated by discovering that the D-cache on
UltraSparc-IIIi and later are 64K not 32K, and the flushes done by the
Cheetah error handlers were assuming a 32K size.

There are still some pieces of code that are hard coding things and
will need to be fixed up at some point.

While we're here, fix the D-cache and I-cache parity error handlers
to run with interrupts disabled, and when the trap occurs at trap
level > 1 log the event via a counter displayed in /proc/cpuinfo.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-26 00:32:17 -07:00
David S. Miller 5642530651 [SPARC64]: Add CONFIG_DEBUG_PAGEALLOC support.
The trick is that we do the kernel linear mapping TLB miss starting
with an instruction sequence like this:

	ba,pt		%xcc, kvmap_load
	 xor		%g2, %g4, %g5

succeeded by an instruction sequence which performs a full page table
walk starting at swapper_pg_dir.

We first take over the trap table from the firmware.  Then, using this
constant PTE generation for the linear mapping area above, we build
the kernel page tables for the linear mapping.

After this is setup, we patch that branch above into a "nop", which
will cause TLB misses to fall through to the full page table walk.

With this, the page unmapping for CONFIG_DEBUG_PAGEALLOC is trivial.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-25 16:46:57 -07:00
David S. Miller 52f26deb7c [SPARC64]: Fix mask formation in tomatillo_wsync_handler()
"1" needs to be "1UL", this is a 64-bit mask we're creating.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-24 23:06:14 -07:00
David S. Miller 1c9ea5db00 [SPARC64]: Kill unused variable in setup_arch()
'highest_paddr' is set, but never actually used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-23 11:54:43 -07:00
David S. Miller a8201c6106 [SPARC64]: Fix comment typo in head.S
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 20:31:29 -07:00
David S. Miller bff06d5522 [SPARC64]: Rewrite bootup sequence.
Instead of all of this cpu-specific code to remap the kernel
to the correct location, use portable firmware calls to do
this instead.

What we do now is the following in position independant
assembler:

	chosen_node = prom_finddevice("/chosen");
	prom_mmu_ihandle_cache = prom_getint(chosen_node, "mmu");
	vaddr = 4MB_ALIGN(current_text_addr());
	prom_translate(vaddr, &paddr_high, &paddr_low, &mode);
	prom_boot_mapping_mode = mode;
	prom_boot_mapping_phys_high = paddr_high;
	prom_boot_mapping_phys_low = paddr_low;
	prom_map(-1, 8 * 1024 * 1024, KERNBASE, paddr_low);

and that replaces the massive amount of by-hand TLB probing and
programming we used to do here.

The new code should also handle properly the case where the kernel
is mapped at the correct address already (think: future kexec
support).

Consequently, the bulk of remap_kernel() dies as does the entirety
of arch/sparc64/prom/map.S

We try to share some strings in the PROM library with the ones used
at bootup, and while we're here mark input strings to oplib.h routines
with "const" when appropriate.

There are many more simplifications now possible.  For one thing, we
can consolidate the two copies we now have of a lot of cpu setup code
sitting in head.S and trampoline.S.

This is a significant step towards CONFIG_DEBUG_PAGEALLOC support.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-22 20:11:33 -07:00
David S. Miller 1ac4f5ebaa [SPARC64]: Remove ktlb.S instruction patching.
This was kind of ugly, and actually buggy.  The bug was that
we didn't handle a machine with memory starting > 4GB.  If
the 'prompmd' was allocated in physical memory > 4GB we'd
croak because the obp_iaddr_patch and obp_daddr_patch things
only supported a 32-bit physical address.

So fix this by just loading the appropriate values from two
variables in the kernel image, which is locked into the TLB
and thus accesses to them can't cause a recursive TLB miss.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21 21:49:32 -07:00
David S. Miller 059deb693e [SPARC64]: Kill SZ_BITS define from dtlb_backend.S
This is just a replica of the existing _PAGE_SZBITS,
and thus unnecessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21 19:23:48 -07:00
David S. Miller 2a7e299034 [SPARC64]: Move kernel TLB miss handling into a seperate file.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-21 18:50:51 -07:00
David S. Miller 729b4f7de6 [SPARC64]: Verify vmalloc TLB misses more strictly.
Arrange the modules, OBP, and vmalloc areas such that a range
verification can be done quite minimally.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-20 12:18:38 -07:00
David S. Miller 6a9b490d5f [SPARC64]: Move DCACHE_ALIASING_POSSIBLE define to asm/page.h
This showed that arch/sparc64/kernel/ptrace.c was not getting
the define properly, and thus the code protected by this ifdef
was never actually compiled before.  So fix that too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 20:11:57 -07:00
David S. Miller ff171d8f66 [SPARC64]: Handle little-endian unaligned loads/stores correctly.
Because we use byte loads/stores to cons up the value
in and out of registers, we can't expect the ASI endianness
setting to take care of this for us.  So do it by hand.

This case is triggered by drivers/block/aoe/aoecmd.c in the
ataid_complete() function where it goes:

		/* word 100: number lba48 sectors */
		ssize = le64_to_cpup((__le64 *) &id[100<<1]);

This &id[100<<1] address is 4 byte, rather than 8 byte aligned,
thus triggering the unaligned exception.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-19 19:56:06 -07:00
David S. Miller 4db2ce0199 [LIB]: Consolidate _atomic_dec_and_lock()
Several implementations were essentialy a common piece of C code using
the cmpxchg() macro.  Put the implementation in one spot that everyone
can share, and convert sparc64 over to using this.

Alpha is the lone arch-specific implementation, which codes up a
special fast path for the common case in order to avoid GP reloading
which a pure C version would require.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-14 21:47:01 -07:00
Ingo Molnar fb1c8f93d8 [PATCH] spinlock consolidation
This patch (written by me and also containing many suggestions of Arjan van
de Ven) does a major cleanup of the spinlock code.  It does the following
things:

 - consolidates and enhances the spinlock/rwlock debugging code

 - simplifies the asm/spinlock.h files

 - encapsulates the raw spinlock type and moves generic spinlock
   features (such as ->break_lock) into the generic code.

 - cleans up the spinlock code hierarchy to get rid of the spaghetti.

Most notably there's now only a single variant of the debugging code,
located in lib/spinlock_debug.c.  (previously we had one SMP debugging
variant per architecture, plus a separate generic one for UP builds)

Also, i've enhanced the rwlock debugging facility, it will now track
write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
All locks have lockup detection now, which will work for both soft and hard
spin/rwlock lockups.

The arch-level include files now only contain the minimally necessary
subset of the spinlock code - all the rest that can be generalized now
lives in the generic headers:

 include/asm-i386/spinlock_types.h       |   16
 include/asm-x86_64/spinlock_types.h     |   16

I have also split up the various spinlock variants into separate files,
making it easier to see which does what. The new layout is:

   SMP                         |  UP
   ----------------------------|-----------------------------------
   asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
   linux/spinlock_types.h      |  linux/spinlock_types.h
   asm/spinlock_smp.h          |  linux/spinlock_up.h
   linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
   linux/spinlock.h            |  linux/spinlock.h

/*
 * here's the role of the various spinlock/rwlock related include files:
 *
 * on SMP builds:
 *
 *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
 *                        initializers
 *
 *  linux/spinlock_types.h:
 *                        defines the generic type and initializers
 *
 *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
 *                        implementations, mostly inline assembly code
 *
 *   (also included on UP-debug builds:)
 *
 *  linux/spinlock_api_smp.h:
 *                        contains the prototypes for the _spin_*() APIs.
 *
 *  linux/spinlock.h:     builds the final spin_*() APIs.
 *
 * on UP builds:
 *
 *  linux/spinlock_type_up.h:
 *                        contains the generic, simplified UP spinlock type.
 *                        (which is an empty structure on non-debug builds)
 *
 *  linux/spinlock_types.h:
 *                        defines the generic type and initializers
 *
 *  linux/spinlock_up.h:
 *                        contains the __raw_spin_*()/etc. version of UP
 *                        builds. (which are NOPs on non-debug, non-preempt
 *                        builds)
 *
 *   (included on UP-non-debug builds:)
 *
 *  linux/spinlock_api_up.h:
 *                        builds the _spin_*() APIs.
 *
 *  linux/spinlock.h:     builds the final spin_*() APIs.
 */

All SMP and UP architectures are converted by this patch.

arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
be mostly fine.

From: Grant Grundler <grundler@parisc-linux.org>

  Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
  Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
  non-SMP kernels.  That should be trivial to fix up later if necessary.

  I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
  some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
  are well tested and contained entirely inside arch specific code.  I do NOT
  expect any new issues to arise with them.

 If someone does ever need to use debug/metrics with them, then they will
  need to unravel this hairball between spinlocks, atomic ops, and bit ops
  that exist only because parisc has exactly one atomic instruction: LDCW
  (load and clear word).

From: "Luck, Tony" <tony.luck@intel.com>

   ia64 fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjanv@infradead.org>
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <willy@debian.org>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se>
Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:21 -07:00
Linus Torvalds 486a153f0e Merge master.kernel.org:/pub/scm/linux/kernel/git/sam/kbuild 2005-09-09 15:46:49 -07:00
Sam Ravnborg 0037c78a96 kbuild: frv,m32r,sparc64 introduce fake asm-offsets.h file
Needed to get them to build.
And a hint to avoid hardcoding to many constants in assembler.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2005-09-09 22:47:53 +02:00
Linus Torvalds 7bbedd5213 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6 2005-09-08 15:55:23 -07:00
David S. Miller 085ae41f66 [PATCH] Make sparc64 use setup-res.c
There were three changes necessary in order to allow
sparc64 to use setup-res.c:

1) Sparc64 roots the PCI I/O and MEM address space using
   parent resources contained in the PCI controller structure.
   I'm actually surprised no other platforms do this, especially
   ones like Alpha and PPC{,64}.  These resources get linked into the
   iomem/ioport tree when PCI controllers are probed.

   So the hierarchy looks like this:

   iomem --|
	   PCI controller 1 MEM space --|
				        device 1
					device 2
					etc.
	   PCI controller 2 MEM space --|
				        ...
   ioport --|
            PCI controller 1 IO space --|
					...
            PCI controller 2 IO space --|
					...

   You get the idea.  The drivers/pci/setup-res.c code allocates
   using plain iomem_space and ioport_space as the root, so that
   wouldn't work with the above setup.

   So I added a pcibios_select_root() that is used to handle this.
   It uses the PCI controller struct's io_space and mem_space on
   sparc64, and io{port,mem}_resource on every other platform to
   keep current behavior.

2) quirk_io_region() is buggy.  It takes in raw BUS view addresses
   and tries to use them as a PCI resource.

   pci_claim_resource() expects the resource to be fully formed when
   it gets called.  The sparc64 implementation would do the translation
   but that's absolutely wrong, because if the same resource gets
   released then re-claimed we'll adjust things twice.

   So I fixed up quirk_io_region() to do the proper pcibios_bus_to_resource()
   conversion before passing it on to pci_claim_resource().

3) I was mistakedly __init'ing the function methods the PCI controller
   drivers provide on sparc64 to implement some parts of these
   routines.  This was, of course, easy to fix.

So we end up with the following, and that nasty SPARC64 makefile
ifdef in drivers/pci/Makefile is finally zapped.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-08 14:57:25 -07:00
John W. Linville 064b53dbcc [PATCH] PCI: restore BAR values after D3hot->D0 for devices that need it
Some PCI devices (e.g. 3c905B, 3c556B) lose all configuration
(including BARs) when transitioning from D3hot->D0.  This leaves such
a device in an inaccessible state.  The patch below causes the BARs
to be restored when enabling such a device, so that its driver will
be able to access it.

The patch also adds pci_restore_bars as a new global symbol, and adds a
correpsonding EXPORT_SYMBOL_GPL for that.

Some firmware (e.g. Thinkpad T21) leaves devices in D3hot after a
(re)boot.  Most drivers call pci_enable_device very early, so devices
left in D3hot that lose configuration during the D3hot->D0 transition
will be inaccessible to their drivers.

Drivers could be modified to account for this, but it would
be difficult to know which drivers need modification.  This is
especially true since often many devices are covered by the same
driver.  It likely would be necessary to replicate code across dozens
of drivers.

The patch below should trigger only when transitioning from D3hot->D0
(or at boot), and only for devices that have the "no soft reset" bit
cleared in the PM control register.  I believe it is safe to include
this patch as part of the PCI infrastructure.

The cleanest implementation of pci_restore_bars was to call
pci_update_resource.  Unfortunately, that does not currently exist
for the sparc64 architecture.  The patch below includes a null
implemenation of pci_update_resource for sparc64.

Some have expressed interest in making general use of the the
pci_restore_bars function, so that has been exported to GPL licensed
modules.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-09-08 14:57:24 -07:00
David S. Miller 4d803fcdcd [SPARC64]: Inline membar()'s again.
Since GCC has to emit a call and a delay slot to the
out-of-line "membar" routines in arch/sparc64/lib/mb.S
it is much better to just do the necessary predicted
branch inline instead as:

	ba,pt	%xcc, 1f
	 membar	#whatever
1:

instead of the current:

	call	membar_foo
	 dslot

because this way GCC is not required to allocate a stack
frame if the function can be a leaf function.

This also makes this bug fix easier to backport to 2.4.x

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-08 14:37:53 -07:00
Linus Torvalds 946e91f36e Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 2005-09-07 17:21:17 -07:00
Prasanna S Panchamukhi 05e14cb3ba [PATCH] Kprobes: prevent possible race conditions sparc64 changes
This patch contains the sparc64 architecture specific changes to prevent the
possible race conditions.

Signed-off-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:58:00 -07:00
Miklos Szeredi e922efc342 [PATCH] remove duplicated sys_open32() code from 64bit archs
64 bit architectures all implement their own compatibility sys_open(),
when in fact the difference is simply not forcing the O_LARGEFILE
flag.  So use the a common function instead.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <viro@parcelfarce.linux.theplanet.co.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:43 -07:00
john stultz b149ee2233 [PATCH] NTP: ntp-helper functions
This patch cleans up a commonly repeated set of changes to the NTP state
variables by adding two helper inline functions:

ntp_clear(): Clears the ntp state variables

ntp_synced(): Returns 1 if the system is synced with a time server.

This was compile tested for alpha, arm, i386, x86-64, ppc64, s390, sparc,
sparc64.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07 16:57:34 -07:00
David S. Miller 09bbe1043a [SPARC64]: Fix set/get MTU cases in sunos_ioctl()
Need to use compat struct sizes and compat_sys_ioctl().
Reported by Adrian Bunk via kernel bugzilla #2683

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-06 20:12:15 -07:00
David S. Miller a7a6cac204 [SPARC]: Kill io_remap_page_range()
It's been deprecated long enough and there are no in-tree
users any longer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-09-01 21:51:26 -07:00
David S. Miller 3c2cafaf50 [SPARC64]: Do not expand CHEETAH_LOG_ERROR 3 times.
We only need to expand this thing once, saving some
text section space.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-30 15:11:52 -07:00
David S. Miller dbd2fdf549 [SPARC64]: Kill BRANCH_IF_ANY_CHEETAH() from copy page.
Just patch the branch at boot time instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-30 11:26:15 -07:00
David S. Miller d7ce78fd9a [SPARC64]: Eliminate irq_cpustat_t.
We can put the __softirq_pending mask in the cpudata,
no need for the silly NR_CPUS array in kernel/softirq.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 22:46:43 -07:00
David S. Miller 4f07118f65 [SPARC64]: More fully work around Spitfire Errata 51.
It appears that a memory barrier soon after a mispredicted
branch, not just in the delay slot, can cause the hang
condition of this cpu errata.

So move them out-of-line, and explicitly put them into
a "branch always, predict taken" delay slot which should
fully kill this problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 12:46:22 -07:00
David S. Miller 442464a500 [SPARC64]: Make debugging spinlocks usable again.
When the spinlock routines were moved out of line into
kernel/spinlock.c this made it so that the debugging
spinlocks record lock acquisition program counts in the
kernel/spinlock.c functions not in their callers.
This makes the debugging info kind of useless.

So record the correct caller's program counter and
now this feature is useful once more.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 12:46:07 -07:00
Kumar Gala 3d6364abcf [SPARC64]: remove use of asm/segment.h
Removed sparc64 architecture specific users of asm/segment.h and
asm-sparc64/segment.h itself

Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 12:45:30 -07:00
David S. Miller 6c52a96e6c [SPARC64]: Revamp Spitfire error trap handling.
Current uncorrectable error handling was poor enough
that the processor could just loop taking the same
trap over and over again.  Fix things up so that we
at least get a log message and perhaps even some register
state.

In the process, much consolidation became possible,
particularly with the correctable error handler.

Prefix assembler and C function names with "spitfire"
to indicate that these are for Ultra-I/II/IIi/IIe only.

More work is needed to make these routines robust and
featureful to the level of the Ultra-III error handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 12:45:11 -07:00
David S. Miller bde4e4ee9f [SPARC64]: Do not call winfix_dax blindly
Verify we really are taking a data access exception trap, at TL1, from
one of the window spill/fill handlers.

Else call a new function, data_access_exception_tl1, to log the error.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 12:44:57 -07:00
David S. Miller 5ea68e0276 [SPARC64]: Fix trap state reading for instruction_access_exception.
1) Read ASI_IMMU SFSR not ASI_DMMU.
2) IMMU has no SFAR, read TPC instead
3) Delete old and incorrect comment about the DTLB protection
   trap having a dependency on the SFSR contents in order to
   function correctly

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-29 12:44:40 -07:00
Steven Rostedt 69be8f1896 [PATCH] convert signal handling of NODEFER to act like other Unix boxes.
It has been reported that the way Linux handles NODEFER for signals is
not consistent with the way other Unix boxes handle it.  I've written a
program to test the behavior of how this flag affects signals and had
several reports from people who ran this on various Unix boxes,
confirming that Linux seems to be unique on the way this is handled.

The way NODEFER affects signals on other Unix boxes is as follows:

1) If NODEFER is set, other signals in sa_mask are still blocked.

2) If NODEFER is set and the signal is in sa_mask, then the signal is
still blocked. (Note: this is the behavior of all tested but Linux _and_
NetBSD 2.0 *).

The way NODEFER affects signals on Linux:

1) If NODEFER is set, other signals are _not_ blocked regardless of
sa_mask (Even NetBSD doesn't do this).

2) If NODEFER is set and the signal is in sa_mask, then the signal being
handled is not blocked.

The patch converts signal handling in all current Linux architectures to
the way most Unix boxes work.

Unix boxes that were tested:  DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU
3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX.

* NetBSD was the only other Unix to behave like Linux on point #2. The
main concern was brought up by point #1 which even NetBSD isn't like
Linux.  So with this patch, we leave NetBSD as the lonely one that
behaves differently here with #2.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-29 10:03:11 -07:00
Keith Owens 41290c1464 [PATCH] Export pcibios_bus_to_resource
pcibios_bus_to_resource is exported on all architectures except ia64
and sparc.  Add exports for the two missing architectures.  Needed when
Yenta socket support is compiled as a module.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-24 10:22:44 -07:00
David S. Miller a3f9985843 [SPARC64]: Move kernel unaligned trap handlers into assembler file.
GCC 4.x really dislikes the games we are playing in
unaligned.c, and the cleanest way to fix this is to
move things into assembler.

Noted by Al Viro.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-19 15:55:33 -07:00
David S. Miller 2cab224d1f [SPARC64]: Fix 2 bugs in cpufreq drivers.
1) cpufreq wants frequenceis in KHZ not MHZ
2) provide ->get() method so curfreq node is created

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-08-18 14:35:38 -07:00
Linus Torvalds dc836b5b6f Revert "[PATCH] PCI: restore BAR values..."
Revert commit fec59a711e, which is
breaking sparc64 that doesn't have a working pci_update_resource.

We'll re-do this after 2.6.13 when we'll do it all properly.
2005-08-08 18:46:09 -07:00
John W. Linville fec59a711e [PATCH] PCI: restore BAR values after D3hot->D0 for devices that need it
Some PCI devices (e.g. 3c905B, 3c556B) lose all configuration
(including BARs) when transitioning from D3hot->D0.  This leaves such
a device in an inaccessible state.  The patch below causes the BARs
to be restored when enabling such a device, so that its driver will
be able to access it.

The patch also adds pci_restore_bars as a new global symbol, and adds a
correpsonding EXPORT_SYMBOL_GPL for that.

Some firmware (e.g. Thinkpad T21) leaves devices in D3hot after a
(re)boot.  Most drivers call pci_enable_device very early, so devices
left in D3hot that lose configuration during the D3hot->D0 transition
will be inaccessible to their drivers.

Drivers could be modified to account for this, but it would
be difficult to know which drivers need modification.  This is
especially true since often many devices are covered by the same
driver.  It likely would be necessary to replicate code across dozens
of drivers.

The patch below should trigger only when transitioning from D3hot->D0
(or at boot), and only for devices that have the "no soft reset" bit
cleared in the PM control register.  I believe it is safe to include
this patch as part of the PCI infrastructure.

The cleanest implementation of pci_restore_bars was to call
pci_update_resource.  Unfortunately, that does not currently exist
for the sparc64 architecture.  The patch below includes a null
implemenation of pci_update_resource for sparc64.

Some have expressed interest in making general use of the the
pci_restore_bars function, so that has been exported to GPL licensed
modules.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-04 21:32:46 -07:00
David S. Miller 40a085c41d [SPARC]: Add inotify syscall entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-27 14:14:39 -07:00
Eric W. Biederman 59586e5a26 [PATCH] Don't export machine_restart, machine_halt, or machine_power_off.
machine_restart, machine_halt and machine_power_off are machine
specific hooks deep into the reboot logic, that modules
have no business messing with.  Usually code should be calling
kernel_restart, kernel_halt, kernel_power_off, or
emergency_restart. So don't export machine_restart,
machine_halt, and machine_power_off so we can catch buggy users.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-26 14:35:42 -07:00
David S. Miller db7d9a4eb7 [SPARC64]: Move syscall success and newchild state out of thread flags.
These two bits were accesses non-atomically from assembler
code.  So, in order to eliminate any potential races resulting
from that, move these pieces of state into two bytes elsewhere
in struct thread_info.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-24 19:36:26 -07:00
David S. Miller cdd5186f75 [SPARC64]: Privatize sun5_timer.
It is only used by some localized code in irq.c, and also
delete enable_prom_timer() as that is totally unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-24 19:36:13 -07:00
Andrew Morton c12a828982 [SPARC64]: Fix SMP build failure.
arch/sparc64/kernel/smp.c:48: error: parse error before "__attribute__"
arch/sparc64/kernel/smp.c:49: error: parse error before "__attribute__"

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-12 12:09:43 -07:00
David S. Miller f7ceba360c [SPARC64]: Add syscall auditing support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-10 19:29:45 -07:00
David S. Miller 8d8a64796f [SPARC64]: Pass regs and entry/exit boolean to syscall_trace()
Also fix a bug in 32-bit syscall tracing.  We forgot to update
this code when we moved over to the convention that all 32-bit
syscall arguments are zero extended by default.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-10 16:55:48 -07:00
David S. Miller bb49bcda15 [SPARC64]: Add SECCOMP support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-10 16:49:28 -07:00
David S. Miller af166d15c3 [SPARC64]: Kill ancient and unused SYSCALL_TRACING debugging code.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-10 15:56:40 -07:00
David S. Miller d369ddd2fc [SPARC64]: Add __read_mostly support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-10 15:45:11 -07:00
David S. Miller 9126dfde9e [SPARC]: Add ioprio system call support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-10 15:11:45 -07:00
David S. Miller dcc83a0285 [SPARC64]: Typo in dtlb_backend.S, _PAGE_SZ4M --> _PAGE_SZ4MB
Noticed by Eddie C. Dost

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-08 13:33:10 -07:00
Eddie C. Dost 12cf649f41 [SPARC64]: Fix set_intr_affinity()
Do not cat bucket->irq_info to struct irqaction * directly,
but go through struct irq_desc *.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-06 15:40:21 -07:00
Rusty Lynch 6772926bef [PATCH] kprobes: fix namespace problem and sparc64 build
The following renames arch_init, a kprobes function for performing any
architecture specific initialization, to arch_init_kprobes in order to
cleanup the namespace.

Also, this patch adds arch_init_kprobes to sparc64 to fix the sparc64 kprobes
build from the last return probe patch.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-05 19:19:00 -07:00
David S. Miller 864ae18007 [SPARC64]: Fix IRQ retry interval timer value on sparc64 PCI controllers.
Use '5' instead of 'infinity'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-04 15:58:19 -07:00
David S. Miller 9fba62a59c [SPARC64]: Small Schizo PCI controller programming tweaks.
Use macro instead of magic value for Tomatillo discard-
timeout interrupt enable register bit.

Leave OBP programming PTO value unless Tomatillo and
version >= 0x2.

If no-bus-parking property is present, explicitly clear
PCICTRL_PARK bit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-04 14:53:33 -07:00
David S. Miller bb6743f4f0 [SPARC64]: Do proper DMA IRQ syncing on Tomatillo
This was the main impetus behind adding the PCI IRQ shim.

In order to properly order DMA writes wrt. interrupts, you have to
write to a PCI controller register, then poll for that bit clearing.
There is one bit for each interrupt source, and setting this register
bit tells Tomatillo to drain all pending DMA from that device.

Furthermore, Tomatillo's with revision less than 4 require us to do a
block store due to some memory transaction ordering issues it has on
JBUS.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-04 13:26:04 -07:00
David S. Miller 088dd1f81b [SPARC64]: Add support for IRQ pre-handlers.
This allows a PCI controller to shim into IRQ delivery
so that DMA queues can be drained, if necessary.

If some bus specific code needs to run before an IRQ
handler is invoked, the bus driver simply needs to setup
the function pointer in bucket->irq_info->pre_handler and
the two args bucket->irq_info->pre_handler_arg[12].

The Schizo PCI driver is converted over to use a pre-handler
for the DMA write-sync processing it needs when a device
is behind a PCI->PCI bus deeper than the top-level APB
bridges.

While we're here, clean up all of the action allocation
and handling.  Now, we allocate the irqaction as part of
the bucket->irq_info area.  There is an array of 4 irqaction
(for PCI irq sharing) and a bitmask saying which entries
are active.

The bucket->irq_info is allocated at build_irq() time, not
at request_irq() time.  This simplifies request_irq() and
free_irq() tremendously.

The SMP dynamic IRQ retargetting code got removed in this
change too.  It was disabled for a few months now, and we
can resurrect it in the future if we want.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-07-04 13:24:38 -07:00
David S. Miller 63b614522c [SPARC64]: Get rid of fast IRQ feature.
The only real user was the assembler floppy interrupt
handler, which does not need to be in assembly.

This makes it so that there are less pieces of code which
know about the internal layout of ivector_table[] and
friends.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-27 17:04:45 -07:00
David S. Miller b445e26cbf [SPARC64]: Avoid membar instructions in delay slots.
In particular, avoid membar instructions in the delay
slot of a jmpl instruction.

UltraSPARC-I, II, IIi, and IIe have a bug, documented in
the UltraSPARC-IIi User's Manual, Appendix K, Erratum 51

The long and short of it is that if the IMU unit misses
on a branch or jmpl, and there is a store buffer synchronizing
membar in the delay slot, the chip can stop fetching instructions.

If interrupts are enabled or some other trap is enabled, the
chip will unwedge itself, but performance will suffer.

We already had a workaround for this bug in a few spots, but
it's better to have the entire tree sanitized for this rule.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-27 15:42:04 -07:00
Stephen Rothwell 0d77e5a2c2 [PATCH] compat: introduce compat_time_t
This patch is based on work by Carlos O'Donell and Matthew Wilcox.  It
introduces/updates the compat_time_t type and uses it for compat siginfo
structures.  I have built this on ppc64 and x86_64.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:32 -07:00
Prasanna S Panchamukhi e539c23314 [PATCH] kprobes: Temporary disarming of reentrant probe for sparc64
This patch includes sparc64 architecture specific changes to support temporary
disarming on reentrancy of probes.

Signed-of-by: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:25 -07:00
Rusty Lynch 7e1048b11c [PATCH] Move kprobe [dis]arming into arch specific code
The architecture independent code of the current kprobes implementation is
arming and disarming kprobes at registration time.  The problem is that the
code is assuming that arming and disarming is a just done by a simple write
of some magic value to an address.  This is problematic for ia64 where our
instructions look more like structures, and we can not insert break points
by just doing something like:

*p->addr = BREAKPOINT_INSTRUCTION;

The following patch to 2.6.12-rc4-mm2 adds two new architecture dependent
functions:

     * void arch_arm_kprobe(struct kprobe *p)
     * void arch_disarm_kprobe(struct kprobe *p)

and then adds the new functions for each of the architectures that already
implement kprobes (spar64/ppc64/i386/x86_64).

I thought arch_[dis]arm_kprobe was the most descriptive of what was really
happening, but each of the architectures already had a disarm_kprobe()
function that was really a "disarm and do some other clean-up items as
needed when you stumble across a recursive kprobe." So...  I took the
liberty of changing the code that was calling disarm_kprobe() to call
arch_disarm_kprobe(), and then do the cleanup in the block of code dealing
with the recursive kprobe case.

So far this patch as been tested on i386, x86_64, and ppc64, but still
needs to be tested in sparc64.

Signed-off-by: Rusty Lynch <rusty.lynch@intel.com>
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23 09:45:21 -07:00
Wolfgang Wander 1363c3cd86 [PATCH] Avoiding mmap fragmentation
Ingo recently introduced a great speedup for allocating new mmaps using the
free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
causes huge performance increases in thread creation.

The downside of this patch is that it does lead to fragmentation in the
mmap-ed areas (visible via /proc/self/maps), such that some applications
that work fine under 2.4 kernels quickly run out of memory on any 2.6
kernel.

The problem is twofold:

  1) the free_area_cache is used to continue a search for memory where
     the last search ended.  Before the change new areas were always
     searched from the base address on.

     So now new small areas are cluttering holes of all sizes
     throughout the whole mmap-able region whereas before small holes
     tended to close holes near the base leaving holes far from the base
     large and available for larger requests.

  2) the free_area_cache also is set to the location of the last
     munmap-ed area so in scenarios where we allocate e.g.  five regions of
     1K each, then free regions 4 2 3 in this order the next request for 1K
     will be placed in the position of the old region 3, whereas before we
     appended it to the still active region 1, placing it at the location
     of the old region 2.  Before we had 1 free region of 2K, now we only
     get two free regions of 1K -> fragmentation.

The patch addresses thes issues by introducing yet another cache descriptor
cached_hole_size that contains the largest known hole size below the
current free_area_cache.  If a new request comes in the size is compared
against the cached_hole_size and if the request can be filled with a hole
below free_area_cache the search is started from the base instead.

The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
(earlier posted) leakme.c test program terminates after 50000+ iterations
with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
(as expected) with thread creation, Ingo's test_str02 with 20000 threads
requires 0.7s system time.

Taking out Ingo's patch (un-patch available per request) by basically
deleting all mentions of free_area_cache from the kernel and starting the
search for new memory always at the respective bases we observe: leakme
terminates successfully with 11 distinctive hardly fragmented areas in
/proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
time for Ingo's test_str02 with 20000 threads.

Now - drumroll ;-) the appended patch works fine with leakme: it ends with
only 7 distinct areas in /proc/self/maps and also thread creation seems
sufficiently fast with 0.71s for 20000 threads.

Signed-off-by: Wolfgang Wander <wwc@rentec.com>
Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:16 -07:00
David S. Miller 88314ee73f [SPARC64]: Refine PCI strbuf ctx-based flush.
The initial peek read PIO of the match register is just a waste.
Just do the flush writes first, as that is more efficient.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-31 19:13:52 -07:00
David S. Miller 7c963ad1d1 [SPARC64]: Fix streaming buffer flushing on PCI and SBUS.
Firstly, if the direction is TODEVICE, then dirty data in the
streaming cache is impossible so we can elide the flush-flag
synchronization in that case.

Next, the context allocator is broken.  It is highly likely
that contexts get used multiple times for different dma
mappings, which confuses the strbuf flushing code and makes
it run inefficiently.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-31 16:57:59 -07:00
David S. Miller 816242da37 [SPARC64]: Add boot option to force UltraSPARC-III P-Cache on.
Older UltraSPARC-III chips have a P-Cache bug that makes us disable it
by default at boot time.

However, this does hurt performance substantially, particularly with
memcpy(), and the bug is _incredibly_ obscure.  I have never seen it
triggered in practice, ever.

So provide a "-P" boot option that forces the P-Cache on.  It taints
the kernel, so if it does trigger and cause some data corruption or
OOPS, we will find out in the logs that this option was on when it
happened.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-23 15:52:08 -07:00
David S. Miller a228dfd5dc [SPARC64]: Fix bad performance side effect of strbuf timeout changes.
The recent change to add a timeout to strbuf flushing had
a negative performance impact.  The udelay()'s are too long,
and they were done in the wrong order wrt. the register read
checks.  Fix both, and things are happy again.

There are more possible improvements in this area.  In fact,
PCI streaming buffer flushing seems to be part of the bottleneck
in network receive performance on my SunBlade1000 box.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-20 11:40:32 -07:00
David S. Miller 4dbc30fb27 [SPARC64]: Add timeouts to streaming buffer synchronization.
If some hardware error occurs and the flush flag never updates,
we will hang forever in these routines.  Add a timeout, and
print out a diagnostic if it is reached.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-11 11:37:00 -07:00
Coywolf Qi Hunt 7cc1712b8a [SPARC]: Remove legacy stuff from cpu_idle().
Currently sparc and sparc64's UP cpu_idle() checks current pid. This
is old time legacy. Now it's paranoia.

Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org>
Acked-by: William Irwin <wli@holomorphy.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-05 14:53:01 -07:00
Christoph Hellwig 8edf72ebce [SPARC64]: Kill useless __pte_alloc_one_kernel indirection
warning: untested, but it there's not too much chance for screwups

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-05 14:27:56 -07:00
David S. Miller 41832a08fe [SPARC64]: Disable IRQ forwarding.
There is some race whereby IRQs get stuck, the IRQ status
is pending but no processor actually handles the IRQ vector
and thus the interrupt.
 
This is a temporary workaround.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 22:05:43 -07:00
David S. Miller cee2824f85 [SPARC64]: Fix goal_cpu tracking in retarget_one_irq().
We would never advance the goal_cpu counter like we
should, so all IRQs would go to a single processor.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-05-03 22:04:36 -07:00
Jesper Juhl 7ed20e1ad5 [PATCH] convert that currently tests _NSIG directly to use valid_signal()
Convert most of the current code that uses _NSIG directly to instead use
valid_signal().  This avoids gcc -W warnings and off-by-one errors.

Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01 08:59:14 -07:00
Al Viro ef0299bf8e [PATCH] mostek bogus sparse annotations fixed
void * __iomem foo is not a pointer to iomem - it's an iomem variable
containing void *.  A pile of such guys in arch/sparc64/kernel/time.c,
drivers/sbus/char/rtc.c and include/asm-sparc64/mostek.h turned into
intended void __iomem *. 

Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-24 12:28:36 -07:00
David S. Miller b4bca26c01 [SPARC]: Provide generic ioctls in Sparc RTC driver.
Provide support for drivers/char/rtc.c ioctls in the
Mostek rtc driver as well as the Sparc specific RTCGET
and RTCSET.

This allows userspace to be much less messy.  Currently
util-linux and other spots jump through hoops trying
various ioctl variants until it hits the right one whatever
driver actually being used supports.

Eventually all of this should move over to the genrtc.c
driver, but not today...

While we are here, fix up the register types for sparse.

Thanks to Frans Pop for helping point out this issue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-04-21 21:42:34 -07:00
David S. Miller 0ba4da03cc [PATCH] sparc64: Fix stat
Like Alpha, sparc64's struct stat was defined before we had the
nanosecond et al.  fields added.  So like Alpha I have to cons up a
struct stat64 to get this stuff.  I'll work on the glibc bits soon. 

Also, we were forgetting to fill in the nanosecond fields in the sparc
compat stat64 syscalls. 

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-18 15:13:15 -07:00
Jurij Smakov 9c7d3b3a6b [PATCH] sparc64: Fix copy_sigingo_to_user32()
The compat routine to copy over this data structure was not
handling SI_POLL correctly, breaking various fcntl() variants
in compat tasks.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-17 18:03:12 -07:00
David S. Miller dadeafdfc8 [PATCH] sparc64: Reduce ptrace cache flushing
We were flushing the D-cache excessively for ptrace() processing
and this makes debugging threads so slow as to be totally unusable.

All process page accesses via ptrace() go via access_process_vm().
This routine, for each process page, uses get_user_pages().  That
in turn does a flush_dcache_page() on the child pages before we
copy in/out the ptrace request data.

Therefore, all we need to do after the data movement is:

1) Flush the D-cache pages if the kernel maps the page to a different
   color than userspace does.
2) If we wrote to the page, we need to flush the I-cache on older cpus.

Previously we just flushed the entire cache at the end of a ptrace()
request, and that was beyond stupid.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-17 18:03:11 -07:00
David S. Miller fb65b9619b [PATCH] sparc: Fix PTRACE_CONT bogosity
SunOS aparently had this weird PTRACE_CONT semantic which
we copied.  If the addr argument is something other than
1, it sets the process program counter to whatever that
value is.

This is different from every other Linux architecture, which
don't do anything with the addr and data args.

This difference in particular breaks the Linux native GDB support
for fork and vfork tracing on sparc and sparc64.

There is no interest in running SunOS binaries using this weird
PTRACE_CONT behavior, so just delete it so we behave like other
platforms do.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-17 18:03:11 -07:00
David S. Miller 961f8bc9fc [PATCH] sparc64: use message queue compat syscalls
A couple message queue system call entries for compat tasks
were not using the necessary compat_sys_*() functions, causing
some glibc test cases to fail.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-17 18:03:10 -07:00
Linus Torvalds 1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00