Commit Graph

28 Commits

Author SHA1 Message Date
Chris Metcalf 35f059761c tilegx: change how we find the kernel stack
Previously, we used a special-purpose register (SPR_SYSTEM_SAVE_K_0)
to hold the CPU number and the top of the current kernel stack
by using the low bits to hold the CPU number, and using the high
bits to hold the address of the page just above where we'd want
the kernel stack to be.  That way we could initialize a new SP
when first entering the kernel by just masking the SPR value and
subtracting a couple of words.

However, it's actually more useful to be able to place an arbitrary
kernel-top value in the SPR.  This allows us to create a new stack
context (e.g. for virtualization) with an arbitrary top-of-stack VA.
To make this work, we now store the CPU number in the high bits,
above the highest legal VA bit (42 bits in the current tilegx
microarchitecture).  The full 42 bits are thus available to store the
top of stack value.  Getting the current cpu (a relatively common
operation) is still fast; it's now a shift rather than a mask.

We make this change only for tilegx, since tilepro has too few SPR
bits to do this, and we don't need this support on tilepro anyway.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2013-08-30 11:56:58 -04:00
Chris Metcalf 9ae0983847 tile: provide traceability for hypervisor calls
This change adds infrastructure (CONFIG_TILE_HVGLUE_TRACE) that
provides C code wrappers for the calls the kernel makes to the Tilera
hypervisor.  This allows standard kernel infrastructure like FTRACE to
be able to instrument hypervisor calls.

To allow direct calls to the true API, we export their names with a
leading underscore as well.  This is important for the few contexts
where we need to make hypervisor calls without touching the stack.

As part of this change, we also switch from creating the symbols
with linker magic to creating them with assembler magic.  This lets
us provide a symbol type and generally make them appear more as symbols
and less as just random values in the Elf namespace.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2013-08-13 16:26:31 -04:00
Chris Metcalf bc1a298f4e tile: support CONFIG_PREEMPT
This change adds support for CONFIG_PREEMPT (full kernel preemption).
In addition to the core support, this change includes a number
of places where we fix up uses of smp_processor_id() and per-cpu
variables.  I also eliminate the PAGE_HOME_HERE and PAGE_HOME_UNKNOWN
values for page homing, as it turns out they weren't being used.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2013-08-13 16:26:01 -04:00
Chris Metcalf 2f9ac29eec tile: fast-path unaligned memory access for tilegx
This change enables unaligned userspace memory access via a kernel
fast path on tilegx.  The kernel tracks user PC/instruction pairs
per-thread using a direct-mapped cache in userspace.  The cache
maps those PC/instruction pairs to JIT'ed instruction sequences that
load or store using byte-wide load store intructions and then
synthesize 2-, 4- or 8-byte load or store results.  Once an
instruction has been seen to generate an unaligned access once,
subsequent hits on that instruction typically require overhead
of only around 50 cycles if cache and TLB is hot.

We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to
enable or disable unaligned fixups on a per-process basis.

To do this we pull some of the tilepro unaligned support out of the
single_step.c file; tilepro uses instruction disassembly for both
single-step and unaligned access support.  Since tilegx actually has
hardware singlestep support, though, it's cleaner to keep the tilegx
unaligned access code in a separate file.  While we're at it,
properly rename the tilepro-specific types, etc., to have tilepro
suffixes instead of generic tile suffixes.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2013-08-13 16:04:10 -04:00
Chris Metcalf b63ea7121c tile: fix comment bug in sys_cmpxchg description
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2013-08-12 14:46:47 -04:00
Simon Marchi ef18272453 arch/tile: Call tracehook_report_syscall_{entry,exit} in syscall trace
Call tracehook functions for syscall tracing.

The check for TIF_SYSCALL_TRACE was removed, because the same check is
done right before in the assembly file.

Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [with ptrace.h fixup]
2013-03-21 15:39:34 -04:00
Chris Metcalf 6b14e4198c arch/tile: eliminate pt_regs trampolines for syscalls
Using the new current_pt_regs() model, we can remove some trampolines
from assembly code and call directly to the C syscall implementations.
rt_sigreturn() and clone() still need some assembly wrapping, but no
longer are passed a pt_regs pointer.  sigaltstack() and the
tilepro-specific cmpxchg_badaddr() syscalls are now just straight C.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-10-23 16:23:58 -04:00
Al Viro 530550651f tile: switch to generic sys_execve()
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-23 13:40:12 -04:00
Chris Metcalf 0f8b983812 tile: support GENERIC_KERNEL_THREAD and GENERIC_KERNEL_EXECVE
Also provide an optimized current_pt_regs() while we're at it.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-10-20 13:13:29 -04:00
Chris Metcalf fc327e268f arch/tile: fix up some issues in calling do_work_pending()
First, we were at risk of handling thread-info flags, in particular
do_signal(), when returning from kernel space.  This could happen
after a failed kernel_execve(), or when forking a kernel thread.
The fix is to test in do_work_pending() for user_mode() and return
immediately if so; we already had this test for one of the flags,
so I just hoisted it to the top of the function.

Second, if a ptraced process updated the callee-saved registers
in the ptregs struct and then processed another thread-info flag, we
would overwrite the modifications with the original callee-saved
registers.  To fix this, we add a register to note if we've already
saved the registers once, and skip doing it on additional passes
through the loop.  To avoid a performance hit from the couple of
extra instructions involved, I modified the GET_THREAD_INFO() macro
to be guaranteed to be one instruction, then bundled it with adjacent
instructions, yielding an overall net savings.

Reported-By: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-05-16 16:01:16 -04:00
Chris Metcalf e1d5c01950 arch/tile: avoid accidentally unmasking NMI-type interrupt accidentally
The return path as we reload registers and core state requires that r30
hold a boolean indicating whether we are returning from an NMI, but in a
couple of cases we weren't setting this properly, with the result that we
could accidentally unmask the NMI interrupt(s), which could cause confusion.
Now we set r30 in every place where we jump into the interrupt return path.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2012-04-02 12:14:03 -04:00
Chris Metcalf d52104b29a tile: revert change from <asm/atomic.h> to <linux/atomic.h> in asm files
The 32-bit TILEPro support uses some #defines in <asm/atomic_32.h>
for atomic support routines in assembly.  To make this more explicit,
I've turned those includes into includes of <asm/atomic_32.h>, which
should hopefully make it clear that they shouldn't be bombed into
<linux/atomic.h> in any cleanups.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-10-13 08:25:01 -04:00
Arun Sharma 60063497a9 atomic: use <linux/atomic.h>
This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-26 16:49:47 -07:00
Chris Metcalf df29ccb6c0 arch/tile: allow nonatomic stores to interoperate with fast atomic syscalls
This semantic was already true for atomic operations within the kernel,
and this change makes it true for the fast atomic syscalls (__NR_cmpxchg
and __NR_atomic_update) as well.  Previously, user-space had to use
the fast atomic syscalls exclusively to update memory, since raw stores
could lose a race with the atomic update code even when the atomic update
hadn't actually modified the value.

With this change, we no longer write back the value to memory if it
hasn't changed.  This allows certain types of idioms in user space to
work as expected, e.g. "atomic exchange" to acquire a spinlock, followed
by a raw store of zero to release the lock.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-05-04 14:40:07 -04:00
Chris Metcalf 313ce674d3 arch/tile: support TIF_NOTIFY_RESUME
This support is required for CONFIG_KEYS, NFSv4 kernel DNS, etc.
The change is slightly more complex than the minimal thing, since
I took advantage of having to go into the assembly code to just
move a bunch of stuff into C code: specifically, the schedule(),
do_async_page_fault(), do_signal(), and single_step_once() support,
in addition to the TIF_NOTIFY_RESUME support.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-05-02 18:53:35 -04:00
Chris Metcalf 76c567fbba arch/tile: support 4KB page size as well as 64KB
The Tilera architecture traditionally supports 64KB page sizes
to improve TLB utilization and improve performance when the
hardware is being used primarily to run a single application.

For more generic server scenarios, it can be beneficial to run
with 4KB page sizes, so this commit allows that to be specified
(by modifying the arch/tile/include/hv/pagesize.h header).

As part of this change, we also re-worked the PTE management
slightly so that PTE writes all go through a __set_pte() function
where we can do some additional validation.  The set_pte_order()
function was eliminated since the "order" argument wasn't being used.

One bug uncovered was in the PCI DMA code, which wasn't properly
flushing the specified range.  This was benign with 64KB pages,
but with 4KB pages we were getting some larger flushes wrong.

The per-cpu memory reservation code also needed updating to
conform with the newer percpu stuff; before it always chose 64KB,
and that was always correct, but with 4KB granularity we now have
to pay closer attention and reserve the amount of memory that will
be requested when the percpu code starts allocating.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-03-10 13:17:53 -05:00
Chris Metcalf 5fb682b064 arch/tile: fix some comments and whitespace
This is a grab bag of changes with no actual change to generated code.
This includes whitespace and comment typos, plus a couple of stale
comments being removed.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-03-10 13:14:03 -05:00
Chris Metcalf b2ce2bdaf9 arch/tile: stop disabling INTCTRL_1 interrupts during hypervisor downcalls
The problem was that this could lead to IPIs being disabled during
the softirq processing after a hypervisor downcall (e.g. for I/O),
since both IPI and device interrupts use the INCTRL_1 downcall mechanism.
When this happened at the wrong time, it could lead to deadlock.

Luckily, we were already maintaining the per-interrupt state we need,
and using it in the proper way in the hypervisor, so all we had to do
was to change Linux to stop blocking downcall interrupts for the entire
length of the downcall.  (Now they're blocked while we're executing the
downcall routine itself, but not while we're executing any subsequent
softirq routines.)  The hypervisor is doing a very small amount of
work it no longer needs to do (masking INTCTRL_1 on entry to the client
interrupt routine), but doing so means that older versions of Tile Linux
will continue to work with a current hypervisor, so that seems reasonable.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2011-03-01 16:20:10 -05:00
Chris Metcalf 81711cee93 arch/tile: handle rt_sigreturn() more cleanly
The current tile rt_sigreturn() syscall pattern uses the common idiom
of loading up pt_regs with all the saved registers from the time of
the signal, then anticipating the fact that we will clobber the ABI
"return value" register (r0) as we return from the syscall by setting
the rt_sigreturn return value to whatever random value was in the pt_regs
for r0.

However, this breaks in our 64-bit kernel when running "compat" tasks,
since we always sign-extend the "return value" register to properly
handle returned pointers that are in the upper 2GB of the 32-bit compat
address space.  Doing this to the sigreturn path then causes occasional
random corruption of the 64-bit r0 register.

Instead, we stop doing the crazy "load the return-value register"
hack in sigreturn.  We already have some sigreturn-specific assembly
code that we use to pass the pt_regs pointer to C code.  We extend that
code to also set the link register to point to a spot a few instructions
after the usual syscall return address so we don't clobber the saved r0.
Now it no longer matters what the rt_sigreturn syscall returns, and the
pt_regs structure can be cleanly and completely reloaded.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-12-17 16:59:29 -05:00
Chris Metcalf 233325b949 arch/tile: enable single-step support for TILE-Gx
This is not quite the complete support, since we're not yet shipping
intvec_64.S, but it is the support relevant to the set of files we are
currently shipping, and makes it easier to track changes between
our internal sources and our public GIT repository.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-10-15 15:38:26 -04:00
Chris Metcalf a78c942df6 arch/tile: parameterize system PLs to support KVM port
While not a port to KVM (yet), this change modifies the kernel
to be able to build either at PL1 or at PL2 with a suitable
config switch.  Pushing up this change avoids handling branch
merge issues going forward with the KVM work.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-10-15 15:38:09 -04:00
Chris Metcalf a4dbc5ee52 arch/tile: change lower bound on syscall error return to -4095
Previously we were using -1023, which is fine for normal syscall
error returns, but the common value in use for other platforms
is -4095, and one Tilera-specific driver does use values in the
-1100 range, so tickled this bug.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-10-14 15:14:29 -04:00
Chris Metcalf d6f0f22c3c arch/tile: update some comments to clarify register usage.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-10-14 14:42:58 -04:00
Chris Metcalf d929b6aeaa arch/tile: Use <asm-generic/syscalls.h>
With this change we now include <asm-generic/syscalls.h> into the "tile"
version of the header.  To take full advantage of the prototypes there,
we also change our naming convention for "struct pt_regs *" syscalls so
that, e.g., _sys_execve() is the "true" syscall entry, which sets the
appropriate register to point to the pt_regs before calling sys_execve().

While doing this I realized I no longer needed the fork and vfork
entry point stubs, since those functions aren't in the generic
syscall ABI, so I removed them as well.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-10-14 14:34:33 -04:00
Chris Metcalf ea44e06e79 arch/tile: remove dead code from intvec_32.S
This "bpt_code" instruction was killed off in our development line a while
ago (the actual definition of bpt_code that is used is in kernel/traps.c)
but I didn't push it for 2.6.36 because it seemed harmless and I didn't
want to try to push more than absolutely necessary.

However, we recently fixed a bug in our gcc that had been causing
"-gdwarf2" not to be passed to the assembler, and passing this flag causes
an erroneous assembler failure in the presence of code in a data section,
sometimes.  While we'd like to track down the bug in the assembler,
we'd also like to make sure 2.6.36 builds with the current toolchain,
so I'm removing this dead code as well.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-09-24 17:19:20 -04:00
Chris Metcalf ba00376b0b arch/tile: extend syscall ABI to set r1 on return as well.
Until now, the tile architecture ABI for syscall return has just been
that r0 holds the return value, and an error is only signalled like it is
for kernel code, with a negative small number.

However, this means that in multiple places in userspace we end up writing
the same three-cycle idiom that tests for a small negative number for
error.  It seems cleaner to instead move that code into the kernel, and
set r1 to hold zero on success or errno on failure; previously, r1 was
just zeroed on return from the kernel (to avoid leaking kernel state).
This way a single conditional branch after the syscall is sufficient
to test for the failure case.  The number of cycles taken is the same,
but the error-checking code is in just one place, so total code size is
smaller, and random userspace syscall code is easier to understand.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
2010-08-13 16:37:00 -04:00
Chris Metcalf 9f9c0382cd arch/tile: Add driver to enable access to the user dynamic network.
This network (the "UDN") connects all the cpus on the chip in a
wormhole-routed dynamic network.  Subrectangles of the chip can
be allocated by a "create" ioctl on /dev/hardwall, and then to access the
UDN in that rectangle, tasks must perform an "activate" ioctl on that
same file object after affinitizing themselves to a single cpu in
the region.  Sending a wormhole-routed message that tries to leave
that subrectangle causes all activated tasks to receive a SIGILL
(just as they would if they tried to access the UDN without first
activating themselves to a hardwall rectangle).

The original submission of this code to LKML had the driver
instantiated under /proc/tile/hardwall.  Now we just use a character
device for this, conventionally /dev/hardwall.  Some futures planning
for the TILE-Gx chip suggests that we may want to have other types of
devices that share the general model of "bind a task to a cpu, then
'activate' a file descriptor on a pseudo-device that gives access to
some hardware resource".  As such, we are using a device rather
than, for example, a syscall, to set up and activate this code.

As part of this change, the compat_ptr() declaration was fixed and used
to pass the compat_ioctl argument to the normal ioctl.  So far we limit
compat code to 2GB, so the difference between zero-extend and sign-extend
(the latter being correct, eventually) had been overlooked.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2010-07-06 13:34:15 -04:00
Chris Metcalf 867e359b97 arch/tile: core support for Tilera 32-bit chips.
This change is the core kernel support for TILEPro and TILE64 chips.
No driver support (except the console driver) is included yet.

This includes the relevant Linux headers in asm/; the low-level
low-level "Tile architecture" headers in arch/, which are
shared with the hypervisor, etc., and are build-system agnostic;
and the relevant hypervisor headers in hv/.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Paul Mundt <lethal@linux-sh.org>
2010-06-04 17:11:18 -04:00