This patch exports per-context statistics in spufs as long as spu
statistics in sysfs.
It was formed by merging:
"spufs: add spu stats in sysfs" From: Christoph Hellwig
"spufs: add stat file to spufs" From: Christoph Hellwig
"spufs: fix libassist accounting" From: Jeremy Kerr
"spusched: fix spu utilization statistics" From: Luke Browning
And some adjustments by myself, after suggestions on cbe-oss-dev.
Having separate patches was making the review process harder
than it should, as we end up integrating spus and ctx statistics
accounting much more than it was on the first implementation.
Signed-off-by: Andre Detsch <adetsch@br.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Reading from the signal{1,2} files requires a spu_acquire_saved, so
make these files write-only for contexts created with
SPU_CREATE_NOSCHED.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
When waiting for I/O events on mfc in an SPU context by using
poll/epoll syscalls, some of the events can be lost because of wrong
order of poll_wait and MFC status checks in the spufs_mfc_poll
function and non-atomic update of tagwait. This fixes the
problem.
Signed-off-by: Kazunori Asayama <asayama@sm.sony.co.jp>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The new tid file contains the ID of the thread currently running the
context, if any. This is used so that the new spu-top and spu-ps
tools can find the thread in /proc.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove redundant whitespace in arch/powerpc/platforms/cell/spufs/
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds a "capabilities" file to spu contexts consisting of a
list of linefeed separated capability names. The current exposed
capabilities are "sched" (the context is scheduleable) and
"step" (the context supports single stepping).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make sure the mapping_lock also protects access to the various address_space
pointers used for tearing down the ptes on a spu context switch.
Because unmap_mapping_range can sleep we need to turn mapping_lock from
a spinlock into a sleeping mutex.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently spufs_mem_release and the mem file doesn't have any release
method hooked up, leading to leaks everytime is used.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds an option to spufs when the kernel is configured for
4K page to give it the ability to use 64K pages for SPE local store
mappings.
Currently, we are optimistic and try order 4 allocations when creating
contexts. If that fails, the code will fallback to 4K automatically.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We now have proper locking around assignets of the mapping pointers,
and the spin_unlock implies enough of a barrier to get rid of the
explicit one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
This patch checks return value of spu_acquire_runnable() in
spufs_mfc_write().
Signed-off-by: Akinobu Mita <mita@fixstars.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Make sure the pointers to various mappings are cleared once the last
user stopped using them. This avoids accessing freed memory when
tearing down the gang directory aswell as optimizing away
pte invalidations if no one uses these.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Due to a buggy unsigned comparison, it was possible to write
beyond the end of the local store file in spufs under some
circumstances.
This rewrites the buggy function to look more like
simple_copy_from_buffer.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Cc: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
I found an exploit in current kernel.
Currently, there is no range check about mmapping "/mem" node in
spufs. Thus, an application can access privilege memory region.
In case this kernel already worked on a public server, I send this
information only here.
If there are such servers in somewhere, please replace it, ASAP.
Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
There is no need to directly wake up contexts in spu_activate when
called from spu_run, so add a flag to surpress this wakeup.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
It looks like we've had some serious bitrot there mostly due to tracking
of address_space's of mmap'ed files getting out of sync with the actual
mmap code. The mfc, mss and psmap were not tracked properly and thus
not invalidated on context switches (oops !)
I also removed the various file->f_mapping = inode->i_mapping;
assignments that were done in the other open() routines since that
is already done for us by __dentry_open.
One improvement we might want to do later is to assign the various
ctx-> fields at mmap time instead of file open/close time so that we
don't call unmap_mapping_range() on thing that have not been mmap'ed
Finally, I added some smp_wmb's after assigning the ctx-> fields to make
sure they are visible to other CPUs. I don't think this is really
necessary as I suspect locking in the fs layer will make that happen
anyway but better safe than sorry.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch removes the need for struct page for SPE local store
and registers from spufs. It also makes the locking much more
obvious and no longer relying on the truncate logic black magic
for protecting against races between unmap_mapping_range() and
new pages faulted in. It does so by switching to a nopfn() handler
and using the new vm_insert_pfn() to setup the PTEs itself while
holding a lock on the SPE.
The nice thing is that this patch actually removes a lot more code
than it adds :-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.
[akpm@osdl.org: sparc64 fix]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds SPU elf notes to the coredump. It creates a separate note
for each of /regs, /fpcr, /lslr, /decr, /decr_status, /mem, /signal1,
/signal1_type, /signal2, /signal2_type, /event_mask, /event_status,
/mbox_info, /ibox_info, /wbox_info, /dma_info, /proxydma_info, /object-id.
A new macro, ARCH_HAVE_EXTRA_NOTES, was created for architectures to
specify they have extra elf core notes.
A new macro, ELF_CORE_EXTRA_NOTES_SIZE, was created so the size of the
additional notes could be calculated and added to the notes phdr entry.
A new macro, ELF_CORE_WRITE_EXTRA_NOTES, was created so the new notes
would be written after the existing notes.
The SPU coredump code resides in spufs. Stub functions are provided in the
kernel which are hooked into the spufs code which does the actual work via
register_arch_coredump_calls().
A new set of __spufs_<file>_read/get() functions was provided to allow the
coredump code to read from the spufs files without having to lock the
SPU context for each file read from.
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Dwayne Grant McConnell <decimal@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
In order to fit with the "don't-run-spus-outside-of-spu_run" model, this
patch starts the isolated-mode loader in spu_run, rather than
spu_create. If spu_run is passed an isolated-mode context that isn't in
isolated mode state, it will run the loader.
This fixes potential races with the isolated SPE app doing a
stop-and-signal before the PPE has called spu_run: bugzilla #29111.
Also (in conjunction with a mambo patch), this addresses #28565, as we
always set the runcntrl register when entering spu_run.
It is up to libspe to ensure that isolated-mode apps are cleaned up
after running to completion - ie, put the app through the "ISOLATE EXIT"
state (see Ch11 of the CBEA).
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When fixing spufs to map the 'mem' file backing store cacheable,
I incorrectly set the physical mapping to use both cache-inhibited
and guarded mapping, which resulted in a serious performance
degradation.
Debugged-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When one of the spufs files is mapped into a process address
space, regular users can use ptrace to attempt accessing
them with access_process_vm(). With the way that the
mappings currently work, this likely causes an oops.
Setting the vm_flags to VM_IO makes sure that ptrace can
not access them but returns an error code. This is not
the perfect solution in case of the local store mapping,
but it fixes the oops in a well-defined way.
Also remove leftover VM_RESERVED flags in spufs. The
VM_RESERVED flag is on it's way out and not checked by
the memory managment code anymore.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Christoph Hellwig <chellwig@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We need to check the channel count of the signal notification registers
before reading them, because it can be undefined when the count is
zero. In order to read count and data atomically, we read from the
saved context.
This patch uses spu_acquire_saved() to force a context save before a
/signal1 or /signal2 read. Because of this it is no longer necessary to
have backing_ops and hw_ops versions of this function so they have been
removed.
Regular applications should not rely on reading this register
to be fast, as it's conceptually a write-only file from the PPE
perspective.
Signed-off-by: Dwayne Grant McConnell <decimal@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch implements read only access to
/mbox_info - SPU Write Outbound Mailbox
/ibox_info - SPU Write Outbound Interrupt Mailbox
/wbox_info - SPU Read Inbound Mailbox
These files are used by gdb in order to look into the current mailbox
queues without changing the contents at the same time. They are
not meant for general programming use, since the access requires
a context save and is therefore rather slow.
It would be good to complement this patch with one that adds
write support as well.
Signed-off-by: Dwayne Grant McConnell <decimal@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch removes the /spu_tag_mask file from spufs. The data provided by
this file is also available from the /dma_info file in the dma_info_mask
of the spu_dma_info struct.
The file was intended to be used by gdb, but that never used it, and
now it has been replaced with the more verbose dma_info file.
Signed-off-by: Dwayne Grant McConnell <decimal@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The /lslr file gives read access to the SPU_LSLR register in hex; 0x3fff
for example The /dma_info file provides read access to the SPU Command
Queue in a binary format. The /proxydma_info files provides read access
access to the Proxy Command Queue in a binary format. The spu_info.h
file provides data structures for interpreting the binary format of
/dma_info and /proxydma_info.
Signed-off-by: Dwayne Grant McConnell <decimal@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patches changes /npc, /decr, /decr_status, /spu_tag_mask,
/event_mask, /event_status, and /srr0 files to provide output according to
the format string "0x%llx" instead of "%llx".
Before this patch some files used "0x%llx" and other used "%llx" which is
inconsistent and potentially confusing. A user might assume "%llx" numbers
were decimal if they happened to not contain any a-f digits. This change
will break any code cannot tolerate a leading 0x in the file contents. The
only known users of these files are the libspe but there might also be
some scripts which access these files. This risk is deemed acceptable for
future consistency.
Signed-off-by: Dwayne Grant McConnell <decimal@us.ibm.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When in isolated mode, SPEs have access to an area of persistent
storage, which is per-SPE. In order for isolated-mode apps to
communicate arbitrary data through this storage, we need to ensure that
isolated physical SPEs can be reused for subsequent applications.
Add a file ("recycle") in a spethread dir to enable isolated-mode
recycling. By writing to this file, the kernel will reload the
isolated-mode loader kernel, allowing a new app to be run on the same
physical SPE.
This requires the spu_acquire_exclusive function to enforce exclusive
access to the SPE while the loader is initialised.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds two new flags to spu_create:
SPU_CREATE_NONSCHED: create a context that is never moved
away from an SPE once it has started running. This flag
can only be used by tasks with the CAP_SYS_NICE capability.
SPU_CREATE_ISOLATED: create a nonschedulable context that
enters isolation mode upon first run. This requires the
SPU_CREATE_NONSCHED flag.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently, spufs_mbox_read transfers more bytes than requested on a
read. If you ask for four bytes, you get eight. This fixes it to
transfer the largest multiple of four bytes that is less than or equal
to the number you asked for.
Note: one nasty property of this file in spufs is that you can only
read multiples of four bytes in the first place, since there is no way
to atomically put back a few bytes into the hardware register. Thus,
reading less than four bytes returns -EINVAL. Asking for more than
four returns the largest possible multiple of four.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a memory leak introduced by "spufs: add support
for read/write oncntl", which was missing a call to simple_attr_close.
Signed-off-by: Masato Noguchi <Masato.Noguchi@jp.sony.com>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds an 'object-id' file that the spe library can
use to store a pointer to its ELF object. This was
originally meant for use by oprofile, but is now
also used by the GNU debugger, if available.
In order for oprofile to find the location in an spu-elf
binary where an event counter triggered, we need a way
to identify the binary in the first place.
Unfortunately, that binary itself can be embedded in a
powerpc ELF binary. Since we can assume it is mapped into
the effective address space of the running process,
have that one write the pointer value into a new spufs
file.
When a context switch occurs, pass the user value to
the profiler so that can look at the mapped file (with
some care).
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Writing to cntl can be used to stop execution on the
spu and to restart it, reading from cntl gives the
contents of the current status register.
The access is always in ascii, as for most other files.
This was always meant to be there, but we had a little
problem with writing to runctl so it was left out so
far.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since libspe2 will provide a function that can read/write
multiple mailbox elements at once, the kernel should handle
that efficiently.
read/write on the three mailbox files can now access the
spe context multiple times to operate on any number of
mailbox data elements.
If the spu application keeps writing to its outbound
mailbox, the read call will pick up all the data in a
single system call.
Unfortunately, if the user passes an invalid pointer,
we may lose a mailbox element on read, since we can't
put it back. This probably impossible to solve, if the
user also accesses the mailbox through direct register
access.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This hopefully fixes a long-standing bug in the spu file system.
An spu context comes with local memory that can be either saved
in kernel pages or point directly to a physical SPE.
When mapping the physical SPE, that mapping needs to be cache-inhibited.
For simplicity, we used to map the kernel backing memory that way
too, but unfortunately that was not only inefficient, but also incorrect
because the same page could then be accessed simultaneously through
a cacheable and a cache-inhibited mapping, which is not allowed
by the powerpc specification and in our case caused data inconsistency
for which we did a really ugly workaround in user space.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds a new "psmap" file to spufs that allows mmap of all of
the problem state mapping of SPEs. It is compatible with 64k pages. In
addition, it removes mmap ability of individual files when using 64k
pages, with the exception of signal1 and signal2 which will both map the
entire 64k page holding both registers. It also removes
CONFIG_SPUFS_MMAP as there is no point in not building mmap support in
spufs.
It goes along a separate patch to libspe implementing usage of that new
file to access problem state registers.
Another patch will follow up to fix races opened up by accessing
the 'runcntl' register directly, which is made possible with this
patch.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a bug where we don't properly map SPE MMIO space as guarded,
causing various test cases to fail, probably due to write combining and other
niceties caused by the lack of the G bit.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For performance analysis, it is often interesting to know
which physical SPE a thread is currently running on, and,
more importantly, if it is running at all.
This patch adds a simple attribute to each SPU directory
with that information.
The attribute is read-only and called 'phys-id'. It contains
an ascii string with the number of the physical SPU (e.g.
"0x5"), or alternatively the string "0xffffffff" (32 bit -1)
when it is not running at all at the time that the file
is read.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A recent change to the way that the mfc file gets mapped made it
impossible to map the SPE Multi-Source Synchronization register
into user space, but that may be needed by some applications.
This restores the missing functionality.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch is layered on top of CONFIG_SPARSEMEM
and is patterned after direct mapping of LS.
This patch allows mmap() of the following regions:
"mfc", which represents the area from [0x3000 - 0x3fff];
"cntl", which represents the area from [0x4000 - 0x4fff];
"signal1" which begins at offset 0x14000; "signal2" which
begins at offset 0x1c000.
The signal1 & signal2 files may be mmap()'d by regular user
processes. The cntl and mfc file, on the other hand, may
only be accessed if the owning process has CAP_SYS_RAWIO,
because they have the potential to confuse the kernel
with regard to parallel access to the same files with
regular file operations: the kernel always holds a spinlock
when accessing registers in these areas to serialize them,
which can not be guaranteed with user mmaps,
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds a new file called 'mfc' to each spufs directory.
The file accepts DMA commands that are a subset of what would
be legal DMA commands for problem state register access. Upon
reading the file, a bitmask is returned with the completed
tag groups set.
The file is meant to be used from an abstraction in libspe
that is added by a different patch.
From the kernel perspective, this means a process can now
offload a memory copy from or into an SPE local store
without having to run code on the SPE itself.
The transfer will only be performed while the SPE is owned
by one thread that is waiting in the spu_run system call
and the data will be transferred into that thread's
address space, independent of which thread started the
transfer.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The logic for sys_spu_run keeps growing and it does
not really belong into file.c any more since we
moved away from using regular file operations to our
own syscall.
No functional change in here.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
During an earlier cleanup, we lost the serialization
of multiple spu_run calls performed on the same
spu_context. In order to get this back, introduce a
mutex in the spu_context that is held inside of spu_run.
Noticed by Al Viro.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Handling mailbox interrupts was broken in multiple respects,
the combination of which was hiding the bugs most of the time.
- The ibox interrupt mask was open initially even though there
are no waiters on a newly created SPU.
- Acknowledging the mailbox interrupt did not work because
it is level triggered and the mailbox data is never retrieved
from inside the interrupt handler.
- The interrupt handler delivered interrupts with a disabled
mask if another interrupt is triggered for the same class
but a different mask.
- The poll function did not enable the interrupt if it had not
been enabled, so we might run into the poll timeout if none of
the other bugs saved us and no signal was delivered.
We probably still have a similar problem with blocking
read/write on mailbox files, but that will result in extra
wakeup in the worst case, not in incorrect behaviour.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch makes it easier to preempt an SPU context by
having the scheduler hold ctx->state_sema for much shorter
periods of time.
As part of this restructuring, the control logic for the "run"
operation is moved from arch/ppc64/kernel/spu_base.c to
fs/spufs/file.c. Of course the base retains "bottom half"
handlers for class{0,1} irqs. The new run loop will re-acquire
an SPU if preempted.
From: Mark Nutter <mnutter@us.ibm.com>
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>