Create an SRP FMR pool on HCAs that support FMRs, and use FMRs to map
gather/scatter lists that have more than one entry into a single
memory region that appears virtually contiguous to the SRP target
(which is the RDMA initiator).
This patch bails out on FMR mapping for SCSI commands where the
gather/scatter list cannot be mapped into a single FMR because there
are sub-page-sized entries in middle of the list. An unaligned
start or end of the list is OK.
Based on a patch by Vu Pham <vuhuong@mellanox.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Kernel connection management agent over InfiniBand that connects based
on IP addresses. The agent defines a generic RDMA connection
abstraction to support clients wanting to connect over different RDMA
devices.
The agent also handles RDMA device hotplug events on behalf of clients.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an address translation service that maps IP addresses to
InfiniBand GID addresses using IPoIB.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Export ip_dev_find() to allow locating a net_device given an IP address.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Extend matching connection requests to listens in the InfiniBand CM to
include private data checks.
This allows applications to listen on the same service identifier,
with private data directing the request to the appropriate application.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Provide common handling for marshalling data between userspace clients
and kernel InfiniBand drivers.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Memfree firmware is in rare cases reporting WQE index == base - 1 in
receive completion with error, instead of (rq size - 1); base is 0 in
mthca. Here is a patch to avoid kernel crash and report a correct WR
id in this case.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mthca does not restore the following PCI-X/PCI Express registers after reset:
PCI-X device: PCI-X command register
PCI-X bridge: upstream and downstream split transaction registers
PCI Express : PCI Express device control and link control registers
This causes instability and/or bad performance on systems where one of
these registers is set to a non-default value by BIOS.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Especially when summary code is used, we can have in-memory data
structures referencing certain nodes without them actually being readable
on the flash. Discard the nodes gracefully in that case, rather than
triggering a BUG().
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Reflect the fact that the Cell Broadband Engine supports 64k
pages by adding the bit to the CPU features.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The page size encoding passed to tlbie is incorrect for new-style
large pages. This fixes it. This doesn't affect anything on older
machines because mmu_psize_defs[psize].penc (the page size encoding)
is 0 for 4k and 16M pages (the two are distinguished by a separate "is
a large page" bit).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state)
in run_posix_cpu_timers().
However, for some reason it does so only for CPUCLOCK_PERTHREAD
case (which is imho wrong).
Also, this check is not reliable, PF_EXITING could be set on
another cpu without any locks/barriers just after the check,
so it can't prevent from attaching the timer to the exiting
task.
The previous patch makes this check unneeded.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
do_exit() clears ->it_##clock##_expires, but nothing prevents
another cpu to attach the timer to exiting process after that.
arm_timer() tries to protect against this race, but the check
is racy.
After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
before do_exit() calls 'schedule() local timer interrupt can find
tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
does sys_wait4) interrupted task has ->signal == NULL.
At this moment exiting task has no pending cpu timers, they were
cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(),
so we can just return from irq.
John Stultz recently confirmed this bug, see
http://marc.theaimsgroup.com/?l=linux-kernel&m=115015841413687
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the local timer interrupt happens just after do_exit() sets PF_EXITING
(and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call
check_process_timers() with tasklist_lock + ->siglock held and
check_process_timers:
t = tsk;
do {
....
do {
t = next_thread(t);
} while (unlikely(t->flags & PF_EXITING));
} while (t != tsk);
the outer loop will never stop.
Actually, the window is bigger. Another process can attach the timer
after ->it_xxx_expires was cleared (see the next commit) and the 'if
(PF_EXITING)' check in arm_timer() is racy (see the one after that).
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of fixes that should prevent crashes when using netconsole and
suspend/resume. First, netconsole poll routine shouldn't run unless the
device is up; second, the NAPI poll should be disabled during suspend.
This is only an issue on sky2, because it has to have one NAPI poll
routine for both ports on dual port boards. Normal drivers use
netif_rx_schedule_prep and that checks for netif_running.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If get_user_pages() returns less pages than what we asked for, we jump
to out_unmap which will return ERR_PTR(ret). But ret can contain a
positive number just smaller than local_nr_pages, so be sure to set it
to -EFAULT always.
Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some time ago the cdrom open routine was changed so that we call the
driver's open routine before checking to see if it is read only. However,
if we discovered that a read write open was not possible and the open
flags required a writable open, we just returned -EROFS without calling
the driver's release routine. This seems to work for most cdrom drivers,
but breaks the Powerpc iSeries virtual cdrom rather badly.
This just inserts the release call in the error path to balance the call
to "->open()" done by "open_for_data()".
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Juha Yrjola suggests that adding the string "MMC" to the
maintainers file entry will make it easier to find. Add
it to the file.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Let's not attempt the abolition of mtd->type until/unless it's properly
thought through. And certainly, let's not do it by halves.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
We don't clear the seek stat values in cfq_alloc_io_context(), and if
->seek_mean is unlucky enough to be set to -36 by chance, the first
invocation of cfq_update_io_seektime() will oops with a divide by zero
in do_div().
Just memset the entire cic instead of filling invididual values
independently.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If flock_lock_file() failed to allocate flock with locks_alloc_lock()
then "error = 0" is returned. Need to return some non-zero.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make the definitions of __FD_SET, __FD_CLR and __FD_ISSET independent
from asm/bitops.h and remove the macro magic that tests for __GLIBC__.
Use simple C inline functions instead of set_bit, clear_bit and test_bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
The resume bug was caused not by an early interrupt but because the idle
timeout was not being stopped on suspend. Also disable hardware IRQ's
on suspend. Will need to revisit this with hotplug?
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The hardware should be fully shut off during suspend, and the base
irq mask restored during resume.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the poll routine detects no hardware available, it needs to dequeue
it self from the network poll list. Linus didn't understand NAPI.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is cleaner, to not loop over both ports if only one exists.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The set power state function is cleaner if it doesn't return anything.
The only caller that could fail is in suspend() and it can check the argument
there.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Randy Dunlap <rdunlap@xenotime.net>
According to include/asm-alpha/bitops.h, only ALPHA_EV67 has hardware
hweight support, so ALPHA_EV6 needs to use GENERIC_HWEIGHT.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ernst Herzberg <earny@net4u.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
...and __constant_ntohs, __constant_ntohl, __constant_cpu_to_be32 too
where possible. Htons and friends are resolved to constants in these
places anyway. Also fix an endianess glitch in a log message, spotted
by Alexey Dobriyan.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Replace spaces by tabulators, wrap lines at 80 columns, delete some
blank lines and superfluous braces. Collapse some if()-within-if()
constructs. Replace a literal CSR address by its preprocessor constant.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
ohci1394 and pcilynx call highlevel_host_reset from their hardware
interrupt handler (via hpsb_selfid_complete). Therefore all readers and
writers of hl_irqs_lock have to disable interrupts. Reported by Jiri
Slaby and J. A. Magallon.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
This gets also rid of the MODPOST warning "drivers/ieee1394/ieee1394.o -
Section mismatch: reference to .exit.text: from .smp_locks after '' (at
offset 0x18)".
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
We only support x86 and ppc, due to the use of bus_to_virt() and friends.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Replace occurrences of the magic value ~(u64)0 for invalid
CSR address spaces by a named constant for better readability.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
The proper designator of an invalid CSR address is ~(u64)0, not (u64)0.
Use the correct value in initialization and deregistration.
Also, scsi_id->sbp2_lun does not need to be initialized twice.
(scsi_id was kzalloc'd.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
I've been experimenting to track down the cause of suspend/resume problems
on my Compaq Presario X1050 laptop:
http://bugzilla.kernel.org/show_bug.cgi?id=6075
Essentially the ACPI Embedded Controller and keyboard controller would
get into a bizarre, confused state after resume.
I found that unloading the ohci1394 module before suspend and reloading it
after resume made the problem go away. Diffing the dmesg output from
resume, with and without the module loaded, I found that with the module
loaded I was missing these:
PM: Writing back config space on device 0000:02:00.0 at offset 1. (Was 2100080, writing 2100007)
PM: Writing back config space on device 0000:02:00.0 at offset 3. (Was 0, writing 8008)
PM: Writing back config space on device 0000:02:00.0 at offset 4. (Was 0, writing 90200000)
PM: Writing back config space on device 0000:02:00.0 at offset 5. (Was 1, writing 2401)
PM: Writing back config space on device 0000:02:00.0 at offset f. (Was 20000100, writing 2000010a)
The default PCI driver performs the pci_restore_state when no driver is
loaded for the device. When the ohci1394 driver is loaded, it is supposed
to do this, however it appears not to do so.
I created the patch below and tested it, and it appears to resolve the
suspend problems I was having with the module loaded. I only added in the
pci_save_state and pci_restore_state - however, though I know little of
this hardware, surely the driver should really be doing more than this when
suspending and resuming? Currently it does almost nothing, what if there
are commands in progress, etc?
Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Cc: Jody McIntyre <scjody@modernduck.com>
Cc: Ben Collins <bcollins@debian.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
It seems to have worked without the attribute during all the years
just because sizes of all struct members are multiples of 32 bits.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
It appears I will not get it fixed overnight.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
If sbp2 is forced to move data via ARM handler, the maximum packet size
allowed for S800 transfers exceeds ohci1394's buffer size on platforms
where PAGE_SIZE is 4096.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Being able to switch physical DMA on and off at run time would be a nice
feature but a PITA to support by highlevel drivers and userspace apps.
Therefore allow it only to be set when the driver is being loaded.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
This patch supplies the API extension introduced by patch
"ieee1394: extend lowlevel API for address range properties"
with proper addresses.
Like in patch ''ohci1394, sbp2: fix "scsi_add_device failed"
with PL-3507 based devices'', 1 TeraByte is chosen as physical
upper bound. This leaves a window for the middle address range.
This choice is only relevant for adapters which actually have a
programmable pysical upper bound register. (Only ALi and
Fujitsu adapters are known for this. Most adapters have a fixed
bound at 4 GB.) The middle address range is suitable for posted
writes.
AFAIK, PCILynx does not support physical DMA nor posted writes,
therefore no equivalent change in the pcilynx driver is necessary.
There is also a driver for GP2Lynx, although not in mainline Linux.
I assume this hardware does not support these OHCI features either.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Host adapter hardware imposes certain restrictions and features on
address ranges. Instead of hard-wire such ranges into the ieee1394
core or even into protocol drivers, let lowlevel drivers specify
these ranges via struct hpsb_host.
Patch "ohci1394: set address range properties" must be applied too,
else hpsb_allocate_and_register_addrspace() won't work properly.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Since this is useful information, promote it from a debug macro to
a regular log message. The message appears only if the user set
exclusive_login=0, therefore won't clutter the logs in normal use.
Also update the comment on exclusive_login.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jody McIntyre <scjody@modernduck.com>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>