Commit Graph

1070 Commits

Author SHA1 Message Date
Linus Torvalds 21ba0f88ae Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits)
  PCI: Only build PCI syscalls on architectures that want them
  PCI: limit pci_get_bus_and_slot to domain 0
  PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure
  PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge
  PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3
  PCI: hotplug: pciehp: wait for 1 second after power off slot
  PCI: pci_set_power_state(): check for PM capabilities earlier
  PCI: cpci_hotplug: Convert to use the kthread API
  PCI: add pci_try_set_mwi
  PCI: pcie: remove SPIN_LOCK_UNLOCKED
  PCI: ROUND_UP macro cleanup in drivers/pci
  PCI: remove pci_dac_dma_... APIs
  PCI: pci-x-pci-express-read-control-interfaces cleanups
  PCI: Fix typo in include/linux/pci.h
  PCI: pci_ids, remove double or more empty lines
  PCI: pci_ids, add atheros and 3com_2 vendors
  PCI: pci_ids, reorder some entries
  PCI: i386: traps, change VENDOR to DEVICE
  PCI: ATM: lanai, change VENDOR to DEVICE
  PCI: Change all drivers to use pci_device->revision
  ...
2007-07-12 13:40:57 -07:00
Tejun Heo 7b595756ec sysfs: kill unnecessary attribute->owner
sysfs is now completely out of driver/module lifetime game.  After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners.  Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner.  Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

  http://article.gmane.org/gmane.linux.kernel/510293

(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Auke Kok 44c10138fd PCI: Change all drivers to use pci_device->revision
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.

This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.

In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.

Compile tested with make all{yes,mod}config on x86_64 and i386.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:10 -07:00
Ralph Campbell 841adfca9c IPoIB/cm: Partial error clean up unmaps wrong address
If a page can't be allocated for the frag list of a skb, the code to
unmap the partially allocated list is off by one.  For exaple, if
'frags' equals one, i == 0, and the alloc_page() fails, then the old
loop would have unmapped mapping[1] which is uninitialized.  The same
would happen if the call to ib_dma_map_page() failed.

Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-07-02 20:48:31 -07:00
Jack Morgenstein c8681f1401 IB/mlx4: Correct max_srq_wr returned from mlx4_ib_query_device()
We need to keep a spare entry in the SRQ so that there always is a
next WQE available when posting receives (so that we can tell the
difference between a full queue and an empty queue).  So subtract 1
from the value HW gives us before reporting the limit on SRQ entries
to consumers.

Found by Mellanox QA.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 13:39:10 -07:00
Roland Dreier 13ef5f44c3 IPoIB/cm: Remove dead definition of struct ipoib_cm_id
It's completely unused.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 13:39:08 -07:00
Michael S. Tsirkin 82c3aca6ad IPoIB/cm: Fix interoperability when MTU doesn't match
IPoIB connected mode currently rejects a connection request unless the
supported MTU is >= the local netdevice MTU. This breaks
interoperability with implementations that might have tweaked
IPOIB_CM_MTU, and there's real no longer a reason to do so: this test
is just a leftover from when we did not tweak MTU per-connection.  Fix
this by making the test as permissive as possible.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 13:38:08 -07:00
Michael S. Tsirkin 3ec7393a68 IPoIB/cm: Initialize RX before moving QP to RTR
Fix a crasher bug in IPoIB CM: once a QP is in the RTR state, a
receive completion (or even an asynchronous error) might be observed
on this QP, so we have to initialize all of our receive data
structures before moving to the RTR state.

As an optimization (since modify_qp might take a long time), the
jiffies update done when moving RX to the passive_ids list is also
left in place to reduce the chance of the RX being misdetected as
stale.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=662>.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 13:03:50 -07:00
Roland Dreier 24bce50803 IB/umem: Fix possible hang on process exit
If ib_umem_release() is called after ib_uverbs_close() sets context->closing,
then a process can get stuck in a D state, because the code boils down to

	if (down_write_trylock(&mm->mmap_sem))
		down_write(&mm->mmap_sem);

which is obviously a stupid instant deadlock.  Fix the code so that we
only try to take the lock once.

This bug was introduced in commit f7c6a7b5 ("IB/uverbs: Export
ib_umem_get()/ib_umem_release() to modules") which fortunately never
made it into a release, and was reported by Pete Wyckoff <pw@osc.edu>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-21 11:05:58 -07:00
Roland Dreier e61ef2416b IB/mlx4: Make sure inline data segments don't cross a 64 byte boundary
Inline data segments in send WQEs are not allowed to cross a 64 byte
boundary.  We use inline data segments to hold the UD headers for MLX
QPs (QP0 and QP1).  A send with GRH on QP1 will have a UD header that
is too big to fit in a single inline data segment without crossing a
64 byte boundary, so split the header into two inline data segments.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-18 09:23:47 -07:00
Roland Dreier 5ae2a7a836 IB/mlx4: Handle FW command interface rev 3
Upcoming firmware introduces command interface revision 3, which
changes the way port capabilities are queried and set.  Update the
driver to handle both the new and old command interfaces by adding a
new MLX4_FLAG_OLD_PORT_CMDS that it is set after querying the firmware
interface revision and then using the correct interface based on the
setting of the flag.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-18 08:15:02 -07:00
Jack Morgenstein 082dee3216 IB/mlx4: Handle buffer wraparound in __mlx4_ib_cq_clean()
When compacting CQ entries, we need to set the correct value of the
ownership bit in case the value is different between the index we copy
the CQE from and the index we copy it to.

Found by Ronni Zimmerman of Mellanox.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-18 08:13:59 -07:00
Roland Dreier 54e95f8dcb IB/mlx4: Get rid of max_inline_data calculation
The calculation of max_inline_data in set_kernel_sq_size() is bogus,
since it doesn't take into account the fact that inline segments may
not cross a 64-byte boundary, and hence multiple inline segments will
probably need to be used to post large inline sends.

We don't support inline sends for kernel QPs anyway, so there's no
point in doing this calculation anyway, since the field is just zeroed
out a little later.  So just delete the bogus calculation.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-18 08:13:53 -07:00
Roland Dreier 0e6e741621 IB/mlx4: Handle new FW requirement for send request prefetching
New ConnectX firmware introduces FW command interface revision 2,
which requires that for each QP, a chunk of send queue entries (the
"headroom") is kept marked as invalid, so that the HCA doesn't get
confused if it prefetches entries that haven't been posted yet.  Add
code to the driver to do this, and also update the user ABI so that
userspace can request that the prefetcher be turned off for userspace
QPs (we just leave the prefetcher on for all kernel QPs).

Unfortunately, marking send queue entries this way is confuses older
firmware, so we change the driver to allow only FW command interface
revisions 2.  This means that users will have to update their firmware
to work with the new driver, but the firmware is changing quickly and
the old firmware has lots of other bugs anyway, so this shouldn't be too
big a deal.

Based on a patch from Jack Morgenstein <jackm@dev.mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-18 08:13:48 -07:00
Roland Dreier 42c059ea2b IB/mlx4: Fix warning in rounding up queue sizes
Doing max(1, foo) where foo is u32 generates a warning, because 1 is a
signed constant.  Fix this by using 1U instead.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-12 10:52:02 -07:00
Roland Dreier 614c3c85b5 IB/mlx4: Fix handling of wq->tail for send completions
Cast the increment added to wq->tail when send completions are
processed to u16 to avoid using wrong values caused by standard
integer promotions.

The same bug was fixed in libmlx4 by Eli Cohen <eli@mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-12 10:50:42 -07:00
Linus Torvalds 99f9f3d49c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Make sure RQ allocation is always valid
  RDMA/cma: Fix initialization of next_port
  IB/mlx4: Fix zeroing of rnr_retry value in ib_modify_qp()
  mlx4_core: Don't set MTT address in dMPT entries with PA set
  mlx4_core: Check firmware command interface revision
  IB/mthca, mlx4_core: Fix typo in comment
  mlx4_core: Free catastrophic error MSI-X interrupt with correct dev_id
  mlx4_core: Initialize ctx_list and ctx_lock earlier
  mlx4_core: Fix CQ context layout
2007-06-11 15:46:08 -07:00
Roland Dreier a4cd7ed86f IB/mlx4: Make sure RQ allocation is always valid
QPs attached to an SRQ must never have their own RQ, and QPs not
attached to SRQs must have an RQ with at least 1 entry.  Enforce all
of this in set_rq_size().

Based on a patch by Eli Cohen <eli@mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-07 23:24:39 -07:00
Sean Hefty bf2944bd56 RDMA/cma: Fix initialization of next_port
next_port should be between sysctl_local_port_range[0] and [1].
However, it is initially set to a random value with get_random_bytes().  
If the value is negative when treated as a signed integer, next_port
can end up outside the expected range because of the result of the % 
operator being negative.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-07 23:24:38 -07:00
Jack Morgenstein 57f01b5339 IB/mlx4: Fix zeroing of rnr_retry value in ib_modify_qp()
The code in __mlx4_ib_modify_qp() overwrites context->params1 after
the RNR retry parameter is ORed in, which results in the RNR retry
parameter always being set to 0.  Fix this by moving where we OR in
the value to later in the function, after the initial assignment of
context->params1.

Found by the Mellanox firmware group.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-07 23:24:38 -07:00
Herbert Xu 42f811b8bc [IPV4]: Convert IPv4 devconf to an array
This patch converts the ipv4_devconf config members (everything except
sysctl) to an array.  This allows easier manipulation which will be
needed later on to provide better management of default config values.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 13:39:13 -07:00
Roland Dreier 3e1db334dc IB/mthca, mlx4_core: Fix typo in comment
s/signifant/significant/

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-06-07 11:51:59 -07:00
Sean Hefty d998ccce02 IB/cm: Fix stale connection detection
The ib_cm can incorrectly detect a stale connection (a new connection
request for a QPN that is already connected) as a duplicate connection
request.  Separate the handling of potential duplicate REQs from stale
connections.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-29 16:07:09 -07:00
Michael S. Tsirkin ec56dc0b7f IPoIB/cm: Fix performance regression on Mellanox
commit 518b1646 ("IPoIB/cm: Fix SRQ WR leak") introduced a severe
performance regression on Mellanox cards, because keeping a QP in the
error state for extended periods of time moves hardware to the slow
path (until the QP is destroyed).  For example, MPI latency goes from
~3 usecs to ~7 usecs.

Fix this by posting a send WR on one of the QPs that are being
flushed, instead of using a separate drain QP that is kept in the
error state.

This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=636>,
reported and bisected by Scott Weitzenkamp at Cisco and debugged by
Sasha Mikheev at Voltaire.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-29 16:07:09 -07:00
Michael S. Tsirkin 8b7e15772a IB/mthca: Fix handling of send CQE with error for QPs connected to SRQ
mthca_free_err_wqe() currently treats both send and receive CQEs
identically if a QP is using an SRQ.  But for Tavor hardware, send
CQEs with error can be chained together even if the RQ is part of SRQ,
so we may miss some CQEs.

Fix by following the WQE chain for all send CQEs even for non-SRQ QPs.

This fixes crashes in IPoIB CM:
<https://bugs.openfabrics.org//show_bug.cgi?id=604>

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-29 16:07:09 -07:00
Michael S. Tsirkin 2dfbfc3712 IPoIB/cm: Drain cq in ipoib_cm_dev_stop()
Since NAPI polling is disabled while ipoib_cm_dev_stop() is running,
ipoib_cm_dev_stop() must poll the CQ itself in order to see the
packets draining.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-24 14:02:40 -07:00
Michael S. Tsirkin 8fd357a6e3 IPoIB/cm: Fix timeout check in ipoib_cm_dev_stop()
time_after() was used backwards, so the timeout occurred immediately.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-24 14:02:39 -07:00
Stefan Roscher 65a2c841d6 IB/ehca: Fix number of send WRs reported for new QP
Due to a typo, the driver was reporting the wrong number of "actual send
WRs" after ehca_create_qp().

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-24 14:02:39 -07:00
Eli Cohen c0be5fb5f8 IB/mlx4: Initialize send queue entry ownership bits
We need to initialize the owner bit of send queue WQEs to hardware 
ownership whenever the QP is modified from reset to init, not just 
when the QP is first allocated.  This avoids having the hardware 
process stale WQEs when the QP is moved to reset but not destroyed and 
then modified to init again. 

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-24 14:02:38 -07:00
Roland Dreier 02d89b8708 IB/mlx4: Don't allocate RQ doorbell if using SRQ
If a QP is attached to a shared receive queue (SRQ), then it doesn't
have a receive queue (RQ).  So don't allocate an RQ doorbell (or map a
doorbell from userspace for userspace QPs) for that QP.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-23 15:16:08 -07:00
Linus Torvalds 8aee74c8ee Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
  IB/cm: Improve local id allocation
  IPoIB/cm: Fix SRQ WR leak
  IB/ipoib: Fix typos in error messages
  IB/mlx4: Check if SRQ is full when posting receive
  IB/mlx4: Pass send queue sizes from userspace to kernel
  IB/mlx4: Fix check of opcode in mlx4_ib_post_send()
  mlx4_core: Fix array overrun in dump_dev_cap_flags()
  IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions
  IB/mthca: Fix RESET to ERROR transition
  IB/mlx4: Set GRH:HopLimit when sending globally routed MADs
  IB/mthca: Set GRH:HopLimit when building MLX headers
  IB/mlx4: Fix check of max_qp_dest_rdma in modify QP
  IB/mthca: Fix use-after-free on device restart
  IB/ehca: Return proper error code if register_mr fails
  IPoIB: Handle P_Key table reordering
  IB/core: Use start_port() and end_port()
  IB/core: Add helpers for uncached GID and P_Key searches
  IB/ipath: Fix potential deadlock with multicast spinlocks
  IB/core: Free umem when mm is already gone
2007-05-21 16:19:32 -07:00
Michael S. Tsirkin 9f81036c54 IB/cm: Improve local id allocation
The IB CM uses an idr for local id allocations, with a running counter
as start_id.  This fails to generate distinct ids if

1. An id is constantly created and destroyed
2. A chunk of ids just beyond the current next_id value is occupied

This in turn leads to an increased chance of connection request being
mis-detected as a duplicate, sometimes for several retries, until
next_id gets past the block of allocated ids. This has been observed
in practice.

As a fix, remember the last id allocated and start immediately above it.
This also fixes a problem with the old code, where next_id might
overflow and become negative.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-21 13:41:29 -07:00
Michael S. Tsirkin 518b1646f8 IPoIB/cm: Fix SRQ WR leak
SRQ WR leakage has been observed with IPoIB/CM: e.g. flipping ports on
and off will, with time, leak out all WRs and then all connections
will start getting RNR NAKs.  Fix this in the way suggested by spec:
move the QP being destroyed to the error state, wait for "Last WQE
Reached" event and then post WR on a "drain QP" connected to the same
CQ.  Once we observe a completion on the drain QP, it's safe to call
ib_destroy_qp.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-21 13:35:40 -07:00
Michael S. Tsirkin 24bd1e4e32 IB/ipoib: Fix typos in error messages
Trivial error message fixups.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-21 13:29:15 -07:00
Alexey Dobriyan e8edc6e03a Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.

This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
   getting them indirectly

Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
   they don't need sched.h
b) sched.h stops being dependency for significant number of files:
   on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
   after patch it's only 3744 (-8.3%).

Cross-compile tested on

	all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
	alpha alpha-up
	arm
	i386 i386-up i386-defconfig i386-allnoconfig
	ia64 ia64-up
	m68k
	mips
	parisc parisc-up
	powerpc powerpc-up
	s390 s390-up
	sparc sparc-up
	sparc64 sparc64-up
	um-x86_64
	x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 09:18:19 -07:00
Roland Dreier 56a8c8b6ac IB/mlx4: Check if SRQ is full when posting receive
Make mlx4_post_srq_recv() fail if the SRQ is full (head == tail).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-20 20:19:24 -07:00
Eli Cohen 2446304dd6 IB/mlx4: Pass send queue sizes from userspace to kernel
Pass the number of WQEs for the send queue and their size from userspace
to the kernel to avoid having to keep the QP size calculations in sync 
between the kernel driver and libmlx4.  This fixes a bug seen with the 
current mlx4_ib driver and current libmlx4 caused by a difference in the 
calculated sizes for SQ WQEs.  Also, this gives more flexibility for 
userspace to experiment with using multiple WQE BBs for a single SQ WQE.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-20 10:18:04 -07:00
Roland Dreier 59b0ed1212 IB/mlx4: Fix check of opcode in mlx4_ib_post_send()
wr->opcode is invalid if it's >= ARRAY_SIZE(mlx4_ib_opcode), not just
strictly >.

This was spotted by the Coverity checker (CID 1643).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:58 -07:00
Michael S. Tsirkin 65adfa911a IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions
According to the IB spec, a QP can be moved from RESET back to RESET
or to the ERROR state, but mlx4 firmware does not support this and
returns an error if we try.  Fix the RESET to RESET transition by
just returning 0 without doing anything, and fix RESET to ERROR by
moving the QP from RESET to INIT with dummy parameters and then
transitioning from INIT to ERROR.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:57 -07:00
Michael S. Tsirkin b18aad7150 IB/mthca: Fix RESET to ERROR transition
According to the IB spec, a QP can be moved from RESET to the ERROR 
state, but mthca firmware does not support this and returns an error if 
we try.  Work around this FW limitation by moving the QP from RESET to
INIT with dummy parameters and then transitioning from INIT to ERROR.

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:57 -07:00
Roland Dreier 1526130351 IB/mlx4: Set GRH:HopLimit when sending globally routed MADs
This is the same issue discovered in mthca by Rolf Manderscheid
<rvm@obsidianresearch.com>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:57 -07:00
Rolf Manderscheid 3f37cae694 IB/mthca: Set GRH:HopLimit when building MLX headers
Global CM packets used by rmda_cm were being sent with a GRH:hopLimit
of zero, causing them to be dropped by the router.  The problem is a
missing initialization of the hop_limit field in mthca_read_ah(),
which was called by build_mlx_header() when sending a MAD on QP1.

Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:56 -07:00
Eli Cohen 1f8f7b7a7b IB/mlx4: Fix check of max_qp_dest_rdma in modify QP
max_qp_dest_rdma is already in natural units - no need to shift.  This
was discovered by a test that deliberately requests more outstanding
atomic operation than the device supports.

Found by Sagi Rotem at Mellanox.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:56 -07:00
Ali Ayoub de57c9f102 IB/mthca: Fix use-after-free on device restart
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:56 -07:00
Hoang-Nam Nguyen bd5a6ccc0e IB/ehca: Return proper error code if register_mr fails
Set the return code of ehca_register_mr() to ENOMEM if the corresponding
firmware call fails due to out of resources.  Some other error codes
were explicitly mapped to EINVAL -- just remove those cases so they
get mapped to the default case, which already returns EINVAL anyway.

Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:54 -07:00
Yosef Etigin 26bbf13ce1 IPoIB: Handle P_Key table reordering
SM reconfiguration or failover possibly causes a shuffling of the values
in the P_Key table. Right now, IPoIB only queries for the P_Key index
once when it creates the device QP, and hence there are problems if the
index of a P_Key value changes.  Fix this by using the PKEY_CHANGE event
to trigger a recheck of the P_Key index.

Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:54 -07:00
Roland Dreier 1af4c435f3 IB/core: Use start_port() and end_port()
Clean up ib_query_port() and ib_modify_port() slightly by using the 
just-added start_port() and end_port() helpers.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:54 -07:00
Yosef Etigin 5eb620c81c IB/core: Add helpers for uncached GID and P_Key searches
Add ib_find_gid() and ib_find_pkey() functions that use uncached device
queries.  The calls might block but the returns are always up-to-date.
Cache P_Key and GID table lengths in core to avoid extra port info queries.

Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:53 -07:00
Roland Dreier 8b8c8bca3a IB/ipath: Fix potential deadlock with multicast spinlocks
Lockdep found the following potential deadlock between mcast_lock and
n_mcast_grps_lock: mcast_lock is taken from both interrupt context and
process context, so spin_lock_irqsave() must be used to take it.
n_mcast_grps_lock is only taken from process context, so at first it
seems safe to take it with plain spin_lock(); however, it also nests
inside mcast_lock, and hence we could deadlock:

  cpu A                                   cpu B
    ipath_mcast_add():
      spin_lock_irq(&mcast_lock);

                                            ipath_mcast_detach():
                                              spin_lock(&n_mcast_grps_lock);

                                            <enter interrupt>

                                            ipath_mcast_find():
                                              spin_lock_irqsave(&mcast_lock);

      spin_lock(&n_mcast_grps_lock);

Fix this by using spin_lock_irq() to take n_mcast_grps_lock.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:53 -07:00
Eli Cohen 7b82cd8ee7 IB/core: Free umem when mm is already gone
Free umem when task's mm is already destroyed by the time
ib_umem_release gets called.

Found by Dotan Barak at Mellanox.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-05-19 08:51:53 -07:00