Rmmod myri10ge crash at free_netdev() -> netif_napi_del(), because napi
structures are already deallocated. To fix call netif_napi_del() before
kfree() at myri10ge_free_slices().
Cc: stable@kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all the infrastructure is in place, we will do the
right thing if we remove this special casing.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The filelayout driver sends LAYOUTCOMMIT only when COMMIT goes to
the data server (as opposed to the MDS) and the data server WRITE
is not NFS_FILE_SYNC.
Only whole file layout support means that there is only one IOMODE_RW layout
segment.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Implement all the hooks created in the previous patches.
This requires exporting quite a few functions and adding a few
structure fields.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Any COMMIT compound directed to a data server needs to have the
GETATTR calls suppressed. We here, make sure the field we are testing
(data->lseg) is set and refcounted correctly.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We create three major hooks for the pnfs code.
pnfs_mark_request_commit() is called during writeback_done from
nfs_mark_request_commit, which gives the driver an opportunity to
claim it wants control over commiting a particular req.
pnfs_choose_commit_list() is called from nfs_scan_list
to choose which list a given req should be added to, based on
where we intend to send it for COMMIT. It is up to the driver
to have preallocated list headers for each destination it may need.
pnfs_commit_list() is how the driver actually takes control, it is
used instead of nfs_commit_list().
In order to pass information between the above functions, we create
a union in nfs_page to hold a lseg (which is possible because the req is
not on any list while in transition), and add some flags to indicate
if we need to use the pnfs code.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Create a preallocated list header to hold nfs_pages for each
non-MDS COMMIT destination. Note this is not necessarily each DS,
but is basically each <DS, fh> pair.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Move it up to avoid forward declaration in later patch.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Create a separate support function for later use by data server
commit code.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Create a separate support function for later use by data server
commit code.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Add a callback that the pnfs layout driver can use to do its own
error handling of the data server's COMMIT response.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Based on consensus reached in Feb 2011 interim IETF meeting regarding
use of LAYOUTCOMMIT, it has been decided that a NFS_DATA_SYNC return
from a WRITE to data server should not initiate a COMMIT.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Doorbell is used according to usage of BlueFlame.
For Blue Flame to work in Ethernet mode QP number should have 0
at bits 6,7.
Allocating range of QPs accordingly.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not allow a kernel consumer to allocate a UAR to serve for blue flame if the
number of available UARs gets below MLX4_NUM_RESERVED_UARS (currently 8). This
will allow userspace apps to open a device file and run things like
ibv_devinfo.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add mlx4_bitmap_avail() to give the number of available resources. We want to
use this as a hint to whether to allocate a resources or not. This patch is
introduced to be used with allocation blue flame registers.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using blue flame can improve latency by allowing the HW to more efficiently
access the WQE. This patch presents two functions that are used to allocate or
release HW resources for using blue flame; the caller need to supply a struct
mlx4_bf object when allocating resources. Consumers that make use of this API
should post doorbells to the UAR object pointed by the initialized struct
mlx4_bf;
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mlx4_en module now uses the new steering mechanism.
The RX packets are now steered through the MCG table instead
of Mac table for unicast, and default entry for multicast.
The feature is enabled through INIT_HCA
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
For Ethernet mode only,
When we want to register QP as promiscuous, it must be added to all the
existing steering entries and also to the default one.
The promiscuous QP might also be on of "real" QPs,
which means we need to monitor every entry to avoid duplicates and ensure
we close an entry when all it has is promiscuous QPs.
Same mechanism both for unicast and multicast.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same packet steering mechanism would be used both for IB and Ethernet,
Both multicasts and unicasts.
This commit prepares the general infrastructure for this.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW revision is derived from device ID and rev id.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver queries the FW for WOL support.
Ethtool get/set_wol is implemented accordingly.
Only magic packets are supported at the time.
Signed-off-by: Igor Yarovinsky <igory@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each RX ring will have its own interrupt vector, and TX rings will share one
(we mostly use polling for TX completions).
The vectors are assigned first time device is opened, and its name includes
the interface name and ring number.
Signed-off-by: Markuze Alex <markuze@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding a pool of MSI-X vectors and EQs that can be used explicitly by mlx4_core
customers (mlx4_ib, mlx4_en). The consumers will assign their own names to the
interrupt vectors. Those vectors are not opened at mlx4 device initialization,
opened by demand.
Changed the max number of possible EQs according to the new scheme, no longer relies on
on number of cores.
The new functionality is exposed through mlx4_assign_eq() and mlx4_release_eq().
Customers that do not use the new API will get completion vectors as before.
Signed-off-by: Markuze Alex <markuze@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of reseting the module parameters each ifup or mtu change,
they are being set once at device initialization
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some irq_set_type() callbacks need to change the chip and the handler
when the trigger mode changes. We have already a (misnomed) setter
function for the handler which can be called from irq_set_type().
Provide one which allows to set chip and name as well. Put the
misnomed function under the COMPAT switch and provide a replacement.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some irq chips need to call genirq functions for nested chips from
their callbacks. That upsets lockdep. So they need to set a different
lock class for those nested chips. Provide a helper function to avoid
open access to irq_desc.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
commit 86271e460a introduced a
regression that caused mac80211 queues in stopped state.
ath_drain_all_txq is called in driver flush which would reset
the stopped flag and the mac80211 queues were never started
after that. iperf traffic is completely stalled due to this issue.
Restart the mac80211 queues in driver flush only if the txqs were
drained.
Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
With the recent tx status optimization in mac80211, we bail out as
and and when invalid rate index is found. So the behavior of resetting
rate idx to -1 and count to 0 has changed for the rate indexes that
were not part of the driver's retry series.
This has resulted in ath9k using incorrect rate table index which
caused the system to panic. Ideally ath9k need to loop only for the
indexes that were part of the retry series and so simply use hw->max_rates
as the loop counter.
Pasted the stack trace of the panic issue for reference.
[ 754.093192] BUG: unable to handle kernel paging request at ffff88046a9025b0
[ 754.093256] IP: [<ffffffffa02eac49>] ath_tx_status+0x209/0x2f0 [ath9k]
[ 754.094888] Call Trace:
[ 754.094903] <IRQ>
[ 754.094928] [<ffffffffa051f883>] ieee80211_tx_status+0x203/0x9e0 [mac80211]
[ 754.094975] [<ffffffffa053e305>] ? __ieee80211_wake_queue+0x125/0x140 [mac80211]
[ 754.095017] [<ffffffffa02e66c9>] ath_tx_complete_buf+0x1b9/0x370 [ath9k]
[ 754.095054] [<ffffffffa02e6fcf>] ath_tx_complete_aggr+0x51f/0xb50 [ath9k]
[ 754.095098] [<ffffffffa05382a3>] ? ieee80211_prepare_and_rx_handle+0x173/0xab0 [mac80211]
[ 754.095148] [<ffffffff81350e62>] ? _raw_spin_unlock_irqrestore+0x32/0x40
[ 754.095186] [<ffffffffa02e9735>] ath_tx_tasklet+0x365/0x4b0 [ath9k]
[ 754.095224] [<ffffffff8107a2a2>] ? clockevents_program_event+0x62/0xa0
[ 754.095261] [<ffffffffa02e2628>] ath9k_tasklet+0x168/0x1c0 [ath9k]
[ 754.095298] [<ffffffff8105599b>] tasklet_action+0x6b/0xe0
[ 754.095331] [<ffffffff81056278>] __do_softirq+0x98/0x120
[ 754.095361] [<ffffffff8100cd5c>] call_softirq+0x1c/0x30
[ 754.095393] [<ffffffff8100efb5>] do_softirq+0x65/0xa0
[ 754.095423] [<ffffffff810563fd>] irq_exit+0x8d/0x90
[ 754.095453] [<ffffffff8100ebc1>] do_IRQ+0x61/0xe0
[ 754.095482] [<ffffffff81351413>] ret_from_intr+0x0/0x15
[ 754.095513] <EOI>
[ 754.095531] [<ffffffff81014375>] ? native_sched_clock+0x15/0x70
[ 754.096475] [<ffffffffa02bcfa6>] ? acpi_idle_enter_bm+0x24d/0x285 [processor]
[ 754.096475] [<ffffffffa02bcf9f>] ? acpi_idle_enter_bm+0x246/0x285 [processor]
[ 754.096475] [<ffffffff8127fab2>] cpuidle_idle_call+0x82/0x100
[ 754.096475] [<ffffffff8100a236>] cpu_idle+0xa6/0xf0
[ 754.096475] [<ffffffff81339bc1>] rest_init+0x91/0xa0
[ 754.096475] [<ffffffff814efccd>] start_kernel+0x3fd/0x408
[ 754.096475] [<ffffffff814ef347>] x86_64_start_reservations+0x132/0x136
[ 754.096475] [<ffffffff814ef451>] x86_64_start_kernel+0x106/0x115
[ 754.096475] RIP [<ffffffffa02eac49>] ath_tx_status+0x209/0x2f0 [ath9k]
Signed-off-by: Senthil Balasubramanian <senthilkumar@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
On hardware busy the scan request pointer should be cleared, as higher
levels will release. This avoids a crash when that pointer is
erroneously used later.
Signed-off-by: Joseph J. Gunn <armadefuego@yahoo.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Clearly a mistake, since pointers won't suddenly
change their value...
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
commit 2c8cec5c10 (Cache learned PMTU information in inetpeer) added
an extra inet_putpeer() call in ip_rt_update_pmtu().
This results in various problems, since we can free one inetpeer, while
it is still in use.
Ref: http://www.spinics.net/lists/netdev/msg159121.html
Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 9435eb1cf0
("ipv4: Implement __ip_dev_find using new interface address hash.")
we reimplemented __ip_dev_find() so that it doesn't have to
do a full FIB table lookup.
Instead, it consults a hash table of addresses configured to
interfaces.
This works identically to the old code in all except one case,
and that is for loopback subnets.
The old code would match the loopback device for any IP address
that falls within a subnet configured to the loopback device.
Handle this corner case by doing the FIB lookup.
We could implement this via inet_addr_onlink() but:
1) Someone could configure many addresses to loopback and
inet_addr_onlink() is a simple list traversal.
2) We know the old code works.
Reported-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some filesystems (such as ext4) can return the same cookie value for
multiple files. If we try to start a readdir with one of these cookies,
the server will return the first file found with a cookie of the same
value. This can cause the client to enter an infinite loop.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
nfs_opendir() created a context that held much more information than we
need for a readdir. This patch introduces a slimmed-down
nfs_open_dir_context that contains only the cookie and the cred used for
RPC operations. The new context will eventually be used to help detect
readdir loops.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If we're doing a search by readdir cookie, we need to ensure that the
resulting f_pos is updated. To do so, we need to update the
desc->current_index, in the same way that we do in the search by
file offset case.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch adds the TCM_Loop Linux/SCSI LLD fabric module for
accessing TCM device backstores as locally accessable SCSI LUNs in
virtual SAS, FC, and iSCSI Target ports using the generic fabric
TransportID and Target Port WWN naming handlers from TCM's
target_core_fabric_lib.c The TCM_Loop module uses the generic fabric
configfs infratructure provided by target_core_fabric_configfs.c and
adds a module dependent attribute for the creation/release of the
virtual I_T Nexus connected the TCM_Loop Target and Initiator Ports.
TCM_Loop can also be used with scsi-generic and BSG drivers so that
STGT userspace fabric modules, QEMU-KVM and other hypervisor SCSI
passthrough support can access TCM device backstore and control CDB
emulation.
For more information please see:
http://linux-iscsi.org/wiki/Tcm_loop
[jejb: fixed up checkpatch stuff]
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The map_bh() call will have already set the buffer_head to mapped.
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Disable Interrupt MBX completion will disable the interrupt on
successful completion. Fixed the bug where driver was waiting for
Interrupt to come in for its completion. Now driver will poll for
disable interrupt MBX completion.
Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This reverts commit 24d720b726.
Previously we thought there was little possibility that devices would
crash with this, but some have been found.
Reported-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Driver does not detect a new CQE (completion queue entry) if a thread receives
the wakup when it is in TASK_RUNNING state. Fix is to set the state to
TASK_INTERRUPTIBLE while holding the fp_work_lock.
Also, Use __set_current_task() since it is now set inside a spinlock with
synchronization.
Two other related optimizations:
1. After we exit the while (!kthread_should_stop()) loop, use
__set_current_state() since synchronization is no longer needed.
2. Remove set_current_state(TASK_RUNNING) after schedule() since it
should always be TASK_RUNNING after schedule().
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>