mmiowb() is now implied by spin_unlock() on architectures that require
it, so there is no reason to call it from driver code. This patch was
generated using coccinelle:
@mmiowb@
@@
- mmiowb();
and invoked as:
$ for d in drivers include/linux/qed sound; do \
spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done
NOTE: mmiowb() has only ever guaranteed ordering in conjunction with
spin_unlock(). However, pairing each mmiowb() removal in this patch with
the corresponding call to spin_unlock() is not at all trivial, so there
is a small chance that this change may regress any drivers incorrectly
relying on mmiowb() to order MMIO writes between CPUs using lock-free
synchronisation. If you've ended up bisecting to this commit, you can
reintroduce the mmiowb() calls using wmb() instead, which should restore
the old behaviour on all architectures other than some esoteric ia64
systems.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
transport improvements, and a reworking of the peer_db_addr to all for
better abstraction
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJci/y4AAoJEG5mS6x6i9IjyPkQAJvUPQwQ8r4MD/+rD+mNDe/E
uZDh10x8upLnO0OvD0hPy6mMm+lcwYnCNpTyBhuQr1Lnp/y3pucSa3tRAzCeuYfx
cEvftE9AJ1CfA54YR5VCJTR90EhKQ4oJziD3zvuZpQvnm+0JW7C6lWBIGoBx5gtk
uhdIeN5QyfGPfZcmXjGoSNaYNpqq+6maJkcgV3eATWZdtIZX1Ts6qrJP/RhdH/qy
LrF+rwFU5Diz206jkw6p3AHqU4jT2uzvEAxNV4WIOQ6iuZMKqEHcDBmBtYNoouyw
4+1G4e6mgrl5xd83lRNwtiUTPD6cN+RrGfGPFOyvM8luJheL6Zq/tDdRbDPFXO+M
rlpcRhUtih7x8ev3V/3buRsf/gSmDY+TlYwQ/OPx0U9zjfYAzc+SIEmTjRX+axvZ
/LEi9z+PXLJP6BMS7CjmEfuwlZtx1tV93gtlaDLGhdtbw0dK6qJScV7Fudl7CG4Q
HOl9qDQOoK2oDd8fSb4vm8N0augLNm42ynRwfLKceJpL9Uv2FSlkET9Gt45zmtqz
LLW9N4gvMIW148/i1S2+66rJKdJdvW3v3yqpBab4TigMLEBoIr9QMua5Dd84qGYx
YqPKBKqVAVtSIslQGPrKEZJEg6d8DJcj1XDbdaLCxtEiJw2rvn3TURg7FTeo47FE
FnIuFsui+wHrLHC97XSz
=7EC8
-----END PGP SIGNATURE-----
Merge tag 'ntb-5.1' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
- fixes for switchtec debugability and mapping table entries
- NTB transport improvements
- a reworking of the peer_db_addr for better abstraction
* tag 'ntb-5.1' of git://github.com/jonmason/ntb:
NTB: add new parameter to peer_db_addr() db_bit and db_data
NTB: ntb_transport: Ensure the destination buffer is mapped for TX DMA
NTB: ntb_transport: Free MWs in ntb_transport_link_cleanup()
ntb_hw_switchtec: Added support of >=4G memory windows
ntb_hw_switchtec: NT req id mapping table register entry number should be 512
ntb_hw_switchtec: debug print 64bit aligned crosslink BAR Numbers
NTB door bell usage depends on NTB hardware.
ex: intel NTB gen1 has one peer door bell register which can be controlled
by the bitmap writen to it, while Intel NTB gen3 has a registers
per door bell and the data trigering the each door bell is always 1.
therefore exposing only peer door bell address forcing the user
to be aware of such low level details
Signed-off-by: Leonid Ravich <Leonid.Ravich@emc.com>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Presently, when ntb_transport is used with DMA and the IOMMU turned on,
it fails with errors from the IOMMU such as:
DMAR: DRHD: handling fault status reg 202
DMAR: [DMA Write] Request device [00:04.0] fault addr
381fc0340000 [fault reason 05] PTE Write access is not set
This is because ntb_transport does not map the BAR space with the IOMMU.
To fix this, we map the entire MW region for each QP after we assign
the DMA channel. This prevents needing an extra DMA map in the fast
path.
Link: https://lore.kernel.org/linux-pci/499934e7-3734-1aee-37dd-b42a5d2a2608@intel.com/
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
If NTB peer host crashes or reboots, the NTB transport link will be
down and the MWs of NTB transport will be invalid. But the
ntb_transport_link_cleanup() does not free these invalid MWs. When
the NTB peer host is recovered later, NTB transport link will be
up and the ntb_set_mw() will not reset up MWs. Because the MWs of
NTB transport are invalid, the NTB transport will not work.
We can fix it by freeing MWs when NTB transport link is down, then
the ntb_set_mw() will reset up MWs when NTB transport link is up.
Signed-off-by: Joey Zhang <joey.zhang@microchip.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Current Switchtec's BAR setup registers are limited to 32bits,
corresponding to the maximum MW (memory window) size is <4G.
Increase the MW sizes with the addition of the BAR Setup Extension
Register for the upper 32bits of a 64bits MW size. This increases the MW
range to between 4K and 2^63.
Reported-by: Boris Glimcher <boris.glimcher@emc.com>
Signed-off-by: Paul Selles <paul.selles@microchip.com>
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Switchtec NTB crosslink BARs are 64bit addressed but they are printed as
32bit addressed BARs. Fix debug log to increment the BAR numbers by 2 to
reflect the 64bit address alignment.
Fixes: 0175250182 ("ntb_hw_switchtec: Add initialization code for crosslink")
Signed-off-by: Paul Selles <paul.selles@microchip.com>
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Clean up the ifdefs which conditionally defined the io{read|write}64
functions in favour of the new common io-64-nonatomic-lo-hi header.
Per a nit from Andy Shevchenko, the include list is also made
alphabetical.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that ioread64 and iowrite64 are available in io-64-nonatomic,
we can remove the hack at the top of ntb_hw_intel.c and replace it
with an include.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Acked-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.
This change was generated with the following Coccinelle SmPL patch:
@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@
-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Since IDT PCIe-switch temperature sensor is now always available
irregardless of the EEPROM/BIOS settings, Kconfig and in-code
description should be properly altered. In addition lets update
the driver copyright lines.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
IDT PCIe-switch temperature sensor interface is very broken. First
of all only a few combinations of TMPCTL threshold enable bits
really cause the interrupts unmasked. Even if an individual bit
indicates the event unmasked, corresponding IRQ just isn't generated.
Most of the threshold enable bits combinations are in fact useless and
non of them can help to create a fully functional alarm interface.
So to speak, we can't create a well defined hwmon alarms based on
the IDT PCI-switch threshold IRQs.
Secondly a single threshold IRQ (not a combination of thresholds) can
be successfully enabled without the issue described above. But in this
case we experienced an enormous number of interrupts generated by
the chip if the temperature got near the enabled threshold value. Filter
adjustment didn't help much. It also doesn't provide a hysteresis settings.
Due to the temperature sample fluctuations near the threshold the
interrupts spate makes the system nearly unusable until the temperature
value finally settled so being pushed either to be fully higher or lower
the threshold.
All of these issues makes the temperature sensor alarm interface useless
and even at some point dangerous to be used in the driver. In this case
it is safer to completely discard it and disable the temperature alarm
interrupts.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
IDT PCIe switches provide an embedded temperature sensor working
within [0; 127.5]C with resolution of 0.5C. They also can generate
a PCIe upstream interrupt in case if the temperature passes through
specified thresholds. Since this thresholds interface is very broken
the created hwmon-sysfs interface exposes only the next set of hwmon
nodes: current input temperature, lowest and highest values measured,
history resetting, value offset. HWmon alarm interface isn't provided.
IDT PCIe switch also've got an ADC/filter settings of the sensor.
This driver doesn't expose them to the hwmon-sysfs interface at the
moment, except the offset node.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
In order to create a hwmon interface for the IDT PCIe-switch temperature
sensor the already available reader method should be improved. Particularly
we need to redesign it so one would be able to read temperature/offset
values from registers of the passed types. Since IDT sensor interface
provides temperature in unsigned format 0:7:1 (7 bits for real value
and one for fraction) we also need to have helpers for the typical sysfs
temperature data type conversion to and from this format. Even though
the IDT PCIe-switch provided temperature offset got the same but signed
type it can be translated by these methods too.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Be a little wasteful if the (likely CMA) message window buffer is not
suitably aligned after our first attempt; allocate a buffer twice as big
as we need and manually align our MW buffer within it.
This was needed on Intel Broadwell DE platforms with intel_iommu=off
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Addresses-Coverity-ID: 1373888 ("Missing break in switch")
Addresses-Coverity-ID: 1373889 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
IDT NTB driver sets the upper limit of actual translation address
being written to the corresponding memory window setup. It is achieved
by BARLIMITx register initialization. Needless to say, that the register
works within PCIe bus address space.
In general CPU and PCIe address spaces are different. It means,
that addresses used for Memory TLPs routine can be different from
CPU addresses. While in most of cases they are the same, there are
exceptions when the proper mapping must be performed to have the
portable driver code. There used to be a virt_to_bus()/bus_to_virt()
interface for this purpose. But it's deprecated now. It was also a
mistake to use pci_resource_start() since the return address of the
method is at the CPU address space. In order to achieve the desired
purpose we need to use pci_bus_address() helper. This method shall
return a PCIe bus base address of the corresponding BAR resource.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <allenbh@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Both devm_kcalloc() and devm_kzalloc() return NULL on error. They
never return error pointers.
The use of IS_ERR_OR_NULL is currently applied to the wrong
context.
Fix this by replacing IS_ERR_OR_NULL with regular NULL checks.
Fixes: bf2a952d31 ("NTB: Add IDT 89HPESxNTx PCIe-switches support")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
ndev_vec_mask() should be returning u64 mask value instead of int.
Otherwise the mask value returned can be incorrect for larger
vectors.
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Lucas Van <lucas.van@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Move the Microsemi Switchtec PCI Vendor ID (same as
PCI_VENDOR_ID_PMC_Sierra) to pci_ids.h. Also, replace Microsemi class
constants with the standard PCI definitions.
Signed-off-by: Doug Meyer <dmeyer@gigaio.com>
[bhelgaas: restore SPDX (I assume it was removed by mistake), remove
device ID definitions]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed (Kees)
-----BEGIN PGP SIGNATURE-----
Comment: Kees Cook <kees@outflux.net>
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsgVtMWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsJEACLYe2EbwLFJz7emOT1KUGK5R1b
oVxJog0893WyMqgk9XBlA2lvTBRBYzR3tzsadfYo87L3VOBzazUv0YZaweJb65sF
bAvxW3nY06brhKKwTRed1PrMa1iG9R63WISnNAuZAq7+79mN6YgW4G6YSAEF9lW7
oPJoPw93YxcI8JcG+dA8BC9w7pJFKooZH4gvLUSUNl5XKr8Ru5YnWcV8F+8M4vZI
EJtXFmdlmxAledUPxTSCIojO8m/tNOjYTreBJt9K1DXKY6UcgAdhk75TRLEsp38P
fPvMigYQpBDnYz2pi9ourTgvZLkffK1OBZ46PPt8BgUZVf70D6CBg10vK47KO6N2
zreloxkMTrz5XohyjfNjYFRkyyuwV2sSVrRJqF4dpyJ4NJQRjvyywxIP4Myifwlb
ONipCM1EjvQjaEUbdcqKgvlooMdhcyxfshqJWjHzXB6BL22uPzq5jHXXugz8/ol8
tOSM2FuJ2sBLQso+szhisxtMd11PihzIZK9BfxEG3du+/hlI+2XgN7hnmlXuA2k3
BUW6BSDhab41HNd6pp50bDJnL0uKPWyFC6hqSNZw+GOIb46jfFcQqnCB3VZGCwj3
LH53Be1XlUrttc/NrtkvVhm4bdxtfsp4F7nsPFNDuHvYNkalAVoC3An0BzOibtkh
AtfvEeaPHaOyD8/h2Q==
=zUUp
-----END PGP SIGNATURE-----
Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more overflow updates from Kees Cook:
"The rest of the overflow changes for v4.18-rc1.
This includes the explicit overflow fixes from Silvio, further
struct_size() conversions from Matthew, and a bug fix from Dan.
But the bulk of it is the treewide conversions to use either the
2-factor argument allocators (e.g. kmalloc(a * b, ...) into
kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a *
b) into vmalloc(array_size(a, b)).
Coccinelle was fighting me on several fronts, so I've done a bunch of
manual whitespace updates in the patches as well.
Summary:
- Error path bug fix for overflow tests (Dan)
- Additional struct_size() conversions (Matthew, Kees)
- Explicitly reported overflow fixes (Silvio, Kees)
- Add missing kvcalloc() function (Kees)
- Treewide conversions of allocators to use either 2-factor argument
variant when available, or array_size() and array3_size() as needed
(Kees)"
* tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
treewide: Use array_size in f2fs_kvzalloc()
treewide: Use array_size() in f2fs_kzalloc()
treewide: Use array_size() in f2fs_kmalloc()
treewide: Use array_size() in sock_kmalloc()
treewide: Use array_size() in kvzalloc_node()
treewide: Use array_size() in vzalloc_node()
treewide: Use array_size() in vzalloc()
treewide: Use array_size() in vmalloc()
treewide: devm_kzalloc() -> devm_kcalloc()
treewide: devm_kmalloc() -> devm_kmalloc_array()
treewide: kvzalloc() -> kvcalloc()
treewide: kvmalloc() -> kvmalloc_array()
treewide: kzalloc_node() -> kcalloc_node()
treewide: kzalloc() -> kcalloc()
treewide: kmalloc() -> kmalloc_array()
mm: Introduce kvcalloc()
video: uvesafb: Fix integer overflow in allocation
UBIFS: Fix potential integer overflow in allocation
leds: Use struct_size() in allocation
Convert intel uncore to struct_size
...
ntb_transport_create_queue() is never called in atomic context.
ntb_transport_create_queue() is only called by ntb_netdev_probe(),
which is set as ".probe" in struct ntb_transport_client.
Despite never getting called from atomic context,
ntb_transport_create_queue() calls kzalloc_node() with GFP_ATOMIC,
which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
which can sleep and improve the possibility of sucessful allocation.
This is found by a static analysis tool named DCNS written by myself.
And I also manually check it
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
ntb_transport_setup_qp_mw() is never called in atomic context.
ntb_transport_setup_qp_mw() is only called by ntb_transport_link_work(),
which is set as a parameter of INIT_DELAYED_WORK()
in ntb_transport_probe().
Despite never getting called from atomic context,
ntb_transport_setup_qp_mw() calls kzalloc_node() with GFP_ATOMIC,
which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL,
which can sleep and improve the possibility of sucessful allocation.
This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Move the Intel hw gen3 code to its own source file. The ntb_hw_intel.c was
getting too large and makes it hard to maintain with future hardware
changes.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Break out the generation specific definitions to different headers
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Sparse is whining about the u32 and __le32 mixed usage in the driver
drivers/ntb/test/ntb_perf.c:288:21: warning: cast to restricted __le32
drivers/ntb/test/ntb_perf.c:295:37: warning: incorrect type in argument 4 (different base types)
drivers/ntb/test/ntb_perf.c:295:37: expected unsigned int [unsigned] [usertype] val
drivers/ntb/test/ntb_perf.c:295:37: got restricted __le32 [usertype] <noident>
...
NTB hardware drivers shall accept CPU-endian data and translate it to
the portable formate by internal means, so the explicit conversions
are not necessary before Scratchpad/Messages API usage anymore.
Fixes: b83003b3fdc1 ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
We accidentally return success if dmaengine_submit() fails. The fix is
to preserve the error code from dma_submit_error().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixes the following sparse warnings:
drivers/ntb/hw/mscc/ntb_hw_switchtec.c:1552:6: warning:
symbol 'switchtec_ntb_remove' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Currently there is a memory leak on buf when the call to ntb_mw_get_align
fails. Add an exit err label and jump to this so that kfree on buf frees
the memory.
Detected by CoverityScan, CID#1464286 ("Resource leak")
Fixes: d637628ce00c ("NTB: ntb_tool: Add full multi-port NTB API support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
On 32-bit architectures, resource_size_t is usually 'unsigned int' or
'unsigned long' but not 'unsigned long long', so we get a warning
about printing the wrong data:
drivers/ntb/test/ntb_perf.c: In function 'perf_setup_peer_mw':
drivers/ntb/test/ntb_perf.c:1390:35: error: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t {aka unsigned int}' [-Werror=format=]
This changes the format string to the special %pa that is already
used elsewhere in the same file.
Fixes: b83003b3fdc1 ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Since Switchtec patch there has been a new topology added to
the NTB API. It's called NTB_TOPO_SWITCH and dedicated for
PCIe switch chips. Even though topo field isn't used within the
IDT driver much, lets set it for the sake of unification.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Former NTB Performance driver could only work with NTB devices, which
got Scratchpads available and had just two ports. Since there are
devices, which don't have Scratchpads and got more than two peer
ports, the performance measuring tool needs to be rewritten. This
patch adds the ability to test any available NTB peer.
Additionally it allows to set NTB memory windows up using any
available data exchange interface: Scratchpad or Message registers.
Some cleanups are also added here.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Former NTB Debugging tool driver supported only the limited
functionality of the recently updated NTB API, which is now available
to work with the truly NTB multi-port devices and devices, which
got NTB Message registers instead of Scratchpads. This patch
fully rewrites the driver so one would fully expose all the new
NTB API interfaces. Particularly it concerns the Message registers,
peer ports API, NTB link settings. Additional cleanups are also added
here.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Current Ping Pong driver can't truly work with multi-port devices.
Additionally it requires the Scratchpad registers being available
on NTB device. This patches rewrites the driver so one would
perform the cyclic Ping-Pong algorithm around all the available
NTB peers and makes it working with NTB hardware, which doesn't
support Scratchpads, but such alternative as NTB Message register.
Additional cleanups are also added here.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The dma_mask and dma_coherent_mask fields of the NTB struct device
weren't initialized in hardware drivers. In fact it should be done
instead of PCIe interface usage, since NTB clients are supposed to
use NTB API and left unaware of real hardware implementation.
In addition to that ntb_device_register() method shouldn't clear
the passed ntb_dev structure, since it dma_mask is initialized
by hardware drivers.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
There is a common methods signature form used over all the NTB API
like functions naming scheme, arguments names and order, etc.
Recently added NTB messaging API IO callbacks were named a bit
different so should be renamed to be in compliance with the rest
of the API.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Newer gcc (version 7 and 8 presumably) warn about a statement mixing
the << operator with logical and:
drivers/ntb/hw/mscc/ntb_hw_switchtec.c: In function 'switchtec_ntb_init_sndev':
drivers/ntb/hw/mscc/ntb_hw_switchtec.c:888:24: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context]
My interpretation here is that the author must have intended a bitmask
rather than a comparison, so I'm changing the '&&' to '&', which makes
a lot more sense in the context.
Fixes: 1b249475275d ("ntb_hw_switchtec: Allow using Switchtec NTB in multi-partition setups")
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
With Switchtec hardware, the buffer used for a memory window must be
aligned to its size (the hardware only replaces the lower bits). In
certain circumstances dma_alloc_coherent() will not provide a buffer
that adheres to this requirement like when using the CMA and
CONFIG_CMA_ALIGNMENT is set lower than the buffer size.
When we get an unaligned buffer mw_set_trans() should return an error.
We also log an error so we know the cause of the problem.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
When using the max_mw_size parameter of ntb_transport to limit the size of
the Memory windows, communication cannot be established and the queues
freeze.
This is because the mw_size that's reported to the peer is correctly
limited but the size used locally is not. So the MW is initialized
with a buffer smaller than the window but the TX side is using the
full window. This means the TX side will be writing to a region of the
window that points nowhere.
This is easily fixed by applying the same limit to tx_size in
ntb_transport_init_queue().
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
If one host crashes and soft reboots, the other host may not see a
link down event. Then when the crashed host comes back up, the
surviving host may not know the link was reset and the NTB clients
may not work without being reset.
To solve this, we send a LINK_FORCE_DOWN message to each peer every
time we come up, before we register the NTB device. If a surviving
host still thinks the link is up it will take it down immediately.
In this way, once the crashed host comes up fully, it will send a
regular link up event as per usual and the link will be properly
restarted.
While we are in the area, this also fixes the MSG_LINK_UP message that
was in the link down function that was reported by Doug Meyers.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reported-by: ThanhTuThai <cruisethai@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
In a crosslink configuration doorbells and messages largely work the
same but the NTB registers must be accessed through the reserved LUT
window. Also, as a bonus, seeing there are now two independent sets of
NTB links, both partitions can actually use all 60 doorbell registers
instead of them having to be split into two for each partition.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Crosslink is a feature of the Switchtec switches that is similar to
the B2B mode of other NTB devices. It allows a system to be designed
that is perfectly symmetric with two identical switches that link
two hosts together.
In order for the system to be symmetric, there is an empty host-less
partition between the two switches which the host must enumerate and
assign BAR addresses to. The firmware in the switch manages this
specially so that the BAR addresses on both sides of the empty
partition will be identical despite being in the same partition with
the same address space.
The driver determines whether crosslink is enabled by a flag set in
the NTB partition info registers which are set by the switch's
configuration file.
When crosslink is enabled, a reserved LUT window is setup to point to
the peer's switch's NTB registers and the local MWs are set to forward
to the host-less partition's BARs. (Yes, this hurts my brain too.)
Once this is setup, largely the same NTB infrastructure is used to
communicate between the two hosts.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
This is a prep patch in order to support the crosslink feature which
will require the driver to setup the requester ID table in another
partition as well as it's own. To aid this, create a helper function
which sets up the requester IDs from an array.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
This is a prep patch in order to support the crosslink feature which
will require the driver to use another reserved LUT window. To
simplify this we move the code which sets up the reserved LUT window
into a helper function which will be used by the crosslink
initialization.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
This is a prep patch in order to support the crosslink feature which will
require the driver to use another reserved LUT window. To simplify this,
we add some code to track the number of reserved LUT windows in use
instead of assuming this is always 1.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Allow using Switchtec NTB in setups that have more than two partitions.
Note: this does not enable having multi-host communication, it only
allows for a single NTB link between two hosts in a network that might
have more than two.
Use following logic to determine the NT peer partition:
1) If there are 2 partitions, and the target vector is set in
the Switchtec configuration, use the partition specified in target
vector.
2) If there are 2 partitions and target vector is unset
use the only other partition as specified in the NT EP map.
3) If there are more than 2 partitions and target vector is set
use the other partition specified in target vector.
4) If there are more than 2 partitions and target vector is unset,
this is invalid and report an error.
Signed-off-by: Kelvin Cao <kelvin.cao@microsemi.com>
[logang@deltatee.com: commit message fleshed out]
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Trivial fix to spelling mistake in dev_err error message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-By: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
There is no need to #define the license of the driver, just put it in
the MODULE_LICENSE() line directly as a text string.
This allows tools that check that the module license matches the source
code license to work properly, as there is no need to unwind the
unneeded dereference, especially when the string is defined just a few
lines above the usage of it.
Reported-and-reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <Allen.Hubbe@emc.com>
Cc: Gary R Hook <gary.hook@amd.com>
Cc: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
This resolves a bug which may incorrectly configure the peer host's
LUT for shared memory window access. The code was using the local
host's first BAR number, rather than the peer hosts's first BAR
number, to determine what peer NT control register to program.
The bug will cause the Switchtec NTB link to work only if both peers
have the same first NTB BAR configured. In all other configurations,
the link will not come up, failing silently.
When both hosts have the same first BAR, the configuration works only
because the first BAR numbers happent to be the same. When the hosts
do not have the same first BAR, then the LUT translation will not be
configured in the correct peer LUT and will not give the peer the
shared memory window access required for the link to operate.
Signed-off-by: Doug Meyer <dmeyer@gigaio.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: 678784a44ae8 ("NTB: switchtec_ntb: Initialize hardware for memory windows")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The workaround code is never used because Skylake NTB does not need it.
Reported-by: Allen Hubbe <allen.hubbe@dell.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Make these const as they are only used during a copy operation.
Done using Coccinelle.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The Switchtec hardware has two types of memory windows: LUTs and Direct.
The first area in each BAR is for LUT windows and the remaining area is
for the direct region. The total number of LUT entries is set by a
configuration setting in hardware and they all must be the same
size. (This is fixed by switchtec_ntb to be 64K.)
switchtec_ntb enables the LUTs only for the first BAR and enables the
highest power of two possible. Seeing the LUTs are at the beginning of
the BAR, the direct memory window's alignment is affected. Therefore,
the maximum direct memory window size can not be greater than the number
of LUTs times 64K. The direct window in other BARs will not have this
restriction as the LUTs will not be enabled there. LUTs will only be
exposed through the NTB API if the use_lut_mw parameter is set.
Seeing the Switchtec hardware, by default, configures BARs to be 4G a
module parameter is given to limit the size of the advertised memory
windows. Higher layers tend to allocate the maximum BAR size and this
has a tendency to fail when they try to allocate 4GB of contiguous
memory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Seeing there is no dedicated hardware for this, we simply add
these as entries in the shared memory window. Thus, we could support
any number of them but 128 seems like enough, for now.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Pretty straightforward implementation of doorbell registers.
The shift and mask were setup in an earlier patch and this just hooks
up the appropriate portion of the IDB register as the local doorbells
and the opposite portion of ODB as the peer doorbells. The DB mask is
protected by a spinlock to avoid concurrent read-modify-write accesses.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
switchtec_ntb checks for a link by looking at the shared memory
window. If the magic number is correct and the other side indicates
their link is enabled then we take the link to be up.
Whenever we change our local link status we send a msg to the
other side to check whether it's up and change their status.
The current status is maintained in a flag so ntb_is_link_up
can return quickly.
We utilize Switchtec's link status notifier to also check link changes
when the switch notices a port changes state.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Add a skeleton NTB driver which will be filled out in subsequent patches.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Set up some hardware registers and creates interrupt service routines
for the doorbells and messages.
There are 64 doorbells in the switch that are shared between all
partitions. The upper 4 doorbells are also shared with the messages
and are therefore not used. Thus, this provides 28 doorbells for each
partition.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Add the code to initialize the memory windows in the hardware.
This includes setting up the requester ID table, and figuring out
which BAR corresponds to which memory window. (Seeing the switch
can be configured with any number of BARs.)
Also, seeing the device doesn't have hardware for scratchpads or
determining the link status, we create a shared memory window that has
these features. A magic number with a version component will be used
to determine if the other side's driver is actually up.
The shared memory window also informs the other side of the
size and count of the local memory windows.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Seeing the Switchtec NTB hardware shares the same endpoint as the
management endpoint we utilize the class_interface API to register
an NTB driver for every Switchtec device in the system that has the
NTB class code.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Stephen Bates <sbates@raithlin.com>
Reviewed-by: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
With Switchtec hardware it's impossible to get the alignment parameters
for a peer's memory window until the peer's driver has configured its
windows. Strictly speaking, the link doesn't have to be up for this,
but the link being up is the only way the client can tell that
the other side has been configured.
This patch converts ntb_transport and ntb_perf to use this function after
the link goes up. This simplifies these clients slightly because they
no longer have to store the alignment parameters. It also tweaks
ntb_tool so that peer_mw_trans will print zero if it is run before
the link goes up.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
It seems that under certain scenarios the SPAD can have bogus values caused
by an agent (i.e. BIOS or other software) that is not the kernel driver, and
that causes memory window setup failure. This should not cause the link to
be disabled because if we do that, the driver will never recover again. We
have verified in testing that this issue happens and prevents proper link
recovery.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixes: 84f766855f ("ntb: stop link work when we do not have memory")
After converting to the new API, both ntb_tool and ntb_transport are
using ntb_mw_count to iterate through ntb_peer_get_addr when they
should be using ntb_peer_mw_count.
This probably isn't an issue with the Intel and AMD drivers but
this will matter for any future driver with asymetric memory window
counts.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixes: 443b9a14ec ("NTB: Alter MW API to support multi-ports devices")
If a failure occurs when creating Debug FS entries, unroll all of
the work that's been done.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The ntb_perf tool uses module parameters to control the
characteristics of its test. Enable the changing of these
options through debugfs, and eliminating the need to unload
and reload the module to make changes and run additional tests.
Add a new module parameter that forces the DMA channel
selection onto the same node as the NTB device (default: true).
- seg_order: Size of the NTB memory window; power of 2.
- run_order: Size of the data buffer; power of 2.
- use_dma: Use DMA or memcpy? Default: 0.
- on_node: Only use DMA channel(s) on the NTB node. Default: true.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The Debug FS entries manage themselves; we don't need to hang onto
them in the context structure.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The DMA channel(s)/memory used to transfer data to an NTB device
may not be required to be on the same node as the device. Add a
module parameter that allows any candidate channel (aside from
node assocation) and allocated memory to be used.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
IDT 89HPESxNTx device series is PCIe-switches, which support
Non-Transparent bridging between domains connected to the device ports.
Since new NTB API exposes multi-port interface and messaging API, the
IDT NT-functions can be now supported in the kernel. This driver adds
the following functionality:
1) Multi-port NTB API to have information of possible NT-functions
activated in compliance with available device ports.
2) Memory windows of direct and look up table based address translation
with all possible combinations of BARs setup.
3) Traditional doorbell NTB API.
4) One-on-one messaging NTB API.
There are some IDT PCIe-switch setups, which must be done before any of
the NTB peers started. It can be performed either by system BIOS via
IDT SMBus-slave interface or by pre-initialized IDT PCIe-switch EEPROM:
1) NT-functions of corresponding ports must be activated using
SWPARTxCTL and SWPORTxCTL registers.
2) BAR0 must be configured to expose NT-function configuration
registers map.
3) The rest of the BARs must have at least one memory window
configured, otherwise the driver will just return an error.
Temperature sensor of IDT PCIe-switches can be also optionally
activated by BIOS or EEPROM.
(See IDT documentations for details of how the pre-initialization can
be done)
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
As per a comments in [1] by Greg Kroah-Hartman, the ndev_* macros should
be cleaned up. This makes it more clear what's actually going on when
reading the code.
[1] http://www.spinics.net/lists/linux-pci/msg56904.html
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
As per a comments in [1] by Greg Kroah-Hartman, the ndev_* macros should
be cleaned up. This makes it more clear what's actually going on when
reading the code.
[1] http://www.spinics.net/lists/linux-pci/msg56904.html
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Some IDT NTB-capable PCIe-switches have message registers to communicate with
peer devices. This patch adds new NTB API callback methods, which can be used
to utilize these registers functionality:
ntb_msg_count(); - get number of message registers
ntb_msg_inbits(); - get bitfield of inbound message registers status
ntb_msg_outbits(); - get bitfield of outbound message registers status
ntb_msg_read_sts(); - read the inbound and outbound message registers status
ntb_msg_clear_sts(); - clear status bits of message registers
ntb_msg_set_mask(); - mask interrupts raised by status bits of message
registers.
ntb_msg_clear_mask(); - clear interrupts mask bits of message registers
ntb_msg_read(midx, *pidx); - read message register with specified index,
additionally getting peer port index which data received from
ntb_msg_write(midx, pidx); - write data to the specified message register
sending it to the passed peer device connected over a pidx port
ntb_msg_event(); - notify driver context of a new message event
Of course there is hardware which doesn't support Message registers, so
this API is made optional.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Even though there is no any real NTB hardware, which would have both more
than two ports and Scratchpad registers, it is logically correct to have
Scratchpad API accepting a peer port index as well. Intel/AMD drivers utilize
Primary and Secondary topology to split Scratchpad between connected root
devices. Since port-index API introduced, Intel/AMD NTB hardware drivers can
use device port to determine which Scratchpad registers actually belong to
local and peer devices. The same approach can be used if some potential
hardware in future will be multi-port and have some set of Scratchpads.
Here are the brief of changes in the API:
ntb_spad_count() - return number of Scratchpads per each port
ntb_peer_spad_addr(pidx, sidx) - address of Scratchpad register of the
peer device with pidx-index
ntb_peer_spad_read(pidx, sidx) - read specified Scratchpad register of the
peer with pidx-index
ntb_peer_spad_write(pidx, sidx) - write data to Scratchpad register of the
peer with pidx-index
Since there is hardware which doesn't support Scratchpad registers, the
corresponding API methods are now made optional.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Multi-port NTB devices permit to share a memory between all accessible peers.
Memory Windows API is altered to correspondingly initialize and map memory
windows for such devices:
ntb_mw_count(pidx); - number of inbound memory windows, which can be allocated
for shared buffer with specified peer device.
ntb_mw_get_align(pidx, widx); - get alignment and size restriction parameters
to properly allocate inbound memory region.
ntb_peer_mw_count(); - get number of outbound memory windows.
ntb_peer_mw_get_addr(widx); - get mapping address of an outbound memory window
If hardware supports inbound translation configured on the local ntb port:
ntb_mw_set_trans(pidx, widx); - set translation address of allocated inbound
memory window so a peer device could access it.
ntb_mw_clear_trans(pidx, widx); - clear the translation address of an inbound
memory window.
If hardware supports outbound translation configured on the peer ntb port:
ntb_peer_mw_set_trans(pidx, widx); - set translation address of a memory
window retrieved from a peer device
ntb_peer_mw_clear_trans(pidx, widx); - clear the translation address of an
outbound memory window
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Multi-port devices permit the NTB connections between multiple domains,
so a local device can have NTB link being up with one peer and being
down with another. NTB link-state API is appropriately altered to return
a bitfield of the link-states between the local device and possible peers.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
There is some NTB hardware, which can combine more than just two domains
over NTB. For instance, some IDT PCIe-switches can have NTB-functions
activated on more than two-ports. The different domains are distinguished
by ports they are connected to. So the new port-related methods are added to
the NTB API:
ntb_port_number() - return local port
ntb_peer_port_count() - return number of peers local port can connect to
ntb_peer_port_number(pdix) - return port number by it index
ntb_peer_port_idx(port) - return port index by it number
Current test-drivers aren't changed much. They still support two-ports devices
for the time being while multi-ports hardware drivers aren't added.
By default port-related API is declared for two-ports hardware.
So corresponding hardware drivers won't need to implement it.
Signed-off-by: Serge Semin <fancer.lancer@gmail.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Do not sleep in ntb_async_tx_submit, which could deadlock.
This reverts commit "8c874cc140d667f84ae4642bb5b5e0d6396d2ca4"
Fixes: 8c874cc140 ("NTB: Address out of DMA descriptor issue with NTB")
Reported-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fixing doorbell register length to 32bits per spec. On Skylake NTB, the
doorbell registers are 32bit write only registers. The source for the
doorbell is a 64bit register that shows the interrupt bits.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 783dfa6cc4 ("ntb: Adding Skylake Xeon NTB support")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
A divide by zero error occurs if qp_count is less than mw_count because
num_qps_mw is calculated to be zero. The calculation appears to be
incorrect.
The requirement is for num_qps_mw to be set to qp_count / mw_count
with any remainder divided among the earlier mws.
For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1
will have 3 qps per window and mws 2 through 4 will have 2 qps per window.
Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher
than when mw_num >= qp_count.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
In cases where there are more mw's than spads/2-2, the mw count gets
reduced to match the limitation. ntb_transport also tries to ensure that
there are fewer qps than mws but uses the full mw count instead of
the reduced one. When this happens, the math in
'ntb_transport_setup_qp_mw' will get confused and result in a kernel
paging request bug.
This patch fixes the bug by reducing qp_count to the reduced mw count
instead of the full mw count.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Fixes: e26a5843f7 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The order parameters are powers of 2; adjust the usage information
to use correct mathematical representations.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Fixes: 8a7b6a778a ("ntb: ntb perf tool")
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
On Skylake hardware, the link_poll isn't clearing the pending interrupt
bit. Adding a new function for SKX that handles clearing of status bit the
right way.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 783dfa6c ("ntb: Adding Skylake Xeon NTB support")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Fix typo causing ntb_transport_create_queue to select the first
queue every time, instead of using the next free queue.
Signed-off-by: Thomas VanSelus <tvanselus@xes-inc.com>
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Fixes: fce8a7bb5 ("PCI-Express Non-Transparent Bridge Support")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
In the normal I/O execution path, ntb_perf is missing a call to
dmaengine_unmap_put() after submission. That causes us to leak
unmap objects.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 8a7b6a77 ("ntb: ntb perf tool")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The call to debugfs_remove_recursive(qp->debugfs_dir) of the sub-level
directory must not be later than
debugfs_remove_recursive(nt_debugfs_dir) of the top-level directory.
Otherwise, the sub-level directory will not exist, and it would be
invalid (panic) to attempt to remove it. This removes the top-level
directory last, after sub-level directories have been cleaned up.
Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com>
Fixes: e26a5843f ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The results were previously ignored, anyway.
Signed-off-by: Steve Wahl <Steve.Wahl@dell.com>
Fixes: e26a5843f7
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
'request_irq()' and 'free_irq()' should have the same 'dev_id'.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The offsets for the SZ registers are wrong. Updated.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reported-by: Sandeep Mann <sandeep@purestorage.com>
Tested-by: Zachary Ross <zacharyx.ross@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
When the underlying NTB H/W driver advertises more memory windows
than the number of scratchpads available to setup MW's, it is likely
that we may end up filling the remaining memory windows with garbage.
So to avoid that, lets limit the memory windows that transport driver
can setup based on the available scratchpads.
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Due to incorrect limit and translation register values, NTB link was
going down when the memory window was setup. Made appropriate changes
as per spec.
Fix limit register values for BAR1, which was overlapping
with the BAR23 address.
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
AMD NTB support hotplug under B2B mode. NTB will trigger link
up/down interrupt event when doing plug add/remove, this patch
implements the two interrupt event to support B2B hotplug function.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The Skylake Xeon NTB hardware has made some changes to the register name,
offset, and the way doorbells work. Adding driver support for the new
hardware.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@dell.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
This is a static checker warning, not something I'm desperately
concerned about. But snprintf() returns the number of bytes that
would have been copied if there were space. We really care about the
number of bytes that actually were copied so we should use scnprintf()
instead.
It probably won't overrun, and in that case we may as well just use
sprintf() but these sorts of things make static checkers and code
reviewers happier.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
The peer_addr member of intel_ntb_dev is not set, therefore when
acquiring ntb_peer_db and ntb_peer_spad we only get the offset rather
than the actual physical address. Adding fix to correct that.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>